I just wanted to share some of the articles and libraries I found within Bayesian Optimization that apply to experimental science.
I think this kind of optimization is exciting for closed-loop optimization.
I am by no means an expert so please chime in if you have other resources you want to share.
This is a great list! I had not seen the presentation from Jensen and Hendricks - thanks for sharing!
As @evwolfson mentioned, I created Summit for solving scientific problems with Bayesian optimization. We really focus on making it easy to get started, but if you try it and have any issues let me know!
Also, I’d classify the applications of Bayesian optimization into three categories:
BO is something I want to work into my own LC validation efforts - but haven’t quite gotten to point where I have spare time to plan and work on it. Do any of these solutions keep track of or measure a value relevant to time spent on optimization? This is typically coined “regret” from what I’ve seen, and is a metric that can be used to summarize loss of materials/time/resources from not using optimal variables. A combination of time spent optimizing and the difference between the highest optimal returns from a black box function and those used during optimization; something like sum[F(optimal)-F(tests)] + scaled_time_metric… the higher the value, the higher the regret.
From my understanding, it would be a great incentivizing variable to keep track of for relaying the importance of optimization, and the impact of BO on minimizing that “regret”.
Yes, that is the correct definition of regret. There are some nuances here. BO algorithms have an acquisition function which scores the quality of potential new experiments. While most acquisition functions do not explicitly keep track of cost, they will implicitly have a trade-off between exploring (and possibly finding a better solution) and exploiting (optimizing around the best known parameters).
For example, expected improvement is known for being quite exploitative. Upper confidence bound (with the correct hyperparameters) tends to explore more. There are more acquisition functions that have different trade-offs.
One important thing that people often are confused by is that BO algorithms usually do not converge to an optimal point. Common acquisition functions (expected improvement, upper confidence bound) will, to some extent, continue exploring as long as they have the budget and there is uncertainty in the model. So, you typically will set a budget in advance and stop once you hit that or are satisfied with the results.
Finally, if the cost of experiments varies as a function of parameters (e.g., certain liquid class parameters take longer to evaluate or need more material), you could look into cost-aware BO.
Thats a super interesting breakdown. It would be cool to see these get integrated more programmatically into people’s workflows especially accounting for cost and payoff metrics.
Interesting, there’s so much to dive into here. I know there’s a lot of wonderful digital and regular chemistry work being done in the space so please share anything you feel could be relevant!
I am currently applying Bayesian Optimisation to yeast optimisation. I will have access to a Beckman Biomek i7 and we will use our Matterhorn Studio platform to host Bayesian Optimisation algorithms that can suggest new candidates experiments 24/7, i.e. whenever an experiment has finished. If nothing goes wrong, this should close the loop! (famous last words)
I am not the bioengineer in the project, but most simply we’re trying to optimise e.g. the Nitrogen-Carbon ratio. I think there is metabolic engineering down the road, but baby-steps!
Btw, my talk on Bayesian Optimisation in Lab Automation is now online here, was a great time at the Munich Lab Automators, a Meetup which started on this forum!
Also! For Bayesian Optimization or any ither interests in Statistics + Mathematical Optimization theory and application, there is a one stop shop text book IMO.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) https://a.co/d/ghAjVTj
BUT: We also work hard on making BO accessible to a wider audience, which is why we have been developing a Google Sheets Plug-In, which arguably is the fastest and easiest way to get started with Bayesian Optimisation.