The Old Reader

02 Feb 01:37

A New LLM System for Synthesis Planning

Austin Ventura
hmmm cool

This new paper is worth examining as the probable state of the art in LLM-based chemical reaction handling and prediction. The authors report a system (MOSAIC, Multiple Optimized Systems for AI-assisted Chemical prediction) that takes a graphical representation of a proposed new reaction and attempts to produce a written synthetic procedure to realize it in a lab.

This is done by creating a fingerprint profile of the proposed reaction using RDKit and Morgan representations of the starting materials and the desired product, and the calculated transformation is then binned into one of many reaction classes, which are represented as cells in a Voronoi diagram. One of those cells/centroids, for example, could represent Buchwald-Hartwig couplings onto aryl bromides, another onto aryl triflates, with all sorts of other cells assigned to all sorts of other chemical transformations from the literature, patents and journals, down to levels like “nitro reduction to amine using tin chloride”. Each of these have had their experimental procedures read and retained as fodder for the LLM phase of things, and the Llama-3.1-8B LLM architecture is used to generate 2,498 separate mini-expert-systems corresponding to reaction types.

In the end, you would enter a drawing of your proposed reaction, and the system would spit out a text describing an experimental procedure to get this reaction to work, complete with solvents, temperatures, times, stoichiometries, etc., along with a predicted yield. These are of course reassembled from existing human-produced text procedures, in the same way that any LLM blends and remixes the textual data sets it’s been trained on. The key tricks here are the step that takes the drawn reaction and bins it into the correct Voronoi region (those 2,948 different reactions) and then the LLM step that takes the procedures it has for that sort of reaction and attempts to whip up one that might work for you.

So let’s get down to what really matters to most of us: how well does it work? The authors tried feeding known reactions into the system and found that in single-shot predictions it gets the correct solvent about 30% of the time and the correct reagents about 22% of the time. That doesn’t sound so good, but to be fair, many times the answers come out as close-and-chemically-plausible. Allowing for such partial matches, you get 52% hits for solvents and 45% for reagents. If you let several of the many expert systems (the top three of them) pick and also count partial matches, which to me is the most generous interpretation I’m willing to lend credence to, you hit 76% for reagents and 55% for solvents. I will say that my reading of the paper doesn’t leave me certain how the top three expert systems are selected each time.

At any rate, the system almost always gets something right, which one of those point-of-view results: for a computational system that’s an encouraging sign that you may well be on the right track, but I would not hire a lab assistant if that phrase was in their letter of recommendation.

Applying the software to classes of catalytic reactions (Heck, Suzuki, Buchwald-Hartwig, Sonogashira, etc.) seems to have gone fairly well (these would be some where there are extensive experimental procedures available). The model’s predictions are not as good as others that have been specifically trained on these reaction types, but it’s quite good for a generalist approach. The team also put in 52 new molecules that looked plausible but had not been described yet in the literature, and 37 of these turned out to be makeable with the program’s recommendations (35 using the top recommendation, and the two others by going down to a lower-scoring alternative). Unfortunately, the full paper is not yet available with all its supplementary data, and I look forward to examining this list more closely.

Articles about the paper have made many comments about how these molecules could represent new directions in pharmaceutical structures, materials, polymers, and so on, but honestly to me that’s just noise. Making new small molecules that aren’t in the literature is not a challenge in itself - it’s the predicting of usable ways of doing it that could mean more. I make previously unknown molecules all the time, via my own predictions of reactions and reaction conditions, and my success rate is reasonably high. What I would very much like to know is how much better (or worse) MOSAIC is at it, and whether it can save me some time along the way to think about other things (see below).

That is to say, I would like to see how its predictions compare to what I (or any other experienced chemist) might have predicted based on a quick pass through literature databases. I take the point that the MOSAIC system has to some extent already had those literature passes done for it while building its various LLM modules, so it could in theory save time compared to bespoke searching. But those time savings will disappear quickly if it suggests more unproductive reactions than I can suggest myself!

And that brings up the usual thoughts about the purpose of such software (and indeed, hardware) assistance. I’ve referred to this as “redefining grunt work”, by which I mean taking things (in this case) that once were considered at the center of a synthetic chemist’s job and gradually moving them into the category of “necessary work that this machine over here can speed up for you” or even “necessary work that this machine will just do for you while you do something else”. And that means, as I’ve said before, that we chemists have to be alert not only to the encroachment of software onto our sacred turf, but (since that’s likely going to happen anyway) to also be alert on how best to turn that situation to our advantage. We have to be ready to spend our energies on higher-level problems: if we’re not thinking all the time about How To Make These Compounds, we should be use that time to think harder about What Compounds Need to be Made. And on top of that, Why We Should Be Making Them in the First Place. Those are going to rather more difficult for any LLM to help out with!

Austin Ventura likes this

Austin Ventura

Shared posts

A New LLM System for Synthesis Planning

[ASAP] Probing Ion Configurations in the KcsA Selectivity Filter with Single-Isotope Labels and 2D IR Spectroscopy