Paper (link)

As life sciences fields learn to harness generative AI methods, we start to move from one paradigm to the other in molecular design:

       property prediction methods as filters for high-throughput *in silico* screening
                                                                     —> 
generative design of molecules using AI models guided by property prediction methods

In Swanson et al. 2024, the authors introduce SyntheMol, a generative AI model that is trained using Monte Carlo tree search (MCTS) to optimize molecular generations to a property prediction model. In other words, using MCTS to guide the AI model to produce molecules that adhere to your property prediction model of choice - for example, this could be optimizing molecules to have a high cLogP value.

They show that this method is more resource and computationally efficient than high-throughput in silico screening of existing datasets. They successfully use this paradigm and model to design molecules that are effective antibiotics against Acinetobacter baumannii at a 10% hit rate, compared to the 3.5% hit rate that existing in the training dataset.

The Case for SyntheMol: The Company

The market size for CROs (Contract Research Organizations) is expected to grow from $82.B in 2024 to $129.8B in 2029 (9.6% CAGR). (link, MarketsandMarkets) CROs are crucial partners to the drug discovery industry, allowing biotech companies to either scale up research or focus their talent on clinical and discovery work.

As AI starts to permeate life sciences research, there has been a rise of new CROs that are AI native in their molecular optimization processes. Two examples of this new breed are XtalPi and Cradle Bio. XtalPi primarily specializes not just in the optimization of molecular properties, but also large-scale automation and AI-based chemical synthesis design in its CRO offerings. (link) While they pull in much smaller revenue than the CRO giants like Thermo Fisher Scientific and Wuxi Apptec, they clearly serve a need for biopharma, and have successfully IPO’ed in the Hong Kong Stock Exchange. In the bio space, Cradle Bio is a new AI-native protein property optimization partner for synthetic biology and therapeutics companies. (link) Based on conversations, it seems that both of these companies rely on the virtuous cycle of generating new molecules, validating their properties, and updating their AI models to become more proficient at property design or synthesis.

SyntheMol could have a wedge in the space as well akin to XtalPi, as a partner for optimizing molecular properties and design-ability of molecules, as outlined in their paper. Another way to view this is as an optimization layer on top of the Enamine REAL space of chemicals - optimization-as-a-service of the largest existing chemical building block repository.

I would be excited to see where this paper goes - and hopefully we get to benefit from the greatly-needed new antimicrobials Swanson et al. designed!