Uses Cloud GPUs to run inference on Recursion's biological and chemical foundation models
Provides public foundation models for life sciences companies to assess the effectiveness of drugs in development
Used in 150+ publications; RxRx3-core reached 6,000+ downloads in four months
Recursion is leveraging its AI-enabled platform to accelerate new treatments for rare and "undruggable" diseases.
Recursion leverages proprietary data—including extensive cell microscopy data, transcriptomic, and patient data, and an AI-enabled platform to accelerate the discovery and development of transformative new medicines for dozens of diseases with unmet medical need. The company, headquartered in Salt Lake City, Utah, with additional sites in New York City, London, the Oxford area, and Montreal, has built a drug discovery platform that integrates automated chemistry, automated biology, and cloud computing tools to uncover novel therapeutic targets and precision-design new drugs, potentially cutting the time to discover and develop a new medicine by a factor of 10.
It's a paradox which observers call "Eroom's Law": drug discovery, despite improvements in medical technology, has become increasingly slower and more expensive. As a result, the number of FDA-approved new medicines reaching patients has stagnated, while the costs of pharmaceutical R&D has continued to increase.
There are approximately 6,000 rare diseases affecting an estimated 25 million people—many of them young children—in the United States alone. The high cost and extremely long timelines of conventional drug development are even more daunting for rare diseases. As a result, less than 5 percent of rare diseases have FDA-approved treatments.
Meanwhile, 1 in 5 people will develop cancer in their lifetime, and approximately 1 in 5 cancer diagnoses in the US is considered a rare cancer.
Recursion is positioned to rapidly improve the development and success rate of new treatments by combining a data-driven understanding of biology and chemistry with the latest in machine learning tools to reimagine the drug discovery and development process end to end—from target discovery informed by patient genetic data to clinical trials optimized to include the patient populations most likely to benefit. The company is advancing a pipeline of clinical stage drugs—and its lead drug, REC-4881, is a potential first-in-disease treatment for the rare disease Familial Adenomatous Polyposis (FAP) that has shown positive Phase 2 results and is one of the most advanced AI drugs in development today.
In its quest to transform how medicines are made, engineering, AI-pattern recognition, and cloud computing have proven as necessary as biochemical expertise. Recursion's data pipeline incorporates image processing, inference engines, and deep learning modules. Its engineers have built a platform that supports bursts of computational power that weigh in at trillions of calculations per second.
Recursion has reached an inflection point—from proving that AI can participate in drug discovery to demonstrating that an AI-native operating system can generate clinical proof and durable value.
Najat Khan, Ph.D.
CEO and President, Recursion
Recursion is seeing tangible proof points from this AI-led approach. Recursion has five drugs advancing through clinical trials in its internal pipeline, more than 15 drugs in early discovery with partners like Sanofi and Roche and Genentech, and has developed a suite of state-of-the-art machine learning foundation models, and maps of biology both internally and with partners (including the world's first whole-genome Neuromap and Microglia Map) that are uncovering unknown biology in difficult areas like neurological diseases. The company has the first AI native end-to-end platform in drug discovery, and is delivering measurable impact in patients.
Recursion has integrated a massive processing pipeline and neural networks into a target platform that is scalable, cost-effective, and tracking to achieve potential treatments for both rare and common conditions in cardiology, neurology, dermatology, oncology, immunology, and ophthalmology, among others.
It starts with wet biology—plates of glass bottom wells containing thousands of healthy and diseased human cells. The firm's biologists run experiments on the cells, applying stains that help characterize and quantify the features of the cellular samples: their roundness, the thickness of their membrane, the shape of their mitochondria, and other characteristics.
A microscopy team captures this data by snapping high-resolution photos of the cells at several different light wavelengths. Google Cloud allows this massive store of data to be readily available when needed for Recursion's machine learning models. A data pipeline that sits on top of Google Kubernetes Engine (GKE) running on Google Cloud, extracts and analyzes cell features from the images. Mathematical models with data that represent the cell features are deployed in packages to GKE containers. Then, data is processed by deep neural networks to find patterns, including those humans might not recognize. The neural nets are trained to compare healthy and diseased tissue signatures with those of tissues before and after a variety of drug treatments. This strategy returns new approaches for treating diseases—based entirely on data—that were previously unknown, as well as the best way to design molecules to be effective drugs, and the most promising patient populations (and where to find them).
In over a decade since its founding, Recursion has expanded and improved its end-to-end operations, including automated high throughput and automated chemistry labs, and built one of the largest proprietary biological and chemical datasets—over 50 petabytes—including microscopy images (phenomics) and RNA changes (transcriptomics), integrating this with patient genetic and health data. "We generate massive amounts of fit-for-purpose data and then use modern AI tools to make sense of that complexity, building powerful maps of biology that turn discovery into a computationally searchable problem," says Hayley Donnella, PhD, VP of Frontier Research at Recursion. "The scale of the data is significant, but what matters more is that the data is complementary and connected."
As the company has scaled, they've had to rethink their approach to managing big data. Google Cloud has been an important partner. "I think of Google Cloud as the foundation, the walls, and then within those, we're able to build our own apartments that are our models or our tools," says Maureen Makes, VP of Engineering at Recursion. "We also leverage TPUs for our model inference to quickly and reliably have outputs from this data on demand. We've seen about a 50% reduction in price while keeping the same quality."
The company has also leveraged Google Cloud to release open source foundation models like Open Phenom-S/16 for life sciences companies to assess the effectiveness of drugs in development using Recursion's microscopy data, setting a new gold standard for the industry.
The potential of using Cloud TPU pods to accelerate our deep learning research while keeping operational costs and complexity low is a big draw.
Ben Mabey
Chief Technology Officer, Recursion Pharmaceuticals
To train its massive deep learning models, Recursion relies on its on-premises supercomputer, BioHive-2. This specialized infrastructure handles the heavy lifting of model training within Recursion's own data center, creating the foundational neural networks used to identify biological patterns.
Once these models are trained, Recursion leverages the scalability of Google Cloud to perform inference on new images in the pipeline. This hybrid approach allows them to combine the raw power of BioHive-2 with the flexibility of the cloud. For this critical inference stage, Recursion utilizes a combination of Google Cloud GPUs and TensorFlow TPU technology to accelerate and automate image processing.
Using GKE On-Prem is attractive as it will allow us to manage all of our Kubernetes clusters with a single, easy-to-use console.
Ben Mabey
Chief Technology Officer, Recursion Pharmaceuticals
"A hybrid approach, enabled by technologies like Kubernetes, offers the best of both worlds," says Recursion Chief Technology Officer Ben Mabey. "The cost-effectiveness and control of on-prem with the on-demand scalability of the cloud."
TPUs are an organic fit for this workflow because Recursion is already using TensorFlow to train its neural networks in its proprietary biological domains. Google has provided reference model architectures optimized for Cloud TPUs, which has allowed Recursion to easily migrate inference workloads.
"The potential of using Cloud TPU pods to accelerate our research while keeping operational costs and complexity low is a big draw," says Ben. Getting answers to researchers in an order of minutes or hours versus days is a definite value add for the business.
The efficiency of TPU processing for inference is substantial. A TPU processes at 90 trillion operations per second, nearly twice that of standard GPUs, while consuming only one-third of the power. GPUs are used throughout Recursion for general-purpose accelerators and TPUs are designed to accelerate the pattern matching that drives machine learning.
By orchestrating this workflow—training on BioHive-2 and running inference on Cloud GPUs and TPUs—Recursion can better execute its mission "to decode biology to radically improve lives." As the company shifts into the era of AI proof points, delivering real clinical impact with its AI-led approach, Recursion continues to expand its platform with integrated data layers and foundation models for drug discovery and grow its discovery and clinical pipeline both internally and with partners.
Right now, a number of cloud providers have Kubernetes solutions, but Google is by far the most mature. In particular Google Kubernetes Engine, web console, and the CLI are all just more intuitive and the ergonomics surrounding them are a lot better.
Ben Mabey
Chief Technology Officer, Recursion Pharmaceuticals
Recursion also continues to leverage the unique capabilities of Google Cloud that initially drove its hybrid computing approach.
"One factor was support," says Ben. "The responsiveness and the high-touch customer support provided by the Google team stood out from the other cloud providers." Another was the Google stewardship of the Kubernetes project, which means that Recursion could rely on Google for its expertise. Ben explains, "Right now, a number of cloud providers have Kubernetes solutions, but Google is by far the most mature. In particular Google Kubernetes Engine, the web console, and the CLI are all just more intuitive and the ergonomics surrounding them are a lot better."
Google Cloud is also proving a better fit for deep learning. "The storage is fast, and Cloud TPU is the only cloud solution for turnkey distributed training that we view as mature," says Ben. Another attraction to Google for Recursion is commitment to the open source community demonstrated by Google. From the start, Recursion took an open source approach to building its solution. The philosophy informs its partnership strategy as well.
"The ongoing support of the open source community provided by Google really resonates with how we approach our business and the best practices we see for moving ahead," says Ben.
Recursion is a clinical stage TechBio company advancing differentiated medicines leveraging the Recursion OS, an AI-native, end-to-end platform integrating biology, chemistry, and clinical development.
Industry: Life Sciences
Location: United States
Products: Google Cloud Kubernetes Engine (GKE) , Cloud Storage , BigQuery , TensorFlow