Recursion, a digital biology company industrializing drug discovery, released its open-source RxRx19 dataset, which is the first human cellular morphological dataset of SARS-CoV-2 (COVID-19). The goal in releasing RxRx19 was to quickly contribute human cellular morphological data and over 1,600 small molecules to researchers around the world who are working to make advances in the fight against the COVID-19 pandemic.
The dataset was derived from experiments that Recursion led, in collaboration with Utah State University, to investigate the therapeutic potential of a library of 1,672 Food and Drug Administration and European Medicines Agency-approved or clinical-stage compounds for modulation of the effect of SARS-CoV-2 in human renal cortical epithelial (HRCE) cells. The images were processed using Recursion’s proprietary deep learning neural network to generate high-dimensional featurizations of each image for the identification of distinct phenotypic profiles, which are also being shared publicly.
Through RxRx19, researchers in the scientific community will have access to 305,520 5-channel fluorescent microscopy images and corresponding deep learning embeddings to analyze or apply to their own experimentation. Any results and conclusions drawn from the in vitro experiments and targeted hypothesis-driven research will contribute to the growing body of COVID-19 scientific data.
“At Recursion we have repeatedly seen how artificial intelligence coupled with target-agnostic drug discovery can rapidly uncover insights that are obscured through traditional approaches,” said Ben Mabey, chief technology officer at Recursion. “The release of RxRx19 creates an unprecedented opportunity for the machine learning community to uncover those hidden insights that will be most valuable in the fight against a global pandemic. Beyond the immediate purpose, this open-source dataset will help researchers advance in their abilities to use high content imaging for compound efficacy screening, which will have a positive impact that lasts well beyond the resolution of the current crisis.”
The experiments took place over four weeks, start-to-finish, and were conducted at the USU Biosafety Level 3 facility and then analyzed by Recursion’s team of data scientists, engineers and machine learning scientists who are currently working remotely. This effort was a timely proof-point that Recursion’s flexible, target-agnostic approach can pivot to tackle the most urgent public health issues.
“I am so humbled by and proud of the work of our team who worked long hours, transporting equipment and reagents nearly 60 miles each way to our collaborator every day for weeks during the height of the current pandemic crisis,” said Chris Gibson, Ph.D., co-founder and CEO, Recursion. “The generation of more than 300,000 5-channel images, a preliminary pre-print manuscript and more in just four weeks is incredible under these circumstances. This speaks both to the scrappiness of the amazing team at Recursion, as well as the flexibility of our platform to adapt rapidly to explore broad areas of biology. This is just the start — there is more to come.”
Combined with Recursion’s RxRx1 dataset released last year, RxRx19 enables machine learning researchers to leverage modern deep learning techniques to bridge two related datasets that demonstrate completely different biological phenomena, but share a consistent image-based approach. Both dataset releases are part of RxRx.ai, a planned series of open-source biological and chemistry data releases for the machine learning community. To download the free RxRx19 dataset, visit https://rxrx.ai/. For more information on Recursion’s unique approach to applying artificial intelligence and machine learning to drug discovery and development, visit www.recursionpharma.com.
Recursion is a digital biology company industrializing drug discovery. Recursion does this by combining automation, artificial intelligence, machine learning, in vivo validation capabilities and a highly cross-functional team to discover novel medicines that expand our collective understanding of biology. Recursion’s rich, relatable database of 4 petabytes of biological images generated in-house on the company’s robotics platform enables advanced machine learning approaches to reveal drug candidates, mechanisms of action, novel chemistry, and potential toxicity, with the eventual goal of decoding biology and advancing new therapeutics that radically improve people’s lives. Recursion is proudly headquartered in Salt Lake City. Learn more at www.recursionpharma.com, or connect on Twitter and LinkedIn.