Machine learning is starting to make the process of developing new pharmaceuticals - which until recently has been a highly time and capital-intensive process - more efficient. This is good news, as according to a report by the Tufts Center for the Study of Drug Development, the cost of developing a new pharmaceutical product and bringing it to market has reached a daunting $2.5 billion dollars, an increase of 145% since 2003.
The pharmaceutical development process is not only expensive, it’s difficult and slow too. It takes about ten years for a drug to make it from the laboratory to the doctor’s clinic, with only about 1 in 10 making it past regulatory agencies. In order to make drug discovery more efficient, researchers have started employing machine learning to shorten development cycles and lower costs, with a particular focus on bringing greater efficiency to the earliest stages of the development process known as “drug discovery.”
One of the major hurdles that drug discovery researchers are facing is a glut of biomedical information. The world’s biomedical journals are producing ever greater amounts of published work every year (there are 4% -5% more publishing scientists every year), and according to Benevolent.ai Founder Ken Mulvany, there are 10,000 new pieces of published content every day in biomedical databases and journals.
This tidal wave of data makes a ripe target for machine learning application. Benevolent.ai, which is now the largest private AI firm in Europe, and was founded specifically to help biomedical researchers working in the drug discovery process to better analyse that vast array of data. The company uses a purpose-built natural language processing model to mine the ever-growing corpus of publicly available biomedical information, as well as proprietary datasets that the company has paid access to. The output of Benevolent’s process is a “knowledge graph,” that doesn’t just find instances of specific molecules or diseases in the searched publications, but maps the relationship between the citations and make inferences about the relationships that could theoretically exist between them. The hope is that by providing drug discovery researchers with these hypotheses, they can more quickly and more accurately identify new targets drugs, or repurpose old research or drugs to achieve new goals.
IBM is another one of the major players that has made high-profile commitments to machine learning in the drug discovery process, along with some very ambitious claims. IBM’s Watson Health drug discovery division recently announced a partnership with drug maker Pfizer, focusing on improving immuno-oncology research, a type of cancer treatment that helps bolster the body’s immune system in order to combat the disease.
Using Watson’s machine learning algorithms, researchers at Pfizer will analyse a massive data set of medical data, which according to IBM consists of over 25 million Medline abstracts, 1 million full-text journals articles, and a constantly updated set of patient data in order to select drugs for more in-depth study, help understand drug combinations, and choose candidates for immuno-oncology treatment. Other notable IBM achievements in the field of pharmaceutical research includes this new patent for using machine learning that helps researchers better identify the associations between a drug indication (a sign that a drug may be helpful) and side effects. This gives researchers better control over the formulation and dosage of drugs in the clinical trials to avoid serious safety issues which could be crucial in helping to bring down the costs of drug discovery, as roughly one-third of all drugs fail their phase III trials due to safety concerns.
Following closely behind the leaders like Benevolent, IBM, and Google (who recently released a research paper outlining their own machine learning model) is a global scene of well-funded start-ups. Research firm CB Insights has identified 106 startups working in the machine learning for drug discovery field that have received funding since 2013.
It’s still early days in using machine learning for drug discovery, and there are major hurdles to overcome yet. Among these challenges is the availability of reliable, high-quality data that researchers can use to train their machine learning models. Often the most relevant data for training these systems is personal medical data, which is often difficult to collect. Some researchers, like Vijay Pande at Stanford University, have tried to circumvent this difficulty by devising new methods of deep learning that require fewer data points in order to achieve a good result. Perhaps medical-grade FitBits and other fitness trackers could help provide the much needed data? Only time will tell. But I’m hopeful that as machine learning companies continue to accumulate success in drug discovery, and large pharma companies become increasing supportive, researchers will find a sensible way to get the data they need and help propel us into a new age of lower-cost and higher-efficiency drug discovery.
Note - This is the second in a series of blogs investigating the effects of machine learning on the healthcare and pharmaceutical industry. Part three of this series will focus on machine learning in clinical trials.