How AI/Machine Learning has Impacted the COVID-19 Pandemic
In recent years, machine learning has found applications in new and often unexpected areas. With the novel coronavirus outbreak in 2019 and 2020, it makes sense that many have tried to apply machine learning and artificial intelligence to various problems relating to the disease. From modeling the spread of the disease to searching for possible drugs and vaccines, machine learning has been integral to understanding many of the problems caused by the COVID-19 pandemic.
Case Study – Disease Dynamics
A simple internet search will lead you to hundreds of dashboards showing the current number of coronavirus cases around the world. This stems from how easy it is to access data relating to the virus, especially from reputable sources like Kaggle or Johns Hopkins. This data, along with sophisticated models for disease dynamics has, for example, enabled predictive modeling for the number of people who actually have the virus and the risk of hosting an event in any county in the US.
Some researchers have tried to analyze the dynamics of the pandemic using machine learning and wavelets. One team from Germany and China combined a traditional disease dynamics model with machine learning techniques, specifically CNNs and LSTMs, to predict various rates for the virus as it spread throughout Germany. They try to predict the infection rate, undiagnosed/diagnosed case rate, and the hospitalization/death rate. Yet another group has tried to model the spread of the virus in Morocco using a hidden Markov chain.
In short, the ubiquity of coronavirus spread data, disease models, and machine learning frameworks has led to an outpouring of research from both professionals and hobbyists. It has served as a data science exercise for many, and each model can be evaluated for accuracy by just waiting to see how the disease actually spreads.
Case Study – Drugs and Vaccines
As early as February 2020, the novel coronavirus spike protein was mapped. The entire 3D structure of the protein at the atom scale was analyzed and shared publicly. The spike protein is the key the virus uses to enter our cells. For proteins, structure is function, and the spike protein structure directly contributes to how contagious the coronavirus has been. The release of the 3D structure was the first step in discovering treatments, drugs, and, ultimately, vaccines to contend with the pandemic.
Once the 3D structure of the spike protein and other parts of the virus are mapped, the power of modern supercomputing can be used to simulate the virus. To find a treatment for COVID-19, a drug is needed that binds to the proteins on the surface of the virus, which subsequently either deactivates the virus or flags the virus to be attacked by the immune system. Researchers at the Lawrence Livermore National Laboratory developed a highly parallelized machine learning model that runs on the world’s third fastest supercomputer at 97.7% utilization. The model was trained on almost 2 billion small molecules that could potentially treat COVID-19.
In the hunt for a vaccine, MIT researchers developed a platform called OptiVax to speed up development of peptide vaccines. This type of vaccine is relatively new, and it hinges on finding a short amino acid sequence from the target virus to put in the vaccine. With 20 possible amino acids, the number of possible sequences of length N is 20N, so the researchers use machine learning to tackle this massive search space. Instead of opting for a purely peptide-based vaccine, the researchers aim to pair the peptide sequence with a more traditional DNA or RNA based vaccine, where the peptides serve to improve coverage of different populations.
With the announcement of vaccines by major players like Pfizer and Moderna in late 2020, the same group of MIT researchers used machine learning to find gaps in coverage by these vaccines. The researchers used OptiVax to predict groups of people that may not be fully protected by the vaccine because of their genetics. Their work was affirmed when the later clinical trials for the major vaccines showed the exact weaknesses the researchers predicted with computer modeling.
Drug and vaccine development, in contrast to disease dynamics, requires much more expertise in biochemistry and simulation. However, research teams from top universities around the country have stepped up in a concerted effort to accelerate effective vaccine development.
The pandemic, while predicted by many experts, caught much of the world by surprise. In spite of the number of cases, hospitalizations, and deaths, the novel coronavirus has also served to bring scientists and researchers together to address a pressing, real-time problem like no other. Industry experts, academics, and hobbyists have all played their part in studying the coronavirus, and machine learning has been a key tool for this endeavor.
Alex Saad-Falcon is a published research engineer at an internationally acclaimed research institute, where he leads internal and sponsored projects. Alex has his MS in Electrical Engineering from Georgia Tech and is pursuing a PhD in machine learning.