Towards Stratified Healthcare and Precision Medicine for Developing Economies during the Covid-19 pandemic through Network Modelling and Natural Language Processing
Nowadays, the ever-increasing volume of information recorded every second has brought significant opportunities and challenges to both life and science. For instance, during the current Covid-19 situation, several medical records are being registered in large databases, and this data brings up two essential, yet straightforward, questions: How to identify and warn possible patients at risk from current patients data? And is there a way to identify successful procedures to threat Covid-19? In this work, we discuss how Network Modelling and Natural Language Processing, which are valuable tools from Data Science, can deliver an answer to these questions in a Stratified Healthcare and Precision Medicine context. Furthermore, we also provide an ethical discussion of to what extent these methods may impact the public.
Let us present one example which is currently happening in clinics and hospitals worldwide: In a third-world country, there are numerous medical facilities currently overwhelmed with Covid-19 patients. As the World Health Organization has not identified a standard procedure to tackle the virus, many doctors and nurses are dazzled trying to treat patients effectively and develop a prognosis. This process is affected by several factors: limited medical resources, lack of a standard procedure nationwide, patients from different medical backgrounds, and a pressing need for quick medical response. However, every step of treatment gets its digital footprint in a local database. This example is an open door for Stratified Healthcare and Precision Medicine. The first is the ability to identify essential differences across populations and to act accordingly in healthcare provision; whereas, the second focuses on establishing a prognosis for each patient based on genetic, environmental, and lifestyle factors. To see how to implement these, we now introduce Network Modelling and Natural Language Processing.
Network Modelling (NM) is the study of the interaction between objects or entities. It uses graphs to represent relationships. Here we present two uses of NM for our example above. Firstly, Systems Biology approach to the study of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has effectively revealed the life span of the virus in different surfaces, and determined the structure of its proteins, explaining its transmission rate and effect on human cells. With this information, hospitals can take measures which could prevent community spread of the virus. Also, labs can research on proteins reaction to assist the production of new therapies and drugs, complemented by hospital records on patient-specific datasets during trials, ultimately reaching Precision Medicine. Secondly, by using a communication network, it is possible to identify and prevent future cases of people with risk of dying of Covid-19. Through the Resource Description Framework (RDF), we can trace a path between a current Covid-19 patient and their household. Hence, if a patient is certainly in touch with someone of a population of high risk, additional measures can be taken to prevent contagion or to do a preventive intervention, thus accomplishing stratified healthcare.
Natural Language Processing (NLP) is an area of Data Science focused on processing, analyzing, and interpreting human languages. Focusing on Precision Medicine, NLP is useful when analyzing pathology reports to relate records to an individual's health. However, it is essential to identify a standard set of procedures that have, historically from a data point of view, been successful in treating the virus; hence, allowing for a more informed decision to be made when considering treatment options. There are several ways to do this, like using rule-based techniques or machine learning. For instance, university hospitals, in partnership with IBM, can use Watson to determine the best method to perform this analysis quickly. Laterwards, this knowledge could be applied nationwide with the use of open-source packages like NLTK. However, if there is no access to this kind of tools, engineers, mathematicians, and statisticians can collaborate with health professionals to provide an NLTK-based nationwide tool. Whatever the method, we would focus on accurately classifying a patient and find the steps that the medical team took to treat it. This way, we can use this technique of Precision Medicine to develop Stratified Healthcare.
At this point, we have seen how combining NM and NLP are useful tools to develop Precision Medicine and Stratified Healthcare for our example above. Moreover, they have been essential in answering the questions proposed at the beginning of this piece of work. Notwithstanding, we have not considered the difficulties related to both approaches, which are of technical and ethical nature.
Not surprisingly, to apply any of the tools previously described for the example discussed, we need a stable internet connection, a server to store data, and computation capability. In third-world economies, this is sometimes too much to ask. Furthermore, if we could overcome these issues, then the lacking of a systematic medical terminology, like SNOMED-CT, would limit processing. Without medical supervision, the output of the analysis would be limited, yet guaranteeing medical assistance for research during the pandemic is not practical, even less if professionals do not know how to use a computer. Without a doubt, if the country overcomes these challenges, then a great opportunity would be ahead. We could not only tackle the pandemic efficiently, we could, more importantly, develop a Data Science-based Health System, and we would have the tools for future disease control.
From the ethical point of view, what pops up pretty quickly are privacy concerns. Consent is essential to apply NM in identifying risky cases, as patients may not want to disclose their personal information nor identify the participants of their households. Information breaches, hacking, discrimination by neighbours, and personal motives are among the fears of compliance. Likewise, the proposed use of NLP includes these drawbacks alongside a possible misuse of information from organizations and researchers. Even though these tools may bring a benefit, they can also affect the wellbeing of the public. Thus, the state and research facilities should guarantee an ethical treatment of data, similar to the General Data Protection Regulation, always assuring consent accurately and adequately. This way, an equilibrium can be reached to both, benefit the nation and ensure people's rights.
Some references:
Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. https://www.ncbi.nlm.nih.gov/pubmed/32075877
Aerosol and Surface Stability of SARS-CoV-2 as Compared with SARS-CoV-1. https://www.ncbi.nlm.nih.gov/pubmed/32182409
These open-access articles are available at:
https://science.sciencemag.org/content/367/6483/1260.long
https://www.nejm.org/doi/pdf/10.1056/NEJMc2004973












