Data Science In The Health Industry: Predicting Diabetes Through Machine Learning.

Data Science In The Health Industry: Predicting Diabetes Through Machine Learning.

Introduction

With the emergence and unfolding of data science and its vast applications since the early 2000s, various industries can now leverage the large amount of data present in the world today to develop data-driven solutions. According to a publication released by the Harvard Business Review, Machine Learning, a subset of data science that allows computers to perform tasks without being explicitly instructed on what to do, is projected to improve the healthcare industry dramatically.

Apple’s chief executive officer, Tim Cook, believes the healthcare industry is an area where Apple could make a big play. In an interview with Consumer News and Business Channels(CNBC), Tim Cook hinted that Apple’s biggest long-term project is to break into consumer health. This is evident in Apple’s continuous improvement of its health solutions integrated into its products, especially the Apple watch optimized for real-time collecting and monitoring personal health data such as heart rate, blood-oxygen level, etc.

In so little time, data science is meeting the increasing need to improve the efficiency of healthcare services through the following;

  • Providing real-time monitoring systems embedded in portable and wearable devices such as wristwatches.

Electronic records to gather patient data and prescribe health measures and drugs based on personalized health history.

  • Application of cluster analysis for finding hidden patterns containing useful insights in patients' health data history.
  • Application of predictive analysis for extracting insights from health data to predict future outcomes.
  • Employing descriptive analysis for understanding factors responsible for health challenges.

All these, alongside an increasing number of applications, will help health practitioners make well-informed decisions when dealing with patients if integrated into health care processes. Machine learning is yet to be explored even in developed nations.

However, as technology evolves, one major question that roams the minds of many is ‘how far do we go until computers take over the workforce?’. This is a valid question as statistics show that digitization and automation of processes often lead to technological unemployment. Whilst I believe it is inevitable that technological advancements lead to a noticeable amount of technological unemployment, I also believe that holding technology as a tool to serve humanity helps reshape our view of technological advancements and our drive to bring about technological advancements.

This realization moves me to apply machine learning models to help solve real-world problems for people out there. The next part of this writing is about diabetes and predicting diabetes through machine learning.

A brief history of Diabetes

Diabetes is a disease caused by an abnormal amount of glucose(sugar) in the bloodstream. Its first discovery is dated far back as 1522 BC in Egypt. Although there were no sophisticated processes or established bodies of knowledge to vividly isolate this disease or its symptoms from others, traditional methods such as tasting a patient’s urine and looking out to see if a patient’s urine attracted ants were used to identify patients who had diabetes. In 1800 scientific processes were developed to measure the amount of sugar in a blood sample. This was a great feat towards understanding diabetes as a disease.

sugar.jpg

Diabetes in the 21'st century

According to the World Health Organization(WHO) and diabetes.co.uk, the largest diabetes community in the world today, about 422 million people worldwide have diabetes with 46% of this population undiagnosed, most diabetic patients do not know they have diabetes until long term damages show signs, this population occupies about 9% of the world’s adult population. The World Health Organization also states that diabetes is one of the leading causes of death globally as Type 1 diabetes often leads to other chronic diseases such as heart diseases, kidney diseases, and blindness.

With the decline in healthy feeding culture in the 21st century and its impact, especially on children, diabetes poses a major concern for the future. According to the Immune Deficiency Foundation (IDF) diabetes atlas, the population of diabetic patients is expected to increase by a factor of about 1.52 to 642 million people by 2040. The World Health Organization in April 2020 also reported that worldwide obesity had tripled since 1975, with 340 million children and adolescents between the age of 5-19 and 38 million children under the age of 5 being either overweight or obese as of 2016.

The way forward

Although diabetes remains a disease with no known cure, precautionary methods have been found to reduce the possibilities of extremities. Also, being an asymptomatic disease(i.e., shows no symptoms in the early stage), a helpful tact will be the ability to predict personalized risk levels of being diabetic in the future to sensitize people with high-risk levels to precautionary methods to reduce their risk levels. This can be done using supervised machine learning models.

High-level description of a supervised machine learning model

Supervised machine learning models work like training a child to identify different fruits. Firstly we give the child samples of fruits(e.g., oranges, apples) with labels/names attached to them and after, test the child’s ability to name/recall different fruits(another orange or apple) without their labels provided, how the child performs in naming unlabeled fruits is the overall test of how much the child has learned to identify fruits correctly. The child learns to identify the different fruits by matching colors, sizes, and other characteristics to different fruits. Given some patients' health data, e.g., Body Mass Index, blood pressure, insulin level, age, etc. with labels provided to differentiate diabetic patients from non-diabetic patients, supervised machine learning models build mathematical and logical equations around the labeled patients’ data to be able to predict if a new patient will have diabetes or not. Also, unlike the human brain, mathematical equations allow machine learning models to provide calculated probabilities of being diabetic or not.

The link below contains the implementation of different supervised learning models in predicting diabetes.

link

Gracias.