This monthly blog highlights and discuss emerging trends and challenges related to healthcare data and its ever changing life cycle.
By Kapila Monga
The phrase “Longevity Revolution,” referring to the increase in human life expectancy by 30 years, isn’t new anymore. Various books have been written on this subject, and it is a widely discussed topic in conferences these days. It is a no-brainer that this increase in life expectancy warrants preparedness to be able to support oneself financially, psychologically, sociologically, and physiologically for those additional years. Every preparation aside, however, there is one obstacle that has yet to be conquered: we haven’t yet been able to identify how best to prepare for or guard against Alzheimer’s disease. Alzheimer’s is a progressive disease that impacts the brain and its cognitive abilities. Its symptoms gradually worsen with time. It has no cure (yet), is fatal, and adversely impacts the quality of life of all involved—patients, family members, and care givers.
While efforts are underway to identify the causes and find a cure for Alzheimer’s disease, this post intends to explore whether Artificial Neural networks (ANNs) can be used to predict the onset of Alzheimer’s disease. ANN is a computational paradigm that has a structure and operation that resembles that of a mammal brain. The hypothesis is that just as a mammal brain can do intelligent and abstract things like image recognition and signal processing with speed, accuracy, and ease, building a computational paradigm on the principles of mammal brain will result in computer-driven intelligent decision making. Mammal brains function through inter-connected neurons whose activation leads to a decision making pathway, and hence so does ANN. The following table illustrates the underlying analogy:
ANNs have been established to be useful algorithms in situations where we can collect lots of data but are unable to identify the features that are important to predict a certain outcome. ANNs work in these situations because of their ability to start with given weight values (and hence the feature set—weighted combination of input data elements) and improvise them using inputs from multiple inter-connected nodes, and the value of cost function. Though a key limitation to be kept in mind is features/predictors hence found through ANNs are often difficult to explain in simple terms, unlike in techniques like Regression where features are relatively easy to decipher.
Causes of Alzheimer’s (late-onset) aren’t yet fully known. Scientists have identified some factors which indicate probability of Alzheimer’s onset but have also mentioned that the presence of these isn’t a sufficient condition, i.e. one might test positive on these factors and yet not develop Alzheimer’s. This makes Alzheimer’s onset prediction a good case for ANN; since the predictors aren’t yet fully known (we will explore the criterion around data needs in the next paragraph). According to researchers, the following factors give an indication of Alzheimer’s onset:
- Presence of allele 4 of Apolipoprotein E i.e. APOE€4
- Atrophy of brain cells
- Mitochondrial dysfunction
- Production of unstable molecules called free radicals
- Abnormal deposits of proteins throughout the brain
There’s a saying—“It’s not who has the best algorithm that wins; it’s who has the most data.” Now for the million dollar question: What data would be required for prediction of Alzheimer’s onset and how to get it?
- Since the causes of Alzheimer’s aren’t yet fully known we will need a set of all possible indicators of Alzheimer’s onset in our data universe, in order to fully utilize the power of ANNs
- Data for two sets of populations would be required: one that has Alzheimer’s and one that doesn’t have—or better still didn’t get Alzheimer’s—during its lifetime
- We will need these data sets over a period of time (probably several decades or even lifetime of populations) because we are trying to identify onset
- As for the data elements, the following will be useful for development of a baseline model: Demographic Data, Results of Nutrition related and other blood tests, Magnetic resonance Measurements, Cerebrospinal fluid-based biomarkers, APOE Genotype, and muscle biopsy test results. This isn’t an exhaustive list and should be augmented with other data sources, too, that based on clinical knowledge could be a ‘potential’ cause of Alzheimer’s onset.
Given the wide span required, and diverse nature of these data sets, it doesn’t seem feasible for a single Health System to possess all these data sets, except for research organizations focused on Alzheimer’s or organizations that conduct clinical trials. Given privacy restrictions, it is all the more difficult for data scientists to get access to this data for building ANNs. Fortunately, National Alzheimer’s Coordinating Center (NACC) provides databases that have many of the above elements, and hence can be used to develop baseline ANNs for Alzheimer’s onset prediction. Machine learning packages like CAFFE, ENCOG, Azure ML, Python, and many more can give a head start in coding for ANN. Developing ANNs for Alzheimer’s onset prediction isn’t an easy task, and requires collaboration amongst multiple business entities to make the data sets available, to develop ANN, to refine ANN, and to test and use the algorithm. Given the impact this disease has on life quality and life expectancy, collective efforts to ensure preparedness are far from an outcry.
Alzheimer’s is a serious medical condition which requires proper medical diagnosis and treatment. The contents of the article cannot replace the medical attention and intervention of a doctor associated with Alzheimer’s. This article is an effort to bring attention toward the potential of trying to predict onset of Alzheimer’s using ANNs, and expresses author’s point of view on the feasibility of this topic.
The opinions expressed in this article are the author’s own and do not reflect the view of the organization the author works for, or of any other corporate or business entity.
Kapila Monga (firstname.lastname@example.org) is a Healthcare Analytics professional with 10-plus years of experience across consulting and analytics, for healthcare and life sciences customers. She currently works with Cognizant Technology Solutions in their Healthcare Analytics practice in the US and helps healthcare customers leverage transformative power of analytics and data science to make their business processes more effective.