Article

Trusting Deep Learning in Medicine

Merriam-Webster1 defines “trust” as “assured reliance on the character, ability, strength, or truth of someone or something; one in which confidence is placed.” The precept is a thread that runs throughout the fabric of medicine, from patient to doctor, physician to peers, and doctor to support systems. Trust is integral to the physician’s ability to comfort and heal patients. In today’s technology-based healthcare culture, demand is mounting for medical professionals to place their trust in deep learning. Before that can occur on a wide scale, developers must face obstacles to trust and overcome them with sensitivity. 

Deep learning primer 

  • Artificial intelligence (AI) is a broad term describing the development of a computer system to perform a task that would have traditionally required a human’s intelligence such as image recognition, speech recognition, or decision making.
  • Beneath the mantle of AI, is machine learning, based on the idea that the machine can learn from data, patterns, and features, processing input on its own to solve a problem, arriving at a decision without human supervision.
  • Deep learning is a further, more complex subset of machine learning. Deep learning uses deep neural networks with layers of mathematical equations and millions of connections and parameters that get strengthened based on desired output, to more closely simulate human cognitive function.

Deep learning in healthcare

The concept of deep learning lends itself elegantly to many areas of medicine – prediction, diagnosis, and treatment planning. Consider, for example, image recognition. 

ImageNet2 brought the potential value of a massive image dataset to the forefront of technology. ImageNet defines itself as, “an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a ‘synonym set’ or ‘synset’. There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated.”

A deep neural network (DNN) receives its primary education on such a dataset, learning to identify new images based on similarities to the training set.5

Imagine a small child who first recognizes that furry thing with four legs and a tail as a pet, then begins to differentiate between the family’s pet dog and pet cat, eventually applying the knowledge to correctly categorize dogs as dogs and cats as cats as they are encountered outside the home. In meeting something else – perhaps a ferret or goat – the child realizes inherent characteristics are different.

Like the child, the DNN builds upon learned capability as new datasets are taken in. In medical imaging, that would be a library of images with known conclusions – cancerous v. benign or normal v. abnormal in any other defined context.5

Natural language processing3 allows the DNN to sift through unstructured textual content in electronic health records (EHR) correlating this layer of information to the visual recognition output. The DNN could detect a possible tissue or structural abnormality in X-ray, magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET), or ultrasound images. Then the DNN considers the patient’s age, lifestyle factors, and medical history to arrive at the likelihood that the imaged abnormality is a specific disease. This conclusion would then influence the physician’s action in ordering further tests or medical intervention. The DNN could be using natural language processing to complete the loop, adding findings to the patient’s EHR.

That is just one scenario. In addition to the support of clinical decisions, there is great promise for deep learning to deliver similar efficiencies and economies in other areas, potentially reducing the need for biopsies and invasive procedures. It may complement diagnostics and serve as an alert system.

These exciting opportunities, however, are not without hurdles. Trust in deep learning is a significant one.

Shifting the basis of trust

At the core of all deep learning is data provided by humans, and humans are subject to bias. Bias is unintentional influence caused by the belief system of the human curator.6 Suppose, for example, the individual working on cat images for a dataset had a very large pet cat at home and was partial to robust felines. That person may instinctively select more pictures of big cats. A machine trained on this data would then be at risk of identifying kittens as an abnormality, further reinforcing this misperception through the machine’s own learning cycles.

Could human bias creep into medical applications in the form of race and ethnicity? Could an algorithm find that people who look like this are more likely to have this disease, or respond to therapy in this way? Those are important considerations in an era where our country is striving to equalize the quality of healthcare over socioeconomic lines.

Bias is overcome in two ways. First, by using large scale, curated and annotated medical datasets which minimize the effect of potential bias. Second, by locking algorithms once objectivity of training data, accuracy, and reliability of output has been established through audit processes.4

The second roadblock to faith in deep learning is the black box effect. In a deep neural network (as compared to a traditional statistical model), the inner functioning of algorithms is extremely complex and obscured from view. It is not easily understood by those who are not data scientists. The clinician is in the position of assuming responsibility for the output, possibly without a thorough understanding of the training data.4

Black box skepticism is overcome with transparency4 – providing a synopsis of the training dataset and overview of the algorithm to clinical end-users. Delivering a gauge of reliability with the DNN’s results compared to an accepted standard. Illustrating security measures which safeguard against intentional malicious faults in algorithms.

Throughout history, every disruptive technology has faced resistance. When the stakes are high enough and the benefits are great enough, human confidence prevails.

References:

    1. Trust. Merriam-Webster. https://www.merriam-webster.com/dictionary/trust June 20, 2019
    2. About ImageNet. ImageNet. http://image-net.org/about-overview   June 20, 2019.
    3. Natural Language Processing (IoT) What it is and why it matters. SAS Insights. https://www.sas.com/en_us/insights/analytics/what-is-natural-language-processing-nlp.html June 21, 2019.
    4. Transparency is Key for Clinical Decision Support, Machine Learning Tools. Xtelligent Healthcare Media. https://healthitanalytics.com/news/transparency-is-key-for-clinical-decision-support-machine-learning-vendors June 21, 2019.
    5. Interview with GE Healthcare’s Marketing Manager Sonia Sahney. June 15, 2019. 
    6. Bias. Merriam-Webster. https://www.merriam-webster.com/dictionary/bias June 20, 2019