Building Better Artificial Intelligence for Health Care

Building Better Artificial Intelligence for Health Care

Diverse datasets are essential to prevent sex and gender imbalances

Michelle Dotzert, PhD

Artificial intelligence (AI) is transforming biomedicine and health care. AI technologies are being leveraged to support clinical decision-making, guide diagnosis and treatment, and to provide patients access to real-time support and guidance. Machine learning techniques, including deep learning, are used to develop these technologies by training models with a variety of datasets. There is mounting concern that incomplete or skewed training datasets, among other factors, contributes to sex and gender imbalances within AI technologies, which can have significant implications for health care providers and patients. Addressing these imbalances will be essential for the development of robust, precise AI technologies designed to enhance health care for all.

Transforming health care

AI is being implemented in the health care setting to overcome the challenges of big data, and aid in diagnosis. Large-scale analytical techniques, such as next-generation sequencing (NGS), produce enormous amounts of complex data, and analysis with manual techniques is labor-intensive and inefficient. Machine learning can be applied to rapidly analyze large datasets and identify markers and patterns that may otherwise be missed by humans. For example, machine learning enables circulating free DNA analysis for cancer screening, detection, and monitoring via liquid biopsy.

Machine learning can also be applied to analyze large volumes of imaging data. Convolutional neural networks (CNNs) are algorithms designed to assign learnable features or weights to input images to differentiate one from another. CNNs have been applied to echocardiogram images and videos, to aid in the diagnosis of heart disease, as well as images of skin lesions to identify skin cancers.

The utility of AI technologies as predictive tools has also been explored. Machine learning algorithms have been applied to clinical data to predict cardiovascular events. AI also shows promise for monitoring and prediction in the intensive care setting, and has been explored for its utility in monitoring fluid management systems and forecasting the onset of sepsis.

Novel AI platforms are designed to extend beyond the confines of the exam room, and place health assessment technology in the patient’s hands. Ada is a health companion and symptom assessment app operated by Ada Health GmbH that uses proprietary technology to combine medical knowledge and AI, and the free mobile app has over 10 million users worldwide.

“[Ada] reviews multiple pieces of data and provides a probabilistic assessment that suggests possible causes for a user’s symptoms and the likelihood of each based on the available information, drawing upon a global medical knowledge base,” explains Matt Fenech, medical safety lead at Ada Health.

According to Fenech, AI tools like Ada can support both physicians and patients. The platform is designed to consider multiple pieces of information including symptoms, medical history, and risk factors to make a personalized assessment. “This can alleviate pressure on health care services by both informing users on appropriate next steps and supporting health professionals with complex cases and decision-making that fits into their existing workflows” says Fenech.

Sources of sex and gender imbalances

Many AI technologies are developed using machine learning techniques such as deep learning. Deep learning methods are based on artificial neural networks (algorithms inspired by the human brain) that learn from large amounts of data. Unlike conventional machine-learning techniques, deep learning is a form of representation learning in which the machine uses raw data to develop its own representations for pattern recognition. As such, the quality of raw data used for training will determine AI performance, and challenges arise from a lack of diverse datasets.

“AI in health care is only as good as the people and data it learns from, meaning a lack of diversity in the development of AI models can drastically reduce its effectiveness,” says Fenech, adding “AI trained on biased data will simply amplify that bias.”

Related Series: AI in the Clinical Lab

A lack of diversity in cardiovascular disease research, for example, has the potential to contribute to sex and gender bias in AI algorithms. Historically, cardiology research failed to ensure a gender balance and many management guidelines and risk prediction models are based on research that enrolled primarily men. It has since been determined that women often present a different set of symptoms during a heart attack, and have a unique set of risk factors compared to men.

“Depending on the data AI has been trained on, this gender bias could also make its way into medical algorithms,” says Fenech. “For example, whilst a male patient may be told to contact a doctor immediately based on symptoms of a potential heart attack, a female patient may be told to visit a doctor within a few days, which could have serious consequences,” he explains.

A lack of gender balance in medical imaging datasets used to train AI systems can also contribute to poor performance for underrepresented genders. Using data from a chest X-ray image database (for diagnosis of common thorax diseases including pneumonia and hernia, among others), researchers demonstrated a consistent performance decrease in computer-aided diagnosis based on CNNs when images from male patients were used to train the algorithm and images from female patients were tested, and vice versa.

Addressing bias in AI technologies

Eliminating bias in the development of AI technologies is challenging. Access to diverse datasets for algorithm training is critical to reduce bias, however “diverse and comprehensive datasets are extremely difficult to come by, particularly for underserved populations,” says Fenech.

A lack of diversity among those developing AI technologies can also contribute to difficulties in recognizing bias. According to Fenech, “having diverse teams—in terms of gender, ethnicity, training, and background—also plays an important role in our process and increases the likelihood of unconscious biases being recognized and addressed, rather than encoded. We have a diverse medical team spanning multiple countries, genders, races, and medical specializations.”

Post-market surveillance for AI technologies can also be used to support ongoing improvements. Having a medical safety team review any potential safety issues “ensures that potential improvements to the AI platform are rapidly identified and actioned,” says Fenech.

The black box problem

In the context of AI, the “black box problem” refers to a lack of explicit declarative knowledge representations in machine learning models, meaning they are unable to provide an explanation of how and why an output was determined.

“This is particularly true of AI based on deep learning, which has algorithms with internal network structures that are generally very large, often with millions of parameters, and of such complexity that it is difficult to describe how they work in terms that patients and users can easily understand,” explains Fenech.

Explainability is essential for AI technologies used in health care, where AI outputs can have a direct impact on patient lives. Achieving explainability in AI could also help identify undesirable biases. With an explanation of the decisional process, it would be possible to identify mistaken conclusions that result from training an algorithm with unbalanced sex and gender representation. Explainability in AI would also enable the identification of representative sex and gender differences in the data, thereby enhancing desirable biases for accurate diagnoses and individualized treatment.

“In both AI and medicine, diversity is essential to counteract potential biases in data and in human judgements,” says Fenech. Implementing processes to reduce bias will support the development of powerful AI technologies with enhanced predictive and diagnostic capabilities.