Why Can’t AI Understand Health Without Genetics?

When a patient walks into a clinic in Kingston, Lagos, or Manila, the AI tool that a doctor might consult to assess her risk for heart disease, diabetes, or certain cancers was almost certainly not designed with her in mind. It was trained, in large part, on data from people who look nothing like her: genetically, demographically, or epidemiologically. That gap, researchers now warn, is not a minor technical footnote. It is a fundamental flaw at the heart of medicine’s AI revolution.

The promise of artificial intelligence in health care rests on its capacity to recognize patterns in vast datasets and translate them into clinical insight. But health is not simply a function of behavior and environment.

It is deeply, inextricably shaped by genetics, by the billions of base pairs that determine how bodies metabolize drugs, how immune systems respond to infection, and how inherited variants elevate or suppress disease risk. And the genetics in most AI training data, experts say, skews sharply toward a narrow slice of humanity.

A Genome-Shaped Blind Spot

The numbers are stark. More than 80 percent of genome-wide association studies, the large-scale research that maps genetic variants to disease risk, have been conducted in people of European ancestry. Yet that group represents fewer than 20 percent of the world’s population, according to analyses published in Nature and cited by the World Economic Forum. This means that when AI systems use genetic findings as a foundation for health predictions, they are drawing from a deeply unrepresentative pool.

The consequences are not hypothetical. Researchers at the University of Wisconsin-Madison published findings in Nature Genetics demonstrating that machine learning tools used in genome-wide association studies can produce systematic false positives, incorrectly linking genetic variants to disease risk in ways that mislead clinical interpretation. Using diabetes as a case study, the team showed that trusting AI-predicted risk as a proxy for actual risk introduced a “pervasive bias” into the findings.

“The problem is if you trust the machine learning-predicted diabetes risk as the actual risk, you would think all those genetic variations are correlated with actual diabetes even though they aren’t,” said Qiongshi Lu, an associate professor in the Department of Biostatistics and Medical Informatics at UW-Madison. “These days, genomic scientists routinely work with biobank datasets that have hundreds of thousands of individuals. As statistical power goes up, biases and the probability of errors are also amplified.”

BY THE NUMBERS: AI, Genetics & Health Equity
81%	of genome-wide association studies (GWAS) use participants of European ancestry, who represent less than 20% of the world’s population.
5 billion	people risk being excluded from the benefits of AI health tools trained on non-representative genetic data.
882	AI-enabled medical devices approved by the FDA as of May 2024 , the majority in radiology, a field showing stark performance gaps across populations.
False positives	AI tools in genomic studies have been shown to incorrectly flag genetic variants as diabetes risk factors , a bias described as “pervasive” by UW-Madison researchers.

The Polygenic Risk Score Problem

Nowhere is this tension more consequential than in the use of polygenic risk scores, composite genetic indicators that aggregate hundreds or thousands of small genetic signals to estimate an individual’s susceptibility to conditions like breast cancer, schizophrenia, or coronary artery disease. AI has accelerated the calculation and deployment of these scores. But a score derived from European genomic data applied to a patient of West African descent may not only be inaccurate but can actively mislead clinicians.

A review published in ScienceDirect noted that “biases in genomic datasets can lead to inequitable outcomes when applying AI models across different populations,” a pattern amplified when those models are deployed at scale across healthcare systems that serve ethnically diverse patients. The review, examining the convergence of AI and genomics, called the lack of diverse training data one of the most pressing challenges to responsible integration.

“When minority groups are invisible in datasets used to deploy AI algorithms, their needs and phenotypes may become invisible.” (Lancet Digital Health)

The Lancet Digital Health has called for mandatory transparency around dataset limitations in AI health tools, noting that “without careful dissection of the ways in which biases can be encoded into AI health technologies, there is a risk of perpetuating existing health inequalities at scale.” The STANDING Together initiative, informed by more than 350 representatives from 58 countries, developed consensus recommendations to address exactly this failure mode.

What AI Models Miss When They Miss Genetics

Beyond population-level bias, the absence of genetic data from AI health models creates a more basic epistemological problem: disease is not simply a social or behavioral phenomenon. Drug metabolism, cancer predisposition, immune response, and neurological function are all powerfully shaped by individual genomic architecture. An AI that predicts patient outcomes without access to that architecture is, at best an approximation and, at worst, a confident one.

A review in Genome Medicine described how deep learning applied to genomic data can now assist with variant calling, genome annotation, and phenotype-to-genotype correspondence, but cautioned that “challenges, limitations, and biases must be carefully addressed for the successful deployment of AI in medical applications, particularly those utilizing human genetics and genomics data.” Most clinical AI tools have not cleared that bar.

Meanwhile, research from MIT found that diagnostic AI models, trained predominantly on data from majority populations, develop a hidden capability: they can predict patient race from medical images, even though no one trained them to do so. “Many popular machine learning models have superhuman demographic prediction capacity,” said Marzyeh Ghassemi, an MIT professor. “These are models that are good at predicting disease, but during training are learning to predict other things that may not be desirable.”

EXPLAINER: What Is a Polygenic Risk Score?A polygenic risk score (PRS) is a number calculated from many small genetic variants across a person’s genome to estimate their inherited risk for a particular disease. Unlike single-gene tests (such as BRCA testing for breast cancer), PRS draws on hundreds or thousands of genetic signals simultaneously. The accuracy of a PRS depends entirely on the population in which the underlying variants were identified, meaning a score built on European genetic data performs significantly worse for individuals of African, Asian, or mixed ancestry.

Five Billion Left Behind

The stakes are not abstract. According to the World Economic Forum, an estimated five billion people risk being excluded from the benefits of AI health tools, largely because the data used to build those tools does not reflect their genetic or demographic reality. Most health data used to train AI systems comes from patients in the United States, Western Europe, and China. Even when inclusive datasets exist, many healthcare systems in the Global South lack the digital infrastructure to deploy and adapt AI tools effectively.

A 2024 PLOS Digital Health review from Yale University researchers catalogued how bias enters AI systems at every stage: from flawed training labels and imbalanced data to biased deployment and selective publication of results. “Left unaddressed, biased medical AI can lead to substandard clinical decisions and the perpetuation and exacerbation of longstanding healthcare disparities,” the authors wrote.

AI models trained on biased COVID-19 mortality data, for instance, showed significantly worse precision for Hispanic, Black, and Asian populations compared to white patients, a gap that transfer learning techniques only partially bridged, according to a 2025 study in BMC Medical Informatics.

Genomics in the Loop

Researchers are not without answers. Multimodal AI systems that integrate genomic sequence data alongside clinical records, imaging, and behavioral data have shown significantly improved diagnostic accuracy in conditions with complex genetic architecture. A systematic review published in Frontiers in Genetics found that newer transformer-based models, when trained on both genetic and clinical inputs, outperformed single-modality systems in identifying rare genetic disorders, predicting cancer predisposition genes, and classifying variants of uncertain significance.

International databases like the NIH’s All of Us Research Program and the UK Biobank have expanded genetic representation, but gaps remain large, particularly for sub-Saharan African and South Asian populations. Global initiatives like the Global Alliance for Genomics and Health are working to establish data-sharing frameworks that preserve sovereignty while increasing the breadth of genetic data available to AI developers.

The convergence of AI and precision medicine holds enormous promise, but only if genetic risk factors are treated as essential inputs, not optional enhancements. A 2025 review in NAR Genomics and Bioinformatics from King Abdullah University of Science and Technology concluded that “a gap exists in exploring the contribution of genetic risk factors to AI-powered precision medicine” and that closing that gap is essential for the field to deliver on its potential.

Boston-based startup Bystro AI believes genomic data can function much like a living medical chart, allowing AI to interpret health questions using an individual’s biological blueprint rather than broad population averages. Instead of relying solely on generalized medical knowledge, Bystro says it allows users (as in mainstream consumers) and researchers to query their own genomic data through a conversational AI interface.

Founder Alex Kotlar spoke to tech editor Faustine Ngila. Here’s the interview excerpt, edited for clarity.

Q: How does Bystro integrate a user’s genomic data into AI-driven health insights while ensuring privacy and security?

Bystro uses a proprietary system to rapidly process and analyze genetic data. It automatically cleans the data, identifies genetic variants, and annotates them using the full body of existing scientific research. Our AI system is then grounded in that genetic data. Instead of generating generic responses, it uses the user’s unique genome and known scientific literature to guide its answers. This approach helps reduce hallucinations by anchoring outputs in real, validated information.

We also run multiple verification layers to refine results and catch potential errors before delivering an answer.

On privacy, we take a strict approach. We never train on user genetic data. Raw genomic files are deleted immediately after processing, and users can delete any remaining data tied to their session at any time. The goal is to minimize data retention while still enabling meaningful analysis.

Q: What kind of health insights can users realistically gain today, and what are the limitations?

The system can provide a wide range of research-grade insights, and its capabilities continue to expand as new scientific studies are published.

For example, it can identify genetic variants associated with disease risk. This doesn’t mean someone will develop a condition, but it can highlight predispositions. A user might then take that information to a doctor to explore prevention or monitoring strategies. It can also support performance and lifestyle optimization. For instance, if someone has genetic markers linked to lower collagen production or joint fragility, the system might suggest adjustments to training or recovery routines.

That said, Bystro does not provide medical advice. It delivers research-backed insights that should be interpreted alongside a healthcare professional.

Q: How will AI move beyond population averages to truly personalized medicine?

Personalized medicine requires shifting from population-level data to individual-level insights. Tools like Bystro make that possible by giving users an interactive, continuously updated view of their own genome. AI can surface relevant patterns and risks, but humans still play a critical role in interpreting and applying that information. It’s less about certainty and more about increasing confidence based on available evidence.

For example, we worked with a user who had early-onset hearing loss. There wasn’t a single clear cause, but their genome revealed multiple contributing factors, including a predisposition to both hearing loss and inner ear infections. While not definitive proof, this combination helped explain their experience in a more personalized way.

Q: How could AI-powered genomics improve healthcare access, especially in underserved communities?

In well-resourced areas, patients can see multiple specialists to figure out a diagnosis. In underserved communities, that’s often not an option. Genomic data, combined with AI, can help narrow down potential issues much faster. While sequencing has an upfront cost, it can reduce the need for multiple appointments, tests, and referrals—ultimately lowering overall healthcare costs.

By giving both patients and doctors better information earlier, AI can help streamline care and improve access, particularly where resources are limited.

Q: What safeguards are needed to prevent misinterpretation?

There are two main risks: human misinterpretation and AI error. On the human side, most doctors are not geneticists or statisticians. Interpreting genetic risk requires expertise across multiple fields, which can lead to gaps in understanding.

AI can help bridge that gap by synthesizing complex genetic and statistical information. However, the system itself must be reliable.

To address this, Bystro uses multiple validation layers to cross-check results and reduce hallucinations. Additionally, human oversight remains essential. AI outputs should always be reviewed and contextualized by a qualified professional.

Q: What ethical and regulatory challenges do you anticipate?

One major challenge is how information is communicated. Genetic insights can be complex and sometimes distressing, especially if they relate to conditions that may not be preventable. There are longstanding ethical questions in genetics around whether, when, and how to share this kind of information.

Misinterpretation is another risk. For example, misunderstanding genetic risk can lead to unnecessary or extreme medical decisions.

As access to genomic data increases, these challenges will become more common. The solution will require a combination of better education, access to counseling, and responsible AI design.

Bystro currently focuses on providing research-based insights, not direct medical advice. As the space evolves, ensuring that information is accurate, contextualized, and responsibly delivered will be critical.