Searching Clinical Data using Generative AI

Product

Use cases

About

Events

Clinical Research

Generative AI

Searching Clinical Data using Generative AI

Published: June 6, 2025

Karan Hanswadkar

Share this post

Searching Clinical Data using Generative AI

Background

Healthcare is a fertile ground for generative AI applications, with clinical data being the common denominator across patients, payers, and providers. Searching clinical data and running analyses over it can lead to better outcomes in patient diagnosis and care. Unfortunately, clinical data is often messy, making the search for relevant information challenging. Efforts to standardize the clinical data with medical codes go way back to the 1850s, and successive iterations have improved the categorization details. For example, the International Classification of Diseases (ICD) categorizes disease descriptions, with ICD-10 being the most prevalent and ICD-11 now getting adopted.

ICD codes follow a tree-like structure, beginning with broad categories such as “respiratory diseases” and branching into specific conditions like “chronic bronchitis” or “asthma”. While this organization helps group related illnesses logically, it complicates the search process, particularly when the everyday medical terminology doctors use doesn't align with the official language in the coding system. In contrast to code assignment, which is a one-to-one mapping from each disease or drug to a single standardized code, searching is a one-to-many problem where users look for broader families of diseases or drugs.

SearchAI for Clinical Data

Our solution, SearchAI, enables generative AI-powered patient search on clinical databases. For example, a physician can select a specific ICD code dataset and, upon entering patient symptoms or diagnoses (i.e. “fever and cough”) into the search bar, the system retrieves relevant ICD codes related to respiratory infections, helping the physician reach an accurate diagnosis more quickly. Unlike simple SQL query generators, SearchAI uses natural language processing (NLP) techniques to interpret complex medical queries. It extracts key medical concepts and retrieves a broad set of relevant medical codes based on contextual and ontology understanding. Overall, our solution focuses on three core ideas.

Boolean Decomposition: We train small models to decompose patient search queries into the underlying Boolean logic.
Ontology-Aware Navigation: We train hierarchical models to traverse coding ontologies while preserving their structural relationships.
Instance-Specific Tuning: We tune the hierarchies for the specific database instance to narrow down the scope of errors and make the patient search better tailored.

To illustrate, users can ask the following kinds of patient queries, and SearchAI will systematically process them:

Show sepsis patients.
Show patients who are diagnosed with Anemia.
Show patients who have external causes of abnormal reactions for surgical operations.
Show patients with external causes of falls on the same level as slipping.
Show patients with chronic ischemic heart disease.
Show patients who are on psychoactive substance use.
Show patients diagnosed with Type 2 diabetes mellitus with diabetic nephropathy.
Show patients diagnosed with prediabetes who undergo drug abuse counseling.

SearchAI introduces a modern way of interacting with clinical data that is fast and easy. The idea is to save valuable time and arrive at better outcomes for all stakeholders.

Hierarchical models

Coding ontologies were originally designed for human interpretation, and automating them using AI requires several adjustments. Below, we briefly describe the hierarchical models for traversing the coding ontology. Readers can refer to the full paper for the detailed algorithms.

Default Hierarchical Predictor: The baseline method follows the standard top-down traversal of the medical code hierarchy. It predicts codes by navigating through parent-child relationships in the original structure. While this approach provides a reasonable starting point and achieves moderate accuracy, it often fails when the hierarchy contains semantic gaps or inconsistent relationships between parent and child nodes.
Hierarchical Flattening: To address the above limitations, we introduced a restructuring technique that adjusts hierarchical depths based on ICD code descriptions. Our method targets unreachable ICD codes through conventional top-down traversal, often because their names are semantically disconnected from their parent codes or consist of overly generic, single-word labels. By re-aligning these outlier codes into a more coherent structure, the system can interpret them more accurately.
Hybrid Approach: To further improve flexibility and accuracy, we designed a hybrid method combining structured and unstructured search strategies. It begins with a traditional top-down traversal but dynamically switches to a randomized flat search when the path becomes uninformative or ambiguous. This adaptive mechanism allows the model to escape rigid hierarchical constraints and intelligently jump to more promising starting points, achieving a better balance between precision and coverage.

Our hierarchical models significantly enhance the machine interpretability of medical codes, making them more accurate and accessible.

Results

We evaluated SearchAI on both production and publicly available Medicare fee-for-service (FFS) datasets. We generated search queries for all ICD-10 codes and measured the proportion of cases where SearchAI successfully retrieved the intended ICD-10 code. The figures below show the result.

Accuracy

Figure 1a and 1b: SearchAI accuracy on FFS and production datasets.

We see that the default hierarchical predictor has lower accuracies of 67.35% and 60% on FFS and production datasets, respectively. However, the accuracy improves significantly with flattened and hybrid variants of the algorithm. For the FFS dataset, accuracies reached 99% and 98.63%, while for the production dataset, the accuracies were 98.3% and 98.6%. These are promising numbers with measurable improvements in accuracy.

SearchAI is robust to semantic variations of the search queries, with 79.86% and 88.23% accuracies on FFS and production datasets when the above queries were rephrased using ChatGPT. SearchAI also has low latency, ranging from a few milliseconds to a hundred milliseconds in the worst case, and it scales well with the dataset sizes.

Extensions

SearchAI also applies to other medical codes. Specifically, we tested the following:

National Drug Codes (NDC): NDCs uniquely identify medications and are essential for pharmacy operations, electronic prescribing, and billing. We are integrating NDC support by fine-tuning our model to understand drug formulations, dosages, and brand/generic variations. With this capability, clinicians and researchers can enter queries like “500mg oral amoxicillin” and receive precise NDC code mappings.
Current Procedural Terminology (CPT) codes: CPT codes represent medical procedures and services. Accurate CPT coding is critical for documentation, billing, and reimbursement. We are enhancing SearchAI to accurately interpret procedural language and retrieve the most clinically relevant CPT codes, helping ensure consistency in medical reporting.
Modifier codes: These codes add specificity to CPT codes by capturing variations in how procedures are performed (e.g., bilateral performance, repeat procedures). SearchAI will assist medical professionals in identifying the appropriate modifiers based on procedure context, improving documentation accuracy.
Merit-based Incentive Payment System (MIPS) codes: MIPS codes track performance metrics related to healthcare quality, payment cost, and patient outcomes. This allows SearchAI to interpret clinical and policy-driven queries (e.g., “preventive care for chronic illness”) and map them to appropriate MIPS categories.

Parting thoughts

The healthcare industry can benefit from various process-related efficiencies, and SearchAI is a meaningful step in that direction. We demonstrated how generative AI models can help interpret natural language queries, traverse complex medical ontologies, and return accurate, hierarchical search results. We started from ICD codes and extended the approach to other codes, including CPT, MIPS, and Modifiers.

Ultimately, we envision SearchAI as a search engine for clinical data, one that bridges the gap between technical complexity and clinical usability.

For more insights, please see our full research on arXiv:
https://arxiv.org/abs/2505.24090