Terminology Categorization and Normalization

By Mikkel Jønsson Thomsen, Solution Architect

One of the top problems in Health IT is Semantic Interoperability. When working with medical data from many different sources, it is crucial to be able to correctly map between code sets, reference terminologies, classification systems, and proprietary coding systems.

A stronger and stronger need for integration and sharing of medical data both within an organization and between organizations require that this data be normalized to a degree where everyone agrees to what it is.

Normalization provides a map between many different sources and targets of medical data, to draw an example: An ICD-9 diagnose is equivalent to one or many (usually a sub hierarchy) ICD-10 diagnoses. I.e. 729.5 (Pain in limb) is equivalent to M79.609 (Pain in unspecified limb), the latter being a sub hierarchy containing six specific diagnoses.

The above example is relatively easy to handle because equivalence between ICD-9 and ICD-10 are described in detail in the General Equivalence Mappings (GEM), provided by the Center for Medicare and Medicaid Services (CMS)[1]. But as we shall see, normalization in general can be a lot more complex.

To define normalization, we will use two typing schemes: Managed versus unmanaged, and direct versus indirect. The former is a notion whether the normalization is externally or internally managed, i.e. the General Equivalence Mappings (GEM) is externally managed. The latter describes whether the mapping is a direct equivalence or an indirect relationship. An example of this is the National Drug File – Reference Terminology (NDR-RT)[2] and its relationship with the National Drug Code Directory (NDC)[3] for prescription drug categorization that we will discuss later. If we create a graph, then we can put managed/unmanaged on the x-axis and direct/indirect on the y-axis. The GEM will be in the far corner of managed, direct; and drug categorization in the managed, indirect corner.

So why go to such extent to classify normalization types for coding system for medical data?

The normalization types are used to categorize mapping systems and evaluate their suitability for a given use case. I.e. when processing claims for medicare/medicaid it is important to have a standardized, managed mapping system that holds direct mappings between codes (procedures, diagnoses, etc.), which is exactly how GEMs are structured. However, when doing population health analysis on medical records, it creates greater value to be able to “group” diagnoses, procedures, medication proscriptions, or even lab results according to their type, usage, equipment, ingredient, etc.

This can be done using several methods; a common method is to use categorization by hierarchy, meaning that coding systems that are hierarchically structured can be categorized by using common parents. Other examples include the attribute relationship system of SNOMED CT (like the finding site of a diagnose), or using relationships to a meta coding system like the RxNorm from NLM[4] for prescription drug vocabularies. Let’s dive further into the prescription drug use case:

Case Study: Prescription Drugs

Prescription drugs normalization is a much used example of medical code normalization, and the reason is that it is a very disperse field in terms of standardization. The lack of a central authority, results in the situation where no clear nomenclature exists for drugs, and thus they are often named based on ingredient, vendor, quantity, etc. from several different sources. This ambiguity makes it hard to maintain a clear, concise database on drugs in use, which is needed in today’s Health IT.

The most ambitious attempt to remedy this is the RxNorm. It is a collection of clinical drugs from 15 different drug vocabularies and coding systems, structured into a database to provide a common coding system for clinical drugs.

By mapping NDC codes to RxNorm entries, and using this to locate the drug in the NDF-RT, we can impose a structure on the otherwise structure less NDC. With this structure, we have the possibility of grouping by hierarchy in both RxNorm and NDF-RT.

Lets say that we are analyzing data from two sources of clinical data and we want to analyze the usage of ACE inhibitors for hypertension, i.e. Captopril and Benazepril. System A delivers a prescription drug list that includes Captopril 12.5mg with RxNorm code of 308963, system B delivers a list containing Benazepril 10mg with NDC code 00093-5125-05. To correctly assert that both dugs are hypertension treatment, we use the relationship from NDC 00093-5125-05 (Benazepril) to NDF-RT code N0000161525 that is included in the hierarchy under ACE inhibitors, and the relationship from RxNorm 308963 (Captopril) to NDF-RT code N0000165544 that is also included in the hierarchy under ACE inhibitors.

Here, the normalization task includes following several relationships, and analyzing the hierarchical structure of a reference data set to get a normalization of the clinical data, as no direct map exists between the specific drugs and the category of ACE inhibitors.

[1] http://www.cms.gov/Medicare/Coding/ICD10/

[2] http://www.nlm.nih.gov/research/umls/sourcereleasedocs/current/NDFRT/

[3] http://www.fda.gov/Drugs/InformationOnDrugs/ucm142438.htm

[4] http://www.nlm.nih.gov/research/umls/rxnorm/





Leave a Comment

Your email address will not be published. Required fields are marked *