CCMapper: An adaptive NLP-based free-text chief complaint mapping algorithm

Objective

Chief complaint (CC) is among the earliest health information recorded at the beginning of a patient’s visit to an emergency department (ED). We propose a heuristic methodology for automatically mapping the free-text data into a structured list of CCs.

Methods

A comprehensive structured list categorizing CCs was developed by experienced Emergency Medicine (EM) physicians. Using this list, we developed a natural language processing-based algorithm, referred to as Chief Complaint Mapper (CCMapper), for automatically mapping a CC into the most appropriate category (ies). We trained and validated CCMapper using free-text CC data from the Mayo Clinic ED in Rochester, MN. We developed a consensus-based validation approach to handle both indifferences and disagreements between the two EM physicians who manually mapped a random sample of free-text CCs into categories within the structured list.

Results

The kappa statistic demonstrated a high level of agreement (κ = 0.958) between the two physicians with less than 2% human error. CCMapper achieved a total sensitivity of 94.2% with a specificity of 99.8% and F-score of 94.7% on the validation set. The sensitivity of CCMapper when mapping free-text data with multiple CCs was 82.3% with a specificity of 99.1% and total F-score of 82.3%.

Conclusion

Due to its simplicity, high performance, and capability of incorporating new free-text CC data, CCMapper can be readily adopted by other EDs to support clinical decision making. CCMapper can facilitate the development of predictive models for the type and timing of important events in ED (e.g., ICU admission).