Data Mining Electronic Health Records to Support Evidence-Based Clinical Decisions

Document Type


Publication Date



This study investigated the extent of use of data mining on electronic health records to support evidence-based clinical decisions, reasons why only few healthcare institutions integrate it in the clinical workflow, and resolutions to increase its utilization in actual clinical practice. A literature review was conducted to get examples of studies where data mining applications were used, particularly in radiation oncology, critical care, in-hospital mortality prediction, pharmaceutically treated depression, visualizing clinical event patterns, and diabetes research. For each literature reviewed, the objectives, data mining methodology, procedure for the integration of the clinical decision support system (CDSS) in the clinical workflow, and various issues and resolutions were analyzed and documented. A brief description of the required infrastructure including policies and procedures to ensure smooth integration and deployment of data mining applications in healthcare were also documented whenever available. Clinical data mining is used mostly to gain new insights, do predictions, risk assessments, and recommendations. Many studies find CDSSs a good learning environment. Issues on data preprocessing, class imbalance, feature engineering, and performance evaluation were mentioned. Other issues include the need for a more active collaboration with stakeholders, access to anonymized clinical data, formulation of interoperability standards, reevaluation of results using data from other institutions, and finally addressing ethical and legal issues. Although still in its infancy, experience of early adopters have been promising thus should encourage more research.