Machine Learning for Data Governance in a Hospital Setting
Wide-spread adoption of EHR systems brings along large amount of data available for reporting and other analytical insights. Health organizations routinely generate thousands of such analytical artifacts on daily or weekly basis.
This continuous stream of analytics insights is often taxing for the infrastructure and is a burden for supporting IT stuff. It is important to understand the following questions in order increase the value of available analytics and potentially reduce the number of such artifacts:
- Which clinical, operational or financial metrics are associated with a given report or a dashboard?
- Is a given report a complete and/or partial duplicate of another report?
- Do these reports refer to the same and correct data elements in the source EHR?
- Who are the right consumers for these insights in the organizations and who should have the rights to access them?
Answering these questions is challenging considering the amount of involved data and new artifacts constantly appearing.
A leading Healthcare system in the United States has approached us for help. We have developed for this organization a searchable catalog that contains over 100,000 analytical insights and has been helping to automatically organize and classify them. We’ve been using machine learning algorithms to detect duplicates and similarities in the reports and provide assignment of the right metrics and portfolio membership. These processes continue to evolve and improve with minimal human involvement as new analytical insights are being developed.
Upon deployment in production this system has helped to reduce analytical inventory by nearly 20%, cut turnaround in producing new reports nearly in half, and significantly improved quality of analytics for the organization.
The above infrastructure and been developed utilizing Microsoft Technology stack - .NET/MVC, Entity Framework, C#, SQL Server.