Big Data in The Medical Field

The Client  

Our client is one of the most famous US universities, a member of the first stream Ivy League. 

The Product

The client is an international e-platform for gathering, processing, managing, analyzing, and interpretation of big data in the medical sphere. The client gathers and normalizes dictionaries (currently 72 dictionaries), case records, and treatment practices in a consolidated interconnected knowledge warehouse.  

Our client audience and users are doctors (who have access to knowledge and practices in the international medical community), pharmaceutical companies (which do research and statistics on medicinal drugs), and insurance companies (that have an assessment of disease characteristics for ensured events).  

The project is supervised by Columbia University. Medical educational and healthcare institutions from the USA,  Western Europe, and Japan are involved in the program. The system is designed on a base of open-source software. 

Project Development

First Line Software joined the project in February 2014. Our team develops several modules for our client.  Vocabulary-v5.0 is a directory of medical dictionaries and reference books. Our team has designed the storage structure and the tools which allow standardized dictionaries automatically and unite them by means of cross-references.  

As part of the project, we interact with OMOP CDM (common data model). This is an application for import and standardization of the patients’ case records where medical procedures and their results are recorded. Any hospital will be able to submit these data (sending them in an anonymized form) to our client. The application gathers converts the materials in a uniform format and analyzes them. Later the processed data will be used for the research of medicinal drugs and treatment methods of various diseases. Also, the First Line Software team developed editing rules and rules for creating cross-references between dictionaries and case records.  

The second module is ATHENA (Automated Terminology Harmonization, Extraction, and Normalization for  Analytics). The solution automates and controls building the dictionaries. It follows up the latest editions of dictionaries and automatically introduces the changes. Besides, our team has developed for ATHENA the user interface to moderate and administer the system content. This allowed the medical community (experts, university employees, medical practitioners), which is formed on the basis of our client, to introduce corrections and comments to the dictionary library in a convenient mode.  

The challenge of the project is big data cumulated by the system. The dictionary charts are measured in millions of entries. In order to build the dictionary model, it is necessary to establish various connections between concepts (terms) described in these dictionaries. 

The Теchnological Stack 

  • Java back end, Spring (framework); Bootstrap, Marionette JS interface
  • Databases: Oracle PL/SQL

Want to know more details about this case study?

Get in touch

Related work

Interested in talking?

Whether you have a problem that needs solving or a great idea you’d like to explore, our team is always on hand to help you.