Development of E-platform for Big Data Collection and Processing in Healthcare Field

Big Data in The Medical Field

The Client  

Our client is one of the most famous US universities, a member of the first-stream Ivy League. 

The Product

The client is an international e-platform for gathering, processing, managing, analyzing, and interpreting of big data in healthcare. The client gathers and normalizes dictionaries (currently 72 dictionaries), case records, and treatment practices in a consolidated interconnected knowledge warehouse.  

Our client audience and users groups are:

  • Doctors, that have access to knowledge and practices in the international medical community.
  • Pharmaceutical companies, which do research and statistics on medicinal drugs
  • Insurance companies, that have an assessment of disease characteristics for ensured events.  

The project is supervised by Columbia University. Medical educational and healthcare institutions from the USA,  Western Europe, and Japan are involved in the program. The system is designed on a base of open-source software. 

Project Development

The First Line Software team developed several modules for the client. 


Vocabulary-v5.0 is a directory of medical dictionaries and reference books. Our team has designed the storage structure and the tools which allow standardize dictionaries automatically and unite them by means of cross-references.  

As part of the project, we interact with OMOP CDM (common data model). This is an application for the import and standardization of the patient’s case records where medical procedures and results are recorded. Any hospital will be able to submit these data (sending it in an anonymized form) to our client. The application gathers and converts the materials in a uniform format and analyzes them. Later the processed data will be used for the research of medicinal drugs and treatment methods for various diseases. In addition, the First Line Software team developed editing rules and rules for creating cross-references between dictionaries and case records.  


The second module is ATHENA (Automated Terminology Harmonization, Extraction, and Normalization for  Analytics). The solution automates and controls building dictionaries. It follows up on the latest editions of dictionaries and automatically introduces the changes. Besides, our team has developed the user interface for ATHENA that allows to moderate and administer the system content. With this, the medical community (experts, university employees, medical practitioners), formed on the basis of our client, can introduce corrections and comments to the dictionary library in a convenient mode.  

The challenge of the project is Big Data cumulated by the system. The dictionary charts are measured in millions of entries. In order to build the dictionary model, it is necessary to establish various connections between concepts (terms) described in these dictionaries. 

The Теchnological Stack 

  • Java back end, Spring (framework); Bootstrap, Marionette JS interface
  • Databases: Oracle PL/SQL

Learn more about First Line Software’s Healthcare Practice and Big Data Engineering Services.

Want to know more details about this case study?

Get in touch

Related work

Interested in talking?

Whether you have a problem that needs solving or a great idea you’d like to explore, our team is always on hand to help you.