Big Data

Extracting Actionable Insights

Deriving actionable insights from Big Data to facilitate business decision-making in real time is what delivers the greatest value for companies. While many companies have achieved the basics in terms of data management and analytics, fewer have been able to put the processes and tools in place for utilizing analytics to elevate the value delivered to customers and inform their digital transformation initiatives.

Data Engineer or Data Scientist or Both?

Further complicating their efforts is some confusion over necessary roles and responsibilities. Companies have invested in hiring Data Scientists only to find that there still gaps that can only be filled with Big Data Engineers. Big Data Engineers prepare the “big data” infrastructure, make sure data is easily accessible, and optimize the performance of the big data ecosystem. The primary function of a Data Scientist is to analyze the integrated data and identify the valuable and actionable insights.

Leave the Heavy Lifting to First Line

The Big Data Engineering team at First Line Software has accumulated extensive experience and expertise in managing data on a physical infrastructure level. We have also built up a team of highly trained experts who analyze the relevant data and draw value from it. Our Big Data Engineering team helps companies stay ahead of the curve.

The First Line business model is similar to the data-as-a-service model. Think Big-Data-as-a-Service - BDaaS. Adopting this model relieves companies of the upfront costs associated with managing large quantities of data and building up a team that possesses the necessary expertise to operationalize Big Data and convert it into a business asset.

Our Work

First Line has produced a number of heavy-load processing systems and data storage solutions associated with Big Data initiatives.

Two Financial Systems

  • This heavy-load system was built by First Line to verify the taxes reported by suppliers and buyers of goods and services. Big Data analysis is performed on a 16-host cluster of 20-core each virtual servers using a Нadoop + HBase + Hive \ Impala link and up to 3 billion active records are stored on the system.
  • We designed an online financial system to collect and analyze data from retail sales transactions processed through point of sale (POS) terminals. Acceptance of point-of-sale transactions is performed by a server cluster, which stores data in a server cluster based on Redis. After that the data is transferred securely to a PostgreSql database. The peak load is 60,000 transactions per second and the rated storage capacity for 4 years is 800 billion records.

Consumer Behavior Monitoring and Advertising at Retail Locations

  • First Line produced a system for processing data collected via Wi-Fi sensors installed at retail stores located in a shopping mall. The system tracks Wi-Fi signals from shoppers’ gadgets to monitor the foot traffic and sales at individual retail locations. Statistics on more than 28 million unique mac-addresses were collected during a two year period and there are over 5 billion stored records.

Railway Data Warehouse System

  • The First Line team created an online dynamic pricing system to address the complexities of railway travel and offer dynamic fare pricing using Java, MQ, MongoDB, and Hazelcast technologies.
  • As the next step, First Line developed and implemented a corporate data warehouse using IBM hardware and software – DataStage, MQ, SPSS, Cognos with an average OLAP cube of 600 MB.

Healthcare – Assimilating data based on vocabulary system

  • First Line created a unified medical language and vocabulary system to link thousands of medical terms and methodologies with the same meaning, from countries around the world. This functionality expanded the volume of Big Data that could be included in studies, increasing the overall value of the derived insights.


The most popular stack of technologies were used to solve the customer’s data challenges and execute on tasks in Big Data processing and analysis:

  • Hadoop/HDFS stack with Hbase non-relation distributive database for customer data storage
  • Batch processing with Hive infrastructure for data aggregation, querying and source of analysis
  • Cloudera Impala query engine is used to access customer’s data as a low-latency queuing technology 

If your company is ready to embark on a Big Data initiative and you need to rely on experts to get the job done, please contact us today for a free consultation and project estimate.


Get In Touch

1000 characters left