Case study - Genomics Data Analysis Solution - Sigma IT
Skip to content

Solution FOr Genomics Data Analysis

Life Sciences – Case Study

For A Better Understanding of Human Genomes

The multinational science-led bio-pharmaceutical company focused on developing life-changing medicines. (NDA)  

The project goal was support Client’s Center for Genomic Research unit in genome data processing. Our Client set an ambitious goal of processing 2 million genome samples by the end of 2026, and to achieve this, a specific solution was built in the AWS cloud. This solution is a combination of infrastructure and software technologies forming a pipeline. Our team is responsible for maintaining and developing this solution to ensure its effectiveness.

Our Services

Our engineering team provides top-notch maintenance services for genomics data analysis solution to ensure it’s working properly, fix bugs in logic and algorithms of data processing, and delivers new features to retrieve even more valuable data from the samples submitted to the pipeline. Our goal is to make it easier for scientists to perform further analysis and discover mutations and correlations that would only be possible to find on such a scale with this toolset.

Our services include:
DevOps services – for maintaining Cloud infrastructure, proactively eliminate risks coming from outside (i.e. from cloud provider), building and proposing improvements, proposing and preparing architecture of sub solutions. Whole solution is Cloud-native (AWS) and serverless. For most cases we are using few core AWS technologies: Step Functions, Lambda, queues, Batch. From this set we can build almost any processing engine. 

Development – implementing bug fixes, changes and new features depending on the stakeholders needs 

Cooperation  with 3rd parties: We are working as a one team in Kanban model – on a daily we have ~20 people from different parties and we all participating and supporting each other in tasks assigned by business on equal rights.  


Genomics Data Analysis Solution implements multiple architectural patterns. It is designed to provide a streamlined process for analyzing genome samples. It’s an orchestrated pipeline where data (genome samples) is injected on one side and processed; structured output is coming out.

It consists of multiple steps inside, but genomic data processing logic can be divided into the following stages:
– Validation: data is validated to the pipeline for processing from various sources
– Ingestion: data is ingested to the pipeline for processing from various sources.
– Secondary Analysis: is performed using licensed software to refine the data further.
– Tertiary Analysis: more detailed variant analysis in order to generate valuable data in an easy-to-understand format.

Impact: Faster and better Genetic DIsease Therapies

Genomics Data Analysis solution enables the processing of vast amounts of data resulting in a wealth of valuable information that scientists can easily visualize and comprehend.  

This information provides insights into different genetics correlations (e.g., between mutuation and specific diseases), leading to the development of new ideas on how to prevent or treat mutations. This research may ultimately result in the developing of novel therapies and drugs to address genetic diseases.   

COre Technologies

– Python
– React
– Terraform
– Kubernetes
– Node.JS
– SQL​

Innovation is a Process

Just tell us about your project needs, and we’ll get back to you as soon as possible.

Never miss a thing With Sigma IT´s newsletter you get all the latest updates on everything we do.

With Sigma IT´s newsletter you get all the latest updates on everything we do.