Transforming Clinical Research: Metric Tree Labs’ Datalabeling service for a Healthcare Startup

A top healthcare startup based in New York, USA, with a mission to make clinical research accessible to all by improving transparency and diversity in clinical trials which was already an existing client of Metric Tree Labs was building an AI model and a data pipeline to process health records and data obtained from different EHR (electronic health records) systems associated with hospitals, physicians, and clinics across the US. 

The Problem Statement

The objective was to train the machine learning models to identify patient matches suitable for clinical trials from large volumes of data.  As it took time for the data models to get matured, they needed the help of a data labeling annotation team who could manually verify, annotate and flag the data based on several inclusion and exclusion criteria associated with the clinical studies and trials.

Metric Tree Lab’s Approach and Solution

As large volumes of data had to be processed manually, we proposed building a dedicated remote team in India with expert healthcare skills associated with the required data processing. Our hiring team built a team constituted of expert Doctors in Pharmacy under the leadership of a senior Doctor. Our consultants along with the team understood the client criteria for annotation, created a unique process from scratch, and delivered the required solution on an ongoing monthly basis. A team of 40 members was constituted for a period of 2 years from 2020 to 2022.

Considerations and Process Improvements


As the data classification criteria varied for each pharmaceutical company for which the data models were developed, a lot of considerations and change management had to be done to the process to improve the productivity of the team and ensure maximum file processing output. Some of the ongoing considerations were the following:


  1. Strict quality assurance to ensure that the processed data were cross-checked, verified, and approved before finalization. 
  2. Continuous involvement from the human resources department to ensure productivity tracking and replacement of unproductive resources. 
  3. Adaptive team process and optimization of labeling schemes based on the variation in data output from different EHR systems. 
  4. Analysis of labeled data based on the initial selection criteria and consideration of change requests for improvements in output data quality.
ecommerce website development

Thank you!

Your submission has been sent.

Get A Call Back!!!

We would like to help you. Enter your details to receive a call back from us.

    Attach your business doc

      Attach your business doc