Full Stack Data Science Implementation

[vc_row][vc_column css_animation=”top-to-bottom”][vc_column_text]By Aviral Bhatnagar, Head of Data Science at Atidiv, a full stack data science firm

Data Science solves real world problems

In this article, I will take you through the inside story on how my team at Atidiv helped a multibillion dollar company leverage full stack data science to serve error free information to 100 million+ visitors a month.

The Scale of the Problem

The client is a directory of local businesses. Its core value proposition is providing accurate information to visitors. The content is user generated. To ensure accuracy, the company runs a data moderation operation that approves each user entry, also known as change request, before it goes live. The data moderation team of 150 members tackles over 15,000 change requests a day.

Only two things matter. Speed and Accuracy.


(How a billion dollar company maintains quality)


A second quality layer audits data moderators. Like I said – accuracy is everything.


However, the company was struggling. The process was ill managed. Data Moderators let errors slip by. Accuracy suffered. The management team spent time on low value tasks. Human resource morale was low and attrition high.


At this point, my team was called in to help. It was a an enticing challenge, and we took it up enthusiastically. Here was the initial process.

Remember – the objective is to optimise the productivity & quality of the Quality Moderation process. It is held back by poor record keeping. Delays in quality record keeping mean that errors remain live on the website – a critical error for a website that hosts 100 million visitors. Lack of interconnectivity between productivity and quality reporting hindered the creation of incentive schemes to push metrics upward. Data is input manually, compiled manually and analysed manually.


This was a unique opportunity to unleash our full stack approach to data science.


(The full stack data science approach)


We help companies standardise processes to collect data, create platforms to host this data and help analyse this data through automated reporting and bespoke consulting.

The New Process

Step 1: Integrate Data Collection into Processes

Lack of in-house data created a dependence on global headquarter reports and manual reporting – inefficient processes that led to incomplete and delayed datasets. We integrated a small custom built form into the data moderation process flow. The employee enters a request ID into this form, effectively linking his employee ID to the work he has done.


With each request, data flowed in real time from the form to the productivity database. Similarly, with each quality audit, data flowed in real time into the quality database.


Step 2: Integrate data into a common platform

The Request ID and Employee ID serve as connecting keys to join the two databases into a data platform. From here, data flows in real time into real time live visualisation monitored by team leads. Errors picked up in the quality assurance process are allocated to data moderators in real time, significantly reducing the turnaround time to tackle errors from more than 36 hours to less than 2 hours.
Step 3: Data Analysis ? Real Time Data Visualisations[/vc_column_text][/vc_column][/vc_row]