DataHour: Exploring the Fundamentals of DeepMatch

Online 02-02-2023 07:00 PM to 02-02-2023 08:00 PM
  • 3781


  • Knowledge and Learning.


DataHour Recording

About the DataHour:

The quintessential pre-task of most data-driven analysis is that of “stitching” multiple data sources together. Traditionally, in an analyst’s language, this is achieved through “joins”. They “stitch” datasets together based on a commonality in terms of shared entries within common columns across datasets. In many modern settings, however, this does not work because two datasets may lack a shared column(s)or have mismatched entries or many-to-one relationships. 

Typically, this arises because of the following reasons:

  • Lack of centralized design across first-party and third-party datasets
  • The datasets not adhering to a standardized format
  • Errors and missing values in the data
  • Many-to-one and many-to-many fuzzy relationships; or all of the above and more. 

This challenge is addressed currently through a mix of manual work and point solutions across industries and verticals including SKU mapping in retail and supply chain for demand planning; reconciliation in account receivable and payable, trade reconciliation in Banking and Financial services; auditing in insurances; and entity resolution across industries. 

In this DataHour, Devavrat will introduce DeepMatch, an AI-powered matching or joining of data with easy-to-interact humans in the loop component. He will also provide a few demonstrations of how it has been used for SKU mapping in Retail and Supply Chain for demand planning, transaction reconciliation in Banking and Financial Services and Auditing in Insurances.

Zeal of learning Data Science and Artificial Intelligence

Who is this DataHour for?

  • Students & Freshers who want to build a career in the Data-tech domain.
  • Working professionals who want to transition to the Data-tech domain.
  • Data science professionals who want to accelerate their career growth


Devavrat Shah

Andrew (1956) and Erna Viterbi Professor at MIT

Devavrat Shah is an Andrew (1956) and Erna Viterbi professor of Computer Science and AI at MIT since 2005 where he founded MIT’s Statistics and Data Science Center and currently directs Deshpande Center for Tech Innovation. Previously, he co-founded Celect, focused on inventory optimization using AI (acquired by Nike in 2019). Currently, he serves as the CTO of Ikigai Labs which he co-founded in 2019, with the mission of building self-driving organization by empowering data business operators to make data-driven decisions with ease of spreadsheets. He received his B.Tech. degree from IIT Bombay and his Ph.D. degree from Stanford University, both in Computer Science. He is a Kavli Fellow of National Academy of Science. He has received paper awards from INFORMS Applied Probability Society, INFORMS Management Science and Operations Management, NeurIPS, ACM Sigmetrics and IEEE Infocom. He has received the Erlang Prize from INFORMS Applied Probability Society and Rising Star Award from ACM Sigmetrics. He has received multiple Test of Time paper awards from ACM Sigmetrics. He is a distinguished alumni of his alma mater IIT Bombay.

Connect with Devavrat on Linkedin and Website.


Please register/login to participate in the contest

Please register to participate in the contest

Please register to participate in the contest



We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy.


We believe in making Analytics Vidhya the best experience possible for Data Science enthusiasts. Help us by providing valuable Feedback.