The Math Company : Consultant - Data Engineering

Brief Description of position:

We, at TheMathCompany, enable data analytics transformations for Fortune 500 organizations across the world. We enable our clients to build core capabilities that set them on a path to achieve analytics self-sufficiency.

  • Over the last four years, we have been consistently doubling in size year-on-year with 500 (and counting…) Data Scientists & Engineers, Consultants and Visualization experts 
  • TheMathCompany has won multiple awards recognizing us as a global Data and Analytics
  • firm –
    • One of the Fastest Growing Technology Companies in India | Deloitte Tech Fast 50 India 2019 & 2020
    • India’s #1 Indian Firm | Nikkei-FT-Statista High Growth Companies Asia-Pacific 2021
    • India’s Fastest Growing Tech Firm | Economic Times and Statista 2021
  • 35+ Fortune 500 Companies, from almost 10 different industries and countries, trust us to power their analytical transformation.


  • An exciting opportunity to be a part of the growth journey of one of the fastest growing AI & ML firms – scope for experimentation, the big & small victories, the learnings and everything in between
  • Our in-house learning and development cell - Co.ach, run by world-class data analytics experts, enables our folks to stay up to date with the latest trends and technologies
  • At TheMathCompany, we insist on a culture that provides us all with enough flexibility to accommodate our personal lives without compromising on the dream of building a great company
  • We are changing the way companies go about executing enterprise-wide data engineering and data science initiatives, and we’d love to have you grow with us on this journey


As a data engineer, you’ll have an opportunity to work on the universe of data and solve some very interesting problems by creating and maintaining scalable data pipelines dealing with petabytes of data. All our projects entail working on cutting edge technologies, petabyte scale data processing systems, data warehouses and data lakes to help manage the ever-growing information needs of our customers.

The responsibilities are detailed as below:

  • Experience in understanding and translating data, analytic requirements and functional needs into technical requirements while working with global customers
  • Build and maintain data pipelines to support large scale data management in alignment with data strategy and data processing standards
  • Strong experience in database, data warehouse & data lake design & architecture
  • Able to train & mentor team members
  • Experience in Database programming using multiple flavour of SQL
  • Deploy scalable data pipelines for analytical needs
  • Experience in Big Data ecosystem - on-prem (Hortonworks/MapR) or Cloud (Dataproc/EMR/HDInsight)
  • Worked on query languages/tools such as Hadoop, Pig, SQL, Hive, Sqoop and SparkSQL.
  • Experience in any orchestration tool such as Airflow/Oozie for scheduling pipelines
  • Exposure to latest cloud ETL tools such as Glue/ADF/Dataflow
  • Understand and execute IN memory distributed computing frameworks like Spark (and/or DataBricks) and its parameter tuning, writing optimized queries in Spark
  • Hands-on experience in using Spark Streaming, Kafka and Hbase
  • Experience working in an Agile/Scrum development process


We are looking for individuals who are curious, excited about learning, and navigating through the uncertainties and complexities that are associated with growing a company. Some qualifications that we think would help you thrive in this role are:

  • BE/BS/MTech/MS in computer science or equivalent work experience.
  • 6+ years of experience in building data processing applications using Hadoop, Spark and NoSQL DB and Hadoop streaming


  • Expertise in data structures, distributed computing, manipulating and analysing complex high-volume data from variety of internal and external sources
  • Experience in developing ETL designs and data models for structured/ unstructured and streaming data sources
  • Experienced in SQL performance tuning & recommendations
  • Experience in building large scale data pipelines in batch and real time mode
  • Experience in data migration to cloud (AWS/GCP/Azure)
  • Proficient in programming language such as Python/Scala
  • Good understanding of in relational/dimensional modelling and ETL concepts
  • Good understanding of data analysis techniques
  • Solid working knowledge of SQL and scripting
  • Understanding of any reporting tools such as Tableau, Qlikview or PowerB
Minimum Qualification:

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy.


We believe in making Analytics Vidhya the best experience possible for Data Science enthusiasts. Help us by providing valuable Feedback.