The Math Company : Associate - Data Engineering

Brief Description of position:

We, at TheMathCompany, enable data analytics transformations for Fortune 500 organizations across the world. We enable our clients to build core capabilities that set them on a path to achieve analytics self-sufficiency.

  • Over the last four years, we have been consistently doubling in size year-on-year with 500 (and counting…) Data Scientists & Engineers, Consultants and Visualization experts 
  • TheMathCompany has won multiple awards recognizing us as a global Data and Analytics
  • firm –
    • One of the Fastest Growing Technology Companies in India | Deloitte Tech Fast 50 India 2019 & 2020
    • India’s #1 Indian Firm | Nikkei-FT-Statista High Growth Companies Asia-Pacific 2021
    • India’s Fastest Growing Tech Firm | Economic Times and Statista 2021
  • 35+ Fortune 500 Companies, from almost 10 different industries and countries, trust us to power their analytical transformation.


  • An exciting opportunity to be a part of the growth journey of one of the fastest growing AI & ML firms – scope for experimentation, the big & small victories, the learnings and everything in between
  • Our in-house learning and development cell - Co.ach, run by world-class data analytics experts, enables our folks to stay up to date with the latest trends and technologies
  • At TheMathCompany, we insist on a culture that provides us all with enough flexibility to accommodate our personal lives without compromising on the dream of building a great company
  • We are changing the way companies go about executing enterprise-wide data engineering and data science initiatives, and we’d love to have you grow with us on this journey


As a data engineer, you’ll have an opportunity to work on the universe of data and solve some very interesting problems by creating and maintaining scalable data pipelines dealing with petabytes of data. All our projects entail working on cutting edge technologies, petabyte scale data processing systems, data warehouses and data lakes to help manage the ever-growing information needs of our customers.

The responsibilities are detailed as below:

  • Build & maintain data pipelines to support large scale data management in alignment with data strategy and data processing standards
  • Experience in designing efficient and robust ETL workflows
  • Experience in Database programming using multiple flavor of SQL
  • Deploy scalable data pipelines for analytical needs
  • Experience in Big Data ecosystem - on-prem (Hortonworks/MapR) or Cloud (Dataproc/EMR/HDInsight)
  • Worked on query languages/tools such as Hadoop, Pig, SQL, Hive, Sqoop and SparkSQL.
  • Experience in any orchestration tool such as Airflow/Oozie for scheduling pipelines
  • Scheduling and monitoring of Hadoop, Hive and Spark jobs
  • Basic experience in cloud environments (AWS, Azure, GCP)
  • Understanding of IN memory distributed computing frameworks like Spark (and/or DataBricks) and its parameter tuning, writing optimized queries in Spark
  • Experience in using Spark Streaming, Kafka and Hbase
  • Experience working in an Agile/Scrum development process


We are looking for individuals who are curious, excited about learning, and navigating through the uncertainties and complexities that are associated with growing a company. Some qualifications that we think would help you thrive in this role are:

  • BE/BS/MTech/MS in computer science or equivalent work experience
  • 2 to 4 years of experience in building data processing applications using Hadoop, Spark and NoSQL DB and Hadoop streaming


  • Exposure to latest cloud ETL tools such as Glue/ADF/Dataflow is a plus
  • Expertise in data structures, distributed computing, manipulating and analysing complex high-volume data from variety of internal and external sources
  • Experience in building structured and unstructured data pipelines
  • Proficient in programming language such as Python/Scala
  • Good understanding of data analysis techniques
  • Solid hands-on working knowledge of SQL and scripting
  • Good understanding of in relational/dimensional modelling and ETL concepts
  • Understanding of any reporting tools such as Tableau, Qlikview or PowerB

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy.


We believe in making Analytics Vidhya the best experience possible for Data Science enthusiasts. Help us by providing valuable Feedback.