Brief Description of position:
Job Description
Position: Data Engineer
Location: Mumbai/Digital/Full-Time
About Aramex:
Since its foundation in 1982, Aramex has grown to become a world leader in comprehensive logistics and transportation solutions recognized for its customized services and innovative products for businesses and consumers. Listed on the Dubai Financial Market (DFM) and headquartered in the UAE, we currently have business operations in over 567 cities across 66 countries worldwide and employ over 17,000 transportation professionals.
We live in an era where technology transforms and influences our daily lives more than ever before; therefore, technological innovation is critical to our success. We are strategically leveraging technology in various ways, and one of them is by leveraging big data and AI.
We are looking for a data engineer to join our team. Candidates for this vacancy must load and transform data from various data sources into a data lake using AWS tools. Help the data science team develop and deploy multiple use cases that will help solve business problems and strong knowledge & expertise in Big Data & Machine Learning [MLOps] specialty.
If you have strong analytical skills and ability, an understanding of data loading and transformation, and you are familiar with several development languages and the ability to solve issues using data analysis, we would love to hear from you.
Use cases we are working on:
In this role, you shall work on a wide variety of problems, to name a few:
- Automatic time slot prediction
- Dynamic territory optimization
- Address geocoding using deep learning
- Incomplete address auto-completion using natural language processing
- Profiling and predictive analytics for customers, consignees, and shippers
Our Tech Stack:
At Aramex, our Digital journey is driven/hosted on AWS and it is built on AWS EC2, ECS, Kinesis, Glue, S3, RedShift, RDS, DynamoDB, SageMaker, Lambda, Step Functions, API Gateway, CloudFormation & Developer Suite
We extensively use Python 3+. Frameworks we use for Data Engineering Spark, AWS Serverless Framework (SAM), Practicing AWS Cloud Development Kit (CDK) for IaaS
Your Role:
- Collaborate and work with data management stakeholders such as data engineers, data scientists, and product managers to identify requirements for complex business problems that may be defined loosely
- Understanding the requirements and assist in architecting a cloud-based solution that is technically and commercially viable via architecting scalable and compliant Data strategy for the company
- Serve data to the end-users. The end users can be analysts, an application, external clients, etc. Depending on the end-user, you may have to set up dashboarding/visualization, data permissions & expiry, Serverless API endpoints & authentication, data dumps & automation
- Build & manage data pipelines by maintaining ETL/ELTs to transform input datasets to data warehouse/data lakes and production database
- Deploy ML models to production by collaborating with the data scientist by setting up batch processing, ML pipelines, CI/CD pipelines & Monitoring & verbose logging
Must to have:
- Linux, Bash scripting, GIT
- Python 3+, SQL, NoSQL (PartiQL etc.)
- Distributed Data Storage: Hadoop Distributed File Systems (HDFS)
- Distributed Data Processing Frameworks: MapReduce, Spark, Hive, CDC, Kafka, Replication, Partitioning, Kafka, Event-based processing, Batch & file queue processing, Orchestration frameworks
- DevOps practices: Familiar with CI/CD pipelines, Setting up monitoring & logging, Containerization using Docker, Application/Infra maintenance, IaaS
- MLOps: Familiar with structuring ML systems for production, feature engineering at scale
Good to have:
- Distributed Data Storage on AWS: S3, RedShift, RDS, DynamoDB
- Distributes Data Processing on AWS: EMR, Glue, Kinesis delivery streams, Kinesis data firehose, Data Migration Service, Data Pipeline
- DevOps on AWS: CloudFormation, AWS Developer Suite for CI/CD & Code Artifacts, Monitoring using CloudWatch, Familiarity with AWS SAM, CDK frameworks
- MLOps on AWS: SageMaker, Lambda, Step Functions, API Gateway, Feature Engineering at scale using Glue, Serverless Spark jobs
Big Plus:
- [AWS] Multiple Certification/s (Associate, Professional, Specialty)
- [Data Science] Hands-on knowledge of Machine learning, Cutting edge deep Learning Algorithms/enhancements
- Understanding of Security compliance frameworks, Data security rule(s) & various data encryption methods.