Brief Description of position:
Role and Responsibilities
As Data Engineer, you are responsible to design, and implement scalable, distributed, highly performant systems, data models, and enablers used in large scale analytics. You will also define data requirements, develop mechanisms to gather and mine structured and unstructured data and will apply modeling and visualization techniques using related Big Data Programming Languages & Technologies such as Hadoop, Hive, Spark, Scala and Pig, Sqoop, Ambari, Airflow, Python, Pyspark.
Responsibilities:
- Design and implement a highly scalable system for data pipeline and data management
- Data discovery: Define data requirements, explore and assess suitable data sources
- Data migration: Create Extract Transform Load pipelines to collect and mine large scales of structured and unstructured data for data processing
- Data Cleansing: Perform initial data quality checks on raw & extracted data by running various data tools in the big data environment
- Data Transformation and Integration: Extract data from identified databases, process through ELT pipelines, and curate data to a structure that is relevant to the problem by selecting appropriate techniques
- Data Modelling: Conceptualize, design and develop logical and physical data models through analysis of complex data elements, systems, data flows, dependencies, and relationships
- Develop and maintain data engineering best practices and contribute to insights on data analytics and visualization concepts, methods and techniques
- Present and review project technical objectives, status and learnings with senior leadership and develop mutually beneficial strategic alliances with customers
- Actively mentor junior members of engineering staff by openly sharing experience, perspective, and holding routine code review
If you thrive in a dynamic, collaborative workplace, IBM provides an environment where you will be challenged and inspired every single day. And if you relish the freedom to bring creative, thoughtful solutions to the table, there's no limit to what you can accomplish here.
Required Technical and Professional Expertise
- 6+ years of experience related, relevant IT experience in the field of
- Minimum 4+ years of development experience in big data specifically on Hadoop/Hive, Spark, Scala
- Ability to quickly learn any open source big data stack quickly and develop using multiple programming languages/tools.
- Expertise in Pig, Sqoop, Ambari, Airflow, Python, Pyspark
- Hands-on experience in Linux platform
Preferred Technical and Professional Expertise
- You love collaborative environments that use agile methodologies to encourage creative design thinking and find innovative ways to develop with cutting edge technologies
- Ambitious individual who can work under their own direction towards agreed targets/goals and with creative approach to work
- Intuitive individual with an ability to manage change and proven time management
- Proven interpersonal skills while contributing to team effort by accomplishing related results as needed
- Up-to-date technical knowledge by attending educational workshops, reviewing publications