Hands-on data engineering experience building and supporting production data pipelines.
Strong SQL skills (design, tuning, troubleshooting) and experience working with relational and analytical data stores.
3+ years of coding/scripting experience with Python (Java/Scala a plus).
3+ years of experience on AWS (e.g., S3, Glue, EMR, Lambda, IAM, CloudWatch, MWAA/Airflow).
2+ years of experience with infrastructure-as-code and CI/CD practices (Terraform preferred).
2+ years of experience with Spark; streaming experience with Kafka is a plus.
Experience leveraging AI tools (e.g., GenAI assistants) to improve productivity, with an understanding of secure and responsible usage.
Self-driven, delivery-focused, and comfortable working in a contract environment with clear milestones and timelines.
Curious and methodical - able to research new data sources, open-source tools, and ingestion patterns to propose practical approaches.
Strong operational mindset: observability, alerting, incident response, and continuous improvement to reduce pipeline failures.
Clear communicator with strong stakeholder management skills; able to surface risks early and provide crisp status updates.
Familiarity with domain areas such as Portfolio Management, Cyber Security, and ServiceNow reporting is a plus.
Qualifications:
Bachelor’s degree in computer science, Engineering, Information Systems, or related technical field (or equivalent practical experience).
Job Description:
Build and enhance ELT/ETL pipelines to ingest, transform, and curate data from multiple sources into clean, trusted datasets.
Develop and maintain data processing jobs using Spark and Python, with a focus on performance and reliability.
Implement and monitor data quality checks, reconcile issues, and partner with analysts to resolve data defects.
Translate requirements into technical tasks and deliver well-documented solutions, including runbooks and operational playbooks.
Support day-to-day operations of data pipelines—monitoring, incident triage, root-cause analysis, and meeting SLA targets.
Develop workflows and scheduling using MWAA (Airflow) and AWS-native services.
Collaborate effectively across time zones with onsite stakeholders and the India engineering team through clear communication and proactive status updates.