5+ years in data engineering, analytics engineering, or data platform roles.
Comfortable working as an Individual Contributor.
Experience in direct client interaction & requirement understanding.
Deep expertise in Google BigQuery - data modeling, optimization, and governance.
Experience with Python ETL and SQL pipelines.
Familiarity with BI tools in the modern data stack (Periscope, Looker, Metabase, Tableau, or equivalents).
Strong understanding of data governance - lineage, cataloging, PII controls, and access management.
Excellent stakeholder engagement and discovery facilitation skills.
Basic experience/knowledge in DBT (data build tool) - project setup, testing, and documentation.
Experience with Amplitude or other product analytics platforms.
Prior work in EdTech or B2C SaaS data environments.
Exposure to AI for BI or semantic layer tooling.
Qualifications:
Bachelor’s degree in Computer Science, Information Technology, Engineering, or related field.
Job Description:
Implement and enhance end‑to‑end data pipelines (batch and/or streaming) to ingest data from diverse source systems into the enterprise data platform, following agreed architecture and patterns.
Engineer robust ETL/ELT workflows to transform, cleanse, and standardize data, ensuring conformance with canonical data models and business rules.
Build and optimize data layers (raw, curated, semantic) that enable self‑service analytics, BI, and data‑science use cases, with particular focus on performance, scalability, and cost efficiency.
Industrialize data solutions by implementing re‑usable frameworks, templates, and components for ingestion, quality checks, logging, and monitoring.
Apply best practices for code management, CI/CD, environment promotion, and automated testing for data pipelines and related assets.
Implement data quality and data validation checks, reconcile data across systems, and resolve data issues in collaboration with business and platform teams.
Contribute to data modeling activities (conceptual, logical, physical) and translate models into physical structures in the target data platform/warehouse.
Tune queries, jobs, and storage layouts to meet SLAs for latency, throughput, and concurrency, leveraging partitioning, indexing, caching, and other optimization techniques supported by the platform.
Implement and adhere to security, privacy, and governance standards, including role‑based access controls, data masking, and lineage/metadata capture.
Produce technical documentation for pipelines, data sets, job flows, and operational procedures, and hand over solutions into BAU/support as they are industrialized.