About me
I'm a Data Platform Engineer specialising in architecting scalable, cost-efficient data infrastructure and orchestration pipelines. My focus over the last several years has been building the cloud platforms that make enterprise-grade analytics possible whilst keeping costs under control.Highlights of my work so far:• Cut pipeline execution costs by 38% by migrating legacy SQL models to incremental dbt transformations, and reduced storage costs by ~13% through feature-layer redesign.• Led migrations off licensed tooling, bypassing heavy enterprise licensing constraints with open-source architectures: dbt Cloud → dbt Core (Airflow + Astronomer Cosmos) and Airbyte/dbt Cloud → Dagster, dltHub & SQLMesh• Built a production-grade CDC streaming pipeline from scratch with Debezium and Kafka, moving a platform from legacy batch to near real-time ingestion into BigQuery.• Established CI/CD and DataOps practices across entire data platforms: security scanning (Semgrep, Trivy, TruffleHog), automated dependency management with Renovate, pre-commit standards enforcement, ephemeral Airflow environments per PR, and self-hosted GitHub Actions runners on EKS.• Embedded GDPR compliance into the platform itself, with automated PII detection (Presidio) at the ingestion layer.Stack: Airflow (MWAA, Astronomer), Dagster, dbt, SQLMesh, dltHub, Kafka, Debezium, Snowflake, BigQuery, Trino/Starburst, Apache Iceberg, Docker, Kubernetes (EKS), Helm, GitHub Actions, Python, SQL — on AWS and GCP.
Tools and technologies
Experience
Prodege, LLC
Business Intelligence Engineer
Led the migration from dbt Cloud to dbt core using Airflow and Astronomer Cosmos, designing a tag system to schedule their execution considering their dependencies.
Built a centralised CI/CD standards system spanning every data-team repository, integrating security and quality scanning (Bandit, Semgrep, Hadolint, TruffleHog, Trivy, Kubeconform, Kubescape, Kubesec) and automating dependency management with Renovate.
Designed a a dbt Core project on Trino with CI/CD workflows and pre-commit hooks enforcing team standards for model layering, naming, data quality, and table materialisation.
Implemented CI/CD workflows that provision ephemeral Astronomer Airflow deployments per pull request, reducing infrastructure costs and significantly improving developer experience.
Engineered reusable tooling to automate source-table ingestion as dbt models, integrating PII detection via Presidio to enforce data classification and GDPR compliance at the ingestion layer.
Deployed self-hosted GitHub Actions runners on EKS using the Actions Runner Controller (ARC), with a custom ECR base image that cut CI/CD startup time and standardised runner environments.
Associate Business Intelligence Engineer
Architected the migration of legacy SQL models to dbt using incremental transformations and source partitioning, reducing pipeline execution costs.
Designed cost-efficient feature layer models, eliminating duplicate datasets and converting them to views, reducing storage costs and improving platform maintainability.
Established and owned CI/CD pipelines for Airflow DAG deployment to Amazon MWAA via S3, with dependency management and shared coding standards ensuring reliable production orchestration.
Drove data modelling and query optimisation on Snowflake with cost-aware design principles, contributing to a platform migration toward a lakehouse architecture on Apache Iceberg with Starburst Galaxy/Trino as the compute layer.
BitBurst GmbH
Associate Business Intelligence Engineer
Led the end-to-end migration of ELT infrastructure from Airbyte/dbt Cloud to Dagster, dltHub, and SQLMesh, driving a major architectural evolution of the data platform and reducing pipeline execution costs by 38%.
Architected and deployed a production-grade CDC pipeline from scratch using Debezium and Kafka (Docker Compose), capturing change data from a PlanetScale (Vitess) database and delivering it to BigQuery via a custom dltHub integration — establishing near real-time streaming ingestion.
Architected a star-schema redesign of core data models, eliminating duplicate datasets and reducing storage costs by ~13%, while maintaining backward compatibility through view abstraction for downstream consumers.
Established CI/CD pipelines enforcing code quality standards, automated model testing, and seamless deployment across the data platform.
Documented data models and sources with full lineage and metadata propagation, improving data discoverability and trust across teams.
BI Data Analytics
Designed and built ELT pipelines using Airbyte and dbt, incorporating automated data quality checks to improve pipeline reliability.
Established version control workflows for dbt models and ELT pipelines via GitHub, enabling collaborative engineering practices and audit-ready change tracking.
Identified fraud patterns through statistical analysis of internal metrics, translating findings into new detection rules that improved data accuracy and platform integrity.
Developed interactive dashboards in Metabase and Sigma, enabling data-driven decisions across business teams.
Allianz Kunde und Markt GmbH
Working Student
Supported the team in gathering customer-satisfaction insights and developing internal dashboards for visualisation.
Conducted data preprocessing, knowledge mining, and statistical analyses, communicating insights to stakeholders.