My Work

Showcasing my journey as a Senior Data Engineer - from designing scalable ETL pipelines to architecting cloud-native data solutions on AWS.

5+ Years Experience
30+ Data Pipelines
50B+ Records Processed
AWS
Python
SQL
Spark
Kafka
Airflow
Iceberg
Glue
Lambda
PostgreSQL

All Projects

Data Engineering
Automated Data Lake Architecture with Apache Hudi

Developed a scalable Data Lake on AWS to process high-frequency Change Data Capture (CDC) logs from MS SQL Server into a transactional storage layer.

AWS Services:
S3 Glue DynamoDB Step Functions
Apache Hudi AWS Glue DynamoDB +6
Data Engineering
AWS CDC ETL Pipeline with Real-time Data Processing

Cloud-based data engineering solution using AWS services for real-time data migration from SQL Server to AWS with CDC capabilities.

AWS Services:
AWS DMS S3 Lambda Glue
Python SQL AWS +4
Data Engineering
Oracle to PostgreSQL Migration

Architected a high-performance migration pipeline moving ~1 billion records per load from an Oracle Data Warehouse to Aurora PostgreSQL using parallel DB-Link async sessions, partitioned tables, and S3-based hybrid-cloud ingestion.

AWS Services:
Aurora PostgreSQL S3 Storage Gateway
Oracle PostgreSQL Aurora +7

Technical Expertise

AWS Cloud

S3, Glue, Lambda, DMS, EMR, Kinesis, Step Functions

Apache Spark

PySpark, Spark SQL, DataFrame API, Optimizations

Real-time Streaming

Kafka, Kinesis, Apache Iceberg

Databases

PostgreSQL, MySQL, Oracle, DynamoDB

Interested in Working Together?

Let's discuss how I can help with your data engineering challenges.

Get in Touch