Skip to content
View omarmohammed271's full-sized avatar

Block or report omarmohammed271

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
omarmohammed271/README.md

πŸ‘‹ Hi, I'm Omar Mohammed

Senior Data Engineer | Real-Time & Batch Data Architect


🧱 About Me

  • Passionate about building scalable data pipelines using Spark, Kafka, and modern data platforms.
  • Experienced in both cloud-native architectures (AWS, Azure) and on-premise systems.
  • Strong believer in the power of clean code, observability, and data quality.

πŸ’Ό What I Do

  • Design and develop ETL and ELT pipelines (batch + streaming).
  • Implement data lakehouse architectures (Bronze / Silver / Gold stages).
  • Build real-time processing systems using Spark Structured Streaming & Kafka.
  • Automate workflows and scheduling with Airflow.
  • Optimize analytics databases (e.g. ClickHouse) for fast query performance.
  • Create CI/CD pipelines for data solutions (using GitHub Actions or similar).

🌐 Tech Stack

Domain Technologies
Data Processing PySpark, Spark SQL, Delta Lake
Streaming Apache Kafka, Spark Structured Streaming
Workflow Orchestration Apache Airflow
Data Storage S3 / ADLS, Delta / Parquet
Analytics ClickHouse, PostgreSQL
Cloud AWS, Azure
CI / CD GitHub Actions
Languages Python, SQL

πŸš€ Featured Projects

Here are some of my key repositories (feel free to click and explore):

  • [Real-Time Processing Pipeline] β€” A Kafka β†’ Spark Streaming system with schema validation and data quality checks.
  • [Lakehouse Architecture Demo] β€” Multi-layer (Bronze / Silver / Gold) data lakehouse built with Delta Lake.
  • [Airflow Data Workflows] β€” End-to-end DAGs for ingestion, transformation, and orchestration.
  • [Analytics in ClickHouse] β€” Setup for real-time analytics using ClickHouse materialized views.
  • [CI/CD for Data Jobs] β€” GitHub Actions to test, build, and deploy data workloads.

πŸ“ˆ GitHub Stats

Omar’s GitHub stats


πŸ“« Get in Touch


⚑ Fun Facts

  • I love optimizing pipelines β€” every millisecond matters.
  • Outside work: I enjoy reading about distributed systems and data infrastructure.
  • Lifelong learner: currently exploring feature stores and ML data platforms.

🌐 Socials:

LinkedIn

πŸ’» Tech Stack:

πŸš€ Tech Stack

🧱 Data Engineering

Spark Kafka Airflow ClickHouse Trino Hadoop


🐍 Programming

Python Scala Java SQL


☁ Cloud & DevOps

AWS Azure Docker Kubernetes GitHub Actions Git GitHub GitLab


πŸ—„ Databases & Warehousing

PostgreSQL MySQL MongoDB Redis Delta Lake S3


πŸ“Š Data Science / ML Tools

NumPy Pandas scikit-learn PyTorch Matplotlib

πŸ“Š GitHub Stats:



πŸ† GitHub Trophies

πŸ” Top Contributed Repo


Pinned Loading

  1. Advanced-E-commercial-Django Advanced-E-commercial-Django Public

    Advanced E-commercial website

    HTML