This repository showcases Databricks-based data engineering projects, demonstrating modern best practices in data ingestion, transformation, and analytics using Delta Lake, Delta Live Tables (DLT), and Medallion Architecture.
Each project is organized into its own branch for modularity, making it easy to explore specific pipelines or datasets.
- Go to the main repo page on GitHub.
- Click on the Branch dropdown near the top-left corner (above the file list).
- Select the branch corresponding to the project you want to explore.
| Branch Name | Description |
|---|---|
main |
Overview and instructions for the repository |
rideone-project |
Databricks Lakehouse project (Bronze β Silver β Gold) |
π‘ Note: Each branch contains its own README with detailed setup instructions, datasets, pipelines, and results.
- Platform: Databricks Lakehouse, Delta Lake, Delta Live Tables (DLT)
- Programming: Python (PySpark), SQL
- Cloud & Storage: AWS S3, Unity Catalog
- Orchestration & Pipelines: Declarative LakeFlow Pipelines
- Version Control: Git & GitHub
- Always explore projects via their respective branches for full instructions.
- Use
git checkout <branch_name>to switch locally after cloning. - Each branch README explains end-to-end pipeline setup, execution, and business impact.
β¨ This structure ensures your Databricks projects are organized, modular, and easy for recruiters or collaborators to review.