This project is a beginner-friendly end-to-end data pipeline that demonstrates how to load data from a CSV file, clean it, store it in a SQL database, and build a simple analytics dashboard in streamlit using Python .
It is designed as an introductory project for anyone learning data engineering, data analytics, or Python-based ETL pipelines.
📦 Beginner_Data_Pipeline_Python_SQL_CSV_file
┣ 📄 README.md
┣ 📄 pipeline.py
┣ 📄 dashboard.py
┣ 📄 sales_data.xlsx
-
Import data from a CSV file
-
Clean and preprocess the dataset using Pandas
-
Store cleaned data into a SQLite database
-
Use SQL queries to aggregate and analyze the data
-
Create a simple analytics dashboard (tables + charts) in Streamlit (Web based visualization platform)
-
Beginner-friendly, fully documented Python code
-
Python 3.x
-
Pandas
-
SQLite3
-
SQLAlchemy
-
Altair (statistical visualization library)
-
The first five products
-
Total Revenue
-
Revenue by Region
-
Top-performing products
-
Daily Sales Trend by products
-
Clone the Repository git clone https://github.com/Kindoli/Building-a-Beginner-Analytics-Dashboard-Python-SQL-CSV-.git cd Building-a-Beginner-Analytics-Dashboard-Python-SQL-CSV-
-
Run the Pipeline python pipeline.py
This will:
-
Load the CSV
-
Clean the data
-
Insert it into the SQL database (sales.db)
- Run the Dashboard Script python dashboard.py
This will generate charts and summary tables.
df = pd.read_excel("sales_data.xlsx")
df_clean = df.dropna()
df_clean.to_sql("Sales_data", conn, if_exists="replace", index=False)
This project helps beginners learn:
-
Reading CSV files in Python
-
Basic data cleaning
-
SQL operations using Python
-
Building a simple analytics dashboard using Streamlit
-
Structuring a real-world mini data pipeline
Pull requests are welcome! Feel free to submit improvements or additional analytics steps.
