Project Objective
To develop a comprehensive Retail Sales Analytics System leveraging Python, SQL, and Power BI. This project focuses on analyzing customer behavior, product performance, and overall sales trends to build an industry-ready, data-driven solution.
Dataset Overview
The project utilizes a rich retail transactional dataset with the following key attributes:
- Customer Info: ID, Name, Demographics (Age, Gender, Income), and Segment.
- Transaction Info: Purchase Date, Total Purchases, and Amount Spent.
- Product Info: Category, Brand, and Type.
- Order Info: Customer Feedback, Shipping, Payment Method, and Order Status.
Tools & Technologies
- Python
- SQL
- Power BI
- Excel
Python Analysis
- Loaded and cleaned the dataset (handling nulls/duplicates).
- Engineered features like customer age groups.
- Aggregated data to find total spend and purchases per category.
- Analyzed distributions (gender, age) and correlations.
- Identified top customers and detected spending outliers.
- Generated summary statistics and exported the cleaned data.
SQL Queries
- Retrieved records with basic and advanced filtering (`WHERE`).
- Performed aggregations (`SUM`, `AVG`, `COUNT`) on key metrics.
- Grouped data to analyze sales by city, category, and segment.
- Filtered groups with `HAVING` to find active customers.
- Ranked results with `ORDER BY` to identify top brands and cities.
- Joined tables to combine customer and order information.
Power BI Dashboard
- Imported and transformed data using Power Query Editor.
- Designed KPIs for total sales and average transaction value.
- Created charts to visualize revenue by product category.
- Analyzed customer segment performance with bar charts.
- Implemented interactive slicers for time-based analysis.
- Visualized demographic data and customer feedback ratings.
Matplotlib Visuals
- Plotted histograms and density plots for distributions (age, spending).
- Created bar charts for categorical data (sales per category).
- Used line graphs to show trends over time (monthly sales).
- Visualized relationships with heatmaps (correlation matrix).
- Compared groups using box plots and violin plots.
- Combined plots into subplots for comprehensive views.