Project Objective

To develop a comprehensive Retail Sales Analytics System leveraging Python, SQL, and Power BI. This project focuses on analyzing customer behavior, product performance, and overall sales trends to build an industry-ready, data-driven solution.

Dataset Overview

The project utilizes a rich retail transactional dataset with the following key attributes:

  • Customer Info: ID, Name, Demographics (Age, Gender, Income), and Segment.
  • Transaction Info: Purchase Date, Total Purchases, and Amount Spent.
  • Product Info: Category, Brand, and Type.
  • Order Info: Customer Feedback, Shipping, Payment Method, and Order Status.

Tools & Technologies

  • Python
  • SQL
  • Power BI
  • Excel

Python Analysis

  1. Loaded and cleaned the dataset (handling nulls/duplicates).
  2. Engineered features like customer age groups.
  3. Aggregated data to find total spend and purchases per category.
  4. Analyzed distributions (gender, age) and correlations.
  5. Identified top customers and detected spending outliers.
  6. Generated summary statistics and exported the cleaned data.

SQL Queries

  1. Retrieved records with basic and advanced filtering (`WHERE`).
  2. Performed aggregations (`SUM`, `AVG`, `COUNT`) on key metrics.
  3. Grouped data to analyze sales by city, category, and segment.
  4. Filtered groups with `HAVING` to find active customers.
  5. Ranked results with `ORDER BY` to identify top brands and cities.
  6. Joined tables to combine customer and order information.

Power BI Dashboard

  1. Imported and transformed data using Power Query Editor.
  2. Designed KPIs for total sales and average transaction value.
  3. Created charts to visualize revenue by product category.
  4. Analyzed customer segment performance with bar charts.
  5. Implemented interactive slicers for time-based analysis.
  6. Visualized demographic data and customer feedback ratings.

Matplotlib Visuals

  1. Plotted histograms and density plots for distributions (age, spending).
  2. Created bar charts for categorical data (sales per category).
  3. Used line graphs to show trends over time (monthly sales).
  4. Visualized relationships with heatmaps (correlation matrix).
  5. Compared groups using box plots and violin plots.
  6. Combined plots into subplots for comprehensive views.
Back to Home