Skip to content

End-to-end SQL project analyzing Walmart sales data using PostgreSQL. Includes data cleaning with Python(pandas), business insights, and advanced SQL queries to solve real-world retail problems

Notifications You must be signed in to change notification settings

shreyashhedaoo181-png/Walmart_sales_Analysis_using_SQL_Python

 
 

Repository files navigation

Walmart Sales Analysis Using SQL and Python 🛒📊

Walmart Sales Analysis
Releases

Table of Contents

  1. Project Overview
  2. Technologies Used
  3. Installation
  4. Usage
  5. Data Cleaning
  6. Business Insights
  7. Advanced SQL Queries
  8. Contributing
  9. License
  10. Contact

Project Overview

The Walmart Sales Analysis project aims to analyze Walmart's sales data through an end-to-end process. This project utilizes PostgreSQL for data storage and querying, along with Python's pandas library for data cleaning and manipulation. The goal is to extract meaningful business insights and solve real-world retail problems using advanced SQL queries.

You can find the latest releases here. This link will guide you to download and execute the necessary files.

Technologies Used

  • PostgreSQL: A powerful, open-source object-relational database system.
  • Python: A programming language used for data analysis.
  • Pandas: A Python library for data manipulation and analysis.
  • SQL: Structured Query Language used for managing and querying data.
  • Jupyter Notebook: An open-source web application for creating and sharing documents that contain live code, equations, visualizations, and narrative text.

Installation

To set up this project on your local machine, follow these steps:

  1. Clone the repository:

    git clone https://github.com/iranboono/Walmart_sales_Analysis_using_SQL_Python.git
  2. Navigate to the project directory:

    cd Walmart_sales_Analysis_using_SQL_Python
  3. Install required Python packages:

    pip install pandas psycopg2
  4. Set up PostgreSQL:

    • Ensure you have PostgreSQL installed.
    • Create a database for the project.
    • Update the connection settings in the Python scripts as needed.

Usage

To run the analysis, execute the provided Jupyter Notebook. This notebook includes all necessary steps from data cleaning to generating insights.

  1. Launch Jupyter Notebook:

    jupyter notebook
  2. Open the notebook file: Locate the notebook file in your browser and open it.

  3. Run the cells: Execute each cell in the notebook to perform the analysis.

Data Cleaning

Data cleaning is a crucial step in any data analysis project. In this project, we used Python's pandas library to handle missing values, correct data types, and remove duplicates. The following steps outline the data cleaning process:

  1. Load the data: Import the sales data into a pandas DataFrame.
  2. Check for missing values: Identify and handle any missing data points.
  3. Correct data types: Ensure all columns have the correct data types for analysis.
  4. Remove duplicates: Eliminate any duplicate records that may skew the results.

Business Insights

After cleaning the data, we focused on generating actionable business insights. The analysis covers:

  • Sales trends: Identify peak sales periods and seasonal trends.
  • Product performance: Determine which products perform best and worst.
  • Customer demographics: Analyze customer behavior based on demographic data.
  • Store performance: Compare sales across different store locations.

These insights can guide Walmart's marketing strategies and inventory management.

Advanced SQL Queries

This project includes several advanced SQL queries designed to extract meaningful information from the sales database. Some examples include:

  • Top-selling products:

    SELECT product_name, SUM(sales) AS total_sales
    FROM sales_data
    GROUP BY product_name
    ORDER BY total_sales DESC
    LIMIT 10;
  • Monthly sales trends:

    SELECT DATE_TRUNC('month', sale_date) AS month, SUM(sales) AS total_sales
    FROM sales_data
    GROUP BY month
    ORDER BY month;
  • Customer segmentation:

    SELECT customer_segment, COUNT(DISTINCT customer_id) AS total_customers
    FROM sales_data
    GROUP BY customer_segment;

These queries provide insights that can help Walmart make informed business decisions.

Contributing

We welcome contributions to this project. If you have suggestions or improvements, please fork the repository and submit a pull request. For larger changes, consider opening an issue to discuss it first.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For questions or feedback, please reach out via GitHub. You can also find the latest releases here.


Feel free to explore the project and utilize the insights gained from the analysis. Happy analyzing!

About

End-to-end SQL project analyzing Walmart sales data using PostgreSQL. Includes data cleaning with Python(pandas), business insights, and advanced SQL queries to solve real-world retail problems

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%