Leading Retail Stores Implement University Of Mar
Walmart Sales Forecasting
Data science plays a huge role in forecasting sales and risks in the retail sector. The majority of the leading retail stores implement data science to keep a track of their customer needs and make better business decisions. Walmart is one of the larger retailers that uses this mechanism.
Project Goal: To analyze the Walmart Sales Data set in order to predict department-wide sales for each of their stores. Generate a report for regional manager to support your predictions.
Data Set Description: The data set used for this project contains historical training data, which covers sales details from 2010-02-05 to 2012-11-01. For the analysis of this problem, there are two tabs: store_data and details.
The store_data tab has the following fields:
- Weekly sales which covers to 2010-02-05 to 2012-11-01.
- Store – the store number
- Dept – the department number
- Date – the week
- Weekly_Sales – sales for the given department in the given store
The details tab has the following fields:
- Store – the store number
- Date – the week
- Temperature – average temperature in the region
- Fuel_Price – cost of fuel in the region
- CPI – the consumer price index
- Unemployment – the unemployment rate
- IsHoliday – whether the week is a special holiday week
By studying the dependency of these predictor variables on the response variable, you can predict or forecast sales for the upcoming months.
Requirements:
- Download the data set: The data set is provided in Blackboard and is named “proj2.xlsx” All of your processing for project 2 will be in Excel or Google Sheets.
- Data Cleaning: Choose one store (there are 45) and clean the data. When choosing your particular store, maybe use your favorite number – if you and other students choose the same store, your projects will be reviewed in more detail so try and choose a random store! You should create one tab with the combined information from store_data and details for that one store. In this stage, you must make sure to get rid of all inconsistencies, such as missing values and any redundant variables. If you remove any data from your one store’s data set, create a tab named “Data Cleaning” that lists the data that you corrected and why.
- Data Exploration: At this stage, you can plot box plots of your individual store’s sales over time. Do it on a quarterly basis (3 months starting in the first month you have data). You can follow the instructions for this here: https://support.microsoft.com/en-us/office/create-a-box-plot-10204530-8cdf-40fe-a711-2eb9785e510f
- Data Modelling: For this particular analysis, since the outcome is a continuous variable (weekly sales), it is reasonable to build a Regression model. The Linear Regression algorithm can be used to solve such problems since it is specifically used to predict continuous dependent variables. Choose two of the fields from details and do a multiple regression for those two variables by hand. Check your answers with Excel’s built in data analysis.
Forecast Sales: Using the instructions described here – project the sales for the next two