Skip to content

focuses on segmenting customers of an online retail store based on their purchasing behaviors using clustering algorithms like KMeans and DBSCAN. By applying RFM (Recency, Frequency, Monetary) analysis, the model identifies distinct customer segments and provides tailored product recommendations helping enhance customer engagement and sales.

Notifications You must be signed in to change notification settings

mehdighelich1379/Customer-Segmentation-and-Recommendation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Customer Segmentation and Marketing Strategy using Online Retail Dataset

Dataset Overview:

The Online Retail dataset consists of the following columns:

  1. InvoiceNo: Unique identifier for each invoice.
  2. StockCode: Unique identifier for each product.
  3. Description: A description of the product.
  4. Quantity: The number of units purchased.
  5. InvoiceDate: The date when the purchase was made.
  6. UnitPrice: The price per unit of the product.
  7. CustomerID: Unique identifier for each customer.
  8. Country: The country where the customer made the purchase.

Project Steps:

  1. Exploratory Data Analysis (EDA):

    • The first step was to analyze the dataset using EDA. This helped me understand the structure of the data, check for missing values, and identify any outliers or patterns.
    • I found that there were some missing CustomerID values, which I handled by either removing those rows or imputing values where necessary.
  2. Feature Engineering using RFM:

    • The main objective of the project was to identify customer segments and suggest marketing strategies. To achieve this, I used the RFM (Recency, Frequency, Monetary) technique, which helps in segmenting customers based on their purchase behavior:
      • Recency (R): How recently a customer made a purchase.
      • Frequency (F): How often a customer makes a purchase.
      • Monetary (M): How much money a customer spends.
  3. Clustering with DBSCAN:

    • I initially applied DBSCAN (Density-Based Spatial Clustering of Applications with Noise) to identify clusters. Using this method, I found 57 clusters, but this was too many for my goal, which was to have only 6 clusters.
    • Although the Silhouette Score for DBSCAN was 99%, I decided to switch to a more controlled clustering technique that would give me exactly 6 clusters.
  4. Clustering with KMeans:

    • I used KMeans clustering to reduce the number of clusters to 6. After applying KMeans, the Silhouette Score dropped to 48, which was acceptable for this type of analysis, indicating a decent clustering result.
  5. Cluster Profiling:

    • For each of the 6 clusters, I calculated the average Monetary, Recency, and Frequency values. This helped me understand the characteristics of each cluster, such as whether they consisted of high-spending or frequent buyers, or if they were new or less engaged customers.
  6. Marketing Strategy:

    • Based on the RFM values and the characteristics of each cluster, I devised personalized marketing strategies to suggest to each group. For example:
      • For clusters with high Monetary and Frequency, I suggested loyalty programs or special offers to reward frequent shoppers.
      • For customers with low Recency but high Monetary, I recommended re-engagement campaigns to encourage them to make another purchase.
    • I stored these strategies in a DataFrame called marketing_strategy.
  7. Visualization:

    • I visualized the number of customers in each cluster using a bar chart, which showed the distribution of customers across the 6 clusters.
    • I also created a mean RFM chart for each cluster to visualize how each cluster performed on the Recency, Frequency, and Monetary metrics.
    • Finally, I created a pie chart to show the percentage of customers in each cluster, providing a clear overview of the distribution of customers across the segments.

Conclusion:

In this project, I successfully performed customer segmentation using the RFM model and applied clustering algorithms such as DBSCAN and KMeans to identify 6 customer segments. Based on the characteristics of each cluster, I developed personalized marketing strategies for each group. The visualizations helped provide a clear picture of the customer distribution and their purchasing behavior. The final marketing strategies were stored in a DataFrame and ready for implementation.


Skills Demonstrated:

  1. Data Cleaning and Preprocessing: Handling missing values, outliers, and transforming the dataset for clustering.
  2. Feature Engineering: Using the RFM model to create relevant features (Recency, Frequency, Monetary) for customer segmentation.
  3. Clustering Algorithms: Applying DBSCAN and KMeans to perform customer segmentation.
  4. Cluster Profiling: Analyzing and interpreting the characteristics of each cluster based on RFM values.
  5. Marketing Strategy Development: Creating targeted marketing strategies based on customer segmentation.
  6. Data Visualization: Using bar charts, pie charts, and other visualizations to communicate cluster characteristics and customer distribution.

This project showcases a strong understanding of customer segmentation, clustering techniques, and marketing strategy formulation, all of which are essential for businesses looking to improve customer engagement and retention.

About

focuses on segmenting customers of an online retail store based on their purchasing behaviors using clustering algorithms like KMeans and DBSCAN. By applying RFM (Recency, Frequency, Monetary) analysis, the model identifies distinct customer segments and provides tailored product recommendations helping enhance customer engagement and sales.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published