Introduction to Segmentation and Clustering.


A well-known data expert may have heard the terms "cluster" and "segmentation". But what is it really? What's the difference between them?

Let's start with the cluster concept. Clustering is a statistical method of grouping similar objects into clusters. A process that allows you to group similar objects into clusters and segments. In contrast, segmentation is the process of grouping customers by similarity.

The word may sound the same, but it is completely different. Let's see how they differ.

Segmentation:

Remember our definition that segmentation is the process of grouping customers based on similarity. When sorting, you know who to target. For example, a new company owns a consulting firm that uses data analytics to improve decision making. In this example, the goal is a small business that opens a business within a year. It goes without saying that services are needed in this category. Finding new business owners with less than a year of experience is a segmentation process. This is important because many companies based in this area already understand data analysis and do not need any services. We recommend custom marketing for this part.

But is the market segment still large? Not all small business owners need your services. What if we add another dimension to improve specificity? Industry. The segmentation can be divided so that only new companies in the fitness industry are included. We added this dimension because business owners in this industry know less about data analysis than other industries like retail, pharmaceutical, transportation, and banking. To improve the market segment, we have removed small businesses outside of the fitness industry. However, not everyone in this newly developed industry wants you to provide services. What is the location? Most online businesses are run by one person, so I know they have no data analytics experience. For this reason, we remove business owners who are not online. The idea is to segment the market until it reaches a market segment called Holy Grail of Marketing. But how many features can you improve? Segmentation becomes more difficult as variables such as industry type, location and company size continue to increase. The idea of ​​market segmentation is to find relationships between variables to predict customer behavior. It may not be possible to examine hundreds of data types to find relationships.

This is where clustering comes to play.

Clustering enables you to find and segment relationships between data points. By clustering your data, you can use machine learning and algorithms to discover new customer segments and buying behavior. Therefore, clustering is done from a statistical perspective, while segmentation is from a business perspective.

The concept of Distance

Clustering uses machine learning and algorithms to determine the relationship between different data types and to create new segments based on these relationships. Most cluster algorithms assign clusters that are similar to other objects in the same cluster. An algorithm that uses the concept of distance to do this. The concept of distance is a method of measuring similarity. The clustering algorithm tries to keep the segments as similar or as close together as possible.

Is Clustering a Supervised or Unsupervised learning technique?

Monitoring technology is to create a model that uses predictors to predict target variables. Here the modeling process is learned when creating the model from the target variables. An example of this learning method is multiple regression. In contrast, in unattended learning, there are no target variables for building the model.

Clustering is an example of this technique. Use different variables and group data without guiding the logic. At the end of this process, a variable called cluster number is created.

Some basic applications of Cluster Analysis:

Market segmentation: group members who can afford to buy a certain product based on similarity. This application is similar to the example shown above.

Sales segmentation: Clustering informs people who buy a certain product.
Insurance: Cluster technology helps you identify fraudulent insurance claims.
Education plan: The definition of the university groups is based on tuition fees, location, quality of education and type of course.

Come to the end of this article. Now I want to be able to distinguish between segmentation analysis and clustering and understand the concept of cluster analysis. This is a basic introduction to cluster analysis. The next article focuses on preparing data for cluster analysis and provides more information on the subject.