ARM

OVERVIEW


Association Rule Mining (ARM) is a data mining technique used to identify patterns and relationships between different items in large datasets. It is commonly applied in market basket analysis, where businesses analyze customer purchase behavior to discover which products are frequently bought together. ARM helps uncover hidden correlations that can be used for recommendation systems, cross-selling strategies, and decision-making.

Key Measures in ARM

What are Association Rules?

An association rule is an implication of the form X β†’ Y, meaning "if X occurs, then Y is likely to occur."

For example, in retail analysis, a rule like:
{Bread, Butter} β†’ {Milk}

suggests that customers who buy bread and butter are likely to buy milk as well.

Itemsets in ARM

The Apriori Algorithm: How It Works


Apriori is one of the most popular algorithms for Association Rule Mining (ARM). It works by iteratively finding frequent itemsets and generating rules from them.

Steps of the Apriori Algorithm

The Apriori algorithm is efficient because it uses the "Apriori Principle", which states:

If an itemset is infrequent, all its supersets must also be infrequent.

This helps in reducing the search space and improving performance.

Comparison of Clustering vs. ARM

Clustering groups similar items together, as shown by bread and butter in the same circle. In contrast, ARM identifies relationships between items, represented by the arrow from bread to butter, implying a purchasing pattern. While clustering finds inherent groups, ARM focuses on if-then associations, making both techniques valuable for different types of data analysis.

πŸ’» CODE FOR IMPLEMENTATION OF ARM

DATA SELECTION AND PREPROCESSING


Dataset Before Preparing for ARM
DATASET BEFORE PREPARING FOR ARM
πŸ—ƒοΈ CLICK HERE TO VIEW THE DATASET BEFORE PREPARING FOR ARM

The dataset was carefully preprocessed to ensure it met the requirements for Association Rule Mining (ARM). First, numerical attributes that were not relevant for identifying associations were removed. Categorical variables were then transformed into a transactional format using one-hot encoding, where each row represents a unique transaction containing different categorical attributes. This step ensures that the dataset is structured for frequent pattern mining. Finally, the cleaned data was formatted into a CSV file without headers, making it compatible with ARM techniques like the Apriori algorithm for uncovering meaningful patterns and associations.

Dataset After Preparing for ARM
DATASET AFTER PREPARING FOR ARM
πŸ—ƒοΈ CLICK HERE TO VIEW THE DATASET AFTER PREPARING FOR ARM

APPLYING ASSOCIATION RULE MINING


Association Rule Mining (ARM) is used in this project to uncover hidden relationships between different features in the dataset. It helps identify patterns, such as which factors are strongly linked to a successful marketing campaign. This is useful for decision-making, such as improving customer targeting strategies.

Key Measures in ARM

In the context of this project, ARM uses the following metrics to assess the strength of discovered patterns:

Association Rules in This Project

ARM generates rules that describe strong relationships between features. A typical rule looks like this:
{Long Call Duration, Previous Contact} β†’ {Subscribed}

This suggests that customers who had a long conversation and were contacted previously are more likely to subscribe to the term deposit.

Apriori Algorithm: How It Works in This Project

How This Helps in the Project

ARM Network Graph of Associations

The network graph visualizes relationships between features (attributes) in the dataset based on association rules extracted from the Apriori algorithm. Each node represents an item (categorical variable), and each edge (arrow) indicates a strong association between two items.

Observations from the Graph:

Association Rule Mining (ARM) helps uncover interesting relationships between features in a dataset. The graphs below visualize the top 15 rules sorted by Support, Confidence, and Lift, providing a well-rounded view of frequently occurring, reliable, and statistically strong patterns.

Top 15 Association Rules Ranked by Support


Top 15 Rules Ranked by Support

Support measures how frequently a rule appears in the dataset. Higher support means the rule involves items that often occur together.

Observations from the Graph:

Top 15 Association Rules Ranked by Confidence


Top 15 Rules Ranked by Confidence

Confidence indicates the likelihood of the consequent occurring given the antecedent. Higher confidence means the rule is more predictive.

Observations from the Graph:

Top 15 Association Rules Ranked by Lift


Top 15 Association Rules Ranked by Lift

Lift evaluates the strength of a rule compared to random chance. A lift > 1 means the rule is more significant than what would happen by coincidence.

Observations from the Graph:

KEY FINDINGS


Top 15 Rules by Support (Most Frequent Associations)

These rules highlight the most commonly occurring itemsets in the dataset.

Threshold Used: Minimum Support = 0.05 (5%)


Top 15 Rules by Confidence (Strongest Predictive Power)

These rules show the probability that the consequent occurs, given that the antecedent is present.

Threshold Used: Minimum Confidence = 0.80 (80%)


Top 15 Rules by Lift (Strongest Relationships Beyond Randomness)

Lift measures the strength of a rule beyond chance.

Threshold Used: Minimum Lift = 1.0 (Rules must be better than random chance)

CONCLUSION