data-mining-lab

🧠 Data Mining Lab

A practical implementation of core data mining techniques including classification, clustering, and association rule mining using Python and GUI-based tools- ORANGE and WEKA.


🚀 Project Overview

This project covers:


📁 Project Structure

data-mining-lab/
│
├── notebooks/
│   ├── 01_iris_classification.ipynb
│   ├── 02_breast_cancer_classification.ipynb
│   ├── 03_clustering.ipynb
│   └── 04_association_rules.ipynb
│
├── results/
│   └── plots/
│
├── orange/
│   ├── screenshots/
│   └── notes.md
│
├── weka/
│   ├── screenshots/
│   └── notes.md
│
└── README.md

🌸 Iris Classification

📊 Output

Decision Tree Confusion Matrix

Iris Dataset Decision Tree Confusion Matrix

KNN Confusion Matrix

KNN Confusion Matrix

🧠 Insight

The dataset is well-separated, allowing multiple models to achieve perfect accuracy.


🧬 Breast Cancer Classification

📊 Output

Cancer Confusion Matrix

🧠 Insight

Precision, recall, and F1-score provide better understanding than accuracy alone for real-world datasets.


🌀 Clustering (K-Means)

📊 Output

Clustering Plot

🧠 Insight

Even without labels, natural groupings emerge from the data.


🔗 Association Rules (Apriori)

🧠 Insight

Association rules reveal relationships between items and can be applied in recommendation systems.


🟠 ORANGE Workflow

Used ORANGE to build a visual machine learning pipeline.

📸 Screenshot

Orange Workflow


🔵 WEKA Experiment

Used WEKA Explorer to perform classification using J48 (Decision Tree).

📸 Screenshot

WEKA Output


🛠️ Technologies Used


📊 Key Learnings


📌 Future Improvements


▶️ How to Run

  1. Clone the repository
  2. Install requirements: pip install -r requirements.txt
  3. Open notebooks in Jupyter or VS Code
  4. Run cells sequentially

✨ Conclusion

This project demonstrates practical implementation of core data mining techniques and highlights the importance of choosing appropriate models and evaluation metrics.


👨‍💻 Author