Data & Infrastructure

Data mining

📖

Definition

Data mining is the computational process of discovering patterns, correlations, anomalies, and statistically significant structures within large datasets using a combination of statistical methods, machine learning algorithms, and database techniques. It is distinct from simple querying or reporting: rather than retrieving known facts, data mining surfaces previously unknown relationships that are not immediately obvious from inspection. Common techniques include association rule learning (identifying items frequently purchased together), clustering (grouping records by similarity without predefined labels), classification (assigning records to categories based on learned patterns), regression (predicting continuous values), and anomaly detection (flagging records that deviate significantly from expected distributions).

In commerce, data mining has been applied for decades to problems such as market basket analysis, customer segmentation, churn prediction, and fraud detection. The classic retail example — that diapers and beer are frequently purchased together on Friday evenings — originated from early association rule mining on point-of-sale data and led to shelf placement changes that demonstrably improved revenue. Modern commerce data mining operates at far greater scale and sophistication, combining transactional, behavioral, and external data sources to uncover micro-segments, pricing sensitivity patterns, and supply chain risk signals that were previously invisible. As the volume and variety of available commerce data has grown, data mining techniques have become the foundation on which most AI and predictive analytics applications are built.

🔗

AI-Ready DataBig dataCustomer Data Platform (CDP)Data Lineage

Last updated: May 12, 2026

Definition

Related Terms