Technology Overview

The AI technology stack for commerce — foundational models, agentic AI, LLMs, privacy & security, investment costs, and what comes next — based on Book Part 5 of the AI Best Practices for Commerce reference.

AI Technology Overview for Commerce

Section 1 of 813% complete

Underlying Technology & History

From the Dartmouth Conference to foundation models

The Evolution of AI

The Evolution of AI in Retail
The Evolution of AI in Retail

Just as the internet once revolutionized shopping, artificial intelligence is now redefining it. What began as simple computational tools for inventory management has evolved into sophisticated systems that can predict consumer behavior, optimize supply chains in real-time, and create personalized shopping experiences for billions of customers worldwide. This chapter traces the journey of AI in retail and ecommerce, from its theoretical foundations in the 1950s to the cutting-edge applications reshaping commerce today.

The Origins of Machine Intelligence

Artificial Intelligence is a term first used in 1956 at the Dartmouth Conference, a gathering of mathematicians, early computer scientists, and cognitive theorists. The focal point of the conference was the idea that “Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

The researchers spent several weeks exploring concepts around neural networks, automated reasoning, self-improving systems, and many more areas which today define the underpinnings of general AI. While the conference didn’t itself have any technological breakthroughs, it laid critical groundwork for AI and machine learning by both defining the field and sparking the first wave of AI research funding (through DARPA and corporate entities).

Early Business Computing

The use of computers for business and retail started earlier, though. In 1951 a British restaurant chain named J. Lyons & Co developed the Lyons Electronic Office 1 (LEO 1) computer. It was the first computer designed for commercial and business operations. Initially it calculated bakery valuations, and later managed payroll and inventory.

Widespread business use of computers really kicked off in 1959, when IBM introduced it’s 1401 Data Processing System. Within five years roughly half of all computer systems in the world were IBM 1401s. Retailers used these computers for everything from inventory management, sales tracking and modeling, and transitioning paper-based retail data operations into the digital realm.

AI Paradigms
AI Paradigms

Expert Systems

In the 1970s and 1980s expert systems were the dominant form of applied artificial intelligence. Expert systems are designed to replicate decision-making of a human in a specific domain or use case. Not general artificial intelligence, but more like rules-based workflows built with specific encoded knowledge and defined decision trees. Domain knowledge would be represented by if-then rules, which would be managed by an inference engine that would apply logic to rules, chaining them, reversing them, and being able to explain how and why, for a given set of inputs, it produced the output.

Digital Equipment Corp (DEC) created the eXpert CONfigurer (XCON) in 1980, which was one of the earliest and most successful retail expert systems. It was developed in part at Carnegie Mellon University (a prominent name in computing and AI) and was used to configure extremely complex VAX mini-computer orders. It had over 10,000 rules and reportedly saved DEC $40 million per year. American Express deployed the Authorizer’s Assistant in 1988, which was an expert system designed to help credit authorizers evaluate unusual transactions based on rich institutional knowledge. It allowed junior team members to make better decisions, more quickly and consistently.

Machine Learning and Recommendations

In the 1990s the rule-based expert systems were replaced with data-driven machine learning systems. The biggest practical difference is that where expert systems had to have the expert knowledge encoded manually as a large set of rules, machine learning systems extracted patterns and rules directly from large data sets. This means much less time and manual labor were needed to create a system that performed the same work.

Machine learning was much better suited to use cases built around the ever-growing volume of retail data, since the models and decision-making could be updated based on new data, and massive amounts of data could be used to train the systems. Machine learning has applicability in an extremely wide range of use cases and workloads: from logistics optimization to fraud prevention to inventory prediction, and much more.

One of the most visible successes of machine learning in the retail ecosystem is around ecommerce recommendations. The “beer and diapers” story became a famous example of data mining in retail, though its factual basis is disputed. Whether real or apocryphal, the story illustrated the potential of association rule mining. The Apriori algorithm, developed by Agrawal and Srikant in 1994 at IBM’s Almaden Research Center, provided a practical method for finding such associations in transaction data.

This in turn led to Amazon’s patent on “Collaborative recommendations using item-to-item similarity mappings,” which was filed in 1998 and granted in 2001. This is the method which powers the “Customers who bought this item also bought these other items” feature you still see today on Amazon.com and the ecommerce sites of many other retailers.

The Deep Learning Revolution

While expert systems dominated the 1980s and machine learning ruled the 1990s, neural networks, the technology that would eventually transform AI, had been quietly developing in parallel since the 1940s. Warren McCulloch and Walter Pitts created the first mathematical model of artificial neurons in 1943, but it wasn’t until 2012 that neural networks would make the jump from the realm of academic research to commercial prominence.

Neural networks had actually experienced their own boom-and-bust cycle. Frank Rosenblatt’s Perceptron, demonstrated at Cornell in 1957, could learn to recognize simple patterns. The New York Times reported it as the “embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.” But in 1969, Marvin Minsky and Seymour Papert published “Perceptrons,” mathematically proving that single-layer neural networks couldn’t solve certain fundamental problems like the XOR function. Government and commercial funding dried up. The first AI winter had begun.

The breakthrough came from an unlikely source. Geoffrey Hinton, a British cognitive psychologist turned computer scientist, refused to give up on neural networks even when they were deeply unfashionable. Working with David Rumelhart and Ronald Williams in 1986, he published “Learning representations by back-propagating errors,” which showed how multi-layer neural networks could be trained using back-propagation. The algorithm allowed

AI Paradigms: From Rules to Learning to Reasoning
1
Expert Systems (1970s–80s)
If-then logic inference engines
  • Replicated decision-making in specific domains
  • DEC XCON saved $40M/year with 10,000+ rules
  • American Express Authorizer's Assistant (1988)
2
Machine Learning (1990s)
Statistical learning from data
  • Extracted patterns directly from large datasets
  • Amazon's item-to-item collaborative filtering (1998)
  • 35% of Amazon revenue from recommendations by 2006
3
Deep Learning (2012+)
Context understanding via neural networks
  • AlexNet at ImageNet 2012: 15.3% vs 26.2% error rate
  • GPU acceleration: 70x speedup over CPU training
  • Visual search, demand forecasting, fraud detection
4
Transformers (2017+)
Self-attention for long-range relationships
  • "Attention Is All You Need" — Google Brain, 2017
  • BERT improved Google Shopping conversion 15%
  • Home Depot: 12% drop in search abandonment
5
Foundation Models (2020+)
Generalization, emergent capabilities
  • GPT-3: 175B parameters, emergent reasoning
  • Scaling laws: predictable power-law improvements
  • RAG, MoE, and on-device models in 2024–25
🌐
Source: AI Best Practices for Commerce, Section 5.1
Share

Last updated: March 12, 2026