Cloud infrastructure accelerates AI inference performanceSaturday, June 13, 2026

DataLLMSearchAmazon Web ServicesStrandsAWS Step FunctionsAmazon Bedrock AgentCore RuntimeAmazon Bedrock Data AutomationAmazon Bedrock Knowledge BasesAmazon Titan Embeddings

AWS Bedrock Data Automation automates intelligent document processing pipelines

AWS launched Amazon Bedrock Data Automation (BDA), a managed service that extracts insights from documents, images, and multimodal content through a unified API, automatically handling classification, extraction, and validation across formats up to 3,000 pages. For commerce teams processing invoices, contracts, and claims at scale, BDA eliminates manual document sorting and reduces costs while improving accuracy through confidence scores and context understanding.

Amazon Web Services introduced Amazon Bedrock Data Automation (BDA) as a unified API service for extracting meaningful insights from multimodal content including documents, images, videos, and audio files (AWS Machine Learning Blog). Unlike traditional optical character recognition (OCR) solutions that only extract text, BDA understands document context, validates extracted data, and provides confidence scores for accuracy. The service automatically splits documents along logical boundaries, classifies sections into appropriate document types, and matches them to correct processing blueprints, supporting file formats up to 3,000 pages and 500 MB per API request (AWS Machine Learning Blog).

For commerce practitioners, BDA addresses a critical operational bottleneck: organizations processing millions of documents daily—from insurance claims and invoices to legal contracts and medical records—currently rely on manual intervention that increases processing time, costs, and error rates (AWS Machine Learning Blog). BDA's intelligent routing removes the need for manual document sorting and orchestration of multiple AI models, enabling organizations to transform document processing workflows with minimal development effort. The service integrates with AWS Step Functions for orchestration, Amazon DynamoDB for metadata tracking, Amazon Bedrock Knowledge Bases for semantic search, and Strands Agents for specialized task coordination.

BDA offers two flexible output modes: standard output providing document summaries, extracted text, and generative insights, and custom output with blueprints that allow precise control over extracted information for specific document types (AWS Machine Learning Blog). The service extracts text in reading order, recognizes table structures, detects form fields, analyzes visual elements like charts and graphs with generated captions, and provides bounding box coordinates for precise location tracking (AWS Machine Learning Blog).

AWS Machine Learning Blog