Amazon DynamoDB is AWS's fully managed NoSQL database, engineered to deliver single-digit-millisecond performance at virtually any scale. It powers some of the highest-traffic systems in the world — shopping carts, gaming leaderboards, session stores, and IoT pipelines — without you ever provisioning a server, managing replication, or worrying about running out of capacity.
DynamoDB takes a fundamentally different approach from relational databases like RDS. There are no joins, no fixed schema beyond the primary key, and you design your tables around the exact queries your application makes rather than around normalized relationships. This trade-off — less query flexibility in exchange for predictable performance and effortless scale — is what makes it so powerful for the right workloads.
This guide covers DynamoDB end-to-end: its data model, keys and indexes, capacity and consistency options, advanced features like streams and global tables, the single-table design philosophy, and hands-on CRUD operations with the AWS CLI and boto3.
What Is DynamoDB?
DynamoDB is a key-value and document database. Data lives in tables; each table holds items (analogous to rows), and each item is a set of attributes (analogous to columns, but flexible — different items can have different attributes). It's serverless: replication across three Availability Zones, scaling, and patching are all handled for you.
Primary Keys: Partition and Sort
The single most important design decision is the primary key, which uniquely identifies each item and determines how data is distributed:
- Partition key (hash key) — DynamoDB hashes it to decide which physical partition stores the item. A good partition key has high cardinality so data and traffic spread evenly.
- Sort key (range key, optional) — orders items that share a partition key, enabling efficient range queries like "all orders for user X between two dates".
A table with only a partition key enforces one item per key; a composite key (partition + sort) allows many related items per partition key, which is the basis of powerful query patterns.
Secondary Indexes
By default you can only query by the primary key. Secondary indexes let you query by other attributes:
- Global Secondary Index (GSI) — a different partition/sort key, queryable independently of the base table; can be added any time.
- Local Secondary Index (LSI) — same partition key, different sort key; must be created with the table.
Indexes are how you support multiple access patterns — design them around the questions your app needs to answer.
Capacity Modes: On-Demand vs Provisioned
- On-demand — pay per read/write request with no capacity planning. Ideal for unpredictable or spiky traffic and new apps.
- Provisioned — you set read/write capacity units (with optional auto scaling). Cheaper for steady, predictable workloads.
Read Consistency
DynamoDB offers two read modes. Eventually consistent reads (the default) are cheaper and may briefly return slightly stale data right after a write. Strongly consistent reads always reflect the latest committed write but cost more and have slightly higher latency. Choose per query based on whether your use case can tolerate a moment of staleness.
Single-Table Design
Experienced DynamoDB users often store multiple entity types in a single table, using generic key names and prefixes (e.g. USER#123, ORDER#456) to model relationships and serve many access patterns with few queries. This single-table design maximizes performance and minimizes cost, though it has a learning curve — beginners can start with one table per entity and evolve later.
Advanced Features
- DynamoDB Streams — a change log of item-level modifications that can trigger Lambda for real-time processing, replication, or analytics.
- TTL (Time to Live) — automatically delete expired items (sessions, temporary data) at no extra cost.
- Global Tables — multi-Region, active-active replication for low-latency global apps and disaster recovery.
- Transactions — all-or-nothing operations across multiple items when you need atomicity.
- DAX — an in-memory cache for microsecond reads on hot data.
Creating a Table (AWS CLI)
aws dynamodb create-table \
--table-name Users \
--attribute-definitions AttributeName=user_id,AttributeType=S \
--key-schema AttributeName=user_id,KeyType=HASH \
--billing-mode PAY_PER_REQUEST
CRUD Operations with boto3
import boto3
from boto3.dynamodb.conditions import Key
dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table("Users")
# CREATE / UPDATE (put_item overwrites by key)
table.put_item(Item={"user_id": "u1", "name": "Tushar", "age": 25})
# READ a single item by key
resp = table.get_item(Key={"user_id": "u1"})
print(resp["Item"])
# QUERY by key (efficient) - never use Scan for this
result = table.query(KeyConditionExpression=Key("user_id").eq("u1"))
print(result["Items"])
# UPDATE specific attributes
table.update_item(
Key={"user_id": "u1"},
UpdateExpression="SET age = :a",
ExpressionAttributeValues={":a": 26},
)
# DELETE
table.delete_item(Key={"user_id": "u1"})
Output:
{'user_id': 'u1', 'name': 'Tushar', 'age': Decimal('25')}
[{'user_id': 'u1', 'name': 'Tushar', 'age': Decimal('25')}]
Query vs Scan (This Matters for Cost)
Query retrieves items by partition key (and optional sort-key condition) — fast and cheap because it reads only matching items. Scan reads the entire table and filters afterward — slow and expensive on large tables. Design your keys and indexes so you can always Query; reserve Scan for rare admin tasks or small tables.
Pricing Model (in detail)
In on-demand mode you pay per million read and write request units plus storage per GB-month. In provisioned mode you pay for the read/write capacity you reserve (with auto scaling to track demand) plus storage. Extra charges apply for global tables, streams, backups, and DAX. The Free Tier includes 25 GB of storage and enough provisioned capacity for many small apps indefinitely.
Real-World Use Cases
- User profiles, sessions, preferences, and shopping carts for high-traffic apps.
- Gaming leaderboards and player state at massive concurrency.
- IoT and clickstream ingestion at huge write volumes.
- Serverless backends paired with Lambda and API Gateway.
- Shopping cart and order systems needing predictable low latency at scale.
Common Mistakes to Avoid
- Low-cardinality partition keys — create "hot" partitions and throttling.
- Using Scan instead of Query — slow and costly; design keys/indexes to Query.
- Modeling like SQL — design around access patterns, not normalized tables and joins.
- Ignoring item-size limits — a single item is capped at 400 KB; store large blobs in S3 and keep a reference.
- Forgetting Python returns Decimals — numeric attributes come back as
Decimal.
Frequently Asked Questions
DynamoDB or RDS? Use DynamoDB for high-scale, known-access-pattern key-value/document data; use RDS for relational data needing flexible queries and joins.
How big can an item be? 400 KB maximum; offload larger data to S3.
Can I add an index later? Yes for GSIs; LSIs must be defined at table creation.
Is DynamoDB truly serverless? Yes — no instances to manage; it scales and replicates automatically.
Summary Table
| Concept | Purpose |
|---|---|
| Partition key | Distributes data across partitions |
| Sort key | Orders items; enables range queries |
| GSI | Query by non-key attributes |
| Streams | React to changes (e.g. with Lambda) |
| On-demand | Pay-per-request, no capacity planning |
Capacity Units and Throughput Explained
Throughput in DynamoDB is measured in capacity units. One read capacity unit (RCU) represents one strongly consistent read per second for an item up to 4 KB (or two eventually consistent reads). One write capacity unit (WCU) represents one write per second for an item up to 1 KB. In provisioned mode you reserve RCUs and WCUs (optionally with auto scaling that adjusts them to track demand), while on-demand mode bills equivalent request units without any reservation. Understanding this is key to both performance and cost: large items consume more units per operation, and bursts beyond your provisioned capacity get throttled unless auto scaling or on-demand absorbs them. Monitoring consumed vs provisioned capacity in CloudWatch tells you whether to scale up, switch modes, or optimize item sizes.
Backup, Restore, and Point-in-Time Recovery
DynamoDB protects your data with two backup options. On-demand backups let you take a full snapshot of a table at any moment — useful before a risky migration or for long-term archival — with no impact on performance. Point-in-time recovery (PITR), when enabled, continuously backs up your table so you can restore it to any second within the last 35 days, protecting against accidental writes or deletes. Restores always create a brand-new table, so your original data is never overwritten during recovery. For compliance, you can also export table data to S3 and query it with Athena without touching the live table.
DynamoDB vs Other NoSQL Databases
Compared with self-managed NoSQL databases like MongoDB or Cassandra, DynamoDB's biggest advantage is that it is fully managed and serverless: there are no nodes to size, patch, or rebalance, and it scales transparently. MongoDB offers richer ad-hoc query and aggregation capabilities, and Cassandra gives you control over a self-hosted cluster, but both require significant operational effort. The trade-off with DynamoDB is that you must design around access patterns up front, because flexible server-side querying isn't available. For teams that want predictable performance without operational overhead, that trade is usually worth it.
Cost Optimization Tips
Keep DynamoDB cheap by choosing the right capacity mode (on-demand for spiky traffic, provisioned with auto scaling for steady load), enabling TTL to automatically purge expired items instead of paying to store them, using sparse indexes so GSIs only contain items you actually query, and compressing or offloading large attributes to S3. Monitor consumed capacity and throttling in CloudWatch so you can right-size before costs or errors creep up.
Reference
This article follows the official AWS documentation. Read the full reference here: Amazon DynamoDB documentation.
Conclusion
DynamoDB delivers fast, serverless NoSQL that scales without limits — provided you design for it. Choose a high-cardinality partition key, model around your access patterns, use indexes to support multiple queries, prefer Query over Scan, and lean on features like streams, TTL, and global tables. Get the data model right and DynamoDB becomes one of the most reliable, low-maintenance, and scalable parts of your architecture.
π¬ Comments (0)
No comments yet. Be the first to share your thoughts!