How to create a DynamoDB table on AWS
ADVERTISEMENT
Table of Contents
- Introduction
- How to do it...
- Creating a table using CLI commands
- Creating a table using a CloudFormation template
- How it works...
- DynamoDB data model
- Data model limits
- DynamoDB keys and partitions
- Read and write capacity units
- Waiting for asynchronous operations
- Other ways to create tables
- DynamoDB features
- DynamoDB general limitations
- Conclusion
- About the author
Introduction
This article will show you how to create your first DynamoDB table on AWS. This is an important part of building quality cloud services.
Amazon DynamoDB is the primary database in AWS for building serverless applications. DynamoDB is a fully managed NoSQL database and you do not have to manage any servers. Unlike most NoSQL databases, DynamoDB also supports consistent reads but with an additional cost.
Attributes in DynamoDB are synonymous with columns, and items are synonymous with rows in a relational database. However, there is no table-level schema in DynamoDB. You can have a different set of attributes in different items (rows). You can also have an attribute with the same name but different types in different items. Getting ready
You need a working AWS account and should have installed and configured the AWS CLI with a profile with the necessary permissions. You are also expected to have a decent understanding of AWS CLI commands, Amazon CloudFormation, and basic database concepts. For the complete code files for this article, you can refer to:
How to do it...
Create a simple table, check its properties, update it, and finally delete the table. First, use CLI commands to create the table and then use a CloudFormation template to do the same. We will also use CLI commands to check the created table.
Creating a table using CLI commands
- We can create a simple DynamoDB table using the
aws dynamodb create-table
CLI command as follows:
aws dynamodb create-table \
--table-name my_table \
--attribute-definitions 'AttributeName=id, AttributeType=S' 'AttributeName=datetime, AttributeType=N' \
--key-schema 'AttributeName=id, KeyType=HASH' 'AttributeName=datetime, KeyType=RANGE' \
--provisioned-throughput 'ReadCapacityUnits=5, WriteCapacityUnits=5' \
--region us-east-1 \
--profile admin
Here, we define a table named my_table
and use the attribute-definitions
property to add two fields: id
of type string (denoted by S
) and ``datetimeof type number (denoted by
N). We then define a partition key (or hash key) and a sort key (or range key) using the
key-schemaproperty. We also define the maximum expected read and write capacity units per second using the
provisioned-throughputproperty. I have specified the region even though
us-east-1` is the default.
- List tables using the
aws dynamodb list-tables
CLI command to verify that our table was created:
aws dynamodb list-tables \
--region us-east-1 \
--profile admin
- Use the aws dynamodb
describe-table
CLI command to see the table properties:
aws dynamodb describe-table \
--table-name my_table \
--profile admin
The initial part of the response contains the table name, attribute definitions, and key schema definition we specified while creating the table:
The latter part of the response contains TableStatus, CreationDateTime, ProvisionedThroughput, TableSizeBytes, ItemCount, TableArn and TableId:
- You may use the
aws dynamodb update-table
CLI command to update the table:
aws dynamodb update-table \
--table-name my_table \
--provisioned-throughput 'ReadCapacityUnits=10, WriteCapacityUnits=10' \
--profile admin
Finally, you may delete the table using aws dynamodb delete-table:
aws dynamodb delete-table \
--table-name my_table \
--profile admin
Creating a table using a CloudFormation template
Now, we will see the components of the CloudFormation template needed for this article. The completed template file is available in the code files.
- Start creating the CloudFormation template by defining the template format, the version, and a description:
AWSTemplateFormatVersion: '2010-09-09'
Description: Your First DynamoDB Table
- Define the Resources section with the DynamoDB Table type:
Resources:
MyFirstTable:
Type: AWS::DynamoDB::Table
- Define the properties section with the essential properties: TableName, ProvisionedThroughput, KeySchema, and AttributeDefinitions:
Properties:
TableName: my_table
ProvisionedThroughput:
ReadCapacityUnits: 1
WriteCapacityUnits: 1
KeySchema:
-
AttributeName: id
KeyType: HASH
-
AttributeName: dateandtime
KeyType: RANGE
AttributeDefinitions:
-
AttributeName: id
AttributeType: S
-
AttributeName: dateandtime
AttributeType: N
- Update the table properties with the CloudFormation template:
Change ReadCapacityUnits
and WriteCapacityUnits
in the template to 5 for each. You can then update the stack using the aws cloudformation update-stack
CLI command:
aws cloudformation update-stack \
--stack-name myteststack \
--template-body file://resources/your-first-dynamodb-table-cf-template-updated.yml \
--region us-east-1 \
--profile admin
Whenever an update is made, CloudFormation compares the template with the existing stack and updates only those resources that are changed.
- Verify the table update using the
aws dynamodb describe-table
CLI command. - Delete the stack using the
aws cloudformation delete-stack
CLI command.
How it works...
We used the following DynamoDB CLI command actions in this recipe: create-table
, list-tables
, describe-table
, update-table
, and delete-table
. We use the corresponding components and properties within our CloudFormation template as well. Some of these options will become clear after you read the following notes.
DynamoDB data model
Data in DynamoDB is stored in tables. A table contains items (similar to rows) and each item contains attributes (similar to columns). Each item can have a different set of attributes and the same attribute names may be used with different types in different items. DynamoDB supports the datatypes string, number, binary, Boolean, string set, number set, binary set, and list. It does not have a JSON data type; however, you can pass JSON data to DynamoDB using the SDK and it will be mapped to native DynamoDB data types. You can also define indexes (global secondary indexes and local secondary indexes) to improve read performance.
Data model limits
The following are some of the important limits in the DynamoDB data model:
-
There is an initial limit of 256 tables per region for an AWS account, but this can be changed by contacting AWS support.
-
Names for tables and secondary indexes must be at least three characters long, but no more than 255 characters. Allowed characters are A-Z, a-z, 0-9, _ (underscore), - (hyphen), and . (dot).
-
An attribute name must be at least one character long but no greater than 64 KB long. Attribute names must be encoded using UTF-8, and the total size of each encoded name cannot exceed 255 bytes.
-
The size of an item, including all the attribute names and attribute values, cannot exceed 400 KB.
-
You can only create a maximum of five local secondary indexes and five global secondary indexes per table.
DynamoDB keys and partitions
Each item is identified with a primary key, which can be either only the partition key if it can uniquely identify the item or a combination of partition key and sort key. The partition key is also called a hash key and the sort key is also called a range key. Primary key attributes (partition and sort keys) can only be string, binary, or number.
Initially, a single partition holds all table data. When a partition's limits are exceeded, new partitions are created and data is spread across them. Current limits are 10 GB storage, 3,000 RCU, and 1,000 WCU. Data belonging to one partition key is stored in the same partition; however, a single partition can have data for multiple partition keys. The partition key is used to locate the partition and the sort key is used to order items within that partition.
Read and write capacity units
We specified the maximum read and write capacity units for our application per second, referred to as read capacity unit (RCU) and write capacity unit (WCU). We also updated our RCU and WCU. Updating the table properties is an asynchronous operation and may take some time to take effect.
Waiting for asynchronous operations
The CLI commands create-table
, update-table
, and delete-table
are asynchronous operations. The control returns immediately to the command line but the operation runs asynchronously.
To wait for table creation, you can use the aws dynamodb wait table-exists --table <table-name>
command, which polls the table until it is active. The wait table-exists command may be used in scripts to wait until the table is created before inserting data. Similarly, you can wait for table deletion using the aws dynamodb wait table-not-exists --table <table-name>
command, which polls with describe-table
until ResourceNotFoundException
is thrown. Both the wait options poll every 20 seconds and exit with a 255 return code after 25 failed checks.
Other ways to create tables
We created our table by specifying the properties, such as attribute-definitions, key-schema
, provisioned-throughput, and so on. Instead, you can specify a JSON snippet or JSON file using the cli-input-json
option. The generate-cli-skeleton option returns a sample template as required by the cli-input-json
option.
We also created a table using the AWS CLI and CloudFormation. You can also create DynamoDB tables from Java code using the AWS SDK. However, in most real-world cases, CloudFormation templates are used to create and provision tables and the AWS SDK is used to work with data items.
There's more...
Let's first see some features and limitations of DynamoDB. We will also see some theory on the LSI and GSI.
DynamoDB features
The following are some of the important features of DynamoDB:
- DynamoDB is a fully managed NoSQL database service. There are no servers to manage.
- DynamoDB has the characteristics of both the key-value and the document-based NoSQL families.
- Virtually no limit on throughput or storage. It scales very well but according to the provisioned throughout configuration.
- DynamoDB replicates data into three different facilities within the same region for availability and fault tolerance. You can also set up cross-region replication manually.
- It supports eventual consistency reads as well as strongly consistent reads.
- DynamoDB is schemaless at the table level. Each item (rows) can have a different set of elements. Even the same attribute name can be associated with different types in different items.
- DynamoDB automatically partitions and re-partitions data as the table grows in size.
- You can store JSON and then do nested queries on that data using the AWS SDK.
- Data is stored on SSD storage.
- DynamoDB supports atomic updates and atomic counters.
- DynamoDB supports conditional operations for put, update, and delete.
DynamoDB general limitations
Here are some of the general limitations of DynamoDB:
- DynamoDB does not support complex relational queries such as joins or complex transactions.
- DynamoDB is not suited for storing a large amount of data that is rarely accessed. S3 may be better suited for such use cases.
- You cannot select the Availability Zone for your DynamoDB table.
- Default replication of data for availability and fault tolerance is only within a region.
- Local and global secondary indexes
- You can define LSI and GSI for your tables to improve the read performance. An LSI can be considered as an alternate sort key for a given partition-key value. A GSI contains attributes from the base table and organizes them by a primary key that is different from that of the base table.
- Secondary indexes are useful when you want to query based on non-key parameters. You can create them with the CLI as well as CloudFormation templates. There is a limit of five LSIs and five GSIs per table.
Conclusion
You can read and learn more about LSIs and GSIs from the following links:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LSI.html https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html
If you found this article interesting, check out the Serverless Programming Cookbook for building and deploying real-world serverless applications. Explore the exciting world of Cloud offerings including Azure, Google Cloud, and IBM Cloud. The Serverless Programming Cookbook provides solutions to the most common problems faced in the world of serverless applications.
About the author
Heartin Kanikathottu is a Senior Software Engineer and Blogger with around 11 years of IT experience currently working as a Senior Member of Technical Staffing at VMware.
Final Notes
Recommended product: Coding Essentials Guidebook for Developers