Image of Streaming and Data Capture with AWS DynamoDB

ADVERTISEMENT

Table of Contents

Introduction

Streaming data and modifications from DynamoDB is a powerful feature that allows real-time processing of data changes in DynamoDB tables. With DynamoDB streams, you can track changes to items in a table, such as when an item is added, updated, or deleted. This enables a variety of use cases, including real-time data analytics, data synchronization across multiple systems, and triggering downstream processing or workflows.

In this article, we will explore how to use DynamoDB streams to monitor and react to data changes in DynamoDB tables. We will cover the basics of DynamoDB streams and discuss how to set them up and configure them to meet your needs. We will also look at some examples of how to use DynamoDB streams to power real-time data processing and integration with other systems. In this DynamoDB stream event example, we will set up a serverless DynamoDB stream using a Table and an AWS Lambda function.

How to Create a DynamoDB Table in the AWS Console

As a prerequisite, you will need to have a DynamoDB table already set up in your AWS account. There are a variety of ways to do this, including through the AWS Console, CloudFormation, or CDK. This article assumes that you have an existing DynamoDB table already in place.

Once you have a DynamoDB table set up, you can enable streams on that table to track changes to its items. To do this, you will need to specify the stream view type, which determines the information that is included in the stream records. There are three options for the stream view type:

  1. NEW_AND_OLD_IMAGES: This option includes both the old and new item images in the stream record, allowing you to see both the before and after state of an item when it is modified.
  2. NEW_IMAGE: This option includes only the new item image in the stream record, showing the final state of the item after it has been modified.
  3. OLD_IMAGE: This option includes only the old item image in the stream record, showing the state of the item before it was modified.

Once you have chosen a stream view type, you can enable streams on your DynamoDB table by using the AWS Management Console, the AWS CLI, or the DynamoDB API. After you have enabled streams on a table, DynamoDB will begin tracking changes to the table and sending stream records to a stream, which can be processed in real time.

There are several ways to process DynamoDB streams, including using AWS Lambda functions, Kinesis Data Streams, or other custom applications. In this article, we will focus on using AWS Lambda to process DynamoDB streams and react to data changes in real time.

With DynamoDB streams and Lambda, you can build real-time data processing pipelines that can perform a wide range of tasks, such as:

  • Updating other systems with data changes from DynamoDB
  • Sending notifications or alerts when certain data changes occur
  • Performing real-time analytics on data changes
  • Synchronizing data across multiple systems in real time

We will explore these use cases and more in more detail later in this article.

Setting up AWS Lambda to Accept DynamoDB Streams

We’ve outlined the basic steps to setting up an AWS Lambda Function to process DynamoDB streaming records:

  1. Create an AWS Lambda function. When creating the function, you will need to specify a runtime (language), and write the code that will be executed by the function.
  2. Configure the Lambda function's triggers, with a DynamoDB stream as a trigger for the function.
  3. Write code to process the streaming data. This is the code that will handle the stream records that are passed to the function and perform any necessary actions based on the data contained in the records.
  4. Test and deploy the Lambda function. You can then deploy the function to make it live and start processing streaming data from DynamoDB.

Now, we will go over each of these steps in depth.

Creating an AWS Lambda Function

To set up an AWS Lambda function through the AWS Management Console, you will need to perform the following steps:

  1. Navigate to the AWS Lambda console: Go to the AWS Management Console and sign in to your AWS account. Then, navigate to the AWS Lambda console by searching for "Lambda" in the services search bar.
AWS Lambda services search bar
  1. Configure the non-default execution role: In order for your Lambda function to be able to access records and streams from DynamoDB, you must define an execution role with proper permissions. Open the roles page in the AWS IAM Console, and choose “Create Role.” Create a role with the following properties:

    1. Trusted entity – Lambda.
    2. Permissions – AWSLambdaDynamoDBExecutionRole (hit the “+” sign near the role to select the policy statement.
    3. Role name – lambda-dynamodb-role.
  2. Click the "Create function" button: On the AWS Lambda dashboard, click the "Create function" button to begin the process of creating a new Lambda function.

AWS Create a Function button
  1. Add a Function name, choose a runtime, and select the Execution Role you Created: On the "Create function" page, choose the name and runtime that you want to use for your function. This will determine the programming language and environment that your function will run in, as well as the name you can use to refer to the Function later on. In this example, we will use a Node runtime, but our architecture will work similarly with any of the runtimes available in Lambda, with minor code differences based on language. Make sure to also select the lambda-dynamodb-role you just created as the execution role (not the default role) to allow Lambda to read DynamoDB streams.
AWS Lambda function basic information
  1. Configure the function's triggers: To set up a Lambda function to process streaming data from DynamoDB, you will need to specify a DynamoDB stream as a trigger for the function. You can do this by clicking the "Add trigger" button and selecting "DynamoDB" from the list of trigger types. Then, choose the DynamoDB stream that you want to use as a trigger for your function. For this, use the DynamoDB table you already have created. If you receive an error about permissions boundaries, go back and make sure that the aforementioned policy from Step 4 was correctly added.

  2. Configure the function's other settings: On the "Create function" page, you can also configure other settings for your function, such as the function's name and description, the memory and timeout settings, and the environment variables.

  3. Click the "Create function" button: After you have finished configuring your function, click the "Create function" button to create the function and make it live.

Code to Process Streaming Data

Now, we need to add some code to process incoming stream events from DynamoDB. Simply copy and paste the following code in the Lambda function’s “Code” section. Once the code is added, click the highlighted “Deploy” button to make the new code live.

export const handler = async(event) => {
    console.log(JSON.stringify(event, null, 2));
    event.Records.forEach(function(record) {
        console.log(record.eventID);
        console.log(record.eventName);
        console.log('DynamoDB Record: %j', record.dynamodb);
    });
};

Adding an Item to DynamoDB

Now that you have set up the Lambda function to accept DynamoDB streams, let’s test it out by making a change to the table. We will add a single item to demonstrate this functionality.

Go to DynamoDB and create an item in the table from the AWS Console. You can do this by going to DynamoDB > Tables > Your Table > Actions > Create Item.

AWS Lambda create item

Once this is created, you can go to the Lambda function’s Cloudwatch Monitoring Logs, and see the change in real time.

AWS Clloudwatch monitoring logs

Extending the Application to More AWS Features

A Lambda function can be used to trigger virtually any other application. As your Lambda consumes DynamoDB streams, you can use the changes in the stream to initiate many other tasks, both within AWS architecture and outside of it. This means you can work with tools such as S3, Kinesis, SNS, SQS, and even other Lambdas!

Summary

In this blog post, we explored how to stream data from a DynamoDB table to a Lambda function using AWS's built-in integration. We started by creating a Lambda function with an execution role for that granted it read access to DynamoDB. We then set up the stream on our DynamoDB table and specified our Lambda function as the trigger.

With these steps in place, any time there is a change to the DynamoDB table (such as a new item being added), the Lambda function will be automatically triggered and the updated data will be passed to it. We tested this setup by adding a new item to the DynamoDB table and verifying that it was processed by the Lambda function in real time via CloudWatch logs.

Overall, streaming data from DynamoDB to a Lambda function is a straightforward process that can be easily set up and maintained using the built-in integration provided by AWS. This can be a useful tool for building real-time data processing pipelines or triggering downstream workflows in response to changes in your DynamoDB data.

Next steps

If you're interested in learning more about the basics of Python, coding, and software development, check out our Coding Essentials Guidebook for Developers, where we cover the essential languages, concepts, and tools that you'll need to become a professional developer.

Thanks and happy coding! We hope you enjoyed this article. If you have any questions or comments, feel free to reach out to jacob@initialcommit.io.

References

  1. AWS DynamoDB Streams - https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html
  2. How to Create a DynamoDB Table - https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SampleData.CreateTables.html](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SampleData.CreateTables.html

Final Notes