Skip to content

A Serverless project to help you operate on every existing item in a DynamoDB table

Notifications You must be signed in to change notification settings

coutoluizf/serverless-dynamodb-scanner

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Serverless DynamoDB Scanner

This is a Serverless application that scans a given DynamoDB table and inserts every item into a Kinesis Stream. You can then process the Kinesis stream, allowing you to perform an operation on all existing items in a DynamoDB table.

It was inspired by a tweet from Eric Hammond:

We really want to run some code against every item in a DynamoDB table.

Surely there's a sample project somewhere that scans a DynamoDB table, feeds records into a Kinesis Data Stream, which triggers an AWS Lambda function?

We can scale DynamoDB and Kinesis manually. https://t.co/ZyAiLfLpWh

— Eric Hammond (@esh) February 5, 2019

Usage:

This project uses the Serverless Framework to deploy a Lambda function and associated AWS resources.

To use it, follow these steps:

  1. Install the Framework and create your service:

    # Make sure you have the Serverless Framework installed
    $ npm install -g serverless
    
    $ sls create --template-url https://github.com/alexdebrie/serverless-dynamodb-scanner --path serverless-dynamodb-scanner
    
    $ cd serverless-dynamodb-scanner
  2. Update the configuration in serverless.yml.

    Add the ARN of the DynamoDB table you want to scan and the ARN of the Kinesis stream where you want the config added:

    # serverless.yml
    
    custom:
      dynamodbTableArn: 'arn:aws:dynamodb:us-east-1:123456789012:table/my_table'
      kinesisStreamArn: 'arn:aws:kinesis:us-east-1:123456789012:stream/my-stream'
    
    ...
  3. Deploy your service:

    $ sls deploy
  4. When you're ready, kick off your scan by invoking the function:

    $ sls invoke -f scanner

How does it work?

The basic workflow is as follows:

dynamodb scanner

A diagram that's missing a few steps ¯\_(ツ)_/¯

  1. Inside the Lambda function, check AWS SSM for a LastEvaluatedKey parameter that would be sent with our Scan call.

    If no parameter exist in SSM, we're just starting the scan.

  2. Make a Scan call to our DynamoDB table.

  3. Insert the items returned from our Scan into our Kinesis Stream via a PutRecords call.

  4. If the Scan call did not return a LastEvaluatedKey, our Scan is done! We can exit the function.

  5. If the Scan did return a LastEvaluatedKey, store the value in SSM.

  6. Do a time check -- if our function has less than 15 seconds of execution time left, we'll invoke another instance of our function and exit the loop for this one. Our next function will pick up where our scan left off by using the LastEvaluatedKey in SSM.

About

A Serverless project to help you operate on every existing item in a DynamoDB table

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%