AWS Database Migration Service 101

I am definitely not a Database Expert! But, if you are anything like me, your various roles have required you to jump in and out of that world once or twice a year. Enough to get your boots a little muddy. I was recently tasked with standing up an instance of the AWS Database Migration Service at work and I definitely had that brief wave of anxiety hit me.

But……as always, it wasn’t that bad at all once you get into it. The AWS side of things is pretty straight forward and if you come into this with a smidge of DB knowledge I think you’ll find it a rewarding experience. Kind of like an after school special where everybody learned a little bit more about themselves that day.

WTF Is A DMS And Why Should I Care?

AWS Database Migration Service (DMS) is basically Amazon’s moving truck for databases. It picks up your data from one database and delivers it safely to another. This tool can do this on AWS, between AWS services, or even from a different cloud or on-prem system. And here are two other REALLY neat parts…

  1. It can migrate between different database engines! Think Oracle -> PostgreSQL or SQL Server -> Amazon Aurora.
  2. It also keeps the source database running while the move happens! It’s often used for cloud migrations, upgrades, or consolidating multiple databases into one.

Standard Vs. Serverless

Feature DMS Standard DMS Serverless
Compute Dedicated Replication Instance AWS Managed Compute
Scaling Fixed Capacity / You Adjust Elastic Scaling Based On Demand
Management You Manage Networking, Instance Sizing, Patching Managed By AWS
Pricing Billed Per Hour For The Instance Billed By Capacity Units Used During Replication
Use Cases You Want Tight Control You Want Simplicity
Supported Endpoints All DMS Supported Engines & Endpoints *Not Yet 100% But Growing
Maturity An Old Wise Man A Cool Older Cousin

The Basics

Below is an overview of the configuration we are talking about today. As you can see, it’s honestly pretty simple at a high level. The Replication Instance sits in the middle and does the work of pulling the data out of SQL Server, massaging it as needed, and then putting it into the endpoint. In our case we are putting this information into an S3 Bucket to act as a data lake.

AWS Backup Plan

Let’s Dive Deepers

Now let’s zoom in a bit and break apart some of the individual pieces!

Replication Instance

Since we are using DMS Standard we are going to need a dedicated Replication Instance to do all of the grunt work. Configuration of this instance is pretty simple as you can see in the diagram below. The Instance Class value is required and is how we are manually defining the EC2 instance size we want to use for our specific Replication Tasks.

Also, please be mindful of the fact that you are billed as long as your instance is online and in a running state. Even if you are not running a task, the instance is costing you money. You might want to get a system in place to power this down when not in use.

AWS Backup Plan

Replication Task

This might be the most intricate part of our setup today. The Replication Task is the brains behind this entire process. It points to a Replication Instance and then uses it to actually move the data between the endpoints. You’ll also notice that this is where settings and mappings between the two endpoints are defined and used along the way.

AWS Backup Plan

Replication Endpoints

The Replication Endpoint component is used to define both the source and target endpoints you are working with. The settings here include definitions around the database type the endpoint represents and also the actual details to connect to the database.

One thing I particular I wanted to call out is the Secrets Manager Secret setting. In most scenarios the details used to actually connect to the database are sensitive and you want to keep them locked away as tight as possible. That is what we are doing here. The Secret that we are pointing to holds those connection details.

You CAN define those settings directly in the endpoint configuration itself if you like but your grandfather will look down from heaven and sigh in shame.

AWS Backup Plan

S3 Replication Endpoints

So as we touched on above, there are many different types of endpoints you can use as both a source and target for your migrations. Many of them are unique and will have specific settings or configurations that aren’t used on the others. What that being said, I did want to specifically call out S3 Replication Endpoints for two reasons.

  1. We are using one in our example here today
  2. Out of all of the “edge case” endpoint types, S3 is probably one of the more extreme examples

So what makes S3 so different? Well you might have noticed that S3 isn’t a database! Other endpoints like Oracle or PostgreSQL understand schemas and support transactions. S3 is nothing more than an object store. While more traditional engines are neat and organized by default S3 is a giant bucket where we throw everything in and rummage around.

There is one last thing I want to call out about S3 endpoints. When DMS moves files in or out of S3 it also puts supporting control files in the bucket that act as metadata and checkpoints. These files help the other processes understand what’s been loaded, how to interpret it, and more.

Slather Some Terraform On That

So now let’s take a look at a basic configuration in Terraform for reference. Please note that you can also find this in my Github account.

providers.tf

We are going to initialize our AWS provider as well as populate our secret key and id so that it knows how to authenticate back to the AWS API.

provider "aws" {
  region  = var.aws_region
  access_key = "ASDFASDFASDFASDFASDFASDF"
  secret_key = "ASDFASDFASDFASDFASDFASDF"
}

terraform.tf

This is where we are going to define some high level configuration around how we want Terraform as a whole to operate.

terraform {
  required_version = ">= 1.12"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 6.4"
    }
  }
}

data.tf

Query our current account and also the existing subnets out there we are operating in.

# Lets Us Get Access To Account Information
data "aws_caller_identity" "current" {}

# Which Subnets DMS Is Allowed To Use
data "aws_subnets" "subnet" {
  filter {
    name   = "tag:Name"
    values = var.subnets
  }
}

tasks.tf

Glue it all together with tasks…

resource "aws_dms_replication_task" "dms_dev_replication" {
  migration_type            = "full-load-and-cdc"
  replication_instance_arn  = aws_dms_replication_instance.instance.replication_instance_arn
  replication_task_id       = "dms-${var.environment}-task"
  source_endpoint_arn       = aws_dms_endpoint.dms_dev_sql_source.endpoint_arn
  target_endpoint_arn       = aws_dms_s3_endpoint.dms_dev_s3_target.endpoint_arn
  table_mappings            = file("table-mappings.json")
  replication_task_settings = file("task-settings.json")

  tags = {
    Name = "${var.environment}-dms-replication"
  }
} 

endpoints.tf

Define all of our endpoints here in this file. Note that different types of endpoints will have different arguments.

# Our Source SQL Server Endpoint We Are Pulling Data From
resource "aws_dms_endpoint" "dms_dev_sql_source" {
  endpoint_id                     = "dms-${var.environment}-sql-source"
  endpoint_type                   = "source"
  engine_name                     = "sqlserver"
  secrets_manager_arn             = aws_secretsmanager_secret.db_connection_info.arn # - Keep DB Connection Info Secret
  secrets_manager_access_role_arn = aws_iam_role.dms_dev_permissions.arn # - Role To Access Secrets Manager

  tags = {
    Name = "dms-${var.environment}-sql-source"
  }
}

# Our Target S3 Bucket We Are Migrating Data Into
resource "aws_dms_s3_endpoint" "dms_dev_s3_target" {
  endpoint_id              = "dms-${var.environment}-s3-target"
  endpoint_type            = "target"
  bucket_name              = var.target_bucket_name
  service_access_role_arn  = aws_iam_role.dms_dev_permissions.arn # - Role To Access Bucket Resources
  csv_row_delimiter        = "\\n"
  csv_delimiter            = ","
  bucket_folder            = "dev_sql_db"
  compression_type         = "NONE"
  data_format              = "parquet"
  parquet_version          = "parquet-2-0"
  enable_statistics        = true
  include_op_for_full_load = true
  timestamp_column_name    = "ingestion_timestamp"
  date_partition_enabled   = false
  date_partition_sequence  = "yyyymmdd"
  date_partition_delimiter = "slash"
  add_column_name          = true
  cdc_max_batch_interval   = 3600
  cdc_min_file_size        = 64000


  tags = {
    Name = "dms-${var.environment}-s3-target"
  }
}

instances.tf

If the tasks are the brains the instances are the muscle. Remember, that you are getting billed for these instances 24/7 even if they are not currently running any migration jobs.

resource "aws_dms_replication_instance" "instance" {
  replication_instance_class  = var.instance_class
  replication_instance_id     = "dms-${var.environment}-replication"
  allocated_storage           = 50
  availability_zone           = var.availability_zone
  replication_subnet_group_id = aws_dms_replication_subnet_group.instance.replication_subnet_group_id

  tags = {
    Name = "dms-${var.environment}-replication"
  }
}

resource "aws_dms_replication_subnet_group" "instance" {
  replication_subnet_group_description = "Subnets to use for dms-${var.environment}-replication instance"
  replication_subnet_group_id          = "dms-${var.environment}-replication"
  subnet_ids                           = data.aws_subnets.subnet.ids

  tags = {
    Name = "dms-${var.environment}-replication"
  }
}

secrets.tf

Notice we are creating the secret here but not providing it a value. We want to try to keep this as secure as possible by keeping it out of terraform state and our repos. I went back in and manually populated the information in the AWS console after the secret was built.

resource "aws_secretsmanager_secret" "db_connection_info" {
    name = "dms-${var.environment}-db-connection-info"
}

iam.tf

We need to be able to authenticate to our S3 Endpoint and also our Secret resource and this is the role we will use.

# Allow The DMS Service To Assume The Role
data "aws_iam_policy_document" "assume_role" {
  statement {
    effect = "Allow"

    principals {
      type        = "Service"
      identifiers = ["dms.amazonaws.com"]
    }

    actions = ["sts:AssumeRole"]
  }
}

# IAM Role To Assign To Our Instance
resource "aws_iam_role" "dms_dev_permissions" {
  name               = "dms-${var.environment}-dms-permissions"
  assume_role_policy = data.aws_iam_policy_document.assume_role.json
}

# Provides The Inline Policy To The Role
resource "aws_iam_role_policy" "inline_policy" {
  name   = "inline_policy"
  role   = aws_iam_role.dms_dev_permissions.id
  policy = data.aws_iam_policy_document.inline_policy.json
}

# DMS Role Inline Policy Document
data "aws_iam_policy_document" "inline_policy" {
  statement {
    sid = "DmsPermissions"
    actions = [
      "secretsmanager:GetSecretValue",
      "secretsmanager:DescribeSecret"
    ]
    resources = [aws_secretsmanager_secret.db_connection_info]
  }
  statement {
    sid = "DmsS3Permissions"
    actions = [
      "s3:PutObject",
      "s3:PutObjectAcl",
      "s3:ListBucket",
      "s3:GetBucketLocation",
      "s3:DeleteObject"
    ]
    resources = [
      "arn:aws:s3:::${var.target_bucket_name}",
      "arn:aws:s3:::${var.target_bucket_name}/*"
    ]
  }
}

table-mappings.json

This file uses several types of rules to specify the data source, source schema, data, and any transformations that should occur during the task. You can use table mapping to specify individual tables in a database to migrate and the schema to use for the migration.

{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "015552122",
            "rule-name": "962748620",
            "object-locator": {
                "schema-name": "dbo",
                "table-name": "tbldev"
            },
            "rule-action": "include",
            "filters": []
        },
        {
            "rule-type": "selection",
            "rule-id": "962897620",
            "rule-name": "962849360",
            "object-locator": {
                "schema-name": "dbo",
                "table-name": "tbldev"
            },
            "rule-action": "include",
            "filters": []
        }
    ]
}

task-settings.json

A giant file that defines a lot of the task settings we want to configure.

{
    "Logging": {
        "EnableLogging": true,
        "EnableLogContext": true,
        "LogComponents": [
            {
                "Severity": "LOGGER_SEVERITY_DEFAULT",
                "Id": "TRANSFORMATION"
            },
            {
                "Severity": "LOGGER_SEVERITY_DEFAULT",
                "Id": "SOURCE_UNLOAD"
            },
            {
                "Severity": "LOGGER_SEVERITY_DEFAULT",
                "Id": "IO"
            },
            {
                "Severity": "LOGGER_SEVERITY_DEFAULT",
                "Id": "TARGET_LOAD"
            }
        ]
    },
    "StreamBufferSettings": {
        "StreamBufferCount": 3,
        "CtrlStreamBufferSizeInMB": 5,
        "StreamBufferSizeInMB": 8
    },
    "ErrorBehavior": {
        "FailOnNoTablesCaptured": true,
        "ApplyErrorUpdatePolicy": "LOG_ERROR",
        "FailOnTransactionConsistencyBreached": false,
    },
    "TTSettings": {
        "TTS3Settings": null,
        "TTRecordSettings": null,
        "EnableTT": false
    },
    "FullLoadSettings": {
        "CommitRate": 10000,
        "StopTaskCachedChangesApplied": false,
        "StopTaskCachedChangesNotApplied": false,
        "MaxFullLoadSubTasks": 8,
    },
    "TargetMetadata": {
        "ParallelApplyBufferSize": 0,
        "ParallelApplyQueuesPerThread": 0,
        "ParallelApplyThreads": 0,
        "TargetSchema": ""
    },
    "BeforeImageSettings": null,
    "ControlTablesSettings": {
        "historyTimeslotInMinutes": 5,
        "CommitPositionTableEnabled": false,
        "HistoryTimeslotInMinutes": 5,
        "StatusTableEnabled": false
    },
    "LoopbackPreventionSettings": null,
    "CharacterSetSettings": null,
    "FailTaskWhenCleanTaskResourceFailed": false,
    "ChangeProcessingTuning": {
        "StatementCacheSize": 50,
        "CommitTimeout": 1,
    },
    "ChangeProcessingDdlHandlingPolicy": {
        "HandleSourceTableDropped": true,
        "HandleSourceTableTruncated": true,
        "HandleSourceTableAltered": true
    },
    "PostProcessingRules": null
}

variables.tf

Define all the variables we will be using in the configuration.

variable "aws_region" {
  description = "AWS region identifier for created resources."
  type        = string
}

variable "environment" {
  description = "Which environment we are working in."
  type        = string
}

variable "availability_zone" {
  description = "AWS AZ identifier for created resources."
  type        = string
}

variable "target_bucket_name" {
  description = "The name of the target S3 bucket."
  type        = string
}

variable "instance_class" {
  description = "The class of the replication instance."
  type        = string
}

variable "subnets" {
  description = "Subnets in which to deploy the infrastructure (minimum 2 required)"
  type        = set(string)
}

terraform.tfvars

Assign values to our variables that we defined in variables.tf

aws_region = "us-east-1"

environment = "dev"

target_bucket_name = "dms-${var.environment}-s3-target"

availability_zone = "us-east-1a"

instance_class = "dms.r5.4xlarge"

subnets = [
  "dms-${var.default_tags["alk:env"]}-sql",
  "dms-${var.default_tags["alk:env"]}-sql2",
]

In Conclusion

So that’s pretty much it for now. To reiterate, the goal here primarily was to give a high level overview of this service for somebody who might NOT be a database expert. If you are working alongside a DBA the AWS side of this is not bad at all.

Using AWS Backup To Automate Windows Server Backups
AWS Managed Microsoft AD Jamboree

Start the conversation