AWS Migration Using DMS Series Part-1
by Rackspace Technology Staff
In this fast-moving tech world, everyone wants to use the best technology with maximum performance and minimum cost. AWS Cloud was introduced with the same notion.
In 2016, AWS introduced its tool DMS (Database Migration Service) to help customers migrate their Databases to AWS.
Through this blog, we would try to understand AWS provided different methods to migrate data from On-Prem/Non AWS Cloud to AWS(Amazon Web Service). Our Primary focus is DMS, which is almost free. For more details on pricing, click here.
Amazon provides two native migration options:
- Schema Conversion Tool (SCT)
- This is basically a tool which one can use if want to convert database schemas from one engine to another. If you want to know more about SCT, refer here.
- Database Migration Service (DMS)
- DMS is a database migration service provided by AWS to help customers move their databases to AWS cloud. NOTE: Before migration, AWS DMS checks for below on the target: tables and primary keys associated. If the above do not exist, then AWS DMS creates them, and then actual data transfer is initiated.
- We can pre-create the target tables ourself.
- We can use AWS Schema Conversion Tool (AWS SCT) to create below objects:
- Target Tables
- Triggers and many other objects
What does DMS not do?
It does not create schemas on Target table. Schema should already exist if data has to be moved in a specific schema.
It does not do conversions of neither Schema nor code . If that is needed, we need to use tools like: SQLDev, MySQL Workbench, SCT.
Database Migration Service
DMS is generally an EC2 server in the cloud VPC(Virtual Private Cloud) which runs software called Replication. It can be used for migration of below databases:
- Oracle DBs version 10.x to 19c
- Data warehouse
- NoSQL like PostGre, MongoDB
- MySQL dbs like MariaDB
- Azure SQL DB
- GCP(Google Cloud Platform) MySql
- And many more
For an extensive list, click here.
1. A Replication Instance
Below are the Pre-requisite resources for creating EC2
- Security group( with outbound rule set to let all traffic on all ports leave (egress) the VPC )
- Below different Network configuration can be used but need to be chosen prior which one you want to go with, click here.
a: Network configuration to a VPC using AWS VPN or Direct Connect
b: Network configuration to a VPC using internet
c. Network configuration using Classic Connect where RDS DB is not in VPC and Source is not in VPC
Below is the generic information required:
- Endpoint type –One has to choose Source if configuring Source else target.
- Engine type – DB engine type, this will be the DB information as per Source or target
- Server name – IP address that AWS DMS can reach. This will be as per source/target whichever is getting configured.
- Port – DB Port for making DB connections.
- Encryption – if you choose then SSL is used to encrypt the connection.
- Credentials – DB User/password for the account with required accesses.
3. Replication task --- To transfer data from Source to Target. This is the actual task that will perform replication. Details below.
- Replication instance – an EC2 instance in which replication related caching calculations happen
- Endpoints both Source and Target
- Migration Type--- We have below types to choose from:
- Full Load---This is generally used for first time
- CDC only ie Changes only—This is used to catch up data once first full load is complete.
- Full load+CDC
Various Phases of Replication task:
a. Full load :
AWS DMS checks for existence of tables, unique indexes, primary keys on target, if they don’t exist it creates them before starting load. Data is moved in parallel.
NOTE: DMS does not create any unnecessary objects on target which are not required for migration
for eg non primary key constraints, secondary index, data defaults are not created.
Homogeneous migration: This is when both source and target are of the same engine type
Here, schema can be exported and imported using Native tool of your engine also
- Cached changes: When you are doing a full load in AWS and your source is still active then there could be data change at source end. These changes are not applied by DMS till full load is complete. Once full load is complete cache changes are applied.
- Ongoing replication: If we want to migrate to different database engine then SCT service need to be used in conjunction with DMS.
Through this blog, we discussed the process of migrating databases using DMS. In part two, we will discuss how we can configure ongoing replication from RDS to Redshift. Read more here.