From 8a0992f397e3f041610e5a5e20dec07d55400f9b Mon Sep 17 00:00:00 2001 From: yangxin Date: Tue, 30 Sep 2025 10:40:48 +0800 Subject: [PATCH 1/2] Add essential migration --- TOC-tidb-cloud-essential.md | 2 + ...om-mysql-using-data-migration-essential.md | 366 ++++++++++++++++++ ...om-mysql-using-data-migration-essential.md | 242 ++++++++++++ 3 files changed, 610 insertions(+) create mode 100644 tidb-cloud/migrate-from-mysql-using-data-migration-essential.md create mode 100644 tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md diff --git a/TOC-tidb-cloud-essential.md b/TOC-tidb-cloud-essential.md index 2ffdf37befc73..cece87301c5db 100644 --- a/TOC-tidb-cloud-essential.md +++ b/TOC-tidb-cloud-essential.md @@ -214,6 +214,8 @@ - Migrate or Import Data - [Overview](/tidb-cloud/tidb-cloud-migration-overview.md) - Migrate Data into TiDB Cloud + - [Migrate Existing and Incremental Data Using Data Migration](/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md) + - [Migrate Incremental Data Using Data Migration](/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md) - [Migrate from TiDB Self-Managed to TiDB Cloud](/tidb-cloud/migrate-from-op-tidb.md) - [Migrate and Merge MySQL Shards of Large Datasets](/tidb-cloud/migrate-sql-shards.md) - [Migrate from Amazon RDS for Oracle Using AWS DMS](/tidb-cloud/migrate-from-oracle-using-aws-dms.md) diff --git a/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md b/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md new file mode 100644 index 0000000000000..24ccdf82a451d --- /dev/null +++ b/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md @@ -0,0 +1,366 @@ +--- +title: Migrate MySQL-Compatible Databases to TiDB Cloud Essential Using Data Migration +summary: Learn how to seamlessly migrate your MySQL databases from Amazon Aurora MySQL, Amazon RDS, Azure Database for MySQL - Flexible Server, Google Cloud SQL for MySQL, or self-managed MySQL instances to TiDB Cloud Essential with minimal downtime using the Data Migration feature. +--- + +# Migrate MySQL-Compatible Databases to TiDB Cloud Essential Using Data Migration + +This document guides you through migrating your MySQL databases from Amazon Aurora MySQL, Amazon RDS, Azure Database for MySQL - Flexible Server, Google Cloud SQL for MySQL, or self-managed MySQL instances to TiDB Cloud using the Data Migration feature in the [TiDB Cloud console](https://tidbcloud.com/). + +This feature enables you to migrate your existing MySQL data and continuously replicate ongoing changes (binlog) from your MySQL-compatible source databases directly to TiDB Cloud Essential, maintaining data consistency whether in the same region or across different regions. The streamlined process eliminates the need for separate dump and load operations, reducing downtime and simplifying your migration from MySQL to a more scalable platform. + +If you only want to replicate ongoing binlog changes from your MySQL-compatible database to TiDB Cloud Essential, see [Migrate Incremental Data from MySQL-Compatible Databases to TiDB Cloud Essential Using Data Migration](/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md). + +## Limitations + +### Availability + +- If you don't see the [Data Migration](/tidb-cloud/migrate-from-mysql-using-data-migration.md#step-1-go-to-the-data-migration-page) entry for your TiDB Cloud Essential cluster in the [TiDB Cloud console](https://tidbcloud.com/), the feature might not be available in your region. To request support for your region, contact [TiDB Cloud Support](/tidb-cloud/tidb-cloud-support.md). + +- Amazon Aurora MySQL writer instances support both existing data and incremental data migration. Amazon Aurora MySQL reader instances only support existing data migration and do not support incremental data migration. + +### Maximum number of migration jobs + +You can create up to 100 migration jobs for each organization. To create more migration jobs, you need to [file a support ticket](/tidb-cloud/tidb-cloud-support.md). + +### Filtered out and deleted databases + +- The system databases will be filtered out and not migrated to TiDB Cloud Essential even if you select all of the databases to migrate. That is, `mysql`, `information_schema`, `performance_schema`, and `sys` will not be migrated using this feature. + +- When you delete a cluster in TiDB Cloud, all migration jobs in that cluster are automatically deleted and not recoverable. + +### Limitations of existing data migration + +- During existing data migration, if the target database already contains the table to be migrated and there are duplicate keys, the rows with duplicate keys will be replaced. + +- Only logical mode is supported for TiDB Cloud Essential now. + +### Limitations of incremental data migration + +- During incremental data migration, if the table to be migrated already exists in the target database with duplicate keys, an error is reported and the migration is interrupted. In this situation, you need to make sure whether the MySQL source data is accurate. If yes, click the "Restart" button of the migration job, and the migration job will replace the target TiDB Cloud cluster's conflicting records with the MySQL source records. + +- During incremental replication (migrating ongoing changes to your cluster), if the migration job recovers from an abrupt error, it might open the safe mode for 60 seconds. During the safe mode, `INSERT` statements are migrated as `REPLACE`, `UPDATE` statements as `DELETE` and `REPLACE`, and then these transactions are migrated to the target TiDB Cloud cluster to make sure that all the data during the abrupt error has been migrated smoothly to the target TiDB Cloud cluster. In this scenario, for MySQL source tables without primary keys or non-null unique indexes, some data might be duplicated in the target TiDB Cloud cluster because the data might be inserted repeatedly into the target TiDB Cloud cluster. + +- In the following scenarios, if the migration job takes longer than 24 hours, do not purge binary logs in the source database to ensure that Data Migration can get consecutive binary logs for incremental replication: + + - During the existing data migration. + - After the existing data migration is completed and when incremental data migration is started for the first time, the latency is not 0ms. + +## Prerequisites + +Before migrating, check whether your data source is supported, enable binary logging in your MySQL-compatible database, ensure network connectivity, and grant required privileges for both the source database and the target TiDB Cloud cluster database. + +### Make sure your data source and version are supported + +Data Migration supports the following data sources and versions: + +| Data source | Supported versions | +|:-------------------------------------------------|:-------------------| +| Self-managed MySQL (on-premises or public cloud) | 8.0, 5.7, 5.6 | +| Amazon Aurora MySQL | 8.0, 5.7, 5.6 | +| Amazon RDS MySQL | 8.0, 5.7 | +| Alibaba Cloud RDS MySQL | 8.0, 5.7 | + +### Enable binary logs in the source MySQL-compatible database for replication + +To continuously replicate incremental changes from the source MySQL-compatible database to the TiDB Cloud target cluster using DM, you need the following configurations to enable binary logs in the source database: + +| Configuration | Required value | Why | +|:--------------|:---------------|:----| +| `log_bin` | `ON` | Enables binary logging, which DM uses to replicate changes to TiDB | +| `binlog_format` | `ROW` | Captures all data changes accurately (other formats miss edge cases) | +| `binlog_row_image` | `FULL` | Includes all column values in events for safe conflict resolution | +| `binlog_expire_logs_seconds` | ≥ `86400` (1 day), `604800` (7 days, recommended) | Ensures DM can access consecutive logs during migration | + +#### Check current values and configure the source MySQL instance + +To check the current configurations, connect to the source MySQL instance and execute the following statement: + +```sql +SHOW VARIABLES WHERE Variable_name IN +('log_bin','server_id','binlog_format','binlog_row_image', +'binlog_expire_logs_seconds','expire_logs_days'); +``` + +If necessary, change the source MySQL instance configurations to match the required values. + +
+ Configure a self‑managed MySQL instance + +1. Open `/etc/my.cnf` and add the following: + + ``` + [mysqld] + log_bin = mysql-bin + binlog_format = ROW + binlog_row_image = FULL + binlog_expire_logs_seconds = 604800 # 7 days retention + ``` + +2. Restart the MySQL service to apply the changes: + + ``` + sudo systemctl restart mysqld + ``` + +3. Run the `SHOW VARIABLES` statement again to verify that the settings take effect. + +For detailed instructions, see [MySQL Server System Variables](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html) and [The Binary Log](https://dev.mysql.com/doc/refman/8.0/en/binary-log.html) in MySQL documentation. + +
+ +
+ Configure AWS RDS or Aurora MySQL + +1. In the AWS Management Console, open the [Amazon RDS console](https://console.aws.amazon.com/rds/), click **Parameter groups** in the left navigation pane, and then create or edit a custom parameter group. +2. Set the four parameters above to the required values. +3. Attach the parameter group to your instance or cluster, and then reboot to apply the changes. +4. After the reboot, connect to the instance and run the `SHOW VARIABLES` statement to verify the configuration. + +For detailed instructions, see [Working with DB Parameter Groups](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html) and [Configuring MySQL Binary Logging](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_LogAccess.MySQL.BinaryFormat.html) in AWS documentation. + +
+ +### Ensure network connectivity + +Before creating a migration job, you need to plan and set up proper network connectivity between your source MySQL instance, the TiDB Cloud Data Migration (DM) service, and your target TiDB Cloud cluster. + +The available connection methods are as follows: + +| Connection method | Availability | Recommended for | +|:---------------------|:-------------|:----------------| +| Public endpoints or IP addresses | All cloud providers supported by TiDB Cloud | Quick proof-of-concept migrations, testing, or when private connectivity is unavailable | +| Private links or private endpoints | AWS and Azure only | Production workloads without exposing data to the public internet | + +Choose a connection method that best fits your cloud provider, network topology, and security requirements, and then follow the setup instructions for that method. + +#### End-to-end encryption over TLS/SSL + +Regardless of the connection method, it is strongly recommended to use TLS/SSL for end-to-end encryption. While private endpoints and VPC peering secure the network path, TLS/SSL secures the data itself and helps meet compliance requirements. + +
+ Download and store the cloud provider's certificates for TLS/SSL encrypted connections + +- Amazon Aurora MySQL or Amazon RDS MySQL: [Using SSL/TLS to encrypt a connection to a DB instance or cluster](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/UsingWithRDS.SSL.html) + +
+ +#### Public endpoints or IP addresses + +When using public endpoints, you can verify network connectivity and access both now and later during the DM job creation process. TiDB Cloud will provide specific egress IP addresses and prompt instructions at that time. + +1. Identify and record the source MySQL instance's endpoint hostname (FQDN) or public IP address. +2. Ensure you have the required permissions to modify the firewall or security group rules for your database. Refer to your cloud provider's documentation for guidance as follows: +3. Optional: Verify connectivity to your source database from a machine with public internet access using the appropriate certificate for in-transit encryption: + + ```shell + mysql -h -P -u -p --ssl-ca= -e "SELECT version();" + ``` + +4. Later, during the Data Migration job setup, TiDB Cloud will provide an egress IP range. At that time, you need to add this IP range to your database's firewall or security‑group rules following the same procedure above. + +#### Private link or private endpoint + +If you use a provider-native private link or private endpoint, create a private endpoint for your source MySQL instance (RDS, Aurora, or Azure Database for MySQL). + +
+ Set up AWS PrivateLink and Private Endpoint for the MySQL source database + +AWS does not support direct PrivateLink access to RDS or Aurora. Therefore, you need to create a Network Load Balancer (NLB) and publish it as an endpoint service associated with your source MySQL instance. + +1. In the [Amazon EC2 console](https://console.aws.amazon.com/ec2/), create an NLB in the same subnet(s) as your RDS or Aurora writer. Configure the NLB with a TCP listener on port `3306` that forwards traffic to the database endpoint. + + For detailed instructions, see [Create a Network Load Balancer](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/create-network-load-balancer.html) in AWS documentation. + +2. In the [Amazon VPC console](https://console.aws.amazon.com/vpc/), click **Endpoint Services** in the left navigation pane, and then create an endpoint service. During the setup, select the NLB created in the previous step as the backing load balancer, and enable the **Require acceptance for endpoint** option. After the endpoint service is created, copy the service name (in the `com.amazonaws.vpce-svc-xxxxxxxxxxxxxxxxx` format) for later use. + + For detailed instructions, see [Create an endpoint service](https://docs.aws.amazon.com/vpc/latest/privatelink/create-endpoint-service.html) in AWS documentation. + +3. Optional: Test connectivity from a bastion or client inside the same VPC or VNet before starting the migration: + + ```shell + mysql -h -P 3306 -u -p --ssl-ca= -e "SELECT version();" + ``` + +4. Later, when configuring TiDB Cloud DM to connect via PrivateLink, you will need to return to the AWS console and approve the pending connection request from TiDB Cloud to this private endpoint. + +
+ +### Grant required privileges for migration + +Before starting migration, you need to set up appropriate database users with the required privileges on both the source and target databases. These privileges enable TiDB Cloud DM to read data from MySQL, replicate changes, and write to your TiDB Cloud cluster securely. Because the migration involves both full data dumps for existing data and binlog replication for incremental changes, your migration user requires specific permissions beyond basic read access. + +#### Grant required privileges to the migration user in the source MySQL database + +For testing purposes, you can use an administrative user (such as `root`) in your source MySQL database. + +For production workloads, it is recommended to have a dedicated user for data dump and replication in the source MySQL database, and grant only the necessary privileges: + +| Privilege | Scope | Purpose | +|:----------|:------|:--------| +| `SELECT` | Tables | Allows reading data from all tables | +| `RELOAD` | Global | Ensures consistent snapshots during full dump | +| `REPLICATION SLAVE` | Global | Enables binlog streaming for incremental replication | +| `REPLICATION CLIENT` | Global | Provides access to binlog position and server status | + +For example, you can use the following `GRANT` statement in your source MySQL instance to grant corresponding privileges: + +```sql +GRANT SELECT, RELOAD, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'dm_source_user'@'%'; +``` + +#### Grant required privileges in the target TiDB Cloud cluster + +For testing purposes, you can use the `root` account of your TiDB Cloud cluster. + +For production workloads, it is recommended to have a dedicated user for replication in the target TiDB Cloud cluster and grant only the necessary privileges: + +| Privilege | Scope | Purpose | +|:----------|:------|:--------| +| `CREATE` | Databases, Tables | Creates schema objects in the target | +| `SELECT` | Tables | Verifies data during migration | +| `INSERT` | Tables | Writes migrated data | +| `UPDATE` | Tables | Modifies existing rows during incremental replication | +| `DELETE` | Tables | Removes rows during replication or updates | +| `ALTER` | Tables | Modifies table definitions when schema changes | +| `DROP` | Databases, Tables | Removes objects during schema sync | +| `INDEX` | Tables | Creates and modifies indexes | +| `CREATE VIEW` | View | Create views used by migration | + +For example, you can execute the following `GRANT` statement in your target TiDB Cloud cluster to grant corresponding privileges: + +```sql +GRANT CREATE, SELECT, INSERT, UPDATE, DELETE, ALTER, DROP, INDEX ON *.* TO 'dm_target_user'@'%'; +``` + +## Step 1: Go to the Data Migration page + +1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**Clusters**](https://tidbcloud.com/project/clusters) page of your project. + + > **Tip:** + > + > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. + +2. Click the name of your target cluster to go to its overview page, and then click **Data** > **Migration** in the left navigation pane. + +3. On the **Data Migration** page, click **Create Migration Job** in the upper-right corner. The **Create Migration Job** page is displayed. + +## Step 2: Configure the source and target connections + +On the **Create Migration Job** page, configure the source and target connections. + +1. Enter a job name, which must start with a letter and must be less than 60 characters. Letters (A-Z, a-z), numbers (0-9), underscores (_), and hyphens (-) are acceptable. + +2. Fill in the source connection profile. + + - **Data source**: the data source type. + - **Connectivity method**: select a connection method for your data source based on your security requirements and cloud provider: + - **Public IP**: available for all cloud providers (recommended for testing and proof-of-concept migrations). + - **Private Link**: available for AWS and Azure only (recommended for production workloads requiring private connectivity). + - Based on the selected **Connectivity method**, do the following: + - If **Public IP** or **VPC Peering** is selected, fill in the **Hostname or IP address** field with the hostname or IP address of the data source. + - If **Private Link** is selected, fill in the following information: + - **Endpoint Service Name** (available if **Data source** is from AWS): enter the VPC endpoint service name (format: `com.amazonaws.vpce-svc-xxxxxxxxxxxxxxxxx`) that you created for your RDS or Aurora instance. + - **Private Endpoint Resource ID** (available if **Data source** is from Azure): enter the resource ID of your MySQL Flexible Server instance (format: `/subscriptions//resourceGroups//providers/Microsoft.DBforMySQL/flexibleServers/`). + - **Port**: the port of the data source. + - **User Name**: the username of the data source. + - **Password**: the password of the username. + - **SSL/TLS**: enable SSL/TLS for end-to-end data encryption (highly recommended for all migration jobs). Upload the appropriate certificates based on your MySQL server's SSL configuration. + + SSL/TLS configuration options: + + - Option 1: Server authentication only + + - If your MySQL server is configured for server authentication only, upload only the **CA Certificate**. + - In this option, the MySQL server presents its certificate to prove its identity, and TiDB Cloud verifies the server certificate against the CA. + - The CA certificate protects against man-in-the-middle attacks and is required if the MySQL server is started with `require_secure_transport = ON`. + + - Option 2: Client certificate authentication + + - If your MySQL server is configured for client certificate authentication, upload **Client Certificate** and **Client private key**. + - In this option, TiDB Cloud presents its certificate to the MySQL server for authentication, but TiDB Cloud does not verify the MySQL server's certificate. + - This option is typically used when the MySQL server is configured with options such as `REQUIRE SUBJECT '...'` or `REQUIRE ISSUER '...'` without `REQUIRE X509`, allowing it to check specific attributes of the client certificate without full CA validation of that client certificate. + - This option is often used when the MySQL server accepts client certificates in self-signed or custom PKI environments. Note that this configuration is vulnerable to man-in-the-middle attacks and is not recommended for production environments unless other network-level controls guarantee server authenticity. + + - Option 3: Mutual TLS (mTLS) - highest security + + - If your MySQL server is configured for mutual TLS (mTLS) authentication, upload **CA Certificate**, **Client Certificate**, and **Client private key**. + - In this option, the MySQL server verifies TiDB Cloud's identity using the client certificate, and TiDB Cloud verifies MySQL server's identity using the CA certificate. + - This option is required when the MySQL server has `REQUIRE X509` or `REQUIRE SSL` configured for the migration user. + - This option is used when the MySQL server requires client certificates for authentication. + - You can get the certificates from the following sources: + - Download from your cloud provider (see [TLS certificate links](#end-to-end-encryption-over-tlsssl)). + - Use your organization's internal CA certificates. + - Self-signed certificates (for development/testing only). + +3. Fill in the target connection profile. + + - **User Name**: enter the username of the target cluster in TiDB Cloud. + - **Password**: enter the password of the TiDB Cloud username. + +4. Click **Validate Connection and Next** to validate the information you have entered. + +5. Take action according to the message you see: + + - If you use **Public IP** as the connectivity method, you need to add the Data Migration service's IP addresses to the IP Access List of your source database and firewall (if any). + - If you use **Private Link** as the connectivity method, you are prompted to accept the endpoint request: + - For AWS: go to the [AWS VPC console](https://us-west-2.console.aws.amazon.com/vpc/home), click **Endpoint services**, and accept the endpoint request from TiDB Cloud. + - For Azure: go to the [Azure portal](https://portal.azure.com), search for your MySQL Flexible Server by name, click **Setting** > **Networking** in the left navigation pane, locate the **Private endpoint** section on the right side, and then approve the pending connection request from TiDB Cloud. + +## Step 3: Choose migration job type + +In the **Choose the objects to be migrated** step, you can choose existing data migration, incremental data migration, or both. + +### Migrate existing data and incremental data + +To migrate data to TiDB Cloud once and for all, choose both **Existing data migration** and **Incremental data migration**, which ensures data consistency between the source and target databases. + +You can only use **logical mode** to migrate **existing data** and **incremental data**. + +This mode exports data from MySQL source databases as SQL statements and then executes them on TiDB. In this mode, the target tables before migration can be either empty or non-empty. + +### Migrate only incremental data + +To migrate only the incremental data of the source database to TiDB Cloud, choose **Incremental data migration**. In this case, the migration job does not migrate the existing data of the source database to TiDB Cloud, but only migrates the ongoing changes of the source database that are explicitly specified by the migration job. + +For detailed instructions about incremental data migration, see [Migrate Only Incremental Data from MySQL-Compatible Databases to TiDB Cloud Using Data Migration](/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration.md). + +## Step 4: Choose the objects to be migrated + +1. On the **Choose Objects to Migrate** page, select the objects to be migrated. You can click **All** to select all objects, or click **Customize** and then click the checkbox next to the object name to select the object. + + - If you click **All**, the migration job will migrate the existing data from the whole source database instance to TiDB Cloud and migrate ongoing changes after the full migration. Note that it happens only if you have selected the **Existing data migration** and **Incremental data migration** checkboxes in the previous step. + - If you click **Customize** and select some databases, the migration job will migrate the existing data and migrate ongoing changes of the selected databases to TiDB Cloud. Note that it happens only if you have selected the **Existing data migration** and **Incremental data migration** checkboxes in the previous step. + - If you click **Customize** and select some tables under a dataset name, the migration job will only migrate the existing data and migrate ongoing changes of the selected tables. Tables created afterwards in the same database will not be migrated. + +2. Click **Next**. + +## Step 5: Precheck + +On the **Precheck** page, you can view the precheck results. If the precheck fails, you need to operate according to **Failed** or **Warning** details, and then click **Check again** to recheck. + +If there are only warnings on some check items, you can evaluate the risk and consider whether to ignore the warnings. If all warnings are ignored, the migration job will automatically go on to the next step. + +For more information about errors and solutions, see [Precheck errors and solutions](/tidb-cloud/tidb-cloud-dm-precheck-and-troubleshooting.md#precheck-errors-and-solutions). + +For more information about precheck items, see [Migration Task Precheck](https://docs.pingcap.com/tidb/stable/dm-precheck). + +If all check items show **Pass**, click **Next**. + +## Step 6: Choose a spec and start migration + +On the **Choose a Spec and Start Migration** page, select an appropriate migration specification according to your performance requirements. For more information about the specifications, see [Specifications for Data Migration](/tidb-cloud/tidb-cloud-billing-dm.md#specifications-for-data-migration). + +After selecting the spec, click **Create Job and Start** to start the migration. + +## Step 7: View the migration progress + +After the migration job is created, you can view the migration progress on the **Migration Job Details** page. The migration progress is displayed in the **Stage and Status** area. + +You can pause or delete a migration job when it is running. + +If a migration job has failed, you can resume it after solving the problem. + +You can delete a migration job in any status. + +If you encounter any problems during the migration, see [Migration errors and solutions](/tidb-cloud/tidb-cloud-dm-precheck-and-troubleshooting.md#migration-errors-and-solutions). diff --git a/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md b/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md new file mode 100644 index 0000000000000..9497244854c59 --- /dev/null +++ b/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md @@ -0,0 +1,242 @@ +--- +title: Migrate Only Incremental Data from MySQL-Compatible Databases to TiDB Cloud Using Data Migration +summary: Learn how to migrate incremental data from MySQL-compatible databases hosted in Amazon Aurora MySQL, Amazon Relational Database Service (RDS), Google Cloud SQL for MySQL, Azure Database for MySQL, or a local MySQL instance to TiDB Cloud using Data Migration. +--- + +# Migrate Only Incremental Data from MySQL-Compatible Databases to TiDB Cloud Using Data Migration + +This document describes how to migrate incremental data from a MySQL-compatible database on a cloud provider (Amazon Aurora MySQL, Amazon Relational Database Service (RDS), Google Cloud SQL for MySQL, or Azure Database for MySQL) or self-hosted source database to TiDB Cloud using the Data Migration feature of the TiDB Cloud console. + +For instructions about how to migrate existing data or both existing data and incremental data, see [Migrate MySQL-Compatible Databases to TiDB Cloud Using Data Migration](/tidb-cloud/migrate-from-mysql-using-data-migration.md). + +## Limitations + +> **Note**: +> +> This section only includes limitations about incremental data migration. It is recommended that you also read the general limitations. See [Limitations](/tidb-cloud/migrate-from-mysql-using-data-migration.md#limitations). + +- If the target table is not yet created in the target database, the migration job will report an error as follows and fail. In this case, you need to manually create the target table and then retry the migration job. + + ```sql + startLocation: [position: (mysql_bin.000016, 5122), gtid-set: + 00000000-0000-0000-0000-00000000000000000], endLocation: + [position: (mysql_bin.000016, 5162), gtid-set: 0000000-0000-0000 + 0000-0000000000000:0]: cannot fetch downstream table schema of + zm`.'table1' to initialize upstream schema 'zm'.'table1' in sschema + tracker Raw Cause: Error 1146: Table 'zm.table1' doesn't exist + ``` + +- If some rows are deleted or updated in the upstream and there are no corresponding rows in the downstream, the migration job will detect that there are no rows available for deletion or update when replicating the `DELETE` and `UPDATE` DML operations from the upstream. + +If you specify GTID as the start position to migrate incremental data, note the following limitations: + +- Make sure that the GTID mode is enabled in the source database. +- If the source database is MySQL, the MySQL version must be 5.6 or later, and the storage engine must be InnoDB. +- If the migration job connects to a secondary database in the upstream, the `REPLICATE CREATE TABLE ... SELECT` events cannot be migrated. This is because the statement will be split into two transactions (`CREATE TABLE` and `INSERT`) that are assigned the same GTID. As a result, the `INSERT` statement will be ignored by the secondary database. + +## Prerequisites + +> **Note**: +> +> This section only includes prerequisites about incremental data migration. It is recommended that you also read the [general prerequisites](/tidb-cloud/migrate-from-mysql-using-data-migration.md#prerequisites). + +If you want to use GTID to specify the start position, make sure that the GTID is enabled in the source database. The operations vary depending on the database type. + +### For Amazon RDS and Amazon Aurora MySQL + +For Amazon RDS and Amazon Aurora MySQL, you need to create a new modifiable parameter group (that is, not the default parameter group) and then modify the following parameters in the parameter group and restart the instance application. + +- `gtid_mode` +- `enforce_gtid_consistency` + +You can check if the GTID mode has been successfully enabled by executing the following SQL statement: + +```sql +SHOW VARIABLES LIKE 'gtid_mode'; +``` + +If the result is `ON` or `ON_PERMISSIVE`, the GTID mode is successfully enabled. + +For more information, see [Parameters for GTID-based replication](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/mysql-replication-gtid.html#mysql-replication-gtid.parameters). + +### For Google Cloud SQL for MySQL + +The GTID mode is enabled for Google Cloud SQL for MySQL by default. You can check if the GTID mode has been successfully enabled by executing the following SQL statement: + +```sql +SHOW VARIABLES LIKE 'gtid_mode'; +``` + +If the result is `ON` or `ON_PERMISSIVE`, the GTID mode is successfully enabled. + +### For Azure Database for MySQL + +The GTID mode is enabled by default for Azure Database for MySQL (versions 5.7 and later). You can check if the GTID mode has been successfully enabled by executing the following SQL statement: + +```sql +SHOW VARIABLES LIKE 'gtid_mode'; +``` + +If the result is `ON` or `ON_PERMISSIVE`, the GTID mode is successfully enabled. + +In addition, ensure that the `binlog_row_image` server parameter is set to `FULL`. You can check this by executing the following SQL statement: + +```sql +SHOW VARIABLES LIKE 'binlog_row_image'; +``` + +If the result is not `FULL`, you need to configure this parameter for your Azure Database for MySQL instance using the [Azure portal](https://portal.azure.com/) or [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/). + +### For a self-hosted MySQL instance + +> **Note**: +> +> The exact steps and commands might vary depending on the MySQL version and configuration. Make sure that you understand the impact of enabling GTID and that you have properly tested and verified it in a non-production environment before performing this action. + +To enable the GTID mode for a self-hosted MySQL instance, follow these steps: + +1. Connect to the MySQL server using a MySQL client with the appropriate privileges. + +2. Execute the following SQL statements to enable the GTID mode: + + ```sql + -- Enable the GTID mode + SET GLOBAL gtid_mode = ON; + + -- Enable `enforce_gtid_consistency` + SET GLOBAL enforce_gtid_consistency = ON; + + -- Reload the GTID configuration + RESET MASTER; + ``` + +3. Restart the MySQL server to ensure that the configuration changes take effect. + +4. Check if the GTID mode has been successfully enabled by executing the following SQL statement: + + ```sql + SHOW VARIABLES LIKE 'gtid_mode'; + ``` + + If the result is `ON` or `ON_PERMISSIVE`, the GTID mode is successfully enabled. + +## Step 1: Go to the Data Migration page + +1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**Clusters**](https://tidbcloud.com/project/clusters) page of your project. + + > **Tip:** + > + > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. + +2. Click the name of your target cluster to go to its overview page, and then click **Data** > **Migration** in the left navigation pane. + +3. On the **Data Migration** page, click **Create Migration Job** in the upper-right corner. The **Create Migration Job** page is displayed. + +## Step 2: Configure the source and target connection + +On the **Create Migration Job** page, configure the source and target connection. + +1. Enter a job name, which must start with a letter and must be less than 60 characters. Letters (A-Z, a-z), numbers (0-9), underscores (_), and hyphens (-) are acceptable. + +2. Fill in the source connection profile. + + - **Data source**: the data source type. + - **Region**: the region of the data source, which is required for cloud databases only. + - **Connectivity method**: the connection method for the data source. Currently, you can choose public IP, VPC Peering, or Private Link according to your connection method. + - **Hostname or IP address** (for public IP and VPC Peering): the hostname or IP address of the data source. + - **Service Name** (for Private Link): the endpoint service name. + - **Port**: the port of the data source. + - **Username**: the username of the data source. + - **Password**: the password of the username. + - **SSL/TLS**: if you enable SSL/TLS, you need to upload the certificates of the data source, including any of the following: + - only the CA certificate + - the client certificate and client key + - the CA certificate, client certificate and client key + +3. Fill in the target connection profile. + + - **Username**: enter the username of the target cluster in TiDB Cloud. + - **Password**: enter the password of the TiDB Cloud username. + +4. Click **Validate Connection and Next** to validate the information you have entered. + +5. Take action according to the message you see: + + - If you use Public IP, you need to add the Data Migration service's IP addresses to the IP Access List of your source database and firewall (if any). + - If you use AWS Private Link, you are prompted to accept the endpoint request. Go to the [AWS VPC console](https://us-west-2.console.aws.amazon.com/vpc/home), and click **Endpoint services** to accept the endpoint request. + +## Step 3: Choose migration job type + +To migrate only the incremental data of the source database to TiDB Cloud, select **Incremental data migration** and do not select **Existing data migration**. In this way, the migration job only migrates ongoing changes of the source database to TiDB Cloud. + +In the **Start Position** area, you can specify one of the following types of start positions for incremental data migration: + +- The time when the incremental migration job starts +- GTID +- Binlog file name and position + +Once a migration job starts, you cannot change the start position. + +### The time when the incremental migration job starts + +If you select this option, the migration job will only migrate the incremental data that is generated in the source database after the migration job starts. + +### Specify GTID + +Select this option to specify the GTID of the source database, for example, `3E11FA47-71CA-11E1-9E33-C80AA9429562:1-23`. The migration job will replicate the transactions excluding the specified GTID set to migrate ongoing changes of the source database to TiDB Cloud. + +You can run the following command to check the GTID of the source database: + +```sql +SHOW MASTER STATUS; +``` + +For information about how to enable GTID, see [Prerequisites](#prerequisites). + +### Specify binlog file name and position + +Select this option to specify the binlog file name (for example, `binlog.000001`) and binlog position (for example, `1307`) of the source database. The migration job will start from the specified binlog file name and position to migrate ongoing changes of the source database to TiDB Cloud. + +You can run the following command to check the binlog file name and position of the source database: + +```sql +SHOW MASTER STATUS; +``` + +If there is data in the target database, make sure the binlog position is correct. Otherwise, there might be conflicts between the existing data and the incremental data. If conflicts occur, the migration job will fail. If you want to replace the conflicted records with data from the source database, you can resume the migration job. + +## Step 4: Choose the objects to be migrated + +1. On the **Choose Objects to Migrate** page, select the objects to be migrated. You can click **All** to select all objects, or click **Customize** and then click the checkbox next to the object name to select the object. + +2. Click **Next**. + +## Step 5: Precheck + +On the **Precheck** page, you can view the precheck results. If the precheck fails, you need to operate according to **Failed** or **Warning** details, and then click **Check again** to recheck. + +If there are only warnings on some check items, you can evaluate the risk and consider whether to ignore the warnings. If all warnings are ignored, the migration job will automatically go on to the next step. + +For more information about errors and solutions, see [Precheck errors and solutions](/tidb-cloud/tidb-cloud-dm-precheck-and-troubleshooting.md#precheck-errors-and-solutions). + +For more information about precheck items, see [Migration Task Precheck](https://docs.pingcap.com/tidb/stable/dm-precheck). + +If all check items show **Pass**, click **Next**. + +## Step 6: Choose a spec and start migration + +On the **Choose a Spec and Start Migration** page, select an appropriate migration specification according to your performance requirements. For more information about the specifications, see [Specifications for Data Migration](/tidb-cloud/tidb-cloud-billing-dm.md#specifications-for-data-migration). + +After selecting the spec, click **Create Job and Start** to start the migration. + +## Step 7: View the migration progress + +After the migration job is created, you can view the migration progress on the **Migration Job Details** page. The migration progress is displayed in the **Stage and Status** area. + +You can pause or delete a migration job when it is running. + +If a migration job has failed, you can resume it after solving the problem. + +You can delete a migration job in any status. + +If you encounter any problems during the migration, see [Migration errors and solutions](/tidb-cloud/tidb-cloud-dm-precheck-and-troubleshooting.md#migration-errors-and-solutions). From 06a3a3368fe7c4bab5209735570d7f7212861746 Mon Sep 17 00:00:00 2001 From: Leon Yang Date: Tue, 30 Sep 2025 10:47:33 +0800 Subject: [PATCH 2/2] Apply suggestions from code review Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- ...om-mysql-using-data-migration-essential.md | 22 +++++++++---------- ...om-mysql-using-data-migration-essential.md | 14 ++++++------ 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md b/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md index 24ccdf82a451d..cf1eae4f755c5 100644 --- a/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md +++ b/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md @@ -37,9 +37,9 @@ You can create up to 100 migration jobs for each organization. To create more mi ### Limitations of incremental data migration -- During incremental data migration, if the table to be migrated already exists in the target database with duplicate keys, an error is reported and the migration is interrupted. In this situation, you need to make sure whether the MySQL source data is accurate. If yes, click the "Restart" button of the migration job, and the migration job will replace the target TiDB Cloud cluster's conflicting records with the MySQL source records. +- During incremental data migration, if the table to be migrated already exists in the target database with duplicate keys, an error is reported and the migration is interrupted. In this situation, you need to verify that the MySQL source data is accurate. If it is, click the "Restart" button of the migration job, and the migration job will replace the target TiDB Cloud cluster's conflicting records with the MySQL source records. -- During incremental replication (migrating ongoing changes to your cluster), if the migration job recovers from an abrupt error, it might open the safe mode for 60 seconds. During the safe mode, `INSERT` statements are migrated as `REPLACE`, `UPDATE` statements as `DELETE` and `REPLACE`, and then these transactions are migrated to the target TiDB Cloud cluster to make sure that all the data during the abrupt error has been migrated smoothly to the target TiDB Cloud cluster. In this scenario, for MySQL source tables without primary keys or non-null unique indexes, some data might be duplicated in the target TiDB Cloud cluster because the data might be inserted repeatedly into the target TiDB Cloud cluster. +- During incremental replication (migrating ongoing changes to your cluster), if the migration job recovers from an abrupt error, it might open the safe mode for 60 seconds. During the safe mode, `INSERT` statements are migrated as `REPLACE`, `UPDATE` statements as `DELETE` and `REPLACE`, and then these transactions are migrated to the target TiDB Cloud cluster to ensure that all the data during the abrupt error has been migrated smoothly to the target TiDB Cloud cluster. In this scenario, for MySQL source tables without primary keys or non-null unique indexes, some data might be duplicated in the target TiDB Cloud cluster because the data might be inserted repeatedly into the target TiDB Cloud cluster. - In the following scenarios, if the migration job takes longer than 24 hours, do not purge binary logs in the source database to ensure that Data Migration can get consecutive binary logs for incremental replication: @@ -85,7 +85,7 @@ SHOW VARIABLES WHERE Variable_name IN If necessary, change the source MySQL instance configurations to match the required values.
- Configure a self‑managed MySQL instance + Configure a self-managed MySQL instance 1. Open `/etc/my.cnf` and add the following: @@ -150,16 +150,16 @@ Regardless of the connection method, it is strongly recommended to use TLS/SSL f When using public endpoints, you can verify network connectivity and access both now and later during the DM job creation process. TiDB Cloud will provide specific egress IP addresses and prompt instructions at that time. 1. Identify and record the source MySQL instance's endpoint hostname (FQDN) or public IP address. -2. Ensure you have the required permissions to modify the firewall or security group rules for your database. Refer to your cloud provider's documentation for guidance as follows: +2. Ensure you have the required permissions to modify the firewall or security group rules for your database. Refer to your cloud provider's documentation for guidance. 3. Optional: Verify connectivity to your source database from a machine with public internet access using the appropriate certificate for in-transit encryption: ```shell mysql -h -P -u -p --ssl-ca= -e "SELECT version();" ``` -4. Later, during the Data Migration job setup, TiDB Cloud will provide an egress IP range. At that time, you need to add this IP range to your database's firewall or security‑group rules following the same procedure above. +4. Later, during the Data Migration job setup, TiDB Cloud will provide an egress IP range. At that time, you need to add this IP range to your database's firewall or security-group rules following the same procedure above. -#### Private link or private endpoint +#### Private link or private endpoint If you use a provider-native private link or private endpoint, create a private endpoint for your source MySQL instance (RDS, Aurora, or Azure Database for MySQL). @@ -277,7 +277,7 @@ On the **Create Migration Job** page, configure the source and target connection - Option 2: Client certificate authentication - - If your MySQL server is configured for client certificate authentication, upload **Client Certificate** and **Client private key**. + - If your MySQL server is configured for client certificate authentication, upload **Client Certificate** and **Client private key**. - In this option, TiDB Cloud presents its certificate to the MySQL server for authentication, but TiDB Cloud does not verify the MySQL server's certificate. - This option is typically used when the MySQL server is configured with options such as `REQUIRE SUBJECT '...'` or `REQUIRE ISSUER '...'` without `REQUIRE X509`, allowing it to check specific attributes of the client certificate without full CA validation of that client certificate. - This option is often used when the MySQL server accepts client certificates in self-signed or custom PKI environments. Note that this configuration is vulnerable to man-in-the-middle attacks and is not recommended for production environments unless other network-level controls guarantee server authenticity. @@ -305,11 +305,11 @@ On the **Create Migration Job** page, configure the source and target connection - If you use **Public IP** as the connectivity method, you need to add the Data Migration service's IP addresses to the IP Access List of your source database and firewall (if any). - If you use **Private Link** as the connectivity method, you are prompted to accept the endpoint request: - For AWS: go to the [AWS VPC console](https://us-west-2.console.aws.amazon.com/vpc/home), click **Endpoint services**, and accept the endpoint request from TiDB Cloud. - - For Azure: go to the [Azure portal](https://portal.azure.com), search for your MySQL Flexible Server by name, click **Setting** > **Networking** in the left navigation pane, locate the **Private endpoint** section on the right side, and then approve the pending connection request from TiDB Cloud. + - For Azure: go to the [Azure portal](https://portal.azure.com), search for your MySQL Flexible Server by name, click **Settings** > **Networking** in the left navigation pane, locate the **Private endpoint** section on the right side, and then approve the pending connection request from TiDB Cloud. ## Step 3: Choose migration job type -In the **Choose the objects to be migrated** step, you can choose existing data migration, incremental data migration, or both. +In the **Choose migration job type** step, you can choose existing data migration, incremental data migration, or both. ### Migrate existing data and incremental data @@ -323,7 +323,7 @@ This mode exports data from MySQL source databases as SQL statements and then ex To migrate only the incremental data of the source database to TiDB Cloud, choose **Incremental data migration**. In this case, the migration job does not migrate the existing data of the source database to TiDB Cloud, but only migrates the ongoing changes of the source database that are explicitly specified by the migration job. -For detailed instructions about incremental data migration, see [Migrate Only Incremental Data from MySQL-Compatible Databases to TiDB Cloud Using Data Migration](/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration.md). +For detailed instructions about incremental data migration, see [Migrate Only Incremental Data from MySQL-Compatible Databases to TiDB Cloud Essential Using Data Migration](/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md). ## Step 4: Choose the objects to be migrated @@ -339,7 +339,7 @@ For detailed instructions about incremental data migration, see [Migrate Only In On the **Precheck** page, you can view the precheck results. If the precheck fails, you need to operate according to **Failed** or **Warning** details, and then click **Check again** to recheck. -If there are only warnings on some check items, you can evaluate the risk and consider whether to ignore the warnings. If all warnings are ignored, the migration job will automatically go on to the next step. +If there are only warnings on some check items, you can evaluate the risk and consider whether to ignore the warnings. If all warnings are ignored, the migration job will automatically proceed to the next step. For more information about errors and solutions, see [Precheck errors and solutions](/tidb-cloud/tidb-cloud-dm-precheck-and-troubleshooting.md#precheck-errors-and-solutions). diff --git a/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md b/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md index 9497244854c59..ba44b00bfd848 100644 --- a/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md +++ b/tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration-essential.md @@ -7,13 +7,13 @@ summary: Learn how to migrate incremental data from MySQL-compatible databases h This document describes how to migrate incremental data from a MySQL-compatible database on a cloud provider (Amazon Aurora MySQL, Amazon Relational Database Service (RDS), Google Cloud SQL for MySQL, or Azure Database for MySQL) or self-hosted source database to TiDB Cloud using the Data Migration feature of the TiDB Cloud console. -For instructions about how to migrate existing data or both existing data and incremental data, see [Migrate MySQL-Compatible Databases to TiDB Cloud Using Data Migration](/tidb-cloud/migrate-from-mysql-using-data-migration.md). +For instructions about how to migrate existing data or both existing data and incremental data, see [Migrate MySQL-Compatible Databases to TiDB Cloud Essential Using Data Migration](/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md). ## Limitations > **Note**: > -> This section only includes limitations about incremental data migration. It is recommended that you also read the general limitations. See [Limitations](/tidb-cloud/migrate-from-mysql-using-data-migration.md#limitations). +> This section only includes limitations about incremental data migration. It is recommended that you also read the general limitations. See [Limitations](/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md#limitations). - If the target table is not yet created in the target database, the migration job will report an error as follows and fail. In this case, you need to manually create the target table and then retry the migration job. @@ -38,13 +38,13 @@ If you specify GTID as the start position to migrate incremental data, note the > **Note**: > -> This section only includes prerequisites about incremental data migration. It is recommended that you also read the [general prerequisites](/tidb-cloud/migrate-from-mysql-using-data-migration.md#prerequisites). +> This section only includes prerequisites about incremental data migration. It is recommended that you also read the [general prerequisites](/tidb-cloud/migrate-from-mysql-using-data-migration-essential.md#prerequisites). If you want to use GTID to specify the start position, make sure that the GTID is enabled in the source database. The operations vary depending on the database type. ### For Amazon RDS and Amazon Aurora MySQL -For Amazon RDS and Amazon Aurora MySQL, you need to create a new modifiable parameter group (that is, not the default parameter group) and then modify the following parameters in the parameter group and restart the instance application. +For Amazon RDS and Amazon Aurora MySQL, you need to create a new modifiable parameter group (that is, not the default parameter group), modify the following parameters in the parameter group, and then restart the instance to apply the changes. - `gtid_mode` - `enforce_gtid_consistency` @@ -142,7 +142,7 @@ On the **Create Migration Job** page, configure the source and target connection - **Data source**: the data source type. - **Region**: the region of the data source, which is required for cloud databases only. - - **Connectivity method**: the connection method for the data source. Currently, you can choose public IP, VPC Peering, or Private Link according to your connection method. + - **Connectivity method**: the connection method for the data source. You can choose public IP or Private Link according to your connection method. - **Hostname or IP address** (for public IP and VPC Peering): the hostname or IP address of the data source. - **Service Name** (for Private Link): the endpoint service name. - **Port**: the port of the data source. @@ -151,7 +151,7 @@ On the **Create Migration Job** page, configure the source and target connection - **SSL/TLS**: if you enable SSL/TLS, you need to upload the certificates of the data source, including any of the following: - only the CA certificate - the client certificate and client key - - the CA certificate, client certificate and client key + - the CA certificate, client certificate, and client key 3. Fill in the target connection profile. @@ -163,7 +163,7 @@ On the **Create Migration Job** page, configure the source and target connection 5. Take action according to the message you see: - If you use Public IP, you need to add the Data Migration service's IP addresses to the IP Access List of your source database and firewall (if any). - - If you use AWS Private Link, you are prompted to accept the endpoint request. Go to the [AWS VPC console](https://us-west-2.console.aws.amazon.com/vpc/home), and click **Endpoint services** to accept the endpoint request. + - If you use AWS Private Link, you are prompted to accept the endpoint request. Go to the [AWS VPC console](https://console.aws.amazon.com/vpc/home), and click **Endpoint services** to accept the endpoint request. ## Step 3: Choose migration job type