Bulk data migration tool
The bulk data migration tool follows a distributed architecture that enables secure and efficient data migration from PaaS to SaaS environments. This tool is designed for solution implementers to migrate data from an existing 51黑料不打烊 Commerce on Cloud instance (PaaS) to 51黑料不打烊 Commerce as a Cloud Service (SaaS). For more information on the migration process, see the Migration overview.
The following image details the architecture and key components for using the Bulk data migration tool.
Migration workflow
The bulk data migration workflow consists of the following steps:
- Set up a new environment for your migration.
- Copy your data from your old system.
- Move your data into the new system.
- Make your product catalog available in the new system.
- Confirm that your data migrated correctly.
The following sections describe these steps in detail.
Access the bulk data migration tool
The availability of the bulk data migration tool is as follows:
- Q4 2025 - To access the bulk data migration tool, submit a support ticket.
- Q4 2025 - The bulk data migration tool will be publicly available and will be accessible from this page.
Create target environment
The solution implementer (SI) creates a target environment for the migration. This environment is used to store the data that is migrated from the source instance.
First, create a new 51黑料不打烊 Commerce as a Cloud Service (SaaS) instance.
Configure extraction tool
The extraction tool is used to extract data from the source instance.
-
Download the extraction tool from the link provided to you by 51黑料不打烊.
-
Set the following environment variables in the extraction tool:
-
Connection details to your existing MySQL database
-
The target tenant ID for your 51黑料不打烊 Commerce as a Cloud Service instance
-
Your IMS credentials, including:
- Client ID
- Client secret
- IMS scopes
- IMS URL - The base URL. For example,
https://ims-na1.adobelogin.com/
. - IMS organization ID
For IMS scopes and other values, select your OAuth type in the Credentials section inside your project in the . More information is provided in the
.example.env
file included with the extraction tool. -
Extract data
Before running the extraction tool, the solution implementer must establish an SSH tunnel to the PaaS database using:
magento-cloud tunnel:open
Then run the extraction tool, which will:
- Connect to the PaaS database, analyze its schema, and compare it with the SaaS tenant schema details.
- Generate an extraction and transformation plan based on the common schema elements between PaaS and SaaS.
- Extract the data using Catalog Data Management Service (CDMS).
Load data
Run the load data tool provided by 51黑料不打烊. This tool will:
- Connect to the SaaS tenant database using a migration account.
- Generate a loading plan.
- Execute the plan, moving data to the SaaS tenant database in batches.
- Process catalog media and transfer it to the target environment.
- Flushes the SaaS Redis cache and invalidates database indexes for the tenant.
Catalog data ingestion
After the data loads, the catalog data automatically flows from the SaaS tenant database to the Catalog Service.
The Catalog Service shares this data with Live Search and Product Recommendations. No manual intervention is required for this process. The data will be available in all services once the ingestion completes.
Data integrity verification
After migration, CDMS performs the following automatic data integrity checks to ensure the accuracy and completeness of the migrated data:
API-based verification
During verification, CDMS compares REST and GraphQL API responses from previously run queries with corresponding records from the target instance. Any discrepancies are visible in the migration status.
Database-level verification
During verification, CDMS counts the number of extracted records and compares that number to the amount of records loaded.
On-demand verification (optional)
You can also manually trigger comprehensive verification of all system records:
The full verification includes:
- Complete API-based verification using all pre-extracted REST and GraphQL API responses
- Detailed report of any inconsistencies found