Create a dataflow to ingest data from a CRM into Experience Platform
Read this guide to learn how to create a dataflow and ingest data into 51黑料不打烊 Experience Platform using the .
Get started
This guide requires a working understanding of the following components of Experience Platform:
- Batch ingestion: DDiscover how you can quickly and efficiently upload large volumes of data in batches.
- Catalog Service: Organize and keep track of your datasets in Experience Platform.
- Data Prep: Transform and map your incoming data to match your schema requirements.
- Dataflows: Set up and manage the pipelines that move your data from sources to destinations.
- Experience Data Model (XDM) Schemas: Structure your data using XDM schemas so it鈥檚 ready for use in Experience Platform.
- Sandboxes: Safely test and develop in isolated environments without affecting production data.
- Sources: Learn how to connect your external data sources to Experience Platform.
Use Experience Platform APIs
For information on how to successfully make calls to Experience Platform APIs, read the guide on getting started with Experience Platform APIs.
Create base connection base
To create a dataflow for your source, you will need a fully authenticated source account and its corresponding base connection ID. If you do not have this ID, visit the sources catalog to find a list of sources for which you can create a base connection.
Create a target XDM schema target-schema
An Experience Data Model (XDM) schema provides a standardized way to organize and describe customer experience data within Experience Platform. To ingest your source data into Experience Platform, you must first create a target XDM schema that defines the structure and types of data you want to ingest. This schema serves as the blueprint for the Experience Platform dataset where your ingested data will reside.
A target XDM schema can be created by performing a POST request to the . For detailed steps on how to create a target XDM schema, read the following guides:
Once created, the target XDM schema $id
will be required later for your target dataset and mapping.
Create a target dataset target-dataset
A dataset is a storage and management construct for a collection of data, typically structured like a table with columns (schema) and rows (fields). Data that is successfully ingested into Experience Platform is stored within the data lake as datasets. During this step, you can either create a new dataset or use an existing one.
You can create a target dataset by making a POST request to the , while providing the ID of the target schema within the payload. For detailed steps on how to create a target dataset, read the guide on creating a dataset using the API.
API format
code language-http |
---|
|
Request
The following example shows how to create a target dataset that is enabled for Real-Time Customer Profile ingestion. In this request, the unifiedProfile
property is set to true
(under the tags
object), which tells Experience Platform to include the dataset in Real-Time Customer Profile.
code language-shell |
---|
|
table 0-row-2 1-row-2 2-row-2 3-row-2 | |
---|---|
Property | Description |
name |
A descriptive name for your target dataset. Use a clear and unique name to make it easier to identify and manage your dataset in future operations. |
schemaRef.id |
The ID of your target XDM schema. |
tags.unifiedProfile |
A boolean value that informs Experience Platform if the data should be ingested into Real-Time Customer Profile. |
Response
A successful response returns the ID of your target dataset. This ID is required later to create a target connection.
code language-json |
---|
|
Create a source connection source
A source connection defines how data is brought into Experience Platform from an external source. It specifies both the source system and the format of the incoming data, and it references a base connection that contains authentication details. Each source connection is unique to your organization.
- For file-based sources (such as cloud storages), a source connection can include settings like column delimiter, encoding type, compression type, regular expressions for file selection, and whether to ingest files recursively.
- For table-based sources (such as databases, CRMs, and marketing automation providers), a source connection can specify details like the table name and column mappings.
To create a source connection, make a POST request to the /sourceConnections
endpoint of the Flow Service API and provide your base connection ID, connection specification ID, and path to the source data file.
API format
POST /sourceConnections
Request
curl -X POST \
'https://platform.adobe.io/data/foundation/flowservice/sourceConnections' \
-H 'Authorization: Bearer {ACCESS_TOKEN}' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}' \
-H 'Content-Type: application/json' \
-d '{
"name": "ACME source connection",
"description": "A source connection for ACME contact data",
"baseConnectionId": "6990abad-977d-41b9-a85d-17ea8cf1c0e4",
"data": {
"format": "tabular"
},
"params": {
"tableName": "Contact",
"columns": [
{
"name": "TestID",
"type": "string",
"xdm": {
"type": "string"
}
},
{
"name": "Name",
"type": "string",
"xdm": {
"type": "string"
}
},
{
"name": "Datefield",
"type": "string",
"meta:xdmType": "date-time",
"xdm": {
"type": "string",
"format": "date-time"
}
}
]
},
"connectionSpec": {
"id": "cfc0fee1-7dc0-40ef-b73e-d8b134c436f5",
"version": "1.0"
}
}'
name
description
baseConnectionId
id
of your base connection. You can retrieve this ID by authenticating your source to Experience Platform using the Flow Service API.data.format
tabular
for table-based sources (such as databases, CRMs, and marketing automation providers).params.tableName
params.columns
connectionSpec.id
Response
A successful response returns the ID of your source connection. This ID is required in order to create a dataflow and ingest your data.
{
"id": "b7581b59-c603-4df1-a689-d23d7ac440f3",
"etag": "\"ef05d265-0000-0200-0000-6019e0080000\""
}
Create a target connection target
A target connection represents the connection to the destination where the ingested data lands in. To create a target connection, you must provide the fixed connection specification ID associated to the data lake. This connection specification ID is: c604ff05-7f1a-43c0-8e18-33bf874cb11c
.
API format
POST /targetConnections
Request
curl -X POST \
'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
-H 'Authorization: Bearer {ACCESS_TOKEN}' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}' \
-H 'Content-Type: application/json' \
-d '{
"name": "ACME target connection",
"description": "ACME target connection",
"data": {
"schema": {
"id": "https://ns.adobe.com/{TENANT_ID}/schemas/52b59140414aa6a370ef5e21155fd7a686744b8739ecc168",
"version": "application/vnd.adobe.xed-full+json;version=1"
}
},
"params": {
"dataSetId": "6889f4f89b982b2b90bc1207"
},
"connectionSpec": {
"id": "c604ff05-7f1a-43c0-8e18-33bf874cb11c",
"version": "1.0"
}
}'
name
description
data.schema.id
params.dataSetId
connectionSpec.id
c604ff05-7f1a-43c0-8e18-33bf874cb11c
.Mapping mapping
Next, map your source data to the target schema that your target dataset adheres to. To create a mapping, make a POST request to the mappingSets
endpoint of the . Include your target XDM schema ID and the details of the mapping sets you want to create.
API format
POST /mappingSets
Request
curl -X POST \
'https://platform.adobe.io/data/foundation/conversion/mappingSets' \
-H 'Authorization: Bearer {ACCESS_TOKEN}' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}' \
-H 'Content-Type: application/json' \
-d '{
"version": 0,
"xdmSchema": "https://ns.adobe.com/{TENANT_ID}/schemas/52b59140414aa6a370ef5e21155fd7a686744b8739ecc168",
"xdmVersion": "1.0",
"id": null,
"mappings": [
{
"destinationXdmPath": "_id",
"sourceAttribute": "TestID",
"identity": false,
"identityGroup": null,
"namespaceCode": null,
"version": 0
},
{
"destinationXdmPath": "person.name.fullName",
"sourceAttribute": "Name",
"identity": false,
"identityGroup": null,
"namespaceCode": null,
"version": 0
},
{
"destinationXdmPath": "person.birthDate",
"sourceAttribute": "Datefield",
"identity": false,
"identityGroup": null,
"namespaceCode": null,
"version": 0
}
]
}'
xdmSchema
$id
of the target XDM schema.Response
A successful response returns details of the newly created mapping including its unique identifier (id
). This ID is required in a later step to create a dataflow.
{
"id": "93ddfa69c4864d978832b1e5ef6ec3b9",
"version": 0,
"createdDate": 1612309018666,
"modifiedDate": 1612309018666,
"createdBy": "{CREATED_BY}",
"modifiedBy": "{MODIFIED_BY}"
}
Retrieve dataflow specifications flow-specs
Before you can create a dataflow, you must first retrieve the dataflow specifications that correspond with your source. To retrieve this information, make a GET request to the /flowSpecs
endpoint of the Flow Service API.
API format
GET /flowSpecs?property=name=="{NAME}"
property=name=="{NAME}"
The name of your dataflow specification.
- For file-based sources (such as cloud storage), set this value to
CloudStorageToAEP
. - For table-based sources (such as databases, CRMs, and marketing automation providers), set this value to
CRMToAEP
.
Request
curl -X GET \
'https://platform.adobe.io/data/foundation/flowservice/flowSpecs?property=name=="CRMToAEP"' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}'
Response
A successful response returns the details of the dataflow specification responsible for bringing data from your source into Experience Platform. The response includes the unique flow spec id
required to create a new dataflow.
To ensure you are using the correct dataflow specification, check the items.sourceConnectionSpecIds
array in the response. Confirm that the connection specification ID for your source is included in this list.
code language-json |
---|
|
Create a dataflow dataflow
A dataflow is a configured pipeline that transfers data across Experience Platform services. It defines how data is ingested from external sources (like databases, cloud storage, or APIs), processed, and routed to target datasets. These datasets are then used by services such as Identity Service, Real-Time Customer Profile, and Destinations for activation and analysis.
o create a dataflow, you will need to provide values for the following items:
During this step, you can use the following parameters in scheduleParams
to configure an ingestion schedule for your dataflow:
startTime
frequency
The frequency of ingestion. Configure frequency to indicate how often the dataflow should run. You can set your frequency to:
once
: Set your frequency toonce
to create a one-time ingestion. Interval and backfill settings are not available for one-time ingestion jobs. By default, the scheduling frequency is set to once.minute
: Set your frequency tominute
to schedule your dataflow to ingest data on a per-minute basis.hour
: Set your frequency tohour
to schedule your dataflow to ingest data on a per-hour basis.day
: Set your frequency today
to schedule your dataflow to ingest data on a per-day basis.week
: Set your frequency toweek
to schedule your dataflow to ingest data on a per-week basis.
interval
The interval between consecutive ingestions (required for all frequencies except once
). Configure the interval setting to establish the time frame between every ingestion. For example, if your frequency is set to day and the interval is 15, the dataflow will run every 15 days. You cannot set the interval to zero. The minimum accepted interval value for each frequency is as follows:
once
: n/aminute
: 15hour
: 1day
: 1week
: 1
backfill
startTime
.API format
POST /flows
Request
curl -X POST \
'https://platform.adobe.io/data/foundation/flowservice/flows' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}' \
-H 'Content-Type: application/json' \
-d '{
"name": "ACME Contact Dataflow",
"description": "A dataflow for ACME contact data",
"flowSpec": {
"id": "14518937-270c-4525-bdec-c2ba7cce3860",
"version": "1.0"
},
"sourceConnectionIds": [
"b7581b59-c603-4df1-a689-d23d7ac440f3"
],
"targetConnectionIds": [
"320f119a-5ac1-4ab1-88ea-eb19e674ea2e"
],
"transformations": [
{
"name": "Copy",
"params": {
"deltaColumn": {
"name": "Datefield",
"dateFormat": "YYYY-MM-DD",
"timezone": "UTC"
}
}
},
{
"name": "Mapping",
"params": {
"mappingId": "93ddfa69c4864d978832b1e5ef6ec3b9",
"mappingVersion": 0
}
}
],
"scheduleParams": {
"startTime": "1612310466",
"frequency":"minute",
"interval":"15",
"backfill": "true"
}
}'
name
description
flowSpec.id
sourceConnectionIds
targetConnectionIds
transformations.params.deltaColum
deltaColumn
is yyyy-MM-dd HH:mm:ss
. For Microsoft Dynamics, the supported format for deltaColumn
is yyyy-MM-ddTHH:mm:ssZ
.transformations.params.deltaColumn.dateFormat
transformations.params.deltaColumn.timeZone
transformations.params.mappingId
scheduleParams.startTime
scheduleParams.frequency
once
, minute
, hour
, day
, or week
.scheduleParams.interval
scheduleParams.backfill
true
or false
) that determines whether to ingest historical data (backfill) when the dataflow is first created.Response
A successful response returns the ID (id
) of the newly created dataflow.
{
"id": "ae0a9777-b322-4ac1-b0ed-48ae9e497c7e",
"etag": "\"770029f8-0000-0200-0000-6019e7d40000\""
}
Use the UI to validate your API workflow validate-in-ui
You can use the Experience Platform user interface to validate the creation of your dataflow. Navigate to the Sources catalog in the Experience Platform UI and then select Dataflows from the header tabs. Next, use the Dataflow Name column and locate the dataflow that you created using the Flow Service API.
You can further validate your dataflow through the Dataflow activity interface. Use the right rail to view the API usage information of your dataflow. This section displays the same dataflow ID, dataset ID, and mapping ID that was generated during the dataflow creation process in Flow Service.
Next steps
This tutorial guided you through the process of creating a dataflow in Experience Platform using the Flow Service API. You learned how to create and configure the necessary components, including the target XDM schema, dataset, source connection, target connection, and the dataflow itself. By following these steps, you can automate the ingestion of data from external sources into Experience Platform, enabling downstream services such as Real-Time Customer Profile and Destinations to leverage your ingested data for advanced use cases.
Monitor your dataflow
Once your dataflow is created, you can monitor its performance directly in the Experience Platform UI. This includes tracking ingestion rates, success metrics, and any errors that occur. For more information on how to monitor dataflow, visit the tutorial on monitoring accounts and dataflows.
Update your dataflow
To update configurations for your dataflows scheduling, mapping, or general information, visit the tutorial on updating sources dataflows.
Delete your dataflow
You can delete dataflows that are no longer necessary or were incorrectly created using the Delete function available in the Dataflows workspace. For more information on how to delete dataflows, visit the tutorial on deleting dataflows.