Documentation Experience Platform Data Prep Guide

Data Prep UI Guide

Last update: Tue Jan 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time)

Topics:
Data Prep

CREATED FOR:

Developer
User
Admin
Leader

This document provides directions on how to use data prep functions in the Adobe Experience Platform user interface to map CSV files to an XDM schema.

Getting started

This tutorial requires a working understanding of the following Platform components:

Experience Data Model (XDM) System: The standardized framework by which Platform organizes customer experience data.
- Basics of schema composition: Learn about the basic building blocks of XDM schemas, including key principles and best practices in schema composition.
- Schema Editor tutorial: Learn how to create custom schemas using the Schema Editor UI.
Identity Service: Gain a better view of individual customers and their behavior by bridging identities across devices and systems.
Real-Time Customer Profile: Provides a unified, real-time consumer profile based on aggregated data from multiple sources.
Sources: Experience Platform allows data to be ingested from various sources while providing you with the ability to structure, label, and enhance incoming data using Platform services.

Dataflow detail

TIP

You can access dataflow detail by selecting any source from the sources catalog. For more information, see the sources overview.

Before you can map your CSV data to an XDM schema, you must first establish the details of your dataflow.

The Dataflow detail page allows you to select whether you want to ingest your CSV data into an existing target dataset or a new target dataset. An existing dataset comes with a pre-built target schema to map your data to, while a new dataset requires you to select an existing schema, or create a new schema, to map your data to.

Use an existing target dataset

To ingest your CSV data into an existing dataset, select Existing dataset. You can either retrieve an existing dataset using the Advanced search option or by scrolling through the list of existing datasets in the dropdown menu.

With a dataset selected, provide a name for your dataflow and an optional description.

During this process, you can also enable Error diagnostics and Partial ingestion. Error diagnostics enables detailed error message generation for any erroneous records that occur in your dataflow, while Partial ingestion allows you to ingest data containing errors, up to a certain threshold that you manually define. See the partial batch ingestion overview for more information.

existing-dataset

Use a new target dataset

To ingest your CSV data into a new dataset, select New dataset and then provide an output dataset name and an optional description. Next, select a schema to map to using the Advanced search option or by scrolling through the list of existing schemas in the dropdown menu.

With a schema selected, provide a name for your dataflow and an optional description, and then apply the Error diagnostics and Partial ingestion settings you want for your dataflow. When finished, select Next.

new-dataset

Select Data

The Select data step appears, providing you an interface to upload your local files and preview their structure and contents. Select Choose files to upload a CSV file from your local system. Alternatively, you can drag and drop the CSV file you want to upload into the Drag and drop files panel.

TIP

Only CSV files are currently supported by local file upload. The maximum file size for each file is 1 GB.

choose-files

Once your file is uploaded, the preview interface updates to display the contents and structure of the file.

preview-sample-data

Depending on your file, you can select a column delimiter such as tabs, commas, pipes, or a custom column delimiter for your source data. Select the Delimiter dropdown arrow and then select the appropriate delimiter from the menu.

When finished, select Next.

delimiter

Mapping

The mapping interface provides you with a comprehensive tool to map source fields from your source schema to their appropriate target XDM fields in the target schema.

map-csv-to-xdm

Understanding the mapping interface mapping-interface

The mapping interface includes a dashboard that provides information on the health of your mapping fields within the context of the ingestion workflow. The dashboard displays the following details regarding your mapping fields:

Property

Description

Mapped fields

Displays the total number of source fields that have been mapped to a target XDM field, regardless of errors.

Required fields

Displays the number of required mapping fields.

Identity fields

Displays the total number of mapping fields defined as identity. These mapping fields are represented by a fingerprint icon.

Errors

Displays the number of erroneous mapping fields.

top-panel

The mapping interface also provides a panel of options that you can choose from to better interact or filter through your mapping fields.

second-panel

To search for a particular mapping set, select Search source fields and enter the name of the source data that you want to isolate.

Select All source fields to see a dropdown menu of filtering options to better narrow down your view of the mapping interface.

The filtering options are:

Source fields

Description

All source fields

This option displays all of the source fields of your source schema. This option is displayed by default.

Required fields

This option filters the source schema to only display the fields required to complete the mapping.

Identity fields

This option filters the source schema to only display the fields marked for Identity.

Mapped fields

This option filters the source schema to only display the fields that have already been mapped.

Unmapped fields

This option filters the source schema to only display the fields that have yet to be mapped.

Fields with recommendation

This option filters the source schema to only display the fields that contain mapping recommendations.

Select Fields with errors to see all mapping fields with errors.

filter

An isolated view of erroneous mapping fields appears, allowing you to address errors through intelligent mapping recommendations or through the manual mapping tree.

fields-with-errors

Add a new field type

You can add a new mapping field or a calculated field by selecting New field type.

New mapping field

To add a new mapping field, select New field type and then select Add new field from the dropdown menu that appears.

add-new-field

Next, select the source field you would like to add from the source schema tree that appears and then select Select.

select-new-field

The mapping interface updates with the source field you selected and an empty target field. Select Map target field to start mapping the new source field to its appropriate target XDM field.

map-target-field

An interactive target schema tree appears, allowing you to manually traverse through the target schema and find the appropriate target XDM field for your source field.

manual-mapping

When finished, select the schema icon to close the target schema interface.

schema-tree

Calculated fields calculated-fields

Calculated fields allow for values to be created based on the attributes in the input schema. These values can then be assigned to attributes in the target schema and be provided a name and description to allow for easier reference. Calculated fields have a maximum length of 4096 characters.

To create a calculated field, select New field type and then select Add calculated field

add-calculated-field

The Create calculated field panel appears. The left dialog box contains the fields, functions, and operators supported in calculated fields. Select one of the tabs to start adding functions, fields, or operators to the expression editor.

Tab

Description

Function

The functions tab lists the functions available to transform the data. To learn more about the functions you can use within calculated fields, please read the guide on using Data Prep (Mapper) functions.

Field

The fields tab lists fields and attributes available in the source schema.

Operator

The operators tab lists the operators that are available to transform the data.

tabs

You can manually add fields, functions, and operators using the expression editor at the center. Select the editor to start creating an expression. Once you are finished, select Save to proceed.

create-calculated-field

Import mapping import

You can reuse the mapping of an existing dataflow to reduce the manual configuration time of your data ingestion and limit mistakes. Select Import mapping to reuse an existing mapping.

import-mapping

The Import mapping window appears, providing you with a list of dataflows to choose from.

Select the preview icon to preview the mapping of the dataflow you selected.

list-mapping

The preview window allows you to inspect existing mapping before importing to your dataflow. Once you verify the mapping, you can select Back to return to the list of dataflows and inspect another set of mapping, or you can select Select to proceed.

preview-mapping

Alternatively, you can select the mapping you want to import from the list of dataflows window. Select the dataflow that contains the mapping you want to import and then select Select to proceed.

select-mapping

The interface updates with the mapping you imported.

NOTE

Any existing mapping sets that you establish or ML mapping recommendations are replaced by the mapping that you imported from an existing dataflow.

mapping-imported

Select Preview data to see mapping results of up to 100 rows of sample data from the selected dataset.

preview-data

During the preview, the identity column is prioritized as the first field, as it is the key information necessary when validating mapping results. When finished, select Close.

preview-screen

To remove all mapping fields, select Clear all mappings.

clear-all

Using the mapping interface

Platform automatically provides intelligent recommendations for auto-mapped fields based on the target schema or dataset that you selected. You can manually adjust mapping rules to suit your use cases or fix any duplicated mapping fields to clear any errors.

mapping-interface

Select the lightbulb icon in the target field that you want to adjust.

mapping-recc

The Mapping recommendations pop up panel appears, displaying a list of recommended target fields that can be mapped to a particular source field. By default the first recommendation is automatically applied.

Sometimes, more than one recommendation is available for the source schema. When this happens, the mapping card displays the most prominent recommendation, followed by an icon that contains the number of additional recommendations available. Selecting the light bulb icon will show a list of the additional recommendations. You can choose one of the alternate recommendations by selecting the checkbox next to the recommendation you want to map to instead.

From here, you can change the selected target field to fix an error or match your use case.

Alternatively, you can select Select manually to manually use the interactive target schema mapping tree.

recc-panel

The target schema mapping interface appears in the same view as your mapping fields, allowing you to modify mapping pairs within the same screen. Select the target field that fits your use case or fixes your errors.

select-target-field

When finished, select Finish to proceed.

finish

Next steps

By reading this document, you have successfully mapped a CSV file to a target XDM schema using the mapping interface in Platform UI. See the following documents for more information:

recommendation-more-help

461cc884-c234-4a0c-ac75-6efbaafc1394