Datasets UI guide
This user guide provides instructions on performing common actions when working with datasets within Adobe Experience Platform user interface.
Getting started
This user guide requires a working understanding of the following components of Adobe Experience Platform:
-
Datasets: The storage and management construct for data persistence in Experience Platform.
-
Experience Data Model (XDM) System: The standardized framework by which Experience Platform organizes customer experience data.
- Basics of schema composition: Learn about the basic building blocks of XDM schemas, including key principles and best practices in schema composition.
- Schema Editor: Learn how to build your own custom XDM schemas using the Schema Editor within the Platform user interface.
-
Real-Time Customer Profile: Provides a unified, real-time consumer profile based on aggregated data from multiple sources.
-
Adobe Experience Platform Data Governance: Ensure compliancy with regulations, restrictions, and policies regarding the usage of customer data.
View datasets view-datasets
In the Experience Platform UI, select Datasets in the left-navigation to open the Datasets dashboard. The dashboard lists all available datasets for your organization. Details are displayed for each listed dataset, including its name, the schema the dataset adheres to, and status of the most recent ingestion run.
Select the name of a dataset from the Browse tab to access its Dataset activity screen and see details of the dataset you selected. The activity tab includes a graph visualizing the rate of messages being consumed as well as a list of successful and failed batches.
Inline dataset actions inline-actions
The datasets UI now offers a collections of inline actions for each available dataset. Select the ellipsis (…) of a dataset that you want to manage to see the available options in a pop-up menu. The available actions include; Preview dataset, Manage data and access labels, Enable unified profile, Manage tags, Move to folders, and Delete. More information on these available actions can be found in their respective sections.
Add dataset tags add-tags
Add custom created tags to organize datasets and improve search, filtering, and sorting capabilities. From the Browse tab of the Datasets workspace, select the ellipsis of a dataset that you want to manage followed by Manage tags from the dropdown menu.
The Manage tags dialog appears. Enter a short description to create a custom tag, or choose from a pre-existing tag to label your dataset. Select Save to confirm your settings.
The Manage tags dialog can also remove existing tags from a dataset. Simply select the ‘x’ next to the tag you wish to remove and select Save.
Once a tag has been aded to a dataset, the datasets can be filtered based on the corresponding tag. See the section on how to filter datasets by tags for more information.
For more information on how to classify business objects for easier discovery and categorization, see the guide on managing metadata taxonomies. This guide details how a user with appropriate permissions can create pre-defined tags, assigning categories to tags, and perform all related CRUD operations on tags and tag categories in the Platform UI.
Search and filter datasets search-and-filter
To search or filter the list of available datasets, select the filter icon ( ) at the top left of the workspace. A set of filter options in the left rail appears. There are several methods to filter your available datasets. These include: Show System Datasets, Included in profile, Tags, Creation date, Modified date, Created by, and Schema.
The list of applied filters is displayed above the filtered results.
Show system datasets show-system-datasets
By default, only datasets that you have ingested data into are shown. If you want to see the system-generated datasets, select the Yes checkbox in the Show system datasets section. System-generated datasets are only used to process other components. For example, the system-generated profile export dataset is used to process the profile dashboard.
Filter Profile enabled datasets filter-profile-enabled-datasets
The datasets that have been enabled for Profile data are used to populate customer profiles after data has been ingested. See the section on enabling datasets for Profile to learn more.
To filter your dataset based on whether they have been enabled for Profile, select the Yes check box from the filter options.
Filter datasets by tag filter-by-tag
Enter your custom tag name in the Tags input, then select your tag from the list of available options to search and filter datasets that correspond to that tag.
Filter datasets by creation date filter-by-creation-date
Datasets can be filtered by creation date over a custom time period. This can be used to exclude historic data or to generate specific chronological data insights and reporting. Choose a Start date and an End date by selecting the calendar icon for each field. After which, only datasets that conform to that criteria will appear in the Browse tab.
Filter datasets by modified date filter-by-modified-date
Similar to the filter for creation date, you can filter your datasets based on the date they were last modified. In the Modified date section, Choose a Start date and an End date by selecting the calendar icon for each field. After which, only datasets that were modified during that period will appear in the Browse tab.
Filter by schema filter-by-schema
You can filter datasets based on the schema that defines their structure. Either select the dropdown icon or input the schema name into the text field. A list of potential matches appears. Select the appropriate schema from the list.
Sort datasets by created date sort
Datasets in the Browse tab can be sorted by either ascending or descending dates. Select the Created or Last updated column headings to alternate between ascending and descending. Once selected, the column indicates this with either an up or down arrow to the side of the column header.
Preview a dataset preview
You can preview dataset sample data from both the inline options of the Browse tab and also the Dataset activity view. From the Browse tab, select the ellipses (…) next to the dataset name you wish to preview. A menu list of options appears. Next, select Preview dataset from the list of available options. If the dataset is empty, the preview link will be deactivated and will instead say that the preview is not available.
This opens the preview window, where the hierarchical view of the schema for the dataset is shown on the right.
Alternatively, from the Dataset activity screen, select Preview dataset near the top-right corner of your screen to preview up to 100 rows of data.
For more robust methods to access your data, Experience Platform provides downstream services such as Query Service and JupyterLab to explore and analyze data. See the following documents for more information:
Create a dataset create
To create a new dataset, start by selecting Create dataset in the Datasets dashboard.
In the next screen, you are presented with the following two options for creating a new dataset:
Create a dataset with an existing schema schema
In the Create dataset screen, select Create dataset from schema to create a new empty dataset.
The Select schema step appears. Browse the schema listing and select the schema that the dataset will adhere to before selecting Next.
The Configure dataset step appears. Provide the dataset with a name and optional description, then select Finish to create the dataset.
Datasets can be filtered from the list of available datasets in the UI with the schema filter. See the section on how to filter datasets by schema for more information.
Create a dataset with a CSV file csv
When a dataset is created using a CSV file, an ad hoc schema is created to provide the dataset with a structure that matches the provided CSV file. In the Create dataset screen, select Create dataset from CSV file.
The Configure step appears. Provide the dataset with a name and optional description, then select Next.
The Add data step appears. Upload the CSV file by either dragging and dropping it onto the center of your screen, or select Browse to explore your file directory. The file can be up to ten gigabytes in size. Once the CSV file is uploaded, select Save to create the dataset.
Enable a dataset for Real-Time Customer Profile enable-profile
Every dataset has the ability to enrich customer profiles with its ingested data. To do so, the schema that the dataset adheres to must be compatible for use in Real-Time Customer Profile. A compatible schema satisfies the following requirements:
- The schema has at least one attribute specified as an identity property.
- The schema has an identity property defined as the primary identity.
For more information on enabling a schema for Profile, see the Schema Editor user guide.
You can enable a dataset for Profile from both the inline options of the Browse tab and also the Dataset activity view. From the Browse tab of the Datasets workspace, select the ellipsis of a dataset that you want to enable for Profile. A menu list of options appears. Next, select Enable unified profile from the list of available options.
Alternatively, from the dataset’s Dataset activity screen, select the Profile toggle within the Properties column. Once enabled, data that is ingested into the dataset will also be used to populate customer profiles.
Datasets that have been enabled for Profile can also be filtered on this criteria. See the section on how to filter Profile enabled datasets for more information.
Manage and enforce data governance on a dataset manage-and-enforce-data-governance
You can manage the data governance labels for a dataset by selecting the inline options of the Browse tab. Select the ellipses (…) next to the dataset name you wish to manage, followed by Manage data and access labels from the dropdown menu.
Data usage labels, applied at the schema level, allow you to categorize datasets and fields according to usage policies that apply to that data. See the Data Governance overview to learn more about labels, or refer to the data usage labels user guide for instructions on how to apply labels to schemas for propagation to datasets.
Move to folders move-to-folders
You can place datasets within folders for better dataset management. To move a dataset into a folder, select the ellipses (…) next to the dataset name you wish to manage, followed by Move to folder from the dropdown menu.
The Move dataset to folder dialog appears. Select the folder you want to move the audience to, then select Move. A popup notification informs you that the dataset move has been successful.
Once the dataset is in a folder, you can choose to only display datasets that belong to a specific folder. To open your folder structure, select the show folders icon ( ). Next, select your chosen folder to see all associated datasets.
Delete a dataset delete
You can delete a dataset from either the dataset inline actions in the Browse tab or the top right of the Dataset activity view. From the Browse view, select the ellipses (…) next to the dataset name you wish to delete. A menu list of options appears. Next, select Delete from the dropdown menu.
A confirmation dialog appears. Select Delete to confirm.
Alternatively, select Delete dataset from the Dataset activity screen.
A confirmation box appears. Select Delete to confirm the deletion of the dataset.
Delete a Profile-enabled dataset
If a dataset is enabled for Profile, deleting that dataset through the UI will delete it from data lake, Identity Service, and the Profile store within Platform.
You can delete a dataset from the Profile store only (leaving the data in the Data Lake) using the Real-Time Customer Profile API. For more information, see the profile system jobs API endpoint guide.
Monitor data ingestion
In the Experience Platform UI, select Monitoring in the left-navigation. The Monitoring dashboard lets you view the statuses of inbound data from either batch or streaming ingestion. To view the statuses of individual batches, select either Batch end-to-end or Streaming end-to-end. The dashboards list all batch or streaming ingestion runs, including those that are successful, failed, or still in progress. Each listing provides details of the batch, including the batch ID, the name of the target dataset, and the number of records ingested. If the target dataset is enabled for Profile, the number of ingested identity and profile records is also displayed.
You can select on an individual Batch ID to access the Batch overview dashboard and see details for the batch, including error logs should the batch fail to ingest.
If you wish to delete the batch, select Delete batch near the top right of the dashboard. Deleting a batch also removes its records from the dataset that the batch was originally ingested to.
Next steps
This user guide provided instructions for performing common actions when working with datasets in the Experience Platform user interface. For steps on performing common Platform workflows involving datasets, please refer to the following tutorials: