Guardrails for Data Ingestion

Guardrails are thresholds that provide guidance for data and system usage, performance optimization, and avoidance of errors or unexpected results in Adobe Experience Platform. Guardrails can refer to your usage or consumption of data and processing in relation to your licensing entitlements.

This document provides guidance on guardrails for data ingestion in Adobe Experience Platform.

Guardrails for batch ingestion

The following table outlines guardrails to consider when using the batch ingestion API or sources:

Type of ingestion
Guidelines
Notes
Data lake ingestion using the batch ingestion API
  • You can ingest up to 20 GB of data per hour to data lake using the batch ingestion API.
  • The maximum number of files per batch is 1500.
  • The maximum batch size is 100 GB.
  • The maximum number of properties or fields per row is 10000.
  • The maximum number of batches per minute, per user is 138.
Data lake ingestion using batch sources
  • You can ingest up to 200 GB of data per hour to data lake using batch ingestion sources such as Azure Blob, Amazon S3, and SFTP.
  • A batch size should be between 256 MB and 100 GB. This applies to both uncompressed and compressed data. When compressed data is uncompressed in the data lake, these limitations will apply.
  • The maximum number of files per batch is 1500.
  • The minimum size of a file or folder is 1 byte. You cannot ingest 0 byte size files or folders.
Read the sources overview for a catalog of sources you can use for data ingestion.
Batch ingestion to Profile
  • The maximum size of a record class is 100 KB (soft).
  • The maximum size of an ExperienceEvent class is 10 KB (soft).
  • The maximum size of a single record is 1 MB.
Number of Profile or ExperienceEvent batches ingested per day
The maximum number of Profile or ExperienceEvent batches ingested per day is 90. This means that the combined total of Profile and ExperienceEvent batches ingested each day cannot exceed 90. Ingesting additional batches will affect system performance.
This is a soft limit. It is possible to go beyond a soft limit, however, soft limits provide a recommended guideline for system performance.

Guardrails for streaming ingestion

Read the streaming ingestion overview for information on guardrails for streaming ingestion.

Guardrails for streaming sources

The following table outlines guardrails to consider when using the streaming sources:

Type of ingestion
Guidelines
Notes
Streaming sources
  • The maximum record size is 1 MB, with the recommended size being 10 KB.
  • Streaming sources support between 4000 to 5000 requests per second when ingesting to the data lake. This applies for both newly created source connections in addition to existing source connections. Note: It can take up to 30 minutes for streaming data to be completely processed to data lake.
  • Streaming sources support a maximum of 1500 requests per second when ingesting data to profile or streaming segmentation.
Streaming sources such as Kafka, Azure Event Hubs, and Amazon Kinesis do not use the Data Collection Core Service (DCCS) route and can have different throughput limits. See the sources overview for a catalog of sources you can use for data ingestion.

Next steps

See the following documentation for more information on other Experience Platform services guardrails, on end-to-end latency information, and licensing information from Real-Time CDP Product Description documents:

recommendation-more-help
2ee14710-6ba4-4feb-9f79-0aad73102a9a