Guardrails for Data Ingestion
Guardrails are thresholds that provide guidance for data and system usage, performance optimization, and avoidance of errors or unexpected results in Adobe Experience Platform. Guardrails can refer to your usage or consumption of data and processing in relation to your licensing entitlements.
This document provides guidance on guardrails for data ingestion in Adobe Experience Platform.
Guardrails for batch ingestion
The following table outlines guardrails to consider when using the batch ingestion API or sources:
- You can ingest up to 20 GB of data per hour to data lake using the batch ingestion API.
- The maximum number of files per batch is 1500.
- The maximum batch size is 100 GB.
- The maximum number of properties or fields per row is 10000.
- The maximum number of batches per minute, per user is 138.
- You can ingest up to 200 GB of data per hour to data lake using batch ingestion sources such as Azure Blob, Amazon S3, and SFTP.
- A batch size should be between 256 MB and 100 GB. This applies to both uncompressed and compressed data. When compressed data is uncompressed in the data lake, these limitations will apply.
- The maximum number of files per batch is 1500.
- The minimum size of a file or folder is 1 byte. You cannot ingest 0 byte size files or folders.
- The maximum size of a record class is 100 KB (soft).
- The maximum size of an ExperienceEvent class is 10 KB (soft).
- The maximum size of a single record is 1 MB.
Guardrails for streaming ingestion
Read the streaming ingestion overview for information on guardrails for streaming ingestion.
Guardrails for streaming sources
The following table outlines guardrails to consider when using the streaming sources:
- The maximum record size is 1 MB, with the recommended size being 10 KB.
- Streaming sources support between 4000 to 5000 requests per second when ingesting to the data lake. This applies for both newly created source connections in addition to existing source connections. Note: It can take up to 30 minutes for streaming data to be completely processed to data lake.
- Streaming sources support a maximum of 1500 requests per second when ingesting data to profile or streaming segmentation.
Next steps
See the following documentation for more information on other Experience Platform services guardrails, on end-to-end latency information, and licensing information from Real-Time CDP Product Description documents:
- Real-Time CDP guardrails
- End-to-end latency diagrams for various Experience Platform services.
- Real-Time Customer Data Platform (B2C Edition - Prime and Ultimate Packages)
- Real-Time Customer Data Platform (B2P - Prime and Ultimate Packages)
- Real-Time Customer Data Platform (B2B - Prime and Ultimate Packages)