Substring component settings
Substring component settings allow you to perform multiple string manipulation methods to obtain the desired dimension items in reports.
Substring is available only on dimensions, and is retroactive to the data it is applied to. It is an immediate data transformation that happens before filtering or other analysis operations are applied.
From the Left/Right
Take a part of a string based on its position to the beginning or end of a string. From the Left and From the Right methods provide two drop-down lists: From (where the output starts) and To (where the output ends).
-
String Start: The start of the string.
-
String End: The end of the string.
-
Position: A static number of characters from the left or right, depending on the method.
-
String: Match a character or sequence of characters to indicate the beginning or end of a string. This drop-down list also reveals additional options:
- Match: The string to match. If the input has no match with this field, No value options apply.
- Index: The Match criteria can be present multiple times in a string. This integer determines which match to start or end the output, depending on the method. For example, an index of
1
represents the first match. If the index is higher than the number of matches available, No value options apply. - Include String: A checkbox that includes the Match string in the output if enabled.
-
Length: An integer that specifies the character count to include after the starting position of the output. Only available under the To drop-down list.
Delimiter
Use this method for fields that use a delimiter to separate multiple string values. You can either extract an individual element to use as the output, or convert the string into an object array schema element.
-
Criterion: How you want to treat the delimited list of values.
- From the Left: Start from the beginning of the delimited list and count forward.
- From the Right: Start from the end of the delimited list and count backward.
- Convert to array: Treat this dimension as if it is an object array schema element.
-
Delimiter: The delimiter that the field uses.
-
Index: Only present if the criterion is From the Left/Right. The element number as if it was in an array. For example, if the string input is
"Fox,Turtle,Rabbit,Wolf"
with an index of 3, the output is"Rabbit"
. If the index is higher than the number of delimited elements, No value options apply.
URL parse
For use with fields that contain URLs. Using the example URL https://example.com/store/index.html?cid=campaign#cart
, the following options are available:
- Get protocol: Get the URL’s protocol. For example,
"https://"
. - Get host: Get the URL’s host. For example,
"example.com"
. - Get path: Get the URL’s path. For example,
"store/index.html"
. - Get query string value: Get the value from a single query string. Place the desired query string parameter in the Query key field. If the above URL is used with the
"cid"
query key, the output is"campaign"
. - Get hash value: Get the URL’s hash value. For example,
"cart"
.
If the input is not a valid URL or if the desired URL component is not present, No value options apply.
Trim
Trim white space or special characters from the string.
- Trim whitespaces: A checkbox that removes all whitespace at the beginning and end of the string if enabled.
- Trim special characters: A checkbox that reveals a Special characters input field if enabled. All characters in this field are stripped from the output. Multi-byte characters are not supported.
Regex
Apply regular expressions to a dimension to retrieve the desired value.
- Regex: The regular expression formula.
- Output format: An optional field that lets you add text or reorder the regex subgroup output. If this field is blank, the string output is the evaluated regex expression.
- Case sensitive: A checkbox that forces the regular expression to be case-sensitive if enabled.
Customer Journey Analytics uses a subset of the Perl regex syntax. If the input does not match the regular expression and the Output format is blank, No value options apply. The following expressions are supported:
a
a
.a|b
a
or b
.[abc]
a
, b
, or c
.[^abc]
a
, b
, or c
.[a-z]
a
-z
.[a-zA-Z0-9]
a
-z
, A
-Z
, or digits 0
-9
.^
$
\A
\z
.
\s
\S
\d
\D
\w
\W
\b
\B
\<
\>
(...)
(?:...)
a?
a
.a*
a
.a+
a
.a{3}
a
.a{3,}
a
.a{3,6}
a
.Output placeholders are also supported. You can use these sequences in the Output format any number of times and in any order to achieve the desired string output.
$&
$n
$1
outputs the first sub expression.$`
$+
$$
"$"
.