Skip to content

ABAP Keyword Documentation →  ABAP - Reference →  Processing Internal Data →  Character String and Byte String Processing →  Expressions and Functions for String Processing →  String Functions →  Processing Functions for Character-Like Arguments 

escape – Escape Function

Other versions: 7.31 | 7.40 | 7.54

Syntax


... escape( val = text format = format ) ...

Effect

This function gets the content of the character string in text, and replaces certain special characters with escape characters according to a rule specified in format.

The possible values of format are defined as constants with the prefix "E_" in the class CL_ABAP_FORMAT. Each value defines which special characters are replaced, and how. There are rules for special characters in markup languages (XML and HTML), in URIs and URLs, in JSON, as well as in regular expressions and character string templates. An important part is also played by attack protection using Cross Site Scripting (XSS) on Web applications.

format expects data objects of the type i. An invalid value for format raises an exception of the class CX_SY_STRG_PAR_VAL. For all characters whose codes are between x00 und xFF, the program DEMO_ESCAPE demonstrates the effect of all associated formats from the class CL_ABAP_FORMAT. The top row contains the names of the constants from the class CL_ABAP_FORMAT without the prefix "E_". The other lines show the effect on the characters specified in the first two columns.

This function can be specified in general and character-like expression positions. The return code has the type string.

Rules for Markup Languages (Including JavaScript)

The program DEMO_ESCAPE_MARKUP demonstrates the escape rules for markup languages. Formats with "_JS" in their name are intended for content with JavaScript components. The following table summarizes the escape rules:

|Format|&|<|>|"|'|TAB|VD|CR|BS|FF||ctrl-char| |----|----|----|----|----|----|----|----|----|----|----|----|----| |E_XML_TEXT|&amp;|&lt;|-|-|-|-|-|-|-|-|-|-| |E_XML_ATTR|&amp;|&lt;|-|&quot;|&apos;|&#9;|&#xA;|&#xD;|-|-|-|-| |E_XML_ATTR_SQ|&amp;|&lt;|-|-|&apos;|&#9;|&#xA;|&#xD;|-|-|-|-| |E_HTML_TEXT|&amp;|&lt;|&gt;|-|-|-|-|-|-|-|-|-| |E_HTML_ATTR|&amp;|&lt;|&gt;|&quot;|&#39;|-|-|-|-|-|-|-| |E_HTML_ATTR_DQ|&amp;|&lt;|&gt;|&quot;|-|-|-|-|-|-|-|-| |E_HTML_ATTR_SQ|&amp;|&lt;|&gt;|-|&#39;|-|-|-|-|-|-|-| |E_HTML_JS|-|-|-|\"|\'|\t|\n|\r|\b|\f|\|\xhh| |E_HTML_JS_HTML|&amp;|&lt;|&gt;|&quot;|&#39;|\t|\n|\r|\b|\f|\|\xhh|

The first column contains the names of the formats from the class CL_ABAP_FORMAT. The other columns show the escape characters that replace the special characters in the first row. None of the other characters are affected. TAB, LF CR, BS, and FF are the control characters for Tabulator, Line Feed, Carriage Return, Backspace, and Form Feed, to which the codes x09, x0A, x0D, x08, and x0C are assigned in 7-Bit ASCII. ctrl-char represents all control characters with codes less than x20 that are not covered by those shown here. Some of these can be converted to \xhh, where "hh" is the hexadecimal value of the code. If there is no value in a field (-), the special character is not affected.

Rules for URL/URIs

The program DEMO_ESCAPE_URL_URI demonstrates the escape rules for URLs and URIs. All characters with codes between x00 and 7F are converted to %hh (except for the characters listed in the following table), where hh is the hexadecimal value of the code.

Format Unconverted Characters
E_URL [0-9], [a-z], [A-Z],!, $, ', (, ), *, +,,, -, ., _, &, /, :,;, =, ?, @
E_URL_FULL [0-9], [a-z], [A-Z],!, $, ', (, ), *, +,,, -, ., _
E_URI [0-9], [a-z], [A-Z],!, $, ', (, ), *, +,,, -, ., _, &, /, :,;, =, ?, @, ~, #, [, ]
E_URI_FULL [0-9], [a-z], [A-Z],-, ., _, ~

All characters with codes from x80 are converted to their UTF-8 representation. Depending on the character, one to four bytes are represented in the form %hh, where hh is the hexadecimal value of a byte.

Rules for JSON

The program DEMO_ESCAPE_JSON demonstrates the escape rules of the format E_JSON_STRING for JSON. The special characters " and \ are prefixed with the escape character \. Control characters with the codes x08, x09, x0A, x0C, and x0D are escaped using \b, \t, \n, \f, and \r respectively. All other codes less than x20 are converted to a four-character hexadecimal representation and prefixed by \u. None of the other characters are affected.

Rules for Regular Expressions

The program DEMO_ESCAPE_REGEX demonstrates the escape rules of the format E_REGEX for regular expressions. The special characters of regular expressions are prefixed by the associated escape character \. Control characters with the codes x08, x09, x0A, x0B, x0C, and x0D are escaped using \b, \t, \n, \v, \f, and \r respectively.

Rules for String Templates

The program DEMO_ESCAPE_STRING_TEMPLATE demonstrates the escape rules of the format E_STRING_TPL for string templates. The special characters of string templates (|, \, {, }) are prefixed by the associated escape character \. Control characters with the codes x09, x0A, and x0D are replaced by \t, \n, and \r respectively.

Rules for Cross Site Scripting

The program DEMO_ESCAPE_XSS demonstrates the escape rules of the formats E_XSS_... that enable attacks using Cross Site Scripting (XSS) on Web applications to be prevented. Rules exist for XML/HTML content, JavaScript content, Cascading Style Sheets (CSS), and URL content.

The rules for XSS include all the rules for individual formats, plus some extra rules. They are particularly distinct from the rules for markup languages, including JavaScript (see above). These extended rules are designed to be used to protect ABAP programs from Cross Site Scripting, when content can be constructed from non-secure sources. The transformations listed above are replaced or modified as follows:

  • Markup languages: Format E_XSS_ML. All characters (except [0-9], [a-z], [A-Z], ,, -, ., _, and control characters) are transformed to &#xhh; or &#xhhhh;, where hh or hhhh is the hexadecimal value of the code. All control characters are transformed to &#xfffd;.
  • JavaScript: Format E_XSS_JS. All characters (except [0-9], [a-z], [A-Z], ,, ., and _) are transformed to \xhh or \uhhhh, where hh or hhhh is the hexadecimal value of the code.
  • URL/URIs: Format E_XSS_URL. All characters (except [0-9], [a-z], [A-Z], *, -, ., and _) are transformed to %hh, where hh is the hexadecimal value of the code. All characters with codes from x80 are converted to their UTF-8 representation. Depending on the character, one to four bytes are represented in the form %hh, where hh is the hexadecimal value of a byte.
  • CSS: Format E_XSS_CSS. All characters (except [0-9], [a-z], and [A-Z]) are transformed to \hh or \hhhh, where hh or hhhh is the hexadecimal value of the code. A blank is inserted after hh or hhhh if the following character is a valid hexadecimal digit.

If the format from the class CL_ABAP_FORMAT has the additional ending "_NU", all characters with codes greater than xFF are converted to a four-character hexadecimal representation, with varying marking depending on the type of the content.


Notes

  • The class CL_ABAP_DYN_PRG contains methods ESCAPE_XSS_... that wrap calls of the predefined function escape with the formats E_XSS_.... It is generally recommended to use the predefined function directly.

  • escape used with rules for XSS is recommended to protect against Cross Site Scripting, but might not be secure enough in all cases. For example, it may be best to use a whitelist to check an unsafe URL, so that phishing attacks can be detected as well as XSS. To guarantee that no code injections are used, never generate JavaScript dynamically from unsafe sources.

Exceptions


Catchable Exceptions

CX_SY_CONVERSION_CODEPAGE_EX

  • Cause: A character cannot be converted in a conversion to UTF-8. This can only occur with characters from the surrogate area. The position and code of the character is listed in the exception object.
    Runtime Error: CONVT_CHARACTER

CX_SY_STRG_PAR_VAL

  • Cause: Invalid value in format.
    Runtime Error: STRG_ILLEGAL_PAR