String literals are sequences of characters representing a
series of ASCII characters, with either single
quotes, double quotes, or CDATA sectionsused
to delimit the literal. Unicode characters are not
supported.
String ::= NullTermString | SmallString | CDATAString
There are two types of string literals: null-terminated string
literals and non-null-terminated string literals.
Null-terminated String Literals
NullTermStringPortion ::= ([#32-#33] | [#35-#91] | [#93-#126] |
EscapedCharacter)*
NullTermString ::= (NullTermStringPortion S?)+
The literal is composed of all characters listed in consecutive chunks of double-quote-bounded
character sequences; a trailing null byte is appended
automatically.
In general, it is only necessary to specify a single instance of
double-quote-bounded character sequence, but more can be used if a long string
will not fit on a single line. Only whitespace can occur
between consecutive chunks for the entire series to be considered a single
stringpreprocessor, remarks, and other punctuators are not allowed.
A null-terminated string literal cannot contain more than 512
characters.
Non-null-terminated String Literals (Small Strings)
SmallString ::= ([#32-#38] | [#40-#91] | [#93-#126] | EscapedCharacter)*
The literal is composed of all characters listed in a single-quote-bounded
character sequence. A null byte is not appended to the string
automatically.
A non-null-terminated string literal cannot contain more than
10 characters. Because the string is often used to identify
numerical representations of characters, which require only 1-8
characters, this category of string literal is sometimes called small
string.
Escaped Characters
EscapedCharacter ::= \ (a | b | t | n | v | f | r | |
| \ | (x HexDigit+) | ([0-9]+) | (#13? #10))
There is often a need to fill string literals with unprintable characters
or characters that cause problems in compilation if they appear
verbatim. For this reason, escaped characters
are supported by the BAR compiler to allow any ASCII character to be a part of
a string literal.
The following are recognized escaped characters:
-
\a
is character 7 (bell).
-
\b
is character 8 (backspace).
-
\t
is character 9 (horiztonal tab).
-
\n
is character 10 (newline or line feed).
-
\v
is character 11 (vertical tab).
-
\f
is character 12 (form feed).
-
\r
is character 13 (carriage return).
-
\
is character 34 (double-quote character).
-
\
is character 39 (single-quote character).
-
\\
is character 92 (backslash character).
-
\x(nnn) is a custom character specified by ASCII code. The
value of nnn is a hexadecimal number representing the
code.
-
\(nnn) is a custom character specified by ASCII code. The
value of nnn is an octal number representing the code.
-
\(line break) denotes that the string will continue as the
first non-whitespace character beyond the line break.
CDATA Sections
CDATAString ::= <![CDATA[ ^]]>* ]]>
A CDATA section, borrowed from SGML syntax, is a way to
represent a string with no characters between the start and end of the
declaration being treated as markup--everything is treated as verbatim
in the string, including tabs and line breaks. The only character
combination not allowed inside the CDATA section is the closing three-character
combination, ']]>'.
Use CDATA sections to define long multi-line strings. CDATA sections are
ideal for long blocks of documentation. With CDATA sections, it is
possible to define a string up to 16,383 characters long.
The compiler places a null byte as the last character of the string (CDATA
sections are always null-terminated).
See also: [Punctuators] [Operators]
[Keywords]
[Identifiers] [Numbers]
[String literals] [Remarks]
[Preprocessor directives] [Whitespace]
[Unrecognized characters]
|