Skip to content

Commit ce9fac6

Browse files
committed
Reason V4 Feature [String Template Literals]
Summary:This diff implements string template literals. Test Plan: Reviewers: CC:
1 parent 8047d1d commit ce9fac6

17 files changed

+978
-49
lines changed

docs/TEMPLATE_LITERALS.md

Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
2+
Contributors: Lexing and Parsing String Templates:
3+
===================================================
4+
Supporting string templates requires coordination between the lexer, parser and
5+
printer. The lexer (as always) creates a token stream, but when it encounters a
6+
backtick, it begins a special parsing mode that collects the (mostly) raw text,
7+
until either hitting a closing backtick, or a `${`. If it encounters the `${`
8+
(called an "interpolation region"), it will temporarily resume the "regular"
9+
lexing approach, instead of collecting the raw text - until it hits a balanced
10+
`}`, upon which it will enter the "raw text" mode again until it hits the
11+
closing backtick.
12+
13+
- Parsing of raw text regions and regular tokenizing: Handled by
14+
`reason_declarative_lexer.ml`.
15+
- Token balancing: Handled by `reason_lexer.ml`.
16+
17+
The output of lexing becomes tokens streamed into the parser, and the parser
18+
`reason_parser.mly` turns those tokens into AST expressions.
19+
20+
## Lexing:
21+
22+
String templates are opened by:
23+
- A backtick.
24+
- Followed by any whitespace character (newline, or space/tab).
25+
26+
- Any whitespace character (newline, or space/tab).
27+
- Followed by a backtick
28+
29+
```reason
30+
let x = ` hi this is my string template `
31+
let x = `
32+
The newline counts as a whitespace character both for opening and closing.
33+
`
34+
35+
```
36+
37+
Within the string template literal, there may be regions of non-string
38+
"interpolation" where expressions are lexed/parsed.
39+
40+
```reason
41+
let x = ` hi this is my ${expressionHere() ++ "!"} template `
42+
```
43+
44+
Template strings are lexed into tokens, some of those tokens contain a string
45+
"payload" with portions of the string content.
46+
The opening backtick, closing backtick, and `${` characters do not become a
47+
token that is fed to the parser, and are not included in the text payload of
48+
any token. The Right Brace `}` closing an interpolation region `${` _does_
49+
become a token that is fed to the parser. There are three tokens that are
50+
produced when lexing string templates.
51+
52+
- `STRING_TEMPLATE_TERMINATED(string)`: A string region that is terminated with
53+
closing backtick. It may be the entire string template contents if there are
54+
no interpolation regions `${}`, or it may be the final string segment after
55+
an interpolation region `${}`, as long as it is the closing of the entire
56+
template.
57+
- `STRING_TEMPLATE_SEGMENT_LBRACE(string)`: A string region occuring _before_
58+
an interpolation region `${`. The `string` payload of this token is the
59+
contents up until (but not including) the next `${`.
60+
- `RBRACE`: A `}` character that terminates an interpolation region that
61+
started with `${`.
62+
63+
Simple example:
64+
65+
STRING_TEMPLATE_TERMINATED
66+
| |
67+
` lorem ipsum lorem ipsum bla `
68+
^ ^
69+
| |
70+
| The closing backtick also doesn't show up in the token
71+
| stream, but the last white space is part of the lexed
72+
| STRING_TEMPLATE_TERMINATED token
73+
| (it is used to compute indentation, but is stripped from
74+
| the string constant, or re-inserted in refmting if not present)
75+
|
76+
The backtick doesn't show up anywhere in the token stream. The first
77+
single white space after backtick is also not part of the lexed tokens.
78+
79+
Multiline example:
80+
81+
All of this leading line whitespace remains parts of the tokens' payloads
82+
but it is is normalized and stripped when the parser converts the tokens
83+
into string expressions.
84+
|
85+
| This newline not part of any token
86+
| |
87+
| v
88+
| `
89+
+-> lorem ipsum lorem
90+
ipsum bla
91+
`
92+
^
93+
|
94+
All of this white space on final line is part of the token as well.
95+
96+
97+
For interpolation, the token `STRING_TEMPLATE_SEGMENT_LBRACE` represents the
98+
string contents (minus any single/first white space after backtick), up to the
99+
`${`. As with non-interpolated string templates, the opening and closing
100+
backtick does not show up in the token stream, the first white space character
101+
after opening backtick is not included in the lexed string contents, the final
102+
white space character before closing backtick *is* part of the lexed string
103+
token (to compute indentation), but that final white space character, along
104+
with leading line whitespace is stripped from the string expression when the
105+
parsing stage converts from lexed tokens to AST string expressions.
106+
107+
` lorem ipsum lorem ipsum bla${expression}lorem ipsum lorem ip lorem`
108+
| | || |
109+
STRING_TEMPLATE_TERMINATED |STRING_TEMPLATE_TERMINATED
110+
RBRACE
111+
## Parsing:
112+
113+
The string template tokens are turned into normal AST expressions.
114+
`STRING_TEMPLATE_SEGMENT_LBRACE` and `STRING_TEMPLATE_TERMINATED` lexed tokens
115+
contains all of the string contents, plus leading line whitespace for each
116+
line, including the final whitespace before the closing backtick. These are
117+
normalized in the parser by stripping that leading whitespace including two
118+
additional spaces for nice indentation, before turning them into some
119+
combination of string contants with a special attribute on the AST, or string
120+
concats with a special attribute on the concat AST node.
121+
122+
```reason
123+
124+
// This:
125+
let x = `
126+
Hello there
127+
`;
128+
// Becomes:
129+
let x = [@reason.template] "Hello there";
130+
131+
// This:
132+
let x = `
133+
${expr} Hello there
134+
`;
135+
// Becomes:
136+
let x = [@reason.template] (expr ++ [@reason.template] "Hello there");
137+
138+
```
139+
140+
User Documentation:
141+
===================
142+
> This section is the user documentation for string template literals, which
143+
> will be published to the [official Reason Syntax
144+
> documentation](https://reasonml.github.io/) when
145+
146+
TODO
Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
[@reason.version 3.7];
2+
/**
3+
* Comments:
4+
*/
5+
6+
let addTwo = (a, b) => string_of_int(a + b);
7+
let singleLineConstant = `
8+
Single line template
9+
`;
10+
let singleLineInterpolate = `
11+
Single line ${addTwo(1, 2)}!
12+
`;
13+
14+
let multiLineConstant = `
15+
Multi line template
16+
Multi %a{x, y}line template
17+
Multi line template
18+
Multi line template
19+
`;
20+
21+
let printTwo = (a, b) => {
22+
print_string(a);
23+
print_string(b);
24+
};
25+
26+
let templteWithAttribute =
27+
[@attrHere]
28+
`
29+
Passing line template
30+
Passing line template
31+
Passing line template
32+
Passing line template
33+
`;
34+
35+
let result =
36+
print_string(
37+
`
38+
Passing line template
39+
Passing line template
40+
Passing line template
41+
Passing line template
42+
`,
43+
);
44+
45+
let resultPrintTwo =
46+
printTwo(
47+
"short one",
48+
`
49+
Passing line template
50+
Passing line template
51+
Passing line template
52+
Passing line template
53+
`,
54+
);
55+
56+
let hasBackSlashes = `
57+
One not escaped: \
58+
Three not escaped: \ \ \
59+
Two not escaped: \\
60+
Two not escaped: \\\
61+
One not escaped slash, and one escaped tick: \\`
62+
Two not escaped slashes, and one escaped tick: \\\`
63+
Two not escaped slashes, and one escaped dollar-brace: \\\${
64+
One not escaped slash, then a close tick: \
65+
`;
66+
67+
let singleLineInterpolateWithEscapeTick = `
68+
Single \`line ${addTwo(1, 2)}!
69+
`;
70+
71+
let singleLineConstantWithEscapeDollar = `
72+
Single \${line template
73+
`;
74+
75+
// The backslash here is a backslash literal.
76+
let singleLineInterpolateWithBackslashThenDollar = `
77+
Single \$line ${addTwo(2, 3)}!
78+
`;
79+
80+
let beforeExpressionCommentInNonLetty = `
81+
Before expression comment in non-letty interpolation:
82+
${/* Comment */ string_of_int(1 + 2)}
83+
`;
84+
85+
let beforeExpressionCommentInNonLetty2 = `
86+
Same thing but with comment on own line:
87+
${
88+
/* Comment */
89+
string_of_int(10 + 8)
90+
}
91+
`;
92+
module StringIndentationWorksInModuleIndentation = {
93+
let beforeExpressionCommentInNonLetty2 = `
94+
Same thing but with comment on own line:
95+
${
96+
/* Comment */
97+
string_of_int(10 + 8)
98+
}
99+
`;
100+
};
101+
102+
let beforeExpressionCommentInNonLetty3 = `
103+
Same thing but with text after final brace on same line:
104+
${
105+
/* Comment */
106+
string_of_int(20 + 1000)
107+
}TextAfterBrace
108+
`;
109+
110+
let beforeExpressionCommentInNonLetty3 = `
111+
Same thing but with text after final brace on next line:
112+
${
113+
/* Comment */
114+
string_of_int(100)
115+
}
116+
TextAfterBrace
117+
`;
118+
119+
let x = 0;
120+
let commentInLetSequence = `
121+
Comment in letty interpolation:
122+
${
123+
/* Comment */
124+
let x = 200 + 49;
125+
string_of_int(x);
126+
}
127+
`;
128+
129+
let commentInLetSequence2 = `
130+
Same but with text after final brace on same line:
131+
${
132+
/* Comment */
133+
let x = 200 + 49;
134+
string_of_int(x);
135+
}TextAfterBrace
136+
`;
137+
138+
let commentInLetSequence3 = `
139+
Same but with text after final brace on next line:
140+
${
141+
/* Comment */
142+
let x = 200 + 49;
143+
string_of_int(x);
144+
}
145+
TextAfterBrace
146+
`;
147+
148+
let reallyCompicatedNested = `
149+
Comment in non-letty interpolation:
150+
151+
${
152+
/* Comment on first line of interpolation region */
153+
154+
let y = (a, b) => a + b;
155+
let x = 0 + y(0, 2);
156+
// Nested string templates
157+
let s = `
158+
asdf${addTwo(0, 0)}
159+
alskdjflakdsjf
160+
`;
161+
s ++ s;
162+
}same line as brace with one space
163+
and some more text at the footer no newline
164+
`;
165+
166+
let reallyLongIdent = "!";
167+
let backToBackInterpolations = `
168+
Two interpolations side by side:
169+
${addTwo(0, 0)}${addTwo(0, 0)}
170+
Two interpolations side by side with leading and trailing:
171+
Before${addTwo(0, 0)}${addTwo(0, 0)}After
172+
173+
Two interpolations side by side second one should break:
174+
Before${addTwo(0, 0)}${
175+
reallyLongIdent
176+
++ reallyLongIdent
177+
++ reallyLongIdent
178+
++ reallyLongIdent
179+
}After
180+
181+
Three interpolations side by side:
182+
Before${addTwo(0, 0)}${
183+
reallyLongIdent
184+
++ reallyLongIdent
185+
++ reallyLongIdent
186+
++ reallyLongIdent
187+
}${
188+
""
189+
}After
190+
`;

0 commit comments

Comments
 (0)