Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Update here-string syntax RFC #343

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

MartinGC94
Copy link

No description provided.

@MartinGC94 MartinGC94 marked this pull request as draft January 23, 2023 23:07
@JamesWTruher JamesWTruher added the WG-Language parser, language semantics label Feb 8, 2023
@MartinGC94 MartinGC94 marked this pull request as ready for review March 24, 2023 07:33
@JamesWTruher
Copy link
Contributor

JamesWTruher commented Aug 10, 2023

The Language-WG met to discuss this on 8/10/23 and had the following observations and conclusions:

  • we agreed that there is substantial risk in altering the behavior of our current HereString and aren't willing to change the current HereString syntax
  • we agreed that there is substantial value in supporting a new syntax which allows indentation of the HereString to be handled more gracefully.
  • we felt that supporting a single-line HereString is not an overwhelming value and are concerned that any implementation may lead to an increased support burden.
  • we agreed that an appropriate syntax to designate the new behavior should be @""
    • we believe it may have reduced implementation cost as the tokenizer is already recognizing @" and this would be a special case of what we already recognize
    • we don't preclude the addition of supporting a single line HereString in the future

Specifics:

  • the behavior should follow the same behavior of our current HereString with regard to single and double quotes - e.g. @'' vs @"" being constant and expandable respectively.
  • the starting token (@"" or @'') must be the last non-whitespace token of a line
  • the ending token (""@ or ''@) must be the first non-whitespace token of a line
  • the column offset of the ending token (""@) will be used to determine how much white space at the beginning of the line to trim.
  • if that whitespace does not exist in a line, a parse error should be generated (similar to the error that c# generates)
  • the starting and ending token lines are not part of the HereString
  • while the arguments for @@' were very well put together and persuasive we ultimately still prefer @'' for understandability

we look forward to your updates

@MartinGC94
Copy link
Author

@JamesWTruher (and WG) thanks for the review. I have a few questions. First, regarding:

the starting token (@"" or @'') must be the last token of a line

This is different from the current @' syntax which allows whitespace characters after the header (the whitespace chars are not included in the string value though). Are you sure you want this slight difference between the old and new syntax?

if that whitespace does not exist in a line, a parse error should be generated (similar to the error that c# generates)

What about empty lines like line2 here:

    @''
    Line1

    Line3
    ''@

Should they also have that whitespace, or can they be left completely empty? Editors like VS code will not indent empty lines and some tools will auto remove trailing whitespace so I think the UX will suffer if we make the whitespace mandatory for empty lines.

@JamesWTruher
Copy link
Contributor

@JamesWTruher (and WG) thanks for the review. I have a few questions. First, regarding:

the starting token (@"" or @'') must be the last token of a line

This is different from the current @' syntax which allows whitespace characters after the header (the whitespace chars are not included in the string value though). Are you sure you want this slight difference between the old and new syntax?

sorry, I've updated that - it should be the last non-whitespace token, any spaces which follow are ignored.

As an aside, the current error message is curious, yes?

> @"           a     
ParserError: 
Line |
   1 |  @"           a
     |               ~
     | No characters are allowed after a here-string header but before the end of the line.

strictly speaking " " is a character.

In any event, we're not interested in introducing any differences with the starting token

if that whitespace does not exist in a line, a parse error should be generated (similar to the error that c# generates)

What about empty lines like line2 here:

    @''
    Line1

    Line3
    ''@

Should they also have that whitespace, or can they be left completely empty? Editors like VS code will not indent empty lines and some tools will auto remove trailing whitespace so I think the UX will suffer if we make the whitespace mandatory for empty lines.

wrt to empty strings - we didn't discuss it, but it's a good question. I expect the scanner should just ignore empty lines. We were more worried about the behavior of:

$a = @"
        line 1
  line 2
        line 3
        "@

c# emits a syntax error in this case, and we didn't think that we should try to be different here. I'm not sure what c# does in the empty line case, we should probably follow that lead unless there's a good reason for us to be different.

@MartinGC94
Copy link
Author

@JamesWTruher I've updated the RFC to remove the single line here-string references and to use the multiple quotes syntax rather than the @ symbols. I also made the specification more precise.
Also, I tested the C# string literal behavior for empty lines and lines that only consist of whitespace and found that any lines with less than or equal to the amount of whitespace that the "footer" line has are considered linebreaks so I will do the same for the PS version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WG-Language parser, language semantics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants