Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Why Quotes Not Multiline By Default
#1
What's the reason for code editors and compilers not allowing multi-line strings with the normal quotes only?
Code:
// usually allowed
"multiline\n\
string"

"multine\n"+
"string"

// not allowed
"multiline
string"
In addition to less redundant characters to type, it becomes easier to detect a missing pair of the quotes in a file if there is a lexer.

I'm working on a simple scripting/markup language parser, and I am going to allow strings to extend to the next line. I am worried there is something I'm overlooking.
[Image: signature.png]
A-Engine: A new beat em up game engine inspired by LF2. Coming soon

A-Engine Dev Blog - Update #8: Timeout

Reply
Thanks given by: prince ABDQ , Hellblazer
#2
Well, in C/C++ it is possible to do it by:

1. Adding a backslash before the newline.
E.g.:
Code:
#include <stdio.h>

int main(){
char *string = "hello \
world!";
printf("%s\n", string);
return 0;
}
However, the white spaces following the newline will be included in the string.

2. Better approach that works just for strings.
E.g.:
Code:
#include <stdio.h>

int main(){
char *string = "hello "
"world!";
printf("%s\n", string);
return 0;
}

With a #define, you have to add an extra '\' to concatenate the two strings:
Code:
#include <stdio.h>
#define string "hello "\
"world!"

int main(){
printf("%s\n", string);
return 0;
}

A few reasons I would have for not allowing the multi-line string literals are:
  1. White spaces will be included. Yes, just like in the example using the backslash, which is a bad one. Programmers could easily overlook this and would need to have bad indentation for it to work out.
  2. Newlines should be added explicitly (with \n for example). Because on different OS's and/or text editors newlines are interpreted differently (e.g. \r\n for Windows, \n for Linux, \r for Mac...). So you should know what character(s) is/are actually there in the string, which you don't know. If you add a newline (press enter) with Windows Notepad, you will have \r\n. More info.
[Image: random.php?pic=random]
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
The meaning of life is to give life a meaning.
Stop existing. Start living.
Reply
Thanks given by: A-Man
#3
Hey, thanks for taking the time to answer!
I am aware of these methods of writing multiline strings (I didn't know about the \n\r/\n/\r stuff though, so thanks for clarifying this), but regardless, I am not asking about how to do it. I am asking about allowing this in a markup/scripting language (I am not sure which it would be considered) I am working on since I couldn't think of a reason for why not to do it.

The \n/\r/\n\r inconsistency problem in different OSs can be solved with a simple regex.

One point that you brought up that might make one avoid having it is that sometimes one can overlook the fact that indentations may be a part of the string.
[Image: signature.png]
A-Engine: A new beat em up game engine inspired by LF2. Coming soon

A-Engine Dev Blog - Update #8: Timeout

Reply
Thanks given by:
#4
MangaD's explanation is quite detailed, but I would like to add that C++11 has multiline string literals, in the form of raw string literals:
    C++-Code:
auto a = R"(Hello 
world!)";
 
cout << a << endl;

I also know that other languages have support for similar features, so it is already available in multiple languages.

When it comes to implementing it in a new language I would discard indentation like the following:
    C++-Code:
auto a = "Hello
           world!";

would result in "Hello world!", because there is a single space on the second line after the quote on the first line, so it is past the indentation of the string.

When it comes to tabs I would count them as a single space, and not bother further since if used correctly they will not cause any issues:
    C++-Code:
tab tab auto a = "Hello
tab tab            world!";

If a person were to use tabs beyond block indentation then he/she is a monster, and his/her code is beyond redemption (except for running it through clang-format).
If you want you can make sure that spaces and tabs match up, and give an error if they don't.

(12-30-2015, 06:27 PM)A-Man Wrote:  The \n/\r/\n\r inconsistency problem in different OSs can be solved with a simple regex.
Please do not use a regex for that, it will be slow in comparison to a simple find and erase.
Age ratings for movies and games (and similar) have never been a good idea.
One can learn a lot from reinventing wheels.
An unsound argument is not the same as an invalid one.
volatile in C++ does not mean thread-safe.
Do not make APIs unnecessarily asynchronous.
Make C++ operator > again
Trump is an idiot.
Reply
Thanks given by: A-Man
#5
Check [SO] Ruby Heredoc for how Ruby guys manage indentation. I've done the same at work, but ya MangaD's explanation contains the main argument why it's probably disallowed in most languages at first glance.
Reply
Thanks given by: A-Man
#6
(12-30-2015, 08:29 PM)Someone else Wrote:  When it comes to implementing it in a new language I would discard indentation like the following:
Code:
auto a = "Hello
          world!";
would result in "Hello world!", because there is a single space on the second line after the quote on the first line, so it is past the indentation of the string.

When it comes to tabs I would count them as a single space, and not bother further since if used correctly they will not cause any issues:
Code:
tab tab auto a = "Hello
tab tab            world!";
If a person were to use tabs beyond block indentation then he/she is a monster, and his/her code is beyond redemption (except for running it through clang-format).
If you want you can make sure that spaces and tabs match up, and give an error if they don't.
Oh, that's a good idea. But it'd be a hassle for people if their editor doesn't auto indent on return (button). What do you think of generalizing the case and stripping off the common spaces/tabs indenting lines, strictly without equating a tab to 1 or 4 spaces or whatever?


Quote:Please do not use a regex for that, it will be slow in comparison to a simple find and erase.
Right. I've not very long ago learnt about them, and I've got this bad habit of trying to force myself to use new stuff I learn >.<

(12-30-2015, 08:36 PM)Azriel Wrote:  Check [SO] Ruby Heredoc for how Ruby guys manage indentation. I've done the same at work, but ya MangaD's explanation contains the main argument why it's probably disallowed in most languages at first glance.
Nice! I am not sure about using a custom delimiter, as I do not expect things to get so ugly in what I am working on, and I do not want to scare off people with extended syntax. Thanks. If I need to, I will probably go with a not so common delimiter like python's triple quotes approach:
Code:
spam = '''multi line
string
end.'''

I appreciate your time everyone.
[Image: signature.png]
A-Engine: A new beat em up game engine inspired by LF2. Coming soon

A-Engine Dev Blog - Update #8: Timeout

Reply
Thanks given by:




Users browsing this thread: 1 Guest(s)