23

Unicode has maybe 50 spaces

\u0009\u000A-\u000D\u0020\u0085\u00A0\u1680\u180E\u2000-\u200A\u2028\u2029\u202F\u205F\u3000][\u0009\u000A-\u000D\u0020\u0085\u00A0\u1680\u180E\u2000-\u200A\u2028\u2029\u202F\u205F\u3000

and 6 line breaks

not only CRLF, LF, CR, but also NEL (U+0085), PS (U+2029) and LS (U+2028).

Maybe I could understand most of the spaces and PS ("Paragraph separator"), but what are "Next Line" and "Line separator" good for?

It all looks like invented by a very big committee where everybody wanted their own space and the leaders were granted one line break each. But seriously, how do you deal with it when your programming language doesn't support it (or does it wrong as e.g. Java does)?

maaartinus
  • 2,713

1 Answers1

17

Maybe I could understand most of the spaces and PS ("Paragraph separator"), but what are "Next Line" and "Line separator" good for

NEXT LINE (U+0085) is often used as the newline character on EBCDIC systems (as 0x15). It's like CR+LF, but as one character.

LINE SEPARATOR (U+2028) and PARAGRAPH SEPARATOR (U+2029) are explained in section 5.8 of the Unicode standard, which describes them as a plain-text version of HTML <br> and <p>, to disambiguate these functions of "newline". But in practice, these characters don't get used much.

dan04
  • 3,957