-1

Java, by default, uses UTF-16 to represent characters in the String data type.

I inherited a JavaFX project which currently has some Strings in UTF-8 and others in UTF-16. This is causing bugs (in pop-ups for example) and I'm at a stage where I must provide some uniformity and choose between one and the other. Do note that for the pop-ups, I must use UTF-8 because UTF-16 doesn't show the characters correctly (I'm not sure why this happens, nor is that the focus of this question).

If Java used UTF-8 by default, I would absolutely use it as well because it is the de facto encoding for the foreseeable future. However, since Java uses UTF-16 by default, I was thinking of changing everything to UTF-16 to be consistent with the language, and then if need be when creating these pop-ups convert to UTF-8. Since there are many pitfalls associated with this encoding, of which a good summary is [1], I'm scared that I'm making the wrong decision.

So, which encoding should I use to store my String variables?

A similar question [2] was asked but for PHP and not between UTF-16 and UTF-8. I believe this qualifies as a different question due to Java natively using UTF-16.

[1] - Should UTF-16 be considered harmful?

[2] - Should I convert the whole project to UTF-8?

1 Answers1

1

Java uses UTF-16 internally. But nobody needs to care about that, except for a tiny bit of efficiency.

UTF-8 is much more what everyone uses as the standard for external representation.

gnasher729
  • 49,096