1

I read a message out of my Database like this:

    import MySQLdb

db = MySQLdb.connect(host="localhost", user="?", passwd="?", db="?")

cur = db.cursor()

message = cur.execute("SELECT Nr, Nachricht from message")

for row in cur.fetchall() :
    print row

cur.close()

db.close()

The Output is right but, the numbers get an L at the and like:

DB->'2'; Output->'2L'

. But also the 'ä', 'ü', 'ö' and 'ß' get a wrong Output like:

DB->'ü'; Output->'\xfc'

Simon Ber
  • 9
  • 2

1 Answers1

3

Looks like you have a character encoding issue. The output \xfc means the character with (hexadecimal) code point 0xFC. That is indeed ü in the Latin-1 character set, but my guess is that it's not being interpreted in that way by Python (or at least it's got mangled somewhere along the way).

My suggestion would be to move everything to UTF-8. If possible, switch to Python 3 which uses UTF-8 by default (or use the unicode type instead of str in Python 2; resources are available online if you're unfamiliar with it).

Then, configure MariaDB to use UTF-8. Clear all the old data out, then you should be able to support literally any character you can imagine without any mangling of your characters.

You might also find The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) helpful for some historical context on why Unicode is needed and what is going wrong.

Aurora0001
  • 6,357
  • 3
  • 25
  • 39