45

In my application I insert data in database using C code, since the strings I receive from an untrusted source I have escaped them using PQescapeByteaConn of libpq library. Which is working perfectly fine i.e. results in Octet format String. See below example,

Input String : \n\t\f\b\p\k\j\l\mestPrepared

Output String : \\012\\011\\014\\010pkjlmestPrepared

Output String is inserted in the database. Now I retrieve that data from the database in a java code using JDBC. How I can unescape the string back to its original value ?

I thought of two possible approaches,

  1. Change the database retrieval query and pass this field to any String manipulation function of postgres i.e. which can convert bytea to text.
  2. Do the decoding in Java code.

I can understand that the approach 1 will be more efficient. I have tried almost all the functions listed here but nothing is working. Please Help!!

I am using version 8.4 of postgres on a linux machine.

Evan Carroll
  • 65,432
  • 50
  • 254
  • 507
Amit
  • 591
  • 2
  • 5
  • 10

3 Answers3

39

For followers, since this seems to be the canonical question for "converting bytea to text" (i.e. so you can actually see it in pgAdmin etc.) . Here's how to just get it viewable (assuming your_column_name is type bytea)

select encode(table.your_column_name, 'escape') as your_alias_name from table_name

rogerdpack
  • 1,364
  • 1
  • 15
  • 19
25

Have you tried the encode(data bytea, format text) with escape format. In that syntax format can be any of these,

  • base64
  • hex
  • escape

So encode(E'123\\000456'::bytea, 'hex') will output the bytea as hex-encoded.

Evan Carroll
  • 65,432
  • 50
  • 254
  • 507
valgog
  • 359
  • 2
  • 2
19

Converting BYTEA to TEXT requires you to know the internal encoding of the text. Without knowing the encoding, there is nothing you can do. In a normal text column, the database stores the text as whatever SERVER_ENCODING is set as. For instance, in your example \n gets translated into \012. Well, that's a property of encoding. It's not objectively true for the universe.

If you know that encoding, it's simple however..

  1. We take your input string as an encoded literal.
  2. We encode it. This produces an escaped string (type text).
  3. Then we have to decode it back to get a type of bytea.
  4. Now having a bytea doesn't permit us to go back to text. We have to tell it what text format that bytea is in using convert_from. We tell it UTF-8.

Here is an example.

SELECT convert_from(decode(x, 'escape'), 'UTF-8')
FROM encode(E'\n\t\f\b\p\k\j\l\mestPrepared'::bytea, 'escape')
  AS t(x);
Evan Carroll
  • 65,432
  • 50
  • 254
  • 507