3

I am trying to dump some data as json with:

\copy (SELECT json_build_object('areaSqft',area_sqft, 'titleNos',title_nos, 'buildingIds',building_ids, 'geometry',geom_json) FROM my_data) to my_data.csv with delimiter ',' csv header

what I am expecting is a valid json per row, but what I get is:

"{""areaSqft"": 214.394254595041, ""geometry"": {""type"": ""MultiPolygon"", ""coordinates"": [[[[0.000015, 51.449107], [0.000154, 51.441108], [0.000238, 51.44111], [0.00024, 51.441052], [0.000137, 51.441051], [0.000041, 51.441049], [0.000015, 51.441107]]]]}, ""titleNos"": [""ZB78669""], ""buildingIds"": [7521141, 9530393, 7530394]}"

There are extra " as first and last character and "" around instead of single ".

How can I get a valid json stripping unnecessary quotes?

Randomize
  • 1,203
  • 2
  • 21
  • 36

3 Answers3

4

The quotes around fields and inside fields are part of the CSV format, and they're required here, because, according the CSV spec:

  1. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes

  2. If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote

I think that you don't want or need CSV in your case. Just take the output of SELECT, with the unaligned format of psql

=# \pset format unaligned 
Output format is unaligned.

=# select json_build_object('foo', 1, bar, 2) AS myjson from (values (E'xy\zt'), ('ab,cd')) as b(bar);

myjson {"foo" : 1, "xyzt" : 2} {"foo" : 1, "ab,cd" : 2} (2 rows)

You may also use \g output.json instead of the semi-colon at the end of the query to have psql redirect the results of that query to a file, and \pset tuples_only to remove headers and footers.

Daniel Vérité
  • 32,662
  • 3
  • 78
  • 84
3

You can just add more parameters to the query:

COPY (
       SELECT row_to_json(fruit_data) FROM (
          SELECT
        name AS fruit_name,
             quantity
             FROM data
              ) fruit_data
         , TRUE  -- add this parameter 
           ) TO 'a.file';

See also:

Paul White
  • 94,921
  • 30
  • 437
  • 687
Steve Ruben
  • 162
  • 1
  • 3
1

Putting this here for future ref, as I've had to deal with a particular column containing JSON on a CSV (already) exported file.

You can get away with it by using in your copy options for FORMAT CSV:

  • QUOTE '''' (using the single quote char instead of the default double quote one)

  • along with DELIMITER '|' (or any other one than comma, to prevent having to escape all commas in your JSON, which is error prone and makes for lower compatibility downstream when you try to use the file for loading (copy from))

The problem for me was actually that I was previously using cursor.copy_to() from Python's psycopg2 library and needed to replace it with cursor.copy_expert() for better/more control (quoting/unquoting of column names passed). And apparently the .copy_to method uses the QUOTE '''' option without it being documented anywhere I could find...

bluu
  • 111
  • 2