56

I need to store JSON objects in a SQLite database, and then do complex queries on it.

I did a table like this:

+--------------------------------------+
|document |  property | string | number|
+--------------------------------------+
|foo      |  "title"  | "test" |       | 
+--------------------------------------+
|foo      |  "id"     |        |  42   | 
+--------------------------------------+
|bar      |  "id"     |        |  43   | 
+--------------------------------------+

for the two objects

foo {"title": "test", "id": 42} 
bar {id: 43}

But I can't do "AND" queries, like:

SELECT DISTINCT id  FROM table WHERE title = "test" AND id = 42 

as you see, the part after the "WHERE" is total nonsense, but I have no idea of how to create a query that would do what I want.

So do you think there is a better way to store my data, or a workaround to do an "AND" query?

And of course, the JSON can contain any property, so I can't create a table with columns for each properties.

I am using WebSQL, which is SQLite without extensions.

I know my question is pretty specific, but could you help me?

Julien Vavasseur
  • 10,180
  • 2
  • 28
  • 47
tuxlu
  • 561
  • 1
  • 4
  • 3

4 Answers4

54

SQLite 3.9 introduced a new extension (JSON1) that allows you to easily work with JSON data .

Also, it introduced support for indexes on expressions, which (in my understanding) should allow you to define indexes on your JSON data as well.

ThePhysicist
  • 641
  • 6
  • 8
10

PostgreSQL has some nice features for JSON storage. You can select the JSON values by doing

SELECT DISTINCT FROM TABLE WHERE foo->title=test AND id=42

You can also index the specific keys on the JSON object if you use the jsonb type.

Andriy M
  • 23,261
  • 6
  • 60
  • 103
Neza
  • 109
  • 1
  • 3
7

The existing answers for this deal with actually storing the OP's data as JSON (which may have been a better solution to his underlying problem).

However, the actual question was how to get find documents in the EAV-style table provided, based on multiple properties. So, perhaps an answer to that question would be useful.

The standard SQL way of doing what you are saying is INTERSECTION.

(SELECT document FROM table WHERE property="id" AND number=43)
INTERSECTION 
(SELECT document FROM table WHERE property="title" AND string="test")

You can also do this with self joins if you prefer.

SELECT t1.document
FROM table as t1, table as t2
WHERE t1.document = t2.document
AND t1.property="id" AND t1.number=43
AND t2.property="title" AND t2.string="test"

Of course the more "correct" and arguably most efficient way would be to just make a column for each property you need, like 'id' and 'title'.

RDFozz
  • 11,731
  • 4
  • 25
  • 38
Thomas Ahle
  • 171
  • 1
  • 2
1

You can import the json into a table with a single row and one column and use the built in json functions to extract the values you need.

Based on the given SQL

SELECT DISTINCT id  FROM table WHERE title = "test" AND id = 42 

and the provided foo object

{"title": "test", "id": 42}

I assume that you have an array of json objects like this

[
  {"title": "test 1", "id": 42},
  {"title": "test 2", "id": 43}
]

In my case i had a pmd json output with an array of json objects ([finding-1, finding-2]). Each array element finding contains another json object code which contains another json object target

array : [
   finding-1 { ... }
                 |
                 |--- code : { ... }
                        |
                        |--- target { ... }
   ,finding-2 { ... }
   ,finding-3 { ... }
]

similar to this

[
 {"resource": "/C:/src/one.cls",
  "code": {
        "value": "AvoidGlobalModifier",
        "target": {
            "path": "/latest/pmd_rules_apex_bestpractices.html",
            "fragment": "avoidglobalmodifier"
        }
    },
    "severity": 4,
    "message": "Avoid using global modifier ..."
},
{"resource": "/C:/src/two.cls",
    "code": {
        "value": "ApexSOQLInjection",
        "target": {
            "path": "/latest/pmd_rules_apex_bestpractices.html",
            "fragment": "apexsoqlinjection"
        }
    },
    "severity": 4,
    "message": "... escape variables merged in DML query ..."
}] 

I created table pmd_report where i copied the json-array into the column pmd_output in a single record. The table looks like this

CREATE TABLE "pmd_report" (
    "id"    INTEGER,
    "import_date"   TEXT,
    "pmd_output"    TEXT
)

To extract the values from the json array and the json objects this SQL statement can be used

SELECT 
    json_extract(json_parsed, '$.code.target.fragment') as code_target_fragment, 
    json_extract(json_parsed, '$.severity') as severity, 
    json_extract(json_parsed, '$.message') as message, 
    json_extract(json_parsed, '$.resource') as resource, 
    json_extract(json_parsed, '$.code.value') as code_value, 
    json_extract(json_parsed, '$.code.target.path') as code_target_path 
FROM (
    SELECT jsonEach.Value as json_parsed
        FROM pmd_report,json_each(pmd_report.pmd_output, '$') as jsonEach 
) as subquery;

To store the extracted json you can insert the rows into a new table like this

INSERT INTO pmd_report_2 (fragment, severity, message, startLineNumber, endLineNumber, 
        resource, 
        code_value, 
        code_target_path
        )
SELECT code_target_fragment, severity, message, 
       startLineNumber, endLineNumber, substr(resource, 43) as resource, 
       code_value,  code_target_path
FROM (
    SELECT 
        json_extract(json_parsed, '$.code.target.fragment') as code_target_fragment, 
        json_extract(json_parsed, '$.severity') as severity, 
        json_extract(json_parsed, '$.message') as message, 
        json_extract(json_parsed, '$.startLineNumber') as startLineNumber,
        json_extract(json_parsed, '$.endLineNumber') as endLineNumber,
        json_extract(json_parsed, '$.resource') as resource, 
        json_extract(json_parsed, '$.code.value') as code_value, 
        json_extract(json_parsed, '$.code.target.path') as code_target_path 
    FROM (
            SELECT jsonEach.Value as json_parsed
            FROM pmd_report,json_each(pmd_report.pmd_output, '$') as jsonEach 
    ) as subquery ) 
as subquery_2;
surfmuggle
  • 219
  • 3
  • 14