2

I am making a web UI and an HTTP API for editing JSON documents in collaboration (role and versioning system).

There are several types of JSON documents. Each type is described by a JSON schema, let us say:

schema_a, schema_b

Each user is assigned a role for editing a JSON document, among:

editor_1, editor_2, reviewer

Besides the "initial" JSON document, each revision of a JSON document is stored, and only one can be marked as "final":

initial, rev_1, rev_2, rev_3, …, final

In the web UI, the user first selects a schema (which then displays the list of documents following that schema), then a document, then a role (which then displays the list of revisions for that role), then a revision (among "initial", "rev_1", "rev_2", "rev_3", …, "final"). In this order. Then the user loads the selected revision in the editor. He works on it and eventually saves his work, which creates a new revision with the current revision number + 1. Before saving, he can mark his revision as "final", in which case the new revision is saved as "final" instead.

What is the best URI structure for this hierarchical model?

Here are the two structures that come to mind (notice the trailing slashes, denoting collection resources, as opposed to item resources):

Structure 1

In this structure, path segments are organized in a sequence of collection resource–item resource pairs:

/
/schemas/
/schemas/{schema}
/schemas/{schema}/documents/
/schemas/{schema}/documents/{document}
/schemas/{schema}/documents/{document}/roles/
/schemas/{schema}/documents/{document}/roles/{role}
/schemas/{schema}/documents/{document}/roles/{role}/revisions/
/schemas/{schema}/documents/{document}/roles/{role}/revisions/{revision}

With this structure I would allow GET on all the resources, PUT and DELETE on all the item resources and POST only on this collection resource: /schemas/{schema}/documents/{document}/roles/{role}/revisions/.

Examples. — I have omitted the headers to simplify.

Request 1:

GET /schemas/ HTTP/1.1

Response 1:

HTTP/1.1 200 OK

["/schemas/schema_a", "/schemas/schema_b"]

Request 2:

GET /schemas/schema_a HTTP/1.1

Response 2 (I use JSON Schema):

HTTP/1.1 200 OK

{ "type": "object", "properties": { "x": {"type": "number"}, "y": {"type": "boolean"}, "z": {"type": "string"} }, "required": ["x", "y"] }

Structure 2

In this structure, all path segments denote collection resources but the last one which denotes an item resource if the path is complete (the longest URIs):

/
/documents/
/documents/{schema}/
/documents/{schema}/{document}
/revisions/
/revisions/{schema}/
/revisions/{schema}/{document}/
/revisions/{schema}/{document}/{role}/
/revisions/{schema}/{document}/{role}/{revision}
/schemas/
/schemas/{schema}

With this structure I would allow GET on all the resources, PUT and DELETE on all the item resources and POST only on this collection resource: /revisions/{schema}/{document}/{role}/.

Examples. — I have omitted the headers to simplify.

Request 1:

GET /documents/ HTTP/1.1

Response 1:

HTTP/1.1 200 OK

["/documents/schema_a/", "/documents/schema_b/"]

Request 2:

GET /documents/schema_a/ HTTP/1.1

Response 2:

HTTP/1.1 200 OK

["/documents/schema_a/document_foo", "/documents/schema_a/document_bar"]

Request 3:

GET /documents/schema_a/document_foo HTTP/1.1

Response 3:

HTTP/1.1 200 OK

{ "x": 48, "y": true }

1 Answers1

5

One issue you have is I think it would be too easy to have duplicate data in your requests. If I understand your design correctly, if I wanted to create a document and a document has 3 fields: schema_id, title, and last_modified, I could make this request:

POST /documents/schema_a/
{
  schema_id: 'schema_a',
  title: 'A fancy title',
  last_modified: '2019-05-01'
}

What if instead a client made this request:

POST /documents/schema_a/
{
  schema_id: 'schema_b',
  title: 'A fancy title',
  last_modified: '2019-05-01'
}

What schema would you expect that document to be in after this request? Would the resource server raise an error, or would it just silently make a default choice? Would that choice be the same choice the client would expect it to make? If the server makes a choice, what's the purpose of having the one it didn't choose?


My suggestion is to break your URIs into 4 resources: schemas, documents, revisions, and roles. You would then have these resource URIs available for listings:

GET /schemas
GET /documents
GET /roles
GET /revisions

And these URIs available to fetch individual entities:

GET /schemas/{id}
GET /documents/{id}
GET /roles/{id}
GET /revisions/{id}

And these URIs for updating/deleting:

PUT /documents/{id}
DELETE /documents/{id}
PUT /revisions/{id}
DELETE /revisions/{id}
(... etc)

And whatever URIs you need for creating:

POST /documents
POST /roles

All of the data you're trying to put in the URI IMHO belongs either in the POST/PUT body, or as a query parameter. For instance, in your question you have this URI: /schemas/{schema}/documents/ Just looking at it, I would expect this URI to return all documents in the given schema. You can just as easily accomplish this using query parameters instead:

GET /documents?schema={schema}
GET /documents?schema={schema}&role={role}
GET /documents?role={role}&schema={schema}

The last example shows that query parameters used this way are commutative, but putting data in the path of the URI is not. This has the benefit that you can mix and match different query parameters without having to make an entirely new route into your application. Your list of routes e.g. currently cannot handle this query:

GET /revisions?schema={schema}

This organization to me is much more REST like and treats each resource equally. It also doesn't take nearly as much inside knowledge of the organizational structure to consume. I can know nothing about how documents and schemas and revisions are related, start consuming, and infer the relationships just based on the data returned. If you include linked actions in your result set data (as suggested by HATEOAS), then I don't need to infer anything at all, I can just start consuming your data and you get to tell me everything I can do with that data in the data itself.