9

In designing a RESTful api the problem arises as to how best to allow resources to be moved between collections.

Renaming a resource could be done by using PATCH but this is not the same thing as moving the resource between collections. Also it is not clear whether it is the resource or the collection which should be patched. Does it make sense to PATCH the resource path of an object in the api if the resource path is not a direct attribute (content) of the resource?

Clearly a DELETE/POST sequence could be used but this involves the use of multiple operations and is not atomic. In this post How to handle a request and delete the issue of performance is raised and POST is suggested as a solution. However POST by itself should not (imho) imply a DELETE. Server performance is not an issue for me, the question is more about the integrity of the RESTful API.

Using PUT is not an option. RFC2616 states:

The PUT method requests that the enclosed entity be stored under the supplied Request-URI. If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server.

Hence, either the resource is replaced in situ or it is created.

Is there a RESTful way to implement this whilst maintaining atomicity of the operation?

Jon Guiton
  • 201
  • 1
  • 5

5 Answers5

13

Is there a RESTful way to implement this whilst maintaining atomicity of the operation?

Short Answer

Just use POST

Medium Answer

Seriously; it is okay to use POST.

POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.” -- Fielding, 2009

Long Answer

REST doesn't have collections. REST has resources and representations, and a uniform interface that includes a vocabulary of self-descriptive messages that are common to all resources.

HTTP doesn't have collections either. It defines a vocabulary of standardized self-descriptive messages that are common all over the web. In other words, when interpreting a message we don't need any specialized knowledge of either the producer or consumer of the message. GET means GET, HEAD means HEAD, 200 means OK, 404 means Not Found, conditional requests, authentication, caching... it's all the same everywhere.

The application domain of HTTP is the transfer of documents over a network. We're just sending each other little copies of documents telling the other guy what to do. If I want you to move a document (A) into a "collection" (B), then I send to you a document (C) that looks something like:

Please move document A into collection B

All of the other stuff -- the method-token, the headers values, the response codes -- that's all meta-data of the document transfer domain; information that we attached to the document so that general purpose HTTP components can do useful things.

In other words, the meta-data allows us to take advantage of the intelligence that we've built into the document transfer application so that we get more value out of it than mere transport.

So, how can we surface the idea of "collection" so that our document transfer application can take advantage of it?

There are at least two answers to this. One answer is WebDAV, which offers a definition of collection resources. And no joke, if what you want is remote web content authoring, you should give it a serious look. RFC 4918 defines the standard semantics for the COPY and MOVE method-tokens.

The other, and I think more common, approach is to describe the relationships between resources. We've got web linking, which gives us standized forms for describing Target/Context/Relation triples. And we've got RFC 6573, which defines the semantics of the item and collection link relation types.

So we can get kind of close: if we have a representation schema like Collection+JSON which has a mechanism for describing a document's own links, then any client familiar with that schema will be able to identify the link relations within it, and those link relations can be changed by sending to the server a representation of the document with the new link values using the same messages that we would use for any other edit (ie: PUT/PATCH). The server can easily understand the request, and decide on its own whether or not to fulfill it.

But it's only close; it doesn't generalize particularly well (where do you embed link relations in a CSV file?). So that leaves you either sending multi-part documents around (ugh) or trying to embed the relations in the headers. And yes, we've already standardized a header for that.

But what we haven't defined is how the semantics of an HTTP request are further refined when link relation headers are present in the request.

And that means that general purpose components aren't going to have a clue what is going on, and aren't going to be able to act intelligently.

And that leaves you with two choices

  • drive the specification and adoption of new standard(s)
  • recognizing that the action isn't worth standardizing

Discussion

The problem is that POST isn't idempotent.

Yes, but let's look carefully at what that means here.

The standard doesn't say "POST is restricted to use for non-idempotent actions". It is perfectly satisfactory to use POST for idempotent actions that aren't worth standardizing.

What it says is that general-purpose components are not allowed to assume that POST is idempotent; that the idempotent semantic constraint doesn't apply to all POST messages. Because the constraint is missing, our document transfer application can't do intelligent things like autonomously retry lost messages.

That's something we could potentially fix via a new method token (let's call it TSOP); you could write up the semantics of TSOP, and guide it through the standards process, and get it registered with IANA, and drive adoption. Ta-da! You now have general-purpose browsers that will resubmit lost form submissions.

Failing that, you are left to look into other registered methods that are unsafe and idempotent. Of the obvious ones, you are limited to PUT.

And PUT is fine -- every general purpose component in the world will understand the document transfer semantics of

PUT /a36c586a-cf90-46aa-b098-b3ffa038bebd HTTP/1.1
Content-Type: text/plain

Please move document A into collection B

So we have a resource model that is documents about changes to documents, and affordances so that clients can find the documents that are documents about documents, and we can design all that.

I am trying to understand exactly what isn't achieved here.

The piece that we don't have is any sort of standardized language for describing changes to previously cached resources in an HTTP response. We only have invalidate, and that only applies to a limited number of meta-data elements have other use.

Consider that 200, in response to a PUT, means that the payload of the payload of the response is a representation of the status of the action. So we might imagine that the response payload to our PUT request could look something like:

SUCCESS

Document /A has been moved to /B Document /A/1 has been moved to /B/1 Document /B/2 has been removed Document /C/4 has been moved to /K/9

Of course, we're just making up a language here -- if we want adoption, then this idea would need to be tightened up in to a standard. That might look like using the link header (these do appear to be link triples), and standardizing a new link relation, and then standardizing the semantics of link headers with that relation in the context of an HTTP response.

And then driving adoption.

VoiceOfUnreason
  • 34,589
  • 2
  • 44
  • 83
6

One possible solution might indeed be to drop that "should not (imho) imply a DELETE".

While it is somewhat counterintuitive that a constructive REST verb (POST) could have a destructive side effect, in general it is relatively normal to have side effects beyond the creation of a new resource (for example, POST on a comments endpoint might have a side effect on the num_comments attribute of a base article resource.)

The exact composition of a POST message is somewhat unspecified. In many cases, you'd want to have it look mostly like the resource that should be the operation's result, but you could also have request bodies that specify how the resource should be created instead of specifying its exact content.

To create a resources as a clone of an existing one you could use

POST /api/v1/collection1
{
    "copy_from":"/api/v1/collection2/1234"
}

while to move a resource as in your use case you would use

POST /api/v1/collection1
{
    "move_from":"/api/v1/collection2/1234"
}

Would it solve your problem? - I think so.

Would it work with HTTP infrastructure such as proxies, caches, etc.? - I think so.

Is it fully RESTful in spirit? - I don't really know.

2

I think you can solve this by examining your assumptions about the concept of a "collection".

To help us, let's work with a concrete example:

  • The resource is currently located at /clients/acme/users/bob
  • We want it to instead be located at /clients/zebedee/users/bob

It's common to interpret that as meaning:

The resource bob is in the /clients/acme/users collection, and we want to move it to the /clients/zebedee/users collection.

But that only makes sense if you assume that:

  • Resource identifiers are strictly hierarchical
  • Each component of a resource identifier represents a collection
  • A resource resides in exactly one collection

But these assumptions are incorrect:

  • Resource identifiers are arbitrary strings; this is part of the idea behind HATEOAS, that the structure of a URL should not determine its relationships.
  • A "collection" is just a type of resource that represents some complex structure. Its name can be just as arbitrary as any other resource.
  • A resource can't "contain" other resources, it can only link to them, so it's perfectly valid for more than one "collection resource" to link to the same resources in different combinations.

So, we can re-construct our problem in various different ways, giving us lots of choices to represent the required action:

  • /clients/acme/users/bob is just an alias or search query, and the canonical identifier for the user is /users/29d123bb-d0ff-488d-b81f-37ffe6a945f7. That resource has a client field in its representation, and a PUT or PATCH request can be used to update that field.
  • An additional "collection" resource exists at /users/ which manages users across all clients, and a PUT or PATCH request to that resource can define the required state of a user {"id": 29d123bb-d0ff-488d-b81f-37ffe6a945f7, "client": "zebedee"}.
  • The /users/ collection accepts a representation of users grouped by client, and you can send a PATCH request which atomically deletes one user and creates the other
  • The /users/ collection accepts a PATCH request which defines the move directly as "from acme/bob to zebedee/bob".
IMSoP
  • 5,947
1

You can have a move resource to which you can POST.

POST's semantics is defined by the resource itself, so you're right that doing that on the original resource is somewhat dodgy, but doing so on a move resource should be completely ok (RFC7231):

The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.

The caller shouldn't know where the POST goes anyway, since you should be using forms to do this, if you are doing REST which includes HATEOAS.

1

I'd like to thank all those who have responded to my question for their time and effort. Many of the answers are very well informed and the discussion here has been very helpful for my own understanding and also for my current project. After having carefully reviewed the answers I would like to offer my own answer to my own question.

No. There is no RESTful way to safely move a resource between collections nor is this possible within the framework of HATEOAS. Without the definition of a new HTTP verb MOVE moving resources safely between collections must involve the > use of server side state representations.

My reasoning is this. Without the use of a new verb, the "mv" operation is intrinsically a two step process. It can be seen as either a DELETE/PUT operation on the subject resource or PATCH/PATCH operation on the source and destination collections. It is also an idempotent operation. As @Jörg_W_Mittag stated, under UNIX this is a system call and hence an atomic operation.

There will always be a possibility that some kind of error can occur between the first operation and the second. In order that the client is free to try the operation again under conditions of failure the verbs used must preserve the intrinsic idempotent nature of the "mv" operation. In order to make "mv" a safe operation there are really only two possibilities.

a) Use a new verb such as MOVE or construct verbs outside of the HTTP protocol and pass the information out of band (i.e. on some protocol built on top of HTTP). In this way "mv" is made into an atomic operation which can be performed safely by the client. WebDAV as suggested by @VoiceOfReason is one possibility for this.

b) Use a server side representation of client state to facilitate a simple form of transaction processing. Either

BEGIN TRANSACTION
   DELETE R
   POST R || ROLLBACK
COMMIT

or

BEGIN TRANSACTION
   PATCH C1
   PATCH C2 || ROLLBACK
COMMIT

The reason why this cannot be achieved within the framework of HATEOAS and hence in a RESTful way is that it is necessary for the server to match the bracket terms 'BEGIN TRANSACTION' with their bracket closures 'ROLLBACK' or 'COMMIT'. In order to do this the server must maintain at least 1 bit of client state information.

More generally, HATEOAS treats the client server interaction within the framework of regular languages https://nordicapis.com/designing-a-true-rest-state-machine/.

At any one time the client owns its own state and is presented with a set of links which it can follow to arrive at a new state as determined by the server. The server therefore acts as a state transition table.

It is well known that matching brackets is easily performed within the framework of context free languages but is not possible within the framework of regular languages About CFLs. In other words, HATEOAS restricts the server to acting as a regular automaton precisely because it cannot store client state. The server is therefore unable to match brackets and hence unable to bracket sequences of operations into transactions.

I think there are likely many other examples like this where REST fails to provide a solution but there is certainly a way to augment REST to give it the power of CFLs rather than RLs. This would require the use of a client stack object on the server.

The important verbs here would be PUSH,POP and EMPTY etc. These operations could apply to symbols or to URIs. The resulting system would be far more powerful than REST being able to support transaction processing models but also able to support backtrack search points for the client. In short, make the server into a stack automaton.

This approach would not be at all RESTful because it definitely implies that the server maintains some kind of user state, however it is very close with just the addition of stack operations to the standard set of verbs. Furthermore, it is probably the simplest representation of client state possible (other than a single variable); is safe (with timeout and stack limits) and would provide the full power of CFLs to the interoperation of the client and server.

Actually not the conclusion I expected or hoped to reach, I'm now really unsure how or whether to proceed with a REST based api for my current project but I have learnt a lot by asking this question. Thankyou contributors.

Jon Guiton
  • 201
  • 1
  • 5