How do you resolve a byte[] into a class instance in a way that doesn't couple the serialization/deserialization contexts together

Question

Consider that you've got a POJO that you intend to serialize and send through a socket.

You can use whatever serialization strategy you wish (JSON, XML, protobuf, ..., etc) to serialize the actual POJO into a byte[], then you send it through the socket.

The byte[] arrives on the other end, but in this receiving context you do not know what class the information represents, so how do you know which POJO class to construct to begin populating its fields?

I'd want to do this without the need to have multiple endpoints/sockets within the context of which I could assume the type of data that is being received. I want to receive all sorts of different POJOs in the same socket context.

One idea was to share a mapping across these contexts, mapping a type code to a class type. I could then build some sort of user defined frame with which to transport the data.

| 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 |
+----------------+---------------+---------------+----------------+
|                      TYPE CODE SECTION (32 bit int)             |
+----------------+---------------+---------------+----------------+
|                      PAYLOAD LENGTH SECTION (32 bit int)        |
+----------------+---------------+---------------+----------------+
|                          PAYLOAD SECTION                        |
+----------------+---------------+---------------+----------------+
|                          PAYLOAD CONTINUED....                  |
+----------------+---------------+---------------+----------------+
:                              .....                              :
+----------------+---------------+---------------+----------------+

On serialization, I would insert this type code into the frame, then append on the byte[] that is the serialized POJO, then send this frame!

On the other end, I could extract this type code from the frame, and look it up in the shared mapping. Voilla! I now know which class was sent through the socket, and can de-serialize smoothly

However, this seems bad, because now all of these contexts are coupled together with this shared mapping. What if I want to make my project micro-service oriented? It could get ugly, especially if the mapping could be different for a different use case.

It occurred to me that this is a problem people have solved, and maybe I just don't know the name for this type of thing, or the high level design patterns/ideas.

Could someone provide some context? What solutions already exist that already solve this problem? Is there a name for this type of thing?

score 5 · Accepted Answer · answered Mar 04 '20 at 21:36

The short answer is that you have to provide sufficient metadata to allow the receiver to recognize what to do with the message data. You want some approach that supports extensibility and loose coupling.

A client and server are already somewhat tightly coupled if your metadata assumes that all receivers are Java with POJO classes readily available to instantiate.

So, rather than describe metadata about what the original Java class was on the sender's side, we might prefer to describe the nature, content, and format of the message itself. It's probably easiest to assume one of either JSON or XML on both sides, since they at least offer a way to parse the syntax of messages so you can get on to the semantics of the messages: understanding the fields and their content.

Focus on describing the domain oriented nature/intent of various messages, and, versioning to allow evolution — ideally, independent versioning of clients and servers. In some of these approaches you'll see some fields described as "must understand", for example; while other newly added fields can be safely ignored when not understood.

Schemas are good to use here — they'll give you a way to describe what forms a legal message of some type and a way to encode that description within a message. Different messages kinds will either have their own schema or augment some common schema with additional field information.

Protocol buffers are good to use when you know what you're doing, want fast and simple, and don't want the baggage of a lot of built-in flexibility. From wikipedia:

Canonically, messages are serialized into a binary wire format which is compact, forward- and backward-compatible, but not self-describing (that is, there is no way to tell the names, meaning, or full datatypes of fields without an external specification). There is no defined way to include or refer to such an external specification (schema) within a Protocol Buffers file. The officially supported implementation includes an ASCII serialization format, but this format—though self-describing—loses the forward- and backward-compatibility behavior, and is thus not a good choice for applications other than debugging.

Of course, you can include your own string or id number to name or identify the schema, but now you're inventing a custom schema system, and it might be better to use an existing one as there are lots of pitfalls to encounter.

score 3 · Answer 2 · answered Mar 05 '20 at 00:04

The fundamental basis of communication is that the sender wants the receiver to reach a desirable state (otherwise why would the sender bother?).

To achieve this the sender must send a message that will cause the desired outcome in a well behaved receiver (an insane or subversive receiver can not be reasoned with anyway...).

Immediately this requires some form of agreement between sender and receiver. The goals of the sender need to be achievable. (If the hat of who is the sender moves around in the communication, then each sender hat wearing participant needs to be able to reasonably achieve their goals).

So the question is what are the goals you wish to achieve?

If you want to describe data, your friends are JSON, YAML, XML. Use them or take inspiration from them. They define a number of primitive types, and provide a mechanism for combining them into more complex structures. This allows you to pass arbitrary arrangements of data without having to agree upfront on the structure. The Data itself self-describes its format. The downside is figuring out on the client how to deal with the message, its not immediately obvious.

If you want to describe specific messages, then block style languages like PNG (yes the image format, take a look at the spec), and BitTorrent are your friend. Each block has a header with total block size, potentially some management flags, and a type identifier. This allows messages to be skipped if irrelevant/indecipherable, and those messages which are understood can be safely deserialised. This is great for latency as the communication is streamlined and any reaction to a message can be done very quickly. The downside is that the messages have to be agreed upon in advance. You can get around this by making one of these messages transmit just data (like JSON).

If you want to have an extensible language take a look at FTP, VT (Virtual Terminal), Telnet wire protocols. They have a special kind of meta message for negotiating which extensions are available, which are preferred, and then to enable those extensions. This provides a kind of half-way point. Both sides have to reach an agreement but can agree to different sets of messages, and may choose to enable/disable those message throughout the communication. This allows for new messages set to be made later, and only when both sides have implemented them will they (possibly) be used.

score 1 · Answer 3 · answered Mar 04 '20 at 20:33

A design that requires disambiguation by default, I'd consider broken unless there was a specific reason to do so. Now, are these objects completely orthogonal or are they polymorph? By polymorph I mean there is some sort of base class, like

class eventbase {
     string sender {get; set; }
     DateTime time {get; set; }
     string type {get; set; }
}

class usernamechangedevent: eventbase {
     string name {get; set; } 
}

In this scenario, you can often deserialize into a the base type, read the acutal type, then deserialize into the actual type. See also: Polymorphism in protocol buffers

We (developers) worked long and hard to make things type safe. In general I would recommend to keep it that way.

How do you resolve a byte[] into a class instance in a way that doesn't couple the serialization/deserialization contexts together

3 Answers3