Why use an opaque "handle" that requires casting in a public API rather than a typesafe struct pointer?

Question

I'm evaluating a library whose public API currently looks like this:

libengine.h

/* Handle, used for all APIs */
typedef size_t enh;
/* Create new engine instance; result returned in handle /
int en_open(int mode, enh handle);
/* Start an engine */
int en_start(enh handle);
/* Add a new hook to the engine; hook handle returned in h2 /
int en_add_hook(enh handle, int hooknum, enh h2);

Note that enh is a generic handle, used as a handle to several different datatypes (engines and hooks).

Internally, most of these APIs of course cast the "handle" to an internal structure which they've malloc'd:

engine.c

struct engine
{
    // ... implementation details ...
};
int en_open(int mode, enh handle)
{
    struct engine en;
en = malloc(sizeof(*en));
if (!en)
    return -1;

// ...initialization...

*handle = (enh)en;
return 0;

}
int en_start(enh handle)
{
    struct engine en = (struct engine)handle;
return en-&gt;start(en);

}

Personally, I hate hiding things behind typedefs, especially when it compromises type safety. (Given an enh, how do I know to what it's actually referring?)

So I submitted a pull request, suggesting the following API change (after modifying the entire library to conform):

libengine.h

struct engine;           /* Forward declaration */
typedef size_t hook_h;    /* Still a handle, for other reasons */
/* Create new engine instance, result returned in en /
int en_open(int mode, struct engine *en);
/* Start an engine /
int en_start(struct engine en);
/* Add a new hook to the engine; hook handle returned in hh /
int en_add_hook(struct engine en, int hooknum, hook_h *hh);

Of course, this makes the internal API implementations look a lot better, eliminating casts, and maintaining type safety to/from the consumer's perspective.

libengine.c

struct engine
{
    // ... implementation details ...
};
int en_open(int mode, struct engine *en)
{
    struct engine _e;
_e = malloc(sizeof(*_e));
if (!_e)
    return -1;

// ...initialization...

*en = _e;
return 0;

}
int en_start(struct engine *en)
{
    return en->start(en);
}

I prefer this for the following reasons:

Added type safety
Improved clarity of types and their purpose
Removed casts and typedefs
It follows the recommended pattern for opaque types in C

However, the owner of the project balked at the pull request (paraphrased):

Personally I don't like the idea of exposing the struct engine. I still think the current way is cleaner & more friendly.

Initially I used another data type for hook handle, but then decided to switch to use enh, so all kind of handles share the same data type to keep it simple. If this is confusing, we can certainly use another data type.

Let's see what others think about this PR.

This library is currently in a private beta stage, so there isn't much consumer code to worry about (yet). Also, I've obfuscated the names a bit.

How is an opaque handle better than a named, opaque struct?

_{Note: I asked this question at Code Review, where it was closed.}

score 34 · Answer 1 · answered Aug 28 '15 at 00:54

The "simple is better" mantra has become too much dogmatic. Simple is not always better if it complicates other things. Assembly is simple - each command is much simpler than higher-level languages commands - and yet Assembly programs are more complex than higher-level languages that do the same thing. In your case, the uniform handle type enh makes the types simpler at the cost of making the functions complex. Since usually project's types tend to grow in sub-linear rate compared to it's functions, as the project gets bigger you usually prefer more complex types if it can make the functions simpler - so in this regard your approach seems to be the correct one.

The project's author is concerned that your approach is "exposing the struct engine". I would have explained to them that it's not exposing the struct itself - only the fact that there is a struct named engine. The user of the library already needs to be aware of that type - they need to know, for example, that the first argument of en_add_hook is of that type, and the first argument is of a different type. So it actually makes the API more complex, because instead of having the function's "signature" document these types it needs to be documented somewhere else, and because the compiler can no longer verify the types for the programmer.

One thing that should be noted - your new API makes user code a bit more complex, since instead of writing:

enh en;
en_open(ENGINE_MODE_1, &en);

They now need more complex syntax to declare their handle:

struct engine* en;
en_open(ENGINE_MODE_1, &en);

The solution, however, is quite simple:

struct _engine;
typedef struct _engine* engine

and now you can straightforwardly write:

engine en;
en_open(ENGINE_MODE_1, &en);

score 16 · Answer 2 · answered Aug 28 '15 at 08:46

There seems to be a confusion on both sides here:

using a handle approach does not requiring using a single handle type for all handles
exposing the struct name does not expose its details (only its existence)

There are advantages to using handles rather than bare pointers, in a language like C, because handing over the pointer allows direct manipulation of the pointee (including calls to free) whereas handing over a handle requires that the client go through the API to perform any action.

However, the approach of having a single type of handle, defined via a typedef is not type safe, and may cause lots of griefs.

My personal suggestion would thus be to move toward type safe handles, which I think would satisfy the both of you. This is accomplished rather simply:

typedef struct {
    size_t id;
} enh;

typedef struct {
    size_t id;
} oth;

Now, one cannot accidentally pass 2 as a handle nor can one accidentally pass a handle to a broomstick where a handle to the engine is expected.

So I submitted a pull request, suggesting the following API change (after modifying the entire library to conform)

That is your mistake: before engaging in significant work on an open source library, contact the author(s)/maintainer(s) to discuss the change upfront. It will allow the both of you to agree on what to do (or not do), and avoid unnecessary work and the frustration that results from it.

score 3 · Answer 3 · answered Aug 28 '15 at 01:24

I suspect the real reason is inertia, that's what they've always done and it works so why change it?

The main reason I can see is that the opaque handle lets the designer put anything at all behind it, not just a struct. If the API returns and accepts multiple opaque types they all look the same to the caller and there's never any compilation problems or recompiles needed if the fine print changes. If en_NewFlidgetTwiddler(handle **newTwiddler) changes to return a pointer to the Twiddler instead of a handle, the API doesn't change and any new code will silently be using a pointer where before it was using a handle. As well, there's no danger of the OS or anything else quietly "fixing" the pointer if it gets passes across boundaries.

The disadvantage of that, of course, is that the caller can feed anything at all into it. You have a 64 bit thing? Shove it into the 64 bit slot in the API call and see what happens.

en_TwiddleFlidget(engine, twiddler, flidget)
en_TwiddleFlidget(engine, flidget, twiddler)

Both compile but I bet only one of them does what you want.

score 3 · Answer 4 · 2016-01-14T16:53:16.260

Déjà vu

How is an opaque handle better than a named, opaque struct?

I encountered the exact same scenario, only with some subtle differences. We had, in our SDK, a lot of things like this:

typedef void* SomeHandle;

My mere proposal was to make it match our internal types:

typedef struct SomeVertex* SomeHandle;

To third parties using the SDK, it should make no difference whatsoever. It's an opaque type. Who cares? It has no effect on ABI* or source compatibility, and using new releases of the SDK requires the plugin to be recompiled anyway.

_{* Note that, as gnasher points out, there can actually be cases where the size of something like pointer to struct and void* may actually be a different size in which case it would affect ABI. Like him, I've never encountered it in practice. But from that standpoint, the second could actually improve portability in some obscure context, so that's another reason to favor the second, albeit probably moot for most people.}

Third Party Bugs

Furthermore, I had even more cause than type safety for internal development/debugging. We already had a number of plugin developers who had bugs in their code because two similar handles (Panel andPanelNew, i.e.) both used a void* typedef for their handles, and they were accidentally passing in the wrong handles to the wrong places as a result of just using void* for everything. So it was actually causing bugs on the side of those using the SDK. Their bugs also cost the internal development team enormous time, since they would send bug reports complaining about bugs in our SDK, and we'd have to debug the plugin and find that it was actually caused by a bug in the plugin passing the wrong handles to the wrong places (which is easily allowed without even a warning when every handle is an alias for void* or size_t). So we were needlessly wasting our time providing a debugging service for third parties because of mistakes caused on their part by our desire for conceptual purity in hiding away all internal information, even the mere names of our internal structs.

Keeping the Typedef

The difference is that I was proposing that we do stick to the typedef still, not to have clients write struct SomeVertex which would affect source compatibility for future plugin releases. While I personally like the idea of not typedefing away struct in C, from an SDK perspective, the typedef can help since the whole point is opacity. So there I'd suggest relaxing this standard just for the publicly-exposed API. To clients using the SDK, it should not matter whether a handle is a pointer to a struct, an integer, etc. The only thing that matters to them is that two different handles do not alias the same data type so that they don't incorrectly pass in the wrong handle to the wrong place.

Type Information

Where it matters most to avoid casting is to you, the internal devs. This kind of aesthetic of hiding away all internal names from the SDK is some conceptual aesthetic that comes at the significant cost of losing all type information, and requiring us to needlessly sprinkle casts into our debuggers to get at critical information. While a C programmer should largely be accustomed to this in C, to require this needlessly is just asking for trouble.

Conceptual Ideals

In general you want to watch out for those types of developers who put some conceptual idea of purity far above all practical, daily needs. Those will drive the maintainability of your codebase to the ground in seeking some Utopian ideal, making the whole team avoid suntan lotion in a desert out of fear that it's unnatural and may cause a vitamin D deficiency while half the crew are dying from skin cancer.

User-End Preference

Even from the strict user-end standpoint of those using the API, would they prefer a buggy API or an API that works well but exposes some name they could hardly care about in exchange? Because that's the practical trade-off. Losing type information needlessly outside of a generic context is increasing the risk of bugs, and from a large-scale codebase in a team-wide setting over a number of years, Murphy's law tends to be quite applicable. If you superfluously increase the risk of bugs, chances are you'll at least get some more bugs. It doesn't take too long in a large team setting to find that every kind of human mistake imaginable will eventually go from a potential into a reality.

So perhaps that's a question to pose to users. "Would you prefer a buggier SDK or one that exposes some internal opaque names that you'll never even care about?" And if that question seems to put forth a false dichotomy, I'd say more team-wide experience in a very large-scale setting is needed to appreciate the fact that a higher risk for bugs will ultimately manifest real bugs in the long term. It matters little how much the developer is confident in avoiding bugs. In a team-wide setting, it helps more to think about the weakest links, and at least the easiest and quickest ways to prevent them from tripping.

Proposal

So I'd suggest a compromise here which will still give you the ability to retain all the debugging benefits:

typedef struct engine* enh;

... even at the cost of typedefing away struct, will that really kill us? Probably not, so I recommend some pragmatism on your part as well, but more so to the developer who would prefer to make debugging exponentially harder by using size_t here and casting to/from integer for no good reason except to further hide information which is already 99% hidden to the user and cannot possibly do any more harm than size_t.

score 2 · Answer 5 · answered Sep 02 '15 at 06:58

Here are a situation where opaque handle is needed;

struct SimpleEngine {
    int type;  // always SimpleEngine.type = 1
    int a;
};

struct ComplexEngine {
    int type;  // always ComplexEngine.type = 2
    int a, b, c;
};

int en_start(enh handle) {
    switch(*(int*)handle) {
    case 1:
        // treat handle as SimpleEngine
        return start_simple_engine(handle);
    case 2:
        // treat handle as ComplexEngine
        return start_complex_engine(handle);
    }
}

When the library has two or more struct types that have a same header part of fields, like "type" in the above, these struct types can be considered having a common parent struct (like a base class in C++).

You can define the header part as "struct engine", like this;

struct engine {
    int type;
};

struct SimpleEngine {
    struct engine base;
    int a;
};

struct ComplexEngine {
    struct engine base;
    int a, b, c;
};

int en_start(struct engine *en) { ... }

But it's a optional decision because type-casts are needed regardless of using struct engine.

Conclusion

In some case, there are reasons why opaque handles are used instead of opaque named structs.

score 0 · Answer 6 · answered Aug 28 '15 at 01:25

I believe that the attitude stems from a long-time philosophy to defend a C library API from abuse by beginners.

In particular,

Library authors know it's a pointer to the struct, and the struct's details is visible to library code.
All experienced programmers who use the library also knows it's a pointer to some opaque structs;
- They had enough hard painful experience to know not to mess with the bytes stored in those structs.
Inexperienced programmers know neither.
- They will try to memcpy the opaque data or increment the bytes or words inside the struct. Go hacking.

The long-time traditional countermeasure is to:

Masquerade the fact that an opaque handle is actually a pointer to an opaque struct that exists in the same process-memory-space.
- To do so by claiming it is an integer value having same number of bits as a void*
- To be extra-circumspect, masquerade the pointer's bits as well, e.g.
  struct engine* peng = (struct engine*)((size_t)enh ^ enh_magic_number);

This is just to say it has long traditions; I had no personal opinion on whether it's right or wrong.

Why use an opaque "handle" that requires casting in a public API rather than a typesafe struct pointer?

6 Answers6

Conclusion