48

I'm working on a PHP web application that depends on a few 3rd-party services. These services are well documented and provided by fairly large organisations.

I feel paranoid when working with responses from these API, which leads me to write validation code that validates that the responses match the structure and data types specified in the documentation. This mainly comes from the fact that it's out of my control and if I blindly trust that the data will be correct and it's not (maybe someone changes the json structure by accident), it could lead to unexpected behaviour in my application.

My question is, do you think this is overkill? How does everyone else handle this situation?

Vector Zita
  • 2,502

8 Answers8

68

Absolutely. For starters, you never know that somebody hasn't hacked into your connection and the reply you receive doesn't come from the API at all.

And some time in the last two weeks I think Facebook changed an API without notice, which caused lots of iOS apps to crash. If someone had verified the reply, the API would have failed, but without crashing the app.

(A very nice case I heard why validation is needed: A server provided information about goods a customer could buy. For dresses, they included the U.K. dress size as an integer, usually 36 to 52. Except for one dress, the size was a string “40-42”. Without validation that could easily be a crash. )

gnasher729
  • 49,096
43

Somebody else's API is your external interface. You shouldn't blindly trust anything that crosses that boundary. Your future debuggers will thank you for not propagating the other system's errors into yours.

17

Is your API-boundary also a trust-boundary?

As you are communicating with a remote system, that's nearly a certainty. Even if the remote system itself might be trusted, the medium might not be.

Failure to successfully and consistently verify all untrusted data may result in a crash in the best case, to silent hostile takeover at the worst.

Is the API stable?

Even a trusted API might not be stable, in which case extra-verification is needed, and a plan for backing out, up to denying service until fixed.

Is the implementation behind the API well-tested, mature and reliable?

It doesn't matter whether the API is stable if the implementation fails to live up to it.

Always remember there is a tradeoff

More tests mean more code which might contain bugs, and will be rarely if ever exercised.

This code must be written, maintained, and debugged, all of which drains effort needed elsewhere too.

Also, comprehensively testing the failure-case is somewhere between hard and impossible without mocking the complete API, likely leaving bug undiscovered, and accumulating more, even if slower than comments.

Thus, some APIs are simply relied on to work, while others are (or at least should be) verified on each call to at least some extent.

Deduplicator
  • 9,209
3

Paranoid or not depends on how robust your software must be.

I think, if your checks have minimal extra implementation costs then they are ok.

Example:

  • if you communicate with services through XML the structural verification can be done through an XSD schema.
  • in Java/C# you can have guard statements that throw an exception, if the API contract is broken
    • Example: if you get a birthday from an external service then the guard-statement assert(birthday > '1900-01-01' and birthday < '2050-01-01') will throw an exception if birthday have a non plausible value
Glorfindel
  • 3,167
k3b
  • 7,621
3

Absolutely. We have been caught out by this with Microsoft APIs, for example, and we were not even set up to log that in our Azure function application. So all we saw was that requests to our endpoints failed. It changed without any warning between hand testing / UAT and actual live use of our application.

Our unit tests still worked of course, because they used the schema from the Microsoft documentation (which had not been updated). I only knew because some other kind developer commented on the Microsoft documentation!

Make sure to log what you actually get as request from external APIs to your endpoint / as response to your call and throw meaningful errors (as appropriate) in your application.

This actually gives me the willies with our current project, which relies on many external APIs - we have monitoring functions and E2E tests with Cypress for vital functions running every so often, so at least we know when it happens. We are still working on how to reliably know in advance...

Vector Zita
  • 2,502
kpollock
  • 135
3

Yes, but in most cases that should not be your personal concern.

For most languages there are parsers that parse a native JSON (or whatever your transfer language is) response into your internal objects. They come with all the options to consider different writing style, understand corner cases, escape characters, special character encodings etc. Their validation code is used by thousands of other applications. You should use one of these parsers if possible and rely on their validation methods instead of validating the syntax yourself. I.e. they should throw exceptions, return error codes or otherwise complain if the input isn't matching what you specified (missing fields, strings not matching your defined pattern etc).

The only validation you might want to do yourself in your own code is that the response makes sense for your business logic. Don't try to reimplement the other service though, it does not make sense to fully validate that their response is correct: if you can do that locally then you don't need to call them. (Unless you deal with totally sensitive things/hard problems, then you could call multiple services and combine their results). What you can do when you want to protect against some extreme level of disasters from malicious responses is to detect answers that are way out of bounds. I.e. block a transaction in your bike rental service if the bill calculated externally for a single customer goes beyond 1000 $ or such. But be careful, one easily overlooks corner cases that are valid (e.g. a "virtual" customer that pays for his whole company rentals for a year).

3

Your validation shouldn't be to restrictive. There is the "tolerant reader" pattern. It means that you should be as tolerant as possible, when consuming data from other services. On the other side, there is the "Magnanimous Writer" pattern. Together, they help to produce more robust communication systems.

For example, in a JSON based interface, you probably should allow unknown properties. This allowes the other side to add new properties without breaking your side.

user355880
  • 909
  • 1
  • 5
  • 5
-3

Yes it's overkill in most scenarios for writing a web application.

I think of a third party api similar to libraries from a package manager. Do you write tests for each of the libraries you use which aren't built into php ?

Normally apis are versioned, are documented and shouldn't have surprise changes - all like packages.

aaaaaa
  • 169