Determine the end of a malformed (bad) http request

Question

I'm implementing a HTTP server and wonder, if there is a defined way of when a server would determine a bad request as ended to

return the corresponding 400 status, and
accept the following data as new request starting a new attempt to parse it.

The only idea that comes to my mind would be a very ambiguous one: searching for the next request line-like data received and start a new parse attempt from there. However, this is, as said, a very ambiguous approach, since the data of a bad request may of course contain said 'request line-like' data without actually intending this to be a separate, new request.

The same question arises when thinking of client-side response parsing of malformed responses, so taking this case into account would be appreciated.

score 0 · Answer 1 · edited Oct 07 '21 at 08:14

The header end with \r\n\r\n. You simply parse each entry that you need to read and split them into argument, strtok ? or strstr, or manually.

If you talk more about the GET line;

The HTTP protocol does not place any a priori limit on the length of
a URI. Servers MUST be able to handle the URI of any resource they
serve, and SHOULD be able to handle URIs of unbounded length if they
provide GET-based forms that could generate such URIs. A server
SHOULD return 414 (Request-URI Too Long) status if a URI is longer
than the server can handle (see section 10.4.15).
  Note: Servers ought to be cautious about depending on URI lengths
  above 255 bytes, because some older client or proxy
  implementations might not properly support these lengths.

Please refer to the RFC 2616 to make your web server re-act according to the standard.

nb, Make sure you are ready to use the chunk attribute too after, if you want to support HTTP1.0+, else your server will be at the HTTP0.9 standard.

Reizo · Accepted Answer · 2018-07-06T11:09:58.567

After some considereations it came quite clear that there's no universally applicable way for determining the end of a malformed message, since the messages always contain some self-describing bits of information (e.g. the Content-Length header field) that allows the recipient to actually understand the message. If for example a response would look like this:

HTTP/1.1 200 OK
Content-Length: [ consider correct content length here ]
Content-Type: text/html
<html>
    <head>
        <title>Title</title>
    </head>
    <body>
HTTP OK status messages look like this:
HTTP/1.1 200 OK
    </body>
</html>

The client parser would most likely fail at the first < since it'd expect another header field name (due to the single line break after Content-Type-header) that doesn't allow <. Further, it then should (probably) not 'search' for another valid HTTP response in the following data, since it might receive message bodies like the given, where it says HTTP/1.1 200 OK, which is not intended to be a new response, however.

Thus the best reaction to a malformed http message appears to be closing the connection, since any other attempt to interpret the following data received is inevitably ambiguous.

This however is AFAIK not in any way specified in RFC. Maybe because RFC is more about defining standards and less about handling non-standard behaviour.

Determine the end of a malformed (bad) http request

2 Answers2