Purpose of async/await in web servers

Question

I don't understand why I keep seeing async/await recommended (or sometimes, even enforced) for ASP.NET Core web applications and APIs.

As far as I can tell, every request is already being run on a thread pool (as empirically tested by logging the thread ID during each request), so making all calls use async/await within your webmethods will, at best, move the execution from your thread pool to a different thread pool.

It doesn't free up the socket, because, well the connection is still open and the client is still waiting (synchronously or not) for a response.

There must be something I don't get, but what?

score 50 · Accepted Answer · edited May 29 '24 at 22:50

As far as I can tell, every request is already being run on a thread pool (as empirically tested by logging the thread ID during each request), so making all calls use async/await within your webmethods will, at best, move the execution from your thread pool to a different thread pool.

The 2nd part of the assumption is incorrect. The async / await calls, assuming they are IO calls, will not be offloaded to a different thread pool thread.

Essentially, while IO happens, the thread that encountered an await will be free to pick up other requests. This improves the throughput of the web application. The fundamental reason behind this is that IO is not done by the CPU but by the various IO devices on the PC (disk, network card, etc.); the CPU merely coordinates them. Synchronous calls will simply block the application thread waiting (essentially meaning one CPU core—it doesn't matter which—is performing the sync wait—OS scheduling has little effect on this outcome) for the IO device to finish, which is not an ideal measure for maximum throughput.

This is a pretty good read on the matter:

https://blog.stephencleary.com/2013/11/there-is-no-thread.html

It doesn't free up the socket, because, well the connection is still open and the client is still waiting (synchronously or not) for a response.

A simplified view: Your server will bind a listener on a port (80 or 443 usually). When a request comes in, a new socket is created for every single connection (you can't have the same socket shared between 2 clients). The simplified workflow is like this:

Server binds listener port.
Connection incoming
Socket is created between server and client.
Request is assigned to a thread pool thread, and processing begins -> this is where your async happens.
Listener is again free to serve a new connection. Repeat 2-4

Note that steps 4 and 5 happen in parallel.

Async in step 4 allows the physical thread to pick up multiple sockets from the listener.

There's a hard limit on how many requests can be processed at the same time. As you correctly identified, there is a limit on how many sockets you can have open, and you cannot simply close the socket on someone. That is true. However, the limit of sockets is in the range of tens of thousands, whereas the limit on threads is in the thousands. So in order to fully saturate your sockets, which is ideal 100% usage of hardware, you need to better manage your threads, which is where async-await comes in.

When a thread processing at step 4 encounters an await on async IO, it will simply return to the pool and be ready to process another request. The async IO device will send a notification to the CPU when it is done, so the processing of the request that was interrupted can continue. In the case of web APIs, the thread continuing after an await is not always the thread that encountered the await. This can be configured using ConfigureAwait on applications that do care about thread affinity (not the case for web API). See https://stackoverflow.com/questions/18097471/what-does-synchronizationcontext-do and https://learn.microsoft.com/en-us/archive/msdn-magazine/2011/february/msdn-magazine-parallel-computing-it-s-all-about-the-synchronizationcontext for more details.

You can imagine this as a clown juggling 3-4 balls with just one hand. The thread is the clown. The balls in the air are async IO that are handled by IO devices. The ball in the clown's hand is the request currently being actively processed by that thread. If the clown wasn't allowed to throw balls in the air he'd be limited by the number of hands (one in this case) on how many balls he can handle.

Some more clarifications from comments:

IO is async in nature. The synchronous IO wait happens in the application level APIs and libraries (even if they are provided by the OS).
Async-await allows applications to fully adapt to the async nature of IO.
We are not talking about Task.Run here, its use case is different, async-await is used there for convenience.

score 12 · Answer 2 · answered May 28 '24 at 15:06

A threadpool does not have infinite threads. Each time you synchronously wait, you are holding onto a thread and doing nothing with it.

If you instead await, the suspension will bubble up to the threadpool's work scheduler, which can then use that thread to do other work. When the underlying action becomes resumable, the scheduler will give you a (probably different) thread to continue on.

score 9 · Answer 3 · answered May 29 '24 at 08:09

It sounds to me like there are 2 wrong assumptions here.

The point of async/await isn't to free up the socket/connection. You're freeing up the thread to do other work during some async operation.
The point of async/await isn't to do long, synchronous work on another thread. You might await a database read across the network, so that the thread doesn't need to stall until the database responds (if it even does). You free up your thread so that other requests can be processed while you wait, then when your response has been delivered, you get added back into the processing queue, receive a thread and continue your work.
Async/await frees your threads with the assumption that someone else is going to work.

Let's pretend threads are humans and a microwave is a network call.
If you're heating something in the microwave, watching it until it's ready is "polling".
If you push a button and let it ring when it's done, that's an interrupt/signal.
Async/await is pushing the button, going to watch tv, then going back when the microwave tells you to.
Waiting for the signal frees you up, because you're delegating the heating to somewhere else (another machine). Heating is an asynchronous task.

But if your heating is synchronous (ie a person needs to do it), then you're basically putting the food in a microwave, walking over to your tv, walking back to the microwave, taking the food out, rubbing it until it's warm, putting it back into the microwave, walking to the tv, walking back to the microwave, then finally taking the food out again.
A lot of pointless overhead.

Let's make this a little more web-specific.

You're probably awaiting a DB transaction network calls or other inter-process communication.
Rather than poll until those other actors are done, you say "hey, run the rest of this code when the other guy is done". (ie, scheduling a task to be run after some signal is received)

score 0 · Answer 4 · answered May 31 '24 at 16:52

Its not about CPU work, the async stuff is not for processing. Its for IO.

Once you realise that, it becomes a little easier to accept. Its still pointless rubbish though as the async version of IO appears to be terribly slow so you're robbing Peter to pay Paul but with extortionate interest!

I think the async was added to help with responsiveness of UIs from the concept that became popular in Node (which needed it because node is single threaded). For a webserver, waiting on a thread even for IO, is not a problem (ie you'll run out of sockets or memory before the number of threads becomes an issue)

So all you do is make your request slower at the benefit of "responsiveness" which for webservers is not an issue - the only way to make your webserver feel more responsive is to complete the request as quickly as possible.

Don't forget, if you wait on IO using a single thread, Windows will not busy-wait that thread, you will wait on a synchronisation primitive and CPU will be free to process other threads. This applies to IO just as much, the OS syscall will block on an event for you.

The benefit appears to be trivially minor. You get to use fewer threads, but by the time you have too many threads, your system is overloaded and needs tuning to block incoming requests anyway. It adds nothing, but as we see from benchmarks, costs a lot. Async/await is a bad solution looking for a problem. its like spending $10 in order to save $5.

Purpose of async/await in web servers

4 Answers4

Linked