Why do modern operating systems ever have perceptible input (keyboard/mouse) lag?

Question

Sometimes computers stutter a bit when they're working hard, to the point where the mouse location freezes for a fraction of a second, or stutters intermittently for a few seconds. This sometimes happens with keyboard input too, even in very basic applications (i.e. presumably it's not a complicated/costly handling of the event loop within an application). This is very frustrating as a user.

Why can't (or why don't) operating systems absolutely prioritise user input (and repainting thereof) in threading and process scheduling?

A few ideas, maybe it's one/some of these:

Operating systems don't force applications to explicitly delineate between immediate user input event handling and any other knock-on processing or background processing, so it relies on all applications being well engineered. And not all applications are.
The event loop and repainting requires every application that could potentially be visible or responsive to user input to weigh in on what happens in the next tick, and sometimes lots of complicated things happen at once. Perhaps even the user input thread in some application(s) gets blocked waiting for another thread?
The event loop and repainting only requires the currently active application for the most part, but operating systems sometimes let background processes hog the CPU. If this is the case - why do they let this happen?
It's deemed more important to let threads operate in short bursts to prevent slow down via context switching than to operate strictly in priority order (presumably there is always some cutoff/trade-off here).
- I don't know modern CPU architectures, but presumably executing 10 instructions on virtual thread 1, then 2, then 3 is faster than executing 1 instruction on thread 1, then 2, then 3, 10 times. (Scale 10 and 1 up or down as appropriate.)
Something about interrupts that I don't understand.

Admittedly my experience is only on Windows - keen to hear if other OS's have this solved.

score 26 · Answer 1 · answered Apr 29 '22 at 10:58

As you may have noticed, there's a category of application that tries really hard to avoid input lag and only occasionally fails at doing so: games. Even then it's not uncommon for players to notice occasonal slowdowns and stuttering.

There is an excellent blog which gives examples of hunting for these issues. A followup post tries to find the exact instructions reponsible: What is Windows doing while hogging that lock.

As you've guessed, it's the user input thread in all applications getting blocked while the operating system holds a global lock while trying to clean up a list of objects. Repeatedly.

Windows, because of its long history and origins in the 16-bit single-CPU cooperative multitasking era, isn't especially suitable for isolating processes from each other's performance issues, especially when there are UI elements involved.

score 25 · Answer 2 · answered Apr 29 '22 at 18:20

I would like to answer this question from more of a high-level, marketing perspective than a more low-level, technical one.

All of the current mainstream Operating Systems are so-called general purpose Operating Systems. That's not really a head scratcher: a special-purpose OS is by definition only useful for a small group of people, so it can't really become mainstream.

In programming, there is an almost general rule that there is a trade-off between latency and throughput. Improving throughput worsens latency and improving latency worsens throughput. You can't have both low latency and high throughput.

However, a general-purpose OS must be useful for many different purposes, so it must make a trade-off where it offers both "good enough" latency and "good enough" throughput. But that means, you can't really have very low latency or very high throughput in a general-purpose OS. (And this applies to hardware, to networking, to many other things as well. E.g. heavily superscalar CPUs with deep pipelines have high steady-state throughput but terrible latency in case of a pipeline stall, cache miss, etc.)

Windows, for example is used for gaming and file servers and word processing and reading emails and 3D modeling. For tiny home PCs, phones, tower workstations, and 500-core servers with terabytes of RAM. It is used in places where power is of no concern and in places where power is everything.

You simply can't have an OS that is perfect at all of those things.

Now, the question is: why don't we simply give up on the idea of a general-purpose OS and use different OSs for different purposes instead? Well, commonality is economically efficient. You only have to write drivers for one OS, not 100. You only have to train administrators in one OS, not 100. Some software may be needed in multiple different niches, you only need to write this once instead of multiple times.

In fact, with projects like Java+JavaFX+Swing, .NET+MAUI, Electron / React-Native, etc. we see efforts of making it possible to write an application only once and have it run on macOS, Windows, Linux, Android, and iOS. So, there clearly is a desire to have uniformity across OSs, even when there is only three or so of them.

So, in summary: it makes economic sense to have only one general-purpose OS, only one general-purpose hardware architecture, only one general-purpose communication network, etc. But that means that we end up with a "jack-of-all-trades, master-of-none" situation, where e.g. an OS cannot afford to optimize for interactive latency too much, because that would hurt batch throughput processing, which is important for other users.

Note that, for example, in Linux, there are certain compile time options in the kernel as well as third-party patches, which improve interactivity. However, in the end, Linux is just a small part of the system. Input processing for graphical applications is actually handled by XInput or whatever the Wayland equivalent of that is, so there is not much Linux can do there.

Which brings us to another thing: abstractions. Software Engineering is all about abstractions and reuse. But, generally reusable abstractions need to be, well, general-purpose. And so we end up in a situation where a game uses maybe some in-house framework, which in turn uses Unity, which in turn uses Wayland, which in turn uses DRM, which in turn uses the Linux kernel, which in turn uses the xHCI API to talk USB HID to the mouse.

Every single crossing of one of those abstraction boundaries costs performance, and every single one of those abstractions is more general than this particular game needs it to be. But without all of these abstractions, every game would need to implement its own USB driver, its own graphics driver, its own rendering pipeline, its own filesystem, its own SSD driver, and so on, and so on, which would be prohibitively expensive. Especially if you consider that one of the big promises of the PC is modularity, where you can create billions of combinations of different hardware and have it work.

This is very different for gaming consoles, for example, where you have exactly one well-defined hardware and software configuration. But gaming consoles are not general-purpose, you can't use them to run a database cluster.

So, effectively, it is all about trade-offs. Like everything in Engineering is.

score 20 · Answer 3 · answered Apr 29 '22 at 09:56

Why can't (or why don't) operating systems absolutely prioritise user input (and repainting thereof) in threading and process scheduling?

Even if the operating system tells the application about the user input, or the necessity to draw (part of) it's window instantaneously, it's still up to the application to actually do that.

what happens in the next tick

Most programs don't use ticks. They handle the stream of events that the OS supplies to them, as fast as they can. If they aren't keeping up, the OS might drop or merge some events (if you haven't handled the mouse moving to one place before it moves elsewhere, you really only need the latest position).

it relies on all applications being well engineered. And not all applications are.

Mostly this. You could more charitably characterise it as the developers prioritising something other than UI responsiveness.

score 11 · Answer 4 · answered Apr 29 '22 at 12:08

In my experience, on most computers I have ever used, this is usually caused by inappropriate swapping to disk. Every other cause (such as operating system locks) is significantly less common.

When there is a high demand for memory, the operating system chooses memory pages to write to disk (very slow) to make room for new memory allocations.

If the memory demand is too high or the operating system chooses poorly, it can swap out pages that are in the event processing path of some process. When (and only when) an event is received that follows this code path, the CPU will fault when it wants to access a page that is not currently present in memory. The operating system will handle the fault by choosing another page to write to disk, then reading the requested page from the disk, and perhaps other pages that the OS predicts will also be needed, and then resuming the process that faulted. In that order.

The OS is unlikely to correctly predict all the pages that will be read to process the event, so several cycles of this can occur - potentially, hundreds of cycles - potentially, waiting 15 milliseconds twice per cycle. This can therefore add up to several seconds of simply waiting for the hard drive mechanics to move.

This is exacerbated by more complex programming environments which access more pages of memory when processing an event. I have seen web browsers swap for several minutes before responding. This may contribute to the perception that programs written in more complex environments are "slower".

Current operating systems do not really provide good tools to manage swapping behaviour. If they did, they would probably be abused by some programs to make them "faster" at the expense of other programs. If a program written by a cut-throat commercial vendor could mark 100% of its memory as "important event-processing code" then it would.

score 4 · Answer 5 · answered May 02 '22 at 12:34

Why? Because, generally speaking, nobody cares enough to make it better. And by "caring enough" I mean cares enough to throw some serious money at it.

Things begin with synchronous programming models: mainstream programming languages, like pre-coroutine C++, and C, make it very hard to write reactive code that responds to events rather than waiting for them occasionally.

For example, suppose you're writing a typical desktop application. There's a GUI thread. The user wants to save the file. So you stick gui->current_document.Save() call in the place where the Save command is accepted. But now the system is not reacting to user events, but saving the file - the GUI is frozen. If the file is on a network, or the local mechanical hard drive is slow, the user experience is necessarily poor.

So then you think "aha, let's save in a thread!". So you execute the save call in a thread pool. Things are seemingly OK until your testers find that if they save the file and then immediately start modifying the document, the application crashes.

Yep, accessing the document data from multiple threads cannot be done "just like that". You need to produce a "safe" static snapshot of the document that's guaranteed to remain valid for as long as the Save() function needs it to. In general, it's not possible to "synchronize" the saving and modifications using just synchronization primitives like mutexes etc: the modifications may make some already-saved data irrelevant, there are problems ensuring self-consistency, etc.

Highly reactive software must be designed that way from day one, and the programming tools available for that are still not easy to use. Concurrent programming is still hard - there just are tools that make it very easy to shoot yourself in the foot instantly, in a single line of code. At least in the "good old days" starting a thread took a bit more work so people didn't bother and things at least worked, if slowly and with lag due to file access and network access. Today it's very easy to "parallelize" code, but it's no easier to do it safely and correctly - at least not in the languages the big, highly visible desktop code bases are written in - typically C++.

score 3 · Answer 6 · answered May 02 '22 at 12:14

Dan Luu posted a wonderful insight into this: https://danluu.com/input-lag/

With his measurements, we can see that an Apple IIe from 1977 is indeed more responsive than a state of the art nowadays-computer.

He also proposes some culprits: rendering pipeline and system complexity (some might say bloating, but your mileage may vary).

Last but not least, I can hear that this matter is relevant to you. But as I see it, there has been gazillions of money thrown at the OS topic, and every decade or so, someone comes up with the next big thing. But people, clients, stick to what they know, and what is compatible. How many know or use KolibriOS?

You could say that latency can... wait (pun only half-intended).

score 2 · Answer 7 · answered May 02 '22 at 14:26

The reason is, that anything is a compromise. If you say, mouse and keyboard input is a priority for you, this means something else gets less priority. Most users seem to not care too much if their mouse and keyboard input is handled instantly, if the computer can't do anything useful with the entered information anyway, because it is so busy with other stuff.

Of course, this is a trade off that can be changed, but a general purpose system cannot be too optimized for specific use cases, because other use cases suffer.

Its mostly a throughput versus latency choice. How much are you willing to compromise? If a calculation takes 5 minutes, while your input lags, is it better to take 8 minutes but your mouse moves smoothly? Or 15 minutes? Where would you draw the line?

For Linux, there are options to tune this, and you can experiment.

I use the lowlatency kernel for Ubuntu since a few years ago. This dramatically reduces the input latency under load. The downside is, that it also reduces throughput, and all the processes which use the CPU in the background will take longer to complete. For me, on my work laptop, this is an acceptable compromise. On a server which does batch jobs, this would be unacceptable. There are also realtime kernels, which further reduce latency, but dramatically reduce throughput and there are of course also other problems.

You can even tune the related kernel settings yourself and compile your own kernel, if smooth mouse movement is extremely important to you.

Some related Questions to dig deeper: This on Unix Stackexchange or this on Ask Ubuntu.

Why do modern operating systems *ever* have perceptible input (keyboard/mouse) lag?

7 Answers7

Why do modern operating systems ever have perceptible input (keyboard/mouse) lag?