Eavesdropping on keyboard keystrokes

A recent study shows how it’s possible to identify typed text from the sound of keystrokes — even in far-from-ideal environments.

Eavesdropping on keyboard keystrokes

U.S. researchers recently published a paper demonstrating that useful information can be extracted from the sounds of keystrokes. This is certainly not the first study of its kind; moreover, the results can’t even be considered more accurate than the conclusions of its predecessors. However, what makes this one interesting is that the researchers weren’t aiming for perfect, lab-controlled conditions. Instead, they wanted to see how it works in fairly realistic conditions: a somewhat noisy room, a not-so-great microphone, and so on.

Attack model

We often get eavesdropped on without even realizing it. And I’m not referring to spy movie clichés with bugs planted in offices and hotel rooms.

Imagine you’re stuck in a boring conference call at work and, at the same time, you’re discreetly catching up on work emails or personal messages without muting your microphone. Guess what? Your colleagues can hear your keystrokes. Streamers — those who love broadcasting their gaming sessions (and other stuff) — are also at risk. They might get distracted mid-stream and, for example, type a password on the keyboard. While the keyboard itself may not be visible, someone could record the sound of the keystrokes, analyze the recording, and try to figure out what was typed.

The first scientific study examining such an attack in detail was published in 2004. Back then, IBM researchers merely proposed a method and demonstrated the basic possibility of distinguishing one keystroke from another, but nothing more. Five years later in 2009, the same researchers attempted to solve the problem using a neural network: a special algorithm was trained on a 10-minute recording of keyboard input, with the text known in advance. This made it possible to associate specific keystroke sounds with typed letters. As a result, the neural network recognized up to 96% of the characters typed.

However, this result was obtained in a lab-controlled environment. The room was completely silent, a high-quality microphone was used, and the text was typed more or less consistently (with roughly the same typing speed and keystroke force). Moreover, a loud mechanical keyboard was used. This study demonstrated the theoretical possibility of an attack, but its results were difficult to apply in practice: if you change the typing style slightly, change the keyboard, or add natural ambient noise to the room, recognition becomes impossible.

Real-life eavesdropping

Everyone has their own unique way of typing. The researchers found patterns in these individual styles, which helped them analyze the sounds of keystrokes. For instance, they discovered that people tend to type common letter pairs at a consistent speed. They also found that it’s fairly easy to distinguish individual words, since the sounds of the spacebar and Enter key are usually distinct from other keys.

During the experiments, the researchers assumed that the potential eavesdropping victim would be typing in an office with a normal level of background noise. Other than that, there were no special restrictions on the participants. They could use any keyboard and type however they wanted. The recording was done on a low-quality, built-in laptop microphone. For a successful attack, however, a potential spy needs to record a sufficiently long sequence of keystrokes — otherwise, it won’t be possible to train the neural network. The recording looks something like this:

Audio signal shape

Shape of the audio signal corresponding to certain keystrokes. Source

Each peak in amplitude corresponds to a specific keystroke. The pause between keystrokes may vary depending on the user’s typing skill and the sequence of letters being typed. In this study, the neural network was trained to recognize these pauses specifically, and as it turns out, they also carry a lot of information — no less than the differences in keystroke sounds themselves!

An important breakthrough in this new study was the use of the neural network to predict whole words. For example, if the neural network identifies the word “goritla” from the keystrokes, then we can confidently assert that the user actually typed “gorilla”, and there was just an error in recognition. The more letters in a word, the more accurately it can be guessed. This rule applies to up to six-letter words — beyond which the accuracy doesn’t increase.

A total of 20 volunteers participated in the experiment. First, they typed an already-known text, which was then correlated with the keystroke sounds and used to train the recognition algorithm. Next, the subjects typed a secret text, which the neural network tried to decipher based on the typing patterns and how well it matched real words. The accuracy varied from person to person, but on average the AI correctly guessed 43% of the text just from the keystroke sounds.

Side channels all around us

This is yet another example of a side-channel attack — when information is leaked indirectly. We’ve written a lot about such attacks. For example, here is a method of espionage using a light sensor. Here we talked about extracting sound from video data by analyzing tiny vibrations in the image. Phone conversations can be eavesdropped on using an accelerometer – the sensor built into every smartphone. The indirect channels of information leakage are indeed many.

But out of all these attacks, extracting text by analyzing keystroke sounds is the most viable in practice. When we enter a credit card number or password, we can hide the keyboard from prying eyes, but protecting yourself from eavesdropping isn’t so easy.

Of course, a 43% accuracy rate in guessing the text might not sound that impressive — especially considering it’s guessing whole words, not random characters like you’d expect in a password. Still, this new research is a significant step toward making this type of attack practical. It’s not quite there yet, but imagine someone in a café or on the train potentially stealing your password, credit card number, or even your private messages just by listening to you typing.

Perhaps future research will bring us closer to this dangerous scenario. But even now we can outline methods of protecting against such attacks and start applying them to particularly sensitive data right away. For starters, avoid typing passwords or other secret information during conference calls — especially during public online events. For many reasons, we recommend using two-factor authentication — it protects well against various password compromise scenarios.

Finally, there’s a way to counteract this specific side-channel attack. It’s based on the fact that you have a certain consistent pattern of typing on the keyboard. Want to make it harder for those sneaky hackers? Break the pattern: mix up your typing style. Both super-slow and super-fast typing can work wonders.