A group of researchers from Texas A&M University, Temple University, the New Jersey Institute of Technology, Rutgers University and the University of Dayton in the USA have revealed a side channel attack method called EarSpy, which allows you to listen to the conversations of the target user through the speakers of the device.
EarSpy uses a phone speaker at the top of the device, which is brought to the ear, as well as a built-in accelerometer to detect vibrations generated by the speaker when receiving confidential information during a phone call..
At the same time, all previous similar studies focused on vibrations generated by the phone speakers, or on an external component for data collection.
An obvious way to intercept a conversation is the introduction of malware by an attacker with the ability to record a call through the phone microphone. However, Android's security measures have evolved significantly, and it is becoming increasingly difficult for malware to obtain the necessary permissions.
On the other hand, access to data from motion sensors in a smartphone does not require any special permissions. Of course, Android developers have already begun to impose some restrictions on collecting data from sensors, but the EarSpy attack is still relevant even in conditions of modern security.
Malware installed on a device can use an EarSpy attack to capture potentially sensitive information and send it to an attacker. Moreover, EarSpy attacks are becoming more and more possible thanks to the improvements that smartphone manufacturers are making to headphones.
Similarly, modern devices use more sensitive motion sensors and gyroscopes capable of registering even the smallest resonances of speakers.
The researchers conducted tests on OnePlus 7T and OnePlus 9 smartphones and found that the accelerometer from the ear speaker can capture significantly more data thanks to the stereo speakers present in these modern models, compared to the old models of OnePlus phones, which did not have them yet.
Experiments conducted by academic researchers have clearly shown the effect of speaker reverberation on the accelerometer, which allows extracting the necessary characteristics of the time-frequency domain and spectrograms. The analysis focused on the recognition of gender, speaker and speech.
In the gender recognition test, which aims to determine whether the target is male or female, the EarSpy attack showed 98% accuracy. The accuracy was almost as high, 92% when determining the identity of the speaker. In the case of real speech, the accuracy of recording the numbers spoken during a phone call reaches 56%.
One can reduce the effectiveness of the EarSpy attack by the volume level. A lower volume can prevent eavesdropping through this side channel attack. The location of the hardware components of the device and the tightness of the assembly also affect the propagation of the reverberation of the speaker.
The researchers recommend that manufacturers, to protect themselves from such attacks, ensure stable sound pressure during a conversation and place motion sensors in a position where internal vibrations do not affect them or at least have the minimum possible impact.