New
Peipei Jiang , Qian Wang , Senior Member, IEEE, Xiu Lin, Man Zhou , Wenbing Ding,
Cong Wang , Fellow, IEEE, Chao Shen , Senior Member, IEEE, and Qi Li , Senior Member, IEEE
Abstract—Voice authentication has been increasingly adopted for sensitive operations on mobile devices. While voice
biometrics can distinguish individuals by their spectral features (such as voiceprints), they are known to be prone to spoofifing attacks, where malicious attackers can use pre-recorded or synthesized samples from legitimate users or impersonate thespeaking style of the targeted user to deceive the voice authentication system. In this paper, we design and implement a novelsoftware-only anti-spoofifing system on smartphones. Our system leverages the pop noise, which is generated by the user’s oralairflflow when speaking the passphrase close to the microphone. The pop noise is delicate and subject to user diversity, makingit hard to be recorded by replay attacks beyond a certain distance or to be imitated precisely by impersonators. Specififically, wedesign a new pop noise detection scheme to pinpoint pop noises at the phonemic level, based on which we establish atheoretical model to calculate the sound pressure level from the speech signal in order to get the estimated pressure signal,and then analyze the consistency with the actual pressure signal extracted from the pop noise. Furthermore, we calculate thesimilarity score of the unique sequences which describe the individually unique relationship between pop noises and phonemesto resist spoofifing attacks. Our evaluation on a dataset of 30 participants and three smartphones shows that our system achieves over 94.79% accuracy. Our system requires no additional hardware and is robust to various factors including authentication angle, authentication distance, the length of passphrase, ambient noise, etc.