- VOICE BIOMETRIC REVOLUTION: -
WHY VOICE ID IS NOW SECURE ENOUGHFOR DEVICE UNLOCK
INTRODUCTION
With the ever-increasing reliance on smartphones, laptops, and smart IoT devices comes a growing need to protect personal data and applications.
Consequently, our devices must implement suitably strong measures to guard against fraudulent access. Until now, the security standard for unlocking a mobile device or laptop exceeded what was possible to achieve with spoken voice, relegating voice to a useful convenience for some functions but not sufficient for completely unlocking a device with full access at the same level as a 4-digit PIN, a fingerprint, or a face match.
However, there are numerous circumstances where a user needs hands-free access and cannot touch or look at the device directly.
Examples include driving, cooking, exercising, or even working in an environment that requires gloves and other personal protective equipment. And, there are several cases where voice is the only means of interaction. Therefore, using spoken voice to unlock a device and enable a full set of voice commands is highly desirable.
DEVICE UNLOCK SECURITY OVERVIEW
There are different ways to protect a personal device against fraudulent access.
The most common are PIN codes and biometric-based methods such as fingerprints and facial recognition.
The probability of accepting an impostor with a 4-digit PIN code is 1 in 10,000.
The accuracy requirements when using biometrics for mobile device unlock are similarly high.
The Android Compatibility Definition Document (CDD) requires the false acceptance rate of fraudsters to not exceed 1 in 50,000 with a maximum false reject rate of 10%, as well as requiring support for spoofing detection.
An example of a biometric that meets or exceeds this standard is face matching. The technology that enables Apple’s Face ID utilizes advanced hardware and software. The iPhone camera creates a depth map of yourface while also capturing an infrared image, while Apple puts the probability of a random person looking at your iPhone and unlocking it using Face ID at approximately 1 in 1,000,000.
DISADVANTAGES OF CURRENT SOLUTIONS
While face and fingerprint biometrics offer strong device-based security, there are cases when a user would benefit from hands-free access to their locked device.
An obvious example is while driving. In this case, typing PIN or using facial or fingerprint biometrics is not safe as require the driver to interact with the device’s touch screen or to position their face in front of the camera, diverting attention from the task of driving.
Voice biometrics offer a passive interaction without diversion. The user can unlock the device and perform a task with a short spoken phrase such as «Ok, Google, read my last text message.»
However, the advantages of offering a voice-based option extend beyond the need for hands-free access for safety.
First, unlike alternative methods of unlocking a device, only voice offers the ability to unlock a secure device and execute a command such as «read my last text message», in one simple step.
Second, the user environment may be better suited to voice, such as in the case of poor lighting conditions for face capture and wet or dirty fingers for fingerprint capture. Voice now provides a secure and convenient alternative.
Finally, unlike other biometric modalities, voice biometrics doesn’t require sophisticated-device-specific fingerprint or camera sensors and will work even with low-end devices.
Voice unlock can now be used, with billions of existing devices without additional hardware costs. The net result is that voice extends the value of the device to more situations.
THE SENSORY SOLUTIONS
The scientific community has recently made advancements in the voice recognition space, more precisely called the speaker verification space.
The voice modality offers a desirable approach to handling user authentication for device unlock, payments, and other activities that require high security.
However, there are currently no commercially available solutions on the market that enable secure device unlock using voice. Until now, the accuracy of voice biometrics has not met accepted standards.
Sensory made significant progress in the use of voice biometrics for device unlock use cases. Sensory accomplished this breakthrough by combining its advanced algorithms with multiple speaker verification methods and its unique voice anti-spoofing technology. The approach is described in the following paragraphs.
The main methods for authenticating a person’s unique voiceprint can be divided into two categories: text-dependent and text-independent.
In a text-dependent approach, the analyzed phrases are fixed and known beforehand. Conversely, the text-independent approach places no constraints on the words which the user is allowed to speak for authentication.
Both approaches have pros and cons. But unique capabilities arise when combining the two approaches in a natural voice user interface interaction.
For instance, voice interactions with personal electronic devices commonly start with a fixed wake-up word such as «Ok, Google», «Hi, Alexa», or «Hey, Siri».
A wake-up word alerts the device to listen for a command phrase. The actual command phrase is not fixed and thus will not be handled by the text-dependent approach. The command phrase could be something like: «What time is my next meeting?» or «Venmo 10 dollars to Alex for lunch».
The proposed solution applies text-dependent speaker recognition for the wake-up word, text-independent speaker recognition for the command/question, and voice anti-spoofing technology for the entire utterance.
The matching results are combined to provide an authentication decision. The voice anti-spoofing algorithm protects the system from spoofing attacks. The types of voice spoofing attacks covered are:
The combination of speaker verification methods and anti-spoofing results in a high level of authentication accuracy with a False Acceptance Rate below 1 in 50,000, a False Rejection Rate below 10%, and a Spoofing Acceptance Rate as low as 3%.
The uniqueness of the technology lies in using a Common Deep Neural Network processing step that enables the extraction of robust features from the voice for text-dependent, text-independent, and anti-spoofing within a single network.
Combining datasets that were previously used separately for these tasks doubles the training set and enables a synergistic effect for each task and the authentication task in particular.
DIVERSITY OF EVALUATION DATA
Sensory pays significant attention to the diversity of data and is guided by industry standards.
Our evaluation methodology and accuracy metrics were defined according to the ISO standards: ISO/IEC 19795-1 (biometric performance testing and reporting), ISO/IEC 30107-3 (biometric presentation attack detection), and ISO/IEC TR19795-3 (biometric performance testing and reporting).
The following principle factors are accurately measured and taken into account:- Biological factors: age distribution, gender-Social factors: language- Environmental factors: noise level and type, transmission channel, reverberation. Additionally, one more biological factor is taken into account - inter-day voice variability.
This is the variability of invoice characteristics caused by changes in a user’s emotional and physical state across different days.
Sensory's data comprises up to five different data sources: data collection services with fully controlled conditions of data gathering, individual subcontractors, crowd collection services with uncontrolled conditions,our data collection department, and data from partners.
There are 10 different languages, including European and Asian languages, 10k speakers, 5 environments, and near-, mid- and far-field conditions.
EVALUATION RESULTS
As with all biometric authentication systems, measuring the two types of errors, the False Accept Rate (FAR), the error of letting an impostor through, and the False Reject Rate (FRR), the error of blocking a valid person, constitute the basis for measuring accuracy.
A Detection Error Trade-off (DET) plot illustrates the trade-off between these two types of errors for a biometric matching system.
CONCLUSION
e, voice biometrics open up new possibilities for secure, hands-free access to information and applications on a variety of voice-enabled devices Examples of use cases for hands-free voice biometric unlock:
本文分享自 SmellLikeAISpirit 微信公众号,前往查看
如有侵权,请联系 cloudcommunity@tencent.com 删除。
本文参与 腾讯云自媒体同步曝光计划 ,欢迎热爱写作的你一起参与!
扫码关注腾讯云开发者
领取腾讯云代金券
Copyright © 2013 - 2025 Tencent Cloud. All Rights Reserved. 腾讯云 版权所有
深圳市腾讯计算机系统有限公司 ICP备案/许可证号:粤B2-20090059 深公网安备号 44030502008569
腾讯云计算(北京)有限责任公司 京ICP证150476号 | 京ICP备11018762号 | 京公网安备号11010802020287
Copyright © 2013 - 2025 Tencent Cloud.
All Rights Reserved. 腾讯云 版权所有