我有wav格式的语音音频文件,每个文件是60秒。但是,输出会被截断,并且只捕获大约15%的长度。我在本地的Jupyter Notebook和Google Colab上都尝试过。根据文档,此请求低于API的阈值。我做错了什么,或者我如何才能绕过这个限制?
# select a recognizer session
# recognize_google() : Google Web Speech API
r = sr.Recognizer()
interview = sr.AudioFile('sample.wav')
with interview as source:
print('Ready...')
r.pause_threshold = 2
audio = r.record(source, duration=60)
type(audio)
transcription = r.recognize_google(audio, language='en_CA')
print(transcription)
发布于 2021-05-03 05:18:31
尝试使用此代码,如果输出仍然与旧的相同,您可以输入Try和except块或更改pause_threshold
值
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile("sample.wav") as source:
print("Ready")
r.pause_threshold = 0.6
audio = r.record(source)
try:
s = r.recognize_google(audio)
print("Text: "+s)
except sr.UnknownValueError:
print("Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Error {0}".format(e))
https://stackoverflow.com/questions/67343334
复制相似问题