python录制一段音频,并实现语音识别转文字的代码
时间: 2024-02-06 12:04:14 浏览: 150
Python阿里云音频转字幕源代码,使用阿里云智能语音服务中的录音文件识别 API,实现将视频、音频文件转写出 srt 字幕
5星 · 资源好评率100%
你可以使用Python中的pyaudio库来录制音频,并使用Google Speech-to-Text API将其转换为文本。以下是一个简单的示例代码:
首先,你需要安装pyaudio和google-cloud-speech库:
```
pip install pyaudio
pip install google-cloud-speech
```
然后,你需要创建一个Google Cloud项目,并生成一个服务账号的密钥,以便你可以使用Speech-to-Text API。将该密钥保存在本地,并将其路径设置为环境变量:
```
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
```
接下来,是代码实现:
```python
import pyaudio
import wave
from google.cloud import speech_v1p1beta1 as speech
# 设置录音参数
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
CHUNK = 1024
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"
# 创建PyAudio对象
audio = pyaudio.PyAudio()
# 开始录音
stream = audio.open(format=FORMAT, channels=CHANNELS,
rate=RATE, input=True,
frames_per_buffer=CHUNK)
print("开始录音...")
frames = []
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
frames.append(data)
print("录音结束!")
# 停止录音
stream.stop_stream()
stream.close()
audio.terminate()
# 将录音保存为WAV文件
wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(audio.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()
# 使用Google Speech-to-Text API将录音转换为文本
client = speech.SpeechClient()
with open(WAVE_OUTPUT_FILENAME, 'rb') as audio_file:
content = audio_file.read()
audio = speech.RecognitionAudio(content=content)
config = speech.RecognitionConfig(
encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=RATE,
language_code='en-US')
response = client.recognize(config=config, audio=audio)
for result in response.results:
print(result.alternatives[0].transcript)
```
这段代码将录制5秒钟的音频,将其保存为`output.wav`文件,然后使用Google Speech-to-Text API将其转换为文本,并打印出来。
阅读全文