python使用whisper实时识别麦克风内容生成带时间戳的srt字幕文件代码

代码语言:python

所属分类:其他

代码描述:python使用whisper实时识别麦克风内容生成带时间戳的srt字幕文件代码,采用多线程,保证识别的时候不堵住麦克风数据的捕获。

代码标签: python whisper 实时 识别 麦克风 内容 生成 时间戳 srt 字幕 文件 代码

下面为部分代码预览,完整代码请点击下载或在bfwstudio webide中打开

#pip install -U openai-whisper

#pip install pyaudio
import pyaudio
import numpy as np
import whisper
import torch
import threading
import queue
import time
from datetime import datetime, timedelta

# 设置音频流参数
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
CHUNK = 1024

# 初始化PyAudio
audio = pyaudio.PyAudio()

# 开启麦克风流
stream = audio.open(format=FORMAT, 
                    channels=CHANNELS,
                    rate=RATE, 
                    input=True,
                    frames_per_buffer=CHUNK)

# 加载Whisper模型
model = whisper.load_model("base")

# 定义一个队列用于在线程之间传递音频数据
audio_queue = queue.Queue()

# 定义SRT文件名
srt_filename = "output.srt"

def format_timedelta(td):
    """将timedelta格式化为SRT时间格式"""
    hours, remainder = divmod(td.seconds, 3600)
    minutes, seconds = divmod(remainder, 60)
    milliseconds = td.microseconds // 1000
    return f"{hours:02d}:{minutes:02d}:{seconds:02d},{milliseconds:03d}"

def write_to_srt(start_time, end_time, text, subtitle_index):
    """将识别的文本写入SRT文件"""
    with open(srt_filename, "a", encoding="utf-8") as srt_file:
        srt_file.write(f"{subtitle_index}\n")
        srt_file.write(f"{format_timedelta(start_time)} --> {format_timedelta(end_time)}\n")
        srt_file.write(f"{text}\n\n")

def audio_capture():
    """捕获音频并将其添加到队列中"""
    while True:
        data = stream.read(CHUNK)
        audio_data = np.frombuffer(data, dtype=np.int16)
        audio_queue.put(audio_data)

def audio_transcribe():
    """从队列中获取音频数据并进行转录"""
    .........完整代码请登录后点击上方下载按钮下载查看

网友评论0