<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>STT on FOSS Engineer</title>
    <link>https://fossengineer.com/tags/stt/</link>
    <description>Recent content in STT on FOSS Engineer</description>
    <generator>Hugo</generator>
    <language>en-US</language>
    <lastBuildDate>Sun, 14 Jun 2026 13:07:14 +0200</lastBuildDate>
    <atom:link href="https://fossengineer.com/tags/stt/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Local AI Voice Tools</title>
      <link>https://fossengineer.com/local-ai-voice-tools/</link>
      <pubDate>Sat, 06 Jun 2026 19:30:00 +0200</pubDate>
      <guid>https://fossengineer.com/local-ai-voice-tools/</guid>
      <description>A chooser guide for Voicebox, KittenTTS, Chatterbox, and related local AI voice workflows, focused on what was actually validated locally.</description>
    </item>
    <item>
      <title>Voicebox - Local AI Voice Studio for Speech, Dictation, and Agents</title>
      <link>https://fossengineer.com/voicebox-local-ai-voice-studio/</link>
      <pubDate>Sat, 06 Jun 2026 10:50:00 +0200</pubDate>
      <guid>https://fossengineer.com/voicebox-local-ai-voice-studio/</guid>
      <description>&lt;strong&gt;Voicebox&lt;/strong&gt; is a local AI voice studio for cloning voices, generating speech, dictating into apps, transcribing captures, adding effects, composing stories, and giving AI agents voices through REST and MCP. It ships a Tauri desktop app, FastAPI backend, web UI, Docker setup, and multiple local TTS/STT engines.</description>
    </item>
    <item>
      <title>Chatterbox - Local Open-Source Text-to-Speech by Resemble AI</title>
      <link>https://fossengineer.com/chatterbox-local-open-source-tts/</link>
      <pubDate>Fri, 05 Jun 2026 11:45:00 +0200</pubDate>
      <guid>https://fossengineer.com/chatterbox-local-open-source-tts/</guid>
      <description>&lt;strong&gt;Chatterbox&lt;/strong&gt; is Resemble AI&amp;rsquo;s MIT-licensed open-source text-to-speech toolkit. It ships Python APIs, Gradio demos, English and multilingual models, voice conversion, Turbo inference, paralinguistic tags, and built-in Perth watermarking. It is not a Docker-first self-hosted app; it is a local ML package where GPU access matters.</description>
    </item>
    <item>
      <title>ComfyUI-Qwen-TTS - Qwen3 Voice Nodes for ComfyUI</title>
      <link>https://fossengineer.com/comfyui-qwen-tts/</link>
      <pubDate>Thu, 04 Jun 2026 13:45:00 +0100</pubDate>
      <guid>https://fossengineer.com/comfyui-qwen-tts/</guid>
      <description>ComfyUI-Qwen-TTS is a custom-node pack for running Qwen3-TTS workflows inside ComfyUI: text-to-speech, zero-shot voice cloning, designed voices, saved speakers, multi-role dialogue, and experimental fine-tuning.</description>
    </item>
  </channel>
</rss>
