str2speech is a simple command-line tool for converting text to speech using Transformer-based text-to-speech (TTS) models. It supports multiple models and voice presets, allowing users to generate high-quality speech audio from text.
Supports multiple TTS models, including suno/bark-small, suno/bark, and various facebook/mms-tts models. Allows selection of voice presets. Supports text input via command-line arguments or files. Outputs speech in .wav format. Works with both CPU and GPU.
Looks like the speech models have to be installed locally to work.
A fast and local neural text to speech system developed by Mycroft for the Mark II. Multiple voice models, multiple languages.
Does not have to be used in the context of Mycroft. You can run Mimic on just about any Linux machine. Has a REST API server but there is also a command line utility that lets you generate speech.
A Python module which implements interfaces to the native text-to-speech API for whatever platform you're running it on, be it Windows, MacOSX, or Espeak on Linux.