Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints, which are ready for inference and available for commercial use.
Another F/OSS personal assistant. Skill-based. Speech recognition and synthesis. Uses node.js and Python.
The eSpeak NG (Next Generation) Text-to-Speech program is an open source speech synthesizer that supports 102 languages and accents, based on the eSpeak engine created by Jonathan Duddington. It supports spectral and Klatt formant synthesis, and the ability to use MBROLA voices.
Online service that does text-to-speech as a service. Free at the low end, cheap at the high end. If you can send it text (even an RSS feed!) it will read and turn it into recorded audio with a synthesizer. Supports 26 languages.
Claims to be a foss personal AI assistant. Called Stella. Built on top of Arch Linux. Demo appears to be both conversational and somewhat usable.
A Python module which implements interfaces to the native text-to-speech API for whatever platform you're running it on, be it Windows, MacOSX, or Espeak on Linux.
F/OSS voice control system. Runs on a raspi. Extensible. Uses speech synthesis to respond.