Uses CMU's Pocketsphinx to do a quick speech-to-text transcription, then grep that transcript. It then takes any found text, extracts the sentence in question from the audio stream and saves it as a separate mp3.
Rhasspy (pronounced RAH-SPEE) is an offline, multilingual voice assistant toolkit inspired by Jasper that works well with Home Assistant, Hass.io, and Node-RED. Designed so that you don't have to use any not-self-hosted software under the hood, from speech recognition to TTS. Emits JSON events. Vocabulary can be expanded with the automated assistance feature. Will run on something as simple as a RasPi but doesn't treat x86(-64) like a second-class citizen. Commands/intents are specified in a fairly easy templating language.
Can be used with audio files and probably a hot mic to transcribe speech into text for later processing. Uses git Large File Storage for the neural network objects. GPU acceleration enabled. Includes trained models as well as source code. Available in PyPy as deepspeech and deepspeech-gpu. Supports the RasPi explicitly as a platform, interestingly.
Looking at the releases page is a good way to keep up with the project: https://github.com/mozilla/DeepSpeech/releases
Mozilla's open source speech recognition project. They're asking people to contribute samples of themselves speaking sentences on the screen to grow their corpus.