A curated list of delightful Conversational AI resources.
Lingua Franca is our multilingual Natural Language Processing library. It allows Mycroft to both understand and respond with naturally expressed entities such as numbers, dates and times. Stand-alone Python module. Ready-to-use and currently has support for Danish, Dutch, English, French, German, Hungarian, Italian, Portuguese, Spanish, and Swedish. Heuristic parsing routines to extract numbers, dates, times, or durations from a spoken language transcription. Natural language formatters for numbers, dates, times and durations as well as utilities for working with lists in multiple languages. Can reformat figures so they can be better pronounced by a synthesizer. Extract information from text to use in figuring out what the user wants and grab the stuff needed to do it.
Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides state-of-the-art general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet, CTRL...) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch.
Snips NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts structured information. Behind every chatbot and voice assistant lies a common piece of technology: Natural Language Understanding (NLU). Anytime a user interacts with an AI using natural language, their words need to be translated into a machine-readable description of what they meant. The NLU engine first detects what the intention of the user is (a.k.a. intent), then extracts the parameters (called slots) of the query. The developer can then use this to determine the appropriate action or response.
A public domain collection of corpora for training AI ML bots. Consists of many YAML files containing key/value data on many different subjects. Each category contains multiple documents about different related subjects. You won't be able to drop these into your code randomly, you'll need to write a fairly simple parser tuned to the document's schema. There are several libraries in different programming languages for efficiently using one or more of these files in your own project.
A python module for parsing mathematical expressions. Figures out the equation you specify (which can be in a spoken language) and solves them.