A public domain collection of corpora for training AI ML bots. Consists of many YAML files containing key/value data on many different subjects. Each category contains multiple documents about different related subjects. You won't be able to drop these into your code randomly, you'll need to write a fairly simple parser tuned to the document's schema. There are several libraries in different programming languages for efficiently using one or more of these files in your own project.
NLP training corpuses for the Chatterbot python module. Contains all of the structured text used to teach the text classifier and semantic analysis engines for the module. All user contributed. Encouages contribution by the community. YAML categories The training data consists of actual conversations and fragments thereof in the file.