A public domain collection of corpora for training AI ML bots. Consists of many YAML files containing key/value data on many different subjects. Each category contains multiple documents about different related subjects. You won't be able to drop these into your code randomly, you'll need to write a fairly simple parser tuned to the document's schema. There are several libraries in different programming languages for efficiently using one or more of these files in your own project.
3749 links, including 199 private