VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media. It is fully open-sourced under the MIT License]. Incorporated into NLTK.
spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pre-trained statistical models and word vectors, and currently supports tokenization for 45+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.
In this post, we’ll be looking at how we can use a deep learning model to train a chatbot on my past social media conversations in hope of getting the chatbot to respond to messages the way that I would.
A modular Python framework for implementing chatops bots. Aims to make it easy to write new plugins that implement various skills and interfaces. Supports XMPP MUCs. Can be configured from inside of chat, so you don't have to edit a config file and restart the bot. Implements command access control.
An article about writing chatbots for chatops in python. Links to frameworks to help do this.
A corpus of over 520 million words which consists of a massive cross-section of the english language between 1990 and 2015. This corpus is used for NLP study, AI training, and lingustic analysis. There's an online service, you can download various forms of it, and you can add to it if you have access.
Tracery is a procedural generation system for generating text, graphics, and more. Think of it like a procgen framework rather than a tool limited to one particular use case. People use it to generate text and dialogue for games, bots (Twitter, et al), artwork, probably music, recipes, insults... Unusual kinds of games have been developed with it, such as rhythm games and dating sims(!). Worth looking into. There is a version for the Twine game development system and a port to Python (https://github.com/aparrish/pytracery), which would make it very useful to us...
A Markov chain generator in Python that is still maintained. Aims to be very extensible. Can save and restore its models as JSON files. Key methods can be overridden. Can randomly generate sentences, splice models together. Can plug NLP software into it to do more interesting things. Tries very hard to not just regurgitate things from the model; you can tweak this a bit. exocortex bots betafork
A Python module that implements an NLP chatbot. Language agnostic, can be trained to speak any spoken language. Train an instance on a corpus and it will be able to communicate in a conversational manner.
Documentation for Discord's REST API.
Documentation for the Chatterbot corpus file format.
NLP training corpuses for the Chatterbot python module. Contains all of the structured text used to teach the text classifier and semantic analysis engines for the module. All user contributed. Encouages contribution by the community. YAML categories The training data consists of actual conversations and fragments thereof in the file.