textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals --- tokenization, part-of-speech tagging, dependency parsing, etc. --- delegated to another library, textacy focuses primarily on the tasks that come before and follow after. Abstracts away the boilerplate for the stuff you actually care about.
spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. spaCy comes with pre-trained statistical models and word vectors, and currently supports tokenization for 45+ languages. It features the fastest syntactic parser in the world, convolutional neural network models for tagging, parsing and named entity recognition and easy deep learning integration. It's commercial open-source software, released under the MIT license.
Somebody I know on Mastodon threw together a quick utility that picks keywords out of documents you feed it and throws them into a Neo4j graph database for indexing. Written in rust.
prose is a natural language processing library (English only, at the moment) in pure Go. It supports tokenization, segmentation, part-of-speech tagging, and named-entity extraction. Parses English text, can also natively extract e-mail addresses, hashtags, @mentions, URLs, and emoticons. Can tag segmented and analyzed text by part of speech, including punctuation marks. Can identify types of entities (people, places). Also has the option to build and train custom models.
3722 links, including 192 private