pyquery allows you to make jquery queries on xml documents. The API is as much as possible the similar to jquery. pyquery uses lxml for fast xml and html manipulation.
A Python module for building parsing expression grammars. Build recognizers combinatively, i.e., by plugging together discrete examples of things to look for. Very useful for command parsers.
textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals --- tokenization, part-of-speech tagging, dependency parsing, etc. --- delegated to another library, textacy focuses primarily on the tasks that come before and follow after. Abstracts away the boilerplate for the stuff you actually care about.
Snips NLU (Natural Language Understanding) is a Python library that allows to parse sentences written in natural language and extracts structured information. Behind every chatbot and voice assistant lies a common piece of technology: Natural Language Understanding (NLU). Anytime a user interacts with an AI using natural language, their words need to be translated into a machine-readable description of what they meant. The NLU engine first detects what the intention of the user is (a.k.a. intent), then extracts the parameters (called slots) of the query. The developer can then use this to determine the appropriate action or response.
A Python module that tries to make parsing HTML as easy to do as Requests makes HTTP requests easy. Written by the same developer, in fact. Built on top of Requests, so you don't have to juggle both. Python v3.6 and later only. Full JS support, CSS selectors, XPath selectors, user-agent spoofing, automatic redirects.
How to clean punctuation marks out of strings represented as lists in Python without needing to build or import a full text parser. The second answer is the most straightforward but not necessarily the most Pythonic.
The complete text of Text Processing In Python by David Mertz, free to read online by the author.
python module that makes handling dates and times significantly easier. Get, parse, and convert datetimes in various forms, such as time_t, natural language ("23 hours from now," "last year"), and various RFC formats. Can also do timezone and arbitrary time interval algebra. Cross-platform.
3722 links, including 192 private