python module for extracting text from different documents. Can also be used as a CLI utility. Can work with text-based formats like CSV, JSON, and HTML. Can work with binary formats like MS Word, MP3, and PDF. The list is fairly extensive.