Bookmarks
Tag cloud
Picture wall
Daily
RSS Feed
  • RSS Feed
  • Daily Feed
  • Weekly Feed
  • Monthly Feed
Filters

Links per page

  • 20 links
  • 50 links
  • 100 links

Filters

Untagged links
deanmalmgren/textract: extract text from any document. no muss. no fuss. https://github.com/deanmalmgren/textract
Mon 19 Mar 2018 10:34:38 PM PDT archive.org

python module for extracting text from different documents. Can also be used as a CLI utility. Can work with text-based formats like CSV, JSON, and HTML. Can work with binary formats like MS Word, MP3, and PDF. The list is fairly extensive.

exocortex search documents cli index python text analysis content forensics modules utilities metadata leandra
6660 links, including 429 private
Shaarli - The personal, minimalist, super-fast, database free, bookmarking service by the Shaarli community - Theme by kalvn