Libraries for extracting web contents.
News extraction, article extraction and content curation in Python.
Pythonic HTML Parsing for Humans.
Extract text from any document, Word, PowerPoint, PDFs, etc.
A module for automatic summarization of text documents and HTML pages.
Every web site provides APIs.
Fast Python port of arc90's readability tool.
Convert HTML to Markdown-formatted text.
A small library for extracting rich content from URLs.
Web Content Retrieval for Humans.