X Tutup
The Principal Dev – Masterclass for Tech Leads

The Principal Dev – Masterclass for Tech Leads28-29 May

Join

New Python HTML Libraries 2026

GitHub Libraries Python HTML Libraries

html5lib/html5lib-python 1K -1

added 12 months ago

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

alir3z4/html2text 2K +2

added 12 months ago

Convert HTML to Markdown-formatted text.

gawel/pyquery 2K +3

added 12 months ago

A jQuery-like library for python.

mozilla/bleach 2K +6

added 12 months ago

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

buriy/python-readability 2K +2

added 12 months ago

Given an HTML document, extract and clean up the main body text and title.

lxml/lxml 3K +6

added 12 months ago

lxml is the most feature-rich and easy-to-use library for processing XML and HTML in the Python language

scrapy/parsel 1K +3

added 1 year ago

Parsel lets you extract data from XML/HTML/JSON documents using XPath or CSS selectors.

psf/requests-html 13K -3

added 1 year ago

This library intends to make parsing HTML as simple and intuitive as possible.

Join libs.tech

...and unlock some superpowers

GitHub

We won't share your data with anyone else.

X Tutup