python - How to find/replace text in html while preserving html tags/structure -
I want to use regexps to change the text as I want, but I want to preserve the HTML tag. Like if I want to replace "stack overflow" with "stack underflow", then it should work as expected: if input stack & lt; Sometag & gt; Overflow & lt; / Sometag & gt;
is a DOM library when working with HTML code for me stack
Use, not on normal expressions:
- lxml: a parser, document and HTML serializer.
- BeautifulSuspoke: A parser, document and HTML serializer.
- html5lib: A parser. There is a serializer in it.
- Elementary: A document implemented in the form of CalymentTree: C extension, a document object.
- HTMLParser: A parser.
- Genshi: A parser, document, and HTML serializer.
- xml.dom.minidom: a document model built into the standard library, whi ch can pel html5lib
to steal from.
Of these, I would recommend lxml, html5lib, and BeautifulSoup.
Comments
Post a Comment