TagSoup : Java Glossary

TagSoup
TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML (extensible Markup Language), parses HTML (Hypertext Markup Language) as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX (Simple API for XML) interface, it allows standard XML tools to be applied to even the worst HTML. TagSoup also includes a command-line processor that reads HTML files and can generate either clean HTML or well-formed XML that is a close approximation to XHTML (extensible Hypertext Markup Language).

CMP homejump to top You can get the freshest copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/jgloss/tagsoup.html J:\mindprod\jgloss\tagsoup.html
logofeedback Please email your feedback for publication, letters to the editor, errors, omissions, typos, formatting errors, ambiguities, unclear wording, broken/redirected link reports, suggestions to improve this page or comments to Roedy Green : feedback email If you want your message kept confidential, not considered for posting, please explicitly specify that.
mindprod.com IP:[65.110.21.43]
view BlogYour face IP:[38.107.179.213]
You are visitor number 11.