Jump to content

HTML tree Structure


marco perez

Recommended Posts

Hi,I have to do a Java program to create an HTML element tree structure of any HTML document. Someone suggest me JDOM, but it seems that this api demands that I know previously the structure of the XML (I'd have to convert to xml)Do you advice me any other api or approach to from any html document, save its structure and accede to any elementThankxMP

Link to comment
Share on other sites

Any other API demands pretty much the same thing - converting the page to XHTML, and thus - XML. XML is easily processable, so there are lots of APIs for it. SGML, and therefore HTML, is not easily processable. There are very few parsers for it, and most of the ones that are out convert HTML to XHTML anyway.You can use Tidy to convert your HTML pages to XHTML ones. From then on, JDOM, SAX, and a few more that don't come into mind, are all APIs with which you can parse your XHTML document.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...