Jump to content

DOMDocument error


Fmdpa

Recommended Posts

I am constructing a dynamic sitemap with DOMDocument.

<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">   <url>      <loc>http://www.example.com/</loc>      <lastmod>2005-01-01</lastmod>      <changefreq>monthly</changefreq>      <priority>0.8</priority>   </url>   <url>      <loc>http://www.example.com/catalog?item=12&desc=vacation_hawaii</loc>      <changefreq>weekly</changefreq>   </url>   <url>      <loc>http://www.example.com/catalog?item=73&desc=vacation_new_zealand</loc>      <lastmod>2004-12-23</lastmod>      <changefreq>weekly</changefreq>   </url>   <url>      <loc>http://www.example.com/catalog?item=74&desc=vacation_newfoundland</loc>      <lastmod>2004-12-23T18:00:15+00:00</lastmod>      <priority>0.3</priority>   </url>   <url id="post">      <loc>http://www.example.com/catalog?item=83&desc=vacation_usa</loc>      <lastmod>2004-11-23</lastmod>   </url></urlset>

When I load and validate the document:

$doc = new DOMDocument();$doc->Load('sitemap.xml');echo $doc->validate();

...I receive this error: "Warning: DOMDocument::validate() [domdocument.validate]: no DTD found! in ..."

Link to comment
Share on other sites

What exactly are you trying to do here? The only reason to validate is to check it against a DTD. Near as I can tell, in an XML document, a DTD says what your document CAN'T do, not what it can do. A "normal" XML document has no limitations, so nothing's ever invalid. All element names are permitted, anything can be nested in anything, and all attribute types are permitted.All that can go wrong is if the XML is malformed, and that will keep the DOMDoc from loading in the first place.So I'm thinking you have nothing to validate, and no need for a DTD?

Link to comment
Share on other sites

Ok, I thought validate was to check if there were errors. Well here's the real issue I'm having.

<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">  <url>      <loc>http://www.example.com/catalog?item=74&desc=vacation_newfoundland</loc>      <lastmod>2004-12-23T18:00:15+00:00</lastmod>      <priority>0.3</priority>  </url>  <url id="post">      <loc>http://www.example.com/catalog?item=83&desc=vacation_usa</loc>      <lastmod>2004-11-23</lastmod>  </url></urlset>

$doc = new DOMDocument();$doc->Load('sitemap.xml');$post = $doc->getElementById('post');

Do you see any problems in this code? I'm trying to fetch the element that has the id of post. It returns NULL though. Why isn't it fetching url#post element?

Link to comment
Share on other sites

Thinking more clearly now. getElementById only works if you have a DTD that defines the id attribute. Plain XML doesn't know what an id is. So you need to search for a node with an attribute called id.You could getElementsByTagName and loop through the list, checking the attributes.XPath might simplify things once you get used to the syntax.

Link to comment
Share on other sites

So I puzzled my puzzler on this. The following code should have worked. Well, it does and it doesn't:

$doc = new DOMDocument();$doc->load('sitemap.xml');$xpath = new DOMXPath($doc);$query = "//url[@id='post']";$results = $xpath->query($query);// $results->item(0) is your item

It turns out this only works without the namespace attribute in your root element. It's outside my skillset to explain that. There are ways of registering a namespace so xpath works, but I can't make it happen. Are you certain your xmlns is valid?Anyway, I'm pretty sure the traditional loop strategy I mentioned above should work in every case.

Link to comment
Share on other sites

Thanks for the replies! XPath really looks like it simplifies things! But I solved the problem using the the approach you mentioned in your second-to-last post. Caution: this thread might not be done yet, though! :) This is my first experience working with PHP's XML extensions, and I may need help later on...

Link to comment
Share on other sites

I'm going to refer you to this recent post of mine about the XPath thing... I'll just add DOMXPath::registerNamespace() as the DOM equivalent to namespace declaration in an XSLT file.As for the ID... you can define an attribute as an ID with DTD or XML Schema. If you use DTD, you don't even have to do validation... you just have to load the DTD.You can also define them on the DOM level with DOMElement::setIdAttribute(). However, this second way is only applicable if you're currently creating the document and have no DTD yet.Last but not least, you can ALWAYS, in XML, without even having DTD or XML Schema, use XML IDs.... like

<url xml:id="post">

getElementById() works automatically on them.(Note: Again, in XML i.e. not applicable in documents loaded with loadHTML() or loadHTMLFile())

Link to comment
Share on other sites

Like I said, I tried the methods that would register the namespace identified in the xmlns, and changed my xpath to suit, but it did not work. That's why I wondered if the xmlns address pointed to a correct document. It looks like it points to a directory.

Link to comment
Share on other sites

Huh? You don't point to something pointed by the namespace URI... you use the namespace URI itself... copy&paste it there if you will, like:

$xpath->registerNamespace('s', 'http://www.sitemaps.org/schemas/sitemap/0.9');$results = $xpath->query("//s:url[@id='post']");

Link to comment
Share on other sites

That is exactly what I meant and exactly the syntax I used when I tested the thing. (I simply meant that a URI points to a resource located at that URI.) What I don't know is if http://www.sitemaps.org/schemas/sitemap/0.9 identifies a valid namespace. I assume that if it doesn't, the thing won't work? Something kept it from working.

Link to comment
Share on other sites

The only required thing for a namespace URI is for it to be a valid URI. It doesn't have to point to any resource. I can have "http://poiajsdpoihasgpiohafg.oasdfo"... that's still a valid URI, despite the fact it doesn't go anywhere.Here's example of a namespace URI that fails, simply because it's not a valid URI (has space in it):

<root xmlns="namespace uri"/>

PHP will open it, but issue a warning due to the namespace being invalid.

Link to comment
Share on other sites

Thanks for the replies. I'm still not sure I understand the namespace thing. Is a namespace an address that points to another resource? I've heard some discussion about it in the PHP manual, and I didn't quite understand it. I finished a PHP file with three functions that allow me to add, update or delete entries from the sitemap. It works flawlessly (so far). The only observation that I made was a difference DOMDoc has from JS. It is that DOMDoc can't create floating node groups. In other words, a node must be bound to the document tree before you can append child or sibling nodes to it. It is a minor annoyance to have to work around that difference.I'm glad to say that I'm finally starting to see potential in XML. I added a feature to my site yesterday that was a convenient use of XML.

Link to comment
Share on other sites

I'm not sure I see the difference you're talking about...

$elementParent = $doc->createElement('parent');$elementChild = $doc->createElement('child');$elementParent->appendChild($elementChild);$doc->appendChild($elementParent);

Works flawlessly, and I never had to add $elementParent to the tree before assigning $elementChild to it. Could you give an example?The thing to get about namespaces is that they're Identifiers... exactly what URI stands for. Think of them as a GUID - a globally unique identifier. They ensure that there won't be naming collisions with other XML dialects, should you choose to integrate them into your dialect or let your dialect be integrated into others for whatever reason. For example, an RSS' "link" element has different semantics than XHTML's "link" element. In theory, RSS could've let XHTML be directly embedded into its description or something, and still let you target one kind of element or the other. In practice, it was left out, possibly because namespaces are not exactly supported in DTD, and XML Schema validators are not as widely supported... if there's an XML Schema variation that allows it, people are not using it, because they too see namespaces as something more complicated than it is.It's NOTHING MORE than an identifier. The resource at the URI, if any, has ABSOLUTELY no value in the process. The URIs "http://w3schools.com" and "http://w3schools.com/" are two different URIs, even though they point to the same location (note the "/").

Link to comment
Share on other sites

I think I was doing this:

$elementParent = $doc->createElement('parent')->appendChild($doc->createTextNode('value'));

However, I stopped doing that when I found out that the createElement method accepts the value of the node in the second parameter:

$elementParent = $doc->createElement('parent', 'value');

Namespaces make sense to me now. What's the difference between the URL and URI?

Link to comment
Share on other sites

Boy have I been confused about (I guess) a lot of nothing. This incredibly obvious entry at Wikipedia provided this useful quote (my emphases):

. . . the namespace specification does not require nor suggest that the namespace URI be used to retrieve information; it is simply treated by an XML parser as a string. For example, the document at http://www.w3.org/1999/xhtml itself does not contain any code. It simply describes the XHTML namespace to human readers. Using a URI (such as "http://www.w3.org/1999/xhtml") to identify a namespace, rather than a simple string (such as "xhtml"), reduces the possibility of different namespaces using duplicate identifiers.
Don't ask me why I didn't check there first. The W3 spec was a lot less clear.
Link to comment
Share on other sites

A wise SGML fan previously explained it to me right here on these forums, but I can't really find the post anymore...To understand the difference, you have to be introduced to URNs too.Basically...URI - Uniquely Identifies a resource in any fashion.URL - Uniquely specifies the Location of a resource. In theory, this location can point to any resource. (non internet example: "Give me what's in the box on the left")URN - Uniquely specifies the Name of a resource. In theory, this resource can be located anywhere. (non internet example: "Give me the big hammer")All URLs and URNs are also URIs, with some URLs also being valid URNs and vice-versa. The graphic at the Wikipedia article on URI shows the relation best.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...