Jump to content
Guest FirefoxRocks

Any fast way to do this?

Recommended Posts

Guest FirefoxRocks

I had a collection of recipes that I stored in the Microsoft Word format for some time and then thought that storing them in XHTML format would save loads of space (at that time, I had floppies).So I tried converting some of them and it was only 2-3kb per recipe instead of 20-30kb. What a huge amount of space saved!Now that I read about XML, I think that it is appropriate for me to store my recipes in XML format instead of XHTML. The problem is I have a whole lot of XHTML files and I want to convert them to XML fast.I still have a few Microsoft Word documents that I plan to convert to XML manually, but is there a way to XMLize the XHTML documents in a fast manner. I'm guessing not because XHTML elements are defined, but XML elements are made up, then defined in the DTD, Schema, or whatever.I'm guessing if it is possible, I will use XML DOM.

Share this post


Link to post
Share on other sites

XHTML is XML. You can use the same tools you use for XML on XHTML.I mean, just imagine that you define your own XML markup language that looked like XHTML, but you specify your own DTD rather then the W3C one. If you've covered everything, you'll notice that your language is XHTML.If you mean HTML, then you can covert it to XHTML by using tools like Tidy. Special batch files or PHP bindings may help you to do that on many files at once.

Share this post


Link to post
Share on other sites
Guest FirefoxRocks

Ok, I thought that XML was used for storing, organizing and generally containing information while XHTML is used to present the information to people?

Share this post


Link to post
Share on other sites

XHTML is a form of XML (XHTML lol). XML is a generalized class of languages that uses the markup system (with <tags>) to control document flow. There are many other markup languages besides XHTML that are used to present information, for example SVG and SMIL. So, there is no need to convert your XHTML files to XML because they already are XML :)You could, however, as Boen_Robot hinted to, create a new markup language with a Document Type Definition (DTD) that is more suited to the type of data you are storing (which is why XML is extensible). Then, I suppose, you could write a script in any programming language with an XML parser (such as Java or PHP) that converts your XHTML documents to your new custom markup language. What sort of information do you have stored currently?

Share this post


Link to post
Share on other sites
Guest FirefoxRocks

That is exactly what I need. "create a new markup language with a Document Type Definition (DTD) that is more suited to the type of data you are storing"I will give an example of what I want to do:Here are my current documents:

<?xml version="1.0" encoding="iso-8859-1"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head><title>Tea Biscuits</title><link rel="stylesheet" type="text/css" href="css.css" /></head><body><h1>Tea Biscuits</h1><p class="intro">I have this on some of my XHTML documents, but not all of them.</p><h3>Ingredients</h3><ul><li>2 eggs</li><li>250 mL (1 cup) sweet cream</li><li>2.5 mL (½ tsp) salt</li><li>10 mL (2 tsps) baking powder</li><li>330 mL (1 ¾ cups) all-purpose flour</li></ul><h3>Procedure</h3><ol><li>Put unbeaten eggs in mixing bowl.</li><li>Add cream, salt, then flour which has been sifted with baking powder.</li><li>Stir well and drop on greased cookie sheet or bake in square 8x8” pan.</li><li>For shortcake: bake in pie plate.</li></ol><p>Bake Time: 20 minutes at 375<sup>o</sup>F</p></body></html>

What I sort of want is this:

<?xml version="1.0" encoding="iso-8859-1"?><!DOCTYPE recipe SYSTEM "recipe.dtd"><recipe>	<name>Tea Biscuits</name>	<type>Biscuits</type>	<category>-</category>	<intro>I have this on some of my XML documents, but not all of them.</intro>	<ingredient>			<amount>2</amount>		eggs	</ingredient>	<ingredient>			<amount>250 mL (1 cup)</amount>		sweet cream	</ingredient>	<ingredient>			<amount>2.5 mL (½ tsp)</amount>		salt	</ingredient>	<ingredient>			<amount>10 mL (2 tsps)</amount>		baking powder	</ingredient>	<ingredient>			<amount>330 mL (1 ¾ cups)</amount>		all-purpose flour	</ingredient>	<procedure>		<step>Put unbeaten eggs in mixing bowl.</step>		<step>Add cream, salt, then flour which has been sifted with baking powder.</step>		<step>Stir well and drop on greased cookie sheet or bake in square 8x8" pan.</step>		<step>For shortcake: bake in pie plate.</step>	</procedure>	<baketime>Bake Time: 20 minutes at 375<sup>o</sup>F</baketime></recipe>

Then I will use XSLT to transform the document to make it look nicer than the tree view in browsers :)So is there a fast way to do this, or no?

Share this post


Link to post
Share on other sites

There is - more XSLT or DOM - the same tools you'll use for the XML to XHTML transformation. However, if the final goal is to display it in the browser, the point is kind'a killed, unless you wanted the data to be formatted differently (i.e. to have another markup).Also, the XHTML generated by Word is not semantically rich enough for you to get your output (XML) that easily. Not with XSLT 1.0 anyway. The other way around would have been a breeze, but this is going to be tricky and quirky, especially if your recepies are not formatted consistently (eg. if you have "2 loafs of bread" at one spot and "loafs of bread - 2" at another).BTW, DTD is not needed unless you wanted to use entities or were going to let others input recepies (for the second case, Schema is probably a better solution).

Share this post


Link to post
Share on other sites
Guest FirefoxRocks
There is - more XSLT or DOM - the same tools you'll use for the XML to XHTML transformation. However, if the final goal is to display it in the browser, the point is kind'a killed, unless you wanted the data to be formatted differently (i.e. to have another markup).
The final result was originally to print more than 1 recipe on a sheet of paper, but I don't know if that is possible from using an XSLT transformation.And it was also to organize information so I can find stuff easier.
Also, the XHTML generated by Word is not semantically rich enough for you to get your output (XML) that easily. Not with XSLT 1.0 anyway. The other way around would have been a breeze, but this is going to be tricky and quirky, especially if your recepies are not formatted consistently (eg. if you have "2 loafs of bread" at one spot and "loafs of bread - 2" at another).
I didn't use Microsoft Word to generate my XHTML, I did it manually. I think I have my recipes formatted consistently, I try to anyways.
BTW, DTD is not needed unless you wanted to use entities or were going to let others input recepies (for the second case, Schema is probably a better solution).
I need to use entities for fractions and a few other things. I already finished the DTD portion of it.

Share this post


Link to post
Share on other sites
The final result was originally to print more than 1 recipe on a sheet of paper, but I don't know if that is possible from using an XSLT transformation.And it was also to organize information so I can find stuff easier.
OK. If that's the case, you can easily do that by just manually creating, or better yet - generating, a list of your recepies and their respective files (a sort of sitemap) into an XML and pass that as input to XSLT. From then on, using XSLT's document() function, you can easily get only the contents of each <body/> into a new tree.
I didn't use Microsoft Word to generate my XHTML, I did it manually. I think I have my recipes formatted consistently, I try to anyways.
Great. All the better.
I need to use entities for fractions and a few other things. I already finished the DTD portion of it.
Again good. Just be sure to have the DTD locally (instead of using W3C's entieies list directly) as you'll otherwise slow the process down a lot.

Share this post


Link to post
Share on other sites
Guest FirefoxRocks
OK. If that's the case, you can easily do that by just manually creating, or better yet - generating, a list of your recepies and their respective files (a sort of sitemap) into an XML and pass that as input to XSLT. From then on, using XSLT's document() function, you can easily get only the contents of each <body/> into a new tree.
Sorry, I don't understand. What do you mean by getting the contents of each <body/> into a new tree?
Great. All the better.
So does this mean I have to manually convert everything into XML?
Again good. Just be sure to have the DTD locally (instead of using W3C's entieies list directly) as you'll otherwise slow the process down a lot.
My DTD looks like this, is this ok?
<!ELEMENT recipe (name,type,category?,intro?,ingredient+,procedure,baketime?,servings?)><!ELEMENT name (#PCDATA)><!ELEMENT type (#PCDATA)><!ELEMENT category (#PCDATA)><!ELEMENT intro (#PCDATA)><!ELEMENT ingredient (#PCDATA|amount)*><!ELEMENT amount (#PCDATA)><!ELEMENT procedure (step+)><!ELEMENT step (#PCDATA)><!ENTITY frac12 "½"><!ENTITY frac34 "¾"><!ENTITY deg "º">

Share this post


Link to post
Share on other sites
My DTD looks like this, is this ok?
I have no idea. I don't know much DTD.
So does this mean I have to manually convert everything into XML?
No, as it is already XML. That's what I was trying to say with the first post.
Sorry, I don't understand. What do you mean by getting the contents of each <body/> into a new tree?
OK. Too much theory, not much code. Sorry about that.If you have a single XML file of your own, that represents a list of your recepies, say like this:
<recipies><recipe>recipe1.xhtml</recipe><recipe>recipe2.xhtml</recipe><!--All other recipies --></recipies>

You could then create an XSLT that would fetch the contents of the <body/> of each of those items in the list and assable all of them into a new XHTML (tree) that will be presented to the user. A simple template for that would be:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml" xmlns:h="http://www.w3.org/1999/xhtml" exclude-result-prefixes="h">	<xsl:template match="/recipies">		<html>			<head>				<title>Recipies</title>			</head>			<body>				<h1>My recipies</h1>				<xsl:for-each select="recipe">					<div>						<h2>							<xsl:value-of select="document(.)//h:title"/>						</h2>						<div>							<xsl:copy-of select="document(.)//h:body/*|document(.)//h:body/text()"/>						</div>					</div>				</xsl:for-each>			</body>		</html>	</xsl:template></xsl:stylesheet>

You could easily generate a list of your files with PHP 5 and DOM. Here's a simple wrap up for the above sample, that assumes all of your recipies are in a single folder called "recipies" at your document root:

<?php$recipiesFolder = scandir('/recipies');$dom = new DOMDocument('1.0','utf-8');$dom->appendChild($dom->createElement('recipies'));$recipies = $dom->documentElement;foreach ($recipiesFolder as $recipe) {	$recipies->appendChild($dom->createElement('recipe',$recipe));}/* This is only if you want to see the result as generated. If you want, you could also do the XSLT transformation in PHP, as long as you have the XSL extension */header('Content-type: application/xml');$dom->saveXML();?>

Share this post


Link to post
Share on other sites
Guest FirefoxRocks
If you have a single XML file of your own, that represents a list of your recepies, say like this:
<recipies><recipe file="recipe1.xhtml"/><recipe file="recipe2.xhtml"/><!--All other recipies --></recipies>

Do you seriously mean I have to create that?!?
You could then create an XSLT that would fetch the contents of the <body/> of each of those items in the list and assable all of them into a new XHTML (tree) that will be presented to the user. A simple template for that would be:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml" xmlns:h="http://www.w3.org/1999/xhtml" exclude-result-prefixes="h">	<xsl:template match="/recipies">		<html>			<head>				<title>Recipies</title>			</head>			<body>				<h1>My recipies</h1>				<xsl:for-each select="recipe">					<div>						<h2>							<xsl:value-of select="document(@file)//h:title"/>						</h2>						<xsl:copy-of select="document(@file)//h:body/*|document(@file)//h:body/text()"/>					</div>				</xsl:for-each>			</body>		</html>	</xsl:template></xsl:stylesheet>

You could easily generate a list of your files with PHP and DOM.

Ok it looks simple enough but what is the h: namespace for? I don't understand too much about that part. Oh and by the way, you do realize that these files are on a USB drive/computer, not on a server with PHP.

Share this post


Link to post
Share on other sites

It's a namespace declaration. The "h" prefix is binded to the XHTML namespace URI, which is "http://www.w3.org/1999/xhtml". You can see the actual binding in the first line of the XSLT:

xmlns:h="http://www.w3.org/1999/xhtml"

The other part

xmlns="http://www.w3.org/1999/xhtml"

adds a default namespace for the output tree, which is again XHTML's namespace URIThe last part

exclude-result-prefixes="h"

Removes the decalred prefix from the output XML file. Otherwise, the output XHTML file would needlessly (at least in this case) keep the "h" namespace binding.

Oh and by the way, you do realize that these files are on a USB drive/computer, not on a server with PHP.
Crap. Well, I'm afraid you'll have to put them on a server with PHP then. Either that, or compile yourself a program in JAVA or C++. The choise is yours, and I think I know what you'll choose.

Share this post


Link to post
Share on other sites
Guest FirefoxRocks
Crap. Well, I'm afraid you'll have to put them on a server with PHP then. Either that, or compile yourself a program in JAVA or C++. The choise is yours, and I think I know what you'll choose.
1. I can install PHP on a Windows 98 computer and do it from there, but do I need apache then?2. Is there a very fast tutorial on compiling this sort of program using Visual Basic 2005 Express?3. In MS-DOS, I can do a "dir *.html /p" command and it will list all of the html files on the specified directory. Can I use this to generate that XML file that I need?

Share this post


Link to post
Share on other sites
1. I can install PHP on a Windows 98 computer and do it from there, but do I need apache then?2. Is there a very fast tutorial on compiling this sort of program using Visual Basic 2005 Express?3. In MS-DOS, I can do a "dir *.html /p" command and it will list all of the html files on the specified directory. Can I use this to generate that XML file that I need?
1. No. You could also use PHP from the command line. Compatibility with Windows 98 however is questionable.2. Nope. At least I don't know of such.3. The lists it generates are too verbose. You need only the names. From then on, it would have been easy to place tags around them.

Share this post


Link to post
Share on other sites

XHTML is not part of the XML family. XHTML stands for Exstensible Hypertext Markup Language. You can think of it as a standard for HTML markup tags that follows all well-formedness rules of XML.Note if you have access to office 2007, the format is OpenXML. I have not tried it myself, but may be you can open the files in Word2007 and save them in Word2007 (XML) format.Unless, you should use XSLT / XSL-FO as explained above to make the tansformation.Generally I will also reccomend using XML Schema in stead of DTD or Relax NG since it is richer, though perhaps more difficult to learn.In addition to the W3Schools tutorials on the XML family of technologies, here is a Norwegian paper explaning some of the members in the XML family. Even if you do not understand Norwegian, the figures and code examles (orange links) may be of value to you. Related to these technologies, especially XLink, there is an important concept, transclusion that you should be aware of."In computer science, transclusion is the inclusion of part of a document into another document by reference. It is a feature of substitution templates."

Share this post


Link to post
Share on other sites
XHTML is not part of the XML family. XHTML stands for Exstensible Hypertext Markup Language. You can think of it as a standard for HTML markup tags that follows all well-formedness rules of XML.
Any markup language that can be processed as XML is essentially XML based, and thus could be called part of the XML family.
Note if you have access to office 2007, the format is OpenXML. I have not tried it myself, but may be you can open the files in Word2007 and save them in Word2007 (XML) format.
This has almost nothing to do with the topic at hand. OpenXML still stores a lot of presentational data, increasing a lot the size of files. In addition, processing it is extra hard since it's not pure XHTML to start with and is in an archive you need to extract before processing.
Unless, you should use XSLT / XSL-FO as explained above to make the tansformation.
Using custom formats should only be done for new problems. You should be using standardised vocablularies when available.

Share this post


Link to post
Share on other sites
This has almost nothing to do with the topic at hand. OpenXML still stores a lot of presentational data, increasing a lot the size of files. In addition, processing it is extra hard since it's not pure XHTML to start with and is in an archive you need to extract before processing.
Note, the original posts heading was: "Any fast way to do this?, I need to convert XHTML to XML, or "XMLization""My bolding.Let us assume that the file can be imported into Office 2007 and saved as an XML document. Did that solve the problem (if it is possible as explained I have not tried since I do not have Office 2007)?If further further processing is needed, let us call the saved file test.xml.Note that XPath (at least version 1.0) can not select an arbitrary string that crosses several nodes, but XPointer can.Then$test=simplexml_load_file(test.xml); loads the file / document into memory (XMLReader is a faster PHP stream parser).Example: This (string)$test->xpointer(string-range(//*,'tagname'));will grab every element node with name, tagname as a string that can be manipulated by using, PHP's string functions like stripos, PHP's XML parsers like SimpleXML or the more advanced DOM API. May be that is explained already. Then I am sorry for repeating. Hopefully the fast solution is that FirefoxRocks (sidenote, have you downloaded the last version of Opera and hit view + style and what about browser security and mobility?) has access to Offiece 2007 and that solves the problem as explained above.

Share this post


Link to post
Share on other sites
Note, the original posts heading was: "Any fast way to do this?, I need to convert XHTML to XML, or "XMLization""My bolding.Let us assume that the file can be imported into Office 2007 and saved as an XML document. Did that solve the problem (if it is possible as explained I have not tried since I do not have Office 2007)?
Right. Remember that there are many files to be processed, not a single one. What you propose would have sounded good if the file was *.doc and was a single one. There is no need for any convertion when it's XHTML. Again, XHTML is an XML vocablulary.
If further further processing is needed, let us call the saved file test.xml.Note that XPath (at least version 1.0) can not select an arbitrary string that crosses several nodes, but XPointer can.Then$test=simplexml_load_file(test.xml); loads the file / document into memory (XMLReader is a faster PHP stream parser).Example: This (string)$test->xpointer(string-range(//*,'tagname'));will grab every element node with name, tagname as a string that can be manipulated by using, PHP's string functions like stripos, PHP's XML parsers like SimpleXML or the more advanced DOM API. May be that is explained already. Then I am sorry for repeating.
XPath 1.0 can't do THAT? Says who? I mean... in SimpleXML maybe, but I know it's possible in DOM like so:
$xpath = new DOMXPath(DOMDocument::load('test.xml'));$xpath->query('//*[.="tagname"]');

DOMXPath::query() returns a DOMNodeList containing all matching nodes. You can then use DOMNodeList::item() to fetch whichever one you desire and then use DOMElement::nodeValue to get the text and manipulate it with PHP's functions.

Hopefully the fast solution is that (s)he has access to Offiece 2007 and that solves the problem as explained above.
The Windows 98 thing above means Office 2007 is not a viable solution, as Office 2007 runs only on Windows XP and Vista.

Share this post


Link to post
Share on other sites
Note that XPath (at least version 1.0) can not select an arbitrary string that crosses several nodes, but XPointer can.
My bolding (that is before any PHP parsing. Make it simple, as simple as possible but no simpler).
XPath 1.0 can't do THAT? Says who?
An Australian professor and an It Dr from Switzerland page 4.They have written a related book, XPath, XLink, XPointer, and XML A Practical Guide to Web Hyperlinking and TransclusionI agree with you, that that is not a proof though, even if they ought to know what they are writing about.

Share this post


Link to post
Share on other sites
My bolding (that is before any PHP parsing. Make it simple, as simple as possible but no simpler).An Australian professor and an It Dr from Switzerland page 4.They have written a related book, XPath, XLink, XPointer, and XML A Practical Guide to Web Hyperlinking and TransclusionI agree with you, that that is not a proof though, even if they ought to know what they are writing about.
Ahh.... Now I understand what they really mean. All nodes that contain a certain string, not that have exactly that contents. Well, again however, this is possible in pure XPath 1.0:
//para[contains(.,'tagname')]

replacing the query above will have the same effect as the XPointer you (and they) show. The first expression I show won't cut it, but this one will.

Share this post


Link to post
Share on other sites
Guest FirefoxRocks

Ok, well first of all, I would like to post this:I have 6 more .doc documents. In total, there are approx 57 Microsoft Word pages (just rounding up). Obviously more than 1 recipe is stored in a .doc file, and especially the one with 21 pages.I have a trial version of Microsoft Office Home and Student 2007 running. If I were to just save it in Office XML format, that would be no use, I want every recipe separate in its own XHTML document. Also, Office XML format still contains tons of presentational markup, to "ensure that when reopened in Microsoft Word, it still looks the same".I plan to manually convert my documents to XHTML, which will take about 2.5 more months, 2 if I work faster. Also, I have a template set up, which will hopefully speed things up.Now secondly...,I have PHP setup, and I am ready to do a test drive of this thing.I have generateXML.php (the code above), recipes.xml (the list of recipes that I have right now), recipes.xsl (the XSLT file copied from above). What do I do now?

Share this post


Link to post
Share on other sites
What do I do now?
Buy Robert Richards (2006 or later): "Pro PHP XML and Web Services" ISBN 1-59059-633-1.That book explains XML validation using DTD, Relax NG or XML Schema in detail. In addition it describes the different XML push and pull parsers etc, etc. Note that push parsers like SimpleXML and DOM that load the document as a node tree into memory are less efficient than stream based pull parsers like Simple API for XML (SAX) and XMLReader developed by R Richards. You will see a huge difference in memory usage by the different extensions. Richards lists an example in table 11-2 page 411, with the following outcome of his test:Internal Memory Usage by Extension:DOM ------ SimpleXML ------ SAX ------ XMLReader85.6MB ----- 85.6 MB ------- 26 Kb ------- 177 KBSurprised? So if memory usage is important, use a stream based parser if you can. The DOM parser is most advanced since it can traverse the node tree in all directions and insert and delete elements. The other parsers lack some of the functionality and flexibility that the DOM parser has in manipulating the data structure represented by the XML document. The following information- LIBXML (Christian Stocker, Rob Richards, Marcus Boerger, Wez Furlong, Shane Caraveo)- DOM (Christian Stocker, Rob Richards, Marcus Boerger)- SimpleXML (Sterling Hughes, Marcus Boerger, Rob Richards)- XMLReader (Robert Richards)- XMLWriter (Rob Richards, Pierre-Alain Joye)- XSL (Christian Stocker, Rob Richards)about developing the XML modules and extensions for PHP 5.2 should be enough to convince you about how important his book is.If you want a lighter introduction that also explains how to make an XML driven CMS system and an XML driven site search engine (that it seems that you may need for your project), I can recommend.Thomas Meyer (latest edition): "No Nonsense XML Web Development With PHP". If you are a fast reader, you can read most of what you need in one month. Usable code follows with both books.Minimalistic: Solution. Read the XML tutorials at W3 Schools and ask on this forum that you have already done. Sometimes you need to read to get an overview. XML is a fairly large family of technologies. In a nutshell, pure XML is a data description (meta) language, while X(HTML) are presentational languages. May be a misuse of the term language.
Now secondly...,I have PHP setup, and I am ready to do a test drive of this thing.I have generateXML.php (the code above), recipes.xml (the list of recipes that I have right now), recipes.xsl (the XSLT file copied from above). What do I do now?
Be sure that your hoster have PHP 5.X installed. So much happens, especially around the PHP XML modules and extensions that the later stable version your hoster has the better.You should also be aware of the XSLT exstensions at EXSLT.

Share this post


Link to post
Share on other sites

<Digression>"But after a while, most programmers realize that this means that a program is equipped with a safety net: many errors that programmers make when they construct programs are caught by this net before they lead to unpleasant effects. An example: A very expensive American space rocket crashed on its way to Venus a few years ago, because of an extremely trivial error in a FORTRAN program. A comma had be written as a point, and, as a consequence of that, the start of a special kind of repeat imperative was mistakenly read as an assignment imperative assigning a value to an undeclared variable. Had it been required to declare every variable in FORTRAN programs, the compiler would have discovered that the variable was undeclared and the error would have been caught much earlier than in the Atlantic Ocean."Professor Bjørn Kirkerud (1989): "Object Oriented Programming With Simula". Addison Wesley Publishing Company ISBN 0 201 17574 6. Page 31-32.Note PHP is a C based language, so there is a great difference between the assignment operator = and the identity operator ==. Code that is written using = where == is correct, may compile and run fine until ...</Digression>

Ahh.... Now I understand what they really mean. All nodes that contain a certain string, not that have exactly that contents. Well, again however, this is possible in pure XPath 1.0:
//para[contains(.,'tagname')]

replacing the query above will have the same effect as the XPointer you (and they) show. The first expression I show won't cut it, but this one will.

Great if you are correct. Note identical in mathematics is a very strict term. Have you tried an example and used var_dump(Object) to see if the output is identical before you start parsing / processing. But identical is perhaps not needed here as long as the job can be done. If two parts of code do the same job in the same language, I prefer, ceteris paribus, the fastest (sometimes that with less overhead). Code is Queen, refactoring to better code is King.

Share this post


Link to post
Share on other sites

Efficiency of a single process is not a priority here. By "fast", FirefoxRocks meant "that will require least effort by me" (FirefoxRocks, feel free to correct me if I'm wrong here).OK. What to do from here. Now, first, you don't need the XML. It was just a sample of the XML you need to generate with PHP and then transform.If your desired result is a single combined file you can then do something with, then first open php.ini (you have renamed one of the php.ini files to just php.ini, right?) and uncomment the line "extension=php_xsl.dll". Also, find the extension_dir directive and set it to the place where PHP is installed in + "\ext". This is all in order to enable the XSL extension, needed to perform the XSLT transformation and give you the output XML file. Look at the XSLT FAQ if you don't get what I just said.After that, the actual PHP file you need is more like this:

<?php//Here are all of the things that you may need to change.//$recipiesFolder = 'recipies'$xsltFile = 'recipes.xsl';$output = 'result.html';$recipies = scandir($recipiesFolder);$dom = new DOMDocument('1.0','utf-8');$dom->appendChild($dom->createElement('recipies'));$recipies = $dom->documentElement;foreach ($recipies as $recipe) {	$recipies->appendChild($dom->createElement('recipe',$recipe));}$xsl = new DOMDocument;if(!$xsl->load($xsltFile)) {die('Error in XSLT file. The file may not be well formed or it may not exist. The exact error message returned by LibXML is: ' . libxml_get_last_error()->message);}$proc = new XSLTProcessor;$proc->importStylesheet($xsl);if($bytes = $proc->transformToURI($dom,$output)) {die('Error during transformation. The exact error message returned by LibXML is: '  . libxml_get_last_error()->message);}echo 'File "' . $output . '" was created successfully. ' . $bytes . ' bytes were written.';?>

Once you have that PHP file, first notice the things at the top. As the comment says, those are the things you might need to edit. The first is the folder with all of your recipies. Use some test folder if you're not sure about this, but still use XML files. The second, it should be obvious, is the XSLT file, incase you decide to rename it. Note that in this case, the XSLT is assumed to be in the same folder. Edit the path there if you wish to keep it separate. And last, but definetly not least is the resulting file. By default, a new file called result.html will be created (or overwritten if it exists) in the same folder as the PHP file.If all the parameters and paths seem OK, simply type in the command line (while in PHP's folder)

php -f "path/to/the/php/file"

So if you've saved the PHP in say "D:\test\generateXML.php", you'll type

php -f "D:\test\generateXML.php"

Once started, the final message "File * was created successfully." should appear shortly and the file should be ready.

Share this post


Link to post
Share on other sites
Guest FirefoxRocks

Ok I will do that once I finish converting all my recipes to XHTML. Hopefully I get it done before Christmas. :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×