rahn Posted December 23, 2006 Share Posted December 23, 2006 Hi folks -- thanks for the assistance,I have a couple thousand xml docs each in there own directory and I am not sure about the best way to go about setting up a simple keyword search through all the docs. Can some one point me in the right direction. I have a typical LAMP set up so, I was thinking that the way to go would be to set up an xquery, stick the results in an array, recurse through all the directories, and then pump it out with php? Or should I just do a regex search through all the docs, store the reference to the matched files in an array, and then call up the xml file with an xslt. Or should I be a bad boy and stick all the xml docs into mysql and just search through the full text, and then call up the xml file w/xslt. I'm confused, but want to do this the right way! Thanks.R. Link to comment Share on other sites More sharing options...
boen_robot Posted December 23, 2006 Share Posted December 23, 2006 XQery sounds like the best option to me, but.. I don't think PHP has any support for it just yet. It only supports XPath 1.0 and DOM. This could be enough if you use it wisely. I personally never really thought about that, but now when I think about it... here's the logic you need to do by using PHP: Scan the directory for the targeted files. I think the scandir() function might be a good start, but I'm not sure how to filter only the XML files and/or filter those files even furher (say by directory for example). Loop through the array and load each file from it. Perform the desired XPath expression over the current file. Generate an XML (fragment) containing each result (and optionally some related data). Store the created fragment to a global newly created DOM tree and go to the next result. Each next result will be added to this DOM tree. Take the newly created DOM tree as an XML input and use an XSLT file to create the result page. Show the user the transformation. Tought? Maybe not as much as it sounds, but it's not a walk in the park... for me at least.When XPath 2.0 and XQuery 1.0 are supported in PHP, things will get easier. By using the standart doc() function, you'll be able to generate the tree without calling a new DOM instance for every document. Link to comment Share on other sites More sharing options...
rahn Posted January 4, 2007 Author Share Posted January 4, 2007 Well hmm. I've been thinking of an even cheaper soln. Convert all the xml docs to xhtml via xslt and then set up a standard search through those docs. Could even just do a google site search then. Link to comment Share on other sites More sharing options...
boen_robot Posted January 5, 2007 Share Posted January 5, 2007 Could the Google search engine be used on just created files? Wow If so, can't Google simply search the XML files?Anyhow, that won't be usefull if you want to have your own result page. It's also unpractical if you're only searching for a certain text at a known position. Link to comment Share on other sites More sharing options...
rahn Posted January 12, 2007 Author Share Posted January 12, 2007 Well, I found that parsing through all the xml docs is really slow, so what I ended up doing was extracting a few of the more important nodes and sticking that in mysql. A search on a few fields is really fast, then sablotron parses the xml doc with an xslt. What I meant by using google, was to do a sablotron run on all the xml files to transform and save them in static html. Google would pick them up a few days later, probably. But the current solution seems to be working ok, though sablotron sometimes chokes on files. Link to comment Share on other sites More sharing options...
boen_robot Posted January 13, 2007 Share Posted January 13, 2007 I see why you didn't liked my solution... it would only work with PHP5 and Salbotron is only part of PHP4. PHP5's libxslt is faster. Link to comment Share on other sites More sharing options...
kvnmck18 Posted January 18, 2007 Share Posted January 18, 2007 Wow, I like this rahn. boen - did I heard php4?hahaGod, I need to stop being so worthless. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now