Jump to content

Query XML doc - display result in RSS format


Brig

Recommended Posts

First, I'm a total newbie when it comes to XML and XSLT.I've gotten the task of trying to find out how to build a dynamic RSS feed for our website's search function.The way it should work when it done,A user can search a database of articles. If the user searches for the words 'web developer' he will get a list of results. Here is where the RSS feed comes in.On the result page there should be an option to subscribe to this search as an RSS feed.(It should work exactly like the RSS button on this search engine.)I've got the RSS feed working so I can get a feed of all the articles. I need to find out how to query this RSS feed/XML document to be able to build a feed that displays only articles that contains the words that the users searched for.Could I use XSLT and XPath to accomplish this?I've experimented a bit with XPath and I can query the RSS XML file and get results but only if the query words exactly matches the word in the XML file.The output also needs to be in an RSS 2.0 format. I think this is where I would use XSLT?Sorry for the long post. I've been searching the web for how to do this for over a week now and is starting to get a bit desperate :) Any help or directions in the right way would be greatly appreciated.Thanks!

Link to comment
Share on other sites

You can use the contains() function to query elements that have a certain word or phrase. For example:

//title[contains(.,'Web developer')]

Would return true() to all titles that contain the phrase "Web developer", which could include:

Tools for Web developers
I'm working as a Web developer
Web developers were once only geeks
etc.Note that the above is case sencetive, so it will be false on
Tools for Web Developers
I'm working as a web Developer
WEB developers were once only geeks
and there isn't a "straight forward" way for XPath 1.0 to be case insencetive.The only way I can think of with XPath 1.0 is to translate all elements' content to one case and translate the search term to the same case. Comparing those translated strings is like comparing case insencetively.In code, this would be something like:
//title[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ,translate('Web Developer','ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'))]

Link to comment
Share on other sites

Thank you for your reply boen_robot. It's certanly a good step in the right direction :) If you have the time I would be very grateful if you could help me with some sample code to get me started on this. I will supply as much information as I can.I have a cold fusion script that picks up the article data from an SQL server and turns it into an RSS xml file with the correct RSS format.The output looks like this,

<?xml version="1.0" encoding="ISO-8859-1"?><rss version="2.0"><channel>	<title>some title</title>	<link>some link</link>	<description>some description</description>	<language>en</language>	<lastbuilddate>Wed, 18 Oct 2006 14:57:32 PST</lastbuilddate>	<image>	<title>HN</title>	<url>some url</url>	<link>some link</link>	</image>	<item>	<title>Title 1</title>		<description>Description 1</description><link>Link 1</link>		<pubdate>Wed, 18 Oct 2006 14:55:48 PST</pubdate>	</item>  	<item>	<title>Title 2</title>		<description>Description 2</description><link>Link 2</link>		<pubdate>Wed, 18 Oct 2006 14:43:21 PST</pubdate>	</item>

How would I do to use the contains function you described with this RSS feed?And where would I put the translate to same case function you described?

//title[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ,translate('Web Developer','ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'))]

If I could just the query function working it would be a big step.The next step after that will be to pick up the search string that the user puts in the search engine and then put that string into the contains function,

//title[contains(.,'Whatever the users searched for')]

Would this be possible?Thank you so much for your help!

Link to comment
Share on other sites

Scince there will be more then a single title match in most cases, it's logical to put this statement in a for-each or a template match, and do whatever you want for each match.For example:

<xsl:template match="/rss/channel"><ul><xsl:for-each select="item[node()[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ,translate('Web Developer','ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'))]]"><li><p class="title"><xsl:value-of select="title"></p><p class="description"><xsl:value-of select="description"></p></li></xsl:for-each></ul></xsl:template>

In order to be able to adjust the search term, you need to put it as a parameter. Cold Fusion could then change that parameter on run and influence the output. You could also do the same for the upper case and lower case to shorten the XSLT a bit:

<xsl:template match="/rss/channel"><xsl:param name="search-term"/><xsl:variable name="upper-case" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/><xsl:variable name="lower-case" select="'abcdefghijklmnopqrstuvwxyz'"/><ul><xsl:for-each select="item[node()[contains(translate(.,$upper-case,$lower-case) ,translate($search-term,$upper-case,$lower-case))]]"><li><p class="title"><xsl:value-of select="title"></p><p class="description"><xsl:value-of select="description"></p></li></xsl:for-each></ul></xsl:template>

With this, using a special command in Cold Fusion (I don't know exactly what's that in Cold Fusion though) you can set the parameter to the search term.However, not the idea is a bit clearer, I think I should give you an idea of an alternative approach. Using Cold Fusion, create a page with the results in either XHTML or RSS format. Using XSLT, you can translate the one to the other. The need for the $search-term parameter will not be anymore, scince all of the results in the input will be the desired search results. Infact, that's actually the best approach, because XSLTs performance on large XML files (your whole database) could be really too much for anyone.

Link to comment
Share on other sites

Thanks agin for your reply!Your first piece of code is brilliant. I'm taking this in small steps as I can feel I'm in over my head with the little knowledge I have :) With this code I can specify a search word in the code and get the results,

<?xml version="1.0" encoding="ISO-8859-1"?><xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:template match="/rss/channel"><xsl:for-each select="item[node()[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ,translate('sql','ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'))]]"><p><xsl:value-of select="title"></xsl:value-of></p><p><xsl:value-of select="description"></xsl:value-of></p></xsl:for-each></xsl:template></xsl:stylesheet>

This displays all articles that contains the word sql.I've linked the XSL file in the XML document but this turns the XML document into a list with the title and description content.

<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type="text/xsl" href="stylesheet1.xsl"?><rss version="2.0"><channel>

Instead of displaying it as a text list as it is now, could it be displayed as an XML document?Really what I want the stylesheet to do is to filter the XML document so it removes all the item elements and what's in them if title or description doesn't match the keyword in the stylesheet.

<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type="text/xsl" href="stylesheet1.xsl"?><rss version="2.0"><channel>	<title>Some title</title>	<link>Some link</link>	<description>Some articles</description>	<language>en</language>	<lastbuilddate>Wed, 18 Oct 2006 14:57:32 PST</lastbuilddate>	<item>	<title>First Title</title>		<description>Description 1</description><link>Link 1</link>		<pubdate>Wed, 18 Oct 2006 14:55:48 PST</pubdate>	</item><item>	<title>Second Title</title>		<description>Description 2</description><link>Link 2</link>		<pubdate>Wed, 18 Oct 2006 14:55:48 PST</pubdate>	</item></channel></rss>

For example if the keyword is "First" it should only display the following,

<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type="text/xsl" href="stylesheet1.xsl"?><rss version="2.0"><channel>	<title>Some title</title>	<link>Some link</link>	<description>Some articles</description>	<language>en</language>	<lastbuilddate>Wed, 18 Oct 2006 14:57:32 PST</lastbuilddate>	<item>	<title>First Title</title>		<description>Description 1</description><link>Link 1</link>		<pubdate>Wed, 18 Oct 2006 14:55:48 PST</pubdate>	</item><item></channel></rss>

I'm getting quite confused myself so I'm sorry if it doesn't make sense! :) Thanks again!

Link to comment
Share on other sites

The point I'm trying to make is that probably, the RSS input (the file that XSLT will process) is actually going to be your whole database, right? And you want to filter the whole database for the search term. Well, that's certanly not good, because performance will drop very much with such large documents (if we suppose your database is huge of course).To impove performance, you need to create a search engine with Cold Fusion itself, that will return either the filtered XHTML or RSS intput. For example, you may set Cold Fusion to return the following XHTML page when the search term is "First".

<html lang="en" xml:lang="en"><head>	<title>Some title</title>	<meta name="description" content="Some articles"/></head><body><ul>	<li>	<p>First Title</title>		<p><a href="Link 1">Description 1</a></p></li></ul></body></html>

Then, taking this as input, XSLT could create an RSS feed:

<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type="text/xsl" href="stylesheet1.xsl"?><rss version="2.0"><channel>	<title>Some title</title>	<link>Some link</link>	<description>Some articles</description>	<language>en</language>	<lastbuilddate>Wed, 18 Oct 2006 14:57:32 PST</lastbuilddate>	<item>	<title>First Title</title>		<description>Description 1</description><link>Link 1</link>		<pubdate>Wed, 18 Oct 2006 14:55:48 PST</pubdate>	</item></channel></rss>

And the code for that get's simpler, because you won't need complex filters. A simple:

<xsl:template match="ul"><xsl:for-each select="li"><item><title><xsl:value-of select="p[1]"/></title><description><xsl:value-of select="p[2]"/></description><link><xsl:value-of select="p[2]/a/@href"/></link></item></xsl:template>

Is all you need to generate the items.This last stylesheet also answers your other question. Yes, the output could be in any XML based format, including XSLT itself.

Link to comment
Share on other sites

Thanks so much for your reply. I will go through what you've written and see how it goes :) The XML will not be that large. The content of the RSS feed will only be vacancy listings. We have an RSS feed for each department so they never get that large as jobs are being taken of as well as put in the database.Really grateful for all the help! Will let you know how it goes! :) Thanks!

Link to comment
Share on other sites

Hi again boen_robot. I'm hopeinig you could answer this one question I have.I have this XSL stylesheet:

<?xml version="1.0" encoding="ISO-8859-1"?><xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:output method="xml" indent="yes"/><xsl:template match="/rss/channel"><rss version="2.0"><channel><title>Some title</title>	<link>Some link</link>	<description>Some articles</description>	<language>en</language>	<lastbuilddate>Wed, 18 Oct 2006 14:57:32 PST</lastbuilddate><xsl:for-each select="item[node()[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ,translate('java','ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'))]]"><xsl:value-of select="title"></xsl:value-of><xsl:value-of select="description"></xsl:value-of><xsl:value-of select="link"></xsl:value-of></xsl:for-each></channel></rss></xsl:template></xsl:stylesheet>

I have linked it in this XML file:

<?xml version="1.0" encoding="ISO-8859-1"?><?xml-stylesheet type="text/xsl" href="stylesheet.xsl"?><rss version="2.0"><channel>	<title>title</title>	<link>Link</link>	<description>Latest jobs</description>	<language>en</language>	<lastbuilddate>Wed, 18 Oct 2006 14:57:32 PST</lastbuilddate>	<image>	<title>HN</title>	<url>Image</url>	<link>Link</link>	</image>........</channel></rss>

The actual query is working fine and it only displays what I want it to but it only displayes it as unformatted text. I want the result to be displayed as structured XML code.Do you know how I can do this please?Btw, I'm still looking into the other solution you gave me with creating a search engine in coldfusion.Thanks a lot again!

Link to comment
Share on other sites

Everything that's not in XSLT's namespace could be anything you want. So, instead of using

<xsl:value-of select="title"></xsl:value-of><xsl:value-of select="description"></xsl:value-of><xsl:value-of select="link"></xsl:value-of>

you could simply add the desired element around those. And for goodness sake, write empty elements with "/>" will you?

<title><xsl:value-of select="title"/></title><description><xsl:value-of select="description"/></dectiption><link><xsl:value-of select="link"/></link>

<xsl:value-of/> just outputs value. Or should I say "text". It doesn't copy the elements and their attributes. If you want that, you can simply use a single xsl:copy-of instead, like this:

<xsl:for-each select="item[node()[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz') ,translate('java','ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'))]]"><xsl:copy-of select="."/></xsl:for-each>

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...