Jump to content

RSS Parsing


MinusMyThoughts

Recommended Posts

I've got a question that I'm hoping is really simple, but I'm having a little difficulty finding documentation that's clear.I'm trying to parse an RSS feed for a client to allow him to post a few recent Flickr photos on his homepage with a link back to his Flickr page.After some initial trouble, I got PHP5 running with SimpleXML, and I'm having no trouble getting the link back to work. What's getting me is that I want to use the thumbnails, not the full images, and I'm having a difficult time connecting to that particular tag.Here's one item from the feed:

		<item>			<title>Embry Rucker Photograph for Men's Health</title>			<link>http://www.flickr.com/photos/thenategreenexperience/2195398252/</link>			<description><p><a href="http://www.flickr.com/people/thenategreenexperience/">nategreen03</a> posted a photo:</p><p><a href="http://www.flickr.com/photos/thenategreenexperience/2195398252/" title="Embry Rucker Photograph for Men's Health"><img src="http://farm3.static.flickr.com/2420/2195398252_2a9bc3e3ff_m.jpg" width="240" height="160" alt="Embry Rucker Photograph for Men's Health" /></a></p></description>			<pubDate>Tue, 15 Jan 2008 07:17:21 -0800</pubDate>						<dc:date.Taken>2008-01-15T07:17:21-08:00</dc:date.Taken>			<author>nobody@flickr.com (nategreen03)</author>			<guid isPermaLink="false">tag:flickr.com,2004:/photo/2195398252</guid>									<media:content url="http://farm3.static.flickr.com/2420/2195398252_5597e6c349_o.jpg" 					   type="image/jpeg"					   height="534"					   width="800"/>			<media:title>Embry Rucker Photograph for Men's Health</media:title>			<media:text type="html"><p><a href="http://www.flickr.com/people/thenategreenexperience/">nategreen03</a> posted a photo:</p><p><a href="http://www.flickr.com/photos/thenategreenexperience/2195398252/" title="Embry Rucker Photograph for Men's Health"><img src="http://farm3.static.flickr.com/2420/2195398252_2a9bc3e3ff_m.jpg" width="240" height="160" alt="Embry Rucker Photograph for Men's Health" /></a></p></media:text>			<media:thumbnail url="http://farm3.static.flickr.com/2420/2195398252_2a9bc3e3ff_s.jpg" height="75" width="75" />			<media:credit role="photographer">nategreen03</media:credit>			<media:category scheme="urn:flickr:tags">magazine for health photographs mens embry rucker</media:category>		</item>

The important part I'm trying to pull out is this:

<media:thumbnail url="http://farm3.static.flickr.com/2420/2195398252_2a9bc3e3ff_s.jpg" height="75" width="75" />

The code I'm currently using (that doesn't work properly) is:

	function flickrParse($numrows) {		$url = "http://api.flickr.com/services/feeds/photos_public.gne";		$user = "?id=22729170@N05〈=en-us&format=rss_200";				$feed = $url.$user;		$feedsrc = "http://flickr.com/thenategreenexperience";				$rss = simplexml_load_file($feed);		$channel = $rss->channel;		foreach($channel->item as $item) {		   echo <<<________EOB		<a href="{$item->link}"><img src="{$item->media->thumbnail->url}" /></a>		<br />________EOB;		} 	}

I thought I was being clever, but it's just a dead image.Has anyone dealt with this before? I'd love some help!Thanks!-Jason

Link to comment
Share on other sites

object(SimpleXMLElement)#5 (6) {  ["title"]=>  string(40) "Embry Rucker Photograph for Men's Health"  ["link"]=>  string(63) "http://www.flickr.com/photos/thenategreenexperience/2195398252/"  ["description"]=>  string(387) "<p><a href="http://www.flickr.com/people/thenategreenexperience/">nategreen03</a> posted a photo:</p><p><a href="http://www.flickr.com/photos/thenategreenexperience/2195398252/" title="Embry Rucker Photograph for Men's Health"><img src="http://farm3.static.flickr.com/2420/2195398252_2a9bc3e3ff_m.jpg" width="240" height="160" alt="Embry Rucker Photograph for Men's Health" /></a></p>"  ["pubDate"]=>  string(31) "Tue, 15 Jan 2008 07:17:21 -0800"  ["author"]=>  string(31) "nobody@flickr.com (nategreen03)"  ["guid"]=>  string(37) "tag:flickr.com,2004:/photo/2195398252"}

It doesn't give the piece I need as a variable, which sucks.However, it appears that the only difference between the medium size thumbnail and the small thumbnail that I want to use is that "_m.jpg" at the end of the img tag needs to be "_s.jpg" instead.Ok, sure, but now I need to have a regular expression to extract that img tag and modify it, right?I'm definitely not very good at regex, so if anyone would be so kind as to maybe point me in the right direction to formulate an expression that will extract the <img> tag and replace "_m" with "_s" for a pretty little preview sidebar, I'd be awfully grateful!Thanks!-Jason

Link to comment
Share on other sites

This feed uses namespaces, which means you must either1. Use xpath(), but register the "media" namespace first.2. Use DOM, and the getElementsByTagNameNS() method in particular.I don't know what the "media" namespace is, as you haven't showed the root element where the namespace is probably declared (look for 'xmlns:media="'), otherwise I could easily write the whole code for you.

Link to comment
Share on other sites

I haven't used namespaces before, so you're kind of going over my head. Here's the raw XML, and I did see the 'xmlns:media' you were talking about.

<?xml version="1.0" encoding="utf-8"?><rss version="2.0"		xmlns:media="http://search.yahoo.com/mrss/"	xmlns:dc="http://purl.org/dc/elements/1.1/"		>	<channel>		<title>Photos from nategreen03</title>		<link>http://www.flickr.com/photos/thenategreenexperience/</link> 		<description></description>		<pubDate>Tue, 15 Jan 2008 07:17:21 -0800</pubDate>		<lastBuildDate>Tue, 15 Jan 2008 07:17:21 -0800</lastBuildDate>		<generator>http://www.flickr.com/</generator>		<image>			<url>http://l.yimg.com/www.flickr.com/images/buddyicon.jpg#22729170@N05</url>			<title>Photos from nategreen03</title>			<link>http://www.flickr.com/photos/thenategreenexperience/</link>		</image>		<item>			<title>Embry Rucker Photograph for Men's Health</title>			<link>http://www.flickr.com/photos/thenategreenexperience/2195398252/</link>			<description><p><a href="http://www.flickr.com/people/thenategreenexperience/">nategreen03</a> posted a photo:</p><p><a href="http://www.flickr.com/photos/thenategreenexperience/2195398252/" title="Embry Rucker Photograph for Men's Health"><img src="http://farm3.static.flickr.com/2420/2195398252_2a9bc3e3ff_m.jpg" width="240" height="160" alt="Embry Rucker Photograph for Men's Health" /></a></p></description>			<pubDate>Tue, 15 Jan 2008 07:17:21 -0800</pubDate>						<dc:date.Taken>2008-01-15T07:17:21-08:00</dc:date.Taken>			<author>nobody@flickr.com (nategreen03)</author>			<guid isPermaLink="false">tag:flickr.com,2004:/photo/2195398252</guid>									<media:content url="http://farm3.static.flickr.com/2420/2195398252_5597e6c349_o.jpg" 					   type="image/jpeg"					   height="534"					   width="800"/>			<media:title>Embry Rucker Photograph for Men's Health</media:title>			<media:text type="html"><p><a href="http://www.flickr.com/people/thenategreenexperience/">nategreen03</a> posted a photo:</p><p><a href="http://www.flickr.com/photos/thenategreenexperience/2195398252/" title="Embry Rucker Photograph for Men's Health"><img src="http://farm3.static.flickr.com/2420/2195398252_2a9bc3e3ff_m.jpg" width="240" height="160" alt="Embry Rucker Photograph for Men's Health" /></a></p></media:text>			<media:thumbnail url="http://farm3.static.flickr.com/2420/2195398252_2a9bc3e3ff_s.jpg" height="75" width="75" />			<media:credit role="photographer">nategreen03</media:credit>			<media:category scheme="urn:flickr:tags">magazine for health photographs mens embry rucker</media:category>		</item>	</channel></rss>

Is there a good reference that's easy to understand that would help me figure out namespaces?Thanks!-Jason

Link to comment
Share on other sites

So, I found a Zend article on SimpleXML that talks about namespaces, so I followed the example and came up with this code:

		$rss = simplexml_load_file($feed);		foreach ($rss->children('http://search.yahoo.com/mrss/') as $entry) {			$photo .= <<<____________EOD		<img src="{$entry->thumbnail}" />____________EOD;		}

However, this isn't working.I'm sure that there's some little detail I'm not understanding. Could anyone give me a point in the right direction?Thanks!-Jason

Link to comment
Share on other sites

Well, I figured it out (sorta). I still don't entirely understand how or why namespaces work, but I found a working example on Ben Ramsey's blog, and that led me to my solution.Thanks for starting me in the right direction. Here's the final, working code:

	function flickrParse($numrows) {		$url = "http://example.com";		$user = "?id=some_numbers〈=en-us&format=rss_200";				$feed = $url.$user;				$rss = simplexml_load_file($feed);		foreach ($rss->channel->item as $item) {			$link   = (string) $item->link;			$media  = $item->children('http://search.yahoo.com/mrss/');			$thumb  = $media->thumbnail->attributes();			$url	= (string) $thumb['url'];			$width  = (string) $thumb['width'];			$height = (string) $thumb['height'];		 			$photodisp .= <<<________________EOD					<div class="flickr_photo">			<a href="{$link}">				<img src="{$url}" width="{$width}" height="{$height}" style="border:0;" />			</a>		</div>________________EOD;		}				return $photodisp;	}

Hopefully this can save someone else a day's worth of heartache.-Jason

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...