Jump to content

Re: Storing Rss Feed In Utf-8" Format


deepakbhalla

Recommended Posts

I am trying to create an application in C# which will extract and save RSS feeds onto the D drive. I am using the following C# code to save the file on the disk rssReader = new XmlTextReader(FeedAddress.Text); rssDoc = new XmlDocument(); //Load the XML contents into a XMLDocument rssDoc.Load(rssReader); rssDoc.Save("D://text.xml");Everything works wells and the file is saved in the "utf-8" format if the extension of the link is XML (e.g http://msdn.microsoft.com/rss.xml). The problem comes when I try to save a feed with .rss extension (e.g http://orangecounty.craigslist.org/apa/index.rss), instead of saving the feed in "utf-8" format it saves the feed in "ISO-8859-1" format.Moreover, when I try to save the feed with .rss extension using the Save As option of the Internet Explorer it saves the feed in "utf-8" format but when I use the C# code to save the same feed it saves it in the "ISO-8859-1" format.Is there any way to save the feed with .rss extension in "utf-8" format instead of "ISO-8859-1" format?ThanksDeepak

Link to comment
Share on other sites

Try to ditch XmlTextReader in favor of a plain old XmlDocument.Load(), like:

rssDoc = new XmlDocument();//Load the XML contents into a XMLDocumentrssDoc.Load(FeedAddress.Text);rssDoc.Save("D://text.xml");

Alternatively, try to craete a stream to the feed, and pass that stream to XmlDocument.Load().I haven't worked with C# enough to help you more.

Link to comment
Share on other sites

I have tried to ditch the XmlTextReader but still it is saving the file in "ISO-8859-1" format. I have even tried to save the XML file using the class System.Net.WebClient but it is also saving the file in ISO-8859-1 format.Does this means that the server is sending me the file in "ISO-889-1" format but if this is the case then why my internet explorer or firefox displays the file in "UTF-8" format is there any way to convert the XML file from ISO-8859-1 format to UTF-8 format.

Link to comment
Share on other sites

I have tried to ditch the XmlTextReader but still it is saving the file in "ISO-8859-1" format. I have even tried to save the XML file using the class System.Net.WebClient but it is also saving the file in ISO-8859-1 format.Does this means that the server is sending me the file in "ISO-889-1" format but if this is the case then why my internet explorer or firefox displays the file in "UTF-8" format is there any way to convert the XML file from ISO-8859-1 format to UTF-8 format.
Have you tried inspecting the HTTP headers of the feed (I suppose System.Net.WebClient is going to have such an option)? If there's an encoding mismatch between the encoding in the XML file and the HTTP headers, the results may vary. In particular, check the "Content-Type" header. It should be "application/xml" or at least "application/rss+xml". If it's "text/xml", then perhaps that's the problem, as "text/xml" defined that encoding as the default one (i.e. in the absence of an XML prolog).Either way, I don't know of any C# library to convert one encoding into another.
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...