Gokuu Posted April 19, 2007 Share Posted April 19, 2007 First of all, let me apologize if this has already been posted, but I can't perform a search on "XML" Also, sorry for the long post :)Now, to the actual problem I'm having...Many of you must already know of the World of Warcraft's Armory (http://armory.wow-europe.com).The data is made of a XML transformed with a XSL, and I'd like to get access to the raw XML, but have been unable to do so up to now. What boggles me the most is, if I try viewing the source in either Firefox or IE, or even open the file in Notepad directly from the Web, I get the raw XML I need, which makes me believe that even PHP, the XSL is applied on the "client" side.I've tried using the DOM, filesystem functions, with the same results.I even tried running a packet sniffer to sniff the packets sent to the server by Notepad, and sent the same packets to a socket I had opened. But that didn't work either.The closest I got was when I tried using the wget with a different user-agent (a possible solution I saw somewhere else). But I get an incomplete result: <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="/layout/character-skills.xsl"?><page globalSearch="0" lang="en_gb" requestUrl="/character-skills.xml"> <characterInfo> <character battleGroup="Reckoning" charUrl="r=Bronzebeard&n=Zethus" class="Hunter" classId="3" faction="Alliance" factionId="0" gender="Male" genderId="0" guildName="The Conclave" guildUrl="r=Bronzebeard&n=The+Conclave&p=1" lastModified="19 April 2007" level="49" name="Zethus" race="Night Elf" raceId="4" realm="Bronzebeard" title=""/> <skillTab> <skillCategory key="professions" name="Professions"> <skill key="leatherworking" max="300" name="Leatherworking" value="257"/> <skill key="skinning" max="375" name="Skinning" value="300"/> </skillCategory> <skillCategory key="secondaryskills" name="Secondary Skills"> <skill key="cooking" max="300" name="Cooking" value="265"/> <skill key="firstaid" max="300" name="First Aid" value="260"/> <skill key="fishing" max="225" name="Fishing" value="170"/> <skill key="riding" max="75" name="Riding" value="75"/> </skillCategory> <skillCategory key="weaponskills" name="Weapon Skills"> <skill key="axes" max="245" name="Axes" value="127"/> <skill key="bows" max="245" name="Bows" value="194"/> <skill key="crossbows" max="245" name="Crossbows" value="1"/> <skill key="daggers" max="245" name="Daggers" value="94"/> <skill key="defense" max="245" name="Defense" value="209"/> <skill key="guns" max="245" name="Guns" value="243"/> <skill key="polearms" max="245" name="Polearms" value="213"/> <skill key="staves" max="245" name="Staves" value="1"/> <skill key="swords" max="245" name="Swords" value="123"/> <skill key="thrown" max="245" name="Thrown" value="1"/> <skill key="two-handedswords" max="245" name="Two-Handed Swords" value="1"/> <skill key="unarmed" max="245" name="Unarmed" value="40"/> </skillCategory> <skillCategory key="classskills" name="Class Skills"> <skill key="beastmastery" max="0" name="Beast Mastery" value="245"/> <skill key="marksmanship" max="0" name="Marksmanship" value="245"/> <skill key="survival" max="0" name="Survival" value="245"/> </skillCategory> <skillCategory key="armorproficiencies" name="Armor Proficiencies"> <skill key="cloth" max="0" name="Cloth" value="1"/> <skill key="leather" max="0" name="Leather" value="1"/> <skill key="mail" max="0" name="Mail" value="1"/> </skillCategory> <skillCategory key="languages" name="Languages"> <skill key="language:common" max="300" name="Language: Common" value="300"/> <skill key="language:darnassian" max="300" name="Language: Darnassian" value="300"/> </skillCategory> </skillTab> </characterInfo></page> I ran wget with the following command: wget -O guild.xml --user-agent='Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)' http://armory.wow-europe.com/character-skills.xml?r=Bronzebeard&n=Zethus PS: Just for the sake of it, I'm putting the code I used with fsockopen, without success (code changed to increase readability: $fp = fsockopen ("armory.wow-europe.com", 80, &$errnr, &$errstr) or die("$errno: $errstr"); $get = <<<EOFGET /character-skills.xml?r=Bronzebeard&n=Zethus HTTP/1.1Accept: */*UA-CPU: x86Accept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)Host: armory.wow-europe.comConnection: Keep-AliveCache-Control: no-cacheEOF; fputs($fp, $get); while (!$end) { $line = fgets($fp, 2048); if (trim($line) == "") $end = true; else echo $line; } Does anyone have any idea on how to do this? Or should I abandon my idea altogether? Link to comment Share on other sites More sharing options...
Mr_CHISOL Posted April 19, 2007 Share Posted April 19, 2007 Hi!Try to change this: while (!$end) { $line = fgets($fp, 2048); if (trim($line) == "") $end = true; else echo $line; }{/code]to[code] while (!feof($fp)) { $line = fgets($fp, 2048); echo $line; } Your code will stop ath the first empty line(s)/return, this isn't good as it can be empty lines in XML.It's better to check for eof instead..You could also use file_get_contents() to get get file: $text = file_get_contents( "armory.wow-europe.com" ); It could, but shouldn't make a difference if it's the default UserAgent or Mozilla...Other than that I can't see why you would get just a part of the xml-file..Note, visited the site, and it wasn't until some clicks and a url like this: http://armory.wow-europe.com/#search.xml?s...Type=characters that you got a character-list and that one you can't thru sources (as it uses ajax) but if you know which file that generates the list you should be able to go to that one to get the xml...Good Luck and Don't Panic! Link to comment Share on other sites More sharing options...
Gokuu Posted April 19, 2007 Author Share Posted April 19, 2007 Thanks, with that small change I discovered that I was sending a Bad Request (that's what you get for Copy & Pasteing code from the web without actually thinking about it ), so I investigated a little more and was now able to send a correct header to the socket, but I still get the XML after being transformed.The current header I'm sending is as follows: GET /character-skills.xml?r=Bronzebeard&n=Zethus HTTP/1.1Host: armory.wow-europe.comContent-Type: text/plainContent-Length: 22Connection: closer=Bronzebeard&n=Zethus each of the lines ending with \r\n.Do you know of any way I can change the header I send to make it reply with raw XML?Oh, and by the way, if you access the page with http://armory.wow-europe.com/character-ski...rd&n=Zethus it will take you directly to the page I'm testing with Link to comment Share on other sites More sharing options...
Mr_CHISOL Posted April 19, 2007 Share Posted April 19, 2007 Add the user agent to the header again: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Discovered while using wget with default UA that I got it as HTML, but when I used that UA-string in the headers I got it has XML. Link to comment Share on other sites More sharing options...
Gokuu Posted April 19, 2007 Author Share Posted April 19, 2007 HeheCame here to say I discovered the solution by adding User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) to the header, and I saw you had already posted it :)One strange thing I noticed... This user agent doesn't parse the XML, but a Lynx browser user agent does... Well, it's solved. Thanks for your help Mr. CHISOL Link to comment Share on other sites More sharing options...
Mr_CHISOL Posted April 19, 2007 Share Posted April 19, 2007 Hehe...One strange thing I noticed... This user agent doesn't parse the XML, but a Lynx browser user agent does... Well, it's solved. Thanks for your help Mr. CHISOL :?)The reason for that is prob. that Lynx doesn't have support for XSL, where IE, FF etc does. Then the server generates the source so it can work with lynx too.. Link to comment Share on other sites More sharing options...
justsomeguy Posted April 19, 2007 Share Posted April 19, 2007 I even tried running a packet sniffer to sniff the packets sent to the server by NotepadWhat?? Notepad is sending packets? Good god man, scan your system! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now