Jump to content

Php Explode Help


destrugter

Recommended Posts

Hi, it's been a while since I needed help with PHP. I came across a very helpful command called explode.For my purpose, I need to grab a webpage, use explode and split it up into strings based on the new line character, which is "\n". Now I got that far, and I got as far as getting the exact result I wanted, BUT it's far too slow. My page loads in 31 seconds using explode versus most of the other pages based on my databases of 0.0006 seconds.Does anyone know a faster alternative to doing this? Or is PHP just not ready to handle an external webpage of 300+ lines within seconds?If there is a faster alternative, please tell me. I'll post my current code.

Function Find_Minimum($Item_Number){  $txt = file_get_contents('http://itemdb-rs.runescape.com/viewitem.ws?obj='. $Item_Number);  $data = explode("\n", $txt);  $ind = 1;  foreach($data as $line)  {	set_time_limit(0);	If($ind == 220)	{	  $pattern = "<b>Minimum price:</b>";	  $Item_Minimum_Price = Cut_String($pattern, $line);	}	set_time_limit(0);	$ind++;  }  Return $Item_Minimum_Price;}

Link to comment
Share on other sites

Most of the time is probably taken up by the fetching of the external page, not the actual explode routine. Especially if you are calling that function multiple times. Benchmark each section!

Link to comment
Share on other sites

I tried that, and in terms of easiness and code clean-up it works great. I tested it out, and the page measured to take 37.408 seconds to load.When I save a particular webpage as a mht file, where the "obj" instance is 2, the resulting file is 364KB on disc. So if it's having to sort through a 364kb file then spit out information, I can see why this would be causing the 30 second load times.I do not think it would be possible to get this to work under 30 seconds. To do so, I would have to make php skip the first couple hundred lines of the code, and then read that information into my variable.[EDIT]Here was my code after making it work with file()

Function Find_Minimum($Item_Number){  $txt = file('http://itemdb-rs.runescape.com/viewitem.ws?obj='. $Item_Number);	  $pattern = "<b>Minimum price:</b> ";	  $Item_Minimum_Price = Cut_String($pattern, $txt[219]);  Return $Item_Minimum_Price;}

Link to comment
Share on other sites

It takes milliseconds to explode several hundred KB of text, not seconds. Explode is not your problem. Since you have your page request in the function:$txt = file('http://itemdb-rs.runescape.com/viewitem.ws?obj='. $Item_Number);if you call that function 10 times to get 10 different things, it's going to send 10 requests to the server. That's where the time is coming from, waiting for the runescape.com server to return several different responses. Ideally you would only get the page once and then search through that for whatever you're looking for.Also, if they ever change the format of their page, even just adding one more line, it will break your code. It's better to use a regular expression for this.

Link to comment
Share on other sites

Yes, I understand this. For my example though, I am only using the function 1 time for 1 specific thing. I made this as a test to see the load times and everything, and so I grabbed 1 line from it. I understand that I could do better later by loading it once, but just for the testing purposes I made it only 1 part of the file.I understand that if they change it, that would mess up my code. I am prepared for this as well. Like I said, this was just a test to see if everything would check out fine, and the only problem so far is the load time.

Link to comment
Share on other sites

It's easy enough to add timers to see how long everything takes. The microtime function returns the current time in seconds and milliseconds. e.g.:

Function Find_Minimum($Item_Number){  $start = microtime(true);  $txt = file('http://itemdb-rs.runescape.com/viewitem.ws?obj='. $Item_Number);  echo 'got file in ' . (microtime(true) - $start) . ' seconds<br>';  $pattern = "<b>Minimum price:</b> ";    $start = microtime(true);  $Item_Minimum_Price = Cut_String($pattern, $txt[219]);  echo 'cut string in ' . (microtime(true) - $start) . ' seconds<br>';  Return $Item_Minimum_Price;}

you can add stuff like that to the cut_string function to see how long various things there take also.

Link to comment
Share on other sites

Maybe so, they might have an API they want people to use instead of just scraping the HTML pages. They might also be checking the user agent string. You might be able to use something like the CURL library to send a request and specify the user agent so it thinks you're a regular browser instead of a bot.

Link to comment
Share on other sites

Wow, that's pretty clever.I have a question though, say I go through with looking at the cURL library to do as you are explaining. Is any part of what I'm doing illegal in any way? All I'm doing is connecting to the website, grabbing 3 pieces of information so that the people who visit my site have the most accurate information about that item possible.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...