Jump to content

Want to creat web robot.


virak
 Share

Recommended Posts

Dear Everyone,My name is Sopheak Virak, i am posting from Cambodia country...I really need your help to guide me or give me some clue on how to creat Web Crawling... i mean i want to creat "Orkun2uBot" this bot will present to all website which is published under Country Domain(.com.kh, .gov.kh, org.kh).. just like GoogleBot, MsnBot, Yahoo!Slurp, methabot...etcPlease give me some clue...I am appreciate for your help.Best Regards,Virak

Link to comment
Share on other sites

I can try with PHP, sir.
I believe a lot of those robots were written in C and C++. Php may be a bit limited.
Link to comment
Share on other sites

I believe a lot of those robots were written in C and C++. Php may be a bit limited.
Yes..correct sir.. most of robots written in C, C++.But i am the one who familiar with Php than other programing.kindly please give me some clue?Best Regards,Virak
Link to comment
Share on other sites

If you know PHP then try Perl, it is very powerful and similar to PHP syntax.
Dear Sir,I tried with some Perl Scripts that talking about web crawler... but I still can't see the robot come out, when i process the code.Can you explain me more about that? I know that i still miss understand about how to create robot.Best Regards,Virak
Link to comment
Share on other sites

What code did you try?A crawler basically:

  • loads a page from a queue
  • stores its contents, and whatever other data is necessary for the crawler's purpose
  • finds all links and adds them to the queue
  • loads the next page on the queue
  • et cetera

PHP is not ideal because it is not designed to run a single script for an extended period of time.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...