Jump to content

Make My Own Site Crawler/spider


carefree

Recommended Posts

Hi, This is just a hobby for me.....I want to build my own search engine. I have submissions and urls for sites, im just missing the crawlerHow can i go about building my own spider?Ill just need to scan the index page of each site and insert the meta tags/keywords/title etc into my databaseI have good sql/php/html knowledge, so if someone points me in the right direction i will have no probsIm assuming spiders run by curl...if so i wont have to much trouble

Link to comment
Share on other sites

Curl is one way to do it, other than getting pages you need to use some clever regular expressions to find the links to other pages. There's also the robots.txt file that you should follow if you're going to build a crawler. Most search engines now don't look at things like meta tags, they look at the actual body content and get the keywords from that.

Link to comment
Share on other sites

Truthfully it will take me a month of building and testing the spider.I need to find an open source spider engine to build offI also got a few quotes from programmers to knock one up for me but the average quote is $1,200 to $4,000 bucks.....not worth it for a hobby

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...