NamePros
Welcome, Guest! Ready to make a name for yourself in the domain business? We welcome both the hobbyist and professional domainer to join the discussion as part of the NamePros community.

Click here to create your profile to start earning reputation for posting, and trader ratings for buying & selling in our free e-marketplace. Build your trader rating with each successful sale. Our system has tracked over 100,000 sales and counting!
FAQ & TOS Register Search Today's Posts Mark Forums Read

Go Back   NamePros.com > Website Development Discussion Forums > Programming
Reload this Page Protecting ourselves from malicious spiders

Programming PHP, Perl, Ruby on Rails, AJAX, HTML, XHTML, CSS, JavaScript, MySQL and any other coding topics.

Advanced Search


Closed Thread
 
LinkBack Thread Tools
Old 11-02-2005, 04:44 AM THREAD STARTER               #1 (permalink)
NamePros Regular
 
gattoplano's Avatar
Join Date: Sep 2005
Location: Roma
Posts: 591
gattoplano will become famous soon enoughgattoplano will become famous soon enough
 



Protecting ourselves from malicious spiders


Hi,
I'm building a directory containing images. Now, I want to prevent this directory from being lurked from malicious spiders, sucking all of its contents.

Then, I want search engines' spiders to navigate it deeply, and to find all of the images and contents.

I was thinking about limiting the number of images/day views for common users and to set no limitations for search engines. But it sounds so bad :\

Should I use some other cloaking technique? Is there any good article around about this issue?
gattoplano is offline  
Old 11-02-2005, 10:17 AM   #2 (permalink)
New Member
Join Date: Oct 2005
Posts: 2
puzzlebox is an unknown quantity at this point
 



try searching for "robots.txt". It's some sort of spider-limiter, although if it's a malicious spider I doubt if it will stop at any limitations, whether cloaking or not.
puzzlebox is offline  
Old 11-02-2005, 11:10 AM   #3 (permalink)
A Wealth of Knowledge
 
stscac's Avatar
Join Date: Aug 2004
Posts: 3,809
stscac has much to be proud ofstscac has much to be proud ofstscac has much to be proud ofstscac has much to be proud ofstscac has much to be proud ofstscac has much to be proud ofstscac has much to be proud ofstscac has much to be proud of
 



You can forbid direct folder access by changing chmod to something less than 755.

-Steve
stscac is offline  
Old 11-02-2005, 01:28 PM THREAD STARTER               #4 (permalink)
NamePros Regular
 
gattoplano's Avatar
Join Date: Sep 2005
Location: Roma
Posts: 591
gattoplano will become famous soon enoughgattoplano will become famous soon enough
 



I can't as my directory is structured so that common users must have access to any point in any moment.

I just want to limit the amount of information that someone can view. I would like to assume that it's impossible for a human to open and read carefully more than 10 directories in a minute, or to see more than 25 images, and limit access for content stealers while leaving it open for search engines.
gattoplano is offline  
Old 11-07-2005, 06:01 AM   #5 (permalink)
NamePros Member
Join Date: Nov 2005
Posts: 41
Spider Ninja is an unknown quantity at this point
 



try mod rewrite:

http://httpd.apache.org/docs/2.0/misc/rewriteguide.html

and search the web for "spider user agent" ...that's a way to start.
__________________
webmaster tools
Spider Ninja is offline  
Closed Thread


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools


Liquid Web Smart Servers  
All times are GMT -7. The time now is 11:57 AM.

Managed Web Hosting by Liquid Web
Domain name forum recommended by Domaining.com Powered by: vBulletin® Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.6.0 Ad Management plugin by RedTyger