[advanced search]
Results from the most recent live auction are here.
14 members in the live chat room. Join Chat!
Register Rules & FAQ NP$ Store Active Threads Mark Forums Read
Go Back   NamePros.Com > Design and Development > Search Engines
User Name
Password

Old 04-10-2008, 06:46 AM   · #1
abdulbasituae
NamePros Regular
 
Trader Rating: (1)
Join Date: Sep 2007
Posts: 280
NP$: 32.00 (Donate)
abdulbasituae is an unknown quantity at this point
robots.txt help needed

Hello everyone,

I have a forum whose robots.txt I have made and placed at http://www.funwadi.com/robots.txt which Google is successfully accessing and blocking those which I have disallowed.

Now I wanted to block profile pages of members which Google is indexing at a rapid pace. I just don't want Google to index profile pages. Now the problem is that profile pages are in this format:-

http://www.funwadi.com/forum/member2.html
http://www.funwadi.com/forum/member3.html
http://www.funwadi.com/forum/member4.html
http://www.funwadi.com/forum/member5.html

and so on. I have over 44,000 registered users and I don't want Google to index profile pages so what I should enter in robots.txt file so that Google won't follow the above members pages.

Thanks a bunch in advance

AbdulBasit Makrani


Please register or log-in into NamePros to hide ads
abdulbasituae is offline   Reply With Quote
Old 04-10-2008, 11:53 AM   · #2
enlytend
NamePros Regular
 
Location: USA
Trader Rating: (6)
Join Date: Aug 2006
Posts: 405
NP$: 43.70 (Donate)
enlytend is just really niceenlytend is just really niceenlytend is just really niceenlytend is just really niceenlytend is just really nice
Google, Yahoo and MSN support 2 wildcards in robots.txt - *, which means any string of characters, and $ which anchors to the end of the url.

This should do it ...
Disallow: /forum/member*.html$
enlytend is offline   Reply With Quote
Old 04-11-2008, 11:51 PM   · #3
abdulbasituae
NamePros Regular
 
Trader Rating: (1)
Join Date: Sep 2007
Posts: 280
NP$: 32.00 (Donate)
abdulbasituae is an unknown quantity at this point
Originally Posted by enlytend
Google, Yahoo and MSN support 2 wildcards in robots.txt - *, which means any string of characters, and $ which anchors to the end of the url.

This should do it ...
Disallow: /forum/member*.html$



Thanks for the reply. I added the above disallow code in robots.txt file and checked through google webmasters tool who checks whether any specific URL is blocked or not so every member page it shows is allowed!

Any idea what to do now ?
abdulbasituae is offline   Reply With Quote
Old 04-12-2008, 02:51 AM   · #4
enlytend
NamePros Regular
 
Location: USA
Trader Rating: (6)
Join Date: Aug 2006
Posts: 405
NP$: 43.70 (Donate)
enlytend is just really niceenlytend is just really niceenlytend is just really niceenlytend is just really niceenlytend is just really nice
??? Should have worked. But here are two other suggestions:

.htaccess - deny the spider useragents access to those files. This is the most foolproof method. (Sorry don't have time to figure out the syntax and write it out)

or

modify the profile code so that there's a robots meta in the header:
<meta name="robots" content="noindex" />
enlytend is offline   Reply With Quote
Old 04-12-2008, 02:59 AM   · #5
weblord
www.1weblord.com
 
weblord's Avatar
 
Name: William R. Nabaza - williamrnabaza.com
Location: Philippines - www.Nabaza.com
Trader Rating: (234)
Join Date: Dec 2005
Posts: 19,207
NP$: 17105.28 (Donate)
weblord Has achieved greatnessweblord Has achieved greatnessweblord Has achieved greatnessweblord Has achieved greatnessweblord Has achieved greatnessweblord Has achieved greatnessweblord Has achieved greatnessweblord Has achieved greatnessweblord Has achieved greatnessweblord Has achieved greatnessweblord Has achieved greatness
Autism Protect Our Planet
robots.txt is currently blocking googlebot
see the proof under "restricted with robots.txt"
http://www.xml-sitemaps.com/se-bot-...ot&submit=Start


Originally Posted by enlytend
Google, Yahoo and MSN support 2 wildcards in robots.txt - *, which means any string of characters, and $ which anchors to the end of the url.

This should do it ...
Disallow: /forum/member*.html$

weblord is offline  
  Reply With Quote
Old 04-12-2008, 10:17 AM   · #6
abdulbasituae
NamePros Regular
 
Trader Rating: (1)
Join Date: Sep 2007
Posts: 280
NP$: 32.00 (Donate)
abdulbasituae is an unknown quantity at this point
Originally Posted by weblord
robots.txt is currently blocking googlebot
see the proof under "restricted with robots.txt"
http://www.xml-sitemaps.com/se-bot-...ot&submit=Start



Oh yeah, that have started working. Awesome

Thanks a lot

Originally Posted by enlytend
??? Should have worked. But here are two other suggestions:

.htaccess - deny the spider useragents access to those files. This is the most foolproof method. (Sorry don't have time to figure out the syntax and write it out)

or

modify the profile code so that there's a robots meta in the header:
<meta name="robots" content="noindex" />



Thanks a lot. The code you gave me have actually started blocking google bot to index profile pages.
Once again thank you very much
abdulbasituae is offline   Reply With Quote
Old 04-20-2008, 03:11 AM   · #7
abdulbasituae
NamePros Regular
 
Trader Rating: (1)
Join Date: Sep 2007
Posts: 280
NP$: 32.00 (Donate)
abdulbasituae is an unknown quantity at this point
Rep added enlytend
abdulbasituae is offline   Reply With Quote
Reply

NamePros is a revenue sharing forum.

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


Site Sponsors
Find out how! http://www.mobisitetrader.com/ Traffic Down Under
Advertise your business at NamePros
All times are GMT -7. The time now is 12:02 PM.


Powered by: vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 2.4.0