Unstoppable Domains

robots.txt help needed

Spaceship Spaceship
Watch

AbdulBasit.com

DomainsWeb.comTop Member
:heavy_check_mark: AbdulBasit.com
Impact
15,995
Hello everyone,

I have a forum whose robots.txt I have made and placed at http://www.funwadi.com/robots.txt which Google is successfully accessing and blocking those which I have disallowed.

Now I wanted to block profile pages of members which Google is indexing at a rapid pace. I just don't want Google to index profile pages. Now the problem is that profile pages are in this format:-

http://www.funwadi.com/forum/member2.html
http://www.funwadi.com/forum/member3.html
http://www.funwadi.com/forum/member4.html
http://www.funwadi.com/forum/member5.html

and so on. I have over 44,000 registered users and I don't want Google to index profile pages so what I should enter in robots.txt file so that Google won't follow the above members pages.

Thanks a bunch in advance

AbdulBasit Makrani
 
0
•••
The views expressed on this page by users and staff are their own, not those of NamePros.
.US domains.US domains
Google, Yahoo and MSN support 2 wildcards in robots.txt - *, which means any string of characters, and $ which anchors to the end of the url.

This should do it ...

Disallow: /forum/member*.html$​
 
0
•••
enlytend said:
Google, Yahoo and MSN support 2 wildcards in robots.txt - *, which means any string of characters, and $ which anchors to the end of the url.

This should do it ...

Disallow: /forum/member*.html$​

Thanks for the reply. I added the above disallow code in robots.txt file and checked through google webmasters tool who checks whether any specific URL is blocked or not so every member page it shows is allowed!

Any idea what to do now ? :(
 
0
•••
??? Should have worked. But here are two other suggestions:

.htaccess - deny the spider useragents access to those files. This is the most foolproof method. (Sorry don't have time to figure out the syntax and write it out)

or

modify the profile code so that there's a robots meta in the header:
<meta name="robots" content="noindex" />
 
0
•••
0
•••
weblord said:
robots.txt is currently blocking googlebot
see the proof under "restricted with robots.txt"
http://www.xml-sitemaps.com/se-bot-.../forum/member2.html&se=googlebot&submit=Start

Oh yeah, that have started working. Awesome :) :)

Thanks a lot

enlytend said:
??? Should have worked. But here are two other suggestions:

.htaccess - deny the spider useragents access to those files. This is the most foolproof method. (Sorry don't have time to figure out the syntax and write it out)

or

modify the profile code so that there's a robots meta in the header:
<meta name="robots" content="noindex" />

Thanks a lot. The code you gave me have actually started blocking google bot to index profile pages.
Once again thank you very much :)
 
0
•••
Rep added enlytend :)
 
0
•••
Dynadot — .com Registration $8.99Dynadot — .com Registration $8.99

We're social

Unstoppable Domains
Domain Recover
NameMaxi - Your Domain Has Buyers
  • The sidebar remains visible by scrolling at a speed relative to the page’s height.
Back