Unstoppable Domains — Expired Auctions

Tech question(s) about bots?

SpaceshipSpaceship
SpaceshipSpaceship
Namecheap AuctionsNamecheap Auctions
Watch

Chris2412

Established Member
Impact
1
I am a n00b developer and I would like to talk about bots and how they crawl a web page.

I tried searching keyword “bot” on the forum but I got nill, just a bunch of random results. I’m sure this is a thread covering this so a link provided will do just fine.

So, Google has “bots” that crawl your page. It looks for keywords, phrases, ect for cataloging purposes. Which is good, because you want your website cataloged in their search engine.

But there are other kind of bots, too. Yes?

Some of these bots are evil pawns sent out to do- what exactly?

Eat your bandwidth?

I read a simple php code a year or two ago, that basically makes bot’s sleep (or time out).

I am not even close to HTML 5 yet so perhaps I am getting way ahead of myself.

There’s no harm in asking. Perhaps I can bump this thread as I get more knowledge and have additional questions regarding coding.
 
1
•••
The views expressed on this page by users and staff are their own, not those of NamePros.
Unstoppable Domains — AI StorefrontUnstoppable Domains — AI Storefront
And i do understand what Beezy is saying.

I'm basically asking "what can I do to grow this plant" without giving you any information of geographical or any other variables. Even if I did, would you advise me for free?

So I do understand that, I guess you could say I am asking hypothetically because I don't have specifics for you.

I do appreciate everyone's input, even if you want to razzle with me, it's all fine and good.
 
0
•••
I think you're overthinking the whole matter, mate.
 
0
•••
I think you're overthinking the whole matter, mate.

I am a bit high-strung. Nothing positive ever comes of fretting.

I just want to do things correctly.

What's the saying; anything worth learning is worth learning right :rolleyes:
 
0
•••
Right, and we're telling you that this really isn't worth learning.
 
0
•••
I appreciate all the input in this thread.

Perhaps when I am more educated I can ask the "hard questions", instead of running around like a rookie designer with his head chopped off.
 
0
•••
If you're trying to be a designer, then you definitely don't need to know this!
 
1
•••
I could ask you, "why not"... but then I would get scolded again.
 
0
•••
A designer would typically learn front end technologies. Javascript and CSS mostly.
 
0
•••
Knowledge is power, I am a bit of a nut.

When I took courses in college I was too busy starring at the brunette two terminals down to actually learn anything (I'm not really a blonde gal type of guy).

The whole education system is bollocks, anyways. If you want something, you go out and learn it for yourself.

I know nothing about bots and I wanted to talk about it. No harm, intended.

I will never pretend to be something I'm not.

Worst case scenario you all make fun of me, so what? At least I am not purchasing WP templates like most of these people (nothing wrong with that, just saying).

I want to create something for myself. I want to be an architect.

It's a bit daunting starting from a completely blank page. But, to know I can create a decent layout from scratch, well that's at least getting the ball rolling.

I want to continue to grow. If that means making a fool of myself, than so be it.

That being said...After all this, I still have very little knowledge about bots.
 
0
•••
Ok, either you're just really weird (no offense)... or you are actually trying to do something malicious with bots, because otherwise your obsession makes zero sense.

Seriously, it's not a thing you should even think about.
 
0
•••
I feel like we are beating a dead horse, Beezy.

I do appreciate your time and efforts.

I certainly am not trying to program a malicious bot. Quite the contrary. I wanted to learn about bots, their purpose, if/how/when to make them sleep, ect.

I'm currently learning spry elements. So yeah, I'm a n00b. Go easy.
 
0
•••
Rest assured no one in this forum will laugh at your ignorance, we all have our shortcomings.

You wanna be a developer, start reading. You are worried about bots, just pick strong passwords for now. Down the road you'll learn how to block certain bots. A bot is not malicious by default.
 
1
•••
why not you google it? lol
 
1
•••
Chris2412 said:
If NP forum is not a community to help, advise, and educate in web development...than you are correct- I am in the wrong place.
You're posting in the right place and its absolutely fine to ask :). Just saying that we don't have many posts on the subject because the overwhelming interest in NP is in buying and selling domains :).

iowadawg said:
Worse offender, once they find your blog/site?
Baidu!

Baidu is the #1 search engine in China and the #2 search engine worldwide (and expanding their reach through various internet acquisitons), but does spider aggressively. It respects robots.txt so if China is not your audience you can block it there. I think you still have to register with their webmaster tools to change frequency. Here's their FAQ: http://help.baidu.com/question?prod_en=master&class=Baiduspider&id=1000973

And just like with Googlebot, there are rogue bots who use the Baidu user agent to try to get past your malicious bot blocking strategy.

Why someone would send them to eat bandwidth?
Most of them don't do it deliberately. Some might do as in a Denial of Service (DOS) attack

Why people are engineering bots instead of... basically anything else they could be doing?

Because it scales tasks which would be impossible to do manually.

Most of what hits your site are not new bots that somebody constructed, its software they're running to accomplish a task. Malicious reasons include spamming comments to get backlinks to their site, scraping content so they can use your content without writing their own, harvesting email addresses so they can send spam, scanning for vulnerabilities so they can get confidential data like passwords, use your site to host illegal downloads, or install their bots on your server so they have a bigger network of server power performing their tasks ...

Why some bots are "good" and others are evil?
Software isn't evil, intent is up to the party who runs it
 
Last edited:
2
•••
1
•••
If you use cloudflare on your site they will do a great job of keeping offending bots away

Here's another good resource: http://www.distilnetworks.com/

Features:
  • THEFT BOTS: Block bots from siphoning away your data & revenue
  • FORM FRAUD: Submitting fake forms. Your forms are being flooded with fake information and clogging your database with bad leads.
  • CLICK FRAUD: Clicking on paid ads. Your daily ad budget is maxing out because of bots, not potential buyers.
  • COMMENT SPAM: Interrupting your users. Spend your time moderating your actual visitors, not bots
 
0
•••
Baidu uses so many different chinese IPs to sent their bot army out.
That using robots.txt would mean getting all those IPs and listing them.
And the list is LONG.
Have not seen this, but it seems that others now saying baidu is now using IPs that are not chinese, to avoid that problem of being blocked.
They are agressive and are worse than google!
 
0
•••
No, using robots.txt means listing the user agent, not the IPs. I think there are about 3 user agents.

Someone probably has an htaccess list of the ip's if you want to go that route. There are lists to block entire countries, I'm sure someone has a Baidu list.

Google is easier to control.
 
0
•••
I use a wordpress plugin to block ALL of china!
End of story.
 
1
•••
Sorry, I've been meaning to reply to this thread for a while, but I've had a lot of work to do and ended up getting a nasty cold earlier this week.

This is certainly a great place to ask about bots, because bots are an important part of SEO, and SEO is a big part of domaining. Plus, many of us have web development experience.

Also, knowledge of bots is definitely relevant to web designers--often more so to them than anyone else. Modern search engine bots, like Googlebot, care quite a bit about design and layout, and will penalize a website if they dislike the design. It's also important for designers to understand how bots interpret content within different types of layouts, and how to make the key content stand out.

I'll write up some information about different types of bots and post it here shortly.
 
3
•••
Appraise.net

We're social

Escrow.com
Spaceship
Domain Recover
CryptoExchange.com
Catchy
DomDB
NameFit
  • The sidebar remains visible by scrolling at a speed relative to the page’s height.
Back