NamePros
Welcome, Guest! Ready to make a name for yourself in the domain business? We welcome both the hobbyist and professional domainer to join the discussion as part of the NamePros community.

Click here to create your profile to start earning reputation for posting, and trader ratings for buying & selling in our free e-marketplace. Build your trader rating with each successful sale. Our system has tracked over 100,000 sales and counting!
FAQ & TOS Register Search Today's Posts Mark Forums Read

Go Back   NamePros.com > Website Development Discussion Forums > Programming
Reload this Page Gathering stats from Google

Programming PHP, Perl, Ruby on Rails, AJAX, HTML, XHTML, CSS, JavaScript, MySQL and any other coding topics.

Advanced Search


Closed Thread
 
LinkBack Thread Tools
Old 05-11-2006, 07:36 AM THREAD STARTER               #1 (permalink)
col
NamePros Regular
 
col's Avatar
Join Date: Jan 2005
Location: Land of the m00
Posts: 729
col is just really nicecol is just really nicecol is just really nicecol is just really nice
 



Gathering stats from Google


I've done a little php script that is fetching some link stats from google. The script is working fine, but Google doesn't seem to like it... I'm getting the stats by parsing the page I get by file_get_contents() (PHP). After a while the script just didn't want to work and when I checked up on it I found that my automated searches returned a page looking like this: <link>
Anyone had the same problem? Any suggestions on alternative solutions?
__________________
The more I think
the more confused I get...
col is offline  
Old 05-11-2006, 07:54 AM   #2 (permalink)
 
BillyConnite's Avatar
Join Date: Jul 2005
Location: Coffs H, Australia
Posts: 3,456
BillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond reputeBillyConnite has a reputation beyond repute
 


Wildlife Parkinson's Disease Parkinson's Disease
I'd say that they have blocked your sites IP most likely, as it was probably against their TOS for you to be doing that .

They block a lot of PR predictors too, but most use various proxies to get the data from google now to spread the work, and lessen any risk of being banned.
BillyConnite is offline  
Old 05-11-2006, 08:21 AM   #3 (permalink)
NamePros Regular
 
Noobie's Avatar
Join Date: Feb 2006
Location: Montreal, Quebec, Canada
Posts: 324
Noobie is on a distinguished road
 



Hi,

I got that error yesterday too out of the blue.
Had nothing to do with google stats, just visiting google to search for something.

I typed in the required characters and I haven't been asked since.
__________________
Goldkey.com is a scam
What's your BMI? | Timestamp Generator
Noobie is offline  
Old 05-11-2006, 11:05 AM THREAD STARTER               #4 (permalink)
col
NamePros Regular
 
col's Avatar
Join Date: Jan 2005
Location: Land of the m00
Posts: 729
col is just really nicecol is just really nicecol is just really nicecol is just really nice
 



Well, I guess I just have to skip their stats at the moment
__________________
The more I think
the more confused I get...
col is offline  
Old 05-11-2006, 02:58 PM   #5 (permalink)
NamePros Member
Join Date: Apr 2005
Posts: 117
mikesherov will become famous soon enoughmikesherov will become famous soon enough
 



Well, the way to do it is to build or use a proxy sniffer to get a list of valid proxies and just build a PHP function that uses proxies to fetch the information.

Well, I leave finding valid proxies a task left to the reader, but what kind of guy would I be if I didn't provide the function that works exactly like file_get_contents, but uses proxies?
PHP Code:
function proxy_url($server,$port,$proxy_url)
{
   
$proxy_name $server;
   
$proxy_port $port;
   
$proxy_cont '';
????: NamePros.com http://www.namepros.com/programming/196229-gathering-stats-from-google.html

   
$proxy_fp fsockopen($proxy_name$proxy_port);
   if (!
$proxy_fp)    {return false;}
   
fputs($proxy_fp"GET $proxy_url HTTP/1.0\r\nHost: $proxy_name\r\n\r\n");
   while(!
feof($proxy_fp)) {$proxy_cont .= fread($proxy_fp,4096);}
   
fclose($proxy_fp);
   
$proxy_cont substr($proxy_contstrpos($proxy_cont,"\r\n\r\n")+4);
   return 
$proxy_cont;

all you do is provide the proxy server name or ip address, the proxy port, and the the page you want to get.
mikesherov is offline  
Old 05-11-2006, 04:31 PM THREAD STARTER               #6 (permalink)
col
NamePros Regular
 
col's Avatar
Join Date: Jan 2005
Location: Land of the m00
Posts: 729
col is just really nicecol is just really nicecol is just really nicecol is just really nice
 



Originally Posted by mikesherov
Well, the way to do it is to build or use a proxy sniffer to get a list of valid proxies and just build a PHP function that uses proxies to fetch the information.

Well, I leave finding valid proxies a task left to the reader, but what kind of guy would I be if I didn't provide the function that works exactly like file_get_contents, but uses proxies?
????: NamePros.com http://www.namepros.com/showthread.php?t=196229
PHP Code:
function proxy_url($server,$port,$proxy_url)
{
   
$proxy_name $server;
   
$proxy_port $port;
   
$proxy_cont '';
????: NamePros.com http://www.namepros.com/showthread.php?t=196229

   
$proxy_fp fsockopen($proxy_name$proxy_port);
   if (!
$proxy_fp)    {return false;}
   
fputs($proxy_fp"GET $proxy_url HTTP/1.0\r\nHost: $proxy_name\r\n\r\n");
   while(!
feof($proxy_fp)) {$proxy_cont .= fread($proxy_fp,4096);}
   
fclose($proxy_fp);
   
$proxy_cont substr($proxy_contstrpos($proxy_cont,"\r\n\r\n")+4);
   return 
$proxy_cont;

all you do is provide the proxy server name or ip address, the proxy port, and the the page you want to get.
Thanks alot! Reputation added
__________________
The more I think
the more confused I get...
col is offline  
Old 05-12-2006, 01:45 PM   #7 (permalink)
NamePros Regular
Join Date: Mar 2006
Posts: 397
sacx13 is on a distinguished road
 




Use google api (registration and blah) if you don't have more than 1000 per day.

Regards
sacx13 is offline  
Old 05-12-2006, 04:52 PM   #8 (permalink)
NamePros Expert
 
Peter's Avatar
Join Date: Nov 2003
Location: Scotland
Posts: 5,069
Peter has a reputation beyond reputePeter has a reputation beyond reputePeter has a reputation beyond reputePeter has a reputation beyond reputePeter has a reputation beyond reputePeter has a reputation beyond reputePeter has a reputation beyond reputePeter has a reputation beyond reputePeter has a reputation beyond reputePeter has a reputation beyond reputePeter has a reputation beyond repute
 


Child Abuse Save The Children Save The Children Help The Homeless - Holiday 2009 Help The Homeless - Holiday 2009 Help The Homeless - Holiday 2009 Help The Homeless - Holiday 2009
yes as sacx13 states use the api if you do not have too many queries. The problem with using proxies is that they will soon enough be blocked as well if too many requests get put in.

I have come across the page you get myself many times when searching manually on google.
Peter is offline  
Closed Thread


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools


Liquid Web Smart Servers  
All times are GMT -7. The time now is 03:34 AM.

Managed Web Hosting by Liquid Web
Domain name forum recommended by Domaining.com Powered by: vBulletin® Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.6.0 Ad Management plugin by RedTyger