- Impact
- 5,308
Most modern analytics platforms, including Google Analytics, rely on a script that's loaded when users visit your site. Several years ago, this used to be an effective way to discern real people from automated crawlers. Crawlers didn't use full web browsers, so they wouldn't run the script, so their visits wouldn't be recorded.
That's changed. As more websites require JavaScript to function, crawlers have had to switch to full browsers. They don't just download the code and pick it apart to find the content; they load webpages just like a human does, which makes it difficult to tell them apart from real people. Projects like Selenium have made this possible.
Overall, this is great news: it means search engines are getting smarter. Tools like Selenium are also great for automated testing, meaning that web developers can test their sites in a variety of web browsers for every change they make. At NamePros, we use New Relic Synthetics to test various parts of our site at set intervals. The automated monitor can log in, edit its own post, and alert us of any problems. It'll even take screenshots of any errors it encounters.
The problem is that these tools aren't always used for good. Recently, we've seen an influx of malicious crawlers that use Selenium or similar technologies. Google Analytics typically thinks these are real people, and it can throw off our metrics quite a bit.
We keep track of what sort of technologies our visitors use, and we plan our features accordingly. For example, if we have a lot of visitors using iPhones, but not many using iPads, we'll spend more time planning for, developing for, and testing on iPhones. One technology metric that we track is screen resolution, and we do this with the help of Google Analytics.
When I looked at our screen metrics earlier today, I noticed something unusual: the resolution 800x600 had made its way into the top 10 resolutions for the past week. 800x600 means that the screen is 800 pixels wide by 600 pixels tall. That used to be one of the most popular resolutions--two decades ago. Today, it's rarely seen. This immediately raised red flags. Google Analytics was giving me information indicating that I should be testing on 800x600 screens, but my experience told me that didn't make any sense.
Upon closer inspection, the metrics became even more suspicious: apparently, nearly 17,000 users over the past 7 days had a screen resolution of 800x600, were using the exact same version of Chrome on Linux, had a bounce rate of 99.61%, and spent 2 seconds on the site each time they visited.
Fortunately, we have our own analytics system that's much more detailed than Google Analytics. We mostly use it for detecting and prevention fraud, but it also comes in handy for troubleshooting technical issues. According to our analytics, this unusual traffic started on June 20th and proceeded at a rate of 400 to 600 visits per hour. They were clearly automated requests, not real people. Google was treating each and every one of them as a unique user.
That's changed. As more websites require JavaScript to function, crawlers have had to switch to full browsers. They don't just download the code and pick it apart to find the content; they load webpages just like a human does, which makes it difficult to tell them apart from real people. Projects like Selenium have made this possible.
Overall, this is great news: it means search engines are getting smarter. Tools like Selenium are also great for automated testing, meaning that web developers can test their sites in a variety of web browsers for every change they make. At NamePros, we use New Relic Synthetics to test various parts of our site at set intervals. The automated monitor can log in, edit its own post, and alert us of any problems. It'll even take screenshots of any errors it encounters.
The problem is that these tools aren't always used for good. Recently, we've seen an influx of malicious crawlers that use Selenium or similar technologies. Google Analytics typically thinks these are real people, and it can throw off our metrics quite a bit.
We keep track of what sort of technologies our visitors use, and we plan our features accordingly. For example, if we have a lot of visitors using iPhones, but not many using iPads, we'll spend more time planning for, developing for, and testing on iPhones. One technology metric that we track is screen resolution, and we do this with the help of Google Analytics.
When I looked at our screen metrics earlier today, I noticed something unusual: the resolution 800x600 had made its way into the top 10 resolutions for the past week. 800x600 means that the screen is 800 pixels wide by 600 pixels tall. That used to be one of the most popular resolutions--two decades ago. Today, it's rarely seen. This immediately raised red flags. Google Analytics was giving me information indicating that I should be testing on 800x600 screens, but my experience told me that didn't make any sense.
Upon closer inspection, the metrics became even more suspicious: apparently, nearly 17,000 users over the past 7 days had a screen resolution of 800x600, were using the exact same version of Chrome on Linux, had a bounce rate of 99.61%, and spent 2 seconds on the site each time they visited.
Fortunately, we have our own analytics system that's much more detailed than Google Analytics. We mostly use it for detecting and prevention fraud, but it also comes in handy for troubleshooting technical issues. According to our analytics, this unusual traffic started on June 20th and proceeded at a rate of 400 to 600 visits per hour. They were clearly automated requests, not real people. Google was treating each and every one of them as a unique user.
Last edited: