Blog stats tell us who comes to our blogs and what they do while they are there. Or, at least, that’s the idea. When I first started this blog, like a lot of people, I thought the built in stats app would suit my stat needs, but I quickly realized otherwise. The stats plugin built into Jetpack is neither accurate nor detailed enough. Thus, I set out to find something better. This is how that turned out.
Blog Stats with Google Analytics
Google Analytics was the first and most obvious choice. It’s easy to set up and quite thorough. Google Analytics is especially great if you are using AdSense or AdWords (I don’t), because it integrates with both effortlessly.
There are several plugins that can insert the analytics code into your blog automatically, but I didn’t have luck with them. I ended up just sticking it into my theme header myself, which worked fine. The one plugin I found that I did like is Analytics360 from MailChimp. It doesn’t put the code in your pages, but what it can do is pull your analytics data into your WordPress dashboard. It also has a cool graph that shows how your recent blog activity and how it relates to your visitor stats. You have to sign up with MailChimp to use it, but the free account is all your need.
Thing were going fine with Google Analytics until I started using CloudFlare, and then I started to question the Analytics data.
CloudFlare is not a stats tool, not really. It’s a proxy service that routes your blog traffic through their own DNS servers. This lets them do a couple of things. First, they can cache your static content, such as images and scripts, and serve it from a cookie-free domain that allows them to load faster. Second, they can monitor the traffic and detect if there is an attack on the site. If they detect an attack, they can then block that traffic.
The fact that all of your traffic goes through the CloudFlare DNS means that they are in a position to see everyone and every thing that hits your site. Their free service offers some limited statistics in their analytics dashboard, which includes page views, unique visitors, bots, and threats. What surprised me was that CloudFlare was showing significantly higher page views and unique visitors than Google, as in 25x higher. CloudFlare offered the following explanation as to why:
After doing some research, I decided to install another analytics program, Piwik.
Blog Statistics with Piwik
I installed Piwik and compared it with CloudFlare and Google Analytics for a few days. What it told me was basically the same thing Google Analytics told me, no one is coming here (so if you’re reading this, thanks).
So, what’s the deal? Where are all these extra visitors that CloudFlare sees coming from? To find the answer, I went into the raw visitor logs.
Ah ha! Oh, no…
The Apache visitor logs told the tale. Once I looked at them (something I should have done sooner), it was really clear where the discrepancy was. So, who was right? CloudFlare? Google? Piwik? The answer is: All of the above.
CloudFlare was correct that there were dozens, at times even hundreds, of “visitors” to my blog with unique ip addresses. What it was missing, however, was that most of them, 99% of them, were spammers, hackers, and bots.
Spammers, Hackers, and Bots! Oh my!
CloudFlare recognizes bots and doesn’t include them in their visitor data, but only certain bots, the nice bots, bots like GoogleBot and Bing. Most of the mysterious site traffic came from other bots. Some bots were bad bots and some were just bots that CloudFlare doesn’t sort out, like The Way Back Machine. Since these crawlers come in with various IP addresses, they were registering as unique users.
Along with the bots, the Apache logs revealed numerous hacking attempts. There were several brute force attempts to break my WordPress password, all unsuccessful. There were also a lot of 404 errors from hackers looking for known vulnerabilities. In theory, CloudFlare should be blocking these, since that is sort of its purpose, but instead of blocking them, it was registering them as unique visitors.
There were also comment spammers and pingback spammers in the mix.
Seeing all those hackers and spammers was a bit disconcerting, but I have taken some measures to protect the site in response. That, however, is a topic for another post.
Google Analytics and Piwik were both accurate in telling me how often actual humans were visiting my site. CloudFlare was accurate in telling me how many hits my site got from various sources, many nefarious.
I like Piwik. I like that I control it. I like that it honors Do Not Track. I like the way to lays out the information. It takes a little more effort to set it up, since you need to install it on your server, but it is worth the effort. I plan to continue using it.
Google Analytics is fine, but I am planning on removing it from my site. There’s just no point in having both Piwik and Google Analytics. If I used AdWords or AdSense I might feel different, but I don’t.
CloudFlare Analytics is not necessarily useless, but it doesn’t really give you useful blog stats. It’s fine for seeing how much traffic you are getting from all sources, but it doesn’t help you figure out how many human people are reading your posts.
So, when it comes to BLOG STATS my recommendation is Piwik. And, I’ll have more to say about blog security at a later time.