Web metrics and analytics
Thu, Aug 11 2005
In past issues of eBusiness News I've talked a lot about emarketing, search engines, customer service, contact management and related topics, but I haven't done much coverage of issues related to measuring user behavior and site performance. However, one of the most powerful things about running a website is that everything is measurable, and understanding what to do with the data available to you is a critical factor in the success of your site.
When anyone starts talking about site performance everyone else's eyes start to glaze over because it inevitably ends up in a discussion about statistics, and pretty much everyone I know finds statistics both boring and confusing.
But it doesn't have to be. When you start to see the kind of information you can extract from your website it can become addictive: you may find yourself obsessively tracking visitor numbers and session statistics day by day, watching as changes in your marketing approach are reflected in near-real-time on your website.
So what I'm going to do over the next little while is spend more time focusing on the two core components of website traffic statistics: metrics and analytics. I'll talk about real-world situations and show you how to understand what's really going on with your website, looking under the hood to watch what visitors *really* do rather than what you may think they do.
"Metrics" are defined units of measure, the yardstick by which we can judge how your site is performing. A metric can be a raw figure like "number of page views per day", or it can be a derived figure like "conversion rate" which is the number of sales as a percentage of number of visitors. Deciding what metrics you need and then making sure the necessary data is being captured is the first step to understanding the performance of your site.
"Analytics" is the next step, processing the raw data to gain a deeper understanding of performance by measuring or discovering trends that may not be immediately obvious. Analytics can be either a process of discovery to uncover trends you weren't aware of, or a process of measurement to track existing trends perhaps in the context of external influences that may or may not be under your control.
Both those definitions may sound a bit airy-fairy at the moment but I'll follow up in coming weeks with very specific examples of how they apply in the real world.
But first, time for a basic question: where does the data come from?
Mostly from webserver logfiles. Every time someone views a page on your site, the server records that fact. Included with the log entry is a variety of data including a timestamp, the network address of the user, the type of browser and operating system they are using, which objects (including pages, images, etc) they accessed, and the last page they viewed.
That data in itself is enough to extract a huge amount of information from. For example, the network address of the user allows a report to be generated showing which countries your visitors are located in and the object name allows a report to show which are the most and least popular pages on your site. Other reports derived from a raw server log include daily visit counts, visits by time of day, users by operating system, users by browser type, top referrers (sites that send traffic to you), and what search terms people are using to find you.
Those raw figures though can be supplemented by additional data stored by the CMS (Content Management System) that runs your site. For example, one critical data point if you have an online store is the number of sales you've made. That sort of information can't generally be found in the raw server logfile, though, because the server doesn't know what constitutes a sale and what doesn't. All it knows about is what pages and images people have viewed. Your CMS software, on the other hand, can track a bunch of additional information such as the number of sales and their value, the number of items purchased by each user and their shipping destination, the number of people signed up to your e-newsletter, and which articles in your online support knowledgebase are read most often.
It's when that additional data is combined with the data from your raw server logfiles and some basic analysis is performed that you can generate some very useful reports such as changes in conversion rates over time.
Those reports in turn will allow you to tune your approach and help you get the most from your site, so stay tuned for next week when I'll explain some of the basic metrics you can obtain from your server logfiles.