TV SMITH's Dua Sen: Figuring Figures Part 1
TV Smith's Dua Sen
TV Smith's Dua Sen. The politically incorrect irregular columnist combines his idiosyncratic observations and tangential commentary into a blog...


by TV Smith

TV Smith dwells on the imperfect science of web traffic analysis and hits you with fuzzy math in another geeky 2-parter...

As a blogger, you are essentially a webmaster. You design, operate and maintain a web site. As the webmaster, you naturally would want to know who arrived at your site, where they originated from and what they were looking at. All these are possible, because most browsers snitch on their clueless owners, big time.

Your IP is recorded by the Special Branch as:

If I want to, the above information alone allows me to track your entire stay on this site without even opening the cookie jar. Such data or information is usually collected by the server where the blog or site is hosted, by a tracking service or by the ISP which handles the traffic; all courtesy of a foxy browser programmed to spill the beans.

Unless you have access to the server logs, the easiest way is to plant a so-called 'single pixel' informer on your pages such as Sitemeter. The measurement is done on the fly and the data captured is quite comprehensive. The embedded meter or counter includes a few lines of javascript and a tiny logo (for free versions) as seen on the bottom right of this page. Except for the logo, most of the action behind the scenes is invisible to the visitor.

Whether you keep the logo and script near the top or bottom of your page may be a matter of aesthetical or practical considerations. If you place it on the top it is more accurate since it loads first. However, if for some reason the meter fails to call home, it can also stall or delay the loading of the rest of your page. Placing it a the bottom would eliminate this hindrance but it presents another problem altogether. Some visitors who are impatient may stop the page loading or click on another link before the meter loads, thereby undercounting your traffic.

Server logs, on the other hand, are much more accurate if you can get your hands on them. They are also mean, hungry and ugly. They take up plenty of space, are quite tasteless eaten raw, and require a third party analysis plug-in or program. The data is usually processed in batches and not in real-time like Sitemeter. Both methods however harvest quite similar end-results after processing, sorting and charting. The next challenge is understanding and interpolating the results.

For this demo, I started with a clean slate and logged my server's activities between 6:43pm 31st Jan to 7:47am 2nd Feb 2005. This 2-day experiment rendered a 15MB log file that contains some 72,000 lines (one line per hit). The data was collected 'live' via a SSH session, parsed and processed off-line by Weblog Expert software. Gaps during Jaring Wireless's persistent signal breaks and the FT holiday reduced reported traffic considerably, thankfully.

Hits was a big thing during the dotcom boom days when eyeballs meant everything to ignorant investors and sneaky upstarts. A hit is a browser request for any one file, such as html, graphic, javascript or other resource required to display the page. One page view as you can see can generate any amount of hits depending on what's inside the page. Hits by themselves are thereby quite misleading, meaningless and rightfully untrusted.

Cached Requests tells you a lot. It means your visitors are returning visitors, which is good. They have been to your site before and your page is not loaded as a similar copy is still in their browser's cache. It also means you have not updated your site, which is bad.

Failed Requests are images and pages that did show up when requested. A likely reason is that your images are too large and thereby slow to load. Or that your visitors are mostly using dial up-or pseudo broadband access.

Page Views. Successful requests for a specific URL or page. A visit to this page is counted as one page view even though 7 hits may be recorded.

Visitors: People hanging around the site, usually within 30 minutes sessions. Visitors and page views statistics reveal more than hits.

Unique IPs: Imperfect way of separating new visitors from old.

Bandwidth: Pay close attention if your web host sets a cap and charges you when you exceed its limit. If you are a forummer especially, please note that by hot-linking to someone's image without permission, you are not only stealing the image but bandwidth as well.

Referral: The resource or url that sent the visitor to your doorstep. Contrary to what you were always told, "No Referrers" are good. The more, the merrier you get. It usually means that most of your visitors came in by typing the url directly or by bookmarks. Repeat customers are always good. However, in Sitemeter, if your stats show no referrals at all, it may mean you inserted the wrong code! Hang in there...

Coming up: PART 2: Making Sense of Sitemeter Statistics

© 2005 TV SMITH
Link to this article:



Link to TV Smith's Dua Sen:

Contact / Feedback / Subscribe / Unsubscribe:    

Meet more Malaysian bloggers at MyCen Blog Directory