Generating and Viewing Awstats Reports

Awstats in action

The first article in this series, Installing awstats on Linux, taught you how to how to install and configure awstats for your site. In this article, we will explain how to generate reports from the command line. This is a simple approach that creates static html pages to display your web traffic.

Building a report: The scrip

This is a simple script that is included with awstats and combines the generation of several reports into one step. It updates the stats and generates several standard reports by using the main script behind the scenes. For a closer look at what reports this script will build, check the awstats online documentation.

The command will look like this in our example:

sudo /usr/local/awstats/tools/ -update -dir=/home/demo/public_html/ -awstatsprog=/usr/local/awstats/wwwroot/cgi-bin/

The script itself

The command is long. We will now break it down and explain where each script option should go:.


This part is the script we are running:

If you installed awstats to a location other than "/usr/local/awstats," redirect awstats to the location of the script on your machine.
-updateIncluding "-update" at the beginning of the options tells the script to update the stats analysis before generating the reports.
-configThe "config" value should be the main domain name for the site. Note that this domain matches up with the name of the config file you created in the first part of this series. The name of your config file should have "awstats." before the main domain name, and ".conf" after it, since that is what this script will be looking for. Replace "" with your main domain name. "-dir" option refers to the directory where you want awstats to create its reports. That directory should contain an "awstatsicons" directory containing awstats' standard image files.
The -awstatsprog optionFor "-awstatsprog" you want the value to be the location of the "" script, which is the main awstats script. If you installed awstats someplace other than "/usr/local/awstats", adjust accordingly.

Script results

Run the following script (all one line):

sudo /usr/local/awstats/tools/ -update -dir=/home/demo/public_html/ -awstatsprog=/usr/local/awstats/wwwroot/cgi-bin/

You will see that the script launches the awstats update process, then tells you about every one of the 20 reports it is generating. If the script encounters an error, you will receive troubleshooting advice, for example, as verifying that you used the right "config" identifier).

NOTE: The last line, the "Main HTML page" line, gives the main page of the report.

In your reports directory you should now see a lot of html files:

$ ls /home/demo/public_html/

View the report

To see the report results, point your browser to the main html file that was identified by the script run to generate the report.

It is important to work out the address you will use to view the reports. If you discover that you created the reports in a directory you cannot see from a browser, you may want to make a new reports directory. Edit your awstats config file accordingly, then run the report generation again to make sure it works with the new directory.

If all goes well you'll see something like:

You might see less than a day's worth of traffic in this initial report, or perhaps a week, depending on how often your web logs are rotated. So not a lot that's interesting just yet, but enough to make sure the reports were generated properly.

Visits, hits, pages and bandwidth

There are several available reports linked at the top of the main report page. The main statistics you will see at the beginning of the report bear some quick explanation:

Unique visitorsTracks the number of different visitors your site received. For awstats, this means the number of unique IP addresses in your web logs. This number is not completely accurate since visitors behind proxy servers and home routers only appear in the web logs under the IP addresses belonging to the proxies or routers.
Number of visitsTracks how many times visitors return to the site. A "visit" encompasses all page hits from a visitor within an hour or so of each other. If the same IP address appears in the web logs the next day, that would count as a second visit.
PagesIn web traffic terms, a "page" is the main page of a visited URL. This would be the HTML or PHP file that was requested by the visitor. If a page includes the contents of other HTML files, only the main page is counted as a "page" in the traffic stats.
HitsEverything a web browser requests from a site is a "hit." The main page, headers and footers, images, videos — everything the browser has to ask for is a "hit." A complex site will produce a fair number of hits per page visit.
BandwidthIn the combined log format the web server records the size of all the requests and responses that get sent between the browser and the server. The total of all the outgoing response sizes is the "bandwidth" statistic in awstats. This is not necessarily the total bandwidth used by the site — it's just the total bandwidth that got recorded in your web server's access logs.

A note about referer spam

You may notice that the "referer" information in your reports contains links to referring web sites. This is useful for checking out sites that are linking to you but there's a potential drawback to putting this information on a web page, and that's "referer spam."

There's a school of thought among less-reputable web admins that encourages doing whatever you can to increase your search engine ratings. One of those tactics involves finding a site with a publicly-accessible web stats page and then running a script that visits the site a bunch of times using their web site as a referrer. The theory is that search engines will count the stats page as another site linking to their site.

In practice it doesn't work that well (most major search engines are wise to the practice and account for it), but that doesn't mean we should encourage the inconsiderate jerks to keep trying it.

The preferred method to keeping the stats pages from being used for spamming purposes is to protect the stats directory from unauthorized access. You can do that by password-protecting that part of the site, or by restricting access to that site to just localhost and using ssh tunneling to view your stats.

If you want to keep your stats public you should at least modify your site's "robots.txt" file to tell the major search engines not to index your stats pages. If you don't have a robots.txt file in the document root of your site this is a good time to create one.

Inside the robots.txt file you just need to add a "Disallow" rule for the web stats directory. If you don't have a robots.txt file already, you can use something like the following:

User-agent: *
Disallow: /webstats/

That would tell any robot that complies with the robots.txt file not to index the "webstats" part of the site. That way your stats site won't show up on major search engines at all, defeating the purpose of any efforts to manipulate your referers report.

If you want your web stats to show up on search engines for some reason, then at least tell robots not to index the referer page report:

User-agent: *
Disallow: /webstats/


We've got awstats installed and configured, and now we can run it to generate web traffic reports. There's just one problem: you'll have to run awstats again when you want to update the report pages.

The third article in this series, Scheduling Awstats Report Generation, will show you how to save time and effort by automating the report generation process.

-- Jered

© 2015 Rackspace US, Inc.

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License

See license specifics and DISCLAIMER