Five Tips to Help Optimise Urchin Software Installations

Below are some useful Urchin tips for getting the most out of your Urchin software installation. Although originally developed in 2005 for Urchin 5, they are still relavent. In order to implement any of these techniques, the software should be installed and working on your server and you should have a good understanding of server administration.

All Urchin Software tips have been tested on our own servers, however please note GA-Experts.com accepts no responsibility for any issues arising from the use of our advice below.

Urchin sales discontinued from March 28th 2012
Please ensure you upgrade to the lastest version (Urchin 7) before the deadline. There is no time limit on any Urchin license and all software will continue to run. However new sales discontinue on this date (read official Google announcement). We will support your install as long as you need us!

Contents:

  1. Recommended logformat for Apache
  2. Log file rotation - How to
  3. How many people add my site to Favourites?
  4. Auto email reports to clients - Urchin 5 specific
  5. Reporting on internal search - Urchin 5 specific

1. Recommended logformat for Apache

Ensure UTM is installed on all pages and use the full NCSA logformat in httpd.conf:

LogFormat '%h %v %u %t '%r' %>s %b '%{Referer}i' '%{User-Agent}i' '%{Cookie}i'' combined

NOTE: YOU MUST REPLACE ALL SINGLE QUOTES ABOVE WITH DOUBLE QUOTES i.e.

"%{Cookie}i"

ends in two DOUBLE quotes

Back to top

2. Log file rotation - How to

A "Howto" document for web server log file rotation with compression - Unix and Windows examples. Download as a log file rotation PDF.

Why Rotate?

Urchin stores aggregated logfile information in its own database enabling the end user to build 'real-time' visitor reports. With is own 'Log Tracker' keeping track of how far into your server logfiles it has processed, you could say Logfile rotation can be ignored. However this isn't recommend for the following reasons:

Logfile rotation is achieved quite simply on unix machines using crontab and logrotate. We describe a separate method below for Windows.

System

This example was developed and tested on:

Unix: RedHat 6.x and 7.1/2/3 using Apache v1.3.9-29.
Windows: NT4/SP6, Windows 2000 SP1
.

Schematic Unix Example

If you wish to rotate/compress files each and every time Urchin is scheduled to run, you can do this within Urchins' Log Manager. However, this can create a large number of logfiles, especially if you have multiple logfiles/hosts.

Our logfile rotation results in the following monthly files being created:

In this example, apache is using the 'combined' log format described in httpd.conf e.g.

# The following directives define some format nicknames for use
# with a CustomLog directive (see below).
# LogFormat "%h %v %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\"" combined
...
CustomLog /home/httpd/logs_dir/httpd_log combined

Note the Logformat directive maybe slightly different than what you see in some installations of apache. Ours is recommended as it allows you to simply use the default Urchin format: 'Log Format = auto'.

Unix Method

Each minute, the system crontab checks what jobs require scheduling. Scheduling is set in the /etc/crontab file.

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
# column headings - thanks Toby
# mins, hr, date, month, day, command
# run-parts
# Min Hr Date Month Day Owner Command File
01 * * * * root run-parts /etc/cron.hourly
02 1 * * * root run-parts /etc/cron.daily
50 23 * * 0 root run-parts /etc/cron.weekly
01 00 1 * * root run-parts /etc/cron.monthly

and what jobs are to be run is described in for example /etc/cron.monthly. In the above example, the directory /etc/cron.monthly is checked at 00:01 on the first day of every month.

My /etc/cron.monthly directory contains a file logrotate, contents of which are:

#!/bin/sh
/usr/sbin/logrotate /etc/logrotate.conf -f

The first line is required and simply informs the operating system to use the system shell to run the next line (command). Line 2 does the rotation, using the program located at /usr/sbin/logrotate and the configuration file /etc/logrotate.conf . My logrotate.conf contains:

# system-specific logs may be configured here
#########################################################
# #
# MONTHLY rotations #
# #
######################################################### # rotate apache log files:
/home/httpd/logs/*_log {
ifempty
copytruncate
rotate 12
monthly
compress
}

Note the first part of this file (up to # system-specific logs may be configured here) are default parameters and are ignored here for clarity. Below this comment, parameters over-ride the defaults.

[One caveat of the default parameters is:

# send errors to root
errors your@emailaddress

This does not work as etc/crontab has: MAILTO=root which over-rides any set in logrotate.conf.]

The part that does the rotating/compressing (you can even even e-mail the rotated file), follows the comment:

# system-specific logs may be configured here
/home/httpd/logs/*_log {

defines which files are to be rotated and must end in a sing closing brace '}'.

ifempty

defines that rotation will continue even if the file is empty.

copytruncate

defines a copy of the logfile is created first and then its contents are removed (instead of simply creating a new one). This is required by apache as it can not be told to close a logfile (release) without stopping the service. By this method the apache server does not have to be restarted.

rotate 12

Over-rides the default (4) by keeping 12 previous files.

monthly

Over-rides the default (weekly) by performing rotations monthly.

compress

Compress the logfile usually by as much as 20:1. Auto-compression is probably logrotates' most powerful feature - something Window's struggles with! See below.

Read man logrotate for details concerning what other options may be useful to you.

[A caveat from the man logrotate page is that it appears to indicate the order of the commands is un-important. For instance, viewing the /var/log/news/* example nocompress appears after endscript. However changing this to compress will not work. It must come above postrotate. i.e. nocompress is actually ignored, but occurs as the default action.]

Windows Schematic Method

This is very similar to the Unix method above. Setup is much easier than for unix (simply select a radio button in the IIS control panel). However there is little flexibility with this and compression is more complicated to achieve. The difference between Windows and unix Schematics are:

Windows Method

The following discusses how you can add compression functionality and assumes you have already selected the required logfile rotation frequency in the IIS control panel and have installed winzip and the winzip commandline add-on on your Windows machine in their respective default directories.

Create a file ziplogs.bat with your text editor containing the following:

ren "c:\Inetpub\weblog\w3svc1\monthly.zip" "monthly-old.zip"
"c:\program files\winzip\wzzip" -exomT
"c:\Inetpub\weblog\w3svc1\monthly.zip" "c:\weblog\w3svc1\*.log"

The first line renames the previous zip file. The second line calls the wzzip program (actually the winzip commandline add-on) with options = exomT, and compresses all logfiles from the default IIS logfile location into monthly.zip.

[Note, *.log is used here because Windows names its log files using a unique timestamp. However, following the initial run of the script, there should in fact only be one logfile available - the last one just rotated.]

Options:

ex = Set the compression level to maximum
o = Change the zip file's date to the same as the newest file in the Zip file
m = Move files into the zip file
T = Include files older than the current date (if no date specified)

The most important option is 'T' (case sensitive). 'T' ensures only the last logfile is included in the compression, not the newly created one. Use the Windows scheduler to run your ziplogs.bat batch file at 01:00 on the day in question i.e. just after the rotation.

As you will have noticed, the Windows method only gives you two backups of your logfiles (monthly.zip and monthly-old.zip). A more advance batch file is required if you wish to keep further (numbered) copies.

Back to top

3. How many people add my site to Favourites

In MS Internet Explorer v5 and above, when a visitor adds your site to their favourites, IE requests a small icon file. Tracking this download allows you to monitor this.

As Admin, within Reporting, append ico the following to the 'Downloads Match' field e.g.

pdf,zip,exe,sh,tar,gz,dmg,pkg,doc,xls,ppt,ico

Firefox Update (March 2005):

Firefox is gaining momentum as new alternative to MS IE. Although Firefox usage is still low (~5% of all visitors), an issue has come to our attention in the way it requests the favicon.ico file. Firefox will load the favicon.ico file for every page viewed (subject to cache settings), even if not bookmarked. As you might guess, this can have a dramatic effect on the perceived number of people adding your site to favourites. It is thought other Mozilla type browsers (e.g. Netscape) behave in a similar way.

To differentiate IE from Firefox requests, you can make the following change in your apache.conf file (a similar hack can be used for IIS):

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !.+MSIE.+
RewriteRule ^/favicon.ico$ /favicon2.ico

In English this reads, "if the browser is not MS IE and requests the favicon.ico file, change the request to favicon2.ico".

Of course you will need to upload the favicon2.ico file into your web root. Note, this does not tell you how many Firefox users (or other Mozilla type browsers) are adding your site to their favourites, but it will keep the MS IE figure accurate.

Back to top

4. Auto email Urchin reports to clients

Tested on Linux, this perl script is a customisation of the original Urchin-supplied u5data_extractor.pl script. It is set to run on the 1st of each month emailing a custom defined report to you and the client. In this example, Search Engine visitors.

A cron job is set as follows:

/usr/local/urchin/util/u5data_extractor.pl --profile <> | mutt -x -s "Last month's SE visitors" -c <<your@email.addr>> <<client@email.addr>>

where <<client profile>> is the profile name you set in the Urchin Admin console.

Rename the existing u5data_extractor.pl script and install this modified one u5data_extractor.txt (save and rename to .pl). Ensure the permissions are set correctly to execute this:

chmod ugo+x u5data_extractor.pl

The file is fully commented and has two main differences from the original:

Back to top

5. Reporting on internal search

Having an internal Search Engine on your website to help visitors quickly find information, is a powerful way for you to understand what your visitors are looking for. Studies have shown, that no matter how good your navigation scheme is, visitors to large sites will always prefer to search than to drill-down. Analysing visitor search is therefore obviously powerful.

Such a feature is standard in Urchin software - see the Pages & Files/Page Query Terms section. However, what if you have two search options? For example, a real estate agent may have separate search boxes for users to specify property type and location:

tips internal

To combine both these two variables into a single search term requires an advanced filter. Below is the example, assuming the names of the form fields above are 'area' and 'type' respectively.

tips internal

Then, within the Page Query Terms report, you will see a 'combinedQuery' entry - a report of how often each combined query was used.

Note, that implementing this filter will overwrite your entire query for any request that contains the variables 'area' or 'type' in the query stem. If there is other information in your query terms that you might want to save for later analysis, you cannot accomplish this while simultaneously combining your keywords. Also, ensure that there are no other pages that use the variables 'area' or 'type', because that page's entire query will be overwritten by this filter.

Back to top

 

 

 

Feedback Form