For Adobe ColdFusion application servers For BlueDragon application servers Documentation

cf_siteInsight

real-time visitor tracking
Copyright 2004-2006 ESWsoftware
Visit us at www.eswsoftware.com

Version: 2.0 (Release notes)
Date: 20060216
Required: Adobe ColdFusion 5.0 or greater, or
BlueDragon 6.1 or greater
See System requirements for more detail
Recommended: Adobe ColdFusion MX 6.0 or greater, or
BlueDragon 6.1 or greater
Recent changes:
20060209 * added historyLength and autoscroll
20060208 * added ignore attribute (takes a regex to identify URLs to ignore)
20060208 * added now attribute (current timestamp) must set for both tracking and reporting tag!
20060129 * Added browser icons, added block attribute (takes a regex - "^Googlebot$")
20060110 * Trimmed whitespace.
20060109 * made browser data available in request variable.
20060109 * applied latest GeoLite Country database from MaxMind.com
20060105 * changed display - removed IP, added robot indicator, added pages viewed
20060104 * added bots
20060104 * now aggregates bots as they often use multiple IPs
20050916 * logDatasource was causing errors when not set (thanks Trevor Cole)
20050823 * added "noindex,nofollow" search engine directive
20050823 * smarter handling of known robots and spiders
20050630 * added static page (HTML) tracking facility via an embedded image

Contents

  1. What is it?
  2. What's new in version 2.0?
  3. Installation
  4. Usage
  5. Setting your resourcePath
  6. Logging
  7. Displaying countries and flags
  8. Identifying logged-in users
  9. Adding host name lookup
  10. Security
  11. Tracking static or non-CFM pages
  12. The tag
  13. Attributes
Back to top

1. What is it?

cf_siteInsight helps you find out who is visiting your site in real time. Discover information about each visitor such as IP address, country, browser details, time spent on the site, number of pages visited, and the page being viewed currently. You can even see who is currently logged in to your site and see a complete page history for each visitor. You can log session data for later analysis, and you can access this data from your application so you can present different content based on country or browser.

cf_siteInsight is not intended to replace your existing statistics application. Rather, it is designed as a real-time supplement. cf_siteInsight can give you a very clear qualitative indication of the traffic on your site, and allow you to monitor the pathways real users take through your site. You will see how users from different countries arrive at different times of day, and you will have a clear idea of the load various bots and spiders place on your site.

Note that cf_siteInsight is designed for small to medium websites and may not be appropriate for long term use of heavily trafficked sites (i.e. sites receiving 10000s of unique user sessions per hour).

Back to top

2. What's new in version 2.0?

  • Better bot handling
    Bots are now indicated with an icon so you can easily see which traffic is human and which non-human. Bots often use a range of IP addresses and fail to maintain session variables. Hits from recognised bots are now aggregated properly.

  • Track static pages
    You may now track static HTML pages by embedding a tracker image in your pages.

  • Autoscroll
    When the cf_siteInsight report reloads itself, it scrolls automatically back to the point you were viewing.

  • Exclude irrelevant pages
    Use a regular expression to specify pages (such as administration screens, for example) you want to ignore.

  • Block bots of your choice
    Block bots or browsers based on user agent string. This is useful where a bot ignores the robots.txt file and creates excess server load.

  • Leverage visitor data for your application
    Data such as visitor country, browser, and platform are now available for your application. You could, for example display different information for local versus overseas visitors.

  • Minor changes
    Minor changes to report layout. Trimmed whitespace generation in report. Applied latest IP address data to the bundled database. Added more bots and spiders. Fixed a bug with logDatasource. Added "noindex,nofollow" search engine directive. Added browser icons. Added a historyLength setting to limit the amount of history displayed. The now attribute lets you adjust trackintg and reporting for your time zone independent of the server's time zone.

NOTE: If you are currently running cf_siteInsight 1.2 on your server, you will need to reset your cached language resource bundle after installation. Do this by viewing your page in a web browser as usual, but add the following variable to your page's URL: resetResourceBundle=true. For example, if your page is index.cfm then you'll need to visit the URL: index.cfm?resetResourceBundle=true . You only need to perform this procedure once.

Back to top

3. Installation

If you are not familiar with installation of custom tags, please read this brief article first: Installing and using custom tags. It will save you time and frustration. You may also like to try the Installation Wizard which will provide a custom recommendation for your environment.

Install the siteinsight.cfm, resourcebundle.cfm and log.cfm files in your server's custom tags folder or another preferred location.

Place the resources folder anywhere under your website root. This folder contains images and stylesheets that must be accessed by client browsers.

Back to top

4. Usage

cf_siteInsight must be called every time a page on your site is viewed. We will call this instance the "tracking tag." The best location to call the tag from is generally Application.cfm or Application.cfcin the root of your site:

<cf_siteInsight
  trackerId="mySite"
>

You can also use cfmodule or, in CFMX, use cfimport. See the following page for general information about calling custom tags: Installing and using custom tags.

If you have never used Application.cfm or Application.cfc before, note that Application.cfc is only available for ColdFusion 7 or later, and requires an understanding of CFCs. Conversely, Application.cfm works on all versions and is conceptually simpler. See these pages for more information: Adobe LiveDocs: Using an Application.cfm page, Adobe LiveDocs: Application.CFC Reference.

The trackerId attribute is used to identify your application uniquely. If you place the cf_siteInsight tag after your cfapplication tag, then you don't need to use trackerId.

In order to view the data that cf_siteInsight is now gathering, create another page and call the tag as follows (once again, the trackerId is not required if you have used cfapplication):

<cf_siteInsight 
  trackerId="mySite"
  mode="report" 
>

The mode attribute tells cf_siteInsight not to track the visit, but to generate a report instead. We will call this instance the "reporting tag."

There are many additional options available with the cf_siteInsight tag. But it is important to get the basic tag functional first.

Back to top

5. Setting your resourcePath

It is important that this attribute is set correctly in your reporting tag so that the style sheets and images for your report are used. If your report table appears plain and unstyled then this path is incorrect.

Make sure you have copied the resources folder to a web-accessible location on your web site. Make sure the supplied image test.gif is found in the resources folder. Open your browser and type in the URL of the resources folder on your web site. Add /test.gif on the end. If a large green check mark appears along with a message, then you have typed the address correctly. If not, keep trying.

Once you see the message, strip the http:// and your domain name off the beginning of the URL but leave a leading slash. Strip the test.gif off the end but leave a trailing slash. What remains is the correct value for your resourcePath attribute!

Back to top

6. Logging

You may log your visitor session data to a physical file or to a ColdFusion datasource.

Logging to a datasource

First, create the database and associated datasource in ColdFusion Administrator. Next, add the dbms and logDatasource attributes to your tracking tag in Application.cfm. logDatasource is the name of your datasource, and dbms indicates the name of the database server you are using. It may be set to access, default, mysql, postgresql, or sqlserver. (If you want to use another DBMS, either create the database table manually or create an appropriate file in the folder {cf_siteInsight Install Directory}/siteinsight/datatypes/ .) For example:

<cf_siteInsight
  trackerId="mySite"
  dbms="sqlserver"
  logDatasource="#request.dsn#"
>

Logging to a file

Add the log attribute to your tracking tag in Application.cfm, containing the file name of the log file you would like to use. If you use a simple name without extension or path, then a standard ColdFusion log will be used. You may view this log through the ColdFusion Administrator. For example:

<cf_siteInsight
  trackerId="mySite"
  log="sessions"
>

If you use an absolute path, then a custom log will be generated using the supplied cf_log tag. The cf_log tag must be found in the same folder as cf_siteInsight. The advantages of using cf_log are that your log may be stored anywhere (which is useful if you do not have CF Administrator access), and that data will be stored in separate columns for easier analysis. You can open the log in software such as Microsoft Excel. For example:

<cf_siteInsight 
  trackerId="mySite"
  log="#expandPath("sessions.csv")#"
>
Back to top

7. Displaying countries and flags

cf_siteInsight can use your visitor's IP address to compute their country of origin. To do this, it simply looks up a database of IP ranges.

You will need to either set up a datasource for the included database siteinsight.mdb, or import the two tables it contains into your existing database.

Locate the tracking tag in Application.cfm and add the datasource attribute, where the datasource corresponds to the database containing the two siteInsight tables. For example:

<cf_siteInsight
  trackerId="mySite"
  datasource="#request.dsn#"
>

Locate the reporting tag and add the resourcePath attribute. See the above section, "Setting your resourcePath."

Back to top

8. Identifying logged-in users

If your visitor is logged in, cf_siteInsight can tell you who the visitor is. The tag will assume it can find the value using the built-in ColdFusion function getAuthUser(). This works if you are using the cflogin tag that is part of ColdFusion. However if you are using a custom technique, cf_siteInsight will not be able to identify your user. You can pass in a value that represents the user to your tracking tag. For example:

<cf_siteInsight
  trackerId="mySite"
  login="#session.userId#"
>
Back to top

9. Adding host name lookup

Reverse DNS lookups, which translate IP addresses into host names, can be added to cf_siteInsight by installing another custom tag. Here's how:

  1. Download and install Ben Forta's CFX_GetIPHostName custom tag. Instructions are included in the download. You must have ColdFusion Administrator access.
  2. Test that the tag is working properly. Try the following sample:
    <CFX_GetIPHostName ADDRESS="216.104.212.88" DISPLAY>
    You can of course try any IP address. Don't proceed until this works.
  3. Add the following attribute to your cf_siteInsight tracking tag:
    getHostName="yes"
Back to top

10. Security

cf_siteInsight exposes a wealth of sensitive information about your visitors. It is important that the report it generates be located in a folder that is not publicly accessible. cf_siteInsight includes a privacy attribute that is primarily intended for demonstration purposes. Set this to true to hide the most sensitive information.

Back to top

11. Tracking static or non-CFM pages

Instead of including the tracking custom tag, you may embed a tracking image. This works not only in CFM pages, but in static HTML or non-CFM pages as well. To do so, simply embed the following markup in your HTML:

<img src="siteinsight4html.cfm">

Ensure siteinsight4html.cfm is located inside your web root, and insert the correct path to this file. Alternately, you may use Javascript as follows:

<script>
  tracker = new Image();
  tracker.src = "siteinsight4html.cfm";
</script>
Back to top

12. The tag

20060209 * added historyLength and autoscroll 20060208 * added ignore attribute (takes a regex to identify URLs to ignore) added now attribute -- current timestamp -- must set for both tracking and reporting tag! 20060129 * Added browser icons, added block attribute (takes a regex - "^Googlebot$") 20060110 * Trimmed whitespace. 20060109 * made browser data available in request variable. 20060109 * applied latest GeoLite Country database from MaxMind.com 20060105 * changed display - removed IP, added robot indicator, added pages viewed 20060104 * added bots 20060104 * now aggregates bots as they often use multiple IPs 20050916 * logDatasource was causing errors when not set (thanks Trevor Cole) 20050823 * added "noindex,nofollow" search engine directive 20050823 * smarter handling of known robots and spiders 20050630 * added static page (HTML) tracking facility via an embedded image

When tracking:

<cf_siteInsight
  block="regex"
  cachedWithin="timespan"            [DEFAULT: 20 minutes]
  datasource="datasource"
  dbms="dbms"                        [DEFAULT: default]
  getHostName="yes" or "no"          [DEFAULT: no]
  historyLength="length"			 [DEFAULT: 500]
  ignore="regex"
  log="log_name_or_absolute_path"
  logDatasource="datasource"
  login="user_identifier"            [DEFAULT: #getAuthUser()#]
  now="timestamp"					 [DEFAULT: #now()#]
  sessiontimeout="minutes"           [DEFAULT: 20]	       
  trackerId="id"                     [REQUIRED if cf_siteInsight
                                      is used before a
                                      cfapplication tag.]      
  visitorIdMethod="cookie" or "ip"   [DEFAULT: "sessionid"]
    or "jsessionid" or "sessionid"
	or "urltoken" or custom value  
>

When reporting:

<cf_siteInsight
  mode="report"                      [REQUIRED]
  now="timestamp"					 [DEFAULT: #now()#]
  privacy="Yes" or "no"              [DEFAULT: "no"] 
  resourcepath="web_path"            [DEFAULT: 
                                      "/resources/"]     
  skin="skin"                        [DEFAULT: "charcoal"]
  thisPage="web_path"                [DEFAULT: "#cgi.script_name#?"]
  trackerId="id"                     [REQUIRED if cf_siteInsight
                                      is used before a
                                      cfapplication tag.]      
>
Back to top

13. Attributes

block="regex"
Use this attribute to block access to certain bots or browsers. This is intended as a last resort, for use where bots produce excessive load and ignore robots.txt directives. cf_siteInsight will look for a match for the regular expression within the bot or browser name as reported by cf_siteInsight. To match the entire name, add ^ to the beginning of your regular expression and $ to the end. For example, block="Google" would block all Google bots or agents, including Googlebot, Mediapartners-Google, and Google Desktop. For more information about robots.txt see robotstxt.org.
cachedWithin="timespan" [DEFAULT: 20 minutes]
If a datasource has been set up so that cf_siteInsight may look up the country codes corresponding to IP addressses, then this attribute controls the caching of that query. Set this attribute as you would a cfquery cachedWithin attribute. That is, supply a timespan as follows:
cachedWithin="#createTimeSpan(0, 0, 20, 0)#"
datasource="datasource"
This attribute is required for cf_siteInsight to translate IP addresses into country names. The datasource must either be set up to point to the supplied Access database, or the database tables must be imported into your existing database.
dbms="dbms" [DEFAULT: default]
This attribute is only used where you are logging to a datasource using the logDatasource attribute. Use it to specify the type of database server you are using. The available options are: access, default, mysql, postgresql, or sqlserver. If you want to use another DBMS, either create the database table manually or create an appropriate file in the folder {cf_siteInsight Install Directory}/siteinsight/datatypes/ (see readme.html in that folder).
getHostName="Yes" or "No" [DEFAULT: "no"]
Set this attribute to add hostname lookup info to your report. Note that you must install the CFX_GetIPHostName tag first. See section 9 above.
historyLength="length" [DEFAULT: 500]
Limits the amount of history stored for each visitor. Set to 0 for full history.
ignore="regex"
Ignore certain URLs using this attribute. cf_steInsight looks for a match for this regular expression within the current URL. To match the entire URL, add ^ to the beginning of your regular expression and $ to the end. To match several different URLs, use standard regular expression pipe (|) notation. For example, ignore="^/(admin|login)/" would ignore any visits to the /admin/ or /login/ folders.
log="log_name_or_absolute_path"
This attribute is required in order to enable logging of session data. If you set it to a simple file name without path or extension, then a standard ColdFusion log will be generated that is accessible via CF Administrator. If you specify a full path, then a log will be generated using the cf_log custom tag. The cf_log custom tag must be situated in the same folder as cf_siteInsight. A custom log generated with cf_log will contain more information, but will not be accessible via CF Administrator.
logDatasource="datasource"
Set this attribute if you would like to log sessions to a database. If the log attribute is also set, then this attribute will be ignored. Set this attribute to the name of an existing ColdFusion datasource. If you specify the dbms attribute as well then the database table will be created automatically as needed.
login="user_identifier" [DEFAULT: #getAuthUser()#]
cf_siteInsight can track logged in users. By default, it inspects the value returned by getAuthUser(). You may pass in any other value that represents your logged in user. ColdFusion 5 only: this field defaults to an empty string.
mode="track" or "report" [DEFAULT: "track"]
The mode attribute lets cf_siteInsight know whether it should track the visitor details or generate and display a report.
now="timestamp" [DEFAULT: #now()#]
Use this to adjust the displayed time if your server time does not reflect your actual time. It is left to you to supply the correct time via this attribute. If you use this attribute, ensure you use it for both your tracking and reporting tags.
privacy="Yes" or "no" [DEFAULT: "no"]
If set, the privacy attribute hides some confidential data about site visitors, including: IP address, referrer, full URL, and login details.
resourcepath="web_path" [DEFAULT: "/resources/"]
This attribute tells cf_siteInsight where to find the stylesheets and images it needs. See the above section, "Setting your resourcePath."
sessiontimeout="minutes" [DEFAULT: 20]
This attribute tells cf_siteInsight how long to track a visitor session.
skin="skin" [DEFAULT: "charcoal"]
The skin controls the visual appearance of this application. cf_siteInsight currently comes with only one skin, although you may add your own.
thisPage="URL" [DEFAULT: "#cgi.script_name#?"]
This is the web path from the current page to the current page. This is required so that the report buttons will work. See the notes above on the format of this URL, but remember that cf_siteInsight will attempt to add variables to the end of the URL, so you will need a ? in there.
trackerId="id" [REQUIRED if cf_siteInsight is used before a cfapplication tag.]
The trackerId must be a unique ID representing your application on the server. If two applications on the same server share trackerIds then they will share stats. If possible, place your cf_siteInsight tag after a cfapplication tag. Then you don't need to use trackerId at all.
visitorIdMethod="cookie", "ip", "jsessionid", "sessionid", "urltoken" or a custom value [DEFAULT: varies]
This attribute tells cf_siteInsight how to identify a unique visitor. If you use ip (IP address), then visitors sharing a network may appear as one visitor. However, some clients as well as spiders and bots will reject cookies, causing each page view to appear as a distinct session. Due to the limited space for cookie storage, setting cookies may have unexpected effects such as the removal of some other cookie, even ColdFusion's session-management cookies. Use of sessionid is recommended. If your application has session-management enabled, then this attribute defaults to sessionid. If not, then it defaults to cookie.

Flag icons provided by flags.blogpotato.de.

Country IP data is the GeoLite Country Database supplied by MaxMind.com. Download updates in CSV format from this address: www.maxmind.com/app/geoip_country.

OPEN DATA LICENSE (GeoLite Free Country Database)

Copyright (c) 2005 MaxMind LLC.  All Rights Reserved.

All advertising materials and documentation mentioning features or use of
this database must display the following acknowledgment:
"This product includes GeoLite data created by MaxMind, available from
http://maxmind.com/"

Redistribution and use with or without modification, are permitted provided
that the following conditions are met:
1. Redistributions must retain the above copyright notice, this list of
conditions and the following disclaimer in the documentation and/or other
materials provided with the distribution. 
2. All advertising materials and documentation mentioning features or use of
this database must display the following acknowledgement:
"This product includes GeoLite data created by MaxMind, available from
http://maxmind.com/"
3. "MaxMind" may not be used to endorse or promote products derived from this
database without specific prior written permission.

THIS DATABASE IS PROVIDED BY MAXMIND.COM ``AS IS'' AND ANY 
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 
DISCLAIMED. IN NO EVENT SHALL MAXMIND.COM BE LIABLE FOR ANY 
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 
DATABASE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Some parts of this software distribution are derived from the APNIC, ARIN and
RIPE databases (copyright details below). The author of this module makes no
claims of ownership on those parts.

APNIC conditions of use:

The files are freely available for download and use on the condition that APNIC
will not be held responsible for any loss or damage arising from the application
of the information contained in these reports.

APNIC endeavours to the best of its ability to ensure the accuracy of these
reports; however, APNIC makes no guarantee in this regard.

In particular, it should be noted that these reports seek to indicate the
country where resources were first allocated or assigned. It is not intended
that these reports be considered as an authoritative statement of the location
in which any specific resource may currently be in use.

ARIN database copyright:

Copyright (c) American Registry for Internet Numbers. All rights reserved.

RIPE database copyright:

The information in the RIPE Database is available to the public for agreed
Internet operation purposes, but is under copyright. The copyright statement is:

"Except for agreed Internet operational purposes, no part of this publication
may be reproduced, stored in a retrieval system, or transmitted, in any form or
by any means, electronic, mechanical, recording, or otherwise, without prior
permission of the RIPE NCC on behalf of the copyright holders. Any use of this
material to target advertising or similar activities is explicitly forbidden and
may be prosecuted. The RIPE NCC requests to be notified of any such activities
or suspicions thereof."
adam said on 3/11/2006 at 7:02:58 PM:
ohh man your installation guidance sucks man. you should work on that. don't think everyone is smart as you or have spare time to waste on this.