Over the last few months, there has been a great deal of activity at the most senior levels of the US Government discussing the use of persistent cookies on federal agency web sites. If you are a Federal Web Manager or Web Analyst, you are painfully aware of the constraints on using cookies and the limitations it causes to a more accurate definition of unique visitor, and some of the more advanced segmentation features available in web analytics solutions.

As part of a greater effort to open up policy discussions, the White House, Office of Science and Technology Policy is taking public comment on the Federal Policy on Cookies as well as other issues impacting use of the Web.

If you've been doing web analytics within the Federal government, I'd suggest you weigh in on this. To comment, go to: http://blog.ostp.gov/category/requests-for-comment/ (you'll need to register before commenting)

My comments are on the site and printed in entirety below. My recommendations for a new policy are fairly straightforward.

Learning Opportunities

  • Allow the use of first party, persistent cookies for Web site measurement.
  • Prominently disclose how Web site measurement is used and how the data is collected and analyzed.
  • Provide instructions for how users may delete persistent cookies from their browser settings.
  • Combination of PII and unique visitor ID (persistent cookie ID) will not be used for analysis.

What is your take? This is a highly critical time regarding the use of cookies. If you read my post below, you'll see that there are those who are actually suggesting that information that has been used historically, such as IP addresses, be disallowed or severely constrained.

Privacy advocates have a very strong voice in this debate, and while I'm all for privacy protections (see my series of posts on privacy, I'm very concerned that the focus on privacy could hamper the effectiveness of Federal web site analysis, even while allowing for the use of persistent cookies. You can read a few of the positions being advocated at these sites:

Make your own determination, and use this opportunity to voice your opinion: http://blog.ostp.gov/2009/06/16/enhancing-online-citizen-participation-through-policy/.

Phil Kemelor's Comments:

The issue of whether and how to permit the use of persistent cookies on Federal agency Web sites is one that tends to be charged with emotion primarily around the concern over personal privacy as it relates to Web site data collection.

However, I believe this concern is based on misperceptions about persistent cookies, the nature of Personal Identifiable Information (PII) and how this data may be used.

A persistent cookie is a unique string of text that is placed on the visitor’s computer during his or her first visit and recorded by the site measurement tool in subsequent visits. Because the cookie is unique, any visit from that same computer is considered unique, and is defined as being a “unique visitor.” If the visitor registers on the site and provides personal information, that information can be associated to the specific computer. The visitor will be recognized as unique on all subsequent visits. However, unless the user voluntarily provides personal information, the persistent cookie simply records activity from the computer. For example, if a person has a computer at work and at home, and goes to a particular Web site, there will be a cookie issued for both computers. There is nothing to link that individual with activity on those Web sites.

Cookies, in and of themselves, do not breach a user’s privacy, nor do they pose any security risks. Persistent cookies are not a software program, nor an executable file. They are not spyware and cannot take control of your computer in any way, shape or form. For example:

  • They don’t steal a user’s e-mail address.

  • They don’t gather data from a visitor’s computer.

  • They simply enable the tracking of a visitor’s behavior on the site from which the cookie is issued.

Visitors are anonymous until they register on the site. However, even should someone register on the site, there would need to be a deliberate effort to link the unique ID provided by the cookie with the personal information.

Persistent cookies when used in the practice of Web analytics can provide insights that result in Web sites that are relevant, cost effective and provide a better user experience. For example, knowing that a group of visitors that comes to the site often through a search engine and leaves the site after viewing only one page could indicate that the search engine optimization works, but that the content pages being viewed do not convey the information that is relevant to this group of frequent search engine users. Knowing this could direct staff efforts on changing and testing how the page was written. This would provide management with direction on how to deploy staff resources effectively. Once the pages were changed, the behavior of the segment of frequent search engine users could be reviewed over the course of several months to see if they read more than one page and started to interact with the site more frequently.

The persistent cookie provides the added value of tracking activity from a single computer over the course of time, and being able to compare that activity over time. There is nothing to suggest in the preceding example that a visitor’s privacy is compromised.

This level of analysis has been used in the private sector and non-profit sectors for years, but has not been available to federal agencies. As a result, federal agencies have been unable to accrue insights into visitor behavior that could lead to the development of Web sites that serve their constituents more effectively. In addition, federal Web managers have not had analysis that would help them more effectively deploy staff on Web projects, thereby resulting in less than optimal use of resources.

There have been proposals made suggesting user opt in/opt out of data collection, limits to the capture of IP data and limitations on the availability of data to federal Web managers.

Opt in/opt out: In the private sector, it is common practice to instruct site visitors how to delete persistent cookies through their browser settings. This is not something that is done on every page through a pop up. While many people do delete their cookies, many more people do not.

Federal agency Web sites should have a clearly written privacy policy that informs users that a persistent cookie is being used with a Web analytics tool to track their behavior for site analysis purposes and provide instructions on how to disable the persistent cookie within the browser.

Note that this covers the use of persistent cookies and does not give a choice on whether to have visit or session based data collected. Session based data collection is the current status quo among federal agencies, and there is no reason to disallow this method, as this would increase the inaccuracy of Web analysis. A pop up method of opt in/opt out would be considered distracting and annoying and would likely diminish of usage federal agency web sites.

Limited data retention: Suggested 90 day limits on data retention severely constrain any opportunity to do in-depth analysis that may provide insight into content and application usage over time. One of the benefits in using unique visitor data is in the creation of behavioral segments that may be used over different time periods. For example, I’d like to be able to compare the number of articles downloaded by a group of visitors from a particular state last month to the same month a year ago in order to determine whether I should retire that content, or add more content. If I must delete the unique visitor data after 90 day, I cannot do this analysis.

Federal agencies should have no data retention limits on Web analytics data.

There are no current limitations on data retention and there is nothing to suggest that this has caused an issue. It should be noted that Federal agencies retain PII for many years without constraint. It is unclear why Web site data that contains no PII should have a limitation on retention.

Limit cross-session measurement: As discussed earlier, the purpose of cross-session measurement and associated segmentation provides an effective method for understanding specific use of content and provides insight into creating more effective user experience. Limiting measurement to single session is the status quo, and one of the reasons that the federal agencies lag the private sector, non-profits and non-US government Web sites in the use of analytics to guide Web site strategy and tactics.

(See Tapping the Potential of Web Analytics for Public Sector and Non-Profit Sites May 2009 http://www.Webanalyticsassociation.org/waaWebcastseries/membersonly/WC20090604_WAA-Public-Sector-Survey-Analysis.pdf)

Obscuring or deleting IP addresses: It should be noted that using persistent cookies does not provide 100 percent accuracy in obtaining unique visitor information. There are more and more ways available for users to reject or delete cookies. It is debatable how many people use these methods. The following list summarizes these methods:

  • Visitors set their browsers not to accept cookies.

  • Visitors can use anti-spyware software.

  • Visitors can delete cookies.

  • Anti-spyware programs and features embedded in browsers give users more ability to reject or not accept cookies.

Because of this, collection of IP addresses is an important facet of Web analysis. Currently, federal Web agencies collect this data. There is nothing to suggest that this data has been used inappropriately. Instituting this constraint as a requirement would be considered burdensome by Web analytics vendors, and may result in additional expense to the federal agencies. For those agencies that host Web analytics solutions “on site” this requirement would add additional workload to staff.

It should also be noted that IP addresses enable the association of geo-location which is considered an important analytics requirement for federal agencies to understand the success of promotion campaigns. In addition, as most IP addresses are dynamically assigned by Internet Service Providers or organizations with many users, there is no way to associate an IP address with an individual.

I’d suggest that the focus on data collection constraints may result in Web sites that are considered cumbersome by users and make it even more difficult for federal Web site managers to understand site usage. Federal Web managers are ever more aware that they are “competing” with sites in the private sector, as well as those in the academic and non-profit world. The question before us is not so much how to regulate the use of persistent cookies, as it is: How do we enable Federal agencies to create relevant and cost effective Web sites that are focused on optimizing a user’s experience?

To give federal Web managers the information they need to fulfill this goal, the Web site measurement policy should be simple and straightforward:

  • Allow the use of first party, persistent cookies for Web site measurement.

  • Prominently disclose how Web site measurement is used and how the data is collected and analyzed.

  • Provide instructions for how users may delete persistent cookies from their browser settings.

  • Combination of PII and unique visitor ID (persistent cookie ID) will not be used for analysis.

It should be noted that non-US governments have addressed the issue of persistent cookies, privacy and web analytics. In the Web Analytics Association survey referenced above, over 50 percent of the non-US government respondents indicated that they used persistent cookies. Please reference the Privacy Policy at the Australian government’s Department of Agriculture, Fisheries and Forestry for an example of how this issue is addressed:http://www.daff.gov.au/about/privacy

Phil Kemelor
Vice President, Strategic Analytics - Semphonic
Lead Analyst - CMS Watch Web Analytics Report
Author - The Executive's Guide to Web Site Measurement and Testing
Web Analytics Trends:http://www.cmswatch.com/Analytics/Trends/
Web Analytics Management: http://wam.typepad.com
[email protected]