Child pages
  • Privacy Framework Proposal - A Credential Lifecycle Approach

 

 

 

 

 

 

Privacy Framework Proposal

A Credential Lifecycle Approach

 

 

Created: July 2, 2010

Modified: September 29, 2010

Status: Draft – Submitted to P3WG

Editor: J. Trent Adams

Contributors: Mark Lizar

Christine Runnegar

Colin Soutar

Robin Wilton

 

 

This document describes a Privacy Framework to be created that can be used in conjunction with a standard credential management lifecycle.  The framework defines at a summary level a methodology for organizations to manage personal data in a privacy-respecting manner.

 

 

 

 

 

 

 

 

NOTE: This is a draft document was developed for contribution to the Kantara Initiative Privacy and Public Policy Working Group (P3WG) and is covered by the Intellectual Properties Rights agreement governing participation in and usage of the materials produced by the group.

 

 


Table of Contents


Introduction ........................................................................................................................................................................................................ 3

Terminology ....................................................................................................................................................................................................... 3

Existing Work .................................................................................................................................................................................................... 5

Scope of this Work ............................................................................................................................................................................................ 6

Models of Data Control ............................................................................................................................................................................. 6

Managing Personal Data Transactions ................................................................................................................................................... 6

Applicable Levels of Assurance ............................................................................................................................................................... 6

Credential Management Lifecycle .......................................................................................................................................................... 7

Regulatory Considerations ........................................................................................................................................................................ 8

Framework Components ................................................................................................................................................................................. 8

Engagement .................................................................................................................................................................................................. 8

Credential Management ............................................................................................................................................................................ 9

Resource Access Management ............................................................................................................................................................... 12

Logfile Management ................................................................................................................................................................................ 13

Appendix A: Document Change Log ......................................................................................................................................................... 15


 

 

 

Introduction

 

This document describes a Privacy Framework to be created that can be used in conjunction with a standard credential lifecycle.  The framework defines at a summary level a methodology for organizations to manage personal data in a privacy-respecting manner.

 

In many cases, an organization that handles personal data will be doing so in conjunction with an account created for the associated individual.  For organizations with an existing methodology for effectively handling their credential lifecycle, this proposed framework is designed to be easily applied.

 

By modeling this framework on common best practices in managing credentials, it can be applied in tandem with existing organizational processes and technologies.  This framework can apply to all personal data collected and managed for account-holders within the organization, and most steps in the credential lifecycle have logical analogs to the privacy expected in the handling of the related data.

 

NOTE: This document is a whitepaper calling for the development of a framework described herein.  This is not the framework itself, but rather an outline of the aspects required as part of the framework.

 

Terminology

 

ED: Rather than being a comprehensive glossary, the intent of this section is to provide the definition for terms as they are used in this document, specifically when the terms used may have multiple meanings.  This section should be extended as appropriate through the writing of this document.

 

  • Anonymity – The characteristic that a partial identity cannot be linked to a true identity.  Recent work has proven that anonymity is difficult to maintain over time due the fact that given enough observed activity for a given “anonym” (i.e. a single identifier associated with a partial identity for an unknown identity), a partial identity can be statistically linked to a true identity.  See also: Pseudonymity.
  • Credential Lifecycle – The process defining the series of steps through which a person is identified, their identity verified (a.k.a. “proofed”), access credentials are granted, ongoing management of the credentials, and ending with the expiration and/or revocation of the credentials.
  • Identifier – The representation that is used to reference a person in order to maintain consistency of interaction over time.  An identifier can reference an anonym, pseudonym, or other partial identity.  See also: opaque identifier.
  • Identity (a.k.a. “true identity”) – The sum of the characteristics that define a natural person.  These characteristics include who the person is (i.e. their biology) as well as what he or she does.  The important aspect is that a person's identity is driven by their observable interactions.  This is supported by the definition common within the Identity Management (IdM) technology community which is that an identity is defined by the combination of what you are (i.e. biology), what you know (e.g. password), and what you have (e.g. physical access card).  See also: Partial Identity
  • Identity Proofing (a.k.a. “proofing”) – The process by which the organization verifies the identity of a natural person.
  • Individual Data – Data contributed by and under the control of the individual (e.g. mailing address).  It may or may not be Personal Data or Sensitive Data.
  • Levels of Assurance (LOA) – The four levels of identity assurance as defined by OMB 04-04 ii and NIST SP 800-63 iii , with LOA-1 being little to no assurance of an individual's true (or natural person) identity, and LOA-4 being the highest assurance.
  • Ongoing Service – A relationship between parties during which multiple transactions are expected to take place (e.g. repeated billing).
  • Opaque Identifier – An identifier that references the partial identity of a user within a given system.  These identifiers cannot be easily associated with the partial identity (or true identity) of the user without access to a reference table linking the opaque identifier with a record containing identifiable data.  For example, if a system assigns a user the identifier “TruP2br7”, this cannot be easily linked to someone's name (or other details) without a corresponding reference table.  NOTE: While an email address may be an opaque identifier (e.g. “TruP2br7@emailservice.com”) they often contain more self-identifying characteristics (e.g. “firstname.lastname@emailservice.com”).  Further, any opaque identifier used in association with enough services can begin to build relationships that can be statistically linked to a specific user.
  • Organization Data – Data created by and under the control of the organization (e.g. account number).  It may or may not be Personal Data or Sensitive Data.
  • Partial Identity – The representation of an identity that is presented within a given context.  Few, if any, contexts will have access to a person's complete, true identity.  In practical terms, all interactions take place with one or more of a person's partial identities.  A partial identity can be created by the individual (a.k.a. a “persona”) or created by an entity observing the individual's actions (a.k.a. a “profile”).
  • Personal Data – Data that is related to a specific individual (e.g. username).  It may or may not be Individual Data, Organization Data, or Sensitive Data.  In some cases this data is also referred to as identity “assertions” or “claims”.
  • Proofing see “Identity Proofing”
  • Pseudonymity – The characteristic that a partial identity is difficult (or prevented by regulation) to be linked to a true identity.  In common practice, pseudonymity is replacing the concept of anonymity as the minimal achievable state of separation between a partial identity with only a weak link to a user's identity.
  • Sanitized Data – Data that is maintained (e.g. for audit logging), but has been modified in a way to make it very difficult (if not impossible) to reconstruct to its original format (e.g. applying a one-way hashing algorithm).
  • Sensitive Data – Personal data that is deemed to be sensitive in nature, and around which there need to be special privacy-protection considerations (e.g. passwords).  It may or may not be Individual Data or Organization Data.
  • Single Transaction – A transaction taking place between two or more parties that is expected to only happen once (e.g. shipping a package).
  • Shared Data – Data jointly created by and/or under the control of both the individual and organization (e.g. usage activity).  It may or may not be Personal Data or Sensitive Data.
  • Statistically Identifiable – When provided with enough observed discreet anonymous or pseudonymous actions, it has been proven to be possible to reconstruct, with varying statistical probability, a linkage between the activities and the identity (or partial identity) that performed them. [i]
  • Third Party Data – Data about a person that is not contributed by the person and is not otherwise under their control.
  • Transaction – The set of data that is required to move between two or more parties in order to service a defined task (e.g. an individual provides a mailing address to an organization so they can ship a product).  A single transaction may be comprised of a series of more discreet transactions between multiple parties (e.g. an individual may need to transfer various data sets to multiple parties in order to complete an online purchase).

 

NOTE: The terms in this section are defined in context of this document.  More generalized definitions, and ones for other contexts can be found in the following sources:

 

  • Allan Milgate, The Identity Dictionary
    http://identityaccessman.blogspot.com/2006/08/identity-dictionary.html
  • Andreas Pfitzmann and Marit Hansen, “ A terminology for talking about privacy by data minimization: Anonymity, Unlinkability, Undetectability, Unobservability, Pseudonymity, and Identity Management
    http://dud.inf.tu-dresden.de/literatur/Anon_Terminology_v0.34.pdf
  • ISO/ITU X.911 Information Technology — Open Distributed Processing — Reference Model — Enterprise Language - definitions in section 6.5
    http://www.joaquin.net/ODP/DIS_15414_X.911.pdf
  • “Modinis IDM, Common Terminological Framework for Interoperable Electronic Identity Management”
    https://www.cosic.esat.kuleuven.be/modinis-idm/twiki/bin/view.cgi/Main/GlossaryDoc?code=nldsv13294
  • Roger Clarke, “A Sufficiently Rich Model of (Id)entity, Authentication and Authorisation Glossary of Terms”
    http://www.rogerclarke.com/ID/IdModel-Gloss-1002.html
  • SAML 2.0 Glossary
    http://docs.oasis-open.org/security/saml/v2.0/saml-glossary-2.0-os.pdf
  • The Identity Gang's Lexicon
    http://wiki.idcommons.net/Lexicon
  • Wikipedia on Digital Identity
    http://en.wikipedia.org/wiki/Digital_identity

 

ARCHIVE NOTE: URLs referenced in this section are archived by WebCite.  If the direct URLs are unreachable when reading this document, you can find cached copies of them by visiting: http://www.webcitation.org/query

 

ED: This may not be the right location in the document for this section.  It may benefit by being moved later in the document when finalizing it.

 

 

 

Existing Work

 

ED: This section will briefly describe existing work relating to privacy, including other privacy frameworks and related methodologies.

 

 

 

Scope of this Work

 

The intent of the framework is to provide a common methodology for organizations to easily couple their handling of personal data with well-understood best practices related to the management of user access credentials.  This framework is applicable to organizations managing personal data that is associated with clients, customers, employees, and/or users of their service for which the organization is managing their credentials.  It is not directly applicable to the handling of personal data that is collected through means other than in association with an account.

 

 

Models of Data Control

 

This document addresses issues of privacy related to various models of control over personal data.  In some cases personal data will be provided to the organization by the individual, and remain under the individual's control (e.g. mailing address).  In other cases personal data will be generated by the organization about the individual and generally remain under the organization's control (e.g. account numbers).  In some cases, the personal data can be viewed as being under joint control of both the individual and the organization (e.g. usage activity).

 

  • Individual Control
  • Organization Control
  • Joint Control

 

All three models are coupled with unique privacy concerns addressed with applicable controls within the framework.

 

ED: This section would benefit from examples of each model and their privacy issues, specifically relating them to the Credential Management Lifecycle.

 

 

Managing Personal Data Transactions

 

There are two primary types of transactions that are covered by the framework.  In some cases, personal data will be moved between parties (e.g. an individual providing a mailing address to an organization) for the purpose of satisfying a single transaction (e.g. shipping a package).  In this case, and the transaction is verified as complete, the data transferred is no longer necessary and can be deleted or otherwise sanitized.  In other cases, personal data may be stored for longer periods of time to service an continuing relationship representing ongoing transactions (e.g. repeated billing).

 

  • Single Transaction Mode
  • Ongoing Transaction Mode

 

ED: This section would benefit from examples of each transaction mode and their related privacy issues.

 

Applicable Levels of Assurance

 

With the publication of “E-Authentication Guidance for Federal Agencies” [ii] in 2003, the United States Office of Management and Budget (OMB) outlined specific guidelines for how Federal Government digital services can be accessed.  The memo included a general discussion regarding four “Levels of Assurance” (LOA) relating to the  “degree of certainty that the user has presented an identifier (a credential in this context) that refers to his or her identity”.  In 2006 NIST added technical guidance for each LOA published in SP 800-63 titled “Electronic Authentication Guideline”. [iii]   While other jurisdictions around the world have similar concepts that map to these levels, this document refers to the LOA as defined by OMB 04-04 and NIST SP 800-63.

 

NOTE: As of this writing, an updated version of SP 800-63 is in “draft” and includes additional guidance. [iv]

 

ED: We may want to consider a more generalized description of LOA that is more globally applicable, but for this writing it may simplify this draft to rely on these two publications.

 

This document is applicable to LOA 1, though it is more likely to be of use to organizations relying on higher LOA.  Specifically, the framework called for in this document is a matching of privacy-respecting steps that map to steps within the Credential Management Lifecycle, and a more direct mapping is likely to be found with LOA of 2 and above.

 

Credential Management Lifecycle

 

The privacy framework called for in this document maps specific privacy considerations to the steps likely to be in place within an organization already handling credentials that provide access to users of their systems.  The management of these credentials generally follow a methodology termed a “Credential Management Lifecycle”.  The common steps of which include:

 

  1. Identity Proofing – The process by which the organization verifies the identity of a natural person who will be given access to a system.
  2. Credential Creation – The point at which the credential(s) used to access a system are created.  This can include the insertion of the user's identifier into a database along with an associated password, and/or the creation of other similar unique mechanisms (e.g. a physical token) required to access a system.  Further, some credentials (e.g. physical access cards) may display or embed data that may impact privacy.
  3. Credential Distribution – The method by which the credentials are provided to the user of the system.
  4. Credential Maintenance – The ongoing steps required to ensure that a credential remains valid and the access rights of the associated user remain accurate.  In some cases this may include the management of personal data (a.k.a. “assertions”) associated with the credential (e.g. name, address, etc.).
  5. Credential Expiration – The process by which an existing credential is modified to be invalid or otherwise identifiably no longer valid for use.  In some cases credentials will automatically expire (e.g. after a specified date or number of uses) while in other cases credentials will need to be manually expired (e.g. when an employee is terminated).  NOTE: The framework discussed in this document differentiates between the expiration of a “credential” and of “access” to a specified resource that leaves the user's credential unchanged. While there are applicable privacy issues relating to a change in access, they are not currently addressed in this document.
  6. Credential Revocation – The process by which an existing credential is verified as being de-activated within the system.  This process can be automatically triggered by the credential expiring, or invoked manually to remove it from use.  In physical systems this step would require the user to return the physical credential (or physical component of a multi-factor authentication system) for verification it is no longer in use.  Electronic systems have no physical counter-part and this step is primarily in place as a verification that the credential is no longer in use.
  7. Credential Deletion – The process by which credentials are removed from the system.  In some cases a de-activated credential may remain in the system without being deleted (e.g. when the credentials for a user expire or are revoked but the user is expected to regain access).  In other cases, when the credentials are not expected to be used again, they are removed from the system.
  8. Credential Management Logging – While not a discreet steps performed in isolation, logging of each activity is a common and expected.  When there is a change performed (e.g. credentials added, issued, changed, expired, etc.) the activity is generally logged, often in a multi-tiered process in which detailed logs are stored for a period of time after which details are deleted and summary logs maintained for posterity.

 

Regulatory Considerations

 

ED: This section will talk about various regulatory issues relating, in general, to privacy and more specifically to the handling of credentials.  It is likely that there will be only general regulatory guidance and specific legislation relating to identified verticals (e.g. healthcare).  Research into this are needs to be done before this section can be drafted.

 

 

Framework Components

 

This section identifies, at a high level, specific privacy considerations that are expected to be addressed by the proposed framework as it relates to the Credential Management Lifecycle.  The components discussed are meant primarily to illustrate some privacy concepts and are not expected to be thoroughly comprehensive.  The questions posed as related to each component are intended to lead to proposed answers in the eventual framework which can be used by implementers of the framework to assess and, hopefully, improve their privacy protection.

 

ED: The components in this section would benefit from a rudimentary privacy leakage risk analysis.  Some components undoubtedly come with higher risk than others, and the proposed framework should help guide eventual implementers toward areas of highest concern.

 

Engagement

 

Often overlooked as part of the Credential Management Lifecycle, engagement with the credentialed (or potentially credentialed) users is an important point at which privacy issues enter the process.  While most systems start with the trigger to begin the steps of credential creation, these steps follow the initial engagement.  Further, engagement with the user continues during the ongoing management of credentials.  The following, then, should be considered as part of the proposed privacy framework.

 

  • Notice Publishing – When users first engage with the entity that will provide them with the necessary credentials, they must be presented with all applicable information relating to their credentials (e.g. what information will be required for their credential to be issued, any terms relating to the issuance of the credentials, etc.).  As such, the privacy framework must consider the following:

         Is the applicable information made available to anonymous, pseudonymous, or identified users?

         Are there logs that record that the pre-credentialed user has read the applicable information?

         If the logs are to be valuable, they must eventually link to the credentialed user:

         Do the log records assign an identifier that uniquely identify the person, even if they are not eventually credentialed?

         Or do the logs assign an anonymous identifier that is later linked to their credential, or deleted if they are not credentialed?

  • Informed Consent – When a user reads the applicable notices, they may be required to indicate they understand them.  In this case, a partial identity for the user will need to be identified and recorded by the system.  The framework must consider the following:

         What is the method used by the system to create an identifier that references the user's applicable partial identity?

         At this point, does the user need to present an existing partial identity (e.g. as asserted by another entity), can the user remain pseudonymous at this stage, or is this the start of the proofing process?

         Is the data being transferred between the parties encrypted or otherwise protected from third party observation?

         Are the logs for incomplete transactions (e.g. the user decides not to continue, there is an interruption of the process, etc.) deleted?

         If so, after how long?

         If not, what happens to them?

  • Ongoing Disclosure – Throughout the Credential Management Lifecycle, the entity providing credentials will likely need to remain in contact with the credential-holder.  This may include providing notice to the user when terms of their interaction with the credential provider change, as well as when the user is notified that they will need to update or otherwise make changes to their associated records.

         When usage policies change, are applicable issues address that are similar to those described in the “Notice Publication” and “Informed Consent” sections?

         If so, how do they differ?

         If not, why are they different?

         NOTE: Cross-reference with the section on “Credential Management: Maintenance” for a related discussion about managing required changes to the user's record (e.g. password changes).

 

Credential Management

 

When entering the standard steps associated with Credential Management, issues related to privacy can be more easily mapped to common practice.  These steps each include clear points which need to be considered as they relate to user privacy.

 

  • Proofing – When verifying the identity of a user to be credentialed, there are a number of key components that need to be considered as they relate to the privacy of the potential user (i.e. prior to the issuance of credentials) such as:

         Does the credential provider need to know the specific identity of the credentialed user?

         If so, the questions in this section about proofing are largely applicable.

         If not, the questions in this section may need to be slightly modified as appropriate to the circumstance.

         Has the credential provider specified the LOA to which the potential user is being verified?

         If so, what is it?

         If not, is there a rationale that drives the decision not to specify a LOA?

         What methods of proofing the identity are applicable for the specified LOA?  For example, does the verification process:

         rely on mechanisms within the control of the user (i.e. self-assertion)?

         leverage “chains of trust” or other “third party” verification methods?

         include in-person verification?

         Is the potential user asked only for the minimum data required to verify their identity to the applicable LOA?

         If yes, what data are they asked to provide, and is there documentation that indicates the reason the data is relevant?

         If no, what data is required beyond the necessary minimum to reach the stated LOA, and what is the reason for the additional collection?

         How is the data handled during the verification process?

         If the verification process is handled electronically, what mechanisms are in place to ensure the information is transmitted securely?

         If the proofing process takes place over a period of time, what mechanisms are in place to ensure the temporary data (i.e. data not required to be maintained after proofing) is deleted?

         What data is stored after verification?

         What data is only used for verification and is deleted?

         Is there a method for handling associated log files such that the sensitive data is also deleted from them?

         Is the data shared with other parties?  If so, what protections are in place for ensuring they handle it appropriately?

         What steps are taken to ensure that users aren't providing more information than is necessary to proof to the appropriate LOA?

  • Creation – When credentials are created, some consideration should be given to the information they contain and leverage to function.

         Are the credentials linked to the user by a recognizable name or referenced by an identifier known only to the system?

         Do the credentials themselves, regardless of the associated access they represent, present personally identifiable information?

         If there are physical components involved (e.g. physical badges, one-time keys, etc.):

         what information do they contain (e.g. printed on the face, within the meta-data, embedded in an RFID chip)?

         what information can they display in the clear (i.e. without access control to the information)?

         what information can be retrieved from them using some form of access control (e.g. when queried, can the credential itself respond when the requesting agent meets a specific requirement)?

         If an expiration date is set for the credential, is this date selected in a way to avoid inadvertently revealing personal information (e.g. birthday) when not necessary?

  • Distribution – When credentials are distributed to the user, it should be done so in a process that remains privacy-respecting.

         If distributed physically (e.g. via postal mail), what information about the user or credentialing entity is displayed in public view?

         What mechanism is in place to validate that the credentials reach their intended recipient?

         For systems that require a password, does the user select their own password during the initial engagement or proofing process, or is it generated for them?

         If it is automatically generated, is it a one-time password or intended for long-time use?

         If it is only to be used once, is there a mechanism in place to ensure it is only used once?

         When using the credentials for the first time, is the user forced to change their password immediately?

  • Maintenance – During the ongoing process required to maintain credentials, there are points at which privacy issues should be addressed.

         What is the mechanisms are in place for users to update their credentials (e.g. name, password etc.)?

         For systems that provide users with an automated method for changing their data, what mechanisms are in place to minimize the risk of exposing sensitive information?

         For systems that include a “password reminder” function, what mechanisms are in place to prevent inadvertent leakage of personal data (e.g. mother's maiden name)?

         For systems that provide human operator assistance:

  • what mechanisms are in place to prevent the operator from accessing more information about the user than is necessary for them to perform their tasks?
  • what process is in place to verify the caller is allowed to access and/or update the information?

         When credentials are updated, what mechanisms are in place to log the activity in a privacy-respecting manner?

         When credentials are updated, are the records overwritten, or is a new entry created leaving the previous record intact (e.g. for version control, roll-back support, or auditing)?

  • Expiration – The process for expiring a credential is largely a mechanical one that won't generally have much impact on user privacy.  One case where it may be important is in the case where manual expiration is coupled with notations recorded within the system.

         When manually expiring a credential requires a notation entered to indicate the reason for the expiration, what steps are in place to prevent the inadvertent leaking of personal information?

  • Revocation – When credentials are revoked, whether as part of an automated process or being managed manually, the revocation notice needs to be handled in a way to protect user privacy.

         What is the mechanism by which users are notified that their credential has expired?

         What process is in place to prevent the notification message from inadvertently revealing more information than is necessary?

         If the user attempts to use a credential after it has been revoked, what message do they see?  What care is taken to ensure that this information does not inadvertently reveal more information than is necessary (e.g. if someone finds a revoked key card with a revoked access number, the system shouldn't display the name of the credential holder when notifying them it is invalid).

  • Logging Cross-Reference to the specific section about Logfile Management.

 

Resource Access Management

 

While not strictly part of the Credential Management Lifecycle, the granting and management of access to resources to a credentialed user is a tightly coupled process.  In the most simple incarnation, a system may grant access to all resources to a credentialed user (e.g. an email service).  In more complex systems such as those modeled as “federations”, a credential in-and-of itself may not provide access to all resources and is supplemented by access specific to additional resources within that system to those available to federated partners.

 

The privacy framework called for in this document must at minimum address the privacy issues relating to the following as they interface with the Credential Management Lifecycle:

 

  • Provisioning – When access to a resource is provisioned for use by the credentialed user, a linkage occurs between the resource and the credentialed user.  Privacy issues similar to those addressed in the “Credential Proofing” and “Credential Creation” processes need to be addressed such as:

         What information about the user does the system running the resource being accessed require to function?

         What mechanisms are in place to minimize the leakage of sensitive information from the credential system and the resource system?

         If the relationship is severed between the credential system and resource system, what information about the credentialed user is maintained?

  • Maintenance – During the ongoing process required to maintain information relating to the access to resources, there are points at which privacy issues should be addressed.

         If a credentialed user with access to a resource changes credential details (e.g. name, password, etc.), is this change available to the resource system?

         If access changes for a credentialed user, what details from the resource system are available to the credential system?

         When changes are made, are the records updated or are new records inserted with the new information allowing the previous record to continue to exist, even if de-referenced?

  • De-provisioning – When access to a resource is removed from a credentialed user, privacy-related issues similar to those identified in the maintenance section should be addressed.

 

NOTE: The privacy framework called for in this document is focused primarily on access management as it interfaces with the core data associated with the credential itself.  Additional privacy considerations are compounded by the interconnected systems relating to complex resource access, and are out of scope for this work.

 

 

Logfile Management

 

As has been touched upon throughout this document, the effective management of logfiles is a key area that should be covered in a successful privacy framework.  It is necessary to record log entries for activities taking place within any system for legitimate reasons, including security, technical management, and regulatory oversight.  These two issues need to be carefully considered when addressing privacy issues, both adhering to the best practice of logging only the minimal amount of data necessary to perform the necessary tasks.

 

NOTE: A well-understood security concern that relates to privacy is that many systems in operation today include logfiles with data that can be easily reconstructed outside the standard security constraints of the operational system itself.  It has been shown that these logs (and their backups) pose a significant risk to the leakage of sensitive personal data and care needs to be taken in their handling.

 

  • Encrypted Log Data – All aspects of logging discussed in this section assumes a basic question about encrypting the logged data.  Specifically, any time that a log is written, care should be taken to encrypt any logged data the could be reasonably assumed to contain sensitive personal data.  Such considerations should include:

         Does the logged data include sensitive personal data?

         Could the logged data be used, even when it does not in-and-of itself include sensitive personal data, in aggregation with other logged data to build a profile that can statistically identify a user?

         Is the encryption/decryption process (e.g. key management) handled in a way that minimizes the ability for unauthorized agents from accessing the data in the logs?

  • Access Logging – When credentialed users access the system, recording their entry to and exit from (to the best of the systems ability) is often important for security as well as regulatory auditing.

         Are entries time-stamped according to a global time zone (e.g. UTC), system clock, or user's time zone?

         Do entries include the credential system's opaque identifier for the user, or something more transparent?

         Do entries include any other characteristics (e.g. accessing IP address, client system information, etc.)?

  • Transaction Logging – In addition to logging access to the system, records of specific activity are also often necessary.  When recording transactions (e.g. changes to data relating to user credentials such as name or password) they should take care to log only what is necessary.  In addition to the same issues relating to access logs:

         Do entries include references to snapshots of data prior to the change?

         Do entries include the actual data record prior to the change (e.g. the previous name)?

         Do entries include the new data after the change (e.g. the new name)?

  • Log Sanitization / Deleting – Understanding that there are legitimate reasons (e.g. security, regulatory auditing) to record detailed logs for some activities, their utility for the stated purpose often changes as time passes.  In the case where the details in specific logs are no longer relevant, sanitizing the records maintained in the logs is a useful mechanism for improving privacy protection.

         Is there a process in place for removing details from logs that are no longer needed?  If so:

         what details are removed?

         what details remain?

         how frequently does the process run?

         If the sanitization process includes aggregating statistics in support of system analytics (e.g. average access and usage reports), is there a mechanism or process in place for ensuring the aggregate data doesn't inadvertently leak sensitive data?

         When deleting records entirely, what process is in place to ensure that they are actually deleted, and not left in the system and simply de-referenced?

         When a credentialed user asks to be removed from the system, is there a mechanism in place to delete their associated records from the logfiles as well as from the operational system?

  • Log Archiving – After the effective life of logs is reached as part of the operational system, they are often archived or otherwise backed up for storage off-line (or otherwise not connected to the operational system).  While minimizing the likelihood of inadvertent leakage of information via the operational system, care needs to be taken to ensure the archives are handled in a secure manner.  NOTE: This is primarily an area of consideration for system and physical security of the credentialing entity.

 

Appendix A: Document Change Log

 

Date

Description

September 22, 2010

First complete rough draft for broader circulation to the Kantara Initiative Privacy and Public Policy Working Group (P3WG)

September 25, 2010

Included final comments from the subcommittee members to be incorporated into the draft to be submitted to the P3WG.

September 29, 2010

Addressed final subcommittee comments: Added more detail to privacy issues related to information available within credentials and clarified the definition of “identity” to more clearly correspond with the one commonly used within IdM.  Also inserted a discussion within the Proofing sections to allow for deployments that may not require highly proofed credentials.

 


[i] Ohm, Paul. Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization (Pre-Print Draft). University of Colorado Law Legal Studies Research Paper No. 09-12. 2009-08-13. URL:http://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID1450006_code487663.pdf?abstractid=1450006&mirid=5. Accessed: 2010-09-22. (Archived by WebCite® at http://www.webcitation.org/5swJgTuFA)

[ii] Bolten, Joshua B.. M-04-04 E-Authentication Guidance for Federal Agencies. EXECUTIVE OFFICE OF THE PRESIDENT, OFFICE OF MANAGEMENT AND BUDGET, WASHINGTON, D.C. 20503. 2003. URL:http://www.whitehouse.gov/sites/default/files/omb/memoranda/fy04/m04-04.pdf. Accessed: 2010-08-30. (Archived by WebCite® at http://www.webcitation.org/5sNVW3ase)

[iii] Burr,William;Polk,Timothy;Dodson,Donna. SP 800-63 Electronic Authentication Guideline. United States National Institute of Standards and Technology. April, 2006. URL:http://csrc.nist.gov/publications/nistpubs/800-63/SP800-63V1_0_2.pdf. Accessed: 2010-08-30. (Archived by WebCite® at http://www.webcitation.org/5sNWgu0N4)

[iv] Burr, William; Dodson, Donna; Perlner, Ray; Polk, Timothy; Gupta, Sarbari; Nabbus, Emad. SP 800-63 Electronic Authentication Guideline. United States National Institute of Standards and Technology. December, 2008. URL:http://csrc.nist.gov/publications/drafts/800-63-rev1/SP800-63-Rev1_Dec2008.pdf. Accessed: 2010-08-30. (Archived by WebCite® at http://www.webcitation.org/5sNYELGwn)