Data minimization can be a powerful – and seemingly simple – data security measure. The term refers to retaining the least amount of personal information necessary in order for an organization to function. Less information means that there is less that the organization needs to protect, and less opportunity for information to be lost or stolen.

In practice data minimization requires organizations to fully understand where they collect information, why they collect information, and where it is stored. It also requires difficult decisions regarding what information the organization will likely need in the future from a business perspective, and what impact having limited consumer or employee records may have on potential legal disputes if they arise. For example, an organization that chooses to implement a 30 day or 60 day automatic “roll off” policy for employee email may not be able to identify email exchanges between an employee and a vendor that relate to a contract dispute that arises months later. The following provides a snapshot of information concerning document retention:

> 8,000 emails

Average size of employee inbox.1

6.5 million

Number of pages of Word data files that could be on a 100GB hard drive.2

18 months3

Length of time search history is kept by Yahoo.

“The indiscriminate collection of data violates the First Commandment of data hygiene: Thou shall not collect and hold onto personal information unnecessary to an identified purpose. Keeping data on the off-chance that it might prove useful is not consistent with privacy best practices.”

- FTC Chairwoman Edith Ramirez4

What to think about when designing a retention policy:

  1. Do you systematically track all of the data fields that your organization collects from consumers and employees?
  2. Do you systematically apply retention periods to each data field that you collect?
  3. Do those retention periods reflect the current business needs, or estimates as to possible future business needs?
  4. For a particular data field, what time period is typical in your industry and for the type of data at issue?
  5. Should you attempt to anonymize (sometimes called de-identify) data after a certain amount of time?
  6. If you do anonymize data, is your organization’s process of anonymization legally sufficient?
  7. What data and documents are you legally required to retain, and for how long must they be retained?
  8. If you decide to retain other data and documents how does it increase, or decrease, your legal risk?
  9. What additional data that, if collected, is your organization likely to need in the next 12 months?
  10. What steps are taken to irrevocably destroy data that is no longer needed?
  11. Are there any contractual requirements that require you to keep data for a certain duration?
  12. Does the retention policy include an annual review process?

1. Dave Troy, The Truth About Email: What’s A Normal Inbox? (April 5, 2013)

2. See, netdocuments, File Sizes and Types,

3. Yahoo, Data Storage and Anonymization FAQ,;

4. Edith Ramirez, The Privacy Challenges of Big Data: A View From the Lifeguard’s Chair, Keynote Address Technology Policy Institute Aspen Forum, (August 19, 2013),