The California Consumer Privacy Act of 2018 (CCPA) is the first-of-its-kind U.S. law that gives greater privacy rights to consumers who reside in the state. Borrowing many of the core principles of the European Union’s General Data Protection Regulation (GDPR), the Act enshrines significant rights for consumers by granting them unprecedented control over their personal information.
Set to go into effect in less than nine months on January 1, 2020, the Act forces companies to understand how it will impact the way they collect and process consumers’ personal information. Because of the Act’s broad-reaching provisions, confusion exists regarding many of the details, with much of it rooted in a foundational provision: what constitutes personal information (PI) and how that differs from personally identifiable information (PII).
Understanding the difference between the two is critical for preparing to meet the requirements of CCPA in addition to similar proposed state and federal regulations that will are likely to take a similar view of personal information. Companies that fail to understand the distinction will be at significantly heightened risk for fines and potential class action civil litigation.
What is Personal Information vs. Personally Identifiable Information?
The starting point for understanding the difference between PI and PII lies in the definition of Personal Information according to the CCPA:
“Personal Information” As defined in section 1798.140 of CCPA
(o) (1) “Personal information” means information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.
The key takeaways here are “capable of being associated with, or could be reasonably linked, directly or indirectly, with a consumer or a household.” This definition creates the potential for extremely broad legal interpretation around what constitutes personal information, holding that personal information is any data that could be linked with a California individual or household. This goes well beyond data that is obviously associated with an identity, such as name, birth date, or social security number, which is traditionally regarded as PII. It’s ultimately this “indirect” information–such as product preference or geolocation data that is material since it is much more difficult to identify it and connect it with a person than well structured personally identifiable information
As companies increase touch points across more channels with their customers, they are collecting petabytes of data about individuals at a breakneck pace. Personal data of all sorts, ranging from highly identifiable to indirect is being collected across a range of applications and data stores, creating personal data sprawl. Since this massive volume of data resides across a combination of structured and unstructured data stores in the data center and the cloud, it is difficult for organizations to have an accurate picture of whose data they really have, where it resides, and how it is being used. CCPA’s broad definition of what constitutes personal information that belongs to a data subject creates perhaps the biggest challenge for organizations in complying on day one, and continuing to comply as data volume and complexity increase over time. The need to protect data subject rights at scale requires a different approach to discovering and correlating data than what companies have been traditionally using.
Purpose Built PI Data Discovery for CCPA
Protecting personal data rights under CCPA means having to account for every individual’s data–including PI and PII. But, traditional, classification-based data discovery tools can’t correlate or associate data back to an individual. They can tell you what kind of data you have but not whose data you have. Traditional data discovery tools rely on regular expression based classifiers to find well-structured types of data like sixteen digit payment card information. They were not designed to identify personal data based on its connection to an identity. As a result, these tools lack the capability to look beyond well-formed types of classic PII, rendering them inadequate and obsolete for discovering and classifying PI under CCPA.
BigID’s data privacy platform is purpose-built for the advanced discovery of PI and PII across structured, unstructured, Big Data and cloud residing in the data center and the cloud. BigID takes a modern approach to data discovery, harnessing the power of machine learning to find hard to find PI. BigID’s approach gives companies a leg up in meeting the specific challenge of discovering and correlating all personal information as defined by CCPA by
• Discovering PI and PII using ML-based “identity intelligence” to measure the identifiability of data and connect how each PI attribute is connected to other data related to the same identity across the enterprise
• Correlating personal information to an individual by indexing data by person to preserve and protect data subject rights
• Looking across all enterprise data sources for personal information
• Identifying PI at petabyte scale
If we learned anything from GDPR, it is that companies need to prepare as early as possible to be ready for the deadline. With that in mind, here are the most important considerations to prioritize:
1. Ensure that your team has a shared understanding of the definition of personal information under CCPA
2. Broaden data governance to include PI, not just PII. Organizations must map their data estates, identify all personal information as compared with the current standard of directly or indirectly identifiable attributes and inventory data by person and state of residence.
3. Effectively manage consent and monitor processing. To prove compliance and build trust with consumers, businesses should examine controls to manage downstream uses of PI with the ability to monitor and assure consent and uses of PI are appropriate.
To learn more how BigID can help you get ready for CCPA visit BigID.com/demo