Know Your Customer, Know Their Data

Consumer data is a big deal to the consumers who share the data, the companies that collect and process the data, and the governments that oversee the protection and usage of that data. Opinions on fair use of the data are in no way universal. March alone saw NY State pass a new regulation to enhance consumer data protection in financial services, while congress voted to roll back new FCC protections around data usage consent. Nevertheless, the thread that connects these decisions, along with new international regulations like the China’s Cyber Security Law or the EU’s GDPR, is that personal data is increasingly a battleground and organizations will have to navigate competing and sometimes divergent consumer, business and government priorities–globally.

Accountability Through Accounting

For every organization, customers are the lifeblood of the business: no customers, no business. Increasingly, that relationship between company and consumer is built on data. Data is the currency that defines how the consumer interacts and transacts with an organization. Not surprisingly then, consumers see organizations as custodians of their data, responsible for that data’s safekeeping and fair use. This view of company as data custodian is only reinforced by the changing regulatory landscape which requires companies to be more accountable around data protection and privacy to their customers and the regulators that defend them. But data accountability is impossible without data accounting.

To know a customer in today’s business world means to know their data. For most organizations this remains a difficult ask. Data is collected across many applications and processed in not always obvious ways. Traditional approaches to data discovery rely on imprecise questionnaires or dated scanners neither of which provide comprehensive data asset inventory or mapping. Knowing that you have a 9 digit number in a relational database is not the same as knowing an individual’s data. To know a customer or “data subject” means knowing all their data content and also the context of that data use – where is it resident, who is accessing it, where is it flowing, what consent has been captured for it etc. Data accountability is impossible without data accounting and data accounting requires a means to discover both an individual’s data and the usage context of that data.

Content & Context

Next gen data discovery tools like BigID go beyond just uncovering social security numbers. Finding sensitive data still matters but in today’s online business world it’s important to accommodate a broader definition of what is sensitive content while also understanding the context of that data in order to meet security, privacy and data governance priorities. Take the EU GDPR for instance, the citizen data rights it enshrines requires organizations to know what data they collect on every individual, it requires them to know what consent they have around the data, where the data is resident, how identifiable the data is and the data’s purpose of use. Doing this accurately and at scale requires new approaches not only to find but also inventory and map the data. Tools like BigID effectively help organizations create an atlas of their data so they can zoom and zoom out on specific characteristics and relationships. This ensures both an ability to meet emerging regulations but also enhance customer data knowledge.

Data Quality Amidst Data Quantity

As with any map there will always be macro details and micro details when it comes to customer data knowledge. Data stewardship requires a big picture view of how data comes into an organization, how it gets processed and how it ultimately gets disposed. It will also benefit from a more detailed inventory that can be sliced and diced by data type, data subject, consent, calling application, system, country or even applicable regulation. But to truly make the data valuable it’s also important to understand the detailed inter-relationship between the various data attributes collected and processed. This requires a granular way to zoom in on how for example a cookie is connected to a social security number.

Having that granular view also helps answer common data quality issues in identity data. Are two data table entities different people or the same if they share some common attributes; can an IP address be mapped back to an individual; is a de-identified data set used for analytics re-identifiable?

Data quality depends on detailed data knowledge. Tools like BigID provide organizations a way to shine light on customer information at a business level, asset inventory level and fine grained data table / field level. Equally important it is designed from the ground up to do so at a global scale spanning petabytes of regionally distributed (and governed) data centers and stores while also accommodating an ever expanding definition of what is identity data. Moreover data knowledge driven protection and privacy tools like BigID give organizations a new way to analyze the data and also reuse it in a modern service-centric way across the organization.