Seeing Through The Data Fog (In The Cloud)

IDC estimates that more than half of all IT spending will be on cloud technology by 2018. Companies now store multiple exabytes natively in the cloud, and as the cost to migrate more data continues to come down, the cloud is becoming the primary repository for many organizations.

But even as the cloud becomes the default store for enterprise and personal data, it is not getting easier for data security to keep pace. A growing number of cloud data breaches which recently received attention involved either data stored in an insecure fashion, or with incorrect access controls. In fact, it can be argued that by removing legacy frictions around procuring, provisioning and using storage in the enterprise, traditional data protection approaches have become weakened or even obsoleted.

More Data, More Problems

The seemingly limitless ability for the cloud to absorb data is a double-edged sword for organizations. Yes, the cost and effort of storing more data drops precipitously in the cloud, enabling unprecedented scale and efficiency. But with simplicity of provisioning and deployment can come more relaxed governance, and elimination of historical IT constraints and limits on the creation and management of data. For companies dealing with sensitive information, which in this day and age increasingly means most organizations, this can represent a serious security, compliance and governance challenge.

Take, for example, data localization. In the age of corporate data centers it was pretty simple to guarantee data sovereignty compliance: data was stored on systems inside the data center, and the data center was a four walled building grounded in a specific location. Regulators wanting proof of data sovereignty could be satisfied in the knowledge that their citizen data resided in a building you can physically visit. In the cloud, companies depend a provider’s assurance which may or may not satisfy a regulator. Moreover, where previously a team standing up an application would have to go through an elaborate sign-off process to stand up infrastructure in a specific location, today any team with a credit card can provision data systems at will.

For industries dealing with regulated data, this ease with which data can be created and proliferated can prove very troubling. Take identity data. Personal data is by most measures regarded as sensitive. Most states and countries today have statutes that regulate the collection, storage and usage of personal data. The changing face of identity data protection regulation is perhaps nowhere better exemplified than in the form of the EU General Data Protection Regulation or GDPR for short.

GDPR which comes into effect in May 2018 is a new EU regulations that governs how enterprises collect and process personal data. Failure to meet strict data accounting and reporting requirements could result in penalties that reach 4% of global revenue for companies that handle European citizen data. Among the 99 articles enshrined by GDPR are new data rights for individuals and strict record keeping for enterprises. Taken together these new requirements require companies to accurately find, inventory and map their user data across the enterprise. For many companies this already proves to be a tall order for the traditional data center where at least there exists some logical and physical boundaries as to where data can be kept and searched. In the cloud the challenge is compounded by the lowered constraints around where personal data can be created or stored. In the cloud, the problem of identifying identity data can sometimes prove foggy.

Seeing Through The Data Fog

Finding and mapping personal data in the cloud is essential to meet new data protection and privacy regulations like GDPR. For enterprises satisfying this capability in the cloud requires an ability to automatically inventory data sources belonging to an organization and then scanning a mix of structured, unstructured and semi-structured data across IaaS and SaaS at cloud-scale. Older generation data discovery based tools pre-date the cloud and typically lack the cloud-native cloud scale and automation. Moreover they struggle to inventory and map data by data subject – a prerequisite for meeting regulations like GDPR.

BigID is a leader in Big Data scale cloud-native tools to help organizations find, inventory and map their data across the cloud and data center. Using BigID companies can automate the finding and monitoring of personal data across a mix of cloud environments like AWS, Azure and Salesforce at petabyte-scale.

Protecting data starts with knowing your data. In the cloud that can sometimes be hard owing to the limitless capacity to grow data in the cloud. Tools like BigID give enterprises a way to find, track and monitor data with precision and scale to peer through the fog of cloud.