What Is Active Metadata?
What happens when you apply machine learning (ML) to metadata so that it can be used to make decisions and trigger actions? Applying ML to metadata transforms metadata into ‘active metadata’, meaning that the metadata is actionable.
Metadata is the data that describes data. For a reminder of metadata types and descriptions, see Metadata Management 101: Know Your Data. Active metadata is ML-augmented metadata that can be used to take action or make decisions based on the metadata.
Active metadata needs to be insightful to be useful for action and needs to be stored and made available in a way that enables operational use. Metadata management platforms apply ML to create insight about the metadata and take action on it. The action can occur by triggering a workflow or a platform may take some action automatically.
Who Uses Active Metadata?
Business analysts make business decisions by analyzing the physical data.
- Physical Data ML Analytic Insight: Customer sales data shows that the demand for a product is increasing.
- Resulting Business Action: The company may increase supply to meet the increased demand.
Data management teams make data management decisions by analyzing the metadata.
- Metadata ML Analytic Insight: Customer sales data contains personal information.
- Resulting Data Management Action: The data team will need to apply a workflow or automation to protect personal information.
Why Does Active Metadata Matter?
This article describes 3 examples of active metadata to illustrate the value of how active metadata is used for privacy and regulation compliance, proactive data quality, and improved data context.
Active Metadata for Privacy and Regulation Compliance
For example, a dataset that includes personal information could have associated metadata to identify that the data includes personal information. The metadata is active because it can be used to take action to protect that data. In an automated platform, the metadata could trigger a role-based dynamic masking policy to show or hide data depending on if the user is provisioned to see certain classifications and types of data.
Active Metadata for Proactive Data Quality
Systems that can analyze data and evaluate data quality may include data quality information as active metadata.
For example, a column has a higher percentage of nulls or outliers beyond the acceptable threshold. The metadata will show that the dataset has a quality issue. That metadata is active to be used in a workflow, or to trigger an automated alert that the dataset contains a quality issue. The data owner can now proactively take measures to correct or remove the dataset and prevent that data from being used for analysis.
Active Metadata for Improved Data Context
In most instances, data usually does not come in perfectly labeled and defined columns. Sometimes columns names are so obscure that they don’t look or sound anything like the data in the underlying column or data asset.
For example, an organization has a list of social security numbers in a column with a name that gives no indication that the column contains social security numbers. A data intelligence platform can scan the data, determine that the column contains social security numbers, and assign an appropriate name or tag as metadata. Now the dataset has metadata to recognize what the content is and take action on it. A workflow can alert a data steward, or take automated action to assign a friendly name, adding data context for users to understand what that data is and take further action to identify it as sensitive data to be protected.
Active Metadata Exchanges for Platform-to-Platform Orchestration
Data environments are complex and rely on multiple tools and platforms. A platform that generates active metadata is even more valuable if it can use the metadata to enhance and interact with other connected systems in a metadata exchange for platform-to-platform orchestration. BigID is a representative vendor in Gartner’s inaugural market guide for Active metadata management, citing that “Active metadata management is an emerging set of capabilities across multiple data management markets resulting from continuous metadata management innovation.” Gartner recommends that data management platforms use active metadata for interoperability with third-party systems.
Three ways that Active Metadata Management Platforms will use metadata to interoperate are to:
- Export the active metadata to be used as insight or to create action in another connected tool or platform.
- Import “foreign” metadata to build further insight and optimize data strategies.
- Active metadata generated by a platform is used to prompt action or a workflow in a connected tool or platform.
Benefits of BigID Active Metadata
Active metadata is foundational to a modern data governance practice. Savvy data teams know that metadata is essential to describe and manage data, and active metadata is the next evolution of metadata for data governance. Creating metadata that is actionable, and storing the metadata in a way that enables action, makes metadata even more powerful to benefit from emerging data governance capabilities.
BigID Data Intelligence Platform applies ML to analyze data at scale and create active metadata. The platform generates metadata to add context including classifiers, attributes, and policies that are used to take action on data. Some actions, like applying policy information, are taken automatically. Other actions, like collaboration and approvals on glossary terms, trigger a workflow because they require human interaction. The active metadata created by BigID will maximize data value and minimize data risk for any organization wanting to gain context and add automation to manage a data environment.