According to Gartner, the future of active metadata is promising, as organizations increasingly rely on data to drive business outcomes. They predict that by 2025, 60% of organizations will actively incorporate metadata management into their overall data management initiatives, up from just 10% in 2017.
Active metadata is becoming increasingly important in the modern data landscape, and is expected to play a critical role in enabling organizations to extract maximum value from their data assets.
What is active metadata?
What happens when you apply machine learning (ML) to metadata so that it can be used to make decisions and trigger actions? Applying ML to metadata transforms metadata into ‘active metadata’, meaning that the metadata is actionable.
Metadata is the data that describes data. For a reminder of metadata types and descriptions, see Metadata Management 101: Know Your Data. Active metadata is ML-augmented metadata that can be used to take action or make decisions based on the metadata.
Active metadata needs to be insightful to be useful for action and needs to be stored and made available in a way that enables operational use. Metadata management platforms apply ML to create insight about the metadata and take action on it. The action can occur by triggering a workflow or a platform may take some action automatically.
Why is active metadata important?
Active metadata management is crucial for effective data governance. Without it, data management processes can become inefficient and ineffective, leading to poor decision-making, increased risk, and wasted resources. Here are some reasons why active metadata management is important for data governance:
- Comprehensive and Up-to-Date View of Data: Active metadata management provides a comprehensive and up-to-date view of data, including its lineage, quality, and context. This can help organizations understand the origin and history of their data, and ensure that it is being used appropriately and in compliance with regulatory requirements.
- Real-Time Monitoring of Data Quality: Active metadata management enables real-time monitoring of data quality, using metrics such as completeness, accuracy, and consistency. This can help organizations identify and resolve data quality issues before they have a negative impact on business operations or decision-making.
- Enforcement of Data Governance Policies and Standards: Active metadata management can be used to enforce data governance policies and standards, such as access controls, data retention, and data classification. This can help organizations ensure that data is being used in compliance with regulatory requirements, and that sensitive data is being protected from unauthorized access or misuse.
- Improved Analytics and Decision-Making: Active metadata management can enhance analytics and decision-making by providing additional context and insights into the data. This can help organizations identify patterns, trends, and correlations in their data, and make more informed decisions based on that information.
By continuously managing metadata, organizations can ensure that their data is accurate, reliable, and secure. This can improve decision-making, reduce risk, and increase the efficiency of active metadata management processes.
Who uses it?
Business analysts make business decisions by analyzing the physical data.
- Physical data ML analytic insight: Customer sales data shows that the demand for a product is increasing.
- Resulting business action: The company may increase supply to meet the increased demand.
Data management teams make data management decisions by analyzing the metadata.
- Metadata ML analytic insight: Customer sales data contains personal information.
- Resulting data management action: The data team will need to apply a workflow or automation to protect personal information.
Examples of active metadata
This article describes 3 examples of active metadata to illustrate the value of how active metadata is used for privacy and regulation compliance, proactive data quality, and improved data context.
1. Active metadata for privacy and regulation compliance
For example, a dataset that includes personal information could have associated metadata to identify that the data includes personal information. The metadata is active because it can be used to take action to protect that data. In an automated platform, the metadata could trigger a role-based dynamic masking policy to show or hide data depending on if the user is provisioned to see certain classifications and types of data.
2. Active metadata for proactive data quality
Systems that can analyze data and evaluate data quality may include data quality information as active metadata.
For example, a column has a higher percentage of nulls or outliers beyond the acceptable threshold. The metadata will show that the dataset has a quality issue. That metadata is active to be used in a workflow, or to trigger an automated alert that the dataset contains a quality issue. The data owner can now proactively take measures to correct or remove the dataset and prevent that data from being used for analysis.
3. Active metadata for improved data context
In most instances, data usually does not come in perfectly labeled and defined columns. Sometimes columns names are so obscure that they don’t look or sound anything like the data in the underlying column or data asset.
For example, an organization has a list of social security numbers in a column with a name that gives no indication that the column contains social security numbers. A data intelligence platform can scan the data, determine that the column contains social security numbers, and assign an appropriate name or tag as metadata. Now the dataset has metadata to recognize what the content is and take action on it. A workflow can alert a data steward, or take automated action to assign a friendly name, adding data context for users to understand what that data is and take further action to identify it as sensitive data to be protected.
Active metadata vs passive metadata
Active metadata refers to metadata that is automatically generated and updated by a system or application. This metadata is typically used to manage data within the system, track changes, and ensure data quality. Examples of active metadata include database schemas, data dictionaries, and data lineage.
Passive metadata, on the other hand, refers to metadata that is manually created and managed by humans. This metadata is typically used to provide additional context and meaning to the data, and to make it easier for humans to understand and use the data. Examples of passive metadata include data descriptions, tags, and annotations.
In data governance, both active and passive metadata are important for ensuring data quality, managing data effectively, and providing meaningful insights to stakeholders. While active metadata is essential for maintaining the integrity of data within a system, passive metadata is necessary for making that data accessible and understandable to humans.
Selecting the “right” platform
Data environments are complex and rely on multiple tools and platforms. A platform that generates active metadata is even more valuable if it can use the metadata to enhance and interact with other connected systems in a metadata exchange for platform-to-platform orchestration. BigID is a representative vendor in Gartner’s inaugural market guide for Active metadata management, citing that “Active metadata management is an emerging set of capabilities across multiple data management markets resulting from continuous metadata management innovation.” Gartner recommends that data management platforms use active metadata for interoperability with third-party systems.
Three ways that Active Metadata Management Platforms will use metadata to interoperate are to:
- Export the active metadata to be used as insight or to create action in another connected tool or platform.
- Import “foreign” metadata to build further insight and optimize data strategies.
- Active metadata generated by a platform is used to prompt action or a workflow in a connected tool or platform.
Enhance Active Metadata Management with BigID
Active metadata is foundational to a modern data governance practice. Savvy data teams know that metadata is essential to describe and manage data, and active metadata is the next evolution of metadata for data governance. Creating metadata that is actionable, and storing the metadata in a way that enables action, makes metadata even more powerful to benefit from emerging data governance capabilities.
BigID Data Intelligence Platform applies ML to analyze data at scale and create active metadata. The platform generates metadata to add context including classifiers, attributes, and policies that are used to take action on data. Some actions, like applying policy information, are taken automatically. Other actions, like collaboration and approvals on glossary terms, trigger a workflow because they require human interaction. The active metadata created by BigID will maximize data value and minimize data risk for any organization wanting to gain context and add automation to manage a data environment.