BigID for Confluent and Kafka

Unique discovery and identification of sensitive personal data using Confluent and Kafka

How BigID Works for Confluent and Kafka

Confluent is the commercial provider of open source data pipeline technology Kafka. Kafka has revolutionized how organizations stream data for rapid analytics and next-generation event management. An estimated 33% of Fortune 500 companies use Kafka to process data or communicate events across large, distributed systems that require zero latency.

Kafka’s popularity in Big Data, Analytics, IoT, and more, has increased organizations’ need to identify the sensitive data contained in Kafka streams — especially personal information that could be subject to new privacy or security regulations like CCPA or NY Privacy Shield.

BigID has developed unique capabilities that preserve data security and privacy for customers using Kafka and Confluent. Customers can process data streams at wire speeds, identify sensitive personal information in a stream before it enters a data lake or analytics platform, subscribe to specific topics, and enable developers to manage BigID.

Technical Benefits

  • Monitor Kafka streams without adding latency
  • Scan for specific types of personal information defined in BigID
  • Identify contextually sensitive data using BigID’s correlation
  • Filter data entering or leaving a data lake or analytics platform
  • Allow developers to deploy via BigID APIs
  • Use Confluent Schema Registry
  • Leverage Kafka Connect for additional data sources

Business Benefits

  • Certified to work with Confluent Kafka platform
  • No additional infrastructure cost
  • Flexible deployment patterns

Resources