An Industry First: Tuning Classification for Increased Accuracy

Data Perspective

Organizations implementing data security, privacy and governance programs often face their first challenge in knowing what data they have. Data teams need to classify data, identify what data is sensitive, and know where their most valuable data is to apply rules for security, privacy, and governance.

BigID uses advanced ML to automatically classify data at scale, but even the best ML can benefit from training to increase accuracy. ML models may return false positives that data stewards want to remove from the results and tweak the models to not be offered the same false positive results again and again. Before now, there was no way to interact with the results to easily adjust the classification models without complex coding. BigID solves this problem with a new feature for Classifier Tuning.

Classifier Tuning combines human interaction with ML to tweak or guide automated engines for increased accuracy. BigID provides an intuitive, user-friendly interface to interact with automated classifiers to accept or reject classifiers for specific data objects without complex coding.

How it works:

Data Stewards and Data Owners working in BigID preview a sample of the classified results in any structured or unstructured data assets and confirm that the data matches the assigned classifier.

Tune classifiers by taking 1 of 3 actions:

  • Validate: If the classifier is generating accurate results, validate the classifier and add a verification date
  • Tune & Modify: If the classifier is mostly accurate but has some false positives, tune the classifier by teaching the model which phrases it should ignore
  • Delete: If there are too many false positives then the classifier is too noisy and not generating useful results, users can delete the classifier to remove it

Data teams are now able to review findings from auto-classification and preview data assets to easily validate that the classifiers are generating useful results. Data stewards can tune classifiers that are generating too many false positives by teaching the model which phrases to ignore. Stakeholders benefit from more accurate classification results with reduced false positives for security, privacy, and governance initiatives.

Schedule a 1:1 demo with BigID to see how to combine ML automation with human interaction for classification results with higher accuracy.