Privacy is all about data management. You need to know which type of information you are dealing with and what really happens to it on the backend. Data catalogs can help you see what is really going on and visualize what happens to your data.
For PII identification, the only task where NLU methods outperformed the good old Regex was Name Entity Recognition (such as identify people’s names). It is no surprise that professional tools like BigId use regular expression as the first method for classifying private information, clustering and other fancy AI comes next.