There's a lot of hype around the use of AI in all aspects of life, not to mention in cybersecurity. As you may have gathered, some of this hype is overblown and in the data protection space, akin to delivering a “magic bullet” to solve an organization’s content analysis problems.
With this in mind, I thought I’d highlight how artificial intelligence is actually used within Fortra’s Data Classification Suite (DCS) of products, and drill down further into context, when applicable.
First though, what's the problem that our AI engine is addressing? Recent research has shown that 72% of Americans have little to no understanding of privacy laws. Personal data? Private data? End users simply don’t know what data should be considered “private” or “sensitive” by law.
On top of that, we've seen that individuals are up to 40% more likely to miss personal data in a long message than a shorter one. If that's the case, then documents are at a high risk of being misidentified. Similarly, we've seen that individuals are up to 14% more likely to miss identifying health information terms like “diabetes” as personal compared with terms like “username” and “password.”
Fortra's Data Detection Engine (DDE) - the most robust data detection solution available today - is able to analyze content and provide results based on AI technology. These results can be used to automatically set classification values on content, or used to prompt the end user to set the suggested classification values, saving them a great deal of time in the process:
DDE actually has a number of AI capabilities, including:
- SmartRegex: The ability to run regular expressions but with additional intelligence that will, for example, identify any term with an alphabetic character is not a credit card. This helps with performance.
- Topic identification: Our AI engine is able to recognize up to 10 separate topics such as job applications, invoices, or employee benefits content. These topic types likely need more protection.
- Named Entity Reference (NER): The ability to identify “names” such as personal names and postal addresses. Examples are “Alice Smith” or “10 Acacia Avenue” which could be Personally Identifiable Information (PII).
- Co-reference: The ability to assign context to a name in a phrase such as “Alice Smith is a financial advisor at our company. Unfortunately, she was recently diagnosed with multiple sclerosis.”
- Identification of privacy data: Examples include home addresses, phone numbers, credit card numbers, personal health information, passport or social security numbers, usernames and passwords, etc.
These AI technologies include machine learning with statistical analysis, deep learning with neural networks, and natural language processing. The combination of these technologies provides a powerful means to identify sensitive content with the applications that your end users typically use.
Fortra's DCS for Windows (using Outlook and Office on the desktop products), DCS One for M365 (using Outlook and Office in a browser), and DCS for DaR (scanning data-at-rest) are all able to call DDE to provide AI provided results.
Looking for more information on how Fortra can detect data? See why Fortra's Data Detection Engine is flexible to meet the needs of organizations but can also grow to satisfy evolving requirements.