Hi there! SafetyDetectives had the pleasure of engaging in a thought-provoking conversation with Kyle, from Ohalo, a company making waves in the data privacy and security space. Our conversation touched upon the core of Ohalo’s Data X-Ray product while also unraveling the intricate challenges and trends shaping the industry.
Thank you for taking the time to speak with me today, Kyle. Could you start by telling me about Ohalo?
We kicked off Ohalo back in 2017 after identifying significant progress in NLP deployability in the enterprise as well as a need for enterprises to comply with new privacy regulations like GDPR and its progeny.
These regulations essentially require enterprises to understand the type of data they have, who has access to that data, why they have the data, and how long they should retain the data. There’s also the need to understand, often at the word level within a document, why we have specific information. Doing this manually is simply not feasible, as it would require hundreds, if not thousands, of people solely dedicated to data compliance.
Can you tell me a little more about Data X-Ray?
Data X-Ray has four essential capabilities: file discovery, classification, activity monitoring, and remediation.
- Discovery: At its core, the Data X-Ray excels at understanding metadata—those vital breadcrumbs that illuminate a file’s journey. It uncovers file entitlements and access controls, revealing who holds the keys to sensitive information.
- Classification: Understanding the contents of files. Is there personal data in it? Is it PCI data? Is it sensitive for a corporate confidentiality reason? The Data X-Ray unveils it all.
- Monitoring: Observing data’s ever-changing landscape. Data X-Ray tracks transformations, records who’s downloading files, and even keeps tabs on the creation of global share links.
- Remediation: Finally, once you understand your data, you need to do something about it. Remediation allows you to redact, archive, and delete data at scale.
When combined, these aspects provide an end-to-end solution for data lifecycle within a company, helping them understand what the data means for the company at each point.
How do data governance tools enable business users to actively participate in data governance processes?
Imagine a scenario where an enterprise has data spread across various sources—Windows servers, S3 buckets, Salesforce, and more. Each of these sources might contain hundreds of millions of files.
Now, think about your own experience with files on your computer. You open around 20 files a day, but do you recall what you opened this morning? The scale that enterprises operate at is beyond human capacity, which is why you need to rely on machines.
This is where data governance tools step in. They wade through this sea of data, determining the most critical items that need human attention.
For example, one of our clients, a large global bank, was divesting a subsidiary bank to another bank last year. The first phase involved examining approximately 19 million files of sensitive data, including anti-money laundering reports and regulatory correspondence. One big four consulting firm said it was impossible to complete this by the divestiture deadline because it involved reviewing 19 million files in just under two months.
We took up the challenge and succeeded. We narrowed down the 19 million files to around 5,000 files that needed manual review, which were the most sensitive types, and sped regulatory approval for the divestiture. This is a successful example of how tools like ours can handle enormous scale and make it manageable for humans.
With data privacy and data security becoming major concerns these days, how do you help organizations address these challenges to secure their data documents?
We operate within the data source layer, constructing metadata about files. We not only ingest existing metadata but also create our own. This information feeds into our workflows, enabling actions like file redaction and, soon, archiving or encrypting files. Moreover, we collaborate seamlessly with other systems. This means metadata can be shared with data catalogs, data security tools, and SIEM systems for a comprehensive security strategy.
What trends do you see coming up in the data security and privacy landscape in the next several years?
The landscape of data privacy is evolving rapidly. A prominent trend has been the shift of responsibility from legal teams to IT or security teams. It’s not just about interpreting laws; it’s about ensuring data compliance in practice. Bridging the gap between policies and real-world data is a trend that will likely persist.
Looking ahead, we anticipate innovation in data security. Managing data at scale is complex, and the rise of generative AI and large language models presents exciting possibilities. However, using these technologies safely is paramount. For example, if you’re training an LLM model on your corporate data, you wouldn’t want a junior business analyst to be able to query the LLM for CEO’s emails. Building safety around that will be a significant data security trend in the coming years. As we navigate this new terrain, it’s clear that while AI and data-driven solutions are valuable, securing their implementation within enterprises is the next big challenge.
As we peer ahead, it’s evident that the journey is far from over – new challenges and solutions will continue to shape this dynamic landscape, and Ohalo is poised to be at the forefront of these transformative shifts.