How Big Data and AI Saved the Day

Critical IP Almost Walked Out the Door

If you’re a big data lover (spoiler alert, we are!), Hortonworks’ DataWorks Summit is a must-see. We had a fantastic time sitting in on sessions by some of the top big data experts, having conversations about real-world big data challenges, and, of course, sharing our own perspective on how to leverage big data for cybersecurity.

In my session, “How Big Data and AI Saved the Day,” I had the opportunity to show attendees several examples of how businesses can radically accelerate threat analysis using big data for AI-enabled security analytics.

For example, one of our case studies featuring a $20 billion manufacturer showcased how a big data and machine learning approach for insider threat detection was more effective than one year and $1 million spent on a “traditional” security vendor. Within two weeks, Interset found two engineers exfiltrating data and another 11 individuals stealing data in North America and in China.

You can check out the presentation slides below.

Attendees had some fantastic questions. Let’s take a look.

Q: You explained that Interset uses machine learning to identify abnormal behavior for an individual that might represent a threat. If someone is only ever exhibiting bad behavior, how does Interset identify an abnormality that could indicate a threat?

A: This is a good question. Interset uses unsupervised machine learning to determine a “unique normal” baseline for every user (as well as other entities, such as machines, printers, servers, etc.), but it also looks at other users’ behavior. If a user is consistently behaving “badly,” Interset can identify this risk by comparing his or her behavior to that of other users.

Q: How does Interset take into account temporal activity (i.e., a user’s typical behavior on 9:00 a.m. on a Monday versus 3:00 a.m. on a Saturday)?

A: We leverage time-based models that will account for temporal activity only. But we ultimately rely on the accumulation of patterns over all the things you do, all the time. This allows us to build up a distribution of your “usual” activity regardless of time.

Q: You shared specific case studies that demonstrated how different companies leveraged Interset to detect threats. Can you apply lessons learned from these case studies to adjust and evolve the models in your platform?

A: Our case studies certainly are diverse, and every deployment is different—different environment, different data sources, different threat criteria, etc. We work with a very diverse group of customers, and every implementation has an impact on the continued development of our models and platform. Our product’s architecture is designed from the ground up to be adaptable and flexible. For example, many of our customers leverage our API interfaces to integrate our products with theirs. But for adjusting and evolving models specifically, there are multiple paths. The product currently comes with multiple options for adjusting models. There is a way to adjust the models with a “more of this/less of that” type of capability. New models can also be created by mixing and matching existing models through our model builder capability. And, of course, we have a data science team that is always adding new models. For more information on how models are adjusted please contact us.   

Roy Wilds is a principal data scientist at Interset.