Will the Real AI Please Stand Up?

Webinar Recap with Interset CTO Stephan Jou and VP of Products Mario Daigle


Thank you to everyone who joined us on Tuesday, June 12, for the webinar, “Will the Real AI Please Stand Up?” We discussed the difference between different types of machine learning (supervised and unsupervised), showed a short demo of how Interset’s AI accelerates detection of insider threats, and answered several questions on the topic of AI and machine learning.

You can view the webinar slides online or watch the webinar in full on demand. The transcript is also available here.

Prior to and during the webinar, we received several questions from attendees, which are listed below.

Q&A

Q: By 2020, what are some use cases where (genuine) AI could be applied to security needs within the enterprise?

A: This depends on your definition of “genuine” AI. Is your definition of AI a complete automated threat and detection system, where SOC level analysts 1, 2, and 3 are no longer needed? Complete replacement of SOC analysis is unlikely due to the variability in innovation in cybercrime and digital attack vectors. By 2020 (only two years from now), Interset believes that AI will be instrumental in providing guided threat hunting and automated analytics that greatly accelerate the current responsibilities of SOC analysts. As mentioned during the demo, our system is designed to constantly monitor and analyze data, because a machine never sleeps. Starting with an implementation today, enterprises benefit from this aspect of AI and, by 2020, broad integration between different systems like threat detection and security automation and orchestration will be common.

Q: How can you tell when it’s AI and when it’s not?

A: This requires investigation and asking questions about how it works. We hope that the content from this webinar helps you interpret the answers you get! In summary, AI should not be rules-based. It should be based on algorithms. Ask for information about the data that is being used to train the AI. Where does it come from? Is it learning by example (supervised), or is it learning by observation (unsupervised)? Asking about the algorithms and understanding the mathematical logic behind the AI will show what is real AI.

Q: What happens when an attacking AI meets your defensive AI?

A: There is an entire area of research that is devoted to exactly trying to answer this question. It’s called “Adversarial Machine Learning” or “Adversarial AI”, and you can find information on it in research papers and on the Internet. The short summary is that there are techniques that can actually be used to make the defensive AI stronger by learning from the attacking AI and vice versa. It’s a fascinating reflection of the importance and new battleground of AI.  

Q: What are the top three cyber machine learning use cases?

A: The top three use cases of machine learning use cases are malware, compromised account, and insider threat.

Malware detection is currently the most prolific use case for machine learning. Because there is a large pool of existing data, it is able to leverage supervised machine learning—learning by labels—to detect threats.

Compromised account is a very popular attack vector and a very common use case for machine learning. By understanding the behavior of accounts on a network, machine learning can alert you when an account behaves in ways that suggest that someone other than its legitimate owner is acting.

Insider threat remains one of the attack vectors with the highest risk, simply because insiders have access rights to the data of interest. Using machine learning to baseline each person’s “unique normal” can help SOCs identify potential risks and mitigate them before they become incidents.

Q: How far away are we from an epiphany moment where machine learning and AI-driven infosec solutions are considered clearly more agile, accessible, and capable than the more traditional, manually micromanaged infosec approaches?

A: We are clearly on the path towards more agile, accessible machine learning and AI-driven infosec solutions—simply out of necessity. We don’t expect the transformation to happen overnight, but as mentioned during the recent Gartner Security and Risk Management Summit, big data is now an inherent part of security. From that, analytics, machine learning, and AI must be the next steps for processing all of the security data coming in, because manually micromanaged infosec approaches are becoming ineffective. Over the next few years, existing processes and products will adapt to leverage automated analysis and automated responses.

Q: How do modern AI solutions detect malware that uses AI?

The old school way is based on signatures—looking for a sequence of bytes of indicators of compromise (IOCs). The more modern approach is based on behavioral analytics. Some behaviors are detectable via endpoints. For example, anomalies such as process dependencies (it would be unusual for notepad.exe to start loading winini.dll). This is not normal and is probably malware. Other examples of behavioral analytics are the ratio of inbound to outbound communications, the shape of the network traffic associated with a particular process or the port, and protocol combination for a particular machine. In the case of a fileless attack, knowing if that user typically uses systems tools to do a PowerShell login or not provides the indication of abnormal behavior. These are all behavior qualities that can be detected by AI but not by the old school signature-based methods.

Q: It is highly important to business leaders and stakeholders for AI to make factual its products for rewarding and trusted results, especially in the area of cybersecurity.

We could not agree more. This is critical when investigating a technology. It’s important to ask a lot of questions. If a vendor refuses to tell you about the “secret sauce” around its algorithms, be very skeptical. Great cybersecurity does not need to invent new machine learning methods or AI. There are plenty of algorithms out there and, ultimately, we should create a community where knowledge is shared in an effort to find the most effective approaches. We need the good guys to collaborate together more.

Q: Why are neural networks considered both supervised and unsupervised machine learning?

A: There is not a straight line between the algorithms and their application to supervised or unsupervised machine learning, which is why neural networks can be applied to both methods. One can implement neural networks in different ways. For example, neural networks are often used for supervised machine learning in image recognition (i.e., a picture of a tree with label “tree,” a second picture of a tree with the label “tree,” a third picture of a tree with the label “tree,” and then picture of a dog with the label “dog”). Neural networks are also often used for clustering applications when groups of things are similar to each other, which is similar to what Interset does when we find a cluster of machines or users that are similar to each other. For example, there is a different kind of neural network called a “self-organizing map” used for clustering. In conclusion, the type of machine learning doesn’t always dictate which kinds of algorithms can be used.

Q: Is there a feedback mechanism to improve or adjust the accuracy of the models?

A: Yes, absolutely. The Interset platform supports human-loop analysis and reinforcement learning. This capability is built with open APIs to enable analysts to go in and make a quick change. We have a slider that can be used to increase and reduce the effect of any behavior on risk.

Q: In machine learning, we believe the past is a good estimator for future. It means we can predict future based on patterns we saw before. Does this match for cybersecurity field? Consider ransomware as an example—we never thought it could be dangerous.

A: This is a really good question. The person who asked this is exactly right. If you haven’t seen it before in the past, you can’t expect a machine algorithm to see it in the future. This applies to black swan attacks, for example. The first time we saw Spectra attacks, we didn’t know what to look for. The key is understanding what you have seen before. Now that we know the behaviors to look for, we can detect zero-day attacks. 99% of zero-day attacks are variations on a theme.

When it comes to behavioral analytics, what you are looking for is precisely what you have not seen before. When a system, user, machine, or printer starts exhibiting behaviors that are not similar to what has been seen before, then you know something is wrong. It is also suspicious when a process like real time.exe that normally doesn’t access the internet suddenly does or. If a packet that normally sends to DNS starts sending very large packets. This means behavioral analytics is able to detect zero-day attacks or patterns that have never been seen before. Learn more about how Interset detects zero-day attacks here.

Q: An observation: Your product seems to do a great job of explaining the context and reasoning for why an alert was triggered. This is an area where I see most machine learning cyber products come up short. Please explain why explainability matters in machine learning and your approach to it?

A: Explainability is a big area of research of AI today. Not all algorithms are amenable to explanation. For example, with deep learning, it is hard to build an explanation that is comprehensible to a human to understand why the machine decided what it did. At Interset, we focus on this from the start when we went to market with the anomalies in our alert user interface (UI). There are two concepts that we focus on. First, we provide information necessary to understand the alert and show what happened. Then we provide details to validate the analytics. When you investigate threats, you need to quickly be able to explain your reasoning in order to escalate and build a case to expend resources to rebuild a machine, investigate an insider threat, or the like. Explainability was always baked into how we build the application—not just to know what matters most, but to understand why something scored high and what the underlying events that justified that score are.

Q: Which is a more practical start: Chief AI Officer or an AI-first strategy?

A: It depends on your situation. Should you hire a dedicated person to lead AI or have a strategy from the board level first? There is no question that in the industry today (or any industry), there are a lot of amazing things that can be done with AI. An AI renaissance is taking place, and it behooves us to utilize AI techniques to do analysis. If you are a company that is sufficiently large and have a board-level or C-level mandate, then it makes sense to make AI someone’s sole responsibility. It’s not easy to jump into. There’s a lot of snake oil out there and having someone dedicated can be very beneficial. But if your company is small, then maybe not. Get some early wins in small projects and solve one problem at a time to learn.

Q: Interset recently announced zero-day, endpoint threat detection capabilities. Are you an endpoint company now?

A: No, we augment existing security tools. We help you get more out of your existing endpoint detection and response (EDR) solution by reducing the volume of noise that comes out of the system

Q: Does Interset security analytics replace my SIEM?

A: No, Interset security analytics augments and optimizes your SIEM, making it smarter at what it does.

This Q&A was authored by Interset CTO Stephan Jou and VP of Products Mario Daigle.