Cybersecurity and Advanced Analytics Will Have a Love Affair in 2018

Five reasons why advanced analytics, powered by unsupervised machine learning, will be your new partner for effective cybersecurity defense

cybersecurity advanced analytics predictions 2018

Cybersecurity and analytics have been entwined with each other from the time the very first SIEM was created. After all, a SIEM’s job was to combine relevant data into a single pane of glass for security practitioners.

But times have changed. Now, analytics of that area cannot keep pace with the volume of data being generated. Advanced analytics, powered by unsupervised machine learning, will be the new partner for effective cybersecurity defense.

“Why,” you ask? Let’s start with the basics.

Reason #1: Analytics Need Data
There are no analytics if there is no data. Analytics love cybersecurity because there’s lots of data.

Reason #2: Extracting Value From a Lot of Data Requires Advanced Analytics
Cybersecurity log files have a lot of data. This is evident from the two most popular log-management tools, Splunk and ELK. Splunk and ELK log collection systems help gather all the relevant data, for human operators to analyze, review and extract insights.

That makes log-management tools a match made in heaven for analytics. A vast amount of data is useless, unless you extract insights out of it.

In the case of log data, there are a many different types of data—each, a rich source of information. Consider the variety of log files available to security teams: authentication logs, directory service logs, endpoint logs, operating system logs, file share logs, VPN logs, resource logs, IP repository logs, printer logs, Netflow logs. Each machine and software program produces logs. Unsupervised machine learning, an advanced method of analyzing them, is the only scalable option.

Each log tells a story about what a machine or program has done, where it has been, how it is doing, or whom it has talked to or connected with. Measuring and understanding the unique patterns of each type of log file allows security and IT teams to know if things are running normally, or if they require additional attention.

A contextual story can be created beyond one type of log file, by analyzing multiple types of log files together. This is where scalable security analytics begin to give defenders a real chance against modern hackers who launch stealthy, multi-vector attacks that live quietly under the radar of existing rigid rules and thresholds-based systems. The latter produces slow, constant, and unauthorized insider access.

Cybersecurity log files can be a treasure trove of information for threat-hunting—if you know what you’re looking for. That’s the crux of it: simply having lots of data isn’t enough. It’s the insights extracted from data that matters. Those insights require unsupervised machine learning and artificial intelligence (AI) to consider how unique entities (both human and machine) have different individual usage patterns. If not interpreted correctly, they result in either too many false positives or too many unsurfaced threats.

Cybersecurity log files need self-learning, advanced analytics (unsupervised machine learning) to provide meaningful insights. In 2018, we finally have both the technology (data science utilizing Apache Hadoop, Spark, NiFi, and more) and the hardware power to process billions of pieces of log-file information to extract actionable security insights.

Reason #3: Zero-Trust Networks Require Contextual Insights
Modern enterprises must operate on the premise of zero-trust networks which need contextual analytics, since perimeter-based security is no longer effective. They require more than just an incredibly large lake of security-log data.

This now includes endpoint data, which can tell a story of data loss, encryption (or lack thereof), application whitelists or blacklists, data classification, endpoints, and more. It also includes business-application data, HR system data, and operational-systems data. All IT systems that help run a business, combined with security data, create a complete, contextual picture of enterprise risk. And to avoid false positives, you need business context. After all, both sets of data are KPIs for how your business is doing.

Analytics about business data tell enterprises how to optimize their business. Security data tells enterprises if everything is running as expected. Anomalies in either tell enterprises something is wrong.

Combining security data with business data enables analytics to tell you if everything is operating according to plan. Conversely, they can tell you if there are anomalies that cannot be attributed to business changes, and are therefore compromising security posture. These contextual analytics bring together even more data that’s normally available through standard log collection, for which Splunk and ELK are used.

For more information, read Forrester’s Zero Trust Security: A CIO’s Guide to Defending Their Business From Cyberattacks by Martha Bennett, with Stephanie Balaouras and Michael Glenn.

Reason #4: Cybersecurity Needs Automation. And Automation Requires Analytics.
Security teams cannot scale human resources to match the number of attack vectors. From the CIOs and the CISOs down to the security analyst, it is apparent that manual efforts cannot keep up with the demand and automation of the “Neverending SOC Cycle.” According to Gartner’s Predicts 2018: Security Solutions (by Dale Gardner, Deborah Kish, Avivah Litan, Lawrence Pingree, and  Eric Ahlm), a “high-level trend” has been the “growing demand for automation of tasks performed primarily by individuals in the past (enabled by the advancement of predictive and prescriptive analytics).”

Reason #5: Analytics Are Only Valuable if They Are Meaningful
More often than not, surfaced “insights” are are not meaningful and sometimes not even accurate. For example, a SIEM utilizing rules and thresholds sends an alert when something crosses those specific rules and thresholds. Yet, the number of rules and thresholds results in a sea of alerts.

Ultimately, those alerts are not meaningful to a human being, because the human mind cannot process the rate and volume of those alerts. So fine-grained, rule-based accuracy ultimately results in the opposite: a broad set of noise. Just like your ear naturally tunes out white noise, so do security teams, even if it is dangerous to do so.

Analytics must be meaningful to the end user. To create meaning , analytics must cancel-out unnecessary data or “noise.” The noise-canceling analytics about security data distills the cacophony of less relevant events and alerts into key risks for security teams to focus on. Because cybersecurity requires the analysis of vast amounts of data, only advanced analytics based on unsupervised machine learning can scale to the job.