AI & Machine Learning 101-Part 3: A Brief History of Artificial Intelligence

AI has come a long way from its inception as a field of research in the late 1950s.

AI, a brief history

For some people, the term artificial intelligence (AI) might evoke images of futuristic cities with flying cars and household robots. But AI isn’t a futuristic concept, at least-not anymore.

Although not referred to as such, the idea of artificial intelligence can be traced back to antiquity (i.e. Greek god Hephaestus’s talking mechanical handmaidens).¹ Since the 1930s, scientists and mathematicians alike have been eager to explore creating true intelligence separate from humans.

AI’s defining moment in the mid-20th century was a happy confluence of math and biology, with researchers like Norbert Wiener, Claude Shannon, and Alan Turing having already chipped away at the intersection of electrical signals and computation. By 1943, Warren McCulloch and Walter Pitts had created a model for neural networks—a term you might recall from our previous blog. Neural networks paved the way for a brave new world of computing with greater horsepower, and, in 1956, the field of AI research was officially established as an academic discipline.

The latter half of the century was an exciting age for AI research and progress, interrupted occasionally by “AI winters” in the mid-70s and late 80s where AI failed to meet public expectations, and investment in the field was reduced. But despite setbacks, different applications for AI and machine learning were appearing left and right. One particular anecdote of such an application has become a popular parable within the scientific community, speaking quite effectively to the trials and tribulations of AI research and implementation.

The story goes something like this:

In the 1980s, the Pentagon decided to use a neural network to identify camouflaged tanks. Working with just one mainframe (from the 1980s, keep in mind), the neural net was trained with 200 pictures—100 tanks and 100 trees. Despite the relatively small neural network (due to 1980’s limitations on computation and memory), the lab training resulted in 100% accuracy. With such success, the team decided to give it a go out in the field. The results were not great.


Why did the neural network do so fantastically on the photos in the lab, but fail so completely in the field? It turned out that the non-tank photos were all taken on days where the sky was cloudy; all the pictures of trees were taken on days where the sun was shining. The neural net had been trained to recognize sunniness, not tanks.

Eventually, though, visual recognition via deep learning—facilitated by neural networks that are much more complex than the Pentagon’s 1980s mainframe would have been able to handle—became a reality. In 2012, Stanford professor Andrew Ng and Google fellow Jeff Dean created one of the first deep neural networks using 1000 computers with 16 cores each. The task: analyze 10 million YouTube videos. The result: it found cats.² Thanks to its “deep learning” algorithm, the network was able to recognize cats over time, and with very good accuracy.

With the availability of vast computing resources that were undreamed of back in the 1980’s, deep neural networks have quickly become a popular area for research. Deep learning gives a system the ability to automatically “learn” through billions of combinations and observations, reducing the dependency on human resources. Within the cybersecurity domain, the method has become particularly promising for detecting malware—scenarios in which we have large datasets with many examples of malware from which the network can learn.

Unfortunately, deep learning methods are currently less effective when it comes to insider threats because we simply don’t have the right kind of data on these types of attacks, in the volumes required. Most often, the information we have on insider threats are anecdotal, which cannot be used efficiently by these types of neural networks. Until we can gather more effective datasets (and reduce the cost and complexity of deep learning systems), deep learning is not the right choice for all use cases. And that’s okay. Deep learning is just one of many machine learning algorithms, and these approaches can be just as if not more valuable—it all depends on the job at hand.

We’ve seen immense potential of AI technologies in the six decades since its official “birth,” and we’ve only just scratched the surface, especially in security. In our next blog, we’ll take a deeper dive into the potential applications for AI and analytics to change the way that we identify and respond to security threats.

¹Deborah Levine Gera (2003). Ancient Greek Ideas on Speech, Language, and Civilization.

²It found other objects, too. But cats were the among the first!


Stephan Jou is Chief Technology Officer at Interset.