It seems that every vendor in the cybersecurity universe is touting how their products have been enhanced with artificial intelligence (AI) and machine learning (ML) capabilities. While some of these claims are true and backed up by substantial engineering and great design, just as many are exaggerated and poorly implemented. This blog attempts to explain the real progress that AI and ML are making to help address the core tasks of cybersecurity. It will help to demystify AI and ML, as well as look at where these technologies are going. It will also touch on how Zscaler is improving detection accuracy and reducing the impact of advanced threats.
AI and ML offer a new methodology for software development and have particular relevance for cybersecurity. At the same time, companies should recognize which use cases are best for applying AI and ML to their cybersecurity, rather than buying into claims that AI can be (or should be) used for anything and everything.
All software needs algorithms. In conventional programming, humans write algorithms explicitly. The best places to apply ML are where the traditional way to solve the problem by coming up with the algorithm is explicitly difficult for humans and where there is enough data available to enable the ML models to learn the algorithm. This is especially true when the problem involves some complex pattern recognition, where it is easier for a machine to learn the algorithm from the data than for a person to try to write an algorithm with a complete set of rules for identifying patterns. A great example is malware, where patterns of malicious behavior are hidden in the binary digits of the malware code. Identifying all permutations of ever-changing patterns is nearly impossible for people, but ML may help unravel such complex and obscure patterns quickly.
In addition to pattern recognition, there are two other main areas where AI and ML can be useful in cybersecurity. The first is with image recognition, where the AI is again being tasked with finding patterns in images that humans might not be able to detect by rules and heuristics fast enough. The other is content classification for websites. AI and ML can go beyond keywords, perhaps by looking at images, to offer a litany of classifications for sites.
Just as data is key to successful AI projects, data can sometimes cause AI projects to fail. For instance, at times, ML models don’t have enough data to draw on to become as accurate as they need to be. Another potential problem is that ML models may be vulnerable to adversary attacks. Adversary attacks are carefully crafted examples designed to cause prediction errors for ML models. Let’s take a stop sign for example. ML can be programmed to identify a standard stop sign. However, if that stop sign has some small variations on it (such as graffiti or stickers) or even just a pixel difference, it might be able to fool the program if the ML model has not been encoded with a robust mechanism for anti-adversary attacks or fed enough images of stop signs with these small variations. Also, when using AI, it’s important to remember that there are no foolproof methods. Things can go wrong, and false positives can be returned.
Despite the exaggerated claims of some vendors and the potential drawbacks if models lack efficacy due to insufficient data, the application of AI and ML to cybersecurity should excite companies. A year ago, Zscaler acquired TrustPath, a leading AI and ML technology startup, and Zscaler is already applying machine learning technology to showcase its larger potential.
One area is malware classifications, which were mentioned earlier. Using AI and ML expedites the ability to do analysis and identify patterns that signal possible threats. The traditional way of finding malware relies on sandboxing potential threats and waiting for illicit behavior to emerge, which leads to delays. Using AI and ML improves not only the accuracy of pinpointing malware but also the speed at which a threat can be identified. Conventional sandboxes still have their own powerful and unique benefits, but we are augmenting them with AI so that our customers can get the best of both technologies.
Another area where Zscaler is using AI and ML is content classification. A person can generally do at most 500 website classifications per day, whereas a machine can do near-infinite number in a day. One more example is phishing classification. The scale and number of potential phishing attacks are too large, but AI and ML offer a way to speed up detection.
In all cases, AI and ML are helping companies speed up their ability to make cybersecurity decisions while lessening the degree to which such protections inhibit the user experience with delays and latency issues.
As companies look ahead to how they will use AI and ML in the future, it’s important to remember that these technologies are often not going to replace existing security solutions. Instead, AI and ML will make existing cybersecurity efforts better and faster and will add new realms of protection that are currently unimagined.
AI and ML work best in areas where there is a lot of data to help the models mature. In the future, the number and difficulty of the uses cases where AI and ML can be applied will continue to grow. AI and ML will improve over time by acquiring more domain-relevant content knowledge, which will then help the cybersecurity experts do their jobs more effectively. AI and ML will not be a panacea for all matters as some vendors proclaim, but rather a very interesting and powerful tool now available in the enterprise cybersecurity toolbox.
Howie Xu is the VP of Machine Learning and AI at Zscaler