What are Signatures as they Relate to Security Threats?
Signature-based threat detection has been a central figure in cyber security from the start. But a brief history of signature-based threat detection across endpoint and network security also shows that the limitations of signature-based methods have consistently led practitioners and vendors to eventually pivot toward behavioral methods. As a result, signature-based detection has often been redundant in favor of behavioral threat detection.
Today, signature-based detection is the de-facto method of detection and response for cloud and cloud-native environments. What might the history of signature-based threat detection tell us about the future of signature-based approaches and the evolution toward behavioral detection and response in the cloud?
Key Dates
What is signature-based threat detection?
History of Signatures in Endpoint Detection Methods
Where signature-based detection began: anti-virus software
Anti-virus to Endpoint Protection Platform (EPP)
From Endpoint Protection Platform (EPP) to Endpoint Detection & Response (EDR)
History of Signatures in Network Security
IDS & IPS
The NGFW
Why is Cloud Security Still Reliant on Signature-Based Threat Detection?
History of Signatures in Cloud Security
Factor 1: Is Behavioral Detection Available for Cloud and Cloud Native Environments?
Factor 2: Cloud Security Attacks
Factor 3: Are Signatures Practical to Use in the Cloud?
It’s Time for the Signature to Behavioral Transition to Happen in Cloud Security
Key Dates:
- 1987: First commercial anti-virus solutions like VirusScan and Anti4us were released.
- 1998: Snort, a signature-based open-source IDS/IPS, was launched.
- Late 1990s: Malware exhibited polymorphism and the amount of malware was exploding.
- 2003: Gartner coined the term "Next Generation Firewall" (NGFW).
- 2006: Gartner predicted IDS/IPS integration.
- 2008: Palo Alto Networks launched its category-defining NGFW.
- 2010: FireEye created the first network sandboxing capability for malware detection.
- 2012: Crowdstrike launched with a signatureless, cloud-based threat intelligence module and a detection and response module.
- 2013: Gartner coined the term "Endpoint Detection and Response" (EDR). Cisco acquired Sourcefire, integrating its IDS/IPS technology into its NGFW.
- 2016: Falco, a signature-based open-source runtime protection tool, was launched. Crowdstrike introduced its Machine Learning engine, emphasizing non-reliance on signatures.
- 2017: The release of the EDR Magic Quadrant from Gartner marked the shift from signature-based to behavioral detection.
- 2023: Gartner predicted that 95% of new applications would use cloud-native technologies like containers and Kubernetes.
- 2024: RAD Security launched a new standard for cloud-native workload fingerprints, enabling behavioral detection in cloud environments.
What is signature-based threat detection?
Signatures have been called many things over the years, including ‘heuristics’ and ‘rules’. The bottom line is that signature-based detection relies on matching. This could mean matching a bit of a known attack, like an IP address or a file. Or it could mean matching a piece of code to known viruses or malware. At the end of the day, signature-based detection tries to match current traffic, behavior or activity to a list of ‘known malicious components.’
Signatures everywhere!
Since the inception of IT Security to today, signature-based detection has made a significant mark. Starting with anti-virus software, and then evolving to IPS/IDS as the internet took hold, the signature-based detection method has been widely adopted across multiple fields of security, including network security, endpoint security, and now the burgeoning field of cloud security.
History of Signatures in Endpoint Detection Methods
Perhaps unfairly, signatures are best known (and also most underappreciated) for their role in endpoint security technology, starting with the anti-virus in the 1980s.
Where signature-based detection began: anti-virus software
The world’s first commercial anti-virus software was released by GDATA in 1987 for Atari ST computers, the same year that McAfee was founded and came out with its first product, VirusScan.
In 1987, two other anti-virus solutions came out, FluShot Plus (gotta love that name!) and Anti4us. Although anti-virus solutions probably have the strongest association with signature-based methods, these solutions found limits to the signature-based methodology. Anti4us and FluShot Plus both purported to detect based on partial ‘heuristics’ comparisons to detect novel attacks.
While the heuristics method was much more prone to false positives, it could supposedly catch novel viruses based only on fragments of known viruses, getting around a critical limitation of signatures.
Credit: https://dosdays.co.uk/topics/Software/mcafee.php
Anti-virus to Endpoint Protection Platform (EPP)
By the late 1990’s, malware had evolved to exhibit polymorphism, use packed and encrypted files, and the amount of malware was exploding. It wasn’t efficient to keep large databases of signatures up to date across enterprise endpoints, and ‘heuristic’ techniques were pushing false positives through the roof.
Something had to change, and as the internet took off and endpoints (including phones, laptops, desktop computers, and even printers) became a more critical part of the attack surface, the EPP category rushed in with a bundled solution that included other techniques such as data encryption, IDP, data-loss prevention and anti-virus.
To address the issues of signature-based detection in anti-virus solutions, a new category of EPP providers was built, using the ‘heuristics’ model from early anti-virus solutions and creating malware ‘families’ to detect new types of malware based on pieces of older malware samples.
From Endpoint Protection Platform (EPP) to Endpoint Detection & Response (EDR)
The evolution from EPP to EDR was based in large part on the need to move even further away from signature-based detection, and the evolution from signature-based to behavioral detection took center stage. In the face of new, emerging threats that didn’t install any software, signature-based detection was becoming an even clearer liability than before. Phishing also was a way to evade any solution based on a signature, for example with a Word document that could provide malicious instructions without a malicious file, per se.
In 2013, Gartner first coined the term ‘EDR’ to describe the class of solutions required on endpoints. This closely followed the launch of Crowdstrike, which had its Series A investment in 2012, touting its signature-less, cloud-based threat intelligence module and detection and response module.
In 2016, Crowdstrike made a direct affront to the capability of signature-based technology to detect these kinds of emerging threats in the press release for its Machine Learning engine:
“Another key unique feature of the CrowdStrike engine is that it does not rely on signatures, giving users immediate insight into any suspicious file that is uploaded to VirusTotal even if the threat is currently unknown to the anti-virus (AV) industry.”
They continued to further pick on signatures, calling the current anti-virus products “legacy and bloated AV agents” while touting new methods of “machine learning/artificial intelligence, behavioral-based Indicators-of-Attack (IoAs).”
With the release of the EDR Magic Quadrant from Gartner in 2017, signature-based detection was officially ‘out,’ and behavioral detection was ‘in’.
But was signature-based detection still ‘in’ for another category?
History of Signatures in Network Security
Signatures were present in a large portion of the evolving network security market, though the evolution of firewalls, IDS and IPS solutions was perhaps not as well known or publicized. In 1994 Checkpoint launched the most well-known commercial firewall (though technically not the first), with its Firewall-1. Initial firewalls could block traffic based on port, protocol and IP address. IDS solutions became popular as a complementary tool in the 2000s to address new attack types like SQL injections and cross-site scripting (XSS). At its core, an IDS solution detects exploits against applications in traditional, on-prem networks. An IPS (classically speaking), in contrast, is inline to the firewall traffic, the goal being to actually stop malicious traffic.
IDS and IPS
The original IDS and IPS solutions took a pure, signature-based approach. For example, Snort,an open-source IDS and IPS that was launched in 1998, was purely signature-based. IDS solutions relied in large part on signatures to detect all the ways an attacker might exploit a vulnerability, so for every vulnerability, there were hundreds of signatures that needed to be written, and vendors would usually brag about how large their signature databases were compared to the competition. This set-up was ok because the IDS was sitting next to the firewall traffic and inspecting it, versus stopping any traffic in mid-flight.
Imagine the impact on throughput for a firewall, sitting inline with the network, checking each packet that comes through against hundreds or thousands of signatures for every possible exploit of any vulnerability? Yikes. To address this, the IPS vendors began to cover each vulnerability with only one signature, instead of using numerous signatures against any possible exploit. But IDS and IPS solutions remained very much in the realm of ‘signature-based’ methods.
It is unclear exactly when, but over a decade-long period, IDS and IPS solutions did eventually expand beyond signature-based solutions. And while many equate IPS solutions with anomaly detection and IDS with signature-based methods, the truth is that both solutions had a mix of behavioral and signature-based methods by ~2005.
The NGFW
Signature-based methods also were overtaken by behavioral tools in the evolution of the networking industry to the Next Generation Firewall (NGFW). The NGFW primarily evolved to address the complexity of connected networks which continued to increase exponentially, requiring stateful, application and identity awareness. From a consolidation perspective, additional IDS/IPS capabilities were also added into the picture. Iterative changes to the NGFW occurred over the course of a decade or so. First, Gartner coined the term ‘Next Generation Firewall’ in 2003, predicted their integration with IDS/IPS in 2006, and Palo Alto Networks launched its famous category-defining NGFW in 2008. Cisco Systems bought Sourcefire’s (the owner of Snort, which was created in 1998) market-leading IDS/IPS in 2013, and Cisco quickly thereafter added the technology to make its own NGFW.
Beyond the IDS/IPS capabilities, the early NGFW had no answer to the malware that was involved in the 2011 RSA breach, that got into the network via a phishing campaign and malicious document. Companies like FireEye had come up with the first network sandboxing capability for this kind of malware in 2010, before the RSA breach had even taken place, but the attack prompted NGFW solutions to start adding in network sandboxing capabilities to sandbox files taken from network traffic or email, making up for gaps in the signature-based IDS and IPS solutions.
In the NGFW’s full suite of capabilities, signatures became less and less front and center over time to adapt to the new threat landscape, from the firewall itself moving to stateful blocking, to the next-generation IDS/IPS capabilities and then the behavioral sandboxing innovation.
Sounds familiar at this point, doesn’t it?
The NGFW moved to behavior-based detection along with endpoint security.
So is there maybe another category where signatures are still ‘in’? Yes - in Cloud Security.
Why is Cloud Security Still Reliant on Signature-Based Threat Detection?
Signatures weren’t good enough for the endpoint market. They weren’t good enough for the network security market. Why would they be good enough for the cloud security market, which is newer, and has brought a whole new element of complexity to security teams’ responsibilities?
Based on the history of signatures and the current state of signature-based detection in cloud security, it might be reasonable to conclude that signatures will soon be replaced by behavioral detection in the cloud. What will it take for this transition to occur? According to history, the key criteria are:
- Innovation: Behavioral detection must be available in a way that is appropriate for cloud and cloud native environments
- New Attacker Tactics: There needs to be an acute awareness of the new tactics and techniques used by attackers targeting the cloud
- Usability: The reality of using signatures must become impractical on a day to day basis
An examination of the history of the cloud security market will show whether those factors are or aren’t in place.
History of Signatures in Cloud Security
Amazon Web Services launched in 2006, and IT has never looked back. Today, 60% of a company’s data is in the cloud, and a full 45% of breaches are cloud-based. The IT landscape that EDR tools and the NGFW grew up in has fundamentally changed, and includes the proliferation of cloud native technologies. In 2023, Gartner predicts that 95% of new applications will be delivered using cloud native technologies like containers and Kubernetes, because of the speed and ability of those new tools to develop and ship new features quicker.
In this new paradigm, signature-based detection and response is not just an option, it’s the rule. In fact, it’s thriving!
To understand signature-based methodologies for detection and response in cloud security today, a few key concepts are important:
SDLC: Each part of the software development lifecycle requires different security components; the development stage where you need to scan code for vulnerabilities, the deployment stage where you need to make sure your cloud and Kubernetes infrastructure is hardened, and the runtime stage where containers are actively running.
This runtime stage is where signature-based security comes into play in the classic sense. Technically signatures also could be construed as admission control rules or other kinds of hardening techniques, but for our purposes here we are sticking to signatures as a method for detection versus hardening.
eBPF: Extended Berkeley Packet Filtering is a way to run programs on the Linux kernel and was created in 2014. It is the technological foundation for networking, observability, and runtime protection in the cloud native world where containers run on Linux in the cloud. It’s primarily used from a runtime perspective, to observe and control the Linux kernel in its actual, running environment.
Falco: Apart from being one of the most popular open source security tools of all time (7000 stars on Github!), Falco, an open source runtime protection tool launched in 2016, is a 100%, signature-based technology. It has libraries of rules (aka signatures) that are deployed alongside the tool itself (hint: heavy overload there), and can perform the observation and blocking functions.
Now, putting it all together, the state of signature-based detection in cloud security today is the following:
Nearly all cloud security vendors use a signature-based approach very similar to that of Falco, using eBPF technology, to provide runtime protection.
Factor 1: Is Behavioral Detection Available for Cloud and Cloud Native Environments?
Yes, yes it is. In January, 2024, RAD Security launched a new standard for cloud native workload fingerprints, in the form of an open source catalog. Don’t have millions of samples of cloud attacks? Don’t have millions of signatures from armies of security engineers working around the clock? Don’t need them.
Screenshot of the open source RAD fingerprint catalog
Instead, with this standard, using eBPF, you can fingerprint the behavior of the cloud native workload at runtime, in terms of the container processes, programs and files. This gives you a baseline for what is normal in that unique environment. Then, anything that diverges from that behavior is marked as suspicious. The approach is 100% signature-less, and can detect something like the XZ Backdoor, or any new attack by profiling the environment before the attack happens. For example, in the XZ Backdoor, if it was exploited, you would have seen the drift in sshd.
Most containers run 80% or more of the same processes across versions and environments, so fingerprints can be versioned and tracked over time, or even shifted left earlier into the SDLC, to the CI/CD process to cryptographically verify and verify behavior early on.
RAD enriches the drift data with real-time context from cloud native identity and infrastructure - and the context does NOT come directly from eBPF, removing the metadata overhead that you would get with, say, Falco or a similar approach.
Note that this is very different from the approach others might take and call “behavioral”, with an approach that relies on volumes of cloud attack scenarios. Since these attacks don’t yet exist, this methodology isn’t really a viable option. See Omer Singer’s incredibly insightful article about recent forays into AI-based detection in the cloud space and why they might have failed.
Factor 2: Cloud Security Attacks
Can signatures protect against the kinds of attacks targeting cloud environments today? The most recent zero day to target cloud native environments was the XZ Backdoor. In this attack, a malicious actor, over the course of two years, managed to become the maintainer of popular open source software, XZ Utils, and successfully packaged a remote execution backdoor into a software update.
For this kind of an attack, a signature might have helped after the fact. But a signature-based detection approach would not have done anything to prevent the initial exploitation, and would not have provided any kind of behavioral verification of the software itself. This is just one example, but it’s the same story for nearly every novel Kubernetes attack over the last two years.
Is the awareness of these attacks acute? 85% of CISOs claim that cloud security is their biggest challenge, so yes.
Factor 3: Are Signatures Practical to Use in the Cloud?
Examples of the main considerations and drawbacks faced by practitioners today for signature-based methods proliferate, due to the heavy usage of signatures in runtime security. Warning: these are VERY familiar to the drawbacks experienced in other industries!
Where does the library run:
Falco asks you to run its library, significantly impacting efficiency and operating speed for your production workloads
My signature library is bigger than yours:
Companies publish blogs about new signatures on a weekly basis, based on the work of their security research teams, and claim that whoever’s signature library is the biggest has the better chance of keeping you secure
What about context?
There are some things that should never be extracted from the kernel, if you want to have acceptable performance, and one of those things is metadata. But many eBPF programs will extract this metadata from the kernel, which is hugely inefficient. The data from the kernel is in real-time as well, so you’re getting data on two different timelines from the same place, making remediation guidance even more difficult. Today, even with this metadata, stateless alerts are the status quo; woe to whoever wants to combine this with other data!
It’s Time for the Signature to Behavioral Transition to Happen in Cloud Security
The innovation is there, the usability is becoming unmanageable, and security in the cloud is a CISO’s biggest challenge. It’s time to declare signatures history in cloud security and continue forging the path for behavioral threat detection in the cloud.
Sign the petition today to declare signatures history, and get started with behavioral fingerprints of your environment today.