Kubernetes Security Blog | RAD Security

How Detection & Response Tools Fall Short in Kubernetes Security

Written by Story Tweedie-Yates | Apr 4, 2023 3:00:00 PM

Detection and Response tooling (nDR) has become a standard part of the security practitioner’s toolkit. Whether monitoring activity on the endpoint, network, or cloud services, these tools are designed to spot patterns of behavior that could indicate a compromise. More effective than virus/malware scans and more sophisticated than older intrusion detection/prevention tooling, nDR tooling often looks across multiple data sources and incorporates some degree of machine learning or data science, weighing the likelihood of any given event being malicious. When it comes to Kubernetes and container runtime threat detection, however, nDR tools have certain blind spots that increase the likelihood nDR tooling will generate false positive findings in a Kubernetes environment.

Visibility Blind Spots for nDR in Kubernetes

Let’s take a step back and remind ourselves what Kubernetes is and how it might intersect with the scope of different types of nDR tooling. Kubernetes is a workload orchestration platform, providing an API for deploying and managing containers, networking them together, and exposing them to other resources or the outside world. It abstracts away the underlying infrastructure, whether a single physical device or thousands of nodes scattered around the world, allowing flexible options for deploying, upgrading, and distributing your workloads.

Compare that to the roles of various nDR tools. EDR, Endpoint Detection and Response, detects signs of malicious activity on a particular endpoint. As a result, its viewpoint is too narrow to fully understand what is happening in a Kubernetes cluster. Because it can only see the activity on one node, it is missing the critical context to determine a baseline of “normal” or “expected” behavior across the cluster as a whole. It’s like listening into one side of an extremely complicated phone call: you may get bits and pieces, but you have a big blind-spot.

On the other hand consider NDR, Network Detection and Response. Because it is looking at the traffic across the entire internal network of the cluster, you might think that it approximates the visibility you need to detect malicious activity in the cluster. However, Kubernetes is designed to abstract away the underlying infrastructure, and as a result the underlying network topology is transparent to your workloads. This means that workloads may not (and arguably should not) know whether they are communicating with a service on the same node or one in a totally different location. Examining traffic at the node-network level misses this abstraction, and as a result may again be blind to large portions of the activity on the cluster. As an added complication: if you are using a service mesh to proxy connections between workloads on your cluster, your East-West traffic may all appear as encrypted HTTPS traffic, only revealing the endpoints for each connection. That may yield some valuable information if you understand enough about your workloads to spot a connection that shouldn’t exist, but its very likely to appear as a wall of noise in your SOC with very little signal.

What you need to adequately monitor and detect malicious activity in Kubernetes is tooling designed specifically for Kubernetes and therefore capable of operating at the right level of abstraction. KSOC does exactly this, using the Kubernetes API itself as its source of truth about what is happening on your cluster, giving it a lens with just the right level of focus.

Likely Sources of False Positives for nDR in Kubernetes

Because the focus of most nDR tooling isn’t properly set to understand activity in a Kubernetes cluster, there is an increased likelihood of false positives, especially for EDR tools. Here are a few common sources of false positives to be aware of:

  • Networking Primitives on Nodes: To create its virtualized network across all workloads, Kubernetes needs to run certain networking primitives on every Node. Exactly what these are will vary depending on what CNI you are using, but they’ll need to do certain things like maintaining IP tables, doing local DNS lookup, opening and closing ports on nodes, creating load balancers and ingress points, etc. EDR and XDR (eXtended Detection and Response, which adds additional data streams from network and cloud platform providers) is known to pick up on this activity as a likely indicator of intrusion, sometimes even going so far as to kill the process as malicious.
  • Containers Running as Root: This is increasingly considered classic container runtime security bad practice for most workloads, but it will at times be necessary to run certain containers/workloads as root (ie, in privileged mode). When this is done, all the activity in that container will appear to be coming from the root user on the host to your EDR. Depending on the workload and what it does, this may look highly suspicious to the EDR agents and could result in false positive detections or blocks/prevention, resulting in the embarrassing situation of your own security tooling killing your own workloads. Worse, because Kubernetes is designed to maintain the desired state of the cluster, when this happens Kubernetes will often just try to redeploy the workload. This can result in a virtual ping-pong match between Kubernetes and your EDR (but not nearly as fun for your SOC team and whoever is wondering why their workload keeps dying).
  • Internal Container Activity: Similar EDR detections/actions may also occur for non-root container activity, but this is likely to be a bit less suspicious looking because of the lower level of privilege. Also, keep in mind that many nDR tools have some machine-learning/data science informed detection algorithms. This means that the more common the workload you are running (whether across your cluster or across all the customers they serve), the more “baselined” its activity will be and the less likely it is to cause false positives.
  • Developer/Admin Activity: Even if the activity of your running workloads is baselined enough for your EDR not to complain, there will be times when developers or admins hop into the cluster (or even a particular container) to troubleshoot something. Once there, they are likely to run commands that are unexpected, including perhaps elevating privileges, opening new ingresses or ports, and perhaps even piping out data to new locations so they can analyze and debug. There is nothing nefarious about any of this development and troubleshooting behavior, but it can easily be mistaken by an nDR agent as signs of an intrusion in the works. One easy way to spot these kinds of false positives: developers repeatedly trying variations on the same commands because they are unexpectedly failing as a result of silent blocking by the EDR agent.

TLDR

The moral of this story is not that you should stop using nDR tooling in your cluster. These tools can and will provide a valuable layer of protection, but they should be approached with caution. Do not assume that all findings are genuine in a Kubernetes environment and do not assume that all your bases are covered. Instead, you’ll need to tune your nDR tooling to be more Kubernetes aware and augment with monitoring and detection tools that are Kubernetes native. KSOC, built by some of the leading Kubernetes security experts specifically for this purpose, is a great way to get the real-time visibility you need to secure your Kubernetes environment without the misleading noise of an nDR. Check out this demo showing how our real-time capabilities provide the context to spot an attacker trying to get in and out of your clusters between point in time Kubernetes posture scanning tools.