Kubernetes Security Blog | RAD Security

From RAGs to Riches: Supercharging Detection Engineering with Retrieval Augmented Generation

Written by Jimmy Mesta | Nov 26, 2024 11:25:56 PM

As detection engineers, we're constantly battling alert fatigue, coverage gaps, and the endless task of writing and tuning detection rules. While we've got solid tools like Sigma, Elastic Rules, and Splunk's SPL in our arsenal, there's still a lot of manual work involved. At RAD Security, we decided to shake things up by leveraging Retrieval Augmented Generation (RAG) to build a smarter detection engineering workflow—and learned a few lessons along the way.

The Detection Engineer's Dilemma

If you're like most detection engineers, you're probably dealing with:

  • Multiple detection formats (Sigma → Elastic → SPL conversions, anyone?)
  • Duplicate detections across platforms
  • The endless "has someone already written this?" question
  • MITRE ATT&CK mapping maintenance challenges

Sound familiar? We thought so too. That's why we turned to RAG to see how it could assist us.

RAG: Beyond the Buzzword

RAG isn't just another AI term—it's a practical approach to augmenting detection engineering workflows. Here's the gist:

  • Vector embeddings capture the semantic meaning of detection logic.
  • Large Language Models (LLMs) provide context and understanding.
  • Knowledge bases tie it all together.

 

The Tech Stack

In our implementation, we used:

  • AWS Bedrock – for LLM and embedding generation.
  • Pinecone – as a serverless vector database for detections.
  • Python – for glue code and processing.

 

Choosing the Right Embeddings Model

One of the critical steps in our journey was selecting the appropriate embeddings model in AWS Bedrock's knowledge base. It wasn't as straightforward as we'd hoped. AWS offers several embeddings models, each with different vector dimensions and performance characteristics.

We discovered that choosing the wrong model—or mismatching vector dimensions—led to cryptic and unhelpful error messages. For example, when the vector dimensions didn't align between the embeddings model and Pinecone's expectations, AWS Bedrock would throw errors that were less than intuitive. It felt like navigating a beta product, which is somewhat expected but a stark contrast to mature services like EKS and S3.

Pro Tip: Double-check the vector dimensions of your embeddings model and ensure they match the configuration in your vector database. Don't expect AWS Bedrock to hold your hand here—attention to detail is key.

Implementation Deep Dive

Detection Processing Pipeline

Here's the high-level flow we designed:

Detection Rules → Embeddings → Vector Database → Knowledge Base → LLM

Key components:

  1. Vector Database: Stores detection embeddings along with metadata.
  2. Embedding Model: Converts detection logic into vector representations.
  3. Knowledge Base: Manages retrieval and provides context.
  4. LLM: Offers intelligent responses and suggestions.

 

Pinecone vs. Other Vector Databases

We chose Pinecone as our vector database for several reasons:

  • Serverless Architecture: Pinecone offers a fully managed, serverless experience. This means we don't have to worry about provisioning or scaling infrastructure—it automatically adjusts to our workload.
  • Performance: Pinecone is optimized for similarity search, providing low-latency, high-throughput operations.
  • Ease of Use: With a simple API and robust documentation, integrating Pinecone was straightforward.

In contrast, other vector databases might require manual scaling or lack the serverless convenience. In the context of vector databases, "serverless" means you can focus on your application logic without managing the underlying servers or scaling policies. The database scales automatically with your data and query load, which is a huge plus when dealing with large volumes of embeddings.

Real-World Use Cases

 

1. Detection Development

Example query:

"Show me all detections for PowerShell execution with network connections."

  • Retrieves semantically similar detections.
  • Provides platform-specific implementations.
  • Suggests relevant MITRE ATT&CK mappings.

 

2. Coverage Analysis

Example coverage check:

"What detection gaps exist for T1055 process injection?"

  • Identifies missing detections in our arsenal.
  • Suggests areas for improvement.
  • Maps coverage across different platforms.

 

3. Detection Tuning

Example tuning query:

"How are other detections handling potential false positives in Windows Task Scheduler?"

  • Learns from existing implementations.
  • Suggests filter conditions.
  • Provides platform-specific optimizations.

 

Limitations (The Real Talk)

While RAG has been beneficial, it's important to acknowledge its limitations.

Technical Limitations

1. Embedding Challenges

  • Complex regex patterns don't embed effectively.
  • Platform-specific syntax can confuse the model.
  • Some technical nuances may be lost in translation.

2. LLM Quirks

  • May generate incorrect technical details.
  • Lacks knowledge of the latest tactics, techniques, and procedures (TTPs) due to training cutoffs.
  • Can suggest invalid syntax.

3. AWS Bedrock Hurdles

Navigating AWS Bedrock wasn't all smooth sailing. The platform felt a bit rough around the edges.

  • Cryptic Error Messages: When things went wrong—like mismatched vector dimensions—the error messages were less than helpful.
  • Documentation Gaps: Some features lacked thorough documentation, making troubleshooting a challenge.
  • Beta Feel: Overall, Bedrock feels like it's still in beta, which can be frustrating when you're trying to build production-grade solutions.

Operational Issues

1. Data Dependencies

  • The quality of input data is crucial.
  • Requires regular updates to stay current with new threats.
  • Ongoing quality control is essential.

2. Resource Overhead

  • Vector database costs increase with detection volume.
  • API usage can become expensive.
  • Real-time use demands significant compute resources.

 

Future Roadmap

At Rad Security, we're continuously looking to enhance our implementation.

Near-Term Improvements

1. Enhanced Context

  • Integrating direct threat intelligence.
  • Understanding attack chains more comprehensively.
  • Improving technical context awareness.

2. Automation

  • Auto-generating detection variants.
  • Facilitating cross-platform detection conversion.
  • Automating performance impact analysis.

 

Long-Term Vision

1. Advanced Capabilities

  • Predictive detection generation.
  • Developing self-evolving detection frameworks.
  • Automating coverage optimization.

2. Integration Goals

  • Native integration with SIEM and EDR platforms.
  • Real-time detection adaptation.
  • Linking automated incident response actions.

 

Building It Better

For fellow detection engineers considering a similar path, here are some insights from our journey.

1. Architecture Considerations

Key design principles we followed:

  • Modular Components: For ease of updates and maintenance.
  • Scalable and Cost-Effective Vector Storage: Leveraging serverless solutions like Pinecone to handle growth seamlessly. Be careful with using AWS provisioned vector databases through OpenSearch. The cost to run the bare minimum infra may shock you. 
  • Efficient Retrieval Methods: Optimizing for low-latency searches.
  • Platform-Agnostic Core: Ensuring flexibility across different detection platforms.

 

2. Quality Control

Essential practices:

  • Establishing a detection validation pipeline.
  • Regularly measuring effectiveness.
  • Conducting cross-platform testing.
  • Monitoring for false positives and negatives.

 

3. Future-Proofing

Planning for growth:

  • Designing a flexible schema.
  • Implementing API versioning.
  • Using modular embeddings.
  • Supporting extensible platforms.

 

To RAG or not to RAG

RAG isn't poised to replace detection engineers—and that's a good thing. It's a powerful addition to our toolkit that, when used thoughtfully, can enhance our capabilities. The key lies in understanding both its strengths and limitations, and integrating it in a way that complements our existing workflows.

For those looking to explore similar implementations, focus on:

  1. Quality of your detection data: Ensure your data is accurate and relevant.
  2. Scalability of your architecture: Build with future growth in mind.
  3. Practical use cases for your team: Identify where RAG can have the most impact.
  4. Integration with existing workflows: Seamless integration is vital for adoption.

The future of detection engineering is about augmenting our expertise with AI, not replacing it. By embracing these technologies today, we're paving the way for more efficient and effective detection strategies tomorrow.