BLOG

Dreadnode Updates

10
RESULTS
Category
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Research

PentestJudge: Judging Agent Behavior Against Operational Requirements

Evals are simple, but penetration testing is complicated. Using human-made rubrics, we compare LLMs and humans at judging the performance of a penetration testing agent.
Aug 2025
Shane Caldwell
News

Dreadnode’s Offensive AI Capabilities on Display in Vegas During Black Hat, Def Con, and the AI Security Forum

Welcome to the Offensive AI party. Learn where to find the Dreadnode crew in Vegas for Black Hat, Def Con, and the AI Security Forum.
Aug 2025
Dreadnode Crew
Research

Evaluating Offensive Cyber Agents: Kerberoasting

In this blog, we breakdown a kerberoasting agent eval, including details on design, how it is implemented in Dreadnode’s Strikes SDK and Platform, and the performance of various LLMs when tested against the evaluation.
Aug 2025
Michael Kouremetis
AI Policy

Five Takeaways from the AI Action Plan

The AI community has been buzzing since the AI Action Plan's release last week - and for good reason. It reads like our policy wishlist from six months ago. Here's what we’re most excited to see implemented.
Jul 2025
Daria Bahrami
Research

Evals: The Foundation for Autonomous Offensive Security

Learn how to build robust evaluations for autonomous red team agents that can perform Windows Active Directory operations. This blog covers action space design, programmatic verification, and measuring model performance using GOAD.
Jul 2025
Shane Caldwell
AI Policy

From Compute to Congress: Setting the Global Standard for AI Security

Daria explores how the TEST AI Act and red teaming standards can establish American leadership in AI security—a winning policy roadmap from Critical Effect DC 2025.
Jun 2025
Daria Bahrami
Research

AI Red Teaming Case Study: Claude 3.7 Sonnet Solves the Turtle Challenge

See how Claude solved a notoriously difficult AI/ML CTF challenge, going beyond pattern matching to genuine problem-solving under adversarial conditions.
Jun 2025
Ads Dawson
Research

Do LLM Agents Have AI Red Team Capabilities? We Built a Benchmark to Find Out

We're excited to introduce AIRTBench, an AI red teaming framework that tests LLMs against AI/ML black-box CTF challenges to see how they perform when attacking other AI systems.
Jun 2025
Ads Dawson
AI Policy

Dreadnode Response to the 2025 National AI R&D Strategic Plan

Dreadnode’s response prioritizes initiatives that strengthen AI security through data science and adversarial testing.
Jun 2025
Daria Bahrami
AI Policy

From Compute to Congress: Decoding AI Policy

Read ā€œFrom Compute to Congress: Decoding AI Policy,ā€ a blog series where we break down cyber and AI policy updates from the lens of security engineers and researchers.
May 2025
Daria Bahrami
Research

The Automation Advantage in AI Red Teaming

Read this data analysis for a large-scale, quantitative comparison between manual and automated attack approaches against Large Language Models (LLMs).
Apr 2025
Rob Mulla
AI Policy

Dreadnode’s Policy Recommendations for the U.S. AI Action Plan

ā€Dreadnode’s AI policy recommendations center on the integration and advancement of AI tools to strengthen America's national security apparatus.
Mar 2025
Daria Bahrami
News

Offensive AI Con Announced: First Conference Dedicated to the Use of AI in Offensive Security

Offensive AI Con (OAIC) is the world's first conference exclusively focused on the intersection of artificial intelligence and offensive cybersecurity. Organized by Dreadnode, Remote Threat, and DevSec.
Mar 2025
Dreadnode Crew
News

Dreadnode Secures $14M to Build AI Systems that Advance the State of Offensive Security

Dreadnode announces $14M in Series A funding and releases offensive AI tools to advance the state of offensive security, enabling more effective evaluation, testing, and deployment of AI systems.
Feb 2025
Dreadnode Crew