Five Takeaways from the AI Action Plan

TOC
This is some text inside of a div block.

The AI community has been buzzing since the AI Action Plan's release last week - and for good reason. It reads like our policy wishlist from six months ago. Here's what we’re most excited to see implemented.

1. AI Evaluation and Testing Infrastructure

Dreadnode: In line with our team's release of the AI Red Teaming benchmark, Dreadnode has been a strong proponent of establishing comprehensive AI evaluation standards and testbeds. We have advocated for the creation of software testbeds that simulate real-world operating conditions, building on the TEST AI Act of 2025 for NIST-led evaluation programs, and developing a consortium centered on data quality assurance.

AI Action Plan: In the spirit of furthering the development of high-performing and reliable AI systems, one of the main recommendations is the creation of an AI Evaluations Ecosystem. Easier said than done, but this goal becomes more realistic with investments in the Department of Energy and the National Science Foundation, particularly to establish "AI testbeds for piloting AI systems in secure, real-world settings." Further, the Action Plan recognizes the importance of leveraging a NIST AI Consortium to empower collaborative research and convergence around evaluations and benchmarks.

Why It Matters: The best way to harness AI systems is to understand and measure their capabilities, so large language models (LLMs) or agents can be leveraged and applied to whatever use case(s) they're best suited for. We cannot expect any LLM or agent to be universally competent, so that is where baseline measurements of capabilities—ranging from math skills to language proficiency to resilience against cyberattacks—is particularly important. But the same way that standardized testing doesn't always reliably measure human intelligence, evaluations are subject to similar inadequacies and we will need to manage our expectations accordingly. That is why thoughtful, comprehensive research will be critical to future evaluation success.

2. Dataset Quality and Contamination Detection

Dreadnode: Dreadnode supports quantifiable benchmarks for dataset contamination and bias detection, provides tools for auditing training and fine-tuning corpora, and enables the evaluation of adversarial data insertion and backdoor attacks.

AI Action Plan: The prioritization of building world-class scientific datasets acknowledges that "high-quality data has become a national strategic asset as governments pursue AI innovation goals." The plan coordinates across a range of authorities—including the Office of Management and Budget, the National Science Foundation, and the National Science and Technology Council, specifically the Machine Learning and AI Subcommittee—to tackle this challenge. This multi-agency approach recognizes that different use cases require different solutions, avoiding a silver bullet mentality.

Why It Matters: Any extended amount of research into evaluations will quickly expose the importance of data quality and integrity. In simple terms, this means assessing whether the data driving AI behavior is comprehensive, resistant to manipulation, and capable of mitigating its own biases.

3. Automated Vulnerability Discovery and Remediation

Dreadnode: Having had the distinct opportunity to support DARPA's Artificial Intelligence Cyber Challenge, Dreadnode has good reason to advocate for increased investments in automated vulnerability discovery and remediation (AVDR). We highlighted the importance of transitioning the winning AIxCC systems and harnessing AVDR solutions for national security and cybersecurity posture.

AI Action Plan: The AI Action Plan highlights the importance of streamlining and enabling AI adoption by removing unnecessary process burdens to harness AI's full potential. Complementary recommendations include establishing regulatory sandboxes or AI Centers of Excellence, particularly to develop and distribute AI-enabled cybersecurity tools and defensive capabilities.

Why It Matters: Vulnerability management is an especially onerous process. The level of effort required to identify, assess, prioritize, remediate, and report on vulnerabilities can take anywhere from a few days to six months. During this window, adversaries can exploit unpatched vulnerabilities to gain unauthorized access to enterprise software for malware deployment, sensitive data exfiltration, espionage, and other adversarial goals. Augmenting security teams with automated—dare I say, agentic—solutions is critical to speeding up vulnerability discovery and time to remediation.

4. AI Red Teaming and Adversarial Testing

Dreadnode: As an offensive AI company, Dreadnode has spent a lot of time on red teaming efforts and the emulation of adversarial intent. We cannot overstate the importance of these research applications in AI security testing, which is why we've advocated for an AI-enabled red teaming consortium that can test for goal obfuscation, adversarial goal hacking, agentic manipulation, and AI performance under degraded operational conditions.

AI Action Plan: The dedicated attention to AI interpretability, control, and robustness coupled with a call for investment is particularly encouraging. The AI Action Plan goes so far as to say that the Department of Defense, Department of Energy, the Center for AI Standards and Innovation, the Department of Homeland Security, and the National Science Foundation should work with academic partners to coordinate an AI hackathon. This initiative would be designed "to solicit the best and brightest from U.S. academia to test AI systems for transparency, effectiveness, use control, and security vulnerabilities."

Why It Matters: Amidst mounting reports about AI-enabled cyber threats, red teaming offers additional value by grounding those discussions in reality. When red team efforts are able to successfully emulate adversarial behavior to identify vulnerabilities, misconfigurations, or exploitable weaknesses in a digital ecosystem, it's a lot easier to determine an incident response plan and to harden enterprise perimeters.‍

5. Federal Procurement and Standards

Dreadnode: At minimum, federal contracts should require that AI systems demonstrate superior performance on American-designed evaluations and pass our red team assessments. Procurement-driven standards create instant market pressure, bypassing lengthy regulatory debates while positioning the U.S. to maintain its lead in the global race for AI dominance.

AI Action Plan: By advocating for an AI procurement toolbox, this administration has recognized the value in using executive orders to enable agencies to “choose among multiple models in a manner compliant with relevant privacy, data governance, and transparency laws.”

Why It Matters: Policymakers have an opportunity to ensure that federal agencies and their critical infrastructure partners can only invest in AI systems that meet rigorous American security standards. When the federal government only buys secure, rigorously evaluated AI systems, the entire market shifts. Private sector companies will build to federal standards because that's where the revenue is. The next 12 months are critical. Agencies must implement these procurement standards before the next political transition threatens continuity.

Part 2: What Still Needs to Happen

But alignment on paper isn't implementation. Here's what still needs to happen:

The Implementation Challenge

For these goals to materialize, the federal government will need to follow through with concrete implementation plans. These plans should include:

Investments and staffing support for key agencies:

Department of Defense, Department of Energy, Department of Commerce
NIST: Center for AI Standards and Innovation
Department of Homeland Security, National Science Foundation

Enablement of interagency coordination through Memoranda of Understanding, appropriations funding, and regulatory pathways. The TEST AI Act is an excellent example of Congressional policy that could leverage existing software testbeds for AI evaluation and benchmark development.

Prioritization of accessibility through testbed accessibility, small business engagement, and open source solutions that don't lock out smaller players from the AI security ecosystem.

The Meta-Problem: Politicization is Killing Progress

Six months ago, these recommendations seemed aspirational. Today, they're federal policy. But as we celebrate this alignment, a different challenge has emerged, and it's not technical—it's political. Every transition period costs us months of progress when AI-enabled cyber threats evolve in days.

We've already lost critical time. While policymakers debate ideological frameworks, adversaries are developing AI-powered attack capabilities. While we argue about DEI in algorithms, state actors are building AI weapons. While we fight over power grid sustainability talking points, we're falling behind on the infrastructure needed to support AI innovation.

Technology development shouldn't stop for political cycles. AI security threats don't pause for elections or partisan transitions. Adversaries aren't waiting for us to resolve our political differences before advancing their AI capabilities.

Bipartisan consensus on AI security is possible and necessary. Republicans and Democrats both want secure systems, competitive advantages, and protection from adversaries. The technical requirements for AI security don't change based on who's in office.

Examples where politicization has already cost us: Transition periods inevitably disrupt continuity of operations, and we risk losing critical momentum. Recent budget and staffing cuts have impacted AI-specific grants and review panels at the National Science Foundation, AI research priorities at the Department of Education, and the cyber workforce within the Department of Homeland Security, particularly CISA.

Meanwhile, our competitors aren't pausing. China announced a $1.4 trillion AI investment plan while we reorganize our agencies. Our adversaries are building capabilities unconstrained by political transitions. Institutional knowledge and information transfer cannot be dismissed as bureaucratic overhead—continuity of operations is what will make us stronger as we face urgent cyber threats.

The Path Forward

Success requires investments in R&D, transition pathways to test and operationalize emerging tech, interagency coordination to share resources, leveraging of federal procurement power, and the priority of secure-by-design. This doesn't mean we get it right the first time, but that we learn from past mistakes, build in safeguards where we can, and create tech that is adaptable to an evolving threat landscape.

Complex issues like sustainable power infrastructure and bias mitigation deserve comprehensive solutions, not political soundbites. These challenges require the least divisive approaches possible because the stakes are too high for partisan gridlock.

The bottom line: We know what needs to be done. The AI Action Plan proves we can build consensus around the right policies. Now we need to depoliticize implementation and get back to work.

Let's focus on what works: rigorous evaluation, red teaming, and procurement standards that evolve with technical advances, not electoral cycles.

Audio content more your speed? I was invited on CyberScoop's Safe Mode podcast to discuss the AI Action Plan this week. Listen here or below.

‍