How AI is Transforming Federal Cybersecurity: Proven Use Cases from NR Labs

Foreword

As agencies navigate the growing complexity of cybersecurity threats and compliance demands, Artificial Intelligence (AI) offers a powerful lever to modernize and scale defenses. At NR Labs, our Cyber Innovation practice has developed a robust framework of AI use cases aligned to federal cybersecurity missions. Our approach is grounded in applied research and built on three (3) pillars:

  • Governance, Risk, and Compliance (GRC)/Risk Management Framework (RMF) Automation - AI-powered support for RMF Step6 (Monitor), Authority to Operate (ATO) acceleration, and continuous risk posture assessments for cloud vendors.
  • Security Operations Center (SOC) Enablement - Enhanced threat triage, automated incident response (IR), and intelligence SOC orchestration.
  • Red Teaming & Model Testing - Use of AI to augment adversarial emulation, vulnerability assessments, bias and fair ness evaluations, and safety/alignment testing of AI systems.

We work alongside federal cyber practitioners in across-sector, multi-disciplinary environments. Through a network of partnerships with cutting-edge AI-enabled software platforms, we’re helping our clients test and adopt AI use cases at speed. This blog outlines our journey, findings, and practical applications for federal agencies looking to safely integrate AI into their cybersecurity programs.

Introduction

By now, everyone has heard the hype surrounding AI. But while headlines tout revolutionary breakthroughs, federal cybersecurity programs have seen limited progress in turning promise into practice.

That’s why, in May 2024, NR Labs launched a hands-on initiative to explore how AI can directly support cybersecurity missions. We stood up a Secure AI Testbed in Amazon Bedrock, where we tested adversarial agents andAI-enabled system defenders to simulate real-world cyber scenarios. This included identifying a novel technique that could bypass safety controls present in Amazon Guardrails and then designing mitigations to strengthen AI defenses against this technique through system prompting engineering.

These experiments made one thing clear: securing AI systems is only half the equation. The real value lies in AI-enabled cybersecurity: augmenting the work of cybersecurity professionals, accelerating control assessments, improving triage, and automating low-value tasks, so humans can focus on what matters most.

Background

This initiative is driven by recent federal mandates accelerating the adoption of AI across government. In January 2025, the Trump Administration issued Executive Order (EO) 14179, “Removing Barriers to American Leadership in Artificial Intelligence,” which revoked restrictive prior policies and directed agencies to submit AI action plans within 180 days.The EO emphasizes AI development that is free from ideological bias and prioritizes American-made technologies.

To operationalize the EO, the Office of Management and Budget (OMB) released two (2) key memoranda in April 2025. OMB M-25-21, “Accelerating Federal Use of AI,” requires agencies to appoint Chief AI Officers and establish AI Governance Boards. Agencies must publish AI strategies, manage risks tied to high-impact AI systems, and promote open-source sharing, while aligning with a “Buy American” approach.

OMB M-25-22, “Driving Efficient Acquisition of AI,” outlines lifecycle guidance for procuring AI, emphasizing data protection, intellectual property (IP) rights, and performance-based contracting. It also discourages vendor lock-in and reinforces the preference for US-developed AI solutions.

Together, these policies create a clear mandate for agencies to embed AI into mission workflows with speed, transparency, and accountability.

Use Cases

Automating RMF Step 6: AI in Continuous Monitoring

The RMF Step 6 “Monitor” phase governs the continuous assessment and authorization of systems operating under an active ATO. Across the federal government, over 12,000 Federal Information Security Modernization Act (FISMA)-reportable systems fall under this requirement, each subject to ongoing evaluations involving dozens, if not hundreds, of security controls. The administrative workload to maintain ATO compliance across this portfolio is immense, often demanding hundreds of hours per month per system from assessors and Information System Security Officers (ISSO).

To address this challenge, NR Labs conducted rigorous testing across leading large language model (LLM) platforms, including ChatGPT, Claude, Grok, Gemini, and LLaMA, to evaluate the role of agentic AI in supporting this continuous monitoring process. The findings revealed significant efficiency gains for both independent assessors and system owners.

For assessors, AI can interpret and validate control implementation statements, analyze supporting documentation, flag potential findings, and generate Security Assessment Reports (SAR) in formats tailored to each agency’s requirements. These tasks, which previously consumed substantial analyst hours, can now be reliably automated or accelerated. Notably, while human review remains necessary to verify AI-generated control assessments, AI dramatically reduces the manual burden by highlighting areas of concern, enabling assessors to focus attention where it is most needed.

For ISSOs and system owners (SO), AI streamlines high-frequency, time-consuming tasks such as updating the System Security Plan (SSP), drafting implementation statements, assembling response packages, and managing artifact versioning. These are precisely the areas where human bandwidth is often constrained, delaying compliance and increasing audit risk. Efficiency gains are further amplified when AI is prompted with clearly defined parameters, including precise control identifiers and authoritative references (e.g.,National Institute of Standards and Technology (NIST) Special Publication (SP)800-53 Rev. 5 rather than general mentions of NIST 800-53). Moreover, results improve significantly when input queries provide examples and clearly stated output expectations, a direct reflection of the "Input =Output" principle that governs natural language processing effectiveness.

The efficiency gains are compelling. ISSOs report more than 70% of the time saved per control update, while assessors estimate savings of 200 to 300 hours per month for systems with 100 or more controls. When scaled across multiple systems, the aggregate time savings and cost avoidance are substantial, resulting in reduced operational overhead and faster compliance cycles.

AI offers a clear opportunity to modernize RMF execution at scale, improving both the speed and quality of federal cybersecurity oversight.

Cloud Vendor Risk Management

As federal agencies increasingly rely on cloud services, managing risk across third-party providers is critical. Cloud Vendor Risk Management (CVRM) involves reviewing Federal Risk and Authorization Management Program (FedRAMP) authorization packages, assessing documentation like SSPs and IR Plans, validating control alignment, and ensuring timely updates; all to maintain compliance and continuous authorization.

With over 300 FedRAMP-authorized services supporting federal workloads, including sensitive data like Controlled Unclassified Information (CUI), the scale and frequency of these reviews demand a more efficient approach. NR Labs leverages AI to streamline CVRM by automating several key tasks.

AI verifies document completeness, flags early risks through cross-artifact consistency checks, and validates control alignment using natural language processing (NLP). It also maps content to agency-specific checklists and tracks timeliness, identifying outdated artifacts and testing delays. Importantly, AI’s responsiveness to user feedback, particularly on format, depth, and accuracy of output, makes it a powerful tool in dynamic review cycles. Teams can iteratively refine results through targeted prompts, improving the quality and relevance of AI-generated outputs over time.

However, achieving desired outcomes is an iterative process. Explicit instruction is essential, especially when cross-referencing multiple documents or validating control evidence across diverse artifact types. For instance, reviewers must clearly direct AI to align content between SSPs, IR Plans, and Plans of Action and Milestones (POA&M). Additionally, different tasks may benefit from different models; some are more effective at handling certain file formats or types of reasoning, making it critical to assess and select the right platform for each use case.

These capabilities reduce manual review time by approximately 50%, allowing assessment teams to focus on high-impact analysis while accelerating review cycles. AI-enabled CVRM offers a scalable solution for managing cloud risk in a rapidly expanding federal Information Technology (IT) environment; elevating speed, precision, and adaptability in oversight processes.

Threat Intelligence Triage

Federal SOCs face an overwhelming and growing volume of alerts often exceeding 100,000 events per day in larger agencies. Each alert must be triaged, validated, and either escalated or dismissed, putting immense pressure on analysts and leading to alert fatigue, delayed responses, and potential oversight of critical threats.

NR Labs is applying AI to streamline the threat intelligence triage process in live SOC environments, enabling analysts to respond faster and more effectively. AI reduces the manual burden by automating indicator of compromise (IOC) triage and ticket creation, which shortens the time from threat detection to mitigation and helps agencies respond before issues escalate.

AI’s strength in processing machine-readable data formats, such as JavaScript Object Notation (JSON) and Comma-Separated Values (CSV), allows it to outperform traditional tools when analyzing structured telemetry, log data, and endpoint status (e.g., from platforms like CrowdStrike). This data-centric approach allows for high-fidelity pattern recognition, risk scoring, and alert correlation. In contrast, while AI can generate outputs in formats like Microsoft (MS) Word, PDF, or MS Excel, its decision-making is far more accurate when working with structured inputs.

Moreover, AI enhances threat triage accuracy by correlating contextual data such as geolocation, endpoint posture, and infrastructure relationships to deliver smarter scoring and actionable recommendations. AI-generated insights include embedded mitigation steps, helping analysts act rather than just react. Crucially, these contextual decisions are refined overtime: the more examples and feedback the model receives, the more accurately it can reflect an agency’s real-world threat environment and playbooks.

While our models are trained to align with agency policies and standard operating procedures (SOP), we emphasize the value of AI’s risk-based recommendations in supporting human decision-making. Analysts and SOC leaders remain the ultimate arbiters, but AI adds a layer of informed guidance to help prioritize and respond with greater speed and precision.

By automating the initial analysis and reducing repetitive manual tasks, for instance summarizing long reports into a paragraph that IT teams can action, AI significantly lowers analyst fatigue and improves IR Service Level Agreements (SLA). On average, we anticipate agencies can save approximately 30 hours per analyst monthly, allowing teams to redirect effort toward higher-order threat hunting and incident resolution.

As federal agencies work to modernize their cyber defenses under increasing demand and scrutiny, AI-enabled triage stands out as an immediate force multiplier for SOC automation, performance, and resilience.

Conclusion

AI-enabled cybersecurity is no longer an emerging concept for federal cybersecurity. It is an operational asset reshaping how agencies assess risk, manage controls, and respond to threats. From accelerating RMF monitoring to scaling cloud vendor reviews and augmenting SOC triage, our AI-driven use cases at NR Labs are demonstrating real-world impact today.

But unlocking AI’s full potential requires more than just deploying powerful models. It requires skilled prompt engineering, clear process scaffolding, and iterative refinement. We’ve learned that short, simple prompts, delivered in sequence rather than all at once, consistently outperform complex, one-shot instructions. Designing AI prompts with AI's assistance further improves clarity, retention, and result quality, especially when cross-referencing multiple systems or navigating nuanced compliance artifacts.

This shift also marks the rise of prompt engineering as a core competency. Just as Security+ and Certified Information Systems Security Professional (CISSP) became baseline certifications for cybersecurity practitioners, we expect prompt engineering credentials and expertise with agentic model architecture to become table stakes for those working with AI in federal environments. These skills not only optimize output accuracy but also help improve AI efficiency and reduce overall processing cost yielding measurable performance gains.

Looking ahead, agencies must treat AI not just as a tool, but as a collaborator; one that learns, improves, and adapts with each interaction. The agencies that invest in AI literacy, safe implementation practices, and mission-aligned prompting strategies will be best positioned to lead.

At NR Labs, our Cyber Innovation practice will continue developing and sharing applied research in this space; focusing on real-world impact, transparency, and resilience. If your agency is ready to explore AI’s role in securing systems and advancing mission outcomes, we’re ready to help.

Ready to leverage AI for your cybersecurity mission? Contact NR Labs to learn more about our Secure AI solutions: https://www.nrlabs.com/contact-us