AI Safety: where is the money now?

In our AI safety market deck, you will find everything you need to understand the market
SUMMARY
AI Safety: where is the money now? The money is now flowing into the parts of AI safety that make AI usable in production, especially security, evaluation, agent control, observability, interpretability, and red teaming.
The biggest pattern is that “AI safety” is no longer one clean market. It has split into a stack of practical control layers that appear when AI starts touching users, workflows, data, decisions, and money.
The strongest funding signals sit closest to deployment pain. LMArena, Braintrust, Goodfire, Noma, WitnessAI, Arcade.dev, and Gray Swan all map to concrete buyer questions: can we test this, monitor this, secure this, explain this, and prove control?
AI security looks like the broadest money pool because it plugs into an existing budget. CISOs already buy platforms, already understand attack surfaces, and can treat AI risks as an extension of security rather than a new philosophical category.
AI evaluation is becoming infrastructure because both model builders and enterprises need it. Labs need public and private benchmarks to prove progress, while companies need workflow-specific evals every time a model, prompt, retrieval layer, or tool stack changes.
Agent security is the freshest signal because the risk has moved from speech to action. A chatbot can say something wrong, but an agent can access systems, call tools, move data, approve tasks, and trigger workflows.
Observability is a surprisingly buyer-friendly form of AI safety. It does not always sound like safety, but it gives companies a practical way to catch hallucinations, regressions, broken agents, drift, and customer-facing failures after launch.
Interpretability is the highest-conviction technical bet in the stack. Goodfire’s move from a $50 million Series A to a $150 million Series B at a $1.25 billion valuation shows that investors are paying for a chance to inspect and steer models before behavior appears.
Guardrails are not disappearing, but they are being absorbed. Prompt injection defense, jailbreak protection, policy enforcement, and data leakage controls look increasingly valuable inside larger AI security platforms rather than as narrow standalone middleware.
Red teaming is investable when it becomes repeatable software infrastructure. The market is less exciting as manual consulting, but much more strategic when it continuously attacks, measures, and hardens models and agents.
Governance, synthetic media detection, and model risk management are real categories, but the money signal is more selective. They look strongest when attached to regulated pain, fraud prevention, identity protection, compliance automation, or broader control-plane infrastructure.
The clear conclusion is that AI safety money is following deployability. The winning categories are not the ones with the broadest ethical language, but the ones that help companies ship AI systems without losing control.

This market map, featured in our AI safety market deck, highlights top companies and startups in the AI safety market
What are the company categories in AI safety?
AI safety is not one market anymore. It is now a stack of tools companies need when AI stops being a demo and starts touching users, data, workflows, decisions, and money.
That matters because the money is not flowing evenly.
First, let’s try to understand what the different categories in the market are.
| Category | What it does | Example companies |
|---|---|---|
| AI evaluation and benchmarking | Tests model quality, reliability, safety, and task performance before deployment | LMArena, Braintrust, Patronus AI, Giskard |
| AI observability and production monitoring | Watches AI systems after launch: traces, regressions, hallucinations, drift, and agent behavior | Braintrust, Fiddler AI, Arize AI, WhyLabs |
| AI interpretability and model transparency | Tries to understand what models are doing internally, rather than only judging the final answer | Goodfire, Anthropic interpretability ecosystem, Neuronpedia-style tools |
| AI security platforms | Secures AI models, applications, prompts, data flows, agents, and enterprise AI usage | Noma Security, WitnessAI, HiddenLayer, Protect AI, Pangea, Lakera |
| AI agent security and authorization | Controls what autonomous agents can access, execute, approve, or trigger inside enterprise systems | Arcade.dev, Noma Security, WitnessAI, Lasso Security |
| Guardrails and prompt-layer protection | Blocks prompt injection, jailbreaks, unsafe outputs, data leakage, and policy violations | Lakera, Pangea, Prompt Security, Guardrails AI, Enkrypt AI |
| AI red teaming and adversarial testing | Stress-tests models and agents to find failure modes before attackers or users do | Gray Swan, Haize Labs, Dreadnode |
| AI governance, risk, and compliance | Helps enterprises inventory AI systems, enforce policies, document controls, and prove compliance | Credo AI, Holistic AI, Norm Ai, ValidMind, Fairly AI |
| Synthetic media and deepfake detection | Detects fake audio, video, images, identity attacks, and AI-generated fraud | Reality Defender, GetReal Security, Pindrop, Truepic |
| Model risk management for regulated industries | Validates AI and ML models for banks, insurers, healthcare, and other regulated buyers | ValidMind, Credo AI, Fiddler AI, Arthur AI |
Is money flowing into AI evaluation and benchmarking right now?
Yes. AI evaluation is one of the cleanest places where money is flowing in AI safety right now.
The best signal is LMArena. In January 2026, it raised $150 million at a $1.7 billion post-money valuation. The more interesting part is the speed: that valuation was nearly triple where it stood after its seed round in May 2025. That is an unusually fast repricing for a company that started as a public research benchmark, and it tells us investors are treating evaluation as a core layer of the AI economy, not as a side project.
There is another useful signal in the round structure. Felicis and UC Investments led the Series A, but the cap table also included a16z, Kleiner Perkins, Lightspeed, The House Fund, LDVP, and Laude Ventures. That mix matters because it combines AI-native investors, university-linked capital, and classic venture firms. When that many different investor types crowd into the same evaluation layer, it usually means the category is becoming legible to everyone at once.
Braintrust points in the same direction from the enterprise side. In February 2026, it raised $80 million in Series B funding led by ICONIQ, with a16z, Greylock, Elad Gil, and others coming back in. The follow-on signal is important here. If early investors were only excited by the 2023–2024 AI tooling hype, they would have had reasons to slow down by now. Instead, they doubled down.
What makes this category strong today is that evals are needed by both model builders and model buyers. Frontier labs need benchmarks to prove model progress. Enterprises need private evals to know whether an AI workflow still works after the model, prompt, retrieval layer, or tool stack changes. That is why the category is moving from “nice research leaderboard” to “deployment infrastructure.”
If you want more recent data on this point, please see our latest AI safety market report.

As this chart shows, and as featured in our AI safety market deck, search interest in AI safety has been growing steadily
Is money flowing into AI observability and production monitoring right now?
Yes. AI observability is getting funded because companies are realizing that testing once before launch is not enough.
Braintrust is again the strongest signal. Its $80 million Series B in February 2026 was framed around becoming the observability layer for production AI. That phrase matters because the buyer pain is changing. Teams no longer only ask, “Is this model good?” but rather “What happened when this agent failed yesterday, which prompt changed, which tool call broke, and how do we stop it from happening again?”
Fiddler AI gives a second signal, and it is useful because the investor base is different. In January 2026, Fiddler raised a $30 million Series C with existing investors like Lightspeed, Lux, Insight, Capgemini Ventures, Dallas VC, Dentsu Ventures, and Mozilla Ventures, plus strategic investors like LG Technology Ventures and Benhamou Global Ventures. That is not just venture money chasing a theme. It includes corporate and services-linked investors that care about real enterprise AI deployment.
The third signal is category convergence. Observability is starting to absorb pieces of governance, evaluation, compliance, and security. In normal software, logs and traces tell you what the system did. In AI systems, the observability layer also needs to explain why the model answered, which context it used, which agent step it took, and whether that behavior violated policy. That makes the budget easier to justify because several teams can care at once: engineering, product, legal, compliance, and security.
The interesting point is that observability does not sound like “AI safety” at first. But in practice, it is one of the most buyer-friendly forms of AI safety. Companies will buy an actual way to catch model regressions, hallucinations, broken agents, and customer-facing failures before they become incidents.
All things considered, observability is a strong money zone right now, especially when it is tied to production AI and agent workflows.
Is money flowing into AI interpretability right now?
Yes. Interpretability is one of the most surprising places where capital has become very serious.
Goodfire is the signal to watch. In April 2025, it raised a $50 million Series A. Then in February 2026, it raised a $150 million Series B at a $1.25 billion valuation. That means the company went from major early-stage funding to unicorn territory in less than a year. For a category that used to feel highly academic, that is a real change.
The investor list makes the signal stronger. Goodfire’s Series B included B Capital, Juniper Ventures, DFJ Growth, Salesforce Ventures, Menlo Ventures, Lightspeed, South Park Commons, Wing, and Eric Schmidt. Earlier participation from Anthropic also matters because frontier labs understand the technical problem better than almost anyone. If this were just a vague “trustworthy AI” story, that kind of technical credibility would be harder to explain.
The valuation also says something. A $1.25 billion valuation on a $150 million round is not cheap, and interpretability does not have the simplest enterprise sales motion today. That is the point. Investors are paying for a chance that interpretability becomes a control layer for how models are designed, inspected, steered, and trusted.
The non-obvious signal is that interpretability is moving upstream. Evaluation asks whether the model gave the right answer. Observability asks what happened in production. Interpretability asks what is happening inside the system before the behavior appears. If that works, even partially, it becomes relevant to model design, safety testing, fine-tuning, governance, and eventually regulation.
So it looks like interpretability is a high-conviction but high-risk money zone.
If you want more recent data on this point, please see our latest AI safety market report.

This chart, featured in our AI safety market deck, shows annual venture capital investment in AI safety startups
Is money flowing into AI security platforms right now?
Yes. AI security is probably the broadest and most validated money pool in AI safety right now.
Noma Security is the clearest startup signal. In July 2025, it raised a $100 million Series B less than a year after its previous round. That kind of timing matters. A big round is one signal; a big round quickly after the last one is stronger because it means the company was able to create enough momentum for investors to move again before a normal fundraising cycle.
WitnessAI adds another strong signal from January 2026. It raised $58 million in strategic funding, bringing total funding above $85 million. The investor mix is important: Sound Ventures led, but Qualcomm Ventures and Samsung Ventures also participated. That tells us AI security is not only a cloud software problem. It is also relevant to devices, edge AI, enterprise hardware, and distributed AI systems.
The biggest validation, though, comes from M&A. Palo Alto Networks completed its acquisition of Protect AI in July 2025. CrowdStrike moved to acquire Pangea in September 2025 to build AI Detection and Response. Check Point also agreed to acquire Lakera to build an end-to-end AI security stack. Three major cybersecurity platforms moving into AI security within the same window is a much stronger signal than one startup round.
This is where the market becomes easy to understand. CISOs already have budget. They already buy platforms. They already understand attack surfaces. If AI creates new risks around prompts, models, agents, data leakage, and tool use, then AI safety can plug into an existing security budget instead of begging for a new one.
At the end of the day, AI security is where “AI safety” becomes easiest to buy. That is why the money is currently flowing so clearly.
Is money flowing into AI agent security and authorization right now?
Yes. Agent security is probably the freshest money signal in the whole AI safety market right now.
Arcade.dev is the most recent proof. In June 2026, it raised a $60 million Series A led by SYN Ventures, with participation from Morgan Stanley and Wipro. That is a very current signal, and the category focus is narrow in a useful way: agent authorization. Arcade is not just saying “secure AI.” It is saying agents need a secure action layer before they can access apps, databases, APIs, and enterprise tools.
The seed-to-Series-A jump also matters. Arcade had raised a $12 million seed in 2025, then moved to a $60 million Series A in 2026. That tells us investors are quickly repricing the agent control layer as agents move from chat interfaces to systems that actually do things. The company also sits close to MCP and A2A-style infrastructure, which makes it more relevant as enterprises start connecting agents to real tools.
Noma and WitnessAI reinforce the same point. Noma’s $100 million Series B was explicitly about AI and agent security. WitnessAI’s $58 million strategic round came with expanded agentic AI security and governance capabilities. When several well-funded companies start using the same “agent security” language at the same time, it usually means buyers are beginning to describe the problem that way too.
The reason this category feels hot now is simple: the risk moved from speech to action. A chatbot can say something wrong. An agent can open a system, call a tool, move data, approve a task, or trigger a workflow. That turns safety into access control, policy enforcement, runtime monitoring, and audit trails.
So we can conclude that agent security is becoming its own money pocket, not just a feature inside AI security. It is still early, but the signal is unusually fresh and unusually practical.
If you want more recent data on this point, please see our latest AI safety market report.

This chart, featured in our AI safety market deck, shows how HiddenLayer is positioned in AI safety
Is money flowing into guardrails and prompt-layer protection right now?
Yes, but the money is starting to look more like consolidation than pure standalone company creation.
Prompt Security raised an $18 million Series A led by Jump Capital, with participation from Okta and F5. That investor mix is more interesting than the round size. Okta points toward identity and access control. F5 points toward application security and traffic infrastructure. In other words, prompt-layer protection is already being pulled toward the broader enterprise security stack.
Lakera is another important signal. Check Point agreed to acquire it in September 2025 to build an end-to-end AI security platform for enterprises, especially around agentic AI applications. That tells us guardrail technology is valuable, but maybe not always as a standalone end market. It may be more valuable when bundled into the larger AI security stack.
Pangea makes the same point from another angle. CrowdStrike moved to acquire it to launch AI Detection and Response across data, models, agents, identities, infrastructure, and interactions. That is a broader frame than “prompt guardrails,” but prompt-layer protection is one of the pieces inside it.
Guardrails are not dying. They are being absorbed. Buyers still need prompt injection defense, jailbreak protection, data leakage controls, unsafe-output filtering, and policy enforcement. But many enterprises will probably buy these capabilities from a larger security platform rather than as a standalone middleware tool.
Is money flowing into AI red teaming and adversarial testing right now?
Yes. Red teaming has become much more investable now that it looks like software infrastructure, not just expert services.
Gray Swan is the strongest recent signal. In May 2026, it raised a $40 million Series A, and the company says its platform is used by every major frontier lab. The investor list also matters: Wing, Madrona, Obvious Ventures, Snowflake Ventures, Hudson River Trading, and Samsung Next. That is a mix of venture, data infrastructure, trading, and strategic technology capital. It suggests buyers see red teaming as part of serious AI deployment, not just a compliance checkbox.
The second signal is the company’s positioning. Gray Swan is not only selling “we can test your model.” It is building adversarial testing infrastructure for models and agents. That distinction matters because manual red teaming does not scale well. If every model update, prompt change, tool connection, and agent workflow needs safety testing, the market needs repeatable attack systems.
Haize Labs adds another useful proof point, even though its round is older. It reportedly attracted strong investor interest around a $100 million valuation and works around automated LLM stress testing. The important thing here is the buyer pattern: frontier labs and AI-heavy enterprises need faster ways to find failure modes before release.
Also, red teaming is becoming continuous. In older security workflows, penetration testing was often periodic. With AI, the model, prompt, retrieval corpus, tools, and agent policy can all change quickly. That pushes red teaming closer to CI/CD and production monitoring.
Finally, we can say red teaming is a strong category when it becomes infrastructure. It is less exciting as a consulting market, but much more exciting when it helps teams continuously attack, measure, and harden AI systems.
If you want more recent data on this point, please see our latest AI safety market report.

This chart, featured in our AI safety market deck, shows annual funding in AI safety startups
Is money flowing into AI governance, risk, and compliance right now?
Yes, but the money signal is more selective than in AI security or evaluation.
Norm Ai is the strongest recent proof. In March 2025, it announced $48 million in funding from Coatue, Craft, Vanguard, Blackstone, Bain Capital, New York Life Ventures, Citi Ventures, TIAA Ventures, and Marc Benioff, bringing total funding to $87 million over 18 months. That is a very specific investor list. It is full of regulated-industry, financial, insurance, and compliance-adjacent names. The money is coming from people who feel regulatory pain directly.
The second signal is the product angle. Norm Ai is not just selling a dashboard for AI policies. It is turning regulations into compliance AI agents. That matters because governance buyers are tired of static documentation. They need systems that can read rules, check behavior, produce evidence, and help teams move faster without losing control.
Credo AI and ValidMind show the broader buyer pull. Credo is positioned around AI governance, risk, compliance, inventories, policy enforcement, and proof for frameworks like the EU AI Act and NIST. ValidMind is more focused on regulated industries like banking and insurance, and now frames itself around agentic AI governance as well as model risk management.
The weaker point is funding intensity. Compared with Goodfire, LMArena, Noma, or Braintrust, governance rounds are not exploding at the same pace. That tells us the category is real, but the market may be more regulation-led, more vertical, and slower to scale.
So it looks like governance is a solid money zone, but not the hottest one.
Is money flowing into synthetic media and deepfake detection right now?
Yes, but the money is flowing into fraud and identity protection more than broad truth detection.
GetReal Security is one signal. In March 2025, it raised a $17.5 million Series A led by Forgepoint, with Ballistic Ventures, Evolution Equity, In-Q-Tel, Cisco Investments, and Capital One Ventures participating. That mix is useful: cyber investors, government-adjacent capital, and enterprise strategic money all showed up.
Reality Defender adds another signal. It expanded its Series A to $33 million and positions itself around real-time detection across audio, video, imagery, and text. The company also talks about call-center deployment, which is important because that is where deepfakes become a measurable business problem rather than a vague social threat.
Pindrop is the strongest adoption signal. It surpassed $100 million in annual recurring revenue, and its voice authentication and deepfake detection technology is tied to fraud prevention in banks, contact centers, and enterprise voice workflows. That is a more mature signal than a startup round because ARR tells us buyers are already paying at scale.
The category is also being pulled by live fraud data. Retailers and large enterprises are now dealing with AI-generated calls, executive impersonation, and synthetic identity attacks. When one large retailer can see more than 1,000 AI-generated calls per day, the problem becomes operational, not theoretical.
All things considered, deepfake detection is investable where the buyer has money at risk.

This chart, featured in our AI safety market deck, compares the main business model options for AI alignment research labs
Is money flowing into model risk management for regulated industries right now?
Some money is flowing, but this is not where the hottest AI safety capital is going right now.
ValidMind is the cleanest example. It raised $8.1 million in seed funding to help financial institutions with model risk management and AI governance, bringing total funding to around $11 million. That is meaningful, but it is a very different scale from the $50 million to $150 million rounds we are seeing in interpretability, evaluation, and AI security.
Banks, insurers, and healthcare companies absolutely need model validation, audit trails, risk tiers, approvals, and board-ready reporting. The issue is that the market is more specialized. Sales cycles are slower, buyers are more regulated, and the product often needs to fit into existing risk processes.
The more interesting signal is that this category is being pulled into agentic AI governance. ValidMind now talks about governing agents through risk frameworks, policy-as-code, real-time hooks, immutable audit trails, reasoning traces, tool-call logs, and policy evaluation records. That is a much stronger angle than traditional model documentation.
So the category is alive, but the money is more cautious. Model risk management looks strongest when it becomes part of a broader AI control plane for regulated industries. As a narrow standalone category, it is less explosive.
So where is the money in AI safety right now?
Right now, the money is flowing into the parts of AI safety that make AI deployable.
The center of gravity is not “ethics tooling” in a broad sense. It is practical control infrastructure: evaluation, security, observability, agent authorization, interpretability, red teaming, and compliance automation. Investors are backing the layers that help companies answer five basic questions: did we test the AI, can we monitor it, can we secure it, can we explain it, and can we prove we controlled it?
The strongest pattern is that the freshest money goes where AI systems start acting like production software or autonomous workers.
That is why LMArena, Braintrust, Goodfire, Noma, WitnessAI, Arcade.dev, and Gray Swan stand out. They sit close to deployment pain, and deployment pain is where budget appears.
| Rank | Category | Signals that prove money is flowing |
|---|---|---|
| 1 | AI security platforms | Noma raised $100M less than a year after its prior round; WitnessAI raised $58M with Qualcomm and Samsung; Palo Alto bought Protect AI; CrowdStrike moved on Pangea; Check Point moved on Lakera |
| 2 | AI evaluation and benchmarking | LMArena raised $150M at $1.7B, nearly triple its seed valuation; Braintrust raised $80M with strong follow-on investors; evals are becoming required before and after deployment |
| 3 | AI agent security and authorization | Arcade.dev raised $60M in June 2026; Noma and WitnessAI now speak directly to agent security; MCP/A2A/tool use makes agent permissions an urgent problem |
| 4 | AI interpretability | Goodfire raised $150M at $1.25B after a $50M Series A less than a year earlier; Anthropic-linked credibility; investors are paying for technical scarcity |
| 5 | AI observability and production monitoring | Braintrust and Fiddler both raised in early 2026; observability is merging with evals, governance, agent traces, and production control |
| 6 | AI red teaming and adversarial testing | Gray Swan raised $40M in May 2026 and is tied to frontier-lab testing; Haize Labs shows demand for automated stress testing; red teaming is becoming continuous infrastructure |
| 7 | Guardrails and prompt-layer protection | Prompt Security raised from Jump, Okta, and F5; Lakera and Pangea were pulled into major cyber platforms; guardrails are becoming a layer inside AI security |
| 8 | AI governance, risk, and compliance | Norm Ai raised $48M from regulated-industry investors; Credo and ValidMind show enterprise pull; strongest angle is active compliance automation |
| 9 | Synthetic media and deepfake detection | GetReal raised $17.5M; Reality Defender expanded to $33M; Pindrop passed $100M ARR; strongest buyer pull is fraud, identity, and call-center protection |
| 10 | Model risk management for regulated industries | ValidMind and others solve real regulated-enterprise pain, but disclosed funding is smaller and sales cycles look slower than in security or evals |
If you want more recent data on this point, please see our latest AI safety market report.

This chart, featured in our AI safety market deck, shows revenue breakdown by customer segment in the AI safety market
OUR METHODOLOGY
This analysis tests where money is flowing in AI safety based on the evidence available today. We compare the main categories through funding rounds, valuation changes, acquisitions, strategic investors, revenue proof, buyer urgency, and product positioning.
We broke the market into practical analytical dimensions: evaluation, observability, interpretability, security, agent control, guardrails, red teaming, governance, synthetic media detection, and model risk management.
We gave more weight to signals that show direct deployment pain or clear budget ownership. That matters because those signals make a category easier to buy, not just easier to discuss.
Funding size was not the only signal. We also looked at speed of repricing, follow-on investor behavior, corporate and strategic investor participation, M&A by major cybersecurity platforms, ARR proof, and whether the product maps to a budget owner.
The ranking reflects where the strongest current evidence clusters today. It is not a permanent view of which AI safety categories matter most technically; it is a comparison of where capital, strategic activity, and buyer demand appear most visible right now.
Key sources used for this analysis include: PR Newswire on LMArena’s $150 million Series A and $1.7 billion valuation, TechCrunch on LMArena’s valuation and company context, Braintrust on its $80 million Series B, Axios on Braintrust’s AI observability round, Goodfire on its $150 million Series B and $1.25 billion valuation, Noma Security on its $100 million Series B, PR Newswire on WitnessAI’s $58 million strategic funding, Palo Alto Networks on its Protect AI acquisition, CrowdStrike on its Pangea acquisition, Check Point on its Lakera acquisition, Yahoo Finance on Arcade.dev’s $60 million Series A, The Wall Street Journal on Arcade.dev and agent security, Gray Swan on its $40 million Series A, Forbes on Gray Swan and frontier-lab pressure testing, Prompt Security on its $18 million Series A, Norm Ai on its $48 million funding round, PR Newswire on GetReal Security’s $17.5 million Series A, PR Newswire on Reality Defender’s expanded $33 million Series A, PR Newswire on Pindrop surpassing $100 million ARR, and ValidMind on its $8.1 million seed funding.

This chart, featured in our AI safety market deck, shows how prompt injection defense platform technology has evolved over time
Related blog posts
- How strong is fundraising in the AI safety market right now?
- Which startups have raised the most funding in the AI safety market?
- Which startups are the most valued in the AI safety market?
Who is the author of this content?
NEW MARKET PITCH TEAM
We track new markets so founders and investors can move fasterWe build living "market pitch" documents for emerging markets: AI, synthetic biology, new proteins, and more. Instead of outdated PDFs or hallucinated LLM answers, our clients get a clean, visual, always-updated view of what's really happening: key players, deals, regulations, and signals that matter. Learn more about us.