AI Safety: what are the top startups now?

Last updated: 17 June 2026
market research pitch 2026 statistics AI safety market

In our AI safety market deck, you will find everything you need to understand the market

SUMMARY

AI Safety: what are the top startups now? The top names are Anthropic, Safe Superintelligence, LMArena, Braintrust, Goodfire, Noma Security, WitnessAI, Gray Swan, Cinder, and Credo AI.

The category is no longer one vague “AI safety” bucket. It is becoming a stack of control points around specific failure modes: model comparison, production evaluation, agent security, red-teaming, interpretability, governance, trust and safety, and deepfake defense.

Anthropic and Safe Superintelligence still dominate the frontier-lab conversation, but for opposite reasons. Anthropic has scale, customers, and production impact, while SSI has founder signal and pure safe-superintelligence ambition.

The more interesting startup action is happening below the frontier-lab layer. Companies like Braintrust, Noma, WitnessAI, Goodfire, Gray Swan, Cinder, and LMArena are turning safety into infrastructure that buyers can understand and budget for.

Evaluation is splitting into two markets. LMArena is becoming the public model scoreboard, while Braintrust is becoming the workflow layer for teams that need to know whether their AI product got worse after a prompt, model, tool, or retrieval change.

Agent security looks like one of the strongest near-term markets because it maps directly to enterprise risk. Noma leads on capital velocity, WitnessAI leads on disclosed adoption momentum, and Vijil is carving out a sharper agent-resilience role.

Red-teaming has moved from research theater to commercial infrastructure. Gray Swan stands out because it combines a recent $40 million Series A, frontier-lab usage claims, and a large hacker network built to pressure-test real frontier systems.

Interpretability is the deeper technical bet. Goodfire’s jump to a $1.25 billion valuation suggests investors now believe model internals may become a fundable product category, not just a frontier-lab research discipline.

Runtime AI security is real, but the market is already consolidating. The acquisitions of Lakera, CalypsoAI, and Protect AI show demand, while also suggesting that first-wave guardrail APIs may become features inside larger cybersecurity platforms.

Governance matters, but the startup evidence is weaker right now. Credo AI still leads, while Holistic AI and Saidot remain credible challengers, yet the category has less fresh funding and customer proof than evals, security, red-teaming, or interpretability.

Trust and safety is being rewritten by synthetic content. Cinder looks especially strong because it is positioned around stopping AI-generated abuse before it scales, rather than only moderating harmful content after publication.

The clearest pattern is that the best AI safety startups are not selling abstract responsibility. They are owning concrete moments where AI systems fail: compare the model, monitor the app, secure the agent, red-team the behavior, inspect the model, govern the workflow, and stop synthetic abuse.

Market map chart showing top companies and startups in the AI safety market

This market map, featured in our AI safety market deck, highlights top companies and startups in the AI safety market

Which AI safety startups are already too obvious to ignore?

Anthropic and Safe Superintelligence are still the two obvious frontier-level names, but they are obvious for completely different reasons.

Anthropic is the scaled safety-native lab. By February 2026, it had announced a $30 billion Series G at a $380 billion post-money valuation. At that level, it barely behaves like a startup anymore, but it still defines the category because many AI safety debates are really debates about whether frontier labs can build more capable systems without losing control of reliability, misuse, and autonomy risks.

Safe Superintelligence is the purer safety moonshot. It reportedly raised about $2 billion in 2025 at roughly a $32 billion valuation, despite having no public product. That does not prove revenue but it does prove investors are willing to price “safe superintelligence from Ilya Sutskever’s team” like a once-in-a-cycle technical bet.

The hierarchy is pretty clear. Anthropic is the leader if we care about scale, customers, and production impact. SSI is the leader if we care about founder signal and frontier-safety ambition. For the rest of this article, the more useful question is which smaller startups are turning AI safety into infrastructure people actually buy.

If you want more recent data on this point, please see our latest AI safety market report.

Which startup is becoming the scoreboard for AI models?

LMArena is the clear standout here.

LMArena raised $150 million in January 2026 at a $1.7 billion post-money valuation, nearly triple its May 2025 valuation. That is the strongest recent funding signal in model evaluation. It also tells us something about the market: investors are not treating model ranking as a side project anymore. Instead, they are treating it like infrastructure.

The reason LMArena beats most benchmark startups right now is distribution and legitimacy. A normal eval company sells testing tools to enterprises. LMArena sits closer to the public scoreboard layer, where researchers, builders, and users compare frontier models through human preference data. That is a stronger position because the market constantly asks the same simple question: “Which model is actually better this month?”

Braintrust is better for production teams, Patronus is more enterprise-eval focused, and Gray Swan is stronger for adversarial safety testing. But if the question is “who owns the model-comparison conversation today?”, LMArena is the name that comes first.

Google Trends chart showing rising interest in AI safety

As this chart shows, and as featured in our AI safety market deck, search interest in AI safety has been growing steadily

Which startups are making AI evaluation useful inside companies?

Braintrust leads this category, with Patronus AI and Vijil behind it in more specialized roles.

Braintrust’s edge is that it moved evaluation from a pre-launch checklist into the production workflow. In February 2026, it raised an $80 million Series B at an $800 million valuation. That round came after the company had already positioned itself around traces, evals, prompt changes, model swaps, and regressions in live AI products.

That makes Braintrust stronger than a pure benchmark company for enterprise teams. A benchmark tells you whether a model looks good in a controlled setting. Braintrust helps teams answer a more painful question: “Did our AI app get worse after we changed the prompt, the model, the tool, or the retrieval layer?”

Patronus AI deserves a place because it is focused directly on LLM evaluation, security, and adversarial testing for enterprises. But its last disclosed $17 million Series A is much smaller than Braintrust’s latest round, so the public scale signal is weaker.

Vijil is not trying to own all AI observability. Its sharper angle is agent resilience. The company raised $17 million in November 2025 and said SmartRecruiters used its platform to cut “time-to-trust” by 75%. That is a better customer outcome than most young safety startups disclose, but it still places Vijil as a focused agent-testing company rather than the broad eval platform leader.

So the ranking is simple: Braintrust first for production AI observability, Patronus for enterprise LLM testing, Vijil for agent resilience, and LMArena for the public model scoreboard.

Which startups are best positioned to secure AI agents?

Noma Security and WitnessAI are the current leaders, with Vijil and Gray Swan as the strongest emerging challengers.

Noma has the biggest capital signal in agent security. In July 2025, it raised a $100 million Series B less than a year after a $32 million Series A. Some reports put the valuation around $400 million. That matters because agent security is still early; a round of that size means investors already see a large enterprise budget forming around AI agents.

WitnessAI has the stronger usage-growth story. In January 2026, it raised $58 million from investors including Sound Ventures, Qualcomm Ventures, Samsung Ventures, Fin Capital, and Forgepoint. More importantly, it reported more than 500% ARR growth over 12 months, hundreds of thousands of enterprise employees and applications protected, and a fivefold headcount increase.

That comparison is useful. Noma looks stronger on fundraising velocity and platform ambition. WitnessAI looks stronger on disclosed adoption momentum. If we had to pick one “default enterprise agent-security platform” today, Noma probably has the capital advantage. If we care more about customer expansion proof, WitnessAI is just as serious.

Vijil is smaller but worth watching because it gives buyers a specific operational promise: build, test, deploy, and harden agents while reducing time-to-trust. Gray Swan is less of a deployment-control platform and more of a serious adversarial testing company. That makes it especially relevant when the agent is powerful enough to need real red-teaming, not just policy checks.

If you want more recent data on this point, please see our latest AI safety market report.

Chart showing annual venture capital investment in AI safety startups

This chart, featured in our AI safety market deck, shows annual venture capital investment in AI safety startups

Which startups are doing the serious red-team work?

Gray Swan is the hottest name here, with Apollo Research and METR also important for frontier credibility.

Gray Swan raised a $40 million Series A in May 2026. For a young AI safety company, that is already a strong signal. But the more important proof is qualitative and very specific: the Carnegie Mellon spinout says its platform is used by Anthropic, OpenAI, and Meta, and Forbes reported that it works with an army of 15,000 hackers to pressure-test systems like Claude, GPT-5, and Gemini.

That makes Gray Swan different from a normal enterprise security startup. It is building around adversarial testing of actual frontier systems and agents. In a market full of vague safety claims, that gives it unusually strong credibility.

Apollo Research is different. It sits closer to frontier-risk research, scheming evals, and oversight. In January 2026, Apollo said it was becoming a public benefit corporation and setting up a product team focused first on AI agent monitoring. That is early commercially, but it matters because Apollo is trying to turn deep safety research into monitoring products.

METR remains one of the most respected independent evaluation groups for frontier models. It is not the obvious venture-backed startup story, so it is harder to rank beside Gray Swan. But if the question is “who has technical credibility in frontier model evaluation?”, METR still belongs in the conversation.

The current order is Gray Swan first for commercial red-teaming momentum, Apollo for agent-monitoring ambition rooted in frontier-safety research, and METR for independent eval credibility.

Which startups are closest to real interpretability breakthroughs?

Goodfire is the clear venture-backed leader, while Transluce is the independent oversight wildcard.

Goodfire is now the interpretability company to watch. It raised a $50 million Series A in April 2025, then a $150 million Series B in February 2026 at a $1.25 billion valuation. That jump matters because interpretability has historically looked more like frontier-lab research than a standalone company. Goodfire is the first startup making it look like a fundable product category at serious scale.

Compared with most AI safety startups, Goodfire is playing a deeper technical game. Guardrail companies try to block bad outputs. Eval companies try to test behavior. Goodfire is trying to understand and steer the model’s internal mechanisms. If that works, it gives model builders a better way to debug and design systems, not just monitor them from the outside.

Transluce is much smaller and less commercial, but it is interesting for another reason: it is building open oversight tools and helped release AEF-1, a standard for independent third-party AI evaluations. It also says it has served as a contractor to the EU AI Office. That gives Transluce a policy and evaluator-ecosystem angle that Goodfire does not emphasize as much.

Today, Goodfire is clearly first on venture scale and technical-commercial ambition. Transluce is not trying to beat Goodfire on funding; it is trying to shape how independent oversight works.

Chart showing how HiddenLayer is positioned in the AI safety market

This chart, featured in our AI safety market deck, shows how HiddenLayer is positioned in AI safety

Which startups are strongest in AI runtime security?

Noma Security, WitnessAI, HiddenLayer, Vijil, and Enkrypt AI stand out, but the old guardrail category is already being absorbed by larger cybersecurity companies.

Noma and WitnessAI lead the independent pack because they are closest to where budgets are moving now: agents, enterprise AI access, data flows, and runtime controls. Their advantage over older guardrail startups is that they are actually asking “what can this AI system access, what can it do, and how do we stop it before it causes damage?”

HiddenLayer is still important because it focuses on AI model security across discovery, supply-chain risk, attack simulation, and runtime defense. Its last major disclosed round was a $50 million Series A in 2023, so it is not the freshest growth story. But it remains one of the more established names in AI asset protection.

Vijil is more focused on agent hardening and runtime resilience. Enkrypt AI is earlier and has less public proof, but it fits the direction of the market: enterprises want a control layer that can detect vulnerabilities, enforce policy, and monitor model behavior continuously.

The important market signal is consolidation. Lakera was acquired by Check Point in 2025. CalypsoAI was acquired by F5 in 2025. Protect AI was acquired by Palo Alto Networks in 2025. That tells us runtime AI security is real, but it also tells us a lot of the first-wave guardrail tooling is becoming a feature inside bigger security platforms.

If you want more recent data on this point, please see our latest AI safety market report.

Which startups are actually winning AI governance?

Credo AI still looks like the leader, but this is a weaker category on fresh startup evidence than security, evals, or interpretability.

Credo AI is the established governance name. It raised $21 million in July 2024, bringing total funding to about $41.3 million. The company has spent years building around AI policy management, risk controls, documentation, and governance workflows.

Holistic AI, Saidot, Fairly/Asenion, Monitaur, Singulr AI, and CTGT are all worth tracking, especially as the EU AI Act pushes companies toward inventories, audits, model documentation, and risk classifications. But the public signals are thinner. We do not see the same recent funding velocity as Noma, Braintrust, Goodfire, LMArena, or Gray Swan.

That does not make governance unimportant. It means the best startup proof today is weaker. Governance is a board-level problem, but buyers are still figuring out whether they want a dedicated governance platform, a GRC extension, a cloud feature, or a security-platform module.

So the honest ranking is Credo first, Holistic AI and Saidot as credible challengers, and the rest in watchlist mode until we see fresher funding, customer, or revenue evidence.

Chart showing the projected CAGR of the AI safety market

This chart, featured in our AI safety market deck, shows annual funding in AI safety startups

Which startups are protecting platforms from AI-generated abuse?

Cinder is the strongest recent standout.

Cinder raised a $41 million Series B in 2026 led by Radical Ventures. The company also says it protects more than 3 billion end users, processes hundreds of millions of events daily, and automates a large share of human review. Those are unusually concrete scale signals for a trust-and-safety startup.

Cinder also has the right customer direction. Its platform is used by companies dealing with AI-generated abuse, fraud, CSAM, NCII, and platform manipulation. The Synthesia partnership is especially telling because the safety check happens before avatar content is generated. That is where the market is going: stop dangerous synthetic content before it exists, instead of moderating it only after publication.

ActiveFence, now Alice, is still relevant as a more mature trust-and-safety platform. But Cinder is the fresher AI-era story. It has the recent round, the scale claims, and the customer pattern that fits today’s problem: bad actors can generate abuse faster than human moderators can review it.

Which startups matter most in deepfake and synthetic-media defense?

Pindrop leads on commercial scale, while Reality Defender is the sharper multimodal detection specialist.

Pindrop is the stronger business today. In April 2025, it said it had passed $100 million in annual recurring revenue, driven by voice authentication, deepfake detection, and fraud prevention. That is a different level of proof from most synthetic-media startups. It shows customers are already paying real money because fake voices and deepfake calls create direct financial risk.

Reality Defender is more focused on detecting AI-generated audio, video, images, and text across formats. It has strong category visibility, but it has not disclosed the same revenue scale as Pindrop. That matters for ranking. In this category, commercial proof beats general buzz.

GetReal is also relevant, especially around high-stakes synthetic media detection, but the public evidence is not strong enough to put it above Pindrop or Reality Defender today.

The current read is straightforward: Pindrop wins if we rank by revenue proof, Reality Defender wins if we rank by dedicated multimodal synthetic-media positioning.

If you want more recent data on this point, please see our latest AI safety market report.

Chart comparing business model options for AI alignment research labs

This chart, featured in our AI safety market deck, compares the main business model options for AI alignment research labs

Which startups are rising fastest right now?

Gray Swan, Goodfire, Cinder, WitnessAI, Vijil, and LMArena are the freshest emerging names.

Gray Swan has the cleanest “new serious player” signal: a $40 million Series A in May 2026, frontier-lab usage claims, and a red-team model that fits the agent era. Goodfire has the strongest technical moonshot signal among infrastructure startups, with a $1.25 billion valuation in interpretability. Cinder has the best trust-and-safety momentum because its numbers speak to real platform scale.

WitnessAI is rising fast because it combines fresh capital with disclosed adoption growth. Vijil is smaller, but its time-to-trust claim gives it a customer-outcome proof point many early safety startups lack. LMArena is already valued like a major infrastructure company, but it still belongs here because its commercial rise happened extremely quickly.

The common thread is not “AI safety” as a vague brand. Each of these companies is attached to a failure mode that buyers now understand: models are hard to compare, agents need control, AI apps regress, platforms face AI-generated abuse, frontier systems need red-teaming, and black-box models are hard to govern.

Which AI safety categories look less exciting right now?

Pure AI governance and first-wave guardrail APIs look less exciting than agent security, evals, red-teaming, and interpretability.

Governance is important, but the startup evidence is not as hot lately. The need is obvious: companies need AI inventories, policy mapping, audit trails, and regulatory workflows. The problem is that the buyer still has several options. They can buy a dedicated governance tool, extend their GRC stack, wait for cloud vendors, or use security platforms that add AI governance features.

Classic guardrail APIs also feel less like the frontier of the market. Several strong names have already been acquired by large security vendors, which validates demand but reduces the number of independent breakout candidates. The exciting part has moved closer to agents, runtime behavior, model-level security, and production observability.

Content moderation is changing too. The old model was “review bad content after users post it.” The new model is “stop AI-generated abuse before it scales.” That is why Cinder looks more interesting than a generic moderation tool right now.

Chart showing revenue breakdown by customer segment in the AI safety market

This chart, featured in our AI safety market deck, shows revenue breakdown by customer segment in the AI safety market

So, who are the top AI safety startups now?

The top AI safety startups right now are Anthropic, Safe Superintelligence, LMArena, Braintrust, Goodfire, Noma Security, WitnessAI, Gray Swan, Cinder, and Credo AI.

It is the group that shows up most often when we look across recent funding, technical credibility, customer proof, and category control.

Anthropic remains the scaled safety-native lab. SSI remains the pure frontier-safety moonshot. LMArena is becoming the model scoreboard. Braintrust is becoming the production AI observability layer. Goodfire is the leading interpretability bet. Noma and WitnessAI are the two strongest agent-security companies. Gray Swan is the hottest red-team company. Cinder is the clearest trust-and-safety breakout. Credo AI still leads governance, even though governance has weaker recent startup momentum than the other categories.

The strongest emerging names just below that top group are Vijil, Patronus AI, Apollo Research, Transluce, HiddenLayer, Pindrop, and Reality Defender. They are not all smaller in importance; Pindrop, for example, has much stronger revenue proof than most AI safety startups. But they are more specialist, older, less freshly funded, or less central to the broad “AI safety infrastructure” stack.

AI safety is becoming an actual stack. The winning startups are building control points around real failure modes: evaluate the model, monitor the AI app, secure the agent, red-team dangerous behavior, explain the model when possible, govern the workflow, and stop synthetic abuse before it spreads.

If you want more recent data on this point, please see our latest AI safety market report.

Category Startups selected and why
Frontier safety labs Anthropic leads on scale and production impact after its 2026 mega-round. SSI leads on pure frontier-safety ambition and founder signal, though public product proof is still absent.
Model scoreboard LMArena leads because its $1.7 billion valuation came from owning a public model-comparison layer, not just selling private eval software.
Production evaluation Braintrust leads because it tests and monitors live AI products. Patronus AI is narrower in LLM testing. Vijil is strongest for agent resilience rather than broad observability.
Agent security Noma leads on capital velocity and platform ambition. WitnessAI is almost level because its adoption-growth claims are stronger. Vijil and Gray Swan are the sharper emerging challengers.
Red-teaming Gray Swan leads commercially because it combines a 2026 Series A with frontier-lab usage claims. Apollo and METR matter more for technical and frontier-risk credibility.
Interpretability Goodfire leads on venture scale and technical ambition. Transluce is the independent oversight wildcard, especially around evaluator access and policy-facing standards.
Runtime AI security Noma, WitnessAI, HiddenLayer, Vijil, and Enkrypt AI are the independent names to watch. Lakera, CalypsoAI, and Protect AI validate the category through acquisitions.
Governance Credo AI remains the leader. Holistic AI and Saidot are credible challengers, but the whole governance category has weaker fresh proof than security or evals.
Trust and safety Cinder is the freshest breakout because it combines recent funding, large protected-user claims, and AI-generated-abuse customer use cases.
Deepfake defense Pindrop leads on revenue proof. Reality Defender leads as a dedicated multimodal detection specialist. GetReal remains relevant but less publicly proven.
Fastest emerging startups Gray Swan, Goodfire, Cinder, WitnessAI, Vijil, and LMArena have the best recent mix of capital, product timing, customer proof, and technical credibility.

OUR METHODOLOGY

We approached this as a market-clarity question, not a reputation ranking.

“AI safety startup” can mean many different things: frontier labs, model evaluators, agent-security platforms, red-teamers, interpretability companies, governance tools, trust-and-safety infrastructure, or deepfake-defense providers. Instead of treating the category as one vague bucket, we broke it into the main areas where AI safety is becoming actual infrastructure.

For each area, we looked for recent, concrete signals: funding rounds, valuations, revenue claims, customer adoption, frontier-lab usage, product positioning, acquisition activity, and disclosed deployment outcomes. We gave more weight to fresh signals from 2025 and 2026 when they showed current market momentum more clearly than older reputation.

We did not use a single metric. Large funding rounds helped identify investor conviction, but they were not enough on their own. Revenue proof, customer traction, technical credibility, category ownership, and timing all mattered depending on the maturity of the segment.

That is why some companies lead on scale, some on technical ambition, some on customer proof, and others on becoming the clearest control point for a specific failure mode. The goal was to separate companies with real evidence of momentum from companies that are simply well-known in the AI safety conversation.

This structured aggregation is what makes the final ranking more defensible. The answer becomes clearer once the market is broken into the problems buyers actually need solved: evaluating models, monitoring AI products, securing agents, red-teaming frontier systems, understanding model behavior, governing AI workflows, and stopping synthetic abuse before it scales.

Key sources used for this analysis include: Anthropic on its Series G funding and $380 billion post-money valuation, GIC on Anthropic’s Series G, Calcalist on Safe Superintelligence’s reported raise and valuation, LMArena on its $150 million raise, TechCrunch on LMArena’s valuation and launch momentum, Axios on Braintrust’s $80 million Series B and $800 million valuation, Patronus AI on its $17 million Series A, Business Wire on Vijil’s $17 million raise and time-to-trust claim, Noma Security on its $100 million Series B, WitnessAI on its $58 million raise and growth signals, Gray Swan on its Series A, Forbes on Gray Swan’s hacker red-team model, Goodfire on its $150 million Series B, AEF-1 on independent third-party AI evaluations, Transluce on AEF-1 and its EU AI Office contractor role, Credo AI on its $21 million funding round, Cinder on its $41 million Series B, Pindrop on surpassing $100 million ARR, Reality Defender on its multimodal detection platform, Check Point on its Lakera acquisition, F5 on its CalypsoAI acquisition, and Palo Alto Networks on its Protect AI acquisition.

Chart showing how prompt injection defense platform technology has evolved over time

This chart, featured in our AI safety market deck, shows how prompt injection defense platform technology has evolved over time

Who is the author of this content?

NEW MARKET PITCH TEAM

We track new markets so founders and investors can move faster

We build living "market pitch" documents for emerging markets: AI, synthetic biology, new proteins, and more. Instead of outdated PDFs or hallucinated LLM answers, our clients get a clean, visual, always-updated view of what's really happening: key players, deals, regulations, and signals that matter. Learn more about us.

Back to blog