The hidden security risks of open source AI

Open source AI is gaining momentum across major players. DeepSeek recently announced plans to share parts of its model architecture and code with the community. Alibaba followed suit with the release of a new open source multimodal model aimed at enabling cost-effective AI agents. Meta’s Llama 4 models, described as “semi-open,” are among the most powerful publicly available AI systems.

The growing openness of AI models fosters transparency, collaboration, and faster iteration across the AI community. But those benefits come with familiar risks. AI models are still software – often bundled with extensive codebases, dependencies, and data pipelines. Like any open source project, they can harbour vulnerabilities, outdated components, or even hidden backdoors that scale with adoption.

AI models are, at their core, still code – just with additional layers of complexity. Validating traditional components is like reviewing a blueprint: intricate, but knowable. AI models are black boxes built from massive, opaque datasets and hard-to-trace training processes. Even when datasets or tuning parameters are available, they’re often too large to audit. Malicious behaviours can be trained in, intentionally or not, and the non-deterministic nature of AI makes exhaustive testing impossible. What makes AI powerful also makes it unpredictable, and risky.

Bias is one of the most subtle and dangerous risks. Skewed or incomplete training data bakes in systemic flaws. Opaque models make bias hard to detect – and nearly impossible to fix. If a biased model is used in hiring, lending, or healthcare, it can quietly reinforce harmful patterns under the guise of objectivity. This is where the black-box nature of AI becomes a liability. Enterprises are deploying powerful models without fully understanding how they work or how their outputs could impact real people.

These aren’t just theoretical risks. You can’t inspect every line of training data or test every possible output. Unlike traditional software, there’s no definitive way to prove that an AI model is safe, reliable, or free from unintended consequences.

Since you can’t fully test AI models or easily mitigate the downstream impacts of their behaviour, the only thing left is trust. But trust doesn’t come from hope; it comes from governance. Organisations implement clear oversight to ensure models are vetted, provenance tracked, and behaviour monitored over time. This isn’t just technical; it’s strategic. Until businesses treat open source AI with the same scrutiny and discipline as any other part of the software supply chain, they’ll be exposed to risks they can’t see with consequences they can’t control.

  1. Securing open source AI: A call to action

Businesses should treat open source AI with the same rigour as software supply chain security, and more. These models introduce new risks that can’t be fully tested or inspected, so proactive oversight is essential.

  1. Establish visibility into AI usage:

Many organisations don’t yet have the tools or processes to detect where AI models are being used in their software. Without visibility into model adoption, whether embedded in applications, pipelines, or APIs – governance is impossible. You can’t manage what you can’t see.

  1. Adopt software supply chain best practices:

Treat AI models like any other critical software component. That means scanning for known vulnerabilities, validating training data sources, and carefully managing updates to prevent regressions or new risks.

  1. Implement governance and oversight:

Many organisations have mature policies for traditional open source use, and AI models deserve the same scrutiny. Establish governance frameworks that include model approval processes, dependency tracking, and internal standards for safe and compliant AI usage.

  1. Push for transparency:

AI doesn’t have to be a black box. Businesses should demand transparency around model lineage: who built it, what data it was trained on, how it’s been modified, and where it came from. Documentation should be the norm, not the exception.

  1. Invest in continuous monitoring:

AI risk doesn’t end at deployment. Threat actors are already experimenting with prompt injection, model manipulation, and adversarial exploits. Real-time monitoring and anomaly detection can help surface issues before they cascade into broader failures.

DeepSeek’s decision to share elements of its model code reflects a broader trend: major players are starting to engage more with the open source AI community, even if full transparency remains elusive. For enterprises consuming these models, this growing accessibility is an opportunity and a responsibility. The fact that a model is available doesn’t mean it’s trustworthy by default. Security, oversight, and governance must be applied downstream to ensure these tools are safe, compliant, and aligned with business objectives.

In the race to deploy AI, trust is the foundation. And trust requires visibility, accountability, and governance every step of the way.

Brian Fox is co-founder and chief technology officer at Sonatype, a software supply chain security company.

#hidden #security #risks #open #source