OWASP Top 10 for LLM Applications
the ranked LLM-application risk list, 2025
The OWASP Top 10 for Large Language Model Applications is the OWASP-published ranked list of the most consequential risks for applications that build on large language models. The list is community-curated by the OWASP GenAI Security Project and refreshed on a regular cadence; the 2025 release covers LLM01 prompt injection, LLM02 sensitive information disclosure, LLM03 supply chain, LLM04 data and model poisoning, LLM05 improper output handling, LLM06 excessive agency, LLM07 system prompt leakage, LLM08 vector and embedding weaknesses, LLM09 misinformation, and LLM10 unbounded consumption. SecPortal operates an LLM Top 10 verification engagement as one structured record across the AI application codebase, the model and version under test, the inference endpoint, the retrieval index, the tool catalogue, the backend the AI feature sits inside, and the verification deliverable.
No credit card required. Free plan available forever.
The OWASP Top 10 for LLM Applications: a ranked risk list for LLM-application security
The OWASP Top 10 for Large Language Model Applications is the OWASP-published list of the ten most common and consequential risks for applications that build on large language models. The list is community-curated by the OWASP GenAI Security Project and refreshed on a regular cadence; the 2025 version is the current release. Where OWASP AISVS is a verification standard with control chapters and verification levels, the LLM Top 10 is the ranked risk list the AISVS controls map into, and the list teams cite when they need a shared shorthand for the LLM-application risk shape. The list sits in the OWASP family next to the original OWASP Top 10 for web applications and the OWASP API Security Top 10, and is intentionally shaped to be the LLM-application sibling of those lists.
For internal AppSec teams running LLM-application security, product security teams owning a generative AI feature surface, AI and ML engineering teams shipping LLM and agent applications, AI safety and red-team functions, GRC and compliance teams mapping to the ISO/IEC 42001 AI management system, the NIST AI Risk Management Framework, the EU AI Act, or the CISA Secure by Design pledge, and CISOs carrying the audit-committee read against the AI programme, the LLM Top 10 is the entry point the LLM-specific verification reads against. The value is not the headline names of the ten risks; the value is the shared vocabulary the engagement, the report, the audit, and the procurement diligence all use.
This page covers the ten LLM Top 10 risks for 2025, the per-finding evidence model the engagement uses, the LLM Top 10 audit and verification expectations, where the LLM Top 10 sits next to OWASP AISVS, OWASP ASVS, the OWASP API Top 10, NIST AI RMF, ISO/IEC 42001, MITRE ATLAS, the EU AI Act, CISA Secure by Design, and NIST SSDF, the buyer and operator read for AppSec, product security, AI engineering, and AI governance, and how SecPortal operates an LLM Top 10 engagement as one structured record.
The ten LLM Top 10 risks for 2025
Each entry below is a one-paragraph operating summary. The risk descriptions are the engagement shorthand; the operating depth lives in the per-vulnerability detail pages, the AISVS control chapters the risk maps into, and the test plan the engagement runs against. For the longer explainer that introduces the framework in plain language, see the OWASP Top 10 for LLM Applications explainer.
Prompt Injection
User-supplied or retrieved input modifies the model behaviour in ways the application did not intend, by mixing instructions with data inside the same prompt context. Direct prompt injection arrives in the user message; indirect prompt injection arrives through retrieved documents, tool outputs, uploaded files, page titles, email bodies, image alt text, or any other content the model is asked to summarise or act on. The control surface is the segregation of system instructions from data, the trust boundary around retrieved content, and the parsing of agent tool outputs.
Sensitive Information Disclosure
The model reveals data the application did not intend to expose: training data fragments, system prompts, user prompts from earlier in the session, tenant data the model retrieved from a vector store the requestor should not see, secrets in retrieved documents, PII the model was told to mask. The control surface is the data segregation at the retrieval layer, the output filtering before display, the privacy review of the training corpus, and the per-user access policy on the inference endpoint.
Supply Chain
The model supply chain is compromised: a foundation model with poisoned weights, a fine-tuning dataset with deliberate backdoors, an LoRA adapter from an untrusted source, a Hugging Face model with embedded executables, a model server with vulnerable dependencies, an inference SDK with known CVEs, an embedding model the application did not pin a version of. The control surface is the AI bill of materials, the source attestation, the dependency inventory, the model registry signing, and the runtime integrity check before loading.
Data and Model Poisoning
Training data, fine-tuning data, embedding inputs, or RAG sources are tainted so the model produces attacker-chosen output for attacker-chosen triggers. Poisoning may be a deliberate adversarial campaign or accidental contamination through unreviewed crowdsourced data. The control surface is the data provenance record, the canary detection during evaluation, the integrity check before ingestion, and the eviction discipline for sources that fail review.
Improper Output Handling
Downstream systems treat model output as trusted code, trusted SQL, trusted shell, trusted markup, or trusted tool arguments without validation. A model that emits HTML rendered into the browser, SQL passed to the database, shell run on the server, tool calls executed without policy review, or file paths read from disk inherits every classical web vulnerability the model output sink touches. The control surface is sink-aware output validation, the same input-validation discipline applied to model output, and parameterisation at the call site.
Excessive Agency
The agentic application gives the model more capability, more permission, or more autonomy than the use case requires. Symptoms include tool catalogues the model can call without per-call confirmation, write-scope credentials on tools that only need read, agent loops without budget limits, destructive actions without human-in-the-loop, transitive tool composition the model can exploit to escalate scope. The control surface is the per-tool allow-list, the per-call permission, the human-confirmation step on destructive tools, and the agent loop budget.
System Prompt Leakage
System prompts disclose intent and constraints the application relies on for security, privacy, or business-logic enforcement (allow-lists, deny-lists, tool descriptions, retrieval keys, prompt-engineered jailbreak defences, customer-specific rules, internal pricing or routing logic). When the system prompt is treated as a security boundary, leakage breaks the boundary; the cure is to remove the dependence on system-prompt secrecy and move the enforcement to the application layer where it is testable.
Vector and Embedding Weaknesses
Vector stores and embedding pipelines introduce risks the application designer often inherits without realising it: cross-tenant retrieval if the namespace partitioning is loose, embedding inversion that recovers near-original text from the vector, RAG injection through documents the search returns, vector store credentials with broader scope than required, similarity thresholds that pull in adversarial documents, missing access-control checks on the retrieval API. The control surface is the per-tenant partition, the access-control enforcement at the retrieval layer, the embedding-pipeline review, and the retrieval-quality regression test.
Misinformation
The model produces plausible-sounding but false content the application or the user acts on. Risks include hallucinated APIs, hallucinated libraries, hallucinated legal citations, hallucinated medical advice, hallucinated financial figures, hallucinated code that compiles but corrupts data. The control surface is the application-level grounding policy, the retrieval-augmented generation discipline, the citation requirement at the output layer, the confidence threshold, and the human-review checkpoint for high-impact decisions.
Unbounded Consumption
The inference surface lacks per-identity, per-tenant, per-conversation, or per-tool consumption limits, so an attacker can drain the operating budget through aggressive token usage, exhaust the rate the model can serve, or run model-extraction attacks that approximate the model by issuing high volumes of inference requests. The control surface is the per-identity quota, the per-tenant budget, the rate-limit at the inference gateway, the abuse-detection on usage patterns, and the cost cap with breakglass.
For per-risk technical depth, the vulnerability pages cover the most-tested attack chains. The prompt injection vulnerability page and the indirect prompt injection via RAG vulnerability page cover LLM01. The system prompt leakage vulnerability page covers LLM07. The improper output handling in LLM applications vulnerability page covers LLM05. The excessive agency in LLM applications vulnerability page covers LLM06. The data and model poisoning vulnerability page covers LLM04. The unbounded consumption vulnerability page covers LLM10. The model extraction attack vulnerability page covers the LLM10 confidentiality dimension the framework explicitly names as one failure mode. Each maps back to one or more LLM Top 10 categories and reads as the per-finding evidence the verification engagement records.
Per-finding evidence model: what the LLM Top 10 engagement records
An LLM Top 10 engagement is only as useful as the per-finding structure the engagement records against. The list below names the minimum fields each finding carries so the report reads as a structured record rather than a free-text narrative. The shape is identical for an engagement run by the internal AppSec team, an engagement run by an external pentest firm, and an engagement run by a specialist AI security consultancy; the structure is what keeps the finding portable across audiences.
- Per-finding LLM Top 10 category: LLM01 through LLM10, plus an optional secondary category when the finding touches two risks (an indirect prompt injection that exfiltrates training data through tool output covers LLM01, LLM02, and LLM05).
- Per-finding CVSS 3.1 vector and base score for the severity claim, plus the application-specific impact note that makes the severity defensible to a reviewer who was not in the test session.
- Per-finding CWE identifier where one applies (CWE-77 OS command injection, CWE-79 cross-site scripting, CWE-89 SQL injection, CWE-200 information exposure, CWE-94 code injection, CWE-77 command injection, CWE-918 SSRF, CWE-89 SQL injection in the LLM output handling case).
- Per-finding OWASP ASVS reference for the wider application surface, OWASP API Security Top 10 reference where the LLM surface is an API, and OWASP AISVS chapter reference for the control the finding exercises against.
- Per-finding asset reference naming the AI application identifier, the model and version under test, the inference endpoint, the retrieval index, the tool catalogue, and the backend the AI feature sits inside.
- Per-finding evidence inline: prompt sequence, retrieved-context excerpt, tool-call log, agent action trace, request and response, screenshot, or short clip, so a reviewer can validate the result without asking for the working file.
- Per-finding remediation guidance written for the engineering audience that will fix the finding (AI engineering, AppSec, ML engineering), citing the LLM Top 10 risk, the AISVS control, and the platform AI API or framework feature, rather than restating the vulnerability description.
- Per-finding override path for the residual risk the engagement closes without remediation, with a named approver, a documented scope, a hard expiry, and a compensating control, so the deferral is evidenced rather than silent.
LLM Top 10 audit and verification expectations
An LLM Top 10 verification is a structured deliverable, not a wrap-up summary. The expectations below cover what the report needs to carry so the verification claim reads defensibly against an audit, a procurement review, or a follow-on engagement two quarters later. The list is the minimum; many engagements add a safety-evaluation pack, a fairness review, and a model-card refresh alongside, but the LLM Top 10 verification itself stops at the security boundary.
- A scoping statement that names the AI application identifier, the model and version in scope, the inference endpoint, the retrieval index, the tool catalogue, the backend API surface the AI feature sits inside, and the LLM Top 10 categories the engagement covers explicitly, with any LLM Top 10 categories out of scope named explicitly so the report cannot be misread as a stronger claim than was tested.
- A test plan that covers direct prompt injection across the user input boundary (LLM01), indirect prompt injection through retrieval documents and tool outputs (LLM01), sensitive-information disclosure across training data, system prompt, user prompt, and tenant data (LLM02), supply-chain integrity across the model, the inference SDK, the embedding model, and the LoRA adapter set (LLM03), data and model poisoning across the training, fine-tuning, and embedding pipelines (LLM04), improper output handling across the HTML, SQL, shell, tool-call, and file-path sinks (LLM05), excessive agency across the per-tool allow-list, the destructive-action confirmation, and the agent loop budget (LLM06), system prompt leakage across direct and indirect extraction techniques (LLM07), vector and embedding weaknesses across cross-tenant retrieval, embedding inversion, RAG injection, and retrieval access control (LLM08), misinformation across hallucinated APIs, libraries, legal citations, and financial figures (LLM09), and unbounded consumption across per-identity, per-tenant, per-conversation, and per-tool limits (LLM10).
- A findings record per LLM Top 10 category exercised, with the per-finding structure above. Categories that were exercised and produced no finding are documented as covered-no-finding rather than silently omitted, so the absence of evidence is not read as the absence of testing.
- A coverage and exclusion statement listing the AI application surface the engagement did not exercise (the backend API surface tested under ASVS rather than the LLM Top 10, the mobile application binary tested under MASVS rather than the LLM Top 10, the model evaluation pack run by the AI safety team rather than the security team), so the cross-discipline read is reconcilable.
- A remediation roadmap that pairs each finding to the remediation owner, the named due date, the verification method, and the residual risk position once the finding is closed, so the engagement deliverable carries the closure plan rather than only the risk statement.
- A retest record per finding once the fix lands, with the verification evidence inline and the regression evaluation run against the LLM Top 10 test plan so a regression introduced by the fix is caught at retest rather than at the next engagement.
- A closure record covering the original finding, the proposed fix, the retest evidence on a clean build, the regression result, and the final outcome, so the LLM Top 10 verification reads as a record rather than a PDF attached to an email.
For the broader pentest-report and verification-report shape the LLM Top 10 engagement inherits, see how to write a pentest report and the matching penetration testing report template. The AI report generation feature composes the executive summary, technical body, per-category coverage statement, and remediation roadmap from the live engagement and findings, citing the LLM Top 10 categories that were exercised rather than starting from a blank template.
How the LLM Top 10 sits next to AISVS, ASVS, the API Top 10, NIST AI RMF, ISO 42001, and MITRE ATLAS
The LLM Top 10 is rarely used in isolation. It is the ranked risk list that other documents cite, run against, or wrap. The contrast below is a working view, not a buyer comparison: the practitioner question is which standards to pair the LLM Top 10 with, not which to pick instead of it.
OWASP LLM Top 10 vs OWASP AISVS
The OWASP Top 10 for LLM Applications is a ranked risk list. OWASP AISVS is a verification standard with control chapters and three verification levels (L1 essential, L2 defence in depth, L3 high assurance). The two compose: an LLM Top 10 finding maps to one or more AISVS controls, and an AISVS verification covers the broader requirement set including controls the LLM Top 10 does not enumerate (training-data governance, model lineage, conversation memory privacy, human oversight). A pentest scoped to the LLM Top 10 covers the headline risks; an AISVS engagement covers the requirement set and produces a verification report rather than a finding list.
OWASP LLM Top 10 vs OWASP Top 10 (web)
The OWASP Top 10 (the original) covers web application risks. The LLM Top 10 covers LLM-application-specific risks. A LLM-enabled web application is in scope for both: the backend API surface is tested against the OWASP Top 10 (or OWASP ASVS), and the LLM surface is tested against the LLM Top 10 (or OWASP AISVS). Engagement scoping should declare both in the rules of engagement so the test plan and the report carry both reads, with findings tagged to the right category.
OWASP LLM Top 10 vs OWASP API Security Top 10
The OWASP API Security Top 10 covers the API-tier risks (broken object-level authorization, broken authentication, excessive data exposure). The LLM Top 10 covers the model-tier risks. An LLM-as-API surface is in scope for both: the API authentication, authorization, and rate-limiting are tested against the API Security Top 10, and the model behaviour, prompt injection, output handling, and tool use are tested against the LLM Top 10. The two read together at the same engagement.
OWASP LLM Top 10 vs NIST AI RMF
NIST AI RMF is the AI Risk Management Framework with GOVERN, MAP, MEASURE, and MANAGE functions. The LLM Top 10 reads as one input to the MEASURE function for LLM-specific systems: the LLM Top 10 test plan supplies the measurement evidence that the trustworthy characteristics secure-and-resilient and accountable-and-transparent depend on. Programmes operating against the AI RMF use the LLM Top 10 as the LLM-application slice of the MEASURE evidence pack.
OWASP LLM Top 10 vs ISO/IEC 42001
ISO/IEC 42001 is the AI management system standard that names the policies, the processes, and the management discipline an AI-producing organisation operates against. The LLM Top 10 reads beneath as one of the technical risk lists the management system covers: LLM Top 10 engagements supply the technical evidence the ISO/IEC 42001 Annex A controls covering AI system testing reference. Programmes certifying against ISO/IEC 42001 typically operate the LLM Top 10 alongside AISVS as the application-side verification companion.
OWASP LLM Top 10 vs MITRE ATLAS
MITRE ATLAS is the Adversarial Threat Landscape for AI Systems, a knowledge base of adversary tactics and techniques targeting AI systems. The LLM Top 10 reads as the OWASP-curated risk list; ATLAS reads as the MITRE-curated adversary-technique reference. The two compose: an LLM Top 10 finding often maps to one or more ATLAS techniques (LLM01 prompt injection covers ATLAS Prompt Injection, LLM04 data and model poisoning covers ATLAS Poison Training Data and ATLAS Backdoor ML Model, LLM03 supply chain covers ATLAS Acquire Public ML Artefacts and ATLAS Verify Attack), and an ATLAS technique mapping makes the LLM Top 10 finding easier to communicate to a threat-modelling audience.
Buyers procuring an LLM Top 10 engagement under a regulated framework should pair the LLM Top 10 with OWASP AISVS for the control-level verification standard, with OWASP ASVS for the wider backend tier, with the OWASP API Security Top 10 for the API surface, with ISO/IEC 42001 for the AI management system the engagement operates inside, with NIST AI RMF for the framework-level outcome model, and with MITRE ATT&CK plus the MITRE ATLAS adversary-technique reference for the threat-modelling audience.
Adjacent regimes: where the LLM Top 10 lands in compliance and procurement
The LLM Top 10 reads alongside several regulatory and procurement-driven regimes. Each one covers a different slice of the AI security operating model; the LLM Top 10 supplies the application-side risk evidence those regimes cite.
LLM Top 10 and the EU AI Act
The EU AI Act classifies AI systems by risk and imposes obligations proportionate to the classification. For high-risk AI systems, Article 9 (risk-management system), Article 14 (human oversight), and Article 15 (accuracy, robustness, and cybersecurity) call for documented technical evidence the LLM Top 10 verification supplies on the LLM tier. For general-purpose AI models, the obligations under Article 51 onwards similarly read against LLM-specific risk coverage including evaluation of systemic risk and cybersecurity protection.
LLM Top 10 and CISA Secure by Design
CISA Secure by Design is the voluntary pledge framework for software producers committing to default-secure software products. CISA has extended the secure-by-design principles into AI-specific guidance covering safe AI product development. The LLM Top 10 reads beneath as the technical risk list the secure-by-design AI claims are evidenced against, particularly the commitments around prompt injection defence, data and model poisoning prevention, and secure output handling.
LLM Top 10 and NIST SSDF
NIST SSDF (Secure Software Development Framework, SP 800-218) is the secure-development practices framework. The LLM Top 10 reads as the verification list the Verify practice category (PW.8 verify third-party components, PS.3 verify intentional behaviour) exercises for LLM-application code paths. The SSDF AI Profile (NIST SP 800-218A) explicitly extends the SSDF to AI software and references AI-specific verification expectations the LLM Top 10 fulfils on the application side.
LLM Top 10 and OWASP Machine Learning Security Top 10
The OWASP Machine Learning Security Top 10 is the ranked list of the most common machine-learning-specific risks (model adversarial attacks, data poisoning, model inversion, membership inference, model extraction, transfer-learning attacks, model denial of service, supply chain compromise, insecure online learning, output manipulation). It reads alongside the LLM Top 10: the ML Top 10 covers classical ML model risks; the LLM Top 10 covers LLM-application risks. Programmes operating across classical ML and LLM applications use both lists together.
The LLM Top 10 for AppSec, product security, AI engineering, and AI governance
The LLM Top 10 is read differently depending on which side of the engagement you sit on. AppSec teams use the list as the verification scope for the LLM tier of the wider application, and as the shared vocabulary that lets prompt-injection results, output- handling reviews, agent-action audits, and dependency findings be communicated in one comparable picture. Product security teams use the list as the security gate the generative-AI feature passes through before launch, with the verification report driving the feature go-live decision and the residual-risk position. AI engineering teams use the list as the threat model the model and prompt infrastructure is built against, especially LLM01 prompt injection, LLM02 sensitive-information disclosure, LLM05 improper output handling, LLM06 excessive agency, and LLM10 unbounded consumption. AI governance committees use the list as the technical risk evidence layer their high-impact AI decisions read against under ISO/IEC 42001 and the EU AI Act.
The persona-specific entry points are SecPortal for AppSec teams, SecPortal for product security teams, SecPortal for application security programme leads, SecPortal for security engineering teams, and SecPortal for GRC and compliance teams. Each anchors a different view of the same LLM Top 10 engagement record.
For firms that specialise in AI and ML security assessments rather than running AI as one of several practices, the SecPortal for AI and ML security consultancies page covers the operating model that fits a specialist AI security practice, including finding evidence fields tuned to LLM Top 10 categories and the retest workflow that pairs verification to the original finding across each new model release.
Where the LLM Top 10 sits next to other framework pages
The LLM Top 10 is the LLM-application risk list; the adjacent framework pages cover the regimes the LLM Top 10 verification reads alongside. The pages below are the ones LLM Top 10 programmes most often read together.
- The OWASP AISVS framework page covers the AI-application verification standard the LLM Top 10 findings map into at the control level.
- The OWASP ASVS framework page covers the verification standard for the wider web and API surface the LLM application sits inside.
- The OWASP API Security Top 10 framework page covers the API-tier risk list the LLM-as-API surface is also in scope for.
- The OWASP Top 10 (web) framework page covers the original web application risk list the LLM-enabled web application surface is tested against alongside the LLM Top 10.
- The ISO/IEC 42001 framework page covers the AI management system standard the LLM Top 10 supplies the technical risk evidence for.
- The NIST AI RMF framework page covers the AI Risk Management Framework the LLM Top 10 supplies MEASURE-function evidence for.
- The NIST SSDF framework page covers the secure-development practices framework that, through the SP 800-218A AI Profile, reads against LLM-application verification including the LLM Top 10.
- The CISA Secure by Design framework page covers the default-secure software commitment the LLM Top 10 is evidenced against for AI features.
- The MITRE ATT&CK framework page covers the wider adversary technique reference the MITRE ATLAS knowledge base for AI systems extends.
Where SecPortal fits in an LLM Top 10 engagement
SecPortal is the operating layer for an LLM Top 10 engagement. The platform handles scope, category coverage, backend authenticated DAST evidence, AI application codebase SAST and SCA evidence, prompt-injection and indirect-injection evaluation results, agent-action audit findings, output-handling findings, sensitive-information-disclosure findings, supply chain findings, finding triage, retests, and the final deliverable, so the engagement runs as a single workflow rather than a long email thread with attachments and screenshot zips. For consultancies running LLM and AI security engagements on behalf of multiple clients, the AI and ML security consultancies workspace bundles that with branded client portals.
- Engagement management captures the LLM Top 10 scope, the AI application identifier, the model and version under test, the inference endpoint, the retrieval index, the tool catalogue, the backend API surface in scope, the rules of engagement, the testing window, and the agreed retest scope as a structured record, so the engagement scaffold is the workflow rather than a contract attachment.
- Findings management stores each finding with a CVSS 3.1 vector, severity, evidence, owner, OWASP LLM Top 10 category, OWASP ASVS reference, OWASP API Top 10 reference, OWASP AISVS chapter reference, and CWE identifier, so the LLM Top 10 verification report writes itself from the underlying records.
- Code scanning runs Semgrep SAST and dependency analysis against the AI application codebase through the connected GitHub, GitLab, or Bitbucket repository, so the LLM03 supply-chain evidence (AI SDK hygiene, model-library inventory, dependency CVE coverage) is captured against the same engagement as the runtime testing.
- Authenticated scanning runs the DAST module pack against the backend the AI feature sits inside, with credentials stored encrypted at rest under AES-256-GCM, so the LLM02 sensitive-information disclosure and LLM05 improper output handling evidence covers the inference surface and the backend API surface together.
- External scanning runs the unauthenticated module pack against the inference endpoint and the supporting infrastructure, so the LLM10 unbounded-consumption evidence covers the externally observable rate-limit, error-handling, and abuse-detection posture.
- Bulk finding import accepts the prompt-injection evaluation outputs, the jailbreak-resistance evaluation results, the agent-action red-team results, the model-extraction probe results, and the safety evaluation pack as CSV intake with column mapping, so the LLM Top 10 evidence pack reads against the same finding ledger as the security findings.
- AI-assisted report generation composes the executive summary, the technical body, the per-LLM-Top-10-category coverage statement, and the remediation roadmap from the live engagement and findings, citing the LLM Top 10 categories that were exercised rather than starting from a blank template.
- Compliance tracking lets one LLM Top 10 engagement satisfy framework mappings to OWASP AISVS chapters C3 input handling, C4 output handling, C5 authentication and rate-limit, C6 RAG, C7 agent action boundaries, and C9 monitoring, to OWASP ASVS V2 V3 V5 V7 V8 for the wider backend surface, to NIST AI RMF MEASURE 2 evaluation evidence, to ISO/IEC 42001 Annex A controls covering AI system testing, to the EU AI Act Article 9 risk-management system, Article 14 human oversight, and Article 15 accuracy, robustness, and cybersecurity, and to MITRE ATLAS techniques across the inference, training, and supply-chain stages.
- Retesting workflows pair each LLM Top 10 finding to a verification step on a clean build, with the regression record attached, so the closure record carries the full evidence chain.
- Activity log with CSV export captures every state change to the engagement, the findings, the override decisions, and the retests, so the auditor or regulator can reconstruct the LLM Top 10 verification operating record without a multi-team excavation.
- Document management for the AI application model card, the system card, the safety policy, the prompt-template inventory, the tool inventory, the AI bill of materials, the data sheet, the data-protection-impact-assessment record, and the LLM Top 10 test plan, with version history per artefact and named custodian per file.
- Team management with role-based access (owner, admin, member, viewer, billing) that keeps AI engineering, AppSec, ML engineering, AI safety, data protection, GRC and compliance, security operations leaders, security architects, and the AI governance committee on the same workspace with appropriate scoping per surface.
- Multi-factor authentication enforcement at workspace level for the LLM Top 10 operating records, so the identity assurance applies at access time as well as evidence time.
- Finding overrides for the residual gaps that cannot be closed within one engagement (named approver, scope, cited reason, hard expiry, compensating control, refresh trigger), so deferrals are evidenced rather than silent.
What SecPortal does not do
The LLM Top 10 is the application-side risk list and depends on the organisation operating AI engineering, AI safety, AppSec, product security, GRC and compliance, and AI governance as coordinated functions. SecPortal is the operating record for the LLM Top 10 verification engagement, not the AI development platform, not the model-evaluation framework, not the AI red team service provider, not the inference gateway, and not the AI governance platform. The honest scope below reads against the LLM Top 10 boundary so the platform commitment is unambiguous.
- SecPortal is not an AI red team service. The platform does not author the prompt-injection corpus, the jailbreak attack scripts, the agent-action attack chains, or the multi-turn safety eval prompts. The red-team workstream is run by the AI security team or by an external AI red-team provider; SecPortal carries the engagement record, the findings, the retests, and the verification report the red team produces.
- SecPortal is not a model-evaluation framework. The platform does not run lm-eval-harness, HELM, BIG-bench, MLPerf, Inspect AI, AI Verify, AISI Inspect, OWASP GenAI Red Teaming Guide tooling, or any other model-evaluation harness. The evaluation runs where the eval infrastructure runs; SecPortal accepts the eval result file via bulk finding import as structured findings on the workspace.
- SecPortal is not an inference gateway, an LLM proxy, or an output guardrail. The platform does not sit on the request path, does not filter prompts at runtime, does not apply output policies at runtime, does not enforce token budgets, and does not rate-limit inference. The runtime controls run where the inference runs; SecPortal carries the verification evidence the runtime controls generate.
- SecPortal does not host AI models or connect to inference providers. The platform does not run inference, does not connect to OpenAI, Anthropic, AWS Bedrock, Google Vertex, Azure OpenAI Service, Hugging Face Inference Endpoints, Cohere, Mistral, or any other inference provider. The inference runs where it runs; SecPortal carries the verification evidence the inference surface generates.
- SecPortal is not an MLOps platform. The platform does not connect to MLflow, Weights & Biases, Comet, ClearML, Neptune, DVC, Kubeflow, SageMaker, Vertex AI, or Azure Machine Learning. The ML lifecycle is operated where the training and serving infrastructure is; SecPortal carries the LLM Top 10 verification record alongside the MLOps record rather than replacing it.
- SecPortal does not replace the AI governance committee, the accountable AI owner, the AI safety lead, or the data protection officer. The platform carries the operating record so the verification programme is durable rather than tribal; the named owner runs the programme, the safety lead authors the safety policy, the AI governance committee approves the decision impact tier, and the DPO signs off the DPIA.
- SecPortal does not issue or verify OWASP certifications, ISO/IEC 42001 certifications, EU AI Act conformity assessments, or any other third-party certification. Certification is performed by an accredited certification body. SecPortal carries the operating record the auditors and the conformity assessors read against, with the artefact pack the audit fieldwork reads continuously rather than reconstructed at examination time.
- SecPortal does not ship packaged connectors into Jira, ServiceNow, Slack, Microsoft Teams, SIEM, SOAR, GRC platforms, AI-governance platforms, or any LLM-gateway management plane. The LLM Top 10 findings live on the SecPortal workspace and the wider operational ticketing remains in the systems where the rest of the work is tracked.
The operational workstreams the LLM Top 10 programme reads against already exist as named use cases on SecPortal. The security onboarding for new applications workflow covers the per-application onboarding that the LLM Top 10 engagement attaches to. The code review use case covers the SAST and SCA evidence the LLM03 supply-chain risk reads against. The cross-framework control mapping workflow reads the same LLM Top 10 evidence pack across OWASP AISVS, ISO/IEC 42001, NIST AI RMF, EU AI Act, and CISA Secure by Design so the cross-regime read is reconcilable rather than reconciled per audit. The security leadership reporting workflow turns the LLM Top 10 evidence pack into the leadership read for the AI governance committee and the audit committee.
For deeper reading on the disciplines this risk list reads against, the OWASP Top 10 for LLM Applications explainer is the longer plain-language introduction. The AI bill of materials guide covers the supply-chain artefact LLM03 expects. The secure code review for AI-generated code guide covers the developer-facing review discipline that flows into LLM Top 10 evidence on the code path. The MLSecOps implementation guide and the AI security posture management explainer cover the operating model the LLM Top 10 engagement runs inside.
Key control areas
SecPortal helps you track and manage compliance across these domains.
LLM01:2025 Prompt Injection
User-supplied or retrieved input modifies model behaviour in ways the application did not intend, by mixing instructions with data inside the same prompt context. Direct injection arrives in user messages; indirect injection arrives through retrieved documents, tool outputs, uploaded files, or any other content the model summarises or acts on. The control surface is the segregation of system instructions from data, the trust boundary around retrieved content, and the parsing of agent tool outputs.
LLM02:2025 Sensitive Information Disclosure
The model reveals data the application did not intend to expose: training data fragments, system prompts, prior user prompts, tenant data, secrets in retrieved documents, PII the model was told to mask. The control surface is the data segregation at retrieval, the output filtering before display, the privacy review of the training corpus, and the per-user access policy on the inference endpoint.
LLM03:2025 Supply Chain
The model supply chain is compromised: a foundation model with poisoned weights, a fine-tuning dataset with deliberate backdoors, an untrusted LoRA adapter, a model server with vulnerable dependencies, an inference SDK with known CVEs, an embedding model without a pinned version. The control surface is the AI bill of materials, the source attestation, the dependency inventory, the model registry signing, and the runtime integrity check before loading.
LLM04:2025 Data and Model Poisoning
Training data, fine-tuning data, embedding inputs, or RAG sources are tainted so the model produces attacker-chosen output for attacker-chosen triggers. Poisoning may be deliberate adversarial campaigns or accidental contamination through unreviewed crowdsourced data. The control surface is the data provenance record, the canary detection during evaluation, the integrity check before ingestion, and the eviction discipline for sources that fail review.
LLM05:2025 Improper Output Handling
Downstream systems treat model output as trusted code, trusted SQL, trusted shell, trusted markup, or trusted tool arguments without validation. A model output that renders into the browser, runs against the database, executes on the server, calls tools without policy review, or reads file paths inherits every classical web vulnerability the output sink touches. The control surface is sink-aware output validation, input-validation discipline applied to model output, and parameterisation at the call site.
LLM06:2025 Excessive Agency
The agentic application gives the model more capability, permission, or autonomy than the use case requires. Symptoms include tool catalogues callable without per-call confirmation, write-scope credentials on tools that only need read, agent loops without budget limits, destructive actions without human-in-the-loop, transitive tool composition the model can exploit. The control surface is the per-tool allow-list, per-call permission, human-confirmation step on destructive tools, and the agent loop budget.
LLM07:2025 System Prompt Leakage
System prompts disclose intent and constraints the application relies on for security, privacy, or business-logic enforcement (allow-lists, deny-lists, tool descriptions, retrieval keys, prompt-engineered jailbreak defences, customer-specific rules, internal pricing or routing logic). When the system prompt is treated as a security boundary, leakage breaks the boundary; the cure is to remove the dependence on system-prompt secrecy and enforce the boundary at the application layer where it is testable.
LLM08:2025 Vector and Embedding Weaknesses
Vector stores and embedding pipelines introduce risks the application designer often inherits without realising it: cross-tenant retrieval if namespace partitioning is loose, embedding inversion that recovers near-original text from the vector, RAG injection through documents the search returns, vector-store credentials with broader scope than required, similarity thresholds that pull in adversarial documents, missing access-control checks on the retrieval API. The control surface is per-tenant partitioning, access-control enforcement at retrieval, embedding-pipeline review, and retrieval-quality regression testing.
LLM09:2025 Misinformation
The model produces plausible-sounding but false content the application or user acts on. Risks include hallucinated APIs, hallucinated libraries, hallucinated legal citations, hallucinated medical advice, hallucinated financial figures, hallucinated code that compiles but corrupts data. The control surface is the application-level grounding policy, the retrieval-augmented generation discipline, the citation requirement at the output layer, the confidence threshold, and the human-review checkpoint for high-impact decisions.
LLM10:2025 Unbounded Consumption
The inference surface lacks per-identity, per-tenant, per-conversation, or per-tool consumption limits, so an attacker can drain the operating budget through aggressive token usage, exhaust the rate the model can serve, or run model-extraction attacks that approximate the model through high inference volumes. The control surface is the per-identity quota, the per-tenant budget, the rate-limit at the inference gateway, the abuse-detection on usage patterns, and the cost cap with breakglass.
Related features
Orchestrate every security engagement from start to finish
Vulnerability management software that tracks every finding
Finding overrides that survive every scan cycle
Test web apps behind the login
Vulnerability scanning tools that map your attack surface
Find vulnerabilities before they ship
Bulk finding import bring your scanner data with you
Document management for every security engagement
AI-powered reports in seconds, not days
Compliance tracking without a full GRC platform
Verify fixes and track reopens on the same finding record
Every action recorded across the workspace
Collaborate across your entire team
Multi-factor authentication on every workspace
Run an LLM Top 10 engagement on one workspace
Anchor each finding to one of the ten LLM Top 10 categories, a CVSS 3.1 vector, an OWASP AISVS control chapter, an OWASP ASVS reference, a CWE identifier, and an evidence record. Carry the same evidence pack across OWASP AISVS, ISO/IEC 42001, NIST AI RMF, EU AI Act, CISA Secure by Design, NIST SSDF, and MITRE ATLAS without rebuilding it per audit. Start free.
No credit card required. Free plan available forever.