Vector and Embedding Weaknesses
detect, understand, remediate
Vector and embedding weaknesses (OWASP LLM08:2025) are the class of flaws in the retrieval layer that almost every production LLM application now depends on. The threat surface is the corpus, the vector store, the ingestion pipeline, the embedding model, the retrieval call, and the per-document access-control envelope. Findings include cross-tenant retrieval that bypasses authorisation, embedding inversion that recovers source text from stored vectors, ingestion that admits unreviewed documents, retrieval-side denial of service, and deletion paths that leave vectors behind.
No credit card required. Free plan available forever.
What are vector and embedding weaknesses?
Vector and embedding weaknesses are the class of security flaws that live in the retrieval layer that almost every production LLM application now depends on. A retrieval-augmented generation (RAG) feature converts documents into numeric vectors with an embedding model, stores those vectors in a vector database (Pinecone, Weaviate, Qdrant, Milvus, Chroma, pgvector, OpenSearch with k-NN, Elasticsearch with dense vectors, Vespa, Redis with vector search), and on every user query embeds the question, retrieves the top-k nearest documents, and stuffs the retrieved content into the model context. The OWASP GenAI Security Project lists the class as LLM08:2025 Vector and Embedding Weaknesses in the 2025 Top 10 for LLM Applications. The attack surface is not the model. The attack surface is the corpus, the vector store, the ingestion pipeline, the embedding model, the retrieval call, and the access-control envelope around every retrieved document.
LLM08 sits beside the other LLM Top 10 disclosure and integrity classes and is often confused with them. The indirect prompt injection via RAG page (LLM01 variant) covers the case where an attacker writes instructions into a document so the retrieval step smuggles attacker instructions into the model context; the threat is the instruction content. LLM08 covers the case where the threat is the security properties of the retrieval system itself: who can read which document, can the vector store leak the underlying text, can an attacker tilt retrieval to favour chosen documents, and can an attacker exhaust the retrieval budget. The data and model poisoning page (LLM04) covers training and fine-tune corpus contamination; LLM08 covers retrieval-index contamination at serving time. The sensitive information disclosure page (LLM02) covers the inference-side egress; LLM08 covers the storage-side disclosure pathway through cross-tenant retrieval and embedding inversion. The LLM supply chain page (LLM03) covers the artefact integrity of the embedding model itself; LLM08 covers what the artefact then writes into and reads from the vector store.
For internal AppSec, product security, AI platform, MLOps, data platform, vulnerability management, and GRC teams, the operating reality is that the vector store inherits every classical data-store security obligation and adds four AI-specific ones on top. The four are: per-document authorisation enforced on every retrieval call (not at ingestion time only), integrity controls over what enters the index (so an unreviewed document cannot reach production retrieval), confidentiality of the embedding vectors themselves (because vectors can be inverted back to source text), and quality controls over retrieval behaviour (so an attacker cannot tilt the top-k toward chosen documents through embedding manipulation). Each of those is a finding category with named detection paths and named remediation paths.
The class shows up in regulator inquiries the moment an AI feature with retrieval ships to customers or to a regulated workload. EU AI Act Article 10 data governance and Article 15 cybersecurity read directly against retrieval-corpus governance and retrieval-side access control. NIST AI RMF Map, Measure, and Manage all read against the corpus inventory, the per-document authorisation control, the integrity envelope, and the evidence the team can produce on each retrieval. ISO/IEC 42001 Annex A controls cover AI lifecycle data management, third-party data risk, and accountability for the data the AI system reads. OWASP AISVS Data and Knowledge Base chapters cover retrieval-side ACL enforcement, embedding-store integrity, and ingestion validation. GDPR Article 5 purpose limitation and Article 32 security of processing read against the retrieval ACL and the encryption envelope around the vector store. Auditors will ask three questions: what documents are in the index, what control enforces the user's right to read each document on every retrieval, and what evidence shows the control actually ran. The answers live on the finding, not in a slide deck.
The retrieval surface
Source corpus and ingestion pipeline
The documents that flow into the index: knowledge-base articles, support tickets, internal wiki pages, code, customer records, contracts, emails, chat history, SharePoint and Google Drive shares, S3 buckets, customer-uploaded files, and crawled web pages. The ingestion job reads each document, splits it into chunks, calls the embedding model, and writes the vectors plus the source text and metadata into the vector store. Each stage is a control point: who is allowed to push into the queue, what document classification the ingestion enforces, what per-document ACL it carries through, and whether ingestion logs are auditable.
Embedding model and chunking policy
The model that turns text into vectors (OpenAI text-embedding-3, Cohere embed, Voyage AI, Anthropic via Claude, open-weights bge, gte, e5, mxbai, jina, Nomic, all-MiniLM). The chunking strategy decides how a document is split (fixed-size, recursive, semantic, parent-child, late-chunking). Both decisions affect what an inversion attack can recover, what poisoning needs, and what the retrieval ACL has to express. A change of embedding model or chunking is a release event with security consequences, not a routine performance tuning.
Vector store and metadata index
The database that holds the vectors plus the chunk text plus the metadata. The metadata typically carries source identifier, tenant identifier, document classification, owner, ingestion timestamp, source-system pointer, and the per-document ACL needed at retrieval time. The store enforces (or fails to enforce) per-tenant isolation, encryption at rest, encryption in transit, network access control, administrative access control, audit logging, and backup integrity. Many production failures are vector store admin-plane mistakes rather than retrieval-call mistakes.
Retrieval call and the authorisation context
The retrieval call assembles the question, embeds it, applies metadata filters (tenant identifier, document classification, owner), retrieves top-k nearest vectors, and returns the chunks to the orchestration layer. The authorisation context (who is asking, what tenant, what role, what document scope) has to reach the metadata filter as a non-bypassable predicate. A retrieval call that runs as a service identity with no per-request user predicate is the most common LLM08 failure pattern.
Reranker, fusion, and post-retrieval pipeline
Many production pipelines retrieve a wider candidate set, then run a cross-encoder reranker (Cohere Rerank, BGE Reranker, custom cross-encoder) over the candidates, then fuse with a keyword retrieval result (hybrid BM25 + vector), then deduplicate, then optionally summarise. Each stage can leak content that the original retrieval would have hidden, can drop ACL metadata, or can introduce a new ranking surface an attacker can game.
Model context assembly and the system prompt boundary
The orchestration layer assembles the retrieved chunks into the model prompt: which chunks, in which order, with what separator, with what citation rendering, and with what reminder of the system prompt and the tool permissions. A weak prompt boundary lets retrieved content override the system prompt; a strong boundary keeps the retrieved content quoted and labelled. The assembly is part of the LLM08 surface even though it executes between retrieval and the model.
Observability, audit log, and replay
The retrieval call carries who asked, what was asked, what the embedded query vector looked like, which document IDs were returned, what reranker scores were applied, what the final prompt looked like, and what the model returned. The log is the evidence the auditor reads to reconstruct a cross-tenant disclosure incident, a poisoning incident, or an inversion attempt. Without the log there is no incident response and no audit defence.
Index lifecycle and deletion semantics
Index lifecycle covers re-embedding (when the source document changes or the embedding model changes), partial deletion (when a tenant offboards or a record is purged for GDPR Article 17 right to erasure), reindex (when the schema or chunking changes), backup, and disaster recovery. A store that silently retains vectors after a delete-by-document call leaves an inversion footprint long after the source text is gone. A backup that survives the deletion is the same finding under a different name.
How it goes wrong
Cross-tenant retrieval without per-document ACL on the call
The retrieval call runs against the whole index. The metadata filter applies the tenant identifier only at ingestion (or not at all) rather than as a non-bypassable predicate on every query. A user asks a question. The retrieval returns chunks from a different tenant whose vectors happened to score high against the query embedding. The chunks land in the model context, the model paraphrases them, and the response contains another customer record.
Service-identity retrieval that bypasses the user authorisation context
The application calls the retrieval API with its own service identity rather than propagating the authenticated user. The retrieval layer trusts the service identity and skips the per-user ACL filter. Every user of the application can retrieve every document the service identity can read. A junior employee question retrieves an executive compensation document, a finance forecast, or a security incident record because the service identity has blanket index access.
Embedding inversion recovers sensitive source text from stored vectors
A vector store ships embeddings to a metric monitoring tool, a debug log, a development environment, a third-party analytics platform, or a backup the wider team can read. Published inversion attacks (Pan et al., Morris et al., Vec2Text) reconstruct paragraph-level text from public embeddings with surprising fidelity for many embedding models. The vectors are no longer a privacy boundary; they are the source data in a different encoding.
Embedding poisoning tilts retrieval toward attacker-controlled documents
An attacker controls a document the team will index (a public web page the crawler reads, a support ticket the customer creates, a comment field in a third-party system, a wiki page the attacker can edit). The attacker shapes the text to maximise embedding similarity against expected queries (adversarial-suffix techniques, instruction-stuffing, keyword-stuffing, semantic mirror tactics). On the next query the poisoned document wins the top-k slot and reaches the model context, where its content then drives the answer.
Ingestion accepts unreviewed documents into production retrieval
The ingestion job has no integrity gate. Any document any system pushes lands in the production index. A compromised upstream source, a misconfigured webhook, a typo in the source pointer, a malicious customer upload, or an accidentally public folder becomes part of the corpus the model reads from. The team finds out when the model surfaces an answer the team did not intend to ship.
Source text stored in the vector store alongside the vectors without classification
The vector store holds the raw chunk text next to the vector and the metadata. Operators with admin access to the store can list every document the AI feature has ever indexed. The store is treated as cache or as an internal lookup, not as a privileged data store, so the access list, the audit log, the encryption envelope, and the network policy are weaker than the source system the documents came from.
Embedding model change without rebuild leaves stale vectors that fail ACL evolution
The team upgrades the embedding model (from text-embedding-ada-002 to text-embedding-3-large, from a smaller open-weights model to a larger one). The index is partially re-embedded. Old vectors with old metadata schemas remain alongside new vectors with new metadata. A retrieval call hits both surfaces and the ACL field the team added in the new schema is missing on the old rows. Cross-document leakage follows.
Deletion of a source document leaves the vector and the chunk text behind
A user exercises a right-to-erasure request. A customer offboards. A document is reclassified and the team deletes it from the source system. The vector store still holds the vector and the chunk text. The model retrieves the chunk on the next query and surfaces information the team and the customer believe is gone. The same pattern shows up after retention-policy expiry, after legal-hold release, and after re-organisation moves.
Embedding denial-of-service through expensive query crafting
An attacker submits queries that force expensive embedding work (very long inputs that hit the per-token cost ceiling), expensive retrieval (queries that match deep into a heavy reranker), or excessive top-k requests. The retrieval budget is exhausted, the per-tenant cost ceiling is breached, the rate limit cascades into a serving outage, or the model provider quota is consumed before legitimate users reach the system. The class chains with classical missing-rate-limiting findings.
Reranker, hybrid retrieval, or post-processing leaks content past the ACL
The metadata filter runs on the initial retrieval but not on the BM25 hybrid leg, not on the reranker, not on the deduplication pass, not on the summarisation pass, and not on the citation rendering. A document filtered out at retrieval re-enters the pipeline through a parallel path. The ACL appears to enforce but the leak surface is one stage later than the team is checking.
Embedding cache or feature store ships embeddings to systems with weaker access control
A feature store or a model-serving cache holds pre-computed embeddings for performance. The cache lives in a system with weaker access control than the vector store. Operators, analytics jobs, debug consoles, and offline training pipelines read the cache freely. The cache becomes a parallel disclosure path with no audit chain back to the original document and the original ACL.
Retrieval call assembled into the prompt without source attribution and ACL trace
The orchestration layer concatenates the retrieved chunks into the prompt without preserving the source document identifier, the per-chunk ACL trace, or the citation pointer. A downstream incident review cannot reconstruct which document produced which answer. A right-to-explanation request cannot answer where the model got the claim. A cross-tenant leak cannot be triaged because the audit cannot say which tenant the leaked chunk came from.
Common causes
Treating the vector store as cache rather than a privileged data store
The team built the retrieval feature against a vector store with default credentials, a permissive admin role, weak network policy, no audit logging, and no encryption-at-rest review. The store inherited the security posture of an internal cache, but the records inside are derived from privileged source documents and the chunk text recovers the underlying content. The mismatch between perceived sensitivity and actual sensitivity drives the rest of the class.
Per-document ACL implemented at ingestion only, not at retrieval
The ingestion job copies the ACL from the source system into the metadata at write time. The retrieval call assumes the ingestion did the work and does not re-check on every read. ACL changes on the source after ingestion (revocation, reclassification, ownership transfer) do not propagate. A user whose access was revoked yesterday still retrieves the document today.
Single service identity for all retrieval calls instead of propagated user context
The application calls retrieval with a single service token. The retrieval layer cannot enforce per-user ACL because the per-user context is not present at retrieval. The platform team writes ACL logic in the application instead, where it is harder to audit and easier to skip. The retrieval becomes the platform-equivalent of running every database query as the application superuser.
Storing raw chunk text alongside vectors without redaction policy
The chunk text sits next to the vector and the metadata in the vector store. The team treats the chunk as needed for citation rendering but never audits which classification of content reaches the store. PII, secrets, legal-hold material, classified-document content, and confidential customer data all land in the same store with the same access control.
No audit log for retrieval, ranker, or context-assembly stages
The team logs the model call but not the retrieval, the rerank, or the context assembly. An incident reviewer can see the model output but cannot say which documents the model read, which user the retrieval served, or which ranker pushed which chunk to the top. The class becomes invisible to incident response and audit.
Embedding model swap, schema change, or chunking change without index rebuild
The team upgraded the embedding model, added a new ACL field to the metadata, or changed chunking strategy. The index was partially migrated. Old rows and new rows coexist. The retrieval call assumes the new schema. The old rows leak past the new ACL because the new predicate does not match the old field. The release went out without a rebuild plan.
How to detect it
Automated detection
- SecPortal code scanning runs Semgrep SAST and dependency analysis across connected GitHub, GitLab, and Bitbucket repositories. Findings surface at retrieval call sites that build the metadata filter without the authenticated user predicate, ingestion pipelines that write into the vector store without an ACL field, embedding-model wrappers that ship vectors out to analytics or logging sinks, and prompt-assembly code that drops the source attribution before the model call.
- External scanning enumerates exposed vector-store admin endpoints, public dashboards on hosted vector databases, leaked embedding API keys, accidentally public buckets that hold chunk text, and debug routes that disclose the embedding model name or vector store version. A publicly visible retrieval gap lands as a finding before an external researcher writes the writeup.
- Authenticated scanning drives the deployed AI feature with cross-tenant probes: requests under one user that ask questions whose answers should only exist in another tenant's documents, requests that target documents the user lost access to yesterday, requests crafted to maximise similarity against another tenant's known content, and requests that exercise the reranker and hybrid retrieval legs independently to detect ACL drift between stages.
- Continuous monitoring re-runs the retrieval-side ACL probe on the configured cadence. An embedding model swap, a metadata schema change, a chunking change, a new ingestion source, a new reranker, or a deployment manifest change that re-opens a previously closed retrieval ACL finding surfaces against the baseline rather than waiting for the next audit cycle.
- Bulk finding import accepts CSV intake from dedicated AI security scanners, vector-store posture tools, embedding inversion testers, and AI-aware DAST tools so external scanner results land on the same engagement record as the SecPortal probes with one CVSS 3.1 calibration applied across the LLM08 finding chain.
Manual testing
- Inventory the retrieval feature: the embedding model and version, the chunking policy, the vector store and tier, the metadata schema, the ACL field, the ingestion sources, the reranker (if any), the hybrid retrieval (if any), the post-processing pipeline, the prompt assembly, and the audit log shape. Record source identifier, classification, owner, and per-document ACL field per source.
- Walk the retrieval call and confirm a per-user authorisation predicate is applied as a non-bypassable metadata filter on every retrieval, not at ingestion only. Construct two test users in two tenants with overlapping topic coverage and confirm neither retrieves the other's documents under matched queries.
- Revoke a user's access to a known document in the source system. Confirm the next retrieval from the same user no longer returns the document, on retrieval, on the reranker leg, on the hybrid BM25 leg, and through any cached embedding path.
- Pull a small sample of vectors from the production vector store. Run a published embedding-inversion approach against the sample (Vec2Text or similar). If recoverable text is identifiable, the vectors carry confidentiality the access control did not enforce.
- Craft a poisoning candidate: a document with embedding-similarity-maximising text against expected queries and a chosen payload (instruction smuggling, factual override, citation injection). Push the candidate through every ingestion path the team supports (manual upload, crawler, webhook, partner sync). Confirm whether the integrity gate blocks the candidate or whether the candidate reaches production retrieval.
- Delete a known document through the official deletion path and verify the vector store no longer returns the chunk text and no longer returns a vector close to the deleted content. Repeat the check against the backup, the analytics pipeline, the feature store cache, and any embedding mirror the team operates.
- Run a denial-of-service probe: submit queries with long inputs, with top-k well beyond the documented ceiling, with token-stuffing patterns, and with rerank-expensive payloads. Confirm the per-tenant cost ceiling, the rate limit, and the back-pressure controls hold before serving capacity is exhausted.
How to fix it
Enforce per-document authorisation on every retrieval as a non-bypassable predicate
The retrieval call carries the authenticated user, the tenant identifier, and the role context. The metadata filter applies a per-document ACL predicate as a non-bypassable parameter on the vector store query (filter-by-metadata in Pinecone, where-clause in Qdrant, filter expression in Weaviate, k-NN filter in OpenSearch). The retrieval layer refuses calls without the user context. The platform team treats the ACL field as part of the schema, not as an optional metadata field.
Propagate the user identity into retrieval, not the service identity
The application passes the authenticated user, the tenant identifier, the document scope, and the role context into the retrieval call. The retrieval layer does not accept calls authenticated only with a service token. The audit log records the user, the tenant, the query, the filter, the returned document IDs, and the rerank scores per call so the call is replayable for incident response.
Treat the vector store as a privileged data store with full access control discipline
Encrypt vectors at rest and in transit. Restrict admin-plane access to named operators with MFA. Apply network policy that constrains the store to the production tenancy boundary. Enable per-operation audit logging on the admin plane. Rotate API keys and signing keys on a defined cadence. Apply the same backup-integrity, deletion-correctness, and disaster-recovery discipline the source system the documents came from runs under.
Apply integrity controls at ingestion before documents reach production retrieval
Each ingestion path runs through an integrity gate: source allow-list, document classification, ACL extraction, PII and secrets scan, content-policy check, and named-approver review for sources outside the allow-list. Unreviewed documents land in a staging index a human reviews before promotion. The integrity gate is a CI step on the ingestion pipeline, not a guideline.
Re-embed on ACL evolution, schema change, embedding model change, and chunking change
A change to the per-document ACL, the metadata schema, the embedding model, the chunking policy, or the reranker behaviour triggers an index rebuild or a staged migration with an explicit cutover plan. The old rows do not coexist silently with the new rows. The release pipeline blocks until the rebuild reports clean against the post-change schema.
Delete completely on right-to-erasure, retention expiry, and offboarding
A document deletion call removes the vector, the chunk text, the metadata row, the cached embedding, the backup that survives the policy retention window, and the analytics mirror. The deletion is auditable: an evidence record on the finding shows the vector store query returns zero results for the deleted document and the cache and backup are empty.
Apply ACL across reranker, hybrid retrieval, post-processing, and citation rendering
Every stage after the initial retrieval re-applies the per-user authorisation predicate. The BM25 hybrid leg honours the same filter. The reranker scoring does not promote documents the user cannot read. The deduplication and the summariser see only documents the user can read. The citation renderer surfaces document identifiers the user is authorised to see.
Protect embeddings as confidential data because vectors can be inverted
Embeddings do not leave the vector store. Analytics pipelines that ship vectors out have a security review before they ship. Feature stores and model-serving caches that hold embeddings carry the same access control as the vector store. Debug environments and development environments do not receive production embeddings. The inversion risk reads as a confidentiality control rather than as a performance control.
Constrain retrieval cost and back-pressure to prevent retrieval denial of service
Per-tenant quotas on retrieval calls per minute, per query token budget, per top-k ceiling, per reranker invocation budget, and per cost ceiling are enforced in the retrieval layer. Cost monitoring treats sudden retrieval cost spikes as a security signal, not as a finance metric. The class chains with the wider missing-rate-limiting discipline and reads against the same controls.
Log retrieval, rerank, context assembly, and model call as a single audit chain
Each user request emits a chained audit record: the question, the embedded query, the metadata filter, the retrieved document IDs, the rerank scores, the post-processing decisions, the prompt assembled, the model response, and the citations rendered. The chain is replayable. The audit identifies which user asked, which tenant the call served, which documents the model read, and which response the user received. The chain is the evidence pack for incident response and audit.
Run the retrieval-side regression on every release event
A new embedding model version, a chunking change, a metadata schema change, a reranker swap, a new ingestion source, a hybrid retrieval introduction, a prompt assembly refactor, or a deployment manifest change is a release event that re-runs the cross-tenant ACL probe, the embedding inversion test against the production sample, the poisoning candidate ingestion test, the deletion correctness test, and the retrieval cost probe. The release pipeline blocks until the regression passes.
What this looks like in SecPortal
Finding with the retrieval surface and the failed predicate
The finding captures the retrieval surface (embedding model and version, vector store identity, metadata schema, ACL field, reranker if present), the failed predicate (no per-user ACL, propagated service identity, missing reranker filter, deletion left a vector, vector mirrored to analytics, etc.), the observed disclosure or poisoning evidence, the affected user and tenant scope, and the deployment context. AppSec, product security, AI platform, data, vendor risk, and GRC read the same record the engineering team uses to ship the fix.
Code scanning across retrieval, ingestion, and prompt assembly
Code scanning runs Semgrep SAST and dependency analysis against connected GitHub, GitLab, and Bitbucket repositories. Findings surface at retrieval call sites missing the user predicate, ingestion pipelines writing without an ACL field, embedding wrappers shipping vectors to analytics, prompt assembly dropping source attribution, and vector-store client configurations that bypass the metadata filter.
External scanning across exposed AI infrastructure
External scanning enumerates exposed vector-store admin endpoints, public dashboards on hosted vector databases, leaked embedding API keys, accidentally public buckets that hold chunk text, and debug routes that disclose the embedding model name or vector store version. A publicly visible retrieval gap lands as a finding before an external researcher writes the writeup.
Authenticated scanning against cross-tenant retrieval
Authenticated scanning drives the deployed AI feature with cross-tenant probes under a real session: requests under one user that ask questions whose answers should only exist in another tenant's documents, requests that target documents the user lost access to, requests that exercise the reranker and hybrid retrieval legs independently to detect ACL drift between stages. Each finding ties the response to the retrieval surface that allowed the call.
Continuous monitoring against retrieval drift
Continuous monitoring re-runs the cross-tenant retrieval probe and the deletion-correctness check on the configured cadence. An embedding model swap, a metadata schema change, a chunking change, a new ingestion source, a new reranker, or a deployment manifest change that re-opens a previously closed retrieval ACL finding surfaces against the baseline rather than waiting for the next audit.
Bulk import for vector-store and AI security scanners
Bulk finding import accepts CSV intake from dedicated AI security scanners, vector-store posture tools, embedding inversion testers, and AI-aware DAST output. External scanner results land on the same engagement record as the SecPortal probes with one CVSS 3.1 calibration applied across the LLM08 chain.
Document management for the canonical retrieval-feature inventory
Document management stores the retrieval-feature inventory, the per-feature ingestion source list, the metadata schema and ACL field documentation, the embedding model and chunking policy record, the reranker and hybrid retrieval configuration, the prompt assembly contract, the deletion-correctness procedure, and the per-framework control mapping. Each artefact attaches to the finding so the auditor reads the operating record the engineering programme runs against.
Retest after the remediation ships
Once the fix deploys (the retrieval call carries the user predicate, the reranker honours the same filter, the embedding mirror is removed, the deletion path purges vectors and cache and backup, the rate limit enforces per-tenant retrieval ceilings, or the integrity gate blocks the unreviewed source), a targeted retest replays the cross-tenant probe, the deletion check, the inversion check, and the cost probe, and records the post-fix result on the finding. The finding closes against the evidence rather than against a developer assertion.
AI-assisted writeups within verified scope
AI reports generate the writeup, the executive summary, and the developer-facing reproduction steps from the finding record. The narrative stays within the verified evidence on the finding (the retrieval surface, the failed predicate, the disclosure or poisoning evidence, the affected scope, the deployment context, the regulatory mapping) and does not invent ACL services, vector-store features, or runtime tooling the product does not have.
Compliance tracking pairs the fix to control evidence
Compliance tracking maps LLM08 findings to the controls that read against them: OWASP LLM Top 10 LLM08, OWASP AISVS Data and Knowledge Base chapters, NIST AI RMF Map and Manage, ISO/IEC 42001 AI lifecycle and data management, EU AI Act Article 10 data governance and Article 15 cybersecurity, GDPR Article 5 purpose limitation and Article 32 security of processing, NIST SSDF for AI extensions, ISO 27001 Annex A 5.34 privacy and protection of PII and A.8.2 to A.8.5 access control, and SOC 2 CC6.1 logical access.
What SecPortal does not do
SecPortal is the operating record where LLM08 vector and embedding findings, the retrieval-surface inventory, the failed-predicate evidence, the cross-tenant probe results, the deletion-correctness results, and the regulatory mapping land alongside the rest of the security backlog. The product does not host a vector store, does not host an embedding model, does not enforce retrieval ACLs at runtime, does not act as a retrieval proxy, does not perform embedding inversion as a managed service, does not generate a retrieval-corpus inventory automatically, and does not act as a vector-store posture management engine.
SecPortal does not provide a managed RAG service, a hosted reranker, a hosted hybrid retrieval pipeline, an inline AI gateway, an inline retrieval firewall, or a managed embedding cache. The product does not ship packaged connectors into Pinecone, Weaviate, Qdrant, Milvus, Chroma, pgvector, OpenSearch, Elasticsearch, Vespa, Redis Stack, Cohere, OpenAI Embeddings, Voyage AI, Jina, Hugging Face Inference, Jira, ServiceNow, Slack, SIEM, SOAR, or external CMDB systems. The discipline is the engineering practice on top of the operating record: AppSec, product security, AI platform, MLOps, data platform, vendor risk, and GRC teams write the retrieval-feature inventory, the per-document ACL predicate, the embedding-confidentiality policy, the ingestion integrity gate, the deletion-correctness procedure, the retrieval cost ceiling, and the CI gate that re-runs the retrieval regression on every release.
Related tools and reading
Vulnerability
Indirect prompt injection via RAG (LLM01 variant)
The instruction-content companion to LLM08. Where LLM08 covers retrieval-side security properties (who can read which document, can vectors leak, can retrieval be tilted), the RAG injection page covers the case where an attacker writes instructions into a document so the retrieval step smuggles them into the model context. The two findings often pair on the same engagement.
Vulnerability
Data and model poisoning (LLM04)
The training-time companion to LLM08 retrieval-index poisoning. LLM04 covers contamination of training, fine-tune, and offline-build corpora; LLM08 covers contamination of the live retrieval index at serving time. Defences read across.
Vulnerability
Sensitive information disclosure (LLM02)
The inference-side egress class. LLM08 covers the storage-side disclosure pathway through cross-tenant retrieval and embedding inversion; LLM02 covers the model-output disclosure once the chunk reaches the prompt. The two findings often chain.
Vulnerability
LLM supply chain vulnerabilities (LLM03)
The artefact-integrity companion. LLM03 covers the integrity envelope of the embedding model and the inference SDK; LLM08 covers what the embedding model then writes into and reads from the vector store. The two disciplines share the artefact identity but apply different controls.
Vulnerability
Broken access control
The classical OWASP A01 page. LLM08 is broken access control reframed for the retrieval layer: a missing per-user predicate on every retrieval call is the same finding pattern as a missing authorisation check on a classical API endpoint. The defences read directly across.
Vulnerability
Missing rate limiting
The classical denial-of-service control. LLM08 embedding-denial-of-service through expensive query crafting inherits the same control surface and reads against the same per-tenant quota and back-pressure discipline.
Blog
OWASP Top 10 for LLM Applications explained
The 2025 LLM Top 10 read in operating context, with LLM08 Vector and Embedding Weaknesses framed alongside LLM01 Prompt Injection, LLM02 Sensitive Information Disclosure, LLM03 Supply Chain, LLM04 Data and Model Poisoning, and the rest of the application risk catalogue.
Blog
AI security posture management (AI-SPM)
The wider AI-SPM frame the LLM08 finding lives inside: AI feature inventory, RAG corpus governance, retrieval-side ACL evidence, embedding-confidentiality policy, and the operating cadence that runs the retrieval regression on every release.
Blog
MLSecOps implementation guide
The MLSecOps programme the LLM08 finding sits inside: model lifecycle controls, embedding-model governance, vector-store hardening, ingestion integrity, and the supply chain regression that runs on every release.
Blog
AI Bill of Materials guide
The AIBOM rollout guide that produces the inventory the LLM08 retrieval-feature record reads against. CycloneDX ML-BOM, SPDX 3.0 AI Profile, embedding-model identity, and the vector-store identity pairing the retrieval feature depends on.
Framework
OWASP AISVS
The AI Security Verification Standard. The Data and Knowledge Base chapter and the Trust and Boundary chapter read directly against retrieval-side ACL enforcement, embedding-store integrity, and ingestion validation on the LLM08 finding record.
Framework
NIST AI Risk Management Framework
The Map, Measure, and Manage functions read directly against the corpus inventory, the per-document authorisation control, the integrity envelope, and the evidence the team can produce on each retrieval.
Framework
ISO/IEC 42001 AI management system
The AI management system standard. Annex A controls cover AI system lifecycle data management, third-party data risk, and accountability for the data the AI system reads. LLM08 findings land in the operating record those controls read against.
For
SecPortal for AppSec teams
The day-to-day workspace where AppSec engineers run the retrieval-feature inventory, the per-user ACL probe, the embedding-confidentiality review, and the LLM08 regression for every AI feature shipping in the product.
For
SecPortal for product security teams
The workspace where product security owns the retrieval-side authorisation envelope across releases, with the per-user ACL discipline, the embedding-confidentiality review, the deletion-correctness procedure, and the retrieval regression wired into the release process.
Feature
Code scanning
Semgrep-backed SAST and dependency analysis across connected GitHub, GitLab, and Bitbucket repositories. Findings surface at retrieval call sites missing the user predicate, ingestion paths missing the ACL field, embedding wrappers shipping vectors to analytics, and prompt assembly dropping source attribution.
Feature
Authenticated scanning
Seventeen authenticated modules behind stored credentials. The cross-tenant retrieval probe runs as a real user under a real session and tests the deployed AI feature for ACL drift across the retrieval, the reranker, and the hybrid retrieval legs.
Compliance impact
OWASP Top 10 for LLM Apps
LLM08:2025 - Vector and Embedding Weaknesses (retrieval-side ACL, embedding confidentiality, ingestion integrity, retrieval-side denial of service, deletion correctness)
OWASP AISVS
Data and Knowledge Base chapter, Trust and Boundary chapter - retrieval-side ACL enforcement, embedding-store integrity, ingestion validation
NIST AI RMF
Map, Measure, Manage; Govern - corpus inventory, per-document authorisation, integrity envelope, retrieval evidence
ISO/IEC 42001
AI management system - AI lifecycle data management, third-party data risk, accountability for the data the AI system reads
ISO 27001
Annex A 5.34 Privacy and protection of PII; A 8.2 Privileged access rights; A 8.3 Information access restriction; A 8.4 Access to source code; A 8.5 Secure authentication; A 5.12 Classification of information
SOC 2
CC6.1 Logical and physical access controls; CC6.2 User registration and authorisation; CC6.3 Change of user access; CC7.1 Detection and monitoring; CC8.1 Change management
NIST 800-53
AC-3 Access Enforcement; AC-4 Information Flow Enforcement; AC-6 Least Privilege; AU-2 Event Logging; SI-4 System Monitoring; CM-3 Configuration Change Control
NIST CSF 2.0
PR.AA Identity Management and Access Control; PR.DS Data Security; DE.CM Continuous Monitoring; ID.RA Risk Assessment; GV.SC Cybersecurity Supply Chain Risk Management
NIST SSDF
PS.1 protect each release integrity; PW.4 reuse existing well-secured software; PO.5 implement supporting toolchains; RV.1 identify and confirm vulnerabilities; RV.2 assess, prioritise, remediate vulnerabilities
PCI DSS
Requirement 6.2 Custom software developed securely; Requirement 7 Restrict access to system components and cardholder data by business need to know; Requirement 8 Identify users and authenticate access to system components; Requirement 10 Log and monitor all access
Related vulnerabilities
Indirect Prompt Injection via RAG
Data and Model Poisoning in LLM Applications
Sensitive Information Disclosure in LLM Applications
LLM Supply Chain Vulnerabilities
Improper Output Handling in LLM Applications
Excessive Agency in LLM Applications
System Prompt Leakage in LLM Applications
Broken Access Control
Broken Object Level Authorization (BOLA)
Missing Rate Limiting
Sensitive Data Exposure
Information Disclosure
Related features
Find vulnerabilities before they ship
Vulnerability scanning tools that map your attack surface
Test web apps behind the login
Vulnerability management software that tracks every finding
Monitor continuously catch regressions early
Bulk finding import bring your scanner data with you
Document management for every security engagement
Finding overrides that survive every scan cycle
Verify fixes and track reopens on the same finding record
Compliance tracking without a full GRC platform
Every action recorded across the workspace
Repository connections for SAST and SCA
AI-powered reports in seconds, not days
Track LLM08 vector and embedding findings against every AI feature
SecPortal records LLM08 findings against the AI feature, attaches the retrieval surface (embedding model, vector store, metadata schema, ACL field), captures the cross-tenant or inversion or poisoning evidence, generates AI-assisted writeups, and tracks the fix through retest. Start for free.
No credit card required. Free plan available forever.