Vulnerability-Lookup

GHSA-4R2X-XPJR-7CVV

Vulnerability from github – Published: 2026-02-02 17:43 – Updated: 2026-02-03 16:12

Summary

vLLM has RCE In Video Processing

Details

Summary

A chain of vulnerabilities in vLLM allow Remote Code Execution (RCE):

Info Leak - PIL error messages expose memory addresses, bypassing ASLR
Heap Overflow - JPEG2000 decoder in OpenCV/FFmpeg has a heap overflow that lets us hijack code execution

Result: Send a malicious video URL to vLLM Completions or Invocations for a video model -> Execute arbitrary commands on the server

Completely default vLLM instance directly from pip, or docker, does not have authentication so "None" privileges are required, but even with non-default api-key enabled configuration this exploit is feasible through invocations route that allows payload to execute pre-auth.

Example heap target is provided, other heap targets can be exploited as well to achieve rce. Leak allows for simple ASLR bypass. Leak + heap overflow achieves RCE on versions prior to 0.14.1.

Deployments not serving a video model are not affected.

1. Vulnerability Overview

1.1 The Bug: JPEG2000 cdef Box Heap Overflow

The JPEG2000 decoder used by OpenCV (cv2) honors a cdef box that can remap color channels. When Y (luma) is mapped into the U (chroma) plane buffer, the decoder writes a large Y plane into the smaller U buffer, causing a heap overflow.

Root Cause - cdef allows channel remapping (e.g., Y→U, U→Y). - Y plane size: W×H; U plane size: (W/2)×(H/2). - Overflow size = W×H - (W/2×H/2) = 0.75 × W × H bytes.

Example (150×64) - Y plane: 150×64 = 9,600 bytes
- U plane: 75×32 = 2,400 bytes
- Overflow: 7,200 bytes past the U buffer

1.2 Malicious cdef Box

Offset  Size  Field           Value
0       4     Box Length      0x00000016 (22 bytes)
4       4     Box Type        'cdef'
8       2     N (channels)    0x0003
10      2     Channel 0 Cn    0x0000 (Y channel)
12      2     Channel 0 Typ   0x0000 (color)
14      2     Channel 0 Asoc  0x0002 (→ maps Y into U plane)
16      2     Channel 1 Cn    0x0001 (U channel)
18      2     Channel 1 Typ   0x0000 (color)
20      2     Channel 1 Asoc  0x0001 (→ maps U into Y plane)
22      2     Channel 2 Cn    0x0002 (V channel)
24      2     Channel 2 Typ   0x0000 (color)
26      2     Channel 2 Asoc  0x0003 (→ maps V plane)

Key control: Asoc=2 for channel 0 forces Y data into the U buffer, triggering the overflow.

Vulnerable Code Chain

1) Entry: vLLM accepts a remote `video_url` and downloads raw bytes

vLLM’s OpenAI-compatible API supports a video_url content part:

class VideoURL(TypedDict, total=False):
    url: Required[str]

class ChatCompletionContentPartVideoParam(TypedDict, total=False):
    video_url: Required[VideoURL]
    type: Required[Literal["video_url"]]

Source: src/vllm/entrypoints/chat_utils.py.

When the URL is HTTP(S), vLLM downloads it as raw bytes and passes the bytes into the modality loader:

if url_spec.scheme.startswith("http"):
    data = connection.get_bytes(url, timeout=fetch_timeout, allow_redirects=...)
    return media_io.load_bytes(data)

Source: src/vllm/multimodal/utils.py (MediaConnector.load_from_url).

2) Decode: vLLM uses OpenCV (cv2) VideoCapture on an in-memory byte stream

The default video backend is OpenCV, and it constructs cv2.VideoCapture over a BytesIO buffer containing the downloaded bytes:

backend = cls().get_cv2_video_api()
cap = cv2.VideoCapture(BytesIO(data), backend, [])
if not cap.isOpened():
    raise ValueError("Could not open video stream")

Source: src/vllm/multimodal/video.py (OpenCVVideoBackend.load_bytes).

The backend is selected from OpenCV’s stream-buffered backends registry:

import cv2.videoio_registry as vr
for backend in vr.getStreamBufferedBackends():
    if vr.hasBackend(backend) and ...:
        api_pref = backend
        break
return api_pref

Source: src/vllm/multimodal/video.py (OpenCVVideoBackend.get_cv2_video_api).

Implication: vLLM is delegating container parsing + codec decode to OpenCV’s Video I/O stack (which, in typical builds, is backed by FFmpeg for MOV/MP4 and codecs like JPEG2000).

3) The actual overflow: Y (full-res) written into U (quarter-res)

When the decoder honors the remap and writes Y into the U-plane buffer, it writes too many bytes:

Y plane bytes: (W \times H)
U plane bytes: ((W/2) \times (H/2))
Overflow bytes: (W \times H - (W/2 \times H/2) = 0.75 \times W \times H)

Concrete example tried (150×64):

Y: (150 \times 64 = 9600) bytes
U: (75 \times 32 = 2400) bytes
Overflow: (9600 - 2400 = 7200) bytes past the end of the U allocation

This is a heap buffer overflow into whatever allocations follow the U-plane buffer in the decoder’s heap layout (structures, metadata, other buffers, etc.). The exact victims depend on build + runtime allocator layout.

The Exploit Chain

Vuln 1: PIL BytesIO Address Leak (ASLR Bypass)

When you send an invalid image to vLLM's multimodal endpoint, PIL throws an error like:

cannot identify image file <_io.BytesIO object at 0x7a95e299e750>
                                                   ^^^^^^^^^^^^^^^^
                                                   LEAKED ADDRESS!

vLLM returns this error to the client, leaking a heap address. This address is ~10.33 GB before libc in memory. With this leak, we reduce ASLR from 4 billion guesses to ~8 guesses.

Vuln 2: JPEG2000 cdef Heap Overflow (RCE)

vLLM uses OpenCV (cv2) to decode videos. OpenCV bundles FFmpeg 5.1.x which has a heap overflow in the JPEG2000 decoder. The OpenCV is used for video decoding so if we build a video from JPEG2000 frames it will reach the vuln:

vLLM API Request to Completions/Invocation
     ↓
OpenCV cv2.VideoCapture()
     ↓
FFmpeg 5.1 (bundled in OpenCV)
     ↓
JPEG2000 decoder (libopenjp2)
     ↓
HEAP OVERFLOW via malicious "cdef" box
     ↓
Overwrite function pointer → RCE!

How the overflow works: - JPEG2000 has a cdef box that remaps color channels - We remap Y (luma) into the U (chroma) buffer - Y plane = 9,600 bytes, U plane = 2,400 bytes - On small geometry like 150x64 pixel image we get 7,200 bytes overflow past the U buffer. We can grow that exponentially by making bigger images. - This overwrites an AVBuffer structure containing a free() function pointer. This could be any function pointer or other targets. - We set free = system() and opaque = "command string" - When the buffer is freed → system("our command") executes

vLLM Attack Surface

Affected Endpoints

Both multimodal endpoints are vulnerable:

POST /v1/chat/completions     (with video_url in content)
POST /v1/invocations          (with video_url in content)

Request Flow

1. Attacker sends request with video_url pointing to malicious .mov file
2. vLLM fetches the video from the URL
3. vLLM passes video bytes to cv2.VideoCapture()
4. OpenCV's bundled FFmpeg decodes JPEG2000 frames
5. Malicious cdef box triggers heap overflow
6. AVBuffer.free pointer overwritten with system()
7. When buffer is released → system("attacker command") executes

Versions Affected

Component	Version	Notes
vLLM	>= 0.8.3, < 0.14.1	Default config vulnerable when serving a video model
OpenCV (cv2)	4.x with FFmpeg bundle	Bundled FFmpeg is vulnerable
FFmpeg	5.1.x (bundled)	JPEG2000 cdef overflow
libopenjp2	2.x	Honors malicious cdef box

Fixes

https://github.com/vllm-project/vllm/pull/31987
https://github.com/vllm-project/vllm/pull/32319
https://github.com/vllm-project/vllm/pull/32668

Severity ?

9.8 (Critical)


                  
                    CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "vllm"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0.8.3"
            },
            {
              "fixed": "0.14.1"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-22778"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-122",
      "CWE-532"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-02-02T17:43:45Z",
    "nvd_published_at": "2026-02-02T23:16:06Z",
    "severity": "CRITICAL"
  },
  "details": "## Summary\n\n**A chain of vulnerabilities in vLLM allow Remote Code Execution (RCE):**\n\n1. **Info Leak** - PIL error messages expose memory addresses, bypassing ASLR\n2. **Heap Overflow** - JPEG2000 decoder in OpenCV/FFmpeg has a heap overflow that lets us hijack code execution\n\n**Result:** Send a malicious video URL to vLLM Completions or Invocations **for a video model** -\u003e Execute arbitrary commands on the server\n\nCompletely default vLLM instance directly from pip, or docker, does not have authentication so \"None\" privileges are required, but even with non-default api-key enabled configuration this exploit is feasible through invocations route that allows payload to execute pre-auth. \n\nExample heap target is provided, other heap targets can be exploited as well to achieve rce. Leak allows for simple ASLR bypass. Leak + heap overflow achieves RCE on versions prior to 0.14.1. \n\nDeployments not serving a video model are not affected.\n\n---\n\n\n## 1. Vulnerability Overview\n\n### 1.1 The Bug: JPEG2000 cdef Box Heap Overflow\nThe JPEG2000 decoder used by OpenCV (cv2) honors a `cdef` box that can remap color channels. When Y (luma) is mapped into the U (chroma) plane buffer, the decoder writes a large Y plane into the smaller U buffer, causing a heap overflow.\n\n**Root Cause**\n- `cdef` allows channel remapping (e.g., Y\u2192U, U\u2192Y).\n- Y plane size: `W\u00d7H`; U plane size: `(W/2)\u00d7(H/2)`.\n- Overflow size = `W\u00d7H - (W/2\u00d7H/2)` = `0.75 \u00d7 W \u00d7 H` bytes.\n\n**Example (150\u00d764)**\n- Y plane: 150\u00d764 = 9,600 bytes  \n- U plane: 75\u00d732 = 2,400 bytes  \n- Overflow: 7,200 bytes past the U buffer\n\n### 1.2 Malicious cdef Box\n```\nOffset  Size  Field           Value\n0       4     Box Length      0x00000016 (22 bytes)\n4       4     Box Type        \u0027cdef\u0027\n8       2     N (channels)    0x0003\n10      2     Channel 0 Cn    0x0000 (Y channel)\n12      2     Channel 0 Typ   0x0000 (color)\n14      2     Channel 0 Asoc  0x0002 (\u2192 maps Y into U plane)\n16      2     Channel 1 Cn    0x0001 (U channel)\n18      2     Channel 1 Typ   0x0000 (color)\n20      2     Channel 1 Asoc  0x0001 (\u2192 maps U into Y plane)\n22      2     Channel 2 Cn    0x0002 (V channel)\n24      2     Channel 2 Typ   0x0000 (color)\n26      2     Channel 2 Asoc  0x0003 (\u2192 maps V plane)\n```\nKey control: `Asoc=2` for channel 0 forces Y data into the U buffer, triggering the overflow.\n\n---\n\n## Vulnerable Code Chain\n\n### 1) Entry: vLLM accepts a remote `video_url` and downloads raw bytes\n\nvLLM\u2019s OpenAI-compatible API supports a `video_url` content part:\n\n```python\nclass VideoURL(TypedDict, total=False):\n    url: Required[str]\n\nclass ChatCompletionContentPartVideoParam(TypedDict, total=False):\n    video_url: Required[VideoURL]\n    type: Required[Literal[\"video_url\"]]\n```\n\nSource: `src/vllm/entrypoints/chat_utils.py`.\n\nWhen the URL is HTTP(S), vLLM downloads it as **raw bytes** and passes the bytes into the modality loader:\n\n```python\nif url_spec.scheme.startswith(\"http\"):\n    data = connection.get_bytes(url, timeout=fetch_timeout, allow_redirects=...)\n    return media_io.load_bytes(data)\n```\n\nSource: `src/vllm/multimodal/utils.py` (`MediaConnector.load_from_url`).\n\n---\n\n### 2) Decode: vLLM uses OpenCV (cv2) VideoCapture on an in-memory byte stream\n\nThe default video backend is OpenCV, and it constructs `cv2.VideoCapture` over a `BytesIO` buffer containing the downloaded bytes:\n\n```python\nbackend = cls().get_cv2_video_api()\ncap = cv2.VideoCapture(BytesIO(data), backend, [])\nif not cap.isOpened():\n    raise ValueError(\"Could not open video stream\")\n```\n\nSource: `src/vllm/multimodal/video.py` (`OpenCVVideoBackend.load_bytes`).\n\nThe backend is selected from OpenCV\u2019s stream-buffered backends registry:\n\n```python\nimport cv2.videoio_registry as vr\nfor backend in vr.getStreamBufferedBackends():\n    if vr.hasBackend(backend) and ...:\n        api_pref = backend\n        break\nreturn api_pref\n```\n\nSource: `src/vllm/multimodal/video.py` (`OpenCVVideoBackend.get_cv2_video_api`).\n\n**Implication**: vLLM is delegating container parsing + codec decode to OpenCV\u2019s Video I/O stack (which, in typical builds, is backed by FFmpeg for MOV/MP4 and codecs like JPEG2000).\n\n---\n\n### 3) The actual overflow: Y (full-res) written into U (quarter-res)\n\nWhen the decoder honors the remap and writes Y into the U-plane buffer, it writes **too many bytes**:\n\n- Y plane bytes: \\(W \\times H\\)\n- U plane bytes: \\((W/2) \\times (H/2)\\)\n- Overflow bytes: \\(W \\times H - (W/2 \\times H/2) = 0.75 \\times W \\times H\\)\n\nConcrete example tried (150\u00d764):\n\n- **Y**: \\(150 \\times 64 = 9600\\) bytes  \n- **U**: \\(75 \\times 32 = 2400\\) bytes  \n- **Overflow**: \\(9600 - 2400 = 7200\\) bytes past the end of the U allocation\n\nThis is a **heap buffer overflow** into whatever allocations follow the U-plane buffer in the decoder\u2019s heap layout (structures, metadata, other buffers, etc.). The exact victims depend on build + runtime allocator layout.\n\n---\n\n## The Exploit Chain \n\n### Vuln 1: PIL BytesIO Address Leak (ASLR Bypass)\n\nWhen you send an **invalid image** to vLLM\u0027s multimodal endpoint, PIL throws an error like:\n\n```\ncannot identify image file \u003c_io.BytesIO object at 0x7a95e299e750\u003e\n                                                   ^^^^^^^^^^^^^^^^\n                                                   LEAKED ADDRESS!\n```\n\nvLLM returns this error to the client, **leaking a heap address**. This address is ~10.33 GB before `libc` in memory. With this leak, we reduce ASLR from **4 billion guesses to ~8 guesses**.\n\n### Vuln 2: JPEG2000 cdef Heap Overflow (RCE)\n\nvLLM uses **OpenCV (cv2)** to decode videos. OpenCV bundles **FFmpeg 5.1.x** which has a heap overflow in the JPEG2000 decoder. The OpenCV is used for video decoding so if we build a video from JPEG2000 frames it will reach the vuln:\n\n```\nvLLM API Request to Completions/Invocation\n     \u2193\nOpenCV cv2.VideoCapture()\n     \u2193\nFFmpeg 5.1 (bundled in OpenCV)\n     \u2193\nJPEG2000 decoder (libopenjp2)\n     \u2193\nHEAP OVERFLOW via malicious \"cdef\" box\n     \u2193\nOverwrite function pointer \u2192 RCE!\n```\n\n**How the overflow works:**\n- JPEG2000 has a `cdef` box that remaps color channels\n- We remap Y (luma) into the U (chroma) buffer\n- Y plane = 9,600 bytes, U plane = 2,400 bytes\n- On small geometry like 150x64 pixel image we get **7,200 bytes overflow** past the U buffer. We can grow that exponentially by making bigger images. \n- This overwrites an `AVBuffer` structure containing a `free()` function pointer. This could be any function pointer or other targets. \n- We set `free = system()` and `opaque = \"command string\"`\n- When the buffer is freed \u2192 `system(\"our command\")` executes\n\n---\n\n## vLLM Attack Surface\n\n### Affected Endpoints\n\nBoth multimodal endpoints are vulnerable:\n\n```\nPOST /v1/chat/completions     (with video_url in content)\nPOST /v1/invocations          (with video_url in content)\n```\n\n### Request Flow\n\n```\n1. Attacker sends request with video_url pointing to malicious .mov file\n2. vLLM fetches the video from the URL\n3. vLLM passes video bytes to cv2.VideoCapture()\n4. OpenCV\u0027s bundled FFmpeg decodes JPEG2000 frames\n5. Malicious cdef box triggers heap overflow\n6. AVBuffer.free pointer overwritten with system()\n7. When buffer is released \u2192 system(\"attacker command\") executes\n```\n\n---\n\n## Versions Affected\n\n| Component | Version | Notes |\n|-----------|---------|-------|\n| vLLM | \u003e= 0.8.3, \u003c 0.14.1 | Default config vulnerable when serving a video model |\n| OpenCV (cv2) | 4.x with FFmpeg bundle | Bundled FFmpeg is vulnerable |\n| FFmpeg | 5.1.x (bundled) | JPEG2000 cdef overflow |\n| libopenjp2 | 2.x | Honors malicious cdef box |\n\n---\n\n## Fixes\n\n* https://github.com/vllm-project/vllm/pull/31987\n* https://github.com/vllm-project/vllm/pull/32319\n* https://github.com/vllm-project/vllm/pull/32668",
  "id": "GHSA-4r2x-xpjr-7cvv",
  "modified": "2026-02-03T16:12:12Z",
  "published": "2026-02-02T17:43:45Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-4r2x-xpjr-7cvv"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2026-22778"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/pull/31987"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/pull/32319"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/vllm-project/vllm"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/releases/tag/v0.14.1"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
      "type": "CVSS_V3"
    }
  ],
  "summary": "vLLM has RCE In Video Processing"
}

CVE-2026-22778 (GCVE-0-2026-22778)

Vulnerability from cvelistv5 – Published: 2026-02-02 21:09 – Updated: 2026-02-03 15:42

Title

vLLM leaks a heap address when PIL throws an error

Summary

vLLM is an inference and serving engine for large language models (LLMs). From 0.8.3 to before 0.14.1, when an invalid image is sent to vLLM's multimodal endpoint, PIL throws an error. vLLM returns this error to the client, leaking a heap address. With this leak, we reduce ASLR from 4 billion guesses to ~8 guesses. This vulnerability can be chained a heap overflow with JPEG2000 decoder in OpenCV/FFmpeg to achieve remote code execution. This vulnerability is fixed in 0.14.1.

Severity ?

9.8 (Critical)


                        
                          CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

CWE

CWE-532 - Insertion of Sensitive Information into Log File

Assigner

GitHub_M

References

URL

	Vendor	Product	Version
	vllm-project	vllm	Affected: >= 0.8.3, < 0.14.1

Sightings

Author	Source	Type	Date

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

GHSA-4R2X-XPJR-7CVV

Summary

1. Vulnerability Overview

1.1 The Bug: JPEG2000 cdef Box Heap Overflow

1.2 Malicious cdef Box

Vulnerable Code Chain

1) Entry: vLLM accepts a remote video_url and downloads raw bytes

2) Decode: vLLM uses OpenCV (cv2) VideoCapture on an in-memory byte stream

3) The actual overflow: Y (full-res) written into U (quarter-res)

The Exploit Chain

Vuln 1: PIL BytesIO Address Leak (ASLR Bypass)

Vuln 2: JPEG2000 cdef Heap Overflow (RCE)

vLLM Attack Surface

Affected Endpoints

Request Flow

Versions Affected

Fixes

CVE-2026-22778 (GCVE-0-2026-22778)

Tags

Sightings

Nomenclature

1) Entry: vLLM accepts a remote `video_url` and downloads raw bytes