GHSA-7H4P-RFFG-7823

Vulnerability from github – Published: 2026-06-17 14:02 – Updated: 2026-06-17 14:02
VLAI
Summary
vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels
Details

Summary

All temperature validation gates use comparison operators (<, >), which silently evaluate to False for NaN and for positive Infinity in Python's IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. Note: -Infinity is correctly caught.

Root Cause

sampling_params.py:384:

if 0 < self.temperature < _MAX_TEMP:  # NaN → False; +Inf → False

sampling_params.py:462:

if self.temperature < 0.0:            # NaN → False; +Inf → False
    raise VLLMValidationError(...)

No math.isnan() or math.isinf() check exists anywhere in sampling_params.py.

Python semantics (verified): float('nan') < 0.0False, float('inf') < 0.0False.

Impact

Crash of inference worker on GPU kernel execution with NaN/Inf softmax input, degrading service for all concurrent users.

Remediation

Add math.isfinite(self.temperature) check in _verify_args(). Reject non-finite float values with a 400 error.

Fix

A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/45116

Show details on source website

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "vllm"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "last_affected": "0.23.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-54235"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-1287"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-06-17T14:02:22Z",
    "nvd_published_at": null,
    "severity": "MODERATE"
  },
  "details": "## Summary\n\nAll temperature validation gates use comparison operators (`\u003c`, `\u003e`), which silently evaluate to `False` for `NaN` and for positive `Infinity` in Python\u0027s IEEE 754 float semantics. Both values pass every guard and propagate to GPU sampling kernels, where they produce undefined behavior or CUDA errors that can crash the inference worker. Note: `-Infinity` is correctly caught.\n\n## Root Cause\n\n`sampling_params.py:384`:\n```python\nif 0 \u003c self.temperature \u003c _MAX_TEMP:  # NaN \u2192 False; +Inf \u2192 False\n```\n\n`sampling_params.py:462`:\n```python\nif self.temperature \u003c 0.0:            # NaN \u2192 False; +Inf \u2192 False\n    raise VLLMValidationError(...)\n```\n\nNo `math.isnan()` or `math.isinf()` check exists anywhere in `sampling_params.py`.\n\nPython semantics (verified): `float(\u0027nan\u0027) \u003c 0.0` \u2192 `False`, `float(\u0027inf\u0027) \u003c 0.0` \u2192 `False`.\n\n\n## Impact\n\nCrash of inference worker on GPU kernel execution with NaN/Inf softmax input, degrading service for all concurrent users.\n\n## Remediation\n\nAdd `math.isfinite(self.temperature)` check in `_verify_args()`. Reject non-finite float values with a 400 error.\n\n## Fix\n\nA fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/45116",
  "id": "GHSA-7h4p-rffg-7823",
  "modified": "2026-06-17T14:02:22Z",
  "published": "2026-06-17T14:02:22Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-7h4p-rffg-7823"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/pull/45116"
    },
    {
      "type": "WEB",
      "url": "https://github.com/vllm-project/vllm/commit/d598d239737cfa37bcfcb98886ec3f3557fc7198"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/vllm-project/vllm"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:N/VA:L/SC:N/SI:N/SA:N",
      "type": "CVSS_V4"
    }
  ],
  "summary": "vLLM: temperature=NaN and temperature=Infinity bypass validation and propagate to GPU kernels"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Forecast uses a logistic model when the trend is rising, or an exponential decay model when the trend is falling. Fitted via linearized least squares.

Sightings

Author Source Type Date Other

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or observed by the user.
  • Confirmed: The vulnerability has been validated from an analyst's perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
  • Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
  • Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
  • Not confirmed: The user expressed doubt about the validity of the vulnerability.
  • Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Loading…

Detection rules are retrieved from Rulezet.

Loading…

Loading…