Vulnerability-Lookup

FKIE_CVE-2026-44223

Vulnerability from fkie_nvd - Published: 2026-05-12 20:16 - Updated: 2026-06-22 22:16

Severity

6.5 (Medium) - CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Summary

vLLM is an inference and serving engine for large language models (LLMs). From 0.18.0 to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., "repetition_penalty": 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0.

References

URL	Tags
security-advisories@github.com	https://github.com/vllm-project/vllm/pull/38610	Issue Tracking, Patch
security-advisories@github.com	https://github.com/vllm-project/vllm/security/advisories/GHSA-83vm-p52w-f9pw	Mitigation, Vendor Advisory
134c704f-9b21-4f2e-91b3-4a467353bcc0	https://github.com/vllm-project/vllm/pull/38610	Issue Tracking, Patch
134c704f-9b21-4f2e-91b3-4a467353bcc0	https://github.com/vllm-project/vllm/security/advisories/GHSA-83vm-p52w-f9pw	Mitigation, Vendor Advisory

Impacted products

	Vendor	Product	Version
	vllm	vllm	*

JSON

To clipboard

{
  "affected": [
    {
      "affectedData": [
        {
          "product": "vllm",
          "vendor": "vllm-project",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 0.18.0, \u003c 0.20.0"
            }
          ]
        }
      ],
      "source": "security-advisories@github.com"
    }
  ],
  "configurations": [
    {
      "nodes": [
        {
          "cpeMatch": [
            {
              "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*",
              "matchCriteriaId": "443F32C2-B323-4FA0-AB97-F6445C91C89E",
              "versionEndExcluding": "0.20.0",
              "versionStartIncluding": "0.18.0",
              "vulnerable": true
            }
          ],
          "negate": false,
          "operator": "OR"
        }
      ]
    }
  ],
  "cveTags": [],
  "descriptions": [
    {
      "lang": "en",
      "value": "vLLM is an inference and serving engine for large language models (LLMs). From 0.18.0 to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., \"repetition_penalty\": 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0."
    }
  ],
  "id": "CVE-2026-44223",
  "lastModified": "2026-06-22T22:16:45.507",
  "metrics": {
    "cvssMetricV31": [
      {
        "cvssData": {
          "attackComplexity": "LOW",
          "attackVector": "NETWORK",
          "availabilityImpact": "HIGH",
          "baseScore": 6.5,
          "baseSeverity": "MEDIUM",
          "confidentialityImpact": "NONE",
          "integrityImpact": "NONE",
          "privilegesRequired": "LOW",
          "scope": "UNCHANGED",
          "userInteraction": "NONE",
          "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
          "version": "3.1"
        },
        "exploitabilityScore": 2.8,
        "impactScore": 3.6,
        "source": "security-advisories@github.com",
        "type": "Secondary"
      }
    ],
    "ssvcV203": [
      {
        "source": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
        "ssvcData": {
          "id": "CVE-2026-44223",
          "options": [
            {
              "exploitation": "poc"
            },
            {
              "automatable": "no"
            },
            {
              "technicalImpact": "partial"
            }
          ],
          "role": "CISA Coordinator",
          "timestamp": "2026-05-15T14:44:05.012494Z",
          "version": "2.0.3"
        }
      }
    ]
  },
  "published": "2026-05-12T20:16:43.293",
  "references": [
    {
      "source": "security-advisories@github.com",
      "tags": [
        "Issue Tracking",
        "Patch"
      ],
      "url": "https://github.com/vllm-project/vllm/pull/38610"
    },
    {
      "source": "security-advisories@github.com",
      "tags": [
        "Mitigation",
        "Vendor Advisory"
      ],
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-83vm-p52w-f9pw"
    },
    {
      "source": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
      "tags": [
        "Issue Tracking",
        "Patch"
      ],
      "url": "https://github.com/vllm-project/vllm/pull/38610"
    },
    {
      "source": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
      "tags": [
        "Mitigation",
        "Vendor Advisory"
      ],
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-83vm-p52w-f9pw"
    }
  ],
  "sourceIdentifier": "security-advisories@github.com",
  "vulnStatus": "Modified",
  "weaknesses": [
    {
      "description": [
        {
          "lang": "en",
          "value": "CWE-131"
        },
        {
          "lang": "en",
          "value": "CWE-704"
        }
      ],
      "source": "security-advisories@github.com",
      "type": "Secondary"
    }
  ]
}

CVE-2026-44223 (GCVE-0-2026-44223)

Vulnerability from cvelistv5 – Published: 2026-05-12 19:58 – Updated: 2026-06-22 21:49

Title

vLLM: extract_hidden_states speculative decoding crashes server on any request with penalty parameters

Summary

Severity

6.5 (Medium)


                        
                          CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

SSVC

Exploitation: poc Automatable: no Technical Impact: partial

CISA Coordinator (v2.0.3)

CWE

CWE-131 - Incorrect Calculation of Buffer Size
CWE-704 - Incorrect Type Conversion or Cast

Assigner

GitHub_M

References

2 references

URL	Tags
https://github.com/vllm-project/vllm/security/adv…	x_refsource_CONFIRM
https://github.com/vllm-project/vllm/pull/38610	x_refsource_MISC

Impacted products

1 product

Vendor	Product	Version
vllm-project	vllm	Affected: >= 0.18.0, < 0.20.0

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2026-44223",
                "options": [
                  {
                    "Exploitation": "poc"
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-05-15T14:44:05.012494Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-05-15T14:46:25.695Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "references": [
          {
            "tags": [
              "exploit"
            ],
            "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-83vm-p52w-f9pw"
          },
          {
            "tags": [
              "exploit"
            ],
            "url": "https://github.com/vllm-project/vllm/pull/38610"
          }
        ],
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "vllm",
          "vendor": "vllm-project",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 0.18.0, \u003c 0.20.0"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "vLLM is an inference and serving engine for large language models (LLMs). From 0.18.0 to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., \"repetition_penalty\": 1.1) is sufficient to crash the server. This vulnerability is fixed in 0.20.0."
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "attackComplexity": "LOW",
            "attackVector": "NETWORK",
            "availabilityImpact": "HIGH",
            "baseScore": 6.5,
            "baseSeverity": "MEDIUM",
            "confidentialityImpact": "NONE",
            "integrityImpact": "NONE",
            "privilegesRequired": "LOW",
            "scope": "UNCHANGED",
            "userInteraction": "NONE",
            "vectorString": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
            "version": "3.1"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-131",
              "description": "CWE-131: Incorrect Calculation of Buffer Size",
              "lang": "en",
              "type": "CWE"
            }
          ]
        },
        {
          "descriptions": [
            {
              "cweId": "CWE-704",
              "description": "CWE-704: Incorrect Type Conversion or Cast",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-06-22T21:49:24.277Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-83vm-p52w-f9pw",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-83vm-p52w-f9pw"
        },
        {
          "name": "https://github.com/vllm-project/vllm/pull/38610",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/vllm-project/vllm/pull/38610"
        }
      ],
      "source": {
        "advisory": "GHSA-83vm-p52w-f9pw",
        "discovery": "UNKNOWN"
      },
      "title": "vLLM: extract_hidden_states speculative decoding crashes server on any request with penalty parameters"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2026-44223",
    "datePublished": "2026-05-12T19:58:40.862Z",
    "dateReserved": "2026-05-05T15:42:40.518Z",
    "dateUpdated": "2026-06-22T21:49:24.277Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

Sightings

Author	Source	Type	Date	Other

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

FKIE_CVE-2026-44223

CVE-2026-44223 (GCVE-0-2026-44223)

Tags

Sightings

Nomenclature