Vulnerability-Lookup

FKIE_CVE-2026-53923

Vulnerability from fkie_nvd - Published: 2026-06-22 23:16 - Updated: 2026-06-24 16:51

Severity

7.5 (High) - CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

Summary

vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0.

References

URL	Tags
security-advisories@github.com	https://github.com/vllm-project/vllm/commit/f219788f91952827132fa4fdf916427cd20d225e	Patch
security-advisories@github.com	https://github.com/vllm-project/vllm/pull/44971	Issue Tracking
security-advisories@github.com	https://github.com/vllm-project/vllm/security/advisories/GHSA-5jv2-g5wq-cmr4	Third Party Advisory

Impacted products

	Vendor	Product	Version
	vllm	vllm	*

JSON

To clipboard

{
  "affected": [
    {
      "affectedData": [
        {
          "product": "vllm",
          "vendor": "vllm-project",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 0.5.5, \u003c 0.23.1rc0"
            }
          ]
        }
      ],
      "source": "security-advisories@github.com"
    }
  ],
  "configurations": [
    {
      "nodes": [
        {
          "cpeMatch": [
            {
              "criteria": "cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*:*",
              "matchCriteriaId": "EC2E4E13-D3B7-4A9F-AF31-A9CD7753B6F4",
              "versionEndExcluding": "0.23.1",
              "versionStartIncluding": "0.5.5",
              "vulnerable": true
            }
          ],
          "negate": false,
          "operator": "OR"
        }
      ]
    }
  ],
  "cveTags": [],
  "descriptions": [
    {
      "lang": "en",
      "value": "vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM\u0027s GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users\u0027 inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0."
    }
  ],
  "id": "CVE-2026-53923",
  "lastModified": "2026-06-24T16:51:00.307",
  "metrics": {
    "cvssMetricV31": [
      {
        "cvssData": {
          "attackComplexity": "LOW",
          "attackVector": "NETWORK",
          "availabilityImpact": "NONE",
          "baseScore": 7.5,
          "baseSeverity": "HIGH",
          "confidentialityImpact": "HIGH",
          "integrityImpact": "NONE",
          "privilegesRequired": "NONE",
          "scope": "UNCHANGED",
          "userInteraction": "NONE",
          "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
          "version": "3.1"
        },
        "exploitabilityScore": 3.9,
        "impactScore": 3.6,
        "source": "nvd@nist.gov",
        "type": "Primary"
      }
    ],
    "cvssMetricV40": [
      {
        "cvssData": {
          "Automatable": "NOT_DEFINED",
          "Recovery": "NOT_DEFINED",
          "Safety": "NOT_DEFINED",
          "attackComplexity": "LOW",
          "attackRequirements": "NONE",
          "attackVector": "NETWORK",
          "availabilityRequirement": "NOT_DEFINED",
          "baseScore": 5.3,
          "baseSeverity": "MEDIUM",
          "confidentialityRequirement": "NOT_DEFINED",
          "exploitMaturity": "NOT_DEFINED",
          "integrityRequirement": "NOT_DEFINED",
          "modifiedAttackComplexity": "NOT_DEFINED",
          "modifiedAttackRequirements": "NOT_DEFINED",
          "modifiedAttackVector": "NOT_DEFINED",
          "modifiedPrivilegesRequired": "NOT_DEFINED",
          "modifiedSubAvailabilityImpact": "NOT_DEFINED",
          "modifiedSubConfidentialityImpact": "NOT_DEFINED",
          "modifiedSubIntegrityImpact": "NOT_DEFINED",
          "modifiedUserInteraction": "NOT_DEFINED",
          "modifiedVulnAvailabilityImpact": "NOT_DEFINED",
          "modifiedVulnConfidentialityImpact": "NOT_DEFINED",
          "modifiedVulnIntegrityImpact": "NOT_DEFINED",
          "privilegesRequired": "NONE",
          "providerUrgency": "NOT_DEFINED",
          "subAvailabilityImpact": "NONE",
          "subConfidentialityImpact": "NONE",
          "subIntegrityImpact": "NONE",
          "userInteraction": "PASSIVE",
          "valueDensity": "NOT_DEFINED",
          "vectorString": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:L/VI:L/VA:N/SC:N/SI:N/SA:N/E:X/CR:X/IR:X/AR:X/MAV:X/MAC:X/MAT:X/MPR:X/MUI:X/MVC:X/MVI:X/MVA:X/MSC:X/MSI:X/MSA:X/S:X/AU:X/R:X/V:X/RE:X/U:X",
          "version": "4.0",
          "vulnAvailabilityImpact": "NONE",
          "vulnConfidentialityImpact": "LOW",
          "vulnIntegrityImpact": "LOW",
          "vulnerabilityResponseEffort": "NOT_DEFINED"
        },
        "source": "security-advisories@github.com",
        "type": "Secondary"
      }
    ],
    "ssvcV203": [
      {
        "source": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
        "ssvcData": {
          "id": "CVE-2026-53923",
          "options": [
            {
              "exploitation": "none"
            },
            {
              "automatable": "no"
            },
            {
              "technicalImpact": "partial"
            }
          ],
          "role": "CISA Coordinator",
          "timestamp": "2026-06-23T15:04:15.555317Z",
          "version": "2.0.3"
        }
      }
    ]
  },
  "published": "2026-06-22T23:16:30.737",
  "references": [
    {
      "source": "security-advisories@github.com",
      "tags": [
        "Patch"
      ],
      "url": "https://github.com/vllm-project/vllm/commit/f219788f91952827132fa4fdf916427cd20d225e"
    },
    {
      "source": "security-advisories@github.com",
      "tags": [
        "Issue Tracking"
      ],
      "url": "https://github.com/vllm-project/vllm/pull/44971"
    },
    {
      "source": "security-advisories@github.com",
      "tags": [
        "Third Party Advisory"
      ],
      "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-5jv2-g5wq-cmr4"
    }
  ],
  "sourceIdentifier": "security-advisories@github.com",
  "vulnStatus": "Analyzed",
  "weaknesses": [
    {
      "description": [
        {
          "lang": "en",
          "value": "CWE-200"
        },
        {
          "lang": "en",
          "value": "CWE-681"
        }
      ],
      "source": "security-advisories@github.com",
      "type": "Secondary"
    }
  ]
}

CVE-2026-53923 (GCVE-0-2026-53923)

Vulnerability from cvelistv5 – Published: 2026-06-22 21:55 – Updated: 2026-06-23 15:05

Title

vLLM GGUF Kernels: int64_t to int truncation of tensor dimensions causes GPU buffer overflow

Summary

Severity

5.3 (Medium)


                        
                          CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:L/VI:L/VA:N/SC:N/SI:N/SA:N

SSVC

Exploitation: none Automatable: no Technical Impact: partial

CISA Coordinator (v2.0.3)

CWE

CWE-681 - Incorrect Conversion between Numeric Types
CWE-200 - Exposure of Sensitive Information to an Unauthorized Actor

Assigner

GitHub_M

References

3 references

URL	Tags
https://github.com/vllm-project/vllm/security/adv…	x_refsource_CONFIRM
https://github.com/vllm-project/vllm/pull/44971	x_refsource_MISC
https://github.com/vllm-project/vllm/commit/f2197…	x_refsource_MISC

Impacted products

1 product

Vendor	Product	Version
vllm-project	vllm	Affected: >= 0.5.5, < 0.23.1rc0

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2026-53923",
                "options": [
                  {
                    "Exploitation": "none"
                  },
                  {
                    "Automatable": "no"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-06-23T15:04:15.555317Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-06-23T15:05:21.711Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "title": "CISA ADP Vulnrichment"
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "vllm",
          "vendor": "vllm-project",
          "versions": [
            {
              "status": "affected",
              "version": "\u003e= 0.5.5, \u003c 0.23.1rc0"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM\u0027s GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users\u0027 inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0."
        }
      ],
      "metrics": [
        {
          "cvssV4_0": {
            "attackComplexity": "LOW",
            "attackRequirements": "NONE",
            "attackVector": "NETWORK",
            "baseScore": 5.3,
            "baseSeverity": "MEDIUM",
            "privilegesRequired": "NONE",
            "subAvailabilityImpact": "NONE",
            "subConfidentialityImpact": "NONE",
            "subIntegrityImpact": "NONE",
            "userInteraction": "PASSIVE",
            "vectorString": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:P/VC:L/VI:L/VA:N/SC:N/SI:N/SA:N",
            "version": "4.0",
            "vulnAvailabilityImpact": "NONE",
            "vulnConfidentialityImpact": "LOW",
            "vulnIntegrityImpact": "LOW"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-681",
              "description": "CWE-681: Incorrect Conversion between Numeric Types",
              "lang": "en",
              "type": "CWE"
            }
          ]
        },
        {
          "descriptions": [
            {
              "cweId": "CWE-200",
              "description": "CWE-200: Exposure of Sensitive Information to an Unauthorized Actor",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-06-22T21:55:42.001Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-5jv2-g5wq-cmr4",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-5jv2-g5wq-cmr4"
        },
        {
          "name": "https://github.com/vllm-project/vllm/pull/44971",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/vllm-project/vllm/pull/44971"
        },
        {
          "name": "https://github.com/vllm-project/vllm/commit/f219788f91952827132fa4fdf916427cd20d225e",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/vllm-project/vllm/commit/f219788f91952827132fa4fdf916427cd20d225e"
        }
      ],
      "source": {
        "advisory": "GHSA-5jv2-g5wq-cmr4",
        "discovery": "UNKNOWN"
      },
      "title": "vLLM GGUF Kernels: int64_t to int truncation of tensor dimensions causes GPU buffer overflow"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2026-53923",
    "datePublished": "2026-06-22T21:55:42.001Z",
    "dateReserved": "2026-06-11T15:46:12.316Z",
    "dateUpdated": "2026-06-23T15:05:21.711Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

Sightings

Author	Source	Type	Date	Other

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

FKIE_CVE-2026-53923

CVE-2026-53923 (GCVE-0-2026-53923)

Tags

Sightings

Nomenclature