CVE-2026-34760 (GCVE-0-2026-34760)
Vulnerability from cvelistv5 – Published: 2026-04-02 18:59 – Updated: 2026-04-03 14:42
VLAI?
Title
vLLM: Downmix Implementation Differences as Attack Vectors Against Audio AI Models
Summary
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0.
Severity ?
5.9 (Medium)
CWE
- CWE-20 - Improper Input Validation
Assigner
References
| URL | Tags | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
||||||||||||||
Impacted products
| Vendor | Product | Version | ||
|---|---|---|---|---|
| vllm-project | vllm |
Affected:
>= 0.5.5, < 0.18.0
|
{
"containers": {
"adp": [
{
"metrics": [
{
"other": {
"content": {
"id": "CVE-2026-34760",
"options": [
{
"Exploitation": "none"
},
{
"Automatable": "no"
},
{
"Technical Impact": "partial"
}
],
"role": "CISA Coordinator",
"timestamp": "2026-04-03T14:42:25.211772Z",
"version": "2.0.3"
},
"type": "ssvc"
}
}
],
"providerMetadata": {
"dateUpdated": "2026-04-03T14:42:34.842Z",
"orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
"shortName": "CISA-ADP"
},
"title": "CISA ADP Vulnrichment"
}
],
"cna": {
"affected": [
{
"product": "vllm",
"vendor": "vllm-project",
"versions": [
{
"status": "affected",
"version": "\u003e= 0.5.5, \u003c 0.18.0"
}
]
}
],
"descriptions": [
{
"lang": "en",
"value": "vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0."
}
],
"metrics": [
{
"cvssV3_1": {
"attackComplexity": "HIGH",
"attackVector": "NETWORK",
"availabilityImpact": "LOW",
"baseScore": 5.9,
"baseSeverity": "MEDIUM",
"confidentialityImpact": "NONE",
"integrityImpact": "HIGH",
"privilegesRequired": "LOW",
"scope": "UNCHANGED",
"userInteraction": "NONE",
"vectorString": "CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:N/I:H/A:L",
"version": "3.1"
}
}
],
"problemTypes": [
{
"descriptions": [
{
"cweId": "CWE-20",
"description": "CWE-20: Improper Input Validation",
"lang": "en",
"type": "CWE"
}
]
}
],
"providerMetadata": {
"dateUpdated": "2026-04-02T18:59:49.638Z",
"orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
"shortName": "GitHub_M"
},
"references": [
{
"name": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6c4r-fmh3-7rh8",
"tags": [
"x_refsource_CONFIRM"
],
"url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6c4r-fmh3-7rh8"
},
{
"name": "https://github.com/vllm-project/vllm/pull/37058",
"tags": [
"x_refsource_MISC"
],
"url": "https://github.com/vllm-project/vllm/pull/37058"
},
{
"name": "https://github.com/vllm-project/vllm/commit/c7f98b4d0a63b32ed939e2b6dfaa8a626e9b46c4",
"tags": [
"x_refsource_MISC"
],
"url": "https://github.com/vllm-project/vllm/commit/c7f98b4d0a63b32ed939e2b6dfaa8a626e9b46c4"
},
{
"name": "https://github.com/vllm-project/vllm/releases/tag/v0.18.0",
"tags": [
"x_refsource_MISC"
],
"url": "https://github.com/vllm-project/vllm/releases/tag/v0.18.0"
}
],
"source": {
"advisory": "GHSA-6c4r-fmh3-7rh8",
"discovery": "UNKNOWN"
},
"title": "vLLM: Downmix Implementation Differences as Attack Vectors Against Audio AI Models"
}
},
"cveMetadata": {
"assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
"assignerShortName": "GitHub_M",
"cveId": "CVE-2026-34760",
"datePublished": "2026-04-02T18:59:49.638Z",
"dateReserved": "2026-03-30T19:17:10.225Z",
"dateUpdated": "2026-04-03T14:42:34.842Z",
"state": "PUBLISHED"
},
"dataType": "CVE_RECORD",
"dataVersion": "5.2",
"vulnerability-lookup:meta": {
"nvd": "{\"cve\":{\"id\":\"CVE-2026-34760\",\"sourceIdentifier\":\"security-advisories@github.com\",\"published\":\"2026-04-02T20:16:25.437\",\"lastModified\":\"2026-04-03T16:10:23.730\",\"vulnStatus\":\"Undergoing Analysis\",\"cveTags\":[],\"descriptions\":[{\"lang\":\"en\",\"value\":\"vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0.\"}],\"metrics\":{\"cvssMetricV31\":[{\"source\":\"security-advisories@github.com\",\"type\":\"Secondary\",\"cvssData\":{\"version\":\"3.1\",\"vectorString\":\"CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:N/I:H/A:L\",\"baseScore\":5.9,\"baseSeverity\":\"MEDIUM\",\"attackVector\":\"NETWORK\",\"attackComplexity\":\"HIGH\",\"privilegesRequired\":\"LOW\",\"userInteraction\":\"NONE\",\"scope\":\"UNCHANGED\",\"confidentialityImpact\":\"NONE\",\"integrityImpact\":\"HIGH\",\"availabilityImpact\":\"LOW\"},\"exploitabilityScore\":1.6,\"impactScore\":4.2}]},\"weaknesses\":[{\"source\":\"security-advisories@github.com\",\"type\":\"Primary\",\"description\":[{\"lang\":\"en\",\"value\":\"CWE-20\"}]}],\"references\":[{\"url\":\"https://github.com/vllm-project/vllm/commit/c7f98b4d0a63b32ed939e2b6dfaa8a626e9b46c4\",\"source\":\"security-advisories@github.com\"},{\"url\":\"https://github.com/vllm-project/vllm/pull/37058\",\"source\":\"security-advisories@github.com\"},{\"url\":\"https://github.com/vllm-project/vllm/releases/tag/v0.18.0\",\"source\":\"security-advisories@github.com\"},{\"url\":\"https://github.com/vllm-project/vllm/security/advisories/GHSA-6c4r-fmh3-7rh8\",\"source\":\"security-advisories@github.com\"}]}}",
"vulnrichment": {
"containers": "{\"adp\": [{\"title\": \"CISA ADP Vulnrichment\", \"metrics\": [{\"other\": {\"type\": \"ssvc\", \"content\": {\"id\": \"CVE-2026-34760\", \"role\": \"CISA Coordinator\", \"options\": [{\"Exploitation\": \"none\"}, {\"Automatable\": \"no\"}, {\"Technical Impact\": \"partial\"}], \"version\": \"2.0.3\", \"timestamp\": \"2026-04-03T14:42:25.211772Z\"}}}], \"providerMetadata\": {\"orgId\": \"134c704f-9b21-4f2e-91b3-4a467353bcc0\", \"shortName\": \"CISA-ADP\", \"dateUpdated\": \"2026-04-03T14:42:31.132Z\"}}], \"cna\": {\"title\": \"vLLM: Downmix Implementation Differences as Attack Vectors Against Audio AI Models\", \"source\": {\"advisory\": \"GHSA-6c4r-fmh3-7rh8\", \"discovery\": \"UNKNOWN\"}, \"metrics\": [{\"cvssV3_1\": {\"scope\": \"UNCHANGED\", \"version\": \"3.1\", \"baseScore\": 5.9, \"attackVector\": \"NETWORK\", \"baseSeverity\": \"MEDIUM\", \"vectorString\": \"CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:N/I:H/A:L\", \"integrityImpact\": \"HIGH\", \"userInteraction\": \"NONE\", \"attackComplexity\": \"HIGH\", \"availabilityImpact\": \"LOW\", \"privilegesRequired\": \"LOW\", \"confidentialityImpact\": \"NONE\"}}], \"affected\": [{\"vendor\": \"vllm-project\", \"product\": \"vllm\", \"versions\": [{\"status\": \"affected\", \"version\": \"\u003e= 0.5.5, \u003c 0.18.0\"}]}], \"references\": [{\"url\": \"https://github.com/vllm-project/vllm/security/advisories/GHSA-6c4r-fmh3-7rh8\", \"name\": \"https://github.com/vllm-project/vllm/security/advisories/GHSA-6c4r-fmh3-7rh8\", \"tags\": [\"x_refsource_CONFIRM\"]}, {\"url\": \"https://github.com/vllm-project/vllm/pull/37058\", \"name\": \"https://github.com/vllm-project/vllm/pull/37058\", \"tags\": [\"x_refsource_MISC\"]}, {\"url\": \"https://github.com/vllm-project/vllm/commit/c7f98b4d0a63b32ed939e2b6dfaa8a626e9b46c4\", \"name\": \"https://github.com/vllm-project/vllm/commit/c7f98b4d0a63b32ed939e2b6dfaa8a626e9b46c4\", \"tags\": [\"x_refsource_MISC\"]}, {\"url\": \"https://github.com/vllm-project/vllm/releases/tag/v0.18.0\", \"name\": \"https://github.com/vllm-project/vllm/releases/tag/v0.18.0\", \"tags\": [\"x_refsource_MISC\"]}], \"descriptions\": [{\"lang\": \"en\", \"value\": \"vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0.\"}], \"problemTypes\": [{\"descriptions\": [{\"lang\": \"en\", \"type\": \"CWE\", \"cweId\": \"CWE-20\", \"description\": \"CWE-20: Improper Input Validation\"}]}], \"providerMetadata\": {\"orgId\": \"a0819718-46f1-4df5-94e2-005712e83aaa\", \"shortName\": \"GitHub_M\", \"dateUpdated\": \"2026-04-02T18:59:49.638Z\"}}}",
"cveMetadata": "{\"cveId\": \"CVE-2026-34760\", \"state\": \"PUBLISHED\", \"dateUpdated\": \"2026-04-03T14:42:34.842Z\", \"dateReserved\": \"2026-03-30T19:17:10.225Z\", \"assignerOrgId\": \"a0819718-46f1-4df5-94e2-005712e83aaa\", \"datePublished\": \"2026-04-02T18:59:49.638Z\", \"assignerShortName\": \"GitHub_M\"}",
"dataType": "CVE_RECORD",
"dataVersion": "5.2"
}
}
}
Loading…
Loading…
Sightings
| Author | Source | Type | Date |
|---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or observed by the user.
- Confirmed: The vulnerability has been validated from an analyst's perspective.
- Published Proof of Concept: A public proof of concept is available for this vulnerability.
- Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
- Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
- Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
- Not confirmed: The user expressed doubt about the validity of the vulnerability.
- Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.
Loading…
Loading…