GHSA-2QV6-9WX5-CWV4
Vulnerability from github – Published: 2026-05-27 00:09 – Updated: 2026-05-27 00:09Summary
The strip_html filter in liquidjs is intended to remove HTML tags from a string before rendering, and is widely used as an XSS sanitizer. The implementation uses a regex whose catch-all branch (<.*?>) does not match line terminators, so any HTML tag containing a \n or \r character passes through unmodified. An attacker who can place a newline inside a tag (e.g. <img\nsrc=x\nonerror=alert(1)>) bypasses sanitization entirely, since browsers treat newlines as whitespace within a tag and execute the resulting onerror/onload/etc. handler. This results in stored or reflected XSS in any application that relies on strip_html to neutralize untrusted HTML.
Details
The vulnerable code is in src/filters/html.ts:
// src/filters/html.ts:45-49
export function strip_html (this: FilterImpl, v: string) {
const str = stringify(v)
this.context.memoryLimit.use(str.length)
return str.replace(/<script[\s\S]*?<\/script>|<style[\s\S]*?<\/style>|<.*?>|<!--[\s\S]*?-->/g, '')
}
The regex has four alternations:
1. <script[\s\S]*?<\/script> — uses [\s\S], matches across newlines.
2. <style[\s\S]*?<\/style> — uses [\s\S], matches across newlines.
3. <.*?> — uses ., which in JavaScript does not match \n or \r (no s/dotAll flag set).
4. <!--[\s\S]*?--> — uses [\s\S], matches across newlines.
Branch 3 is the catch-all for "any other tag." Because . excludes line terminators, a tag containing a newline does not match any alternative. The literal characters of the tag are passed through to the output.
Browsers, however, parse HTML tag content with whitespace tolerance: per the HTML spec, attribute names and values may be separated by ASCII whitespace, which includes \n and \r. So <img\nsrc=x\nonerror=alert(1)> is parsed as a valid img element with an onerror handler.
liquidjs' default rendering pipeline does not auto-escape filter output (the outputEscape engine option is undefined by default — see src/liquid-options.ts), so the unescaped HTML is delivered verbatim to the consumer's HTML response.
Trust path:
- Application receives untrusted input (e.g. user comment field).
- Developer renders it as {{ comment | strip_html }} to "safely" embed user content as plaintext.
- Attacker submits <img\u000Asrc=x\u000Aonerror=alert(document.cookie)>.
- strip_html returns the input unchanged.
- Output is written into the HTML response with no further escaping.
- Victim's browser executes the attacker's JavaScript in the application's origin.
This is an inconsistency bug: the same regex correctly uses [\s\S] for <script>, <style>, and comment branches, but reverts to . for the catch-all. The other branches' authors clearly knew to handle multi-line content; the catch-all was missed.
PoC
Reproduces against current HEAD (10.25.7) using the published dist/liquid.node.js build:
node -e "
const { Liquid } = require('./dist/liquid.node.js');
const engine = new Liquid();
engine.parseAndRender(
'Safe output: {{ input | strip_html }}',
{ input: '<img\nsrc=x\nonerror=\"alert(document.cookie)\">' }
).then(r => console.log(JSON.stringify(r)));
"
Verified output:
"Safe output: <img\nsrc=x\nonerror=\"alert(document.cookie)\">"
The <img ... onerror=...> tag is delivered to the output completely unmodified. When this string is placed into an HTML document and parsed by a browser, the onerror handler executes.
Same bypass works with \r (carriage return), \r\n, or any combination of CR/LF inside the tag. It also works with other event-handler vectors (<svg\nonload=alert(1)>, <body\nonload=alert(1)>, <iframe\nsrc="javascript:alert(1)">, etc.) and is not specific to <img>.
For comparison, the same input without a newline is correctly stripped:
node -e "
const { Liquid } = require('./dist/liquid.node.js');
const engine = new Liquid();
engine.parseAndRender(
'Safe output: {{ input | strip_html }}',
{ input: '<img src=x onerror=\"alert(1)\">' }
).then(r => console.log(JSON.stringify(r)));
"
# → "Safe output: "
This confirms strip_html is intended to remove tags of this shape, and the newline form is a sanitizer bypass rather than expected behavior.
Impact
Any liquidjs-using application that:
1. Renders attacker-controlled strings via {{ x | strip_html }} to defend against HTML injection, AND
2. Does not separately HTML-escape that output (default behavior — outputEscape is unset by default),
Is vulnerable to stored or reflected XSS. The attacker can execute arbitrary JavaScript in the victim's browser in the application's origin, enabling session theft, account takeover, CSRF with origin-scoped credentials, and arbitrary actions in the victim's authenticated session. The XSS is triggered with simple, well-known event-handler payloads — no exotic encoding, no character set tricks, just a literal newline inside the tag.
The blast radius matches the deployment of liquidjs as a server-side template engine: liquidjs is one of the most popular Liquid implementations on npm (millions of downloads/week) and strip_html is documented as the sanitization filter for HTML stripping, so the vulnerable pattern ({{ user | strip_html }}) is the natural and recommended use of the filter.
Recommended Fix
Replace <.*?> with <[\s\S]*?> (or apply the s/dotAll flag to the entire regex) so the catch-all branch matches across line terminators, consistent with the other branches:
// src/filters/html.ts
export function strip_html (this: FilterImpl, v: string) {
const str = stringify(v)
this.context.memoryLimit.use(str.length)
return str.replace(/<script[\s\S]*?<\/script>|<style[\s\S]*?<\/style>|<[\s\S]*?>|<!--[\s\S]*?-->/g, '')
}
Equivalent fix using the dotAll flag (requires ES2018+, which liquidjs already targets):
return str.replace(/<script.*?<\/script>|<style.*?<\/style>|<.*?>|<!--.*?-->/gs, '')
After the fix, the PoC input is correctly reduced to an empty string. Note that strip_html should still not be relied on as a primary XSS defense — the project README/documentation should recommend HTML-escaping (escape filter) for untrusted content rendered into HTML contexts. A brief security note in the filter's documentation would help users who currently treat strip_html as a sanitizer.
{
"affected": [
{
"package": {
"ecosystem": "npm",
"name": "liquidjs"
},
"ranges": [
{
"events": [
{
"introduced": "0"
},
{
"last_affected": "10.25.7"
}
],
"type": "ECOSYSTEM"
}
]
}
],
"aliases": [
"CVE-2026-44644"
],
"database_specific": {
"cwe_ids": [
"CWE-79"
],
"github_reviewed": true,
"github_reviewed_at": "2026-05-27T00:09:12Z",
"nvd_published_at": null,
"severity": "MODERATE"
},
"details": "## Summary\n\nThe `strip_html` filter in liquidjs is intended to remove HTML tags from a string before rendering, and is widely used as an XSS sanitizer. The implementation uses a regex whose catch-all branch (`\u003c.*?\u003e`) does not match line terminators, so any HTML tag containing a `\\n` or `\\r` character passes through unmodified. An attacker who can place a newline inside a tag (e.g. `\u003cimg\\nsrc=x\\nonerror=alert(1)\u003e`) bypasses sanitization entirely, since browsers treat newlines as whitespace within a tag and execute the resulting `onerror`/`onload`/etc. handler. This results in stored or reflected XSS in any application that relies on `strip_html` to neutralize untrusted HTML.\n\n## Details\n\nThe vulnerable code is in `src/filters/html.ts`:\n\n```ts\n// src/filters/html.ts:45-49\nexport function strip_html (this: FilterImpl, v: string) {\n const str = stringify(v)\n this.context.memoryLimit.use(str.length)\n return str.replace(/\u003cscript[\\s\\S]*?\u003c\\/script\u003e|\u003cstyle[\\s\\S]*?\u003c\\/style\u003e|\u003c.*?\u003e|\u003c!--[\\s\\S]*?--\u003e/g, \u0027\u0027)\n}\n```\n\nThe regex has four alternations:\n1. `\u003cscript[\\s\\S]*?\u003c\\/script\u003e` \u2014 uses `[\\s\\S]`, matches across newlines.\n2. `\u003cstyle[\\s\\S]*?\u003c\\/style\u003e` \u2014 uses `[\\s\\S]`, matches across newlines.\n3. `\u003c.*?\u003e` \u2014 uses `.`, which in JavaScript does **not** match `\\n` or `\\r` (no `s`/dotAll flag set).\n4. `\u003c!--[\\s\\S]*?--\u003e` \u2014 uses `[\\s\\S]`, matches across newlines.\n\nBranch 3 is the catch-all for \"any other tag.\" Because `.` excludes line terminators, a tag containing a newline does not match any alternative. The literal characters of the tag are passed through to the output.\n\nBrowsers, however, parse HTML tag content with whitespace tolerance: per the HTML spec, attribute names and values may be separated by ASCII whitespace, which includes `\\n` and `\\r`. So `\u003cimg\\nsrc=x\\nonerror=alert(1)\u003e` is parsed as a valid `img` element with an `onerror` handler.\n\n`liquidjs`\u0027 default rendering pipeline does not auto-escape filter output (the `outputEscape` engine option is undefined by default \u2014 see `src/liquid-options.ts`), so the unescaped HTML is delivered verbatim to the consumer\u0027s HTML response.\n\nTrust path:\n- Application receives untrusted input (e.g. user comment field).\n- Developer renders it as `{{ comment | strip_html }}` to \"safely\" embed user content as plaintext.\n- Attacker submits `\u003cimg\\u000Asrc=x\\u000Aonerror=alert(document.cookie)\u003e`.\n- `strip_html` returns the input unchanged.\n- Output is written into the HTML response with no further escaping.\n- Victim\u0027s browser executes the attacker\u0027s JavaScript in the application\u0027s origin.\n\nThis is an inconsistency bug: the same regex correctly uses `[\\s\\S]` for `\u003cscript\u003e`, `\u003cstyle\u003e`, and comment branches, but reverts to `.` for the catch-all. The other branches\u0027 authors clearly knew to handle multi-line content; the catch-all was missed.\n\n## PoC\n\nReproduces against current HEAD (10.25.7) using the published `dist/liquid.node.js` build:\n\n```bash\nnode -e \"\nconst { Liquid } = require(\u0027./dist/liquid.node.js\u0027);\nconst engine = new Liquid();\nengine.parseAndRender(\n \u0027Safe output: {{ input | strip_html }}\u0027,\n { input: \u0027\u003cimg\\nsrc=x\\nonerror=\\\"alert(document.cookie)\\\"\u003e\u0027 }\n).then(r =\u003e console.log(JSON.stringify(r)));\n\"\n```\n\nVerified output:\n\n```\n\"Safe output: \u003cimg\\nsrc=x\\nonerror=\\\"alert(document.cookie)\\\"\u003e\"\n```\n\nThe `\u003cimg ... onerror=...\u003e` tag is delivered to the output completely unmodified. When this string is placed into an HTML document and parsed by a browser, the `onerror` handler executes.\n\nSame bypass works with `\\r` (carriage return), `\\r\\n`, or any combination of CR/LF inside the tag. It also works with other event-handler vectors (`\u003csvg\\nonload=alert(1)\u003e`, `\u003cbody\\nonload=alert(1)\u003e`, `\u003ciframe\\nsrc=\"javascript:alert(1)\"\u003e`, etc.) and is not specific to `\u003cimg\u003e`.\n\nFor comparison, the same input without a newline is correctly stripped:\n\n```bash\nnode -e \"\nconst { Liquid } = require(\u0027./dist/liquid.node.js\u0027);\nconst engine = new Liquid();\nengine.parseAndRender(\n \u0027Safe output: {{ input | strip_html }}\u0027,\n { input: \u0027\u003cimg src=x onerror=\\\"alert(1)\\\"\u003e\u0027 }\n).then(r =\u003e console.log(JSON.stringify(r)));\n\"\n# \u2192 \"Safe output: \"\n```\n\nThis confirms `strip_html` is intended to remove tags of this shape, and the newline form is a sanitizer bypass rather than expected behavior.\n\n## Impact\n\nAny liquidjs-using application that:\n1. Renders attacker-controlled strings via `{{ x | strip_html }}` to defend against HTML injection, AND\n2. Does not separately HTML-escape that output (default behavior \u2014 `outputEscape` is unset by default),\n\nIs vulnerable to stored or reflected XSS. The attacker can execute arbitrary JavaScript in the victim\u0027s browser in the application\u0027s origin, enabling session theft, account takeover, CSRF with origin-scoped credentials, and arbitrary actions in the victim\u0027s authenticated session. The XSS is triggered with simple, well-known event-handler payloads \u2014 no exotic encoding, no character set tricks, just a literal newline inside the tag.\n\nThe blast radius matches the deployment of liquidjs as a server-side template engine: liquidjs is one of the most popular Liquid implementations on npm (millions of downloads/week) and `strip_html` is documented as the sanitization filter for HTML stripping, so the vulnerable pattern (`{{ user | strip_html }}`) is the natural and recommended use of the filter.\n\n## Recommended Fix\n\nReplace `\u003c.*?\u003e` with `\u003c[\\s\\S]*?\u003e` (or apply the `s`/dotAll flag to the entire regex) so the catch-all branch matches across line terminators, consistent with the other branches:\n\n```ts\n// src/filters/html.ts\nexport function strip_html (this: FilterImpl, v: string) {\n const str = stringify(v)\n this.context.memoryLimit.use(str.length)\n return str.replace(/\u003cscript[\\s\\S]*?\u003c\\/script\u003e|\u003cstyle[\\s\\S]*?\u003c\\/style\u003e|\u003c[\\s\\S]*?\u003e|\u003c!--[\\s\\S]*?--\u003e/g, \u0027\u0027)\n}\n```\n\nEquivalent fix using the dotAll flag (requires ES2018+, which liquidjs already targets):\n\n```ts\nreturn str.replace(/\u003cscript.*?\u003c\\/script\u003e|\u003cstyle.*?\u003c\\/style\u003e|\u003c.*?\u003e|\u003c!--.*?--\u003e/gs, \u0027\u0027)\n```\n\nAfter the fix, the PoC input is correctly reduced to an empty string. Note that `strip_html` should still not be relied on as a primary XSS defense \u2014 the project README/documentation should recommend HTML-escaping (`escape` filter) for untrusted content rendered into HTML contexts. A brief security note in the filter\u0027s documentation would help users who currently treat `strip_html` as a sanitizer.",
"id": "GHSA-2qv6-9wx5-cwv4",
"modified": "2026-05-27T00:09:12Z",
"published": "2026-05-27T00:09:12Z",
"references": [
{
"type": "WEB",
"url": "https://github.com/harttle/liquidjs/security/advisories/GHSA-2qv6-9wx5-cwv4"
},
{
"type": "PACKAGE",
"url": "https://github.com/harttle/liquidjs"
}
],
"schema_version": "1.4.0",
"severity": [
{
"score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N",
"type": "CVSS_V3"
}
],
"summary": "LiquidJS\u0027s strip_html filter bypass via newline characters in HTML tags enables XSS"
}
Sightings
| Author | Source | Type | Date | Other |
|---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or observed by the user.
- Confirmed: The vulnerability has been validated from an analyst's perspective.
- Published Proof of Concept: A public proof of concept is available for this vulnerability.
- Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
- Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
- Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
- Not confirmed: The user expressed doubt about the validity of the vulnerability.
- Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.