Skip to content

Workshop fixes from a first-time-attendee dry-run#47

Merged
sanchezpaco merged 9 commits into
unicrons:mainfrom
sanchezpaco:workshop-fixes-from-dryrun
May 1, 2026
Merged

Workshop fixes from a first-time-attendee dry-run#47
sanchezpaco merged 9 commits into
unicrons:mainfrom
sanchezpaco:workshop-fixes-from-dryrun

Conversation

@sanchezpaco

Copy link
Copy Markdown
Collaborator

Context

I walked through the workshop end-to-end as a first-time attendee a few weeks ago, recorded the friction, and turned each blocker / silent-pass into a concrete fix. The dry-run notes (per-module status, repro steps, trade-offs) live at the linked report. This MR bundles the eight commits that came out of it.

📝 Dry-run report: https://github.com/sanchezpaco/secure-pipeline-workshop/blob/workshop-improvements/docs/dry-run/report.md

The commits are independent and each one should make sense on its own — happy to split into smaller PRs if that's easier to review.

What's fixed, by group

🟢 Quick wins — workflow flag/wording fixes (commits 1–4)

  • `fix(pipeline_scan): disable zizmor online audits to keep module runnable`
    zizmor v1.24.x's online audits hard-crash the run on any GitHub API HTTP error. With `github/codeql-action` SHA pins this is deterministic — the compare API returns 404 for cross-era refs and zizmor retries until rate-limited (~17 min, ~2500 calls per invocation). Setting `online-audits: "false"` keeps the module runnable; we lose four online checks (repojacking, known-vulnerable-actions, impostor-commit, artipacked) but keep the bulk (unpinned-uses, template-injection, excessive-permissions, etc.). Tracking upstream: CLI: improve error message on rate limiting from GitHub zizmorcore/zizmor#1252, #1874.

  • `fix(secrets_scan): include unverified results in trufflehog so didactic AKIA fails`
    The didactic AKIA in `code/src/simple-app.js` is fake → AWS rejects it as invalid → trufflehog's default `--results=verified,unknown` discards it as a false positive → exit 0, silent green. Adding `unverified` makes the AKIA trigger a fail as the workshop intends.

  • `fix(container_scan): align scan-result wording with severity-cutoff`
    Both grype and trivy use `severity-cutoff: high`, but the failure message says "critical vulnerabilities detected". Attendees go looking for criticals that aren't there. Reword to "vulnerabilities at or above the configured severity threshold detected".

  • `fix(dast): replace dead-end placeholder with explicit 'coming soon' message`
    Today `.github/workflows/dast.yml` instructs attendees to "Copy from `workshop/dast/{tool}/workflow.yml`", but that path does not exist. Replace the misleading placeholder with an explicit no-op until the DAST module is shipped.

🟡 SCA module — make it actually work without external setup (commits 5–6)

  • `feat(code_scan): add osv-scanner as a key-free SCA alternative to Dependency Check`
    Since 2024 NVD aggressively rate-limits unauthenticated clients, so OWASP Dependency Check effectively returns 0 findings without an `NVD_API_KEY` — silent false-pass. Add `workshop/code_scan/sca/osv-scanner/workflow.yml` (queries OSV.dev, no external setup) as a second alternative. README updated to split SAST/SCA sections, document the heads-up about NVD, and explain the two DB-source philosophies (open-standard OSV.dev vs. NVD-direct).

  • `feat(code): swap lodash for axios as the didactic SCA target`
    `lodash@4.17.x` is no longer maintained — newer CVEs (CVE-2025-13465, CVE-2026-2950, CVE-2026-4800) affect the entire 4.x line with no patched release, so the standard SCA pedagogy "scan reports vuln → bump to patched version → re-scan green" silently breaks. Swap to `axios@1.6.0` which has a clean bump-to-fix path (CVE-2024-39338 SSRF and CVE-2025-27152 URL bypass, fixed by `axios@1.8.2+`). Also wire axios into `simple-app.js` with a `/fetch` endpoint that proxies a user-controlled URL — amplifying the lib-level CVE into an exploitable SSRF and reinforcing the cross-module lesson "lib bug + missing validation = exploitable".

🔵 Deeper module rewrites (commits 7–8)

  • `docs(container_scan): document the 2-step fix path (base image + bundled deps)`
    In 2026, "just bump the base image" is not enough to clean the container scan: `node:25-alpine` still ships a HIGH in `picomatch` bundled inside the npm CLI. Document the 2-step lesson in the module README — bump base, then `RUN npm install -g npm@11.13.0 && npm cache clean --force` to refresh the bundled tree. Verified working combos: `node:20-alpine + npm@11.13.0`, `node:25-alpine + npm@11.13.0`.

  • `fix(code_scan): rewrite CodeQL placeholder to drive analysis via CLI`
    The previous CodeQL placeholder had two bugs that produced silent green on PR triggers: (1) it counted findings via `gh api .../code-scanning/alerts?pr=` which returns [] without a baseline on main; (2) even after switching to the SARIF that `codeql-action/analyze` writes, the action itself applies a diff-aware filter on `pull_request` events, stripping results whose source file isn't in the PR diff. Rewrite to drive analysis manually: use `init` only to install the toolchain, then `codeql database create` + `codeql database analyze` directly. Also count findings via jq from the unfiltered SARIF and pretty-print rule id + message + location. Module philosophy clarified at top of file: CodeQL focuses on data-flow vulns; expect to surface `js/command-line-injection` at `code/src/simple-app.js:41`. Hardcoded secret detection is the secrets_scan module's job.

Verification

Each fix was verified end-to-end by substituting the workshop placeholder into a fork branch and running the orchestrator. The dry-run report links to the per-module CI runs.

I deliberately did not touch:

  • `workshop/ai_scan/` — the dry-run flagged it for a deeper revamp (multi-provider, prompt-injection guardrails, ollama fallback) that is out of scope for this MR. Happy to open a separate issue/MR for it.
  • The `upload` step on the codeql-action — this MR keeps SARIF being uploaded to GitHub Advanced Security so the Security tab continues to work.

Happy to iterate or split if any of these conflict with the maintainers' direction.

zizmor v1.24.1's online audits (impostor-commit, repojacking,
known-vulnerable-actions, artipacked) hard-crash the whole run on any
GitHub API HTTP error. With github/codeql-action SHA pins the failure is
deterministic — the compare API returns 404 for cross-era refs and zizmor
retries until rate-limited (~17 min, ~2500 calls per invocation).

Without this flag, attendees who run pipeline_scan after enabling
code_scan (or any module that pins github/codeql-action) hit a hard
'fatal: no audit was performed' and cannot proceed. Trade-off is losing
the four online checks listed above; the bulk of zizmor's value
(unpinned-uses, template-injection, excessive-permissions,
hardcoded-container-credentials) comes from offline audits and is
unaffected.

Tracking upstream: zizmor#1252, #1874.
…ic AKIA fails

The workshop intentionally hardcodes AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY
in code/src/simple-app.js so attendees can practice fixing a real-looking
secret leak. Those credentials are fake — AWS responds 'invalid access
key', and trufflehog's --results=verified,unknown silently discards the
match as a false positive, returning exit 0 and a green ✅.

Reproduction:
  trufflehog filesystem . --results=verified,unknown --fail
  → 0 results, exit 0
  trufflehog filesystem . --results=verified,unknown,unverified --fail
  → 1 unverified result, exit 183

Adding `unverified` lets the workshop's pass→fail→fix loop work as
intended for attendees who run only TruffleHog (skipping gitleaks).
The grype and trivy jobs use severity-cutoff: high, so a failing run
means a high or critical was detected — not necessarily a critical.
The current 'critical vulnerabilities detected' message sends attendees
looking for critical findings that often aren't there, only highs.

Reword both pass and fail messages to match the actual threshold the
job enforces. Cosmetic but eliminates a recurring source of confusion
during the container_scan module.
…essage

The orchestrator's dast.yml told attendees to 'Copy from
workshop/dast/{tool}/workflow.yml', but that path does not exist —
there is no workshop/dast/ directory, no DAST README, and no DAST
mention in the root or workshop READMEs. An attendee following the
orchestrator flow hits a dead end at this step.

Until the DAST module is actually shipped, replace the misleading
'workshop placeholder' wording with an explicit no-op that prints
'coming soon' and exits 0 so the orchestrator stays green.

This is a stop-gap; either ship the DAST module (e.g. OWASP ZAP /
nuclei against the deployed app, which would also need the runtime
infrastructure currently AWS-stubbed in deploy-application) or keep
the explicit placeholder so attendees aren't confused.
…endency Check

Today the SCA module only offers OWASP Dependency Check, which since
2024 has been effectively unusable in CI without an NVD_API_KEY: NVD's
anonymous rate-limit lets the scan finish quickly with 0 findings, a
silent false-pass that defeats the module's pedagogical intent.

Add Google's osv-scanner as a second alternative that works out of the
box. It queries OSV.dev, an open-standard advisory aggregator (GitHub
Advisories, RustSec, PyPA, GoVulnDB, etc.), is lockfile-based, and
needs no external keys or DB downloads.

Update workshop/code_scan/README.md to split the tool listing into
SAST/SCA sections and explain that the two SCA tools represent two
different DB-source philosophies (OSV.dev open standard vs. NVD-direct).
Keep Dependency Check as a documented alternative with a heads-up about
the NVD_API_KEY requirement.

Net effect: the workshop's SCA module now runs end-to-end on a fresh
fork with zero secrets configured, while still teaching the OWASP
veteran option for attendees who want it.
Replace the workshop's intentionally-vulnerable lodash@4.17.15 (now
abandoned upstream — newer CVEs like CVE-2025-13465, CVE-2026-2950
and CVE-2026-4800 affect the entire 4.x line with no patched release)
with axios@1.6.0, which has a clean "bump to fix" remediation path:

  - CVE-2024-39338 (SSRF in URL handling)         → fixed in 1.7.4
  - CVE-2025-27152 (absolute URL bypass)          → fixed in 1.8.2

This restores the standard SCA pedagogy ("scan reports vuln → bump to
patched version → re-scan green") that lodash 4.x had silently broken.

Also wire axios into simple-app.js with a /fetch endpoint that takes
a user-controlled URL and proxies it via axios.get(). The endpoint
intentionally lacks an allow-list so it amplifies the lib-level CVE
into an exploitable SSRF, reinforcing the cross-module lesson:

  - SCA detects: axios 1.6.0 has a known SSRF CVE
  - SAST will detect: code passes user input to axios.get() unchecked
  - Combined: lib bug + missing app-level validation = exploitable

The endpoint imitates the existing /readfile pattern (command
injection TODO) so the file stays consistent in style and in the
"intentional teaching material" spirit documented in the dry-run
audit.
…led deps)

The workshop's code/Dockerfile ships with FROM node:16.14.0-alpine, and
the obvious instinct on first scan is "bump the base image and the
findings will go away". In 2026 that is necessary but not sufficient:

  - node:16.14.0-alpine            → ~147 findings
  - node:25-alpine (bump only)     → 1 residual HIGH in picomatch
  - node:25-alpine + npm@11.13.0   → green

The residual HIGH is in picomatch 4.0.3, which is bundled inside the
npm CLI shipped with modern base images. npm 11.13.0 ships
picomatch >= 4.0.4, so reinstalling npm globally inside the image
refreshes that subtree and clears the finding.

Add a Tip callout in the module README explaining the 2-step pattern
and showing the canonical RUN line, so attendees who try "just bump
the base" and find the scan still red understand why and what to do
next, instead of getting stuck wondering if they did something wrong.
The previous CodeQL placeholder had two bugs that made the module
silently green when run on a pull_request:

1. It counted findings from `gh api /code-scanning/alerts?pr=<n>`
   which only returns alerts that the PR introduces relative to a
   baseline on the default branch — first-time workshop attendees
   never have such a baseline, so the API always returned [].
2. Even after rewriting it to count findings directly from the SARIF
   that `codeql-action/analyze` writes, the action ALSO applies a
   diff-aware filter on pull_request events: it strips results whose
   source file isn't part of the PR diff. The workshop PR that
   substitutes the placeholder doesn't touch code/src/simple-app.js,
   so the cmd-injection finding gets filtered out post-analyze.

Rewrite the placeholder to:

- Use `github/codeql-action/init` only to install the CodeQL toolchain
  (puts `codeql` on PATH and prepares the standard query packs).
- Build the database manually via `codeql database create`.
- Run `codeql database analyze` directly with the
  javascript-security-extended suite. This produces an unfiltered
  SARIF — the same output the CLI gives locally, regardless of the
  PR diff.
- Count findings from the SARIF with the same jq pattern used by
  the osv-scanner placeholder, and pretty-print rule id + message
  + location for each finding.
- Still upload the SARIF to GitHub Advanced Security so the Security
  tab keeps working.

Module philosophy clarification (also noted at the top of the file):
CodeQL focuses on data-flow / taint vulnerabilities — its strength.
Hardcoded secret detection is the secrets_scan module's job, not
this one. Expect this scan to fail with one finding,
`js/command-line-injection` at code/src/simple-app.js:41, where the
user-controlled `filePath` from a JSON request body flows into
`child_process.exec`. The fix is signposted by the existing TODO
comment one line above the sink: switch from `exec()` to
`fs.readFile()`.
…atten /fetch

Cleanup pass on the previous commits in this MR:

- workshop/code_scan/sast/codeQL/workflow.yml: drop the meta paragraphs
  about taint focus and diff-aware filtering (those are MR-review
  context, not workshop content), and replace the verbose binary-locate
  step with a one-liner using `find -printf -quit`.
- workshop/code_scan/sca/osv-scanner/workflow.yml: drop the
  "no NVD / no API keys / lockfile-based" line — implementation detail
  the workshop user doesn't need to read.
- workshop/code_scan/README.md: trim the SCA tool descriptions to
  one-liners matching the SAST style; keep the NVD_API_KEY heads-up
  but make it shorter.
- workshop/container_scan/README.md: tighten the 2-step TIP — drop
  hard-coded version examples (which rot) and the meta-prose;
  keep the principle "fixes can have layers, re-run the scanner".
- workshop/pipeline_scan/zizmor/workflow.yml: trim the online-audits
  comment to a single line pointing at the upstream issues.
- workshop/secrets_scan/trufflehog/workflow.yml: drop the verbose
  rationale comment for the `unverified` flag.
- code/src/simple-app.js: refactor the /fetch endpoint added earlier
  in this MR — flatten nested callbacks, switch from .then/.catch to
  async/await, use early returns and split the JSON-parse and HTTP
  error handling into separate try/catch blocks. The intentional SSRF
  pattern is preserved (axios.get(targetUrl) with a TODO for the
  allow-list) so SAST detection is unaffected.
@sanchezpaco sanchezpaco merged commit 505952d into unicrons:main May 1, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants