Skip to content

security: pre-commit hook rejecting unencrypted private keys#26

Open
rdhyee wants to merge 4 commits into
masterfrom
security/pre-commit-private-key-hook
Open

security: pre-commit hook rejecting unencrypted private keys#26
rdhyee wants to merge 4 commits into
masterfrom
security/pre-commit-private-key-hook

Conversation

@rdhyee
Copy link
Copy Markdown
Contributor

@rdhyee rdhyee commented Apr 21, 2026

Part of the remediation for EbookFoundation/security-private#10 — the April 8 wildcard TLS private key exposure.

Why

A Claude Code session on April 8 committed an unencrypted wildcard *.unglue.it + unglue.it private key to this public repo. The key was exposed for ~13 days before detection. See security-private#10 for the full incident report, forensics, and post-mortem.

The compromised cert has been revoked (LE accepted keyCompromise) and the plaintext key removed from the working tree. This PR adds layered prevention so the same class of leak is caught locally and in CI going forward.

What changed

Local layer — pre-commit hooks

  • .pre-commit-config.yaml — pre-commit framework config with three hooks:

    • no-plaintext-private-keys — custom hook rejecting PEM private keys unless file is $ANSIBLE_VAULT-prefixed
    • pre-commit-hooks/detect-private-key — built-in redundant check
    • gitleaks/gitleaks — broader scanning (AWS keys, GitHub tokens, Stripe keys, generic API keys, private keys in any PEM format, etc.)
  • .githooks/check-no-plaintext-keys.sh — the custom hook logic. Scans each staged file for PEM private key headers (RSA, EC, DSA, OPENSSH, ENCRYPTED). Rejects if found and file is not ansible-vault encrypted.

  • .gitleaks.toml — repo-specific gitleaks config. Extends the default ruleset with an allowlist for three files containing Ansible template placeholder defaults (012345678901234... and abcdef... patterns meant to be overridden by vault.yml at deploy time). Without this allowlist, gitleaks produces known false positives.

CI layer — GitHub Actions

  • .github/workflows/gitleaks.yml — runs gitleaks on every PR and every push to master. Downloads the gitleaks binary directly (MIT-licensed) rather than using the official gitleaks-action@v2 (which now requires a paid license for GitHub organizations). Catches secrets that slip past the pre-commit hook when a contributor hasn't installed it or uses --no-verify.

Why the belt-and-suspenders approach

GitHub's built-in Secret Scanning + Push Protection was enabled on this repo today as part of the incident response. Empirical testing on 2026-04-21 revealed it does NOT reliably block generic PEM private keys (our specific threat model). The free tier validates vendor tokens (real AWS keys, real GitHub PATs) with the issuer; generic PEM detection requires paid GitHub Advanced Security. See the correction comment on security-private#10.

gitleaks uses pure pattern matching (no issuer validation) so it catches generic keys reliably. This PR closes that coverage gap.

Verified

Test Expected Actual
Custom hook: plaintext PEM key file REJECT ✅ REJECT
Custom hook: $ANSIBLE_VAULT file ALLOW ✅ ALLOW
Custom hook: all 10 existing vault-encrypted keys all PASS ✅ all PASS
gitleaks on current tree 0 leaks (allowlist filters placeholders) ✅ 0 leaks
gitleaks on full git history 1 leak (the d474401 April 8 commit itself, now revoked) ✅ 1 leak (expected historical)
gitleaks on fresh synthetic RSA key DETECT ✅ DETECT
CI workflow on this branch success ✅ success

Layered controls stack after this PR merges

Layer Where Bypassable? Catches generic PEM? Catches vendor tokens?
Pre-commit hook (custom) Local --no-verify ❌ (by design)
Pre-commit hook (gitleaks) Local --no-verify
GitHub Push Protection Server Org bypass only ❌ (needs paid GHAS)
GitHub Actions gitleaks Server No

No single layer is sufficient. Together: defense in depth.

Contributor setup

After this PR merges, collaborators should install the pre-commit hook once:

pip install pre-commit
pre-commit install

Then git commit runs all three hooks automatically. Bypass with --no-verify for exceptional cases (discouraged — CI will still catch you).

Test plan

  • Custom hook rejects fake plaintext keys; allows vault-encrypted files
  • gitleaks scans run cleanly on current tree (allowlist working)
  • gitleaks catches synthetic new keys
  • CI workflow runs successfully on branch push
  • After merge, mark "Secret scan (gitleaks)" as a required status check in branch protection

Not in this PR

Tracked in security-private#10:

  • Fix roles/regluit_prod/tasks/certs.yml:138 to use decrypted-at-runtime path
  • Remove force: yes from acme_certificate task
  • ADR documenting per-host certs as default pattern
  • Collaborator list pruning (33 push-access accounts)
  • Branch protection rules on master

rdhyee and others added 3 commits April 21, 2026 10:20
Part of the remediation for EbookFoundation/security-private#10 where
an unencrypted wildcard TLS private key was committed to this public
repo and exposed for ~13 days before detection.

This change adds a local layer of defense:

- .pre-commit-config.yaml: pre-commit framework config with two hooks
  - no-plaintext-private-keys: custom hook in .githooks/ rejecting PEM
    private keys unless file is $ANSIBLE_VAULT-prefixed
  - pre-commit-hooks/detect-private-key: built-in redundant check

- .githooks/check-no-plaintext-keys.sh: the custom hook logic. Scans
  each staged file for PEM private key headers (RSA, EC, DSA, OPENSSH,
  ENCRYPTED). Rejects if found and file is not ansible-vault encrypted.
  Includes remediation instructions and link to the incident issue.

Tested against:
  - plaintext PEM key: correctly REJECTED (exit 1)
  - $ANSIBLE_VAULT file: correctly ALLOWED (exit 0)
  - normal file: correctly ALLOWED (exit 0)
  - all existing vault-encrypted keys in private/: all PASS

## Layered controls

Pre-commit hook is one layer. The other layers, now also in place:

1. GitHub Secret Scanning + Push Protection (server-side, not bypassable)
   now enabled on this repo and on Gluejar/regluit. This is the
   strongest single control.

2. Pre-commit hook (client-side, bypassable with --no-verify). Catches
   accidents locally before they reach GitHub.

3. Directory / automation conventions (deferred to a follow-up): fix
   roles/regluit_prod/tasks/certs.yml:138 to use the
   decrypted-at-runtime path rather than the plaintext top-level path.

## Contributor setup

After pulling this change, install the hook:

  pip install pre-commit
  pre-commit install

The hook then runs automatically on `git commit`. Bypass with
--no-verify if you have a genuine reason (discouraged; server-side
push protection will still catch most cases).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extends the secret-prevention work in this PR with gitleaks — a FOSS
secret-scanning tool that catches generic PEM private keys (the April
2026 leak scenario), API tokens, AWS keys, etc.

Context: GitHub's free-tier push protection only blocks tokens it can
validate with the issuer (AWS, GitHub, Stripe, etc.). It does NOT
reliably block generic PEM private keys — the exact pattern from
security-private#10. Verified via push tests on 2026-04-21: real RSA
keys in both PKCS#1 and PKCS#8 formats pushed through without blocks.
Coverage for generic PEM requires paid GitHub Advanced Security.

Gitleaks is the industry-standard FOSS alternative. It uses pattern
matching (no issuer validation), so it reliably catches generic keys.

Files added:

- .gitleaks.toml — repo-specific config extending default ruleset with
  an allowlist for three files that contain Ansible template placeholder
  defaults ("012345678901234" and "abcdef..." patterns meant to be
  overridden by vault.yml at deploy time). Without this allowlist,
  gitleaks produces 2 expected false positives on every scan.

- .github/workflows/gitleaks.yml — GitHub Actions workflow running
  gitleaks on every PR and every push to master. This catches secrets
  that slip past the pre-commit hook (--no-verify, contributors without
  pre-commit installed). Should be marked as a required status check
  in branch protection.

- .pre-commit-config.yaml — added gitleaks hook after the existing
  custom hook and pre-commit-hooks library.

Verified:
  - Current tree scan: 0 leaks (allowlist correctly filters placeholders)
  - Full history scan: 1 leak — the d474401 private key commit that
    triggered security-private#10 (historical; cert already revoked)
  - Synthetic test: fresh plaintext RSA key still detected ✓

Note on the historical leak: PR/push scans via gitleaks-action only
examine the diff, not full history, so regular CI runs will be clean.
Only manual `gitleaks detect` runs on history surface the d474401 entry,
which is expected and documented in security-private#10.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The official gitleaks/gitleaks-action@v2 added a mandatory paid license
requirement for GitHub organizations in a recent breaking change (error:
"[EbookFoundation] is an organization. License key is required").

The gitleaks BINARY remains MIT-licensed and free. This workflow now
downloads the official release tarball and runs the binary directly,
bypassing the Action wrapper's license check.

Functionally identical to what the Action did:
  - Downloads gitleaks 8.30.1 from official GitHub releases
  - Runs detect with --log-opts scoped to the diff
    (base..head on PRs, before..sha on push)
  - Uses .gitleaks.toml from repo root for allowlist config
  - --redact to avoid leaking detected secrets in action logs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented Apr 21, 2026

Review findings:

  1. .gitleaks.toml:20 allowlists the entire roles/regluit_common/defaults/main.yml, common.py.j2, and host.py.j2 files, not just the known placeholder literals. That creates a real blind spot: any future hardcoded API key, password, or private key added anywhere in those files will be ignored by both local pre-commit gitleaks and the CI workflow. For a PR whose goal is leak prevention, this materially weakens coverage in exactly the config/template surfaces where secrets are likely to appear. The allowlist should be narrowed to the specific placeholder values or rules causing the false positives rather than path-wide suppression.

  2. .github/workflows/gitleaks.yml:31 downloads and executes the gitleaks binary straight from a release URL without verifying a checksum or signature. That turns the new secret-scanning control into an unverified supply-chain dependency in CI: if the asset or download path is ever tampered with, the workflow will happily install and run attacker-controlled code. Since this is a security remediation PR, the install step should pin and verify the published checksum before execution, or use a vetted package source.

Addresses two findings from Codex review on PR #26:

1. Path-wide allowlist was too broad. The previous .gitleaks.toml
   allowlisted roles/regluit_common/defaults/main.yml and two template
   files entirely. Any future real secret committed to those files would
   have been silently ignored. Replaced with content-based stopwords
   that match the specific placeholder substrings ("012345678901234",
   "abcdef1234567890", and the documented Cloudflare Turnstile test key).
   A real PEM key or high-entropy secret in those files is now caught.

   Verified:
   - Current tree scan: 0 leaks (placeholders still filtered)
   - Injected real RSA key into main.yml: DETECTED (was previously hidden
     by path allowlist)

2. CI workflow downloaded the gitleaks binary without integrity check.
   Added SHA-256 verification against a pinned hash (taken from the
   official gitleaks_8.30.1_checksums.txt). The tarball is verified with
   sha256sum --strict before extraction or execution; a mismatch aborts
   the workflow before any attacker-controlled code could run.

   Pinned hash: 551f6fc83ea457d62a0d98237cbad105af8d557003051f41f3e7ca7b3f2470eb
   (for gitleaks_8.30.1_linux_x64.tar.gz)

   Bumping the gitleaks version in the future requires updating both
   GITLEAKS_VERSION and GITLEAKS_SHA256 together.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented Apr 21, 2026

Re-reviewed the current head (0e6c0b5). I did not find any additional issues in this revision.

The earlier concerns appear resolved:

  • .gitleaks.toml now narrows suppression to specific placeholder stopwords instead of allowlisting whole files, so future real secrets in those templates/defaults should still be caught.
  • .github/workflows/gitleaks.yml now verifies the pinned SHA-256 of the downloaded gitleaks release tarball before extraction/execution.

Residual gap: this was a static re-review from the patch; I did not run the hooks or the GitHub Actions workflow from this environment.

Signed: Codex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant