CI/infra hardening from the package audit: cache-poisoning fix, two-sided ratchet, MSPLIM, pin upgrades (PR B) by tap · Pull Request #26 · tap/SampleRateTap

tap · 2026-06-12T22:40:49Z

Audit fix series, part B — the CI/infrastructure findings. Composes cleanly with #25 (part A): this PR deliberately avoids the hexagon ctest -E line and the bare-metal filter string that #25 touches.

The two Highs

Hexagon toolchain cache-poisoning hole closed. The icount-ratchet and compare.yml jobs downloaded the toolchain unverified yet saved it under the same -verified- cache key the hexagon-qemu job trusts — first unverified writer poisoned everyone's cache. Now: the SHA256 hard pin is verified in all three download paths, and the cache key is the pinned hash (hexagon-toolchain-<sha256>-1), so an unverified artifact can never occupy a trusted key.
The icount ratchet is now two-sided. Improvement beyond tolerance fails with "run icount.py --update and commit baselines.json" — unclaimed improvements can no longer create a stale-baseline dead zone that absorbs future regressions. Demonstrated end-to-end: inflated baseline → exit 1 with the message; --update now also prunes stale scenario keys (verified bit-identical values, zero README diff).

The rest

Bare-metal empty-run guard: a filter typo matching zero tests now fails instead of passing green. One spec deviation, empirically forced: gtest applies filters inside RUN_ALL_TESTS() (test_to_run_count() reads 0 before it), so the guard sits after the run — proven by building the as-specified version and watching it fail on target.
MSPLIM armed on M33 and M55 (first instruction of Reset_Handler; __stack_limit symbols in both linker scripts, verified by nm: M33 0x383f0000, M55 0x20000000) plus a dedicated HardFault handler — stack overflow now faults instead of silently corrupting the heap. M33 one-shot suite passed under QEMU with MSPLIM armed (78.9 s). Baselines deliberately unchanged (+2..26 insns one-time, +0.00%).
compare-smoke job: per-push build-only smoke of srt_bench_compare (host) and cmp_icount_lsr_medium (M55 cross) — the comparison infrastructure can no longer bit-rot invisibly between manual dispatches.
ci-arm64 failures get an audience: if: failure() opens/updates a "ci-arm64 weekly run failing" issue; stale header comment corrected (macos-latest already covers per-push arm64; this workflow's unique value is TSan-on-arm64).
Pin upgrades: qemu-plugin.h fetched by commit SHA + SHA256 check (self-tested: plugin built from the pinned URL); googletest/benchmark FetchContent moved from movable tags to commit SHAs (fresh configure verified to clone the pinned SHA).
Script guards: icount.py per-binary 600 s timeout naming the binary, zero-baseline guard, corrected usage; update_perf_docs.py refuses to write an empty table (both failure cases unit-tested). clang-format gate extended to bench/compare, the QEMU plugin, and platform/*.c (reformat included; minimal churn).
Known-debt ledger in PERFORMANCE.md (MSVC /W4 triage, missing tail-latency bench) referenced from the MSVC gate comment.

Verified locally: all workflows YAML-parse; host build + fast ctest green on the new pins; M55 icount all 7 scenarios pass vs committed baselines; M33 QEMU suite green with MSPLIM. Not verifiable here: the Hexagon legs, actual cache/issue-creation behavior, MSVC/macOS — first CI run on this PR covers most of that.

https://claude.ai/code/session_01HuAFfoeD5a5Xe5aGNA16M9

Generated by Claude Code

- Hexagon toolchain cache poisoning: icount-ratchet (ci.yml) and Measure Hexagon (compare.yml) downloaded the toolchain unverified on cache miss while sharing the hexagon-qemu job's trusted cache key. Both paths now verify the existing HEXAGON_TOOLCHAIN_SHA256 hard pin, and all three cache keys are keyed on the pinned digest itself. - Two-sided ratchet: icount.py now fails on improvement beyond tolerance (stale slack would hide later regressions) with instructions to run --update and commit baselines.json; docstring and PERFORMANCE.md updated. Also: zero-baseline guard, 600 s per-binary QEMU timeout with a named-binary error, usage line lists all three targets, and --update rewrites the target entry to exactly the measured scenarios (prunes stale keys). Committed baselines unchanged; README regen is diff-free. - Bare-metal empty-run guard: bare_metal_main.cpp fails with SRT_TESTS_COMPLETE rc=1 if fewer than 15 tests were selected, so a filter typo cannot pass green. Checked after RUN_ALL_TESTS because gtest applies the filter inside it (the count reads 0 beforehand — verified on target). Filter string itself untouched. - MSPLIM: __stack_limit added to both linker scripts (M55: DTCM base, the stack owns the region; M33: __heap_end__) and written to msplim first thing in Reset_Handler (Armv8-M Mainline only; both targets are). Dedicated HardFault_Handler (bkpt + park) replaces the Default_Handler alias. Verified: M33 one-shot suite passes under QEMU; M55 icount workloads still complete with counts within 0.01% of baselines. - compare-smoke job: per-push build-only check of srt_bench_compare (host) and cmp_icount_lsr_medium (M55 cross) so compare.yml's manual-only paths cannot bit-rot. - ci-arm64.yml: on failure, opens or comments on a "ci-arm64 weekly run failing" issue (scheduled runs have no PR audience); header comment reworded — macos-latest already covers arm64 per push, this workflow's unique value is TSan-on-arm64. - qemu-plugin.h pinned to the commit v8.2.2 points at (11aa0b1ff115b86160c4d37e7c37e6a6b13b77ea) with sha256 verification in both workflows' plugin-build steps. - FetchContent pins: googletest f8d7d77c (v1.14.0), benchmark c58e6d07 (v1.9.1) — commit SHAs instead of movable tags. - update_perf_docs.py exits nonzero on empty/items_per_second-less benchmark output; clang-format gate extended to bench/compare, tools/qemu_insn_plugin and platform C sources (only churn: comment realignment in armv8m_startup.c's vector table). - Known-debt ledger in PERFORMANCE.md (MSVC /W4 triage, missing tail-latency benchmark); MSVC matrix comment references it. https://claude.ai/code/session_01HuAFfoeD5a5Xe5aGNA16M9

tap merged commit 14a9329 into main Jun 12, 2026
26 checks passed

tap mentioned this pull request Jun 12, 2026

Docs truth sweep from the package audit (PR C) #27

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI/infra hardening from the package audit: cache-poisoning fix, two-sided ratchet, MSPLIM, pin upgrades (PR B)#26

CI/infra hardening from the package audit: cache-poisoning fix, two-sided ratchet, MSPLIM, pin upgrades (PR B)#26
tap merged 1 commit into
mainfrom
claude/infra-hardening

tap commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tap commented Jun 12, 2026

The two Highs

The rest

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants