CI/infra hardening from the package audit: cache-poisoning fix, two-sided ratchet, MSPLIM, pin upgrades (PR B)#26
Merged
Merged
Conversation
- Hexagon toolchain cache poisoning: icount-ratchet (ci.yml) and Measure Hexagon (compare.yml) downloaded the toolchain unverified on cache miss while sharing the hexagon-qemu job's trusted cache key. Both paths now verify the existing HEXAGON_TOOLCHAIN_SHA256 hard pin, and all three cache keys are keyed on the pinned digest itself. - Two-sided ratchet: icount.py now fails on improvement beyond tolerance (stale slack would hide later regressions) with instructions to run --update and commit baselines.json; docstring and PERFORMANCE.md updated. Also: zero-baseline guard, 600 s per-binary QEMU timeout with a named-binary error, usage line lists all three targets, and --update rewrites the target entry to exactly the measured scenarios (prunes stale keys). Committed baselines unchanged; README regen is diff-free. - Bare-metal empty-run guard: bare_metal_main.cpp fails with SRT_TESTS_COMPLETE rc=1 if fewer than 15 tests were selected, so a filter typo cannot pass green. Checked after RUN_ALL_TESTS because gtest applies the filter inside it (the count reads 0 beforehand — verified on target). Filter string itself untouched. - MSPLIM: __stack_limit added to both linker scripts (M55: DTCM base, the stack owns the region; M33: __heap_end__) and written to msplim first thing in Reset_Handler (Armv8-M Mainline only; both targets are). Dedicated HardFault_Handler (bkpt + park) replaces the Default_Handler alias. Verified: M33 one-shot suite passes under QEMU; M55 icount workloads still complete with counts within 0.01% of baselines. - compare-smoke job: per-push build-only check of srt_bench_compare (host) and cmp_icount_lsr_medium (M55 cross) so compare.yml's manual-only paths cannot bit-rot. - ci-arm64.yml: on failure, opens or comments on a "ci-arm64 weekly run failing" issue (scheduled runs have no PR audience); header comment reworded — macos-latest already covers arm64 per push, this workflow's unique value is TSan-on-arm64. - qemu-plugin.h pinned to the commit v8.2.2 points at (11aa0b1ff115b86160c4d37e7c37e6a6b13b77ea) with sha256 verification in both workflows' plugin-build steps. - FetchContent pins: googletest f8d7d77c (v1.14.0), benchmark c58e6d07 (v1.9.1) — commit SHAs instead of movable tags. - update_perf_docs.py exits nonzero on empty/items_per_second-less benchmark output; clang-format gate extended to bench/compare, tools/qemu_insn_plugin and platform C sources (only churn: comment realignment in armv8m_startup.c's vector table). - Known-debt ledger in PERFORMANCE.md (MSVC /W4 triage, missing tail-latency benchmark); MSVC matrix comment references it. https://claude.ai/code/session_01HuAFfoeD5a5Xe5aGNA16M9
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Audit fix series, part B — the CI/infrastructure findings. Composes cleanly with #25 (part A): this PR deliberately avoids the hexagon
ctest -Eline and the bare-metal filter string that #25 touches.The two Highs
-verified-cache key the hexagon-qemu job trusts — first unverified writer poisoned everyone's cache. Now: the SHA256 hard pin is verified in all three download paths, and the cache key is the pinned hash (hexagon-toolchain-<sha256>-1), so an unverified artifact can never occupy a trusted key.icount.py --updateand commit baselines.json" — unclaimed improvements can no longer create a stale-baseline dead zone that absorbs future regressions. Demonstrated end-to-end: inflated baseline → exit 1 with the message;--updatenow also prunes stale scenario keys (verified bit-identical values, zero README diff).The rest
RUN_ALL_TESTS()(test_to_run_count()reads 0 before it), so the guard sits after the run — proven by building the as-specified version and watching it fail on target.__stack_limitsymbols in both linker scripts, verified bynm: M33 0x383f0000, M55 0x20000000) plus a dedicated HardFault handler — stack overflow now faults instead of silently corrupting the heap. M33 one-shot suite passed under QEMU with MSPLIM armed (78.9 s). Baselines deliberately unchanged (+2..26 insns one-time, +0.00%).srt_bench_compare(host) andcmp_icount_lsr_medium(M55 cross) — the comparison infrastructure can no longer bit-rot invisibly between manual dispatches.if: failure()opens/updates a "ci-arm64 weekly run failing" issue; stale header comment corrected (macos-latest already covers per-push arm64; this workflow's unique value is TSan-on-arm64).Verified locally: all workflows YAML-parse; host build + fast ctest green on the new pins; M55 icount all 7 scenarios pass vs committed baselines; M33 QEMU suite green with MSPLIM. Not verifiable here: the Hexagon legs, actual cache/issue-creation behavior, MSVC/macOS — first CI run on this PR covers most of that.
https://claude.ai/code/session_01HuAFfoeD5a5Xe5aGNA16M9
Generated by Claude Code