Skip to content

RP2350 dual-core deployment example: one clock domain per core, self-validating#24

Merged
tap merged 1 commit into
mainfrom
claude/pico2-dualcore
Jun 12, 2026
Merged

RP2350 dual-core deployment example: one clock domain per core, self-validating#24
tap merged 1 commit into
mainfrom
claude/pico2-dualcore

Conversation

@tap

@tap tap commented Jun 12, 2026

Copy link
Copy Markdown
Owner

The second software add-on: standalone RP2350 firmware demonstrating and self-validating the dual-core deployment shape the README prescribes — the converter's two ends on the two cores, push() core0-only at the input clock, pull() core1-only at the output clock, which is exactly the library's SPSC two-agent contract with cores in place of threads (stated as such in the code).

What it runs (each phase ~30 s, self-judging PASS/FAIL over USB serial)

  • Phase A: Q15 stereo balanced() @ 48 kHz — the configuration the README calls "tight on one core". core0 busy-paces pushes at rate × (1+200 ppm) using exact integer-rational due times (printf stalls cause catch-up, not drift); core1 pulls at nominal rate and times every pull() with its own CYCCNT. Asserts: locked ≤ 2 s, |ppm−200| < 5 after 10 s, zero under/over/resyncs after lock, plus mean/p99/max cycles-per-block and %-of-150 MHz-core.
  • Phase B: Q15 12-channel @ 16 kHz (the reference-mic/AVB shape, config scaled per the 16 kHz rule). Deliberately not 48 kHz: at 10,027 insns/frame the 12-channel pipeline exceeds 3× one core's 48 kHz budget, and a single instance's pull() is one consumer by contract — dual-core buys one clock domain per core, not more datapath than one core has. Cycles/block is rate-independent, so this still yields the real-silicon counterpart of the pipeline12_q15 baseline.

Cross-core design (the part worth reviewing)

Converter handoff via release-store/acquire-load; consumer cycle stats cross via a seqlock over 32-bit relaxed atomics (64-bit std::atomic is not lock-free on M33 — the same constraint as the library's own telemetry); teardown orders core1's final pull before destruction. DWT verified per-core in the SDK headers (CYCCNT enabled on core1 by core1), with a runtime NOCYCCNT guard because the SVD-derived reset value is Arm's generic template, not silicon truth.

Verified

Builds to pico2_dualcore.uf2 (135.7 KB; flash image 67.4 KB, static RAM 27.8 KB, 12-channel phase peaks ~175 KB of 520 KB SRAM), zero warnings, clang-format clean, same pinned-SDK pattern as pico2_cyccnt. Not run — needs your Pico 2; the firmware prints its own verdicts (SUMMARY lines, SRT_PICO2_DUALCORE_DONE).

https://claude.ai/code/session_01HuAFfoeD5a5Xe5aGNA16M9


Generated by Claude Code

The dual-core deliverable of docs/HARDWARE_TESTING.md Setup 2: the
converter's two ends on the RP2350's two Cortex-M33 cores, one core per
clock domain — the deployment the README prescribes for configurations
that are tight on one 150 MHz core.

- core0 = producer: push(32) busy-paced at rate*(1+200e-6) on the shared
  microsecond timebase (absolute integer-rational due times, so USB
  telemetry stalls are followed by catch-up, not schedule slip), plus
  1 Hz telemetry over USB stdio.
- core1 = consumer: pull(32) paced at the exact nominal rate, every call
  timed with its own DWT.CYCCNT (the PPB is core-local on the RP2350, so
  the counter is enabled on core1; the SVD's NOCYCCNT=1 reset value is
  Arm template junk — the runtime check stays the gate).
- Cross-core: the library's SPSC push/pull contract is satisfied by
  cores exactly as by threads; phase handoff is a release/acquire atomic
  pointer, consumer stats cross in a seqlock of 32-bit atomics (64-bit
  atomics are not lock-free on the M33).
- Phase A: Q15 stereo balanced() at 48 kHz. Phase B: Q15 12-channel at
  16 kHz with band edges and servo scaled 16/48 per the README — the
  QEMU baseline (pipeline12_q15, 10,027 insns/frame) puts the 48 kHz
  12-channel datapath >3x over one core's 3,125-cycle frame budget, and
  a single instance's pull() cannot be split across cores.
- Each ~30 s phase ends in a PASS/FAIL summary (lock within 2 s/6 s,
  |ppm-200|<5 after settling, zero underruns/overruns/resyncs after
  lock, pull cycles mean/p99/max and % of core), then
  SRT_PICO2_DUALCORE_DONE.

Same standalone SDK pinning as pico2_cyccnt (Pico SDK 2.1.1 FetchContent
with lib/tinyusb only, PICO_BOARD=pico2, exceptions on) plus
pico_multicore. Verified: compiles to pico2_dualcore.uf2 with
arm-none-eabi-gcc 13.2.1 (flash image 67,412 B: .text 52,284 + .rodata
3,328 + .data 10,788; static RAM 27.8 KB = .data 10,788 + .bss 17,008
incl. 8 KB histogram + 4 KB core1 stack; 12-channel phase peaks ~175 KB
heap of 520 KB). Not run — no RP2350 hardware available here.

https://claude.ai/code/session_01HuAFfoeD5a5Xe5aGNA16M9
@tap tap merged commit 045de5d into main Jun 12, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants