Fix use-after-free in async host ledger reads on shutdown by Copilot · Pull Request #8003 · microsoft/CCF

Copilot · 2026-06-29T17:20:57Z

Async uv_work_t ledger reads in src/host/ledger.h could have their worker and completion callbacks run after the host Ledger was destroyed during shutdown (uv_stop ends the main loop, then the post-loop flushing iterations run after Ledger has left scope in run_main_loop). The worker dereferenced data->ledger, a use-after-free. Per @eddyashton's analysis, the other flagged sites (timer.h, process_launcher.h, Signal, HTTPParser) are already safe via libuv proxy/close pointers or run-scoped lifetimes.

Changes (all in `src/host/ledger.h`)

AsyncReadState control block - shared_ptr-owned by Ledger and copied into every outstanding AsyncLedgerGet. Holds a mutex, condition variable, active_workers count, and ledger_alive flag. It outlives the Ledger, so worker callbacks can safely decide whether they may still access it during shutdown.
on_ledger_get_async - registers as an active worker under the lock; if ledger_alive is already false it skips the read entirely (logged at DEBUG). The raw Ledger* is only dereferenced after observing ledger_alive == true and incrementing the count, which forces the destructor to wait. An RAII guard decrements active_workers and notifies waiters on every exit path from the worker.
~Ledger() - sets ledger_alive = false and blocks until active_workers == 0, draining in-flight reads before member destruction.
The completion callback remains Ledger-independent, writing back through a captured to_enclave shared_ptr.

The lock ordering is safe: a worker either registers before the destructor checks (destructor waits, Ledger stays alive through the read) or observes ledger_alive == false and never touches the Ledger. The AsyncReadState stays alive via the worker's own shared_ptr, so member-destruction order is irrelevant.

Scope

Shutdown-only; running nodes are unaffected. A CHANGELOG.md entry is included.

Closes #3501 Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>

Copilot

Pull request overview

This PR addresses a shutdown-only use-after-free risk in the host’s async ledger read path (uv_work_t threadpool jobs) by ensuring worker callbacks never dereference a Ledger* after the owning asynchost::Ledger has been destroyed. It does this by introducing a shared control block that outlives the Ledger and by having ~Ledger() wait for in-flight worker callbacks to finish before allowing member destruction.

Changes:

Added a shared_ptr-owned AsyncReadState (mutex + condition variable + active_workers + ledger_alive) shared between Ledger and each AsyncLedgerGet.
Updated on_ledger_get_async to (a) skip reads once shutdown begins and (b) register/deregister active worker callbacks so ~Ledger() can safely drain them.
Added a CHANGELOG entry documenting the shutdown use-after-free fix.

Custom instructions used:

.github/copilot-instructions.md
.github/instructions/reviewing.instructions.md

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`src/host/ledger.h`	Introduces `AsyncReadState` and shutdown coordination so async libuv workers don’t access a destroyed `Ledger` during/after shutdown.
`CHANGELOG.md`	Documents the shutdown UAF fix with a PR reference.

achamayou · 2026-06-30T20:56:30Z

No blocking issues found.

I reviewed PR #8003 against main, focusing on src/host/ledger.h async uv_work_t reads, the run_main_loop shutdown order, the linked #3501 concern, minimality, and the repo review instructions.

Correctness:

The bug is real: uv_stop in handle_ring_buffer.h exits uv_run without draining threadpool work; Ledger is stack-local in run_main_loop and is destroyed before the post-loop uv_loop_close drain. A pending or running on_ledger_get_async could previously dereference data->ledger after destruction.
The AsyncReadState protocol is sound: ledger_alive and active_workers are checked/updated under one mutex, so a worker either registers before the destructor can finish or observes shutdown and never touches Ledger. condition_variable::wait releases the mutex, and the decrement happens in the worker callback, so this does not depend on the stopped main loop running.
Completion callbacks remain Ledger-independent: result_cb only uses the captured WriterPtr and request data, and AsyncLedgerGet/uv_work_t are still deleted in completion. The captured writer chain/circuit outlives the loop-close drain.
I did not find lock-order inversion with state_lock/read_cache_lock; async_state->lock is not held during read_entries_range.
The change is local and minimal for the problem. uv_cancel alone would not cover running work, and moving the loop-close drain earlier would have broader shutdown implications.
The CHANGELOG.md entry is appropriate under the current Fixed section.

Non-blocking suggestions:

Consider making the active_workers decrement exception-safe with a small scope guard around read_entries_range (src/host/ledger.h:1763-1774). An exception escaping the libuv worker would already be fatal today, so I would not block on this, but the new destructor invariant would be more self-contained if every exit path decremented and notified.
The comments around AsyncReadState slightly overstate its role in making completion callbacks safe. It protects the worker's access to Ledger; completion is safe because it no longer touches Ledger and captures to_enclave by value. Tightening that wording would reduce future confusion.
I do not think a deterministic test is required here. The failure is a shutdown race around libuv worker scheduling; an ASAN stress test could add confidence but would likely be timing-sensitive.

CI note: At the time of review, the main VM/ACI checks for the current head were still queued or in progress; ASAN/TSAN and long jobs were skipped by workflow configuration.

Custom instructions used:

.github/copilot-instructions.md
.github/instructions/reviewing.instructions.md

Make the active worker accounting exception-safe and clarify that AsyncReadState protects worker access to Ledger while completion callbacks are Ledger-independent. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Initial plan

6ba0d62

Copilot AI assigned Copilot and achamayou Jun 29, 2026

Copilot started work on behalf of achamayou June 29, 2026 17:21 View session

Copilot AI and others added 2 commits June 29, 2026 17:32

Fix use-after-free in async ledger reads on shutdown

69657fa

Closes #3501 Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>

Address review: log skipped async reads and clarify lifetime safety

4c46e41

Closes #3501 Co-authored-by: achamayou <4016369+achamayou@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix potential async cleanup problems~~ Fix use-after-free in async host ledger reads on shutdown Jun 29, 2026

Copilot finished work on behalf of achamayou June 29, 2026 17:35

Copilot AI requested a review from achamayou June 29, 2026 17:35

achamayou requested a review from Copilot June 29, 2026 20:26

Merge branch 'main' into copilot/fix-potential-async-cleanup-problems

b505815

Copilot started reviewing on behalf of achamayou June 29, 2026 20:26 View session

Copilot AI reviewed Jun 29, 2026

View reviewed changes

Merge branch 'main' into copilot/fix-potential-async-cleanup-problems

15fb45b

Address async ledger read review comments

f527f01

Make the active worker accounting exception-safe and clarify that AsyncReadState protects worker access to Ledger while completion callbacks are Ledger-independent. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix use-after-free in async host ledger reads on shutdown#8003

Fix use-after-free in async host ledger reads on shutdown#8003
Copilot wants to merge 6 commits into
mainfrom
copilot/fix-potential-async-cleanup-problems

Copilot AI commented Jun 29, 2026 •

edited by achamayou

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

achamayou commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

Copilot AI commented Jun 29, 2026 • edited by achamayou Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes (all in src/host/ledger.h)

Scope

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

achamayou commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Jun 29, 2026 •

edited by achamayou

Loading

Changes (all in `src/host/ledger.h`)