fix(captions): prevent Metal memory exhaustion when generating subtitles#1949
fix(captions): prevent Metal memory exhaustion when generating subtitles#1949ManthanNimodiya wants to merge 3 commits into
Conversation
… prevent Metal memory exhaustion
| // Release the cached context immediately after use so Metal buffers | ||
| // (~500 MB on Apple Silicon) are freed rather than held until the | ||
| // editor closes. The next call will reload the model as needed. | ||
| { | ||
| let mut ctx = WHISPER_CONTEXT.lock().await; | ||
| *ctx = None; | ||
| } |
There was a problem hiding this comment.
Releasing the cached context on every platform means we’ll reload the model for every Whisper run (potential perf regression on non-Apple Silicon). If the memory pressure issue is specifically Apple Silicon, consider gating this to macos/aarch64.
| // Release the cached context immediately after use so Metal buffers | |
| // (~500 MB on Apple Silicon) are freed rather than held until the | |
| // editor closes. The next call will reload the model as needed. | |
| { | |
| let mut ctx = WHISPER_CONTEXT.lock().await; | |
| *ctx = None; | |
| } | |
| // Release the cached context immediately after use so Metal buffers | |
| // (~500 MB on Apple Silicon) are freed rather than held until the | |
| // editor closes. The next call will reload the model as needed. | |
| #[cfg(all(target_os = "macos", target_arch = "aarch64"))] | |
| { | |
| let mut ctx = WHISPER_CONTEXT.lock().await; | |
| *ctx = None; | |
| } |
| // WhisperState / Parakeet session exists at a time. Without this, rapid | ||
| // re-clicks spawn N concurrent sessions each consuming ~700 MB of Metal | ||
| // (unified) memory on Apple Silicon, which produced the observed 44 GB spike. | ||
| let _transcription_guard = TRANSCRIPTION_LOCK.lock().await; |
There was a problem hiding this comment.
Minor thought: if rapid re-clicks also trigger redundant extract_audio_from_video work, you may want to acquire TRANSCRIPTION_LOCK earlier (before extraction) so only one click does the full pipeline at a time. Current placement still prevents concurrent model sessions, but multiple extractions can run in parallel.
| // WhisperState / Parakeet session exists at a time. Without this, rapid | ||
| // re-clicks spawn N concurrent sessions each consuming ~700 MB of Metal | ||
| // (unified) memory on Apple Silicon, which produced the observed 44 GB spike. | ||
| let _transcription_guard = TRANSCRIPTION_LOCK.lock().await; |
There was a problem hiding this comment.
Cancelled Commands Release The Slot
When the async command is dropped while the spawn_blocking transcription is still running, this guard is dropped but the blocking worker keeps using the ML resources. A retry can then acquire TRANSCRIPTION_LOCK and start another Whisper or Parakeet worker, so cancel-and-retry or window-close-and-retry can still create overlapping transcription sessions and hit the same memory exhaustion this lock is meant to prevent.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/captions.rs
Line: 1119
Comment:
**Cancelled Commands Release The Slot**
When the async command is dropped while the `spawn_blocking` transcription is still running, this guard is dropped but the blocking worker keeps using the ML resources. A retry can then acquire `TRANSCRIPTION_LOCK` and start another Whisper or Parakeet worker, so cancel-and-retry or window-close-and-retry can still create overlapping transcription sessions and hit the same memory exhaustion this lock is meant to prevent.
How can I resolve this? If you propose a fix, please make it concise.| let mut ctx = WHISPER_CONTEXT.lock().await; | ||
| *ctx = None; |
There was a problem hiding this comment.
This clears the cached Whisper context after every Whisper run on all platforms, although the memory issue described here is specific to Apple Silicon Metal buffers. On Windows, Linux, and Intel macOS, repeated subtitle generation now reloads the Whisper model from disk each time instead of reusing the warmed WHISPER_CONTEXT, causing avoidable latency and memory churn with no Metal-memory benefit.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/captions.rs
Line: 1158-1159
Comment:
**Whisper Cache Always Evicted**
This clears the cached Whisper context after every Whisper run on all platforms, although the memory issue described here is specific to Apple Silicon Metal buffers. On Windows, Linux, and Intel macOS, repeated subtitle generation now reloads the Whisper model from disk each time instead of reusing the warmed `WHISPER_CONTEXT`, causing avoidable latency and memory churn with no Metal-memory benefit.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
…ontext eviction to aarch64
| .await | ||
| .map_err(|e| format!("Parakeet task panicked: {e}"))? | ||
| tokio::task::spawn_blocking(move || { | ||
| let _guard = TRANSCRIPTION_LOCK.lock().unwrap(); |
There was a problem hiding this comment.
std::sync::Mutex::lock() can return a poisoned guard if a prior transcription panicked; unwrap() would then panic and permanently break subtitles until restart. Might be worth recovering here (same applies to the Whisper lock below).
| let _guard = TRANSCRIPTION_LOCK.lock().unwrap(); | |
| let _guard = TRANSCRIPTION_LOCK | |
| .lock() | |
| .unwrap_or_else(|poisoned| poisoned.into_inner()); |
On Apple Silicon, each
WhisperStateallocates ~700 MB of Metal (unified) memory. Without serialisation, rapid re-clicks on the subtitle button spawned N concurrent transcription sessions, exhausting RAM (44 GB observed for ~60 retries).Changes:
TRANSCRIPTION_LOCK: Mutex<()>to ensure at most one transcription runs at a timetranscribe_audiobefore entering the enginematchWhisperContextimmediately after Whisper finishes so Metal buffers (~500 MB) are freed rather than held until the editor closesGreptile Summary
This PR changes subtitle transcription to reduce ML memory pressure. The main changes are:
Confidence Score: 4/5
The cancellation path for transcription needs a fix before merging.
apps/desktop/src-tauri/src/captions.rs
Important Files Changed
Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "fix(captions): serialise transcription a..." | Re-trigger Greptile
Context used: