[ExecuTorch][WebGPU] Add view_copy op (aten.view_copy.default)#20360
[ExecuTorch][WebGPU] Add view_copy op (aten.view_copy.default)#20360JulianCloudNTH wants to merge 3 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20360
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 3 Unrelated FailuresAs of commit 867bd73 with merge base 0e65ba6 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
@claude review |
|
Claude finished @JulianCloudNTH's task in 1m 14s —— View job Code Review: WebGPU
|
Stack from ghstack (oldest at bottom):
Adds
aten.view_copy.defaultto the WebGPU delegate. A contiguous reshape on a dense row-major buffer backend is a flat copyoutput[i] = input[i], so the op is a single 1D-dispatch copy kernel.Composition (single compute dispatch):
runtime/ops/view_copy/view_copy.h— declaresadd_flat_copy(graph, in_id, out_id): fail-loud guards (both tensors, fp32, numel-preserving) + theview_copy.wgsldispatch overcompute_1d_workgroup_count(num_elements)withoverride wg_size; mirrors Vulkanadd_view_copy_node.runtime/ops/view_copy/ViewCopy.cpp— readsargs = [self, size, out], ignores the AOT-fixedsizevalue-id (output shape comes fromout_tensor.dims), callsadd_flat_copy.runtime/ops/view_copy/view_copy.wgsl— guardsidx >= num_elements, writesoutput[idx] = input[idx].add_flat_copyis factored into the header so the stacked squeeze/unsqueeze ops reuse it without a new kernel.@exported-using-ghexport
Differential Revision: D108793164
Differential Revision: D108793164