Add L2 score mod distributed attention shape by vcherepanov-nv · Pull Request #3147 · NVIDIA/TransformerEngine

vcherepanov-nv · 2026-06-25T19:38:16Z

Description

Add L2 score mod distributed attention shape

Fixes # (issue)

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Please list the changes introduced in this PR:

Add L2 shape to fix L2 tests

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

greptile-apps · 2026-06-25T19:40:07Z

Greptile Summary

This PR adds the \"L2\" key to DISTRIBUTED_SCORE_MOD_DATA_SHAPES in the distributed fused-attention test file. Before this change, running the test suite with NVTE_JAX_UNITTEST_LEVEL=L2 would raise a ValueError because the key was absent; the addition prevents that crash.

The new entry is \"L2\": [], an empty list, which resolves to zero parametrized test cases at L2 — so while the ValueError is avoided, no score-mod distributed-attention tests actually run at that level.
Every other shape dictionary in this file (DISTRIBUTED_SELF_ATTN_DATA_SHAPES, DISTRIBUTED_CROSS_ATTN_DATA_SHAPES) provides a distinct, non-empty shape tuple for L2, suggesting at least one concrete shape should be supplied here as well.

Confidence Score: 4/5

The change prevents a crash when running at L2 level, but leaves L2 with no actual test coverage for the score-mod attention path.

The single-line fix resolves the ValueError from the missing L2 key, but "L2": [] means zero test cases are parametrized when the suite runs at L2. The stated goal — fixing L2 tests — is not achieved: no score-mod distributed-attention tests execute at that level.

tests/jax/test_distributed_fused_attn.py — the L2 entry in DISTRIBUTED_SCORE_MOD_DATA_SHAPES needs at least one shape tuple.

Important Files Changed

Filename	Overview
tests/jax/test_distributed_fused_attn.py	Adds "L2": [] to DISTRIBUTED_SCORE_MOD_DATA_SHAPES; while this prevents the ValueError that occurred when L2 was missing from the dict, the empty list means no score-mod tests are parametrized or executed at L2 level.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["Test run starts\n(NVTE_JAX_UNITTEST_LEVEL=L2)"] --> B["pytest_parametrize_wrapper\ncalled with DISTRIBUTED_SCORE_MOD_DATA_SHAPES"]
    B --> C["get_parameters_for_test_level\nlooks up 'L2' key"]
    C --> D{"Key exists?"}
    D -- "Before PR\n(key missing)" --> E["ValueError:\nUnsupported test level"]
    D -- "After PR\n(key = [])" --> F["returns empty list []"]
    F --> G["pytest.mark.parametrize\n('data_shape', [])"]
    G --> H["0 test cases collected\nor pytest warning/skip"]
    H --> I["No score-mod L2 tests run"]
    style E fill:#f88,stroke:#c00
    style I fill:#fa8,stroke:#c60

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A["Test run starts\n(NVTE_JAX_UNITTEST_LEVEL=L2)"] --> B["pytest_parametrize_wrapper\ncalled with DISTRIBUTED_SCORE_MOD_DATA_SHAPES"]
    B --> C["get_parameters_for_test_level\nlooks up 'L2' key"]
    C --> D{"Key exists?"}
    D -- "Before PR\n(key missing)" --> E["ValueError:\nUnsupported test level"]
    D -- "After PR\n(key = [])" --> F["returns empty list []"]
    F --> G["pytest.mark.parametrize\n('data_shape', [])"]
    G --> H["0 test cases collected\nor pytest warning/skip"]
    H --> I["No score-mod L2 tests run"]
    style E fill:#f88,stroke:#c00
    style I fill:#fa8,stroke:#c60

_{Reviews (2): Last reviewed commit: "Add L2 score mod distributed attention s..." | Re-trigger Greptile}

KshitijLakhani · 2026-06-25T22:14:06Z

 DISTRIBUTED_SCORE_MOD_DATA_SHAPES = {
    "L0": [],
    "L1": [(4, 16, 4, 64)],
+    "L2": [(4, 16, 4, 64)],


I think it should be (assuming you want this to run as L1 test):

DISTRIBUTED_SCORE_MOD_DATA_SHAPES = { "L0": [], "L1": [(4, 16, 4, 64)], "L2": [], }

What you have will run the same tests for L1 and L2 there by duplicating effort

Please urgently launch a pipeline with a JAX build manually for L0, L1 and L2 levels and confirm that it runs successfully before merging

Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>

greptile-apps · 2026-06-25T22:45:59Z

 DISTRIBUTED_SCORE_MOD_DATA_SHAPES = {
    "L0": [],
    "L1": [(4, 16, 4, 64)],
+    "L2": [],


L2 level resolves to zero test cases

"L2": [] is passed to pytest_parametrize_wrapper, which calls get_parameters_for_test_level and returns the empty list. That list is forwarded directly to pytest.mark.parametrize("data_shape", []). With an empty parametrize set, pytest either skips the test entirely or raises a collection error depending on the --empty-parameter-set-mark config, so when NVTE_JAX_UNITTEST_LEVEL=L2 is used in CI no TestDistributedScoreModSelfAttn cases will execute. The PR description says this change "fixes L2 tests", but the fix needs at least one concrete shape tuple — the same pattern used by DISTRIBUTED_SELF_ATTN_DATA_SHAPES where L2 carries [(32, 512, 12, 64)].

vcherepanov-nv requested a review from KshitijLakhani June 25, 2026 19:38

vcherepanov-nv added the 2.17 label Jun 25, 2026

greptile-apps Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread tests/jax/test_distributed_fused_attn.py Outdated

KshitijLakhani requested changes Jun 25, 2026

View reviewed changes

Add L2 score mod distributed attention shape

6e9fb06

Signed-off-by: Vladimir Cherepanov <vcherepanov@nvidia.com>

vcherepanov-nv force-pushed the fix-jax-l2-tests branch from 286f9be to 6e9fb06 Compare June 25, 2026 22:43

greptile-apps Bot reviewed Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add L2 score mod distributed attention shape#3147

Add L2 score mod distributed attention shape#3147
vcherepanov-nv wants to merge 1 commit into
NVIDIA:mainfrom
vcherepanov-nv:fix-jax-l2-tests

vcherepanov-nv commented Jun 25, 2026

Uh oh!

greptile-apps Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

KshitijLakhani Jun 25, 2026

Uh oh!

greptile-apps Bot Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

vcherepanov-nv commented Jun 25, 2026

Description

Type of change

Changes

Checklist:

Uh oh!

greptile-apps Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

KshitijLakhani Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Jun 25, 2026 •

edited

Loading