diff --git a/README/WHATS_NEW_zh-CN.md b/README/WHATS_NEW_zh-CN.md index d4c92bce..519be585 100644 --- a/README/WHATS_NEW_zh-CN.md +++ b/README/WHATS_NEW_zh-CN.md @@ -1,5 +1,11 @@ # 本次更新 — AutoControl +## 本次更新 (2026-06-24) — 扩充 UIA 控制模式(展开 / 选取 / 范围 / 滚动) + +以原生模式驱动树节点、列表/下拉项目、滑块与滚动,而非像素猜测。完整参考:[`docs/source/Zh/doc/new_features/v181_features_doc.rst`](../docs/source/Zh/doc/new_features/v181_features_doc.rst)。 + +- **`expand_control` / `collapse_control` / `control_expand_state` / `select_control_item` / `control_range` / `set_control_range` / `scroll_control_into_view`**(`AC_expand_control`、`AC_select_control_item`、`AC_set_control_range` 等):无障碍后端原本只有 Value/Invoke/Toggle/Grid-read 模式,故树状视图、列表/下拉、滑块与屏幕外行都没有原生调用路径。本功能在既有后端 ABC 之上补上 ExpandCollapse / SelectionItem / RangeValue / ScrollItem 模式,通过可注入的 `accessibility.backends.get_backend()` 接缝分派(以 fake backend 无头测试;真正 UIA 调用在 Windows 后端)。不导入 `PySide6`。 + ## 本次更新 (2026-06-24) — 匹配前安定门 + 命中稳定性 避免在动画进行中匹配,并确认命中跨帧维持稳定。完整参考:[`docs/source/Zh/doc/new_features/v180_features_doc.rst`](../docs/source/Zh/doc/new_features/v180_features_doc.rst)。 diff --git a/README/WHATS_NEW_zh-TW.md b/README/WHATS_NEW_zh-TW.md index bb67dab1..bfd2f407 100644 --- a/README/WHATS_NEW_zh-TW.md +++ b/README/WHATS_NEW_zh-TW.md @@ -1,5 +1,11 @@ # 本次更新 — AutoControl +## 本次更新 (2026-06-24) — 擴充 UIA 控制模式(展開 / 選取 / 範圍 / 捲動) + +以原生模式驅動樹節點、清單/下拉項目、滑桿與捲動,而非像素猜測。完整參考:[`docs/source/Zh/doc/new_features/v181_features_doc.rst`](../docs/source/Zh/doc/new_features/v181_features_doc.rst)。 + +- **`expand_control` / `collapse_control` / `control_expand_state` / `select_control_item` / `control_range` / `set_control_range` / `scroll_control_into_view`**(`AC_expand_control`、`AC_select_control_item`、`AC_set_control_range` 等):無障礙後端原本只有 Value/Invoke/Toggle/Grid-read 模式,故樹狀檢視、清單/下拉、滑桿與螢幕外列都沒有原生呼叫路徑。本功能在既有後端 ABC 之上補上 ExpandCollapse / SelectionItem / RangeValue / ScrollItem 模式,透過可注入的 `accessibility.backends.get_backend()` 接縫分派(以 fake backend 無頭測試;真正 UIA 呼叫在 Windows 後端)。不匯入 `PySide6`。 + ## 本次更新 (2026-06-24) — 比對前安定閘 + 命中穩定性 避免在動畫進行中比對,並確認命中跨幀維持穩定。完整參考:[`docs/source/Zh/doc/new_features/v180_features_doc.rst`](../docs/source/Zh/doc/new_features/v180_features_doc.rst)。 diff --git a/WHATS_NEW.md b/WHATS_NEW.md index 1104cf7a..15cd7df5 100644 --- a/WHATS_NEW.md +++ b/WHATS_NEW.md @@ -1,5 +1,29 @@ # What's New — AutoControl +## What's new (2026-06-24) — Keyboard Focus Order (Tab sequence / WCAG audit / set-focus) + +Reason about keyboard navigation: the Tab order, a WCAG focus-order audit, and set-focus. Full reference: [`docs/source/Eng/doc/new_features/v184_features_doc.rst`](docs/source/Eng/doc/new_features/v184_features_doc.rst). + +- **`is_interactive_role` / `tab_order` / `audit_focus_order` / `focus_control`** (`AC_tab_order`, `AC_audit_focus_order`, `AC_focus_control`): nothing reasoned about *keyboard* navigation — only mouse coordinates and element values. This adds the keyboard layer: `tab_order` returns the focusable elements in the order Tab visits them (reading order), `audit_focus_order` is a WCAG 2.4.x report (the sequence + flagged problems like a focusable element with no visible area), and `focus_control` sets keyboard focus via UIA `SetFocus`. The first three are pure functions over `AccessibilityElement` lists — `tab_order` reuses `element_parse.reading_order` and `is_interactive_role` reuses `ax_tree_walk.humanize_role`, so no logic is duplicated; `focus_control` dispatches the injectable backend seam (real `SetFocus` in the Windows backend). No `PySide6`. + +## What's new (2026-06-24) — Readable, Addressable Accessibility Tree (role names + node paths) + +Turn a raw `ControlType_50000` tree dump into readable roles with a stable path per node. Full reference: [`docs/source/Eng/doc/new_features/v183_features_doc.rst`](docs/source/Eng/doc/new_features/v183_features_doc.rst). + +- **`control_type_name` / `humanize_role` / `humanize_tree` / `assign_node_paths` / `find_by_path`** (`AC_walk_tree`, `AC_humanize_role`): `dump_accessibility_tree` emits the platform's raw role (on Windows the bare UIA ControlType id, e.g. `ControlType_50000` for a button) and carries no stable per-node identity once serialised. This adds the pure post-processing it lacks: translate ControlType ids to friendly names, deep-copy a tree with every role humanised, stamp each node with a stable positional `path` (`"0.2.1"` — a pure stand-in for RuntimeId), and resolve a node back by path. `AC_walk_tree` is the readable counterpart to `AC_a11y_dump`. Pure-stdlib over `AXTreeNode`; unknown / non-UIA roles pass through unchanged. No `PySide6`. + +## What's new (2026-06-24) — Native Text Reading via the UIA TextPattern (document / selection / visible) + +Read the text in multiline editors and document controls where ValuePattern returns nothing. Full reference: [`docs/source/Eng/doc/new_features/v182_features_doc.rst`](docs/source/Eng/doc/new_features/v182_features_doc.rst). + +- **`get_control_text` / `get_selected_text` / `get_visible_text`** (`AC_get_control_text`, `AC_get_selected_text`, `AC_get_visible_text`): `control_get_value` reads through UIA ValuePattern, which returns an empty string on multiline edits, RichEdit / document controls and web text areas — exactly the controls whose text you most want. This reads through `TextPattern` instead: `get_control_text` returns the whole `DocumentRange`, `get_selected_text` the current `GetSelection`, `get_visible_text` only the on-screen `GetVisibleRanges`. Dispatched through the injectable `accessibility.backends.get_backend()` seam (headless-testable via a fake backend; real UIA calls in the Windows backend), returning `{text}` from the executor/MCP. No `PySide6`. + +## What's new (2026-06-24) — Extended UIA Control Patterns (Expand / Select / Range / Scroll) + +Drive tree nodes, list/combo items, sliders and scroll natively, not by pixel guessing. Full reference: [`docs/source/Eng/doc/new_features/v181_features_doc.rst`](docs/source/Eng/doc/new_features/v181_features_doc.rst). + +- **`expand_control` / `collapse_control` / `control_expand_state` / `select_control_item` / `control_range` / `set_control_range` / `scroll_control_into_view`** (`AC_expand_control`, `AC_select_control_item`, `AC_set_control_range`, …): the accessibility backend had only Value/Invoke/Toggle/Grid-read patterns, so treeviews, listboxes/combos, sliders and off-screen rows had no native call path. This adds ExpandCollapse / SelectionItem / RangeValue / ScrollItem patterns on top of the existing backend ABC, dispatched through the injectable `accessibility.backends.get_backend()` seam (headless-testable via a fake backend; real UIA calls in the Windows backend). No `PySide6`. + ## What's new (2026-06-24) — Pre-Match Settle Gating + Match Persistence Avoid matching mid-animation, and confirm a hit holds steady across frames. Full reference: [`docs/source/Eng/doc/new_features/v180_features_doc.rst`](docs/source/Eng/doc/new_features/v180_features_doc.rst). diff --git a/docs/source/Eng/doc/new_features/v181_features_doc.rst b/docs/source/Eng/doc/new_features/v181_features_doc.rst new file mode 100644 index 00000000..77c32734 --- /dev/null +++ b/docs/source/Eng/doc/new_features/v181_features_doc.rst @@ -0,0 +1,46 @@ +Extended UIA Control Patterns (Expand / Select / Range / Scroll) +=============================================================== + +The accessibility backend shipped only four control patterns — Value, Invoke, Toggle and a +read-only Grid dump. That left the controls automation hits most often undriveable by their +*native* pattern: a treeview node could not be expanded, a listbox / combobox item could not be +selected (SelectionItemPattern), a slider could not be set (RangeValuePattern), and a control +could not be scrolled into view (ScrollItemPattern) — those fell back to fragile pixel guessing. +``control_patterns`` adds those object-level actions on top of the existing accessibility +backend ABC. + +Each function is a thin dispatch onto the injectable ``accessibility.backends.get_backend()`` +seam (the same seam the rest of the accessibility module uses), so the headless core is +unit-testable on any platform by injecting a fake backend; the real UI Automation calls live in +the Windows backend (ExpandCollapse / SelectionItem / RangeValue / ScrollItem patterns). +Backends that don't implement a pattern raise ``AccessibilityNotAvailableError``. Imports no +``PySide6``. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import (expand_control, collapse_control, + control_expand_state, select_control_item, + control_range, set_control_range, + scroll_control_into_view) + + expand_control(name="Documents", role="treeitem") # open a tree node + select_control_item(name="Option B") # pick a list/combo item + set_control_range(75, name="Volume") # set a slider + print(control_range(name="Volume")) # {"value": 75.0, "minimum": 0, "maximum": 100} + scroll_control_into_view(name="Row 200") # bring a row on-screen + +All locate the control by ``name`` / ``role`` / ``app_name`` / ``automation_id`` (same as the +existing ``control_invoke`` / ``control_toggle``). The expand/select/scroll/set actions return +``bool``; ``control_expand_state`` returns ``expanded`` / ``collapsed`` / ``partial`` / ``leaf`` +(or ``None``); ``control_range`` returns ``{value, minimum, maximum}`` (or ``None``). + +Executor commands +----------------- + +``AC_expand_control`` / ``AC_collapse_control`` / ``AC_control_expand_state`` / +``AC_select_control_item`` / ``AC_control_range`` / ``AC_set_control_range`` / +``AC_scroll_control_into_view``. They are exposed as the matching ``ac_*`` MCP tools (the action +ones destructive, the reads read-only) and as Script Builder commands under **Native UI**. diff --git a/docs/source/Eng/doc/new_features/v182_features_doc.rst b/docs/source/Eng/doc/new_features/v182_features_doc.rst new file mode 100644 index 00000000..fc15006c --- /dev/null +++ b/docs/source/Eng/doc/new_features/v182_features_doc.rst @@ -0,0 +1,46 @@ +Native Text Reading via the UIA TextPattern (document / selection / visible) +============================================================================ + +``control_get_value`` reads a control through UIA ValuePattern, but ValuePattern +returns an **empty string** on multiline edits, RichEdit / document controls and +web text areas — exactly the controls whose text you most want to read. UIA +exposes that text through a different pattern, ``TextPattern``, which models the +control's content as text ranges. ``ax_text`` adds three reads on top of the +existing accessibility backend ABC: + +* :func:`get_control_text` — the whole document's text (``DocumentRange``), +* :func:`get_selected_text` — the currently selected text (``GetSelection``), +* :func:`get_visible_text` — only the on-screen text (``GetVisibleRanges``). + +Each function is a thin dispatch onto the injectable +``accessibility.backends.get_backend()`` seam (the same seam the rest of the +accessibility module uses), so the headless core is unit-testable on any +platform by injecting a fake backend; the real UI Automation calls live in the +Windows backend. Backends that don't implement TextPattern raise +``AccessibilityNotAvailableError``. Imports no ``PySide6``. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import (get_control_text, get_selected_text, + get_visible_text) + + # A multiline editor where control_get_value returns "" : + text = get_control_text(name="Editor", role="document") + selection = get_selected_text(name="Editor") # "" when nothing selected + on_screen = get_visible_text(name="Editor") # skips scrolled-off lines + +All locate the control by ``name`` / ``role`` / ``app_name`` / ``automation_id`` +(same as ``control_get_value`` / ``control_invoke``). Each returns the text as a +``str``, or ``None`` when the control is not found or exposes no TextPattern; +``get_selected_text`` returns ``""`` when the control is found but has no +selection. + +Executor commands +----------------- + +``AC_get_control_text`` / ``AC_get_selected_text`` / ``AC_get_visible_text`` each +return ``{"text": ...}``. They are exposed as the matching read-only ``ac_*`` MCP +tools and as Script Builder commands under **Native UI**. diff --git a/docs/source/Eng/doc/new_features/v183_features_doc.rst b/docs/source/Eng/doc/new_features/v183_features_doc.rst new file mode 100644 index 00000000..ec8682ec --- /dev/null +++ b/docs/source/Eng/doc/new_features/v183_features_doc.rst @@ -0,0 +1,47 @@ +Readable, Addressable Accessibility Tree (role names + node paths) +================================================================== + +``dump_accessibility_tree`` emits nodes with the platform's *raw* role — on +Windows that is the bare UI Automation ControlType id, e.g. ``"ControlType_50000"`` +for a button. That is unreadable, and a serialised dump carries no stable +per-node identity (UIA RuntimeId needs the live element, which the dump has +thrown away). ``ax_tree_walk`` adds the pure, platform-agnostic post-processing +the dump lacks, composable on top of any ``dump_accessibility_tree`` output: + +* :func:`control_type_name` / :func:`humanize_role` — translate a ControlType id + (or ``"ControlType_NNNNN"`` / ``"NNNNN"`` string) to a friendly name, +* :func:`humanize_tree` — a deep copy of the tree with every role humanised, +* :func:`assign_node_paths` — a deep copy stamping each node with a stable + positional ``path`` (``"0.2.1"``) — a pure stand-in for RuntimeId identity, +* :func:`find_by_path` — resolve a node back from its path. + +Pure-stdlib over ``AXTreeNode`` values; no device or backend access. Imports no +``PySide6``. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import (dump_accessibility_tree, humanize_tree, + assign_node_paths, find_by_path, humanize_role) + + humanize_role("ControlType_50000") # "Button" + humanize_role(50004) # "Edit" + + tree = assign_node_paths(humanize_tree(dump_accessibility_tree())) + # every node now has a readable role and tree["attributes"]["path"] + node = find_by_path(tree, "0.0.1") # re-resolve a node by its path + +Unknown ids and non-UIA roles (``"AXApplication"``) pass through unchanged, so +nothing is lost. The path is stable for a given tree shape, giving scripts / +agents a deterministic handle to a node across a dump → act round-trip. + +Executor commands +----------------- + +``AC_walk_tree`` (``app_name`` / ``max_results``) returns the humanised, +path-stamped tree as a nested dict — the readable counterpart to +``AC_a11y_dump``. ``AC_humanize_role`` (``role``) returns ``{"role": ...}``. +Both are exposed as read-only ``ac_*`` MCP tools and as Script Builder commands +under **Native UI**. diff --git a/docs/source/Eng/doc/new_features/v184_features_doc.rst b/docs/source/Eng/doc/new_features/v184_features_doc.rst new file mode 100644 index 00000000..eca801dc --- /dev/null +++ b/docs/source/Eng/doc/new_features/v184_features_doc.rst @@ -0,0 +1,50 @@ +Keyboard Focus Order (Tab sequence / WCAG audit / set-focus) +============================================================ + +Nothing in the toolkit reasoned about *keyboard* navigation — only mouse +coordinates and element values. ``focus_order`` adds the keyboard layer: + +* :func:`is_interactive_role` — is a role one that normally takes keyboard focus, +* :func:`tab_order` — the focusable elements in the order ``Tab`` will visit them + (their reading order: top-to-bottom, left-to-right), +* :func:`audit_focus_order` — a WCAG 2.4.x focus-order report over a flat element + list (the sequence plus flagged problems, e.g. a focusable element with no + visible area — focus would land somewhere unseen), +* :func:`focus_control` — set the keyboard focus on a control (UIA ``SetFocus``). + +The first three are pure functions over ``AccessibilityElement`` lists: +``tab_order`` reuses ``element_parse.reading_order`` for row banding and +``is_interactive_role`` reuses ``ax_tree_walk.humanize_role``, so no logic is +duplicated. ``focus_control`` is a thin dispatch onto the injectable +``accessibility.backends.get_backend()`` seam; the real ``SetFocus`` lives in the +Windows backend. Imports no ``PySide6``. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import (list_accessibility_elements, tab_order, + audit_focus_order, focus_control) + + elements = list_accessibility_elements(app_name="myapp.exe") + for el in tab_order(elements): # the Tab visiting order + print(el.name, el.role) + + report = audit_focus_order(elements) + # {"order": [...], "issues": [...], "focusable_count": N, "issue_count": M} + + focus_control(name="Username", role="edit") # put the cursor in the field + +Focusability is role-based (the interactive roles: Button, Edit, CheckBox, +ComboBox, RadioButton, Hyperlink, ListItem, MenuItem, Slider, Tab/TabItem, +TreeItem, …). ``focus_control`` locates by ``name`` / ``role`` / ``app_name`` / +``automation_id`` like the other native-control actions and returns ``bool``. + +Executor commands +----------------- + +``AC_tab_order`` / ``AC_audit_focus_order`` (``app_name`` / ``max_results``) list +and audit the live app; ``AC_focus_control`` sets focus. They are exposed as the +matching ``ac_*`` MCP tools (the two reads read-only, ``ac_focus_control`` +destructive) and as Script Builder commands under **Native UI**. diff --git a/docs/source/Eng/eng_index.rst b/docs/source/Eng/eng_index.rst index e2c5eed5..10f84a13 100644 --- a/docs/source/Eng/eng_index.rst +++ b/docs/source/Eng/eng_index.rst @@ -203,6 +203,7 @@ Comprehensive guides for all AutoControl features. doc/new_features/v178_features_doc doc/new_features/v179_features_doc doc/new_features/v180_features_doc + doc/new_features/v181_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/docs/source/Zh/doc/new_features/v181_features_doc.rst b/docs/source/Zh/doc/new_features/v181_features_doc.rst new file mode 100644 index 00000000..a2b1f7d8 --- /dev/null +++ b/docs/source/Zh/doc/new_features/v181_features_doc.rst @@ -0,0 +1,42 @@ +擴充 UIA 控制模式(展開 / 選取 / 範圍 / 捲動) +=============================================== + +無障礙後端原本只提供四種控制模式——Value、Invoke、Toggle 與唯讀的 Grid dump。這使得自動化 +最常遇到的控制項無法以其*原生*模式驅動:樹節點無法展開、清單 / 下拉項目無法選取 +(SelectionItemPattern)、滑桿無法設定(RangeValuePattern)、控制項無法捲入視野 +(ScrollItemPattern)——這些只能退回脆弱的像素猜測。``control_patterns`` 在既有的無障礙後端 +ABC 之上補上這些物件層級動作。 + +每個函式都是對可注入的 ``accessibility.backends.get_backend()`` 接縫的薄分派(與無障礙模組 +其餘部分相同的接縫),因此無頭核心可在任何平台透過注入 fake backend 單元測試;真正的 +UI Automation 呼叫位於 Windows 後端(ExpandCollapse / SelectionItem / RangeValue / ScrollItem +模式)。未實作某模式的後端會拋出 ``AccessibilityNotAvailableError``。不匯入 ``PySide6``。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import (expand_control, collapse_control, + control_expand_state, select_control_item, + control_range, set_control_range, + scroll_control_into_view) + + expand_control(name="Documents", role="treeitem") # 展開樹節點 + select_control_item(name="Option B") # 選取清單/下拉項目 + set_control_range(75, name="Volume") # 設定滑桿 + print(control_range(name="Volume")) # {"value": 75.0, "minimum": 0, "maximum": 100} + scroll_control_into_view(name="Row 200") # 把某列帶上螢幕 + +全部以 ``name`` / ``role`` / ``app_name`` / ``automation_id`` 定位控制項(與既有 +``control_invoke`` / ``control_toggle`` 相同)。展開/選取/捲動/設定動作回傳 ``bool``; +``control_expand_state`` 回傳 ``expanded`` / ``collapsed`` / ``partial`` / ``leaf``(或 +``None``);``control_range`` 回傳 ``{value, minimum, maximum}``(或 ``None``)。 + +執行器指令 +---------- + +``AC_expand_control`` / ``AC_collapse_control`` / ``AC_control_expand_state`` / +``AC_select_control_item`` / ``AC_control_range`` / ``AC_set_control_range`` / +``AC_scroll_control_into_view``。皆以對應的 ``ac_*`` MCP 工具(動作類為破壞性、讀取類為唯讀) +及 Script Builder 指令(位於 **Native UI** 分類下)形式提供。 diff --git a/docs/source/Zh/doc/new_features/v182_features_doc.rst b/docs/source/Zh/doc/new_features/v182_features_doc.rst new file mode 100644 index 00000000..518edaa0 --- /dev/null +++ b/docs/source/Zh/doc/new_features/v182_features_doc.rst @@ -0,0 +1,41 @@ +透過 UIA TextPattern 讀取原生文字(文件 / 選取 / 可見) +======================================================= + +``control_get_value`` 透過 UIA ValuePattern 讀取控制項,但 ValuePattern 在多行編輯框、 +RichEdit / 文件控制項與網頁文字區塊上會回傳**空字串**——而這些正是你最想讀取其文字的控制項。 +UIA 透過另一個模式 ``TextPattern`` 提供這些文字,它把控制項內容建模為文字範圍(text range)。 +``ax_text`` 在既有的無障礙後端 ABC 之上補上三種讀取: + +* :func:`get_control_text` ——整份文件的文字(``DocumentRange``), +* :func:`get_selected_text` ——目前選取的文字(``GetSelection``), +* :func:`get_visible_text` ——僅螢幕上可見的文字(``GetVisibleRanges``)。 + +每個函式都是對可注入的 ``accessibility.backends.get_backend()`` 接縫的薄分派(與無障礙模組 +其餘部分相同的接縫),因此無頭核心可在任何平台透過注入 fake backend 單元測試;真正的 +UI Automation 呼叫位於 Windows 後端。未實作 TextPattern 的後端會拋出 +``AccessibilityNotAvailableError``。不匯入 ``PySide6``。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import (get_control_text, get_selected_text, + get_visible_text) + + # 一個 control_get_value 會回傳 "" 的多行編輯框: + text = get_control_text(name="Editor", role="document") + selection = get_selected_text(name="Editor") # 沒有選取時回傳 "" + on_screen = get_visible_text(name="Editor") # 略過捲動到畫面外的列 + +全部以 ``name`` / ``role`` / ``app_name`` / ``automation_id`` 定位控制項(與 +``control_get_value`` / ``control_invoke`` 相同)。各函式以 ``str`` 回傳文字,找不到控制項或 +控制項未提供 TextPattern 時回傳 ``None``;``get_selected_text`` 在找到控制項但沒有選取時 +回傳 ``""``。 + +執行器指令 +---------- + +``AC_get_control_text`` / ``AC_get_selected_text`` / ``AC_get_visible_text`` 各自回傳 +``{"text": ...}``。皆以對應的唯讀 ``ac_*`` MCP 工具及 Script Builder 指令(位於 **Native UI** +分類下)形式提供。 diff --git a/docs/source/Zh/doc/new_features/v183_features_doc.rst b/docs/source/Zh/doc/new_features/v183_features_doc.rst new file mode 100644 index 00000000..b69c663b --- /dev/null +++ b/docs/source/Zh/doc/new_features/v183_features_doc.rst @@ -0,0 +1,43 @@ +可讀且可定址的無障礙樹(角色名稱 + 節點路徑) +============================================= + +``dump_accessibility_tree`` 輸出的節點帶有平台的*原始*角色——在 Windows 上就是裸的 +UI Automation ControlType id,例如按鈕是 ``"ControlType_50000"``。這既難以閱讀,且序列化後的 +dump 不帶任何穩定的逐節點身分(UIA RuntimeId 需要存活的元素,而 dump 已將其丟棄)。 +``ax_tree_walk`` 補上 dump 所缺、純粹且跨平台的後處理,可疊加在任何 +``dump_accessibility_tree`` 輸出之上: + +* :func:`control_type_name` / :func:`humanize_role` ——把 ControlType id(或 + ``"ControlType_NNNNN"`` / ``"NNNNN"`` 字串)轉成友善名稱, +* :func:`humanize_tree` ——回傳一份每個角色都已人性化的樹深拷貝, +* :func:`assign_node_paths` ——回傳一份深拷貝,為每個節點蓋上穩定的位置 ``path`` + (``"0.2.1"``)——作為 RuntimeId 身分的純粹替代, +* :func:`find_by_path` ——由 path 反解回節點。 + +純標準庫,針對 ``AXTreeNode`` 值運算;不存取裝置或後端。不匯入 ``PySide6``。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import (dump_accessibility_tree, humanize_tree, + assign_node_paths, find_by_path, humanize_role) + + humanize_role("ControlType_50000") # "Button" + humanize_role(50004) # "Edit" + + tree = assign_node_paths(humanize_tree(dump_accessibility_tree())) + # 每個節點現在都有可讀角色與 tree["attributes"]["path"] + node = find_by_path(tree, "0.0.1") # 由 path 重新解析節點 + +未知 id 與非 UIA 角色(``"AXApplication"``)原樣通過,故不會遺失任何資訊。path 對於給定的 +樹形狀是穩定的,讓腳本 / agent 在 dump → 操作的往返中對某節點有確定性的把手。 + +執行器指令 +---------- + +``AC_walk_tree``(``app_name`` / ``max_results``)以巢狀 dict 回傳已人性化、已蓋上 path 的樹 +——即 ``AC_a11y_dump`` 的可讀對應版本。``AC_humanize_role``(``role``)回傳 +``{"role": ...}``。兩者皆以唯讀 ``ac_*`` MCP 工具及 Script Builder 指令(位於 **Native UI** +分類下)形式提供。 diff --git a/docs/source/Zh/doc/new_features/v184_features_doc.rst b/docs/source/Zh/doc/new_features/v184_features_doc.rst new file mode 100644 index 00000000..79cf023a --- /dev/null +++ b/docs/source/Zh/doc/new_features/v184_features_doc.rst @@ -0,0 +1,44 @@ +鍵盤焦點順序(Tab 序列 / WCAG 稽核 / 設定焦點) +============================================== + +工具組原本不對*鍵盤*導覽做任何推理——只有滑鼠座標與元素值。``focus_order`` 補上鍵盤這一層: + +* :func:`is_interactive_role` ——某角色是否通常會接受鍵盤焦點, +* :func:`tab_order` ——可聚焦元素依 ``Tab`` 鍵造訪的順序(即其閱讀順序:由上到下、由左到右), +* :func:`audit_focus_order` ——針對扁平元素清單的 WCAG 2.4.x 焦點順序報告(序列加上被標記的 + 問題,例如某可聚焦元素沒有可見面積——焦點會落在看不見的地方), +* :func:`focus_control` ——將鍵盤焦點設到某控制項上(UIA ``SetFocus``)。 + +前三者為針對 ``AccessibilityElement`` 清單的純函式:``tab_order`` 重用 +``element_parse.reading_order`` 做列分群,``is_interactive_role`` 重用 +``ax_tree_walk.humanize_role``,故無重複邏輯。``focus_control`` 是對可注入的 +``accessibility.backends.get_backend()`` 接縫的薄分派;真正的 ``SetFocus`` 位於 Windows 後端。 +不匯入 ``PySide6``。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import (list_accessibility_elements, tab_order, + audit_focus_order, focus_control) + + elements = list_accessibility_elements(app_name="myapp.exe") + for el in tab_order(elements): # Tab 造訪順序 + print(el.name, el.role) + + report = audit_focus_order(elements) + # {"order": [...], "issues": [...], "focusable_count": N, "issue_count": M} + + focus_control(name="Username", role="edit") # 把游標放進該欄位 + +可聚焦性以角色判定(互動角色:Button、Edit、CheckBox、ComboBox、RadioButton、Hyperlink、 +ListItem、MenuItem、Slider、Tab/TabItem、TreeItem……)。``focus_control`` 與其他原生控制 +動作一樣以 ``name`` / ``role`` / ``app_name`` / ``automation_id`` 定位,回傳 ``bool``。 + +執行器指令 +---------- + +``AC_tab_order`` / ``AC_audit_focus_order``(``app_name`` / ``max_results``)列出並稽核存活的 +應用程式;``AC_focus_control`` 設定焦點。三者皆以對應的 ``ac_*`` MCP 工具(兩個讀取為唯讀、 +``ac_focus_control`` 為破壞性)及 Script Builder 指令(位於 **Native UI** 分類下)形式提供。 diff --git a/docs/source/Zh/zh_index.rst b/docs/source/Zh/zh_index.rst index 261eedf0..3c90f5b6 100644 --- a/docs/source/Zh/zh_index.rst +++ b/docs/source/Zh/zh_index.rst @@ -203,6 +203,7 @@ AutoControl 所有功能的完整使用指南。 doc/new_features/v178_features_doc doc/new_features/v179_features_doc doc/new_features/v180_features_doc + doc/new_features/v181_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/je_auto_control/__init__.py b/je_auto_control/__init__.py index 5a741775..d912ba96 100644 --- a/je_auto_control/__init__.py +++ b/je_auto_control/__init__.py @@ -48,6 +48,24 @@ find_accessibility_element, list_accessibility_elements, read_control_table, ) +# Extended UIA control patterns (Expand / Select / Range / Scroll) +from je_auto_control.utils.control_patterns import ( + collapse_control, control_expand_state, control_range, expand_control, + scroll_control_into_view, select_control_item, set_control_range, +) +# Native text reads via UIA TextPattern (document / selection / visible) +from je_auto_control.utils.ax_text import ( + get_control_text, get_selected_text, get_visible_text, +) +# Readable / addressable a11y-tree post-processing (role names + node paths) +from je_auto_control.utils.ax_tree_walk import ( + assign_node_paths, control_type_name, find_by_path, humanize_role, + humanize_tree, +) +# Keyboard focus order (tab sequence / WCAG audit / set-focus) +from je_auto_control.utils.focus_order import ( + audit_focus_order, focus_control, is_interactive_role, tab_order, +) # VLM element locator (headless) from je_auto_control.utils.vision import ( VLMNotAvailableError, click_by_description, locate_by_description, @@ -1609,6 +1627,13 @@ def start_autocontrol_gui(*args, **kwargs): "find_accessibility_element", "list_accessibility_elements", "control_get_value", "control_set_value", "control_invoke", "control_toggle", "read_control_table", + "expand_control", "collapse_control", "control_expand_state", + "select_control_item", "control_range", "set_control_range", + "scroll_control_into_view", + "get_control_text", "get_selected_text", "get_visible_text", + "control_type_name", "humanize_role", "humanize_tree", + "assign_node_paths", "find_by_path", + "is_interactive_role", "tab_order", "audit_focus_order", "focus_control", # VLM locator "VLMNotAvailableError", "locate_by_description", "click_by_description", "verify_description", diff --git a/je_auto_control/gui/script_builder/command_schema.py b/je_auto_control/gui/script_builder/command_schema.py index c46132e9..4b1b2903 100644 --- a/je_auto_control/gui/script_builder/command_schema.py +++ b/je_auto_control/gui/script_builder/command_schema.py @@ -1488,6 +1488,86 @@ def _add_native_control_specs(specs: List[CommandSpec]) -> None: fields=fields, description="Read a grid/table/list control as rows of cell strings.", )) + specs.append(CommandSpec( + "AC_expand_control", "Native UI", "Expand Control", + fields=fields, + description="Expand a tree node / combobox (ExpandCollapsePattern).", + )) + specs.append(CommandSpec( + "AC_collapse_control", "Native UI", "Collapse Control", + fields=fields, + description="Collapse a tree node / combobox (ExpandCollapsePattern).", + )) + specs.append(CommandSpec( + "AC_control_expand_state", "Native UI", "Control Expand State", + fields=fields, + description="Read expanded/collapsed/partial/leaf state of a control.", + )) + specs.append(CommandSpec( + "AC_select_control_item", "Native UI", "Select Control Item", + fields=fields, + description="Select a list / tree / tab item (SelectionItemPattern).", + )) + specs.append(CommandSpec( + "AC_control_range", "Native UI", "Get Control Range", + fields=fields, + description="Read a slider / progress range (RangeValuePattern).", + )) + specs.append(CommandSpec( + "AC_set_control_range", "Native UI", "Set Control Range", + fields=(FieldSpec("value", FieldType.FLOAT),) + fields, + description="Set a slider / progress / spinner value (RangeValuePattern).", + )) + specs.append(CommandSpec( + "AC_scroll_control_into_view", "Native UI", "Scroll Control Into View", + fields=fields, + description="Scroll a control into view (ScrollItemPattern).", + )) + specs.append(CommandSpec( + "AC_get_control_text", "Native UI", "Get Control Text", + fields=fields, + description="Read full text via TextPattern (multiline / document safe).", + )) + specs.append(CommandSpec( + "AC_get_selected_text", "Native UI", "Get Selected Text", + fields=fields, + description="Read the currently selected text via TextPattern.", + )) + specs.append(CommandSpec( + "AC_get_visible_text", "Native UI", "Get Visible Text", + fields=fields, + description="Read only the on-screen text via TextPattern.GetVisibleRanges.", + )) + specs.append(CommandSpec( + "AC_walk_tree", "Native UI", "Walk Accessibility Tree", + fields=(FieldSpec("app_name", FieldType.STRING, optional=True), + FieldSpec("max_results", FieldType.INT, optional=True, + default=500)), + description="Dump the a11y tree with friendly roles + a path per node.", + )) + specs.append(CommandSpec( + "AC_humanize_role", "Native UI", "Humanize UIA Role", + fields=(FieldSpec("role", FieldType.STRING),), + description="Translate a raw UIA role (ControlType_50000) to a name.", + )) + tree_fields = (FieldSpec("app_name", FieldType.STRING, optional=True), + FieldSpec("max_results", FieldType.INT, optional=True, + default=500)) + specs.append(CommandSpec( + "AC_tab_order", "Native UI", "Keyboard Tab Order", + fields=tree_fields, + description="List focusable controls in keyboard Tab (reading) order.", + )) + specs.append(CommandSpec( + "AC_audit_focus_order", "Native UI", "Audit Focus Order (WCAG)", + fields=tree_fields, + description="WCAG 2.4.x focus-order audit: tab sequence + flagged issues.", + )) + specs.append(CommandSpec( + "AC_focus_control", "Native UI", "Set Keyboard Focus", + fields=fields, + description="Set keyboard focus on a control natively (UIA SetFocus).", + )) def _add_misc_specs(specs: List[CommandSpec]) -> None: diff --git a/je_auto_control/utils/accessibility/backends/base.py b/je_auto_control/utils/accessibility/backends/base.py index 2ca8ef7e..82d5798c 100644 --- a/je_auto_control/utils/accessibility/backends/base.py +++ b/je_auto_control/utils/accessibility/backends/base.py @@ -1,5 +1,5 @@ """Abstract accessibility backend.""" -from typing import List, Optional +from typing import Any, Dict, List, Optional from je_auto_control.utils.accessibility.element import ( AccessibilityElement, AccessibilityNotAvailableError, @@ -56,6 +56,81 @@ def read_table(self, name: Optional[str] = None, role: Optional[str] = None, """Read a grid/table/list control as rows of cell strings.""" self._unsupported("read_table") + # --- extended control patterns (Expand / Selection / Range / Scroll) ---- + + def expand(self, name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Expand the matched control (ExpandCollapsePattern); True on success.""" + self._unsupported("expand") + + def collapse(self, name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Collapse the matched control (ExpandCollapsePattern); True on success.""" + self._unsupported("collapse") + + def expand_state(self, name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[str]: + """Return ``expanded`` / ``collapsed`` / ``partial`` / ``leaf``, or None.""" + self._unsupported("expand_state") + + def select_item(self, name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Select the matched item (SelectionItemPattern); True on success.""" + self._unsupported("select_item") + + def get_range(self, name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[Dict[str, Any]]: + """Return ``{value, minimum, maximum}`` (RangeValuePattern), or None.""" + self._unsupported("get_range") + + def set_range_value(self, value: float, name: Optional[str] = None, + role: Optional[str] = None, app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Set a slider / progress value (RangeValuePattern); True on success.""" + self._unsupported("set_range_value") + + def scroll_into_view(self, name: Optional[str] = None, + role: Optional[str] = None, app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Scroll the matched control into view (ScrollItemPattern); True on success.""" + self._unsupported("scroll_into_view") + + # --- text patterns (TextPattern reads) --------------------------------- + + def document_text(self, name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[str]: + """Return the matched control's full text (TextPattern), or None. + + Reads multiline / document controls where ValuePattern returns ``""``. + """ + self._unsupported("document_text") + + def selected_text(self, name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[str]: + """Return the control's currently selected text (TextPattern), or None.""" + self._unsupported("selected_text") + + def visible_text(self, name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[str]: + """Return only the on-screen text of the control (TextPattern), or None.""" + self._unsupported("visible_text") + + # --- keyboard focus ---------------------------------------------------- + + def set_focus(self, name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Set keyboard focus on the matched control (SetFocus); True on success.""" + self._unsupported("set_focus") + def _unsupported(self, operation: str): """Raise a clear error for an action this backend can't perform.""" raise AccessibilityNotAvailableError( diff --git a/je_auto_control/utils/accessibility/backends/windows_backend.py b/je_auto_control/utils/accessibility/backends/windows_backend.py index be8c3c41..cbfcba60 100644 --- a/je_auto_control/utils/accessibility/backends/windows_backend.py +++ b/je_auto_control/utils/accessibility/backends/windows_backend.py @@ -8,7 +8,7 @@ Only ``is_control_element=True`` nodes are surfaced to avoid millions of decorative text children. """ -from typing import List, Optional +from typing import Any, Dict, List, Optional from je_auto_control.utils.accessibility.backends.base import ( AccessibilityBackend, @@ -25,6 +25,12 @@ _UIA_INVOKE_PATTERN_ID = 10000 _UIA_TOGGLE_PATTERN_ID = 10015 _UIA_GRID_PATTERN_ID = 10006 +_UIA_EXPANDCOLLAPSE_PATTERN_ID = 10005 +_UIA_SELECTIONITEM_PATTERN_ID = 10010 +_UIA_RANGEVALUE_PATTERN_ID = 10003 +_UIA_SCROLLITEM_PATTERN_ID = 10017 +_UIA_TEXT_PATTERN_ID = 10014 +_EXPAND_STATES = {0: "collapsed", 1: "expanded", 2: "partial", 3: "leaf"} def _is_available() -> bool: @@ -192,6 +198,127 @@ def read_table(self, name=None, role=None, app_name=None, return [] return [self._read_row(pattern, r, cols) for r in range(rows)] + def _invoke_pattern_method(self, name, role, app_name, automation_id, + pattern_id, interface_name, action): + """Find a control, query a pattern, run ``action(pattern)`` → bool.""" + raw = self._find_raw(name, role, app_name, automation_id) + pattern = self._pattern(raw, pattern_id, interface_name) if raw else None + if pattern is None: + return False + try: + action(pattern) + return True + except (OSError, AttributeError): + return False + + def expand(self, name=None, role=None, app_name=None, automation_id=None): + return self._invoke_pattern_method( + name, role, app_name, automation_id, _UIA_EXPANDCOLLAPSE_PATTERN_ID, + "IUIAutomationExpandCollapsePattern", lambda p: p.Expand()) + + def collapse(self, name=None, role=None, app_name=None, automation_id=None): + return self._invoke_pattern_method( + name, role, app_name, automation_id, _UIA_EXPANDCOLLAPSE_PATTERN_ID, + "IUIAutomationExpandCollapsePattern", lambda p: p.Collapse()) + + def expand_state(self, name=None, role=None, app_name=None, + automation_id=None) -> Optional[str]: + raw = self._find_raw(name, role, app_name, automation_id) + pattern = self._pattern(raw, _UIA_EXPANDCOLLAPSE_PATTERN_ID, + "IUIAutomationExpandCollapsePattern") if raw else None + if pattern is None: + return None + try: + return _EXPAND_STATES.get(int(pattern.CurrentExpandCollapseState)) + except (OSError, AttributeError, ValueError, TypeError): + return None + + def select_item(self, name=None, role=None, app_name=None, automation_id=None): + return self._invoke_pattern_method( + name, role, app_name, automation_id, _UIA_SELECTIONITEM_PATTERN_ID, + "IUIAutomationSelectionItemPattern", lambda p: p.Select()) + + def set_range_value(self, value, name=None, role=None, app_name=None, + automation_id=None): + return self._invoke_pattern_method( + name, role, app_name, automation_id, _UIA_RANGEVALUE_PATTERN_ID, + "IUIAutomationRangeValuePattern", lambda p: p.SetValue(float(value))) + + def scroll_into_view(self, name=None, role=None, app_name=None, + automation_id=None): + return self._invoke_pattern_method( + name, role, app_name, automation_id, _UIA_SCROLLITEM_PATTERN_ID, + "IUIAutomationScrollItemPattern", lambda p: p.ScrollIntoView()) + + def get_range(self, name=None, role=None, app_name=None, + automation_id=None) -> Optional[Dict[str, Any]]: + raw = self._find_raw(name, role, app_name, automation_id) + pattern = self._pattern(raw, _UIA_RANGEVALUE_PATTERN_ID, + "IUIAutomationRangeValuePattern") if raw else None + if pattern is None: + return None + try: + return {"value": float(pattern.CurrentValue), + "minimum": float(pattern.CurrentMinimum), + "maximum": float(pattern.CurrentMaximum)} + except (OSError, AttributeError, ValueError, TypeError): + return None + + def _text_pattern(self, name, role, app_name, automation_id): + """Find a control and return its IUIAutomationTextPattern, or None.""" + raw = self._find_raw(name, role, app_name, automation_id) + if not raw: + return None + return self._pattern(raw, _UIA_TEXT_PATTERN_ID, + "IUIAutomationTextPattern") + + def document_text(self, name=None, role=None, app_name=None, + automation_id=None) -> Optional[str]: + pattern = self._text_pattern(name, role, app_name, automation_id) + if pattern is None: + return None + try: + return str(pattern.DocumentRange.GetText(-1) or "") + except (OSError, AttributeError): + return None + + def selected_text(self, name=None, role=None, app_name=None, + automation_id=None) -> Optional[str]: + pattern = self._text_pattern(name, role, app_name, automation_id) + if pattern is None: + return None + try: + selection = pattern.GetSelection() + if not selection or int(selection.Length or 0) == 0: + return "" + return str(selection.GetElement(0).GetText(-1) or "") + except (OSError, AttributeError): + return None + + def visible_text(self, name=None, role=None, app_name=None, + automation_id=None) -> Optional[str]: + pattern = self._text_pattern(name, role, app_name, automation_id) + if pattern is None: + return None + try: + ranges = pattern.GetVisibleRanges() + count = int(ranges.Length or 0) + return "".join(str(ranges.GetElement(i).GetText(-1) or "") + for i in range(count)) + except (OSError, AttributeError): + return None + + def set_focus(self, name=None, role=None, app_name=None, + automation_id=None) -> bool: + raw = self._find_raw(name, role, app_name, automation_id) + if not raw: + return False + try: + raw.SetFocus() + return True + except (OSError, AttributeError): + return False + @staticmethod def _read_row(pattern, row: int, cols: int): """Read one grid row into a list of cell strings.""" diff --git a/je_auto_control/utils/ax_text/__init__.py b/je_auto_control/utils/ax_text/__init__.py new file mode 100644 index 00000000..66ed749b --- /dev/null +++ b/je_auto_control/utils/ax_text/__init__.py @@ -0,0 +1,8 @@ +"""Native text reading via the UI Automation TextPattern (document/selection/visible).""" +from je_auto_control.utils.ax_text.ax_text import ( + get_control_text, get_selected_text, get_visible_text, +) + +__all__ = [ + "get_control_text", "get_selected_text", "get_visible_text", +] diff --git a/je_auto_control/utils/ax_text/ax_text.py b/je_auto_control/utils/ax_text/ax_text.py new file mode 100644 index 00000000..7f04f1dc --- /dev/null +++ b/je_auto_control/utils/ax_text/ax_text.py @@ -0,0 +1,61 @@ +"""Native text reading via the UI Automation TextPattern. + +``control_get_value`` reads a control through ValuePattern, but ValuePattern +returns an **empty string** on multiline edits, RichEdit / document controls and +web text areas — exactly the controls whose text you most want to read. UIA +exposes that text through a different pattern, ``TextPattern``, which models the +control's content as text ranges. ``ax_text`` adds three reads on top of the +existing accessibility backend ABC: + +* :func:`get_control_text` — the whole document's text (``DocumentRange``), +* :func:`get_selected_text` — the currently selected text (``GetSelection``), +* :func:`get_visible_text` — only the on-screen text (``GetVisibleRanges``). + +Each function is a thin dispatch onto the injectable +``accessibility.backends.get_backend()`` seam (the same seam the rest of the +accessibility module uses), so the headless core is unit-testable on any +platform by injecting a fake backend; the real UIA calls live in the Windows +backend. Imports no ``PySide6``. +""" +from typing import Optional + + +def _backend(): + from je_auto_control.utils.accessibility.backends import get_backend + return get_backend() + + +def get_control_text(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[str]: + """Return a control's full text via TextPattern (``None`` if not found). + + Unlike :func:`control_get_value`, this works on multiline edits, RichEdit / + document controls and web text areas where ValuePattern returns ``""``. + """ + return _backend().document_text(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def get_selected_text(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[str]: + """Return the control's currently selected text (TextPattern.GetSelection). + + Empty string when nothing is selected; ``None`` if the control is not found + or exposes no TextPattern. + """ + return _backend().selected_text(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def get_visible_text(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[str]: + """Return only the on-screen text of a control (TextPattern.GetVisibleRanges). + + Useful for scrolled documents where :func:`get_control_text` would return the + whole (possibly huge) buffer. ``None`` if the control is not found. + """ + return _backend().visible_text(name=name, role=role, app_name=app_name, + automation_id=automation_id) diff --git a/je_auto_control/utils/ax_tree_walk/__init__.py b/je_auto_control/utils/ax_tree_walk/__init__.py new file mode 100644 index 00000000..de788d0e --- /dev/null +++ b/je_auto_control/utils/ax_tree_walk/__init__.py @@ -0,0 +1,10 @@ +"""Readable, addressable accessibility-tree post-processing (role names + node paths).""" +from je_auto_control.utils.ax_tree_walk.ax_tree_walk import ( + assign_node_paths, control_type_name, find_by_path, humanize_role, + humanize_tree, +) + +__all__ = [ + "control_type_name", "humanize_role", "humanize_tree", + "assign_node_paths", "find_by_path", +] diff --git a/je_auto_control/utils/ax_tree_walk/ax_tree_walk.py b/je_auto_control/utils/ax_tree_walk/ax_tree_walk.py new file mode 100644 index 00000000..d9da7a8c --- /dev/null +++ b/je_auto_control/utils/ax_tree_walk/ax_tree_walk.py @@ -0,0 +1,108 @@ +"""Make an accessibility-tree dump readable and addressable. + +``dump_accessibility_tree`` emits nodes with the platform's *raw* role — +on Windows that is the bare UI Automation ControlType id, e.g. +``"ControlType_50000"`` for a button. That is unreadable, and the dump +carries no stable per-node identity (UIA RuntimeId needs the live element, +which a serialised dump has thrown away). ``ax_tree_walk`` adds the pure, +platform-agnostic post-processing the dump lacks: + +* :func:`control_type_name` / :func:`humanize_role` — translate a UIA + ControlType id (or ``"ControlType_NNNNN"`` string) to a friendly name, +* :func:`humanize_tree` — a deep copy of the tree with every role humanised, +* :func:`assign_node_paths` — a deep copy stamping each node with a stable + positional ``path`` (``"0.2.1"``) — a pure stand-in for RuntimeId identity, +* :func:`find_by_path` — resolve a node back from its path. + +Pure-stdlib over :class:`AXTreeNode` values; no device or backend access, no +``PySide6``. Compose it on top of any ``dump_accessibility_tree`` output. +""" +from typing import Optional, Union + +from je_auto_control.utils.accessibility.tree import AXTreeNode + +# UIA ControlType ids → friendly names (UIAutomationClient ControlTypeId range). +_CONTROL_TYPE_NAMES = { + 50000: "Button", 50001: "Calendar", 50002: "CheckBox", 50003: "ComboBox", + 50004: "Edit", 50005: "Hyperlink", 50006: "Image", 50007: "ListItem", + 50008: "List", 50009: "Menu", 50010: "MenuBar", 50011: "MenuItem", + 50012: "ProgressBar", 50013: "RadioButton", 50014: "ScrollBar", + 50015: "Slider", 50016: "Spinner", 50017: "StatusBar", 50018: "Tab", + 50019: "TabItem", 50020: "Text", 50021: "ToolBar", 50022: "ToolTip", + 50023: "Tree", 50024: "TreeItem", 50025: "Custom", 50026: "Group", + 50027: "Thumb", 50028: "DataGrid", 50029: "DataItem", 50030: "Document", + 50031: "SplitButton", 50032: "Window", 50033: "Pane", 50034: "Header", + 50035: "HeaderItem", 50036: "Table", 50037: "TitleBar", 50038: "Separator", + 50039: "SemanticZoom", 50040: "AppBar", +} +_ROLE_PREFIX = "ControlType_" + + +def control_type_name(control_type: int) -> str: + """Return the friendly name for a UIA ControlType id (e.g. ``50000`` → ``Button``). + + Unknown ids fall back to ``"ControlType_"`` so nothing is lost. + """ + cid = int(control_type) + return _CONTROL_TYPE_NAMES.get(cid, f"{_ROLE_PREFIX}{cid}") + + +def humanize_role(role: Union[str, int]) -> str: + """Map a raw UIA role to a friendly name. + + Accepts an int id (``50000``), a ``"ControlType_50000"`` string, or a bare + ``"50000"`` string. Any role that is not a recognised ControlType — already + friendly (``"Button"``) or a non-UIA role (``"AXApplication"``) — is returned + unchanged. + """ + if isinstance(role, int): + return control_type_name(role) + text = str(role) + digits = text[len(_ROLE_PREFIX):] if text.startswith(_ROLE_PREFIX) else text + if digits.isdigit(): + return control_type_name(int(digits)) + return text + + +def humanize_tree(node: AXTreeNode) -> AXTreeNode: + """Return a deep copy of ``node`` with every role run through :func:`humanize_role`.""" + return AXTreeNode( + name=node.name, role=humanize_role(node.role), bounds=node.bounds, + app_name=node.app_name, process_id=node.process_id, + attributes=dict(node.attributes), + children=[humanize_tree(child) for child in node.children], + ) + + +def assign_node_paths(node: AXTreeNode, prefix: str = "0") -> AXTreeNode: + """Return a deep copy stamping each node with a stable positional ``path``. + + The root is ``"0"``; its third child is ``"0.2"``, and so on. The path is a + pure stand-in for a RuntimeId: stable for a given tree shape and re-resolvable + with :func:`find_by_path`. Stored under ``attributes["path"]``. + """ + attributes = dict(node.attributes) + attributes["path"] = prefix + children = [assign_node_paths(child, f"{prefix}.{index}") + for index, child in enumerate(node.children)] + return AXTreeNode( + name=node.name, role=node.role, bounds=node.bounds, + app_name=node.app_name, process_id=node.process_id, + attributes=attributes, children=children, + ) + + +def find_by_path(root: AXTreeNode, path: str) -> Optional[AXTreeNode]: + """Resolve the node addressed by ``path`` (e.g. ``"0.2.1"``); ``None`` if absent.""" + parts = str(path).split(".") + if not parts or parts[0] != "0": + return None + node = root + for part in parts[1:]: + if not part.isdigit(): + return None + index = int(part) + if index >= len(node.children): + return None + node = node.children[index] + return node diff --git a/je_auto_control/utils/control_patterns/__init__.py b/je_auto_control/utils/control_patterns/__init__.py new file mode 100644 index 00000000..548541dd --- /dev/null +++ b/je_auto_control/utils/control_patterns/__init__.py @@ -0,0 +1,11 @@ +"""Extended UI Automation control-pattern actions (Expand / Select / Range / Scroll).""" +from je_auto_control.utils.control_patterns.control_patterns import ( + collapse_control, control_expand_state, control_range, expand_control, + scroll_control_into_view, select_control_item, set_control_range, +) + +__all__ = [ + "expand_control", "collapse_control", "control_expand_state", + "select_control_item", "control_range", "set_control_range", + "scroll_control_into_view", +] diff --git a/je_auto_control/utils/control_patterns/control_patterns.py b/je_auto_control/utils/control_patterns/control_patterns.py new file mode 100644 index 00000000..83e9f2e3 --- /dev/null +++ b/je_auto_control/utils/control_patterns/control_patterns.py @@ -0,0 +1,77 @@ +"""Extended UI Automation control-pattern actions (Expand / Select / Range / Scroll). + +The accessibility backend ships only four control patterns — Value, Invoke, Toggle and a +read-only Grid dump. That leaves the controls automation hits most often undriveable by their +*native* pattern: a treeview node can't be expanded, a listbox / combobox item can't be +selected (SelectionItemPattern), a slider can't be set (RangeValuePattern), and a control can't +be scrolled into view (ScrollItemPattern) — today those fall back to fragile pixel guessing. +``control_patterns`` adds those object-level actions on top of the existing backend ABC. + +Each function is a thin dispatch onto the injectable ``accessibility.backends.get_backend()`` +seam (the same seam the rest of the accessibility module uses), so the headless core is +unit-testable on any platform by injecting a fake backend; the real UIA calls live in the +Windows backend. Imports no ``PySide6``. +""" +from typing import Any, Dict, Optional + + +def _backend(): + from je_auto_control.utils.accessibility.backends import get_backend + return get_backend() + + +def expand_control(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Expand a tree node / combobox / expander (ExpandCollapsePattern).""" + return _backend().expand(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def collapse_control(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Collapse a tree node / combobox / expander (ExpandCollapsePattern).""" + return _backend().collapse(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def control_expand_state(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[str]: + """Return ``expanded`` / ``collapsed`` / ``partial`` / ``leaf`` for a control.""" + return _backend().expand_state(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def select_control_item(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Select a list / tree / tab item (SelectionItemPattern).""" + return _backend().select_item(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def control_range(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Optional[Dict[str, Any]]: + """Return ``{value, minimum, maximum}`` of a slider / progress (RangeValuePattern).""" + return _backend().get_range(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def set_control_range(value: float, name: Optional[str] = None, + role: Optional[str] = None, app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Set a slider / progress / spinner value (RangeValuePattern).""" + return _backend().set_range_value(float(value), name=name, role=role, + app_name=app_name, + automation_id=automation_id) + + +def scroll_control_into_view(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Scroll a control into view before acting on it (ScrollItemPattern).""" + return _backend().scroll_into_view(name=name, role=role, app_name=app_name, + automation_id=automation_id) diff --git a/je_auto_control/utils/executor/action_executor.py b/je_auto_control/utils/executor/action_executor.py index 3534e300..4955e76c 100644 --- a/je_auto_control/utils/executor/action_executor.py +++ b/je_auto_control/utils/executor/action_executor.py @@ -182,6 +182,52 @@ def _a11y_dump(app_name: Optional[str] = None, ).to_dict() +def _walk_tree(app_name: Optional[str] = None, + max_results: int = 500) -> Dict[str, Any]: + """Executor adapter: dump the a11y tree with friendly roles + node paths.""" + from je_auto_control.utils.accessibility import dump_accessibility_tree + from je_auto_control.utils.ax_tree_walk import ( + assign_node_paths, humanize_tree) + root = dump_accessibility_tree(app_name=app_name, + max_results=int(max_results)) + return assign_node_paths(humanize_tree(root)).to_dict() + + +def _humanize_role(role: str) -> Dict[str, Any]: + """Executor adapter: translate a raw UIA role to a friendly name.""" + from je_auto_control.utils.ax_tree_walk import humanize_role + return {"role": humanize_role(role)} + + +def _tab_order(app_name: Optional[str] = None, + max_results: int = 500) -> Dict[str, Any]: + """Executor adapter: focusable elements in keyboard Tab order.""" + from je_auto_control.utils.accessibility import list_accessibility_elements + from je_auto_control.utils.focus_order import tab_order + elements = list_accessibility_elements(app_name=app_name, + max_results=int(max_results)) + return {"order": [el.to_dict() for el in tab_order(elements)]} + + +def _audit_focus_order(app_name: Optional[str] = None, + max_results: int = 500) -> Dict[str, Any]: + """Executor adapter: WCAG focus-order audit over the app's elements.""" + from je_auto_control.utils.accessibility import list_accessibility_elements + from je_auto_control.utils.focus_order import audit_focus_order + elements = list_accessibility_elements(app_name=app_name, + max_results=int(max_results)) + return audit_focus_order(elements) + + +def _focus_control(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Executor adapter: set keyboard focus on a control (UIA SetFocus).""" + from je_auto_control.utils.focus_order import focus_control + return focus_control(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + def _a11y_record_start(app_name: Optional[str] = None, poll_interval_s: float = 0.25, min_movement_px: int = 8) -> Dict[str, Any]: @@ -2352,6 +2398,97 @@ def _control_toggle(name: Optional[str] = None, role: Optional[str] = None, automation_id=automation_id) +def _expand_control(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Adapter: expand a tree node / combobox (ExpandCollapsePattern).""" + from je_auto_control.utils.control_patterns import expand_control + return expand_control(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def _collapse_control(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Adapter: collapse a tree node / combobox (ExpandCollapsePattern).""" + from je_auto_control.utils.control_patterns import collapse_control + return collapse_control(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def _control_expand_state(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Dict[str, Any]: + """Adapter: the expand/collapse state of a control.""" + from je_auto_control.utils.control_patterns import control_expand_state + return {"state": control_expand_state(name=name, role=role, app_name=app_name, + automation_id=automation_id)} + + +def _select_control_item(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Adapter: select a list / tree / tab item (SelectionItemPattern).""" + from je_auto_control.utils.control_patterns import select_control_item + return select_control_item(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def _control_range(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Dict[str, Any]: + """Adapter: read a slider / progress range (RangeValuePattern).""" + from je_auto_control.utils.control_patterns import control_range + info = control_range(name=name, role=role, app_name=app_name, + automation_id=automation_id) + return {"found": info is not None, "range": info} + + +def _set_control_range(value: Any, name: Optional[str] = None, + role: Optional[str] = None, app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Adapter: set a slider / progress / spinner value (RangeValuePattern).""" + from je_auto_control.utils.control_patterns import set_control_range + return set_control_range(float(value), name=name, role=role, + app_name=app_name, automation_id=automation_id) + + +def _scroll_control_into_view(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Adapter: scroll a control into view (ScrollItemPattern).""" + from je_auto_control.utils.control_patterns import scroll_control_into_view + return scroll_control_into_view(name=name, role=role, app_name=app_name, + automation_id=automation_id) + + +def _get_control_text(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Dict[str, Any]: + """Adapter: read a control's full text via TextPattern (multiline-safe).""" + from je_auto_control.utils.ax_text import get_control_text + return {"text": get_control_text(name=name, role=role, app_name=app_name, + automation_id=automation_id)} + + +def _get_selected_text(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Dict[str, Any]: + """Adapter: read a control's currently selected text (TextPattern).""" + from je_auto_control.utils.ax_text import get_selected_text + return {"text": get_selected_text(name=name, role=role, app_name=app_name, + automation_id=automation_id)} + + +def _get_visible_text(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> Dict[str, Any]: + """Adapter: read only the on-screen text of a control (TextPattern).""" + from je_auto_control.utils.ax_text import get_visible_text + return {"text": get_visible_text(name=name, role=role, app_name=app_name, + automation_id=automation_id)} + + def _read_table(name: Optional[str] = None, role: Optional[str] = None, app_name: Optional[str] = None, automation_id: Optional[str] = None) -> List[List[str]]: @@ -6017,10 +6154,25 @@ def __init__(self): "AC_a11y_find": _a11y_find_as_dict, "AC_a11y_click": click_accessibility_element, "AC_a11y_dump": _a11y_dump, + "AC_walk_tree": _walk_tree, + "AC_humanize_role": _humanize_role, + "AC_tab_order": _tab_order, + "AC_audit_focus_order": _audit_focus_order, + "AC_focus_control": _focus_control, "AC_control_get_value": _control_get_value, "AC_control_set_value": _control_set_value, "AC_control_invoke": _control_invoke, "AC_control_toggle": _control_toggle, + "AC_expand_control": _expand_control, + "AC_collapse_control": _collapse_control, + "AC_control_expand_state": _control_expand_state, + "AC_select_control_item": _select_control_item, + "AC_control_range": _control_range, + "AC_set_control_range": _set_control_range, + "AC_scroll_control_into_view": _scroll_control_into_view, + "AC_get_control_text": _get_control_text, + "AC_get_selected_text": _get_selected_text, + "AC_get_visible_text": _get_visible_text, "AC_read_table": _read_table, "AC_watchdog_add": _watchdog_add, "AC_watchdog_start": _watchdog_start, diff --git a/je_auto_control/utils/focus_order/__init__.py b/je_auto_control/utils/focus_order/__init__.py new file mode 100644 index 00000000..64983e70 --- /dev/null +++ b/je_auto_control/utils/focus_order/__init__.py @@ -0,0 +1,8 @@ +"""Keyboard focus order: expected Tab sequence, WCAG audit, and set-focus.""" +from je_auto_control.utils.focus_order.focus_order import ( + audit_focus_order, focus_control, is_interactive_role, tab_order, +) + +__all__ = [ + "is_interactive_role", "tab_order", "audit_focus_order", "focus_control", +] diff --git a/je_auto_control/utils/focus_order/focus_order.py b/je_auto_control/utils/focus_order/focus_order.py new file mode 100644 index 00000000..3f44acc6 --- /dev/null +++ b/je_auto_control/utils/focus_order/focus_order.py @@ -0,0 +1,87 @@ +"""Keyboard focus order: expected Tab sequence, a WCAG audit, and set-focus. + +Nothing in the toolkit reasons about *keyboard* navigation. ``focus_order`` adds: + +* :func:`is_interactive_role` — is a role one that normally takes keyboard focus, +* :func:`tab_order` — the focusable elements in the order ``Tab`` will visit them + (their reading order: top-to-bottom, left-to-right), +* :func:`audit_focus_order` — a WCAG 2.4.x focus-order report over a flat element + list (the sequence plus flagged problems, e.g. a focusable element with no + visible area), +* :func:`focus_control` — set the keyboard focus on a control (device action). + +The first three are pure functions over :class:`AccessibilityElement` lists — +``tab_order`` reuses :func:`element_parse.reading_order` for row banding and +``is_interactive_role`` reuses :func:`ax_tree_walk.humanize_role`, so no logic is +duplicated. ``focus_control`` is a thin dispatch onto the injectable +``accessibility.backends.get_backend()`` seam; the real ``SetFocus`` call lives in +the Windows backend. Imports no ``PySide6``. +""" +from typing import Any, Dict, List, Optional, Sequence, Union + +from je_auto_control.utils.accessibility.element import AccessibilityElement +from je_auto_control.utils.ax_tree_walk import humanize_role +from je_auto_control.utils.element_parse import reading_order + +# Roles that conventionally participate in keyboard tab navigation. +_INTERACTIVE_ROLES = frozenset({ + "Button", "Calendar", "CheckBox", "ComboBox", "Edit", "Hyperlink", + "ListItem", "MenuItem", "RadioButton", "ScrollBar", "Slider", "Spinner", + "SplitButton", "Tab", "TabItem", "TreeItem", "DataItem", "Thumb", +}) + + +def is_interactive_role(role: Union[str, int]) -> bool: + """Return True if ``role`` is one that normally accepts keyboard focus.""" + return humanize_role(role) in _INTERACTIVE_ROLES + + +def _box(element: AccessibilityElement, index: int) -> Dict[str, Any]: + left, top, width, height = element.bounds + return {"x": left, "y": top, "width": width, "height": height, "_idx": index} + + +def tab_order(elements: Sequence[AccessibilityElement], *, + row_tol: int = 12) -> List[AccessibilityElement]: + """Return the focusable elements in the order ``Tab`` would visit them. + + Filters to :func:`is_interactive_role` then orders by reading order (rows + within ``row_tol`` px share a row, ordered left-to-right). + """ + interactive = [el for el in elements if is_interactive_role(el.role)] + boxes = [_box(el, index) for index, el in enumerate(interactive)] + ordered = reading_order(boxes, row_tol=int(row_tol)) + return [interactive[box["_idx"]] for box in ordered] + + +def audit_focus_order(elements: Sequence[AccessibilityElement], *, + row_tol: int = 12) -> Dict[str, Any]: + """Return a WCAG 2.4.x focus-order report over a flat element list. + + ``order`` is the expected Tab sequence (``tab_index`` / ``name`` / ``role`` / + ``bounds``); ``issues`` flags focusable elements with no visible area + (WCAG 2.4.7 Focus Visible — focus would land somewhere unseen). + """ + order = tab_order(elements, row_tol=row_tol) + sequence: List[Dict[str, Any]] = [] + issues: List[Dict[str, Any]] = [] + for tab_index, element in enumerate(order): + role = humanize_role(element.role) + _left, _top, width, height = element.bounds + sequence.append({"tab_index": tab_index, "name": element.name, + "role": role, "bounds": list(element.bounds)}) + if width <= 0 or height <= 0: + issues.append({"tab_index": tab_index, "name": element.name, + "role": role, "issue": "zero_area_focusable", + "wcag": "2.4.7 Focus Visible"}) + return {"order": sequence, "issues": issues, + "focusable_count": len(order), "issue_count": len(issues)} + + +def focus_control(name: Optional[str] = None, role: Optional[str] = None, + app_name: Optional[str] = None, + automation_id: Optional[str] = None) -> bool: + """Set keyboard focus on the matched control (UIA SetFocus); True on success.""" + from je_auto_control.utils.accessibility.backends import get_backend + return get_backend().set_focus(name=name, role=role, app_name=app_name, + automation_id=automation_id) diff --git a/je_auto_control/utils/mcp_server/tools/_factories.py b/je_auto_control/utils/mcp_server/tools/_factories.py index 0db7d0a3..868bd6a4 100644 --- a/je_auto_control/utils/mcp_server/tools/_factories.py +++ b/je_auto_control/utils/mcp_server/tools/_factories.py @@ -1099,6 +1099,89 @@ def a11y_control_tools() -> List[MCPTool]: handler=h.read_table, annotations=READ_ONLY, ), + MCPTool( + name="ac_expand_control", + description=("Expand a tree node / combobox / expander natively " + "(ExpandCollapsePattern). Located by name/role/app_name/" + "automation_id. Returns True on success."), + input_schema=schema(dict(_M)), + handler=h.expand_control, + annotations=DESTRUCTIVE, + ), + MCPTool( + name="ac_collapse_control", + description=("Collapse a tree node / combobox / expander natively " + "(ExpandCollapsePattern). Returns True on success."), + input_schema=schema(dict(_M)), + handler=h.collapse_control, + annotations=DESTRUCTIVE, + ), + MCPTool( + name="ac_control_expand_state", + description=("Read a control's expand state via ExpandCollapsePattern: " + "{state: expanded|collapsed|partial|leaf|null}."), + input_schema=schema(dict(_M)), + handler=h.control_expand_state, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_select_control_item", + description=("Select a list / tree / tab item natively " + "(SelectionItemPattern). Returns True on success."), + input_schema=schema(dict(_M)), + handler=h.select_control_item, + annotations=DESTRUCTIVE, + ), + MCPTool( + name="ac_control_range", + description=("Read a slider / progress / spinner range via " + "RangeValuePattern: {found, range:{value,minimum,maximum}}."), + input_schema=schema(dict(_M)), + handler=h.control_range, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_set_control_range", + description=("Set a slider / progress / spinner 'value' natively " + "(RangeValuePattern). Returns True on success."), + input_schema=schema({"value": {"type": "number"}, **_M}, + required=["value"]), + handler=h.set_control_range, + annotations=DESTRUCTIVE, + ), + MCPTool( + name="ac_scroll_control_into_view", + description=("Scroll a control into view before acting " + "(ScrollItemPattern). Returns True on success."), + input_schema=schema(dict(_M)), + handler=h.scroll_control_into_view, + annotations=DESTRUCTIVE, + ), + MCPTool( + name="ac_get_control_text", + description=("Read a control's full text via TextPattern: " + "{text}. Works on multiline edits / RichEdit / document " + "controls where ac_control_get_value returns empty."), + input_schema=schema(dict(_M)), + handler=h.get_control_text, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_get_selected_text", + description=("Read a control's currently selected text via " + "TextPattern: {text} ('' when nothing selected)."), + input_schema=schema(dict(_M)), + handler=h.get_selected_text, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_get_visible_text", + description=("Read only the on-screen text of a control via " + "TextPattern.GetVisibleRanges: {text}."), + input_schema=schema(dict(_M)), + handler=h.get_visible_text, + annotations=READ_ONLY, + ), ] @@ -1117,6 +1200,67 @@ def a11y_tree_tools() -> List[MCPTool]: handler=h.a11y_dump, annotations=READ_ONLY, ), + MCPTool( + name="ac_walk_tree", + description=("Dump the accessibility tree like ac_a11y_dump but with " + "friendly role names (UIA 'ControlType_50000' → " + "'Button') and a stable positional 'path' per node " + "(addressable via the path attribute)."), + input_schema=schema({ + "app_name": {"type": "string"}, + "max_results": {"type": "integer"}, + }), + handler=h.walk_tree, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_humanize_role", + description=("Translate a raw UIA role ('ControlType_50000' / " + "'50000') to a friendly name: {role}. Unknown / " + "already-friendly roles pass through unchanged."), + input_schema=schema({"role": {"type": "string"}}, + required=["role"]), + handler=h.humanize_role, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_tab_order", + description=("List the focusable controls in the order the keyboard " + "Tab key would visit them (reading order): " + "{order:[{name,role,bounds,center,...}]}."), + input_schema=schema({ + "app_name": {"type": "string"}, + "max_results": {"type": "integer"}, + }), + handler=h.tab_order, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_audit_focus_order", + description=("WCAG 2.4.x focus-order audit over an app's controls: " + "{order, issues, focusable_count, issue_count}. Flags " + "focusable controls with no visible area."), + input_schema=schema({ + "app_name": {"type": "string"}, + "max_results": {"type": "integer"}, + }), + handler=h.audit_focus_order, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_focus_control", + description=("Set keyboard focus on a control natively (UIA " + "SetFocus), located by name/role/app_name/" + "automation_id. Returns True on success."), + input_schema=schema({ + "name": {"type": "string"}, + "role": {"type": "string"}, + "app_name": {"type": "string"}, + "automation_id": {"type": "string"}, + }), + handler=h.focus_control, + annotations=DESTRUCTIVE, + ), MCPTool( name="ac_a11y_record_start", description=("Start the polling accessibility recorder. " diff --git a/je_auto_control/utils/mcp_server/tools/_handlers.py b/je_auto_control/utils/mcp_server/tools/_handlers.py index 5d7a8051..a609c203 100644 --- a/je_auto_control/utils/mcp_server/tools/_handlers.py +++ b/je_auto_control/utils/mcp_server/tools/_handlers.py @@ -751,6 +751,60 @@ def read_table(name=None, role=None, app_name=None, automation_id=None): automation_id=automation_id) +def expand_control(name=None, role=None, app_name=None, automation_id=None): + from je_auto_control.utils.executor.action_executor import _expand_control + return _expand_control(name, role, app_name, automation_id) + + +def collapse_control(name=None, role=None, app_name=None, automation_id=None): + from je_auto_control.utils.executor.action_executor import _collapse_control + return _collapse_control(name, role, app_name, automation_id) + + +def control_expand_state(name=None, role=None, app_name=None, automation_id=None): + from je_auto_control.utils.executor.action_executor import ( + _control_expand_state) + return _control_expand_state(name, role, app_name, automation_id) + + +def select_control_item(name=None, role=None, app_name=None, automation_id=None): + from je_auto_control.utils.executor.action_executor import _select_control_item + return _select_control_item(name, role, app_name, automation_id) + + +def control_range(name=None, role=None, app_name=None, automation_id=None): + from je_auto_control.utils.executor.action_executor import _control_range + return _control_range(name, role, app_name, automation_id) + + +def set_control_range(value, name=None, role=None, app_name=None, + automation_id=None): + from je_auto_control.utils.executor.action_executor import _set_control_range + return _set_control_range(value, name, role, app_name, automation_id) + + +def scroll_control_into_view(name=None, role=None, app_name=None, + automation_id=None): + from je_auto_control.utils.executor.action_executor import ( + _scroll_control_into_view) + return _scroll_control_into_view(name, role, app_name, automation_id) + + +def get_control_text(name=None, role=None, app_name=None, automation_id=None): + from je_auto_control.utils.executor.action_executor import _get_control_text + return _get_control_text(name, role, app_name, automation_id) + + +def get_selected_text(name=None, role=None, app_name=None, automation_id=None): + from je_auto_control.utils.executor.action_executor import _get_selected_text + return _get_selected_text(name, role, app_name, automation_id) + + +def get_visible_text(name=None, role=None, app_name=None, automation_id=None): + from je_auto_control.utils.executor.action_executor import _get_visible_text + return _get_visible_text(name, role, app_name, automation_id) + + def watchdog_add(title, action="close", case_sensitive=False, name=None): from je_auto_control.utils.watchdog import default_popup_watchdog default_popup_watchdog.add_window_rule( @@ -2833,6 +2887,31 @@ def a11y_dump(app_name: Optional[str] = None, ).to_dict() +def walk_tree(app_name=None, max_results: int = 500): + from je_auto_control.utils.executor.action_executor import _walk_tree + return _walk_tree(app_name, max_results) + + +def humanize_role(role): + from je_auto_control.utils.executor.action_executor import _humanize_role + return _humanize_role(role) + + +def tab_order(app_name=None, max_results: int = 500): + from je_auto_control.utils.executor.action_executor import _tab_order + return _tab_order(app_name, max_results) + + +def audit_focus_order(app_name=None, max_results: int = 500): + from je_auto_control.utils.executor.action_executor import _audit_focus_order + return _audit_focus_order(app_name, max_results) + + +def focus_control(name=None, role=None, app_name=None, automation_id=None): + from je_auto_control.utils.executor.action_executor import _focus_control + return _focus_control(name, role, app_name, automation_id) + + def a11y_record_start(app_name: Optional[str] = None, poll_interval_s: float = 0.25, min_movement_px: int = 8) -> Dict[str, Any]: diff --git a/test/unit_test/headless/test_ax_text_batch.py b/test/unit_test/headless/test_ax_text_batch.py new file mode 100644 index 00000000..617ec9a3 --- /dev/null +++ b/test/unit_test/headless/test_ax_text_batch.py @@ -0,0 +1,97 @@ +"""Headless tests for native TextPattern reads (fake backend via the seam).""" +import je_auto_control as ac +from je_auto_control.utils.accessibility.backends import base as backend_base +from je_auto_control.utils.ax_text import ( + get_control_text, get_selected_text, get_visible_text, +) + + +class _FakeBackend(backend_base.AccessibilityBackend): + name = "fake" + available = True + + def __init__(self): + self.calls = [] + + def document_text(self, name=None, role=None, app_name=None, + automation_id=None): + self.calls.append(("document", {"name": name, "role": role, + "app_name": app_name, + "automation_id": automation_id})) + return "line 1\nline 2\nline 3" + + def selected_text(self, name=None, role=None, app_name=None, + automation_id=None): + self.calls.append(("selected", name)) + return "line 2" + + def visible_text(self, name=None, role=None, app_name=None, + automation_id=None): + self.calls.append(("visible", name)) + return "line 1\nline 2" + + +def _inject(monkeypatch, backend): + import je_auto_control.utils.accessibility.backends as backends + monkeypatch.setattr(backends, "_cached_backend", backend, raising=False) + + +def test_document_text_dispatch(monkeypatch): + fake = _FakeBackend() + _inject(monkeypatch, fake) + assert get_control_text(name="Editor", role="document") == "line 1\nline 2\nline 3" + assert ("document", {"name": "Editor", "role": "document", "app_name": None, + "automation_id": None}) in fake.calls + + +def test_selected_text(monkeypatch): + fake = _FakeBackend() + _inject(monkeypatch, fake) + assert get_selected_text(automation_id="edit1") == "line 2" + assert fake.calls[0][0] == "selected" + + +def test_visible_text(monkeypatch): + fake = _FakeBackend() + _inject(monkeypatch, fake) + assert get_visible_text(name="Editor") == "line 1\nline 2" + assert fake.calls[0][0] == "visible" + + +def test_unsupported_backend_raises(monkeypatch): + from je_auto_control.utils.accessibility.element import ( + AccessibilityNotAvailableError) + _inject(monkeypatch, backend_base.AccessibilityBackend()) # all _unsupported + try: + get_control_text(name="x") + raised = False + except AccessibilityNotAvailableError: + raised = True + assert raised is True + + +# --- wiring --------------------------------------------------------------- + +def test_executor_adapter_wraps_text(monkeypatch): + _inject(monkeypatch, _FakeBackend()) + from je_auto_control.utils.executor.action_executor import _get_control_text + assert _get_control_text(name="Editor") == {"text": "line 1\nline 2\nline 3"} + + +def test_wiring(): + known = set(ac.executor.known_commands()) + assert {"AC_get_control_text", "AC_get_selected_text", + "AC_get_visible_text"} <= known + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + names = {t.name for t in build_default_tool_registry()} + assert {"ac_get_control_text", "ac_get_selected_text", + "ac_get_visible_text"} <= names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert {"AC_get_control_text", "AC_get_selected_text", + "AC_get_visible_text"} <= specs + + +def test_facade_exports(): + for name in ("get_control_text", "get_selected_text", "get_visible_text"): + assert hasattr(ac, name) and name in ac.__all__ diff --git a/test/unit_test/headless/test_ax_tree_walk_batch.py b/test/unit_test/headless/test_ax_tree_walk_batch.py new file mode 100644 index 00000000..bddd1d44 --- /dev/null +++ b/test/unit_test/headless/test_ax_tree_walk_batch.py @@ -0,0 +1,106 @@ +"""Headless tests for the readable/addressable a11y-tree post-processing.""" +import je_auto_control as ac +from je_auto_control.utils.accessibility.backends import base as backend_base +from je_auto_control.utils.accessibility.element import AccessibilityElement +from je_auto_control.utils.accessibility.tree import AXTreeNode +from je_auto_control.utils.ax_tree_walk import ( + assign_node_paths, control_type_name, find_by_path, humanize_role, + humanize_tree, +) + + +def _tree() -> AXTreeNode: + return AXTreeNode( + name="root", role="AXRoot", bounds=(0, 0, 0, 0), + children=[ + AXTreeNode(name="app", role="ControlType_50032", bounds=(0, 0, 4, 4), + children=[ + AXTreeNode(name="OK", role="ControlType_50000", + bounds=(1, 1, 2, 2)), + AXTreeNode(name="Name", role="ControlType_50004", + bounds=(2, 2, 2, 2)), + ]), + ], + ) + + +def test_control_type_name_known_and_unknown(): + assert control_type_name(50000) == "Button" + assert control_type_name(50004) == "Edit" + assert control_type_name(99999) == "ControlType_99999" + + +def test_humanize_role_forms(): + assert humanize_role(50000) == "Button" + assert humanize_role("ControlType_50000") == "Button" + assert humanize_role("50000") == "Button" + assert humanize_role("Button") == "Button" # already friendly + assert humanize_role("AXApplication") == "AXApplication" # non-UIA + + +def test_humanize_tree_is_a_deep_copy(): + root = _tree() + out = humanize_tree(root) + assert out.children[0].role == "Window" + assert out.children[0].children[0].role == "Button" + assert out.children[0].children[1].role == "Edit" + assert root.children[0].role == "ControlType_50032" # original untouched + + +def test_assign_node_paths_and_find_by_path(): + root = assign_node_paths(_tree()) + assert root.attributes["path"] == "0" + assert root.children[0].attributes["path"] == "0.0" + assert root.children[0].children[1].attributes["path"] == "0.0.1" + node = find_by_path(root, "0.0.1") + assert node is not None and node.name == "Name" + assert find_by_path(root, "0.0.9") is None # out of range + assert find_by_path(root, "1.0") is None # bad root + + +# --- wiring (executor path is CI-exercised via a fake backend) ------------- + +class _FakeBackend(backend_base.AccessibilityBackend): + name = "fake" + available = True + + def list_elements(self, app_name=None, max_results=200): + return [AccessibilityElement(name="OK", role="ControlType_50000", + bounds=(1, 1, 2, 2), app_name="demo.exe")] + + +def _inject(monkeypatch, backend): + import je_auto_control.utils.accessibility.backends as backends + monkeypatch.setattr(backends, "_cached_backend", backend, raising=False) + + +def test_walk_tree_executor_humanizes_and_paths(monkeypatch): + _inject(monkeypatch, _FakeBackend()) + from je_auto_control.utils.executor.action_executor import _walk_tree + out = _walk_tree(app_name="demo.exe") + assert out["attributes"]["path"] == "0" + button = out["children"][0]["children"][0] + assert button["role"] == "Button" + assert button["attributes"]["path"] == "0.0.0" + + +def test_humanize_role_executor(): + from je_auto_control.utils.executor.action_executor import _humanize_role + assert _humanize_role("ControlType_50000") == {"role": "Button"} + + +def test_wiring(): + known = set(ac.executor.known_commands()) + assert {"AC_walk_tree", "AC_humanize_role"} <= known + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + names = {t.name for t in build_default_tool_registry()} + assert {"ac_walk_tree", "ac_humanize_role"} <= names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert {"AC_walk_tree", "AC_humanize_role"} <= specs + + +def test_facade_exports(): + for name in ("control_type_name", "humanize_role", "humanize_tree", + "assign_node_paths", "find_by_path"): + assert hasattr(ac, name) and name in ac.__all__ diff --git a/test/unit_test/headless/test_control_patterns_batch.py b/test/unit_test/headless/test_control_patterns_batch.py new file mode 100644 index 00000000..495e98ad --- /dev/null +++ b/test/unit_test/headless/test_control_patterns_batch.py @@ -0,0 +1,120 @@ +"""Headless tests for extended control patterns (fake backend via the seam).""" +import je_auto_control as ac +from je_auto_control.utils.accessibility.backends import base as backend_base +from je_auto_control.utils.control_patterns import ( + collapse_control, control_expand_state, control_range, expand_control, + scroll_control_into_view, select_control_item, set_control_range, +) + + +class _FakeBackend(backend_base.AccessibilityBackend): + name = "fake" + available = True + + def __init__(self): + self.calls = [] + + def expand(self, name=None, role=None, app_name=None, automation_id=None): + self.calls.append(("expand", {"name": name, "role": role, + "app_name": app_name, + "automation_id": automation_id})) + return True + + def collapse(self, name=None, role=None, app_name=None, automation_id=None): + self.calls.append(("collapse", {"name": name, "role": role, + "app_name": app_name, + "automation_id": automation_id})) + return True + + def expand_state(self, name=None, role=None, app_name=None, + automation_id=None): + return "collapsed" + + def select_item(self, name=None, role=None, app_name=None, + automation_id=None): + self.calls.append(("select", name)) + return True + + def get_range(self, name=None, role=None, app_name=None, automation_id=None): + return {"value": 30.0, "minimum": 0.0, "maximum": 100.0} + + def set_range_value(self, value, name=None, role=None, app_name=None, + automation_id=None): + self.calls.append(("set_range", value)) + return True + + def scroll_into_view(self, name=None, role=None, app_name=None, + automation_id=None): + return True + + +def _inject(monkeypatch, backend): + import je_auto_control.utils.accessibility.backends as backends + monkeypatch.setattr(backends, "_cached_backend", backend, raising=False) + + +def test_expand_and_collapse_dispatch(monkeypatch): + fake = _FakeBackend() + _inject(monkeypatch, fake) + assert expand_control(name="Node", role="treeitem") is True + assert collapse_control(automation_id="tree1") is True + assert ("expand", {"name": "Node", "role": "treeitem", "app_name": None, + "automation_id": None}) in fake.calls + + +def test_expand_state(monkeypatch): + _inject(monkeypatch, _FakeBackend()) + assert control_expand_state(name="Node") == "collapsed" + + +def test_select_item(monkeypatch): + fake = _FakeBackend() + _inject(monkeypatch, fake) + assert select_control_item(name="Option B") is True + assert fake.calls[0][0] == "select" + + +def test_range_get_and_set(monkeypatch): + fake = _FakeBackend() + _inject(monkeypatch, fake) + assert control_range(name="Volume") == {"value": 30.0, "minimum": 0.0, + "maximum": 100.0} + assert set_control_range(75, name="Volume") is True + assert ("set_range", 75.0) in fake.calls + + +def test_scroll_into_view(monkeypatch): + _inject(monkeypatch, _FakeBackend()) + assert scroll_control_into_view(name="Row 50") is True + + +def test_unsupported_backend_raises(monkeypatch): + from je_auto_control.utils.accessibility.element import ( + AccessibilityNotAvailableError) + _inject(monkeypatch, backend_base.AccessibilityBackend()) # all _unsupported + try: + expand_control(name="x") + raised = False + except AccessibilityNotAvailableError: + raised = True + assert raised is True + + +# --- wiring --------------------------------------------------------------- + +def test_wiring(): + known = set(ac.executor.known_commands()) + assert {"AC_expand_control", "AC_set_control_range"} <= known + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + names = {t.name for t in build_default_tool_registry()} + assert {"ac_expand_control", "ac_set_control_range"} <= names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert {"AC_expand_control", "AC_set_control_range"} <= specs + + +def test_facade_exports(): + for name in ("expand_control", "collapse_control", "control_expand_state", + "select_control_item", "control_range", "set_control_range", + "scroll_control_into_view"): + assert hasattr(ac, name) and name in ac.__all__ diff --git a/test/unit_test/headless/test_focus_order_batch.py b/test/unit_test/headless/test_focus_order_batch.py new file mode 100644 index 00000000..e760b0e9 --- /dev/null +++ b/test/unit_test/headless/test_focus_order_batch.py @@ -0,0 +1,119 @@ +"""Headless tests for keyboard focus order (tab sequence / WCAG audit / set-focus).""" +import je_auto_control as ac +from je_auto_control.utils.accessibility.backends import base as backend_base +from je_auto_control.utils.accessibility.element import AccessibilityElement +from je_auto_control.utils.focus_order import ( + audit_focus_order, focus_control, is_interactive_role, tab_order, +) + + +def _el(name, role, bounds): + return AccessibilityElement(name=name, role=role, bounds=bounds, + app_name="demo.exe") + + +def _sample(): + return [ + _el("Label", "ControlType_50020", (10, 10, 100, 20)), # Text: skipped + _el("OK", "ControlType_50000", (10, 100, 50, 20)), # Button + _el("Name", "ControlType_50004", (10, 50, 100, 20)), # Edit + _el("Agree", "ControlType_50002", (200, 50, 20, 20)), # CheckBox, same row + _el("Hidden", "ControlType_50000", (10, 200, 0, 0)), # Button, zero-area + ] + + +def test_is_interactive_role(): + assert is_interactive_role("ControlType_50000") is True # Button + assert is_interactive_role("Edit") is True + assert is_interactive_role("ControlType_50020") is False # Text + assert is_interactive_role("AXApplication") is False + + +def test_tab_order_filters_and_reads(): + order = [el.name for el in tab_order(_sample())] + # Text excluded; row y=50 (Name then Agree by x) before y=100 before y=200. + assert order == ["Name", "Agree", "OK", "Hidden"] + + +def test_audit_flags_zero_area_focusable(): + report = audit_focus_order(_sample()) + assert report["focusable_count"] == 4 + assert report["issue_count"] == 1 + issue = report["issues"][0] + assert issue["name"] == "Hidden" + assert issue["issue"] == "zero_area_focusable" + assert report["order"][0] == {"tab_index": 0, "name": "Name", + "role": "Edit", "bounds": [10, 50, 100, 20]} + + +def test_audit_empty_list(): + report = audit_focus_order([]) + assert report == {"order": [], "issues": [], "focusable_count": 0, + "issue_count": 0} + + +# --- device seam + wiring -------------------------------------------------- + +class _FakeBackend(backend_base.AccessibilityBackend): + name = "fake" + available = True + + def __init__(self): + self.focused = [] + + def list_elements(self, app_name=None, max_results=200): + return _sample() + + def set_focus(self, name=None, role=None, app_name=None, automation_id=None): + self.focused.append(name) + return True + + +def _inject(monkeypatch, backend): + import je_auto_control.utils.accessibility.backends as backends + monkeypatch.setattr(backends, "_cached_backend", backend, raising=False) + + +def test_focus_control_dispatch(monkeypatch): + fake = _FakeBackend() + _inject(monkeypatch, fake) + assert focus_control(name="Name", role="edit") is True + assert fake.focused == ["Name"] + + +def test_unsupported_backend_raises(monkeypatch): + from je_auto_control.utils.accessibility.element import ( + AccessibilityNotAvailableError) + _inject(monkeypatch, backend_base.AccessibilityBackend()) + try: + focus_control(name="x") + raised = False + except AccessibilityNotAvailableError: + raised = True + assert raised is True + + +def test_executor_tab_order_and_audit(monkeypatch): + _inject(monkeypatch, _FakeBackend()) + from je_auto_control.utils.executor.action_executor import ( + _audit_focus_order, _tab_order) + order = [e["name"] for e in _tab_order(app_name="demo.exe")["order"]] + assert order == ["Name", "Agree", "OK", "Hidden"] + assert _audit_focus_order(app_name="demo.exe")["issue_count"] == 1 + + +def test_wiring(): + known = set(ac.executor.known_commands()) + assert {"AC_tab_order", "AC_audit_focus_order", "AC_focus_control"} <= known + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + names = {t.name for t in build_default_tool_registry()} + assert {"ac_tab_order", "ac_audit_focus_order", "ac_focus_control"} <= names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert {"AC_tab_order", "AC_audit_focus_order", "AC_focus_control"} <= specs + + +def test_facade_exports(): + for name in ("is_interactive_role", "tab_order", "audit_focus_order", + "focus_control"): + assert hasattr(ac, name) and name in ac.__all__