Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README/WHATS_NEW_zh-CN.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# 本次更新 — AutoControl

## 本次更新 (2026-06-24) — 扩充 UIA 控制模式(展开 / 选取 / 范围 / 滚动)

以原生模式驱动树节点、列表/下拉项目、滑块与滚动,而非像素猜测。完整参考:[`docs/source/Zh/doc/new_features/v181_features_doc.rst`](../docs/source/Zh/doc/new_features/v181_features_doc.rst)。

- **`expand_control` / `collapse_control` / `control_expand_state` / `select_control_item` / `control_range` / `set_control_range` / `scroll_control_into_view`**(`AC_expand_control`、`AC_select_control_item`、`AC_set_control_range` 等):无障碍后端原本只有 Value/Invoke/Toggle/Grid-read 模式,故树状视图、列表/下拉、滑块与屏幕外行都没有原生调用路径。本功能在既有后端 ABC 之上补上 ExpandCollapse / SelectionItem / RangeValue / ScrollItem 模式,通过可注入的 `accessibility.backends.get_backend()` 接缝分派(以 fake backend 无头测试;真正 UIA 调用在 Windows 后端)。不导入 `PySide6`。

## 本次更新 (2026-06-24) — 匹配前安定门 + 命中稳定性

避免在动画进行中匹配,并确认命中跨帧维持稳定。完整参考:[`docs/source/Zh/doc/new_features/v180_features_doc.rst`](../docs/source/Zh/doc/new_features/v180_features_doc.rst)。
Expand Down
6 changes: 6 additions & 0 deletions README/WHATS_NEW_zh-TW.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# 本次更新 — AutoControl

## 本次更新 (2026-06-24) — 擴充 UIA 控制模式(展開 / 選取 / 範圍 / 捲動)

以原生模式驅動樹節點、清單/下拉項目、滑桿與捲動,而非像素猜測。完整參考:[`docs/source/Zh/doc/new_features/v181_features_doc.rst`](../docs/source/Zh/doc/new_features/v181_features_doc.rst)。

- **`expand_control` / `collapse_control` / `control_expand_state` / `select_control_item` / `control_range` / `set_control_range` / `scroll_control_into_view`**(`AC_expand_control`、`AC_select_control_item`、`AC_set_control_range` 等):無障礙後端原本只有 Value/Invoke/Toggle/Grid-read 模式,故樹狀檢視、清單/下拉、滑桿與螢幕外列都沒有原生呼叫路徑。本功能在既有後端 ABC 之上補上 ExpandCollapse / SelectionItem / RangeValue / ScrollItem 模式,透過可注入的 `accessibility.backends.get_backend()` 接縫分派(以 fake backend 無頭測試;真正 UIA 呼叫在 Windows 後端)。不匯入 `PySide6`。

## 本次更新 (2026-06-24) — 比對前安定閘 + 命中穩定性

避免在動畫進行中比對,並確認命中跨幀維持穩定。完整參考:[`docs/source/Zh/doc/new_features/v180_features_doc.rst`](../docs/source/Zh/doc/new_features/v180_features_doc.rst)。
Expand Down
24 changes: 24 additions & 0 deletions WHATS_NEW.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,29 @@
# What's New — AutoControl

## What's new (2026-06-24) — Keyboard Focus Order (Tab sequence / WCAG audit / set-focus)

Reason about keyboard navigation: the Tab order, a WCAG focus-order audit, and set-focus. Full reference: [`docs/source/Eng/doc/new_features/v184_features_doc.rst`](docs/source/Eng/doc/new_features/v184_features_doc.rst).

- **`is_interactive_role` / `tab_order` / `audit_focus_order` / `focus_control`** (`AC_tab_order`, `AC_audit_focus_order`, `AC_focus_control`): nothing reasoned about *keyboard* navigation — only mouse coordinates and element values. This adds the keyboard layer: `tab_order` returns the focusable elements in the order Tab visits them (reading order), `audit_focus_order` is a WCAG 2.4.x report (the sequence + flagged problems like a focusable element with no visible area), and `focus_control` sets keyboard focus via UIA `SetFocus`. The first three are pure functions over `AccessibilityElement` lists — `tab_order` reuses `element_parse.reading_order` and `is_interactive_role` reuses `ax_tree_walk.humanize_role`, so no logic is duplicated; `focus_control` dispatches the injectable backend seam (real `SetFocus` in the Windows backend). No `PySide6`.

## What's new (2026-06-24) — Readable, Addressable Accessibility Tree (role names + node paths)

Turn a raw `ControlType_50000` tree dump into readable roles with a stable path per node. Full reference: [`docs/source/Eng/doc/new_features/v183_features_doc.rst`](docs/source/Eng/doc/new_features/v183_features_doc.rst).

- **`control_type_name` / `humanize_role` / `humanize_tree` / `assign_node_paths` / `find_by_path`** (`AC_walk_tree`, `AC_humanize_role`): `dump_accessibility_tree` emits the platform's raw role (on Windows the bare UIA ControlType id, e.g. `ControlType_50000` for a button) and carries no stable per-node identity once serialised. This adds the pure post-processing it lacks: translate ControlType ids to friendly names, deep-copy a tree with every role humanised, stamp each node with a stable positional `path` (`"0.2.1"` — a pure stand-in for RuntimeId), and resolve a node back by path. `AC_walk_tree` is the readable counterpart to `AC_a11y_dump`. Pure-stdlib over `AXTreeNode`; unknown / non-UIA roles pass through unchanged. No `PySide6`.

## What's new (2026-06-24) — Native Text Reading via the UIA TextPattern (document / selection / visible)

Read the text in multiline editors and document controls where ValuePattern returns nothing. Full reference: [`docs/source/Eng/doc/new_features/v182_features_doc.rst`](docs/source/Eng/doc/new_features/v182_features_doc.rst).

- **`get_control_text` / `get_selected_text` / `get_visible_text`** (`AC_get_control_text`, `AC_get_selected_text`, `AC_get_visible_text`): `control_get_value` reads through UIA ValuePattern, which returns an empty string on multiline edits, RichEdit / document controls and web text areas — exactly the controls whose text you most want. This reads through `TextPattern` instead: `get_control_text` returns the whole `DocumentRange`, `get_selected_text` the current `GetSelection`, `get_visible_text` only the on-screen `GetVisibleRanges`. Dispatched through the injectable `accessibility.backends.get_backend()` seam (headless-testable via a fake backend; real UIA calls in the Windows backend), returning `{text}` from the executor/MCP. No `PySide6`.

## What's new (2026-06-24) — Extended UIA Control Patterns (Expand / Select / Range / Scroll)

Drive tree nodes, list/combo items, sliders and scroll natively, not by pixel guessing. Full reference: [`docs/source/Eng/doc/new_features/v181_features_doc.rst`](docs/source/Eng/doc/new_features/v181_features_doc.rst).

- **`expand_control` / `collapse_control` / `control_expand_state` / `select_control_item` / `control_range` / `set_control_range` / `scroll_control_into_view`** (`AC_expand_control`, `AC_select_control_item`, `AC_set_control_range`, …): the accessibility backend had only Value/Invoke/Toggle/Grid-read patterns, so treeviews, listboxes/combos, sliders and off-screen rows had no native call path. This adds ExpandCollapse / SelectionItem / RangeValue / ScrollItem patterns on top of the existing backend ABC, dispatched through the injectable `accessibility.backends.get_backend()` seam (headless-testable via a fake backend; real UIA calls in the Windows backend). No `PySide6`.

## What's new (2026-06-24) — Pre-Match Settle Gating + Match Persistence

Avoid matching mid-animation, and confirm a hit holds steady across frames. Full reference: [`docs/source/Eng/doc/new_features/v180_features_doc.rst`](docs/source/Eng/doc/new_features/v180_features_doc.rst).
Expand Down
46 changes: 46 additions & 0 deletions docs/source/Eng/doc/new_features/v181_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
Extended UIA Control Patterns (Expand / Select / Range / Scroll)
===============================================================

The accessibility backend shipped only four control patterns — Value, Invoke, Toggle and a
read-only Grid dump. That left the controls automation hits most often undriveable by their
*native* pattern: a treeview node could not be expanded, a listbox / combobox item could not be
selected (SelectionItemPattern), a slider could not be set (RangeValuePattern), and a control
could not be scrolled into view (ScrollItemPattern) — those fell back to fragile pixel guessing.
``control_patterns`` adds those object-level actions on top of the existing accessibility
backend ABC.

Each function is a thin dispatch onto the injectable ``accessibility.backends.get_backend()``
seam (the same seam the rest of the accessibility module uses), so the headless core is
unit-testable on any platform by injecting a fake backend; the real UI Automation calls live in
the Windows backend (ExpandCollapse / SelectionItem / RangeValue / ScrollItem patterns).
Backends that don't implement a pattern raise ``AccessibilityNotAvailableError``. Imports no
``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import (expand_control, collapse_control,
control_expand_state, select_control_item,
control_range, set_control_range,
scroll_control_into_view)

expand_control(name="Documents", role="treeitem") # open a tree node
select_control_item(name="Option B") # pick a list/combo item
set_control_range(75, name="Volume") # set a slider
print(control_range(name="Volume")) # {"value": 75.0, "minimum": 0, "maximum": 100}
scroll_control_into_view(name="Row 200") # bring a row on-screen

All locate the control by ``name`` / ``role`` / ``app_name`` / ``automation_id`` (same as the
existing ``control_invoke`` / ``control_toggle``). The expand/select/scroll/set actions return
``bool``; ``control_expand_state`` returns ``expanded`` / ``collapsed`` / ``partial`` / ``leaf``
(or ``None``); ``control_range`` returns ``{value, minimum, maximum}`` (or ``None``).

Executor commands
-----------------

``AC_expand_control`` / ``AC_collapse_control`` / ``AC_control_expand_state`` /
``AC_select_control_item`` / ``AC_control_range`` / ``AC_set_control_range`` /
``AC_scroll_control_into_view``. They are exposed as the matching ``ac_*`` MCP tools (the action
ones destructive, the reads read-only) and as Script Builder commands under **Native UI**.
46 changes: 46 additions & 0 deletions docs/source/Eng/doc/new_features/v182_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
Native Text Reading via the UIA TextPattern (document / selection / visible)
============================================================================

``control_get_value`` reads a control through UIA ValuePattern, but ValuePattern
returns an **empty string** on multiline edits, RichEdit / document controls and
web text areas — exactly the controls whose text you most want to read. UIA
exposes that text through a different pattern, ``TextPattern``, which models the
control's content as text ranges. ``ax_text`` adds three reads on top of the
existing accessibility backend ABC:

* :func:`get_control_text` — the whole document's text (``DocumentRange``),
* :func:`get_selected_text` — the currently selected text (``GetSelection``),
* :func:`get_visible_text` — only the on-screen text (``GetVisibleRanges``).

Each function is a thin dispatch onto the injectable
``accessibility.backends.get_backend()`` seam (the same seam the rest of the
accessibility module uses), so the headless core is unit-testable on any
platform by injecting a fake backend; the real UI Automation calls live in the
Windows backend. Backends that don't implement TextPattern raise
``AccessibilityNotAvailableError``. Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import (get_control_text, get_selected_text,
get_visible_text)

# A multiline editor where control_get_value returns "" :
text = get_control_text(name="Editor", role="document")
selection = get_selected_text(name="Editor") # "" when nothing selected
on_screen = get_visible_text(name="Editor") # skips scrolled-off lines

All locate the control by ``name`` / ``role`` / ``app_name`` / ``automation_id``
(same as ``control_get_value`` / ``control_invoke``). Each returns the text as a
``str``, or ``None`` when the control is not found or exposes no TextPattern;
``get_selected_text`` returns ``""`` when the control is found but has no
selection.

Executor commands
-----------------

``AC_get_control_text`` / ``AC_get_selected_text`` / ``AC_get_visible_text`` each
return ``{"text": ...}``. They are exposed as the matching read-only ``ac_*`` MCP
tools and as Script Builder commands under **Native UI**.
47 changes: 47 additions & 0 deletions docs/source/Eng/doc/new_features/v183_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
Readable, Addressable Accessibility Tree (role names + node paths)
==================================================================

``dump_accessibility_tree`` emits nodes with the platform's *raw* role — on
Windows that is the bare UI Automation ControlType id, e.g. ``"ControlType_50000"``
for a button. That is unreadable, and a serialised dump carries no stable
per-node identity (UIA RuntimeId needs the live element, which the dump has
thrown away). ``ax_tree_walk`` adds the pure, platform-agnostic post-processing
the dump lacks, composable on top of any ``dump_accessibility_tree`` output:

* :func:`control_type_name` / :func:`humanize_role` — translate a ControlType id
(or ``"ControlType_NNNNN"`` / ``"NNNNN"`` string) to a friendly name,
* :func:`humanize_tree` — a deep copy of the tree with every role humanised,
* :func:`assign_node_paths` — a deep copy stamping each node with a stable
positional ``path`` (``"0.2.1"``) — a pure stand-in for RuntimeId identity,
* :func:`find_by_path` — resolve a node back from its path.

Pure-stdlib over ``AXTreeNode`` values; no device or backend access. Imports no
``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import (dump_accessibility_tree, humanize_tree,
assign_node_paths, find_by_path, humanize_role)

humanize_role("ControlType_50000") # "Button"
humanize_role(50004) # "Edit"

tree = assign_node_paths(humanize_tree(dump_accessibility_tree()))
# every node now has a readable role and tree["attributes"]["path"]
node = find_by_path(tree, "0.0.1") # re-resolve a node by its path

Unknown ids and non-UIA roles (``"AXApplication"``) pass through unchanged, so
nothing is lost. The path is stable for a given tree shape, giving scripts /
agents a deterministic handle to a node across a dump → act round-trip.

Executor commands
-----------------

``AC_walk_tree`` (``app_name`` / ``max_results``) returns the humanised,
path-stamped tree as a nested dict — the readable counterpart to
``AC_a11y_dump``. ``AC_humanize_role`` (``role``) returns ``{"role": ...}``.
Both are exposed as read-only ``ac_*`` MCP tools and as Script Builder commands
under **Native UI**.
50 changes: 50 additions & 0 deletions docs/source/Eng/doc/new_features/v184_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
Keyboard Focus Order (Tab sequence / WCAG audit / set-focus)
============================================================

Nothing in the toolkit reasoned about *keyboard* navigation — only mouse
coordinates and element values. ``focus_order`` adds the keyboard layer:

* :func:`is_interactive_role` — is a role one that normally takes keyboard focus,
* :func:`tab_order` — the focusable elements in the order ``Tab`` will visit them
(their reading order: top-to-bottom, left-to-right),
* :func:`audit_focus_order` — a WCAG 2.4.x focus-order report over a flat element
list (the sequence plus flagged problems, e.g. a focusable element with no
visible area — focus would land somewhere unseen),
* :func:`focus_control` — set the keyboard focus on a control (UIA ``SetFocus``).

The first three are pure functions over ``AccessibilityElement`` lists:
``tab_order`` reuses ``element_parse.reading_order`` for row banding and
``is_interactive_role`` reuses ``ax_tree_walk.humanize_role``, so no logic is
duplicated. ``focus_control`` is a thin dispatch onto the injectable
``accessibility.backends.get_backend()`` seam; the real ``SetFocus`` lives in the
Windows backend. Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import (list_accessibility_elements, tab_order,
audit_focus_order, focus_control)

elements = list_accessibility_elements(app_name="myapp.exe")
for el in tab_order(elements): # the Tab visiting order
print(el.name, el.role)

report = audit_focus_order(elements)
# {"order": [...], "issues": [...], "focusable_count": N, "issue_count": M}

focus_control(name="Username", role="edit") # put the cursor in the field

Focusability is role-based (the interactive roles: Button, Edit, CheckBox,
ComboBox, RadioButton, Hyperlink, ListItem, MenuItem, Slider, Tab/TabItem,
TreeItem, …). ``focus_control`` locates by ``name`` / ``role`` / ``app_name`` /
``automation_id`` like the other native-control actions and returns ``bool``.

Executor commands
-----------------

``AC_tab_order`` / ``AC_audit_focus_order`` (``app_name`` / ``max_results``) list
and audit the live app; ``AC_focus_control`` sets focus. They are exposed as the
matching ``ac_*`` MCP tools (the two reads read-only, ``ac_focus_control``
destructive) and as Script Builder commands under **Native UI**.
1 change: 1 addition & 0 deletions docs/source/Eng/eng_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,7 @@ Comprehensive guides for all AutoControl features.
doc/new_features/v178_features_doc
doc/new_features/v179_features_doc
doc/new_features/v180_features_doc
doc/new_features/v181_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
42 changes: 42 additions & 0 deletions docs/source/Zh/doc/new_features/v181_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
擴充 UIA 控制模式(展開 / 選取 / 範圍 / 捲動)
===============================================

無障礙後端原本只提供四種控制模式——Value、Invoke、Toggle 與唯讀的 Grid dump。這使得自動化
最常遇到的控制項無法以其*原生*模式驅動:樹節點無法展開、清單 / 下拉項目無法選取
(SelectionItemPattern)、滑桿無法設定(RangeValuePattern)、控制項無法捲入視野
(ScrollItemPattern)——這些只能退回脆弱的像素猜測。``control_patterns`` 在既有的無障礙後端
ABC 之上補上這些物件層級動作。

每個函式都是對可注入的 ``accessibility.backends.get_backend()`` 接縫的薄分派(與無障礙模組
其餘部分相同的接縫),因此無頭核心可在任何平台透過注入 fake backend 單元測試;真正的
UI Automation 呼叫位於 Windows 後端(ExpandCollapse / SelectionItem / RangeValue / ScrollItem
模式)。未實作某模式的後端會拋出 ``AccessibilityNotAvailableError``。不匯入 ``PySide6``。

無頭 API
--------

.. code-block:: python

from je_auto_control import (expand_control, collapse_control,
control_expand_state, select_control_item,
control_range, set_control_range,
scroll_control_into_view)

expand_control(name="Documents", role="treeitem") # 展開樹節點
select_control_item(name="Option B") # 選取清單/下拉項目
set_control_range(75, name="Volume") # 設定滑桿
print(control_range(name="Volume")) # {"value": 75.0, "minimum": 0, "maximum": 100}
scroll_control_into_view(name="Row 200") # 把某列帶上螢幕

全部以 ``name`` / ``role`` / ``app_name`` / ``automation_id`` 定位控制項(與既有
``control_invoke`` / ``control_toggle`` 相同)。展開/選取/捲動/設定動作回傳 ``bool``;
``control_expand_state`` 回傳 ``expanded`` / ``collapsed`` / ``partial`` / ``leaf``(或
``None``);``control_range`` 回傳 ``{value, minimum, maximum}``(或 ``None``)。

執行器指令
----------

``AC_expand_control`` / ``AC_collapse_control`` / ``AC_control_expand_state`` /
``AC_select_control_item`` / ``AC_control_range`` / ``AC_set_control_range`` /
``AC_scroll_control_into_view``。皆以對應的 ``ac_*`` MCP 工具(動作類為破壞性、讀取類為唯讀)
及 Script Builder 指令(位於 **Native UI** 分類下)形式提供。
Loading
Loading