diff --git a/WHATS_NEW.md b/WHATS_NEW.md index a0be99cc..4c39e190 100644 --- a/WHATS_NEW.md +++ b/WHATS_NEW.md @@ -1,5 +1,23 @@ # What's New — AutoControl +## What's new (2026-06-24) — Visual Saliency (where to look — spectral-residual) + +Find the region that stands out, with no template / colour / text. Full reference: [`docs/source/Eng/doc/new_features/v190_features_doc.rst`](docs/source/Eng/doc/new_features/v190_features_doc.rst). + +- **`saliency_map` / `salient_regions` / `most_salient`** (`AC_salient_regions`, `AC_most_salient`): when there's no template, colour or text to key on, an agent still needs a cue for *where to look*. This computes the spectral-residual saliency map (Hou & Zhang 2007 — log amplitude minus its local average, reconstructed through the phase) and turns it into ranked salient boxes in source pixel coordinates. The transform is a pure numpy FFT (`cv2.saliency` is in the forbidden opencv-contrib package, so it's re-implemented over base opencv); it reuses `visual_match`'s grayscale loader and `cv2_utils.blobs.connected_boxes`. Regions threshold at `mean + 2·std` by default. A coarse attention cue to *narrow* where a template / OCR pass then looks. No `PySide6`. + +## What's new (2026-06-24) — Display-Scale / Visual-DPI Detection + +Infer which display scale (DPI) a template renders at — and how confidently. Full reference: [`docs/source/Eng/doc/new_features/v189_features_doc.rst`](docs/source/Eng/doc/new_features/v189_features_doc.rst). + +- **`detect_scale` / `scale_sweep`** (`AC_detect_scale`, `AC_scale_sweep`): a template cropped at 100% scale won't match on a 150%-DPI machine, and `match_template` returns only the single best match — discarding the per-scale scores. This keeps the whole profile: `scale_sweep` scores the template at every scale, and `detect_scale` reports the winning scale as a DPI inference (`scale_percent`) with a confidence `margin` (how far it beats the runner-up). Reuses `visual_match._score_map` per scale; source is any ndarray / path / PIL image (or the live screen); scales default to the common Windows values. cv2/numpy lazily imported. No `PySide6`. + +## What's new (2026-06-24) — Image Quality Scoring (sharpness / contrast / brightness gate) + +Refuse to OCR a blurry or washed-out frame — score quality and gate before recognition. Full reference: [`docs/source/Eng/doc/new_features/v188_features_doc.rst`](docs/source/Eng/doc/new_features/v188_features_doc.rst). + +- **`image_quality` / `is_blurry` / `quality_gate`** (`AC_image_quality`, `AC_quality_gate`): OCR and template matching quietly fail on a blurry, washed-out or too-dark capture, and the caller can't tell a *missing* element from an *unreadable* one. This measures sharpness (variance of the Laplacian), contrast (grayscale stddev) and brightness (mean 0–255); `quality_gate` turns them into `{passed, issues}` flagging `blurry` / `low_contrast` / `too_dark` / `too_bright` so a script can pre-process or re-capture before OCR. Reuses `visual_match`'s grayscale loader (any ndarray / path / PIL image, or the live screen); cv2/numpy lazily imported. No `PySide6`. + ## What's new (2026-06-24) — Drop Files onto a Window (WM_DROPFILES) Complete a drag-and-drop programmatically — drop files onto a target window. Full reference: [`docs/source/Eng/doc/new_features/v187_features_doc.rst`](docs/source/Eng/doc/new_features/v187_features_doc.rst). diff --git a/docs/source/Eng/doc/new_features/v188_features_doc.rst b/docs/source/Eng/doc/new_features/v188_features_doc.rst new file mode 100644 index 00000000..f50e7471 --- /dev/null +++ b/docs/source/Eng/doc/new_features/v188_features_doc.rst @@ -0,0 +1,47 @@ +Image Quality Scoring (sharpness / contrast / brightness gate) +============================================================== + +OCR and template matching quietly fail on a blurry, washed-out or too-dark +capture — the locate returns nothing and the caller can't tell a *missing* +element from an *unreadable* one. ``image_quality`` measures the three things +that wreck recognition and gates on them: + +* **sharpness** — variance of the Laplacian (low = blurry / out of focus), +* **contrast** — standard deviation of the grayscale (low = washed out), +* **brightness** — mean grayscale 0–255 (too low = dark, too high = blown out). + +:func:`image_quality` returns the raw metrics, :func:`is_blurry` is the common +one-liner, and :func:`quality_gate` turns the metrics into a pass / fail verdict +with named issues, so a script can refuse to OCR a bad frame (or pre-process it +first). It reuses ``visual_match``'s grayscale loader, so the source is any +ndarray / path / PIL image (or the live screen when omitted); cv2 / numpy are +lazily imported. Imports no ``PySide6``. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import image_quality, is_blurry, quality_gate + + image_quality("frame.png") + # {"sharpness": 842.1, "contrast": 58.3, "brightness": 131.0} + + if is_blurry("frame.png", threshold=100): + ... # capture again / sharpen before OCR + + gate = quality_gate("frame.png", min_sharpness=100, min_contrast=12) + # {"sharpness": .., "contrast": .., "brightness": .., "passed": False, + # "issues": ["blurry", "too_dark"]} + +``quality_gate`` flags ``blurry`` / ``low_contrast`` / ``too_dark`` / +``too_bright``; ``passed`` is True only when no issue fires. ``region`` applies to +a live-screen grab (omit ``source`` to grade the screen). Thresholds are tunable; +the defaults suit typical UI screenshots. + +Executor commands +----------------- + +``AC_image_quality`` (``source`` / ``region``) and ``AC_quality_gate`` (plus +``min_sharpness`` / ``min_contrast``). They are exposed as read-only ``ac_*`` MCP +tools and as Script Builder commands under **Image**. diff --git a/docs/source/Eng/doc/new_features/v189_features_doc.rst b/docs/source/Eng/doc/new_features/v189_features_doc.rst new file mode 100644 index 00000000..95f050ba --- /dev/null +++ b/docs/source/Eng/doc/new_features/v189_features_doc.rst @@ -0,0 +1,47 @@ +Display-Scale / Visual-DPI Detection +==================================== + +A template cropped at 100% display scale will not match pixel-for-pixel on a +machine running at 150% DPI — everything is 1.5x bigger. ``visual_match. +match_template`` *can* sweep scales, but it returns only the single best match's +location and throws the per-scale scores away. ``scale_detect`` keeps the whole +profile: it scores the template against the haystack at a range of scales and +reports **which scale wins, by how much**, so an automation can infer the +effective UI scale / DPI and how confident that inference is. + +* :func:`scale_sweep` — the per-scale score profile (every scale's best match), +* :func:`detect_scale` — the winning scale as a DPI inference with a confidence + margin. + +It reuses ``visual_match._score_map`` (the full ``matchTemplate`` surface, +oriented higher = better) for each scale, so the source is any ndarray / path / +PIL image (or the live screen). cv2 / numpy are lazily imported. Imports no +``PySide6``. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import detect_scale, scale_sweep + + detect_scale("button.png", "screen.png") + # {"scale": 1.5, "scale_percent": 150, "score": 0.98, "center": [...], + # "margin": 0.62, "candidates": [...]} + + scale_sweep("button.png", scales=[1.0, 1.25, 1.5, 1.75, 2.0]) + # [{"scale": 1.0, "score": .., "center": [..]}, {"scale": 1.25, ...}, ...] + +``scales`` defaults to the common Windows display scales +``(1.0, 1.25, 1.5, 1.75, 2.0)``. ``margin`` is how far the winning scale beats the +runner-up — a low margin means the inference is ambiguous. Scales at which the +template is larger than the haystack are skipped; ``detect_scale`` returns +``None`` when none fit. Omit ``haystack`` to match against the live screen +(``region`` applies to that grab). + +Executor commands +----------------- + +``AC_detect_scale`` and ``AC_scale_sweep`` (``template`` / ``haystack`` / +``region`` / ``scales`` / ``method``). They are exposed as read-only ``ac_*`` MCP +tools and as Script Builder commands under **Image**. diff --git a/docs/source/Eng/doc/new_features/v190_features_doc.rst b/docs/source/Eng/doc/new_features/v190_features_doc.rst new file mode 100644 index 00000000..4ebf33a0 --- /dev/null +++ b/docs/source/Eng/doc/new_features/v190_features_doc.rst @@ -0,0 +1,49 @@ +Visual Saliency (where to look — spectral-residual) +=================================================== + +When there is no template, no known colour and no text to OCR, an agent still +needs a cue for *where to look* — the region that stands out from its +surroundings (a popup, a badge, a highlighted row). ``saliency`` computes the +spectral-residual saliency map (Hou & Zhang 2007) — ``log`` amplitude minus its +local average, reconstructed through the phase — and turns it into ranked salient +boxes. + +* :func:`saliency_map` — the normalised (0–1) saliency map as an ndarray, +* :func:`salient_regions` — ranked salient boxes ``{x, y, width, height, center, + score}`` in source pixel coordinates, +* :func:`most_salient` — the single most salient region (the first place to look). + +The transform is a pure ``numpy`` FFT — ``cv2.saliency`` lives in the forbidden +opencv-contrib package, so it is re-implemented over base opencv only. It reuses +``visual_match``'s grayscale loader (any ndarray / path / PIL image, or the live +screen) and ``cv2_utils.blobs.connected_boxes`` for region extraction. cv2 / +numpy are lazily imported. Imports no ``PySide6``. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import saliency_map, salient_regions, most_salient + + most_salient("screen.png") + # {"x": 612, "y": 40, "width": 180, "height": 36, "center": [702, 58], + # "score": 0.82} + + for region in salient_regions("screen.png"): # most-salient first + ... + + sal = saliency_map("screen.png") # (64, 64) float32 in 0..1 + +Regions are thresholded at ``mean + 2·std`` of the saliency map by default (pass +``threshold`` to override), extracted with ``connected_boxes`` and scaled back to +the source's pixel coordinates. ``size`` is the (small) resolution the saliency is +computed at. Saliency is a coarse attention cue, not a precise detector — use it +to *narrow* where a template / OCR pass then looks. + +Executor commands +----------------- + +``AC_salient_regions`` and ``AC_most_salient`` (``source`` / ``region`` / ``size`` +/ ``threshold`` / ``min_area``). They are exposed as read-only ``ac_*`` MCP tools +and as Script Builder commands under **Image**. diff --git a/docs/source/Zh/doc/new_features/v188_features_doc.rst b/docs/source/Zh/doc/new_features/v188_features_doc.rst new file mode 100644 index 00000000..6c7b062d --- /dev/null +++ b/docs/source/Zh/doc/new_features/v188_features_doc.rst @@ -0,0 +1,42 @@ +影像品質評分(銳利度 / 對比 / 亮度門檻) +======================================= + +OCR 與模板比對在模糊、褪色或太暗的擷取畫面上會悄悄失敗——定位回傳空值,呼叫端無法分辨是元素 +*不存在*還是畫面*無法辨識*。``image_quality`` 量測三項會破壞辨識的指標並據以把關: + +* **sharpness(銳利度)**——Laplacian 的變異數(低 = 模糊 / 失焦), +* **contrast(對比)**——灰階的標準差(低 = 褪色), +* **brightness(亮度)**——灰階平均 0–255(太低 = 太暗,太高 = 過曝)。 + +:func:`image_quality` 回傳原始指標,:func:`is_blurry` 是常用的一行式,:func:`quality_gate` 把 +指標轉成通過 / 失敗的判定並附上具名問題,讓腳本可以拒絕對壞畫面做 OCR(或先做前處理)。它重用 +``visual_match`` 的灰階載入器,因此來源可為任何 ndarray / 路徑 / PIL 影像(省略時則為存活螢幕); +cv2 / numpy 為延遲匯入。不匯入 ``PySide6``。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import image_quality, is_blurry, quality_gate + + image_quality("frame.png") + # {"sharpness": 842.1, "contrast": 58.3, "brightness": 131.0} + + if is_blurry("frame.png", threshold=100): + ... # 在 OCR 前重新擷取 / 銳化 + + gate = quality_gate("frame.png", min_sharpness=100, min_contrast=12) + # {"sharpness": .., "contrast": .., "brightness": .., "passed": False, + # "issues": ["blurry", "too_dark"]} + +``quality_gate`` 會標記 ``blurry`` / ``low_contrast`` / ``too_dark`` / +``too_bright``;只有在沒有任何問題時 ``passed`` 才為 True。``region`` 套用於存活螢幕擷取(省略 +``source`` 即評分螢幕)。門檻可調整;預設值適合一般 UI 截圖。 + +執行器指令 +---------- + +``AC_image_quality``(``source`` / ``region``)與 ``AC_quality_gate``(另加 +``min_sharpness`` / ``min_contrast``)。皆以唯讀 ``ac_*`` MCP 工具及 Script Builder 指令 +(位於 **Image** 分類下)形式提供。 diff --git a/docs/source/Zh/doc/new_features/v189_features_doc.rst b/docs/source/Zh/doc/new_features/v189_features_doc.rst new file mode 100644 index 00000000..e38de475 --- /dev/null +++ b/docs/source/Zh/doc/new_features/v189_features_doc.rst @@ -0,0 +1,40 @@ +顯示縮放 / 視覺 DPI 偵測 +======================= + +在 100% 顯示縮放下裁切的模板,在 150% DPI 的機器上不會逐像素吻合——一切都放大了 1.5 倍。 +``visual_match.match_template`` *可以* 掃過多個縮放,但它只回傳單一最佳吻合的位置,並把各縮放的 +分數丟棄。``scale_detect`` 保留整個剖面:它在一系列縮放下對 haystack 評分模板,並回報**哪個縮放 +勝出、勝出多少**,讓自動化能推測有效的 UI 縮放 / DPI,以及該推測的信心。 + +* :func:`scale_sweep` ——逐縮放的分數剖面(每個縮放的最佳吻合), +* :func:`detect_scale` ——勝出的縮放作為 DPI 推測,並附信心 margin。 + +它對每個縮放重用 ``visual_match._score_map``(完整的 ``matchTemplate`` 表面,方向為越高越好), +因此來源可為任何 ndarray / 路徑 / PIL 影像(或存活螢幕)。cv2 / numpy 為延遲匯入。不匯入 +``PySide6``。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import detect_scale, scale_sweep + + detect_scale("button.png", "screen.png") + # {"scale": 1.5, "scale_percent": 150, "score": 0.98, "center": [...], + # "margin": 0.62, "candidates": [...]} + + scale_sweep("button.png", scales=[1.0, 1.25, 1.5, 1.75, 2.0]) + # [{"scale": 1.0, "score": .., "center": [..]}, {"scale": 1.25, ...}, ...] + +``scales`` 預設為常見的 Windows 顯示縮放 ``(1.0, 1.25, 1.5, 1.75, 2.0)``。``margin`` 是勝出縮放 +領先次佳者的幅度——margin 低代表推測模稜兩可。模板大於 haystack 的縮放會被略過;當沒有任何縮放 +吻合時 ``detect_scale`` 回傳 ``None``。省略 ``haystack`` 即對存活螢幕比對(``region`` 套用於該 +擷取)。 + +執行器指令 +---------- + +``AC_detect_scale`` 與 ``AC_scale_sweep``(``template`` / ``haystack`` / ``region`` / +``scales`` / ``method``)。皆以唯讀 ``ac_*`` MCP 工具及 Script Builder 指令(位於 **Image** +分類下)形式提供。 diff --git a/docs/source/Zh/doc/new_features/v190_features_doc.rst b/docs/source/Zh/doc/new_features/v190_features_doc.rst new file mode 100644 index 00000000..6167c75c --- /dev/null +++ b/docs/source/Zh/doc/new_features/v190_features_doc.rst @@ -0,0 +1,42 @@ +視覺顯著度(該看哪裡——spectral-residual) +========================================== + +當沒有模板、沒有已知顏色、也沒有文字可 OCR 時,agent 仍需要一個*該看哪裡*的線索——也就是從 +周遭凸顯出來的區域(彈出視窗、徽章、被反白的列)。``saliency`` 計算 spectral-residual 顯著度圖 +(Hou & Zhang 2007)——``log`` 振幅減去其區域平均,再透過相位重建——並轉成排序後的顯著方框。 + +* :func:`saliency_map` ——正規化(0–1)的顯著度圖(ndarray), +* :func:`salient_regions` ——排序後的顯著方框 ``{x, y, width, height, center, score}`` + (以來源像素座標表示), +* :func:`most_salient` ——單一最顯著的區域(第一個該看的地方)。 + +此轉換為純 ``numpy`` FFT——``cv2.saliency`` 位於被禁用的 opencv-contrib 套件,故在 base opencv +上重新實作。它重用 ``visual_match`` 的灰階載入器(任何 ndarray / 路徑 / PIL 影像,或存活螢幕)與 +``cv2_utils.blobs.connected_boxes`` 做區域擷取。cv2 / numpy 為延遲匯入。不匯入 ``PySide6``。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import saliency_map, salient_regions, most_salient + + most_salient("screen.png") + # {"x": 612, "y": 40, "width": 180, "height": 36, "center": [702, 58], + # "score": 0.82} + + for region in salient_regions("screen.png"): # 最顯著者在前 + ... + + sal = saliency_map("screen.png") # (64, 64) float32,範圍 0..1 + +區域預設以顯著度圖的 ``mean + 2·std`` 為門檻(可傳 ``threshold`` 覆寫),以 ``connected_boxes`` +擷取,並縮放回來源的像素座標。``size`` 是計算顯著度所用的(較小)解析度。顯著度是粗略的注意力 +線索,而非精確偵測器——用它來*縮小*接著由模板 / OCR 比對的範圍。 + +執行器指令 +---------- + +``AC_salient_regions`` 與 ``AC_most_salient``(``source`` / ``region`` / ``size`` / +``threshold`` / ``min_area``)。皆以唯讀 ``ac_*`` MCP 工具及 Script Builder 指令(位於 **Image** +分類下)形式提供。 diff --git a/je_auto_control/__init__.py b/je_auto_control/__init__.py index 895a182d..0cd404e9 100644 --- a/je_auto_control/__init__.py +++ b/je_auto_control/__init__.py @@ -78,6 +78,16 @@ ) # Drop files onto a window (WM_DROPFILES sender) from je_auto_control.utils.file_drop import drop_files, plan_file_drop +# Image quality scoring (sharpness / contrast / brightness gate before OCR) +from je_auto_control.utils.image_quality import ( + image_quality, is_blurry, quality_gate, +) +# Display-scale / visual-DPI detection (per-scale match profile) +from je_auto_control.utils.scale_detect import detect_scale, scale_sweep +# Spectral-residual visual saliency (where to look — map + salient regions) +from je_auto_control.utils.saliency import ( + most_salient, salient_regions, saliency_map, +) # VLM element locator (headless) from je_auto_control.utils.vision import ( VLMNotAvailableError, click_by_description, locate_by_description, @@ -1652,6 +1662,9 @@ def start_autocontrol_gui(*args, **kwargs): "classify_format", "classify_formats", "diff_formats", "list_clipboard_formats", "clipboard_formats", "plan_file_drop", "drop_files", + "image_quality", "is_blurry", "quality_gate", + "detect_scale", "scale_sweep", + "saliency_map", "salient_regions", "most_salient", # VLM locator "VLMNotAvailableError", "locate_by_description", "click_by_description", "verify_description", diff --git a/je_auto_control/gui/script_builder/command_schema.py b/je_auto_control/gui/script_builder/command_schema.py index 60844736..8019fb23 100644 --- a/je_auto_control/gui/script_builder/command_schema.py +++ b/je_auto_control/gui/script_builder/command_schema.py @@ -741,6 +741,70 @@ def _add_image_specs(specs: List[CommandSpec]) -> None: ), description="Detect a palette/view change vs a reference (illumination-robust).", )) + specs.append(CommandSpec( + "AC_image_quality", "Image", "Image Quality", + fields=( + FieldSpec("source", FieldType.FILE_PATH, optional=True), + FieldSpec("region", FieldType.STRING, optional=True, + placeholder=_REGION_PLACEHOLDER), + ), + description="Sharpness / contrast / brightness of an image or the screen.", + )) + specs.append(CommandSpec( + "AC_quality_gate", "Image", "Quality Gate (OCR-ready?)", + fields=( + FieldSpec("source", FieldType.FILE_PATH, optional=True), + FieldSpec("region", FieldType.STRING, optional=True, + placeholder=_REGION_PLACEHOLDER), + FieldSpec("min_sharpness", FieldType.FLOAT, optional=True, + default=100.0), + FieldSpec("min_contrast", FieldType.FLOAT, optional=True, + default=12.0), + ), + description="Pass/fail an image for OCR readability with named issues.", + )) + specs.append(CommandSpec( + "AC_detect_scale", "Image", "Detect Display Scale (DPI)", + fields=( + FieldSpec("template", FieldType.FILE_PATH), + FieldSpec("haystack", FieldType.FILE_PATH, optional=True), + FieldSpec("region", FieldType.STRING, optional=True, + placeholder=_REGION_PLACEHOLDER), + FieldSpec("scales", FieldType.STRING, optional=True, + placeholder="[1.0, 1.25, 1.5, 1.75, 2.0]"), + ), + description="Infer the display scale a template renders at (visual DPI).", + )) + specs.append(CommandSpec( + "AC_scale_sweep", "Image", "Scale Sweep (per-scale scores)", + fields=( + FieldSpec("template", FieldType.FILE_PATH), + FieldSpec("haystack", FieldType.FILE_PATH, optional=True), + FieldSpec("region", FieldType.STRING, optional=True, + placeholder=_REGION_PLACEHOLDER), + FieldSpec("scales", FieldType.STRING, optional=True, + placeholder="[1.0, 1.25, 1.5, 1.75, 2.0]"), + ), + description="Per-scale match-score profile of a template.", + )) + saliency_fields = ( + FieldSpec("source", FieldType.FILE_PATH, optional=True), + FieldSpec("region", FieldType.STRING, optional=True, + placeholder=_REGION_PLACEHOLDER), + FieldSpec("size", FieldType.INT, optional=True, default=64), + FieldSpec("threshold", FieldType.FLOAT, optional=True), + FieldSpec("min_area", FieldType.INT, optional=True, default=4), + ) + specs.append(CommandSpec( + "AC_salient_regions", "Image", "Salient Regions", + fields=saliency_fields, + description="Visually salient regions (spectral-residual; where to look).", + )) + specs.append(CommandSpec( + "AC_most_salient", "Image", "Most Salient Region", + fields=saliency_fields, + description="The single most visually salient region of an image/screen.", + )) specs.append(CommandSpec( "AC_changed_regions", "Image", "Changed Regions (motion)", fields=( diff --git a/je_auto_control/utils/executor/action_executor.py b/je_auto_control/utils/executor/action_executor.py index dff38212..39a99780 100644 --- a/je_auto_control/utils/executor/action_executor.py +++ b/je_auto_control/utils/executor/action_executor.py @@ -4274,6 +4274,80 @@ def _drop_files(hwnd: Any, paths: Any, point: Any = None) -> Dict[str, Any]: return {"dropped": bool(dropped), "count": len(coerced)} +def _coerce_region(region: Any): + """Normalise a region argument (JSON '[x,y,w,h]' string / list / None).""" + import json + if isinstance(region, str): + return json.loads(region) if region.strip() else None + return region + + +def _image_quality(source: Any = None, region: Any = None) -> Dict[str, Any]: + """Adapter: sharpness / contrast / brightness of an image or the screen.""" + from je_auto_control.utils.image_quality import image_quality + return image_quality(source, region=_coerce_region(region)) + + +def _quality_gate(source: Any = None, region: Any = None, + min_sharpness: Any = 100.0, + min_contrast: Any = 12.0) -> Dict[str, Any]: + """Adapter: pass / fail an image for OCR readability with named issues.""" + from je_auto_control.utils.image_quality import quality_gate + return quality_gate(source, region=_coerce_region(region), + min_sharpness=float(min_sharpness), + min_contrast=float(min_contrast)) + + +def _coerce_scales(scales: Any): + """Normalise a scales argument (JSON '[1.0,1.5]' string / list / None).""" + import json + if isinstance(scales, str): + return json.loads(scales) if scales.strip() else None + return scales + + +def _detect_scale(template: Any, haystack: Any = None, region: Any = None, + scales: Any = None, + method: str = "ccoeff_normed") -> Dict[str, Any]: + """Adapter: infer the display scale a template renders at (visual DPI).""" + from je_auto_control.utils.scale_detect import detect_scale + result = detect_scale(template, haystack, region=_coerce_region(region), + scales=_coerce_scales(scales), method=str(method)) + return {"found": result is not None, "result": result} + + +def _scale_sweep(template: Any, haystack: Any = None, region: Any = None, + scales: Any = None, + method: str = "ccoeff_normed") -> Dict[str, Any]: + """Adapter: per-scale match-score profile of a template.""" + from je_auto_control.utils.scale_detect import scale_sweep + return {"sweep": scale_sweep(template, haystack, + region=_coerce_region(region), + scales=_coerce_scales(scales), + method=str(method))} + + +def _salient_regions(source: Any = None, region: Any = None, size: Any = 64, + threshold: Any = None, min_area: Any = 4) -> Dict[str, Any]: + """Adapter: ranked visually-salient regions of an image / the screen.""" + from je_auto_control.utils.saliency import salient_regions + cut = float(threshold) if threshold not in (None, "") else None + regions = salient_regions(source, region=_coerce_region(region), + size=int(size), threshold=cut, + min_area=int(min_area)) + return {"regions": regions, "count": len(regions)} + + +def _most_salient(source: Any = None, region: Any = None, size: Any = 64, + threshold: Any = None, min_area: Any = 4) -> Dict[str, Any]: + """Adapter: the single most visually-salient region (where to look).""" + from je_auto_control.utils.saliency import most_salient + cut = float(threshold) if threshold not in (None, "") else None + result = most_salient(source, region=_coerce_region(region), + size=int(size), threshold=cut, min_area=int(min_area)) + return {"found": result is not None, "region": result} + + def _image_histogram(source: Any = None, bins: Any = 32, space: str = "hsv", region: Any = None) -> Dict[str, Any]: """Adapter: per-channel colour histogram of an image / the screen.""" @@ -6496,6 +6570,12 @@ def __init__(self): "AC_diff_formats": _diff_formats, "AC_plan_file_drop": _plan_file_drop, "AC_drop_files": _drop_files, + "AC_image_quality": _image_quality, + "AC_quality_gate": _quality_gate, + "AC_detect_scale": _detect_scale, + "AC_scale_sweep": _scale_sweep, + "AC_salient_regions": _salient_regions, + "AC_most_salient": _most_salient, "AC_image_histogram": _image_histogram, "AC_histogram_changed": _histogram_changed, "AC_changed_regions": _changed_regions, diff --git a/je_auto_control/utils/image_quality/__init__.py b/je_auto_control/utils/image_quality/__init__.py new file mode 100644 index 00000000..279676f0 --- /dev/null +++ b/je_auto_control/utils/image_quality/__init__.py @@ -0,0 +1,6 @@ +"""Score image quality (sharpness / contrast / brightness) before OCR / matching.""" +from je_auto_control.utils.image_quality.image_quality import ( + image_quality, is_blurry, quality_gate, +) + +__all__ = ["image_quality", "is_blurry", "quality_gate"] diff --git a/je_auto_control/utils/image_quality/image_quality.py b/je_auto_control/utils/image_quality/image_quality.py new file mode 100644 index 00000000..086d4eb8 --- /dev/null +++ b/je_auto_control/utils/image_quality/image_quality.py @@ -0,0 +1,71 @@ +"""Score image quality before OCR / template matching. + +OCR and template matching quietly fail on a blurry, washed-out or too-dark +capture — the locate returns nothing and the caller can't tell a *missing* +element from an *unreadable* one. ``image_quality`` measures the three things +that wreck recognition and gates on them: + +* **sharpness** — variance of the Laplacian (low = blurry / out of focus), +* **contrast** — standard deviation of the grayscale (low = washed out), +* **brightness** — mean grayscale 0–255 (too low = dark, too high = blown out). + +:func:`image_quality` returns the raw metrics, :func:`is_blurry` is the common +one-liner, and :func:`quality_gate` turns the metrics into a pass / fail verdict +with named issues so a script can refuse to OCR a bad frame (or pre-process it +first). It reuses ``visual_match``'s grayscale loader, so the source is any +ndarray / path / PIL image (or the live screen when omitted); ``region`` applies +to a live-screen grab. cv2 / numpy are lazily imported. Imports no ``PySide6``. +""" +from typing import Any, Dict, Optional, Sequence, Tuple + +ImageSource = Any + + +def _gray(source: Optional[ImageSource], region: Optional[Sequence[int]]): + from je_auto_control.utils.visual_match.visual_match import _haystack_gray + return _haystack_gray(source, region) + + +def image_quality(source: Optional[ImageSource] = None, *, + region: Optional[Sequence[int]] = None) -> Dict[str, float]: + """Return ``{sharpness, contrast, brightness}`` for an image (or live screen). + + ``sharpness`` is the variance of the Laplacian, ``contrast`` the grayscale + standard deviation, ``brightness`` the mean grayscale (0–255). + """ + import cv2 + gray = _gray(source, region) + return {"sharpness": float(cv2.Laplacian(gray, cv2.CV_64F).var()), + "contrast": float(gray.std()), + "brightness": float(gray.mean())} + + +def is_blurry(source: Optional[ImageSource] = None, *, + region: Optional[Sequence[int]] = None, + threshold: float = 100.0) -> bool: + """Return True if the image's Laplacian variance is below ``threshold``.""" + return image_quality(source, region=region)["sharpness"] < float(threshold) + + +def quality_gate(source: Optional[ImageSource] = None, *, + region: Optional[Sequence[int]] = None, + min_sharpness: float = 100.0, min_contrast: float = 12.0, + brightness_range: Tuple[float, float] = (40.0, 220.0), + ) -> Dict[str, Any]: + """Grade an image for OCR readability: ``{..., passed, issues}``. + + ``issues`` flags ``blurry`` / ``low_contrast`` / ``too_dark`` / ``too_bright``; + ``passed`` is True only when no issue fires. + """ + metrics = image_quality(source, region=region) + low, high = brightness_range + issues = [] + if metrics["sharpness"] < float(min_sharpness): + issues.append("blurry") + if metrics["contrast"] < float(min_contrast): + issues.append("low_contrast") + if metrics["brightness"] < float(low): + issues.append("too_dark") + elif metrics["brightness"] > float(high): + issues.append("too_bright") + return {**metrics, "passed": not issues, "issues": issues} diff --git a/je_auto_control/utils/mcp_server/tools/_factories.py b/je_auto_control/utils/mcp_server/tools/_factories.py index c3e2401d..b84715a1 100644 --- a/je_auto_control/utils/mcp_server/tools/_factories.py +++ b/je_auto_control/utils/mcp_server/tools/_factories.py @@ -3385,6 +3385,93 @@ def img_histogram_tools() -> List[MCPTool]: handler=h.histogram_changed, annotations=READ_ONLY, ), + MCPTool( + name="ac_image_quality", + description=("Measure image quality of 'source' (image path; default " + "screen grab of 'region'): {sharpness (Laplacian " + "variance — low=blurry), contrast (grayscale stddev), " + "brightness (mean 0-255)}."), + input_schema=schema({ + "source": {"type": "string"}, + "region": {"type": "array", "items": {"type": "integer"}}}), + handler=h.image_quality, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_quality_gate", + description=("Grade 'source' for OCR readability: {sharpness, " + "contrast, brightness, passed, issues}. 'issues' flags " + "blurry / low_contrast / too_dark / too_bright. Tune with " + "'min_sharpness' / 'min_contrast'."), + input_schema=schema({ + "source": {"type": "string"}, + "region": {"type": "array", "items": {"type": "integer"}}, + "min_sharpness": {"type": "number"}, + "min_contrast": {"type": "number"}}), + handler=h.quality_gate, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_detect_scale", + description=("Infer the display scale a 'template' renders at (visual " + "DPI) by scoring it against 'haystack' (default screen) " + "across 'scales'. Returns {found, result:{scale, " + "scale_percent, score, center, margin, candidates}}."), + input_schema=schema({ + "template": {"type": "string"}, + "haystack": {"type": "string"}, + "region": {"type": "array", "items": {"type": "integer"}}, + "scales": {"type": "array", "items": {"type": "number"}}, + "method": {"type": "string"}}, + required=["template"]), + handler=h.detect_scale, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_scale_sweep", + description=("Per-scale match-score profile of a 'template' against " + "'haystack' (default screen): {sweep:[{scale, score, x, " + "y, width, height, center}]} — the raw scores match_" + "template discards."), + input_schema=schema({ + "template": {"type": "string"}, + "haystack": {"type": "string"}, + "region": {"type": "array", "items": {"type": "integer"}}, + "scales": {"type": "array", "items": {"type": "number"}}, + "method": {"type": "string"}}, + required=["template"]), + handler=h.scale_sweep, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_salient_regions", + description=("Visually salient regions of 'source' (image path; " + "default screen grab of 'region') via spectral-residual " + "saliency — where to look with no template/text. Returns " + "{regions:[{x,y,width,height,center,score}], count}."), + input_schema=schema({ + "source": {"type": "string"}, + "region": {"type": "array", "items": {"type": "integer"}}, + "size": {"type": "integer"}, + "threshold": {"type": "number"}, + "min_area": {"type": "integer"}}), + handler=h.salient_regions, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_most_salient", + description=("The single most visually salient region of 'source' " + "(default screen): {found, region:{x,y,width,height," + "center,score}}. The first place to look."), + input_schema=schema({ + "source": {"type": "string"}, + "region": {"type": "array", "items": {"type": "integer"}}, + "size": {"type": "integer"}, + "threshold": {"type": "number"}, + "min_area": {"type": "integer"}}), + handler=h.most_salient, + annotations=READ_ONLY, + ), ] diff --git a/je_auto_control/utils/mcp_server/tools/_handlers.py b/je_auto_control/utils/mcp_server/tools/_handlers.py index 2bb64e51..c00840fe 100644 --- a/je_auto_control/utils/mcp_server/tools/_handlers.py +++ b/je_auto_control/utils/mcp_server/tools/_handlers.py @@ -2509,6 +2509,40 @@ def drop_files(hwnd, paths, point=None): return _drop_files(hwnd, paths, point) +def image_quality(source=None, region=None): + from je_auto_control.utils.executor.action_executor import _image_quality + return _image_quality(source, region) + + +def quality_gate(source=None, region=None, min_sharpness=100.0, + min_contrast=12.0): + from je_auto_control.utils.executor.action_executor import _quality_gate + return _quality_gate(source, region, min_sharpness, min_contrast) + + +def detect_scale(template, haystack=None, region=None, scales=None, + method="ccoeff_normed"): + from je_auto_control.utils.executor.action_executor import _detect_scale + return _detect_scale(template, haystack, region, scales, method) + + +def scale_sweep(template, haystack=None, region=None, scales=None, + method="ccoeff_normed"): + from je_auto_control.utils.executor.action_executor import _scale_sweep + return _scale_sweep(template, haystack, region, scales, method) + + +def salient_regions(source=None, region=None, size=64, threshold=None, + min_area=4): + from je_auto_control.utils.executor.action_executor import _salient_regions + return _salient_regions(source, region, size, threshold, min_area) + + +def most_salient(source=None, region=None, size=64, threshold=None, min_area=4): + from je_auto_control.utils.executor.action_executor import _most_salient + return _most_salient(source, region, size, threshold, min_area) + + def image_histogram(source=None, bins=32, space="hsv", region=None): from je_auto_control.utils.executor.action_executor import _image_histogram return _image_histogram(source, bins, space, region) diff --git a/je_auto_control/utils/saliency/__init__.py b/je_auto_control/utils/saliency/__init__.py new file mode 100644 index 00000000..396700d2 --- /dev/null +++ b/je_auto_control/utils/saliency/__init__.py @@ -0,0 +1,6 @@ +"""Spectral-residual visual saliency: map + ranked salient regions (numpy FFT).""" +from je_auto_control.utils.saliency.saliency import ( + most_salient, salient_regions, saliency_map, +) + +__all__ = ["saliency_map", "salient_regions", "most_salient"] diff --git a/je_auto_control/utils/saliency/saliency.py b/je_auto_control/utils/saliency/saliency.py new file mode 100644 index 00000000..a4dd92cd --- /dev/null +++ b/je_auto_control/utils/saliency/saliency.py @@ -0,0 +1,101 @@ +"""Find the visually salient regions of a frame (spectral-residual saliency). + +When there is no template, no known colour and no text to OCR, an agent still +needs a cue for *where to look* — the region that stands out from its +surroundings (a popup, a badge, a highlighted row). ``saliency`` computes the +spectral-residual saliency map (Hou & Zhang 2007) — ``log`` amplitude minus its +local average, reconstructed through the phase — and turns it into ranked salient +boxes. + +The transform is a pure ``numpy`` FFT (``cv2.saliency`` lives in the forbidden +opencv-contrib package, so it is re-implemented here over base opencv only). It +reuses ``visual_match``'s grayscale loader for the source (any ndarray / path / +PIL image, or the live screen) and ``cv2_utils.blobs.connected_boxes`` for the +region extraction. cv2 / numpy are lazily imported. Imports no ``PySide6``. +""" +from typing import Any, Dict, List, Optional, Sequence, Tuple + +ImageSource = Any + + +def _gray(source: Optional[ImageSource], region: Optional[Sequence[int]]): + from je_auto_control.utils.visual_match.visual_match import _haystack_gray + return _haystack_gray(source, region) + + +def _saliency_from_gray(gray, size: int): + import cv2 + import numpy as np + small = cv2.resize(gray, (size, size), + interpolation=cv2.INTER_AREA).astype(np.float32) + fft = np.fft.fft2(small) + log_amplitude = np.log(np.abs(fft) + 1e-8) + residual = log_amplitude - cv2.blur(log_amplitude, (3, 3)) + recon = np.fft.ifft2(np.exp(residual + 1j * np.angle(fft))) + smoothed = cv2.GaussianBlur(np.abs(recon) ** 2, (0, 0), sigmaX=3.0) + peak = float(smoothed.max()) + if peak > 0: + smoothed = smoothed / peak + return smoothed.astype(np.float32) + + +def saliency_map(source: Optional[ImageSource] = None, *, + region: Optional[Sequence[int]] = None, size: int = 64): + """Return the normalised (0–1) spectral-residual saliency map as an ndarray. + + The map is computed at ``size`` x ``size`` (the algorithm's native low + resolution); higher = more salient. + """ + return _saliency_from_gray(_gray(source, region), int(size)) + + +def _regions_from_saliency(saliency, orig_shape: Tuple[int, int], + threshold: Optional[float], min_area: int, + size: int) -> List[Dict[str, Any]]: + from je_auto_control.utils.cv2_utils.blobs import connected_boxes + if threshold is not None: + cut = float(threshold) + else: # scale-invariant: regions standing 2 std above the mean saliency + cut = float(saliency.mean()) + 2.0 * float(saliency.std()) + mask = (saliency >= cut).astype("uint8") * 255 + orig_height, orig_width = int(orig_shape[0]), int(orig_shape[1]) + scale_x, scale_y = orig_width / float(size), orig_height / float(size) + regions: List[Dict[str, Any]] = [] + for box in connected_boxes(mask, min_area=min_area): + x, y = int(box["x"] * scale_x), int(box["y"] * scale_y) + width = max(1, int(box["width"] * scale_x)) + height = max(1, int(box["height"] * scale_y)) + patch = saliency[box["y"]:box["y"] + box["height"], + box["x"]:box["x"] + box["width"]] + score = float(patch.mean()) if patch.size else 0.0 + regions.append({"x": x, "y": y, "width": width, "height": height, + "center": [x + width // 2, y + height // 2], + "score": score}) + regions.sort(key=lambda region: region["score"], reverse=True) + return regions + + +def salient_regions(source: Optional[ImageSource] = None, *, + region: Optional[Sequence[int]] = None, size: int = 64, + threshold: Optional[float] = None, + min_area: int = 4) -> List[Dict[str, Any]]: + """Return salient regions as ``[{x, y, width, height, center, score}]``. + + Boxes are thresholded from the saliency map (default cut = 3x the mean, + per Hou & Zhang), extracted with ``connected_boxes`` and scaled back to the + source's pixel coordinates, ranked most-salient first. + """ + gray = _gray(source, region) + saliency = _saliency_from_gray(gray, int(size)) + return _regions_from_saliency(saliency, gray.shape[:2], threshold, + int(min_area), int(size)) + + +def most_salient(source: Optional[ImageSource] = None, *, + region: Optional[Sequence[int]] = None, size: int = 64, + threshold: Optional[float] = None, + min_area: int = 4) -> Optional[Dict[str, Any]]: + """Return the single most salient region, or ``None`` if none stand out.""" + regions = salient_regions(source, region=region, size=size, + threshold=threshold, min_area=min_area) + return regions[0] if regions else None diff --git a/je_auto_control/utils/scale_detect/__init__.py b/je_auto_control/utils/scale_detect/__init__.py new file mode 100644 index 00000000..666f9725 --- /dev/null +++ b/je_auto_control/utils/scale_detect/__init__.py @@ -0,0 +1,6 @@ +"""Detect the display scale / visual DPI a template renders at (per-scale profile).""" +from je_auto_control.utils.scale_detect.scale_detect import ( + detect_scale, scale_sweep, +) + +__all__ = ["detect_scale", "scale_sweep"] diff --git a/je_auto_control/utils/scale_detect/scale_detect.py b/je_auto_control/utils/scale_detect/scale_detect.py new file mode 100644 index 00000000..4e63fbd1 --- /dev/null +++ b/je_auto_control/utils/scale_detect/scale_detect.py @@ -0,0 +1,78 @@ +"""Detect the display scale a template renders at (visual DPI). + +A template cropped at 100% display scale will not match pixel-for-pixel on a +machine running at 150% DPI — everything is 1.5x bigger. ``visual_match. +match_template`` *can* sweep scales, but it returns only the single best match's +location and throws the per-scale scores away. ``scale_detect`` keeps the whole +profile: it scores the template against the haystack at a range of scales and +reports *which scale wins, by how much*, so an automation can infer the effective +UI scale / DPI and how confident that inference is. + +It reuses ``visual_match._score_map`` (the full ``matchTemplate`` surface, +oriented higher = better) for each scale, so the source is any ndarray / path / +PIL image (or the live screen). cv2 / numpy are lazily imported. Imports no +``PySide6``. +""" +from typing import Any, Dict, List, Optional, Sequence + +ImageSource = Any +# Common Windows display scales (100% / 125% / 150% / 175% / 200%). +_DEFAULT_SCALES = (1.0, 1.25, 1.5, 1.75, 2.0) + + +def _score_at(template: ImageSource, haystack: Optional[ImageSource], + region: Optional[Sequence[int]], method: str, + scale: float) -> Optional[Dict[str, Any]]: + import cv2 + from je_auto_control.utils.visual_match.visual_match import _score_map + score_map, tmpl = _score_map(template, haystack, region=region, + method=method, scale=scale) + if score_map is None: + return None # template larger than haystack at this scale + _min_v, max_v, _min_loc, max_loc = cv2.minMaxLoc(score_map) + height, width = int(tmpl.shape[0]), int(tmpl.shape[1]) + x, y = int(max_loc[0]), int(max_loc[1]) + return {"scale": float(scale), "score": float(max_v), "x": x, "y": y, + "width": width, "height": height, + "center": [x + width // 2, y + height // 2]} + + +def scale_sweep(template: ImageSource, haystack: Optional[ImageSource] = None, *, + region: Optional[Sequence[int]] = None, + scales: Optional[Sequence[float]] = None, + method: str = "ccoeff_normed") -> List[Dict[str, Any]]: + """Score ``template`` against the haystack at each scale. + + Returns ``[{scale, score, x, y, width, height, center}]`` (best match per + scale), skipping scales at which the template is larger than the haystack. + """ + chosen = tuple(scales) if scales else _DEFAULT_SCALES + results = [] + for scale in chosen: + entry = _score_at(template, haystack, region, method, float(scale)) + if entry is not None: + results.append(entry) + return results + + +def detect_scale(template: ImageSource, haystack: Optional[ImageSource] = None, *, + region: Optional[Sequence[int]] = None, + scales: Optional[Sequence[float]] = None, + method: str = "ccoeff_normed") -> Optional[Dict[str, Any]]: + """Infer the display scale ``template`` renders at (its visual DPI). + + Returns ``{scale, scale_percent, score, center, margin, candidates}`` — the + winning scale, its percentage, the match score, and ``margin`` (how far it + beats the runner-up: a confidence in the inference). ``None`` if no scale + matched (template never fit the haystack). + """ + sweep = scale_sweep(template, haystack, region=region, scales=scales, + method=method) + if not sweep: + return None + ranked = sorted(sweep, key=lambda entry: entry["score"], reverse=True) + best = ranked[0] + margin = best["score"] - ranked[1]["score"] if len(ranked) > 1 else best["score"] + return {"scale": best["scale"], "scale_percent": round(best["scale"] * 100), + "score": best["score"], "center": best["center"], + "margin": float(margin), "candidates": sweep} diff --git a/test/unit_test/headless/test_image_quality_batch.py b/test/unit_test/headless/test_image_quality_batch.py new file mode 100644 index 00000000..1b43165e --- /dev/null +++ b/test/unit_test/headless/test_image_quality_batch.py @@ -0,0 +1,79 @@ +"""Headless tests for image-quality scoring (cv2 synthetic frames).""" +import pytest + +import je_auto_control as ac + +np = pytest.importorskip("numpy") +cv2 = pytest.importorskip("cv2") + +from je_auto_control.utils.image_quality import ( # noqa: E402 + image_quality, is_blurry, quality_gate, +) + + +def _sharp(): + rng = np.random.default_rng(0) + return rng.integers(0, 256, (120, 120)).astype("uint8") # noisy = high Laplacian var + + +def _blurry(): + return cv2.GaussianBlur(_sharp(), (0, 0), 8) # heavy blur = low var + + +def test_metrics_present_and_typed(): + metrics = image_quality(_sharp()) + assert set(metrics) == {"sharpness", "contrast", "brightness"} + assert all(isinstance(v, float) for v in metrics.values()) + assert 0.0 <= metrics["brightness"] <= 255.0 + + +def test_sharp_is_sharper_than_blurry(): + assert image_quality(_sharp())["sharpness"] > image_quality(_blurry())["sharpness"] + assert is_blurry(_sharp(), threshold=100.0) is False + assert is_blurry(_blurry(), threshold=100.0) is True + + +def test_quality_gate_pass_and_fail(): + good = quality_gate(_sharp()) + assert good["passed"] is True and good["issues"] == [] + bad = quality_gate(_blurry()) + assert bad["passed"] is False + assert "blurry" in bad["issues"] + + +def test_quality_gate_flags_dark(): + dark = np.full((80, 80), 5, "uint8") + report = quality_gate(dark) + assert report["passed"] is False + assert "too_dark" in report["issues"] + + +def test_quality_gate_brightness_range_tunable(): + mid = np.full((40, 40), 130, "uint8") + # a flat frame is blurry+low-contrast, but brightness must not be flagged + issues = quality_gate(mid, brightness_range=(40.0, 220.0))["issues"] + assert "too_dark" not in issues and "too_bright" not in issues + + +# --- wiring --------------------------------------------------------------- + +def test_executor_pure_path(): + from je_auto_control.utils.executor.action_executor import _quality_gate + report = _quality_gate(_blurry()) + assert report["passed"] is False and "blurry" in report["issues"] + + +def test_wiring(): + known = set(ac.executor.known_commands()) + assert {"AC_image_quality", "AC_quality_gate"} <= known + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + names = {t.name for t in build_default_tool_registry()} + assert {"ac_image_quality", "ac_quality_gate"} <= names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert {"AC_image_quality", "AC_quality_gate"} <= specs + + +def test_facade_exports(): + for name in ("image_quality", "is_blurry", "quality_gate"): + assert hasattr(ac, name) and name in ac.__all__ diff --git a/test/unit_test/headless/test_saliency_batch.py b/test/unit_test/headless/test_saliency_batch.py new file mode 100644 index 00000000..ccd9a964 --- /dev/null +++ b/test/unit_test/headless/test_saliency_batch.py @@ -0,0 +1,79 @@ +"""Headless tests for spectral-residual saliency (cv2/numpy synthetic frames).""" +import pytest + +import je_auto_control as ac + +np = pytest.importorskip("numpy") +pytest.importorskip("cv2") + +from je_auto_control.utils.saliency import ( # noqa: E402 + most_salient, salient_regions, saliency_map, +) + + +def _structured(): + """A dark frame with three bright blocks.""" + img = np.full((240, 320, 3), 20, "uint8") + img[40:80, 40:80] = 240 + img[150:190, 200:240] = 230 + img[100:130, 140:175] = 255 + return img + + +def test_saliency_map_shape_and_range(): + sal_map = saliency_map(_structured()) + assert sal_map.shape == (64, 64) + assert sal_map.dtype == np.float32 + assert float(sal_map.min()) >= 0.0 + assert float(sal_map.max()) <= 1.0 + + +def test_size_parameter_changes_map_resolution(): + assert saliency_map(_structured(), size=32).shape == (32, 32) + + +def test_salient_regions_in_bounds_and_ranked(): + regions = salient_regions(_structured()) + assert len(regions) >= 1 + for region in regions: + assert 0 <= region["x"] and region["x"] + region["width"] <= 320 + assert 0 <= region["y"] and region["y"] + region["height"] <= 240 + assert 0.0 <= region["score"] <= 1.0 + scores = [r["score"] for r in regions] + assert scores == sorted(scores, reverse=True) + + +def test_most_salient_matches_top_region(): + img = _structured() + top = most_salient(img) + assert top is not None and top == salient_regions(img)[0] + + +def test_high_threshold_yields_nothing(): + img = _structured() + assert salient_regions(img, threshold=2.0) == [] # above the normalised max + assert most_salient(img, threshold=2.0) is None + + +# --- wiring --------------------------------------------------------------- + +def test_executor_pure_path(): + from je_auto_control.utils.executor.action_executor import _salient_regions + out = _salient_regions(_structured()) + assert isinstance(out["regions"], list) and len(out["regions"]) >= 1 + + +def test_wiring(): + known = set(ac.executor.known_commands()) + assert {"AC_salient_regions", "AC_most_salient"} <= known + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + names = {t.name for t in build_default_tool_registry()} + assert {"ac_salient_regions", "ac_most_salient"} <= names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert {"AC_salient_regions", "AC_most_salient"} <= specs + + +def test_facade_exports(): + for name in ("saliency_map", "salient_regions", "most_salient"): + assert hasattr(ac, name) and name in ac.__all__ diff --git a/test/unit_test/headless/test_scale_detect_batch.py b/test/unit_test/headless/test_scale_detect_batch.py new file mode 100644 index 00000000..2dd18ba1 --- /dev/null +++ b/test/unit_test/headless/test_scale_detect_batch.py @@ -0,0 +1,84 @@ +"""Headless tests for display-scale / visual-DPI detection (cv2 synthetic frames).""" +import pytest + +import je_auto_control as ac + +np = pytest.importorskip("numpy") +cv2 = pytest.importorskip("cv2") + +from je_auto_control.utils.scale_detect import detect_scale, scale_sweep # noqa: E402 + + +def _template(): + rng = np.random.default_rng(1) + return rng.integers(0, 256, (40, 40, 3)).astype("uint8") + + +def _haystack_at(template, factor): + """Embed the template resized by ``factor`` into a blank canvas.""" + size = int(40 * factor) + big = cv2.resize(template, (size, size), interpolation=cv2.INTER_LINEAR) + canvas = np.zeros((260, 260, 3), "uint8") + canvas[60:60 + size, 50:50 + size] = big + return canvas + + +def test_detect_scale_finds_the_rendering_scale(): + template = _template() + result = detect_scale(template, _haystack_at(template, 1.5), + scales=(1.0, 1.25, 1.5, 1.75, 2.0)) + assert result["scale"] == pytest.approx(1.5) + assert result["scale_percent"] == 150 + assert result["margin"] > 0.3 # clearly beats the runner-up + # centre near the embedded 60x60 block at (50,60) + cx, cy = result["center"] + assert 70 <= cx <= 110 and 80 <= cy <= 120 + + +def test_detect_scale_at_unity(): + template = _template() + result = detect_scale(template, _haystack_at(template, 1.0)) + assert result["scale"] == pytest.approx(1.0) + assert result["scale_percent"] == 100 + + +def test_scale_sweep_returns_full_profile(): + template = _template() + sweep = scale_sweep(template, _haystack_at(template, 1.25), + scales=(1.0, 1.25, 1.5)) + assert [round(c["scale"], 2) for c in sweep] == [1.0, 1.25, 1.5] + best = max(sweep, key=lambda c: c["score"]) + assert best["scale"] == pytest.approx(1.25) + assert {"scale", "score", "x", "y", "width", "height", "center"} <= set(sweep[0]) + + +def test_detect_scale_none_when_template_too_big(): + template = _template() + assert detect_scale(template, np.zeros((10, 10, 3), "uint8")) is None + assert scale_sweep(template, np.zeros((10, 10, 3), "uint8")) == [] + + +# --- wiring --------------------------------------------------------------- + +def test_executor_pure_path(): + template = _template() + from je_auto_control.utils.executor.action_executor import _detect_scale + # pass ndarrays straight through (executor coerces only str args) + out = _detect_scale(template, _haystack_at(template, 2.0)) + assert out["found"] is True and out["result"]["scale_percent"] == 200 + + +def test_wiring(): + known = set(ac.executor.known_commands()) + assert {"AC_detect_scale", "AC_scale_sweep"} <= known + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + names = {t.name for t in build_default_tool_registry()} + assert {"ac_detect_scale", "ac_scale_sweep"} <= names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert {"AC_detect_scale", "AC_scale_sweep"} <= specs + + +def test_facade_exports(): + for name in ("detect_scale", "scale_sweep"): + assert hasattr(ac, name) and name in ac.__all__