Skip to content

Release: vision lane — image quality + DPI detection + saliency (v188–v190)#409

Merged
JE-Chen merged 6 commits into
mainfrom
dev
Jun 24, 2026
Merged

Release: vision lane — image quality + DPI detection + saliency (v188–v190)#409
JE-Chen merged 6 commits into
mainfrom
dev

Conversation

@JE-Chen

@JE-Chen JE-Chen commented Jun 24, 2026

Copy link
Copy Markdown
Member

Release: vision lane (v188–v190)

Bundles the completed vision lane — three image-analysis features that reuse visual_match's loaders, each merged to dev CI-green (Codacy 0 / SonarCloud OK / all matrices + Docker).

All three keep the metric / inference / transform logic headless-testable (cv2 via importorskip); cv2/numpy are lazily imported. EN/Zh docs (v188–v190) + WHATS_NEW entries included.

JE-Chen added 6 commits June 24, 2026 14:25
OCR and template matching quietly fail on a blurry, washed-out or
too-dark capture, and the caller can't tell a missing element from an
unreadable one. Measure sharpness (variance of the Laplacian), contrast
(grayscale stddev) and brightness (mean), and gate on them with named
issues (blurry / low_contrast / too_dark / too_bright) so a script can
pre-process or re-capture before OCR. Reuses visual_match's grayscale
loader; cv2/numpy lazily imported.
…y-batch

Add image_quality: sharpness/contrast/brightness gate before OCR
A template cropped at 100% scale won't match on a 150%-DPI machine, and
match_template returns only the single best match, discarding the
per-scale scores. scale_sweep keeps the whole profile (every scale's
best match) and detect_scale reports the winning scale as a DPI
inference with a confidence margin (how far it beats the runner-up).
Reuses visual_match._score_map per scale; cv2/numpy lazily imported.
…-batch

Add scale_detect: infer display scale / visual DPI from a template
When there's no template, colour or text to key on, an agent still
needs a cue for where to look. Compute the spectral-residual saliency
map (Hou & Zhang 2007) and rank salient boxes in source coordinates.
Pure numpy FFT (cv2.saliency is opencv-contrib, forbidden), reusing
visual_match's grayscale loader and cv2_utils.blobs.connected_boxes;
regions threshold at mean+2*std by default. A coarse attention cue to
narrow where a template / OCR pass then looks.
Add saliency: spectral-residual visual saliency (where to look)
@codacy-production

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 77 complexity · 0 duplication

Metric Results
Complexity 77
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@JE-Chen JE-Chen merged commit 2ba8465 into main Jun 24, 2026
31 checks passed
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant