Conversation
OCR and template matching quietly fail on a blurry, washed-out or too-dark capture, and the caller can't tell a missing element from an unreadable one. Measure sharpness (variance of the Laplacian), contrast (grayscale stddev) and brightness (mean), and gate on them with named issues (blurry / low_contrast / too_dark / too_bright) so a script can pre-process or re-capture before OCR. Reuses visual_match's grayscale loader; cv2/numpy lazily imported.
…y-batch Add image_quality: sharpness/contrast/brightness gate before OCR
A template cropped at 100% scale won't match on a 150%-DPI machine, and match_template returns only the single best match, discarding the per-scale scores. scale_sweep keeps the whole profile (every scale's best match) and detect_scale reports the winning scale as a DPI inference with a confidence margin (how far it beats the runner-up). Reuses visual_match._score_map per scale; cv2/numpy lazily imported.
…-batch Add scale_detect: infer display scale / visual DPI from a template
When there's no template, colour or text to key on, an agent still needs a cue for where to look. Compute the spectral-residual saliency map (Hou & Zhang 2007) and rank salient boxes in source coordinates. Pure numpy FFT (cv2.saliency is opencv-contrib, forbidden), reusing visual_match's grayscale loader and cv2_utils.blobs.connected_boxes; regions threshold at mean+2*std by default. A coarse attention cue to narrow where a template / OCR pass then looks.
Add saliency: spectral-residual visual saliency (where to look)
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 77 |
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Release: vision lane (v188–v190)
Bundles the completed vision lane — three image-analysis features that reuse
visual_match's loaders, each merged todevCI-green (Codacy 0 / SonarCloud OK / all matrices + Docker).image_quality(v188, Add image_quality: sharpness/contrast/brightness gate before OCR #406) — sharpness / contrast / brightness metrics + aquality_gate(blurry / low_contrast / too_dark / too_bright) to refuse OCR on a bad frame.scale_detect(v189, Add scale_detect: infer display scale / visual DPI from a template #407) — infer the display scale / visual DPI a template renders at, with the per-scale score profile + confidence marginmatch_templatediscards.saliency(v190, Add saliency: spectral-residual visual saliency (where to look) #408) — spectral-residual visual saliency (pure numpy FFT) → ranked salient regions: where to look with no template / colour / text.All three keep the metric / inference / transform logic headless-testable (cv2 via
importorskip); cv2/numpy are lazily imported. EN/Zh docs (v188–v190) + WHATS_NEW entries included.