Add ask_vlm method for cloud VLM alert verification#442
Draft
srnangi wants to merge 7 commits into
Draft
Conversation
Add Groundlight.ask_vlm(images, query, model_id) which verifies one or two images against a natural-language query by calling POST /v1/vlm-queries. Returns a VLMVerificationResult dataclass with verdict (YES/NO/UNSURE), confidence, reasoning, and token cost. - Accepts a single image or [full_frame, roi] for the dual-image strategy, reusing parse_supported_image_types for encoding. - Moves the requests import to module level. - Exports VLMVerificationResult from the package. - Unit tests with mocked HTTP. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- POST query and model_id as multipart form fields (data=) instead of query-string params, matching the updated endpoint and keeping long prompts out of URLs and access logs. - model_id is now a friendly alias (e.g. "gpt-5.4", "claude-sonnet-4.5") resolved server-side, not a raw Bedrock model ID. - Tests updated to assert form-field transport. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop the gpt-5.4 example (OpenAI models on Bedrock are text-only and cannot do image verification); use claude-sonnet-4.5 / nova-pro instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Match the generalized endpoint: param images -> media, multipart field 'media', guard raised from 2 to 8. The query should describe each media item (server makes no frame/ROI assumption). Docstring + tests updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Endpoint renamed server-side from vlm-queries to vlm-verifications. Update the SDK POST path and test fixtures accordingly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
sanitize_endpoint_url() strips the trailing slash from self.endpoint, so joining without "/" produced ".../device-apiv1/vlm-verifications" instead of ".../device-api/v1/vlm-verifications". Added test_url_has_correct_path to pin the correct URL shape. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
b2b0755 to
263808d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds
Groundlight.ask_vlm(media, query, model_id)for VLM-based alert verification. CallsPOST /v1/vlm-verificationson the Groundlight cloud (AWS Bedrock) and returns aVLMVerificationResultwithverdict(YES/NO/UNSURE),confidence,reasoning, and token cost fields.Pairs with the janzu PR (zuuul#6519). No local inference — VLM runs entirely in the cloud.
How
mediaaccepts 1–8 images (single image or list). Accepts numpy BGR arrays, PIL Images, bytes, BytesIO/BufferedReader, or filename strings — encoded via the existingparse_supported_image_typesutility.mediaparts asimage/jpegfiles;queryandmodel_idas form fields (not URL params, so the prompt never leaks into access logs).model_idis a friendly alias (e.g."gpt-5.4","claude-sonnet-4.5") — the server maps it to the real Bedrock model ID. Defaults to the server-configured default.VLMVerificationResultfrom the package root.Usage
Changes
src/groundlight/client.py:ask_vlmmethod +VLMVerificationResultdataclasssrc/groundlight/__init__.py: exportsVLMVerificationResulttest/unit/test_ask_vlm.py: 8 unit tests with mocked HTTPBug fix included
sanitize_endpoint_urlstrips the trailing slash fromself.endpoint, so the original code produced.../device-apiv1/vlm-verifications. Fixed to/v1/vlm-verifications(with leading slash). Regression test added.Testing
8 unit tests (mocked HTTP, no live server):
mediapartsquery/model_idsent as form fields, not URL paramsmodel_idomits the field entirelyValueError/device-api/v1/vlm-verifications)🤖 Generated with Claude Code