static_instruction + instruction pattern for context caching producing a permanently unstable fingerprint

## 🔴 Required Information

**Describe the Bug:**

`GeminiContextCacheManager` contains a method `_find_count_of_contents_to_cache` specifically designed to exclude the dynamic `instruction_provider` content from the cache fingerprint — but it is **never called**. As a result, using the documented `static_instruction` + `instruction` pattern for context caching produces a permanently unstable fingerprint, making the cache never hit.

**Root cause (code walkthrough):**

`static_instruction` + `instruction` is the ADK-recommended pattern for context caching: the static part goes to `system_instruction` (stable, fingerprinted), while the `instruction` provider result is appended to `llm_request.contents` as a user-role `Content` (dynamic, should be excluded from the fingerprint).

The intended mechanism for excluding this dynamic content exists in `gemini_context_cache_manager.py`:

```python
def _find_count_of_contents_to_cache(self, contents):
    """Find the number of contents to cache based on user content strategy.
    Strategy: Find the last continuous batch of user contents and cache
    all contents before them.
    """
    last_user_batch_start = len(contents)
    for i in range(len(contents) - 1, -1, -1):
        if contents[i].role == "user":
            last_user_batch_start = i
        else:
            break
    return last_user_batch_start
```

At turn 1, with `instruction_provider` appending a user-role block at the end, all contents are user-role → this function returns **N=0** → fingerprint = `hash(system_instruction + tools)` only → stable across all turns.

However, in `handle_context_caching`, the actual fingerprint count is computed as:

```python
# No existing cache metadata - return fingerprint-only metadata
total_contents_count = len(llm_request.contents)  # ← bug: should use _find_count_of_contents_to_cache
fingerprint = self._generate_cache_fingerprint(llm_request, total_contents_count)
return CacheMetadata(fingerprint=fingerprint, contents_count=total_contents_count)
```

`_find_count_of_contents_to_cache` is **defined but never called anywhere in the codebase.**

**Why this breaks turn-by-turn:**

- Turn 1 contents (after `instruction_provider` appends): `[user_msg_1, dynamic_ctx_t1]` → N=2, fingerprint covers both
- Turn 2 contents (first N=2): `[user_msg_1, model_resp_1]` — model response now occupies the slot where `dynamic_ctx_t1` was
- Fingerprint mismatch → N reset to 4 (total contents) → same problem repeats every turn
- Cache is **never created**

**Steps to Reproduce:**

1. Create an `LlmAgent` with `static_instruction` (stable string) and `instruction` (dynamic provider returning session-dependent content)
2. Enable `ContextCacheConfig` on the `App`
3. Run a multi-turn conversation
4. Enable `GOOGLE_ADK_LOG_LEVEL=DEBUG` and observe logs

**Expected Behavior:**

The `instruction_provider` content (user-role, appended at end of `contents`) is excluded from the cache fingerprint. The fingerprint covers only `system_instruction + tools`, which is stable across turns. The cache is created on turn 2 and reused on subsequent turns as long as `system_instruction` and `tools` do not change.

**Observed Behavior:**

The fingerprint includes the `instruction_provider` content (via `len(llm_request.contents)`). Since that content changes each turn (or is displaced by the model's response in the first-N window), the fingerprint changes on every turn. Debug logs show:

```
Cache content fingerprint mismatch
Fingerprints don't match, returning fingerprint-only metadata
```

The cache is never created. `cache_hit_pct = 0%`.

**Proposed Fix:**

In `handle_context_caching`, replace `len(llm_request.contents)` with the existing (but uncalled) `_find_count_of_contents_to_cache`:

```python
# Before (buggy):
total_contents_count = len(llm_request.contents)

# After (fix):
total_contents_count = self._find_count_of_contents_to_cache(llm_request.contents)
```

This aligns the implementation with the documented `static_instruction` + `instruction` pattern and with the evident design intent of `_find_count_of_contents_to_cache`.

**Environment Details:**

- ADK Library Version: `google-adk==1.32.0`
- Desktop OS: macOS (Darwin 24.6.0)
- Python Version: 3.13.11

**Model Information:**

- LiteLLM: No
- Model: `gemini-2.0-flash-lite` (Gemini API)

---

## 🟡 Optional Information

**Minimal Reproduction Code:**

```python
from google.adk.agents import LlmAgent
from google.adk.apps.app import App
from google.adk.agents.context_cache_config import ContextCacheConfig
from google.adk.agents.readonly_context import ReadonlyContext
from google.adk.models import Gemini

_STATIC_PROMPT = "You are a helpful assistant. " * 300  # large enough to exceed 4096 tokens with tools

def dynamic_instruction(context: ReadonlyContext) -> str:
    # Simulates per-turn dynamic content (e.g. session state)
    return f"<session_state>turn_data={context.state.get('turn', 0)}</session_state>"

agent = LlmAgent(
    name="test_agent",
    model=Gemini(model="gemini-2.0-flash-lite"),
    static_instruction=_STATIC_PROMPT,
    instruction=dynamic_instruction,
)

app = App(
    name="test",
    root_agent=agent,
    context_cache_config=ContextCacheConfig(ttl_seconds=1800, min_tokens=4096),
)
# Run multi-turn: observe "fingerprint mismatch" in DEBUG logs on every turn
```

**How often has this issue occurred?:** Always (100%)

**Additional Context:**

The workaround is to inject dynamic content via a `before_model_callback` that calls `llm_request.contents.insert(0, ...)` instead of using `instruction_provider`. Because the dynamic block is then at position 0 on every turn, the first-N fingerprint window consistently starts with it, and the fingerprint is stable as long as the dynamic content itself doesn't change. This is semantically equivalent to the intended `instruction_provider` behavior but should not be necessary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

static_instruction + instruction pattern for context caching producing a permanently unstable fingerprint #6216

🔴 Required Information

🟡 Optional Information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

static_instruction + instruction pattern for context caching producing a permanently unstable fingerprint #6216

Description

🔴 Required Information

🟡 Optional Information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions