Support proper KVCache

## What?

Previously, KVCache export was not supported for mainly TWO reasons.

(1) torch.export doesn't allow input mutation (e.g. in torch 2.7)
  - This is resolved in torch 2.10

(2) ONE CIRCLE doesn't allow in-memory buffer update.
  - TBD (Will they provide support?)

If above limitations are lifted, we could directly convert KV Cache into corresponding Circle.

### Importance

- ASR/Llama/VLM models includes self-attention/cross-attention and they applies implies multiple kv cache handling.
  - + speculative decoding

### KV Cache's aten origins
```py
ERROR:tico.utils.convert:NOT SUPPORTED OPERATOR
        (op) index_put.default
        (trace)   File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/torch/_dynamo/functional_export.py", line 216, in forward
    res = self._export_root(*args, **kwargs)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/transformers/models/whisper/modeling_whisper.py", line 337, in forward
    key_states, value_states = past_key_values.update(
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/transformers/cache_utils.py", line 783, in update
    keys, values = self.layers[layer_idx].update(key_states, value_states, cache_kwargs)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/transformers/cache_utils.py", line 340, in update
    self.values.index_copy_(2, cache_position, value_states)
```




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support proper KVCache #469

What?

Importance

KV Cache's aten origins

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support proper KVCache #469

Description

What?

Importance

KV Cache's aten origins

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions