Skip to content
This repository was archived by the owner on Oct 25, 2024. It is now read-only.

Commit 7da0cf5

Browse files
committed
Merge branch 'hengguo/h2o' of https://github.com/intel/intel-extension-for-transformers into hengguo/h2o
2 parents 3723158 + 0c547c5 commit 7da0cf5

File tree

1 file changed

+1
-1
lines changed
  • examples/huggingface/pytorch/text-generation/h2o

1 file changed

+1
-1
lines changed

examples/huggingface/pytorch/text-generation/h2o/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
22

3-
**Heavy-Hitter Oracal (H2O)** is a novel approach for implementing the KV cache wihich significantly reduces memory footprint.
3+
**Heavy-Hitter Oracal (H2O)** is a novel approach for implementing the KV cache which significantly reduces memory footprint.
44

55
This methods base on the fact that the accumulated attention scores of all tokens in attention blocks adhere to a power-law distribution. It suggests that there exists a small set of influential tokens that are critical during generation, named heavy-hitters (H2). H2 provides an opportunity to step away from the combinatorial search problem and identify an eviction policy that maintains accuracy.
66

0 commit comments

Comments
 (0)