Fix HellaSwag #4982

ikawrakow · 2024-01-16T16:57:24Z

HellaSwag is broken on current master (see #4980). It is related to KV cache handling.

Instead of trying to sort it out, I just changed to evaluating the full ending (so context + ending together) for all 4 endings.
The performance hit is surprisingly low: it runs 400 tasks in 55 seconds with a 7B model (fp16, not quantized), versus 49 seconds before this change (where is the time going? The number of tokens being evaluated is at least two times more). The result for LLaMA-v2 is now 77.00, as we had before, versus 53.00 on master.

Left the existing version commented out for now.

ikawrakow · 2024-01-16T17:18:27Z

Closing in favor of #4981

Fix HellaSwag

d34472c

ikawrakow requested a review from ggerganov January 16, 2024 16:57

ikawrakow mentioned this pull request Jan 16, 2024

perplexity : fix kv cache handling for hellaswag #4981

Merged

ikawrakow closed this Jan 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix HellaSwag #4982

Fix HellaSwag #4982

Uh oh!

ikawrakow commented Jan 16, 2024

Uh oh!

ikawrakow commented Jan 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix HellaSwag #4982

Fix HellaSwag #4982

Uh oh!

Conversation

ikawrakow commented Jan 16, 2024

Uh oh!

ikawrakow commented Jan 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants