Is GenerationConfig.repetitionPenalty used during generation?

I am testing the code using the [Core ML version of Llama 2](https://huggingface.co/coreml-projects/Llama-2-7b-chat-coreml).

Setting `GenerationConfig.maxLength` to something larger than the default, e.g., `64`, produces the correct number of output tokens, but tends to repeat tokens towards the end of generation. Adjusting `repetitionPenalty` doesn't seem to have an effect.

Looking into `Generation.swift`, I see the code references `maxLength`, `eosTokenId`, `temperature` and others, but not `repetitionPenalty`.  Does this explain the repetitive output?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Is GenerationConfig.repetitionPenalty used during generation? #84

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Is GenerationConfig.repetitionPenalty used during generation? #84

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions