This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Description
Description:
When using chat templates in Hugging Face, the Beginning-OfSentence (BOS) token is often already included in the template. However, Llama.cpp also automatically adds the BOS token, resulting in a duplicate BOS token.
Expected behavior:
The system should automatically detect and remove any duplicate BOS tokens in the chat template. This would ensure proper functioning of the chat system without causing errors due to redundant tokens.
Additional context:
This issue may cause unexpected behavior or errors in the chat system. It is recommended that Cortex checks for and deduplicates the BOS token if it is present in the user's template to maintain a consistent and error-free chat experience.