Skip to content
This repository was archived by the owner on Sep 23, 2025. It is now read-only.

Conversation

@harborn
Copy link
Contributor

@harborn harborn commented Jul 5, 2024

  1. fix default tokenize function for enabling padding.
  2. instance training arguments before loading dataset and tokenizer, due to deepspeed training bug.
  3. enable more training arguments on HPU, such as attn_softmax_bf16, use_flash_attention, pipelining_fwd_bwd

@harborn harborn force-pushed the fix-fine-tuning-bugs branch from e1b0418 to fd4d31f Compare July 17, 2024 03:00
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant