TIL: fix trash output for code llama for long prompt #24

sfc-gh-zhwang · 2023-10-08T02:06:57Z

When prompt length is very long, the shared memory size grow to more than 48kb, which is the default value in cuda. We need to manually set the shared memory size.

https://leimao.github.io/blog/CUDA-Shared-Memory-Capacity/#:~:text=However%2C%20CUDA%20shared%20memory%20has,is%2048%20KB%20by%20default.

cc @sfc-gh-vichan @neevaco/corvo

sfc-gh-zhwang added 5 commits October 7, 2023 19:06

commit

94a24da

commit

404cc95

commit

63002c8

commit

c3f9b0f

commit

94cd065

sfc-gh-zhwang changed the title ~~Zhwang/fix code llama long prompt~~ TIL: fix trash output for code llama for long prompt Oct 8, 2023

sfc-gh-zhwang added 2 commits October 7, 2023 21:42

commit

783b025

commit

cc7b81c

sfc-gh-zhwang merged commit 26835b7 into corvo Oct 8, 2023

sfc-gh-zhwang mentioned this pull request Oct 8, 2023

Update ft commit to fix code llama long prompt issue Snowflake-Labs/fastertransformer_backend#26

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TIL: fix trash output for code llama for long prompt #24

TIL: fix trash output for code llama for long prompt #24

Uh oh!

sfc-gh-zhwang commented Oct 8, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TIL: fix trash output for code llama for long prompt #24

TIL: fix trash output for code llama for long prompt #24

Uh oh!

Conversation

sfc-gh-zhwang commented Oct 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sfc-gh-zhwang commented Oct 8, 2023 •

edited

Loading