Skip to content

Conversation

@DannyDaemonic
Copy link
Contributor

This code just checks if we've used up all the given prompt but still have leftover tokens from the cache. In this case it instead leaves the last token in embd so it will be evaluated.

Evaluating one token isn't the end of the world, but it would be nice if it were faster. What I really wanted to do was something like this:

    llama_eval(ctx, embd.data(), 0, n_past - 1, params.n_threads);

But you can't eval 0 tokens.

I tried to edit llama_eval_internal to skip over the input calculations where n_past (N) is 0, but it's actually way more integrated than I expected. Perhaps there was something to #1281 after all. I think it would be nicer if llama_eval just supported 0 tokens, but it would also be nice to have a different api call that would evaluate the logits at a given position n_past.

This API call could also be used for implementing the mockup in #1608 where you can click on a token and see alternatives.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@KerfuffleV2
Copy link
Contributor

I can no longer reproduce the problem using the steps described here: #1550 (comment)

Also, as far as I can see prompt caching still works.

I don't know enough about the code to actually review this pull, but I can confirm that from a user's perspective it seems to fix the problem and doesn't break other stuff.

@SlyEcho
Copy link
Contributor

SlyEcho commented May 29, 2023

Yeah, I discovered the same "fix" for #1570

@SlyEcho
Copy link
Contributor

SlyEcho commented May 29, 2023

Merge in the latest master to fix the build errors.

@DannyDaemonic
Copy link
Contributor Author

@SlyEcho Thanks, that fixed the CI errors. For some reason I thought I could just wait until things were fixed and then rerun the checks.

@DannyDaemonic DannyDaemonic merged commit 2483676 into ggml-org:master May 29, 2023
@DannyDaemonic DannyDaemonic deleted the cached-logits-bandaid branch May 29, 2023 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants