Skip to content

Conversation

quic-sanising
Copy link
Contributor

@quic-sanising quic-sanising commented Jun 18, 2025

This PR adds the following Unit Tests for On Device Sampling:

  1. test_sampler_transform: Test if SamplerTransform adds nodes at the output of a QEffForCausalLM model to enable the sampling of next tokens at the device (instead of the host) and returns the next tokens and/or probability distributions.
  2. test_greedy_sampling: Test greedy sampling with QPC compiled with and without On Device Sampling.
  3. test_random_sampling: Test random sampling with QPC compiled with and without On Device Sampling.

Signed-off-by: quic-sanising <[email protected]>
@quic-rishinr
Copy link
Contributor

@quic-sanising can you add a small feature description under /docs/source/quick_start.md supported feature section? also provide the example script link in the description

@quic-sanising
Copy link
Contributor Author

@quic-sanising can you add a small feature description under /docs/source/quick_start.md supported feature section? also provide the example script link in the description

Done

Copy link
Contributor

@quic-amitraj quic-amitraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix lint error.

sanising added 3 commits July 3, 2025 13:44
@quic-sanising quic-sanising marked this pull request as ready for review July 3, 2025 19:08
Signed-off-by: sanising <[email protected]>
@quic-sanising
Copy link
Contributor Author

quic-sanising commented Jul 3, 2025

Please fix lint error.

@quic-amitraj The lint failures were happening because the linter is installing ruff v0.12.2 whereas the .pre-commit-config.yaml file has an older version of v0.5.2.

To fix the errors, we need to either install ruff v0.5.2 in the linter or update the .pre-commit-config.yaml file to version v0.12.2.

Signed-off-by: sanising <[email protected]>
@@ -1,7 +1,7 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.5.2
rev: v0.12.7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to my comment above: #463 (comment)

@@ -231,7 +231,6 @@ def main(
tokenizer,
prompts=prompt,
device_id=device_group,
prompt=prompt,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Contributor Author

@quic-sanising quic-sanising Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the inference tests in the later stages of CI were failing. From the logs, prompt was being passed twice to the same function. Not sure how this is not causing problems in the main branch.

prompts = 'My name is', device_id = None, runtime_ai100 = True
kwargs = {'prompt': 'My name is', 'prompts_txt_file_path': 'examples/prompts.txt'}

>           return QEfficient.cloud_ai_100_exec_kv(
                tokenizer,
                self.qpc_path,
                prompt=prompts,
                device_id=device_id,
                generation_len=generation_len,
                is_tlm=self.is_tlm,
                **kwargs,
            )
E           TypeError: QEfficient.generation.text_generation_inference.cloud_ai_100_exec_kv() got multiple values for keyword argument 'prompt'

@quic-rishinr please feel free to comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the print statements intentional here?

Copy link
Contributor Author

@quic-sanising quic-sanising Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they are intentional. But let me know if you want any of them removed.

@quic-hemagnih
Copy link
Contributor

Anything pending for this PR? @ochougul @quic-sanising

@quic-sanising
Copy link
Contributor Author

Anything pending for this PR? @ochougul @quic-sanising

Nothing from my end. Please go ahead with the merge if the CI has passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants