Improve inference output and add optional system prompt #393

CYHSM · 2025-08-18T14:11:59Z

What does this PR do?

This PR makes the inference look a bit nicer and also adds:

A loop so the user can test more prompts after running the first one
An option to specify a temperature or several (comma separated,e.g. 0, 0.4, 0.8) which then evaluates the prompt for each temperature
An optional system prompt which can be read from a txt file and is specified in the yaml file:

text_inference_component:
  component_key: inference_component
  variant_key: text
  config:
    device: ${settings.device}
    model:
      instance_key: checkpointed_model
      pass_type: BY_REFERENCE
    tokenizer:
      component_key: tokenizer
      variant_key: pretrained_sp_tokenizer
      config:
        tokenizer_model_file: /raid/s3/opengptx/mfrey/3.73T-Tokens/tokenizer/eurolingua_tokenizer.model
    sequence_length: ${settings.sequence_length}
    eod_token: <|endoftext|>
    prompt_template: "{prompt_input}" # "<instruction> Du bist Moody, ein LLM welches Menschen helfen soll. user: {prompt_input}"
    system_prompt_path: "/home/markus_frey/Github/modalities/tutorials/instruct_teuken/configs/system_prompt.txt"
    chat_template: "System:\n{system_prompt}\nUser:{user_prompt}\nAssistant:\n"
    temperature: 1

I have found no inference tests so not sure if there are none or I am just not seeing them. If someone points me towards an existing test I can create one for this PR or create a full inference one

Breaking Changes

System prompt is configured as optional so there should be no breaking changes

Checklist before submitting final PR

My PR is minimal and addresses one issue in isolation
I have merged the latest version of the target branch into this feature branch
I have reviewed my own code w.r.t. correct implementation, missing type hints, proper documentation, etc.
I have run a sample config for model training
I have checked that all tests run through (python tests/tests.py)
Not all tests run through but the errors are seemingly unrelated to this change (sh scripts/run_checkpoint_conversion.sh)
I have updated the internal changelog (CHANGELOG_DEV.md)

behzadshomali · 2025-08-19T10:21:32Z

Thanks for your commit, it was really helpful. As a quick-to-implement suggestion, in the run() function, you can add another try/except statements which enables the user to interrupt the model's generation while still running the code (it becomes handy when the sequence_length has been set to a higher value and the models goes off track, so you don't need to wait for it to finish generating non-sense text):

def run(self):
        print("\n" + "🚀 Modalities Chat Interface ".center(60, "="))
        print("=" * 60)

        while True:
            try:
                user_prompt = self._get_prompt(self.prompt_template)
                full_prompt = self.chat_template.format(system_prompt=self.system_prompt, user_prompt=user_prompt)

                temp_input = input("\n🌡️  Enter temperatures (comma-separated) or press Enter for default [0.8]: ")

                if not temp_input.strip():
                    temperatures = [0.8]
                    print("Using default temperature: 0.8")
                else:
                    try:
                        temperatures = [float(t.strip()) for t in temp_input.split(",")]
                        if not temperatures:
                            raise ValueError("No temperatures provided.")
                    except ValueError:
                        print("\n❌ Invalid input. Please enter comma-separated numbers or press Enter for default.\n")
                        continue

                for i, temp in enumerate(temperatures):
                    if len(temperatures) > 1:
                        print(f"\n\n{'🎯 GENERATION ' + str(i+1) + f' (Temperature: {temp})'.center(60, '=')}")
                    else:
                        print(f"\n\n{'🎯 GENERATING (Temperature: ' + str(temp) + ')'.center(60, '=')}")
                    try:
                        self.temperature = temp
                        self.generate_tokens(context=full_prompt)
                    except:
                        continue

                print("\n\n" + "🏁 ALL GENERATIONS COMPLETE".center(60, "="))
                print("=" * 60)
            except KeyboardInterrupt:
                print("\n\n👋 Closing app... Goodbye!")
                break

CYHSM · 2025-09-01T13:12:12Z

Thanks @behzadshomali, I integrated your suggestion now and also added tests for general inference:

Checks if greedy decoding produces the same output
Checks different temperatures produce different outputs
Checks new run method with and without system prompt

@rrutmann this is ready from my side

rrutmann

At first glance the code looks good. I want to run it locally to check if everything works. Could you please add an inference config using the newly added variable system_prompt_path as well as an example file for such a system prompt to the repo?I would suggest to put it into config_files/text_inference

feat: improve inference output and add optional system prompt

bc5db68

ci: Added test for inference

cf5782e

rrutmann self-requested a review September 4, 2025 12:25

rrutmann requested changes Sep 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve inference output and add optional system prompt #393

Improve inference output and add optional system prompt #393

Uh oh!

CYHSM commented Aug 18, 2025

Uh oh!

behzadshomali commented Aug 19, 2025

Uh oh!

CYHSM commented Sep 1, 2025 •

edited

Loading

Uh oh!

rrutmann left a comment •

edited

Loading

Uh oh!

Uh oh!

Improve inference output and add optional system prompt #393

Are you sure you want to change the base?

Improve inference output and add optional system prompt #393

Uh oh!

Conversation

CYHSM commented Aug 18, 2025

What does this PR do?

Breaking Changes

Checklist before submitting final PR

Uh oh!

behzadshomali commented Aug 19, 2025

Uh oh!

CYHSM commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rrutmann left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

CYHSM commented Sep 1, 2025 •

edited

Loading

rrutmann left a comment •

edited

Loading