Great work! I want to know if this framework can integrate the image tokens into the rollout trajectory, the base model can use these image tokens to generate reasoning for the next step, i.e., think with images. This implement has been implemented in here and here.