fix: provide tolerance for apply_patch tool #993

bolinfest · 2025-05-17T18:03:55Z

As explained in detail in the doc comment for ParseMode::Lenient, we have observed that GPT-4.1 does not always generate a valid invocation of apply_patch. Fortunately, the error is predictable, so we introduce some new logic to the codex-apply-patch crate to recover from this error.

Because we would like to avoid this becoming a de facto standard (as it would be incompatible if apply_patch were provided as an actual executable, unless we also introduced the lenient behavior in the executable, as well), we require passing ParseMode::Lenient to parse_patch_text() to make it clear that the caller is opting into supporting this special case.

Note the analogous change to the TypeScript CLI was #930. In addition to changing the accepted input to apply_patch, it also introduced additional instructions for the model, which we include in this PR.

Note that apply-patch does not depend on either regex or regex-lite, so some of the checks are slightly more verbose to avoid introducing this dependency.

That said, this PR does not leverage the existing extract_heredoc_body_from_apply_patch_command(), which depends on tree-sitter and tree-sitter-bash:

codex/codex-rs/apply-patch/src/lib.rs

Lines 191 to 246 in 5a5aa89

    
           /// Attempts to extract a heredoc_body object from a string bash command like: 
        
           /// Optimistically 
        
           /// 
        
           /// ```bash 
        
           /// bash -lc 'apply_patch <<EOF\n***Begin Patch\n...EOF' 
        
           /// ``` 
        
           /// 
        
           /// # Arguments 
        
           /// 
        
           /// * `src` - A string slice that holds the full command 
        
           /// 
        
           /// # Returns 
        
           /// 
        
           /// This function returns a `Result` which is: 
        
           /// 
        
           /// * `Ok(String)` - The heredoc body if the extraction is successful. 
        
           /// * `Err(anyhow::Error)` - An error if the extraction fails. 
        
           /// 
        
           fn extract_heredoc_body_from_apply_patch_command( 
        
               src: &str, 
        
           ) -> std::result::Result<String, ExtractHeredocError> { 
        
               if !src.trim_start().starts_with("apply_patch") { 
        
                   return Err(ExtractHeredocError::CommandDidNotStartWithApplyPatch); 
        
               } 
        
               let lang = BASH.into(); 
        
               let mut parser = Parser::new(); 
        
               parser 
        
                   .set_language(&lang) 
        
                   .map_err(ExtractHeredocError::FailedToLoadBashGrammar)?; 
        
               let tree = parser 
        
                   .parse(src, None) 
        
                   .ok_or(ExtractHeredocError::FailedToParsePatchIntoAst)?; 
        
               let bytes = src.as_bytes(); 
        
               let mut c = tree.root_node().walk(); 
        
               loop { 
        
                   let node = c.node(); 
        
                   if node.kind() == "heredoc_body" { 
        
                       let text = node 
        
                           .utf8_text(bytes) 
        
                           .map_err(ExtractHeredocError::HeredocNotUtf8)?; 
        
                       return Ok(text.trim_end_matches('\n').to_owned()); 
        
                   } 
        
                   if c.goto_first_child() { 
        
                       continue; 
        
                   } 
        
                   while !c.goto_next_sibling() { 
        
                       if !c.goto_parent() { 
        
                           return Err(ExtractHeredocError::FailedToFindHeredocBody); 
        
                       } 
        
                   } 
        
               } 
        
           }

though perhaps it should.

github-actions · 2025-06-03T06:46:30Z

Summary

Adds lenient heredoc handling to apply_patch (for GPT-4.1), ships detailed tool instructions, and plumbs model-specific guidance through ModelClient; includes new tests and minor visibility/derive tweaks.

Notes

parser.rs: new ParseMode::Lenient, heredoc stripping, PARSE_IN_STRICT_MODE flag, many tests.
apply_patch_tool_instructions.md + APPLY_PATCH_TOOL_INSTRUCTIONS constant for reuse.
core::client: concatenates BASE_INSTRUCTIONS + optional model-specific block + user text.
Minor visibility (BASE_INSTRUCTIONS), derives, and doc additions.

Review

Looks clean and well-tested—heredoc tolerance should unblock GPT-4.1 quickly.

Suggestions

Consider making PARSE_IN_STRICT_MODE a runtime conf/env var instead of a compile-time const.
ModelClient now duplicates some of Prompt::get_full_instructions; maybe refactor to avoid drift.

Nice work!

View workflow run

github-actions · 2025-06-03T15:53:21Z

PR summary

Adds detailed apply_patch instructions for GPT-4.1, introduces a lenient heredoc-aware parser in apply-patch, and threads the model name so prompts can conditionally include the new instructions.

Review

Looks great—clear docs and well-contained code changes.

Pros

Robust heredoc handling with strict/lenient modes and thorough tests.
Clean inclusion of instructions via include_str!, only for GPT-4.1.
Minimal touch points: just passes model where needed.

Nit-picks

Consider gating PARSE_IN_STRICT_MODE behind a feature flag or env var.
A brief comment on why heredoc stripping is safe could help future readers.

View workflow run

bolinfest force-pushed the pr993 branch 2 times, most recently from d690965 to 4053788 Compare May 17, 2025 19:28

yuan-alex mentioned this pull request May 29, 2025

fix: remove extraneous reasoning parameter on non-supported models #1153

Closed

bolinfest force-pushed the pr993 branch 4 times, most recently from bdb0bd4 to 1f0f39c Compare June 3, 2025 06:38

bolinfest added the code-review Issues relating to code reviews performed by codex label Jun 3, 2025

github-actions bot added codex-review-in-progress and removed code-review Issues relating to code reviews performed by codex labels Jun 3, 2025

github-actions bot added codex-review-completed and removed codex-review-in-progress labels Jun 3, 2025

bolinfest force-pushed the pr993 branch 2 times, most recently from 29686a7 to 0cb4f46 Compare June 3, 2025 15:49

fix: provide tolerance for apply_patch tool

fae9ff8

bolinfest force-pushed the pr993 branch from 0cb4f46 to fae9ff8 Compare June 3, 2025 15:50

bolinfest marked this pull request as ready for review June 3, 2025 15:51

bolinfest added code-review Issues relating to code reviews performed by codex and removed codex-review-completed labels Jun 3, 2025

github-actions bot added codex-review-in-progress and removed code-review Issues relating to code reviews performed by codex labels Jun 3, 2025

github-actions bot added codex-review-completed and removed codex-review-in-progress labels Jun 3, 2025

bolinfest merged commit 6fcc528 into main Jun 3, 2025
19 checks passed

bolinfest deleted the pr993 branch June 3, 2025 16:06

github-actions bot locked and limited conversation to collaborators Jun 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: provide tolerance for apply_patch tool #993

fix: provide tolerance for apply_patch tool #993

Uh oh!

bolinfest commented May 17, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 3, 2025

Uh oh!

github-actions bot commented Jun 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	/// Attempts to extract a heredoc_body object from a string bash command like:
	/// Optimistically
	///
	/// ```bash
	/// bash -lc 'apply_patch <<EOF\n***Begin Patch\n...EOF'
	/// ```
	///
	/// # Arguments
	///
	/// * `src` - A string slice that holds the full command
	///
	/// # Returns
	///
	/// This function returns a `Result` which is:
	///
	/// * `Ok(String)` - The heredoc body if the extraction is successful.
	/// * `Err(anyhow::Error)` - An error if the extraction fails.
	///
	fn extract_heredoc_body_from_apply_patch_command(
	src: &str,
	) -> std::result::Result<String, ExtractHeredocError> {
	if !src.trim_start().starts_with("apply_patch") {
	return Err(ExtractHeredocError::CommandDidNotStartWithApplyPatch);
	}

	let lang = BASH.into();
	let mut parser = Parser::new();
	parser
	.set_language(&lang)
	.map_err(ExtractHeredocError::FailedToLoadBashGrammar)?;
	let tree = parser
	.parse(src, None)
	.ok_or(ExtractHeredocError::FailedToParsePatchIntoAst)?;

	let bytes = src.as_bytes();
	let mut c = tree.root_node().walk();

	loop {
	let node = c.node();
	if node.kind() == "heredoc_body" {
	let text = node
	.utf8_text(bytes)
	.map_err(ExtractHeredocError::HeredocNotUtf8)?;
	return Ok(text.trim_end_matches('\n').to_owned());
	}

	if c.goto_first_child() {
	continue;
	}
	while !c.goto_next_sibling() {
	if !c.goto_parent() {
	return Err(ExtractHeredocError::FailedToFindHeredocBody);
	}
	}
	}
	}

fix: provide tolerance for apply_patch tool #993

fix: provide tolerance for apply_patch tool #993

Uh oh!

Conversation

bolinfest commented May 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 3, 2025

Summary

Notes

Review

Uh oh!

github-actions bot commented Jun 3, 2025

PR summary

Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bolinfest commented May 17, 2025 •

edited

Loading