Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 61 additions & 2 deletions regex-syntax/src/hir/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3047,7 +3047,7 @@ fn lift_common_prefix(hirs: Vec<Hir>) -> Result<Hir, Vec<Hir>> {
.count();
prefix = &prefix[..common_len];
if prefix.is_empty() {
return Err(hirs);
return lift_common_suffix(hirs).map(Hir::concat);
}
}
let len = prefix.len();
Expand All @@ -3068,10 +3068,69 @@ fn lift_common_prefix(hirs: Vec<Hir>) -> Result<Hir, Vec<Hir>> {
}
}
let mut concat = prefix_concat;
concat.push(Hir::alternation(suffix_alts));
match lift_common_suffix(suffix_alts) {
Ok(suffix_concat) => {
concat.extend(suffix_concat);
}
Err(suffix_alts) => {
concat.push(Hir::alternation(suffix_alts));
}
}
Ok(Hir::concat(concat))
}

#[allow(clippy::inline_always)]
#[inline(always)] // prevents blowing the stack
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is worrying me. I can't see why this is supposed to prevent stack overflow. Can you say more about why you have this?

Copy link
Author

@mmirate mmirate Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a mutual recursion here: Hir::alternation calls lift_common_prefix which calls lift_common_suffix which calls Hir::alternation (last line before the happy-path return). If I remember correctly, inlining lift_common_suffix reduced the rate of stack growth enough that one of these 3 functions would hit a base-case before the stack ran out of space. Thinking back, it would probably be a good idea to also inline lift_common_prefix.

(To be clear, I don't know whether inline(always) is supposed to affect intra-crate callsites this way - all I know is that in my full-blown use-case I got stack overflows from this mutual recursion until I added inline(always) here.)

fn lift_common_suffix(hirs: Vec<Hir>) -> Result<Vec<Hir>, Vec<Hir>> {
if hirs.len() <= 1 {
return Err(hirs);
}
let mut suffix = match hirs.last().unwrap().kind() {
HirKind::Concat(ref xs) => &**xs,
_ => return Err(hirs),
};
if suffix.is_empty() {
return Err(hirs);
}
for h in hirs.iter().rev().skip(1) {
let concat = match h.kind() {
HirKind::Concat(ref xs) => xs,
_ => return Err(hirs),
};
let common_len = suffix
.iter()
.rev()
.zip(concat.iter().rev())
.take_while(|(x, y)| x == y)
.count();
suffix = &suffix[suffix.len() - common_len..];
if suffix.is_empty() {
return Err(hirs);
}
}
let len = suffix.len();
assert_ne!(0, len);
let mut suffix_concat = vec![];
let mut prefix_alts = vec![];
for h in hirs {
let mut concat = match h.into_kind() {
HirKind::Concat(xs) => xs,
// We required all sub-expressions to be
// concats above, so we're only here if we
// have a concat.
_ => unreachable!(),
};
let suffix = concat.split_off(concat.len() - len);
prefix_alts.push(Hir::concat(concat));
if suffix_concat.is_empty() {
suffix_concat = suffix;
}
}
let mut concat = suffix_concat;
concat.insert(0, Hir::alternation(prefix_alts));
Ok(concat)
}

#[cfg(test)]
mod tests {
use super::*;
Expand Down
Loading