Skip to content

Inconsistent behavior with zero-width matches on empty strings #1163

@rootCircle

Description

@rootCircle

What version of regex are you using?

v1.10.3

Describe the bug at a high level.

replace_all in the regex crate replaces empty strings before non-matching characters differently than Python's standard library regex engine. (Rust version of regex doesn't consider empty strings before non-matching characters as valid matches.)

What are the steps to reproduce the behavior?

  1. Create a Regex object with the pattern r"a*" (matches zero or more "a"s).
  2. Apply replace_all to the string "abxd" with a hyphen as the replacement string.
  3. Observed output (Rust): "-a-b-d-"
  4. Expected output (Python): "-a-b--d-"

Rust Code

use regex::Regex;

fn main() {
    let re = Regex::new(r"x*").unwrap();
    let hay = "abxd";

    println!("{:?}", re.replace_all(hay, "-"));
}

Equivalent Python Code:

import re

regex = r"x*"
test_str = "abxd"
subst = "-"

result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

What is the actual behavior?

replace_all only replaces the empty string before "b" in Rust, not the one before "d".

What is the expected behavior?

Both empty strings should be replaced, resulting in "-a-b--d-".

By the way, I am not sure, if this is an intentional difference or a potential bug?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions