All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Lars Kellogg-Stedman <lars@oddbit.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v4] line-range: fix infinite loop bug with '$' regex
Date: Mon, 19 Dec 2022 23:55:41 +0100	[thread overview]
Message-ID: <221219.86edsvxank.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <20221219224850.2703967-1-lars@oddbit.com>


On Mon, Dec 19 2022, Lars Kellogg-Stedman wrote:

> When the -L argument to "git log" is passed the zero-width regular
> expression "$" (as in "-L :$:line-range.c"), this results in an
> infinite loop in find_funcname_matching_regexp().
>
> Modify find_funcname_matching_regexp to correctly match the entire line
> instead of the zero-width match at eol and update the loop condition to
> prevent an infinite loop in the event of other undiscovered corner cases.
>
> The primary change is that we pre-decrement the beginning-of-line marker
> ('bol') before comparing it to '\n'. In the case of '$', where we match the
> '\n' at the end of the line and start the loop with bol == eol, this
> ensures that bol will find the beginning of the line on which the match
> occurred.
>
> Originally reported in <https://stackoverflow.com/q/74690545/147356>.
>
> Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
> ---
>  line-range.c        |  7 ++++---
>  t/t4211-line-log.sh | 22 ++++++++++++++++++++++
>  2 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/line-range.c b/line-range.c
> index 955a8a9535..47bf0d6f1a 100644
> --- a/line-range.c
> +++ b/line-range.c
> @@ -135,7 +135,7 @@ static const char *find_funcname_matching_regexp(xdemitconf_t *xecfg, const char
>  {
>  	int reg_error;
>  	regmatch_t match[1];
> -	while (1) {
> +	while (*start) {
>  		const char *bol, *eol;
>  		reg_error = regexec(regexp, start, 1, match, 0);
>  		if (reg_error == REG_NOMATCH)
> @@ -148,8 +148,8 @@ static const char *find_funcname_matching_regexp(xdemitconf_t *xecfg, const char
>  		/* determine extent of line matched */
>  		bol = start+match[0].rm_so;
>  		eol = start+match[0].rm_eo;
> -		while (bol > start && *bol != '\n')
> -			bol--;
> +		while (bol > start && *--bol != '\n')
> +			; /* nothing */
>  		if (*bol == '\n')
>  			bol++;
>  		while (*eol && *eol != '\n')
> @@ -161,6 +161,7 @@ static const char *find_funcname_matching_regexp(xdemitconf_t *xecfg, const char
>  			return bol;
>  		start = eol;
>  	}
> +	return NULL;
>  }
>  
>  static const char *parse_range_funcname(
> diff --git a/t/t4211-line-log.sh b/t/t4211-line-log.sh
> index ac9e4d0928..c6540e822f 100755
> --- a/t/t4211-line-log.sh
> +++ b/t/t4211-line-log.sh
> @@ -315,4 +315,26 @@ test_expect_success 'line-log with --before' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'setup tests for zero-width regular expressions' '
> +	cat >expect <<-EOF
> +	Modify func1() in file.c
> +	Add func1() and func2() in file.c
> +	EOF
> +'
> +
> +test_expect_success 'zero-width regex $ matches any function name' '
> +	git log --format="%s" --no-patch "-L:$:file.c" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'zero-width regex ^ matches any function name' '
> +	git log --format="%s" --no-patch "-L:^:file.c" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'zero-width regex .* matches any function name' '
> +	git log --format="%s" --no-patch "-L:.*:file.c" >actual &&
> +	test_cmp expect actual
> +'
> +

Untested, but as most of this is repeated & would fail if the "setup"
test is skipped, a one-off function would be better, e.g.:
	
	test_zerowidth_regex () {
		local rx="$1" &&
		
		test_expect_success "zero-width regex '$rx' matches any function name" '
			cat >expect <<-\EOF &&
			Modify func1() in file.c
			Add func1() and func2() in file.c
			EOF
			git log --format="%s" --no-patch -L:"$rx":file.c >actual &&
			test_cmp expect actual
		'
	}
	
	test_zerowidth_regex '$'
	test_zerowidth_regex '^'
	test_zerowidth_regex '.*'

The change of direction for the fix itself looks good to me (re my
earlier feedback on a previous round), i.e. that we should fix our own
code, not forbid '$' in regexes.

  reply	other threads:[~2022-12-19 22:59 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-05 19:36 [PATCH v2] line-range: Fix infinite loop bug with degenerate regex Lars Kellogg-Stedman
2022-12-07  4:33 ` Eric Sunshine
2022-12-07  4:52 ` Ævar Arnfjörð Bjarmason
2022-12-07  5:29   ` Junio C Hamano
2022-12-07 20:30   ` SZEDER Gábor
2022-12-09 19:16   ` Lars Kellogg-Stedman
2022-12-11  1:53 ` [PATCH v3] line-range: Fix infinite loop bug with degenerate '$' regex Lars Kellogg-Stedman
2022-12-11  3:32   ` Junio C Hamano
2022-12-11  3:34     ` Junio C Hamano
2022-12-14 14:53     ` Lars Kellogg-Stedman
2022-12-18  1:33       ` Junio C Hamano
2022-12-19 22:48 ` [PATCH v4] line-range: fix infinite loop bug with " Lars Kellogg-Stedman
2022-12-19 22:55   ` Ævar Arnfjörð Bjarmason [this message]
2022-12-20  0:00     ` Eric Sunshine
2022-12-20  1:07       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=221219.86edsvxank.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=lars@oddbit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.