[PATCH v2 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage

public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage
@ 2026-03-11 13:20 Deveshi Dwivedi
  2026-03-11 13:20 ` [PATCH v2 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Deveshi Dwivedi @ 2026-03-11 13:20 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Deveshi Dwivedi

Junio's "do not overuse strbuf_split*()" series calls out remaining
uses of strbuf_split*() as leftover bits for others to continue.
This series picks up two of them.

write_worktree_linking_files() takes two struct strbuf parameters by
value even though it only needs plain path strings.

parse_combine_filter() in list-objects-filter-options.c uses
strbuf_split_str() to split a combine: filter spec at '+'.  An array
of strbufs is unnecessary; walking the string directly with
strchrnul() is simpler and cleaner.

Changes since v1:

 * Patch 1/2: no changes.

 * Patch 2/2: Incorporate review feedback from Jeff King.
   - Use strchrnul() instead of strchr() so the loop needs no
     separate "found separator?" branch or strlen() fallback.
   - Drop the unnecessary (size_t) cast on the length expression.
   - Always exclude '+' from the sub-spec via end - p, removing the
     incorrect conditional stripping.  A trailing '+' now causes the
     while (*p) condition to fail cleanly on the next iteration
     rather than passing an empty string to the parser.
   - Remove the test that expected an error on trailing '+', since
     that behavior was incorrect.

Deveshi Dwivedi (2):
  worktree: do not pass strbuf by value
  list-objects-filter-options: avoid strbuf_split_str()

 builtin/worktree.c                  |  2 +-
 list-objects-filter-options.c       | 35 +++++++++++++----------------
 t/t6112-rev-list-filters-objects.sh |  4 ----
 worktree.c                          | 22 +++++++++---------
 worktree.h                          |  2 +-
 5 files changed, 28 insertions(+), 37 deletions(-)

Range-diff against v1:
1:  ee6b7d1e6a = 1:  ee6b7d1e6a worktree: do not pass strbuf by value
2:  386aed0adf ! 2:  c04ddaeb95 list-objects-filter-options: avoid strbuf_split_str()
    @@ Commit message
         parse_combine_subfilter(), only read the string content of the strbuf
         they receive.
     
    -    Walk the input string directly with strchr() to find each '+'.  Copy
    -    each sub-spec into a temporary buffer and strip the '+' only when
    -    another sub-spec follows.  Change the helpers to take const char *
    -    instead of struct strbuf *.
    +    Walk the input string directly with strchrnul() to find each '+'.
    +    strchrnul() returns a pointer to the terminating '\0' when the
    +    delimiter is not found, so no separate "found separator?" branch is
    +    needed.  Copy each sub-spec into a temporary buffer using end - p,
    +    which naturally excludes the '+', so the separator is always stripped
    +    cleanly.  A trailing '+' causes the outer while (*p) test to fail on
    +    the next iteration rather than passing an empty string to the parser.
    +    Change the helpers to take const char * instead of struct strbuf *.
    +
    +    The test that expected an error on a trailing '+' is removed, since
    +    that behavior was incorrect.
     
         Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
     
    @@ list-objects-filter-options.c: static int parse_combine_filter(
     -		result = parse_combine_subfilter(
     -			filter_options, subspecs[sub], errbuf);
     +	while (*p && !result) {
    -+		const char *sep = strchr(p, '+');
    -+		size_t len = sep ? (size_t)(sep - p + 1) : strlen(p);
    -+		char *sub = xmemdupz(p, len);
    -+
    -+		/* strip '+' separator, but only when more sub-specs follow */
    -+		if (sep && *(sep + 1))
    -+			sub[len - 1] = '\0';
    ++		const char *end = strchrnul(p, '+');
    ++		char *sub = xmemdupz(p, end - p);
     +
     +		result = parse_combine_subfilter(filter_options, sub, errbuf);
     +		free(sub);
    -+		if (!sep)
    ++		if (!*end)
     +			break;
    -+		p = sep + 1;
    ++		p = end + 1;
      	}
      
      	filter_options->choice = LOFC_COMBINE;
    @@ list-objects-filter-options.c: static int parse_combine_filter(
      	if (result)
      		list_objects_filter_release(filter_options);
      	return result;
    +
    + ## t/t6112-rev-list-filters-objects.sh ##
    +@@ t/t6112-rev-list-filters-objects.sh: test_expect_success 'combine:... with non-encoded reserved chars' '
    + 		"must escape char in sub-filter-spec: .\~."
    + '
    + 
    +-test_expect_success 'validate err msg for "combine:<valid-filter>+"' '
    +-	expect_invalid_filter_spec combine:tree:2+ "expected .tree:<depth>."
    +-'
    +-
    + test_expect_success 'combine:... with edge-case hex digits: Ff Aa 0 9' '
    + 	git -C r3 rev-list --objects --filter="combine:tree:2+bl%6Fb:n%6fne" \
    + 		HEAD >actual &&
-- 
2.52.0.230.gd8af7cadaa


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/2] worktree: do not pass strbuf by value
  2026-03-11 13:20 [PATCH v2 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
@ 2026-03-11 13:20 ` Deveshi Dwivedi
  2026-03-11 13:20 ` [PATCH v2 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
  2026-03-11 17:33 ` [PATCH v3 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
  2 siblings, 0 replies; 11+ messages in thread
From: Deveshi Dwivedi @ 2026-03-11 13:20 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Deveshi Dwivedi

write_worktree_linking_files() takes two struct strbuf parameters by
value, even though it only reads path strings from them.

Passing a strbuf by value is misleading and dangerous. The structure
carries a pointer to its underlying character array; caller and callee
end up sharing that storage.  If the callee ever causes the strbuf to
be reallocated, the caller's copy becomes a dangling pointer, which
results in a double-free when the caller does strbuf_release().

The function only needs the string values, not the strbuf machinery.
Switch it to take const char * and update all callers to pass .buf.

Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
---
 builtin/worktree.c |  2 +-
 worktree.c         | 22 +++++++++++-----------
 worktree.h         |  2 +-
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index bc2d0d645b..4035b1cb06 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -539,7 +539,7 @@ static int add_worktree(const char *path, const char *refname,
 
 	strbuf_reset(&sb);
 	strbuf_addf(&sb, "%s/gitdir", sb_repo.buf);
-	write_worktree_linking_files(sb_git, sb, opts->relative_paths);
+	write_worktree_linking_files(sb_git.buf, sb.buf, opts->relative_paths);
 	strbuf_reset(&sb);
 	strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
 	write_file(sb.buf, "../..");
diff --git a/worktree.c b/worktree.c
index 6e2f0f7828..7eba12c6ed 100644
--- a/worktree.c
+++ b/worktree.c
@@ -445,7 +445,7 @@ void update_worktree_location(struct worktree *wt, const char *path_,
 	strbuf_realpath(&path, path_, 1);
 	strbuf_addf(&dotgit, "%s/.git", path.buf);
 	if (fspathcmp(wt->path, path.buf)) {
-		write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
+		write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
 
 		free(wt->path);
 		wt->path = strbuf_detach(&path, NULL);
@@ -684,7 +684,7 @@ static void repair_gitfile(struct worktree *wt,
 
 	if (repair) {
 		fn(0, wt->path, repair, cb_data);
-		write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
+		write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
 	}
 
 done:
@@ -742,7 +742,7 @@ void repair_worktree_after_gitdir_move(struct worktree *wt, const char *old_path
 	if (!file_exists(dotgit.buf))
 		goto done;
 
-	write_worktree_linking_files(dotgit, gitdir, is_relative_path);
+	write_worktree_linking_files(dotgit.buf, gitdir.buf, is_relative_path);
 done:
 	strbuf_release(&gitdir);
 	strbuf_release(&dotgit);
@@ -913,7 +913,7 @@ void repair_worktree_at_path(const char *path,
 
 	if (repair) {
 		fn(0, gitdir.buf, repair, cb_data);
-		write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
+		write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
 	}
 done:
 	free(dotgit_contents);
@@ -1087,17 +1087,17 @@ int init_worktree_config(struct repository *r)
 	return res;
 }
 
-void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
+void write_worktree_linking_files(const char *dotgit, const char *gitdir,
 				  int use_relative_paths)
 {
 	struct strbuf path = STRBUF_INIT;
 	struct strbuf repo = STRBUF_INIT;
 	struct strbuf tmp = STRBUF_INIT;
 
-	strbuf_addbuf(&path, &dotgit);
+	strbuf_addstr(&path, dotgit);
 	strbuf_strip_suffix(&path, "/.git");
 	strbuf_realpath(&path, path.buf, 1);
-	strbuf_addbuf(&repo, &gitdir);
+	strbuf_addstr(&repo, gitdir);
 	strbuf_strip_suffix(&repo, "/gitdir");
 	strbuf_realpath(&repo, repo.buf, 1);
 
@@ -1110,11 +1110,11 @@ void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
 	}
 
 	if (use_relative_paths) {
-		write_file(gitdir.buf, "%s/.git", relative_path(path.buf, repo.buf, &tmp));
-		write_file(dotgit.buf, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp));
+		write_file(gitdir, "%s/.git", relative_path(path.buf, repo.buf, &tmp));
+		write_file(dotgit, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp));
 	} else {
-		write_file(gitdir.buf, "%s/.git", path.buf);
-		write_file(dotgit.buf, "gitdir: %s", repo.buf);
+		write_file(gitdir, "%s/.git", path.buf);
+		write_file(dotgit, "gitdir: %s", repo.buf);
 	}
 
 	strbuf_release(&path);
diff --git a/worktree.h b/worktree.h
index 06efe26b83..f4e46be385 100644
--- a/worktree.h
+++ b/worktree.h
@@ -240,7 +240,7 @@ int init_worktree_config(struct repository *r);
  *  dotgit: "/path/to/foo/.git"
  *  gitdir: "/path/to/repo/worktrees/foo/gitdir"
  */
-void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
+void write_worktree_linking_files(const char *dotgit, const char *gitdir,
 				  int use_relative_paths);
 
 #endif
-- 
2.52.0.230.gd8af7cadaa


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] list-objects-filter-options: avoid strbuf_split_str()
  2026-03-11 13:20 [PATCH v2 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
  2026-03-11 13:20 ` [PATCH v2 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
@ 2026-03-11 13:20 ` Deveshi Dwivedi
  2026-03-11 16:28   ` Junio C Hamano
  2026-03-11 17:33 ` [PATCH v3 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
  2 siblings, 1 reply; 11+ messages in thread
From: Deveshi Dwivedi @ 2026-03-11 13:20 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Deveshi Dwivedi

parse_combine_filter() splits a combine: filter spec at '+' using
strbuf_split_str(), which yields an array of strbufs with the
delimiter left at the end of each non-final piece.  The code then
mutates each non-final piece to strip the trailing '+' before parsing.

Allocating an array of strbufs is unnecessary.  The function processes
one sub-spec at a time and does not use strbuf editing on the pieces.
The two helpers it calls, has_reserved_character() and
parse_combine_subfilter(), only read the string content of the strbuf
they receive.

Walk the input string directly with strchrnul() to find each '+'.
strchrnul() returns a pointer to the terminating '\0' when the
delimiter is not found, so no separate "found separator?" branch is
needed.  Copy each sub-spec into a temporary buffer using end - p,
which naturally excludes the '+', so the separator is always stripped
cleanly.  A trailing '+' causes the outer while (*p) test to fail on
the next iteration rather than passing an empty string to the parser.
Change the helpers to take const char * instead of struct strbuf *.

The test that expected an error on a trailing '+' is removed, since
that behavior was incorrect.

Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
---
 list-objects-filter-options.c       | 35 +++++++++++++----------------
 t/t6112-rev-list-filters-objects.sh |  4 ----
 2 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 7f3e7b8f50..616c6c7faa 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -125,9 +125,9 @@ int gently_parse_list_objects_filter(
 static const char *RESERVED_NON_WS = "~`!@#$^&*()[]{}\\;'\",<>?";
 
 static int has_reserved_character(
-	struct strbuf *sub_spec, struct strbuf *errbuf)
+	const char *sub_spec, struct strbuf *errbuf)
 {
-	const char *c = sub_spec->buf;
+	const char *c = sub_spec;
 	while (*c) {
 		if (*c <= ' ' || strchr(RESERVED_NON_WS, *c)) {
 			strbuf_addf(
@@ -144,7 +144,7 @@ static int has_reserved_character(
 
 static int parse_combine_subfilter(
 	struct list_objects_filter_options *filter_options,
-	struct strbuf *subspec,
+	const char *subspec,
 	struct strbuf *errbuf)
 {
 	size_t new_index = filter_options->sub_nr;
@@ -155,7 +155,7 @@ static int parse_combine_subfilter(
 		      filter_options->sub_alloc);
 	list_objects_filter_init(&filter_options->sub[new_index]);
 
-	decoded = url_percent_decode(subspec->buf);
+	decoded = url_percent_decode(subspec);
 
 	result = has_reserved_character(subspec, errbuf);
 	if (result)
@@ -182,34 +182,29 @@ static int parse_combine_filter(
 	const char *arg,
 	struct strbuf *errbuf)
 {
-	struct strbuf **subspecs = strbuf_split_str(arg, '+', 0);
-	size_t sub;
+	const char *p = arg;
 	int result = 0;
 
-	if (!subspecs[0]) {
+	if (!*p) {
 		strbuf_addstr(errbuf, _("expected something after combine:"));
 		result = 1;
 		goto cleanup;
 	}
 
-	for (sub = 0; subspecs[sub] && !result; sub++) {
-		if (subspecs[sub + 1]) {
-			/*
-			 * This is not the last subspec. Remove trailing "+" so
-			 * we can parse it.
-			 */
-			size_t last = subspecs[sub]->len - 1;
-			assert(subspecs[sub]->buf[last] == '+');
-			strbuf_remove(subspecs[sub], last, 1);
-		}
-		result = parse_combine_subfilter(
-			filter_options, subspecs[sub], errbuf);
+	while (*p && !result) {
+		const char *end = strchrnul(p, '+');
+		char *sub = xmemdupz(p, end - p);
+
+		result = parse_combine_subfilter(filter_options, sub, errbuf);
+		free(sub);
+		if (!*end)
+			break;
+		p = end + 1;
 	}
 
 	filter_options->choice = LOFC_COMBINE;
 
 cleanup:
-	strbuf_list_free(subspecs);
 	if (result)
 		list_objects_filter_release(filter_options);
 	return result;
diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
index 0387f35a32..39211ef989 100755
--- a/t/t6112-rev-list-filters-objects.sh
+++ b/t/t6112-rev-list-filters-objects.sh
@@ -483,10 +483,6 @@ test_expect_success 'combine:... with non-encoded reserved chars' '
 		"must escape char in sub-filter-spec: .\~."
 '
 
-test_expect_success 'validate err msg for "combine:<valid-filter>+"' '
-	expect_invalid_filter_spec combine:tree:2+ "expected .tree:<depth>."
-'
-
 test_expect_success 'combine:... with edge-case hex digits: Ff Aa 0 9' '
 	git -C r3 rev-list --objects --filter="combine:tree:2+bl%6Fb:n%6fne" \
 		HEAD >actual &&
-- 
2.52.0.230.gd8af7cadaa


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] list-objects-filter-options: avoid strbuf_split_str()
  2026-03-11 13:20 ` [PATCH v2 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
@ 2026-03-11 16:28   ` Junio C Hamano
  2026-03-11 17:45     ` Jeff King
  0 siblings, 1 reply; 11+ messages in thread
From: Junio C Hamano @ 2026-03-11 16:28 UTC (permalink / raw)
  To: Deveshi Dwivedi; +Cc: git, peff

Deveshi Dwivedi <deveshigurgaon@gmail.com> writes:

> Walk the input string directly with strchrnul() to find each '+'.
> strchrnul() returns a pointer to the terminating '\0' when the
> delimiter is not found, so no separate "found separator?" branch is
> needed.  Copy each sub-spec into a temporary buffer using end - p,
> which naturally excludes the '+', so the separator is always stripped
> cleanly.  A trailing '+' causes the outer while (*p) test to fail on
> the next iteration rather than passing an empty string to the parser.

OK.  The above description is a bit more excessively focused on the
implementation (which readers can find in the patch text) than the
level of detail we usually aim for, but let's let it pass.

> +	if (!*p) {
>  		strbuf_addstr(errbuf, _("expected something after combine:"));
>  		result = 1;
>  		goto cleanup;
>  	}

This complains when "combine:" is not followed by anything.

> +	while (*p && !result) {
> +		const char *end = strchrnul(p, '+');
> +		char *sub = xmemdupz(p, end - p);
> +
> +		result = parse_combine_subfilter(filter_options, sub, errbuf);
> +		free(sub);
> +		if (!*end)
> +			break;
> +		p = end + 1;
>  	}

The usual "process up to the next '+' and skip over that '+' to find the
next piece" pattern is here.  There is nothing surprising.  Good.

The "trailing '+' problem" the proposed log message talks about
happens when the input is "combine:foo+" (this loop starts scanning
from 'f').  We find the '+' at the end of the string in "end", make
a temporary copy of 'foo' and feed it to parse_combine_subfilter(),
and move on to the NUL after '+', at which the loop control notices
that we are at the end.

It is curious what would happen when the input were "combine:foo++",
though.  What happens is that the loop begins with p pointing at 'f'
in the initial iteration, "end" points at the first '+', and a
temporary copy of 'foo' is fed to parse_combine_subfilter(), and we
move on to the second '+'.  Then the second iteration finds NUL
after that '+' in "end", and we end up calling the helper function
with a temporary copy of '+'; gently_parse_list_objects_fiter() will
reject it as an invalid filter-spec.

Logically, "foo+" would be a combination of "foo" and "" (an empty
string) and we ignore the empty string, and "foo++" would be a
combination of "foo", "" and "" (two empty strings), but we barf at
the empty string if it appears in the middle.  And recall that "" we
saw earlier at the beginning of this function was also triggered an
error.

Admittedly the original wasn't much better.  It ignored an empty
string in the middle (e.g., "foo++bar" would have fed 'foo', '', and
'bar' to parse_combine_subfilter() and an empty string would have
become a no-op) but barfed at the trailing one "foo+".  This new
implementation swaps where it barfs, complaining an empty string in
the middle and ignoring an empty string at the end.

In any case, the error behaviour against an empty filter-spec feels
a bit uneven.

Tightening to reject empty string in the middle may appear to
existing users as a regression if they are using "combine:foo++bar"
as they are forced to update it to lose the extra '+'.

By the way, instead of making a temporary copy and discarding it
repeatedly in a loop, it might be cheaper to reuse an allocated
temporary with the common pattern:

	struct strbuf temp = STRBUF_INIT;
	while (... loop ...) {
		const char *end = ...;
		strbuf_reset(&temp);
		strbuf_add(&temp, p, end - p);
		... use temp.buf ...
	}
	strbuf_release(&temp);

because _reset() only resets the len member of the strbuf without
releasing the resource, if the next piece of memory you need a
temporary copy for is shorter than the pieces you have ever used the
strbuf for, you can make the copy without a new allocation.

> -test_expect_success 'validate err msg for "combine:<valid-filter>+"' '
> -	expect_invalid_filter_spec combine:tree:2+ "expected .tree:<depth>."
> -'

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage
  2026-03-11 13:20 [PATCH v2 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
  2026-03-11 13:20 ` [PATCH v2 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
  2026-03-11 13:20 ` [PATCH v2 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
@ 2026-03-11 17:33 ` Deveshi Dwivedi
  2026-03-11 17:33   ` [PATCH v3 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
  2026-03-11 17:33   ` [PATCH v3 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
  2 siblings, 2 replies; 11+ messages in thread
From: Deveshi Dwivedi @ 2026-03-11 17:33 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Deveshi Dwivedi

Junio's "do not overuse strbuf_split*()" series calls out remaining
uses of strbuf_split*() as leftover bits for others to continue.
This series picks up two of them.

write_worktree_linking_files() takes two struct strbuf parameters by
value even though it only needs plain path strings.

parse_combine_filter() in list-objects-filter-options.c uses
strbuf_split_str() to split a combine: filter spec at '+'.  An array
of strbufs is unnecessary; walking the string directly with
strchrnul() is simpler and cleaner.

Changes since v2:

 * Patch 1/2: no changes.
 * Patch 2/2: Incorporate review feedback from Junio C Hamano.
   - Reuse a single strbuf instead of xmemdupz()/free() per
     iteration to avoid repeated allocations.
   - Skip empty sub-specs consistently (e.g. "foo++bar" or "foo+")
     instead of only handling the trailing case.
   - Trim commit message to focus less on implementation details.

Deveshi Dwivedi (2):
  worktree: do not pass strbuf by value
  list-objects-filter-options: avoid strbuf_split_str()

 builtin/worktree.c                  |  2 +-
 list-objects-filter-options.c       | 40 ++++++++++++++---------------
 t/t6112-rev-list-filters-objects.sh |  4 ---
 worktree.c                          | 22 ++++++++--------
 worktree.h                          |  2 +-
 5 files changed, 33 insertions(+), 37 deletions(-)

Range-diff against v2:
1:  ee6b7d1e6a = 1:  ee6b7d1e6a worktree: do not pass strbuf by value
2:  c04ddaeb95 ! 2:  9f8690b9c7 list-objects-filter-options: avoid strbuf_split_str()
    @@ Commit message
         parse_combine_subfilter(), only read the string content of the strbuf
         they receive.
     
    -    Walk the input string directly with strchrnul() to find each '+'.
    -    strchrnul() returns a pointer to the terminating '\0' when the
    -    delimiter is not found, so no separate "found separator?" branch is
    -    needed.  Copy each sub-spec into a temporary buffer using end - p,
    -    which naturally excludes the '+', so the separator is always stripped
    -    cleanly.  A trailing '+' causes the outer while (*p) test to fail on
    -    the next iteration rather than passing an empty string to the parser.
    -    Change the helpers to take const char * instead of struct strbuf *.
    +    Walk the input string directly with strchrnul() to find each '+',
    +    copying each sub-spec into a reusable temporary buffer.  The '+'
    +    delimiter is naturally excluded.  Empty sub-specs (e.g. from a
    +    trailing '+') are silently skipped for consistency.  Change the
    +    helpers to take const char * instead of struct strbuf *.
     
         The test that expected an error on a trailing '+' is removed, since
         that behavior was incorrect.
    @@ list-objects-filter-options.c: static int parse_combine_filter(
     -	struct strbuf **subspecs = strbuf_split_str(arg, '+', 0);
     -	size_t sub;
     +	const char *p = arg;
    ++	struct strbuf sub = STRBUF_INIT;
      	int result = 0;
      
     -	if (!subspecs[0]) {
    @@ list-objects-filter-options.c: static int parse_combine_filter(
     -			filter_options, subspecs[sub], errbuf);
     +	while (*p && !result) {
     +		const char *end = strchrnul(p, '+');
    -+		char *sub = xmemdupz(p, end - p);
     +
    -+		result = parse_combine_subfilter(filter_options, sub, errbuf);
    -+		free(sub);
    ++		strbuf_reset(&sub);
    ++		strbuf_add(&sub, p, end - p);
    ++
    ++		if (sub.len)
    ++			result = parse_combine_subfilter(filter_options, sub.buf, errbuf);
    ++
     +		if (!*end)
     +			break;
     +		p = end + 1;
      	}
    ++	strbuf_release(&sub);
      
      	filter_options->choice = LOFC_COMBINE;
      
-- 
2.52.0.230.gd8af7cadaa


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3 1/2] worktree: do not pass strbuf by value
  2026-03-11 17:33 ` [PATCH v3 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
@ 2026-03-11 17:33   ` Deveshi Dwivedi
  2026-03-11 17:33   ` [PATCH v3 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
  1 sibling, 0 replies; 11+ messages in thread
From: Deveshi Dwivedi @ 2026-03-11 17:33 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Deveshi Dwivedi

write_worktree_linking_files() takes two struct strbuf parameters by
value, even though it only reads path strings from them.

Passing a strbuf by value is misleading and dangerous. The structure
carries a pointer to its underlying character array; caller and callee
end up sharing that storage.  If the callee ever causes the strbuf to
be reallocated, the caller's copy becomes a dangling pointer, which
results in a double-free when the caller does strbuf_release().

The function only needs the string values, not the strbuf machinery.
Switch it to take const char * and update all callers to pass .buf.

Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
---
 builtin/worktree.c |  2 +-
 worktree.c         | 22 +++++++++++-----------
 worktree.h         |  2 +-
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index bc2d0d645b..4035b1cb06 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -539,7 +539,7 @@ static int add_worktree(const char *path, const char *refname,
 
 	strbuf_reset(&sb);
 	strbuf_addf(&sb, "%s/gitdir", sb_repo.buf);
-	write_worktree_linking_files(sb_git, sb, opts->relative_paths);
+	write_worktree_linking_files(sb_git.buf, sb.buf, opts->relative_paths);
 	strbuf_reset(&sb);
 	strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
 	write_file(sb.buf, "../..");
diff --git a/worktree.c b/worktree.c
index 6e2f0f7828..7eba12c6ed 100644
--- a/worktree.c
+++ b/worktree.c
@@ -445,7 +445,7 @@ void update_worktree_location(struct worktree *wt, const char *path_,
 	strbuf_realpath(&path, path_, 1);
 	strbuf_addf(&dotgit, "%s/.git", path.buf);
 	if (fspathcmp(wt->path, path.buf)) {
-		write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
+		write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
 
 		free(wt->path);
 		wt->path = strbuf_detach(&path, NULL);
@@ -684,7 +684,7 @@ static void repair_gitfile(struct worktree *wt,
 
 	if (repair) {
 		fn(0, wt->path, repair, cb_data);
-		write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
+		write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
 	}
 
 done:
@@ -742,7 +742,7 @@ void repair_worktree_after_gitdir_move(struct worktree *wt, const char *old_path
 	if (!file_exists(dotgit.buf))
 		goto done;
 
-	write_worktree_linking_files(dotgit, gitdir, is_relative_path);
+	write_worktree_linking_files(dotgit.buf, gitdir.buf, is_relative_path);
 done:
 	strbuf_release(&gitdir);
 	strbuf_release(&dotgit);
@@ -913,7 +913,7 @@ void repair_worktree_at_path(const char *path,
 
 	if (repair) {
 		fn(0, gitdir.buf, repair, cb_data);
-		write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
+		write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
 	}
 done:
 	free(dotgit_contents);
@@ -1087,17 +1087,17 @@ int init_worktree_config(struct repository *r)
 	return res;
 }
 
-void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
+void write_worktree_linking_files(const char *dotgit, const char *gitdir,
 				  int use_relative_paths)
 {
 	struct strbuf path = STRBUF_INIT;
 	struct strbuf repo = STRBUF_INIT;
 	struct strbuf tmp = STRBUF_INIT;
 
-	strbuf_addbuf(&path, &dotgit);
+	strbuf_addstr(&path, dotgit);
 	strbuf_strip_suffix(&path, "/.git");
 	strbuf_realpath(&path, path.buf, 1);
-	strbuf_addbuf(&repo, &gitdir);
+	strbuf_addstr(&repo, gitdir);
 	strbuf_strip_suffix(&repo, "/gitdir");
 	strbuf_realpath(&repo, repo.buf, 1);
 
@@ -1110,11 +1110,11 @@ void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
 	}
 
 	if (use_relative_paths) {
-		write_file(gitdir.buf, "%s/.git", relative_path(path.buf, repo.buf, &tmp));
-		write_file(dotgit.buf, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp));
+		write_file(gitdir, "%s/.git", relative_path(path.buf, repo.buf, &tmp));
+		write_file(dotgit, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp));
 	} else {
-		write_file(gitdir.buf, "%s/.git", path.buf);
-		write_file(dotgit.buf, "gitdir: %s", repo.buf);
+		write_file(gitdir, "%s/.git", path.buf);
+		write_file(dotgit, "gitdir: %s", repo.buf);
 	}
 
 	strbuf_release(&path);
diff --git a/worktree.h b/worktree.h
index 06efe26b83..f4e46be385 100644
--- a/worktree.h
+++ b/worktree.h
@@ -240,7 +240,7 @@ int init_worktree_config(struct repository *r);
  *  dotgit: "/path/to/foo/.git"
  *  gitdir: "/path/to/repo/worktrees/foo/gitdir"
  */
-void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
+void write_worktree_linking_files(const char *dotgit, const char *gitdir,
 				  int use_relative_paths);
 
 #endif
-- 
2.52.0.230.gd8af7cadaa


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v3 2/2] list-objects-filter-options: avoid strbuf_split_str()
  2026-03-11 17:33 ` [PATCH v3 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
  2026-03-11 17:33   ` [PATCH v3 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
@ 2026-03-11 17:33   ` Deveshi Dwivedi
  2026-03-11 17:48     ` Jeff King
  1 sibling, 1 reply; 11+ messages in thread
From: Deveshi Dwivedi @ 2026-03-11 17:33 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Deveshi Dwivedi

parse_combine_filter() splits a combine: filter spec at '+' using
strbuf_split_str(), which yields an array of strbufs with the
delimiter left at the end of each non-final piece.  The code then
mutates each non-final piece to strip the trailing '+' before parsing.

Allocating an array of strbufs is unnecessary.  The function processes
one sub-spec at a time and does not use strbuf editing on the pieces.
The two helpers it calls, has_reserved_character() and
parse_combine_subfilter(), only read the string content of the strbuf
they receive.

Walk the input string directly with strchrnul() to find each '+',
copying each sub-spec into a reusable temporary buffer.  The '+'
delimiter is naturally excluded.  Empty sub-specs (e.g. from a
trailing '+') are silently skipped for consistency.  Change the
helpers to take const char * instead of struct strbuf *.

The test that expected an error on a trailing '+' is removed, since
that behavior was incorrect.

Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
---
 list-objects-filter-options.c       | 40 ++++++++++++++---------------
 t/t6112-rev-list-filters-objects.sh |  4 ---
 2 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 7f3e7b8f50..cef67e5919 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -125,9 +125,9 @@ int gently_parse_list_objects_filter(
 static const char *RESERVED_NON_WS = "~`!@#$^&*()[]{}\\;'\",<>?";
 
 static int has_reserved_character(
-	struct strbuf *sub_spec, struct strbuf *errbuf)
+	const char *sub_spec, struct strbuf *errbuf)
 {
-	const char *c = sub_spec->buf;
+	const char *c = sub_spec;
 	while (*c) {
 		if (*c <= ' ' || strchr(RESERVED_NON_WS, *c)) {
 			strbuf_addf(
@@ -144,7 +144,7 @@ static int has_reserved_character(
 
 static int parse_combine_subfilter(
 	struct list_objects_filter_options *filter_options,
-	struct strbuf *subspec,
+	const char *subspec,
 	struct strbuf *errbuf)
 {
 	size_t new_index = filter_options->sub_nr;
@@ -155,7 +155,7 @@ static int parse_combine_subfilter(
 		      filter_options->sub_alloc);
 	list_objects_filter_init(&filter_options->sub[new_index]);
 
-	decoded = url_percent_decode(subspec->buf);
+	decoded = url_percent_decode(subspec);
 
 	result = has_reserved_character(subspec, errbuf);
 	if (result)
@@ -182,34 +182,34 @@ static int parse_combine_filter(
 	const char *arg,
 	struct strbuf *errbuf)
 {
-	struct strbuf **subspecs = strbuf_split_str(arg, '+', 0);
-	size_t sub;
+	const char *p = arg;
+	struct strbuf sub = STRBUF_INIT;
 	int result = 0;
 
-	if (!subspecs[0]) {
+	if (!*p) {
 		strbuf_addstr(errbuf, _("expected something after combine:"));
 		result = 1;
 		goto cleanup;
 	}
 
-	for (sub = 0; subspecs[sub] && !result; sub++) {
-		if (subspecs[sub + 1]) {
-			/*
-			 * This is not the last subspec. Remove trailing "+" so
-			 * we can parse it.
-			 */
-			size_t last = subspecs[sub]->len - 1;
-			assert(subspecs[sub]->buf[last] == '+');
-			strbuf_remove(subspecs[sub], last, 1);
-		}
-		result = parse_combine_subfilter(
-			filter_options, subspecs[sub], errbuf);
+	while (*p && !result) {
+		const char *end = strchrnul(p, '+');
+
+		strbuf_reset(&sub);
+		strbuf_add(&sub, p, end - p);
+
+		if (sub.len)
+			result = parse_combine_subfilter(filter_options, sub.buf, errbuf);
+
+		if (!*end)
+			break;
+		p = end + 1;
 	}
+	strbuf_release(&sub);
 
 	filter_options->choice = LOFC_COMBINE;
 
 cleanup:
-	strbuf_list_free(subspecs);
 	if (result)
 		list_objects_filter_release(filter_options);
 	return result;
diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
index 0387f35a32..39211ef989 100755
--- a/t/t6112-rev-list-filters-objects.sh
+++ b/t/t6112-rev-list-filters-objects.sh
@@ -483,10 +483,6 @@ test_expect_success 'combine:... with non-encoded reserved chars' '
 		"must escape char in sub-filter-spec: .\~."
 '
 
-test_expect_success 'validate err msg for "combine:<valid-filter>+"' '
-	expect_invalid_filter_spec combine:tree:2+ "expected .tree:<depth>."
-'
-
 test_expect_success 'combine:... with edge-case hex digits: Ff Aa 0 9' '
 	git -C r3 rev-list --objects --filter="combine:tree:2+bl%6Fb:n%6fne" \
 		HEAD >actual &&
-- 
2.52.0.230.gd8af7cadaa


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] list-objects-filter-options: avoid strbuf_split_str()
  2026-03-11 16:28   ` Junio C Hamano
@ 2026-03-11 17:45     ` Jeff King
  2026-03-11 18:07       ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2026-03-11 17:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Deveshi Dwivedi, git

On Wed, Mar 11, 2026 at 09:28:21AM -0700, Junio C Hamano wrote:

> > +	while (*p && !result) {
> > +		const char *end = strchrnul(p, '+');
> > +		char *sub = xmemdupz(p, end - p);
> > +
> > +		result = parse_combine_subfilter(filter_options, sub, errbuf);
> > +		free(sub);
> > +		if (!*end)
> > +			break;
> > +		p = end + 1;
> >  	}
> [...]
> It is curious what would happen when the input were "combine:foo++",
> though.  What happens is that the loop begins with p pointing at 'f'
> in the initial iteration, "end" points at the first '+', and a
> temporary copy of 'foo' is fed to parse_combine_subfilter(), and we
> move on to the second '+'.  Then the second iteration finds NUL
> after that '+' in "end", and we end up calling the helper function
> with a temporary copy of '+'; gently_parse_list_objects_fiter() will
> reject it as an invalid filter-spec.

I don't think this is quite right. After we skip the first "+" and "p"
points to the second one, then strchrnul() will find that second "+",
not NUL. And so we have a 0-length spec, and feed the empty string to
parse_combine_subfilter(), which complains.

It is the same behavior when we see "++" in the middle of the string.

> Logically, "foo+" would be a combination of "foo" and "" (an empty
> string) and we ignore the empty string, and "foo++" would be a
> combination of "foo", "" and "" (two empty strings), but we barf at
> the empty string if it appears in the middle.  And recall that "" we
> saw earlier at the beginning of this function was also triggered an
> error.
> 
> Admittedly the original wasn't much better.  It ignored an empty
> string in the middle (e.g., "foo++bar" would have fed 'foo', '', and
> 'bar' to parse_combine_subfilter() and an empty string would have
> become a no-op) but barfed at the trailing one "foo+".  This new
> implementation swaps where it barfs, complaining an empty string in
> the middle and ignoring an empty string at the end.
> 
> In any case, the error behaviour against an empty filter-spec feels
> a bit uneven.
> 
> Tightening to reject empty string in the middle may appear to
> existing users as a regression if they are using "combine:foo++bar"
> as they are forced to update it to lose the extra '+'.

But yeah, it is somewhat inconsistent that we complain about an empty
spec in the middle, but not at the end. If we were starting from
scratch, I'd probably forbid it everywhere. But since we allow it in
some cases now, it may be worth being more permissive.

It is easy to check in the loop, or even just teach the helper to make
empty specs a noop:

diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 616c6c7faa..56e1c651f6 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -151,6 +151,9 @@ static int parse_combine_subfilter(
 	char *decoded;
 	int result;
 
+	if (!*subspec)
+		return 0;
+
 	ALLOC_GROW_BY(filter_options->sub, filter_options->sub_nr, 1,
 		      filter_options->sub_alloc);
 	list_objects_filter_init(&filter_options->sub[new_index]);

> By the way, instead of making a temporary copy and discarding it
> repeatedly in a loop, it might be cheaper to reuse an allocated
> temporary with the common pattern:
> 
> 	struct strbuf temp = STRBUF_INIT;
> 	while (... loop ...) {
> 		const char *end = ...;
> 		strbuf_reset(&temp);
> 		strbuf_add(&temp, p, end - p);
> 		... use temp.buf ...
> 	}
> 	strbuf_release(&temp);
> 
> because _reset() only resets the len member of the strbuf without
> releasing the resource, if the next piece of memory you need a
> temporary copy for is shorter than the pieces you have ever used the
> strbuf for, you can make the copy without a new allocation.

As a general strategy, I agree this is a good one. But I'd be quite
surprised if it ever made a measurable difference for this loop, which
we'd expect to trigger a handful of times (and which allocates in the
sub-function anyway).

-Peff

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] list-objects-filter-options: avoid strbuf_split_str()
  2026-03-11 17:33   ` [PATCH v3 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
@ 2026-03-11 17:48     ` Jeff King
  2026-03-11 18:13       ` Junio C Hamano
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff King @ 2026-03-11 17:48 UTC (permalink / raw)
  To: Deveshi Dwivedi; +Cc: git, gitster

On Wed, Mar 11, 2026 at 05:33:36PM +0000, Deveshi Dwivedi wrote:

> +	while (*p && !result) {
> +		const char *end = strchrnul(p, '+');
> +
> +		strbuf_reset(&sub);
> +		strbuf_add(&sub, p, end - p);
> +
> +		if (sub.len)
> +			result = parse_combine_subfilter(filter_options, sub.buf, errbuf);
> +
> +		if (!*end)
> +			break;
> +		p = end + 1;
>  	}
> +	strbuf_release(&sub);

This version of the loop looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] list-objects-filter-options: avoid strbuf_split_str()
  2026-03-11 17:45     ` Jeff King
@ 2026-03-11 18:07       ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2026-03-11 18:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Deveshi Dwivedi, git

Jeff King <peff@peff.net> writes:

> I don't think this is quite right. After we skip the first "+" and "p"
> points to the second one, then strchrnul() will find that second "+",
> not NUL. And so we have a 0-length spec, and feed the empty string to
> parse_combine_subfilter(), which complains.

Ah, I misread gently_parse_list_objects_fiter(), which makes a NULL
arg a silent no-op, but fully complains on an empty string.  Thanks.

> But yeah, it is somewhat inconsistent that we complain about an empty
> spec in the middle, but not at the end. If we were starting from
> scratch, I'd probably forbid it everywhere. But since we allow it in
> some cases now, it may be worth being more permissive.
>
> It is easy to check in the loop, or even just teach the helper to make
> empty specs a noop:

Yup.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3 2/2] list-objects-filter-options: avoid strbuf_split_str()
  2026-03-11 17:48     ` Jeff King
@ 2026-03-11 18:13       ` Junio C Hamano
  0 siblings, 0 replies; 11+ messages in thread
From: Junio C Hamano @ 2026-03-11 18:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Deveshi Dwivedi, git

Jeff King <peff@peff.net> writes:

> On Wed, Mar 11, 2026 at 05:33:36PM +0000, Deveshi Dwivedi wrote:
>
>> +	while (*p && !result) {
>> +		const char *end = strchrnul(p, '+');
>> +
>> +		strbuf_reset(&sub);
>> +		strbuf_add(&sub, p, end - p);
>> +
>> +		if (sub.len)
>> +			result = parse_combine_subfilter(filter_options, sub.buf, errbuf);
>> +
>> +		if (!*end)
>> +			break;
>> +		p = end + 1;
>>  	}
>> +	strbuf_release(&sub);
>
> This version of the loop looks good to me.
>
> -Peff

Thanks, both.  Will queue.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-03-11 18:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-11 13:20 [PATCH v2 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
2026-03-11 13:20 ` [PATCH v2 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
2026-03-11 13:20 ` [PATCH v2 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
2026-03-11 16:28   ` Junio C Hamano
2026-03-11 17:45     ` Jeff King
2026-03-11 18:07       ` Junio C Hamano
2026-03-11 17:33 ` [PATCH v3 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
2026-03-11 17:33   ` [PATCH v3 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
2026-03-11 17:33   ` [PATCH v3 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
2026-03-11 17:48     ` Jeff King
2026-03-11 18:13       ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox