* [PATCH v1 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage
@ 2026-03-08 18:03 Deveshi Dwivedi
2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
0 siblings, 2 replies; 8+ messages in thread
From: Deveshi Dwivedi @ 2026-03-08 18:03 UTC (permalink / raw)
To: git; +Cc: gitster, peff, Deveshi Dwivedi
Junio's "do not overuse strbuf_split*()" series [1] calls out
remaining uses of strbuf_split*() as leftover bits for others to
continue. This series picks up two of them.
write_worktree_linking_files() takes two struct strbuf parameters by
value even though it only needs plain path strings. This is the same
class of problem fixed for builtin/clean.c in that series.
parse_combine_filter() in list-objects-filter-options.c uses
strbuf_split_str() to split a combine: filter spec at '+', then
mutates each non-final piece to strip the trailing delimiter. An
array of strbufs is unnecessary; the function processes one sub-spec
at a time and does not use strbuf editing on the pieces.
[1]: https://lore.kernel.org/git/20250731225433.4028872-1-gitster@pobox.com/
Deveshi Dwivedi (2):
worktree: do not pass strbuf by value
list-objects-filter-options: avoid strbuf_split_str()
builtin/worktree.c | 2 +-
list-objects-filter-options.c | 40 +++++++++++++++++------------------
worktree.c | 22 +++++++++----------
worktree.h | 2 +-
4 files changed, 33 insertions(+), 33 deletions(-)
--
2.52.0.230.gd8af7cadaa
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v1 1/2] worktree: do not pass strbuf by value
2026-03-08 18:03 [PATCH v1 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
@ 2026-03-08 18:03 ` Deveshi Dwivedi
2026-03-09 14:48 ` Junio C Hamano
2026-03-09 19:26 ` coccinelle to catch pass-by-value?, was: " Jeff King
2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
1 sibling, 2 replies; 8+ messages in thread
From: Deveshi Dwivedi @ 2026-03-08 18:03 UTC (permalink / raw)
To: git; +Cc: gitster, peff, Deveshi Dwivedi
write_worktree_linking_files() takes two struct strbuf parameters by
value, even though it only reads path strings from them.
Passing a strbuf by value is misleading and dangerous. The structure
carries a pointer to its underlying character array; caller and callee
end up sharing that storage. If the callee ever causes the strbuf to
be reallocated, the caller's copy becomes a dangling pointer, which
results in a double-free when the caller does strbuf_release().
The function only needs the string values, not the strbuf machinery.
Switch it to take const char * and update all callers to pass .buf.
Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
---
builtin/worktree.c | 2 +-
worktree.c | 22 +++++++++++-----------
worktree.h | 2 +-
3 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/builtin/worktree.c b/builtin/worktree.c
index bc2d0d645b..4035b1cb06 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -539,7 +539,7 @@ static int add_worktree(const char *path, const char *refname,
strbuf_reset(&sb);
strbuf_addf(&sb, "%s/gitdir", sb_repo.buf);
- write_worktree_linking_files(sb_git, sb, opts->relative_paths);
+ write_worktree_linking_files(sb_git.buf, sb.buf, opts->relative_paths);
strbuf_reset(&sb);
strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
write_file(sb.buf, "../..");
diff --git a/worktree.c b/worktree.c
index 6e2f0f7828..7eba12c6ed 100644
--- a/worktree.c
+++ b/worktree.c
@@ -445,7 +445,7 @@ void update_worktree_location(struct worktree *wt, const char *path_,
strbuf_realpath(&path, path_, 1);
strbuf_addf(&dotgit, "%s/.git", path.buf);
if (fspathcmp(wt->path, path.buf)) {
- write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
+ write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
free(wt->path);
wt->path = strbuf_detach(&path, NULL);
@@ -684,7 +684,7 @@ static void repair_gitfile(struct worktree *wt,
if (repair) {
fn(0, wt->path, repair, cb_data);
- write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
+ write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
}
done:
@@ -742,7 +742,7 @@ void repair_worktree_after_gitdir_move(struct worktree *wt, const char *old_path
if (!file_exists(dotgit.buf))
goto done;
- write_worktree_linking_files(dotgit, gitdir, is_relative_path);
+ write_worktree_linking_files(dotgit.buf, gitdir.buf, is_relative_path);
done:
strbuf_release(&gitdir);
strbuf_release(&dotgit);
@@ -913,7 +913,7 @@ void repair_worktree_at_path(const char *path,
if (repair) {
fn(0, gitdir.buf, repair, cb_data);
- write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
+ write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
}
done:
free(dotgit_contents);
@@ -1087,17 +1087,17 @@ int init_worktree_config(struct repository *r)
return res;
}
-void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
+void write_worktree_linking_files(const char *dotgit, const char *gitdir,
int use_relative_paths)
{
struct strbuf path = STRBUF_INIT;
struct strbuf repo = STRBUF_INIT;
struct strbuf tmp = STRBUF_INIT;
- strbuf_addbuf(&path, &dotgit);
+ strbuf_addstr(&path, dotgit);
strbuf_strip_suffix(&path, "/.git");
strbuf_realpath(&path, path.buf, 1);
- strbuf_addbuf(&repo, &gitdir);
+ strbuf_addstr(&repo, gitdir);
strbuf_strip_suffix(&repo, "/gitdir");
strbuf_realpath(&repo, repo.buf, 1);
@@ -1110,11 +1110,11 @@ void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
}
if (use_relative_paths) {
- write_file(gitdir.buf, "%s/.git", relative_path(path.buf, repo.buf, &tmp));
- write_file(dotgit.buf, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp));
+ write_file(gitdir, "%s/.git", relative_path(path.buf, repo.buf, &tmp));
+ write_file(dotgit, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp));
} else {
- write_file(gitdir.buf, "%s/.git", path.buf);
- write_file(dotgit.buf, "gitdir: %s", repo.buf);
+ write_file(gitdir, "%s/.git", path.buf);
+ write_file(dotgit, "gitdir: %s", repo.buf);
}
strbuf_release(&path);
diff --git a/worktree.h b/worktree.h
index 06efe26b83..f4e46be385 100644
--- a/worktree.h
+++ b/worktree.h
@@ -240,7 +240,7 @@ int init_worktree_config(struct repository *r);
* dotgit: "/path/to/foo/.git"
* gitdir: "/path/to/repo/worktrees/foo/gitdir"
*/
-void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
+void write_worktree_linking_files(const char *dotgit, const char *gitdir,
int use_relative_paths);
#endif
--
2.52.0.230.gd8af7cadaa
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str()
2026-03-08 18:03 [PATCH v1 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
@ 2026-03-08 18:03 ` Deveshi Dwivedi
2026-03-09 15:38 ` Junio C Hamano
2026-03-09 19:08 ` Jeff King
1 sibling, 2 replies; 8+ messages in thread
From: Deveshi Dwivedi @ 2026-03-08 18:03 UTC (permalink / raw)
To: git; +Cc: gitster, peff, Deveshi Dwivedi
parse_combine_filter() splits a combine: filter spec at '+' using
strbuf_split_str(), which yields an array of strbufs with the
delimiter left at the end of each non-final piece. The code then
mutates each non-final piece to strip the trailing '+' before parsing.
Allocating an array of strbufs is unnecessary. The function processes
one sub-spec at a time and does not use strbuf editing on the pieces.
The two helpers it calls, has_reserved_character() and
parse_combine_subfilter(), only read the string content of the strbuf
they receive.
Walk the input string directly with strchr() to find each '+'. Copy
each sub-spec into a temporary buffer and strip the '+' only when
another sub-spec follows. Change the helpers to take const char *
instead of struct strbuf *.
Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
---
list-objects-filter-options.c | 40 +++++++++++++++++------------------
1 file changed, 20 insertions(+), 20 deletions(-)
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 7f3e7b8f50..f536085a7c 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -125,9 +125,9 @@ int gently_parse_list_objects_filter(
static const char *RESERVED_NON_WS = "~`!@#$^&*()[]{}\\;'\",<>?";
static int has_reserved_character(
- struct strbuf *sub_spec, struct strbuf *errbuf)
+ const char *sub_spec, struct strbuf *errbuf)
{
- const char *c = sub_spec->buf;
+ const char *c = sub_spec;
while (*c) {
if (*c <= ' ' || strchr(RESERVED_NON_WS, *c)) {
strbuf_addf(
@@ -144,7 +144,7 @@ static int has_reserved_character(
static int parse_combine_subfilter(
struct list_objects_filter_options *filter_options,
- struct strbuf *subspec,
+ const char *subspec,
struct strbuf *errbuf)
{
size_t new_index = filter_options->sub_nr;
@@ -155,7 +155,7 @@ static int parse_combine_subfilter(
filter_options->sub_alloc);
list_objects_filter_init(&filter_options->sub[new_index]);
- decoded = url_percent_decode(subspec->buf);
+ decoded = url_percent_decode(subspec);
result = has_reserved_character(subspec, errbuf);
if (result)
@@ -182,34 +182,34 @@ static int parse_combine_filter(
const char *arg,
struct strbuf *errbuf)
{
- struct strbuf **subspecs = strbuf_split_str(arg, '+', 0);
- size_t sub;
+ const char *p = arg;
int result = 0;
- if (!subspecs[0]) {
+ if (!*p) {
strbuf_addstr(errbuf, _("expected something after combine:"));
result = 1;
goto cleanup;
}
- for (sub = 0; subspecs[sub] && !result; sub++) {
- if (subspecs[sub + 1]) {
- /*
- * This is not the last subspec. Remove trailing "+" so
- * we can parse it.
- */
- size_t last = subspecs[sub]->len - 1;
- assert(subspecs[sub]->buf[last] == '+');
- strbuf_remove(subspecs[sub], last, 1);
- }
- result = parse_combine_subfilter(
- filter_options, subspecs[sub], errbuf);
+ while (*p && !result) {
+ const char *sep = strchr(p, '+');
+ size_t len = sep ? (size_t)(sep - p + 1) : strlen(p);
+ char *sub = xmemdupz(p, len);
+
+ /* strip '+' separator, but only when more sub-specs follow */
+ if (sep && *(sep + 1))
+ sub[len - 1] = '\0';
+
+ result = parse_combine_subfilter(filter_options, sub, errbuf);
+ free(sub);
+ if (!sep)
+ break;
+ p = sep + 1;
}
filter_options->choice = LOFC_COMBINE;
cleanup:
- strbuf_list_free(subspecs);
if (result)
list_objects_filter_release(filter_options);
return result;
--
2.52.0.230.gd8af7cadaa
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/2] worktree: do not pass strbuf by value
2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
@ 2026-03-09 14:48 ` Junio C Hamano
2026-03-09 19:26 ` coccinelle to catch pass-by-value?, was: " Jeff King
1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2026-03-09 14:48 UTC (permalink / raw)
To: Deveshi Dwivedi; +Cc: git, peff
Deveshi Dwivedi <deveshigurgaon@gmail.com> writes:
> The function only needs the string values, not the strbuf machinery.
> Switch it to take const char * and update all callers to pass .buf.
Makes perfect sense. Thanks for noticing and fixing these.
> - write_worktree_linking_files(dotgit, gitdir, use_relative_paths);
> + write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths);
> ...
> -void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir,
> +void write_worktree_linking_files(const char *dotgit, const char *gitdir,
> int use_relative_paths)
> {
This updated function signature makes it plenty clear that the
function does not modify anything in gitdir, so the caller shouldn't
be affected. Nice.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str()
2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
@ 2026-03-09 15:38 ` Junio C Hamano
2026-03-09 19:01 ` Jeff King
2026-03-09 19:08 ` Jeff King
1 sibling, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2026-03-09 15:38 UTC (permalink / raw)
To: Deveshi Dwivedi; +Cc: git, peff
Deveshi Dwivedi <deveshigurgaon@gmail.com> writes:
> parse_combine_filter() splits a combine: filter spec at '+' using
> strbuf_split_str(), which yields an array of strbufs with the
> delimiter left at the end of each non-final piece. The code then
> mutates each non-final piece to strip the trailing '+' before parsing.
>
> Allocating an array of strbufs is unnecessary. The function processes
> one sub-spec at a time and does not use strbuf editing on the pieces.
> The two helpers it calls, has_reserved_character() and
> parse_combine_subfilter(), only read the string content of the strbuf
> they receive.
>
> Walk the input string directly with strchr() to find each '+'. Copy
> each sub-spec into a temporary buffer and strip the '+' only when
> another sub-spec follows. Change the helpers to take const char *
> instead of struct strbuf *.
Makes sense. Instead of finding '+' and making many small copies
piecemeal, you could make a single copy of "const char *arg" once,
walk that string using strchr() looking for the next '+', and
replace '+' with '\0' before processing the current piece and
iterate, which may reduce the need for many small allocations and
deallocations, but I do not know if it is worth it. Benchmarking
it would not yield measurable difference, I suspect.
> + while (*p && !result) {
> + const char *sep = strchr(p, '+');
> + size_t len = sep ? (size_t)(sep - p + 1) : strlen(p);
> + char *sub = xmemdupz(p, len);
> +
> + /* strip '+' separator, but only when more sub-specs follow */
> + if (sep && *(sep + 1))
> + sub[len - 1] = '\0';
> +
> + result = parse_combine_subfilter(filter_options, sub, errbuf);
> + free(sub);
> + if (!sep)
> + break;
> + p = sep + 1;
> }
Hmph, would this loop handle a trailing '+' the same way as before,
e.g., "combine:tree:2+"? The original would have split the string
into ["tree:2+", ""] and the last call to parse_combine_subfilter()
would have been made with an empty string. The new code does not
make that last call with an empty string. Perhaps the differences
do not matter? I dunno.
Other than that, nice to see one fewer use of "splitting into an
array of strbuf" pattern.
Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str()
2026-03-09 15:38 ` Junio C Hamano
@ 2026-03-09 19:01 ` Jeff King
0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2026-03-09 19:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Deveshi Dwivedi, git
On Mon, Mar 09, 2026 at 08:38:16AM -0700, Junio C Hamano wrote:
> > + while (*p && !result) {
> > + const char *sep = strchr(p, '+');
> > + size_t len = sep ? (size_t)(sep - p + 1) : strlen(p);
> > + char *sub = xmemdupz(p, len);
> > +
> > + /* strip '+' separator, but only when more sub-specs follow */
> > + if (sep && *(sep + 1))
> > + sub[len - 1] = '\0';
> > +
> > + result = parse_combine_subfilter(filter_options, sub, errbuf);
> > + free(sub);
> > + if (!sep)
> > + break;
> > + p = sep + 1;
> > }
>
> Hmph, would this loop handle a trailing '+' the same way as before,
> e.g., "combine:tree:2+"? The original would have split the string
> into ["tree:2+", ""] and the last call to parse_combine_subfilter()
> would have been made with an empty string. The new code does not
> make that last call with an empty string. Perhaps the differences
> do not matter? I dunno.
I think the original was wrong in its parsing. The first entry should be
"tree:2", without the trailing "+". And that would almost always result
in rejecting the string, because the "+" doesn't make any sense for most
filters. The exception is:
blob=$(echo foo | git hash-object -w --stdin)
git tag foo+ $blob
git rev-list --filter=combine:sparse:oid=foo+
which happens to work. But it is wrong according to the documentation,
which says that "+" needs to be escaped if you want it passed along to
the sub-filter.
So flagging a trailing "+" as an error would probably be OK, and match
what happens now. But quietly ignoring it is perhaps friendlier (or less
friendly, if you think it might let a typo'd input go unnoticed).
-Peff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str()
2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
2026-03-09 15:38 ` Junio C Hamano
@ 2026-03-09 19:08 ` Jeff King
1 sibling, 0 replies; 8+ messages in thread
From: Jeff King @ 2026-03-09 19:08 UTC (permalink / raw)
To: Deveshi Dwivedi; +Cc: git, gitster
On Sun, Mar 08, 2026 at 06:03:59PM +0000, Deveshi Dwivedi wrote:
> + while (*p && !result) {
> + const char *sep = strchr(p, '+');
> + size_t len = sep ? (size_t)(sep - p + 1) : strlen(p);
> + char *sub = xmemdupz(p, len);
The cast to size_t made me look twice to see if something tricky was
going on. We don't usually bother explicitly casting from a ptrdiff_t
into a size_t.
However, might this all be simpler with strchrnul? Something like:
const char *end = strchrnul(p, '+');
char *sub = xmemdupz(p, end - p);
...parse sub...
if (!*end)
break; /* found NUL at end of string */
p = end + 1;
Notice I cut off the "+" when we find it, because I think...
> + /* strip '+' separator, but only when more sub-specs follow */
> + if (sep && *(sep + 1))
> + sub[len - 1] = '\0';
...this is wrong. I know you are matching what the current code does,
but it does not match the documentation, and does not actually make any
sense in practice.
Other than that, this looks nice, and I am happy to see more
strbuf_split() calls going away.
I think you could in theory drop the xmemdupz() here, too, and feed the
ptr/len combo into parse_combine_subfilter(), which then percent-decodes
into a newly allocated buffer. But it is probably not worth trying to
squeeze out one extra allocation here. It is not like people have huge
lists of combined filters; we'd expect to see a couple at most.
-Peff
^ permalink raw reply [flat|nested] 8+ messages in thread
* coccinelle to catch pass-by-value?, was: [PATCH v1 1/2] worktree: do not pass strbuf by value
2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
2026-03-09 14:48 ` Junio C Hamano
@ 2026-03-09 19:26 ` Jeff King
1 sibling, 0 replies; 8+ messages in thread
From: Jeff King @ 2026-03-09 19:26 UTC (permalink / raw)
To: Deveshi Dwivedi; +Cc: git, gitster
On Sun, Mar 08, 2026 at 06:03:58PM +0000, Deveshi Dwivedi wrote:
> The function only needs the string values, not the strbuf machinery.
> Switch it to take const char * and update all callers to pass .buf.
Nice catch. I wonder if we can get the compiler or other static analysis
to complain about this mistake. The best I could come up with is:
diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index 5f06105df6..665f56d070 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -60,3 +60,10 @@ expression E1, E2;
@@
- strbuf_addstr(E1, real_path(E2));
+ strbuf_add_real_path(E1, E2);
+
+@@
+expression F, ARG1, ARG2;
+struct strbuf SB;
+@@
+- F(ARG1, SB, ARG2)
++ F(ARG1, &SB, ARG2)
It rewrites a non-pointer argument into a pointer. That's not enough to
actually make the code work, but it would alert a developer that they
needed to follow-through on the rest of it. Or maybe it would just
confuse them without further hints.
I think there may be a way to get coccinelle to just emit an error
message describing the situation, but it relies on python extensions,
which I'm not sure we currently require.
Anyway, your patch is obviously good and anything further we do would
want to come on top of it.
-Peff
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-03-09 19:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-08 18:03 [PATCH v1 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi
2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi
2026-03-09 14:48 ` Junio C Hamano
2026-03-09 19:26 ` coccinelle to catch pass-by-value?, was: " Jeff King
2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi
2026-03-09 15:38 ` Junio C Hamano
2026-03-09 19:01 ` Jeff King
2026-03-09 19:08 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox