* [PATCH v1 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage @ 2026-03-08 18:03 Deveshi Dwivedi 2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi 2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi 0 siblings, 2 replies; 8+ messages in thread From: Deveshi Dwivedi @ 2026-03-08 18:03 UTC (permalink / raw) To: git; +Cc: gitster, peff, Deveshi Dwivedi Junio's "do not overuse strbuf_split*()" series [1] calls out remaining uses of strbuf_split*() as leftover bits for others to continue. This series picks up two of them. write_worktree_linking_files() takes two struct strbuf parameters by value even though it only needs plain path strings. This is the same class of problem fixed for builtin/clean.c in that series. parse_combine_filter() in list-objects-filter-options.c uses strbuf_split_str() to split a combine: filter spec at '+', then mutates each non-final piece to strip the trailing delimiter. An array of strbufs is unnecessary; the function processes one sub-spec at a time and does not use strbuf editing on the pieces. [1]: https://lore.kernel.org/git/20250731225433.4028872-1-gitster@pobox.com/ Deveshi Dwivedi (2): worktree: do not pass strbuf by value list-objects-filter-options: avoid strbuf_split_str() builtin/worktree.c | 2 +- list-objects-filter-options.c | 40 +++++++++++++++++------------------ worktree.c | 22 +++++++++---------- worktree.h | 2 +- 4 files changed, 33 insertions(+), 33 deletions(-) -- 2.52.0.230.gd8af7cadaa ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v1 1/2] worktree: do not pass strbuf by value 2026-03-08 18:03 [PATCH v1 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi @ 2026-03-08 18:03 ` Deveshi Dwivedi 2026-03-09 14:48 ` Junio C Hamano 2026-03-09 19:26 ` coccinelle to catch pass-by-value?, was: " Jeff King 2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi 1 sibling, 2 replies; 8+ messages in thread From: Deveshi Dwivedi @ 2026-03-08 18:03 UTC (permalink / raw) To: git; +Cc: gitster, peff, Deveshi Dwivedi write_worktree_linking_files() takes two struct strbuf parameters by value, even though it only reads path strings from them. Passing a strbuf by value is misleading and dangerous. The structure carries a pointer to its underlying character array; caller and callee end up sharing that storage. If the callee ever causes the strbuf to be reallocated, the caller's copy becomes a dangling pointer, which results in a double-free when the caller does strbuf_release(). The function only needs the string values, not the strbuf machinery. Switch it to take const char * and update all callers to pass .buf. Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com> --- builtin/worktree.c | 2 +- worktree.c | 22 +++++++++++----------- worktree.h | 2 +- 3 files changed, 13 insertions(+), 13 deletions(-) diff --git a/builtin/worktree.c b/builtin/worktree.c index bc2d0d645b..4035b1cb06 100644 --- a/builtin/worktree.c +++ b/builtin/worktree.c @@ -539,7 +539,7 @@ static int add_worktree(const char *path, const char *refname, strbuf_reset(&sb); strbuf_addf(&sb, "%s/gitdir", sb_repo.buf); - write_worktree_linking_files(sb_git, sb, opts->relative_paths); + write_worktree_linking_files(sb_git.buf, sb.buf, opts->relative_paths); strbuf_reset(&sb); strbuf_addf(&sb, "%s/commondir", sb_repo.buf); write_file(sb.buf, "../.."); diff --git a/worktree.c b/worktree.c index 6e2f0f7828..7eba12c6ed 100644 --- a/worktree.c +++ b/worktree.c @@ -445,7 +445,7 @@ void update_worktree_location(struct worktree *wt, const char *path_, strbuf_realpath(&path, path_, 1); strbuf_addf(&dotgit, "%s/.git", path.buf); if (fspathcmp(wt->path, path.buf)) { - write_worktree_linking_files(dotgit, gitdir, use_relative_paths); + write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths); free(wt->path); wt->path = strbuf_detach(&path, NULL); @@ -684,7 +684,7 @@ static void repair_gitfile(struct worktree *wt, if (repair) { fn(0, wt->path, repair, cb_data); - write_worktree_linking_files(dotgit, gitdir, use_relative_paths); + write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths); } done: @@ -742,7 +742,7 @@ void repair_worktree_after_gitdir_move(struct worktree *wt, const char *old_path if (!file_exists(dotgit.buf)) goto done; - write_worktree_linking_files(dotgit, gitdir, is_relative_path); + write_worktree_linking_files(dotgit.buf, gitdir.buf, is_relative_path); done: strbuf_release(&gitdir); strbuf_release(&dotgit); @@ -913,7 +913,7 @@ void repair_worktree_at_path(const char *path, if (repair) { fn(0, gitdir.buf, repair, cb_data); - write_worktree_linking_files(dotgit, gitdir, use_relative_paths); + write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths); } done: free(dotgit_contents); @@ -1087,17 +1087,17 @@ int init_worktree_config(struct repository *r) return res; } -void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir, +void write_worktree_linking_files(const char *dotgit, const char *gitdir, int use_relative_paths) { struct strbuf path = STRBUF_INIT; struct strbuf repo = STRBUF_INIT; struct strbuf tmp = STRBUF_INIT; - strbuf_addbuf(&path, &dotgit); + strbuf_addstr(&path, dotgit); strbuf_strip_suffix(&path, "/.git"); strbuf_realpath(&path, path.buf, 1); - strbuf_addbuf(&repo, &gitdir); + strbuf_addstr(&repo, gitdir); strbuf_strip_suffix(&repo, "/gitdir"); strbuf_realpath(&repo, repo.buf, 1); @@ -1110,11 +1110,11 @@ void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir, } if (use_relative_paths) { - write_file(gitdir.buf, "%s/.git", relative_path(path.buf, repo.buf, &tmp)); - write_file(dotgit.buf, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp)); + write_file(gitdir, "%s/.git", relative_path(path.buf, repo.buf, &tmp)); + write_file(dotgit, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp)); } else { - write_file(gitdir.buf, "%s/.git", path.buf); - write_file(dotgit.buf, "gitdir: %s", repo.buf); + write_file(gitdir, "%s/.git", path.buf); + write_file(dotgit, "gitdir: %s", repo.buf); } strbuf_release(&path); diff --git a/worktree.h b/worktree.h index 06efe26b83..f4e46be385 100644 --- a/worktree.h +++ b/worktree.h @@ -240,7 +240,7 @@ int init_worktree_config(struct repository *r); * dotgit: "/path/to/foo/.git" * gitdir: "/path/to/repo/worktrees/foo/gitdir" */ -void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir, +void write_worktree_linking_files(const char *dotgit, const char *gitdir, int use_relative_paths); #endif -- 2.52.0.230.gd8af7cadaa ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/2] worktree: do not pass strbuf by value 2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi @ 2026-03-09 14:48 ` Junio C Hamano 2026-03-09 19:26 ` coccinelle to catch pass-by-value?, was: " Jeff King 1 sibling, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2026-03-09 14:48 UTC (permalink / raw) To: Deveshi Dwivedi; +Cc: git, peff Deveshi Dwivedi <deveshigurgaon@gmail.com> writes: > The function only needs the string values, not the strbuf machinery. > Switch it to take const char * and update all callers to pass .buf. Makes perfect sense. Thanks for noticing and fixing these. > - write_worktree_linking_files(dotgit, gitdir, use_relative_paths); > + write_worktree_linking_files(dotgit.buf, gitdir.buf, use_relative_paths); > ... > -void write_worktree_linking_files(struct strbuf dotgit, struct strbuf gitdir, > +void write_worktree_linking_files(const char *dotgit, const char *gitdir, > int use_relative_paths) > { This updated function signature makes it plenty clear that the function does not modify anything in gitdir, so the caller shouldn't be affected. Nice. ^ permalink raw reply [flat|nested] 8+ messages in thread
* coccinelle to catch pass-by-value?, was: [PATCH v1 1/2] worktree: do not pass strbuf by value 2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi 2026-03-09 14:48 ` Junio C Hamano @ 2026-03-09 19:26 ` Jeff King 1 sibling, 0 replies; 8+ messages in thread From: Jeff King @ 2026-03-09 19:26 UTC (permalink / raw) To: Deveshi Dwivedi; +Cc: git, gitster On Sun, Mar 08, 2026 at 06:03:58PM +0000, Deveshi Dwivedi wrote: > The function only needs the string values, not the strbuf machinery. > Switch it to take const char * and update all callers to pass .buf. Nice catch. I wonder if we can get the compiler or other static analysis to complain about this mistake. The best I could come up with is: diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci index 5f06105df6..665f56d070 100644 --- a/contrib/coccinelle/strbuf.cocci +++ b/contrib/coccinelle/strbuf.cocci @@ -60,3 +60,10 @@ expression E1, E2; @@ - strbuf_addstr(E1, real_path(E2)); + strbuf_add_real_path(E1, E2); + +@@ +expression F, ARG1, ARG2; +struct strbuf SB; +@@ +- F(ARG1, SB, ARG2) ++ F(ARG1, &SB, ARG2) It rewrites a non-pointer argument into a pointer. That's not enough to actually make the code work, but it would alert a developer that they needed to follow-through on the rest of it. Or maybe it would just confuse them without further hints. I think there may be a way to get coccinelle to just emit an error message describing the situation, but it relies on python extensions, which I'm not sure we currently require. Anyway, your patch is obviously good and anything further we do would want to come on top of it. -Peff ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() 2026-03-08 18:03 [PATCH v1 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi 2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi @ 2026-03-08 18:03 ` Deveshi Dwivedi 2026-03-09 15:38 ` Junio C Hamano 2026-03-09 19:08 ` Jeff King 1 sibling, 2 replies; 8+ messages in thread From: Deveshi Dwivedi @ 2026-03-08 18:03 UTC (permalink / raw) To: git; +Cc: gitster, peff, Deveshi Dwivedi parse_combine_filter() splits a combine: filter spec at '+' using strbuf_split_str(), which yields an array of strbufs with the delimiter left at the end of each non-final piece. The code then mutates each non-final piece to strip the trailing '+' before parsing. Allocating an array of strbufs is unnecessary. The function processes one sub-spec at a time and does not use strbuf editing on the pieces. The two helpers it calls, has_reserved_character() and parse_combine_subfilter(), only read the string content of the strbuf they receive. Walk the input string directly with strchr() to find each '+'. Copy each sub-spec into a temporary buffer and strip the '+' only when another sub-spec follows. Change the helpers to take const char * instead of struct strbuf *. Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com> --- list-objects-filter-options.c | 40 +++++++++++++++++------------------ 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 7f3e7b8f50..f536085a7c 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -125,9 +125,9 @@ int gently_parse_list_objects_filter( static const char *RESERVED_NON_WS = "~`!@#$^&*()[]{}\\;'\",<>?"; static int has_reserved_character( - struct strbuf *sub_spec, struct strbuf *errbuf) + const char *sub_spec, struct strbuf *errbuf) { - const char *c = sub_spec->buf; + const char *c = sub_spec; while (*c) { if (*c <= ' ' || strchr(RESERVED_NON_WS, *c)) { strbuf_addf( @@ -144,7 +144,7 @@ static int has_reserved_character( static int parse_combine_subfilter( struct list_objects_filter_options *filter_options, - struct strbuf *subspec, + const char *subspec, struct strbuf *errbuf) { size_t new_index = filter_options->sub_nr; @@ -155,7 +155,7 @@ static int parse_combine_subfilter( filter_options->sub_alloc); list_objects_filter_init(&filter_options->sub[new_index]); - decoded = url_percent_decode(subspec->buf); + decoded = url_percent_decode(subspec); result = has_reserved_character(subspec, errbuf); if (result) @@ -182,34 +182,34 @@ static int parse_combine_filter( const char *arg, struct strbuf *errbuf) { - struct strbuf **subspecs = strbuf_split_str(arg, '+', 0); - size_t sub; + const char *p = arg; int result = 0; - if (!subspecs[0]) { + if (!*p) { strbuf_addstr(errbuf, _("expected something after combine:")); result = 1; goto cleanup; } - for (sub = 0; subspecs[sub] && !result; sub++) { - if (subspecs[sub + 1]) { - /* - * This is not the last subspec. Remove trailing "+" so - * we can parse it. - */ - size_t last = subspecs[sub]->len - 1; - assert(subspecs[sub]->buf[last] == '+'); - strbuf_remove(subspecs[sub], last, 1); - } - result = parse_combine_subfilter( - filter_options, subspecs[sub], errbuf); + while (*p && !result) { + const char *sep = strchr(p, '+'); + size_t len = sep ? (size_t)(sep - p + 1) : strlen(p); + char *sub = xmemdupz(p, len); + + /* strip '+' separator, but only when more sub-specs follow */ + if (sep && *(sep + 1)) + sub[len - 1] = '\0'; + + result = parse_combine_subfilter(filter_options, sub, errbuf); + free(sub); + if (!sep) + break; + p = sep + 1; } filter_options->choice = LOFC_COMBINE; cleanup: - strbuf_list_free(subspecs); if (result) list_objects_filter_release(filter_options); return result; -- 2.52.0.230.gd8af7cadaa ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() 2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi @ 2026-03-09 15:38 ` Junio C Hamano 2026-03-09 19:01 ` Jeff King 2026-03-09 19:08 ` Jeff King 1 sibling, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2026-03-09 15:38 UTC (permalink / raw) To: Deveshi Dwivedi; +Cc: git, peff Deveshi Dwivedi <deveshigurgaon@gmail.com> writes: > parse_combine_filter() splits a combine: filter spec at '+' using > strbuf_split_str(), which yields an array of strbufs with the > delimiter left at the end of each non-final piece. The code then > mutates each non-final piece to strip the trailing '+' before parsing. > > Allocating an array of strbufs is unnecessary. The function processes > one sub-spec at a time and does not use strbuf editing on the pieces. > The two helpers it calls, has_reserved_character() and > parse_combine_subfilter(), only read the string content of the strbuf > they receive. > > Walk the input string directly with strchr() to find each '+'. Copy > each sub-spec into a temporary buffer and strip the '+' only when > another sub-spec follows. Change the helpers to take const char * > instead of struct strbuf *. Makes sense. Instead of finding '+' and making many small copies piecemeal, you could make a single copy of "const char *arg" once, walk that string using strchr() looking for the next '+', and replace '+' with '\0' before processing the current piece and iterate, which may reduce the need for many small allocations and deallocations, but I do not know if it is worth it. Benchmarking it would not yield measurable difference, I suspect. > + while (*p && !result) { > + const char *sep = strchr(p, '+'); > + size_t len = sep ? (size_t)(sep - p + 1) : strlen(p); > + char *sub = xmemdupz(p, len); > + > + /* strip '+' separator, but only when more sub-specs follow */ > + if (sep && *(sep + 1)) > + sub[len - 1] = '\0'; > + > + result = parse_combine_subfilter(filter_options, sub, errbuf); > + free(sub); > + if (!sep) > + break; > + p = sep + 1; > } Hmph, would this loop handle a trailing '+' the same way as before, e.g., "combine:tree:2+"? The original would have split the string into ["tree:2+", ""] and the last call to parse_combine_subfilter() would have been made with an empty string. The new code does not make that last call with an empty string. Perhaps the differences do not matter? I dunno. Other than that, nice to see one fewer use of "splitting into an array of strbuf" pattern. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() 2026-03-09 15:38 ` Junio C Hamano @ 2026-03-09 19:01 ` Jeff King 0 siblings, 0 replies; 8+ messages in thread From: Jeff King @ 2026-03-09 19:01 UTC (permalink / raw) To: Junio C Hamano; +Cc: Deveshi Dwivedi, git On Mon, Mar 09, 2026 at 08:38:16AM -0700, Junio C Hamano wrote: > > + while (*p && !result) { > > + const char *sep = strchr(p, '+'); > > + size_t len = sep ? (size_t)(sep - p + 1) : strlen(p); > > + char *sub = xmemdupz(p, len); > > + > > + /* strip '+' separator, but only when more sub-specs follow */ > > + if (sep && *(sep + 1)) > > + sub[len - 1] = '\0'; > > + > > + result = parse_combine_subfilter(filter_options, sub, errbuf); > > + free(sub); > > + if (!sep) > > + break; > > + p = sep + 1; > > } > > Hmph, would this loop handle a trailing '+' the same way as before, > e.g., "combine:tree:2+"? The original would have split the string > into ["tree:2+", ""] and the last call to parse_combine_subfilter() > would have been made with an empty string. The new code does not > make that last call with an empty string. Perhaps the differences > do not matter? I dunno. I think the original was wrong in its parsing. The first entry should be "tree:2", without the trailing "+". And that would almost always result in rejecting the string, because the "+" doesn't make any sense for most filters. The exception is: blob=$(echo foo | git hash-object -w --stdin) git tag foo+ $blob git rev-list --filter=combine:sparse:oid=foo+ which happens to work. But it is wrong according to the documentation, which says that "+" needs to be escaped if you want it passed along to the sub-filter. So flagging a trailing "+" as an error would probably be OK, and match what happens now. But quietly ignoring it is perhaps friendlier (or less friendly, if you think it might let a typo'd input go unnoticed). -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() 2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi 2026-03-09 15:38 ` Junio C Hamano @ 2026-03-09 19:08 ` Jeff King 1 sibling, 0 replies; 8+ messages in thread From: Jeff King @ 2026-03-09 19:08 UTC (permalink / raw) To: Deveshi Dwivedi; +Cc: git, gitster On Sun, Mar 08, 2026 at 06:03:59PM +0000, Deveshi Dwivedi wrote: > + while (*p && !result) { > + const char *sep = strchr(p, '+'); > + size_t len = sep ? (size_t)(sep - p + 1) : strlen(p); > + char *sub = xmemdupz(p, len); The cast to size_t made me look twice to see if something tricky was going on. We don't usually bother explicitly casting from a ptrdiff_t into a size_t. However, might this all be simpler with strchrnul? Something like: const char *end = strchrnul(p, '+'); char *sub = xmemdupz(p, end - p); ...parse sub... if (!*end) break; /* found NUL at end of string */ p = end + 1; Notice I cut off the "+" when we find it, because I think... > + /* strip '+' separator, but only when more sub-specs follow */ > + if (sep && *(sep + 1)) > + sub[len - 1] = '\0'; ...this is wrong. I know you are matching what the current code does, but it does not match the documentation, and does not actually make any sense in practice. Other than that, this looks nice, and I am happy to see more strbuf_split() calls going away. I think you could in theory drop the xmemdupz() here, too, and feed the ptr/len combo into parse_combine_subfilter(), which then percent-decodes into a newly allocated buffer. But it is probably not worth trying to squeeze out one extra allocation here. It is not like people have huge lists of combined filters; we'd expect to see a couple at most. -Peff ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-03-09 19:26 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-08 18:03 [PATCH v1 0/2] avoid unnecessary strbuf_split*() and strbuf-by-value usage Deveshi Dwivedi 2026-03-08 18:03 ` [PATCH v1 1/2] worktree: do not pass strbuf by value Deveshi Dwivedi 2026-03-09 14:48 ` Junio C Hamano 2026-03-09 19:26 ` coccinelle to catch pass-by-value?, was: " Jeff King 2026-03-08 18:03 ` [PATCH v1 2/2] list-objects-filter-options: avoid strbuf_split_str() Deveshi Dwivedi 2026-03-09 15:38 ` Junio C Hamano 2026-03-09 19:01 ` Jeff King 2026-03-09 19:08 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox