From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Subject: [PATCH v2 00/11] do not overuse strbuf_split*()
Date: Thu, 31 Jul 2025 15:54:22 -0700 [thread overview]
Message-ID: <20250731225433.4028872-1-gitster@pobox.com> (raw)
In-Reply-To: <20250731074154.2835370-1-gitster@pobox.com>
strbuf is a very good data structure to work with string data
without having to worry about running past the end of the string.
But an array of strbuf is often a wrong data structure. You rarely
have need to be able to edit multiple strings represented by such an
array simultaneously. And strbuf_split*() that produces result in
such a shape is a misdesigned API function.
The most common use case of strbuf_split*() family of functions
seems to be to trim away the whitespaces around each piece of split
string. With modern string_list_split*(), it is often no longer
necessary.
This series builds on top of the other series that extends string
list API to allow string_list_split() to take more than one delimiter
bytes, and to optionally trim the resulting string pieces.
I do not plan to eradicate all the uses of strbuf_split*() myself,
not because I found some valid use cases in the existing code (I
haven't yet), but these patches would give interested others enough
material to study and mimic to continue the effort and I can safely
leave it as #leftoverbits to rewrite them.
Junio C Hamano (11):
wt-status: avoid strbuf_split*()
clean: do not pass strbuf by value
clean: do not use strbuf_split*() [part 1]
clean: do not use strbuf_split*() [part 2]
merge-tree: do not use strbuf_split*()
notes: do not use strbuf_split*()
config: do not use strbuf_split()
environment: do not use strbuf_split*()
sub-process: do not use strbuf_split*()
trace2: trim_trailing_newline followed by trim is a no-op
trace2: do not use strbuf_split*()
builtin/clean.c | 74 ++++++++++++++++++++--------------------
builtin/merge-tree.c | 30 +++++++++--------
builtin/notes.c | 23 +++++++------
config.c | 23 ++++++-------
environment.c | 19 +++++++----
sub-process.c | 15 ++++-----
trace2/tr2_cfg.c | 80 +++++++++++++++-----------------------------
wt-status.c | 31 ++++++-----------
8 files changed, 129 insertions(+), 166 deletions(-)
1: 79164fa6de = 1: e254c0b462 wt-status: avoid strbuf_split*()
2: 85006a11ff = 2: 07306ec99d clean: do not pass strbuf by value
3: 633dd871f7 = 3: 1dc849eb0b clean: do not use strbuf_split*() [part 1]
4: 60dd58af7e = 4: 69b885a579 clean: do not use strbuf_split*() [part 2]
5: 49a6606d76 = 5: ac90eb1b57 merge-tree: do not use strbuf_split*()
6: 3813cc0e4d ! 6: 68dd2be7ae notes: do not use strbuf_split*()
@@ Metadata
## Commit message ##
notes: do not use strbuf_split*()
- When reading the copy instruction from the standard input, the
- program reads a line, splits it into tokens at whitespace, and trims
- each of the tokens before using. We no longer need to use strbuf
- just to be able to trimming, as string_list_split*() family now can
- trim while splitting a string.
+ When reading copy instructions from the standard input, the program
+ reads a line, splits it into tokens at whitespace, and trims each of
+ the tokens before using. We no longer need to use strbuf just to be
+ able to trim, as string_list_split*() family now can trim while
+ splitting a string.
- Retire the use of strbuf_split().
+ Retire the use of strbuf_split() from this code path.
Note that this loop is a bit sloppy in that it ensures at least
there are two tokens on each line, but ignores if there are extra
7: 2bd08bde36 ! 7: 1f8c86ad0a config: do not use strbuf_split()
@@ Commit message
config: do not use strbuf_split()
When parsing an old-style GIT_CONFIG_PARAMETERS environment
- variable, the code parses the key=value pair by spliting them at '='
- into an array of strbuf's. As strbuf_split() leafes the delimiter
+ variable, the code parses key=value pairs by splitting them at '='
+ into an array of strbuf's. As strbuf_split() leaves the delimiter
at the end of the split piece, the code has to manually trim it.
If we split with string_list_split(), that becomes unnecessary.
- Retire the use of strbuf_split().
+ Retire the use of strbuf_split() from this code path.
Note that the max parameter of string_list_split() is of
an ergonomically iffy design---it specifies the maximum number of
8: ed35491fb1 = 8: 26c782fd7e environment: do not use strbuf_split*()
9: 5b8fe54684 ! 9: 95316b62fe sub-process: do not use strbuf_split*()
@@ Commit message
Signed-off-by: Junio C Hamano <gitster@pobox.com>
## sub-process.c ##
-@@
- #include "sub-process.h"
- #include "sigchain.h"
- #include "pkt-line.h"
-+#include "string-list.h"
-
- int cmd2process_cmp(const void *cmp_data UNUSED,
- const struct hashmap_entry *eptr,
@@ sub-process.c: struct subprocess_entry *subprocess_find_entry(struct hashmap *hashmap, const ch
int subprocess_read_status(int fd, struct strbuf *status)
-: ---------- > 10: 09e83741d2 trace2: trim_trailing_newline followed by trim is a no-op
-: ---------- > 11: be9c9cb420 trace2: do not use strbuf_split*()
next prev parent reply other threads:[~2025-07-31 22:54 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-31 7:41 [PATCH 0/9] do not overuse strbuf_split*() Junio C Hamano
2025-07-31 7:41 ` [PATCH 1/9] wt-status: avoid strbuf_split*() Junio C Hamano
2025-07-31 7:41 ` [PATCH 2/9] clean: do not pass strbuf by value Junio C Hamano
2025-07-31 7:41 ` [PATCH 3/9] clean: do not use strbuf_split*() [part 1] Junio C Hamano
2025-07-31 7:41 ` [PATCH 4/9] clean: do not use strbuf_split*() [part 2] Junio C Hamano
2025-07-31 7:41 ` [PATCH 5/9] merge-tree: do not use strbuf_split*() Junio C Hamano
2025-07-31 7:41 ` [PATCH 6/9] notes: " Junio C Hamano
2025-07-31 20:14 ` Eric Sunshine
2025-07-31 7:41 ` [PATCH 7/9] config: do not use strbuf_split() Junio C Hamano
2025-07-31 20:15 ` Eric Sunshine
2025-07-31 7:41 ` [PATCH 8/9] environment: do not use strbuf_split*() Junio C Hamano
2025-07-31 7:41 ` [PATCH 9/9] sub-process: " Junio C Hamano
2025-07-31 8:50 ` Christian Couder
2025-07-31 14:30 ` Junio C Hamano
2025-07-31 22:54 ` Junio C Hamano [this message]
2025-07-31 22:54 ` [PATCH v2 01/11] wt-status: avoid strbuf_split*() Junio C Hamano
2025-07-31 22:54 ` [PATCH v2 02/11] clean: do not pass strbuf by value Junio C Hamano
2025-08-02 8:38 ` Jeff King
2025-08-02 16:44 ` Junio C Hamano
2025-08-02 18:40 ` Jeff King
2025-07-31 22:54 ` [PATCH v2 03/11] clean: do not use strbuf_split*() [part 1] Junio C Hamano
2025-07-31 22:54 ` [PATCH v2 04/11] clean: do not use strbuf_split*() [part 2] Junio C Hamano
2025-07-31 22:54 ` [PATCH v2 05/11] merge-tree: do not use strbuf_split*() Junio C Hamano
2025-08-02 8:55 ` Jeff King
2025-07-31 22:54 ` [PATCH v2 06/11] notes: " Junio C Hamano
2025-07-31 22:54 ` [PATCH v2 07/11] config: do not use strbuf_split() Junio C Hamano
2025-07-31 22:54 ` [PATCH v2 08/11] environment: do not use strbuf_split*() Junio C Hamano
2025-07-31 22:54 ` [PATCH v2 09/11] sub-process: " Junio C Hamano
2025-07-31 22:54 ` [PATCH v2 10/11] trace2: trim_trailing_newline followed by trim is a no-op Junio C Hamano
2025-07-31 22:54 ` [PATCH v2 11/11] trace2: do not use strbuf_split*() Junio C Hamano
2025-08-02 9:08 ` [PATCH v2 00/11] do not overuse strbuf_split*() Jeff King
2025-08-02 17:09 ` Junio C Hamano
2025-08-02 18:47 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250731225433.4028872-1-gitster@pobox.com \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.