All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Subject: [PATCH v2 00/11] do not overuse strbuf_split*()
Date: Thu, 31 Jul 2025 15:54:22 -0700	[thread overview]
Message-ID: <20250731225433.4028872-1-gitster@pobox.com> (raw)
In-Reply-To: <20250731074154.2835370-1-gitster@pobox.com>

strbuf is a very good data structure to work with string data
without having to worry about running past the end of the string.

But an array of strbuf is often a wrong data structure.  You rarely
have need to be able to edit multiple strings represented by such an
array simultaneously.  And strbuf_split*() that produces result in
such a shape is a misdesigned API function.

The most common use case of strbuf_split*() family of functions
seems to be to trim away the whitespaces around each piece of split
string.  With modern string_list_split*(), it is often no longer
necessary.

This series builds on top of the other series that extends string
list API to allow string_list_split() to take more than one delimiter
bytes, and to optionally trim the resulting string pieces.

I do not plan to eradicate all the uses of strbuf_split*() myself,
not because I found some valid use cases in the existing code (I
haven't yet), but these patches would give interested others enough
material to study and mimic to continue the effort and I can safely
leave it as #leftoverbits to rewrite them.

Junio C Hamano (11):
  wt-status: avoid strbuf_split*()
  clean: do not pass strbuf by value
  clean: do not use strbuf_split*() [part 1]
  clean: do not use strbuf_split*() [part 2]
  merge-tree: do not use strbuf_split*()
  notes: do not use strbuf_split*()
  config: do not use strbuf_split()
  environment: do not use strbuf_split*()
  sub-process: do not use strbuf_split*()
  trace2: trim_trailing_newline followed by trim is a no-op
  trace2: do not use strbuf_split*()

 builtin/clean.c      | 74 ++++++++++++++++++++--------------------
 builtin/merge-tree.c | 30 +++++++++--------
 builtin/notes.c      | 23 +++++++------
 config.c             | 23 ++++++-------
 environment.c        | 19 +++++++----
 sub-process.c        | 15 ++++-----
 trace2/tr2_cfg.c     | 80 +++++++++++++++-----------------------------
 wt-status.c          | 31 ++++++-----------
 8 files changed, 129 insertions(+), 166 deletions(-)

 1:  79164fa6de =  1:  e254c0b462 wt-status: avoid strbuf_split*()
 2:  85006a11ff =  2:  07306ec99d clean: do not pass strbuf by value
 3:  633dd871f7 =  3:  1dc849eb0b clean: do not use strbuf_split*() [part 1]
 4:  60dd58af7e =  4:  69b885a579 clean: do not use strbuf_split*() [part 2]
 5:  49a6606d76 =  5:  ac90eb1b57 merge-tree: do not use strbuf_split*()
 6:  3813cc0e4d !  6:  68dd2be7ae notes: do not use strbuf_split*()
    @@ Metadata
      ## Commit message ##
         notes: do not use strbuf_split*()
     
    -    When reading the copy instruction from the standard input, the
    -    program reads a line, splits it into tokens at whitespace, and trims
    -    each of the tokens before using.  We no longer need to use strbuf
    -    just to be able to trimming, as string_list_split*() family now can
    -    trim while splitting a string.
    +    When reading copy instructions from the standard input, the program
    +    reads a line, splits it into tokens at whitespace, and trims each of
    +    the tokens before using.  We no longer need to use strbuf just to be
    +    able to trim, as string_list_split*() family now can trim while
    +    splitting a string.
     
    -    Retire the use of strbuf_split().
    +    Retire the use of strbuf_split() from this code path.
     
         Note that this loop is a bit sloppy in that it ensures at least
         there are two tokens on each line, but ignores if there are extra
 7:  2bd08bde36 !  7:  1f8c86ad0a config: do not use strbuf_split()
    @@ Commit message
         config: do not use strbuf_split()
     
         When parsing an old-style GIT_CONFIG_PARAMETERS environment
    -    variable, the code parses the key=value pair by spliting them at '='
    -    into an array of strbuf's.  As strbuf_split() leafes the delimiter
    +    variable, the code parses key=value pairs by splitting them at '='
    +    into an array of strbuf's.  As strbuf_split() leaves the delimiter
         at the end of the split piece, the code has to manually trim it.
     
         If we split with string_list_split(), that becomes unnecessary.
    -    Retire the use of strbuf_split().
    +    Retire the use of strbuf_split() from this code path.
     
         Note that the max parameter of string_list_split() is of
         an ergonomically iffy design---it specifies the maximum number of
 8:  ed35491fb1 =  8:  26c782fd7e environment: do not use strbuf_split*()
 9:  5b8fe54684 !  9:  95316b62fe sub-process: do not use strbuf_split*()
    @@ Commit message
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
      ## sub-process.c ##
    -@@
    - #include "sub-process.h"
    - #include "sigchain.h"
    - #include "pkt-line.h"
    -+#include "string-list.h"
    - 
    - int cmd2process_cmp(const void *cmp_data UNUSED,
    - 		    const struct hashmap_entry *eptr,
     @@ sub-process.c: struct subprocess_entry *subprocess_find_entry(struct hashmap *hashmap, const ch
      
      int subprocess_read_status(int fd, struct strbuf *status)
 -:  ---------- > 10:  09e83741d2 trace2: trim_trailing_newline followed by trim is a no-op
 -:  ---------- > 11:  be9c9cb420 trace2: do not use strbuf_split*()

  parent reply	other threads:[~2025-07-31 22:54 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-31  7:41 [PATCH 0/9] do not overuse strbuf_split*() Junio C Hamano
2025-07-31  7:41 ` [PATCH 1/9] wt-status: avoid strbuf_split*() Junio C Hamano
2025-07-31  7:41 ` [PATCH 2/9] clean: do not pass strbuf by value Junio C Hamano
2025-07-31  7:41 ` [PATCH 3/9] clean: do not use strbuf_split*() [part 1] Junio C Hamano
2025-07-31  7:41 ` [PATCH 4/9] clean: do not use strbuf_split*() [part 2] Junio C Hamano
2025-07-31  7:41 ` [PATCH 5/9] merge-tree: do not use strbuf_split*() Junio C Hamano
2025-07-31  7:41 ` [PATCH 6/9] notes: " Junio C Hamano
2025-07-31 20:14   ` Eric Sunshine
2025-07-31  7:41 ` [PATCH 7/9] config: do not use strbuf_split() Junio C Hamano
2025-07-31 20:15   ` Eric Sunshine
2025-07-31  7:41 ` [PATCH 8/9] environment: do not use strbuf_split*() Junio C Hamano
2025-07-31  7:41 ` [PATCH 9/9] sub-process: " Junio C Hamano
2025-07-31  8:50   ` Christian Couder
2025-07-31 14:30     ` Junio C Hamano
2025-07-31 22:54 ` Junio C Hamano [this message]
2025-07-31 22:54   ` [PATCH v2 01/11] wt-status: avoid strbuf_split*() Junio C Hamano
2025-07-31 22:54   ` [PATCH v2 02/11] clean: do not pass strbuf by value Junio C Hamano
2025-08-02  8:38     ` Jeff King
2025-08-02 16:44       ` Junio C Hamano
2025-08-02 18:40         ` Jeff King
2025-07-31 22:54   ` [PATCH v2 03/11] clean: do not use strbuf_split*() [part 1] Junio C Hamano
2025-07-31 22:54   ` [PATCH v2 04/11] clean: do not use strbuf_split*() [part 2] Junio C Hamano
2025-07-31 22:54   ` [PATCH v2 05/11] merge-tree: do not use strbuf_split*() Junio C Hamano
2025-08-02  8:55     ` Jeff King
2025-07-31 22:54   ` [PATCH v2 06/11] notes: " Junio C Hamano
2025-07-31 22:54   ` [PATCH v2 07/11] config: do not use strbuf_split() Junio C Hamano
2025-07-31 22:54   ` [PATCH v2 08/11] environment: do not use strbuf_split*() Junio C Hamano
2025-07-31 22:54   ` [PATCH v2 09/11] sub-process: " Junio C Hamano
2025-07-31 22:54   ` [PATCH v2 10/11] trace2: trim_trailing_newline followed by trim is a no-op Junio C Hamano
2025-07-31 22:54   ` [PATCH v2 11/11] trace2: do not use strbuf_split*() Junio C Hamano
2025-08-02  9:08   ` [PATCH v2 00/11] do not overuse strbuf_split*() Jeff King
2025-08-02 17:09     ` Junio C Hamano
2025-08-02 18:47       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250731225433.4028872-1-gitster@pobox.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.