git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Subject: [PATCH v4 0/7] string_list_split*() updates
Date: Sat,  2 Aug 2025 23:52:16 -0700	[thread overview]
Message-ID: <20250803065223.3325111-1-gitster@pobox.com> (raw)
In-Reply-To: <20250801220423.1230969-1-gitster@pobox.com>

Two related string-list API functions, string_list_split() and
string_list_split_in_place(), more or less duplicates their
implementations.  They both take a single string, and split the
string at the delimiter and stuff the result into a string list.

However, there is one subtle and unnecessary difference.  The non
"in-place" variant only allows a single byte value as delimiter,
while the "in-place" variant can take multiple delimiters (e.g.,
"split at either a comma or a space").

This series first updates the string_list_split() to allow multiple
delimiters like string_list_split_in_place() does, by unifying their
implementations into one.  This refactoring allows us to give new
features to these two functions more easily.

Then these functions learn to optionally

 - trim the split string pieces before placing them in the resulting
   string list.

 - omit empty string pieces from the resulting string list.

An existing caller of string_list_split() in diff.c trims the
elements in the resulting string list before it uses them, which is
simplified by taking advantage of this new feature.

A handful of code paths call string_list_split*(), immediately
followed by string_list_remove_empty_items().  They are simplified
by not placing empty items in the list in the first place.


Relative to the v3 iteration, the v4 iteration explains the history
behind string_list_split_in_place() in a bit more detail, and
expands in-code comment to clarify what the verb "trim" means in the
context of STRING_LIST_SPLIT_TRIM.

Junio C Hamano (7):
  string-list: report programming error with BUG
  string-list: align string_list_split() with its _in_place()
    counterpart
  string-list: unify string_list_split* functions
  string-list: optionally trim string pieces split by
    string_list_split*()
  diff: simplify parsing of diff.colormovedws
  string-list: optionally omit empty string pieces in
    string_list_split*()
  string-list: split-then-remove-empty can be done while splitting

 builtin/blame.c              |   2 +-
 builtin/merge.c              |   2 +-
 builtin/var.c                |   2 +-
 connect.c                    |   2 +-
 diff.c                       |  20 ++----
 fetch-pack.c                 |   2 +-
 notes.c                      |   6 +-
 parse-options.c              |   2 +-
 pathspec.c                   |   3 +-
 protocol.c                   |   2 +-
 ref-filter.c                 |   4 +-
 setup.c                      |   3 +-
 string-list.c                | 120 ++++++++++++++++++++++++-----------
 string-list.h                |  33 +++++++---
 t/helper/test-hashmap.c      |   4 +-
 t/helper/test-json-writer.c  |   4 +-
 t/helper/test-path-utils.c   |   3 +-
 t/helper/test-ref-store.c    |   2 +-
 t/unit-tests/u-string-list.c |  95 ++++++++++++++++++++++++---
 transport.c                  |   2 +-
 upload-pack.c                |   2 +-
 21 files changed, 225 insertions(+), 90 deletions(-)

Range-diff against v3:
1:  442ed679bb = 1:  4f9c8d8963 string-list: report programming error with BUG
2:  cc80bac8c2 ! 2:  9f6dfe43c8 string-list: align string_list_split() with its _in_place() counterpart
    @@ Metadata
      ## Commit message ##
         string-list: align string_list_split() with its _in_place() counterpart
     
    -    For some unknown reason, unlike string_list_split_in_place(),
    -    string_list_split() took only a single character as a field
    -    delimiter.  Before giving both functions more features in future
    -    commits, allow string_list_split() to take more than one delimiter
    -    characters to make them closer to each other.
    +    The string_list_split_in_place() function was updated by 52acddf3
    +    (string-list: multi-delimiter `string_list_split_in_place()`,
    +    2023-04-24) to take more than one delimiter characters, hoping that
    +    we can later use it to replace our uses of strtok().  We however did
    +    not make a matching change to the string_list_split() function,
    +    which is very similar.
    +
    +    Before giving both functions more features in future commits, allow
    +    string_list_split() to also take more than one delimiter characters
    +    to make them closer to each other.
     
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
3:  c7922b3e14 = 3:  527535fcdd string-list: unify string_list_split* functions
4:  9d7d22e8ef ! 4:  5764549741 string-list: optionally trim string pieces split by string_list_split*()
    @@ string-list.h: int string_list_split(struct string_list *list, const char *strin
      int string_list_split_in_place(struct string_list *list, char *string,
      			       const char *delim, int maxsplit);
     +
    -+/* flag bits for split_f and split_in_place_f functions */
    ++/* Flag bits for split_f and split_in_place_f functions */
     +enum {
    -+	/* trim() resulting string piece before adding it to the list */
    ++	/*
    ++	 * trim whitespaces around resulting string piece before adding
    ++	 * it to the list
    ++	 */
     +	STRING_LIST_SPLIT_TRIM = (1 << 0),
     +};
     +
5:  ad8b425bc5 = 5:  f3a303aef0 diff: simplify parsing of diff.colormovedws
6:  d03f443878 ! 6:  27531efa41 string-list: optionally omit empty string pieces in string_list_split*()
    @@ string-list.c: static int append_one(struct string_list *list,
      		string_list_append(list, p);
     
      ## string-list.h ##
    -@@ string-list.h: int string_list_split_in_place(struct string_list *list, char *string,
    - enum {
    - 	/* trim() resulting string piece before adding it to the list */
    +@@ string-list.h: enum {
    + 	 * it to the list
    + 	 */
      	STRING_LIST_SPLIT_TRIM = (1 << 0),
     +	/* omit adding empty string piece to the resulting list */
     +	STRING_LIST_SPLIT_NONEMPTY = (1 << 1),
7:  9eb8d87d62 = 7:  2ab2aac73d string-list: split-then-remove-empty can be done while splitting
-- 
2.50.1-633-g69dfdd50af


  parent reply	other threads:[~2025-08-03  6:52 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-31  6:39 [PATCH 0/5] string_list_split*() updates Junio C Hamano
2025-07-31  6:39 ` [PATCH 1/5] string-list: report programming error with BUG Junio C Hamano
2025-07-31 19:33   ` Eric Sunshine
2025-07-31 22:16     ` Junio C Hamano
2025-07-31  6:39 ` [PATCH 2/5] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
2025-07-31 19:36   ` Eric Sunshine
2025-07-31  6:39 ` [PATCH 3/5] string-list: unify string_list_split* functions Junio C Hamano
2025-07-31  6:39 ` [PATCH 4/5] string-list: optionally trim string pieces split by string_list_split() Junio C Hamano
2025-07-31  6:39 ` [PATCH 5/5] diff: simplify parsing of diff.colormovedws Junio C Hamano
2025-07-31 19:45   ` Eric Sunshine
2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 1/7] string-list: report programming error with BUG Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
2025-08-01  2:33     ` shejialuo
2025-08-01  3:43       ` Junio C Hamano
2025-08-01  3:55         ` shejialuo
2025-08-01 23:10           ` Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 3/7] string-list: unify string_list_split* functions Junio C Hamano
2025-08-01  3:00     ` shejialuo
2025-07-31 22:46   ` [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
2025-08-01  3:18     ` shejialuo
2025-08-01  3:47       ` Junio C Hamano
2025-08-01  4:04         ` shejialuo
2025-08-01 23:09           ` Junio C Hamano
2025-08-02  1:51             ` shejialuo
2025-08-01  8:47     ` Patrick Steinhardt
2025-08-01 16:26       ` Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
2025-08-01  8:47     ` Patrick Steinhardt
2025-07-31 22:46   ` [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
2025-07-31 22:54     ` Eric Sunshine
2025-08-01  3:33     ` shejialuo
2025-08-01  8:47     ` Patrick Steinhardt
2025-08-01 16:38       ` Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
2025-08-01  8:47     ` Patrick Steinhardt
2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 1/7] string-list: report programming error with BUG Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
2025-08-02  8:22       ` Jeff King
2025-08-02 16:34         ` Junio C Hamano
2025-08-02 18:38           ` Jeff King
2025-08-01 22:04     ` [PATCH v3 3/7] string-list: unify string_list_split* functions Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
2025-08-02  8:26       ` Jeff King
2025-08-02 16:38         ` Junio C Hamano
2025-08-02 18:39           ` Jeff King
2025-08-01 22:04     ` [PATCH v3 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
2025-08-03  6:52     ` Junio C Hamano [this message]
2025-08-03  6:52       ` [PATCH v4 1/7] string-list: report programming error with BUG Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 3/7] string-list: unify string_list_split* functions Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
2025-08-04  6:24       ` [PATCH v4 0/7] string_list_split*() updates Patrick Steinhardt
2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 01/12] wt-status: avoid strbuf_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 02/12] clean: do not pass strbuf by value Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 03/12] clean: do not use strbuf_split*() [part 1] Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 04/12] clean: do not pass the whole structure when it is not necessary Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 05/12] clean: do not use strbuf_split*() [part 2] Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 06/12] merge-tree: do not use strbuf_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 07/12] notes: " Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 08/12] config: do not use strbuf_split() Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 09/12] environment: do not use strbuf_split*() Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 10/12] sub-process: " Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 11/12] trace2: trim_trailing_newline followed by trim is a no-op Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 12/12] trace2: do not use strbuf_split*() Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250803065223.3325111-1-gitster@pobox.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).