git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] string_list_split*() updates
@ 2025-07-31  6:39 Junio C Hamano
  2025-07-31  6:39 ` [PATCH 1/5] string-list: report programming error with BUG Junio C Hamano
                   ` (5 more replies)
  0 siblings, 6 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31  6:39 UTC (permalink / raw)
  To: git

Two related string-list API functions, string_list_split() and
string_list_split_in_place(), more or less duplicates their
implementations.  They both take a single string, and split the
string at the delimiter and stuff the result into a string list.

However, there is one subtle and unnecessary difference.  The non
"in-place" variant only allows a single byte value as delimiter,
while the "in-place" variant can take multiple delimiters (e.g.,
"split at either a comma or a space").

This series first updates the string_list_split() to allow multiple
delimiters like string_list_split_in_place() does, unifies their
implementations into one.  This refactoring allows us to give new
feature to these two functions with a single chnage.

Then these functions learn to optionally trim the split string
pieces before placing them in the resulting string list.

An existing caller of string_list_split() in diff.c trims the
elements in the resulting string list before it uses them, which is
simplified by taking advantage of this new feature.

Junio C Hamano (5):
  string-list: report programming error with BUG
  string-list: align string_list_split() with its _in_place()
    counterpart
  string-list: unify string_list_split* functions
  string-list: optionally trim string pieces split by
    string_list_split()
  diff: simplify parsing of diff.colormovedws

 builtin/blame.c              |   2 +-
 builtin/merge.c              |   2 +-
 builtin/var.c                |   2 +-
 connect.c                    |   2 +-
 diff.c                       |  20 +++----
 fetch-pack.c                 |   2 +-
 notes.c                      |   2 +-
 parse-options.c              |   2 +-
 pathspec.c                   |   2 +-
 protocol.c                   |   2 +-
 ref-filter.c                 |   4 +-
 setup.c                      |   3 +-
 string-list.c                | 113 +++++++++++++++++++++++------------
 string-list.h                |  26 +++++---
 t/helper/test-path-utils.c   |   3 +-
 t/helper/test-ref-store.c    |   2 +-
 t/unit-tests/u-string-list.c |  80 ++++++++++++++++++++++---
 transport.c                  |   2 +-
 upload-pack.c                |   2 +-
 19 files changed, 190 insertions(+), 83 deletions(-)

-- 
2.50.1-612-g4756c59422


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH 1/5] string-list: report programming error with BUG
  2025-07-31  6:39 [PATCH 0/5] string_list_split*() updates Junio C Hamano
@ 2025-07-31  6:39 ` Junio C Hamano
  2025-07-31 19:33   ` Eric Sunshine
  2025-07-31  6:39 ` [PATCH 2/5] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31  6:39 UTC (permalink / raw)
  To: git

Passing a string list that has .strdup_strings bit unset to
string_list_split(), orone that has .strdup_strings bit set to
string_list_split_in_place(), is a programmer error.  Do not use
die() to abort the execution.  Use BUG() instead.

As a developer-facing message, the message string itself should
be a lot more concise, but let's keep the original one for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/string-list.c b/string-list.c
index 53faaa8420..0cb920e9b0 100644
--- a/string-list.c
+++ b/string-list.c
@@ -283,7 +283,7 @@ int string_list_split(struct string_list *list, const char *string,
 	const char *p = string, *end;
 
 	if (!list->strdup_strings)
-		die("internal error in string_list_split(): "
+		BUG("internal error in string_list_split(): "
 		    "list->strdup_strings must be set");
 	for (;;) {
 		count++;
@@ -309,7 +309,7 @@ int string_list_split_in_place(struct string_list *list, char *string,
 	char *p = string, *end;
 
 	if (list->strdup_strings)
-		die("internal error in string_list_split_in_place(): "
+		BUG("internal error in string_list_split_in_place(): "
 		    "list->strdup_strings must not be set");
 	for (;;) {
 		count++;
-- 
2.50.1-612-g4756c59422


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 2/5] string-list: align string_list_split() with its _in_place() counterpart
  2025-07-31  6:39 [PATCH 0/5] string_list_split*() updates Junio C Hamano
  2025-07-31  6:39 ` [PATCH 1/5] string-list: report programming error with BUG Junio C Hamano
@ 2025-07-31  6:39 ` Junio C Hamano
  2025-07-31 19:36   ` Eric Sunshine
  2025-07-31  6:39 ` [PATCH 3/5] string-list: unify string_list_split* functions Junio C Hamano
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31  6:39 UTC (permalink / raw)
  To: git

For some unknown reason, unlike string_list_split_in_place(),
string_list_split() took only a single character as a field
delimiter.  Before giving both functions more features in future
commits, allow stirng_list_split() to take more than one delimiter
characters to make them closer to each other.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/blame.c              |  2 +-
 builtin/merge.c              |  2 +-
 builtin/var.c                |  2 +-
 connect.c                    |  2 +-
 diff.c                       |  2 +-
 fetch-pack.c                 |  2 +-
 notes.c                      |  2 +-
 parse-options.c              |  2 +-
 pathspec.c                   |  2 +-
 protocol.c                   |  2 +-
 ref-filter.c                 |  4 ++--
 setup.c                      |  3 ++-
 string-list.c                |  4 ++--
 string-list.h                | 16 ++++++++--------
 t/helper/test-path-utils.c   |  3 ++-
 t/helper/test-ref-store.c    |  2 +-
 t/unit-tests/u-string-list.c | 16 ++++++++--------
 transport.c                  |  2 +-
 upload-pack.c                |  2 +-
 19 files changed, 37 insertions(+), 35 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 91586e6852..70a6460401 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -420,7 +420,7 @@ static void parse_color_fields(const char *s)
 	colorfield_nr = 0;
 
 	/* Ideally this would be stripped and split at the same time? */
-	string_list_split(&l, s, ',', -1);
+	string_list_split(&l, s, ",", -1);
 	ALLOC_GROW(colorfield, colorfield_nr + 1, colorfield_alloc);
 
 	for_each_string_list_item(item, &l) {
diff --git a/builtin/merge.c b/builtin/merge.c
index 18b22c0a26..893f8950bf 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -875,7 +875,7 @@ static void add_strategies(const char *string, unsigned attr)
 	if (string) {
 		struct string_list list = STRING_LIST_INIT_DUP;
 		struct string_list_item *item;
-		string_list_split(&list, string, ' ', -1);
+		string_list_split(&list, string, " ", -1);
 		for_each_string_list_item(item, &list)
 			append_strategy(get_strategy(item->string));
 		string_list_clear(&list, 0);
diff --git a/builtin/var.c b/builtin/var.c
index ada642a9fe..4ae7af0eff 100644
--- a/builtin/var.c
+++ b/builtin/var.c
@@ -181,7 +181,7 @@ static void list_vars(void)
 			if (ptr->multivalued && *val) {
 				struct string_list list = STRING_LIST_INIT_DUP;
 
-				string_list_split(&list, val, '\n', -1);
+				string_list_split(&list, val, "\n", -1);
 				for (size_t i = 0; i < list.nr; i++)
 					printf("%s=%s\n", ptr->name, list.items[i].string);
 				string_list_clear(&list, 0);
diff --git a/connect.c b/connect.c
index e77287f426..867b12bde5 100644
--- a/connect.c
+++ b/connect.c
@@ -407,7 +407,7 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
 	 * name.  Subsequent fields (symref-target and peeled) are optional and
 	 * don't have a particular order.
 	 */
-	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
+	if (string_list_split(&line_sections, line, " ", -1) < 2) {
 		ret = 0;
 		goto out;
 	}
diff --git a/diff.c b/diff.c
index dca87e164f..a81949a422 100644
--- a/diff.c
+++ b/diff.c
@@ -327,7 +327,7 @@ static unsigned parse_color_moved_ws(const char *arg)
 	struct string_list l = STRING_LIST_INIT_DUP;
 	struct string_list_item *i;
 
-	string_list_split(&l, arg, ',', -1);
+	string_list_split(&l, arg, ",", -1);
 
 	for_each_string_list_item(i, &l) {
 		struct strbuf sb = STRBUF_INIT;
diff --git a/fetch-pack.c b/fetch-pack.c
index c1be9b76eb..9866270696 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1914,7 +1914,7 @@ static void fetch_pack_config(void)
 		char *str;
 
 		if (!git_config_get_string("fetch.uriprotocols", &str) && str) {
-			string_list_split(&uri_protocols, str, ',', -1);
+			string_list_split(&uri_protocols, str, ",", -1);
 			free(str);
 		}
 	}
diff --git a/notes.c b/notes.c
index 97b995f3f2..6afcf088b9 100644
--- a/notes.c
+++ b/notes.c
@@ -892,7 +892,7 @@ static int string_list_add_note_lines(struct string_list *list,
 	 * later, along with any empty strings that came from empty
 	 * lines within the file.
 	 */
-	string_list_split(list, data, '\n', -1);
+	string_list_split(list, data, "\n", -1);
 	free(data);
 	return 0;
 }
diff --git a/parse-options.c b/parse-options.c
index 5224203ffe..9e7cb75192 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1338,7 +1338,7 @@ static enum parse_opt_result usage_with_options_internal(struct parse_opt_ctx_t
 		if (!saw_empty_line && !*str)
 			saw_empty_line = 1;
 
-		string_list_split(&list, str, '\n', -1);
+		string_list_split(&list, str, "\n", -1);
 		for (j = 0; j < list.nr; j++) {
 			const char *line = list.items[j].string;
 
diff --git a/pathspec.c b/pathspec.c
index a3ddd701c7..de325f7ef9 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -201,7 +201,7 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va
 	if (!value || !*value)
 		die(_("attr spec must not be empty"));
 
-	string_list_split(&list, value, ' ', -1);
+	string_list_split(&list, value, " ", -1);
 	string_list_remove_empty_items(&list, 0);
 
 	item->attr_check = attr_check_alloc();
diff --git a/protocol.c b/protocol.c
index bae7226ff4..54b9f49c01 100644
--- a/protocol.c
+++ b/protocol.c
@@ -61,7 +61,7 @@ enum protocol_version determine_protocol_version_server(void)
 	if (git_protocol) {
 		struct string_list list = STRING_LIST_INIT_DUP;
 		const struct string_list_item *item;
-		string_list_split(&list, git_protocol, ':', -1);
+		string_list_split(&list, git_protocol, ":", -1);
 
 		for_each_string_list_item(item, &list) {
 			const char *value;
diff --git a/ref-filter.c b/ref-filter.c
index f9f2c512a8..4edfb9c83b 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -435,7 +435,7 @@ static int remote_ref_atom_parser(struct ref_format *format UNUSED,
 	}
 
 	atom->u.remote_ref.nobracket = 0;
-	string_list_split(&params, arg, ',', -1);
+	string_list_split(&params, arg, ",", -1);
 
 	for (i = 0; i < params.nr; i++) {
 		const char *s = params.items[i].string;
@@ -831,7 +831,7 @@ static int align_atom_parser(struct ref_format *format UNUSED,
 
 	align->position = ALIGN_LEFT;
 
-	string_list_split(&params, arg, ',', -1);
+	string_list_split(&params, arg, ",", -1);
 	for (i = 0; i < params.nr; i++) {
 		const char *s = params.items[i].string;
 		int position;
diff --git a/setup.c b/setup.c
index 6f52dab64c..b9f5eb8b51 100644
--- a/setup.c
+++ b/setup.c
@@ -1460,8 +1460,9 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 
 	if (env_ceiling_dirs) {
 		int empty_entry_found = 0;
+		static const char path_sep[] = { PATH_SEP, '\0' };
 
-		string_list_split(&ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1);
+		string_list_split(&ceiling_dirs, env_ceiling_dirs, path_sep, -1);
 		filter_string_list(&ceiling_dirs, 0,
 				   canonicalize_ceiling_entry, &empty_entry_found);
 		ceil_offset = longest_ancestor_length(dir->buf, &ceiling_dirs);
diff --git a/string-list.c b/string-list.c
index 0cb920e9b0..2284a009cb 100644
--- a/string-list.c
+++ b/string-list.c
@@ -277,7 +277,7 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 }
 
 int string_list_split(struct string_list *list, const char *string,
-		      int delim, int maxsplit)
+		      const char *delim, int maxsplit)
 {
 	int count = 0;
 	const char *p = string, *end;
@@ -291,7 +291,7 @@ int string_list_split(struct string_list *list, const char *string,
 			string_list_append(list, p);
 			return count;
 		}
-		end = strchr(p, delim);
+		end = strpbrk(p, delim);
 		if (end) {
 			string_list_append_nodup(list, xmemdupz(p, end - p));
 			p = end + 1;
diff --git a/string-list.h b/string-list.h
index 122b318641..6c8650efde 100644
--- a/string-list.h
+++ b/string-list.h
@@ -254,7 +254,7 @@ struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
 void unsorted_string_list_delete_item(struct string_list *list, int i, int free_util);
 
 /**
- * Split string into substrings on character `delim` and append the
+ * Split string into substrings on characters in `delim` and append the
  * substrings to `list`.  The input string is not modified.
  * list->strdup_strings must be set, as new memory needs to be
  * allocated to hold the substrings.  If maxsplit is non-negative,
@@ -262,15 +262,15 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
  * appended to list.
  *
  * Examples:
- *   string_list_split(l, "foo:bar:baz", ':', -1) -> ["foo", "bar", "baz"]
- *   string_list_split(l, "foo:bar:baz", ':', 0) -> ["foo:bar:baz"]
- *   string_list_split(l, "foo:bar:baz", ':', 1) -> ["foo", "bar:baz"]
- *   string_list_split(l, "foo:bar:", ':', -1) -> ["foo", "bar", ""]
- *   string_list_split(l, "", ':', -1) -> [""]
- *   string_list_split(l, ":", ':', -1) -> ["", ""]
+ *   string_list_split(l, "foo:bar:baz", ":", -1) -> ["foo", "bar", "baz"]
+ *   string_list_split(l, "foo:bar:baz", ":", 0) -> ["foo:bar:baz"]
+ *   string_list_split(l, "foo:bar:baz", ":", 1) -> ["foo", "bar:baz"]
+ *   string_list_split(l, "foo:bar:", ":", -1) -> ["foo", "bar", ""]
+ *   string_list_split(l, "", ":", -1) -> [""]
+ *   string_list_split(l, ":", ":", -1) -> ["", ""]
  */
 int string_list_split(struct string_list *list, const char *string,
-		      int delim, int maxsplit);
+		      const char *delim, int maxsplit);
 
 /*
  * Like string_list_split(), except that string is split in-place: the
diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c
index 086238c826..f5f33751da 100644
--- a/t/helper/test-path-utils.c
+++ b/t/helper/test-path-utils.c
@@ -348,6 +348,7 @@ int cmd__path_utils(int argc, const char **argv)
 	if (argc == 4 && !strcmp(argv[1], "longest_ancestor_length")) {
 		int len;
 		struct string_list ceiling_dirs = STRING_LIST_INIT_DUP;
+		const char path_sep[] = { PATH_SEP, '\0' };
 		char *path = xstrdup(argv[2]);
 
 		/*
@@ -362,7 +363,7 @@ int cmd__path_utils(int argc, const char **argv)
 		 */
 		if (normalize_path_copy(path, path))
 			die("Path \"%s\" could not be normalized", argv[2]);
-		string_list_split(&ceiling_dirs, argv[3], PATH_SEP, -1);
+		string_list_split(&ceiling_dirs, argv[3], path_sep, -1);
 		filter_string_list(&ceiling_dirs, 0,
 				   normalize_ceiling_entry, NULL);
 		len = longest_ancestor_length(path, &ceiling_dirs);
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index 8d9a271845..aa1cb9b4ac 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -29,7 +29,7 @@ static unsigned int parse_flags(const char *str, struct flag_definition *defs)
 	if (!strcmp(str, "0"))
 		return 0;
 
-	string_list_split(&masks, str, ',', 64);
+	string_list_split(&masks, str, ",", 64);
 	for (size_t i = 0; i < masks.nr; i++) {
 		const char *name = masks.items[i].string;
 		struct flag_definition *def = defs;
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index d4ba5f9fa5..150a5f505f 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -43,7 +43,7 @@ static void t_string_list_equal(struct string_list *list,
 				  expected_strings->items[i].string);
 }
 
-static void t_string_list_split(const char *data, int delim, int maxsplit, ...)
+static void t_string_list_split(const char *data, const char *delim, int maxsplit, ...)
 {
 	struct string_list expected_strings = STRING_LIST_INIT_DUP;
 	struct string_list list = STRING_LIST_INIT_DUP;
@@ -65,13 +65,13 @@ static void t_string_list_split(const char *data, int delim, int maxsplit, ...)
 
 void test_string_list__split(void)
 {
-	t_string_list_split("foo:bar:baz", ':', -1, "foo", "bar", "baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 0, "foo:bar:baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 1, "foo", "bar:baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 2, "foo", "bar", "baz", NULL);
-	t_string_list_split("foo:bar:", ':', -1, "foo", "bar", "", NULL);
-	t_string_list_split("", ':', -1, "", NULL);
-	t_string_list_split(":", ':', -1, "", "", NULL);
+	t_string_list_split("foo:bar:baz", ":", -1, "foo", "bar", "baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 0, "foo:bar:baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 1, "foo", "bar:baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 2, "foo", "bar", "baz", NULL);
+	t_string_list_split("foo:bar:", ":", -1, "foo", "bar", "", NULL);
+	t_string_list_split("", ":", -1, "", NULL);
+	t_string_list_split(":", ":", -1, "", "", NULL);
 }
 
 static void t_string_list_split_in_place(const char *data, const char *delim,
diff --git a/transport.c b/transport.c
index c123ac1e38..76487b5453 100644
--- a/transport.c
+++ b/transport.c
@@ -1042,7 +1042,7 @@ static const struct string_list *protocol_allow_list(void)
 	if (enabled < 0) {
 		const char *v = getenv("GIT_ALLOW_PROTOCOL");
 		if (v) {
-			string_list_split(&allowed, v, ':', -1);
+			string_list_split(&allowed, v, ":", -1);
 			string_list_sort(&allowed);
 			enabled = 1;
 		} else {
diff --git a/upload-pack.c b/upload-pack.c
index 4f26f6afc7..91fcdcad9b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1685,7 +1685,7 @@ static void process_args(struct packet_reader *request,
 			if (data->uri_protocols.nr)
 				send_err_and_die(data,
 						 "multiple packfile-uris lines forbidden");
-			string_list_split(&data->uri_protocols, p, ',', -1);
+			string_list_split(&data->uri_protocols, p, ",", -1);
 			continue;
 		}
 
-- 
2.50.1-612-g4756c59422


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 3/5] string-list: unify string_list_split* functions
  2025-07-31  6:39 [PATCH 0/5] string_list_split*() updates Junio C Hamano
  2025-07-31  6:39 ` [PATCH 1/5] string-list: report programming error with BUG Junio C Hamano
  2025-07-31  6:39 ` [PATCH 2/5] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
@ 2025-07-31  6:39 ` Junio C Hamano
  2025-07-31  6:39 ` [PATCH 4/5] string-list: optionally trim string pieces split by string_list_split() Junio C Hamano
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31  6:39 UTC (permalink / raw)
  To: git

Thanks to the previous step, the only difference between these two
related functions is that string_list_split() works on a string
without modifying its contents (i.e. taking "const char *") and the
resulting pieces of strings are their own copies in a string list,
while string_list_split_in_place() works on a mutable string and the
resulting pieces of strings come from the original string.

Consolidate their implementations into a single helper function, and
make them a thin wrapper around it.  We can later add an extra flags
parameter to extend both of these functions by updating only the
internal helper function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c | 90 +++++++++++++++++++++++++++++----------------------
 1 file changed, 51 insertions(+), 39 deletions(-)

diff --git a/string-list.c b/string-list.c
index 2284a009cb..893e82be49 100644
--- a/string-list.c
+++ b/string-list.c
@@ -276,55 +276,67 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 	list->nr--;
 }
 
-int string_list_split(struct string_list *list, const char *string,
-		      const char *delim, int maxsplit)
+static void append_one(struct string_list *list,
+		       const char *p, const char *end,
+		       int in_place)
+{
+	if (!end)
+		end = p + strlen(p);
+
+	if (in_place) {
+		*((char *)end) = '\0';
+		string_list_append(list, p);
+	} else {
+		string_list_append_nodup(list, xmemdupz(p, end - p));
+	}
+}
+
+/*
+ * Unfortunately this cannot become a public interface, as _in_place()
+ * wants to have "const char *string" while the other variant wants to
+ * have "char *string" for type safety.
+ *
+ * This accepts "const char *string" to allow both wrappers to use it;
+ * it internally casts away the constness when in_place is true by
+ * taking advantage of strpbrk() that takes a "const char *" arg and
+ * returns "char *" pointer into that const string.  Yucky but works ;-).
+ */
+static int split_string(struct string_list *list, const char *string, const char *delim,
+			int maxsplit, int in_place)
 {
 	int count = 0;
-	const char *p = string, *end;
+	const char *p = string;
+
+	if (in_place && list->strdup_strings)
+		BUG("string_list_split_in_place() called with strdup_strings");
+	else if (!in_place && !list->strdup_strings)
+		BUG("string_list_split() called without strdup_strings");
 
-	if (!list->strdup_strings)
-		BUG("internal error in string_list_split(): "
-		    "list->strdup_strings must be set");
 	for (;;) {
+		char *end;
+
 		count++;
-		if (maxsplit >= 0 && count > maxsplit) {
-			string_list_append(list, p);
-			return count;
-		}
-		end = strpbrk(p, delim);
-		if (end) {
-			string_list_append_nodup(list, xmemdupz(p, end - p));
-			p = end + 1;
-		} else {
-			string_list_append(list, p);
+		if (maxsplit >= 0 && count > maxsplit)
+			end = NULL;
+		else
+			end = strpbrk(p, delim);
+
+		append_one(list, p, end, in_place);
+
+		if (!end)
 			return count;
-		}
+		p = end + 1;
 	}
 }
 
+int string_list_split(struct string_list *list, const char *string,
+		      const char *delim, int maxsplit)
+{
+	return split_string(list, string, delim, maxsplit, 0);
+}
+
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit)
 {
-	int count = 0;
-	char *p = string, *end;
-
-	if (list->strdup_strings)
-		BUG("internal error in string_list_split_in_place(): "
-		    "list->strdup_strings must not be set");
-	for (;;) {
-		count++;
-		if (maxsplit >= 0 && count > maxsplit) {
-			string_list_append(list, p);
-			return count;
-		}
-		end = strpbrk(p, delim);
-		if (end) {
-			*end = '\0';
-			string_list_append(list, p);
-			p = end + 1;
-		} else {
-			string_list_append(list, p);
-			return count;
-		}
-	}
+	return split_string(list, string, delim, maxsplit, 1);
 }
-- 
2.50.1-612-g4756c59422


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 4/5] string-list: optionally trim string pieces split by string_list_split()
  2025-07-31  6:39 [PATCH 0/5] string_list_split*() updates Junio C Hamano
                   ` (2 preceding siblings ...)
  2025-07-31  6:39 ` [PATCH 3/5] string-list: unify string_list_split* functions Junio C Hamano
@ 2025-07-31  6:39 ` Junio C Hamano
  2025-07-31  6:39 ` [PATCH 5/5] diff: simplify parsing of diff.colormovedws Junio C Hamano
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
  5 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31  6:39 UTC (permalink / raw)
  To: git

Teach the unified split_string() to take an optional "flags" word,
and define the first flag STRING_LIST_SPLIT_TRIM to cause the split
pieces to be trimmed before they are placed in the string list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c                | 35 +++++++++++++++++---
 string-list.h                | 10 ++++++
 t/unit-tests/u-string-list.c | 64 ++++++++++++++++++++++++++++++++++++
 3 files changed, 104 insertions(+), 5 deletions(-)

diff --git a/string-list.c b/string-list.c
index 893e82be49..c6a3afb15a 100644
--- a/string-list.c
+++ b/string-list.c
@@ -278,11 +278,18 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 
 static void append_one(struct string_list *list,
 		       const char *p, const char *end,
-		       int in_place)
+		       int in_place, unsigned flags)
 {
 	if (!end)
 		end = p + strlen(p);
 
+	if ((flags & STRING_LIST_SPLIT_TRIM)) {
+		/* rtrim */
+		for (; p < end; end--)
+			if (!isspace(end[-1]))
+				break;
+	}
+
 	if (in_place) {
 		*((char *)end) = '\0';
 		string_list_append(list, p);
@@ -302,7 +309,7 @@ static void append_one(struct string_list *list,
  * returns "char *" pointer into that const string.  Yucky but works ;-).
  */
 static int split_string(struct string_list *list, const char *string, const char *delim,
-			int maxsplit, int in_place)
+			int maxsplit, int in_place, unsigned flags)
 {
 	int count = 0;
 	const char *p = string;
@@ -315,13 +322,19 @@ static int split_string(struct string_list *list, const char *string, const char
 	for (;;) {
 		char *end;
 
+		if (flags & STRING_LIST_SPLIT_TRIM) {
+			/* ltrim */
+			while (*p && isspace(*p))
+				p++;
+		}
+
 		count++;
 		if (maxsplit >= 0 && count > maxsplit)
 			end = NULL;
 		else
 			end = strpbrk(p, delim);
 
-		append_one(list, p, end, in_place);
+		append_one(list, p, end, in_place, flags);
 
 		if (!end)
 			return count;
@@ -332,11 +345,23 @@ static int split_string(struct string_list *list, const char *string, const char
 int string_list_split(struct string_list *list, const char *string,
 		      const char *delim, int maxsplit)
 {
-	return split_string(list, string, delim, maxsplit, 0);
+	return split_string(list, string, delim, maxsplit, 0, 0);
 }
 
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit)
 {
-	return split_string(list, string, delim, maxsplit, 1);
+	return split_string(list, string, delim, maxsplit, 1, 0);
+}
+
+int string_list_split_f(struct string_list *list, const char *string,
+			const char *delim, int maxsplit, unsigned flags)
+{
+	return split_string(list, string, delim, maxsplit, 0, flags);
+}
+
+int string_list_split_in_place_f(struct string_list *list, char *string,
+			       const char *delim, int maxsplit, unsigned flags)
+{
+	return split_string(list, string, delim, maxsplit, 1, flags);
 }
diff --git a/string-list.h b/string-list.h
index 6c8650efde..ee9922af67 100644
--- a/string-list.h
+++ b/string-list.h
@@ -281,4 +281,14 @@ int string_list_split(struct string_list *list, const char *string,
  */
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit);
+
+/* trim() resulting string piece before adding it to the list */
+#define STRING_LIST_SPLIT_TRIM 01
+
+int string_list_split_f(struct string_list *, const char *string,
+			const char *delim, int maxsplit, unsigned flags);
+
+int string_list_split_in_place_f(struct string_list *, char *string,
+				 const char *delim, int maxsplit, unsigned flags);
+
 #endif /* STRING_LIST_H */
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index 150a5f505f..daa9307e45 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -63,6 +63,70 @@ static void t_string_list_split(const char *data, const char *delim, int maxspli
 	string_list_clear(&list, 0);
 }
 
+static void t_string_list_split_f(const char *data, const char *delim,
+				  int maxsplit, unsigned flags, ...)
+{
+	struct string_list expected_strings = STRING_LIST_INIT_DUP;
+	struct string_list list = STRING_LIST_INIT_DUP;
+	va_list ap;
+	int len;
+
+	va_start(ap, flags);
+	t_vcreate_string_list_dup(&expected_strings, 0, ap);
+	va_end(ap);
+
+	string_list_clear(&list, 0);
+	len = string_list_split_f(&list, data, delim, maxsplit, flags);
+	cl_assert_equal_i(len, expected_strings.nr);
+	t_string_list_equal(&list, &expected_strings);
+
+	string_list_clear(&expected_strings, 0);
+	string_list_clear(&list, 0);
+}
+
+void test_string_list__split_f(void)
+{
+	t_string_list_split_f("::foo:bar:baz:", ":", -1, 0,
+			      "", "", "foo", "bar", "baz", "", NULL);
+	t_string_list_split_f(" foo:bar : baz", ":", -1, STRING_LIST_SPLIT_TRIM,
+			      "foo", "bar", "baz", NULL);
+	t_string_list_split_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
+			      "a", "b c", NULL);
+}
+
+static void t_string_list_split_in_place_f(const char *data_, const char *delim,
+					   int maxsplit, unsigned flags, ...)
+{
+	struct string_list expected_strings = STRING_LIST_INIT_DUP;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *data = xstrdup(data_);
+	va_list ap;
+	int len;
+
+	va_start(ap, flags);
+	t_vcreate_string_list_dup(&expected_strings, 0, ap);
+	va_end(ap);
+
+	string_list_clear(&list, 0);
+	len = string_list_split_in_place_f(&list, data, delim, maxsplit, flags);
+	cl_assert_equal_i(len, expected_strings.nr);
+	t_string_list_equal(&list, &expected_strings);
+
+	free(data);
+	string_list_clear(&expected_strings, 0);
+	string_list_clear(&list, 0);
+}
+
+void test_string_list__split_in_place_f(void)
+{
+	t_string_list_split_in_place_f("::foo:bar:baz:", ":", -1, 0,
+				       "", "", "foo", "bar", "baz", "", NULL);
+	t_string_list_split_in_place_f(" foo:bar : baz", ":", -1, STRING_LIST_SPLIT_TRIM,
+				       "foo", "bar", "baz", NULL);
+	t_string_list_split_in_place_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
+				       "a", "b c", NULL);
+}
+
 void test_string_list__split(void)
 {
 	t_string_list_split("foo:bar:baz", ":", -1, "foo", "bar", "baz", NULL);
-- 
2.50.1-612-g4756c59422


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH 5/5] diff: simplify parsing of diff.colormovedws
  2025-07-31  6:39 [PATCH 0/5] string_list_split*() updates Junio C Hamano
                   ` (3 preceding siblings ...)
  2025-07-31  6:39 ` [PATCH 4/5] string-list: optionally trim string pieces split by string_list_split() Junio C Hamano
@ 2025-07-31  6:39 ` Junio C Hamano
  2025-07-31 19:45   ` Eric Sunshine
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
  5 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31  6:39 UTC (permalink / raw)
  To: git

The code to parse this configuration variable, whose value is a
comma separated known tokens like "ignore-space-change" and
"ignore-all-space", uses string_list_split() to split the value int
pieces, and then places each piece of string in a strbuf to trim,
before comparing the result with the list of known tokens.

Thanks to the previous steps, now string_list_split() knows to trim
the resulting pieces in the string list.  Use it to simplify the
code.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/diff.c b/diff.c
index a81949a422..70666ad2cd 100644
--- a/diff.c
+++ b/diff.c
@@ -327,29 +327,23 @@ static unsigned parse_color_moved_ws(const char *arg)
 	struct string_list l = STRING_LIST_INIT_DUP;
 	struct string_list_item *i;
 
-	string_list_split(&l, arg, ",", -1);
+	string_list_split_f(&l, arg, ",", -1, STRING_LIST_SPLIT_TRIM);
 
 	for_each_string_list_item(i, &l) {
-		struct strbuf sb = STRBUF_INIT;
-		strbuf_addstr(&sb, i->string);
-		strbuf_trim(&sb);
-
-		if (!strcmp(sb.buf, "no"))
+		if (!strcmp(i->string, "no"))
 			ret = 0;
-		else if (!strcmp(sb.buf, "ignore-space-change"))
+		else if (!strcmp(i->string, "ignore-space-change"))
 			ret |= XDF_IGNORE_WHITESPACE_CHANGE;
-		else if (!strcmp(sb.buf, "ignore-space-at-eol"))
+		else if (!strcmp(i->string, "ignore-space-at-eol"))
 			ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
-		else if (!strcmp(sb.buf, "ignore-all-space"))
+		else if (!strcmp(i->string, "ignore-all-space"))
 			ret |= XDF_IGNORE_WHITESPACE;
-		else if (!strcmp(sb.buf, "allow-indentation-change"))
+		else if (!strcmp(i->string, "allow-indentation-change"))
 			ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
 		else {
 			ret |= COLOR_MOVED_WS_ERROR;
-			error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), sb.buf);
+			error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), i->string);
 		}
-
-		strbuf_release(&sb);
 	}
 
 	if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) &&
-- 
2.50.1-612-g4756c59422


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH 1/5] string-list: report programming error with BUG
  2025-07-31  6:39 ` [PATCH 1/5] string-list: report programming error with BUG Junio C Hamano
@ 2025-07-31 19:33   ` Eric Sunshine
  2025-07-31 22:16     ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: Eric Sunshine @ 2025-07-31 19:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 2:40 AM Junio C Hamano <gitster@pobox.com> wrote:
> Passing a string list that has .strdup_strings bit unset to
> string_list_split(), orone that has .strdup_strings bit set to
> string_list_split_in_place(), is a programmer error.  Do not use
> die() to abort the execution.  Use BUG() instead.

s/orone/or one/

> As a developer-facing message, the message string itself should
> be a lot more concise, but let's keep the original one for now.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 2/5] string-list: align string_list_split() with its _in_place() counterpart
  2025-07-31  6:39 ` [PATCH 2/5] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
@ 2025-07-31 19:36   ` Eric Sunshine
  0 siblings, 0 replies; 72+ messages in thread
From: Eric Sunshine @ 2025-07-31 19:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 2:40 AM Junio C Hamano <gitster@pobox.com> wrote:
> For some unknown reason, unlike string_list_split_in_place(),
> string_list_split() took only a single character as a field
> delimiter.  Before giving both functions more features in future
> commits, allow stirng_list_split() to take more than one delimiter
> characters to make them closer to each other.

s/stirng/string/

> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 5/5] diff: simplify parsing of diff.colormovedws
  2025-07-31  6:39 ` [PATCH 5/5] diff: simplify parsing of diff.colormovedws Junio C Hamano
@ 2025-07-31 19:45   ` Eric Sunshine
  0 siblings, 0 replies; 72+ messages in thread
From: Eric Sunshine @ 2025-07-31 19:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 2:40 AM Junio C Hamano <gitster@pobox.com> wrote:
> The code to parse this configuration variable, whose value is a
> comma separated known tokens like "ignore-space-change" and
> "ignore-all-space", uses string_list_split() to split the value int
> pieces, and then places each piece of string in a strbuf to trim,
> before comparing the result with the list of known tokens.

s/int/into/

> Thanks to the previous steps, now string_list_split() knows to trim
> the resulting pieces in the string list.  Use it to simplify the
> code.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> diff --git a/diff.c b/diff.c
> @@ -327,29 +327,23 @@ static unsigned parse_color_moved_ws(const char *arg)
> -       string_list_split(&l, arg, ",", -1);
> +       string_list_split_f(&l, arg, ",", -1, STRING_LIST_SPLIT_TRIM);
>
>         for_each_string_list_item(i, &l) {
> -               struct strbuf sb = STRBUF_INIT;
> -               strbuf_addstr(&sb, i->string);
> -               strbuf_trim(&sb);
> -
> -               if (!strcmp(sb.buf, "no"))
> +               if (!strcmp(i->string, "no"))
>                         ret = 0;
> -               else if (!strcmp(sb.buf, "ignore-space-change"))
> +               else if (!strcmp(i->string, "ignore-space-change"))
>                         ret |= XDF_IGNORE_WHITESPACE_CHANGE;
> -               else if (!strcmp(sb.buf, "ignore-space-at-eol"))
> +               else if (!strcmp(i->string, "ignore-space-at-eol"))
>                         ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
> -               else if (!strcmp(sb.buf, "ignore-all-space"))
> +               else if (!strcmp(i->string, "ignore-all-space"))
>                         ret |= XDF_IGNORE_WHITESPACE;
> -               else if (!strcmp(sb.buf, "allow-indentation-change"))
> +               else if (!strcmp(i->string, "allow-indentation-change"))
>                         ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
>                 else {
>                         ret |= COLOR_MOVED_WS_ERROR;
> -                       error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), sb.buf);
> +                       error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), i->string);
>                 }
> -
> -               strbuf_release(&sb);
>         }

An unfortunately noisy diff, but it can't be helped. The end result is
a pleasant improvement.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH 1/5] string-list: report programming error with BUG
  2025-07-31 19:33   ` Eric Sunshine
@ 2025-07-31 22:16     ` Junio C Hamano
  0 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31 22:16 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Thu, Jul 31, 2025 at 2:40 AM Junio C Hamano <gitster@pobox.com> wrote:
>> Passing a string list that has .strdup_strings bit unset to
>> string_list_split(), orone that has .strdup_strings bit set to
>> string_list_split_in_place(), is a programmer error.  Do not use
>> die() to abort the execution.  Use BUG() instead.
>
> s/orone/or one/

Thanks, as always, for typofixes.  Not just this step but for other
steps in the series.  Will use them when I update them.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 0/7] string_list_split*() updates
  2025-07-31  6:39 [PATCH 0/5] string_list_split*() updates Junio C Hamano
                   ` (4 preceding siblings ...)
  2025-07-31  6:39 ` [PATCH 5/5] diff: simplify parsing of diff.colormovedws Junio C Hamano
@ 2025-07-31 22:45 ` Junio C Hamano
  2025-07-31 22:46   ` [PATCH v2 1/7] string-list: report programming error with BUG Junio C Hamano
                     ` (7 more replies)
  5 siblings, 8 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31 22:45 UTC (permalink / raw)
  To: git

Two related string-list API functions, string_list_split() and
string_list_split_in_place(), more or less duplicates their
implementations.  They both take a single string, and split the
string at the delimiter and stuff the result into a string list.

However, there is one subtle and unnecessary difference.  The non
"in-place" variant only allows a single byte value as delimiter,
while the "in-place" variant can take multiple delimiters (e.g.,
"split at either a comma or a space").

This series first updates the string_list_split() to allow multiple
delimiters like string_list_split_in_place() does, by unifying their
implementations into one.  This refactoring allows us to give new
features to these two functions more easily.

Then these functions learn to optionally

 - trim the split string pieces before placing them in the resulting
   string list.

 - omit empty string pieces from the resulting string list.

An existing caller of string_list_split() in diff.c trims the
elements in the resulting string list before it uses them, which is
simplified by taking advantage of this new feature.

A handful of code paths call string_list_split*(), immediately
followed by string_list_remove_empty_items().  They are simplified
by not placing empty items in the list in the first place.

Junio C Hamano (7):
  string-list: report programming error with BUG
  string-list: align string_list_split() with its _in_place()
    counterpart
  string-list: unify string_list_split* functions
  string-list: optionally trim string pieces split by
    string_list_split*()
  diff: simplify parsing of diff.colormovedws
  string-list: optionally omit empty string pieces in
    string_list_split*()
  string-list: split-then-remove-empty can be done while splitting

 builtin/blame.c              |   2 +-
 builtin/merge.c              |   2 +-
 builtin/var.c                |   2 +-
 connect.c                    |   2 +-
 diff.c                       |  20 ++----
 fetch-pack.c                 |   2 +-
 notes.c                      |   6 +-
 parse-options.c              |   2 +-
 pathspec.c                   |   3 +-
 protocol.c                   |   2 +-
 ref-filter.c                 |   4 +-
 setup.c                      |   3 +-
 string-list.c                | 120 ++++++++++++++++++++++++-----------
 string-list.h                |  29 ++++++---
 t/helper/test-hashmap.c      |   4 +-
 t/helper/test-json-writer.c  |   4 +-
 t/helper/test-path-utils.c   |   3 +-
 t/helper/test-ref-store.c    |   2 +-
 t/unit-tests/u-string-list.c |  95 ++++++++++++++++++++++++---
 transport.c                  |   2 +-
 upload-pack.c                |   2 +-
 21 files changed, 221 insertions(+), 90 deletions(-)


1:  e56dc89249 ! 1:  1c2b222eec string-list: report programming error with BUG
    @@ Commit message
         string-list: report programming error with BUG
     
         Passing a string list that has .strdup_strings bit unset to
    -    string_list_split(), orone that has .strdup_strings bit set to
    +    string_list_split(), or one that has .strdup_strings bit set to
         string_list_split_in_place(), is a programmer error.  Do not use
         die() to abort the execution.  Use BUG() instead.
     
2:  1bd3506fad ! 2:  a7e07b94ef string-list: align string_list_split() with its _in_place() counterpart
    @@ Commit message
         For some unknown reason, unlike string_list_split_in_place(),
         string_list_split() took only a single character as a field
         delimiter.  Before giving both functions more features in future
    -    commits, allow stirng_list_split() to take more than one delimiter
    +    commits, allow string_list_split() to take more than one delimiter
         characters to make them closer to each other.
     
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
3:  52c3b694d2 ! 3:  b7a7fbb975 string-list: unify string_list_split* functions
    @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i
      
     -int string_list_split(struct string_list *list, const char *string,
     -		      const char *delim, int maxsplit)
    -+static void append_one(struct string_list *list,
    -+		       const char *p, const char *end,
    -+		       int in_place)
    ++/*
    ++ * append a substring [p..end] to list; return number of things it
    ++ * appended to the list.
    ++ */
    ++static int append_one(struct string_list *list,
    ++		      const char *p, const char *end,
    ++		      int in_place)
     +{
     +	if (!end)
     +		end = p + strlen(p);
    @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i
     +	} else {
     +		string_list_append_nodup(list, xmemdupz(p, end - p));
     +	}
    ++	return 1;
     +}
     +
     +/*
    @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i
     -		BUG("internal error in string_list_split(): "
     -		    "list->strdup_strings must be set");
      	for (;;) {
    -+		char *end;
    -+
    - 		count++;
    +-		count++;
     -		if (maxsplit >= 0 && count > maxsplit) {
     -			string_list_append(list, p);
     -			return count;
    @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i
     -			p = end + 1;
     -		} else {
     -			string_list_append(list, p);
    -+		if (maxsplit >= 0 && count > maxsplit)
    ++		char *end;
    ++
    ++		if (0 <= maxsplit && maxsplit <= count)
     +			end = NULL;
     +		else
     +			end = strpbrk(p, delim);
     +
    -+		append_one(list, p, end, in_place);
    ++		count += append_one(list, p, end, in_place);
     +
     +		if (!end)
      			return count;
4:  13e3d9fbaf ! 4:  c566d88c28 string-list: optionally trim string pieces split by string_list_split()
    @@ Metadata
     Author: Junio C Hamano <gitster@pobox.com>
     
      ## Commit message ##
    -    string-list: optionally trim string pieces split by string_list_split()
    +    string-list: optionally trim string pieces split by string_list_split*()
     
         Teach the unified split_string() to take an optional "flags" word,
         and define the first flag STRING_LIST_SPLIT_TRIM to cause the split
    @@ Commit message
     
      ## string-list.c ##
     @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
    - 
    - static void append_one(struct string_list *list,
    - 		       const char *p, const char *end,
    --		       int in_place)
    -+		       int in_place, unsigned flags)
    +  */
    + static int append_one(struct string_list *list,
    + 		      const char *p, const char *end,
    +-		      int in_place)
    ++		      int in_place, unsigned flags)
      {
      	if (!end)
      		end = p + strlen(p);
    @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i
      	if (in_place) {
      		*((char *)end) = '\0';
      		string_list_append(list, p);
    -@@ string-list.c: static void append_one(struct string_list *list,
    +@@ string-list.c: static int append_one(struct string_list *list,
       * returns "char *" pointer into that const string.  Yucky but works ;-).
       */
      static int split_string(struct string_list *list, const char *string, const char *delim,
    @@ string-list.c: static int split_string(struct string_list *list, const char *str
     +				p++;
     +		}
     +
    - 		count++;
    - 		if (maxsplit >= 0 && count > maxsplit)
    + 		if (0 <= maxsplit && maxsplit <= count)
      			end = NULL;
      		else
      			end = strpbrk(p, delim);
      
    --		append_one(list, p, end, in_place);
    -+		append_one(list, p, end, in_place, flags);
    +-		count += append_one(list, p, end, in_place);
    ++		count += append_one(list, p, end, in_place, flags);
      
      		if (!end)
      			return count;
5:  912c6ee193 ! 5:  eb272e0f22 diff: simplify parsing of diff.colormovedws
    @@ Commit message
     
         The code to parse this configuration variable, whose value is a
         comma separated known tokens like "ignore-space-change" and
    -    "ignore-all-space", uses string_list_split() to split the value int
    +    "ignore-all-space", uses string_list_split() to split the value into
         pieces, and then places each piece of string in a strbuf to trim,
         before comparing the result with the list of known tokens.
     
    -    Thanks to the previous steps, now string_list_split() knows to trim
    -    the resulting pieces in the string list.  Use it to simplify the
    -    code.
    +    Thanks to the previous steps, now string_list_split() can trim the
    +    resulting pieces before it places them in the string list.  Use it
    +    to simplify the code.
     
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
-:  ---------- > 6:  d418078a84 string-list: optionally omit empty string pieces in string_list_split*()
-:  ---------- > 7:  12c1189a08 string-list: split-then-remove-empty can be done while splitting

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v2 1/7] string-list: report programming error with BUG
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
@ 2025-07-31 22:46   ` Junio C Hamano
  2025-07-31 22:46   ` [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31 22:46 UTC (permalink / raw)
  To: git

Passing a string list that has .strdup_strings bit unset to
string_list_split(), or one that has .strdup_strings bit set to
string_list_split_in_place(), is a programmer error.  Do not use
die() to abort the execution.  Use BUG() instead.

As a developer-facing message, the message string itself should
be a lot more concise, but let's keep the original one for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/string-list.c b/string-list.c
index 53faaa8420..0cb920e9b0 100644
--- a/string-list.c
+++ b/string-list.c
@@ -283,7 +283,7 @@ int string_list_split(struct string_list *list, const char *string,
 	const char *p = string, *end;
 
 	if (!list->strdup_strings)
-		die("internal error in string_list_split(): "
+		BUG("internal error in string_list_split(): "
 		    "list->strdup_strings must be set");
 	for (;;) {
 		count++;
@@ -309,7 +309,7 @@ int string_list_split_in_place(struct string_list *list, char *string,
 	char *p = string, *end;
 
 	if (list->strdup_strings)
-		die("internal error in string_list_split_in_place(): "
+		BUG("internal error in string_list_split_in_place(): "
 		    "list->strdup_strings must not be set");
 	for (;;) {
 		count++;
-- 
2.50.1-618-g45d530d26b


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
  2025-07-31 22:46   ` [PATCH v2 1/7] string-list: report programming error with BUG Junio C Hamano
@ 2025-07-31 22:46   ` Junio C Hamano
  2025-08-01  2:33     ` shejialuo
  2025-07-31 22:46   ` [PATCH v2 3/7] string-list: unify string_list_split* functions Junio C Hamano
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31 22:46 UTC (permalink / raw)
  To: git

For some unknown reason, unlike string_list_split_in_place(),
string_list_split() took only a single character as a field
delimiter.  Before giving both functions more features in future
commits, allow string_list_split() to take more than one delimiter
characters to make them closer to each other.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/blame.c              |  2 +-
 builtin/merge.c              |  2 +-
 builtin/var.c                |  2 +-
 connect.c                    |  2 +-
 diff.c                       |  2 +-
 fetch-pack.c                 |  2 +-
 notes.c                      |  2 +-
 parse-options.c              |  2 +-
 pathspec.c                   |  2 +-
 protocol.c                   |  2 +-
 ref-filter.c                 |  4 ++--
 setup.c                      |  3 ++-
 string-list.c                |  4 ++--
 string-list.h                | 16 ++++++++--------
 t/helper/test-path-utils.c   |  3 ++-
 t/helper/test-ref-store.c    |  2 +-
 t/unit-tests/u-string-list.c | 16 ++++++++--------
 transport.c                  |  2 +-
 upload-pack.c                |  2 +-
 19 files changed, 37 insertions(+), 35 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 91586e6852..70a6460401 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -420,7 +420,7 @@ static void parse_color_fields(const char *s)
 	colorfield_nr = 0;
 
 	/* Ideally this would be stripped and split at the same time? */
-	string_list_split(&l, s, ',', -1);
+	string_list_split(&l, s, ",", -1);
 	ALLOC_GROW(colorfield, colorfield_nr + 1, colorfield_alloc);
 
 	for_each_string_list_item(item, &l) {
diff --git a/builtin/merge.c b/builtin/merge.c
index 18b22c0a26..893f8950bf 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -875,7 +875,7 @@ static void add_strategies(const char *string, unsigned attr)
 	if (string) {
 		struct string_list list = STRING_LIST_INIT_DUP;
 		struct string_list_item *item;
-		string_list_split(&list, string, ' ', -1);
+		string_list_split(&list, string, " ", -1);
 		for_each_string_list_item(item, &list)
 			append_strategy(get_strategy(item->string));
 		string_list_clear(&list, 0);
diff --git a/builtin/var.c b/builtin/var.c
index ada642a9fe..4ae7af0eff 100644
--- a/builtin/var.c
+++ b/builtin/var.c
@@ -181,7 +181,7 @@ static void list_vars(void)
 			if (ptr->multivalued && *val) {
 				struct string_list list = STRING_LIST_INIT_DUP;
 
-				string_list_split(&list, val, '\n', -1);
+				string_list_split(&list, val, "\n", -1);
 				for (size_t i = 0; i < list.nr; i++)
 					printf("%s=%s\n", ptr->name, list.items[i].string);
 				string_list_clear(&list, 0);
diff --git a/connect.c b/connect.c
index e77287f426..867b12bde5 100644
--- a/connect.c
+++ b/connect.c
@@ -407,7 +407,7 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
 	 * name.  Subsequent fields (symref-target and peeled) are optional and
 	 * don't have a particular order.
 	 */
-	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
+	if (string_list_split(&line_sections, line, " ", -1) < 2) {
 		ret = 0;
 		goto out;
 	}
diff --git a/diff.c b/diff.c
index dca87e164f..a81949a422 100644
--- a/diff.c
+++ b/diff.c
@@ -327,7 +327,7 @@ static unsigned parse_color_moved_ws(const char *arg)
 	struct string_list l = STRING_LIST_INIT_DUP;
 	struct string_list_item *i;
 
-	string_list_split(&l, arg, ',', -1);
+	string_list_split(&l, arg, ",", -1);
 
 	for_each_string_list_item(i, &l) {
 		struct strbuf sb = STRBUF_INIT;
diff --git a/fetch-pack.c b/fetch-pack.c
index c1be9b76eb..9866270696 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1914,7 +1914,7 @@ static void fetch_pack_config(void)
 		char *str;
 
 		if (!git_config_get_string("fetch.uriprotocols", &str) && str) {
-			string_list_split(&uri_protocols, str, ',', -1);
+			string_list_split(&uri_protocols, str, ",", -1);
 			free(str);
 		}
 	}
diff --git a/notes.c b/notes.c
index 97b995f3f2..6afcf088b9 100644
--- a/notes.c
+++ b/notes.c
@@ -892,7 +892,7 @@ static int string_list_add_note_lines(struct string_list *list,
 	 * later, along with any empty strings that came from empty
 	 * lines within the file.
 	 */
-	string_list_split(list, data, '\n', -1);
+	string_list_split(list, data, "\n", -1);
 	free(data);
 	return 0;
 }
diff --git a/parse-options.c b/parse-options.c
index 5224203ffe..9e7cb75192 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1338,7 +1338,7 @@ static enum parse_opt_result usage_with_options_internal(struct parse_opt_ctx_t
 		if (!saw_empty_line && !*str)
 			saw_empty_line = 1;
 
-		string_list_split(&list, str, '\n', -1);
+		string_list_split(&list, str, "\n", -1);
 		for (j = 0; j < list.nr; j++) {
 			const char *line = list.items[j].string;
 
diff --git a/pathspec.c b/pathspec.c
index a3ddd701c7..de325f7ef9 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -201,7 +201,7 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va
 	if (!value || !*value)
 		die(_("attr spec must not be empty"));
 
-	string_list_split(&list, value, ' ', -1);
+	string_list_split(&list, value, " ", -1);
 	string_list_remove_empty_items(&list, 0);
 
 	item->attr_check = attr_check_alloc();
diff --git a/protocol.c b/protocol.c
index bae7226ff4..54b9f49c01 100644
--- a/protocol.c
+++ b/protocol.c
@@ -61,7 +61,7 @@ enum protocol_version determine_protocol_version_server(void)
 	if (git_protocol) {
 		struct string_list list = STRING_LIST_INIT_DUP;
 		const struct string_list_item *item;
-		string_list_split(&list, git_protocol, ':', -1);
+		string_list_split(&list, git_protocol, ":", -1);
 
 		for_each_string_list_item(item, &list) {
 			const char *value;
diff --git a/ref-filter.c b/ref-filter.c
index f9f2c512a8..4edfb9c83b 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -435,7 +435,7 @@ static int remote_ref_atom_parser(struct ref_format *format UNUSED,
 	}
 
 	atom->u.remote_ref.nobracket = 0;
-	string_list_split(&params, arg, ',', -1);
+	string_list_split(&params, arg, ",", -1);
 
 	for (i = 0; i < params.nr; i++) {
 		const char *s = params.items[i].string;
@@ -831,7 +831,7 @@ static int align_atom_parser(struct ref_format *format UNUSED,
 
 	align->position = ALIGN_LEFT;
 
-	string_list_split(&params, arg, ',', -1);
+	string_list_split(&params, arg, ",", -1);
 	for (i = 0; i < params.nr; i++) {
 		const char *s = params.items[i].string;
 		int position;
diff --git a/setup.c b/setup.c
index 6f52dab64c..b9f5eb8b51 100644
--- a/setup.c
+++ b/setup.c
@@ -1460,8 +1460,9 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 
 	if (env_ceiling_dirs) {
 		int empty_entry_found = 0;
+		static const char path_sep[] = { PATH_SEP, '\0' };
 
-		string_list_split(&ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1);
+		string_list_split(&ceiling_dirs, env_ceiling_dirs, path_sep, -1);
 		filter_string_list(&ceiling_dirs, 0,
 				   canonicalize_ceiling_entry, &empty_entry_found);
 		ceil_offset = longest_ancestor_length(dir->buf, &ceiling_dirs);
diff --git a/string-list.c b/string-list.c
index 0cb920e9b0..2284a009cb 100644
--- a/string-list.c
+++ b/string-list.c
@@ -277,7 +277,7 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 }
 
 int string_list_split(struct string_list *list, const char *string,
-		      int delim, int maxsplit)
+		      const char *delim, int maxsplit)
 {
 	int count = 0;
 	const char *p = string, *end;
@@ -291,7 +291,7 @@ int string_list_split(struct string_list *list, const char *string,
 			string_list_append(list, p);
 			return count;
 		}
-		end = strchr(p, delim);
+		end = strpbrk(p, delim);
 		if (end) {
 			string_list_append_nodup(list, xmemdupz(p, end - p));
 			p = end + 1;
diff --git a/string-list.h b/string-list.h
index 122b318641..6c8650efde 100644
--- a/string-list.h
+++ b/string-list.h
@@ -254,7 +254,7 @@ struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
 void unsorted_string_list_delete_item(struct string_list *list, int i, int free_util);
 
 /**
- * Split string into substrings on character `delim` and append the
+ * Split string into substrings on characters in `delim` and append the
  * substrings to `list`.  The input string is not modified.
  * list->strdup_strings must be set, as new memory needs to be
  * allocated to hold the substrings.  If maxsplit is non-negative,
@@ -262,15 +262,15 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
  * appended to list.
  *
  * Examples:
- *   string_list_split(l, "foo:bar:baz", ':', -1) -> ["foo", "bar", "baz"]
- *   string_list_split(l, "foo:bar:baz", ':', 0) -> ["foo:bar:baz"]
- *   string_list_split(l, "foo:bar:baz", ':', 1) -> ["foo", "bar:baz"]
- *   string_list_split(l, "foo:bar:", ':', -1) -> ["foo", "bar", ""]
- *   string_list_split(l, "", ':', -1) -> [""]
- *   string_list_split(l, ":", ':', -1) -> ["", ""]
+ *   string_list_split(l, "foo:bar:baz", ":", -1) -> ["foo", "bar", "baz"]
+ *   string_list_split(l, "foo:bar:baz", ":", 0) -> ["foo:bar:baz"]
+ *   string_list_split(l, "foo:bar:baz", ":", 1) -> ["foo", "bar:baz"]
+ *   string_list_split(l, "foo:bar:", ":", -1) -> ["foo", "bar", ""]
+ *   string_list_split(l, "", ":", -1) -> [""]
+ *   string_list_split(l, ":", ":", -1) -> ["", ""]
  */
 int string_list_split(struct string_list *list, const char *string,
-		      int delim, int maxsplit);
+		      const char *delim, int maxsplit);
 
 /*
  * Like string_list_split(), except that string is split in-place: the
diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c
index 086238c826..f5f33751da 100644
--- a/t/helper/test-path-utils.c
+++ b/t/helper/test-path-utils.c
@@ -348,6 +348,7 @@ int cmd__path_utils(int argc, const char **argv)
 	if (argc == 4 && !strcmp(argv[1], "longest_ancestor_length")) {
 		int len;
 		struct string_list ceiling_dirs = STRING_LIST_INIT_DUP;
+		const char path_sep[] = { PATH_SEP, '\0' };
 		char *path = xstrdup(argv[2]);
 
 		/*
@@ -362,7 +363,7 @@ int cmd__path_utils(int argc, const char **argv)
 		 */
 		if (normalize_path_copy(path, path))
 			die("Path \"%s\" could not be normalized", argv[2]);
-		string_list_split(&ceiling_dirs, argv[3], PATH_SEP, -1);
+		string_list_split(&ceiling_dirs, argv[3], path_sep, -1);
 		filter_string_list(&ceiling_dirs, 0,
 				   normalize_ceiling_entry, NULL);
 		len = longest_ancestor_length(path, &ceiling_dirs);
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index 8d9a271845..aa1cb9b4ac 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -29,7 +29,7 @@ static unsigned int parse_flags(const char *str, struct flag_definition *defs)
 	if (!strcmp(str, "0"))
 		return 0;
 
-	string_list_split(&masks, str, ',', 64);
+	string_list_split(&masks, str, ",", 64);
 	for (size_t i = 0; i < masks.nr; i++) {
 		const char *name = masks.items[i].string;
 		struct flag_definition *def = defs;
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index d4ba5f9fa5..150a5f505f 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -43,7 +43,7 @@ static void t_string_list_equal(struct string_list *list,
 				  expected_strings->items[i].string);
 }
 
-static void t_string_list_split(const char *data, int delim, int maxsplit, ...)
+static void t_string_list_split(const char *data, const char *delim, int maxsplit, ...)
 {
 	struct string_list expected_strings = STRING_LIST_INIT_DUP;
 	struct string_list list = STRING_LIST_INIT_DUP;
@@ -65,13 +65,13 @@ static void t_string_list_split(const char *data, int delim, int maxsplit, ...)
 
 void test_string_list__split(void)
 {
-	t_string_list_split("foo:bar:baz", ':', -1, "foo", "bar", "baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 0, "foo:bar:baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 1, "foo", "bar:baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 2, "foo", "bar", "baz", NULL);
-	t_string_list_split("foo:bar:", ':', -1, "foo", "bar", "", NULL);
-	t_string_list_split("", ':', -1, "", NULL);
-	t_string_list_split(":", ':', -1, "", "", NULL);
+	t_string_list_split("foo:bar:baz", ":", -1, "foo", "bar", "baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 0, "foo:bar:baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 1, "foo", "bar:baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 2, "foo", "bar", "baz", NULL);
+	t_string_list_split("foo:bar:", ":", -1, "foo", "bar", "", NULL);
+	t_string_list_split("", ":", -1, "", NULL);
+	t_string_list_split(":", ":", -1, "", "", NULL);
 }
 
 static void t_string_list_split_in_place(const char *data, const char *delim,
diff --git a/transport.c b/transport.c
index c123ac1e38..76487b5453 100644
--- a/transport.c
+++ b/transport.c
@@ -1042,7 +1042,7 @@ static const struct string_list *protocol_allow_list(void)
 	if (enabled < 0) {
 		const char *v = getenv("GIT_ALLOW_PROTOCOL");
 		if (v) {
-			string_list_split(&allowed, v, ':', -1);
+			string_list_split(&allowed, v, ":", -1);
 			string_list_sort(&allowed);
 			enabled = 1;
 		} else {
diff --git a/upload-pack.c b/upload-pack.c
index 4f26f6afc7..91fcdcad9b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1685,7 +1685,7 @@ static void process_args(struct packet_reader *request,
 			if (data->uri_protocols.nr)
 				send_err_and_die(data,
 						 "multiple packfile-uris lines forbidden");
-			string_list_split(&data->uri_protocols, p, ',', -1);
+			string_list_split(&data->uri_protocols, p, ",", -1);
 			continue;
 		}
 
-- 
2.50.1-618-g45d530d26b


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 3/7] string-list: unify string_list_split* functions
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
  2025-07-31 22:46   ` [PATCH v2 1/7] string-list: report programming error with BUG Junio C Hamano
  2025-07-31 22:46   ` [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
@ 2025-07-31 22:46   ` Junio C Hamano
  2025-08-01  3:00     ` shejialuo
  2025-07-31 22:46   ` [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
                     ` (4 subsequent siblings)
  7 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31 22:46 UTC (permalink / raw)
  To: git

Thanks to the previous step, the only difference between these two
related functions is that string_list_split() works on a string
without modifying its contents (i.e. taking "const char *") and the
resulting pieces of strings are their own copies in a string list,
while string_list_split_in_place() works on a mutable string and the
resulting pieces of strings come from the original string.

Consolidate their implementations into a single helper function, and
make them a thin wrapper around it.  We can later add an extra flags
parameter to extend both of these functions by updating only the
internal helper function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c | 96 ++++++++++++++++++++++++++++++---------------------
 1 file changed, 56 insertions(+), 40 deletions(-)

diff --git a/string-list.c b/string-list.c
index 2284a009cb..65b6ceb259 100644
--- a/string-list.c
+++ b/string-list.c
@@ -276,55 +276,71 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 	list->nr--;
 }
 
-int string_list_split(struct string_list *list, const char *string,
-		      const char *delim, int maxsplit)
+/*
+ * append a substring [p..end] to list; return number of things it
+ * appended to the list.
+ */
+static int append_one(struct string_list *list,
+		      const char *p, const char *end,
+		      int in_place)
+{
+	if (!end)
+		end = p + strlen(p);
+
+	if (in_place) {
+		*((char *)end) = '\0';
+		string_list_append(list, p);
+	} else {
+		string_list_append_nodup(list, xmemdupz(p, end - p));
+	}
+	return 1;
+}
+
+/*
+ * Unfortunately this cannot become a public interface, as _in_place()
+ * wants to have "const char *string" while the other variant wants to
+ * have "char *string" for type safety.
+ *
+ * This accepts "const char *string" to allow both wrappers to use it;
+ * it internally casts away the constness when in_place is true by
+ * taking advantage of strpbrk() that takes a "const char *" arg and
+ * returns "char *" pointer into that const string.  Yucky but works ;-).
+ */
+static int split_string(struct string_list *list, const char *string, const char *delim,
+			int maxsplit, int in_place)
 {
 	int count = 0;
-	const char *p = string, *end;
+	const char *p = string;
+
+	if (in_place && list->strdup_strings)
+		BUG("string_list_split_in_place() called with strdup_strings");
+	else if (!in_place && !list->strdup_strings)
+		BUG("string_list_split() called without strdup_strings");
 
-	if (!list->strdup_strings)
-		BUG("internal error in string_list_split(): "
-		    "list->strdup_strings must be set");
 	for (;;) {
-		count++;
-		if (maxsplit >= 0 && count > maxsplit) {
-			string_list_append(list, p);
-			return count;
-		}
-		end = strpbrk(p, delim);
-		if (end) {
-			string_list_append_nodup(list, xmemdupz(p, end - p));
-			p = end + 1;
-		} else {
-			string_list_append(list, p);
+		char *end;
+
+		if (0 <= maxsplit && maxsplit <= count)
+			end = NULL;
+		else
+			end = strpbrk(p, delim);
+
+		count += append_one(list, p, end, in_place);
+
+		if (!end)
 			return count;
-		}
+		p = end + 1;
 	}
 }
 
+int string_list_split(struct string_list *list, const char *string,
+		      const char *delim, int maxsplit)
+{
+	return split_string(list, string, delim, maxsplit, 0);
+}
+
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit)
 {
-	int count = 0;
-	char *p = string, *end;
-
-	if (list->strdup_strings)
-		BUG("internal error in string_list_split_in_place(): "
-		    "list->strdup_strings must not be set");
-	for (;;) {
-		count++;
-		if (maxsplit >= 0 && count > maxsplit) {
-			string_list_append(list, p);
-			return count;
-		}
-		end = strpbrk(p, delim);
-		if (end) {
-			*end = '\0';
-			string_list_append(list, p);
-			p = end + 1;
-		} else {
-			string_list_append(list, p);
-			return count;
-		}
-	}
+	return split_string(list, string, delim, maxsplit, 1);
 }
-- 
2.50.1-618-g45d530d26b


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
                     ` (2 preceding siblings ...)
  2025-07-31 22:46   ` [PATCH v2 3/7] string-list: unify string_list_split* functions Junio C Hamano
@ 2025-07-31 22:46   ` Junio C Hamano
  2025-08-01  3:18     ` shejialuo
  2025-08-01  8:47     ` Patrick Steinhardt
  2025-07-31 22:46   ` [PATCH v2 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
                     ` (3 subsequent siblings)
  7 siblings, 2 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31 22:46 UTC (permalink / raw)
  To: git

Teach the unified split_string() to take an optional "flags" word,
and define the first flag STRING_LIST_SPLIT_TRIM to cause the split
pieces to be trimmed before they are placed in the string list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c                | 35 +++++++++++++++++---
 string-list.h                | 10 ++++++
 t/unit-tests/u-string-list.c | 64 ++++++++++++++++++++++++++++++++++++
 3 files changed, 104 insertions(+), 5 deletions(-)

diff --git a/string-list.c b/string-list.c
index 65b6ceb259..86a309f8fb 100644
--- a/string-list.c
+++ b/string-list.c
@@ -282,11 +282,18 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
  */
 static int append_one(struct string_list *list,
 		      const char *p, const char *end,
-		      int in_place)
+		      int in_place, unsigned flags)
 {
 	if (!end)
 		end = p + strlen(p);
 
+	if ((flags & STRING_LIST_SPLIT_TRIM)) {
+		/* rtrim */
+		for (; p < end; end--)
+			if (!isspace(end[-1]))
+				break;
+	}
+
 	if (in_place) {
 		*((char *)end) = '\0';
 		string_list_append(list, p);
@@ -307,7 +314,7 @@ static int append_one(struct string_list *list,
  * returns "char *" pointer into that const string.  Yucky but works ;-).
  */
 static int split_string(struct string_list *list, const char *string, const char *delim,
-			int maxsplit, int in_place)
+			int maxsplit, int in_place, unsigned flags)
 {
 	int count = 0;
 	const char *p = string;
@@ -320,12 +327,18 @@ static int split_string(struct string_list *list, const char *string, const char
 	for (;;) {
 		char *end;
 
+		if (flags & STRING_LIST_SPLIT_TRIM) {
+			/* ltrim */
+			while (*p && isspace(*p))
+				p++;
+		}
+
 		if (0 <= maxsplit && maxsplit <= count)
 			end = NULL;
 		else
 			end = strpbrk(p, delim);
 
-		count += append_one(list, p, end, in_place);
+		count += append_one(list, p, end, in_place, flags);
 
 		if (!end)
 			return count;
@@ -336,11 +349,23 @@ static int split_string(struct string_list *list, const char *string, const char
 int string_list_split(struct string_list *list, const char *string,
 		      const char *delim, int maxsplit)
 {
-	return split_string(list, string, delim, maxsplit, 0);
+	return split_string(list, string, delim, maxsplit, 0, 0);
 }
 
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit)
 {
-	return split_string(list, string, delim, maxsplit, 1);
+	return split_string(list, string, delim, maxsplit, 1, 0);
+}
+
+int string_list_split_f(struct string_list *list, const char *string,
+			const char *delim, int maxsplit, unsigned flags)
+{
+	return split_string(list, string, delim, maxsplit, 0, flags);
+}
+
+int string_list_split_in_place_f(struct string_list *list, char *string,
+			       const char *delim, int maxsplit, unsigned flags)
+{
+	return split_string(list, string, delim, maxsplit, 1, flags);
 }
diff --git a/string-list.h b/string-list.h
index 6c8650efde..ee9922af67 100644
--- a/string-list.h
+++ b/string-list.h
@@ -281,4 +281,14 @@ int string_list_split(struct string_list *list, const char *string,
  */
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit);
+
+/* trim() resulting string piece before adding it to the list */
+#define STRING_LIST_SPLIT_TRIM 01
+
+int string_list_split_f(struct string_list *, const char *string,
+			const char *delim, int maxsplit, unsigned flags);
+
+int string_list_split_in_place_f(struct string_list *, char *string,
+				 const char *delim, int maxsplit, unsigned flags);
+
 #endif /* STRING_LIST_H */
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index 150a5f505f..daa9307e45 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -63,6 +63,70 @@ static void t_string_list_split(const char *data, const char *delim, int maxspli
 	string_list_clear(&list, 0);
 }
 
+static void t_string_list_split_f(const char *data, const char *delim,
+				  int maxsplit, unsigned flags, ...)
+{
+	struct string_list expected_strings = STRING_LIST_INIT_DUP;
+	struct string_list list = STRING_LIST_INIT_DUP;
+	va_list ap;
+	int len;
+
+	va_start(ap, flags);
+	t_vcreate_string_list_dup(&expected_strings, 0, ap);
+	va_end(ap);
+
+	string_list_clear(&list, 0);
+	len = string_list_split_f(&list, data, delim, maxsplit, flags);
+	cl_assert_equal_i(len, expected_strings.nr);
+	t_string_list_equal(&list, &expected_strings);
+
+	string_list_clear(&expected_strings, 0);
+	string_list_clear(&list, 0);
+}
+
+void test_string_list__split_f(void)
+{
+	t_string_list_split_f("::foo:bar:baz:", ":", -1, 0,
+			      "", "", "foo", "bar", "baz", "", NULL);
+	t_string_list_split_f(" foo:bar : baz", ":", -1, STRING_LIST_SPLIT_TRIM,
+			      "foo", "bar", "baz", NULL);
+	t_string_list_split_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
+			      "a", "b c", NULL);
+}
+
+static void t_string_list_split_in_place_f(const char *data_, const char *delim,
+					   int maxsplit, unsigned flags, ...)
+{
+	struct string_list expected_strings = STRING_LIST_INIT_DUP;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *data = xstrdup(data_);
+	va_list ap;
+	int len;
+
+	va_start(ap, flags);
+	t_vcreate_string_list_dup(&expected_strings, 0, ap);
+	va_end(ap);
+
+	string_list_clear(&list, 0);
+	len = string_list_split_in_place_f(&list, data, delim, maxsplit, flags);
+	cl_assert_equal_i(len, expected_strings.nr);
+	t_string_list_equal(&list, &expected_strings);
+
+	free(data);
+	string_list_clear(&expected_strings, 0);
+	string_list_clear(&list, 0);
+}
+
+void test_string_list__split_in_place_f(void)
+{
+	t_string_list_split_in_place_f("::foo:bar:baz:", ":", -1, 0,
+				       "", "", "foo", "bar", "baz", "", NULL);
+	t_string_list_split_in_place_f(" foo:bar : baz", ":", -1, STRING_LIST_SPLIT_TRIM,
+				       "foo", "bar", "baz", NULL);
+	t_string_list_split_in_place_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
+				       "a", "b c", NULL);
+}
+
 void test_string_list__split(void)
 {
 	t_string_list_split("foo:bar:baz", ":", -1, "foo", "bar", "baz", NULL);
-- 
2.50.1-618-g45d530d26b


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 5/7] diff: simplify parsing of diff.colormovedws
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
                     ` (3 preceding siblings ...)
  2025-07-31 22:46   ` [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
@ 2025-07-31 22:46   ` Junio C Hamano
  2025-08-01  8:47     ` Patrick Steinhardt
  2025-07-31 22:46   ` [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31 22:46 UTC (permalink / raw)
  To: git

The code to parse this configuration variable, whose value is a
comma separated known tokens like "ignore-space-change" and
"ignore-all-space", uses string_list_split() to split the value into
pieces, and then places each piece of string in a strbuf to trim,
before comparing the result with the list of known tokens.

Thanks to the previous steps, now string_list_split() can trim the
resulting pieces before it places them in the string list.  Use it
to simplify the code.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/diff.c b/diff.c
index a81949a422..70666ad2cd 100644
--- a/diff.c
+++ b/diff.c
@@ -327,29 +327,23 @@ static unsigned parse_color_moved_ws(const char *arg)
 	struct string_list l = STRING_LIST_INIT_DUP;
 	struct string_list_item *i;
 
-	string_list_split(&l, arg, ",", -1);
+	string_list_split_f(&l, arg, ",", -1, STRING_LIST_SPLIT_TRIM);
 
 	for_each_string_list_item(i, &l) {
-		struct strbuf sb = STRBUF_INIT;
-		strbuf_addstr(&sb, i->string);
-		strbuf_trim(&sb);
-
-		if (!strcmp(sb.buf, "no"))
+		if (!strcmp(i->string, "no"))
 			ret = 0;
-		else if (!strcmp(sb.buf, "ignore-space-change"))
+		else if (!strcmp(i->string, "ignore-space-change"))
 			ret |= XDF_IGNORE_WHITESPACE_CHANGE;
-		else if (!strcmp(sb.buf, "ignore-space-at-eol"))
+		else if (!strcmp(i->string, "ignore-space-at-eol"))
 			ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
-		else if (!strcmp(sb.buf, "ignore-all-space"))
+		else if (!strcmp(i->string, "ignore-all-space"))
 			ret |= XDF_IGNORE_WHITESPACE;
-		else if (!strcmp(sb.buf, "allow-indentation-change"))
+		else if (!strcmp(i->string, "allow-indentation-change"))
 			ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
 		else {
 			ret |= COLOR_MOVED_WS_ERROR;
-			error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), sb.buf);
+			error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), i->string);
 		}
-
-		strbuf_release(&sb);
 	}
 
 	if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) &&
-- 
2.50.1-618-g45d530d26b


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*()
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
                     ` (4 preceding siblings ...)
  2025-07-31 22:46   ` [PATCH v2 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
@ 2025-07-31 22:46   ` Junio C Hamano
  2025-07-31 22:54     ` Eric Sunshine
                       ` (2 more replies)
  2025-07-31 22:46   ` [PATCH v2 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
  7 siblings, 3 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31 22:46 UTC (permalink / raw)
  To: git

Teach the unified split_string() machinery a new flag bit,
STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces omitted from
the resulting string list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c                |  3 +++
 string-list.h                |  3 +++
 t/unit-tests/u-string-list.c | 15 +++++++++++++++
 3 files changed, 21 insertions(+)

diff --git a/string-list.c b/string-list.c
index 86a309f8fb..343cf1ca90 100644
--- a/string-list.c
+++ b/string-list.c
@@ -294,6 +294,9 @@ static int append_one(struct string_list *list,
 				break;
 	}
 
+	if ((flags & STRING_LIST_SPLIT_NONEMPTY) && (end <= p))
+		return 0;
+
 	if (in_place) {
 		*((char *)end) = '\0';
 		string_list_append(list, p);
diff --git a/string-list.h b/string-list.h
index ee9922af67..0f73064fd1 100644
--- a/string-list.h
+++ b/string-list.h
@@ -285,6 +285,9 @@ int string_list_split_in_place(struct string_list *list, char *string,
 /* trim() resulting string piece before adding it to the list */
 #define STRING_LIST_SPLIT_TRIM 01
 
+/* omit adding empty string piece to the resulting list */
+#define STRING_LIST_SPLIT_NONEMPTY 02
+
 int string_list_split_f(struct string_list *, const char *string,
 			const char *delim, int maxsplit, unsigned flags);
 
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index daa9307e45..a2457d7b1e 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -92,6 +92,13 @@ void test_string_list__split_f(void)
 			      "foo", "bar", "baz", NULL);
 	t_string_list_split_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
 			      "a", "b c", NULL);
+	t_string_list_split_f("::foo::bar:baz:", ":", -1, STRING_LIST_SPLIT_NONEMPTY,
+			      "foo", "bar", "baz", NULL);
+	t_string_list_split_f("foo:baz", ":", -1, STRING_LIST_SPLIT_NONEMPTY,
+			      "foo", "baz", NULL);
+	t_string_list_split_f("foo :: : baz", ":", -1,
+			      STRING_LIST_SPLIT_NONEMPTY | STRING_LIST_SPLIT_TRIM,
+			      "foo", "baz", NULL);
 }
 
 static void t_string_list_split_in_place_f(const char *data_, const char *delim,
@@ -125,6 +132,14 @@ void test_string_list__split_in_place_f(void)
 				       "foo", "bar", "baz", NULL);
 	t_string_list_split_in_place_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
 				       "a", "b c", NULL);
+	t_string_list_split_in_place_f("::foo::bar:baz:", ":", -1,
+				       STRING_LIST_SPLIT_NONEMPTY,
+				       "foo", "bar", "baz", NULL);
+	t_string_list_split_in_place_f("foo:baz", ":", -1, STRING_LIST_SPLIT_NONEMPTY,
+				       "foo", "baz", NULL);
+	t_string_list_split_in_place_f("foo :: : baz", ":", -1,
+				       STRING_LIST_SPLIT_NONEMPTY | STRING_LIST_SPLIT_TRIM,
+				       "foo", "baz", NULL);
 }
 
 void test_string_list__split(void)
-- 
2.50.1-618-g45d530d26b


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v2 7/7] string-list: split-then-remove-empty can be done while splitting
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
                     ` (5 preceding siblings ...)
  2025-07-31 22:46   ` [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
@ 2025-07-31 22:46   ` Junio C Hamano
  2025-08-01  8:47     ` Patrick Steinhardt
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
  7 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-07-31 22:46 UTC (permalink / raw)
  To: git

Thanks to the new STRING_LIST_SPLIT_NONEMPTY flag, a common pattern
to split a string into a string list and then remove empty items in
the resulting list is no longer needed.  Instead, just tell the
string_list_split*() to omit empty ones while splitting.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 notes.c                     | 4 ++--
 pathspec.c                  | 3 +--
 t/helper/test-hashmap.c     | 4 ++--
 t/helper/test-json-writer.c | 4 ++--
 4 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/notes.c b/notes.c
index 6afcf088b9..3603c4a42b 100644
--- a/notes.c
+++ b/notes.c
@@ -970,8 +970,8 @@ void string_list_add_refs_from_colon_sep(struct string_list *list,
 	char *globs_copy = xstrdup(globs);
 	int i;
 
-	string_list_split_in_place(&split, globs_copy, ":", -1);
-	string_list_remove_empty_items(&split, 0);
+	string_list_split_in_place_f(&split, globs_copy, ":", -1,
+				     STRING_LIST_SPLIT_NONEMPTY);
 
 	for (i = 0; i < split.nr; i++)
 		string_list_add_refs_by_glob(list, split.items[i].string);
diff --git a/pathspec.c b/pathspec.c
index de325f7ef9..5993c4afa0 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -201,8 +201,7 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va
 	if (!value || !*value)
 		die(_("attr spec must not be empty"));
 
-	string_list_split(&list, value, " ", -1);
-	string_list_remove_empty_items(&list, 0);
+	string_list_split_f(&list, value, " ", -1, STRING_LIST_SPLIT_NONEMPTY);
 
 	item->attr_check = attr_check_alloc();
 	CALLOC_ARRAY(item->attr_match, list.nr);
diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c
index 7782ae585e..e4dc02bd7a 100644
--- a/t/helper/test-hashmap.c
+++ b/t/helper/test-hashmap.c
@@ -149,8 +149,8 @@ int cmd__hashmap(int argc UNUSED, const char **argv UNUSED)
 
 		/* break line into command and up to two parameters */
 		string_list_setlen(&parts, 0);
-		string_list_split_in_place(&parts, line.buf, DELIM, 2);
-		string_list_remove_empty_items(&parts, 0);
+		string_list_split_in_place_f(&parts, line.buf, DELIM, 2,
+					     STRING_LIST_SPLIT_NONEMPTY);
 
 		/* ignore empty lines */
 		if (!parts.nr)
diff --git a/t/helper/test-json-writer.c b/t/helper/test-json-writer.c
index a288069b04..f8316a7d29 100644
--- a/t/helper/test-json-writer.c
+++ b/t/helper/test-json-writer.c
@@ -492,8 +492,8 @@ static int scripted(void)
 
 		/* break line into command and zero or more tokens */
 		string_list_setlen(&parts, 0);
-		string_list_split_in_place(&parts, line, " ", -1);
-		string_list_remove_empty_items(&parts, 0);
+		string_list_split_in_place_f(&parts, line, " ", -1,
+					     STRING_LIST_SPLIT_NONEMPTY);
 
 		/* ignore empty lines */
 		if (!parts.nr || !*parts.items[0].string)
-- 
2.50.1-618-g45d530d26b


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*()
  2025-07-31 22:46   ` [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
@ 2025-07-31 22:54     ` Eric Sunshine
  2025-08-01  3:33     ` shejialuo
  2025-08-01  8:47     ` Patrick Steinhardt
  2 siblings, 0 replies; 72+ messages in thread
From: Eric Sunshine @ 2025-07-31 22:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 6:46 PM Junio C Hamano <gitster@pobox.com> wrote:
> Teach the unified split_string() machinery a new flag bit,
> STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces omitted from
> the resulting string list.

s/pieces/& to be/

> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-07-31 22:46   ` [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
@ 2025-08-01  2:33     ` shejialuo
  2025-08-01  3:43       ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: shejialuo @ 2025-08-01  2:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 03:46:01PM -0700, Junio C Hamano wrote:
> diff --git a/setup.c b/setup.c
> index 6f52dab64c..b9f5eb8b51 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -1460,8 +1460,9 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>  
>  	if (env_ceiling_dirs) {
>  		int empty_entry_found = 0;
> +		static const char path_sep[] = { PATH_SEP, '\0' };
>  

I am a little confused why we need to use `static`? Would this function
be called many times?

And I have a design question: by using "PATH_SEP", we need to convert
this character to be string. Should we create a new variable named
"PATH_SEP_STR" or whatever to do that?

> -		string_list_split(&ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1);
> +		string_list_split(&ceiling_dirs, env_ceiling_dirs, path_sep, -1);
>  		filter_string_list(&ceiling_dirs, 0,
>  				   canonicalize_ceiling_entry, &empty_entry_found);
>  		ceil_offset = longest_ancestor_length(dir->buf, &ceiling_dirs);

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 3/7] string-list: unify string_list_split* functions
  2025-07-31 22:46   ` [PATCH v2 3/7] string-list: unify string_list_split* functions Junio C Hamano
@ 2025-08-01  3:00     ` shejialuo
  0 siblings, 0 replies; 72+ messages in thread
From: shejialuo @ 2025-08-01  3:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 03:46:02PM -0700, Junio C Hamano wrote:

[snip]

> +/*
> + * append a substring [p..end] to list; return number of things it
> + * appended to the list.
> + */

In the following function, we would always return 1. So, I guess in the
following commits, there would be a case where we won't append the
string. And it is, in [PATCH v2 6/7], we would simply skip and return 0.

And I have a design question, should we make "append_one" pure? It would
simply attend a string where start is `p` and end is `end`? Let's see in
the later patches whether we could do this.

> +static int append_one(struct string_list *list,
> +		      const char *p, const char *end,
> +		      int in_place)
> +{
> +	if (!end)
> +		end = p + strlen(p);
> +
> +	if (in_place) {
> +		*((char *)end) = '\0';
> +		string_list_append(list, p);
> +	} else {
> +		string_list_append_nodup(list, xmemdupz(p, end - p));
> +	}
> +	return 1;

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-07-31 22:46   ` [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
@ 2025-08-01  3:18     ` shejialuo
  2025-08-01  3:47       ` Junio C Hamano
  2025-08-01  8:47     ` Patrick Steinhardt
  1 sibling, 1 reply; 72+ messages in thread
From: shejialuo @ 2025-08-01  3:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 03:46:03PM -0700, Junio C Hamano wrote:
>  static int split_string(struct string_list *list, const char *string, const char *delim,
> -			int maxsplit, int in_place)
> +			int maxsplit, int in_place, unsigned flags)
>  {
>  	int count = 0;
>  	const char *p = string;
> @@ -320,12 +327,18 @@ static int split_string(struct string_list *list, const char *string, const char
>  	for (;;) {
>  		char *end;
>  
> +		if (flags & STRING_LIST_SPLIT_TRIM) {
> +			/* ltrim */
> +			while (*p && isspace(*p))
> +				p++;
> +		}
> +
>  		if (0 <= maxsplit && maxsplit <= count)
>  			end = NULL;
>  		else
>  			end = strpbrk(p, delim);
>  

In `append_one`, we would tell whether `end` is NULL. I somehow feel
strange why we need to do that in `append_one`. Should we just set `end`
to be `p + strlen(p)` when `end` is NULL. And then we could do rtrim
inside this function instead of `append_one` to avoid passing "flags" to
`append_one`.

> -		count += append_one(list, p, end, in_place);
> +		count += append_one(list, p, end, in_place, flags);
>  
>  		if (!end)
>  			return count;

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*()
  2025-07-31 22:46   ` [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
  2025-07-31 22:54     ` Eric Sunshine
@ 2025-08-01  3:33     ` shejialuo
  2025-08-01  8:47     ` Patrick Steinhardt
  2 siblings, 0 replies; 72+ messages in thread
From: shejialuo @ 2025-08-01  3:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 03:46:05PM -0700, Junio C Hamano wrote:
> diff --git a/string-list.c b/string-list.c
> index 86a309f8fb..343cf1ca90 100644
> --- a/string-list.c
> +++ b/string-list.c
> @@ -294,6 +294,9 @@ static int append_one(struct string_list *list,
>  				break;
>  	}
>  
> +	if ((flags & STRING_LIST_SPLIT_NONEMPTY) && (end <= p))
> +		return 0;
> +

I somehow think we should do this directly in `split_string` function.
And should we use `end == p`?

Thanks,
Jialuo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-08-01  2:33     ` shejialuo
@ 2025-08-01  3:43       ` Junio C Hamano
  2025-08-01  3:55         ` shejialuo
  0 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01  3:43 UTC (permalink / raw)
  To: shejialuo; +Cc: git

shejialuo <shejialuo@gmail.com> writes:

> On Thu, Jul 31, 2025 at 03:46:01PM -0700, Junio C Hamano wrote:
>> diff --git a/setup.c b/setup.c
>> index 6f52dab64c..b9f5eb8b51 100644
>> --- a/setup.c
>> +++ b/setup.c
>> @@ -1460,8 +1460,9 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>>  
>>  	if (env_ceiling_dirs) {
>>  		int empty_entry_found = 0;
>> +		static const char path_sep[] = { PATH_SEP, '\0' };
>>  
>
> I am a little confused why we need to use `static`? Would this function
> be called many times?

I actually am confused why you would want anything other than static
here.  Writing this way would allow the compiler to realize that the
array can be prepared at compile time, without need to do anything
at runtime.  If you made it non static, the runtime code would
allocate two bytes worth of memory on stack, and stuff these two
byte values there, each time this block is entered, which would be
at least once.

> And I have a design question: by using "PATH_SEP", we need to convert
> this character to be string. Should we create a new variable named
> "PATH_SEP_STR" or whatever to do that?

Sorry, but I do not understand the question.  You want to see
something like

	#define PATH_SEP_STR "/"

you mean?  I do not offhand see why anybody would want to do so.

>> -		string_list_split(&ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1);
>> +		string_list_split(&ceiling_dirs, env_ceiling_dirs, path_sep, -1);
>>  		filter_string_list(&ceiling_dirs, 0,
>>  				   canonicalize_ceiling_entry, &empty_entry_found);
>>  		ceil_offset = longest_ancestor_length(dir->buf, &ceiling_dirs);
>
> Thanks,
> Jialuo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-01  3:18     ` shejialuo
@ 2025-08-01  3:47       ` Junio C Hamano
  2025-08-01  4:04         ` shejialuo
  0 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01  3:47 UTC (permalink / raw)
  To: shejialuo; +Cc: git

shejialuo <shejialuo@gmail.com> writes:

> On Thu, Jul 31, 2025 at 03:46:03PM -0700, Junio C Hamano wrote:
>>  static int split_string(struct string_list *list, const char *string, const char *delim,
>> -			int maxsplit, int in_place)
>> +			int maxsplit, int in_place, unsigned flags)
>>  {
>>  	int count = 0;
>>  	const char *p = string;
>> @@ -320,12 +327,18 @@ static int split_string(struct string_list *list, const char *string, const char
>>  	for (;;) {
>>  		char *end;
>>  
>> +		if (flags & STRING_LIST_SPLIT_TRIM) {
>> +			/* ltrim */
>> +			while (*p && isspace(*p))
>> +				p++;
>> +		}
>> +
>>  		if (0 <= maxsplit && maxsplit <= count)
>>  			end = NULL;
>>  		else
>>  			end = strpbrk(p, delim);
>>  
>
> In `append_one`, we would tell whether `end` is NULL. I somehow feel
> strange why we need to do that in `append_one`. Should we just set `end`
> to be `p + strlen(p)` when `end` is NULL. And then we could do rtrim
> inside this function instead of `append_one` to avoid passing "flags" to
> `append_one`.

Sorry, but I do not see why such an alternative design is a better
idea.  The helper function's purpose is to stuff the substring at
[p..end), possibly after rtrimming, to the list.  You could compute
rtrim in the caller, but that would make the logic here more complex
(at least, you'd need to introduce yet another variable similar to
"end" that points at the real tail of the string, and you cannot
reuse "end" for it, because of the exit condition you see below).

>> -		count += append_one(list, p, end, in_place);
>> +		count += append_one(list, p, end, in_place, flags);
>>  
>>  		if (!end)
>>  			return count;
>
> Thanks,
> Jialuo

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-08-01  3:43       ` Junio C Hamano
@ 2025-08-01  3:55         ` shejialuo
  2025-08-01 23:10           ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: shejialuo @ 2025-08-01  3:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 08:43:24PM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > On Thu, Jul 31, 2025 at 03:46:01PM -0700, Junio C Hamano wrote:
> >> diff --git a/setup.c b/setup.c
> >> index 6f52dab64c..b9f5eb8b51 100644
> >> --- a/setup.c
> >> +++ b/setup.c
> >> @@ -1460,8 +1460,9 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
> >>  
> >>  	if (env_ceiling_dirs) {
> >>  		int empty_entry_found = 0;
> >> +		static const char path_sep[] = { PATH_SEP, '\0' };
> >>  
> >
> > I am a little confused why we need to use `static`? Would this function
> > be called many times?
> 
> I actually am confused why you would want anything other than static
> here.  Writing this way would allow the compiler to realize that the
> array can be prepared at compile time, without need to do anything
> at runtime.  If you made it non static, the runtime code would
> allocate two bytes worth of memory on stack, and stuff these two
> byte values there, each time this block is entered, which would be
> at least once.
> 

Sorry to make you confused. Because there are some other changes where
you don't use `static`. That's the main reason why I ask this question.

--- a/t/helper/test-path-utils.c
+++ b/t/helper/test-path-utils.c
@@ -348,6 +348,7 @@ int cmd__path_utils(int argc, const char **argv)
 	if (argc == 4 && !strcmp(argv[1], "longest_ancestor_length")) {
 		int len;
 		struct string_list ceiling_dirs = STRING_LIST_INIT_DUP;
+		const char path_sep[] = { PATH_SEP, '\0' };
 		char *path = xstrdup(argv[2]);

> > And I have a design question: by using "PATH_SEP", we need to convert
> > this character to be string. Should we create a new variable named
> > "PATH_SEP_STR" or whatever to do that?
> 
> Sorry, but I do not understand the question.  You want to see
> something like
> 
> 	#define PATH_SEP_STR "/"
> 
> you mean?  I do not offhand see why anybody would want to do so.
> 

Yes, that's my question. Because I see that we would define `path_sep`
array in many place, so I wonder whether we could use such macro.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-01  3:47       ` Junio C Hamano
@ 2025-08-01  4:04         ` shejialuo
  2025-08-01 23:09           ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: shejialuo @ 2025-08-01  4:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 08:47:10PM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > On Thu, Jul 31, 2025 at 03:46:03PM -0700, Junio C Hamano wrote:
> >>  static int split_string(struct string_list *list, const char *string, const char *delim,
> >> -			int maxsplit, int in_place)
> >> +			int maxsplit, int in_place, unsigned flags)
> >>  {
> >>  	int count = 0;
> >>  	const char *p = string;
> >> @@ -320,12 +327,18 @@ static int split_string(struct string_list *list, const char *string, const char
> >>  	for (;;) {
> >>  		char *end;
> >>  
> >> +		if (flags & STRING_LIST_SPLIT_TRIM) {
> >> +			/* ltrim */
> >> +			while (*p && isspace(*p))
> >> +				p++;
> >> +		}
> >> +
> >>  		if (0 <= maxsplit && maxsplit <= count)
> >>  			end = NULL;
> >>  		else
> >>  			end = strpbrk(p, delim);
> >>  
> >
> > In `append_one`, we would tell whether `end` is NULL. I somehow feel
> > strange why we need to do that in `append_one`. Should we just set `end`
> > to be `p + strlen(p)` when `end` is NULL. And then we could do rtrim
> > inside this function instead of `append_one` to avoid passing "flags" to
> > `append_one`.
> 
> Sorry, but I do not see why such an alternative design is a better
> idea.  The helper function's purpose is to stuff the substring at
> [p..end), possibly after rtrimming, to the list.  You could compute
> rtrim in the caller, but that would make the logic here more complex
> (at least, you'd need to introduce yet another variable similar to
> "end" that points at the real tail of the string, and you cannot
> reuse "end" for it, because of the exit condition you see below).
> 

I agree with you that we would introduce another variable. However, the
thing I quite dislike is that we do ltrim inside the current function
and we do rtrim inside `append_one`.

My thinking is that we should handle the [p, end) string in the same
place. We could either decide to drop the string or change the string in
the same place. However, at now, the logic happens at two different
places, which is my concern.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-07-31 22:46   ` [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
  2025-08-01  3:18     ` shejialuo
@ 2025-08-01  8:47     ` Patrick Steinhardt
  2025-08-01 16:26       ` Junio C Hamano
  1 sibling, 1 reply; 72+ messages in thread
From: Patrick Steinhardt @ 2025-08-01  8:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 03:46:03PM -0700, Junio C Hamano wrote:
> diff --git a/string-list.c b/string-list.c
> index 65b6ceb259..86a309f8fb 100644
> --- a/string-list.c
> +++ b/string-list.c
> @@ -336,11 +349,23 @@ static int split_string(struct string_list *list, const char *string, const char
>  int string_list_split(struct string_list *list, const char *string,
>  		      const char *delim, int maxsplit)
>  {
> -	return split_string(list, string, delim, maxsplit, 0);
> +	return split_string(list, string, delim, maxsplit, 0, 0);
>  }
>  
>  int string_list_split_in_place(struct string_list *list, char *string,
>  			       const char *delim, int maxsplit)
>  {
> -	return split_string(list, string, delim, maxsplit, 1);
> +	return split_string(list, string, delim, maxsplit, 1, 0);
> +}
> +
> +int string_list_split_f(struct string_list *list, const char *string,
> +			const char *delim, int maxsplit, unsigned flags)
> +{
> +	return split_string(list, string, delim, maxsplit, 0, flags);
> +}
> +
> +int string_list_split_in_place_f(struct string_list *list, char *string,
> +			       const char *delim, int maxsplit, unsigned flags)
> +{
> +	return split_string(list, string, delim, maxsplit, 1, flags);
>  }

One issue I have with the `_f` suffix is that I immediately jumped
to "formatting string". I think in other places we use `_ext` as a
suffix.

> diff --git a/string-list.h b/string-list.h
> index 6c8650efde..ee9922af67 100644
> --- a/string-list.h
> +++ b/string-list.h
> @@ -281,4 +281,14 @@ int string_list_split(struct string_list *list, const char *string,
>   */
>  int string_list_split_in_place(struct string_list *list, char *string,
>  			       const char *delim, int maxsplit);
> +
> +/* trim() resulting string piece before adding it to the list */
> +#define STRING_LIST_SPLIT_TRIM 01

Another nit: I think nowadays we more often use enums to introduce such
flags, where the benefit is improved grouping. Also, I think having
`(1 << 0)` as value is slightly more readable.

Patrick

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 5/7] diff: simplify parsing of diff.colormovedws
  2025-07-31 22:46   ` [PATCH v2 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
@ 2025-08-01  8:47     ` Patrick Steinhardt
  0 siblings, 0 replies; 72+ messages in thread
From: Patrick Steinhardt @ 2025-08-01  8:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 03:46:04PM -0700, Junio C Hamano wrote:
> The code to parse this configuration variable, whose value is a
> comma separated known tokens like "ignore-space-change" and

Should this read "comma-separated list of known tokens"?

> "ignore-all-space", uses string_list_split() to split the value into
> pieces, and then places each piece of string in a strbuf to trim,
> before comparing the result with the list of known tokens.
> 
> Thanks to the previous steps, now string_list_split() can trim the
> resulting pieces before it places them in the string list.  Use it
> to simplify the code.

The change itself makes sense.

Patrick

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*()
  2025-07-31 22:46   ` [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
  2025-07-31 22:54     ` Eric Sunshine
  2025-08-01  3:33     ` shejialuo
@ 2025-08-01  8:47     ` Patrick Steinhardt
  2025-08-01 16:38       ` Junio C Hamano
  2 siblings, 1 reply; 72+ messages in thread
From: Patrick Steinhardt @ 2025-08-01  8:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 03:46:05PM -0700, Junio C Hamano wrote:
> Teach the unified split_string() machinery a new flag bit,
> STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces omitted from

s/omitted/to be &/

> the resulting string list.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  string-list.c                |  3 +++
>  string-list.h                |  3 +++
>  t/unit-tests/u-string-list.c | 15 +++++++++++++++
>  3 files changed, 21 insertions(+)
> 
> diff --git a/string-list.c b/string-list.c
> index 86a309f8fb..343cf1ca90 100644
> --- a/string-list.c
> +++ b/string-list.c
> @@ -294,6 +294,9 @@ static int append_one(struct string_list *list,
>  				break;
>  	}
>  
> +	if ((flags & STRING_LIST_SPLIT_NONEMPTY) && (end <= p))
> +		return 0;

Okay, this is where the return value of `append_one()` starts to make
sense.

The condition for `end <= p` is probably overly defensive, as it
shouldn't ever happen that `end < p`. We could make that a `BUG()`, but
I'm not sure that's really worth it.

Patrick

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 7/7] string-list: split-then-remove-empty can be done while splitting
  2025-07-31 22:46   ` [PATCH v2 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
@ 2025-08-01  8:47     ` Patrick Steinhardt
  0 siblings, 0 replies; 72+ messages in thread
From: Patrick Steinhardt @ 2025-08-01  8:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Jul 31, 2025 at 03:46:06PM -0700, Junio C Hamano wrote:
> Thanks to the new STRING_LIST_SPLIT_NONEMPTY flag, a common pattern
> to split a string into a string list and then remove empty items in
> the resulting list is no longer needed.  Instead, just tell the
> string_list_split*() to omit empty ones while splitting.

Neat.

Patrick

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-01  8:47     ` Patrick Steinhardt
@ 2025-08-01 16:26       ` Junio C Hamano
  0 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 16:26 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

> One issue I have with the `_f` suffix is that I immediately jumped
> to "formatting string". I think in other places we use `_ext` as a
> suffix.

It stands for "with flags".  I've seen _with_options and _extended
also used.  This is like oid_object_info_extended() that has a
variant oid_object_info() that is a simpler and less capable wrapper
for common use cases.  None of these overly long names are my
favourites X-<.

With brief inspection, many _ext() in midx.c are more like helpers
that deal with a class of files with .$ext for various extensions;
they are not the primary interface to external callers, and many are
extern only because the code is spread across midx. and midx-write.c
instead of being in a single compilation unit.

>> diff --git a/string-list.h b/string-list.h
>> index 6c8650efde..ee9922af67 100644
>> --- a/string-list.h
>> +++ b/string-list.h
>> @@ -281,4 +281,14 @@ int string_list_split(struct string_list *list, const char *string,
>>   */
>>  int string_list_split_in_place(struct string_list *list, char *string,
>>  			       const char *delim, int maxsplit);
>> +
>> +/* trim() resulting string piece before adding it to the list */
>> +#define STRING_LIST_SPLIT_TRIM 01
>
> Another nit: I think nowadays we more often use enums to introduce such
> flags, where the benefit is improved grouping. Also, I think having
> `(1 << 0)` as value is slightly more readable.

OK, let's update that.  Thanks.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*()
  2025-08-01  8:47     ` Patrick Steinhardt
@ 2025-08-01 16:38       ` Junio C Hamano
  0 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 16:38 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

>> +	if ((flags & STRING_LIST_SPLIT_NONEMPTY) && (end <= p))
>> +		return 0;
>
> Okay, this is where the return value of `append_one()` starts to make
> sense.
>
> The condition for `end <= p` is probably overly defensive, as it
> shouldn't ever happen that `end < p`. We could make that a `BUG()`, but
> I'm not sure that's really worth it.

Correct.  I'd leave it to be defensive but without overly
pessimistic BUG, as this is a leaf function that, once carefully
vetted, is unlikely to become buggy (which is famous last words).

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v3 0/7] string_list_split*() updates
  2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
                     ` (6 preceding siblings ...)
  2025-07-31 22:46   ` [PATCH v2 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
@ 2025-08-01 22:04   ` Junio C Hamano
  2025-08-01 22:04     ` [PATCH v3 1/7] string-list: report programming error with BUG Junio C Hamano
                       ` (8 more replies)
  7 siblings, 9 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 22:04 UTC (permalink / raw)
  To: git

Two related string-list API functions, string_list_split() and
string_list_split_in_place(), more or less duplicates their
implementations.  They both take a single string, and split the
string at the delimiter and stuff the result into a string list.

However, there is one subtle and unnecessary difference.  The non
"in-place" variant only allows a single byte value as delimiter,
while the "in-place" variant can take multiple delimiters (e.g.,
"split at either a comma or a space").

This series first updates the string_list_split() to allow multiple
delimiters like string_list_split_in_place() does, by unifying their
implementations into one.  This refactoring allows us to give new
features to these two functions more easily.

Then these functions learn to optionally

 - trim the split string pieces before placing them in the resulting
   string list.

 - omit empty string pieces from the resulting string list.

An existing caller of string_list_split() in diff.c trims the
elements in the resulting string list before it uses them, which is
simplified by taking advantage of this new feature.

A handful of code paths call string_list_split*(), immediately
followed by string_list_remove_empty_items().  They are simplified
by not placing empty items in the list in the first place.



Relative to the v2 iteration, the v3 iteration switches from CPP
macros to enum for flag bits, and corrects a handful of typos.

Junio C Hamano (7):
  string-list: report programming error with BUG
  string-list: align string_list_split() with its _in_place()
    counterpart
  string-list: unify string_list_split* functions
  string-list: optionally trim string pieces split by
    string_list_split*()
  diff: simplify parsing of diff.colormovedws
  string-list: optionally omit empty string pieces in
    string_list_split*()
  string-list: split-then-remove-empty can be done while splitting

 builtin/blame.c              |   2 +-
 builtin/merge.c              |   2 +-
 builtin/var.c                |   2 +-
 connect.c                    |   2 +-
 diff.c                       |  20 ++----
 fetch-pack.c                 |   2 +-
 notes.c                      |   6 +-
 parse-options.c              |   2 +-
 pathspec.c                   |   3 +-
 protocol.c                   |   2 +-
 ref-filter.c                 |   4 +-
 setup.c                      |   3 +-
 string-list.c                | 120 ++++++++++++++++++++++++-----------
 string-list.h                |  30 ++++++---
 t/helper/test-hashmap.c      |   4 +-
 t/helper/test-json-writer.c  |   4 +-
 t/helper/test-path-utils.c   |   3 +-
 t/helper/test-ref-store.c    |   2 +-
 t/unit-tests/u-string-list.c |  95 ++++++++++++++++++++++++---
 transport.c                  |   2 +-
 upload-pack.c                |   2 +-
 21 files changed, 222 insertions(+), 90 deletions(-)

Range-diff against v2:
1:  1c2b222eec = 1:  442ed679bb string-list: report programming error with BUG
2:  a7e07b94ef = 2:  cc80bac8c2 string-list: align string_list_split() with its _in_place() counterpart
3:  b7a7fbb975 = 3:  c7922b3e14 string-list: unify string_list_split* functions
4:  c566d88c28 ! 4:  9d7d22e8ef string-list: optionally trim string pieces split by string_list_split*()
    @@ string-list.h: int string_list_split(struct string_list *list, const char *strin
      int string_list_split_in_place(struct string_list *list, char *string,
      			       const char *delim, int maxsplit);
     +
    -+/* trim() resulting string piece before adding it to the list */
    -+#define STRING_LIST_SPLIT_TRIM 01
    ++/* flag bits for split_f and split_in_place_f functions */
    ++enum {
    ++	/* trim() resulting string piece before adding it to the list */
    ++	STRING_LIST_SPLIT_TRIM = (1 << 0),
    ++};
     +
     +int string_list_split_f(struct string_list *, const char *string,
     +			const char *delim, int maxsplit, unsigned flags);
     +
     +int string_list_split_in_place_f(struct string_list *, char *string,
     +				 const char *delim, int maxsplit, unsigned flags);
    -+
      #endif /* STRING_LIST_H */
     
      ## t/unit-tests/u-string-list.c ##
5:  eb272e0f22 ! 5:  ad8b425bc5 diff: simplify parsing of diff.colormovedws
    @@ Commit message
         diff: simplify parsing of diff.colormovedws
     
         The code to parse this configuration variable, whose value is a
    -    comma separated known tokens like "ignore-space-change" and
    +    comma-separated list of known tokens like "ignore-space-change" and
         "ignore-all-space", uses string_list_split() to split the value into
         pieces, and then places each piece of string in a strbuf to trim,
         before comparing the result with the list of known tokens.
6:  d418078a84 ! 6:  d03f443878 string-list: optionally omit empty string pieces in string_list_split*()
    @@ Commit message
         string-list: optionally omit empty string pieces in string_list_split*()
     
         Teach the unified split_string() machinery a new flag bit,
    -    STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces omitted from
    -    the resulting string list.
    +    STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces to be
    +    omitted from the resulting string list.
     
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
    @@ string-list.c: static int append_one(struct string_list *list,
     
      ## string-list.h ##
     @@ string-list.h: int string_list_split_in_place(struct string_list *list, char *string,
    - /* trim() resulting string piece before adding it to the list */
    - #define STRING_LIST_SPLIT_TRIM 01
    + enum {
    + 	/* trim() resulting string piece before adding it to the list */
    + 	STRING_LIST_SPLIT_TRIM = (1 << 0),
    ++	/* omit adding empty string piece to the resulting list */
    ++	STRING_LIST_SPLIT_NONEMPTY = (1 << 1),
    + };
      
    -+/* omit adding empty string piece to the resulting list */
    -+#define STRING_LIST_SPLIT_NONEMPTY 02
    -+
      int string_list_split_f(struct string_list *, const char *string,
    - 			const char *delim, int maxsplit, unsigned flags);
    - 
     
      ## t/unit-tests/u-string-list.c ##
     @@ t/unit-tests/u-string-list.c: void test_string_list__split_f(void)
7:  12c1189a08 = 7:  9eb8d87d62 string-list: split-then-remove-empty can be done while splitting
-- 
2.50.1-633-g85c5610de3


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v3 1/7] string-list: report programming error with BUG
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
@ 2025-08-01 22:04     ` Junio C Hamano
  2025-08-01 22:04     ` [PATCH v3 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
                       ` (7 subsequent siblings)
  8 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 22:04 UTC (permalink / raw)
  To: git

Passing a string list that has .strdup_strings bit unset to
string_list_split(), or one that has .strdup_strings bit set to
string_list_split_in_place(), is a programmer error.  Do not use
die() to abort the execution.  Use BUG() instead.

As a developer-facing message, the message string itself should
be a lot more concise, but let's keep the original one for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/string-list.c b/string-list.c
index 53faaa8420..0cb920e9b0 100644
--- a/string-list.c
+++ b/string-list.c
@@ -283,7 +283,7 @@ int string_list_split(struct string_list *list, const char *string,
 	const char *p = string, *end;
 
 	if (!list->strdup_strings)
-		die("internal error in string_list_split(): "
+		BUG("internal error in string_list_split(): "
 		    "list->strdup_strings must be set");
 	for (;;) {
 		count++;
@@ -309,7 +309,7 @@ int string_list_split_in_place(struct string_list *list, char *string,
 	char *p = string, *end;
 
 	if (list->strdup_strings)
-		die("internal error in string_list_split_in_place(): "
+		BUG("internal error in string_list_split_in_place(): "
 		    "list->strdup_strings must not be set");
 	for (;;) {
 		count++;
-- 
2.50.1-633-g85c5610de3


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
  2025-08-01 22:04     ` [PATCH v3 1/7] string-list: report programming error with BUG Junio C Hamano
@ 2025-08-01 22:04     ` Junio C Hamano
  2025-08-02  8:22       ` Jeff King
  2025-08-01 22:04     ` [PATCH v3 3/7] string-list: unify string_list_split* functions Junio C Hamano
                       ` (6 subsequent siblings)
  8 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 22:04 UTC (permalink / raw)
  To: git

For some unknown reason, unlike string_list_split_in_place(),
string_list_split() took only a single character as a field
delimiter.  Before giving both functions more features in future
commits, allow string_list_split() to take more than one delimiter
characters to make them closer to each other.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/blame.c              |  2 +-
 builtin/merge.c              |  2 +-
 builtin/var.c                |  2 +-
 connect.c                    |  2 +-
 diff.c                       |  2 +-
 fetch-pack.c                 |  2 +-
 notes.c                      |  2 +-
 parse-options.c              |  2 +-
 pathspec.c                   |  2 +-
 protocol.c                   |  2 +-
 ref-filter.c                 |  4 ++--
 setup.c                      |  3 ++-
 string-list.c                |  4 ++--
 string-list.h                | 16 ++++++++--------
 t/helper/test-path-utils.c   |  3 ++-
 t/helper/test-ref-store.c    |  2 +-
 t/unit-tests/u-string-list.c | 16 ++++++++--------
 transport.c                  |  2 +-
 upload-pack.c                |  2 +-
 19 files changed, 37 insertions(+), 35 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 91586e6852..70a6460401 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -420,7 +420,7 @@ static void parse_color_fields(const char *s)
 	colorfield_nr = 0;
 
 	/* Ideally this would be stripped and split at the same time? */
-	string_list_split(&l, s, ',', -1);
+	string_list_split(&l, s, ",", -1);
 	ALLOC_GROW(colorfield, colorfield_nr + 1, colorfield_alloc);
 
 	for_each_string_list_item(item, &l) {
diff --git a/builtin/merge.c b/builtin/merge.c
index 18b22c0a26..893f8950bf 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -875,7 +875,7 @@ static void add_strategies(const char *string, unsigned attr)
 	if (string) {
 		struct string_list list = STRING_LIST_INIT_DUP;
 		struct string_list_item *item;
-		string_list_split(&list, string, ' ', -1);
+		string_list_split(&list, string, " ", -1);
 		for_each_string_list_item(item, &list)
 			append_strategy(get_strategy(item->string));
 		string_list_clear(&list, 0);
diff --git a/builtin/var.c b/builtin/var.c
index ada642a9fe..4ae7af0eff 100644
--- a/builtin/var.c
+++ b/builtin/var.c
@@ -181,7 +181,7 @@ static void list_vars(void)
 			if (ptr->multivalued && *val) {
 				struct string_list list = STRING_LIST_INIT_DUP;
 
-				string_list_split(&list, val, '\n', -1);
+				string_list_split(&list, val, "\n", -1);
 				for (size_t i = 0; i < list.nr; i++)
 					printf("%s=%s\n", ptr->name, list.items[i].string);
 				string_list_clear(&list, 0);
diff --git a/connect.c b/connect.c
index e77287f426..867b12bde5 100644
--- a/connect.c
+++ b/connect.c
@@ -407,7 +407,7 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
 	 * name.  Subsequent fields (symref-target and peeled) are optional and
 	 * don't have a particular order.
 	 */
-	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
+	if (string_list_split(&line_sections, line, " ", -1) < 2) {
 		ret = 0;
 		goto out;
 	}
diff --git a/diff.c b/diff.c
index dca87e164f..a81949a422 100644
--- a/diff.c
+++ b/diff.c
@@ -327,7 +327,7 @@ static unsigned parse_color_moved_ws(const char *arg)
 	struct string_list l = STRING_LIST_INIT_DUP;
 	struct string_list_item *i;
 
-	string_list_split(&l, arg, ',', -1);
+	string_list_split(&l, arg, ",", -1);
 
 	for_each_string_list_item(i, &l) {
 		struct strbuf sb = STRBUF_INIT;
diff --git a/fetch-pack.c b/fetch-pack.c
index c1be9b76eb..9866270696 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1914,7 +1914,7 @@ static void fetch_pack_config(void)
 		char *str;
 
 		if (!git_config_get_string("fetch.uriprotocols", &str) && str) {
-			string_list_split(&uri_protocols, str, ',', -1);
+			string_list_split(&uri_protocols, str, ",", -1);
 			free(str);
 		}
 	}
diff --git a/notes.c b/notes.c
index 97b995f3f2..6afcf088b9 100644
--- a/notes.c
+++ b/notes.c
@@ -892,7 +892,7 @@ static int string_list_add_note_lines(struct string_list *list,
 	 * later, along with any empty strings that came from empty
 	 * lines within the file.
 	 */
-	string_list_split(list, data, '\n', -1);
+	string_list_split(list, data, "\n", -1);
 	free(data);
 	return 0;
 }
diff --git a/parse-options.c b/parse-options.c
index 5224203ffe..9e7cb75192 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1338,7 +1338,7 @@ static enum parse_opt_result usage_with_options_internal(struct parse_opt_ctx_t
 		if (!saw_empty_line && !*str)
 			saw_empty_line = 1;
 
-		string_list_split(&list, str, '\n', -1);
+		string_list_split(&list, str, "\n", -1);
 		for (j = 0; j < list.nr; j++) {
 			const char *line = list.items[j].string;
 
diff --git a/pathspec.c b/pathspec.c
index a3ddd701c7..de325f7ef9 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -201,7 +201,7 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va
 	if (!value || !*value)
 		die(_("attr spec must not be empty"));
 
-	string_list_split(&list, value, ' ', -1);
+	string_list_split(&list, value, " ", -1);
 	string_list_remove_empty_items(&list, 0);
 
 	item->attr_check = attr_check_alloc();
diff --git a/protocol.c b/protocol.c
index bae7226ff4..54b9f49c01 100644
--- a/protocol.c
+++ b/protocol.c
@@ -61,7 +61,7 @@ enum protocol_version determine_protocol_version_server(void)
 	if (git_protocol) {
 		struct string_list list = STRING_LIST_INIT_DUP;
 		const struct string_list_item *item;
-		string_list_split(&list, git_protocol, ':', -1);
+		string_list_split(&list, git_protocol, ":", -1);
 
 		for_each_string_list_item(item, &list) {
 			const char *value;
diff --git a/ref-filter.c b/ref-filter.c
index f9f2c512a8..4edfb9c83b 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -435,7 +435,7 @@ static int remote_ref_atom_parser(struct ref_format *format UNUSED,
 	}
 
 	atom->u.remote_ref.nobracket = 0;
-	string_list_split(&params, arg, ',', -1);
+	string_list_split(&params, arg, ",", -1);
 
 	for (i = 0; i < params.nr; i++) {
 		const char *s = params.items[i].string;
@@ -831,7 +831,7 @@ static int align_atom_parser(struct ref_format *format UNUSED,
 
 	align->position = ALIGN_LEFT;
 
-	string_list_split(&params, arg, ',', -1);
+	string_list_split(&params, arg, ",", -1);
 	for (i = 0; i < params.nr; i++) {
 		const char *s = params.items[i].string;
 		int position;
diff --git a/setup.c b/setup.c
index 6f52dab64c..b9f5eb8b51 100644
--- a/setup.c
+++ b/setup.c
@@ -1460,8 +1460,9 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 
 	if (env_ceiling_dirs) {
 		int empty_entry_found = 0;
+		static const char path_sep[] = { PATH_SEP, '\0' };
 
-		string_list_split(&ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1);
+		string_list_split(&ceiling_dirs, env_ceiling_dirs, path_sep, -1);
 		filter_string_list(&ceiling_dirs, 0,
 				   canonicalize_ceiling_entry, &empty_entry_found);
 		ceil_offset = longest_ancestor_length(dir->buf, &ceiling_dirs);
diff --git a/string-list.c b/string-list.c
index 0cb920e9b0..2284a009cb 100644
--- a/string-list.c
+++ b/string-list.c
@@ -277,7 +277,7 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 }
 
 int string_list_split(struct string_list *list, const char *string,
-		      int delim, int maxsplit)
+		      const char *delim, int maxsplit)
 {
 	int count = 0;
 	const char *p = string, *end;
@@ -291,7 +291,7 @@ int string_list_split(struct string_list *list, const char *string,
 			string_list_append(list, p);
 			return count;
 		}
-		end = strchr(p, delim);
+		end = strpbrk(p, delim);
 		if (end) {
 			string_list_append_nodup(list, xmemdupz(p, end - p));
 			p = end + 1;
diff --git a/string-list.h b/string-list.h
index 122b318641..6c8650efde 100644
--- a/string-list.h
+++ b/string-list.h
@@ -254,7 +254,7 @@ struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
 void unsorted_string_list_delete_item(struct string_list *list, int i, int free_util);
 
 /**
- * Split string into substrings on character `delim` and append the
+ * Split string into substrings on characters in `delim` and append the
  * substrings to `list`.  The input string is not modified.
  * list->strdup_strings must be set, as new memory needs to be
  * allocated to hold the substrings.  If maxsplit is non-negative,
@@ -262,15 +262,15 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
  * appended to list.
  *
  * Examples:
- *   string_list_split(l, "foo:bar:baz", ':', -1) -> ["foo", "bar", "baz"]
- *   string_list_split(l, "foo:bar:baz", ':', 0) -> ["foo:bar:baz"]
- *   string_list_split(l, "foo:bar:baz", ':', 1) -> ["foo", "bar:baz"]
- *   string_list_split(l, "foo:bar:", ':', -1) -> ["foo", "bar", ""]
- *   string_list_split(l, "", ':', -1) -> [""]
- *   string_list_split(l, ":", ':', -1) -> ["", ""]
+ *   string_list_split(l, "foo:bar:baz", ":", -1) -> ["foo", "bar", "baz"]
+ *   string_list_split(l, "foo:bar:baz", ":", 0) -> ["foo:bar:baz"]
+ *   string_list_split(l, "foo:bar:baz", ":", 1) -> ["foo", "bar:baz"]
+ *   string_list_split(l, "foo:bar:", ":", -1) -> ["foo", "bar", ""]
+ *   string_list_split(l, "", ":", -1) -> [""]
+ *   string_list_split(l, ":", ":", -1) -> ["", ""]
  */
 int string_list_split(struct string_list *list, const char *string,
-		      int delim, int maxsplit);
+		      const char *delim, int maxsplit);
 
 /*
  * Like string_list_split(), except that string is split in-place: the
diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c
index 086238c826..f5f33751da 100644
--- a/t/helper/test-path-utils.c
+++ b/t/helper/test-path-utils.c
@@ -348,6 +348,7 @@ int cmd__path_utils(int argc, const char **argv)
 	if (argc == 4 && !strcmp(argv[1], "longest_ancestor_length")) {
 		int len;
 		struct string_list ceiling_dirs = STRING_LIST_INIT_DUP;
+		const char path_sep[] = { PATH_SEP, '\0' };
 		char *path = xstrdup(argv[2]);
 
 		/*
@@ -362,7 +363,7 @@ int cmd__path_utils(int argc, const char **argv)
 		 */
 		if (normalize_path_copy(path, path))
 			die("Path \"%s\" could not be normalized", argv[2]);
-		string_list_split(&ceiling_dirs, argv[3], PATH_SEP, -1);
+		string_list_split(&ceiling_dirs, argv[3], path_sep, -1);
 		filter_string_list(&ceiling_dirs, 0,
 				   normalize_ceiling_entry, NULL);
 		len = longest_ancestor_length(path, &ceiling_dirs);
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index 8d9a271845..aa1cb9b4ac 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -29,7 +29,7 @@ static unsigned int parse_flags(const char *str, struct flag_definition *defs)
 	if (!strcmp(str, "0"))
 		return 0;
 
-	string_list_split(&masks, str, ',', 64);
+	string_list_split(&masks, str, ",", 64);
 	for (size_t i = 0; i < masks.nr; i++) {
 		const char *name = masks.items[i].string;
 		struct flag_definition *def = defs;
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index d4ba5f9fa5..150a5f505f 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -43,7 +43,7 @@ static void t_string_list_equal(struct string_list *list,
 				  expected_strings->items[i].string);
 }
 
-static void t_string_list_split(const char *data, int delim, int maxsplit, ...)
+static void t_string_list_split(const char *data, const char *delim, int maxsplit, ...)
 {
 	struct string_list expected_strings = STRING_LIST_INIT_DUP;
 	struct string_list list = STRING_LIST_INIT_DUP;
@@ -65,13 +65,13 @@ static void t_string_list_split(const char *data, int delim, int maxsplit, ...)
 
 void test_string_list__split(void)
 {
-	t_string_list_split("foo:bar:baz", ':', -1, "foo", "bar", "baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 0, "foo:bar:baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 1, "foo", "bar:baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 2, "foo", "bar", "baz", NULL);
-	t_string_list_split("foo:bar:", ':', -1, "foo", "bar", "", NULL);
-	t_string_list_split("", ':', -1, "", NULL);
-	t_string_list_split(":", ':', -1, "", "", NULL);
+	t_string_list_split("foo:bar:baz", ":", -1, "foo", "bar", "baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 0, "foo:bar:baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 1, "foo", "bar:baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 2, "foo", "bar", "baz", NULL);
+	t_string_list_split("foo:bar:", ":", -1, "foo", "bar", "", NULL);
+	t_string_list_split("", ":", -1, "", NULL);
+	t_string_list_split(":", ":", -1, "", "", NULL);
 }
 
 static void t_string_list_split_in_place(const char *data, const char *delim,
diff --git a/transport.c b/transport.c
index c123ac1e38..76487b5453 100644
--- a/transport.c
+++ b/transport.c
@@ -1042,7 +1042,7 @@ static const struct string_list *protocol_allow_list(void)
 	if (enabled < 0) {
 		const char *v = getenv("GIT_ALLOW_PROTOCOL");
 		if (v) {
-			string_list_split(&allowed, v, ':', -1);
+			string_list_split(&allowed, v, ":", -1);
 			string_list_sort(&allowed);
 			enabled = 1;
 		} else {
diff --git a/upload-pack.c b/upload-pack.c
index 4f26f6afc7..91fcdcad9b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1685,7 +1685,7 @@ static void process_args(struct packet_reader *request,
 			if (data->uri_protocols.nr)
 				send_err_and_die(data,
 						 "multiple packfile-uris lines forbidden");
-			string_list_split(&data->uri_protocols, p, ',', -1);
+			string_list_split(&data->uri_protocols, p, ",", -1);
 			continue;
 		}
 
-- 
2.50.1-633-g85c5610de3


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 3/7] string-list: unify string_list_split* functions
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
  2025-08-01 22:04     ` [PATCH v3 1/7] string-list: report programming error with BUG Junio C Hamano
  2025-08-01 22:04     ` [PATCH v3 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
@ 2025-08-01 22:04     ` Junio C Hamano
  2025-08-01 22:04     ` [PATCH v3 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
                       ` (5 subsequent siblings)
  8 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 22:04 UTC (permalink / raw)
  To: git

Thanks to the previous step, the only difference between these two
related functions is that string_list_split() works on a string
without modifying its contents (i.e. taking "const char *") and the
resulting pieces of strings are their own copies in a string list,
while string_list_split_in_place() works on a mutable string and the
resulting pieces of strings come from the original string.

Consolidate their implementations into a single helper function, and
make them a thin wrapper around it.  We can later add an extra flags
parameter to extend both of these functions by updating only the
internal helper function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c | 96 ++++++++++++++++++++++++++++++---------------------
 1 file changed, 56 insertions(+), 40 deletions(-)

diff --git a/string-list.c b/string-list.c
index 2284a009cb..65b6ceb259 100644
--- a/string-list.c
+++ b/string-list.c
@@ -276,55 +276,71 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 	list->nr--;
 }
 
-int string_list_split(struct string_list *list, const char *string,
-		      const char *delim, int maxsplit)
+/*
+ * append a substring [p..end] to list; return number of things it
+ * appended to the list.
+ */
+static int append_one(struct string_list *list,
+		      const char *p, const char *end,
+		      int in_place)
+{
+	if (!end)
+		end = p + strlen(p);
+
+	if (in_place) {
+		*((char *)end) = '\0';
+		string_list_append(list, p);
+	} else {
+		string_list_append_nodup(list, xmemdupz(p, end - p));
+	}
+	return 1;
+}
+
+/*
+ * Unfortunately this cannot become a public interface, as _in_place()
+ * wants to have "const char *string" while the other variant wants to
+ * have "char *string" for type safety.
+ *
+ * This accepts "const char *string" to allow both wrappers to use it;
+ * it internally casts away the constness when in_place is true by
+ * taking advantage of strpbrk() that takes a "const char *" arg and
+ * returns "char *" pointer into that const string.  Yucky but works ;-).
+ */
+static int split_string(struct string_list *list, const char *string, const char *delim,
+			int maxsplit, int in_place)
 {
 	int count = 0;
-	const char *p = string, *end;
+	const char *p = string;
+
+	if (in_place && list->strdup_strings)
+		BUG("string_list_split_in_place() called with strdup_strings");
+	else if (!in_place && !list->strdup_strings)
+		BUG("string_list_split() called without strdup_strings");
 
-	if (!list->strdup_strings)
-		BUG("internal error in string_list_split(): "
-		    "list->strdup_strings must be set");
 	for (;;) {
-		count++;
-		if (maxsplit >= 0 && count > maxsplit) {
-			string_list_append(list, p);
-			return count;
-		}
-		end = strpbrk(p, delim);
-		if (end) {
-			string_list_append_nodup(list, xmemdupz(p, end - p));
-			p = end + 1;
-		} else {
-			string_list_append(list, p);
+		char *end;
+
+		if (0 <= maxsplit && maxsplit <= count)
+			end = NULL;
+		else
+			end = strpbrk(p, delim);
+
+		count += append_one(list, p, end, in_place);
+
+		if (!end)
 			return count;
-		}
+		p = end + 1;
 	}
 }
 
+int string_list_split(struct string_list *list, const char *string,
+		      const char *delim, int maxsplit)
+{
+	return split_string(list, string, delim, maxsplit, 0);
+}
+
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit)
 {
-	int count = 0;
-	char *p = string, *end;
-
-	if (list->strdup_strings)
-		BUG("internal error in string_list_split_in_place(): "
-		    "list->strdup_strings must not be set");
-	for (;;) {
-		count++;
-		if (maxsplit >= 0 && count > maxsplit) {
-			string_list_append(list, p);
-			return count;
-		}
-		end = strpbrk(p, delim);
-		if (end) {
-			*end = '\0';
-			string_list_append(list, p);
-			p = end + 1;
-		} else {
-			string_list_append(list, p);
-			return count;
-		}
-	}
+	return split_string(list, string, delim, maxsplit, 1);
 }
-- 
2.50.1-633-g85c5610de3


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
                       ` (2 preceding siblings ...)
  2025-08-01 22:04     ` [PATCH v3 3/7] string-list: unify string_list_split* functions Junio C Hamano
@ 2025-08-01 22:04     ` Junio C Hamano
  2025-08-02  8:26       ` Jeff King
  2025-08-01 22:04     ` [PATCH v3 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
                       ` (4 subsequent siblings)
  8 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 22:04 UTC (permalink / raw)
  To: git

Teach the unified split_string() to take an optional "flags" word,
and define the first flag STRING_LIST_SPLIT_TRIM to cause the split
pieces to be trimmed before they are placed in the string list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c                | 35 +++++++++++++++++---
 string-list.h                | 12 +++++++
 t/unit-tests/u-string-list.c | 64 ++++++++++++++++++++++++++++++++++++
 3 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/string-list.c b/string-list.c
index 65b6ceb259..86a309f8fb 100644
--- a/string-list.c
+++ b/string-list.c
@@ -282,11 +282,18 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
  */
 static int append_one(struct string_list *list,
 		      const char *p, const char *end,
-		      int in_place)
+		      int in_place, unsigned flags)
 {
 	if (!end)
 		end = p + strlen(p);
 
+	if ((flags & STRING_LIST_SPLIT_TRIM)) {
+		/* rtrim */
+		for (; p < end; end--)
+			if (!isspace(end[-1]))
+				break;
+	}
+
 	if (in_place) {
 		*((char *)end) = '\0';
 		string_list_append(list, p);
@@ -307,7 +314,7 @@ static int append_one(struct string_list *list,
  * returns "char *" pointer into that const string.  Yucky but works ;-).
  */
 static int split_string(struct string_list *list, const char *string, const char *delim,
-			int maxsplit, int in_place)
+			int maxsplit, int in_place, unsigned flags)
 {
 	int count = 0;
 	const char *p = string;
@@ -320,12 +327,18 @@ static int split_string(struct string_list *list, const char *string, const char
 	for (;;) {
 		char *end;
 
+		if (flags & STRING_LIST_SPLIT_TRIM) {
+			/* ltrim */
+			while (*p && isspace(*p))
+				p++;
+		}
+
 		if (0 <= maxsplit && maxsplit <= count)
 			end = NULL;
 		else
 			end = strpbrk(p, delim);
 
-		count += append_one(list, p, end, in_place);
+		count += append_one(list, p, end, in_place, flags);
 
 		if (!end)
 			return count;
@@ -336,11 +349,23 @@ static int split_string(struct string_list *list, const char *string, const char
 int string_list_split(struct string_list *list, const char *string,
 		      const char *delim, int maxsplit)
 {
-	return split_string(list, string, delim, maxsplit, 0);
+	return split_string(list, string, delim, maxsplit, 0, 0);
 }
 
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit)
 {
-	return split_string(list, string, delim, maxsplit, 1);
+	return split_string(list, string, delim, maxsplit, 1, 0);
+}
+
+int string_list_split_f(struct string_list *list, const char *string,
+			const char *delim, int maxsplit, unsigned flags)
+{
+	return split_string(list, string, delim, maxsplit, 0, flags);
+}
+
+int string_list_split_in_place_f(struct string_list *list, char *string,
+			       const char *delim, int maxsplit, unsigned flags)
+{
+	return split_string(list, string, delim, maxsplit, 1, flags);
 }
diff --git a/string-list.h b/string-list.h
index 6c8650efde..87ccc5f1e6 100644
--- a/string-list.h
+++ b/string-list.h
@@ -281,4 +281,16 @@ int string_list_split(struct string_list *list, const char *string,
  */
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit);
+
+/* flag bits for split_f and split_in_place_f functions */
+enum {
+	/* trim() resulting string piece before adding it to the list */
+	STRING_LIST_SPLIT_TRIM = (1 << 0),
+};
+
+int string_list_split_f(struct string_list *, const char *string,
+			const char *delim, int maxsplit, unsigned flags);
+
+int string_list_split_in_place_f(struct string_list *, char *string,
+				 const char *delim, int maxsplit, unsigned flags);
 #endif /* STRING_LIST_H */
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index 150a5f505f..daa9307e45 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -63,6 +63,70 @@ static void t_string_list_split(const char *data, const char *delim, int maxspli
 	string_list_clear(&list, 0);
 }
 
+static void t_string_list_split_f(const char *data, const char *delim,
+				  int maxsplit, unsigned flags, ...)
+{
+	struct string_list expected_strings = STRING_LIST_INIT_DUP;
+	struct string_list list = STRING_LIST_INIT_DUP;
+	va_list ap;
+	int len;
+
+	va_start(ap, flags);
+	t_vcreate_string_list_dup(&expected_strings, 0, ap);
+	va_end(ap);
+
+	string_list_clear(&list, 0);
+	len = string_list_split_f(&list, data, delim, maxsplit, flags);
+	cl_assert_equal_i(len, expected_strings.nr);
+	t_string_list_equal(&list, &expected_strings);
+
+	string_list_clear(&expected_strings, 0);
+	string_list_clear(&list, 0);
+}
+
+void test_string_list__split_f(void)
+{
+	t_string_list_split_f("::foo:bar:baz:", ":", -1, 0,
+			      "", "", "foo", "bar", "baz", "", NULL);
+	t_string_list_split_f(" foo:bar : baz", ":", -1, STRING_LIST_SPLIT_TRIM,
+			      "foo", "bar", "baz", NULL);
+	t_string_list_split_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
+			      "a", "b c", NULL);
+}
+
+static void t_string_list_split_in_place_f(const char *data_, const char *delim,
+					   int maxsplit, unsigned flags, ...)
+{
+	struct string_list expected_strings = STRING_LIST_INIT_DUP;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *data = xstrdup(data_);
+	va_list ap;
+	int len;
+
+	va_start(ap, flags);
+	t_vcreate_string_list_dup(&expected_strings, 0, ap);
+	va_end(ap);
+
+	string_list_clear(&list, 0);
+	len = string_list_split_in_place_f(&list, data, delim, maxsplit, flags);
+	cl_assert_equal_i(len, expected_strings.nr);
+	t_string_list_equal(&list, &expected_strings);
+
+	free(data);
+	string_list_clear(&expected_strings, 0);
+	string_list_clear(&list, 0);
+}
+
+void test_string_list__split_in_place_f(void)
+{
+	t_string_list_split_in_place_f("::foo:bar:baz:", ":", -1, 0,
+				       "", "", "foo", "bar", "baz", "", NULL);
+	t_string_list_split_in_place_f(" foo:bar : baz", ":", -1, STRING_LIST_SPLIT_TRIM,
+				       "foo", "bar", "baz", NULL);
+	t_string_list_split_in_place_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
+				       "a", "b c", NULL);
+}
+
 void test_string_list__split(void)
 {
 	t_string_list_split("foo:bar:baz", ":", -1, "foo", "bar", "baz", NULL);
-- 
2.50.1-633-g85c5610de3


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 5/7] diff: simplify parsing of diff.colormovedws
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
                       ` (3 preceding siblings ...)
  2025-08-01 22:04     ` [PATCH v3 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
@ 2025-08-01 22:04     ` Junio C Hamano
  2025-08-01 22:04     ` [PATCH v3 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
                       ` (3 subsequent siblings)
  8 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 22:04 UTC (permalink / raw)
  To: git

The code to parse this configuration variable, whose value is a
comma-separated list of known tokens like "ignore-space-change" and
"ignore-all-space", uses string_list_split() to split the value into
pieces, and then places each piece of string in a strbuf to trim,
before comparing the result with the list of known tokens.

Thanks to the previous steps, now string_list_split() can trim the
resulting pieces before it places them in the string list.  Use it
to simplify the code.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/diff.c b/diff.c
index a81949a422..70666ad2cd 100644
--- a/diff.c
+++ b/diff.c
@@ -327,29 +327,23 @@ static unsigned parse_color_moved_ws(const char *arg)
 	struct string_list l = STRING_LIST_INIT_DUP;
 	struct string_list_item *i;
 
-	string_list_split(&l, arg, ",", -1);
+	string_list_split_f(&l, arg, ",", -1, STRING_LIST_SPLIT_TRIM);
 
 	for_each_string_list_item(i, &l) {
-		struct strbuf sb = STRBUF_INIT;
-		strbuf_addstr(&sb, i->string);
-		strbuf_trim(&sb);
-
-		if (!strcmp(sb.buf, "no"))
+		if (!strcmp(i->string, "no"))
 			ret = 0;
-		else if (!strcmp(sb.buf, "ignore-space-change"))
+		else if (!strcmp(i->string, "ignore-space-change"))
 			ret |= XDF_IGNORE_WHITESPACE_CHANGE;
-		else if (!strcmp(sb.buf, "ignore-space-at-eol"))
+		else if (!strcmp(i->string, "ignore-space-at-eol"))
 			ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
-		else if (!strcmp(sb.buf, "ignore-all-space"))
+		else if (!strcmp(i->string, "ignore-all-space"))
 			ret |= XDF_IGNORE_WHITESPACE;
-		else if (!strcmp(sb.buf, "allow-indentation-change"))
+		else if (!strcmp(i->string, "allow-indentation-change"))
 			ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
 		else {
 			ret |= COLOR_MOVED_WS_ERROR;
-			error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), sb.buf);
+			error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), i->string);
 		}
-
-		strbuf_release(&sb);
 	}
 
 	if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) &&
-- 
2.50.1-633-g85c5610de3


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 6/7] string-list: optionally omit empty string pieces in string_list_split*()
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
                       ` (4 preceding siblings ...)
  2025-08-01 22:04     ` [PATCH v3 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
@ 2025-08-01 22:04     ` Junio C Hamano
  2025-08-01 22:04     ` [PATCH v3 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
                       ` (2 subsequent siblings)
  8 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 22:04 UTC (permalink / raw)
  To: git

Teach the unified split_string() machinery a new flag bit,
STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces to be
omitted from the resulting string list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c                |  3 +++
 string-list.h                |  2 ++
 t/unit-tests/u-string-list.c | 15 +++++++++++++++
 3 files changed, 20 insertions(+)

diff --git a/string-list.c b/string-list.c
index 86a309f8fb..343cf1ca90 100644
--- a/string-list.c
+++ b/string-list.c
@@ -294,6 +294,9 @@ static int append_one(struct string_list *list,
 				break;
 	}
 
+	if ((flags & STRING_LIST_SPLIT_NONEMPTY) && (end <= p))
+		return 0;
+
 	if (in_place) {
 		*((char *)end) = '\0';
 		string_list_append(list, p);
diff --git a/string-list.h b/string-list.h
index 87ccc5f1e6..c25f28e9a3 100644
--- a/string-list.h
+++ b/string-list.h
@@ -286,6 +286,8 @@ int string_list_split_in_place(struct string_list *list, char *string,
 enum {
 	/* trim() resulting string piece before adding it to the list */
 	STRING_LIST_SPLIT_TRIM = (1 << 0),
+	/* omit adding empty string piece to the resulting list */
+	STRING_LIST_SPLIT_NONEMPTY = (1 << 1),
 };
 
 int string_list_split_f(struct string_list *, const char *string,
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index daa9307e45..a2457d7b1e 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -92,6 +92,13 @@ void test_string_list__split_f(void)
 			      "foo", "bar", "baz", NULL);
 	t_string_list_split_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
 			      "a", "b c", NULL);
+	t_string_list_split_f("::foo::bar:baz:", ":", -1, STRING_LIST_SPLIT_NONEMPTY,
+			      "foo", "bar", "baz", NULL);
+	t_string_list_split_f("foo:baz", ":", -1, STRING_LIST_SPLIT_NONEMPTY,
+			      "foo", "baz", NULL);
+	t_string_list_split_f("foo :: : baz", ":", -1,
+			      STRING_LIST_SPLIT_NONEMPTY | STRING_LIST_SPLIT_TRIM,
+			      "foo", "baz", NULL);
 }
 
 static void t_string_list_split_in_place_f(const char *data_, const char *delim,
@@ -125,6 +132,14 @@ void test_string_list__split_in_place_f(void)
 				       "foo", "bar", "baz", NULL);
 	t_string_list_split_in_place_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
 				       "a", "b c", NULL);
+	t_string_list_split_in_place_f("::foo::bar:baz:", ":", -1,
+				       STRING_LIST_SPLIT_NONEMPTY,
+				       "foo", "bar", "baz", NULL);
+	t_string_list_split_in_place_f("foo:baz", ":", -1, STRING_LIST_SPLIT_NONEMPTY,
+				       "foo", "baz", NULL);
+	t_string_list_split_in_place_f("foo :: : baz", ":", -1,
+				       STRING_LIST_SPLIT_NONEMPTY | STRING_LIST_SPLIT_TRIM,
+				       "foo", "baz", NULL);
 }
 
 void test_string_list__split(void)
-- 
2.50.1-633-g85c5610de3


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 7/7] string-list: split-then-remove-empty can be done while splitting
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
                       ` (5 preceding siblings ...)
  2025-08-01 22:04     ` [PATCH v3 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
@ 2025-08-01 22:04     ` Junio C Hamano
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
  8 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 22:04 UTC (permalink / raw)
  To: git

Thanks to the new STRING_LIST_SPLIT_NONEMPTY flag, a common pattern
to split a string into a string list and then remove empty items in
the resulting list is no longer needed.  Instead, just tell the
string_list_split*() to omit empty ones while splitting.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 notes.c                     | 4 ++--
 pathspec.c                  | 3 +--
 t/helper/test-hashmap.c     | 4 ++--
 t/helper/test-json-writer.c | 4 ++--
 4 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/notes.c b/notes.c
index 6afcf088b9..3603c4a42b 100644
--- a/notes.c
+++ b/notes.c
@@ -970,8 +970,8 @@ void string_list_add_refs_from_colon_sep(struct string_list *list,
 	char *globs_copy = xstrdup(globs);
 	int i;
 
-	string_list_split_in_place(&split, globs_copy, ":", -1);
-	string_list_remove_empty_items(&split, 0);
+	string_list_split_in_place_f(&split, globs_copy, ":", -1,
+				     STRING_LIST_SPLIT_NONEMPTY);
 
 	for (i = 0; i < split.nr; i++)
 		string_list_add_refs_by_glob(list, split.items[i].string);
diff --git a/pathspec.c b/pathspec.c
index de325f7ef9..5993c4afa0 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -201,8 +201,7 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va
 	if (!value || !*value)
 		die(_("attr spec must not be empty"));
 
-	string_list_split(&list, value, " ", -1);
-	string_list_remove_empty_items(&list, 0);
+	string_list_split_f(&list, value, " ", -1, STRING_LIST_SPLIT_NONEMPTY);
 
 	item->attr_check = attr_check_alloc();
 	CALLOC_ARRAY(item->attr_match, list.nr);
diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c
index 7782ae585e..e4dc02bd7a 100644
--- a/t/helper/test-hashmap.c
+++ b/t/helper/test-hashmap.c
@@ -149,8 +149,8 @@ int cmd__hashmap(int argc UNUSED, const char **argv UNUSED)
 
 		/* break line into command and up to two parameters */
 		string_list_setlen(&parts, 0);
-		string_list_split_in_place(&parts, line.buf, DELIM, 2);
-		string_list_remove_empty_items(&parts, 0);
+		string_list_split_in_place_f(&parts, line.buf, DELIM, 2,
+					     STRING_LIST_SPLIT_NONEMPTY);
 
 		/* ignore empty lines */
 		if (!parts.nr)
diff --git a/t/helper/test-json-writer.c b/t/helper/test-json-writer.c
index a288069b04..f8316a7d29 100644
--- a/t/helper/test-json-writer.c
+++ b/t/helper/test-json-writer.c
@@ -492,8 +492,8 @@ static int scripted(void)
 
 		/* break line into command and zero or more tokens */
 		string_list_setlen(&parts, 0);
-		string_list_split_in_place(&parts, line, " ", -1);
-		string_list_remove_empty_items(&parts, 0);
+		string_list_split_in_place_f(&parts, line, " ", -1,
+					     STRING_LIST_SPLIT_NONEMPTY);
 
 		/* ignore empty lines */
 		if (!parts.nr || !*parts.items[0].string)
-- 
2.50.1-633-g85c5610de3


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-01  4:04         ` shejialuo
@ 2025-08-01 23:09           ` Junio C Hamano
  2025-08-02  1:51             ` shejialuo
  0 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 23:09 UTC (permalink / raw)
  To: shejialuo; +Cc: git

shejialuo <shejialuo@gmail.com> writes:

> I agree with you that we would introduce another variable. However, the
> thing I quite dislike is that we do ltrim inside the current function
> and we do rtrim inside `append_one`.

Your "quite dislike" does not matter unless backed by a reason why
it is not good, and for that, you need to think a bit deeper.  Then
you will hopefully appreciate why the current arrangement is more
optimal ;-)

There is a clear separation of tasks between this caller-callee
pair.  The caller is responsible for finding where the current token
ends, and the callee is responsible for massaging the current token
into the resulting list.

But ltrim needs to be done in the caller for this to be efficient.

Imagine the case where you want to allow both non-empty and trim
behaviour at the same time, and use a whitespace character as
delimiter.  If your token has leading whitespaces, instead of
chopping them into zero-length ranges and feeding it to append_one()
one by one, only to have them discarded (due to non-empty being
set), ltrimming in the caller before it decides where the next token
(i.e. "end") starts is far more efficient.  It may be more
conceptually cleaner, but cleanliness is more subjective ;-)

> My thinking is that we should handle the [p, end) string in the same
> place.

Again this sounds no more than a subjective "quite dislike".  Is
there a reason why anybody would want to insist these two things be
done at the same location?

You could satisfy the subjective "same place requirement" by
inlining the helper function into its sole caller and still keep the
current arrangement to ltrim before finding "end", of course.  But
at that point, I would have to say that it is a tail wagging the
dog.  You are making the code worse by destroying the caller-callee
division of responsibilities, only to satisfy a subjetive "quite
dislike" criteria.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-08-01  3:55         ` shejialuo
@ 2025-08-01 23:10           ` Junio C Hamano
  0 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-01 23:10 UTC (permalink / raw)
  To: shejialuo; +Cc: git

shejialuo <shejialuo@gmail.com> writes:

> Sorry to make you confused. Because there are some other changes where
> you don't use `static`. That's the main reason why I ask this question.

Hmph, are there new things that are not static but have good reasons
to be?  I am not aware of them offhand.

> --- a/t/helper/test-path-utils.c
> +++ b/t/helper/test-path-utils.c
> @@ -348,6 +348,7 @@ int cmd__path_utils(int argc, const char **argv)
>  	if (argc == 4 && !strcmp(argv[1], "longest_ancestor_length")) {
>  		int len;
>  		struct string_list ceiling_dirs = STRING_LIST_INIT_DUP;
> +		const char path_sep[] = { PATH_SEP, '\0' };
>  		char *path = xstrdup(argv[2]);
>
>> > And I have a design question: by using "PATH_SEP", we need to convert
>> > this character to be string. Should we create a new variable named
>> > "PATH_SEP_STR" or whatever to do that?
>> 
>> Sorry, but I do not understand the question.  You want to see
>> something like
>> 
>> 	#define PATH_SEP_STR "/"
>> 
>> you mean?  I do not offhand see why anybody would want to do so.
>> 
>
> Yes, that's my question. Because I see that we would define `path_sep`
> array in many place, so I wonder whether we could use such macro.

Why would we define path_sep[] in many places?  There is only one
here, and no existing code outside this helper needs one.

If it becomes necessary, I suspect that a global variable, i.e.

    === in some header file, perhaps git-compat-util.h ===
    extern const char PATH_SEP_STR[];

    === in a C source file, perhaps dir.c ===
    const char PATH_SEP_STR[] = { PATH_SEP, '\0' };

would be more efficient than having such #define and use site across
the codebase.

Using PATH_SEP_STR defined as a literal string "/" and sprinkling
many copies of it in the source files everywhere gives compilers
opportunity to consolidate those that appear in each file into one
copy (as these are literals and constant, they can be shared), but
would not give opportunity to do so across compilation units so
easily.  With an explicit singleton global instance, that is
trivial.

If we end up using great many number of them to matter, that is.  As
I said above, we have no evidence that would be the case, and that
is why I didn't make such a change in the first place.

Also, if many code paths need to split a path into pieces, it is
much more likely for us to give them a single well-designed helper
function for doing exactly that, than to have PATH_SEP_STR[] that
can be used by them from everywhere and have them use it to split
their paths individually.

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-01 23:09           ` Junio C Hamano
@ 2025-08-02  1:51             ` shejialuo
  0 siblings, 0 replies; 72+ messages in thread
From: shejialuo @ 2025-08-02  1:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Aug 01, 2025 at 04:09:02PM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > I agree with you that we would introduce another variable. However, the
> > thing I quite dislike is that we do ltrim inside the current function
> > and we do rtrim inside `append_one`.
> 
> Your "quite dislike" does not matter unless backed by a reason why
> it is not good, and for that, you need to think a bit deeper.  Then
> you will hopefully appreciate why the current arrangement is more
> optimal ;-)
> 
> There is a clear separation of tasks between this caller-callee
> pair.  The caller is responsible for finding where the current token
> ends, and the callee is responsible for massaging the current token
> into the resulting list.
> 
> But ltrim needs to be done in the caller for this to be efficient.
> 
> Imagine the case where you want to allow both non-empty and trim
> behaviour at the same time, and use a whitespace character as
> delimiter.  If your token has leading whitespaces, instead of
> chopping them into zero-length ranges and feeding it to append_one()
> one by one, only to have them discarded (due to non-empty being
> set), ltrimming in the caller before it decides where the next token
> (i.e. "end") starts is far more efficient.  It may be more
> conceptually cleaner, but cleanliness is more subjective ;-)
> 

Yes, I agree. Thanks for the wonderful explanation.

> > My thinking is that we should handle the [p, end) string in the same
> > place.
> 
> Again this sounds no more than a subjective "quite dislike".  Is
> there a reason why anybody would want to insist these two things be
> done at the same location?
> 
> You could satisfy the subjective "same place requirement" by
> inlining the helper function into its sole caller and still keep the
> current arrangement to ltrim before finding "end", of course.  But
> at that point, I would have to say that it is a tail wagging the
> dog.  You are making the code worse by destroying the caller-callee
> division of responsibilities, only to satisfy a subjetive "quite
> dislike" criteria.
> 

I get your point. In the later review, I think I should avoid commenting
things using a subjetive idea. Really thanks for your suggestion.

Thanks,

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-08-01 22:04     ` [PATCH v3 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
@ 2025-08-02  8:22       ` Jeff King
  2025-08-02 16:34         ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: Jeff King @ 2025-08-02  8:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Aug 01, 2025 at 03:04:18PM -0700, Junio C Hamano wrote:

> For some unknown reason, unlike string_list_split_in_place(),
> string_list_split() took only a single character as a field
> delimiter.  Before giving both functions more features in future
> commits, allow string_list_split() to take more than one delimiter
> characters to make them closer to each other.

You must know by now that writing "some unknown reason" in a commit
message is the best way to nerd-snipe me. ;)

It looks like 52acddf36c (string-list: multi-delimiter
`string_list_split_in_place()`, 2023-04-24) modified the in-place
variant, but left the original alone. It was needed for the in-place one
to replace strtok(). Probably the original _should_ have been updated
then too for consistency, but wasn't. The motivation isn't given there,
but I'd assume it was some combination of "didn't think of it",
laziness, and not wanting to update all of the callers.

I don't think there's a need to re-roll for this. Mostly I was curious
and thought I'd share my finding on the list.

-Peff

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-01 22:04     ` [PATCH v3 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
@ 2025-08-02  8:26       ` Jeff King
  2025-08-02 16:38         ` Junio C Hamano
  0 siblings, 1 reply; 72+ messages in thread
From: Jeff King @ 2025-08-02  8:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Aug 01, 2025 at 03:04:20PM -0700, Junio C Hamano wrote:

> +/* flag bits for split_f and split_in_place_f functions */
> +enum {
> +	/* trim() resulting string piece before adding it to the list */
> +	STRING_LIST_SPLIT_TRIM = (1 << 0),
> +};

It might be worth defining here what "trim" means. I can think of two
obvious definitions:

  1. trim whitespace from each split piece

  2. trim excess delimiters from each split piece (which in turn depends
     on how we handle multiple delimiters; do we make empty pieces, or
     do we collapse them? I think the former, which would make this type
     of trimming impossible?).

It looks like the patch does (1).

-Peff

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-08-02  8:22       ` Jeff King
@ 2025-08-02 16:34         ` Junio C Hamano
  2025-08-02 18:38           ` Jeff King
  0 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-08-02 16:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> You must know by now that writing "some unknown reason" in a commit
> message is the best way to nerd-snipe me. ;)
>
> It looks like 52acddf36c (string-list: multi-delimiter
> `string_list_split_in_place()`, 2023-04-24) modified the in-place
> variant, but left the original alone. It was needed for the in-place one
> to replace strtok(). Probably the original _should_ have been updated
> then too for consistency, but wasn't. The motivation isn't given there,
> but I'd assume it was some combination of "didn't think of it",
> laziness, and not wanting to update all of the callers.

IOW, it wasn't done for some reason that is still unknown ;-).

> I don't think there's a need to re-roll for this. Mostly I was curious
> and thought I'd share my finding on the list.

Thanks.  IIRC, the multi-delimiter capability was used only in one
unit test or test helper that wanted to split at any whitespace.  As
it did not look so important, I could have unified them in the other
direction to only support a single delimiter (i.e. reverting that
commit you found), but I think the end result of this series would
give us more ergonomic API that we can apply at more places than
before, so I am happy I didn't.


^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-02  8:26       ` Jeff King
@ 2025-08-02 16:38         ` Junio C Hamano
  2025-08-02 18:39           ` Jeff King
  0 siblings, 1 reply; 72+ messages in thread
From: Junio C Hamano @ 2025-08-02 16:38 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> On Fri, Aug 01, 2025 at 03:04:20PM -0700, Junio C Hamano wrote:
>
>> +/* flag bits for split_f and split_in_place_f functions */
>> +enum {
>> +	/* trim() resulting string piece before adding it to the list */
>> +	STRING_LIST_SPLIT_TRIM = (1 << 0),
>> +};
>
> It might be worth defining here what "trim" means. I can think of two
> obvious definitions:
>
>   1. trim whitespace from each split piece
>
>   2. trim excess delimiters from each split piece (which in turn depends
>      on how we handle multiple delimiters; do we make empty pieces, or
>      do we collapse them? I think the former, which would make this type
>      of trimming impossible?).
>
> It looks like the patch does (1).

True.  "nm git | grep trim" tells us that we most of the time use
the word to mean removing whitespaces, but there are exceptions.

It certainly is a good idea to rewrite "trim()" in that comment to
"trim whitespaces around" or something like that.

Thanks.



^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-08-02 16:34         ` Junio C Hamano
@ 2025-08-02 18:38           ` Jeff King
  0 siblings, 0 replies; 72+ messages in thread
From: Jeff King @ 2025-08-02 18:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sat, Aug 02, 2025 at 09:34:08AM -0700, Junio C Hamano wrote:

> > I don't think there's a need to re-roll for this. Mostly I was curious
> > and thought I'd share my finding on the list.
> 
> Thanks.  IIRC, the multi-delimiter capability was used only in one
> unit test or test helper that wanted to split at any whitespace.  As
> it did not look so important, I could have unified them in the other
> direction to only support a single delimiter (i.e. reverting that
> commit you found), but I think the end result of this series would
> give us more ergonomic API that we can apply at more places than
> before, so I am happy I didn't.

Agreed.

The commit I mentioned earlier said it was trying to get rid of
strtok(), so I assumed there were some real uses. But running "log
-Sstrtok", all of those strtok calls were in test helpers!

Well, at least it speaks well of us that we did not let strtok creep
into actual git code. ;)

-Peff

^ permalink raw reply	[flat|nested] 72+ messages in thread

* Re: [PATCH v3 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-02 16:38         ` Junio C Hamano
@ 2025-08-02 18:39           ` Jeff King
  0 siblings, 0 replies; 72+ messages in thread
From: Jeff King @ 2025-08-02 18:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sat, Aug 02, 2025 at 09:38:52AM -0700, Junio C Hamano wrote:

> > It might be worth defining here what "trim" means. I can think of two
> > obvious definitions:
> >
> >   1. trim whitespace from each split piece
> >
> >   2. trim excess delimiters from each split piece (which in turn depends
> >      on how we handle multiple delimiters; do we make empty pieces, or
> >      do we collapse them? I think the former, which would make this type
> >      of trimming impossible?).
> >
> > It looks like the patch does (1).
> 
> True.  "nm git | grep trim" tells us that we most of the time use
> the word to mean removing whitespaces, but there are exceptions.
> 
> It certainly is a good idea to rewrite "trim()" in that comment to
> "trim whitespaces around" or something like that.

Yep, ordinarily I'd assume it means whitespace. But since the function
takes a different delimiter, a hint of doubt crept into my mind. The
text you suggest would have made that go away.

-Peff

^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 0/7] string_list_split*() updates
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
                       ` (6 preceding siblings ...)
  2025-08-01 22:04     ` [PATCH v3 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
@ 2025-08-03  6:52     ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 1/7] string-list: report programming error with BUG Junio C Hamano
                         ` (7 more replies)
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
  8 siblings, 8 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

Two related string-list API functions, string_list_split() and
string_list_split_in_place(), more or less duplicates their
implementations.  They both take a single string, and split the
string at the delimiter and stuff the result into a string list.

However, there is one subtle and unnecessary difference.  The non
"in-place" variant only allows a single byte value as delimiter,
while the "in-place" variant can take multiple delimiters (e.g.,
"split at either a comma or a space").

This series first updates the string_list_split() to allow multiple
delimiters like string_list_split_in_place() does, by unifying their
implementations into one.  This refactoring allows us to give new
features to these two functions more easily.

Then these functions learn to optionally

 - trim the split string pieces before placing them in the resulting
   string list.

 - omit empty string pieces from the resulting string list.

An existing caller of string_list_split() in diff.c trims the
elements in the resulting string list before it uses them, which is
simplified by taking advantage of this new feature.

A handful of code paths call string_list_split*(), immediately
followed by string_list_remove_empty_items().  They are simplified
by not placing empty items in the list in the first place.


Relative to the v3 iteration, the v4 iteration explains the history
behind string_list_split_in_place() in a bit more detail, and
expands in-code comment to clarify what the verb "trim" means in the
context of STRING_LIST_SPLIT_TRIM.

Junio C Hamano (7):
  string-list: report programming error with BUG
  string-list: align string_list_split() with its _in_place()
    counterpart
  string-list: unify string_list_split* functions
  string-list: optionally trim string pieces split by
    string_list_split*()
  diff: simplify parsing of diff.colormovedws
  string-list: optionally omit empty string pieces in
    string_list_split*()
  string-list: split-then-remove-empty can be done while splitting

 builtin/blame.c              |   2 +-
 builtin/merge.c              |   2 +-
 builtin/var.c                |   2 +-
 connect.c                    |   2 +-
 diff.c                       |  20 ++----
 fetch-pack.c                 |   2 +-
 notes.c                      |   6 +-
 parse-options.c              |   2 +-
 pathspec.c                   |   3 +-
 protocol.c                   |   2 +-
 ref-filter.c                 |   4 +-
 setup.c                      |   3 +-
 string-list.c                | 120 ++++++++++++++++++++++++-----------
 string-list.h                |  33 +++++++---
 t/helper/test-hashmap.c      |   4 +-
 t/helper/test-json-writer.c  |   4 +-
 t/helper/test-path-utils.c   |   3 +-
 t/helper/test-ref-store.c    |   2 +-
 t/unit-tests/u-string-list.c |  95 ++++++++++++++++++++++++---
 transport.c                  |   2 +-
 upload-pack.c                |   2 +-
 21 files changed, 225 insertions(+), 90 deletions(-)

Range-diff against v3:
1:  442ed679bb = 1:  4f9c8d8963 string-list: report programming error with BUG
2:  cc80bac8c2 ! 2:  9f6dfe43c8 string-list: align string_list_split() with its _in_place() counterpart
    @@ Metadata
      ## Commit message ##
         string-list: align string_list_split() with its _in_place() counterpart
     
    -    For some unknown reason, unlike string_list_split_in_place(),
    -    string_list_split() took only a single character as a field
    -    delimiter.  Before giving both functions more features in future
    -    commits, allow string_list_split() to take more than one delimiter
    -    characters to make them closer to each other.
    +    The string_list_split_in_place() function was updated by 52acddf3
    +    (string-list: multi-delimiter `string_list_split_in_place()`,
    +    2023-04-24) to take more than one delimiter characters, hoping that
    +    we can later use it to replace our uses of strtok().  We however did
    +    not make a matching change to the string_list_split() function,
    +    which is very similar.
    +
    +    Before giving both functions more features in future commits, allow
    +    string_list_split() to also take more than one delimiter characters
    +    to make them closer to each other.
     
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
3:  c7922b3e14 = 3:  527535fcdd string-list: unify string_list_split* functions
4:  9d7d22e8ef ! 4:  5764549741 string-list: optionally trim string pieces split by string_list_split*()
    @@ string-list.h: int string_list_split(struct string_list *list, const char *strin
      int string_list_split_in_place(struct string_list *list, char *string,
      			       const char *delim, int maxsplit);
     +
    -+/* flag bits for split_f and split_in_place_f functions */
    ++/* Flag bits for split_f and split_in_place_f functions */
     +enum {
    -+	/* trim() resulting string piece before adding it to the list */
    ++	/*
    ++	 * trim whitespaces around resulting string piece before adding
    ++	 * it to the list
    ++	 */
     +	STRING_LIST_SPLIT_TRIM = (1 << 0),
     +};
     +
5:  ad8b425bc5 = 5:  f3a303aef0 diff: simplify parsing of diff.colormovedws
6:  d03f443878 ! 6:  27531efa41 string-list: optionally omit empty string pieces in string_list_split*()
    @@ string-list.c: static int append_one(struct string_list *list,
      		string_list_append(list, p);
     
      ## string-list.h ##
    -@@ string-list.h: int string_list_split_in_place(struct string_list *list, char *string,
    - enum {
    - 	/* trim() resulting string piece before adding it to the list */
    +@@ string-list.h: enum {
    + 	 * it to the list
    + 	 */
      	STRING_LIST_SPLIT_TRIM = (1 << 0),
     +	/* omit adding empty string piece to the resulting list */
     +	STRING_LIST_SPLIT_NONEMPTY = (1 << 1),
7:  9eb8d87d62 = 7:  2ab2aac73d string-list: split-then-remove-empty can be done while splitting
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v4 1/7] string-list: report programming error with BUG
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

Passing a string list that has .strdup_strings bit unset to
string_list_split(), or one that has .strdup_strings bit set to
string_list_split_in_place(), is a programmer error.  Do not use
die() to abort the execution.  Use BUG() instead.

As a developer-facing message, the message string itself should
be a lot more concise, but let's keep the original one for now.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/string-list.c b/string-list.c
index 53faaa8420..0cb920e9b0 100644
--- a/string-list.c
+++ b/string-list.c
@@ -283,7 +283,7 @@ int string_list_split(struct string_list *list, const char *string,
 	const char *p = string, *end;
 
 	if (!list->strdup_strings)
-		die("internal error in string_list_split(): "
+		BUG("internal error in string_list_split(): "
 		    "list->strdup_strings must be set");
 	for (;;) {
 		count++;
@@ -309,7 +309,7 @@ int string_list_split_in_place(struct string_list *list, char *string,
 	char *p = string, *end;
 
 	if (list->strdup_strings)
-		die("internal error in string_list_split_in_place(): "
+		BUG("internal error in string_list_split_in_place(): "
 		    "list->strdup_strings must not be set");
 	for (;;) {
 		count++;
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 2/7] string-list: align string_list_split() with its _in_place() counterpart
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 1/7] string-list: report programming error with BUG Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 3/7] string-list: unify string_list_split* functions Junio C Hamano
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

The string_list_split_in_place() function was updated by 52acddf3
(string-list: multi-delimiter `string_list_split_in_place()`,
2023-04-24) to take more than one delimiter characters, hoping that
we can later use it to replace our uses of strtok().  We however did
not make a matching change to the string_list_split() function,
which is very similar.

Before giving both functions more features in future commits, allow
string_list_split() to also take more than one delimiter characters
to make them closer to each other.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/blame.c              |  2 +-
 builtin/merge.c              |  2 +-
 builtin/var.c                |  2 +-
 connect.c                    |  2 +-
 diff.c                       |  2 +-
 fetch-pack.c                 |  2 +-
 notes.c                      |  2 +-
 parse-options.c              |  2 +-
 pathspec.c                   |  2 +-
 protocol.c                   |  2 +-
 ref-filter.c                 |  4 ++--
 setup.c                      |  3 ++-
 string-list.c                |  4 ++--
 string-list.h                | 16 ++++++++--------
 t/helper/test-path-utils.c   |  3 ++-
 t/helper/test-ref-store.c    |  2 +-
 t/unit-tests/u-string-list.c | 16 ++++++++--------
 transport.c                  |  2 +-
 upload-pack.c                |  2 +-
 19 files changed, 37 insertions(+), 35 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index 91586e6852..70a6460401 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -420,7 +420,7 @@ static void parse_color_fields(const char *s)
 	colorfield_nr = 0;
 
 	/* Ideally this would be stripped and split at the same time? */
-	string_list_split(&l, s, ',', -1);
+	string_list_split(&l, s, ",", -1);
 	ALLOC_GROW(colorfield, colorfield_nr + 1, colorfield_alloc);
 
 	for_each_string_list_item(item, &l) {
diff --git a/builtin/merge.c b/builtin/merge.c
index 18b22c0a26..893f8950bf 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -875,7 +875,7 @@ static void add_strategies(const char *string, unsigned attr)
 	if (string) {
 		struct string_list list = STRING_LIST_INIT_DUP;
 		struct string_list_item *item;
-		string_list_split(&list, string, ' ', -1);
+		string_list_split(&list, string, " ", -1);
 		for_each_string_list_item(item, &list)
 			append_strategy(get_strategy(item->string));
 		string_list_clear(&list, 0);
diff --git a/builtin/var.c b/builtin/var.c
index ada642a9fe..4ae7af0eff 100644
--- a/builtin/var.c
+++ b/builtin/var.c
@@ -181,7 +181,7 @@ static void list_vars(void)
 			if (ptr->multivalued && *val) {
 				struct string_list list = STRING_LIST_INIT_DUP;
 
-				string_list_split(&list, val, '\n', -1);
+				string_list_split(&list, val, "\n", -1);
 				for (size_t i = 0; i < list.nr; i++)
 					printf("%s=%s\n", ptr->name, list.items[i].string);
 				string_list_clear(&list, 0);
diff --git a/connect.c b/connect.c
index e77287f426..867b12bde5 100644
--- a/connect.c
+++ b/connect.c
@@ -407,7 +407,7 @@ static int process_ref_v2(struct packet_reader *reader, struct ref ***list,
 	 * name.  Subsequent fields (symref-target and peeled) are optional and
 	 * don't have a particular order.
 	 */
-	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
+	if (string_list_split(&line_sections, line, " ", -1) < 2) {
 		ret = 0;
 		goto out;
 	}
diff --git a/diff.c b/diff.c
index dca87e164f..a81949a422 100644
--- a/diff.c
+++ b/diff.c
@@ -327,7 +327,7 @@ static unsigned parse_color_moved_ws(const char *arg)
 	struct string_list l = STRING_LIST_INIT_DUP;
 	struct string_list_item *i;
 
-	string_list_split(&l, arg, ',', -1);
+	string_list_split(&l, arg, ",", -1);
 
 	for_each_string_list_item(i, &l) {
 		struct strbuf sb = STRBUF_INIT;
diff --git a/fetch-pack.c b/fetch-pack.c
index c1be9b76eb..9866270696 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1914,7 +1914,7 @@ static void fetch_pack_config(void)
 		char *str;
 
 		if (!git_config_get_string("fetch.uriprotocols", &str) && str) {
-			string_list_split(&uri_protocols, str, ',', -1);
+			string_list_split(&uri_protocols, str, ",", -1);
 			free(str);
 		}
 	}
diff --git a/notes.c b/notes.c
index 97b995f3f2..6afcf088b9 100644
--- a/notes.c
+++ b/notes.c
@@ -892,7 +892,7 @@ static int string_list_add_note_lines(struct string_list *list,
 	 * later, along with any empty strings that came from empty
 	 * lines within the file.
 	 */
-	string_list_split(list, data, '\n', -1);
+	string_list_split(list, data, "\n", -1);
 	free(data);
 	return 0;
 }
diff --git a/parse-options.c b/parse-options.c
index 5224203ffe..9e7cb75192 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -1338,7 +1338,7 @@ static enum parse_opt_result usage_with_options_internal(struct parse_opt_ctx_t
 		if (!saw_empty_line && !*str)
 			saw_empty_line = 1;
 
-		string_list_split(&list, str, '\n', -1);
+		string_list_split(&list, str, "\n", -1);
 		for (j = 0; j < list.nr; j++) {
 			const char *line = list.items[j].string;
 
diff --git a/pathspec.c b/pathspec.c
index a3ddd701c7..de325f7ef9 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -201,7 +201,7 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va
 	if (!value || !*value)
 		die(_("attr spec must not be empty"));
 
-	string_list_split(&list, value, ' ', -1);
+	string_list_split(&list, value, " ", -1);
 	string_list_remove_empty_items(&list, 0);
 
 	item->attr_check = attr_check_alloc();
diff --git a/protocol.c b/protocol.c
index bae7226ff4..54b9f49c01 100644
--- a/protocol.c
+++ b/protocol.c
@@ -61,7 +61,7 @@ enum protocol_version determine_protocol_version_server(void)
 	if (git_protocol) {
 		struct string_list list = STRING_LIST_INIT_DUP;
 		const struct string_list_item *item;
-		string_list_split(&list, git_protocol, ':', -1);
+		string_list_split(&list, git_protocol, ":", -1);
 
 		for_each_string_list_item(item, &list) {
 			const char *value;
diff --git a/ref-filter.c b/ref-filter.c
index f9f2c512a8..4edfb9c83b 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -435,7 +435,7 @@ static int remote_ref_atom_parser(struct ref_format *format UNUSED,
 	}
 
 	atom->u.remote_ref.nobracket = 0;
-	string_list_split(&params, arg, ',', -1);
+	string_list_split(&params, arg, ",", -1);
 
 	for (i = 0; i < params.nr; i++) {
 		const char *s = params.items[i].string;
@@ -831,7 +831,7 @@ static int align_atom_parser(struct ref_format *format UNUSED,
 
 	align->position = ALIGN_LEFT;
 
-	string_list_split(&params, arg, ',', -1);
+	string_list_split(&params, arg, ",", -1);
 	for (i = 0; i < params.nr; i++) {
 		const char *s = params.items[i].string;
 		int position;
diff --git a/setup.c b/setup.c
index 6f52dab64c..b9f5eb8b51 100644
--- a/setup.c
+++ b/setup.c
@@ -1460,8 +1460,9 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 
 	if (env_ceiling_dirs) {
 		int empty_entry_found = 0;
+		static const char path_sep[] = { PATH_SEP, '\0' };
 
-		string_list_split(&ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1);
+		string_list_split(&ceiling_dirs, env_ceiling_dirs, path_sep, -1);
 		filter_string_list(&ceiling_dirs, 0,
 				   canonicalize_ceiling_entry, &empty_entry_found);
 		ceil_offset = longest_ancestor_length(dir->buf, &ceiling_dirs);
diff --git a/string-list.c b/string-list.c
index 0cb920e9b0..2284a009cb 100644
--- a/string-list.c
+++ b/string-list.c
@@ -277,7 +277,7 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 }
 
 int string_list_split(struct string_list *list, const char *string,
-		      int delim, int maxsplit)
+		      const char *delim, int maxsplit)
 {
 	int count = 0;
 	const char *p = string, *end;
@@ -291,7 +291,7 @@ int string_list_split(struct string_list *list, const char *string,
 			string_list_append(list, p);
 			return count;
 		}
-		end = strchr(p, delim);
+		end = strpbrk(p, delim);
 		if (end) {
 			string_list_append_nodup(list, xmemdupz(p, end - p));
 			p = end + 1;
diff --git a/string-list.h b/string-list.h
index 122b318641..6c8650efde 100644
--- a/string-list.h
+++ b/string-list.h
@@ -254,7 +254,7 @@ struct string_list_item *unsorted_string_list_lookup(struct string_list *list,
 void unsorted_string_list_delete_item(struct string_list *list, int i, int free_util);
 
 /**
- * Split string into substrings on character `delim` and append the
+ * Split string into substrings on characters in `delim` and append the
  * substrings to `list`.  The input string is not modified.
  * list->strdup_strings must be set, as new memory needs to be
  * allocated to hold the substrings.  If maxsplit is non-negative,
@@ -262,15 +262,15 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
  * appended to list.
  *
  * Examples:
- *   string_list_split(l, "foo:bar:baz", ':', -1) -> ["foo", "bar", "baz"]
- *   string_list_split(l, "foo:bar:baz", ':', 0) -> ["foo:bar:baz"]
- *   string_list_split(l, "foo:bar:baz", ':', 1) -> ["foo", "bar:baz"]
- *   string_list_split(l, "foo:bar:", ':', -1) -> ["foo", "bar", ""]
- *   string_list_split(l, "", ':', -1) -> [""]
- *   string_list_split(l, ":", ':', -1) -> ["", ""]
+ *   string_list_split(l, "foo:bar:baz", ":", -1) -> ["foo", "bar", "baz"]
+ *   string_list_split(l, "foo:bar:baz", ":", 0) -> ["foo:bar:baz"]
+ *   string_list_split(l, "foo:bar:baz", ":", 1) -> ["foo", "bar:baz"]
+ *   string_list_split(l, "foo:bar:", ":", -1) -> ["foo", "bar", ""]
+ *   string_list_split(l, "", ":", -1) -> [""]
+ *   string_list_split(l, ":", ":", -1) -> ["", ""]
  */
 int string_list_split(struct string_list *list, const char *string,
-		      int delim, int maxsplit);
+		      const char *delim, int maxsplit);
 
 /*
  * Like string_list_split(), except that string is split in-place: the
diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c
index 086238c826..f5f33751da 100644
--- a/t/helper/test-path-utils.c
+++ b/t/helper/test-path-utils.c
@@ -348,6 +348,7 @@ int cmd__path_utils(int argc, const char **argv)
 	if (argc == 4 && !strcmp(argv[1], "longest_ancestor_length")) {
 		int len;
 		struct string_list ceiling_dirs = STRING_LIST_INIT_DUP;
+		const char path_sep[] = { PATH_SEP, '\0' };
 		char *path = xstrdup(argv[2]);
 
 		/*
@@ -362,7 +363,7 @@ int cmd__path_utils(int argc, const char **argv)
 		 */
 		if (normalize_path_copy(path, path))
 			die("Path \"%s\" could not be normalized", argv[2]);
-		string_list_split(&ceiling_dirs, argv[3], PATH_SEP, -1);
+		string_list_split(&ceiling_dirs, argv[3], path_sep, -1);
 		filter_string_list(&ceiling_dirs, 0,
 				   normalize_ceiling_entry, NULL);
 		len = longest_ancestor_length(path, &ceiling_dirs);
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index 8d9a271845..aa1cb9b4ac 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -29,7 +29,7 @@ static unsigned int parse_flags(const char *str, struct flag_definition *defs)
 	if (!strcmp(str, "0"))
 		return 0;
 
-	string_list_split(&masks, str, ',', 64);
+	string_list_split(&masks, str, ",", 64);
 	for (size_t i = 0; i < masks.nr; i++) {
 		const char *name = masks.items[i].string;
 		struct flag_definition *def = defs;
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index d4ba5f9fa5..150a5f505f 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -43,7 +43,7 @@ static void t_string_list_equal(struct string_list *list,
 				  expected_strings->items[i].string);
 }
 
-static void t_string_list_split(const char *data, int delim, int maxsplit, ...)
+static void t_string_list_split(const char *data, const char *delim, int maxsplit, ...)
 {
 	struct string_list expected_strings = STRING_LIST_INIT_DUP;
 	struct string_list list = STRING_LIST_INIT_DUP;
@@ -65,13 +65,13 @@ static void t_string_list_split(const char *data, int delim, int maxsplit, ...)
 
 void test_string_list__split(void)
 {
-	t_string_list_split("foo:bar:baz", ':', -1, "foo", "bar", "baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 0, "foo:bar:baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 1, "foo", "bar:baz", NULL);
-	t_string_list_split("foo:bar:baz", ':', 2, "foo", "bar", "baz", NULL);
-	t_string_list_split("foo:bar:", ':', -1, "foo", "bar", "", NULL);
-	t_string_list_split("", ':', -1, "", NULL);
-	t_string_list_split(":", ':', -1, "", "", NULL);
+	t_string_list_split("foo:bar:baz", ":", -1, "foo", "bar", "baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 0, "foo:bar:baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 1, "foo", "bar:baz", NULL);
+	t_string_list_split("foo:bar:baz", ":", 2, "foo", "bar", "baz", NULL);
+	t_string_list_split("foo:bar:", ":", -1, "foo", "bar", "", NULL);
+	t_string_list_split("", ":", -1, "", NULL);
+	t_string_list_split(":", ":", -1, "", "", NULL);
 }
 
 static void t_string_list_split_in_place(const char *data, const char *delim,
diff --git a/transport.c b/transport.c
index c123ac1e38..76487b5453 100644
--- a/transport.c
+++ b/transport.c
@@ -1042,7 +1042,7 @@ static const struct string_list *protocol_allow_list(void)
 	if (enabled < 0) {
 		const char *v = getenv("GIT_ALLOW_PROTOCOL");
 		if (v) {
-			string_list_split(&allowed, v, ':', -1);
+			string_list_split(&allowed, v, ":", -1);
 			string_list_sort(&allowed);
 			enabled = 1;
 		} else {
diff --git a/upload-pack.c b/upload-pack.c
index 4f26f6afc7..91fcdcad9b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1685,7 +1685,7 @@ static void process_args(struct packet_reader *request,
 			if (data->uri_protocols.nr)
 				send_err_and_die(data,
 						 "multiple packfile-uris lines forbidden");
-			string_list_split(&data->uri_protocols, p, ',', -1);
+			string_list_split(&data->uri_protocols, p, ",", -1);
 			continue;
 		}
 
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 3/7] string-list: unify string_list_split* functions
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 1/7] string-list: report programming error with BUG Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
                         ` (4 subsequent siblings)
  7 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

Thanks to the previous step, the only difference between these two
related functions is that string_list_split() works on a string
without modifying its contents (i.e. taking "const char *") and the
resulting pieces of strings are their own copies in a string list,
while string_list_split_in_place() works on a mutable string and the
resulting pieces of strings come from the original string.

Consolidate their implementations into a single helper function, and
make them a thin wrapper around it.  We can later add an extra flags
parameter to extend both of these functions by updating only the
internal helper function.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c | 96 ++++++++++++++++++++++++++++++---------------------
 1 file changed, 56 insertions(+), 40 deletions(-)

diff --git a/string-list.c b/string-list.c
index 2284a009cb..65b6ceb259 100644
--- a/string-list.c
+++ b/string-list.c
@@ -276,55 +276,71 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
 	list->nr--;
 }
 
-int string_list_split(struct string_list *list, const char *string,
-		      const char *delim, int maxsplit)
+/*
+ * append a substring [p..end] to list; return number of things it
+ * appended to the list.
+ */
+static int append_one(struct string_list *list,
+		      const char *p, const char *end,
+		      int in_place)
+{
+	if (!end)
+		end = p + strlen(p);
+
+	if (in_place) {
+		*((char *)end) = '\0';
+		string_list_append(list, p);
+	} else {
+		string_list_append_nodup(list, xmemdupz(p, end - p));
+	}
+	return 1;
+}
+
+/*
+ * Unfortunately this cannot become a public interface, as _in_place()
+ * wants to have "const char *string" while the other variant wants to
+ * have "char *string" for type safety.
+ *
+ * This accepts "const char *string" to allow both wrappers to use it;
+ * it internally casts away the constness when in_place is true by
+ * taking advantage of strpbrk() that takes a "const char *" arg and
+ * returns "char *" pointer into that const string.  Yucky but works ;-).
+ */
+static int split_string(struct string_list *list, const char *string, const char *delim,
+			int maxsplit, int in_place)
 {
 	int count = 0;
-	const char *p = string, *end;
+	const char *p = string;
+
+	if (in_place && list->strdup_strings)
+		BUG("string_list_split_in_place() called with strdup_strings");
+	else if (!in_place && !list->strdup_strings)
+		BUG("string_list_split() called without strdup_strings");
 
-	if (!list->strdup_strings)
-		BUG("internal error in string_list_split(): "
-		    "list->strdup_strings must be set");
 	for (;;) {
-		count++;
-		if (maxsplit >= 0 && count > maxsplit) {
-			string_list_append(list, p);
-			return count;
-		}
-		end = strpbrk(p, delim);
-		if (end) {
-			string_list_append_nodup(list, xmemdupz(p, end - p));
-			p = end + 1;
-		} else {
-			string_list_append(list, p);
+		char *end;
+
+		if (0 <= maxsplit && maxsplit <= count)
+			end = NULL;
+		else
+			end = strpbrk(p, delim);
+
+		count += append_one(list, p, end, in_place);
+
+		if (!end)
 			return count;
-		}
+		p = end + 1;
 	}
 }
 
+int string_list_split(struct string_list *list, const char *string,
+		      const char *delim, int maxsplit)
+{
+	return split_string(list, string, delim, maxsplit, 0);
+}
+
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit)
 {
-	int count = 0;
-	char *p = string, *end;
-
-	if (list->strdup_strings)
-		BUG("internal error in string_list_split_in_place(): "
-		    "list->strdup_strings must not be set");
-	for (;;) {
-		count++;
-		if (maxsplit >= 0 && count > maxsplit) {
-			string_list_append(list, p);
-			return count;
-		}
-		end = strpbrk(p, delim);
-		if (end) {
-			*end = '\0';
-			string_list_append(list, p);
-			p = end + 1;
-		} else {
-			string_list_append(list, p);
-			return count;
-		}
-	}
+	return split_string(list, string, delim, maxsplit, 1);
 }
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 4/7] string-list: optionally trim string pieces split by string_list_split*()
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
                         ` (2 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v4 3/7] string-list: unify string_list_split* functions Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

Teach the unified split_string() to take an optional "flags" word,
and define the first flag STRING_LIST_SPLIT_TRIM to cause the split
pieces to be trimmed before they are placed in the string list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c                | 35 +++++++++++++++++---
 string-list.h                | 15 +++++++++
 t/unit-tests/u-string-list.c | 64 ++++++++++++++++++++++++++++++++++++
 3 files changed, 109 insertions(+), 5 deletions(-)

diff --git a/string-list.c b/string-list.c
index 65b6ceb259..86a309f8fb 100644
--- a/string-list.c
+++ b/string-list.c
@@ -282,11 +282,18 @@ void unsorted_string_list_delete_item(struct string_list *list, int i, int free_
  */
 static int append_one(struct string_list *list,
 		      const char *p, const char *end,
-		      int in_place)
+		      int in_place, unsigned flags)
 {
 	if (!end)
 		end = p + strlen(p);
 
+	if ((flags & STRING_LIST_SPLIT_TRIM)) {
+		/* rtrim */
+		for (; p < end; end--)
+			if (!isspace(end[-1]))
+				break;
+	}
+
 	if (in_place) {
 		*((char *)end) = '\0';
 		string_list_append(list, p);
@@ -307,7 +314,7 @@ static int append_one(struct string_list *list,
  * returns "char *" pointer into that const string.  Yucky but works ;-).
  */
 static int split_string(struct string_list *list, const char *string, const char *delim,
-			int maxsplit, int in_place)
+			int maxsplit, int in_place, unsigned flags)
 {
 	int count = 0;
 	const char *p = string;
@@ -320,12 +327,18 @@ static int split_string(struct string_list *list, const char *string, const char
 	for (;;) {
 		char *end;
 
+		if (flags & STRING_LIST_SPLIT_TRIM) {
+			/* ltrim */
+			while (*p && isspace(*p))
+				p++;
+		}
+
 		if (0 <= maxsplit && maxsplit <= count)
 			end = NULL;
 		else
 			end = strpbrk(p, delim);
 
-		count += append_one(list, p, end, in_place);
+		count += append_one(list, p, end, in_place, flags);
 
 		if (!end)
 			return count;
@@ -336,11 +349,23 @@ static int split_string(struct string_list *list, const char *string, const char
 int string_list_split(struct string_list *list, const char *string,
 		      const char *delim, int maxsplit)
 {
-	return split_string(list, string, delim, maxsplit, 0);
+	return split_string(list, string, delim, maxsplit, 0, 0);
 }
 
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit)
 {
-	return split_string(list, string, delim, maxsplit, 1);
+	return split_string(list, string, delim, maxsplit, 1, 0);
+}
+
+int string_list_split_f(struct string_list *list, const char *string,
+			const char *delim, int maxsplit, unsigned flags)
+{
+	return split_string(list, string, delim, maxsplit, 0, flags);
+}
+
+int string_list_split_in_place_f(struct string_list *list, char *string,
+			       const char *delim, int maxsplit, unsigned flags)
+{
+	return split_string(list, string, delim, maxsplit, 1, flags);
 }
diff --git a/string-list.h b/string-list.h
index 6c8650efde..40e148712d 100644
--- a/string-list.h
+++ b/string-list.h
@@ -281,4 +281,19 @@ int string_list_split(struct string_list *list, const char *string,
  */
 int string_list_split_in_place(struct string_list *list, char *string,
 			       const char *delim, int maxsplit);
+
+/* Flag bits for split_f and split_in_place_f functions */
+enum {
+	/*
+	 * trim whitespaces around resulting string piece before adding
+	 * it to the list
+	 */
+	STRING_LIST_SPLIT_TRIM = (1 << 0),
+};
+
+int string_list_split_f(struct string_list *, const char *string,
+			const char *delim, int maxsplit, unsigned flags);
+
+int string_list_split_in_place_f(struct string_list *, char *string,
+				 const char *delim, int maxsplit, unsigned flags);
 #endif /* STRING_LIST_H */
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index 150a5f505f..daa9307e45 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -63,6 +63,70 @@ static void t_string_list_split(const char *data, const char *delim, int maxspli
 	string_list_clear(&list, 0);
 }
 
+static void t_string_list_split_f(const char *data, const char *delim,
+				  int maxsplit, unsigned flags, ...)
+{
+	struct string_list expected_strings = STRING_LIST_INIT_DUP;
+	struct string_list list = STRING_LIST_INIT_DUP;
+	va_list ap;
+	int len;
+
+	va_start(ap, flags);
+	t_vcreate_string_list_dup(&expected_strings, 0, ap);
+	va_end(ap);
+
+	string_list_clear(&list, 0);
+	len = string_list_split_f(&list, data, delim, maxsplit, flags);
+	cl_assert_equal_i(len, expected_strings.nr);
+	t_string_list_equal(&list, &expected_strings);
+
+	string_list_clear(&expected_strings, 0);
+	string_list_clear(&list, 0);
+}
+
+void test_string_list__split_f(void)
+{
+	t_string_list_split_f("::foo:bar:baz:", ":", -1, 0,
+			      "", "", "foo", "bar", "baz", "", NULL);
+	t_string_list_split_f(" foo:bar : baz", ":", -1, STRING_LIST_SPLIT_TRIM,
+			      "foo", "bar", "baz", NULL);
+	t_string_list_split_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
+			      "a", "b c", NULL);
+}
+
+static void t_string_list_split_in_place_f(const char *data_, const char *delim,
+					   int maxsplit, unsigned flags, ...)
+{
+	struct string_list expected_strings = STRING_LIST_INIT_DUP;
+	struct string_list list = STRING_LIST_INIT_NODUP;
+	char *data = xstrdup(data_);
+	va_list ap;
+	int len;
+
+	va_start(ap, flags);
+	t_vcreate_string_list_dup(&expected_strings, 0, ap);
+	va_end(ap);
+
+	string_list_clear(&list, 0);
+	len = string_list_split_in_place_f(&list, data, delim, maxsplit, flags);
+	cl_assert_equal_i(len, expected_strings.nr);
+	t_string_list_equal(&list, &expected_strings);
+
+	free(data);
+	string_list_clear(&expected_strings, 0);
+	string_list_clear(&list, 0);
+}
+
+void test_string_list__split_in_place_f(void)
+{
+	t_string_list_split_in_place_f("::foo:bar:baz:", ":", -1, 0,
+				       "", "", "foo", "bar", "baz", "", NULL);
+	t_string_list_split_in_place_f(" foo:bar : baz", ":", -1, STRING_LIST_SPLIT_TRIM,
+				       "foo", "bar", "baz", NULL);
+	t_string_list_split_in_place_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
+				       "a", "b c", NULL);
+}
+
 void test_string_list__split(void)
 {
 	t_string_list_split("foo:bar:baz", ":", -1, "foo", "bar", "baz", NULL);
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 5/7] diff: simplify parsing of diff.colormovedws
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
                         ` (3 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v4 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

The code to parse this configuration variable, whose value is a
comma-separated list of known tokens like "ignore-space-change" and
"ignore-all-space", uses string_list_split() to split the value into
pieces, and then places each piece of string in a strbuf to trim,
before comparing the result with the list of known tokens.

Thanks to the previous steps, now string_list_split() can trim the
resulting pieces before it places them in the string list.  Use it
to simplify the code.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c | 20 +++++++-------------
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/diff.c b/diff.c
index a81949a422..70666ad2cd 100644
--- a/diff.c
+++ b/diff.c
@@ -327,29 +327,23 @@ static unsigned parse_color_moved_ws(const char *arg)
 	struct string_list l = STRING_LIST_INIT_DUP;
 	struct string_list_item *i;
 
-	string_list_split(&l, arg, ",", -1);
+	string_list_split_f(&l, arg, ",", -1, STRING_LIST_SPLIT_TRIM);
 
 	for_each_string_list_item(i, &l) {
-		struct strbuf sb = STRBUF_INIT;
-		strbuf_addstr(&sb, i->string);
-		strbuf_trim(&sb);
-
-		if (!strcmp(sb.buf, "no"))
+		if (!strcmp(i->string, "no"))
 			ret = 0;
-		else if (!strcmp(sb.buf, "ignore-space-change"))
+		else if (!strcmp(i->string, "ignore-space-change"))
 			ret |= XDF_IGNORE_WHITESPACE_CHANGE;
-		else if (!strcmp(sb.buf, "ignore-space-at-eol"))
+		else if (!strcmp(i->string, "ignore-space-at-eol"))
 			ret |= XDF_IGNORE_WHITESPACE_AT_EOL;
-		else if (!strcmp(sb.buf, "ignore-all-space"))
+		else if (!strcmp(i->string, "ignore-all-space"))
 			ret |= XDF_IGNORE_WHITESPACE;
-		else if (!strcmp(sb.buf, "allow-indentation-change"))
+		else if (!strcmp(i->string, "allow-indentation-change"))
 			ret |= COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE;
 		else {
 			ret |= COLOR_MOVED_WS_ERROR;
-			error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), sb.buf);
+			error(_("unknown color-moved-ws mode '%s', possible values are 'ignore-space-change', 'ignore-space-at-eol', 'ignore-all-space', 'allow-indentation-change'"), i->string);
 		}
-
-		strbuf_release(&sb);
 	}
 
 	if ((ret & COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE) &&
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 6/7] string-list: optionally omit empty string pieces in string_list_split*()
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
                         ` (4 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v4 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v4 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
  2025-08-04  6:24       ` [PATCH v4 0/7] string_list_split*() updates Patrick Steinhardt
  7 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

Teach the unified split_string() machinery a new flag bit,
STRING_LIST_SPLIT_NONEMPTY, to cause empty split pieces to be
omitted from the resulting string list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 string-list.c                |  3 +++
 string-list.h                |  2 ++
 t/unit-tests/u-string-list.c | 15 +++++++++++++++
 3 files changed, 20 insertions(+)

diff --git a/string-list.c b/string-list.c
index 86a309f8fb..343cf1ca90 100644
--- a/string-list.c
+++ b/string-list.c
@@ -294,6 +294,9 @@ static int append_one(struct string_list *list,
 				break;
 	}
 
+	if ((flags & STRING_LIST_SPLIT_NONEMPTY) && (end <= p))
+		return 0;
+
 	if (in_place) {
 		*((char *)end) = '\0';
 		string_list_append(list, p);
diff --git a/string-list.h b/string-list.h
index 40e148712d..2b438c7733 100644
--- a/string-list.h
+++ b/string-list.h
@@ -289,6 +289,8 @@ enum {
 	 * it to the list
 	 */
 	STRING_LIST_SPLIT_TRIM = (1 << 0),
+	/* omit adding empty string piece to the resulting list */
+	STRING_LIST_SPLIT_NONEMPTY = (1 << 1),
 };
 
 int string_list_split_f(struct string_list *, const char *string,
diff --git a/t/unit-tests/u-string-list.c b/t/unit-tests/u-string-list.c
index daa9307e45..a2457d7b1e 100644
--- a/t/unit-tests/u-string-list.c
+++ b/t/unit-tests/u-string-list.c
@@ -92,6 +92,13 @@ void test_string_list__split_f(void)
 			      "foo", "bar", "baz", NULL);
 	t_string_list_split_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
 			      "a", "b c", NULL);
+	t_string_list_split_f("::foo::bar:baz:", ":", -1, STRING_LIST_SPLIT_NONEMPTY,
+			      "foo", "bar", "baz", NULL);
+	t_string_list_split_f("foo:baz", ":", -1, STRING_LIST_SPLIT_NONEMPTY,
+			      "foo", "baz", NULL);
+	t_string_list_split_f("foo :: : baz", ":", -1,
+			      STRING_LIST_SPLIT_NONEMPTY | STRING_LIST_SPLIT_TRIM,
+			      "foo", "baz", NULL);
 }
 
 static void t_string_list_split_in_place_f(const char *data_, const char *delim,
@@ -125,6 +132,14 @@ void test_string_list__split_in_place_f(void)
 				       "foo", "bar", "baz", NULL);
 	t_string_list_split_in_place_f("  a  b c  ", " ", 1, STRING_LIST_SPLIT_TRIM,
 				       "a", "b c", NULL);
+	t_string_list_split_in_place_f("::foo::bar:baz:", ":", -1,
+				       STRING_LIST_SPLIT_NONEMPTY,
+				       "foo", "bar", "baz", NULL);
+	t_string_list_split_in_place_f("foo:baz", ":", -1, STRING_LIST_SPLIT_NONEMPTY,
+				       "foo", "baz", NULL);
+	t_string_list_split_in_place_f("foo :: : baz", ":", -1,
+				       STRING_LIST_SPLIT_NONEMPTY | STRING_LIST_SPLIT_TRIM,
+				       "foo", "baz", NULL);
 }
 
 void test_string_list__split(void)
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v4 7/7] string-list: split-then-remove-empty can be done while splitting
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
                         ` (5 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v4 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-04  6:24       ` [PATCH v4 0/7] string_list_split*() updates Patrick Steinhardt
  7 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

Thanks to the new STRING_LIST_SPLIT_NONEMPTY flag, a common pattern
to split a string into a string list and then remove empty items in
the resulting list is no longer needed.  Instead, just tell the
string_list_split*() to omit empty ones while splitting.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 notes.c                     | 4 ++--
 pathspec.c                  | 3 +--
 t/helper/test-hashmap.c     | 4 ++--
 t/helper/test-json-writer.c | 4 ++--
 4 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/notes.c b/notes.c
index 6afcf088b9..3603c4a42b 100644
--- a/notes.c
+++ b/notes.c
@@ -970,8 +970,8 @@ void string_list_add_refs_from_colon_sep(struct string_list *list,
 	char *globs_copy = xstrdup(globs);
 	int i;
 
-	string_list_split_in_place(&split, globs_copy, ":", -1);
-	string_list_remove_empty_items(&split, 0);
+	string_list_split_in_place_f(&split, globs_copy, ":", -1,
+				     STRING_LIST_SPLIT_NONEMPTY);
 
 	for (i = 0; i < split.nr; i++)
 		string_list_add_refs_by_glob(list, split.items[i].string);
diff --git a/pathspec.c b/pathspec.c
index de325f7ef9..5993c4afa0 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -201,8 +201,7 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va
 	if (!value || !*value)
 		die(_("attr spec must not be empty"));
 
-	string_list_split(&list, value, " ", -1);
-	string_list_remove_empty_items(&list, 0);
+	string_list_split_f(&list, value, " ", -1, STRING_LIST_SPLIT_NONEMPTY);
 
 	item->attr_check = attr_check_alloc();
 	CALLOC_ARRAY(item->attr_match, list.nr);
diff --git a/t/helper/test-hashmap.c b/t/helper/test-hashmap.c
index 7782ae585e..e4dc02bd7a 100644
--- a/t/helper/test-hashmap.c
+++ b/t/helper/test-hashmap.c
@@ -149,8 +149,8 @@ int cmd__hashmap(int argc UNUSED, const char **argv UNUSED)
 
 		/* break line into command and up to two parameters */
 		string_list_setlen(&parts, 0);
-		string_list_split_in_place(&parts, line.buf, DELIM, 2);
-		string_list_remove_empty_items(&parts, 0);
+		string_list_split_in_place_f(&parts, line.buf, DELIM, 2,
+					     STRING_LIST_SPLIT_NONEMPTY);
 
 		/* ignore empty lines */
 		if (!parts.nr)
diff --git a/t/helper/test-json-writer.c b/t/helper/test-json-writer.c
index a288069b04..f8316a7d29 100644
--- a/t/helper/test-json-writer.c
+++ b/t/helper/test-json-writer.c
@@ -492,8 +492,8 @@ static int scripted(void)
 
 		/* break line into command and zero or more tokens */
 		string_list_setlen(&parts, 0);
-		string_list_split_in_place(&parts, line, " ", -1);
-		string_list_remove_empty_items(&parts, 0);
+		string_list_split_in_place_f(&parts, line, " ", -1,
+					     STRING_LIST_SPLIT_NONEMPTY);
 
 		/* ignore empty lines */
 		if (!parts.nr || !*parts.items[0].string)
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 00/12] do not overuse strbuf_split*()
  2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
                       ` (7 preceding siblings ...)
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
@ 2025-08-03  6:52     ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 01/12] wt-status: avoid strbuf_split*() Junio C Hamano
                         ` (11 more replies)
  8 siblings, 12 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

strbuf is a very good data structure to work with string data
without having to worry about running past the end of the string.

But an array of strbuf is often a wrong data structure.  You rarely
have need to be able to edit multiple strings represented by such an
array simultaneously.  And strbuf_split*() that produces result in
such a shape is a misdesigned API function.

The most common use case of strbuf_split*() family of functions
seems to be to trim away the whitespaces around each piece of split
string.  With modern string_list_split*(), it is often no longer
necessary.

This series builds on top of the other series that extends string
list API to allow string_list_split() to take more than one delimiter
bytes, and to optionally trim the resulting string pieces.

I do not plan to eradicate all the uses of strbuf_split*() myself,
not because I found some valid use cases in the existing code (I
haven't yet), but these patches would give interested others enough
material to study and mimic to continue the effort and I can safely
leave it as #leftoverbits to rewrite them.

Relative to v2, this iteration v3 adds one more clean-up step to
correct a callee that insists on taking a while strbuf when it can
work with any NUL-terminated strings, and comes with a handful of
typofixes.

Junio C Hamano (12):
  wt-status: avoid strbuf_split*()
  clean: do not pass strbuf by value
  clean: do not use strbuf_split*() [part 1]
  clean: do not pass the whole structure when it is not necessary
  clean: do not use strbuf_split*() [part 2]
  merge-tree: do not use strbuf_split*()
  notes: do not use strbuf_split*()
  config: do not use strbuf_split()
  environment: do not use strbuf_split*()
  sub-process: do not use strbuf_split*()
  trace2: trim_trailing_newline followed by trim is a no-op
  trace2: do not use strbuf_split*()

 builtin/clean.c      | 74 ++++++++++++++++++++--------------------
 builtin/merge-tree.c | 30 +++++++++--------
 builtin/notes.c      | 23 +++++++------
 config.c             | 23 ++++++-------
 environment.c        | 19 +++++++----
 sub-process.c        | 15 ++++-----
 trace2/tr2_cfg.c     | 80 +++++++++++++++-----------------------------
 wt-status.c          | 31 ++++++-----------
 8 files changed, 129 insertions(+), 166 deletions(-)

Range-diff against v2:
 1:  27de3d9a92 =  1:  2efe707054 wt-status: avoid strbuf_split*()
 2:  8f096e5a2d =  2:  899ff9c175 clean: do not pass strbuf by value
 3:  768b08907e =  3:  7a4acc3607 clean: do not use strbuf_split*() [part 1]
 -:  ---------- >  4:  4985f72ea5 clean: do not pass the whole structure when it is not necessary
 4:  0f8583e798 =  5:  4f60672f6f clean: do not use strbuf_split*() [part 2]
 5:  cefc2ec9f5 =  6:  d33091220d merge-tree: do not use strbuf_split*()
 6:  1c8ea097f6 !  7:  566e910495 notes: do not use strbuf_split*()
    @@ Metadata
      ## Commit message ##
         notes: do not use strbuf_split*()
     
    -    When reading the copy instruction from the standard input, the
    -    program reads a line, splits it into tokens at whitespace, and trims
    -    each of the tokens before using.  We no longer need to use strbuf
    -    just to be able to trimming, as string_list_split*() family now can
    -    trim while splitting a string.
    +    When reading copy instructions from the standard input, the program
    +    reads a line, splits it into tokens at whitespace, and trims each of
    +    the tokens before using.  We no longer need to use strbuf just to be
    +    able to trim, as string_list_split*() family now can trim while
    +    splitting a string.
     
    -    Retire the use of strbuf_split().
    +    Retire the use of strbuf_split() from this code path.
     
         Note that this loop is a bit sloppy in that it ensures at least
         there are two tokens on each line, but ignores if there are extra
 7:  a472688ec1 !  8:  dcecac2580 config: do not use strbuf_split()
    @@ Commit message
         config: do not use strbuf_split()
     
         When parsing an old-style GIT_CONFIG_PARAMETERS environment
    -    variable, the code parses the key=value pair by spliting them at '='
    -    into an array of strbuf's.  As strbuf_split() leafes the delimiter
    +    variable, the code parses key=value pairs by splitting them at '='
    +    into an array of strbuf's.  As strbuf_split() leaves the delimiter
         at the end of the split piece, the code has to manually trim it.
     
         If we split with string_list_split(), that becomes unnecessary.
    -    Retire the use of strbuf_split().
    +    Retire the use of strbuf_split() from this code path.
     
         Note that the max parameter of string_list_split() is of
         an ergonomically iffy design---it specifies the maximum number of
 8:  2b9957f31c =  9:  b894d4481f environment: do not use strbuf_split*()
 9:  4a5599836d = 10:  d6fd08bd76 sub-process: do not use strbuf_split*()
10:  cf6ecd2090 ! 11:  cb8e82a641 trace2: trim_trailing_newline followed by trim is a no-op
    @@ Commit message
         of a string.  If the code plans to call strbuf_trim() immediately
         after doing so, the code is better off skipping the EOL trimming in
         the first place.  After all, LF/CRLF at the end is a mere special
    -    case of whitespaces at the right end of the string, which will be
    -    removed by strbuf_rtrim().
    +    case of whitespaces at the end of the string, which will be removed
    +    by strbuf_rtrim() anyway.
     
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
11:  c2578b6b1c ! 12:  838fe56920 trace2: do not use strbuf_split*()
    @@ Metadata
      ## Commit message ##
         trace2: do not use strbuf_split*()
     
    -    tr2_cfg_load_patterns() and tr2_load_env_vars() functions are copied
    -    and pasted pair of functions that each reads an environment
    -    variable, split the value at ',' boundaries and trims the resulting
    -    string pieces into an array of strbufs.  But the code paths that
    -    later use these strbufs take no advantage of the strbuf-ness of the
    -    result (they do not benefit from <ptr,len> representation to avoid
    -    having to run strlne(<ptr>), for example).
    +    tr2_cfg_load_patterns() and tr2_load_env_vars() functions are
    +    functions with very similar structure that each reads an environment
    +    variable, splits its value at the ',' boundaries, and trims the
    +    resulting string pieces into an array of strbufs.
    +
    +    But the code paths that later use these strbufs take no advantage of
    +    the strbuf-ness of the result (they do not benefit from <ptr,len>
    +    representation to avoid having to run strlen(<ptr>), for example).
     
         Simplify the code by teaching these functions to split into a string
    -    list instead.
    +    list instead; even the trimming comes for free ;-).
     
         Signed-off-by: Junio C Hamano <gitster@pobox.com>
     
     
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply	[flat|nested] 72+ messages in thread

* [PATCH v3 01/12] wt-status: avoid strbuf_split*()
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 02/12] clean: do not pass strbuf by value Junio C Hamano
                         ` (10 subsequent siblings)
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

strbuf is a very good data structure to work with string data
without having to worry about running past the end of the string,
but strbuf_split() is a wrong API and an array of strbuf that the
function produces is a wrong thing to use in general.  You do not
edit these N strings split out of a single strbuf simultaneously.
Often it is much better off to split a string into string_list and
work with the resulting strings.

wt-status.c:abbrev_oid_in_line() takes one line of rebase todo list
(like "pick e813a0200a7121b97fec535f0d0b460b0a33356c title"), and
for instructions that has an object name as the second token on the
line, replace the object name with its unique abbreviation.  After
splitting these tokens out of a single line, no simultaneous edit on
any of these pieces of string that takes advantage of strbuf API
takes place.  The final string is composed with strbuf API, but
these split pieces are merely used as pieces of strings and there is
no need for them to be stored in individual strbuf.

Instead, split the line into a string_list, and compose the final
string using these pieces.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 wt-status.c | 31 ++++++++++---------------------
 1 file changed, 10 insertions(+), 21 deletions(-)

diff --git a/wt-status.c b/wt-status.c
index 454601afa1..a34dc144ee 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1351,8 +1351,8 @@ static int split_commit_in_progress(struct wt_status *s)
  */
 static void abbrev_oid_in_line(struct strbuf *line)
 {
-	struct strbuf **split;
-	int i;
+	struct string_list split = STRING_LIST_INIT_DUP;
+	struct object_id oid;
 
 	if (starts_with(line->buf, "exec ") ||
 	    starts_with(line->buf, "x ") ||
@@ -1360,26 +1360,15 @@ static void abbrev_oid_in_line(struct strbuf *line)
 	    starts_with(line->buf, "l "))
 		return;
 
-	split = strbuf_split_max(line, ' ', 3);
-	if (split[0] && split[1]) {
-		struct object_id oid;
-
-		/*
-		 * strbuf_split_max left a space. Trim it and re-add
-		 * it after abbreviation.
-		 */
-		strbuf_trim(split[1]);
-		if (!repo_get_oid(the_repository, split[1]->buf, &oid)) {
-			strbuf_reset(split[1]);
-			strbuf_add_unique_abbrev(split[1], &oid,
-						 DEFAULT_ABBREV);
-			strbuf_addch(split[1], ' ');
-			strbuf_reset(line);
-			for (i = 0; split[i]; i++)
-				strbuf_addbuf(line, split[i]);
-		}
+	if ((2 <= string_list_split(&split, line->buf, " ", 2)) &&
+	    !repo_get_oid(the_repository, split.items[1].string, &oid)) {
+		strbuf_reset(line);
+		strbuf_addf(line, "%s ", split.items[0].string);
+		strbuf_add_unique_abbrev(line, &oid, DEFAULT_ABBREV);
+		for (size_t i = 2; i < split.nr; i++)
+			strbuf_addf(line, " %s", split.items[i].string);
 	}
-	strbuf_list_free(split);
+	string_list_clear(&split, 0);
 }
 
 static int read_rebase_todolist(const char *fname, struct string_list *lines)
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 02/12] clean: do not pass strbuf by value
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 01/12] wt-status: avoid strbuf_split*() Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 03/12] clean: do not use strbuf_split*() [part 1] Junio C Hamano
                         ` (9 subsequent siblings)
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

When you pass a structure by value, the callee can modify the
contents of the structure that was passed in without having to worry
about changing the structure the caller has.  Passing structure by
value sometimes (but not very often) can be a valid way to give
callee a temporary variable it can freely modify.

But not a structure with members that are pointers, like a strbuf.

builtin/clean.c:list_and_choose() reads a line interactively from
the user, and passes the line (in a strbuf) to parse_choice() by
value, which then munges by replacing ',' with ' ' (to accept both
comma and space separated list of choices).  But because the strbuf
passed by value still shares the underlying character array buf[],
this ends up munging the caller's strbuf contents.

This is a catastrophe waiting to happen.  If the callee causes the
strbuf to be reallocated, the buf[] the caller has will become
dangling, and when the caller does strbuf_release(), it would result
in double-free.

Stop calling the function with misleading call-by-value with strbuf.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clean.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 053c94fc6b..224551537e 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -477,7 +477,7 @@ static int find_unique(const char *choice, struct menu_stuff *menu_stuff)
  */
 static int parse_choice(struct menu_stuff *menu_stuff,
 			int is_single,
-			struct strbuf input,
+			struct strbuf *input,
 			int **chosen)
 {
 	struct strbuf **choice_list, **ptr;
@@ -485,14 +485,14 @@ static int parse_choice(struct menu_stuff *menu_stuff,
 	int i;
 
 	if (is_single) {
-		choice_list = strbuf_split_max(&input, '\n', 0);
+		choice_list = strbuf_split_max(input, '\n', 0);
 	} else {
-		char *p = input.buf;
+		char *p = input->buf;
 		do {
 			if (*p == ',')
 				*p = ' ';
 		} while (*p++);
-		choice_list = strbuf_split_max(&input, ' ', 0);
+		choice_list = strbuf_split_max(input, ' ', 0);
 	}
 
 	for (ptr = choice_list; *ptr; ptr++) {
@@ -630,7 +630,7 @@ static int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
 
 		nr = parse_choice(stuff,
 				  opts->flags & MENU_OPTS_SINGLETON,
-				  choice,
+				  &choice,
 				  &chosen);
 
 		if (opts->flags & MENU_OPTS_SINGLETON) {
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 03/12] clean: do not use strbuf_split*() [part 1]
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 01/12] wt-status: avoid strbuf_split*() Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 02/12] clean: do not pass strbuf by value Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 04/12] clean: do not pass the whole structure when it is not necessary Junio C Hamano
                         ` (8 subsequent siblings)
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

builtin/clean.c:parse_choice() is fed a single line of input, which
is space or comma separated list of tokens, and a list of menu
items.  It parses the tokens into number ranges (e.g. 1-3 that means
the first three items) or string prefix (e.g. 's' to choose the menu
item "(s)elect") that specify the elements in the menu item list,
and tells the caller which ones are chosen.

For parsing the input string, it uses strbuf_split() to split it
into bunch of strbufs.  Instead use string_list_split_in_place(),
for a few reasons.

 * strbuf_split() is a bad API function to use, that yields an array
   of strbuf that is a bad data structure to use in general.

 * string_list_split_in_place() allows you to split with "comma or
   space"; the current code has to preprocess the input string to
   replace comma with space because strbuf_split() does not allow
   this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clean.c | 50 +++++++++++++++++++++++--------------------------
 1 file changed, 23 insertions(+), 27 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 224551537e..708cd9344c 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -480,40 +480,36 @@ static int parse_choice(struct menu_stuff *menu_stuff,
 			struct strbuf *input,
 			int **chosen)
 {
-	struct strbuf **choice_list, **ptr;
+	struct string_list choice = STRING_LIST_INIT_NODUP;
+	struct string_list_item *item;
 	int nr = 0;
 	int i;
 
-	if (is_single) {
-		choice_list = strbuf_split_max(input, '\n', 0);
-	} else {
-		char *p = input->buf;
-		do {
-			if (*p == ',')
-				*p = ' ';
-		} while (*p++);
-		choice_list = strbuf_split_max(input, ' ', 0);
-	}
+	string_list_split_in_place_f(&choice, input->buf,
+				     is_single ? "\n" : ", ", -1,
+				     STRING_LIST_SPLIT_TRIM);
 
-	for (ptr = choice_list; *ptr; ptr++) {
-		char *p;
-		int choose = 1;
+	for_each_string_list_item(item, &choice) {
+		const char *string;
+		int choose;
 		int bottom = 0, top = 0;
 		int is_range, is_number;
 
-		strbuf_trim(*ptr);
-		if (!(*ptr)->len)
+		string = item->string;
+		if (!*string)
 			continue;
 
 		/* Input that begins with '-'; unchoose */
-		if (*(*ptr)->buf == '-') {
+		if (string[0] == '-') {
 			choose = 0;
-			strbuf_remove((*ptr), 0, 1);
+			string++;
+		} else {
+			choose = 1;
 		}
 
 		is_range = 0;
 		is_number = 1;
-		for (p = (*ptr)->buf; *p; p++) {
+		for (const char *p = string; *p; p++) {
 			if ('-' == *p) {
 				if (!is_range) {
 					is_range = 1;
@@ -531,27 +527,27 @@ static int parse_choice(struct menu_stuff *menu_stuff,
 		}
 
 		if (is_number) {
-			bottom = atoi((*ptr)->buf);
+			bottom = atoi(string);
 			top = bottom;
 		} else if (is_range) {
-			bottom = atoi((*ptr)->buf);
+			bottom = atoi(string);
 			/* a range can be specified like 5-7 or 5- */
-			if (!*(strchr((*ptr)->buf, '-') + 1))
+			if (!*(strchr(string, '-') + 1))
 				top = menu_stuff->nr;
 			else
-				top = atoi(strchr((*ptr)->buf, '-') + 1);
-		} else if (!strcmp((*ptr)->buf, "*")) {
+				top = atoi(strchr(string, '-') + 1);
+		} else if (!strcmp(string, "*")) {
 			bottom = 1;
 			top = menu_stuff->nr;
 		} else {
-			bottom = find_unique((*ptr)->buf, menu_stuff);
+			bottom = find_unique(string, menu_stuff);
 			top = bottom;
 		}
 
 		if (top <= 0 || bottom <= 0 || top > menu_stuff->nr || bottom > top ||
 		    (is_single && bottom != top)) {
 			clean_print_color(CLEAN_COLOR_ERROR);
-			printf(_("Huh (%s)?\n"), (*ptr)->buf);
+			printf(_("Huh (%s)?\n"), string);
 			clean_print_color(CLEAN_COLOR_RESET);
 			continue;
 		}
@@ -560,7 +556,7 @@ static int parse_choice(struct menu_stuff *menu_stuff,
 			(*chosen)[i-1] = choose;
 	}
 
-	strbuf_list_free(choice_list);
+	string_list_clear(&choice, 0);
 
 	for (i = 0; i < menu_stuff->nr; i++)
 		nr += (*chosen)[i];
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 04/12] clean: do not pass the whole structure when it is not necessary
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
                         ` (2 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v3 03/12] clean: do not use strbuf_split*() [part 1] Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 05/12] clean: do not use strbuf_split*() [part 2] Junio C Hamano
                         ` (7 subsequent siblings)
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

The callee parse_choice() only needs to access a NUL-terminated
string; instead of insisting to take a pointer to a strbuf, just
take a pointer to a character array.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clean.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 708cd9344c..9bb920e7fd 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -477,7 +477,7 @@ static int find_unique(const char *choice, struct menu_stuff *menu_stuff)
  */
 static int parse_choice(struct menu_stuff *menu_stuff,
 			int is_single,
-			struct strbuf *input,
+			char *input,
 			int **chosen)
 {
 	struct string_list choice = STRING_LIST_INIT_NODUP;
@@ -485,7 +485,7 @@ static int parse_choice(struct menu_stuff *menu_stuff,
 	int nr = 0;
 	int i;
 
-	string_list_split_in_place_f(&choice, input->buf,
+	string_list_split_in_place_f(&choice, input,
 				     is_single ? "\n" : ", ", -1,
 				     STRING_LIST_SPLIT_TRIM);
 
@@ -626,7 +626,7 @@ static int *list_and_choose(struct menu_opts *opts, struct menu_stuff *stuff)
 
 		nr = parse_choice(stuff,
 				  opts->flags & MENU_OPTS_SINGLETON,
-				  &choice,
+				  choice.buf,
 				  &chosen);
 
 		if (opts->flags & MENU_OPTS_SINGLETON) {
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 05/12] clean: do not use strbuf_split*() [part 2]
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
                         ` (3 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v3 04/12] clean: do not pass the whole structure when it is not necessary Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 06/12] merge-tree: do not use strbuf_split*() Junio C Hamano
                         ` (6 subsequent siblings)
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

builtin/clean.c:filter_by_patterns_cmd() interactively reads a line
that has exclude patterns from the user and splits the line into a
list of patterns.  It uses the strbuf_split() so that each split
piece can then trimmed.

There is no need to use strbuf anymore, thanks to the recent
enhancement to string_list_split*() family that allows us to trim
the pieces split into a string_list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/clean.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 9bb920e7fd..38780edc39 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -674,12 +674,13 @@ static int filter_by_patterns_cmd(void)
 {
 	struct dir_struct dir = DIR_INIT;
 	struct strbuf confirm = STRBUF_INIT;
-	struct strbuf **ignore_list;
-	struct string_list_item *item;
 	struct pattern_list *pl;
 	int changed = -1, i;
 
 	for (;;) {
+		struct string_list ignore_list = STRING_LIST_INIT_NODUP;
+		struct string_list_item *item;
+
 		if (!del_list.nr)
 			break;
 
@@ -697,14 +698,15 @@ static int filter_by_patterns_cmd(void)
 			break;
 
 		pl = add_pattern_list(&dir, EXC_CMDL, "manual exclude");
-		ignore_list = strbuf_split_max(&confirm, ' ', 0);
 
-		for (i = 0; ignore_list[i]; i++) {
-			strbuf_trim(ignore_list[i]);
-			if (!ignore_list[i]->len)
-				continue;
+		string_list_split_in_place_f(&ignore_list, confirm.buf, " ", -1,
+					     STRING_LIST_SPLIT_TRIM);
 
-			add_pattern(ignore_list[i]->buf, "", 0, pl, -(i+1));
+		for (i = 0; i < ignore_list.nr; i++) {
+			item = &ignore_list.items[i];
+			if (!*item->string)
+				continue;
+			add_pattern(item->string, "", 0, pl, -(i+1));
 		}
 
 		changed = 0;
@@ -725,7 +727,7 @@ static int filter_by_patterns_cmd(void)
 			clean_print_color(CLEAN_COLOR_RESET);
 		}
 
-		strbuf_list_free(ignore_list);
+		string_list_clear(&ignore_list, 0);
 		dir_clear(&dir);
 	}
 
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 06/12] merge-tree: do not use strbuf_split*()
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
                         ` (4 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v3 05/12] clean: do not use strbuf_split*() [part 2] Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:52       ` [PATCH v3 07/12] notes: " Junio C Hamano
                         ` (5 subsequent siblings)
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

When reading merge instructions from the standard input, the program
reads from the standard input, splits the line into tokens at
whitespace, and trims each of them before using.  We no longer need
to use strbuf just for trimming, as string_list_split*() family can
trim while splitting a string.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/merge-tree.c | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index cf8b06cadc..70235856d7 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -618,32 +618,34 @@ int cmd_merge_tree(int argc,
 			    "--merge-base", "--stdin");
 		line_termination = '\0';
 		while (strbuf_getline_lf(&buf, stdin) != EOF) {
-			struct strbuf **split;
+			struct string_list split = STRING_LIST_INIT_NODUP;
 			const char *input_merge_base = NULL;
 
-			split = strbuf_split(&buf, ' ');
-			if (!split[0] || !split[1])
+			string_list_split_in_place_f(&split, buf.buf, " ", -1,
+						     STRING_LIST_SPLIT_TRIM);
+
+			if (split.nr < 2)
 				die(_("malformed input line: '%s'."), buf.buf);
-			strbuf_rtrim(split[0]);
-			strbuf_rtrim(split[1]);
 
 			/* parse the merge-base */
-			if (!strcmp(split[1]->buf, "--")) {
-				input_merge_base = split[0]->buf;
+			if (!strcmp(split.items[1].string, "--")) {
+				input_merge_base = split.items[0].string;
 			}
 
-			if (input_merge_base && split[2] && split[3] && !split[4]) {
-				strbuf_rtrim(split[2]);
-				strbuf_rtrim(split[3]);
-				real_merge(&o, input_merge_base, split[2]->buf, split[3]->buf, prefix);
-			} else if (!input_merge_base && !split[2]) {
-				real_merge(&o, NULL, split[0]->buf, split[1]->buf, prefix);
+			if (input_merge_base && split.nr == 4) {
+				real_merge(&o, input_merge_base,
+					   split.items[2].string, split.items[3].string,
+					   prefix);
+			} else if (!input_merge_base && split.nr == 2) {
+				real_merge(&o, NULL,
+					   split.items[0].string, split.items[1].string,
+					   prefix);
 			} else {
 				die(_("malformed input line: '%s'."), buf.buf);
 			}
 			maybe_flush_or_die(stdout, "stdout");
 
-			strbuf_list_free(split);
+			string_list_clear(&split, 0);
 		}
 		strbuf_release(&buf);
 
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 07/12] notes: do not use strbuf_split*()
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
                         ` (5 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v3 06/12] merge-tree: do not use strbuf_split*() Junio C Hamano
@ 2025-08-03  6:52       ` Junio C Hamano
  2025-08-03  6:53       ` [PATCH v3 08/12] config: do not use strbuf_split() Junio C Hamano
                         ` (4 subsequent siblings)
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:52 UTC (permalink / raw)
  To: git

When reading copy instructions from the standard input, the program
reads a line, splits it into tokens at whitespace, and trims each of
the tokens before using.  We no longer need to use strbuf just to be
able to trim, as string_list_split*() family now can trim while
splitting a string.

Retire the use of strbuf_split() from this code path.

Note that this loop is a bit sloppy in that it ensures at least
there are two tokens on each line, but ignores if there are extra
tokens on the line.  Tightening it is outside the scope of this
series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/notes.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index a9529b1696..4fb36a743c 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -375,18 +375,19 @@ static int notes_copy_from_stdin(int force, const char *rewrite_cmd)
 
 	while (strbuf_getline_lf(&buf, stdin) != EOF) {
 		struct object_id from_obj, to_obj;
-		struct strbuf **split;
+		struct string_list split = STRING_LIST_INIT_NODUP;
 		int err;
 
-		split = strbuf_split(&buf, ' ');
-		if (!split[0] || !split[1])
+		string_list_split_in_place_f(&split, buf.buf, " ", -1,
+					     STRING_LIST_SPLIT_TRIM);
+		if (split.nr < 2)
 			die(_("malformed input line: '%s'."), buf.buf);
-		strbuf_rtrim(split[0]);
-		strbuf_rtrim(split[1]);
-		if (repo_get_oid(the_repository, split[0]->buf, &from_obj))
-			die(_("failed to resolve '%s' as a valid ref."), split[0]->buf);
-		if (repo_get_oid(the_repository, split[1]->buf, &to_obj))
-			die(_("failed to resolve '%s' as a valid ref."), split[1]->buf);
+		if (repo_get_oid(the_repository, split.items[0].string, &from_obj))
+			die(_("failed to resolve '%s' as a valid ref."),
+			    split.items[0].string);
+		if (repo_get_oid(the_repository, split.items[1].string, &to_obj))
+			die(_("failed to resolve '%s' as a valid ref."),
+			    split.items[1].string);
 
 		if (rewrite_cmd)
 			err = copy_note_for_rewrite(c, &from_obj, &to_obj);
@@ -396,11 +397,11 @@ static int notes_copy_from_stdin(int force, const char *rewrite_cmd)
 
 		if (err) {
 			error(_("failed to copy notes from '%s' to '%s'"),
-			      split[0]->buf, split[1]->buf);
+			      split.items[0].string, split.items[1].string);
 			ret = 1;
 		}
 
-		strbuf_list_free(split);
+		string_list_clear(&split, 0);
 	}
 
 	if (!rewrite_cmd) {
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 08/12] config: do not use strbuf_split()
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
                         ` (6 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v3 07/12] notes: " Junio C Hamano
@ 2025-08-03  6:53       ` Junio C Hamano
  2025-08-03  6:53       ` [PATCH v3 09/12] environment: do not use strbuf_split*() Junio C Hamano
                         ` (3 subsequent siblings)
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:53 UTC (permalink / raw)
  To: git

When parsing an old-style GIT_CONFIG_PARAMETERS environment
variable, the code parses key=value pairs by splitting them at '='
into an array of strbuf's.  As strbuf_split() leaves the delimiter
at the end of the split piece, the code has to manually trim it.

If we split with string_list_split(), that becomes unnecessary.
Retire the use of strbuf_split() from this code path.

Note that the max parameter of string_list_split() is of
an ergonomically iffy design---it specifies the maximum number of
times the function is allowed to split, which means that in order to
split a text into up to 2 pieces, you have to pass 1, not 2.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 config.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/config.c b/config.c
index 8a2d0b7916..1769f15ee3 100644
--- a/config.c
+++ b/config.c
@@ -638,31 +638,28 @@ int git_config_parse_parameter(const char *text,
 			       config_fn_t fn, void *data)
 {
 	const char *value;
-	struct strbuf **pair;
+	struct string_list pair = STRING_LIST_INIT_DUP;
 	int ret;
 	struct key_value_info kvi = KVI_INIT;
 
 	kvi_from_param(&kvi);
 
-	pair = strbuf_split_str(text, '=', 2);
-	if (!pair[0])
+	string_list_split(&pair, text, "=", 1);
+	if (!pair.nr)
 		return error(_("bogus config parameter: %s"), text);
 
-	if (pair[0]->len && pair[0]->buf[pair[0]->len - 1] == '=') {
-		strbuf_setlen(pair[0], pair[0]->len - 1);
-		value = pair[1] ? pair[1]->buf : "";
-	} else {
+	if (pair.nr == 1)
 		value = NULL;
-	}
+	else
+		value = pair.items[1].string;
 
-	strbuf_trim(pair[0]);
-	if (!pair[0]->len) {
-		strbuf_list_free(pair);
+	if (!*pair.items[0].string) {
+		string_list_clear(&pair, 0);
 		return error(_("bogus config parameter: %s"), text);
 	}
 
-	ret = config_parse_pair(pair[0]->buf, value, &kvi, fn, data);
-	strbuf_list_free(pair);
+	ret = config_parse_pair(pair.items[0].string, value, &kvi, fn, data);
+	string_list_clear(&pair, 0);
 	return ret;
 }
 
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 09/12] environment: do not use strbuf_split*()
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
                         ` (7 preceding siblings ...)
  2025-08-03  6:53       ` [PATCH v3 08/12] config: do not use strbuf_split() Junio C Hamano
@ 2025-08-03  6:53       ` Junio C Hamano
  2025-08-03  6:53       ` [PATCH v3 10/12] sub-process: " Junio C Hamano
                         ` (2 subsequent siblings)
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:53 UTC (permalink / raw)
  To: git

environment.c:get_git_namespace() learns the raw namespace from an
environment variable, splits it at "/", and appends them after
"refs/namespaces/"; the reason why it splits first is so that an
empty string resulting from double slashes can be omitted.

The split pieces do not need to be edited in any way, so an array of
strbufs is a wrong data structure to use.  Instead split into a
string list and use the pieces from there.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 environment.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/environment.c b/environment.c
index 7c2480b22e..ab3ed08433 100644
--- a/environment.c
+++ b/environment.c
@@ -163,10 +163,10 @@ int have_git_dir(void)
 const char *get_git_namespace(void)
 {
 	static const char *namespace;
-
 	struct strbuf buf = STRBUF_INIT;
-	struct strbuf **components, **c;
 	const char *raw_namespace;
+	struct string_list components = STRING_LIST_INIT_DUP;
+	struct string_list_item *item;
 
 	if (namespace)
 		return namespace;
@@ -178,12 +178,17 @@ const char *get_git_namespace(void)
 	}
 
 	strbuf_addstr(&buf, raw_namespace);
-	components = strbuf_split(&buf, '/');
+
+	string_list_split(&components, buf.buf, "/", -1);
 	strbuf_reset(&buf);
-	for (c = components; *c; c++)
-		if (strcmp((*c)->buf, "/") != 0)
-			strbuf_addf(&buf, "refs/namespaces/%s", (*c)->buf);
-	strbuf_list_free(components);
+
+	for_each_string_list_item(item, &components) {
+		if (item->string[0])
+			strbuf_addf(&buf, "refs/namespaces/%s/", item->string);
+	}
+	string_list_clear(&components, 0);
+
+	strbuf_trim_trailing_dir_sep(&buf);
 	if (check_refname_format(buf.buf, 0))
 		die(_("bad git namespace path \"%s\""), raw_namespace);
 	strbuf_addch(&buf, '/');
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 10/12] sub-process: do not use strbuf_split*()
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
                         ` (8 preceding siblings ...)
  2025-08-03  6:53       ` [PATCH v3 09/12] environment: do not use strbuf_split*() Junio C Hamano
@ 2025-08-03  6:53       ` Junio C Hamano
  2025-08-03  6:53       ` [PATCH v3 11/12] trace2: trim_trailing_newline followed by trim is a no-op Junio C Hamano
  2025-08-03  6:53       ` [PATCH v3 12/12] trace2: do not use strbuf_split*() Junio C Hamano
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:53 UTC (permalink / raw)
  To: git

The code to read status from subprocess reads one packet line and
tries to find "status=<foo>".  It is way overkill to split the line
into an array of two strbufs to extract <foo>.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sub-process.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/sub-process.c b/sub-process.c
index 1daf5a9752..83bf0a0e82 100644
--- a/sub-process.c
+++ b/sub-process.c
@@ -30,23 +30,20 @@ struct subprocess_entry *subprocess_find_entry(struct hashmap *hashmap, const ch
 
 int subprocess_read_status(int fd, struct strbuf *status)
 {
-	struct strbuf **pair;
-	char *line;
 	int len;
 
 	for (;;) {
+		char *line;
+		const char *value;
+
 		len = packet_read_line_gently(fd, NULL, &line);
 		if ((len < 0) || !line)
 			break;
-		pair = strbuf_split_str(line, '=', 2);
-		if (pair[0] && pair[0]->len && pair[1]) {
+		if (skip_prefix(line, "status=", &value)) {
 			/* the last "status=<foo>" line wins */
-			if (!strcmp(pair[0]->buf, "status=")) {
-				strbuf_reset(status);
-				strbuf_addbuf(status, pair[1]);
-			}
+			strbuf_reset(status);
+			strbuf_addstr(status, value);
 		}
-		strbuf_list_free(pair);
 	}
 
 	return (len < 0) ? len : 0;
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 11/12] trace2: trim_trailing_newline followed by trim is a no-op
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
                         ` (9 preceding siblings ...)
  2025-08-03  6:53       ` [PATCH v3 10/12] sub-process: " Junio C Hamano
@ 2025-08-03  6:53       ` Junio C Hamano
  2025-08-03  6:53       ` [PATCH v3 12/12] trace2: do not use strbuf_split*() Junio C Hamano
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:53 UTC (permalink / raw)
  To: git

strbuf_trim_trailing_newline() removes a LF or a CRLF from the tail
of a string.  If the code plans to call strbuf_trim() immediately
after doing so, the code is better off skipping the EOL trimming in
the first place.  After all, LF/CRLF at the end is a mere special
case of whitespaces at the end of the string, which will be removed
by strbuf_rtrim() anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 trace2/tr2_cfg.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/trace2/tr2_cfg.c b/trace2/tr2_cfg.c
index 22a99a0682..2b7cfcd10c 100644
--- a/trace2/tr2_cfg.c
+++ b/trace2/tr2_cfg.c
@@ -39,7 +39,6 @@ static int tr2_cfg_load_patterns(void)
 
 		if (buf->len && buf->buf[buf->len - 1] == ',')
 			strbuf_setlen(buf, buf->len - 1);
-		strbuf_trim_trailing_newline(*s);
 		strbuf_trim(*s);
 	}
 
@@ -78,7 +77,6 @@ static int tr2_load_env_vars(void)
 
 		if (buf->len && buf->buf[buf->len - 1] == ',')
 			strbuf_setlen(buf, buf->len - 1);
-		strbuf_trim_trailing_newline(*s);
 		strbuf_trim(*s);
 	}
 
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* [PATCH v3 12/12] trace2: do not use strbuf_split*()
  2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
                         ` (10 preceding siblings ...)
  2025-08-03  6:53       ` [PATCH v3 11/12] trace2: trim_trailing_newline followed by trim is a no-op Junio C Hamano
@ 2025-08-03  6:53       ` Junio C Hamano
  11 siblings, 0 replies; 72+ messages in thread
From: Junio C Hamano @ 2025-08-03  6:53 UTC (permalink / raw)
  To: git

tr2_cfg_load_patterns() and tr2_load_env_vars() functions are
functions with very similar structure that each reads an environment
variable, splits its value at the ',' boundaries, and trims the
resulting string pieces into an array of strbufs.

But the code paths that later use these strbufs take no advantage of
the strbuf-ness of the result (they do not benefit from <ptr,len>
representation to avoid having to run strlen(<ptr>), for example).

Simplify the code by teaching these functions to split into a string
list instead; even the trimming comes for free ;-).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 trace2/tr2_cfg.c | 78 +++++++++++++++++-------------------------------
 1 file changed, 27 insertions(+), 51 deletions(-)

diff --git a/trace2/tr2_cfg.c b/trace2/tr2_cfg.c
index 2b7cfcd10c..bbcfeda60a 100644
--- a/trace2/tr2_cfg.c
+++ b/trace2/tr2_cfg.c
@@ -8,87 +8,65 @@
 #include "trace2/tr2_sysenv.h"
 #include "wildmatch.h"
 
-static struct strbuf **tr2_cfg_patterns;
-static int tr2_cfg_count_patterns;
+static struct string_list tr2_cfg_patterns = STRING_LIST_INIT_DUP;
 static int tr2_cfg_loaded;
 
-static struct strbuf **tr2_cfg_env_vars;
-static int tr2_cfg_env_vars_count;
+static struct string_list tr2_cfg_env_vars = STRING_LIST_INIT_DUP;
 static int tr2_cfg_env_vars_loaded;
 
 /*
  * Parse a string containing a comma-delimited list of config keys
- * or wildcard patterns into a list of strbufs.
+ * or wildcard patterns into a string list.
  */
-static int tr2_cfg_load_patterns(void)
+static size_t tr2_cfg_load_patterns(void)
 {
-	struct strbuf **s;
 	const char *envvar;
 
 	if (tr2_cfg_loaded)
-		return tr2_cfg_count_patterns;
+		return tr2_cfg_patterns.nr;
 	tr2_cfg_loaded = 1;
 
 	envvar = tr2_sysenv_get(TR2_SYSENV_CFG_PARAM);
 	if (!envvar || !*envvar)
-		return tr2_cfg_count_patterns;
+		return tr2_cfg_patterns.nr;
 
-	tr2_cfg_patterns = strbuf_split_buf(envvar, strlen(envvar), ',', -1);
-	for (s = tr2_cfg_patterns; *s; s++) {
-		struct strbuf *buf = *s;
-
-		if (buf->len && buf->buf[buf->len - 1] == ',')
-			strbuf_setlen(buf, buf->len - 1);
-		strbuf_trim(*s);
-	}
-
-	tr2_cfg_count_patterns = s - tr2_cfg_patterns;
-	return tr2_cfg_count_patterns;
+	string_list_split_f(&tr2_cfg_patterns, envvar, ",", -1,
+			    STRING_LIST_SPLIT_TRIM);
+	return tr2_cfg_patterns.nr;
 }
 
 void tr2_cfg_free_patterns(void)
 {
-	if (tr2_cfg_patterns)
-		strbuf_list_free(tr2_cfg_patterns);
-	tr2_cfg_count_patterns = 0;
+	if (tr2_cfg_patterns.nr)
+		string_list_clear(&tr2_cfg_patterns, 0);
 	tr2_cfg_loaded = 0;
 }
 
 /*
  * Parse a string containing a comma-delimited list of environment variable
- * names into a list of strbufs.
+ * names into a string list.
  */
-static int tr2_load_env_vars(void)
+static size_t tr2_load_env_vars(void)
 {
-	struct strbuf **s;
 	const char *varlist;
 
 	if (tr2_cfg_env_vars_loaded)
-		return tr2_cfg_env_vars_count;
+		return tr2_cfg_env_vars.nr;
 	tr2_cfg_env_vars_loaded = 1;
 
 	varlist = tr2_sysenv_get(TR2_SYSENV_ENV_VARS);
 	if (!varlist || !*varlist)
-		return tr2_cfg_env_vars_count;
-
-	tr2_cfg_env_vars = strbuf_split_buf(varlist, strlen(varlist), ',', -1);
-	for (s = tr2_cfg_env_vars; *s; s++) {
-		struct strbuf *buf = *s;
-
-		if (buf->len && buf->buf[buf->len - 1] == ',')
-			strbuf_setlen(buf, buf->len - 1);
-		strbuf_trim(*s);
-	}
+		return tr2_cfg_env_vars.nr;
 
-	tr2_cfg_env_vars_count = s - tr2_cfg_env_vars;
-	return tr2_cfg_env_vars_count;
+	string_list_split_f(&tr2_cfg_env_vars, varlist, ",", -1,
+			    STRING_LIST_SPLIT_TRIM);
+	return tr2_cfg_env_vars.nr;
 }
 
 void tr2_cfg_free_env_vars(void)
 {
-	if (tr2_cfg_env_vars)
-		strbuf_list_free(tr2_cfg_env_vars);
-	tr2_cfg_env_vars_count = 0;
+	if (tr2_cfg_env_vars.nr)
+		string_list_clear(&tr2_cfg_env_vars, 0);
 	tr2_cfg_env_vars_loaded = 0;
 }
 
@@ -103,12 +81,11 @@ struct tr2_cfg_data {
 static int tr2_cfg_cb(const char *key, const char *value,
 		      const struct config_context *ctx, void *d)
 {
-	struct strbuf **s;
+	struct string_list_item *item;
 	struct tr2_cfg_data *data = (struct tr2_cfg_data *)d;
 
-	for (s = tr2_cfg_patterns; *s; s++) {
-		struct strbuf *buf = *s;
-		int wm = wildmatch(buf->buf, key, WM_CASEFOLD);
+	for_each_string_list_item(item, &tr2_cfg_patterns) {
+		int wm = wildmatch(item->string, key, WM_CASEFOLD);
 		if (wm == WM_MATCH) {
 			trace2_def_param_fl(data->file, data->line, key, value,
 					    ctx->kvi);
@@ -130,17 +107,16 @@ void tr2_cfg_list_config_fl(const char *file, int line)
 void tr2_list_env_vars_fl(const char *file, int line)
 {
 	struct key_value_info kvi = KVI_INIT;
-	struct strbuf **s;
+	struct string_list_item *item;
 
 	kvi_from_param(&kvi);
 	if (tr2_load_env_vars() <= 0)
 		return;
 
-	for (s = tr2_cfg_env_vars; *s; s++) {
-		struct strbuf *buf = *s;
-		const char *val = getenv(buf->buf);
+	for_each_string_list_item(item, &tr2_cfg_env_vars) {
+		const char *val = getenv(item->string);
 		if (val && *val)
-			trace2_def_param_fl(file, line, buf->buf, val, &kvi);
+			trace2_def_param_fl(file, line, item->string, val, &kvi);
 	}
 }
 
-- 
2.50.1-633-g69dfdd50af


^ permalink raw reply related	[flat|nested] 72+ messages in thread

* Re: [PATCH v4 0/7] string_list_split*() updates
  2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
                         ` (6 preceding siblings ...)
  2025-08-03  6:52       ` [PATCH v4 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
@ 2025-08-04  6:24       ` Patrick Steinhardt
  7 siblings, 0 replies; 72+ messages in thread
From: Patrick Steinhardt @ 2025-08-04  6:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sat, Aug 02, 2025 at 11:52:16PM -0700, Junio C Hamano wrote:
> Relative to the v3 iteration, the v4 iteration explains the history
> behind string_list_split_in_place() in a bit more detail, and
> expands in-code comment to clarify what the verb "trim" means in the
> context of STRING_LIST_SPLIT_TRIM.

Thanks, this version looks good to me.

Patrick

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2025-08-04  6:24 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-31  6:39 [PATCH 0/5] string_list_split*() updates Junio C Hamano
2025-07-31  6:39 ` [PATCH 1/5] string-list: report programming error with BUG Junio C Hamano
2025-07-31 19:33   ` Eric Sunshine
2025-07-31 22:16     ` Junio C Hamano
2025-07-31  6:39 ` [PATCH 2/5] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
2025-07-31 19:36   ` Eric Sunshine
2025-07-31  6:39 ` [PATCH 3/5] string-list: unify string_list_split* functions Junio C Hamano
2025-07-31  6:39 ` [PATCH 4/5] string-list: optionally trim string pieces split by string_list_split() Junio C Hamano
2025-07-31  6:39 ` [PATCH 5/5] diff: simplify parsing of diff.colormovedws Junio C Hamano
2025-07-31 19:45   ` Eric Sunshine
2025-07-31 22:45 ` [PATCH v2 0/7] string_list_split*() updates Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 1/7] string-list: report programming error with BUG Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
2025-08-01  2:33     ` shejialuo
2025-08-01  3:43       ` Junio C Hamano
2025-08-01  3:55         ` shejialuo
2025-08-01 23:10           ` Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 3/7] string-list: unify string_list_split* functions Junio C Hamano
2025-08-01  3:00     ` shejialuo
2025-07-31 22:46   ` [PATCH v2 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
2025-08-01  3:18     ` shejialuo
2025-08-01  3:47       ` Junio C Hamano
2025-08-01  4:04         ` shejialuo
2025-08-01 23:09           ` Junio C Hamano
2025-08-02  1:51             ` shejialuo
2025-08-01  8:47     ` Patrick Steinhardt
2025-08-01 16:26       ` Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
2025-08-01  8:47     ` Patrick Steinhardt
2025-07-31 22:46   ` [PATCH v2 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
2025-07-31 22:54     ` Eric Sunshine
2025-08-01  3:33     ` shejialuo
2025-08-01  8:47     ` Patrick Steinhardt
2025-08-01 16:38       ` Junio C Hamano
2025-07-31 22:46   ` [PATCH v2 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
2025-08-01  8:47     ` Patrick Steinhardt
2025-08-01 22:04   ` [PATCH v3 0/7] string_list_split*() updates Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 1/7] string-list: report programming error with BUG Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
2025-08-02  8:22       ` Jeff King
2025-08-02 16:34         ` Junio C Hamano
2025-08-02 18:38           ` Jeff King
2025-08-01 22:04     ` [PATCH v3 3/7] string-list: unify string_list_split* functions Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
2025-08-02  8:26       ` Jeff King
2025-08-02 16:38         ` Junio C Hamano
2025-08-02 18:39           ` Jeff King
2025-08-01 22:04     ` [PATCH v3 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
2025-08-01 22:04     ` [PATCH v3 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
2025-08-03  6:52     ` [PATCH v4 0/7] string_list_split*() updates Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 1/7] string-list: report programming error with BUG Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 2/7] string-list: align string_list_split() with its _in_place() counterpart Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 3/7] string-list: unify string_list_split* functions Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 4/7] string-list: optionally trim string pieces split by string_list_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 5/7] diff: simplify parsing of diff.colormovedws Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 6/7] string-list: optionally omit empty string pieces in string_list_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v4 7/7] string-list: split-then-remove-empty can be done while splitting Junio C Hamano
2025-08-04  6:24       ` [PATCH v4 0/7] string_list_split*() updates Patrick Steinhardt
2025-08-03  6:52     ` [PATCH v3 00/12] do not overuse strbuf_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 01/12] wt-status: avoid strbuf_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 02/12] clean: do not pass strbuf by value Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 03/12] clean: do not use strbuf_split*() [part 1] Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 04/12] clean: do not pass the whole structure when it is not necessary Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 05/12] clean: do not use strbuf_split*() [part 2] Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 06/12] merge-tree: do not use strbuf_split*() Junio C Hamano
2025-08-03  6:52       ` [PATCH v3 07/12] notes: " Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 08/12] config: do not use strbuf_split() Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 09/12] environment: do not use strbuf_split*() Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 10/12] sub-process: " Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 11/12] trace2: trim_trailing_newline followed by trim is a no-op Junio C Hamano
2025-08-03  6:53       ` [PATCH v3 12/12] trace2: do not use strbuf_split*() Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).