git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "René Scharfe" <l.s.r@web.de>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [RFC PATCH 3/5] strvec API: add a "STRVEC_INIT_NODUP"
Date: Thu, 15 Dec 2022 10:11:09 +0100	[thread overview]
Message-ID: <RFC-patch-3.5-16c20baf5ec-20221215T090226Z-avarab@gmail.com> (raw)
In-Reply-To: <RFC-cover-0.5-00000000000-20221215T090226Z-avarab@gmail.com>

We have various tricky cases where we'll leak a "struct strvec" even
when we call strvec_clear(), these happen because we'll call
setup_revisions(), parse_options() etc., which will munge our "v"
member.

There's various potential ways to deal with that, see the extensive
on-list discussion at [1]. One way would be to pass a flag to ask the
underlying API to free() these, as was done for setup_revisions() in
[2].

But we don't need that complexity for many common cases, which are
pushing fixed strings to the "struct strvec". Let's instead add a flag
analogous to the "strdup_strings" flag in the "struct string_list". A
subsequent commit will make use of this API.

Implementation notes: The BUG_unless_dup() is implemented as a macro
so we'll report the correct line number on BUG(). The "nodup_strings"
flag could have been named a "strdup_strings" for consistency with the
"struct string_list" API, but to do so we'd have to be confident that
we've spotted all callers that assume that they can memset() a "struct
strvec" to zero.

1. https://lore.kernel.org/git/221214.86ilie48cv.gmgdl@evledraar.gmail.com/
2. f92dbdbc6a8 (revisions API: don't leak memory on argv elements that
   need free()-ing, 2022-08-02)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 strvec.c | 20 ++++++++++++++++++--
 strvec.h | 30 +++++++++++++++++++++++++++++-
 2 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/strvec.c b/strvec.c
index 61a76ce6cb9..721f8e94a50 100644
--- a/strvec.c
+++ b/strvec.c
@@ -10,6 +10,16 @@ void strvec_init(struct strvec *array)
 	memcpy(array, &blank, sizeof(*array));
 }
 
+void strvec_init_nodup(struct strvec *array)
+{
+	struct strvec blank = STRVEC_INIT_NODUP;
+	memcpy(array, &blank, sizeof(*array));
+}
+
+#define BUG_unless_dup(array, fn) \
+	if ((array)->nodup_strings) \
+		BUG("cannot %s() on a 'STRVEC_INIT_NODUP' strvec", (fn))
+
 static void strvec_push_nodup(struct strvec *array, const char *value)
 {
 	if (array->v == empty_strvec)
@@ -22,7 +32,9 @@ static void strvec_push_nodup(struct strvec *array, const char *value)
 
 const char *strvec_push(struct strvec *array, const char *value)
 {
-	strvec_push_nodup(array, xstrdup(value));
+	const char *to_push = array->nodup_strings ? value : xstrdup(value);
+
+	strvec_push_nodup(array, to_push);
 	return array->v[array->nr - 1];
 }
 
@@ -31,6 +43,8 @@ const char *strvec_pushf(struct strvec *array, const char *fmt, ...)
 	va_list ap;
 	struct strbuf v = STRBUF_INIT;
 
+	BUG_unless_dup(array, "strvec_pushf");
+
 	va_start(ap, fmt);
 	strbuf_vaddf(&v, fmt, ap);
 	va_end(ap);
@@ -67,6 +81,8 @@ void strvec_pop(struct strvec *array)
 
 void strvec_split(struct strvec *array, const char *to_split)
 {
+	BUG_unless_dup(array, "strvec_pushf");
+
 	while (isspace(*to_split))
 		to_split++;
 	for (;;) {
@@ -89,7 +105,7 @@ void strvec_clear(struct strvec *array)
 {
 	if (array->v != empty_strvec) {
 		int i;
-		for (i = 0; i < array->nr; i++)
+		for (i = 0; !array->nodup_strings && i < array->nr; i++)
 			free((char *)array->v[i]);
 		free(array->v);
 	}
diff --git a/strvec.h b/strvec.h
index 9f55c8766ba..b122b87b369 100644
--- a/strvec.h
+++ b/strvec.h
@@ -26,29 +26,51 @@ extern const char *empty_strvec[];
  * member contains the actual array; the `nr` member contains the
  * number of elements in the array, not including the terminating
  * NULL.
+ *
+ * When using `STRVEC_INIT_NODUP` to initialize it the `nodup_strings'
+ * member is set, and individual members of the "struct strvec" will
+ * not be free()'d by strvec_clear(). This is for fixed string
+ * arguments to parse_options() and others that might munge the "v"
+ * itself.
  */
 struct strvec {
 	const char **v;
 	size_t nr;
 	size_t alloc;
+	unsigned int nodup_strings:1;
 };
 
 #define STRVEC_INIT { \
 	.v = empty_strvec, \
 }
 
+#define STRVEC_INIT_NODUP { \
+	.v = empty_strvec, \
+	.nodup_strings = 1, \
+}
+
 /**
  * Initialize an array. This is no different than assigning from
  * `STRVEC_INIT`.
  */
 void strvec_init(struct strvec *);
 
+/**
+ * Initialize a "nodup" array. This is no different than assigning from
+ * `STRVEC_INIT_NODUP`.
+ */
+void strvec_init_nodup(struct strvec *);
+
 /* Push a copy of a string onto the end of the array. */
 const char *strvec_push(struct strvec *, const char *);
 
 /**
  * Format a string and push it onto the end of the array. This is a
  * convenience wrapper combining `strbuf_addf` and `strvec_push`.
+ *
+ * This is incompatible with arrays initialized with
+ * `STRVEC_INIT_NODUP`, as pushing the formatted string requires the
+ * equivalent of an xstrfmt().
  */
 __attribute__((format (printf,2,3)))
 const char *strvec_pushf(struct strvec *, const char *fmt, ...);
@@ -70,7 +92,13 @@ void strvec_pushv(struct strvec *, const char **);
  */
 void strvec_pop(struct strvec *);
 
-/* Splits by whitespace; does not handle quoted arguments! */
+/**
+ * Splits by whitespace; does not handle quoted arguments!
+ *
+ * This is incompatible with arrays initialized with
+ * `STRVEC_INIT_NODUP`, as pushing the elements requires an xstrndup()
+ * call.
+ */
 void strvec_split(struct strvec *, const char *);
 
 /**
-- 
2.39.0.rc2.1048.g0e5493b8d5b


  parent reply	other threads:[~2022-12-15  9:11 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-13  6:47 [PATCH] am: don't pass strvec to apply_parse_options() René Scharfe
2022-12-13  8:37 ` Ævar Arnfjörð Bjarmason
2022-12-13 18:31   ` René Scharfe
2022-12-14  8:44     ` Ævar Arnfjörð Bjarmason
2022-12-15  9:11       ` [RFC PATCH 0/5] strvec: add a "nodup" mode, fix memory leaks Ævar Arnfjörð Bjarmason
2022-12-15  9:11         ` [RFC PATCH 1/5] builtin/annotate.c: simplify for strvec API Ævar Arnfjörð Bjarmason
2022-12-17 12:45           ` René Scharfe
2022-12-15  9:11         ` [RFC PATCH 2/5] various: add missing strvec_clear() Ævar Arnfjörð Bjarmason
2022-12-15  9:11         ` Ævar Arnfjörð Bjarmason [this message]
2022-12-17 12:45           ` [RFC PATCH 3/5] strvec API: add a "STRVEC_INIT_NODUP" René Scharfe
2022-12-15  9:11         ` [RFC PATCH 4/5] strvec API users: fix leaks by using "STRVEC_INIT_NODUP" Ævar Arnfjörð Bjarmason
2022-12-17 12:45           ` René Scharfe
2022-12-15  9:11         ` [RFC PATCH 5/5] strvec API users: fix more " Ævar Arnfjörð Bjarmason
2022-12-17 12:45           ` René Scharfe
2022-12-17 12:45         ` [RFC PATCH 0/5] strvec: add a "nodup" mode, fix memory leaks René Scharfe
2022-12-17 13:13         ` Jeff King
2022-12-19  9:20           ` Ævar Arnfjörð Bjarmason
2023-01-07 13:21             ` Jeff King
2022-12-17 12:46       ` [PATCH] am: don't pass strvec to apply_parse_options() René Scharfe
2022-12-17 13:24     ` Jeff King
2022-12-17 16:07       ` René Scharfe
2022-12-17 21:53         ` Jeff King
2022-12-18  2:42           ` Junio C Hamano
2022-12-20  1:29         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=RFC-patch-3.5-16c20baf5ec-20221215T090226Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=l.s.r@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).