* [PATCH 0/9] ref-filter %(trailer) fixes
@ 2024-09-09 23:07 Jeff King
2024-09-09 23:08 ` [PATCH 1/9] t6300: drop newline from wrapped test title Jeff King
` (9 more replies)
0 siblings, 10 replies; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:07 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
This series fixes two bugs noticed by Brooke:
- parsing trailers from signed tags doesn't work, because the
signature confuses the trailers code
- multiple %(trailers) placeholders share some storage, so their
options may conflict
The fixes for those are in patches 3 and 5, respectively. The other
patches up to there are related cleanups and preparation.
When fixing the second one, I noticed an obvious memory leak, fixed in
patch 6. And then that made me wonder if that made t6300 leak-free. It
didn't, but patches 7-9 get it there.
So 6-9 could be taken as a separate series, but they do textually depend
on what came before.
[1/9]: t6300: drop newline from wrapped test title
[2/9]: ref-filter: avoid extra copies of payload/signature
[3/9]: ref-filter: strip signature when parsing tag trailers
[4/9]: ref-filter: drop useless cast in trailers_atom_parser()
[5/9]: ref-filter: store ref_trailer_buf data per-atom
[6/9]: ref-filter: fix leak of %(trailers) "argbuf"
[7/9]: ref-filter: fix leak with %(describe) arguments
[8/9]: ref-filter: fix leak when formatting %(push:remoteref)
[9/9]: ref-filter: add ref_format_clear() function
builtin/branch.c | 1 +
builtin/for-each-ref.c | 1 +
builtin/tag.c | 1 +
builtin/verify-tag.c | 1 +
ref-filter.c | 90 ++++++++++++++++++++++++++++++-----------
ref-filter.h | 3 ++
remote.c | 8 ++--
remote.h | 2 +-
t/t6300-for-each-ref.sh | 41 ++++++++++++++++++-
9 files changed, 118 insertions(+), 30 deletions(-)
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 1/9] t6300: drop newline from wrapped test title
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
@ 2024-09-09 23:08 ` Jeff King
2024-09-09 23:12 ` [PATCH 2/9] ref-filter: avoid extra copies of payload/signature Jeff King
` (8 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:08 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
We don't usually include newlines in test titles, because you get funny
TAP output like:
ok 417 - show good signature with custom format
ok 418 - show good signature with custom format
with ssh
ok 419 - signature atom with grade option and bad signature
where a TAP parser would ignore the extra line anyway, giving the wrong
title. This comes from 26c9c03f0a (ref-filter: add new "signature" atom,
2023-06-04), and I think it was probably just editor line wrapping.
I checked for other cases with:
git grep "test_expect_success [A-Z_,]* '[^']*$"
git grep 'test_expect_success [A-Z_,]* "[^"]*$'
but this was the only hit.
Signed-off-by: Jeff King <peff@peff.net>
---
Just an annoyance I noticed while running t6300 over and over.
t/t6300-for-each-ref.sh | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 8d15713cc6..7c208e20a6 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -2003,8 +2003,7 @@ test_expect_success GPG 'show good signature with custom format' '
--format="$GRADE_FORMAT" >actual &&
test_cmp expect actual
'
-test_expect_success GPGSSH 'show good signature with custom format
- with ssh' '
+test_expect_success GPGSSH 'show good signature with custom format with ssh' '
test_config gpg.ssh.allowedSignersFile "${GPGSSH_ALLOWED_SIGNERS}" &&
FINGERPRINT=$(ssh-keygen -lf "${GPGSSH_KEY_PRIMARY}" | awk "{print \$2;}") &&
cat >expect.tmpl <<-\EOF &&
--
2.46.0.867.gf99b2b8e0f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 2/9] ref-filter: avoid extra copies of payload/signature
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
2024-09-09 23:08 ` [PATCH 1/9] t6300: drop newline from wrapped test title Jeff King
@ 2024-09-09 23:12 ` Jeff King
2024-09-10 6:09 ` Patrick Steinhardt
2024-09-09 23:14 ` [PATCH 3/9] ref-filter: strip signature when parsing tag trailers Jeff King
` (7 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:12 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
When we know we're going to show the subject or body of a tag or commit,
we call find_subpos(), which returns pointers and lengths for the three
parts: subject, body, signature.
Oddly, the function finds the signature twice: once by calling
parse_signature() at the start, which copies the signature into a
separate strbuf, and then again by calling parse_signed_buffer() after
we've parsed past the subject.
This is due to 482c119186 (gpg-interface: improve interface for parsing
tags, 2021-02-11) and 88bce0e24c (ref-filter: hoist signature parsing,
2021-02-11). The idea is that in a multi-hash world, tag signatures may
appear in the header, rather than at the end of the body, in which case
we need to extract them into a separate buffer.
But parse_signature() would never find such a buffer! It only looks for
signature lines (like "-----BEGIN PGP") at the start of each line,
without any header keyword. So this code will never find anything except
the usual in-body signature.
And the extra code has two downsides:
1. We spend time copying the payload and signature into strbufs. That
might even be useful if we ended up with a NUL-terminated copy of
the payload data, but we throw it away immediately. And the
signature, since it comes at the end of the message, is already its
own NUL-terminated buffer.
The overhead isn't huge, but I measured a pretty consistent 1-2%
speedup running "git for-each-ref --format='%(subject)'" with this
patch on a clone of linux.git.
2. The output of find_subpos() is a set of three ptr/len combinations,
but only two of them point into the original buffer. This makes the
interface confusing: you can't do pointer comparisons between them,
and you have to remember to free the signature buffer. Since
there's only one caller, it's not too bad in practice, but it did
bite me while working on the next patch (and simplifying it will
pave the way for that).
In the long run we might have to go back to something like this
approach, if we do have multi-hash header signatures. But I would argue
that the extra buffer should kick in only for a header signature, and be
passed out of find_subpos() separately.
Signed-off-by: Jeff King <peff@peff.net>
---
This should produce no behavior change.
It does seem funny to me that we do this signature parsing for commits
as well as tags, even through the former would have an in-header
signature. So arguably we are wrongly cutting in-body signatures from
commits, and not showing their signatures with %(contents:signature).
(But note that %(signature) is its own thing and does handle commits
correctly).
So I don't know if we'd want to fix that or not, but I left it as-is in
this series. And as I said above, if we did want to go that way, I think
we'd still want to build it on top of this, so that find_subpos()
returns the in-header and in-body signatures separately.
ref-filter.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index b6c6c10127..0f5513ba7e 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1833,16 +1833,10 @@ static void find_subpos(const char *buf,
size_t *nonsiglen,
const char **sig, size_t *siglen)
{
- struct strbuf payload = STRBUF_INIT;
- struct strbuf signature = STRBUF_INIT;
const char *eol;
const char *end = buf + strlen(buf);
const char *sigstart;
- /* parse signature first; we might not even have a subject line */
- parse_signature(buf, end - buf, &payload, &signature);
- strbuf_release(&payload);
-
/* skip past header until we hit empty line */
while (*buf && *buf != '\n') {
eol = strchrnul(buf, '\n');
@@ -1853,8 +1847,10 @@ static void find_subpos(const char *buf,
/* skip any empty lines */
while (*buf == '\n')
buf++;
- *sig = strbuf_detach(&signature, siglen);
+ /* parse signature first; we might not even have a subject line */
sigstart = buf + parse_signed_buffer(buf, strlen(buf));
+ *sig = sigstart;
+ *siglen = end - *sig;
/* subject is first non-empty line */
*sub = buf;
@@ -2021,7 +2017,6 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
v->s = xstrdup(subpos);
}
- free((void *)sigpos);
}
/*
--
2.46.0.867.gf99b2b8e0f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 3/9] ref-filter: strip signature when parsing tag trailers
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
2024-09-09 23:08 ` [PATCH 1/9] t6300: drop newline from wrapped test title Jeff King
2024-09-09 23:12 ` [PATCH 2/9] ref-filter: avoid extra copies of payload/signature Jeff King
@ 2024-09-09 23:14 ` Jeff King
2024-09-10 6:08 ` Patrick Steinhardt
2024-09-09 23:14 ` [PATCH 4/9] ref-filter: drop useless cast in trailers_atom_parser() Jeff King
` (6 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:14 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
To expand the "%(trailers)" placeholder, we have to feed the commit or
tag body to the trailer API. But that API doesn't know anything about
signatures, and will be confused by a signed tag like this:
the subject
the body
Some-trailer: foo
-----BEGIN PGP SIGNATURE-----
...etc...
because it will start looking for trailers after the signature, and get
stopped walking backwards by the very non-trailer signature lines. So it
thinks there are no trailers.
This problem has existed since %(trailers) was added to the ref-filter
code, but back then trailers on tags weren't something we really
considered (commits don't have the same problem because their signatures
are embedded in the header). But since 066cef7707 (builtin/tag: add
--trailer option, 2024-05-05), we'd generate an object like the above
for "git tag -s --trailer 'Some-trailer: foo' my-tag".
The implementation here is pretty simple: we just make a NUL-terminated
copy of the non-signature part of the tag (which we've already parsed)
and pass it to the trailer API. There are some alternatives I rejected,
at least for now:
- the trailer code already understands skipping past some cruft at the
end of a commit, such as patch dividers. see find_end_of_log_message().
We could teach it to do the same for signatures. But since this is
the only context where we'd want that feature, and since we've already
parsed the object into subject/body/signature here, it seemed easier
to just pass in the truncated message.
- it would be nice if we could just pass in a pointer/len pair to the
trailer API (rather than a NUL-terminated string) to avoid the extra
copy. I think this is possible, since as noted above, the trailer
code already has to deal with ignoring some cruft at the end of the
input. But after an initial attempt at this, it got pretty messy, as
we have to touch a lot of intermediate functions that are also
called in other contexts.
So I went for the simple and stupid thing, at least for now. I don't
think the extra copy overhead will be all that bad. The previous
patch noted that an extra copy seemed to cause about 1-2% slowdown
for something simple like "%(subject)". But here we are only
triggering it for "%(trailers)" (and only when there is a
signature), and the trailer code is a bit allocation-heavy already.
I couldn't measure any difference formatting "%(trailers)" on
linux.git before and after (even though there are not even any
trailers to find).
Reported-by: Brooke Kuhlmann <brooke@alchemists.io>
Signed-off-by: Jeff King <peff@peff.net>
---
ref-filter.c | 10 +++++++++-
t/t6300-for-each-ref.sh | 18 ++++++++++++++++++
2 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/ref-filter.c b/ref-filter.c
index 0f5513ba7e..e39f505a81 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2008,9 +2008,17 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
v->s = strbuf_detach(&s, NULL);
} else if (atom->u.contents.option == C_TRAILERS) {
struct strbuf s = STRBUF_INIT;
+ const char *msg;
+ char *to_free = NULL;
+
+ if (siglen)
+ msg = to_free = xmemdupz(subpos, sigpos - subpos);
+ else
+ msg = subpos;
/* Format the trailer info according to the trailer_opts given */
- format_trailers_from_commit(&atom->u.contents.trailer_opts, subpos, &s);
+ format_trailers_from_commit(&atom->u.contents.trailer_opts, msg, &s);
+ free(to_free);
v->s = strbuf_detach(&s, NULL);
} else if (atom->u.contents.option == C_BARE)
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 7c208e20a6..b830b542c4 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -1835,6 +1835,24 @@ sig_crlf="$(printf "%s" "$sig" | append_cr; echo dummy)"
sig_crlf=${sig_crlf%dummy}
test_atom refs/tags/fake-sig-crlf contents:signature "$sig_crlf"
+test_expect_success 'set up tag with signature and trailers' '
+ git tag -F - fake-sig-trailer <<-\EOF
+ this is the subject
+
+ this is the body
+
+ My-Trailer: foo
+ -----BEGIN PGP SIGNATURE-----
+
+ not a real signature, but we just care about the
+ subject/body/trailer parsing.
+ -----END PGP SIGNATURE-----
+ EOF
+'
+
+# use "separator=" here to suppress the terminating newline
+test_atom refs/tags/fake-sig-trailer trailers:separator= 'My-Trailer: foo'
+
test_expect_success 'git for-each-ref --stdin: empty' '
>in &&
git for-each-ref --format="%(refname)" --stdin <in >actual &&
--
2.46.0.867.gf99b2b8e0f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 4/9] ref-filter: drop useless cast in trailers_atom_parser()
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
` (2 preceding siblings ...)
2024-09-09 23:14 ` [PATCH 3/9] ref-filter: strip signature when parsing tag trailers Jeff King
@ 2024-09-09 23:14 ` Jeff King
2024-09-09 23:16 ` [PATCH 5/9] ref-filter: store ref_trailer_buf data per-atom Jeff King
` (5 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:14 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
There's no need to cast invalid_arg before freeing it. It is already a
non-const pointer.
Signed-off-by: Jeff King <peff@peff.net>
---
ref-filter.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/ref-filter.c b/ref-filter.c
index e39f505a81..4d0df546da 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -578,7 +578,7 @@ static int trailers_atom_parser(struct ref_format *format UNUSED,
strbuf_addf(err, _("expected %%(trailers:key=<value>)"));
else
strbuf_addf(err, _("unknown %%(trailers) argument: %s"), invalid_arg);
- free((char *)invalid_arg);
+ free(invalid_arg);
return -1;
}
}
--
2.46.0.867.gf99b2b8e0f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 5/9] ref-filter: store ref_trailer_buf data per-atom
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
` (3 preceding siblings ...)
2024-09-09 23:14 ` [PATCH 4/9] ref-filter: drop useless cast in trailers_atom_parser() Jeff King
@ 2024-09-09 23:16 ` Jeff King
2024-09-10 6:08 ` Patrick Steinhardt
2024-09-09 23:18 ` [PATCH 6/9] ref-filter: fix leak of %(trailers) "argbuf" Jeff King
` (4 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:16 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
The trailer API takes options via a trailer_opts struct. Some of those
options point to data structures which require extra storage. Those
structures aren't actually embedded in the options struct, but rather we
pass pointers, and the caller is responsible for managing them. This is
a little convoluted, but makes sense since some of them are not even
concrete (e.g., you can pass a filter function and a void data pointer,
but the trailer code doesn't even know what's in the pointer).
When for-each-ref, etc, parse the %(trailers) placeholder, they stuff
the extra data into a ref_trailer_buf struct. But we only hold a single
static global instance of this struct. So if a format string has
multiple %(trailer) placeholders, they'll stomp on each other: the "key"
list will end up with entries for all of them, and the separator buffers
will use the values from whichever was parsed last.
Instead, we should have a ref_trailer_buf for each instance of the
placeholder, and store it alongside the trailer_opts in the used_atom
structure.
And that's what this patch does. Note that we also have to add code to
clean them up in ref_array_clear(). The original code did not bother
cleaning them up, but it wasn't technically a "leak" since they were
still reachable from the static global instance.
Reported-by: Brooke Kuhlmann <brooke@alchemists.io>
Signed-off-by: Jeff King <peff@peff.net>
---
ref-filter.c | 36 ++++++++++++++++++++++++++++++------
t/t6300-for-each-ref.sh | 19 +++++++++++++++++++
2 files changed, 49 insertions(+), 6 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 4d0df546da..ebddc041c7 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -75,11 +75,11 @@ struct refname_atom {
int lstrip, rstrip;
};
-static struct ref_trailer_buf {
+struct ref_trailer_buf {
struct string_list filter_list;
struct strbuf sepbuf;
struct strbuf kvsepbuf;
-} ref_trailer_buf = {STRING_LIST_INIT_NODUP, STRBUF_INIT, STRBUF_INIT};
+};
static struct expand_data {
struct object_id oid;
@@ -201,6 +201,7 @@ static struct used_atom {
enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH, C_LINES,
C_SIG, C_SUB, C_SUB_SANITIZE, C_TRAILERS } option;
struct process_trailer_options trailer_opts;
+ struct ref_trailer_buf *trailer_buf;
unsigned int nlines;
} contents;
struct {
@@ -568,12 +569,24 @@ static int trailers_atom_parser(struct ref_format *format UNUSED,
if (arg) {
const char *argbuf = xstrfmt("%s)", arg);
char *invalid_arg = NULL;
+ struct ref_trailer_buf *tb;
+
+ /*
+ * Do not inline these directly into the used_atom struct!
+ * When we parse them in format_set_trailers_options(),
+ * we will make pointer references directly to them,
+ * which will not survive a realloc() of the used_atom list.
+ * They must be allocated in a separate, stable struct.
+ */
+ atom->u.contents.trailer_buf = tb = xmalloc(sizeof(*tb));
+ string_list_init_nodup(&tb->filter_list);
+ strbuf_init(&tb->sepbuf, 0);
+ strbuf_init(&tb->kvsepbuf, 0);
if (format_set_trailers_options(&atom->u.contents.trailer_opts,
- &ref_trailer_buf.filter_list,
- &ref_trailer_buf.sepbuf,
- &ref_trailer_buf.kvsepbuf,
- &argbuf, &invalid_arg)) {
+ &tb->filter_list,
+ &tb->sepbuf, &tb->kvsepbuf,
+ &argbuf, &invalid_arg)) {
if (!invalid_arg)
strbuf_addf(err, _("expected %%(trailers:key=<value>)"));
else
@@ -2988,6 +3001,17 @@ void ref_array_clear(struct ref_array *array)
struct used_atom *atom = &used_atom[i];
if (atom->atom_type == ATOM_HEAD)
free(atom->u.head);
+ else if (atom->atom_type == ATOM_TRAILERS ||
+ (atom->atom_type == ATOM_CONTENTS &&
+ atom->u.contents.option == C_TRAILERS)) {
+ struct ref_trailer_buf *tb = atom->u.contents.trailer_buf;
+ if (tb) {
+ string_list_clear(&tb->filter_list, 0);
+ strbuf_release(&tb->sepbuf);
+ strbuf_release(&tb->kvsepbuf);
+ free(tb);
+ }
+ }
free((char *)atom->name);
}
FREE_AND_NULL(used_atom);
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index b830b542c4..e8db612f95 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -1560,6 +1560,25 @@ test_trailer_option '%(trailers:separator,key_value_separator) changes both sepa
Reviewed-by,A U Thor <author@example.com>,Signed-off-by,A U Thor <author@example.com>
EOF
+test_expect_success 'multiple %(trailers) use their own options' '
+ git tag -F - tag-with-trailers <<-\EOF &&
+ body
+
+ one: foo
+ one: bar
+ two: baz
+ two: qux
+ EOF
+ t1="%(trailers:key=one,key_value_separator=W,separator=X)" &&
+ t2="%(trailers:key=two,key_value_separator=Y,separator=Z)" &&
+ git for-each-ref --format="$t1%0a$t2" refs/tags/tag-with-trailers >actual &&
+ cat >expect <<-\EOF &&
+ oneWfooXoneWbar
+ twoYbazZtwoYqux
+ EOF
+ test_cmp expect actual
+'
+
test_failing_trailer_option () {
title=$1 option=$2
cat >expect
--
2.46.0.867.gf99b2b8e0f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 6/9] ref-filter: fix leak of %(trailers) "argbuf"
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
` (4 preceding siblings ...)
2024-09-09 23:16 ` [PATCH 5/9] ref-filter: store ref_trailer_buf data per-atom Jeff King
@ 2024-09-09 23:18 ` Jeff King
2024-09-10 6:09 ` Patrick Steinhardt
2024-09-09 23:19 ` [PATCH 7/9] ref-filter: fix leak with %(describe) arguments Jeff King
` (3 subsequent siblings)
9 siblings, 1 reply; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:18 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
When we parse a placeholder like "%(trailers:key=foo)", our atom parsing
function is passed just the argument string "key=foo". We duplicate this
into its own string, but never free it, causing a leak.
We do the duplication for two reasons:
1. There's a mismatch with the pretty.c trailer-formatting code that
we rely on. It expects to see a closing paren, like "key=foo)". So
we duplicate the argument string with that extra character to pass
along.
This is probably something we could fix in the long run, but it's
somewhat non-trivial if we want to avoid regressing error cases for
things like "git log --format='%(trailer:oops'". So let's accept
it as a necessity for now.
2. The argument parser expects to store the list of "key" entries
("foo" in this case) in a string-list. It also stores the length of
the string in the string-list "util" field. The original caller in
pretty.c uses this with a "nodup" string list to avoid making extra
copies, which creates a subtle dependency on the lifetime of the
original format string.
We do the same here, which creates that same dependency. So we
can't simply free it as soon as the parsing is done.
There are two possible solutions here. The first is to hold on to the
duplicated "argbuf" string in the used_atom struct, so that it lives as
long as the string_list which references it.
But I think a less-subtle solution, and what this patch does, is to
switch to a duplicating string_list. That makes it self-contained, and
lets us free argbuf immediately. It may involve a few extra allocations,
but this parsing is something that happens once per program, not once
per output ref.
This clears up one case that LSan finds in t6300, but there are more.
Signed-off-by: Jeff King <peff@peff.net>
---
ref-filter.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index ebddc041c7..54c5079dde 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -567,7 +567,8 @@ static int trailers_atom_parser(struct ref_format *format UNUSED,
atom->u.contents.trailer_opts.no_divider = 1;
if (arg) {
- const char *argbuf = xstrfmt("%s)", arg);
+ char *argbuf = xstrfmt("%s)", arg);
+ const char *arg = argbuf;
char *invalid_arg = NULL;
struct ref_trailer_buf *tb;
@@ -579,21 +580,23 @@ static int trailers_atom_parser(struct ref_format *format UNUSED,
* They must be allocated in a separate, stable struct.
*/
atom->u.contents.trailer_buf = tb = xmalloc(sizeof(*tb));
- string_list_init_nodup(&tb->filter_list);
+ string_list_init_dup(&tb->filter_list);
strbuf_init(&tb->sepbuf, 0);
strbuf_init(&tb->kvsepbuf, 0);
if (format_set_trailers_options(&atom->u.contents.trailer_opts,
&tb->filter_list,
&tb->sepbuf, &tb->kvsepbuf,
- &argbuf, &invalid_arg)) {
+ &arg, &invalid_arg)) {
if (!invalid_arg)
strbuf_addf(err, _("expected %%(trailers:key=<value>)"));
else
strbuf_addf(err, _("unknown %%(trailers) argument: %s"), invalid_arg);
free(invalid_arg);
+ free(argbuf);
return -1;
}
+ free(argbuf);
}
atom->u.contents.option = C_TRAILERS;
return 0;
--
2.46.0.867.gf99b2b8e0f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 7/9] ref-filter: fix leak with %(describe) arguments
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
` (5 preceding siblings ...)
2024-09-09 23:18 ` [PATCH 6/9] ref-filter: fix leak of %(trailers) "argbuf" Jeff King
@ 2024-09-09 23:19 ` Jeff King
2024-09-09 23:19 ` [PATCH 8/9] ref-filter: fix leak when formatting %(push:remoteref) Jeff King
` (2 subsequent siblings)
9 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:19 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
When we parse a %(describe) placeholder, we stuff its arguments into a
strvec, which is then detached into the used_atom struct. But later,
when ref_array_clear() frees the atom, we never free the memory.
To solve this, we just need to add the appropriate free() calls. But
it's a little awkward, since we have to free each element of the array,
in addition to the array itself. Instead, let's store the actual strvec,
which lets us do a simple strvec_clear().
This clears up one case that LSan finds in t6300, but there are more.
Signed-off-by: Jeff King <peff@peff.net>
---
ref-filter.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 54c5079dde..370cc5b44a 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -233,7 +233,7 @@ static struct used_atom {
enum { S_BARE, S_GRADE, S_SIGNER, S_KEY,
S_FINGERPRINT, S_PRI_KEY_FP, S_TRUST_LEVEL } option;
} signature;
- const char **describe_args;
+ struct strvec describe_args;
struct refname_atom refname;
char *head;
} u;
@@ -693,7 +693,7 @@ static int describe_atom_parser(struct ref_format *format UNUSED,
struct used_atom *atom,
const char *arg, struct strbuf *err)
{
- struct strvec args = STRVEC_INIT;
+ strvec_init(&atom->u.describe_args);
for (;;) {
int found = 0;
@@ -702,13 +702,12 @@ static int describe_atom_parser(struct ref_format *format UNUSED,
if (!arg || !*arg)
break;
- found = describe_atom_option_parser(&args, &arg, err);
+ found = describe_atom_option_parser(&atom->u.describe_args, &arg, err);
if (found < 0)
return found;
if (!found)
return err_bad_arg(err, "describe", bad_arg);
}
- atom->u.describe_args = strvec_detach(&args);
return 0;
}
@@ -1941,7 +1940,7 @@ static void grab_describe_values(struct atom_value *val, int deref,
cmd.git_cmd = 1;
strvec_push(&cmd.args, "describe");
- strvec_pushv(&cmd.args, atom->u.describe_args);
+ strvec_pushv(&cmd.args, atom->u.describe_args.v);
strvec_push(&cmd.args, oid_to_hex(&commit->object.oid));
if (pipe_command(&cmd, NULL, 0, &out, 0, &err, 0) < 0) {
error(_("failed to run 'describe'"));
@@ -3004,6 +3003,8 @@ void ref_array_clear(struct ref_array *array)
struct used_atom *atom = &used_atom[i];
if (atom->atom_type == ATOM_HEAD)
free(atom->u.head);
+ else if (atom->atom_type == ATOM_DESCRIBE)
+ strvec_clear(&atom->u.describe_args);
else if (atom->atom_type == ATOM_TRAILERS ||
(atom->atom_type == ATOM_CONTENTS &&
atom->u.contents.option == C_TRAILERS)) {
--
2.46.0.867.gf99b2b8e0f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 8/9] ref-filter: fix leak when formatting %(push:remoteref)
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
` (6 preceding siblings ...)
2024-09-09 23:19 ` [PATCH 7/9] ref-filter: fix leak with %(describe) arguments Jeff King
@ 2024-09-09 23:19 ` Jeff King
2024-09-10 6:09 ` Patrick Steinhardt
2024-09-09 23:21 ` [PATCH 9/9] ref-filter: add ref_format_clear() function Jeff King
2024-09-10 6:57 ` [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms Patrick Steinhardt
9 siblings, 1 reply; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:19 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
When we expand the %(upstream) or %(push) placeholders, we rely on
remote.c's remote_ref_for_branch() to fill in the ":refname" argument.
But that function has confusing memory ownership semantics: it may or
may not return an allocated string, depending on whether we are in
"upstream" mode or "push" mode. The caller in ref-filter.c always
duplicates the result, meaning that we leak the original in the case of
%(push:refname).
To solve this, let's make the return value from remote_ref_for_branch()
consistent, by always returning an allocated pointer. Note that the
switch to returning a non-const pointer has a ripple effect inside the
function, too. We were storing the "dst" result as a const pointer, too,
even though it is always allocated! It is the return value from
apply_refspecs(), which is always a non-const allocated string.
And then on the caller side in ref-filter.c (and this is the only caller
at all), we just need to avoid the extra duplication when the return
value is non-NULL.
This clears up one case that LSan finds in t6300, but there are more.
Signed-off-by: Jeff King <peff@peff.net>
---
ref-filter.c | 2 +-
remote.c | 8 ++++----
remote.h | 2 +-
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index 370cc5b44a..0f51095bbd 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2237,7 +2237,7 @@ static void fill_remote_ref_details(struct used_atom *atom, const char *refname,
const char *merge;
merge = remote_ref_for_branch(branch, atom->u.remote_ref.push);
- *s = xstrdup(merge ? merge : "");
+ *s = merge ? merge : xstrdup("");
} else
BUG("unhandled RR_* enum");
}
diff --git a/remote.c b/remote.c
index 8f3dee1318..539e5ceae3 100644
--- a/remote.c
+++ b/remote.c
@@ -632,19 +632,19 @@ const char *pushremote_for_branch(struct branch *branch, int *explicit)
static struct remote *remotes_remote_get(struct remote_state *remote_state,
const char *name);
-const char *remote_ref_for_branch(struct branch *branch, int for_push)
+char *remote_ref_for_branch(struct branch *branch, int for_push)
{
read_config(the_repository, 0);
die_on_missing_branch(the_repository, branch);
if (branch) {
if (!for_push) {
if (branch->merge_nr) {
- return branch->merge_name[0];
+ return xstrdup(branch->merge_name[0]);
}
} else {
- const char *dst,
- *remote_name = remotes_pushremote_for_branch(
+ char *dst;
+ const char *remote_name = remotes_pushremote_for_branch(
the_repository->remote_state, branch,
NULL);
struct remote *remote = remotes_remote_get(
diff --git a/remote.h b/remote.h
index b901b56746..a58713f20a 100644
--- a/remote.h
+++ b/remote.h
@@ -329,7 +329,7 @@ struct branch {
struct branch *branch_get(const char *name);
const char *remote_for_branch(struct branch *branch, int *explicit);
const char *pushremote_for_branch(struct branch *branch, int *explicit);
-const char *remote_ref_for_branch(struct branch *branch, int for_push);
+char *remote_ref_for_branch(struct branch *branch, int for_push);
/* returns true if the given branch has merge configuration given. */
int branch_has_merge_config(struct branch *branch);
--
2.46.0.867.gf99b2b8e0f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 9/9] ref-filter: add ref_format_clear() function
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
` (7 preceding siblings ...)
2024-09-09 23:19 ` [PATCH 8/9] ref-filter: fix leak when formatting %(push:remoteref) Jeff King
@ 2024-09-09 23:21 ` Jeff King
2024-09-10 6:09 ` Patrick Steinhardt
2024-09-10 6:57 ` [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms Patrick Steinhardt
9 siblings, 1 reply; 27+ messages in thread
From: Jeff King @ 2024-09-09 23:21 UTC (permalink / raw)
To: git; +Cc: Brooke Kuhlmann
After using the ref-filter API, callers should use ref_filter_clear() to
free any used memory. However, there's not a matching function to clear
the ref_format struct.
Traditionally this did not need to be cleaned up, as it was just a way
for the caller to store and pass format options as a single unit. Even
though the parsing step of some placeholders may allocate data, that's
usually inside their "used_atom" structs, which are part of the
ref_filter itself.
But a few placeholders keep data outside of there. The %(ahead-behind)
and %(is-base) parsers both keep a master list of bases, because they
perform a single filtering pass outside of the use of any particular
atom. And since the format parser does not have access to the ref_filter
struct, they store their cross-atom data in the ref_format struct
itself.
And thus when they are finished, the ref_format also needs to be cleaned
up. So let's add a function to do so, and call it from all of the users
of the ref-filter API.
The %(is-base) case is found by running LSan on t6300. After this patch,
the script can now be marked leak-free.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/branch.c | 1 +
builtin/for-each-ref.c | 1 +
builtin/tag.c | 1 +
builtin/verify-tag.c | 1 +
ref-filter.c | 13 +++++++++++++
ref-filter.h | 3 +++
t/t6300-for-each-ref.sh | 1 +
7 files changed, 21 insertions(+)
diff --git a/builtin/branch.c b/builtin/branch.c
index 3f870741bf..c98601c6fe 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -878,6 +878,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
string_list_clear(&output, 0);
ref_sorting_release(sorting);
ref_filter_clear(&filter);
+ ref_format_clear(&format);
return 0;
} else if (edit_description) {
const char *branch_name;
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 5517a4a1c0..c72fa05bcb 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -104,6 +104,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
filter_and_format_refs(&filter, flags, sorting, &format);
ref_filter_clear(&filter);
+ ref_format_clear(&format);
ref_sorting_release(sorting);
strvec_clear(&vec);
return 0;
diff --git a/builtin/tag.c b/builtin/tag.c
index a1fb218512..607e48e311 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -702,6 +702,7 @@ int cmd_tag(int argc, const char **argv, const char *prefix)
cleanup:
ref_sorting_release(sorting);
ref_filter_clear(&filter);
+ ref_format_clear(&format);
strbuf_release(&buf);
strbuf_release(&ref);
strbuf_release(&reflog_msg);
diff --git a/builtin/verify-tag.c b/builtin/verify-tag.c
index c731e2f87b..77becf7e75 100644
--- a/builtin/verify-tag.c
+++ b/builtin/verify-tag.c
@@ -65,5 +65,6 @@ int cmd_verify_tag(int argc, const char **argv, const char *prefix)
if (format.format)
pretty_print_ref(name, &oid, &format);
}
+ ref_format_clear(&format);
return had_error;
}
diff --git a/ref-filter.c b/ref-filter.c
index 0f51095bbd..ce1bcfad85 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -3621,3 +3621,16 @@ void ref_filter_clear(struct ref_filter *filter)
free_commit_list(filter->unreachable_from);
ref_filter_init(filter);
}
+
+void ref_format_init(struct ref_format *format)
+{
+ struct ref_format blank = REF_FORMAT_INIT;
+ memcpy(format, &blank, sizeof(blank));
+}
+
+void ref_format_clear(struct ref_format *format)
+{
+ string_list_clear(&format->bases, 0);
+ string_list_clear(&format->is_base_tips, 0);
+ ref_format_init(format);
+}
diff --git a/ref-filter.h b/ref-filter.h
index e794b8a676..754038ab07 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -221,4 +221,7 @@ void filter_is_base(struct repository *r,
void ref_filter_init(struct ref_filter *filter);
void ref_filter_clear(struct ref_filter *filter);
+void ref_format_init(struct ref_format *format);
+void ref_format_clear(struct ref_format *format);
+
#endif /* REF_FILTER_H */
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index e8db612f95..b3163629c5 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -5,6 +5,7 @@
test_description='for-each-ref test'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
GNUPGHOME_NOT_USED=$GNUPGHOME
. "$TEST_DIRECTORY"/lib-gpg.sh
--
2.46.0.867.gf99b2b8e0f
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 3/9] ref-filter: strip signature when parsing tag trailers
2024-09-09 23:14 ` [PATCH 3/9] ref-filter: strip signature when parsing tag trailers Jeff King
@ 2024-09-10 6:08 ` Patrick Steinhardt
2024-09-10 6:28 ` Jeff King
0 siblings, 1 reply; 27+ messages in thread
From: Patrick Steinhardt @ 2024-09-10 6:08 UTC (permalink / raw)
To: Jeff King; +Cc: git, Brooke Kuhlmann
On Mon, Sep 09, 2024 at 07:14:45PM -0400, Jeff King wrote:
> The implementation here is pretty simple: we just make a NUL-terminated
> copy of the non-signature part of the tag (which we've already parsed)
> and pass it to the trailer API. There are some alternatives I rejected,
> at least for now:
>
> - the trailer code already understands skipping past some cruft at the
> end of a commit, such as patch dividers. see find_end_of_log_message().
s/./,
> We could teach it to do the same for signatures. But since this is
> the only context where we'd want that feature, and since we've already
> parsed the object into subject/body/signature here, it seemed easier
> to just pass in the truncated message.
>
> - it would be nice if we could just pass in a pointer/len pair to the
s/it/It
> diff --git a/ref-filter.c b/ref-filter.c
> index 0f5513ba7e..e39f505a81 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -2008,9 +2008,17 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
> v->s = strbuf_detach(&s, NULL);
> } else if (atom->u.contents.option == C_TRAILERS) {
> struct strbuf s = STRBUF_INIT;
> + const char *msg;
> + char *to_free = NULL;
> +
> + if (siglen)
> + msg = to_free = xmemdupz(subpos, sigpos - subpos);
> + else
> + msg = subpos;
>
> /* Format the trailer info according to the trailer_opts given */
> - format_trailers_from_commit(&atom->u.contents.trailer_opts, subpos, &s);
> + format_trailers_from_commit(&atom->u.contents.trailer_opts, msg, &s);
> + free(to_free);
>
> v->s = strbuf_detach(&s, NULL);
> } else if (atom->u.contents.option == C_BARE)
I've been surprised that we use `subpos` as the starting point here,
which includes the whole message including its subject. I would have
thought that it was sufficient to only pass the message body as input,
which saves allocating some bytes. At least `trailer_info_get()` does
not seem to care about the subject at all.
In any case this would be a micro-optimization anyway, it just left me
scratching my head for a second or two.
Patrick
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 5/9] ref-filter: store ref_trailer_buf data per-atom
2024-09-09 23:16 ` [PATCH 5/9] ref-filter: store ref_trailer_buf data per-atom Jeff King
@ 2024-09-10 6:08 ` Patrick Steinhardt
0 siblings, 0 replies; 27+ messages in thread
From: Patrick Steinhardt @ 2024-09-10 6:08 UTC (permalink / raw)
To: Jeff King; +Cc: git, Brooke Kuhlmann
On Mon, Sep 09, 2024 at 07:16:53PM -0400, Jeff King wrote:
> @@ -2988,6 +3001,17 @@ void ref_array_clear(struct ref_array *array)
> struct used_atom *atom = &used_atom[i];
> if (atom->atom_type == ATOM_HEAD)
> free(atom->u.head);
Nit: this branch should grow some braces now, too.
Patrick
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 6/9] ref-filter: fix leak of %(trailers) "argbuf"
2024-09-09 23:18 ` [PATCH 6/9] ref-filter: fix leak of %(trailers) "argbuf" Jeff King
@ 2024-09-10 6:09 ` Patrick Steinhardt
2024-09-10 6:33 ` Jeff King
0 siblings, 1 reply; 27+ messages in thread
From: Patrick Steinhardt @ 2024-09-10 6:09 UTC (permalink / raw)
To: Jeff King; +Cc: git, Brooke Kuhlmann
On Mon, Sep 09, 2024 at 07:18:28PM -0400, Jeff King wrote:
> When we parse a placeholder like "%(trailers:key=foo)", our atom parsing
> function is passed just the argument string "key=foo". We duplicate this
> into its own string, but never free it, causing a leak.
>
> We do the duplication for two reasons:
>
> 1. There's a mismatch with the pretty.c trailer-formatting code that
> we rely on. It expects to see a closing paren, like "key=foo)". So
> we duplicate the argument string with that extra character to pass
> along.
>
> This is probably something we could fix in the long run, but it's
> somewhat non-trivial if we want to avoid regressing error cases for
> things like "git log --format='%(trailer:oops'". So let's accept
> it as a necessity for now.
>
> 2. The argument parser expects to store the list of "key" entries
> ("foo" in this case) in a string-list. It also stores the length of
> the string in the string-list "util" field. The original caller in
> pretty.c uses this with a "nodup" string list to avoid making extra
> copies, which creates a subtle dependency on the lifetime of the
> original format string.
>
> We do the same here, which creates that same dependency. So we
> can't simply free it as soon as the parsing is done.
>
> There are two possible solutions here. The first is to hold on to the
> duplicated "argbuf" string in the used_atom struct, so that it lives as
> long as the string_list which references it.
>
> But I think a less-subtle solution, and what this patch does, is to
> switch to a duplicating string_list. That makes it self-contained, and
> lets us free argbuf immediately. It may involve a few extra allocations,
> but this parsing is something that happens once per program, not once
> per output ref.
Sensible. I found that in many cases, the `nodup` variants of string
lists bring more pain than real benefit.
> This clears up one case that LSan finds in t6300, but there are more.
Yeah, there are a bunch of memory leaks around atom parsing in general
exposed by t6300. Thanks for plugging some of them!
Patrick
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 8/9] ref-filter: fix leak when formatting %(push:remoteref)
2024-09-09 23:19 ` [PATCH 8/9] ref-filter: fix leak when formatting %(push:remoteref) Jeff King
@ 2024-09-10 6:09 ` Patrick Steinhardt
0 siblings, 0 replies; 27+ messages in thread
From: Patrick Steinhardt @ 2024-09-10 6:09 UTC (permalink / raw)
To: Jeff King; +Cc: git, Brooke Kuhlmann
On Mon, Sep 09, 2024 at 07:19:51PM -0400, Jeff King wrote:
> When we expand the %(upstream) or %(push) placeholders, we rely on
> remote.c's remote_ref_for_branch() to fill in the ":refname" argument.
> But that function has confusing memory ownership semantics: it may or
> may not return an allocated string, depending on whether we are in
> "upstream" mode or "push" mode. The caller in ref-filter.c always
> duplicates the result, meaning that we leak the original in the case of
> %(push:refname).
Ah, I remember this issue, I think I also have it pending somewhere.
Anyway, I'm happy if I can drop one more patch.
The change looks sensible to me.
Patrick
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 9/9] ref-filter: add ref_format_clear() function
2024-09-09 23:21 ` [PATCH 9/9] ref-filter: add ref_format_clear() function Jeff King
@ 2024-09-10 6:09 ` Patrick Steinhardt
2024-09-10 6:37 ` Jeff King
0 siblings, 1 reply; 27+ messages in thread
From: Patrick Steinhardt @ 2024-09-10 6:09 UTC (permalink / raw)
To: Jeff King; +Cc: git, Brooke Kuhlmann
On Mon, Sep 09, 2024 at 07:21:18PM -0400, Jeff King wrote:
> diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
> index e8db612f95..b3163629c5 100755
> --- a/t/t6300-for-each-ref.sh
> +++ b/t/t6300-for-each-ref.sh
> @@ -5,6 +5,7 @@
>
> test_description='for-each-ref test'
>
> +TEST_PASSES_SANITIZE_LEAK=true
> . ./test-lib.sh
> GNUPGHOME_NOT_USED=$GNUPGHOME
> . "$TEST_DIRECTORY"/lib-gpg.sh
Nice! There's also t6302, which has been failing due to all the memory
leaks in our atom handling, as well. After your series there's a single
memory leak left to make it pass. So we may want to add below patch on
top as a low-hanging fruit.
Patrick
-- >8 --
diff --git a/ref-filter.c b/ref-filter.c
index ce1bcfad857..b06e18a569a 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1001,6 +1001,7 @@ struct ref_formatting_stack {
struct ref_formatting_stack *prev;
struct strbuf output;
void (*at_end)(struct ref_formatting_stack **stack);
+ void (*at_end_data_free)(void *data);
void *at_end_data;
};
@@ -1169,6 +1170,8 @@ static void pop_stack_element(struct ref_formatting_stack **stack)
if (prev)
strbuf_addbuf(&prev->output, ¤t->output);
strbuf_release(¤t->output);
+ if (current->at_end_data_free)
+ current->at_end_data_free(current->at_end_data);
free(current);
*stack = prev;
}
@@ -1228,15 +1231,13 @@ static void if_then_else_handler(struct ref_formatting_stack **stack)
}
*stack = cur;
- free(if_then_else);
}
static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state *state,
struct strbuf *err UNUSED)
{
struct ref_formatting_stack *new_stack;
- struct if_then_else *if_then_else = xcalloc(1,
- sizeof(struct if_then_else));
+ struct if_then_else *if_then_else = xcalloc(1, sizeof(*if_then_else));
if_then_else->str = atomv->atom->u.if_then_else.str;
if_then_else->cmp_status = atomv->atom->u.if_then_else.cmp_status;
@@ -1245,6 +1246,7 @@ static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state
new_stack = state->stack;
new_stack->at_end = if_then_else_handler;
new_stack->at_end_data = if_then_else;
+ new_stack->at_end_data_free = free;
return 0;
}
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 163c378cfd1..7f44d3c3f22 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -2,6 +2,7 @@
test_description='test for-each-refs usage of ref-filter APIs'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY"/lib-gpg.sh
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 2/9] ref-filter: avoid extra copies of payload/signature
2024-09-09 23:12 ` [PATCH 2/9] ref-filter: avoid extra copies of payload/signature Jeff King
@ 2024-09-10 6:09 ` Patrick Steinhardt
2024-09-10 6:26 ` Jeff King
0 siblings, 1 reply; 27+ messages in thread
From: Patrick Steinhardt @ 2024-09-10 6:09 UTC (permalink / raw)
To: Jeff King; +Cc: git, Brooke Kuhlmann
On Mon, Sep 09, 2024 at 07:12:28PM -0400, Jeff King wrote:
> When we know we're going to show the subject or body of a tag or commit,
> we call find_subpos(), which returns pointers and lengths for the three
> parts: subject, body, signature.
>
> Oddly, the function finds the signature twice: once by calling
> parse_signature() at the start, which copies the signature into a
> separate strbuf, and then again by calling parse_signed_buffer() after
> we've parsed past the subject.
>
> This is due to 482c119186 (gpg-interface: improve interface for parsing
> tags, 2021-02-11) and 88bce0e24c (ref-filter: hoist signature parsing,
> 2021-02-11). The idea is that in a multi-hash world, tag signatures may
> appear in the header, rather than at the end of the body, in which case
> we need to extract them into a separate buffer.
>
> But parse_signature() would never find such a buffer! It only looks for
> signature lines (like "-----BEGIN PGP") at the start of each line,
> without any header keyword. So this code will never find anything except
> the usual in-body signature.
Okay. So in other words the intent was to parse in-header signatures,
but the code failed to do so correctly and thus this never worked in the
first place?
In any case, `parse_signature()` is only a glorified wrapper around
`parse_signed_buffer()` in the first place, so in the end they would
both parse the buffer in the same way.
Nice cleanup, even though it leaves one wondering why the in-header
signatures have only been wired up partially.
Patrick
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/9] ref-filter: avoid extra copies of payload/signature
2024-09-10 6:09 ` Patrick Steinhardt
@ 2024-09-10 6:26 ` Jeff King
0 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-09-10 6:26 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Brooke Kuhlmann
On Tue, Sep 10, 2024 at 08:09:22AM +0200, Patrick Steinhardt wrote:
> > But parse_signature() would never find such a buffer! It only looks for
> > signature lines (like "-----BEGIN PGP") at the start of each line,
> > without any header keyword. So this code will never find anything except
> > the usual in-body signature.
>
> Okay. So in other words the intent was to parse in-header signatures,
> but the code failed to do so correctly and thus this never worked in the
> first place?
>
> In any case, `parse_signature()` is only a glorified wrapper around
> `parse_signed_buffer()` in the first place, so in the end they would
> both parse the buffer in the same way.
>
> Nice cleanup, even though it leaves one wondering why the in-header
> signatures have only been wired up partially.
I, too, was confused. See this exchange with brian:
https://lore.kernel.org/git/20240908233636.GA4026999@coredump.intra.peff.net/
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/9] ref-filter: strip signature when parsing tag trailers
2024-09-10 6:08 ` Patrick Steinhardt
@ 2024-09-10 6:28 ` Jeff King
0 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-09-10 6:28 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Brooke Kuhlmann
On Tue, Sep 10, 2024 at 08:08:55AM +0200, Patrick Steinhardt wrote:
> > diff --git a/ref-filter.c b/ref-filter.c
> > index 0f5513ba7e..e39f505a81 100644
> > --- a/ref-filter.c
> > +++ b/ref-filter.c
> > @@ -2008,9 +2008,17 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
> > v->s = strbuf_detach(&s, NULL);
> > } else if (atom->u.contents.option == C_TRAILERS) {
> > struct strbuf s = STRBUF_INIT;
> > + const char *msg;
> > + char *to_free = NULL;
> > +
> > + if (siglen)
> > + msg = to_free = xmemdupz(subpos, sigpos - subpos);
> > + else
> > + msg = subpos;
> >
> > /* Format the trailer info according to the trailer_opts given */
> > - format_trailers_from_commit(&atom->u.contents.trailer_opts, subpos, &s);
> > + format_trailers_from_commit(&atom->u.contents.trailer_opts, msg, &s);
> > + free(to_free);
> >
> > v->s = strbuf_detach(&s, NULL);
> > } else if (atom->u.contents.option == C_BARE)
>
> I've been surprised that we use `subpos` as the starting point here,
> which includes the whole message including its subject. I would have
> thought that it was sufficient to only pass the message body as input,
> which saves allocating some bytes. At least `trailer_info_get()` does
> not seem to care about the subject at all.
>
> In any case this would be a micro-optimization anyway, it just left me
> scratching my head for a second or two.
Yeah, I suspect it would be fine to start at "bodypos" instead (and then
you could just use "nonsiglen" directly as the length). But there may be
corner cases for instances where there's no subject/body at all (though
having _just_ trailers feels weird).
At any rate, I was just following the existing code, which passed in the
whole contents starting from subpos, to avoid any unexpected changes.
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 6/9] ref-filter: fix leak of %(trailers) "argbuf"
2024-09-10 6:09 ` Patrick Steinhardt
@ 2024-09-10 6:33 ` Jeff King
0 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-09-10 6:33 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Brooke Kuhlmann
On Tue, Sep 10, 2024 at 08:09:03AM +0200, Patrick Steinhardt wrote:
> > But I think a less-subtle solution, and what this patch does, is to
> > switch to a duplicating string_list. That makes it self-contained, and
> > lets us free argbuf immediately. It may involve a few extra allocations,
> > but this parsing is something that happens once per program, not once
> > per output ref.
>
> Sensible. I found that in many cases, the `nodup` variants of string
> lists bring more pain than real benefit.
I have found that, too. :) I've argued in the past for getting rid of it
(and especially not propagating its world-view to other data structures
like strmap, and so on). But Elijah presented some pretty compelling
numbers that it does help for some of the large merge-ort strmaps.
So I've softened my argument to asking people to use it judiciously. ;)
> > This clears up one case that LSan finds in t6300, but there are more.
>
> Yeah, there are a bunch of memory leaks around atom parsing in general
> exposed by t6300. Thanks for plugging some of them!
Well, you're really going to like the rest of the series, then!
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 9/9] ref-filter: add ref_format_clear() function
2024-09-10 6:09 ` Patrick Steinhardt
@ 2024-09-10 6:37 ` Jeff King
0 siblings, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-09-10 6:37 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Brooke Kuhlmann
On Tue, Sep 10, 2024 at 08:09:16AM +0200, Patrick Steinhardt wrote:
> On Mon, Sep 09, 2024 at 07:21:18PM -0400, Jeff King wrote:
> > diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
> > index e8db612f95..b3163629c5 100755
> > --- a/t/t6300-for-each-ref.sh
> > +++ b/t/t6300-for-each-ref.sh
> > @@ -5,6 +5,7 @@
> >
> > test_description='for-each-ref test'
> >
> > +TEST_PASSES_SANITIZE_LEAK=true
> > . ./test-lib.sh
> > GNUPGHOME_NOT_USED=$GNUPGHOME
> > . "$TEST_DIRECTORY"/lib-gpg.sh
>
> Nice! There's also t6302, which has been failing due to all the memory
> leaks in our atom handling, as well. After your series there's a single
> memory leak left to make it pass. So we may want to add below patch on
> top as a low-hanging fruit.
I was afraid to go looking for other almost-there scripts, knowing what
a rabbit hole it can turn into (which I know you are also familiar
with).
> -- >8 --
> diff --git a/ref-filter.c b/ref-filter.c
> index ce1bcfad857..b06e18a569a 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
This looks plausibly correct to me, but I'm not at all familiar with the
conditional placeholders. I think it would make more sense for you to
wrap it up with a commit message.
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
` (8 preceding siblings ...)
2024-09-09 23:21 ` [PATCH 9/9] ref-filter: add ref_format_clear() function Jeff King
@ 2024-09-10 6:57 ` Patrick Steinhardt
2024-09-10 7:12 ` Jeff King
2024-09-10 16:48 ` Junio C Hamano
9 siblings, 2 replies; 27+ messages in thread
From: Patrick Steinhardt @ 2024-09-10 6:57 UTC (permalink / raw)
To: git; +Cc: git, Brooke Kuhlmann
When parsing `%(if)` atoms we expect a few other atoms to exist to
complete it, like `%(then)` and `%(end)`. Whether or not we have seen
these other atoms is tracked in an allocated `if_then_else` structure,
which gets free'd by the `if_then_else_handler()` once we have parsed
the complete conditional expression.
This results in a memory leak when the `%(if)` atom is not terminated
correctly and thus incomplete. We never end up executing its handler and
thus don't end up freeing the structure.
Plug this memory leak by introducing a new `at_end_data_free` callback
function. If set, we'll execute it in `pop_stack_element()` and pass it
the `at_end_data` variable with the intent to free its state. Wire it up
for the `%(if)` atom accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
ref-filter.c | 8 +++++---
t/t6302-for-each-ref-filter.sh | 1 +
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/ref-filter.c b/ref-filter.c
index ce1bcfad857..b06e18a569a 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1001,6 +1001,7 @@ struct ref_formatting_stack {
struct ref_formatting_stack *prev;
struct strbuf output;
void (*at_end)(struct ref_formatting_stack **stack);
+ void (*at_end_data_free)(void *data);
void *at_end_data;
};
@@ -1169,6 +1170,8 @@ static void pop_stack_element(struct ref_formatting_stack **stack)
if (prev)
strbuf_addbuf(&prev->output, ¤t->output);
strbuf_release(¤t->output);
+ if (current->at_end_data_free)
+ current->at_end_data_free(current->at_end_data);
free(current);
*stack = prev;
}
@@ -1228,15 +1231,13 @@ static void if_then_else_handler(struct ref_formatting_stack **stack)
}
*stack = cur;
- free(if_then_else);
}
static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state *state,
struct strbuf *err UNUSED)
{
struct ref_formatting_stack *new_stack;
- struct if_then_else *if_then_else = xcalloc(1,
- sizeof(struct if_then_else));
+ struct if_then_else *if_then_else = xcalloc(1, sizeof(*if_then_else));
if_then_else->str = atomv->atom->u.if_then_else.str;
if_then_else->cmp_status = atomv->atom->u.if_then_else.cmp_status;
@@ -1245,6 +1246,7 @@ static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state
new_stack = state->stack;
new_stack->at_end = if_then_else_handler;
new_stack->at_end_data = if_then_else;
+ new_stack->at_end_data_free = free;
return 0;
}
diff --git a/t/t6302-for-each-ref-filter.sh b/t/t6302-for-each-ref-filter.sh
index 163c378cfd1..7f44d3c3f22 100755
--- a/t/t6302-for-each-ref-filter.sh
+++ b/t/t6302-for-each-ref-filter.sh
@@ -2,6 +2,7 @@
test_description='test for-each-refs usage of ref-filter APIs'
+TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
. "$TEST_DIRECTORY"/lib-gpg.sh
--
2.46.0.519.g2e7b89e038.dirty
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms
2024-09-10 6:57 ` [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms Patrick Steinhardt
@ 2024-09-10 7:12 ` Jeff King
2024-09-10 16:48 ` Junio C Hamano
1 sibling, 0 replies; 27+ messages in thread
From: Jeff King @ 2024-09-10 7:12 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Brooke Kuhlmann
On Tue, Sep 10, 2024 at 08:57:15AM +0200, Patrick Steinhardt wrote:
> When parsing `%(if)` atoms we expect a few other atoms to exist to
> complete it, like `%(then)` and `%(end)`. Whether or not we have seen
> these other atoms is tracked in an allocated `if_then_else` structure,
> which gets free'd by the `if_then_else_handler()` once we have parsed
> the complete conditional expression.
>
> This results in a memory leak when the `%(if)` atom is not terminated
> correctly and thus incomplete. We never end up executing its handler and
> thus don't end up freeing the structure.
>
> Plug this memory leak by introducing a new `at_end_data_free` callback
> function. If set, we'll execute it in `pop_stack_element()` and pass it
> the `at_end_data` variable with the intent to free its state. Wire it up
> for the `%(if)` atom accordingly.
Ah, thanks for explaining. The patch makes much more sense now. :)
In particular, this:
> @@ -1169,6 +1170,8 @@ static void pop_stack_element(struct ref_formatting_stack **stack)
> if (prev)
> strbuf_addbuf(&prev->output, ¤t->output);
> strbuf_release(¤t->output);
> + if (current->at_end_data_free)
> + current->at_end_data_free(current->at_end_data);
> free(current);
> *stack = prev;
> }
which frees on pop, replaces the manual:
> @@ -1228,15 +1231,13 @@ static void if_then_else_handler(struct ref_formatting_stack **stack)
> }
>
> *stack = cur;
> - free(if_then_else);
> }
free that was happening in the success case.
I think putting this on top of my series makes sense.
-Peff
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms
2024-09-10 6:57 ` [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms Patrick Steinhardt
2024-09-10 7:12 ` Jeff King
@ 2024-09-10 16:48 ` Junio C Hamano
2024-09-12 10:22 ` Patrick Steinhardt
1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2024-09-10 16:48 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Brooke Kuhlmann
Patrick Steinhardt <ps@pks.im> writes:
> When parsing `%(if)` atoms we expect a few other atoms to exist to
> complete it, like `%(then)` and `%(end)`. Whether or not we have seen
> these other atoms is tracked in an allocated `if_then_else` structure,
> which gets free'd by the `if_then_else_handler()` once we have parsed
> the complete conditional expression.
>
> This results in a memory leak when the `%(if)` atom is not terminated
> correctly and thus incomplete. We never end up executing its handler and
> thus don't end up freeing the structure.
>
> Plug this memory leak by introducing a new `at_end_data_free` callback
> function. If set, we'll execute it in `pop_stack_element()` and pass it
> the `at_end_data` variable with the intent to free its state. Wire it up
> for the `%(if)` atom accordingly.
Sounds good. We diagnose unclosed "%(if)", report mismatch, and
die() soon, so plugging this may more about "let's silence leak
checker so that it can be more effective to help us find real leaks
that matter", not "this is leaking proportionally to the size of the
user data, and must be plugged".
I see this code snippet (not touched by your patch):
if (state.stack->prev) {
pop_stack_element(&state.stack);
return strbuf_addf_ret(error_buf, -1, _("format: %%(end) atom missing"));
}
and wonder how this handles the case where state.stack->prev->prev
is also not NULL. Shouldn't it be looping while .prev is not NULL?
e.g.
diff --git c/ref-filter.c w/ref-filter.c
index b06e18a569..d2040f5047 100644
--- c/ref-filter.c
+++ w/ref-filter.c
@@ -3471,7 +3471,8 @@ int format_ref_array_item(struct ref_array_item *info,
}
}
if (state.stack->prev) {
- pop_stack_element(&state.stack);
+ while (state.stack->prev)
+ pop_stack_element(&state.stack);
return strbuf_addf_ret(error_buf, -1, _("format: %%(end) atom missing"));
}
strbuf_addbuf(final_buf, &state.stack->output);
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms
2024-09-10 16:48 ` Junio C Hamano
@ 2024-09-12 10:22 ` Patrick Steinhardt
2024-09-12 11:18 ` Jeff King
0 siblings, 1 reply; 27+ messages in thread
From: Patrick Steinhardt @ 2024-09-12 10:22 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Brooke Kuhlmann
On Tue, Sep 10, 2024 at 09:48:32AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > When parsing `%(if)` atoms we expect a few other atoms to exist to
> > complete it, like `%(then)` and `%(end)`. Whether or not we have seen
> > these other atoms is tracked in an allocated `if_then_else` structure,
> > which gets free'd by the `if_then_else_handler()` once we have parsed
> > the complete conditional expression.
> >
> > This results in a memory leak when the `%(if)` atom is not terminated
> > correctly and thus incomplete. We never end up executing its handler and
> > thus don't end up freeing the structure.
> >
> > Plug this memory leak by introducing a new `at_end_data_free` callback
> > function. If set, we'll execute it in `pop_stack_element()` and pass it
> > the `at_end_data` variable with the intent to free its state. Wire it up
> > for the `%(if)` atom accordingly.
>
> Sounds good. We diagnose unclosed "%(if)", report mismatch, and
> die() soon, so plugging this may more about "let's silence leak
> checker so that it can be more effective to help us find real leaks
> that matter", not "this is leaking proportionally to the size of the
> user data, and must be plugged".
>
> I see this code snippet (not touched by your patch):
>
> if (state.stack->prev) {
> pop_stack_element(&state.stack);
> return strbuf_addf_ret(error_buf, -1, _("format: %%(end) atom missing"));
> }
>
> and wonder how this handles the case where state.stack->prev->prev
> is also not NULL. Shouldn't it be looping while .prev is not NULL?
>
> e.g.
>
> diff --git c/ref-filter.c w/ref-filter.c
> index b06e18a569..d2040f5047 100644
> --- c/ref-filter.c
> +++ w/ref-filter.c
> @@ -3471,7 +3471,8 @@ int format_ref_array_item(struct ref_array_item *info,
> }
> }
> if (state.stack->prev) {
> - pop_stack_element(&state.stack);
> + while (state.stack->prev)
> + pop_stack_element(&state.stack);
> return strbuf_addf_ret(error_buf, -1, _("format: %%(end) atom missing"));
> }
> strbuf_addbuf(final_buf, &state.stack->output);
Hm. It certainly feels like we should do that. I couldn't construct a
test case that fails with the leak sanitizer though. If it's a leak I'm
sure I'll eventually hit it when I continue down the road headed towards
leak-free-ness.
Patrick
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms
2024-09-12 10:22 ` Patrick Steinhardt
@ 2024-09-12 11:18 ` Jeff King
2024-09-12 11:32 ` Patrick Steinhardt
2024-09-12 20:24 ` Junio C Hamano
0 siblings, 2 replies; 27+ messages in thread
From: Jeff King @ 2024-09-12 11:18 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git, Brooke Kuhlmann
On Thu, Sep 12, 2024 at 12:22:16PM +0200, Patrick Steinhardt wrote:
> > diff --git c/ref-filter.c w/ref-filter.c
> > index b06e18a569..d2040f5047 100644
> > --- c/ref-filter.c
> > +++ w/ref-filter.c
> > @@ -3471,7 +3471,8 @@ int format_ref_array_item(struct ref_array_item *info,
> > }
> > }
> > if (state.stack->prev) {
> > - pop_stack_element(&state.stack);
> > + while (state.stack->prev)
> > + pop_stack_element(&state.stack);
> > return strbuf_addf_ret(error_buf, -1, _("format: %%(end) atom missing"));
> > }
> > strbuf_addbuf(final_buf, &state.stack->output);
>
> Hm. It certainly feels like we should do that. I couldn't construct a
> test case that fails with the leak sanitizer though. If it's a leak I'm
> sure I'll eventually hit it when I continue down the road headed towards
> leak-free-ness.
Hmm. I think just:
./git for-each-ref --format='%(if)%(then)%(if)%(then)%(if)%(then)'
should trigger it, and running it in the debugger I can see that we exit
the function with multiple entries.
Valgrind claims the memory is still reachable, but I don't see how. The
"state" variable is accessible only inside that function. The only thing
we do after returning is die(). I wonder if it is a false negative
because the stack is left undisturbed (especially because the compiler
knows that die() does not return).
At any rate, I think the same would apply to the earlier error returns:
diff --git a/ref-filter.c b/ref-filter.c
index b06e18a569..a339f0ab0f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -3454,7 +3454,8 @@ int format_ref_array_item(struct ref_array_item *info,
pos = parse_ref_filter_atom(format, sp + 2, ep, error_buf);
if (pos < 0 || get_ref_atom_value(info, pos, &atomv, error_buf) ||
atomv->handler(atomv, &state, error_buf)) {
- pop_stack_element(&state.stack);
+ while (state.stack->prev)
+ pop_stack_element(&state.stack);
return -1;
}
}
@@ -3466,7 +3467,8 @@ int format_ref_array_item(struct ref_array_item *info,
struct atom_value resetv = ATOM_VALUE_INIT;
resetv.s = GIT_COLOR_RESET;
if (append_atom(&resetv, &state, error_buf)) {
- pop_stack_element(&state.stack);
+ while (state.stack->prev)
+ pop_stack_element(&state.stack);
return -1;
}
}
I wasn't sure why the non-error code path wouldn't need the same, but it
looks like there's some popping that happens in the various callbacks?
I'm not very familiar with this code, and it's hard to follow the flow
through the function pointers.
All that said, I am content to leave it for now. Even if it's a real
leak, it's one that happens once per program right before exiting with
an error. Most of the value in cleaning up trivial leaks like that are
to reduce the noise from analyzers so that we can find the much more
important leaks that scale with the input. If the analyzers aren't
complaining and we think it's trivial, it may not be worth spending a
lot of time on.
-Peff
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms
2024-09-12 11:18 ` Jeff King
@ 2024-09-12 11:32 ` Patrick Steinhardt
2024-09-12 20:24 ` Junio C Hamano
1 sibling, 0 replies; 27+ messages in thread
From: Patrick Steinhardt @ 2024-09-12 11:32 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git, Brooke Kuhlmann
On Thu, Sep 12, 2024 at 07:18:58AM -0400, Jeff King wrote:
> All that said, I am content to leave it for now. Even if it's a real
> leak, it's one that happens once per program right before exiting with
> an error. Most of the value in cleaning up trivial leaks like that are
> to reduce the noise from analyzers so that we can find the much more
> important leaks that scale with the input. If the analyzers aren't
> complaining and we think it's trivial, it may not be worth spending a
> lot of time on.
Agreed. Thanks for digging!
Patrick
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms
2024-09-12 11:18 ` Jeff King
2024-09-12 11:32 ` Patrick Steinhardt
@ 2024-09-12 20:24 ` Junio C Hamano
1 sibling, 0 replies; 27+ messages in thread
From: Junio C Hamano @ 2024-09-12 20:24 UTC (permalink / raw)
To: Jeff King; +Cc: Patrick Steinhardt, git, Brooke Kuhlmann
Jeff King <peff@peff.net> writes:
> On Thu, Sep 12, 2024 at 12:22:16PM +0200, Patrick Steinhardt wrote:
>
>> > diff --git c/ref-filter.c w/ref-filter.c
>> > index b06e18a569..d2040f5047 100644
>> > --- c/ref-filter.c
>> > +++ w/ref-filter.c
>> > @@ -3471,7 +3471,8 @@ int format_ref_array_item(struct ref_array_item *info,
>> > }
>> > }
>> > if (state.stack->prev) {
>> > - pop_stack_element(&state.stack);
>> > + while (state.stack->prev)
>> > + pop_stack_element(&state.stack);
>> > return strbuf_addf_ret(error_buf, -1, _("format: %%(end) atom missing"));
>> > }
>> > strbuf_addbuf(final_buf, &state.stack->output);
>>
>> Hm. It certainly feels like we should do that. I couldn't construct a
>> test case that fails with the leak sanitizer though. If it's a leak I'm
>> sure I'll eventually hit it when I continue down the road headed towards
>> leak-free-ness.
>
> Hmm. I think just:
>
> ./git for-each-ref --format='%(if)%(then)%(if)%(then)%(if)%(then)'
>
> should trigger it, and running it in the debugger I can see that we exit
> the function with multiple entries.
>
> Valgrind claims the memory is still reachable, but I don't see how. The
> "state" variable is accessible only inside that function. The only thing
> we do after returning is die(). I wonder if it is a false negative
> because the stack is left undisturbed (especially because the compiler
> knows that die() does not return).
Yup, the reason why I didn't add any test was because the leak
checker failed to notice the apparent leak.
> At any rate, I think the same would apply to the earlier error returns:
> ...
> All that said, I am content to leave it for now. Even if it's a real
> leak, it's one that happens once per program right before exiting with
> an error. Most of the value in cleaning up trivial leaks like that are
> to reduce the noise from analyzers so that we can find the much more
> important leaks that scale with the input. If the analyzers aren't
> complaining and we think it's trivial, it may not be worth spending a
> lot of time on.
That is good to me, too.
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2024-09-12 20:24 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-09 23:07 [PATCH 0/9] ref-filter %(trailer) fixes Jeff King
2024-09-09 23:08 ` [PATCH 1/9] t6300: drop newline from wrapped test title Jeff King
2024-09-09 23:12 ` [PATCH 2/9] ref-filter: avoid extra copies of payload/signature Jeff King
2024-09-10 6:09 ` Patrick Steinhardt
2024-09-10 6:26 ` Jeff King
2024-09-09 23:14 ` [PATCH 3/9] ref-filter: strip signature when parsing tag trailers Jeff King
2024-09-10 6:08 ` Patrick Steinhardt
2024-09-10 6:28 ` Jeff King
2024-09-09 23:14 ` [PATCH 4/9] ref-filter: drop useless cast in trailers_atom_parser() Jeff King
2024-09-09 23:16 ` [PATCH 5/9] ref-filter: store ref_trailer_buf data per-atom Jeff King
2024-09-10 6:08 ` Patrick Steinhardt
2024-09-09 23:18 ` [PATCH 6/9] ref-filter: fix leak of %(trailers) "argbuf" Jeff King
2024-09-10 6:09 ` Patrick Steinhardt
2024-09-10 6:33 ` Jeff King
2024-09-09 23:19 ` [PATCH 7/9] ref-filter: fix leak with %(describe) arguments Jeff King
2024-09-09 23:19 ` [PATCH 8/9] ref-filter: fix leak when formatting %(push:remoteref) Jeff King
2024-09-10 6:09 ` Patrick Steinhardt
2024-09-09 23:21 ` [PATCH 9/9] ref-filter: add ref_format_clear() function Jeff King
2024-09-10 6:09 ` Patrick Steinhardt
2024-09-10 6:37 ` Jeff King
2024-09-10 6:57 ` [PATCH 10/9] ref-filter: fix leak with unterminated %(if) atoms Patrick Steinhardt
2024-09-10 7:12 ` Jeff King
2024-09-10 16:48 ` Junio C Hamano
2024-09-12 10:22 ` Patrick Steinhardt
2024-09-12 11:18 ` Jeff King
2024-09-12 11:32 ` Patrick Steinhardt
2024-09-12 20:24 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).