From: "Rubén Justo" <rjusto@gmail.com>
To: "Jeff King" <peff@peff.net>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Taylor Blau <me@ttaylorr.com>,
git@vger.kernel.org, Victoria Dye <vdye@github.com>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] Makefile: force -O0 when compiling with SANITIZE=leak
Date: Thu, 20 Oct 2022 20:57:31 +0200 [thread overview]
Message-ID: <939ccb0c-005c-4f98-a6ca-3f8e5cda1641@gmail.com> (raw)
In-Reply-To: <Y0+i1G5ybdhUGph2@coredump.intra.peff.net>
On 19/10/22 9:10, Jeff King wrote:
> Yes, that's a mouthful. If you really really want it, we can make the
> "-O0" behavior conditional. But I remain entirely unconvinced that what
> you're trying to do is useful.
>
>> I'd prefer not to need to do that, because while non-O0 is noisy
>> sometimes, I've also found that it points (at least indirectly) to some
>> useful edge-cases.
>
> I haven't seen any evidence that leak-checking with -O2 produces
> anything of value. And I have seen several examples where it produces
> garbage. I guess you're referring here to the elaboration you give below
> in your message. But I think you're just wrong about it being useful.
>
>> The tl;dr is that I think you use "leak" in the sense that valgrind
>> talks about "still reachable" leaks, which is conceptually where
>> -fsanitize=leak draws the line. I.e. it's not a malloc()/free() tracker,
>> but tries to see what's "in scope".
>
> No, these "still reachable" cases are not at all interesting. If we end
> the program by returning up the stack, then perhaps they could be (but
> IMHO they are not generally, because it's perfectly reasonable to leave
> globals pointing to allocated memory at program exit). But if we exit in
> the middle of a function, and variables are left on the stack, what in
> the world is the use of a leak-checker complaining about that?
>
> It is not hurting anything, because by definition we have exited the
> program and released the memory. And it is near-impossible to remove
> these entirely. In the example we're discussing, there's a local
> variable in git_config_push_env() that _could_ be freed before calling
> die(). But not in the general case:
>
> - what about heap allocations in callers? Imagine a program like this:
>
> void foo(const char *str)
> {
> if (some_func(str))
> die("bad str");
> }
> void bar(void)
> {
> char *str = strdup("some string");
> foo(str);
> free(str);
> }
>
> If foo() calls die(), then bar()'s str is a "still reachable" leak.
> But we can't fix it in any reasonable way. foo() doesn't know about
> freeing the string. And bar() doesn't know about calling die().
>
> - likewise, we may actually need the heap-allocated string to call
> die! Consider this:
>
> void foo(void)
> {
> char *str = strdup("some string");
> if (some_func(str))
> die("bad str: %s", str);
> free(str);
> }
>
> You can't avoid a "still reachable" leak here, because die() would
> need to use the string, then free it, then exit.
>
> So if these "still reachable" leaks are not hurting anything, and if
> they are impossible to get rid of, why do we want to care about them?
> Doing so just causes pain with no benefit.
>
>> This is getting a bit academic, but I don't see how you can both say
>> that the "compilers are allowed to modify the program as long as the
>> observable behavior of that abstract machine is unchanged" *and* claim
>> that e.g. the git_config_push_env() case isn't a real leak.
>
> The words "observable behavior" are doing a lot of heavy lifting there.
> That is the behavior which a conforming C program can observe. And in
> this case, nothing is violated. We did not reach the free() call, so
> there was no need to make sure we had the parameters available. The C
> standard's abstract machine doesn't know about things like "the stack is
> memory that you can look at". Doing that is undefined behavior, and a
> program which cares about that is not conforming.
>
> I agree this is mostly academic. My point in bringing it up was just to
> say that no, this isn't a bug in the optimizer. Clobbering values for
> NORETURN is perfectly reasonable and valid according to the standard.
> And leak-checking, while basically undefined behavior from the
> perspective of the standard, is a useful tool. It's the interaction
> between the two that is ugly, and the most useful thing for us to do is
> avoid using both at the same time.
>
>> Because surely the thing that makes it a "leak" by your definition (and
>> what LSAN strives for) is that it's attributed to a variable that's "in
>> scope", but the compiler is free to re-arrange all of that.
>
> If you mean "not a leak", then yes: there's still an in-scope variable
> that points to it, which is what makes it not a leak.
>
>> Anyway, one reason I wanted to punt on that "git --config-env" issue is
>> because we can entirely avoid the malloc()/free() there. See the "-- >8
>> --" below, but if we just malloc() after we do our assertions we can
>> un-confuse clang.
>>
>> And that seems like a good idea in general, and re whether the "leak" is
>> gone, at that point valgrind will stop reporting it, so we're really not
>> leaking at all, not just in the "still reachable" sense.
>
> I'm not convinced it's a good idea. The resulting code requires an extra
> variable and is (IMHO) slightly harder to understand. And what have we
> accomplished? We silenced one false positive, but we know there are
> others. And the linux-leaks CI job is failing on master _right now_, and
> the patch below doesn't fix that.
>
> That makes the job somewhat worthless. And worse, it makes all of CI
> less useful, because if it says "failed" on every build, people will
> start to ignore it.
>
>> The reason I mention all of that is to try to define the problem here. I
>> haven't seen cases where the compilers get it wrong about there being a
>> leak, it's just that they're mis-categorizing them as "still reachable"
>> or not, re your "abstract machine" summary.
>
> This paragraph makes it sound like we are mixing up a real leak and a
> "still reachable" leak. But it is mixing up "there is no leak and
> nothing to fix" with "there is a still reachable leak".
>
>> Now, the other cases in t1300 are from git.c using exit() instead of
>> retur-ing to our main().
>>
>> Among numerous other leak fixes I have queued up I have a fix for that,
>> which fixes the other t1300 cases that have been reported now:
>> https://github.com/avar/git/tree/avar/git-c-do-not-exit
>
> I said it in the last round of discussion, and I'll say it again: this
> is the wrong direction. These are not real leaks, and we should not be
> twisting our code to try to avoid them. We should be fixing our
> leak-checking so that they don't come up.
>
>> The actual meaningful fix here is that we don't need to do this
>> allocation at all. The only reason it's needed is because there's no
>> variant of "sq_quote_buf()" in quote.c that takes a "len"
>> parameter (but I have one locally).
>>
>> If we had that we could just pass a "key" and "key_len" to
>> git_config_push_split_parameter() instead and avoid the allocation
>> altogether. But in lieu of that better fix (which other API users of
>> quote.c would benefit from) let's defer the allocation, which happens
>> to fix the leak reporting on
>
> So as you probably guessed, I absolutely hate this patch and think we
> should not take it.
>
> If we had a version of sq_quote_buf() that took a "len" parameter, then
> sure, it would be reasonable to use. But it absolutely is not worth
> caring about to silence a leak that is a false positive. There is
> _nothing_ wrong with the code as written.
>
> -Peff
>
"-O2" is what goes public, testing it, directly or indirectly, is useful.
Another thing is the pay-off..
We're losing a few eyeballs on "-O2" and some test performance in the way, in
exchange of avoiding some false positives. And even with "-O0" the compiler is
free to keep playing the whack-a-mole with us.
I can easily find motivations for people to switch to "-O0" and do its local
tests for example, and so we add eyeballs... but not so easily the other way
around, to switch to "-O2".
"False positives" makes me think the leak checker does its best. And the
compiler. This "-O0 leak" is ignored by both, with "-O2":
void func()
{
malloc(1024);
exit(1);
}
UNLEAK(), "-O0" as a /second opinion/ /confirmation/, some attention to the
false positives,... are things that doesn't sound so bad. Maybe there is a
better way. Dunno.
Un saludo.
next prev parent reply other threads:[~2022-10-20 18:57 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-18 18:37 linux-leaks CI failure on master Taylor Blau
2022-10-18 19:40 ` Jeff King
2022-10-18 20:15 ` [PATCH] Makefile: force -O0 when compiling with SANITIZE=leak Jeff King
2022-10-18 21:00 ` Ævar Arnfjörð Bjarmason
2022-10-19 7:10 ` Jeff King
2022-10-20 18:57 ` Rubén Justo [this message]
2022-10-21 6:04 ` Jeff King
2022-10-19 15:36 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=939ccb0c-005c-4f98-a6ca-3f8e5cda1641@gmail.com \
--to=rjusto@gmail.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=peff@peff.net \
--cc=vdye@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).