* [PATCH 1/6] test-lib: use individual lsan dir for --stress runs
2025-01-01 20:12 ` [PATCH 0/6] a less-invasive racy-leak fix Jeff King
@ 2025-01-01 20:12 ` Jeff King
2025-01-01 20:12 ` [PATCH 2/6] Revert "index-pack: spawn threads atomically" Jeff King
` (5 subsequent siblings)
6 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2025-01-01 20:12 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git
When storing output in test-results/, we usually give each numbered run
in a --stress set its own output file. But we don't do that for storing
LSan logs, so something like:
./t0003-attributes.sh --stress
will have many scripts simultaneously creating, writing to, and deleting
the test-results/t0003-attributes.leak directory. This can cause logs
from one run to be attributed to another, spurious failures when
creation and deletion race, and so on.
This has always been broken, but nobody noticed because it's rare to do
a --stress run with LSan (since the point is for the code to run quickly
many times in order to hit races). But if you're trying to find a race
in the leak sanitizing code, it makes sense to use these together.
We can fix it by using $TEST_RESULTS_BASE, which already incorporates
the stress job suffix.
Signed-off-by: Jeff King <peff@peff.net>
---
t/test-lib.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 1a67adb207..96f2dfb69d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -331,7 +331,7 @@ TEST_RESULTS_BASE="$TEST_RESULTS_DIR/$TEST_NAME$TEST_STRESS_JOB_SFX"
TEST_RESULTS_SAN_FILE_PFX=trace
TEST_RESULTS_SAN_DIR_SFX=leak
TEST_RESULTS_SAN_FILE=
-TEST_RESULTS_SAN_DIR="$TEST_RESULTS_DIR/$TEST_NAME.$TEST_RESULTS_SAN_DIR_SFX"
+TEST_RESULTS_SAN_DIR="$TEST_RESULTS_BASE.$TEST_RESULTS_SAN_DIR_SFX"
TRASH_DIRECTORY="trash directory.$TEST_NAME$TEST_STRESS_JOB_SFX"
test -n "$root" && TRASH_DIRECTORY="$root/$TRASH_DIRECTORY"
case "$TRASH_DIRECTORY" in
--
2.48.0.rc1.363.g2bf91ec010
^ permalink raw reply related [flat|nested] 37+ messages in thread* [PATCH 2/6] Revert "index-pack: spawn threads atomically"
2025-01-01 20:12 ` [PATCH 0/6] a less-invasive racy-leak fix Jeff King
2025-01-01 20:12 ` [PATCH 1/6] test-lib: use individual lsan dir for --stress runs Jeff King
@ 2025-01-01 20:12 ` Jeff King
2025-01-01 20:14 ` [PATCH 3/6] test-lib: rely on logs to detect leaks Jeff King
` (4 subsequent siblings)
6 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2025-01-01 20:12 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git
This reverts commit 993d38a0669a8056d496797516e743e26b6b8b54.
That commit was trying to solve a race between LSan setting up the
threads stack and another thread calling exit(), by making sure that all
pthread_create() calls have finished before doing any work that might
trigger the exit().
But that isn't sufficient. The setup code actually runs in the
individual threads themselves, not in the spawning thread's call to
pthread_create(). So while it may have improved the race a bit, you can
still trigger it pretty quickly with:
make SANITIZE=leak
cd t
./t5309-pack-delta-cycles.sh --stress
Let's back out that failed attempt so we can try again.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/index-pack.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index d773809c4c..0b62b2589f 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1336,15 +1336,13 @@ static void resolve_deltas(struct pack_idx_option *opts)
base_cache_limit = opts->delta_base_cache_limit * nr_threads;
if (nr_threads > 1 || getenv("GIT_FORCE_THREADS")) {
init_thread();
- work_lock();
for (i = 0; i < nr_threads; i++) {
int ret = pthread_create(&thread_data[i].thread, NULL,
threaded_second_pass, thread_data + i);
if (ret)
die(_("unable to create thread: %s"),
strerror(ret));
}
- work_unlock();
for (i = 0; i < nr_threads; i++)
pthread_join(thread_data[i].thread, NULL);
cleanup_thread();
--
2.48.0.rc1.363.g2bf91ec010
^ permalink raw reply related [flat|nested] 37+ messages in thread* [PATCH 3/6] test-lib: rely on logs to detect leaks
2025-01-01 20:12 ` [PATCH 0/6] a less-invasive racy-leak fix Jeff King
2025-01-01 20:12 ` [PATCH 1/6] test-lib: use individual lsan dir for --stress runs Jeff King
2025-01-01 20:12 ` [PATCH 2/6] Revert "index-pack: spawn threads atomically" Jeff King
@ 2025-01-01 20:14 ` Jeff King
2025-01-03 12:05 ` Patrick Steinhardt
2025-01-01 20:17 ` [PATCH 4/6] test-lib: simplify leak-log checking Jeff King
` (3 subsequent siblings)
6 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2025-01-01 20:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git
When we run with sanitizers, we set abort_on_error=1 so that the tests
themselves can detect problems directly (when the buggy program exits
with SIGABRT). This has one blind spot, though: we don't always check
the exit codes for all programs (e.g., helpers like upload-pack invoked
behind the scenes).
For ASan and UBSan this is mostly fine; they exit as soon as they see an
error, so the unexpected abort of the program causes the test to fail
anyway.
But for LSan, the program runs to completion, since we can only check
for leaks at the end. And in that case we could miss leak reports. And
thus we started checking LSan logs in faececa53f (test-lib: have the
"check" mode for SANITIZE=leak consider leak logs, 2022-07-28).
Originally the logs were optional, but logs are generated (and checked)
always as of 8c1d6691bc (test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by
default, 2024-07-11). And we even check them for each test snippet, as
of cf1464331b (test-lib: check for leak logs after every test,
2024-09-24).
So now aborting on error is superfluous for LSan! We can get everything
we need by checking the logs. And checking the logs is actually
preferable, since it gives us more control over silencing false
positives (something we do not yet do, but will soon).
So let's tell LSan to just exit normally, even if it finds leaks. We can
do so with exitcode=0, which also suppresses the abort_on_error flag.
Signed-off-by: Jeff King <peff@peff.net>
---
t/test-lib.sh | 1 +
1 file changed, 1 insertion(+)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 96f2dfb69d..dd2ba6e6cc 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -80,6 +80,7 @@ prepend_var ASAN_OPTIONS : detect_leaks=0
export ASAN_OPTIONS
prepend_var LSAN_OPTIONS : $GIT_SAN_OPTIONS
+prepend_var LSAN_OPTIONS : exitcode=0
prepend_var LSAN_OPTIONS : fast_unwind_on_malloc=0
export LSAN_OPTIONS
--
2.48.0.rc1.363.g2bf91ec010
^ permalink raw reply related [flat|nested] 37+ messages in thread* Re: [PATCH 3/6] test-lib: rely on logs to detect leaks
2025-01-01 20:14 ` [PATCH 3/6] test-lib: rely on logs to detect leaks Jeff King
@ 2025-01-03 12:05 ` Patrick Steinhardt
2025-01-03 20:10 ` Jeff King
0 siblings, 1 reply; 37+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 12:05 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Wed, Jan 01, 2025 at 03:14:44PM -0500, Jeff King wrote:
> When we run with sanitizers, we set abort_on_error=1 so that the tests
> themselves can detect problems directly (when the buggy program exits
> with SIGABRT). This has one blind spot, though: we don't always check
> the exit codes for all programs (e.g., helpers like upload-pack invoked
> behind the scenes).
>
> For ASan and UBSan this is mostly fine; they exit as soon as they see an
> error, so the unexpected abort of the program causes the test to fail
> anyway.
>
> But for LSan, the program runs to completion, since we can only check
> for leaks at the end. And in that case we could miss leak reports. And
> thus we started checking LSan logs in faececa53f (test-lib: have the
> "check" mode for SANITIZE=leak consider leak logs, 2022-07-28).
> Originally the logs were optional, but logs are generated (and checked)
> always as of 8c1d6691bc (test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by
> default, 2024-07-11). And we even check them for each test snippet, as
> of cf1464331b (test-lib: check for leak logs after every test,
> 2024-09-24).
>
> So now aborting on error is superfluous for LSan! We can get everything
> we need by checking the logs. And checking the logs is actually
> preferable, since it gives us more control over silencing false
> positives (something we do not yet do, but will soon).
>
> So let's tell LSan to just exit normally, even if it finds leaks. We can
> do so with exitcode=0, which also suppresses the abort_on_error flag.
The only downside I can think of is that we now run the whole testcase
to completion before checking for leaks, whereas beforehand we most
likely aborted the testcase on hitting the first leak. It follows that
we may now have multiple leak reports, and it is not immediately clear
which of the commands has actually been failing.
I think we're now in a clean-enough state regarding memory leaks that
this isn't a huge issue anymore though.
Patrick
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 3/6] test-lib: rely on logs to detect leaks
2025-01-03 12:05 ` Patrick Steinhardt
@ 2025-01-03 20:10 ` Jeff King
0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2025-01-03 20:10 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git
On Fri, Jan 03, 2025 at 01:05:41PM +0100, Patrick Steinhardt wrote:
> > So now aborting on error is superfluous for LSan! We can get everything
> > we need by checking the logs. And checking the logs is actually
> > preferable, since it gives us more control over silencing false
> > positives (something we do not yet do, but will soon).
> >
> > So let's tell LSan to just exit normally, even if it finds leaks. We can
> > do so with exitcode=0, which also suppresses the abort_on_error flag.
>
> The only downside I can think of is that we now run the whole testcase
> to completion before checking for leaks, whereas beforehand we most
> likely aborted the testcase on hitting the first leak. It follows that
> we may now have multiple leak reports, and it is not immediately clear
> which of the commands has actually been failing.
True, I didn't think of that. We do at least check the logs after each
case, so it would have to be multiple leaks in the same snippet. The
LSan output also mentions the process name, though not the arguments
(and some snippets may invoke the same program multiple times).
The other thing I guess we'd miss is that SIGABRT can optionally produce
a core dump. I don't think I've ever found that useful for LSan (since
it's not exiting until the end of process) but occasionally have used it
for ASan (though this patch does not change anything there).
> I think we're now in a clean-enough state regarding memory leaks that
> this isn't a huge issue anymore though.
Yeah, agreed. We can see how much of a problem it is in practice. You
can always switch the flag yourself for a specific run, like:
LSAN_OPTIONS=exitcode=23 ./twhatever -i
-Peff
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 4/6] test-lib: simplify leak-log checking
2025-01-01 20:12 ` [PATCH 0/6] a less-invasive racy-leak fix Jeff King
` (2 preceding siblings ...)
2025-01-01 20:14 ` [PATCH 3/6] test-lib: rely on logs to detect leaks Jeff King
@ 2025-01-01 20:17 ` Jeff King
2025-01-03 12:05 ` Patrick Steinhardt
2025-01-01 20:18 ` [PATCH 5/6] test-lib: check leak logs for presence of DEDUP_TOKEN Jeff King
` (2 subsequent siblings)
6 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2025-01-01 20:17 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git
We have a function to count the number of leaks found (actually, it is
the number of processes which produced a log file). Once upon a time we
cared about seeing if this number increased between runs. But we
simplified that away in 95c679ad86 (test-lib: stop showing old leak
logs, 2024-09-24), and now we only care if it returns any results or
not.
In preparation for refactoring it further, let's drop the counting
function entirely, and roll it into the "is it empty" check. The outcome
should be the same, but we'll be free to return a boolean "did we find
anything" without worrying about somebody adding a new call to the
counting function.
Signed-off-by: Jeff King <peff@peff.net>
---
We need the extra "!" to invert the exit code of the final grep, which
made my head spin a little at first. Maybe it would be less confusing if
this was reporting non-empty results, as "check_test_results_has_leaks"
or something? We'd have to update the callers, but there are only 2 of
them. I dunno.
t/test-lib.sh | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index dd2ba6e6cc..23eb26bfbe 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -340,17 +340,6 @@ case "$TRASH_DIRECTORY" in
*) TRASH_DIRECTORY="$TEST_OUTPUT_DIRECTORY/$TRASH_DIRECTORY" ;;
esac
-# Utility functions using $TEST_RESULTS_* variables
-nr_san_dir_leaks_ () {
- # stderr piped to /dev/null because the directory may have
- # been "rmdir"'d already.
- find "$TEST_RESULTS_SAN_DIR" \
- -type f \
- -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
- xargs grep -lv "Unable to get registers from thread" |
- wc -l
-}
-
# If --stress was passed, run this test repeatedly in several parallel loops.
if test "$GIT_TEST_STRESS_STARTED" = "done"
then
@@ -1181,8 +1170,14 @@ test_atexit_handler () {
}
check_test_results_san_file_empty_ () {
- test -z "$TEST_RESULTS_SAN_FILE" ||
- test "$(nr_san_dir_leaks_)" = 0
+ test -z "$TEST_RESULTS_SAN_FILE" && return 0
+
+ # stderr piped to /dev/null because the directory may have
+ # been "rmdir"'d already.
+ ! find "$TEST_RESULTS_SAN_DIR" \
+ -type f \
+ -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
+ xargs grep -qv "Unable to get registers from thread"
}
check_test_results_san_file_ () {
--
2.48.0.rc1.363.g2bf91ec010
^ permalink raw reply related [flat|nested] 37+ messages in thread* Re: [PATCH 4/6] test-lib: simplify leak-log checking
2025-01-01 20:17 ` [PATCH 4/6] test-lib: simplify leak-log checking Jeff King
@ 2025-01-03 12:05 ` Patrick Steinhardt
2025-01-03 20:24 ` Jeff King
0 siblings, 1 reply; 37+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 12:05 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Wed, Jan 01, 2025 at 03:17:21PM -0500, Jeff King wrote:
> @@ -1181,8 +1170,14 @@ test_atexit_handler () {
> }
>
> check_test_results_san_file_empty_ () {
> - test -z "$TEST_RESULTS_SAN_FILE" ||
> - test "$(nr_san_dir_leaks_)" = 0
> + test -z "$TEST_RESULTS_SAN_FILE" && return 0
> +
> + # stderr piped to /dev/null because the directory may have
> + # been "rmdir"'d already.
> + ! find "$TEST_RESULTS_SAN_DIR" \
> + -type f \
> + -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> + xargs grep -qv "Unable to get registers from thread"
Can't we use `-exec grep -qv "Unable to get registers from thread" {}
\+` instead of using xargs? Or is that unportable? Might make it a bit
easier to reason about the `!` in the presence of a pipe.
Patrick
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 4/6] test-lib: simplify leak-log checking
2025-01-03 12:05 ` Patrick Steinhardt
@ 2025-01-03 20:24 ` Jeff King
2025-01-06 7:56 ` Patrick Steinhardt
0 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2025-01-03 20:24 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git
On Fri, Jan 03, 2025 at 01:05:45PM +0100, Patrick Steinhardt wrote:
> On Wed, Jan 01, 2025 at 03:17:21PM -0500, Jeff King wrote:
> > @@ -1181,8 +1170,14 @@ test_atexit_handler () {
> > }
> >
> > check_test_results_san_file_empty_ () {
> > - test -z "$TEST_RESULTS_SAN_FILE" ||
> > - test "$(nr_san_dir_leaks_)" = 0
> > + test -z "$TEST_RESULTS_SAN_FILE" && return 0
> > +
> > + # stderr piped to /dev/null because the directory may have
> > + # been "rmdir"'d already.
> > + ! find "$TEST_RESULTS_SAN_DIR" \
> > + -type f \
> > + -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> > + xargs grep -qv "Unable to get registers from thread"
>
> Can't we use `-exec grep -qv "Unable to get registers from thread" {}
> \+` instead of using xargs? Or is that unportable? Might make it a bit
> easier to reason about the `!` in the presence of a pipe.
I don't think that saves us from negating, though. The "grep" will tell
us if it matched any "real" lines, but we want to report that we found
no real lines.
Plus I don't think "find" propagates the exit code from -exec anyway. I
think you can check the exit status with more find logic, so you'd then
use a conditional -print for each file like:
find ... \
-exec grep -qv "Unable to get registers from thread" \{} \; \
-print
and you have to check whether the output is empty. The easiest way to do
that is with another grep! Which also needs negated. ;)
I think if we really want to drop the negation, we'd be best to flip the
function's return, like:
have_leaks() {
# not leak-checking
test -z "$TEST_RESULTS_SAN_FILE" && return 1
find "$TEST_RESULTS_SAN_DIR" \
-type f \
-name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
xargs grep ^DEDUP_TOKEN |
grep -qv sanitizer::GetThreadStackTopAndBottom
}
And then you could switch the initial "grep" to -exec if you want, but
there's no negation to get rid of, so it is only a preference of -exec
versus xargs.
-Peff
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 4/6] test-lib: simplify leak-log checking
2025-01-03 20:24 ` Jeff King
@ 2025-01-06 7:56 ` Patrick Steinhardt
2025-01-07 7:01 ` Jeff King
0 siblings, 1 reply; 37+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 7:56 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Fri, Jan 03, 2025 at 03:24:10PM -0500, Jeff King wrote:
> On Fri, Jan 03, 2025 at 01:05:45PM +0100, Patrick Steinhardt wrote:
>
> > On Wed, Jan 01, 2025 at 03:17:21PM -0500, Jeff King wrote:
> > > @@ -1181,8 +1170,14 @@ test_atexit_handler () {
> > > }
> > >
> > > check_test_results_san_file_empty_ () {
> > > - test -z "$TEST_RESULTS_SAN_FILE" ||
> > > - test "$(nr_san_dir_leaks_)" = 0
> > > + test -z "$TEST_RESULTS_SAN_FILE" && return 0
> > > +
> > > + # stderr piped to /dev/null because the directory may have
> > > + # been "rmdir"'d already.
> > > + ! find "$TEST_RESULTS_SAN_DIR" \
> > > + -type f \
> > > + -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> > > + xargs grep -qv "Unable to get registers from thread"
> >
> > Can't we use `-exec grep -qv "Unable to get registers from thread" {}
> > \+` instead of using xargs? Or is that unportable? Might make it a bit
> > easier to reason about the `!` in the presence of a pipe.
>
> I don't think that saves us from negating, though. The "grep" will tell
> us if it matched any "real" lines, but we want to report that we found
> no real lines.
>
> Plus I don't think "find" propagates the exit code from -exec anyway. I
> think you can check the exit status with more find logic, so you'd then
> use a conditional -print for each file like:
It should. Quoting find(1):
If any invocation with the `+' form returns a non-zero value as exit
status, then find returns a non-zero exit status.
> find ... \
> -exec grep -qv "Unable to get registers from thread" \{} \; \
> -print
>
> and you have to check whether the output is empty. The easiest way to do
> that is with another grep! Which also needs negated. ;)
Yup, I didn't mean to say that we can drop the negation, but that it
makes it easier to reason about what the negation applies to (the whole
pipe or just the find(1) command)).
> I think if we really want to drop the negation, we'd be best to flip the
> function's return, like:
>
> have_leaks() {
> # not leak-checking
> test -z "$TEST_RESULTS_SAN_FILE" && return 1
>
> find "$TEST_RESULTS_SAN_DIR" \
> -type f \
> -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> xargs grep ^DEDUP_TOKEN |
> grep -qv sanitizer::GetThreadStackTopAndBottom
> }
>
> And then you could switch the initial "grep" to -exec if you want, but
> there's no negation to get rid of, so it is only a preference of -exec
> versus xargs.
Yup.
Patrick
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 4/6] test-lib: simplify leak-log checking
2025-01-06 7:56 ` Patrick Steinhardt
@ 2025-01-07 7:01 ` Jeff King
0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2025-01-07 7:01 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git
On Mon, Jan 06, 2025 at 08:56:57AM +0100, Patrick Steinhardt wrote:
> > Plus I don't think "find" propagates the exit code from -exec anyway. I
> > think you can check the exit status with more find logic, so you'd then
> > use a conditional -print for each file like:
>
> It should. Quoting find(1):
>
> If any invocation with the `+' form returns a non-zero value as exit
> status, then find returns a non-zero exit status.
Ah, right. I tried using ';' to look at individual files, and it does
ignore the code. But of course we don't need to know which logs had
leaks, only that there was at least one.
I think we can make it even simpler, though. I'll post patches in a
moment.
-Peff
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 5/6] test-lib: check leak logs for presence of DEDUP_TOKEN
2025-01-01 20:12 ` [PATCH 0/6] a less-invasive racy-leak fix Jeff King
` (3 preceding siblings ...)
2025-01-01 20:17 ` [PATCH 4/6] test-lib: simplify leak-log checking Jeff King
@ 2025-01-01 20:18 ` Jeff King
2025-01-01 20:21 ` [PATCH 6/6] test-lib: ignore leaks in the sanitizer's thread code Jeff King
2025-01-07 7:04 ` [PATCH 0/3] lsan test-lib readability Jeff King
6 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2025-01-01 20:18 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git
When we check the leak logs, our original strategy was to check for any
non-empty log file produced by LSan. We later amended that to ignore
noisy lines in 370ef7e40d (test-lib: ignore uninteresting LSan output,
2023-08-28).
This makes it hard to ignore noise which is more than a single line;
we'd have to actually parse the file to determine the meaning of each
line.
But there's an easy line-oriented solution. Because we always pass the
dedup_token_length option, the output will contain a DEDUP_TOKEN line
for each leak that has been found. So if we invert our strategy to stop
ignoring useless lines and only look for useful ones, we can just count
the number of DEDUP_TOKEN lines. If it's non-zero, then we found at
least one leak (it would even give us a count of unique leaks, but we
really only care if it is non-zero).
This should yield the same outcome, but will help us build more false
positive detection on top.
Signed-off-by: Jeff King <peff@peff.net>
---
t/test-lib.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 23eb26bfbe..c9487d0805 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1177,7 +1177,7 @@ check_test_results_san_file_empty_ () {
! find "$TEST_RESULTS_SAN_DIR" \
-type f \
-name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
- xargs grep -qv "Unable to get registers from thread"
+ xargs grep -q ^DEDUP_TOKEN
}
check_test_results_san_file_ () {
--
2.48.0.rc1.363.g2bf91ec010
^ permalink raw reply related [flat|nested] 37+ messages in thread* [PATCH 6/6] test-lib: ignore leaks in the sanitizer's thread code
2025-01-01 20:12 ` [PATCH 0/6] a less-invasive racy-leak fix Jeff King
` (4 preceding siblings ...)
2025-01-01 20:18 ` [PATCH 5/6] test-lib: check leak logs for presence of DEDUP_TOKEN Jeff King
@ 2025-01-01 20:21 ` Jeff King
2025-01-03 12:05 ` Patrick Steinhardt
2025-01-07 7:04 ` [PATCH 0/3] lsan test-lib readability Jeff King
6 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2025-01-01 20:21 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git
Our CI jobs sometimes see false positive leaks like this:
=================================================================
==3904583==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 32 byte(s) in 1 object(s) allocated from:
#0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
#1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
#2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
#3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
#4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
#5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
#6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
#7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
This is not a leak in our code, but appears to be a race between one
thread calling exit() while another one is in LSan's stack setup code.
You can reproduce it easily by running t0003 or t5309 with --stress
(these trigger it because of the threading in git-grep and index-pack
respectively).
This may be a bug in LSan, but regardless of whether it is eventually
fixed, it is useful to work around it so that we stop seeing these false
positives.
We can recognize it by the mention of the sanitizer functions in the
DEDUP_TOKEN line. With this patch, the scripts mentioned above should
run with --stress indefinitely.
Signed-off-by: Jeff King <peff@peff.net>
---
One small downside here is that this just suppresses the "were there any
leaks" check. If there's a real leak _and_ the race was triggered, then
you'd see the racy false positive in the output. I don't think that's a
big deal, since both the race and real leaks should be rare-ish, and
you'd have to encounter both in the same run of a given test script.
t/test-lib.sh | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index c9487d0805..d1f62adbf8 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1177,7 +1177,8 @@ check_test_results_san_file_empty_ () {
! find "$TEST_RESULTS_SAN_DIR" \
-type f \
-name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
- xargs grep -q ^DEDUP_TOKEN
+ xargs grep ^DEDUP_TOKEN |
+ grep -qv sanitizer::GetThreadStackTopAndBottom
}
check_test_results_san_file_ () {
--
2.48.0.rc1.363.g2bf91ec010
^ permalink raw reply related [flat|nested] 37+ messages in thread* Re: [PATCH 6/6] test-lib: ignore leaks in the sanitizer's thread code
2025-01-01 20:21 ` [PATCH 6/6] test-lib: ignore leaks in the sanitizer's thread code Jeff King
@ 2025-01-03 12:05 ` Patrick Steinhardt
2025-01-03 20:26 ` Jeff King
0 siblings, 1 reply; 37+ messages in thread
From: Patrick Steinhardt @ 2025-01-03 12:05 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Wed, Jan 01, 2025 at 03:21:24PM -0500, Jeff King wrote:
> One small downside here is that this just suppresses the "were there any
> leaks" check. If there's a real leak _and_ the race was triggered, then
> you'd see the racy false positive in the output. I don't think that's a
> big deal, since both the race and real leaks should be rare-ish, and
> you'd have to encounter both in the same run of a given test script.
Yeah, I think that's fine. You'd have to both be unlucky and trigger the
leak _and_ you'd have to be debugging a testcase that hits the race in
the first place. Which together feels unlikely enough to really matter.
> t/test-lib.sh | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index c9487d0805..d1f62adbf8 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1177,7 +1177,8 @@ check_test_results_san_file_empty_ () {
> ! find "$TEST_RESULTS_SAN_DIR" \
> -type f \
> -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> - xargs grep -q ^DEDUP_TOKEN
> + xargs grep ^DEDUP_TOKEN |
> + grep -qv sanitizer::GetThreadStackTopAndBottom
> }
It would be nice to provide some more context here in the form of a
comment so that one doesn't have to blame the commit.
Patrick
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 6/6] test-lib: ignore leaks in the sanitizer's thread code
2025-01-03 12:05 ` Patrick Steinhardt
@ 2025-01-03 20:26 ` Jeff King
2025-01-06 7:56 ` Patrick Steinhardt
0 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2025-01-03 20:26 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git
On Fri, Jan 03, 2025 at 01:05:48PM +0100, Patrick Steinhardt wrote:
> > diff --git a/t/test-lib.sh b/t/test-lib.sh
> > index c9487d0805..d1f62adbf8 100644
> > --- a/t/test-lib.sh
> > +++ b/t/test-lib.sh
> > @@ -1177,7 +1177,8 @@ check_test_results_san_file_empty_ () {
> > ! find "$TEST_RESULTS_SAN_DIR" \
> > -type f \
> > -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> > - xargs grep -q ^DEDUP_TOKEN
> > + xargs grep ^DEDUP_TOKEN |
> > + grep -qv sanitizer::GetThreadStackTopAndBottom
> > }
>
> It would be nice to provide some more context here in the form of a
> comment so that one doesn't have to blame the commit.
We can add that on top, but I'm not sure what it should say. Do you want
something along the lines of "add false positives to ignore here..." or
are an explanation of why we are ignoring this particular false
positive?
-Peff
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 6/6] test-lib: ignore leaks in the sanitizer's thread code
2025-01-03 20:26 ` Jeff King
@ 2025-01-06 7:56 ` Patrick Steinhardt
0 siblings, 0 replies; 37+ messages in thread
From: Patrick Steinhardt @ 2025-01-06 7:56 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Fri, Jan 03, 2025 at 03:26:45PM -0500, Jeff King wrote:
> On Fri, Jan 03, 2025 at 01:05:48PM +0100, Patrick Steinhardt wrote:
>
> > > diff --git a/t/test-lib.sh b/t/test-lib.sh
> > > index c9487d0805..d1f62adbf8 100644
> > > --- a/t/test-lib.sh
> > > +++ b/t/test-lib.sh
> > > @@ -1177,7 +1177,8 @@ check_test_results_san_file_empty_ () {
> > > ! find "$TEST_RESULTS_SAN_DIR" \
> > > -type f \
> > > -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> > > - xargs grep -q ^DEDUP_TOKEN
> > > + xargs grep ^DEDUP_TOKEN |
> > > + grep -qv sanitizer::GetThreadStackTopAndBottom
> > > }
> >
> > It would be nice to provide some more context here in the form of a
> > comment so that one doesn't have to blame the commit.
>
> We can add that on top, but I'm not sure what it should say. Do you want
> something along the lines of "add false positives to ignore here..." or
> are an explanation of why we are ignoring this particular false
> positive?
The latter, ideally also with a reference to the upstream issue you have
created. That makes it way easier to discover why this line exists and
may prompt people to remove the line eventually if they discover that
the issue has been fixed for a while.
It doesn't have to be a full paragraph, but a sentence or to go a long
way sometimes. For more context people can still blame the commit
message.
Patrick
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 0/3] lsan test-lib readability
2025-01-01 20:12 ` [PATCH 0/6] a less-invasive racy-leak fix Jeff King
` (5 preceding siblings ...)
2025-01-01 20:21 ` [PATCH 6/6] test-lib: ignore leaks in the sanitizer's thread code Jeff King
@ 2025-01-07 7:04 ` Jeff King
2025-01-07 7:05 ` [PATCH 1/3] test-lib: invert return value of check_test_results_san_file_empty Jeff King
` (2 more replies)
6 siblings, 3 replies; 37+ messages in thread
From: Jeff King @ 2025-01-07 7:04 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git
On Wed, Jan 01, 2025 at 03:12:26PM -0500, Jeff King wrote:
> [1/6]: test-lib: use individual lsan dir for --stress runs
> [2/6]: Revert "index-pack: spawn threads atomically"
> [3/6]: test-lib: rely on logs to detect leaks
> [4/6]: test-lib: simplify leak-log checking
> [5/6]: test-lib: check leak logs for presence of DEDUP_TOKEN
> [6/6]: test-lib: ignore leaks in the sanitizer's thread code
And here are a few cosmetic improvements on top based on the review from
Patrick. They should apply on top of jk/lsan-race-ignore-false-positive,
though that is also in 'master' now.
[1/3]: test-lib: invert return value of check_test_results_san_file_empty
[2/3]: test-lib: simplify lsan results check
[3/3]: test-lib: add a few comments to LSan log checking
t/test-lib-functions.sh | 2 +-
t/test-lib.sh | 20 ++++++++++----------
2 files changed, 11 insertions(+), 11 deletions(-)
-Peff
^ permalink raw reply [flat|nested] 37+ messages in thread* [PATCH 1/3] test-lib: invert return value of check_test_results_san_file_empty
2025-01-07 7:04 ` [PATCH 0/3] lsan test-lib readability Jeff King
@ 2025-01-07 7:05 ` Jeff King
2025-01-07 7:07 ` [PATCH 2/3] test-lib: simplify lsan results check Jeff King
2025-01-07 7:08 ` [PATCH 3/3] test-lib: add a few comments to LSan log checking Jeff King
2 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2025-01-07 7:05 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git
We have a function to check whether LSan logged any leaks. It returns
success for no leaks, and non-zero otherwise. This is the simplest thing
for its callers, who want to say "if no leaks then return early". But
because it's implemented as a shell pipeline, you end up with the
awkward:
! find ... |
xargs grep leaks |
grep -v false-positives
where the "!" is actually negating the final grep. Switch the return
value (and name) to return success when there are leaks. This should
make the code a little easier to read, and the negation in the callers
still reads pretty naturally.
Signed-off-by: Jeff King <peff@peff.net>
---
t/test-lib-functions.sh | 2 +-
t/test-lib.sh | 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 78e054ab50..c25cee0ad8 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -927,7 +927,7 @@ test_expect_success () {
test -n "$test_skip_test_preamble" ||
say >&3 "expecting success of $TEST_NUMBER.$test_count '$1': $test_body"
if test_run_ "$test_body" &&
- check_test_results_san_file_empty_
+ ! check_test_results_san_file_has_entries_
then
test_ok_ "$1"
else
diff --git a/t/test-lib.sh b/t/test-lib.sh
index d1f62adbf8..be3553e40e 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1169,20 +1169,20 @@ test_atexit_handler () {
teardown_malloc_check
}
-check_test_results_san_file_empty_ () {
- test -z "$TEST_RESULTS_SAN_FILE" && return 0
+check_test_results_san_file_has_entries_ () {
+ test -z "$TEST_RESULTS_SAN_FILE" && return 1
# stderr piped to /dev/null because the directory may have
# been "rmdir"'d already.
- ! find "$TEST_RESULTS_SAN_DIR" \
+ find "$TEST_RESULTS_SAN_DIR" \
-type f \
-name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
xargs grep ^DEDUP_TOKEN |
grep -qv sanitizer::GetThreadStackTopAndBottom
}
check_test_results_san_file_ () {
- if check_test_results_san_file_empty_
+ if ! check_test_results_san_file_has_entries_
then
return
fi &&
--
2.48.0.rc2.377.g0507c08f68
^ permalink raw reply related [flat|nested] 37+ messages in thread* [PATCH 2/3] test-lib: simplify lsan results check
2025-01-07 7:04 ` [PATCH 0/3] lsan test-lib readability Jeff King
2025-01-07 7:05 ` [PATCH 1/3] test-lib: invert return value of check_test_results_san_file_empty Jeff King
@ 2025-01-07 7:07 ` Jeff King
2025-01-07 7:37 ` Patrick Steinhardt
2025-01-07 16:23 ` Junio C Hamano
2025-01-07 7:08 ` [PATCH 3/3] test-lib: add a few comments to LSan log checking Jeff King
2 siblings, 2 replies; 37+ messages in thread
From: Jeff King @ 2025-01-07 7:07 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git
We want to know if there are any leaks logged by LSan in the results
directory, so we run "find" on the containing directory and pipe it to
xargs. We can accomplish the same thing by just globbing in the shell
and passing the result to grep, which has a few advantages:
- it's one fewer process to run
- we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we
checked at the beginning of the function, and is the same glob use
to show the logs in check_test_results_san_file_
- this correctly handles the case where TEST_OUTPUT_DIRECTORY has a
space in it. For example doing:
mkdir "/tmp/foo bar"
TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test
would yield a lot of:
grep: /tmp/foo: No such file or directory
grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory
when there are leaks. We could do the same thing with "xargs
--null", but that isn't portable.
We are now subject to command-line length limits, but that is also true
of the globbing cat used to show the logs themselves. This hasn't been a
problem in practice.
We do need to use "grep -s" for the case that the glob does not expand
(i.e., there are not any log files at all). This option is in POSIX, and
has been used in t7407 for several years without anybody complaining.
This also also naturally handles the case where the surrounding
directory has already been removed (in which case there are likewise no
files!), dropping the need to comment about it.
Signed-off-by: Jeff King <peff@peff.net>
---
I was surprised by the use of "grep -s" in t7407, since it is totally
pointless there. But I think we can take its presence as a positive sign
for portability.
t/test-lib.sh | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index be3553e40e..898c2267b8 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1172,12 +1172,7 @@ test_atexit_handler () {
check_test_results_san_file_has_entries_ () {
test -z "$TEST_RESULTS_SAN_FILE" && return 1
- # stderr piped to /dev/null because the directory may have
- # been "rmdir"'d already.
- find "$TEST_RESULTS_SAN_DIR" \
- -type f \
- -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
- xargs grep ^DEDUP_TOKEN |
+ grep -s ^DEDUP_TOKEN "$TEST_RESULTS_SAN_FILE".* |
grep -qv sanitizer::GetThreadStackTopAndBottom
}
--
2.48.0.rc2.377.g0507c08f68
^ permalink raw reply related [flat|nested] 37+ messages in thread* Re: [PATCH 2/3] test-lib: simplify lsan results check
2025-01-07 7:07 ` [PATCH 2/3] test-lib: simplify lsan results check Jeff King
@ 2025-01-07 7:37 ` Patrick Steinhardt
2025-01-09 7:57 ` Jeff King
2025-01-07 16:23 ` Junio C Hamano
1 sibling, 1 reply; 37+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 7:37 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Tue, Jan 07, 2025 at 02:07:52AM -0500, Jeff King wrote:
> We want to know if there are any leaks logged by LSan in the results
> directory, so we run "find" on the containing directory and pipe it to
> xargs. We can accomplish the same thing by just globbing in the shell
> and passing the result to grep, which has a few advantages:
>
> - it's one fewer process to run
>
> - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we
> checked at the beginning of the function, and is the same glob use
s/use/used
I'm always a bit thrown off by your style of bulleted lists, where they
feel like sentences but start with a lower-case letter, and sometimes
they do and sometimes they don't end with punctuation. Maybe it's just
me not being a native speaker and it's a natural thing to do in English.
In any case, it's nothing that really matters in the end, but would be
happy to learn if this is indeed something you tend to do in English.
> to show the logs in check_test_results_san_file_
>
> - this correctly handles the case where TEST_OUTPUT_DIRECTORY has a
> space in it. For example doing:
>
> mkdir "/tmp/foo bar"
> TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test
>
> would yield a lot of:
>
> grep: /tmp/foo: No such file or directory
> grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory
>
> when there are leaks. We could do the same thing with "xargs
> --null", but that isn't portable.
>
> We are now subject to command-line length limits, but that is also true
> of the globbing cat used to show the logs themselves. This hasn't been a
> problem in practice.
Yup, this also came to my mind immediately. But I agree that it
shouldn't be an issue in general.
> We do need to use "grep -s" for the case that the glob does not expand
> (i.e., there are not any log files at all). This option is in POSIX, and
> has been used in t7407 for several years without anybody complaining.
> This also also naturally handles the case where the surrounding
> directory has already been removed (in which case there are likewise no
> files!), dropping the need to comment about it.
Okay. So in case there are no matching files we don't expand the
globbing string, and "--no-messages" makes us ignore that case. A bit
funny, but I don't see any issue with it.
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I was surprised by the use of "grep -s" in t7407, since it is totally
> pointless there. But I think we can take its presence as a positive sign
> for portability.
Good to know.
> t/test-lib.sh | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index be3553e40e..898c2267b8 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1172,12 +1172,7 @@ test_atexit_handler () {
> check_test_results_san_file_has_entries_ () {
> test -z "$TEST_RESULTS_SAN_FILE" && return 1
>
> - # stderr piped to /dev/null because the directory may have
> - # been "rmdir"'d already.
> - find "$TEST_RESULTS_SAN_DIR" \
> - -type f \
> - -name "$TEST_RESULTS_SAN_FILE_PFX.*" 2>/dev/null |
> - xargs grep ^DEDUP_TOKEN |
> + grep -s ^DEDUP_TOKEN "$TEST_RESULTS_SAN_FILE".* |
> grep -qv sanitizer::GetThreadStackTopAndBottom
And this nicely simplifies things indeed.
Patrick
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: [PATCH 2/3] test-lib: simplify lsan results check
2025-01-07 7:37 ` Patrick Steinhardt
@ 2025-01-09 7:57 ` Jeff King
2025-01-09 10:00 ` Patrick Steinhardt
0 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2025-01-09 7:57 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git
On Tue, Jan 07, 2025 at 08:37:33AM +0100, Patrick Steinhardt wrote:
> On Tue, Jan 07, 2025 at 02:07:52AM -0500, Jeff King wrote:
> > We want to know if there are any leaks logged by LSan in the results
> > directory, so we run "find" on the containing directory and pipe it to
> > xargs. We can accomplish the same thing by just globbing in the shell
> > and passing the result to grep, which has a few advantages:
> >
> > - it's one fewer process to run
> >
> > - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we
> > checked at the beginning of the function, and is the same glob use
>
> s/use/used
>
> I'm always a bit thrown off by your style of bulleted lists, where they
> feel like sentences but start with a lower-case letter, and sometimes
> they do and sometimes they don't end with punctuation. Maybe it's just
> me not being a native speaker and it's a natural thing to do in English.
> In any case, it's nothing that really matters in the end, but would be
> happy to learn if this is indeed something you tend to do in English.
Heh. Yeah, I've seen you mention them before and I've been tempted to
start a big discussion. But I never felt like it was worth it. But
tonight's your lucky night. ;)
In short: I think it's a style question. I perceive them as
continuations of the sentence that has the ":". Though admittedly I do
not always grammatically continue that sentence. So for example I could:
- have one bullet item that completes the sentence.
- and then another that likewise completes it.
;) I think many style guides would frown on that. Especially with the
periods at the end (you might argue that they should be semicolons).
In the example you quoted above they don't grammatically continue the
sentence, so arguably what I'm saying doesn't even apply. But I also
kind of think of the list items as sentence fragments. That sometimes
happen to make a full sentence. Or need punctuation because that
fragments gets so long it contains multiple sentences.
I dunno. You asked if it is something you tend to do in English. It is
something _I_ tend to do in English, but I think most style guides would
suggest against it (but then, most also suggest against bulleted lists
in the first place). (They probably also suggest against lots of
parentheses). So I wouldn't necessarily copy me.
My general feeling is that unless a commit message is inaccurate or hard
to understand, we should mostly let it pass (even typos). Yes, they are
an artifact that is enshrined in the history. But at some point they are
also just a written communication between developers, and we all have
our own voices and styles. And make mistakes. Polishing them is
something we _can_ do collaboratively, but there are diminishing
returns.
In case it is not clear, I would not say the same for documentation,
error messages, etc. Those are artifacts that hits a wider audience, and
we have a tool for polishing them together: git.
And people should still proofread and correct their own messages before
sending. Believe it or not, I do always take a final pass when sending
out my commits and still manage to have errors. ;) A lot of times I end
up improving clarity and wording on the final pass, but end up
introducing a typo (I'm pretty sure that the use/used above was me
switching last-minute between "the same glob we use" and "the same glob
used").
Bringing it back to the example at hand, my assumption is that the
bullet list capitalization and punctuation is mostly a question of
style, and isn't making the result hard to understand. But if it is, I
can try to adjust. I actually wrote a bulleted list in a commit message
earlier today and capitalized it just for you. :)
-Peff
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/3] test-lib: simplify lsan results check
2025-01-09 7:57 ` Jeff King
@ 2025-01-09 10:00 ` Patrick Steinhardt
0 siblings, 0 replies; 37+ messages in thread
From: Patrick Steinhardt @ 2025-01-09 10:00 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Thu, Jan 09, 2025 at 02:57:50AM -0500, Jeff King wrote:
> On Tue, Jan 07, 2025 at 08:37:33AM +0100, Patrick Steinhardt wrote:
>
> > On Tue, Jan 07, 2025 at 02:07:52AM -0500, Jeff King wrote:
> > > We want to know if there are any leaks logged by LSan in the results
> > > directory, so we run "find" on the containing directory and pipe it to
> > > xargs. We can accomplish the same thing by just globbing in the shell
> > > and passing the result to grep, which has a few advantages:
> > >
> > > - it's one fewer process to run
> > >
> > > - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we
> > > checked at the beginning of the function, and is the same glob use
> >
> > s/use/used
> >
> > I'm always a bit thrown off by your style of bulleted lists, where they
> > feel like sentences but start with a lower-case letter, and sometimes
> > they do and sometimes they don't end with punctuation. Maybe it's just
> > me not being a native speaker and it's a natural thing to do in English.
> > In any case, it's nothing that really matters in the end, but would be
> > happy to learn if this is indeed something you tend to do in English.
>
> Heh. Yeah, I've seen you mention them before and I've been tempted to
> start a big discussion. But I never felt like it was worth it. But
> tonight's your lucky night. ;)
>
> In short: I think it's a style question. I perceive them as
> continuations of the sentence that has the ":". Though admittedly I do
> not always grammatically continue that sentence. So for example I could:
>
> - have one bullet item that completes the sentence.
>
> - and then another that likewise completes it.
>
> ;) I think many style guides would frown on that. Especially with the
> periods at the end (you might argue that they should be semicolons).
>
> In the example you quoted above they don't grammatically continue the
> sentence, so arguably what I'm saying doesn't even apply. But I also
> kind of think of the list items as sentence fragments. That sometimes
> happen to make a full sentence. Or need punctuation because that
> fragments gets so long it contains multiple sentences.
>
> I dunno. You asked if it is something you tend to do in English. It is
> something _I_ tend to do in English, but I think most style guides would
> suggest against it (but then, most also suggest against bulleted lists
> in the first place). (They probably also suggest against lots of
> parentheses). So I wouldn't necessarily copy me.
>
> My general feeling is that unless a commit message is inaccurate or hard
> to understand, we should mostly let it pass (even typos). Yes, they are
> an artifact that is enshrined in the history. But at some point they are
> also just a written communication between developers, and we all have
> our own voices and styles. And make mistakes. Polishing them is
> something we _can_ do collaboratively, but there are diminishing
> returns.
Yup, agreed. It's a minor detail and I'm happy to gloss over it in the
future.
> In case it is not clear, I would not say the same for documentation,
> error messages, etc. Those are artifacts that hits a wider audience, and
> we have a tool for polishing them together: git.
>
> And people should still proofread and correct their own messages before
> sending. Believe it or not, I do always take a final pass when sending
> out my commits and still manage to have errors. ;) A lot of times I end
> up improving clarity and wording on the final pass, but end up
> introducing a typo (I'm pretty sure that the use/used above was me
> switching last-minute between "the same glob we use" and "the same glob
> used").
>
> Bringing it back to the example at hand, my assumption is that the
> bullet list capitalization and punctuation is mostly a question of
> style, and isn't making the result hard to understand. But if it is, I
> can try to adjust. I actually wrote a bulleted list in a commit message
> earlier today and capitalized it just for you. :)
Thanks for explaining!
Patrick
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/3] test-lib: simplify lsan results check
2025-01-07 7:07 ` [PATCH 2/3] test-lib: simplify lsan results check Jeff King
2025-01-07 7:37 ` Patrick Steinhardt
@ 2025-01-07 16:23 ` Junio C Hamano
2025-01-09 7:59 ` Jeff King
1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2025-01-07 16:23 UTC (permalink / raw)
To: Jeff King; +Cc: Patrick Steinhardt, git
Jeff King <peff@peff.net> writes:
> We want to know if there are any leaks logged by LSan in the results
> directory, so we run "find" on the containing directory and pipe it to
> xargs. We can accomplish the same thing by just globbing in the shell
> and passing the result to grep, which has a few advantages:
>
> - it's one fewer process to run
> ...
> We are now subject to command-line length limits, but that is also true
> of the globbing cat used to show the logs themselves. This hasn't been a
> problem in practice.
Nice to see it mentioned here. And the resulting code does become
simpler to reason about.
> We do need to use "grep -s" for the case that the glob does not expand
> (i.e., there are not any log files at all). This option is in POSIX, and
> has been used in t7407 for several years without anybody complaining.
Also since c625bf0e (git-p4: git-p4 tests with p4 triggers,
2017-07-13) t9831 has also been using it. It is not like a stray
error message about unmatched glob would really matter here, though.
We are not doing 2>&1 to let the downstream of the pipe see it, and
unless the test is run under "-v" option, it wouldn't even be seen.
> This also also naturally handles the case where the surrounding
> directory has already been removed (in which case there are likewise no
> files!), dropping the need to comment about it.
Nice.
Thanks.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/3] test-lib: simplify lsan results check
2025-01-07 16:23 ` Junio C Hamano
@ 2025-01-09 7:59 ` Jeff King
0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2025-01-09 7:59 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Patrick Steinhardt, git
On Tue, Jan 07, 2025 at 08:23:34AM -0800, Junio C Hamano wrote:
> > We do need to use "grep -s" for the case that the glob does not expand
> > (i.e., there are not any log files at all). This option is in POSIX, and
> > has been used in t7407 for several years without anybody complaining.
>
> Also since c625bf0e (git-p4: git-p4 tests with p4 triggers,
> 2017-07-13) t9831 has also been using it. It is not like a stray
> error message about unmatched glob would really matter here, though.
> We are not doing 2>&1 to let the downstream of the pipe see it, and
> unless the test is run under "-v" option, it wouldn't even be seen.
Yeah, I saw those. But I don't think they count since hardly anybody
runs the p4 tests. They do run in CI, but on a rather limited set of
platforms. Though come to think of it, this one would only kick in for
LSan, which may also run on a pretty limited set of platforms. :)
-Peff
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 3/3] test-lib: add a few comments to LSan log checking
2025-01-07 7:04 ` [PATCH 0/3] lsan test-lib readability Jeff King
2025-01-07 7:05 ` [PATCH 1/3] test-lib: invert return value of check_test_results_san_file_empty Jeff King
2025-01-07 7:07 ` [PATCH 2/3] test-lib: simplify lsan results check Jeff King
@ 2025-01-07 7:08 ` Jeff King
2025-01-07 7:37 ` Patrick Steinhardt
2 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2025-01-07 7:08 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: Junio C Hamano, git
Commit b119a687d4 (test-lib: ignore leaks in the sanitizer's thread
code, 2025-01-01) added code to suppress a false positive in the leak
checker. But if you're just reading the code, the obscure grep call is a
bit of a head-scratcher. Let's add a brief comment explaining what's
going on (and anybody digging further can find this commit or that one
for all the details).
Signed-off-by: Jeff King <peff@peff.net>
---
t/test-lib.sh | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 898c2267b8..9f27a49995 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1172,6 +1172,11 @@ test_atexit_handler () {
check_test_results_san_file_has_entries_ () {
test -z "$TEST_RESULTS_SAN_FILE" && return 1
+ # Lines marked with DEDUP_TOKEN show unique leaks. We only care that we
+ # found at least one.
+ #
+ # But also suppress any false positives caused by bugs or races in the
+ # sanitizer itself.
grep -s ^DEDUP_TOKEN "$TEST_RESULTS_SAN_FILE".* |
grep -qv sanitizer::GetThreadStackTopAndBottom
}
--
2.48.0.rc2.377.g0507c08f68
^ permalink raw reply related [flat|nested] 37+ messages in thread* Re: [PATCH 3/3] test-lib: add a few comments to LSan log checking
2025-01-07 7:08 ` [PATCH 3/3] test-lib: add a few comments to LSan log checking Jeff King
@ 2025-01-07 7:37 ` Patrick Steinhardt
0 siblings, 0 replies; 37+ messages in thread
From: Patrick Steinhardt @ 2025-01-07 7:37 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, git
On Tue, Jan 07, 2025 at 02:08:31AM -0500, Jeff King wrote:
> Commit b119a687d4 (test-lib: ignore leaks in the sanitizer's thread
> code, 2025-01-01) added code to suppress a false positive in the leak
> checker. But if you're just reading the code, the obscure grep call is a
> bit of a head-scratcher. Let's add a brief comment explaining what's
> going on (and anybody digging further can find this commit or that one
> for all the details).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> t/test-lib.sh | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 898c2267b8..9f27a49995 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1172,6 +1172,11 @@ test_atexit_handler () {
> check_test_results_san_file_has_entries_ () {
> test -z "$TEST_RESULTS_SAN_FILE" && return 1
>
> + # Lines marked with DEDUP_TOKEN show unique leaks. We only care that we
> + # found at least one.
> + #
> + # But also suppress any false positives caused by bugs or races in the
> + # sanitizer itself.
> grep -s ^DEDUP_TOKEN "$TEST_RESULTS_SAN_FILE".* |
> grep -qv sanitizer::GetThreadStackTopAndBottom
> }
Thanks for adding this comment!
Patrick
^ permalink raw reply [flat|nested] 37+ messages in thread