* [PATCH v14 12/13] fsmonitor: add tests for Linux
2026-04-09 4:59 ` [PATCH v14 " Paul Tarjan via GitGitGadget
@ 2026-04-09 4:59 ` Paul Tarjan via GitGitGadget
2026-04-14 20:20 ` SZEDER Gábor
0 siblings, 1 reply; 5+ messages in thread
From: Paul Tarjan via GitGitGadget @ 2026-04-09 4:59 UTC (permalink / raw)
To: git; +Cc: Patrick Steinhardt, Paul Tarjan, Paul Tarjan, Paul Tarjan
From: Paul Tarjan <github@paulisageek.com>
Add a smoke test that verifies the filesystem actually delivers
inotify events to the daemon. On some configurations (e.g.,
overlayfs with older kernels), inotify watches succeed but events
are never delivered. The daemon cookie wait will time out, but
every subsequent test would fail. Skip the entire test file early
when this is detected.
Add a test that exercises rapid nested directory creation to verify
the daemon correctly handles the EEXIST race between recursive scan
and queued inotify events. When IN_MASK_CREATE is available and a
directory watch is added during recursive registration, the kernel
may also deliver a queued IN_CREATE event for the same directory.
The second inotify_add_watch() returns EEXIST, which must be treated
as harmless. An earlier version of the listener crashed in this
scenario.
Reduce --start-timeout from the default 60 seconds to 10 seconds so
that tests fail promptly when the daemon cannot start.
Harden the test helpers to work in environments without procps
(e.g., Fedora CI): fall back to reading /proc/$pid/stat for the
process group ID when ps is unavailable, guard stop_git() against
an empty pgid, and redirect stderr from kill to /dev/null to avoid
noise when processes have already exited.
Use set -m to enable job control in the submodule-pull test so that
the background git pull gets its own process group, preventing the
shell wait from blocking on the daemon. setsid() in the previous
commit detaches the daemon itself, but the intermediate git pull
process still needs its own process group for the test shell to
manage it correctly.
Signed-off-by: Paul Tarjan <github@paulisageek.com>
---
t/t7527-builtin-fsmonitor.sh | 89 +++++++++++++++++++++++++++++++++---
1 file changed, 82 insertions(+), 7 deletions(-)
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 409cd0cd12..774da5ac60 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -10,9 +10,58 @@ then
test_done
fi
+# Verify that the filesystem delivers events to the daemon.
+# On some configurations (e.g., overlayfs with older kernels),
+# inotify watches succeed but events are never delivered. The
+# cookie wait will time out and the daemon logs a trace message.
+#
+# Use "timeout" (if available) to guard each step against hangs.
+maybe_timeout () {
+ if type timeout >/dev/null 2>&1
+ then
+ timeout "$@"
+ else
+ shift
+ "$@"
+ fi
+}
+verify_fsmonitor_works () {
+ git init test_fsmonitor_smoke || return 1
+
+ GIT_TRACE_FSMONITOR="$PWD/smoke.trace" &&
+ export GIT_TRACE_FSMONITOR &&
+ maybe_timeout 30 \
+ git -C test_fsmonitor_smoke fsmonitor--daemon start \
+ --start-timeout=10
+ ret=$?
+ unset GIT_TRACE_FSMONITOR
+ if test $ret -ne 0
+ then
+ rm -rf test_fsmonitor_smoke smoke.trace
+ return 1
+ fi
+
+ maybe_timeout 10 \
+ test-tool -C test_fsmonitor_smoke fsmonitor-client query \
+ --token 0 >/dev/null 2>&1
+ maybe_timeout 5 \
+ git -C test_fsmonitor_smoke fsmonitor--daemon stop 2>/dev/null
+ ! grep -q "cookie_wait timed out" "$PWD/smoke.trace" 2>/dev/null
+ ret=$?
+ rm -rf test_fsmonitor_smoke smoke.trace
+ return $ret
+}
+
+if ! verify_fsmonitor_works
+then
+ skip_all="filesystem does not deliver fsmonitor events (container/overlayfs?)"
+ test_done
+fi
+
stop_daemon_delete_repo () {
r=$1 &&
- test_might_fail git -C $r fsmonitor--daemon stop &&
+ test_might_fail maybe_timeout 30 \
+ git -C $r fsmonitor--daemon stop 2>/dev/null
rm -rf $1
}
@@ -67,7 +116,7 @@ start_daemon () {
export GIT_TEST_FSMONITOR_TOKEN
fi &&
- git $r fsmonitor--daemon start &&
+ git $r fsmonitor--daemon start --start-timeout=10 &&
git $r fsmonitor--daemon status
)
}
@@ -520,6 +569,28 @@ test_expect_success 'directory changes to a file' '
grep "^event: dir1$" .git/trace
'
+test_expect_success 'rapid nested directory creation' '
+ test_when_finished "git fsmonitor--daemon stop; rm -rf rapid" &&
+
+ start_daemon --tf "$PWD/.git/trace" &&
+
+ # Rapidly create nested directories to exercise race conditions
+ # where directory watches may be added concurrently during
+ # event processing and recursive scanning.
+ for i in $(test_seq 1 20)
+ do
+ mkdir -p "rapid/nested/dir$i/subdir/deep" || return 1
+ done &&
+
+ # Give the daemon time to process all events
+ sleep 1 &&
+
+ test-tool fsmonitor-client query --token 0 &&
+
+ # Verify daemon is still running (did not crash)
+ git fsmonitor--daemon status
+'
+
# The next few test cases exercise the token-resync code. When filesystem
# drops events (because of filesystem velocity or because the daemon isn't
# polling fast enough), we need to discard the cached data (relative to the
@@ -910,7 +981,10 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
start_git_in_background () {
git "$@" &
git_pid=$!
- git_pgid=$(ps -o pgid= -p $git_pid)
+ git_pgid=$(ps -o pgid= -p $git_pid 2>/dev/null ||
+ awk '{print $5}' /proc/$git_pid/stat 2>/dev/null) &&
+ git_pgid="${git_pgid## }" &&
+ git_pgid="${git_pgid%% }"
nr_tries_left=10
while true
do
@@ -921,15 +995,16 @@ start_git_in_background () {
fi
sleep 1
nr_tries_left=$(($nr_tries_left - 1))
- done >/dev/null 2>&1 &
+ done >/dev/null 2>&1 3>&- 4>&- 5>&- 6>&- 7>&- &
watchdog_pid=$!
wait $git_pid
}
stop_git () {
- while kill -0 -- -$git_pgid
+ test -n "$git_pgid" || return 0
+ while kill -0 -- -$git_pgid 2>/dev/null
do
- kill -- -$git_pgid
+ kill -- -$git_pgid 2>/dev/null
sleep 1
done
}
@@ -944,7 +1019,7 @@ stop_watchdog () {
test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
test_atexit "stop_watchdog" &&
- test_when_finished "stop_git; rm -rf cloned super sub" &&
+ test_when_finished "set +m; stop_git; rm -rf cloned super sub" &&
create_super super &&
create_sub sub &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v14 12/13] fsmonitor: add tests for Linux
2026-04-09 4:59 ` [PATCH v14 12/13] fsmonitor: add tests " Paul Tarjan via GitGitGadget
@ 2026-04-14 20:20 ` SZEDER Gábor
2026-04-14 20:40 ` Junio C Hamano
0 siblings, 1 reply; 5+ messages in thread
From: SZEDER Gábor @ 2026-04-14 20:20 UTC (permalink / raw)
To: Paul Tarjan via GitGitGadget
Cc: git, Patrick Steinhardt, Paul Tarjan, Paul Tarjan
On Thu, Apr 09, 2026 at 04:59:34AM +0000, Paul Tarjan via GitGitGadget wrote:
> From: Paul Tarjan <github@paulisageek.com>
>
> Add a smoke test that verifies the filesystem actually delivers
> inotify events to the daemon. On some configurations (e.g.,
> overlayfs with older kernels), inotify watches succeed but events
> are never delivered. The daemon cookie wait will time out, but
> every subsequent test would fail. Skip the entire test file early
> when this is detected.
>
> Add a test that exercises rapid nested directory creation to verify
> the daemon correctly handles the EEXIST race between recursive scan
> and queued inotify events. When IN_MASK_CREATE is available and a
> directory watch is added during recursive registration, the kernel
> may also deliver a queued IN_CREATE event for the same directory.
> The second inotify_add_watch() returns EEXIST, which must be treated
> as harmless. An earlier version of the listener crashed in this
> scenario.
>
> Reduce --start-timeout from the default 60 seconds to 10 seconds so
> that tests fail promptly when the daemon cannot start.
>
> Harden the test helpers to work in environments without procps
> (e.g., Fedora CI): fall back to reading /proc/$pid/stat for the
> process group ID when ps is unavailable, guard stop_git() against
> an empty pgid, and redirect stderr from kill to /dev/null to avoid
> noise when processes have already exited.
>
> Use set -m to enable job control in the submodule-pull test so that
> the background git pull gets its own process group, preventing the
> shell wait from blocking on the daemon. setsid() in the previous
> commit detaches the daemon itself, but the intermediate git pull
> process still needs its own process group for the test shell to
> manage it correctly.
>
> Signed-off-by: Paul Tarjan <github@paulisageek.com>
> ---
> t/t7527-builtin-fsmonitor.sh | 89 +++++++++++++++++++++++++++++++++---
> 1 file changed, 82 insertions(+), 7 deletions(-)
>
> diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
> index 409cd0cd12..774da5ac60 100755
> --- a/t/t7527-builtin-fsmonitor.sh
> +++ b/t/t7527-builtin-fsmonitor.sh
> @@ -10,9 +10,58 @@ then
> test_done
> fi
>
> +# Verify that the filesystem delivers events to the daemon.
> +# On some configurations (e.g., overlayfs with older kernels),
> +# inotify watches succeed but events are never delivered. The
> +# cookie wait will time out and the daemon logs a trace message.
> +#
> +# Use "timeout" (if available) to guard each step against hangs.
> +maybe_timeout () {
> + if type timeout >/dev/null 2>&1
> + then
> + timeout "$@"
> + else
> + shift
> + "$@"
> + fi
> +}
> +verify_fsmonitor_works () {
> + git init test_fsmonitor_smoke || return 1
> +
> + GIT_TRACE_FSMONITOR="$PWD/smoke.trace" &&
> + export GIT_TRACE_FSMONITOR &&
> + maybe_timeout 30 \
> + git -C test_fsmonitor_smoke fsmonitor--daemon start \
> + --start-timeout=10
> + ret=$?
> + unset GIT_TRACE_FSMONITOR
> + if test $ret -ne 0
> + then
> + rm -rf test_fsmonitor_smoke smoke.trace
> + return 1
> + fi
> +
> + maybe_timeout 10 \
> + test-tool -C test_fsmonitor_smoke fsmonitor-client query \
> + --token 0 >/dev/null 2>&1
> + maybe_timeout 5 \
> + git -C test_fsmonitor_smoke fsmonitor--daemon stop 2>/dev/null
> + ! grep -q "cookie_wait timed out" "$PWD/smoke.trace" 2>/dev/null
> + ret=$?
> + rm -rf test_fsmonitor_smoke smoke.trace
> + return $ret
> +}
> +
> +if ! verify_fsmonitor_works
> +then
> + skip_all="filesystem does not deliver fsmonitor events (container/overlayfs?)"
> + test_done
> +fi
> +
> stop_daemon_delete_repo () {
> r=$1 &&
> - test_might_fail git -C $r fsmonitor--daemon stop &&
> + test_might_fail maybe_timeout 30 \
> + git -C $r fsmonitor--daemon stop 2>/dev/null
"test_might_fail" only allows a few select commands and functions, and
the "maybe_timeout" helper function introduced in this patch is, of
course, not one of them, so it returns with error and without running
the given command. Consequently, after this test script is finished I
have still two fsmonitor daemon processes running in the background.
Alas, this went unnoticed, because this patch broke the &&-chain and
redirected "test_might_fail"'s
test_must_fail: only 'git' is allowed: maybe_timeout 30 git -C test_explicit fsmonitor--daemon stop
error messages to /dev/null. With the &&-chain restored over 40 test
cases fail because of this.
> rm -rf $1
> }
>
> @@ -67,7 +116,7 @@ start_daemon () {
> export GIT_TEST_FSMONITOR_TOKEN
> fi &&
>
> - git $r fsmonitor--daemon start &&
> + git $r fsmonitor--daemon start --start-timeout=10 &&
> git $r fsmonitor--daemon status
> )
> }
> @@ -520,6 +569,28 @@ test_expect_success 'directory changes to a file' '
> grep "^event: dir1$" .git/trace
> '
>
> +test_expect_success 'rapid nested directory creation' '
> + test_when_finished "git fsmonitor--daemon stop; rm -rf rapid" &&
> +
> + start_daemon --tf "$PWD/.git/trace" &&
> +
> + # Rapidly create nested directories to exercise race conditions
> + # where directory watches may be added concurrently during
> + # event processing and recursive scanning.
> + for i in $(test_seq 1 20)
> + do
> + mkdir -p "rapid/nested/dir$i/subdir/deep" || return 1
> + done &&
> +
> + # Give the daemon time to process all events
> + sleep 1 &&
> +
> + test-tool fsmonitor-client query --token 0 &&
> +
> + # Verify daemon is still running (did not crash)
> + git fsmonitor--daemon status
> +'
> +
> # The next few test cases exercise the token-resync code. When filesystem
> # drops events (because of filesystem velocity or because the daemon isn't
> # polling fast enough), we need to discard the cached data (relative to the
> @@ -910,7 +981,10 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
> start_git_in_background () {
> git "$@" &
> git_pid=$!
> - git_pgid=$(ps -o pgid= -p $git_pid)
> + git_pgid=$(ps -o pgid= -p $git_pid 2>/dev/null ||
> + awk '{print $5}' /proc/$git_pid/stat 2>/dev/null) &&
> + git_pgid="${git_pgid## }" &&
> + git_pgid="${git_pgid%% }"
> nr_tries_left=10
> while true
> do
> @@ -921,15 +995,16 @@ start_git_in_background () {
> fi
> sleep 1
> nr_tries_left=$(($nr_tries_left - 1))
> - done >/dev/null 2>&1 &
> + done >/dev/null 2>&1 3>&- 4>&- 5>&- 6>&- 7>&- &
> watchdog_pid=$!
> wait $git_pid
> }
>
> stop_git () {
> - while kill -0 -- -$git_pgid
> + test -n "$git_pgid" || return 0
> + while kill -0 -- -$git_pgid 2>/dev/null
> do
> - kill -- -$git_pgid
> + kill -- -$git_pgid 2>/dev/null
> sleep 1
> done
> }
> @@ -944,7 +1019,7 @@ stop_watchdog () {
>
> test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
> test_atexit "stop_watchdog" &&
> - test_when_finished "stop_git; rm -rf cloned super sub" &&
> + test_when_finished "set +m; stop_git; rm -rf cloned super sub" &&
>
> create_super super &&
> create_sub sub &&
> --
> gitgitgadget
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v14 12/13] fsmonitor: add tests for Linux
2026-04-14 20:20 ` SZEDER Gábor
@ 2026-04-14 20:40 ` Junio C Hamano
2026-04-14 22:13 ` Jeff King
0 siblings, 1 reply; 5+ messages in thread
From: Junio C Hamano @ 2026-04-14 20:40 UTC (permalink / raw)
To: SZEDER Gábor
Cc: Paul Tarjan via GitGitGadget, git, Patrick Steinhardt,
Paul Tarjan, Paul Tarjan
SZEDER Gábor <szeder.dev@gmail.com> writes:
>> stop_daemon_delete_repo () {
>> r=$1 &&
>> - test_might_fail git -C $r fsmonitor--daemon stop &&
>> + test_might_fail maybe_timeout 30 \
>> + git -C $r fsmonitor--daemon stop 2>/dev/null
>
> "test_might_fail" only allows a few select commands and functions, and
> the "maybe_timeout" helper function introduced in this patch is, of
> course, not one of them, so it returns with error and without running
> the given command. Consequently, after this test script is finished I
> have still two fsmonitor daemon processes running in the background.
>
> Alas, this went unnoticed, because this patch broke the &&-chain and
> redirected "test_might_fail"'s
>
> test_must_fail: only 'git' is allowed: maybe_timeout 30 git -C test_explicit fsmonitor--daemon stop
>
> error messages to /dev/null. With the &&-chain restored over 40 test
> cases fail because of this.
>
>> rm -rf $1
>> }
Yikes. Would it help to apply a patch like this, then?
t/t7527-builtin-fsmonitor.sh | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git c/t/t7527-builtin-fsmonitor.sh w/t/t7527-builtin-fsmonitor.sh
index 774da5ac60..dfa06395f6 100755
--- c/t/t7527-builtin-fsmonitor.sh
+++ w/t/t7527-builtin-fsmonitor.sh
@@ -60,8 +60,7 @@ fi
stop_daemon_delete_repo () {
r=$1 &&
- test_might_fail maybe_timeout 30 \
- git -C $r fsmonitor--daemon stop 2>/dev/null
+ maybe_timeout 30 git -C $r fsmonitor--daemon stop || : &&
rm -rf $1
}
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v14 12/13] fsmonitor: add tests for Linux
2026-04-14 20:40 ` Junio C Hamano
@ 2026-04-14 22:13 ` Jeff King
0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2026-04-14 22:13 UTC (permalink / raw)
To: Junio C Hamano
Cc: SZEDER Gábor, Paul Tarjan via GitGitGadget, git,
Patrick Steinhardt, Paul Tarjan, Paul Tarjan
On Tue, Apr 14, 2026 at 01:40:38PM -0700, Junio C Hamano wrote:
> diff --git c/t/t7527-builtin-fsmonitor.sh w/t/t7527-builtin-fsmonitor.sh
> index 774da5ac60..dfa06395f6 100755
> --- c/t/t7527-builtin-fsmonitor.sh
> +++ w/t/t7527-builtin-fsmonitor.sh
> @@ -60,8 +60,7 @@ fi
>
> stop_daemon_delete_repo () {
> r=$1 &&
> - test_might_fail maybe_timeout 30 \
> - git -C $r fsmonitor--daemon stop 2>/dev/null
> + maybe_timeout 30 git -C $r fsmonitor--daemon stop || : &&
> rm -rf $1
> }
Do we need to put it in curly braces to avoid interfering with the &&
chain? Otherwise a failure of anything before the maybe_timeout will hit
the "||".
I guess in this case it is just "r=$1", which will never fail, but it
feels like we should model best practice.
-Peff
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v14 12/13] fsmonitor: add tests for Linux
[not found] <20260414221335.GA3413665@coredump.intra.peff.net>
@ 2026-04-15 13:26 ` Paul Tarjan
0 siblings, 0 replies; 5+ messages in thread
From: Paul Tarjan @ 2026-04-15 13:26 UTC (permalink / raw)
To: git; +Cc: peff, szeder.dev, gitster, ps, gitgitgadget
Jeff King <peff@peff.net> writes:
> Do we need to put it in curly braces to avoid interfering with the &&
> chain? Otherwise a failure of anything before the maybe_timeout will hit
> the "||".
Good point. Fixed in v15:
stop_daemon_delete_repo () {
r=$1 &&
{ maybe_timeout 30 git -C $r fsmonitor--daemon stop 2>/dev/null || :; } &&
rm -rf $1
}
Thanks for catching this. The broken &&-chain and
2>/dev/null were hiding the test_might_fail error.
Verified that test_might_fail rejects maybe_timeout (exit code 1,
command never runs), while the { ... || :; } version actually
executes the stop. No other instances of test_might_fail maybe_timeout in the
repo.
Paul
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-15 13:26 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260414221335.GA3413665@coredump.intra.peff.net>
2026-04-15 13:26 ` [PATCH v14 12/13] fsmonitor: add tests for Linux Paul Tarjan
2026-04-06 17:54 [PATCH v13 00/13] fsmonitor: implement filesystem change listener " Paul Tarjan via GitGitGadget
2026-04-09 4:59 ` [PATCH v14 " Paul Tarjan via GitGitGadget
2026-04-09 4:59 ` [PATCH v14 12/13] fsmonitor: add tests " Paul Tarjan via GitGitGadget
2026-04-14 20:20 ` SZEDER Gábor
2026-04-14 20:40 ` Junio C Hamano
2026-04-14 22:13 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox