git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] fsmonitor OSX: fix hangs for submodules
@ 2024-09-29  2:41 Koji Nakamaru via GitGitGadget
  2024-09-30 17:40 ` Junio C Hamano
  2024-10-01  5:09 ` [PATCH v2] " Koji Nakamaru via GitGitGadget
  0 siblings, 2 replies; 19+ messages in thread
From: Koji Nakamaru via GitGitGadget @ 2024-09-29  2:41 UTC (permalink / raw)
  To: git; +Cc: Koji Nakamaru, Koji Nakamaru

From: Koji Nakamaru <koji.nakamaru@gree.net>

fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
the value with trailing "/." (as repo_get_git_dir(the_repository) on
Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
IS_OUTSIDE_CONE.

In this case, fsevent_callback() doesn't update cookie_list so that
fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
not invoked.

As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
that with_lock__mark_cookies_seen() should unlock, the whole daemon
hangs.

Remove trailing "/." from state->path_gitdir_watch.buf for submodules
and add a corresponding test in t7527-builtin-fsmonitor.sh.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
---
    fsmonitor/darwin: fix hangs for submodules

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1802%2FKojiNakamaru%2Ffix%2Ffsmonitor-darwin-hangs-for-submodules-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1802/KojiNakamaru/fix/fsmonitor-darwin-hangs-for-submodules-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1802

 builtin/fsmonitor--daemon.c  |  1 +
 t/t7527-builtin-fsmonitor.sh | 39 ++++++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index dce8a3b2482..e1e6b96d09e 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -1314,6 +1314,7 @@ static int fsmonitor_run_daemon(void)
 		strbuf_reset(&state.path_gitdir_watch);
 		strbuf_addstr(&state.path_gitdir_watch,
 			      absolute_path(repo_get_git_dir(the_repository)));
+		strbuf_strip_suffix(&state.path_gitdir_watch, "/.");
 		state.nr_paths_watching = 2;
 	}
 
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 730f3c7f810..7acd074a97f 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -82,6 +82,28 @@ have_t2_data_event () {
 	grep -e '"event":"data".*"category":"'"$c"'".*"key":"'"$k"'"'
 }
 
+start_git_in_background () {
+	git "$@" &
+	git_pid=$!
+	nr_tries_left=10
+	while true
+	do
+		if test $nr_tries_left -eq 0
+		then
+			kill $git_pid
+			exit 1
+		fi
+		sleep 1
+		nr_tries_left=$(($nr_tries_left - 1))
+	done > /dev/null 2>&1 &
+	watchdog_pid=$!
+	wait $git_pid
+}
+
+stop_git_and_watchdog () {
+	kill $git_pid $watchdog_pid
+}
+
 test_expect_success 'explicit daemon start and stop' '
 	test_when_finished "stop_daemon_delete_repo test_explicit" &&
 
@@ -907,6 +929,23 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
 	test_subcommand git fsmonitor--daemon start <super-sub.trace
 '
 
+test_expect_success "submodule implicitly starts daemon by pull" '
+	test_atexit "stop_git_and_watchdog" &&
+	test_when_finished "rm -rf cloned; \
+			    rm -rf super; \
+			    rm -rf sub" &&
+
+	create_super super &&
+	create_sub sub &&
+
+	git -C super submodule add ../sub ./dir_1/dir_2/sub &&
+	git -C super commit -m "add sub" &&
+	git clone --recurse-submodules super cloned &&
+
+	git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
+	start_git_in_background -C cloned pull --recurse-submodules
+'
+
 # On a case-insensitive file system, confirm that the daemon
 # notices when the .git directory is moved/renamed/deleted
 # regardless of how it is spelled in the FS event.

base-commit: 3857aae53f3633b7de63ad640737c657387ae0c6
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] fsmonitor OSX: fix hangs for submodules
  2024-09-29  2:41 [PATCH] fsmonitor OSX: fix hangs for submodules Koji Nakamaru via GitGitGadget
@ 2024-09-30 17:40 ` Junio C Hamano
  2024-10-01  4:11   ` Koji Nakamaru
  2024-10-01  5:09 ` [PATCH v2] " Koji Nakamaru via GitGitGadget
  1 sibling, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2024-09-30 17:40 UTC (permalink / raw)
  To: Koji Nakamaru via GitGitGadget; +Cc: git, Koji Nakamaru

"Koji Nakamaru via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Koji Nakamaru <koji.nakamaru@gree.net>
>
> fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
> has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
> the value with trailing "/." (as repo_get_git_dir(the_repository) on
> Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
> IS_OUTSIDE_CONE.
>
> In this case, fsevent_callback() doesn't update cookie_list so that
> fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
> not invoked.
>
> As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
> that with_lock__mark_cookies_seen() should unlock, the whole daemon
> hangs.

The above very nicely describes the cause, the mechansim that leads
to the end-user observable effect, and the (bad) effect the bug has.

I wish everybody wrote their proposed commit messages like this ;-)

> Remove trailing "/." from state->path_gitdir_watch.buf for submodules
> and add a corresponding test in t7527-builtin-fsmonitor.sh.
>
> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
> ---
>     fsmonitor/darwin: fix hangs for submodules

> diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
> index 730f3c7f810..7acd074a97f 100755
> --- a/t/t7527-builtin-fsmonitor.sh
> +++ b/t/t7527-builtin-fsmonitor.sh
> @@ -82,6 +82,28 @@ have_t2_data_event () {
>  	grep -e '"event":"data".*"category":"'"$c"'".*"key":"'"$k"'"'
>  }
>  
> +start_git_in_background () {
> +	git "$@" &
> +	git_pid=$!
> +	nr_tries_left=10
> +	while true
> +	do
> +		if test $nr_tries_left -eq 0
> +		then
> +			kill $git_pid
> +			exit 1
> +		fi
> +		sleep 1
> +		nr_tries_left=$(($nr_tries_left - 1))
> +	done > /dev/null 2>&1 &

So, the command is allowed to run for 10 seconds and then a signal
is sent to the process (by the way, we do not write the SP between
">" and "/dev/null").

> +	watchdog_pid=$!
> +	wait $git_pid

And the process to ensure the command gets killed in 10 seconds is
called the "watchdog".  We let the command run for completion (and
we'd be happy if it did without watchdog needing to forcibly kill
it).

Which means that even after the test finishes normally (e.g., the
command completes without getting killed by the watchdog, because it
is on a fast box and finishes in 0.5 second), we have leftover
watchdog process hanging around for 10 seconds, which might interfere
with the removal of the $TRASH_DIRECTORY at the end of the test.

There is a helper function to kill both (below), which probably is
used to avoid it.  Let's keep reading.

> +}
> +
> +stop_git_and_watchdog () {
> +	kill $git_pid $watchdog_pid
> +}

This sends a signal and let the process die.  Without waiting to
make sure they indeed died, at which point we can safely remove the
$TRASH_DIRECTORY on filesystems that refuse to remove a directory
when a process still has it as its current working directory.

Shouldn't it loop, like

	for pid in $git_pid $watchdog_pid
	do
                until kill -0 $pid
                do
                        kill $pid
                done
	done

or something?  Or is there a mechanism already to ensure that we
return after they get killed that I am failing to find?

>  test_expect_success 'explicit daemon start and stop' '
>  	test_when_finished "stop_daemon_delete_repo test_explicit" &&
>  
> @@ -907,6 +929,23 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
>  	test_subcommand git fsmonitor--daemon start <super-sub.trace
>  '
>  
> +test_expect_success "submodule implicitly starts daemon by pull" '
> +	test_atexit "stop_git_and_watchdog" &&

Hmph, this is _atexit and not _when_finished because...?

> +	test_when_finished "rm -rf cloned; \
> +			    rm -rf super; \
> +			    rm -rf sub" &&

Makes me wonder why it is not written like so:

	test_when_finished "rm -rf cloned super sub" &&

which is short enough to still fit on a line.  Is there something I
am missing that these directories must be removed separately and in
this order?

> +	create_super super &&
> +	create_sub sub &&
> +
> +	git -C super submodule add ../sub ./dir_1/dir_2/sub &&
> +	git -C super commit -m "add sub" &&
> +	git clone --recurse-submodules super cloned &&
> +
> +	git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
> +	start_git_in_background -C cloned pull --recurse-submodules
> +'

Other than that, very nicely done.

Thanks.

>  # On a case-insensitive file system, confirm that the daemon
>  # notices when the .git directory is moved/renamed/deleted
>  # regardless of how it is spelled in the FS event.
>
> base-commit: 3857aae53f3633b7de63ad640737c657387ae0c6

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] fsmonitor OSX: fix hangs for submodules
  2024-09-30 17:40 ` Junio C Hamano
@ 2024-10-01  4:11   ` Koji Nakamaru
  0 siblings, 0 replies; 19+ messages in thread
From: Koji Nakamaru @ 2024-10-01  4:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Koji Nakamaru via GitGitGadget, git

Thank you very much for carefully checking the patch and suggesting
better ways. I'll later revise it and submit a new one.

> > +}
> > +
> > +stop_git_and_watchdog () {
> > +     kill $git_pid $watchdog_pid
> > +}
>
> This sends a signal and let the process die.  Without waiting to
> make sure they indeed died, at which point we can safely remove the
> $TRASH_DIRECTORY on filesystems that refuse to remove a directory
> when a process still has it as its current working directory.
>
> Shouldn't it loop, like
>
>         for pid in $git_pid $watchdog_pid
>         do
>                 until kill -0 $pid
>                 do
>                         kill $pid
>                 done
>         done
>
> or something?  Or is there a mechanism already to ensure that we
> return after they get killed that I am failing to find?

I agree that we have to wait for pids. I also realized that we should
run git in another process group and kill the group for killing all git
child processes. I'll fix the code.

> >  test_expect_success 'explicit daemon start and stop' '
> >       test_when_finished "stop_daemon_delete_repo test_explicit" &&
> >
> > @@ -907,6 +929,23 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
> >       test_subcommand git fsmonitor--daemon start <super-sub.trace
> >  '
> >
> > +test_expect_success "submodule implicitly starts daemon by pull" '
> > +     test_atexit "stop_git_and_watchdog" &&
>
> Hmph, this is _atexit and not _when_finished because...?

This is because README describes _atexit to run unconditionally to clean
up before the test script exits, e.g. to stop (kill) a daemon. More
appropriately, we should kill git before "rm -rf cloned super sub" in
_when_finished and kill watchdog in _atexit. I'll adjust the code.

> > +     test_when_finished "rm -rf cloned; \
> > +                         rm -rf super; \
> > +                         rm -rf sub" &&
>
> Makes me wonder why it is not written like so:
>
>         test_when_finished "rm -rf cloned super sub" &&
>
> which is short enough to still fit on a line.  Is there something I
> am missing that these directories must be removed separately and in
> this order?

There is no special reason, I simply followed the style used in
t7527-builtin-fsmonitor.sh. I'll fix this part.


Koji Nakamaru

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-09-29  2:41 [PATCH] fsmonitor OSX: fix hangs for submodules Koji Nakamaru via GitGitGadget
  2024-09-30 17:40 ` Junio C Hamano
@ 2024-10-01  5:09 ` Koji Nakamaru via GitGitGadget
  2024-10-01 11:57   ` Junio C Hamano
  2024-10-01 19:29   ` [PATCH v3] " Koji Nakamaru via GitGitGadget
  1 sibling, 2 replies; 19+ messages in thread
From: Koji Nakamaru via GitGitGadget @ 2024-10-01  5:09 UTC (permalink / raw)
  To: git; +Cc: Koji Nakamaru, Koji Nakamaru

From: Koji Nakamaru <koji.nakamaru@gree.net>

fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
the value with trailing "/." (as repo_get_git_dir(the_repository) on
Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
IS_OUTSIDE_CONE.

In this case, fsevent_callback() doesn't update cookie_list so that
fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
not invoked.

As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
that with_lock__mark_cookies_seen() should unlock, the whole daemon
hangs.

Remove trailing "/." from state->path_gitdir_watch.buf for submodules
and add a corresponding test in t7527-builtin-fsmonitor.sh.

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
---
    fsmonitor/darwin: fix hangs for submodules

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1802%2FKojiNakamaru%2Ffix%2Ffsmonitor-darwin-hangs-for-submodules-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1802/KojiNakamaru/fix/fsmonitor-darwin-hangs-for-submodules-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1802

Range-diff vs v1:

 1:  1889cbb193d ! 1:  decf68499f7 fsmonitor OSX: fix hangs for submodules
     @@ Commit message
          Remove trailing "/." from state->path_gitdir_watch.buf for submodules
          and add a corresponding test in t7527-builtin-fsmonitor.sh.
      
     -    Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
     +    Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
     +    Suggested-by: Junio C Hamano <gitster@pobox.com>
          Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
      
       ## builtin/fsmonitor--daemon.c ##
     @@ builtin/fsmonitor--daemon.c: static int fsmonitor_run_daemon(void)
       
      
       ## t/t7527-builtin-fsmonitor.sh ##
     -@@ t/t7527-builtin-fsmonitor.sh: have_t2_data_event () {
     - 	grep -e '"event":"data".*"category":"'"$c"'".*"key":"'"$k"'"'
     - }
     +@@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
     + 	test_subcommand git fsmonitor--daemon start <super-sub.trace
     + '
       
      +start_git_in_background () {
      +	git "$@" &
      +	git_pid=$!
     ++	git_pgid=$(ps -o pgid= -p $git_pid)
      +	nr_tries_left=10
      +	while true
      +	do
      +		if test $nr_tries_left -eq 0
      +		then
     -+			kill $git_pid
     ++			kill -- -$git_pgid
      +			exit 1
      +		fi
      +		sleep 1
      +		nr_tries_left=$(($nr_tries_left - 1))
     -+	done > /dev/null 2>&1 &
     ++	done >/dev/null 2>&1 &
      +	watchdog_pid=$!
      +	wait $git_pid
      +}
      +
     -+stop_git_and_watchdog () {
     -+	kill $git_pid $watchdog_pid
     ++stop_git () {
     ++	while kill -0 -- -$git_pgid
     ++	do
     ++		kill -- -$git_pgid
     ++		sleep 1
     ++	done
     ++}
     ++
     ++stop_watchdog () {
     ++	while kill -0 $watchdog_pid
     ++	do
     ++		kill $watchdog_pid
     ++		sleep 1
     ++	done
      +}
      +
     - test_expect_success 'explicit daemon start and stop' '
     - 	test_when_finished "stop_daemon_delete_repo test_explicit" &&
     - 
     -@@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
     - 	test_subcommand git fsmonitor--daemon start <super-sub.trace
     - '
     - 
      +test_expect_success "submodule implicitly starts daemon by pull" '
     -+	test_atexit "stop_git_and_watchdog" &&
     -+	test_when_finished "rm -rf cloned; \
     -+			    rm -rf super; \
     -+			    rm -rf sub" &&
     ++	test_atexit "stop_watchdog" &&
     ++	test_when_finished "stop_git && rm -rf cloned super sub" &&
      +
      +	create_super super &&
      +	create_sub sub &&
     @@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs impli
      +	git clone --recurse-submodules super cloned &&
      +
      +	git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
     ++	set -m &&
      +	start_git_in_background -C cloned pull --recurse-submodules
      +'
      +


 builtin/fsmonitor--daemon.c  |  1 +
 t/t7527-builtin-fsmonitor.sh | 51 ++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index dce8a3b2482..e1e6b96d09e 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -1314,6 +1314,7 @@ static int fsmonitor_run_daemon(void)
 		strbuf_reset(&state.path_gitdir_watch);
 		strbuf_addstr(&state.path_gitdir_watch,
 			      absolute_path(repo_get_git_dir(the_repository)));
+		strbuf_strip_suffix(&state.path_gitdir_watch, "/.");
 		state.nr_paths_watching = 2;
 	}
 
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 730f3c7f810..2dd1ca1a7b7 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -907,6 +907,57 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
 	test_subcommand git fsmonitor--daemon start <super-sub.trace
 '
 
+start_git_in_background () {
+	git "$@" &
+	git_pid=$!
+	git_pgid=$(ps -o pgid= -p $git_pid)
+	nr_tries_left=10
+	while true
+	do
+		if test $nr_tries_left -eq 0
+		then
+			kill -- -$git_pgid
+			exit 1
+		fi
+		sleep 1
+		nr_tries_left=$(($nr_tries_left - 1))
+	done >/dev/null 2>&1 &
+	watchdog_pid=$!
+	wait $git_pid
+}
+
+stop_git () {
+	while kill -0 -- -$git_pgid
+	do
+		kill -- -$git_pgid
+		sleep 1
+	done
+}
+
+stop_watchdog () {
+	while kill -0 $watchdog_pid
+	do
+		kill $watchdog_pid
+		sleep 1
+	done
+}
+
+test_expect_success "submodule implicitly starts daemon by pull" '
+	test_atexit "stop_watchdog" &&
+	test_when_finished "stop_git && rm -rf cloned super sub" &&
+
+	create_super super &&
+	create_sub sub &&
+
+	git -C super submodule add ../sub ./dir_1/dir_2/sub &&
+	git -C super commit -m "add sub" &&
+	git clone --recurse-submodules super cloned &&
+
+	git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
+	set -m &&
+	start_git_in_background -C cloned pull --recurse-submodules
+'
+
 # On a case-insensitive file system, confirm that the daemon
 # notices when the .git directory is moved/renamed/deleted
 # regardless of how it is spelled in the FS event.

base-commit: 3857aae53f3633b7de63ad640737c657387ae0c6
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-10-01  5:09 ` [PATCH v2] " Koji Nakamaru via GitGitGadget
@ 2024-10-01 11:57   ` Junio C Hamano
  2024-10-01 17:51     ` Koji Nakamaru
  2024-10-01 19:29   ` [PATCH v3] " Koji Nakamaru via GitGitGadget
  1 sibling, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2024-10-01 11:57 UTC (permalink / raw)
  To: Koji Nakamaru via GitGitGadget; +Cc: git, Koji Nakamaru

"Koji Nakamaru via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Koji Nakamaru <koji.nakamaru@gree.net>
>
> fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
> has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
> the value with trailing "/." (as repo_get_git_dir(the_repository) on
> Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
> IS_OUTSIDE_CONE.
>
> In this case, fsevent_callback() doesn't update cookie_list so that
> fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
> not invoked.
>
> As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
> that with_lock__mark_cookies_seen() should unlock, the whole daemon
> hangs.
>
> Remove trailing "/." from state->path_gitdir_watch.buf for submodules
> and add a corresponding test in t7527-builtin-fsmonitor.sh.
>
> Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Suggested-by: Junio C Hamano <gitster@pobox.com>

In none of the changes described above, I have any input to deserve
such credit, though.

> +start_git_in_background () {
> +	git "$@" &
> +	git_pid=$!
> +	git_pgid=$(ps -o pgid= -p $git_pid)
> +	nr_tries_left=10
> +	while true
> +	do
> +		if test $nr_tries_left -eq 0
> +		then
> +			kill -- -$git_pgid
> +			exit 1
> +		fi
> +		sleep 1
> +		nr_tries_left=$(($nr_tries_left - 1))
> +	done >/dev/null 2>&1 &
> +	watchdog_pid=$!
> +	wait $git_pid
> +}
> +
> +stop_git () {
> +	while kill -0 -- -$git_pgid
> +	do
> +		kill -- -$git_pgid
> +		sleep 1
> +	done
> +}

On the "git" side you use process group because you expect that
"git" would spawn subprocesses and you want to catch all of them,
...

> +stop_watchdog () {
> +	while kill -0 $watchdog_pid
> +	do
> +		kill $watchdog_pid
> +		sleep 1
> +	done
> +}

... but "watchdog" you know is a single process, so you'd only need
a single process id, is that the idea?

What is the motivation behind the change in this iteration to use
process group?  Was it observed that leftover processes hang around
if we killed only the $git_pid, or something?

> +test_expect_success "submodule implicitly starts daemon by pull" '
> +	test_atexit "stop_watchdog" &&
> +	test_when_finished "stop_git && rm -rf cloned super sub" &&

If stop_git ever returns with non-zero status, "rm -rf" will be
skipped, which I am not sure is a good idea.

The whole test_when_finished would fail in such a case, so you would
notice the problem right away, which is a plus, though.

> +	create_super super &&
> +	create_sub sub &&
> +
> +	git -C super submodule add ../sub ./dir_1/dir_2/sub &&
> +	git -C super commit -m "add sub" &&
> +	git clone --recurse-submodules super cloned &&
> +
> +	git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
> +	set -m &&

I have to wonder how portable (and necessary) this is.

POSIX says it shall be supported if the implementation supports the
User Portability Utilities option.  It also says that it was added
to apply only to the UPE because it applies primarily to interactive
use, not shell script applications.  And our test scripts are of
course not interactive.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-10-01 11:57   ` Junio C Hamano
@ 2024-10-01 17:51     ` Koji Nakamaru
  2024-10-01 18:04       ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Koji Nakamaru @ 2024-10-01 17:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Koji Nakamaru via GitGitGadget, git

On Tue, Oct 1, 2024 at 8:57 PM Junio C Hamano <gitster@pobox.com> wrote:

> > fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
> > has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
> > the value with trailing "/." (as repo_get_git_dir(the_repository) on
> > Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
> > IS_OUTSIDE_CONE.
> >
> > In this case, fsevent_callback() doesn't update cookie_list so that
> > fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
> > not invoked.
> >
> > As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
> > that with_lock__mark_cookies_seen() should unlock, the whole daemon
> > hangs.
> >
> > Remove trailing "/." from state->path_gitdir_watch.buf for submodules
> > and add a corresponding test in t7527-builtin-fsmonitor.sh.
> >
> > Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > Suggested-by: Junio C Hamano <gitster@pobox.com>
>
> In none of the changes described above, I have any input to deserve
> such credit, though.

Your points are very helpful :)

> > +start_git_in_background () {
> > + git "$@" &
> > + git_pid=$!
> > + git_pgid=$(ps -o pgid= -p $git_pid)
> > + nr_tries_left=10
> > + while true
> > + do
> > + if test $nr_tries_left -eq 0
> > + then
> > + kill -- -$git_pgid
> > + exit 1
> > + fi
> > + sleep 1
> > + nr_tries_left=$(($nr_tries_left - 1))
> > + done >/dev/null 2>&1 &
> > + watchdog_pid=$!
> > + wait $git_pid
> > +}
> > +
> > +stop_git () {
> > + while kill -0 -- -$git_pgid
> > + do
> > + kill -- -$git_pgid
> > + sleep 1
> > + done
> > +}
>
> On the "git" side you use process group because you expect that
> "git" would spawn subprocesses and you want to catch all of them,
> ...
>
> > +stop_watchdog () {
> > + while kill -0 $watchdog_pid
> > + do
> > + kill $watchdog_pid
> > + sleep 1
> > + done
> > +}
>
> ... but "watchdog" you know is a single process, so you'd only need
> a single process id, is that the idea?

Yes, that is the idea.

> What is the motivation behind the change in this iteration to use
> process group?  Was it observed that leftover processes hang around
> if we killed only the $git_pid, or something?

Yes, if the issue occurs, three processes remains:

  git fetch --update-head-ok --recurse-submodules=on

  git fetch --no-prune --no-prune-tags --update-head-ok
    --recurse-submodules --recurse-submodules-default yes
    --submodule-prefix=dir_1/dir_2/sub/

  git fsmonitor--daemon run --detach --ipc-threads=8

If there is no issue, only the fsmonitor process remains.

> > +test_expect_success "submodule implicitly starts daemon by pull" '
> > + test_atexit "stop_watchdog" &&
> > + test_when_finished "stop_git && rm -rf cloned super sub" &&
>
> If stop_git ever returns with non-zero status, "rm -rf" will be
> skipped, which I am not sure is a good idea.
>
> The whole test_when_finished would fail in such a case, so you would
> notice the problem right away, which is a plus, though.

t/README discusses that test_when_finished and test_atexit differ about
the "--immediate" option. As git and its subprocesses are the test
target, I moved stop_git to the current place. This might be however
confusing when someone later reads this test. Should we simply put
stop_git and stop_watchdong in test_atexit?

> > + create_super super &&
> > + create_sub sub &&
> > +
> > + git -C super submodule add ../sub ./dir_1/dir_2/sub &&
> > + git -C super commit -m "add sub" &&
> > + git clone --recurse-submodules super cloned &&
> > +
> > + git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
> > + set -m &&
>
> I have to wonder how portable (and necessary) this is.
>
> POSIX says it shall be supported if the implementation supports the
> User Portability Utilities option.  It also says that it was added
> to apply only to the UPE because it applies primarily to interactive
> use, not shell script applications.  And our test scripts are of
> course not interactive.

How about the following modification? It still utilizes $git_pgid to
filter processes, but avoids "set -m".

  diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
  index 2dd1ca1a7b..23d9a7c953 100755
  --- a/t/t7527-builtin-fsmonitor.sh
  +++ b/t/t7527-builtin-fsmonitor.sh
  @@ -916,7 +916,7 @@ start_git_in_background () {
          do
                  if test $nr_tries_left -eq 0
                  then
  -                       kill -- -$git_pgid
  +                       kill $git_pid
                          exit 1
                  fi
                  sleep 1
  @@ -927,10 +927,13 @@ start_git_in_background () {
   }

   stop_git () {
  -       while kill -0 -- -$git_pgid
  +       for p in $(ps -o pgid=,pid=,comm= | grep "^$git_pgid .*git"
| sed 's/^[0-9][0-9]* \([0-9][0-9]*\) .*/\1/')
          do
  -               kill -- -$git_pgid
  -               sleep 1
  +               while kill -0 $p
  +               do
  +                       kill $p
  +                       sleep 1
  +               done
          done
   }

  @@ -954,7 +957,6 @@ test_expect_success "submodule implicitly starts
daemon by pull" '
          git clone --recurse-submodules super cloned &&

          git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
  -       set -m &&
          start_git_in_background -C cloned pull --recurse-submodules
   '


Koji Nakamaru

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-10-01 17:51     ` Koji Nakamaru
@ 2024-10-01 18:04       ` Junio C Hamano
  2024-10-01 18:42         ` Koji Nakamaru
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2024-10-01 18:04 UTC (permalink / raw)
  To: Koji Nakamaru; +Cc: Koji Nakamaru via GitGitGadget, git

Koji Nakamaru <koji.nakamaru@gree.net> writes:

>> > +test_expect_success "submodule implicitly starts daemon by pull" '
>> > + test_atexit "stop_watchdog" &&
>> > + test_when_finished "stop_git && rm -rf cloned super sub" &&
>>
>> If stop_git ever returns with non-zero status, "rm -rf" will be
>> skipped, which I am not sure is a good idea.
>>
>> The whole test_when_finished would fail in such a case, so you would
>> notice the problem right away, which is a plus, though.
>
> t/README discusses that test_when_finished and test_atexit differ about
> the "--immediate" option. As git and its subprocesses are the test
> target, I moved stop_git to the current place. This might be however
> confusing when someone later reads this test. Should we simply put
> stop_git and stop_watchdong in test_atexit?

That is not what I meant.

I was merely questioning the &&-chaining that stops "rm -fr" from
running if stop_git ever fails (and your earlier iteration you had
multiple "rm -fr" ;-chained, not &&-chained---not using && is often
more appropriate in a when_finished handler).

>> > + set -m &&
>>
>> I have to wonder how portable (and necessary) this is.
>>
>> POSIX says it shall be supported if the implementation supports the
>> User Portability Utilities option.  It also says that it was added
>> to apply only to the UPE because it applies primarily to interactive
>> use, not shell script applications.  And our test scripts are of
>> course not interactive.
>
> How about the following modification? It still utilizes $git_pgid to
> filter processes, but avoids "set -m".

Nah, your original reads much better, and the code is grabbing and
using the process group information anyway (and my question about
"-m" was more about "should we be relying on process group features
in this test to kill them all?").

I am OK with the idea that we can assume, at least among the
platforms that support fsmonitor, that sending a signal to a process
group would cause the signal delivered to the member processes just
as we expect.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-10-01 18:04       ` Junio C Hamano
@ 2024-10-01 18:42         ` Koji Nakamaru
  2024-10-02  6:58           ` Koji Nakamaru
  0 siblings, 1 reply; 19+ messages in thread
From: Koji Nakamaru @ 2024-10-01 18:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Koji Nakamaru via GitGitGadget, git

On Wed, Oct 2, 2024 at 3:04 AM Junio C Hamano <gitster@pobox.com> wrote:
>>> > +test_expect_success "submodule implicitly starts daemon by pull" '
>>> > + test_atexit "stop_watchdog" &&
>>> > + test_when_finished "stop_git && rm -rf cloned super sub" &&
>>>
>>> If stop_git ever returns with non-zero status, "rm -rf" will be
>>> skipped, which I am not sure is a good idea.
>>>
>>> The whole test_when_finished would fail in such a case, so you would
>>> notice the problem right away, which is a plus, though.
>>
>> t/README discusses that test_when_finished and test_atexit differ about
>> the "--immediate" option. As git and its subprocesses are the test
>> target, I moved stop_git to the current place. This might be however
>> confusing when someone later reads this test. Should we simply put
>> stop_git and stop_watchdong in test_atexit?
>
> That is not what I meant.
>
> I was merely questioning the &&-chaining that stops "rm -fr" from
> running if stop_git ever fails (and your earlier iteration you had
> multiple "rm -fr" ;-chained, not &&-chained---not using && is often
> more appropriate in a when_finished handler).

I see. I'll fix this part.

>>> > + set -m &&
>>>
>>> I have to wonder how portable (and necessary) this is.
>>>
>>> POSIX says it shall be supported if the implementation supports the
>>> User Portability Utilities option.  It also says that it was added
>>> to apply only to the UPE because it applies primarily to interactive
>>> use, not shell script applications.  And our test scripts are of
>>> course not interactive.
>>
>> How about the following modification? It still utilizes $git_pgid to
>> filter processes, but avoids "set -m".
>
> Nah, your original reads much better, and the code is grabbing and
> using the process group information anyway (and my question about
> "-m" was more about "should we be relying on process group features
> in this test to kill them all?").
>
> I am OK with the idea that we can assume, at least among the
> platforms that support fsmonitor, that sending a signal to a process
> group would cause the signal delivered to the member processes just
> as we expect.

Thank you for the clarification and the support.

Koji Nakamaru

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3] fsmonitor OSX: fix hangs for submodules
  2024-10-01  5:09 ` [PATCH v2] " Koji Nakamaru via GitGitGadget
  2024-10-01 11:57   ` Junio C Hamano
@ 2024-10-01 19:29   ` Koji Nakamaru via GitGitGadget
  2024-10-04  0:07     ` [PATCH v4] " Koji Nakamaru via GitGitGadget
  1 sibling, 1 reply; 19+ messages in thread
From: Koji Nakamaru via GitGitGadget @ 2024-10-01 19:29 UTC (permalink / raw)
  To: git; +Cc: Koji Nakamaru, Koji Nakamaru

From: Koji Nakamaru <koji.nakamaru@gree.net>

fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
the value with trailing "/." (as repo_get_git_dir(the_repository) on
Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
IS_OUTSIDE_CONE.

In this case, fsevent_callback() doesn't update cookie_list so that
fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
not invoked.

As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
that with_lock__mark_cookies_seen() should unlock, the whole daemon
hangs.

Remove trailing "/." from state->path_gitdir_watch.buf for submodules
and add a corresponding test in t7527-builtin-fsmonitor.sh.

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
---
    fsmonitor/darwin: fix hangs for submodules

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1802%2FKojiNakamaru%2Ffix%2Ffsmonitor-darwin-hangs-for-submodules-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1802/KojiNakamaru/fix/fsmonitor-darwin-hangs-for-submodules-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1802

Range-diff vs v2:

 1:  decf68499f7 ! 1:  aabc1c2f6ee fsmonitor OSX: fix hangs for submodules
     @@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs impli
      +
      +test_expect_success "submodule implicitly starts daemon by pull" '
      +	test_atexit "stop_watchdog" &&
     -+	test_when_finished "stop_git && rm -rf cloned super sub" &&
     ++	test_when_finished "stop_git; rm -rf cloned super sub" &&
      +
      +	create_super super &&
      +	create_sub sub &&


 builtin/fsmonitor--daemon.c  |  1 +
 t/t7527-builtin-fsmonitor.sh | 51 ++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index dce8a3b2482..e1e6b96d09e 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -1314,6 +1314,7 @@ static int fsmonitor_run_daemon(void)
 		strbuf_reset(&state.path_gitdir_watch);
 		strbuf_addstr(&state.path_gitdir_watch,
 			      absolute_path(repo_get_git_dir(the_repository)));
+		strbuf_strip_suffix(&state.path_gitdir_watch, "/.");
 		state.nr_paths_watching = 2;
 	}
 
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 730f3c7f810..e6ddc7048c0 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -907,6 +907,57 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
 	test_subcommand git fsmonitor--daemon start <super-sub.trace
 '
 
+start_git_in_background () {
+	git "$@" &
+	git_pid=$!
+	git_pgid=$(ps -o pgid= -p $git_pid)
+	nr_tries_left=10
+	while true
+	do
+		if test $nr_tries_left -eq 0
+		then
+			kill -- -$git_pgid
+			exit 1
+		fi
+		sleep 1
+		nr_tries_left=$(($nr_tries_left - 1))
+	done >/dev/null 2>&1 &
+	watchdog_pid=$!
+	wait $git_pid
+}
+
+stop_git () {
+	while kill -0 -- -$git_pgid
+	do
+		kill -- -$git_pgid
+		sleep 1
+	done
+}
+
+stop_watchdog () {
+	while kill -0 $watchdog_pid
+	do
+		kill $watchdog_pid
+		sleep 1
+	done
+}
+
+test_expect_success "submodule implicitly starts daemon by pull" '
+	test_atexit "stop_watchdog" &&
+	test_when_finished "stop_git; rm -rf cloned super sub" &&
+
+	create_super super &&
+	create_sub sub &&
+
+	git -C super submodule add ../sub ./dir_1/dir_2/sub &&
+	git -C super commit -m "add sub" &&
+	git clone --recurse-submodules super cloned &&
+
+	git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
+	set -m &&
+	start_git_in_background -C cloned pull --recurse-submodules
+'
+
 # On a case-insensitive file system, confirm that the daemon
 # notices when the .git directory is moved/renamed/deleted
 # regardless of how it is spelled in the FS event.

base-commit: 3857aae53f3633b7de63ad640737c657387ae0c6
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-10-01 18:42         ` Koji Nakamaru
@ 2024-10-02  6:58           ` Koji Nakamaru
  2024-10-02 18:14             ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Koji Nakamaru @ 2024-10-02  6:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Koji Nakamaru via GitGitGadget, git

Although I submitted [PATCH v3], it was incomplete about the following:

On Wed, Oct 2, 2024 at 3:04 AM Junio C Hamano <gitster@pobox.com> wrote:
> I am OK with the idea that we can assume, at least among the
> platforms that support fsmonitor, that sending a signal to a process
> group would cause the signal delivered to the member processes just
> as we expect.

On windows, there is no process group so the test cannot run
correctly. As hangs corrected with the patch occur only for darwin, I
would like to skip MINGW in the test. Is it okay?

Koji Nakamaru

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-10-02  6:58           ` Koji Nakamaru
@ 2024-10-02 18:14             ` Junio C Hamano
  2024-10-03  8:54               ` Koji Nakamaru
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2024-10-02 18:14 UTC (permalink / raw)
  To: Koji Nakamaru; +Cc: Koji Nakamaru via GitGitGadget, git

Koji Nakamaru <koji.nakamaru@gree.net> writes:

> On windows, there is no process group so the test cannot run
> correctly. As hangs corrected with the patch occur only for darwin, I
> would like to skip MINGW in the test. Is it okay?

Surely.  But can we do so without spelling MINGW or WINDOWS out?

That is, if your test requires process group features available, can
we come up with a lazy prerequisite and use that to decide if we
skip the test?

Earlier in the discussion, you said who are left behind if we did so
on systems with process groups, but I wonder what happens when we
throw a signal at the top-level "git" process on Windows, and if it
behaves better, perhaps we can implement stop_git differently where
process groups are not available, instead of skipping the tests
altogether?

Thanks.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-10-02 18:14             ` Junio C Hamano
@ 2024-10-03  8:54               ` Koji Nakamaru
  2024-10-03 17:44                 ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Koji Nakamaru @ 2024-10-03  8:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Koji Nakamaru via GitGitGadget, git

On Thu, Oct 3, 2024 at 3:14 AM Junio C Hamano <gitster@pobox.com> wrote:
>> On windows, there is no process group so the test cannot run
>> correctly. As hangs corrected with the patch occur only for darwin, I
>> would like to skip MINGW in the test. Is it okay?
>
> Surely.  But can we do so without spelling MINGW or WINDOWS out?
>
> That is, if your test requires process group features available, can
> we come up with a lazy prerequisite and use that to decide if we
> skip the test?

I tweaked fsm-listen-win32.c to cause hangs and tested on windows. I'm
sorry that simply saying "there is no process group" was not quite
correct. Each mingw process has a process group and its win32
subprocesses can be killed by "kill -- -$pgid"

For example, when a hang occurs, the following processes remain.

      PID    PPID    PGID     WINPID   TTY         UID    STIME
        COMMAND
    56782   40923   56782       9484  pty0     1052296 16:23:22
        /mingw64/bin/git
        # mingw git process
    78100       0       0      12564  ?              0 16:23:23
        C:\git-sdk-64\mingw64\libexec\git-core\git.exe
        # win32 process
    86108       0       0      20572  ?              0 16:23:23
        C:\git-sdk-64\mingw64\libexec\git-core\git.exe
        # win32 subprocess
    73328       0       0       7792  ?              0 16:23:23
        C:\git-sdk-64\mingw64\libexec\git-core\git.exe
        # win32 fsmonitor

> Earlier in the discussion, you said who are left behind if we did so
> on systems with process groups, but I wonder what happens when we
> throw a signal at the top-level "git" process on Windows, and if it
> behaves better, perhaps we can implement stop_git differently where
> process groups are not available, instead of skipping the tests
> altogether?

If we do "kill 56782" or "kill -- -56782" for the above example, most of
processes are terminated except the win32 fsmonitor. This is because the
win32 fsmonitor is detached by FreeConsole().

I also tried "git fsmonitor--daemon stop". It was able to communicate
with the win32 fsmonitor and the internal status of the win32 fsmonitor
changed, but the win32 daemon didn't terminate.

Because it's getting complicated, how about the following:

* specify MINGW
* note in the commit log:
  The test is disabled for MINGW because hangs treated with this patch
  occur only for Darwin and there is no simple way to terminate the
  win32 fsmonitor daemon that hangs.

Koji Nakamaru

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-10-03  8:54               ` Koji Nakamaru
@ 2024-10-03 17:44                 ` Junio C Hamano
  2024-10-04  2:42                   ` Koji Nakamaru
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2024-10-03 17:44 UTC (permalink / raw)
  To: Koji Nakamaru; +Cc: Koji Nakamaru via GitGitGadget, git

Koji Nakamaru <koji.nakamaru@gree.net> writes:

> Because it's getting complicated, how about the following:
>
> * specify MINGW
> * note in the commit log:
>   The test is disabled for MINGW because hangs treated with this patch
>   occur only for Darwin and there is no simple way to terminate the
>   win32 fsmonitor daemon that hangs.

Sounds good to me.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v4] fsmonitor OSX: fix hangs for submodules
  2024-10-01 19:29   ` [PATCH v3] " Koji Nakamaru via GitGitGadget
@ 2024-10-04  0:07     ` Koji Nakamaru via GitGitGadget
  2024-10-04 17:44       ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Koji Nakamaru via GitGitGadget @ 2024-10-04  0:07 UTC (permalink / raw)
  To: git; +Cc: Koji Nakamaru, Koji Nakamaru

From: Koji Nakamaru <koji.nakamaru@gree.net>

fsmonitor_classify_path_absolute() expects state->path_gitdir_watch.buf
has no trailing '/' or '.' For a submodule, fsmonitor_run_daemon() sets
the value with trailing "/." (as repo_get_git_dir(the_repository) on
Darwin returns ".") so that fsmonitor_classify_path_absolute() returns
IS_OUTSIDE_CONE.

In this case, fsevent_callback() doesn't update cookie_list so that
fsmonitor_publish() does nothing and with_lock__mark_cookies_seen() is
not invoked.

As with_lock__wait_for_cookie() infinitely waits for state->cookies_cond
that with_lock__mark_cookies_seen() should unlock, the whole daemon
hangs.

Remove trailing "/." from state->path_gitdir_watch.buf for submodules
and add a corresponding test in t7527-builtin-fsmonitor.sh. The test is
disabled for MINGW because hangs treated with this patch occur only for
Darwin and there is no simple way to terminate the win32 fsmonitor
daemon that hangs.

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Koji Nakamaru <koji.nakamaru@gree.net>
---
    fsmonitor/darwin: fix hangs for submodules

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1802%2FKojiNakamaru%2Ffix%2Ffsmonitor-darwin-hangs-for-submodules-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1802/KojiNakamaru/fix/fsmonitor-darwin-hangs-for-submodules-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1802

Range-diff vs v3:

 1:  aabc1c2f6ee ! 1:  7b7224ef718 fsmonitor OSX: fix hangs for submodules
     @@ Commit message
          hangs.
      
          Remove trailing "/." from state->path_gitdir_watch.buf for submodules
     -    and add a corresponding test in t7527-builtin-fsmonitor.sh.
     +    and add a corresponding test in t7527-builtin-fsmonitor.sh. The test is
     +    disabled for MINGW because hangs treated with this patch occur only for
     +    Darwin and there is no simple way to terminate the win32 fsmonitor
     +    daemon that hangs.
      
          Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
          Suggested-by: Junio C Hamano <gitster@pobox.com>
     @@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs impli
      +	done
      +}
      +
     -+test_expect_success "submodule implicitly starts daemon by pull" '
     ++test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
      +	test_atexit "stop_watchdog" &&
      +	test_when_finished "stop_git; rm -rf cloned super sub" &&
      +


 builtin/fsmonitor--daemon.c  |  1 +
 t/t7527-builtin-fsmonitor.sh | 51 ++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index dce8a3b2482..e1e6b96d09e 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -1314,6 +1314,7 @@ static int fsmonitor_run_daemon(void)
 		strbuf_reset(&state.path_gitdir_watch);
 		strbuf_addstr(&state.path_gitdir_watch,
 			      absolute_path(repo_get_git_dir(the_repository)));
+		strbuf_strip_suffix(&state.path_gitdir_watch, "/.");
 		state.nr_paths_watching = 2;
 	}
 
diff --git a/t/t7527-builtin-fsmonitor.sh b/t/t7527-builtin-fsmonitor.sh
index 730f3c7f810..9b15baa02d3 100755
--- a/t/t7527-builtin-fsmonitor.sh
+++ b/t/t7527-builtin-fsmonitor.sh
@@ -907,6 +907,57 @@ test_expect_success "submodule absorbgitdirs implicitly starts daemon" '
 	test_subcommand git fsmonitor--daemon start <super-sub.trace
 '
 
+start_git_in_background () {
+	git "$@" &
+	git_pid=$!
+	git_pgid=$(ps -o pgid= -p $git_pid)
+	nr_tries_left=10
+	while true
+	do
+		if test $nr_tries_left -eq 0
+		then
+			kill -- -$git_pgid
+			exit 1
+		fi
+		sleep 1
+		nr_tries_left=$(($nr_tries_left - 1))
+	done >/dev/null 2>&1 &
+	watchdog_pid=$!
+	wait $git_pid
+}
+
+stop_git () {
+	while kill -0 -- -$git_pgid
+	do
+		kill -- -$git_pgid
+		sleep 1
+	done
+}
+
+stop_watchdog () {
+	while kill -0 $watchdog_pid
+	do
+		kill $watchdog_pid
+		sleep 1
+	done
+}
+
+test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
+	test_atexit "stop_watchdog" &&
+	test_when_finished "stop_git; rm -rf cloned super sub" &&
+
+	create_super super &&
+	create_sub sub &&
+
+	git -C super submodule add ../sub ./dir_1/dir_2/sub &&
+	git -C super commit -m "add sub" &&
+	git clone --recurse-submodules super cloned &&
+
+	git -C cloned/dir_1/dir_2/sub config core.fsmonitor true &&
+	set -m &&
+	start_git_in_background -C cloned pull --recurse-submodules
+'
+
 # On a case-insensitive file system, confirm that the daemon
 # notices when the .git directory is moved/renamed/deleted
 # regardless of how it is spelled in the FS event.

base-commit: 3857aae53f3633b7de63ad640737c657387ae0c6
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2] fsmonitor OSX: fix hangs for submodules
  2024-10-03 17:44                 ` Junio C Hamano
@ 2024-10-04  2:42                   ` Koji Nakamaru
  0 siblings, 0 replies; 19+ messages in thread
From: Koji Nakamaru @ 2024-10-04  2:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Koji Nakamaru via GitGitGadget, git

On Fri, Oct 4, 2024 at 2:44 AM Junio C Hamano <gitster@pobox.com> wrote:
> > Because it's getting complicated, how about the following:
> >
> > * specify MINGW
> > * note in the commit log:
> >   The test is disabled for MINGW because hangs treated with this patch
> >   occur only for Darwin and there is no simple way to terminate the
> >   win32 fsmonitor daemon that hangs.
>
> Sounds good to me.

Thank you. I've submitted [PATCH v4].

Koji Nakamaru

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4] fsmonitor OSX: fix hangs for submodules
  2024-10-04  0:07     ` [PATCH v4] " Koji Nakamaru via GitGitGadget
@ 2024-10-04 17:44       ` Junio C Hamano
  2024-10-04 18:47         ` Ramsay Jones
  2024-10-05  3:12         ` Koji Nakamaru
  0 siblings, 2 replies; 19+ messages in thread
From: Junio C Hamano @ 2024-10-04 17:44 UTC (permalink / raw)
  To: Koji Nakamaru via GitGitGadget; +Cc: git, Koji Nakamaru, Ramsay Jones

"Koji Nakamaru via GitGitGadget" <gitgitgadget@gmail.com> writes:

> ... The test is
> disabled for MINGW because hangs treated with this patch occur only for
> Darwin and there is no simple way to terminate the win32 fsmonitor
> daemon that hangs.
> ...
>      @@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs impli
>       +	done
>       +}
>       +
>      -+test_expect_success "submodule implicitly starts daemon by pull" '
>      ++test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
>       +	test_atexit "stop_watchdog" &&
>       +	test_when_finished "stop_git; rm -rf cloned super sub" &&
>       +

Let me update !MINGW to !WINDOWS while queuing.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4] fsmonitor OSX: fix hangs for submodules
  2024-10-04 17:44       ` Junio C Hamano
@ 2024-10-04 18:47         ` Ramsay Jones
  2024-10-04 19:07           ` Junio C Hamano
  2024-10-05  3:12         ` Koji Nakamaru
  1 sibling, 1 reply; 19+ messages in thread
From: Ramsay Jones @ 2024-10-04 18:47 UTC (permalink / raw)
  To: Junio C Hamano, Koji Nakamaru via GitGitGadget; +Cc: git, Koji Nakamaru



On 04/10/2024 18:44, Junio C Hamano wrote:
> "Koji Nakamaru via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> ... The test is
>> disabled for MINGW because hangs treated with this patch occur only for
>> Darwin and there is no simple way to terminate the win32 fsmonitor
>> daemon that hangs.
>> ...
>>      @@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs impli
>>       +	done
>>       +}
>>       +
>>      -+test_expect_success "submodule implicitly starts daemon by pull" '
>>      ++test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
>>       +	test_atexit "stop_watchdog" &&
>>       +	test_when_finished "stop_git; rm -rf cloned super sub" &&
>>       +
> 
> Let me update !MINGW to !WINDOWS while queuing.
> 

While this won't hurt, this test file is skipped on cygwin:

[23:19:33] t7527-builtin-fsmonitor.sh ......................... skipped: fsmonitor--daemon is not supported on this platform

(my eternal TODO list has an 'fsmonitor on cygwin?' item ...)

Thanks.

ATB,
Ramsay Jones




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4] fsmonitor OSX: fix hangs for submodules
  2024-10-04 18:47         ` Ramsay Jones
@ 2024-10-04 19:07           ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2024-10-04 19:07 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: Koji Nakamaru via GitGitGadget, git, Koji Nakamaru

Ramsay Jones <ramsay@ramsayjones.plus.com> writes:

> While this won't hurt, this test file is skipped on cygwin:
>
> [23:19:33] t7527-builtin-fsmonitor.sh ......................... skipped: fsmonitor--daemon is not supported on this platform
>
> (my eternal TODO list has an 'fsmonitor on cygwin?' item ...)

Thanks, then let's not bother.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4] fsmonitor OSX: fix hangs for submodules
  2024-10-04 17:44       ` Junio C Hamano
  2024-10-04 18:47         ` Ramsay Jones
@ 2024-10-05  3:12         ` Koji Nakamaru
  1 sibling, 0 replies; 19+ messages in thread
From: Koji Nakamaru @ 2024-10-05  3:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Koji Nakamaru via GitGitGadget, git, Ramsay Jones

On Sat, Oct 5, 2024 at 2:44 AM Junio C Hamano <gitster@pobox.com> wrote:
> > ... The test is
> > disabled for MINGW because hangs treated with this patch occur only for
> > Darwin and there is no simple way to terminate the win32 fsmonitor
> > daemon that hangs.
> > ...
> >      @@ t/t7527-builtin-fsmonitor.sh: test_expect_success "submodule absorbgitdirs impli
> >       +       done
> >       +}
> >       +
> >      -+test_expect_success "submodule implicitly starts daemon by pull" '
> >      ++test_expect_success !MINGW "submodule implicitly starts daemon by pull" '
> >       +       test_atexit "stop_watchdog" &&
> >       +       test_when_finished "stop_git; rm -rf cloned super sub" &&
> >       +
>
> Let me update !MINGW to !WINDOWS while queuing.

I see, Thank you.

Koji Nakamaru

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-10-05  3:13 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-29  2:41 [PATCH] fsmonitor OSX: fix hangs for submodules Koji Nakamaru via GitGitGadget
2024-09-30 17:40 ` Junio C Hamano
2024-10-01  4:11   ` Koji Nakamaru
2024-10-01  5:09 ` [PATCH v2] " Koji Nakamaru via GitGitGadget
2024-10-01 11:57   ` Junio C Hamano
2024-10-01 17:51     ` Koji Nakamaru
2024-10-01 18:04       ` Junio C Hamano
2024-10-01 18:42         ` Koji Nakamaru
2024-10-02  6:58           ` Koji Nakamaru
2024-10-02 18:14             ` Junio C Hamano
2024-10-03  8:54               ` Koji Nakamaru
2024-10-03 17:44                 ` Junio C Hamano
2024-10-04  2:42                   ` Koji Nakamaru
2024-10-01 19:29   ` [PATCH v3] " Koji Nakamaru via GitGitGadget
2024-10-04  0:07     ` [PATCH v4] " Koji Nakamaru via GitGitGadget
2024-10-04 17:44       ` Junio C Hamano
2024-10-04 18:47         ` Ramsay Jones
2024-10-04 19:07           ` Junio C Hamano
2024-10-05  3:12         ` Koji Nakamaru

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).