* [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush @ 2023-08-14 6:03 Jürg Billeter 2023-08-14 11:02 ` Miklos Szeredi 0 siblings, 1 reply; 14+ messages in thread From: Jürg Billeter @ 2023-08-14 6:03 UTC (permalink / raw) To: Miklos Szeredi, Eric W. Biederman Cc: Tycho Andersen, linux-fsdevel, linux-kernel, regressions Since v6.3-rc1 commit 5a8bee63b1 ("fuse: in fuse_flush only wait if someone wants the return code") `fput()` is called asynchronously if a file is closed as part of a process exiting, i.e., if there was no explicit `close()` before exit. If the file was open for writing, also `put_write_access()` is called asynchronously as part of the async `fput()`. If that newly written file is an executable, attempting to `execve()` the new file can fail with `ETXTBSY` if it's called after the writer process exited but before the async `fput()` has run. I've confirmed that this issue is absent in v6.2 and reverting 5a8bee63b1 on top of v6.4.10 fixes the regression. #regzbot introduced: 5a8bee63b1 Jürg ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 6:03 [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush Jürg Billeter @ 2023-08-14 11:02 ` Miklos Szeredi 2023-08-14 12:07 ` Bernd Schubert 2023-08-14 14:00 ` Tycho Andersen 0 siblings, 2 replies; 14+ messages in thread From: Miklos Szeredi @ 2023-08-14 11:02 UTC (permalink / raw) To: Jürg Billeter Cc: Eric W. Biederman, Tycho Andersen, linux-fsdevel, linux-kernel, regressions On Mon, 14 Aug 2023 at 08:03, Jürg Billeter <j@bitron.ch> wrote: > > Since v6.3-rc1 commit 5a8bee63b1 ("fuse: in fuse_flush only wait if > someone wants the return code") `fput()` is called asynchronously if a > file is closed as part of a process exiting, i.e., if there was no > explicit `close()` before exit. > > If the file was open for writing, also `put_write_access()` is called > asynchronously as part of the async `fput()`. > > If that newly written file is an executable, attempting to `execve()` > the new file can fail with `ETXTBSY` if it's called after the writer > process exited but before the async `fput()` has run. Thanks for the report. At this point, I think it would be best to revert the original patch, since only v6.4 has it. The original fix was already a workaround, and I don't see a clear path forward in this direction. We need to see if there's better direction. Ideas? Thanks, Miklos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 11:02 ` Miklos Szeredi @ 2023-08-14 12:07 ` Bernd Schubert 2023-08-14 12:28 ` Miklos Szeredi 2023-08-14 14:00 ` Tycho Andersen 1 sibling, 1 reply; 14+ messages in thread From: Bernd Schubert @ 2023-08-14 12:07 UTC (permalink / raw) To: Miklos Szeredi, Jürg Billeter Cc: Eric W. Biederman, Tycho Andersen, linux-fsdevel, linux-kernel, regressions On 8/14/23 13:02, Miklos Szeredi wrote: > On Mon, 14 Aug 2023 at 08:03, Jürg Billeter <j@bitron.ch> wrote: >> >> Since v6.3-rc1 commit 5a8bee63b1 ("fuse: in fuse_flush only wait if >> someone wants the return code") `fput()` is called asynchronously if a >> file is closed as part of a process exiting, i.e., if there was no >> explicit `close()` before exit. >> >> If the file was open for writing, also `put_write_access()` is called >> asynchronously as part of the async `fput()`. >> >> If that newly written file is an executable, attempting to `execve()` >> the new file can fail with `ETXTBSY` if it's called after the writer >> process exited but before the async `fput()` has run. > > Thanks for the report. > > At this point, I think it would be best to revert the original patch, > since only v6.4 has it. > > The original fix was already a workaround, and I don't see a clear > path forward in this direction. We need to see if there's better > direction. > > Ideas? Is there a good reason to flush O_RDONLY? fuse: Avoid flush for O_RDONLY From: Bernd Schubert <bschubert@ddn.com> A file opened in read-only moded does not have data to be flushed, so no need to send flush at all. This also mitigates -EBUSY for executables, which is due to async flush with commit 5a8bee63b1. Fixes: 5a8bee63b1 (unless executable opened in rw) Signed-off-by: Bernd Schubert <bschubert@ddn.com> index 89d97f6188e0..e058a6af6751 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -545,7 +545,8 @@ static int fuse_flush(struct file *file, fl_owner_t id) if (fuse_is_bad(inode)) return -EIO; - if (ff->open_flags & FOPEN_NOFLUSH && !fm->fc->writeback_cache) + if ((ff->open_flags & FOPEN_NOFLUSH && !fm->fc->writeback_cache) || + ((file->f_flags & O_ACCMODE) == O_RDONLY)) return 0; fa = kzalloc(sizeof(*fa), GFP_KERNEL); ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 12:07 ` Bernd Schubert @ 2023-08-14 12:28 ` Miklos Szeredi 2023-08-14 12:38 ` Jürg Billeter 0 siblings, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2023-08-14 12:28 UTC (permalink / raw) To: Bernd Schubert Cc: Jürg Billeter, Eric W. Biederman, Tycho Andersen, linux-fsdevel, linux-kernel, regressions On Mon, 14 Aug 2023 at 14:07, Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > > > > On 8/14/23 13:02, Miklos Szeredi wrote: > > On Mon, 14 Aug 2023 at 08:03, Jürg Billeter <j@bitron.ch> wrote: > >> > >> Since v6.3-rc1 commit 5a8bee63b1 ("fuse: in fuse_flush only wait if > >> someone wants the return code") `fput()` is called asynchronously if a > >> file is closed as part of a process exiting, i.e., if there was no > >> explicit `close()` before exit. > >> > >> If the file was open for writing, also `put_write_access()` is called > >> asynchronously as part of the async `fput()`. > >> > >> If that newly written file is an executable, attempting to `execve()` > >> the new file can fail with `ETXTBSY` if it's called after the writer > >> process exited but before the async `fput()` has run. > > > > Thanks for the report. > > > > At this point, I think it would be best to revert the original patch, > > since only v6.4 has it. > > > > The original fix was already a workaround, and I don't see a clear > > path forward in this direction. We need to see if there's better > > direction. > > > > Ideas? > > Is there a good reason to flush O_RDONLY? ->flush() is somewhat of a misnomer, it's the callback for explicit close(2) and also for implicit close by exit(2). The reason it's called flush is that nfs/fuse use it to ensure close-to-open cache consistency, which means flushing dirty data on close. On fuse it also used to unlock remote posix locks, and we really know what other "creative" uses were found for this request. So while we could turn off sending FLUSH for read-only case (and use SETLK/F_UNLCK for the remote lock case) but first let's see a use case. Sending FLUSH can already be disabled by returning ENOSYS. > > fuse: Avoid flush for O_RDONLY > > From: Bernd Schubert <bschubert@ddn.com> > > A file opened in read-only moded does not have data to be > flushed, so no need to send flush at all. > > This also mitigates -EBUSY for executables, which is due to > async flush with commit 5a8bee63b1. Does it? If I read the bug report correctly, it's the write case that causes EBUSY. Thanks, Miklos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 12:28 ` Miklos Szeredi @ 2023-08-14 12:38 ` Jürg Billeter 2023-08-14 13:44 ` Bernd Schubert 0 siblings, 1 reply; 14+ messages in thread From: Jürg Billeter @ 2023-08-14 12:38 UTC (permalink / raw) To: Miklos Szeredi, Bernd Schubert Cc: Eric W. Biederman, Tycho Andersen, linux-fsdevel, linux-kernel, regressions On Mon, 2023-08-14 at 14:28 +0200, Miklos Szeredi wrote: > On Mon, 14 Aug 2023 at 14:07, Bernd Schubert > > fuse: Avoid flush for O_RDONLY > > > > From: Bernd Schubert <bschubert@ddn.com> > > > > A file opened in read-only moded does not have data to be > > flushed, so no need to send flush at all. > > > > This also mitigates -EBUSY for executables, which is due to > > async flush with commit 5a8bee63b1. > > Does it? If I read the bug report correctly, it's the write case that > causes EBUSY. Indeed, I see this when trying to execute a file after a process wrote to (created) that file. As far as I can tell, `ETXTBSY` can't happen on exec without the file being opened for writing and thus, I don't see this patch mitigating this bug. Jürg ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 12:38 ` Jürg Billeter @ 2023-08-14 13:44 ` Bernd Schubert 0 siblings, 0 replies; 14+ messages in thread From: Bernd Schubert @ 2023-08-14 13:44 UTC (permalink / raw) To: Jürg Billeter, Miklos Szeredi Cc: Eric W. Biederman, Tycho Andersen, linux-fsdevel, linux-kernel, regressions On 8/14/23 14:38, Jürg Billeter wrote: > On Mon, 2023-08-14 at 14:28 +0200, Miklos Szeredi wrote: >> On Mon, 14 Aug 2023 at 14:07, Bernd Schubert >>> fuse: Avoid flush for O_RDONLY >>> >>> From: Bernd Schubert <bschubert@ddn.com> >>> >>> A file opened in read-only moded does not have data to be >>> flushed, so no need to send flush at all. >>> >>> This also mitigates -EBUSY for executables, which is due to >>> async flush with commit 5a8bee63b1. >> >> Does it? If I read the bug report correctly, it's the write case that >> causes EBUSY. > > Indeed, I see this when trying to execute a file after a process wrote > to (created) that file. As far as I can tell, `ETXTBSY` can't happen on > exec without the file being opened for writing and thus, I don't see > this patch mitigating this bug. Sorry, my fault, I should have read your message/report more carefully. Bernd ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 11:02 ` Miklos Szeredi 2023-08-14 12:07 ` Bernd Schubert @ 2023-08-14 14:00 ` Tycho Andersen 2023-08-14 14:35 ` Miklos Szeredi 2023-08-14 21:34 ` Bernd Schubert 1 sibling, 2 replies; 14+ messages in thread From: Tycho Andersen @ 2023-08-14 14:00 UTC (permalink / raw) To: Miklos Szeredi Cc: Jürg Billeter, Eric W. Biederman, linux-fsdevel, linux-kernel, regressions On Mon, Aug 14, 2023 at 01:02:35PM +0200, Miklos Szeredi wrote: > On Mon, 14 Aug 2023 at 08:03, Jürg Billeter <j@bitron.ch> wrote: > > > > Since v6.3-rc1 commit 5a8bee63b1 ("fuse: in fuse_flush only wait if > > someone wants the return code") `fput()` is called asynchronously if a > > file is closed as part of a process exiting, i.e., if there was no > > explicit `close()` before exit. > > > > If the file was open for writing, also `put_write_access()` is called > > asynchronously as part of the async `fput()`. > > > > If that newly written file is an executable, attempting to `execve()` > > the new file can fail with `ETXTBSY` if it's called after the writer > > process exited but before the async `fput()` has run. > > Thanks for the report. > > At this point, I think it would be best to revert the original patch, > since only v6.4 has it. I agree. > The original fix was already a workaround, and I don't see a clear > path forward in this direction. We need to see if there's better > direction. > > Ideas? It seems like we really do need to wait here. I guess that means we need some kind of exit-proof wait? Tycho ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 14:00 ` Tycho Andersen @ 2023-08-14 14:35 ` Miklos Szeredi 2023-08-14 22:36 ` Tycho Andersen 2023-08-14 21:34 ` Bernd Schubert 1 sibling, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2023-08-14 14:35 UTC (permalink / raw) To: Tycho Andersen Cc: Jürg Billeter, Eric W. Biederman, linux-fsdevel, linux-kernel, regressions On Mon, 14 Aug 2023 at 16:00, Tycho Andersen <tycho@tycho.pizza> wrote: > It seems like we really do need to wait here. I guess that means we > need some kind of exit-proof wait? Could you please recap the original problem? Thanks, Miklos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 14:35 ` Miklos Szeredi @ 2023-08-14 22:36 ` Tycho Andersen 2023-08-21 14:24 ` Miklos Szeredi 0 siblings, 1 reply; 14+ messages in thread From: Tycho Andersen @ 2023-08-14 22:36 UTC (permalink / raw) To: Miklos Szeredi Cc: Jürg Billeter, Eric W. Biederman, linux-fsdevel, linux-kernel, regressions On Mon, Aug 14, 2023 at 04:35:56PM +0200, Miklos Szeredi wrote: > On Mon, 14 Aug 2023 at 16:00, Tycho Andersen <tycho@tycho.pizza> wrote: > > > It seems like we really do need to wait here. I guess that means we > > need some kind of exit-proof wait? > > Could you please recap the original problem? Sure, the symptom is a deadlock, something like: # cat /proc/1528591/stack [<0>] do_wait+0x156/0x2f0 [<0>] kernel_wait4+0x8d/0x140 [<0>] zap_pid_ns_processes+0x104/0x180 [<0>] do_exit+0xa41/0xb80 [<0>] do_group_exit+0x3a/0xa0 [<0>] __x64_sys_exit_group+0x14/0x20 [<0>] do_syscall_64+0x37/0xb0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae which is stuck waiting for: # cat /proc/1544574/stack [<0>] request_wait_answer+0x12f/0x210 [<0>] fuse_simple_request+0x109/0x2c0 [<0>] fuse_flush+0x16f/0x1b0 [<0>] filp_close+0x27/0x70 [<0>] put_files_struct+0x6b/0xc0 [<0>] do_exit+0x360/0xb80 [<0>] do_group_exit+0x3a/0xa0 [<0>] get_signal+0x140/0x870 [<0>] arch_do_signal_or_restart+0xae/0x7c0 [<0>] exit_to_user_mode_prepare+0x10f/0x1c0 [<0>] syscall_exit_to_user_mode+0x26/0x40 [<0>] do_syscall_64+0x46/0xb0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae I have a reproducer here: https://github.com/tych0/kernel-utils/blob/master/fuse2/Makefile#L7 The problem is that the second thread has called do_exit() -> exit_signals(), but then tries to request_wait_answer() which uses the core wait primitives that no longer get woken up from signals due to the code in exit_signals(). So when we try to exit the pid ns, the whole cleanup hangs. It seems we really do need to wait in do_exit(), otherwise we get the behavior described in this regression... Tycho ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 22:36 ` Tycho Andersen @ 2023-08-21 14:24 ` Miklos Szeredi 2023-08-21 15:02 ` Tycho Andersen 0 siblings, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2023-08-21 14:24 UTC (permalink / raw) To: Tycho Andersen Cc: Jürg Billeter, Eric W. Biederman, linux-fsdevel, linux-kernel, regressions On Tue, 15 Aug 2023 at 00:36, Tycho Andersen <tycho@tycho.pizza> wrote: > > On Mon, Aug 14, 2023 at 04:35:56PM +0200, Miklos Szeredi wrote: > > On Mon, 14 Aug 2023 at 16:00, Tycho Andersen <tycho@tycho.pizza> wrote: > > > > > It seems like we really do need to wait here. I guess that means we > > > need some kind of exit-proof wait? > > > > Could you please recap the original problem? > > Sure, the symptom is a deadlock, something like: > > # cat /proc/1528591/stack > [<0>] do_wait+0x156/0x2f0 > [<0>] kernel_wait4+0x8d/0x140 > [<0>] zap_pid_ns_processes+0x104/0x180 > [<0>] do_exit+0xa41/0xb80 > [<0>] do_group_exit+0x3a/0xa0 > [<0>] __x64_sys_exit_group+0x14/0x20 > [<0>] do_syscall_64+0x37/0xb0 > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae > > which is stuck waiting for: > > # cat /proc/1544574/stack > [<0>] request_wait_answer+0x12f/0x210 > [<0>] fuse_simple_request+0x109/0x2c0 > [<0>] fuse_flush+0x16f/0x1b0 > [<0>] filp_close+0x27/0x70 > [<0>] put_files_struct+0x6b/0xc0 > [<0>] do_exit+0x360/0xb80 > [<0>] do_group_exit+0x3a/0xa0 > [<0>] get_signal+0x140/0x870 > [<0>] arch_do_signal_or_restart+0xae/0x7c0 > [<0>] exit_to_user_mode_prepare+0x10f/0x1c0 > [<0>] syscall_exit_to_user_mode+0x26/0x40 > [<0>] do_syscall_64+0x46/0xb0 > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae > > I have a reproducer here: > https://github.com/tych0/kernel-utils/blob/master/fuse2/Makefile#L7 The issue seems to be that the server process is recursing into the filesystem it is serving (nested_fsync()). It's quite easy to deadlock fuse this way, and I'm not sure why this would be needed for any server implementation. Can you explain? Thanks, Miklos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-21 14:24 ` Miklos Szeredi @ 2023-08-21 15:02 ` Tycho Andersen 2023-08-21 15:31 ` Miklos Szeredi 0 siblings, 1 reply; 14+ messages in thread From: Tycho Andersen @ 2023-08-21 15:02 UTC (permalink / raw) To: Miklos Szeredi Cc: Jürg Billeter, Eric W. Biederman, linux-fsdevel, linux-kernel, regressions On Mon, Aug 21, 2023 at 04:24:00PM +0200, Miklos Szeredi wrote: > On Tue, 15 Aug 2023 at 00:36, Tycho Andersen <tycho@tycho.pizza> wrote: > > > > On Mon, Aug 14, 2023 at 04:35:56PM +0200, Miklos Szeredi wrote: > > > On Mon, 14 Aug 2023 at 16:00, Tycho Andersen <tycho@tycho.pizza> wrote: > > > > > > > It seems like we really do need to wait here. I guess that means we > > > > need some kind of exit-proof wait? > > > > > > Could you please recap the original problem? > > > > Sure, the symptom is a deadlock, something like: > > > > # cat /proc/1528591/stack > > [<0>] do_wait+0x156/0x2f0 > > [<0>] kernel_wait4+0x8d/0x140 > > [<0>] zap_pid_ns_processes+0x104/0x180 > > [<0>] do_exit+0xa41/0xb80 > > [<0>] do_group_exit+0x3a/0xa0 > > [<0>] __x64_sys_exit_group+0x14/0x20 > > [<0>] do_syscall_64+0x37/0xb0 > > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > which is stuck waiting for: > > > > # cat /proc/1544574/stack > > [<0>] request_wait_answer+0x12f/0x210 > > [<0>] fuse_simple_request+0x109/0x2c0 > > [<0>] fuse_flush+0x16f/0x1b0 > > [<0>] filp_close+0x27/0x70 > > [<0>] put_files_struct+0x6b/0xc0 > > [<0>] do_exit+0x360/0xb80 > > [<0>] do_group_exit+0x3a/0xa0 > > [<0>] get_signal+0x140/0x870 > > [<0>] arch_do_signal_or_restart+0xae/0x7c0 > > [<0>] exit_to_user_mode_prepare+0x10f/0x1c0 > > [<0>] syscall_exit_to_user_mode+0x26/0x40 > > [<0>] do_syscall_64+0x46/0xb0 > > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > I have a reproducer here: > > https://github.com/tych0/kernel-utils/blob/master/fuse2/Makefile#L7 > > The issue seems to be that the server process is recursing into the > filesystem it is serving (nested_fsync()). It's quite easy to > deadlock fuse this way, and I'm not sure why this would be needed for > any server implementation. Can you explain? I think the idea is that they're saving snapshots of their own threads to the fs for debugging purposes. Whether this is a sane thing to do or not, it doesn't seem like it should deadlock pid ns destruction. Tycho ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-21 15:02 ` Tycho Andersen @ 2023-08-21 15:31 ` Miklos Szeredi 2023-08-29 17:42 ` Tycho Andersen 0 siblings, 1 reply; 14+ messages in thread From: Miklos Szeredi @ 2023-08-21 15:31 UTC (permalink / raw) To: Tycho Andersen Cc: Jürg Billeter, Eric W. Biederman, linux-fsdevel, linux-kernel, regressions On Mon, 21 Aug 2023 at 17:02, Tycho Andersen <tycho@tycho.pizza> wrote: > > On Mon, Aug 21, 2023 at 04:24:00PM +0200, Miklos Szeredi wrote: > > On Tue, 15 Aug 2023 at 00:36, Tycho Andersen <tycho@tycho.pizza> wrote: > > > > > > On Mon, Aug 14, 2023 at 04:35:56PM +0200, Miklos Szeredi wrote: > > > > On Mon, 14 Aug 2023 at 16:00, Tycho Andersen <tycho@tycho.pizza> wrote: > > > > > > > > > It seems like we really do need to wait here. I guess that means we > > > > > need some kind of exit-proof wait? > > > > > > > > Could you please recap the original problem? > > > > > > Sure, the symptom is a deadlock, something like: > > > > > > # cat /proc/1528591/stack > > > [<0>] do_wait+0x156/0x2f0 > > > [<0>] kernel_wait4+0x8d/0x140 > > > [<0>] zap_pid_ns_processes+0x104/0x180 > > > [<0>] do_exit+0xa41/0xb80 > > > [<0>] do_group_exit+0x3a/0xa0 > > > [<0>] __x64_sys_exit_group+0x14/0x20 > > > [<0>] do_syscall_64+0x37/0xb0 > > > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > > > which is stuck waiting for: > > > > > > # cat /proc/1544574/stack > > > [<0>] request_wait_answer+0x12f/0x210 > > > [<0>] fuse_simple_request+0x109/0x2c0 > > > [<0>] fuse_flush+0x16f/0x1b0 > > > [<0>] filp_close+0x27/0x70 > > > [<0>] put_files_struct+0x6b/0xc0 > > > [<0>] do_exit+0x360/0xb80 > > > [<0>] do_group_exit+0x3a/0xa0 > > > [<0>] get_signal+0x140/0x870 > > > [<0>] arch_do_signal_or_restart+0xae/0x7c0 > > > [<0>] exit_to_user_mode_prepare+0x10f/0x1c0 > > > [<0>] syscall_exit_to_user_mode+0x26/0x40 > > > [<0>] do_syscall_64+0x46/0xb0 > > > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > > > I have a reproducer here: > > > https://github.com/tych0/kernel-utils/blob/master/fuse2/Makefile#L7 > > > > The issue seems to be that the server process is recursing into the > > filesystem it is serving (nested_fsync()). It's quite easy to > > deadlock fuse this way, and I'm not sure why this would be needed for > > any server implementation. Can you explain? > > I think the idea is that they're saving snapshots of their own threads > to the fs for debugging purposes. This seems a fairly special situation. Have they (whoever they may be) thought about fixing this in their server? > Whether this is a sane thing to do or not, it doesn't seem like it > should deadlock pid ns destruction. True. So the suggested solution is to allow wait_event_killable() to return if a terminal signal is pending in the exiting state and only in that case turn the flush into a background request? That would still allow for regressions like the one reported, but that would be much less likely to happen in real life. Okay, I said this for the original solution as well, so this may turn out to be wrong as well. Anyway, I'd prefer if this was fixed in the server code, as it looks fairly special and adding complexity to the kernel for this case might not be justifiable. But I'm also open to suggestions on fixing this in the kernel in a not too complex manner. Thanks, Miklos ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-21 15:31 ` Miklos Szeredi @ 2023-08-29 17:42 ` Tycho Andersen 0 siblings, 0 replies; 14+ messages in thread From: Tycho Andersen @ 2023-08-29 17:42 UTC (permalink / raw) To: Miklos Szeredi Cc: Jürg Billeter, Eric W. Biederman, linux-fsdevel, linux-kernel, regressions On Mon, Aug 21, 2023 at 05:31:48PM +0200, Miklos Szeredi wrote: (Apologies for the delay, I have been away without cell signal for some time.) > > I think the idea is that they're saving snapshots of their own threads > > to the fs for debugging purposes. > > This seems a fairly special situation. Have they (whoever they may > be) thought about fixing this in their server? Sorry, "we" here is some internal team that works for my employer Netflix. We can't use imap clients on our corporate e-mails, whee. > > Whether this is a sane thing to do or not, it doesn't seem like it > > should deadlock pid ns destruction. > > True. So the suggested solution is to allow wait_event_killable() to > return if a terminal signal is pending in the exiting state and only > in that case turn the flush into a background request? That would > still allow for regressions like the one reported, but that would be > much less likely to happen in real life. Okay, I said this for the > original solution as well, so this may turn out to be wrong as well. I wonder if there's room here for a completion that doesn't use the wait primitives. Something like an atomic + queuing in task_work() would both fix this bug and not exhibit this regression, IIUC. > Anyway, I'd prefer if this was fixed in the server code, as it looks > fairly special and adding complexity to the kernel for this case might > not be justifiable. But I'm also open to suggestions on fixing this > in the kernel in a not too complex manner. I don't think this is specific to the server-accessing-its-own-file case. My reproducer uses that because I didn't quite understand the bug fully at the time. I believe that *any* task that is killed with an inflight fuse request will exhibit this. We have seen this fairly rarely on another fuse fs we use throughout the fleet: https://github.com/lxc/lxcfs and it doesn't really do anything strange, and is mounted from the host's pid ns. Tycho ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush 2023-08-14 14:00 ` Tycho Andersen 2023-08-14 14:35 ` Miklos Szeredi @ 2023-08-14 21:34 ` Bernd Schubert 1 sibling, 0 replies; 14+ messages in thread From: Bernd Schubert @ 2023-08-14 21:34 UTC (permalink / raw) To: Tycho Andersen, Miklos Szeredi Cc: Jürg Billeter, Eric W. Biederman, linux-fsdevel, linux-kernel, regressions On 8/14/23 16:00, Tycho Andersen wrote: > On Mon, Aug 14, 2023 at 01:02:35PM +0200, Miklos Szeredi wrote: >> On Mon, 14 Aug 2023 at 08:03, Jürg Billeter <j@bitron.ch> wrote: >>> >>> Since v6.3-rc1 commit 5a8bee63b1 ("fuse: in fuse_flush only wait if >>> someone wants the return code") `fput()` is called asynchronously if a >>> file is closed as part of a process exiting, i.e., if there was no >>> explicit `close()` before exit. >>> >>> If the file was open for writing, also `put_write_access()` is called >>> asynchronously as part of the async `fput()`. >>> >>> If that newly written file is an executable, attempting to `execve()` >>> the new file can fail with `ETXTBSY` if it's called after the writer >>> process exited but before the async `fput()` has run. >> >> Thanks for the report. >> >> At this point, I think it would be best to revert the original patch, >> since only v6.4 has it. > > I agree. > >> The original fix was already a workaround, and I don't see a clear >> path forward in this direction. We need to see if there's better >> direction. >> >> Ideas? > > It seems like we really do need to wait here. I guess that means we > need some kind of exit-proof wait? I'm not sure how hackish it is, if fuse_flush gets converted to queue_work() and with a new work-queue in struct fuse_inode. That work_queue could be flushed through a new inode operation from do_open_execat. Bernd ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2023-08-29 17:42 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-14 6:03 [REGRESSION] fuse: execve() fails with ETXTBSY due to async fuse_flush Jürg Billeter 2023-08-14 11:02 ` Miklos Szeredi 2023-08-14 12:07 ` Bernd Schubert 2023-08-14 12:28 ` Miklos Szeredi 2023-08-14 12:38 ` Jürg Billeter 2023-08-14 13:44 ` Bernd Schubert 2023-08-14 14:00 ` Tycho Andersen 2023-08-14 14:35 ` Miklos Szeredi 2023-08-14 22:36 ` Tycho Andersen 2023-08-21 14:24 ` Miklos Szeredi 2023-08-21 15:02 ` Tycho Andersen 2023-08-21 15:31 ` Miklos Szeredi 2023-08-29 17:42 ` Tycho Andersen 2023-08-14 21:34 ` Bernd Schubert
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).