From: Dominique Martinet <asmadeus@codewreck.org>
To: Vasiliy Kovalev <kovalev@altlinux.org>
Cc: Eric Van Hensbergen <ericvh@kernel.org>,
Latchesar Ionkov <lucho@ionkov.net>,
Christian Schoenebeck <linux_oss@crudebyte.com>,
v9fs@lists.linux.dev, linux-kernel@vger.kernel.org,
lvc-project@linuxtesting.org
Subject: Re: [PATCH] net/9p: fix infinite loop in p9_client_rpc on fatal signal
Date: Sun, 21 Jun 2026 22:00:22 +0900 [thread overview]
Message-ID: <ajfgZs23wKUUG6IW@codewreck.org> (raw)
In-Reply-To: <aeFoRIOUNQZWmTG5@codewreck.org>
Dominique Martinet wrote on Fri, Apr 17, 2026 at 07:52:52AM +0900:
> > While the ideal long-term goal is the asynchronous implementation (as
> > seen in your 9p-async-v2 branch [2]), this patch serves as a reliable
> > intermediate solution for a critical regression.
> > [2] https://github.com/martinetd/linux/commits/9p-async-v2
>
> iirc one of the problem with the async branch is that the process would
> quit immediately on, say, ^C, before the IO has completed, but it's
> possible for the server to process the IO (and not the flush) afterwards
> and you'd get something that's not supposed to happen e.g.
>
> p1 p2
>
> write(1)
> ^C/sigkill
> flush sent but process exit without waiting for server ack
> 1 not written yet
> write(2) in same spot
> write(2) done
> write(1) completes
> data isn't 2 as expected after p2 completed
>
>
> So it's quite possible async isn't the way to go, but that there is no
> good solution for this
> (given this is true even without async on sigkill: if we have something
> that works safely, there's no reason to wait only for non-fatal signals...)
Sorry to come back to this after two months but I'm still a bit worried
about this patch, and just came back to it as I'm about to send the PR
to Linus...
And I'm still thinking about the problem above, or rather possible
variants involving cache (e.g. write going through the server, but
client believing it didn't because the response didn't make it in time)
.. But the thing is, I couldn't actually hit the `if
(fatal_signal_pending(current))` you added (adding some print
statement):
- if cache is enabled, the actual I/Os are done by the vfs in the
background, so any kill to user processes won't have any impact (and
thus I guess my main worry about cache is alleviated there)
- with cache=none I'm not sure why I can't hit it, I tried with an
external server, breaking on the write() call while running dd, and
killing dd with SIGKILL a few times but that doesn't appear to be
enough? (task still stuck in write > rpc > flush > rpc, but it doesn't
appear to ever get out of io_wait_event_killable() even when I hammer it
with more signals?)
So, given that my worry with cache is irrelevant (runs in background &
won't ever hit this), I can't seem to hit this with what I consider
to be normal workloads, and assuming it does fix your problems given you
were able to test it... I'll leave it in and send to Linus now but I'd
appreciate clarifications on how to test this more thoroughly as time
permits...
(I honestly probably should drop the patch at this point, but it'll
still be time to revert if I figure something out in the next few weeks
given it's been in -next for almost 2 months already)
Thanks,
--
Dominique Martinet
prev parent reply other threads:[~2026-06-21 13:00 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-15 15:52 [PATCH] net/9p: fix infinite loop in p9_client_rpc on fatal signal Vasiliy Kovalev
2026-04-16 1:41 ` Dominique Martinet
2026-04-16 12:49 ` Vasiliy Kovalev
2026-04-16 22:52 ` Dominique Martinet
2026-04-19 8:22 ` Vasiliy Kovalev
2026-05-19 12:35 ` Dominique Martinet
2026-06-21 13:00 ` Dominique Martinet [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ajfgZs23wKUUG6IW@codewreck.org \
--to=asmadeus@codewreck.org \
--cc=ericvh@kernel.org \
--cc=kovalev@altlinux.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux_oss@crudebyte.com \
--cc=lucho@ionkov.net \
--cc=lvc-project@linuxtesting.org \
--cc=v9fs@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.