From: Dominique Martinet <asmadeus@codewreck.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Eric Van Hensbergen <ericvh@gmail.com>,
Latchesar Ionkov <lucho@ionkov.net>,
Christian Schoenebeck <linux_oss@crudebyte.com>,
linux-fsdevel@vger.kernel.org,
syzbot <syzbot+2349f5067b1772c1d8a5@syzkaller.appspotmail.com>,
syzkaller-bugs@googlegroups.com,
"v9fs-developer@lists.sourceforge.net"
<v9fs-developer@lists.sourceforge.net>
Subject: Re: INFO: task hung in iterate_supers
Date: Thu, 11 Aug 2022 15:53:14 +0900 [thread overview]
Message-ID: <YvSnWrfU7kM4Ia9r@codewreck.org> (raw)
In-Reply-To: <f00146b5-0a14-ac24-3d7b-3d4deeb96359@I-love.SAKURA.ne.jp>
Hi,
Tetsuo Handa wrote on Thu, Aug 11, 2022 at 03:01:23PM +0900:
> https://syzkaller.appspot.com/text?tag=CrashReport&x=154869fd080000
> suggests that p9_client_rpc() is trapped at infinite retry loop
Would be far from the first one, Dmitry brought this up years ago...
> But why does p9 think that Flush operation worth retrying forever?
I can't answer much more than "it's how it was done"; I started
implementing asynchronous flush back when this was first discussed but
my implementation introduced a regression somewhere and I never had time
to debug it; the main "problem" is that we (currently) have no way of
freeing up resources associated with that request if we leave the
thread.
The first step was adding refcounting to requests and this is somewhat
holding up, so all's left now would be to properly clean things up if we
leave this call.
You can find inspiration in my old patches[1] if you'd like to give it a
try:
[1] https://lore.kernel.org/all/20181217110111.GB17466@nautica/T/
Note that there is one point that wasn't discussed back then, but
according to the 9p man page for flush[2], the request should be
considered successful if the original request's reply comes before the
flush reply.
This might be important e.g. with caching enabled and mkdir, create or
unlink with caching enabled as the 9p client has no notion of cache
coherency... So even if the caller itself will be busy dealing with a
signal at least the cache should be kept coherent, somehow.
I don't see any way of doing that with the current 9pfs/9pnet layering,
9pnet cannot call back in the vfs.
[2] https://9fans.github.io/plan9port/man/man9/flush.html
> The peer side should be able to detect close of file descriptor on local
> side due to process termination via SIGKILL, and the peer side should be
> able to perform appropriate recovery operation even if local side cannot
> receive response for Flush operation.
The peer side (= server in my vocabulary) has no idea about processes or
file descriptors, it's the 9p client's job to do any such cleanup.
The vfs takes care of calling the proper close functions that'll end up
in clunk for fids properly, there was a report of fid leak recently but
these are rare enough...
The problem isn't open fids though, but really resources associated with
the request itself; it shouldn't be too hard to do (ignoring any cache
coherency issue), but...
> Thus, why not to give up upon SIGKILL?
... "Nobody has done it yet".
Last year I'd probably have answered that I'm open to funding, but
franlky don't have the time anyway; I'll be happy to review and lightly
test anything sent my way in my meager free time though.
(And yes, I agree ignoring sigkill is bad user experience)
--
Dominique
next prev parent reply other threads:[~2022-08-11 6:53 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-10 10:30 INFO: task hung in iterate_supers syzbot
2018-07-10 10:34 ` Tetsuo Handa
2018-07-11 10:19 ` Tetsuo Handa
2018-07-13 10:09 ` [PATCH] fs: Add to super_blocks list after SB_BORN is set Tetsuo Handa
2018-07-13 12:00 ` Al Viro
2022-08-11 6:01 ` INFO: task hung in iterate_supers Tetsuo Handa
2022-08-11 6:53 ` Dominique Martinet [this message]
-- strict thread matches above, loose matches on Subject: below --
2021-09-13 2:33 Hao Sun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YvSnWrfU7kM4Ia9r@codewreck.org \
--to=asmadeus@codewreck.org \
--cc=ericvh@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux_oss@crudebyte.com \
--cc=lucho@ionkov.net \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=syzbot+2349f5067b1772c1d8a5@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=v9fs-developer@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).