From: Oleg Nesterov <oleg@redhat.com>
To: asmadeus@codewreck.org
Cc: syzbot <syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com>,
brauner@kernel.org, dhowells@redhat.com, ericvh@kernel.org,
jack@suse.cz, jlayton@kernel.org, kprateek.nayak@amd.com,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux_oss@crudebyte.com, lucho@ionkov.net, mjguzik@gmail.com,
netfs@lists.linux.dev, swapnil.sapkal@amd.com,
syzkaller-bugs@googlegroups.com, v9fs@lists.linux.dev,
viro@zeniv.linux.org.uk
Subject: Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
Date: Sat, 29 Mar 2025 15:21:39 +0100 [thread overview]
Message-ID: <20250329142138.GA9144@redhat.com> (raw)
In-Reply-To: <Z-c4B7NbHM3pgQOa@codewreck.org>
First of all, let me remind that I know nothing about 9p or netfs ;)
And I am not sure that my patch is the right solution.
I am not even sure we need the fix, according to syzbot testing the
problem goes away with the fixes from David
https://web.git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes
but I didn't even try to read them, this is not my area.
Now, I'll try to answer some of your questions, but I can be easily
wrong.
On 03/29, asmadeus@codewreck.org wrote:
>
> Right, so your patch sounds better than Prateek's initial blowing
> up workaround, but it's a bit weird anyway so let me recap:
> - that syz repro has this unnatural pattern where the replies are all
> written before the requests are sent
Yes,
> - 9p_read_work() (read worker) has an optimization that if there is no
> in fly request then there obviously must be nothing to read (9p is 100%
> client initiated, there's no way the server should send something
> first), so at this point the reader task is idle
Yes. But note that it does kernel_read() -> pipe_read() before it becomes
idle. See below.
> - p9_fd_request() (sending a new request) has another optimization that
> only checks for tx: at this point if another request was already in
> flight then the rx task should have a poll going on for rx, and if there
> were no in flight request yet then there should be no point in checking
> for rx, so p9_fd_request() only kick in the tx worker if there is room
> to send
Can't comment, but
> - at this point I don't really get the logic that'll wake the rx thread
> up either... p9_pollwake() will trigger p9_poll_workfn()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Yes, but where this p9_pollwake() can come from? see below.
> - due to the new optimization (aaec5a95d59615 "pipe_read: don't wake up
> the writer if the pipe is still full"), that 'if there is room to send'
> check started failing and tx thread doesn't start?
Again, I can be easily wrong, but no.
With or without the optimization above, it doesn't make sense to start
the tx thread when the pipe is full, p9_fd_poll() can't report EPOLLOUT.
Lets recall that the idle read worker did kernel_read() -> pipe_read().
Before this optimization, pipe_read() did the unnecessary
wake_up_interruptible_sync_poll(&pipe->wr_wait);
when the pipe was full before the reading _and_ is still full after the
reading.
This wakeup calls p9_pollwake() which kicks p9_poll_workfn().
p9_poll_workfn() calls p9_poll_mux().
p9_poll_mux() does n = p9_fd_poll().
"n & EPOLLOUT" is false, exactly because this wakeup was unnecessary,
so p9_poll_mux() won't do schedule_work(&m->wq), this is fine,
But, "n & EPOLLIN" is true, so p9_poll_mux() does schedule_work(&m->rq)
and wakes the rx thread.
p9_read_work() is called again. It reads more data and (I guess) notices
some problem and does p9_conn_cancel(EIO).
This no longer happens after the optimization. So in some sense the
p9_fd_request() -> p9_poll_mux() hack (which wakes the rx thread in this
case) restores the old behaviour.
But again, again, quite possibly I completely misread this (nontrivial)
code.
Oleg.
next prev parent reply other threads:[~2025-03-29 14:22 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250323184848.GB14883@redhat.com>
[not found] ` <67e05e30.050a0220.21942d.0003.GAE@google.com>
[not found] ` <20250323194701.GC14883@redhat.com>
[not found] ` <CAGudoHHmvU54MU8dsZy422A4+ZzWTVs7LFevP7NpKzwZ1YOqgg@mail.gmail.com>
[not found] ` <20250323210251.GD14883@redhat.com>
[not found] ` <af0134a7-6f2a-46e1-85aa-c97477bd6ed8@amd.com>
[not found] ` <CAGudoHH9w8VO8069iKf_TsAjnfuRSrgiJ2e2D9-NGEDgXW+Lcw@mail.gmail.com>
[not found] ` <7e377feb-a78b-4055-88cc-2c20f924bf82@amd.com>
[not found] ` <f7585a27-aaef-4334-a1de-5e081f10c901@amd.com>
[not found] ` <ff294b3c-cd24-4aa6-9d03-718ff7087158@amd.com>
2025-03-25 12:15 ` [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter Oleg Nesterov
2025-03-25 12:36 ` Dominique Martinet
2025-03-25 13:04 ` Oleg Nesterov
2025-03-25 14:49 ` K Prateek Nayak
2025-03-25 14:58 ` Dominique Martinet
2025-03-26 12:19 ` Oleg Nesterov
2025-03-26 12:44 ` Oleg Nesterov
2025-03-26 13:05 ` Oleg Nesterov
2025-03-27 17:46 ` K Prateek Nayak
2025-03-27 21:19 ` syzbot
2025-03-27 22:18 ` asmadeus
2025-03-28 4:01 ` K Prateek Nayak
2025-03-28 4:43 ` syzbot
2025-03-28 13:06 ` Oleg Nesterov
2025-03-28 13:07 ` syzbot
2025-03-28 13:25 ` Oleg Nesterov
2025-03-28 13:49 ` syzbot
2025-03-28 14:49 ` Oleg Nesterov
2025-03-28 15:22 ` syzbot
2025-03-28 17:00 ` Oleg Nesterov
2025-03-28 17:56 ` K Prateek Nayak
2025-03-28 18:20 ` Oleg Nesterov
2025-03-29 0:00 ` asmadeus
2025-03-29 14:21 ` Oleg Nesterov [this message]
2025-03-29 23:27 ` asmadeus
2025-03-30 10:21 ` Oleg Nesterov
[not found] <67dedd2f.050a0220.31a16b.003f.GAE@google.com>
2025-08-03 12:09 ` syzbot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250329142138.GA9144@redhat.com \
--to=oleg@redhat.com \
--cc=asmadeus@codewreck.org \
--cc=brauner@kernel.org \
--cc=dhowells@redhat.com \
--cc=ericvh@kernel.org \
--cc=jack@suse.cz \
--cc=jlayton@kernel.org \
--cc=kprateek.nayak@amd.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux_oss@crudebyte.com \
--cc=lucho@ionkov.net \
--cc=mjguzik@gmail.com \
--cc=netfs@lists.linux.dev \
--cc=swapnil.sapkal@amd.com \
--cc=syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=v9fs@lists.linux.dev \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox