public inbox for v9fs@lists.linux.dev
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Dominique Martinet <asmadeus@codewreck.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>,
	Eric Van Hensbergen <ericvh@kernel.org>,
	Latchesar Ionkov <lucho@ionkov.net>,
	Christian Schoenebeck <linux_oss@crudebyte.com>,
	Mateusz Guzik <mjguzik@gmail.com>,
	syzbot <syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com>,
	brauner@kernel.org, dhowells@redhat.com, jack@suse.cz,
	jlayton@kernel.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org, netfs@lists.linux.dev,
	swapnil.sapkal@amd.com, syzkaller-bugs@googlegroups.com,
	viro@zeniv.linux.org.uk, v9fs@lists.linux.dev
Subject: Re: [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter
Date: Wed, 26 Mar 2025 13:19:47 +0100	[thread overview]
Message-ID: <20250326121946.GC30181@redhat.com> (raw)
In-Reply-To: <Z-LEsPFE4e7TTMiY@codewreck.org>

On 03/25, Dominique Martinet wrote:
>
> Thanks for the traces.
>
> w/ revert
> K Prateek Nayak wrote on Tue, Mar 25, 2025 at 08:19:26PM +0530:
> >    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_poll rd poll
> >    kworker/100:1-1803    [100] .....   286.618822: p9_fd_poll: p9_fd_request wr poll
> >    kworker/100:1-1803    [100] .....   286.618823: p9_read_work: Data read wait 7
>
> new behavior
> >            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_poll rd poll
> >            repro-4076    [031] .....    95.011394: p9_fd_poll: p9_fd_request wr poll
> >            repro-4076    [031] .....    99.731970: p9_client_rpc: Wait event killable (-512)
>
> For me the problem isn't so much that this gets ERESTARTSYS but that it
> nevers gets to read the 7 bytes that are available?

Yes...

OK, lets first recall what the commit aaec5a95d59615523 ("pipe_read:
don't wake up the writer if the pipe is still full") does.
It simply removes the unnecessary/spurious wakeups when the writer
can't add more data to the pipe.

See the "stupid test-cas" in
https://lore.kernel.org/all/20250120144338.GC7432@redhat.com/

In particular this note:

	As you can see, without this patch pipe_read() wakes the writer up
	4095 times for no reason, the writer burns a bit of CPU and blocks
	again after wakeup until the last read(fd[0], &c, 1).

in this test-case the writer sleeps in pipe_write(), but the same is true
for the task sleeping in poll( { .fd = pipe_fd, .events = POLLOUT}, ...).

Now, after some grepping I have found

	static void p9_conn_create(struct p9_client *client)
	{
		...
	
		init_poll_funcptr(&m->pt, p9_pollwait);

		n = p9_fd_poll(client, &m->pt, NULL);

		...
	}

So, iiuc, in this case p9_fd_poll(&m->pt /* != NULL */) -> p9_pollwait()
paths will add the "dummy" pwait->wait entries with ->func = p9_pollwake
to pipe_inode_info.rd_wait and pipe_inode_info.wr_wait.

Hmm... I don't understand why the 2nd vfs_poll(ts->wr) depends on the
ret from vfs_poll(ts->rd), but I assume this is correct.

This means that every time pipe_read() does wake_up(&pipe->wr_wait)
p9_pollwake() is called. This function kicks p9_poll_workfn() which
calls p9_poll_mux() which calls p9_fd_poll() again with pt == NULL.

In this case the conditional vfs_poll(ts->wr) looks more understandable...

So. Without the commit above, p9_poll_mux()->p9_fd_poll() can be called
much more often and, in particular, can report the "additional" EPOLLIN.

Can this somehow explain the problem?

Oleg.


  reply	other threads:[~2025-03-26 12:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250323184848.GB14883@redhat.com>
     [not found] ` <67e05e30.050a0220.21942d.0003.GAE@google.com>
     [not found]   ` <20250323194701.GC14883@redhat.com>
     [not found]     ` <CAGudoHHmvU54MU8dsZy422A4+ZzWTVs7LFevP7NpKzwZ1YOqgg@mail.gmail.com>
     [not found]       ` <20250323210251.GD14883@redhat.com>
     [not found]         ` <af0134a7-6f2a-46e1-85aa-c97477bd6ed8@amd.com>
     [not found]           ` <CAGudoHH9w8VO8069iKf_TsAjnfuRSrgiJ2e2D9-NGEDgXW+Lcw@mail.gmail.com>
     [not found]             ` <7e377feb-a78b-4055-88cc-2c20f924bf82@amd.com>
     [not found]               ` <f7585a27-aaef-4334-a1de-5e081f10c901@amd.com>
     [not found]                 ` <ff294b3c-cd24-4aa6-9d03-718ff7087158@amd.com>
2025-03-25 12:15                   ` [syzbot] [netfs?] INFO: task hung in netfs_unbuffered_write_iter Oleg Nesterov
2025-03-25 12:36                     ` Dominique Martinet
2025-03-25 13:04                     ` Oleg Nesterov
2025-03-25 14:49                       ` K Prateek Nayak
2025-03-25 14:58                         ` Dominique Martinet
2025-03-26 12:19                           ` Oleg Nesterov [this message]
2025-03-26 12:44                             ` Oleg Nesterov
2025-03-26 13:05                               ` Oleg Nesterov
2025-03-27 17:46                           ` K Prateek Nayak
2025-03-27 21:19                             ` syzbot
2025-03-27 22:18                               ` asmadeus
2025-03-28  4:01                                 ` K Prateek Nayak
2025-03-28  4:43                                   ` syzbot
2025-03-28 13:06                                   ` Oleg Nesterov
2025-03-28 13:07                                     ` syzbot
2025-03-28 13:25                                       ` Oleg Nesterov
2025-03-28 13:49                                         ` syzbot
2025-03-28 14:49                                           ` Oleg Nesterov
2025-03-28 15:22                                             ` syzbot
2025-03-28 17:00                                               ` Oleg Nesterov
2025-03-28 17:56                                                 ` K Prateek Nayak
2025-03-28 18:20                                                   ` Oleg Nesterov
2025-03-29  0:00                                                 ` asmadeus
2025-03-29 14:21                                                   ` Oleg Nesterov
2025-03-29 23:27                                                     ` asmadeus
2025-03-30 10:21                                                       ` Oleg Nesterov
     [not found] <67dedd2f.050a0220.31a16b.003f.GAE@google.com>
2025-08-03 12:09 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250326121946.GC30181@redhat.com \
    --to=oleg@redhat.com \
    --cc=asmadeus@codewreck.org \
    --cc=brauner@kernel.org \
    --cc=dhowells@redhat.com \
    --cc=ericvh@kernel.org \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=lucho@ionkov.net \
    --cc=mjguzik@gmail.com \
    --cc=netfs@lists.linux.dev \
    --cc=swapnil.sapkal@amd.com \
    --cc=syzbot+62262fdc0e01d99573fc@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=v9fs@lists.linux.dev \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox