From: Oleg Nesterov <oleg@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Manfred Spraul <manfred@colorfullife.com>,
Christian Brauner <brauner@kernel.org>,
David Howells <dhowells@redhat.com>,
WangYuli <wangyuli@uniontech.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: wakeup_pipe_readers/writers() && pipe_poll()
Date: Tue, 7 Jan 2025 18:25:12 +0100 [thread overview]
Message-ID: <20250107172512.GB29771@redhat.com> (raw)
In-Reply-To: <CAHk-=wh-SxjH7uvADd5XJBuM2ReyPcLPyXKvBbwbiS5kod+3hA@mail.gmail.com>
On 01/06, Linus Torvalds wrote:
>
> On Mon, 6 Jan 2025 at 11:34, Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > 1. pipe_read() says
> >
> > * But when we do wake up writers, we do so using a sync wakeup
> > * (WF_SYNC), because we want them to get going and generate more
> > * data for us.
> >
> > OK, WF_SYNC makes sense if pipe_read() or pipe_write() is going to do wait_event()
> > after wake_up(). But wake_up_interruptible_sync_poll() looks at bit misleading if
> > we are going to wakeup the writer or next_reader before return.
>
> This heuristic has always been a bit iffy. And honestly, I think it's
> been driven by benchmarks that aren't necessarily always realistic (ie
> for ping-pong benchmarks, the best behavior is often to stay on the
> same CPU and just schedule between the reader/writer).
Agreed. But my question was not about performance, I just tried to
understand this logic. So in the case of
wake_up_interruptible_sync_poll(wr_wait);
wait_event_interruptible_exclusive(wr_read);
WF_SYNC is understandable, "stay on the same CPU" looks like the right
thing, and "_sync_" matches the comment above.
But if we are going to return, wake_up_interruptible_sync_poll() looks
a bit misleading to me.
> > 2. I can't understand this code in pipe_write()
> >
> > if (ret > 0 && sb_start_write_trylock(file_inode(filp)->i_sb)) {
> > int err = file_update_time(filp);
> > if (err)
> > ret = err;
> > sb_end_write(file_inode(filp)->i_sb);
> > }
> >
> > - it only makes sense in the "fifo" case, right? When
> > i_sb->s_magic != PIPEFS_MAGIC...
>
> I think we've done it for regular pipes too. You can see it with
> 'fstat()', after all.
Ah, indeed, thanks for correcting me...
And thanks for your other explanations. Again, it is not that I thought
this needs changes, just I was a bit confused. In particular by
err = file_update_time();
if (err)
ret = err;
which doesn't match the usage of file_accessed() in pipe_read().
Oleg.
prev parent reply other threads:[~2025-01-07 17:25 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-29 13:57 PATCH? avoid the unnecessary wakeups in pipe_read() Oleg Nesterov
2024-12-29 17:27 ` Linus Torvalds
2025-01-02 16:33 ` wakeup_pipe_readers/writers() && pipe_poll() Oleg Nesterov
2025-01-04 20:57 ` Manfred Spraul
2025-01-04 22:05 ` Linus Torvalds
2025-01-06 16:30 ` Oleg Nesterov
2025-01-06 18:03 ` Oleg Nesterov
2025-01-06 18:23 ` Linus Torvalds
2025-01-06 18:36 ` Oleg Nesterov
2025-01-06 19:33 ` Oleg Nesterov
2025-01-06 20:23 ` Linus Torvalds
2025-01-07 17:25 ` Oleg Nesterov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250107172512.GB29771@redhat.com \
--to=oleg@redhat.com \
--cc=brauner@kernel.org \
--cc=dhowells@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=torvalds@linux-foundation.org \
--cc=wangyuli@uniontech.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox