From: Oleg Nesterov <oleg@redhat.com>
To: Manfred Spraul <manfred@colorfullife.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Christian Brauner <brauner@kernel.org>,
David Howells <dhowells@redhat.com>
Cc: WangYuli <wangyuli@uniontech.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: PATCH? avoid the unnecessary wakeups in pipe_read()
Date: Sun, 29 Dec 2024 14:57:37 +0100 [thread overview]
Message-ID: <20241229135737.GA3293@redhat.com> (raw)
The previous discussion was very confusing, let me start another thread.
This is orthogonal to the possible wq_has_sleeper() optimizations in fs/pipe.c
we discussed before.
Let me quote one of my previous emails. Consider
int main(void)
{
int fd[2], cnt;
char c;
pipe(fd);
if (!fork()) {
// wait until the parent blocks in pipe_write() ->
// wait_event_interruptible_exclusive(pipe->wr_wait, pipe_writable(pipe));
sleep(1);
for (cnt = 0; cnt < 4096; ++cnt)
read(fd[0], &c, 1);
return 0;
}
// parent
for (;;)
write(fd[1], &c, 1);
}
If I read this code correctly, in this case the child will wakeup the parent
4095 times for no reason, pipe_writable() == !pipe_pull() will still be true
until the last read(fd[0], &c, 1) does
if (!buf->len)
tail = pipe_update_tail(pipe, buf, tail);
and after that the parent can write the next char.
Does the patch below make sense? With this patch pipe_read() wakes the
writer up only when pipe_full() changes from T to F.
Still incomplete, obviously not for inclusion. But is it correct or not?
I am not sure I understand this nontrivial logic...
Oleg.
---
diff --git a/fs/pipe.c b/fs/pipe.c
index 12b22c2723b7..27ffb650f131 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -253,7 +253,7 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
size_t total_len = iov_iter_count(to);
struct file *filp = iocb->ki_filp;
struct pipe_inode_info *pipe = filp->private_data;
- bool was_full, wake_next_reader = false;
+ bool wake_writer = false, wake_next_reader = false;
ssize_t ret;
/* Null read succeeds. */
@@ -271,7 +271,6 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
* (WF_SYNC), because we want them to get going and generate more
* data for us.
*/
- was_full = pipe_full(pipe->head, pipe->tail, pipe->max_usage);
for (;;) {
/* Read ->head with a barrier vs post_one_notification() */
unsigned int head = smp_load_acquire(&pipe->head);
@@ -340,8 +339,10 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
buf->len = 0;
}
- if (!buf->len)
+ if (!buf->len) {
+ wake_writer |= pipe_full(head, tail, pipe->max_usage);
tail = pipe_update_tail(pipe, buf, tail);
+ }
total_len -= chars;
if (!total_len)
break; /* common path: read succeeded */
@@ -377,7 +378,7 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
* _very_ unlikely case that the pipe was full, but we got
* no data.
*/
- if (unlikely(was_full))
+ if (unlikely(wake_writer))
wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
@@ -391,14 +392,14 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
return -ERESTARTSYS;
mutex_lock(&pipe->mutex);
- was_full = pipe_full(pipe->head, pipe->tail, pipe->max_usage);
wake_next_reader = true;
+ wake_writer = false;
}
if (pipe_empty(pipe->head, pipe->tail))
wake_next_reader = false;
mutex_unlock(&pipe->mutex);
- if (was_full)
+ if (wake_writer)
wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
if (wake_next_reader)
wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
next reply other threads:[~2024-12-29 13:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-29 13:57 Oleg Nesterov [this message]
2024-12-29 17:27 ` PATCH? avoid the unnecessary wakeups in pipe_read() Linus Torvalds
2025-01-02 16:33 ` wakeup_pipe_readers/writers() && pipe_poll() Oleg Nesterov
2025-01-04 20:57 ` Manfred Spraul
2025-01-04 22:05 ` Linus Torvalds
2025-01-06 16:30 ` Oleg Nesterov
2025-01-06 18:03 ` Oleg Nesterov
2025-01-06 18:23 ` Linus Torvalds
2025-01-06 18:36 ` Oleg Nesterov
2025-01-06 19:33 ` Oleg Nesterov
2025-01-06 20:23 ` Linus Torvalds
2025-01-07 17:25 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241229135737.GA3293@redhat.com \
--to=oleg@redhat.com \
--cc=brauner@kernel.org \
--cc=dhowells@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=torvalds@linux-foundation.org \
--cc=wangyuli@uniontech.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.