From: Oleg Nesterov <oleg@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Sapkal, Swapnil" <swapnil.sapkal@amd.com>,
Manfred Spraul <manfred@colorfullife.com>,
Christian Brauner <brauner@kernel.org>,
David Howells <dhowells@redhat.com>,
WangYuli <wangyuli@uniontech.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
K Prateek Nayak <kprateek.nayak@amd.com>,
"Shenoy, Gautham Ranjal" <gautham.shenoy@amd.com>,
Neeraj.Upadhyay@amd.com
Subject: Re: [PATCH] pipe_read: don't wake up the writer if the pipe is still full
Date: Tue, 25 Feb 2025 15:26:33 +0100 [thread overview]
Message-ID: <20250225142632.GA29585@redhat.com> (raw)
In-Reply-To: <CAHk-=wi+P5__7LfbTX66shvYC1X11G2ZdKcg4psi+k_pD3sO+w@mail.gmail.com>
On 02/24, Linus Torvalds wrote:
>
> However, I see at least one case where this exclusive wakeup seems broken:
>
> /*
> * But because we didn't read anything, at this point we can
> * just return directly with -ERESTARTSYS if we're interrupted,
> * since we've done any required wakeups and there's no need
> * to mark anything accessed. And we've dropped the lock.
> */
> if (wait_event_interruptible_exclusive(pipe->rd_wait,
> pipe_readable(pipe)) < 0)
> return -ERESTARTSYS;
>
> and I'm wondering if the issue is that the *readers* got stuck,
> Because that "return -ERESTARTSYS" path now basically will by-pass the
> logic to wake up the next exclusive waiter.
I think this is fine... lets denote this reader as R.
> Because that "return -ERESTARTSYS" is *after* the reader has been on
> the rd_wait queue - and possibly gotten the only wakeup that any of
> the readers will ever get - and now it returns without waking up any
> other reader.
I think this can't happen. ___wait_event() does
init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0); \
for (;;) { \
long __int = prepare_to_wait_event(&wq_head, &__wq_entry, state);\
\
if (condition) \
break; \
\
if (___wait_is_interruptible(state) && __int) { \
__ret = __int; \
goto __out; \
} \
\
cmd; \
} \
and in this case condition == pipe_readable(pipe), cmd == schedule().
Suppose that R got that only wakeup, and wake_up() races with some signal
so that signal_pending(R) is true.
In this case prepare_to_wait_event() will return -ERESTARTSYS, but
___wait_event() won't return this error code, it will check pipe_readable()
and return 0.
After that R will restart the main loop with wake_next_reader = true,
and whatever it does it should do wake_up(pipe->rd_wait) before return.
Note also that prepare_to_wait_event() removes the waiter from the
wait_queue_head->head list, so another wake_up() can't pick this task.
Can ___wait_event() miss the pipe_readable() event in this case? No,
both wake_up() and prepare_to_wait_event() take the same wq_head->lock.
What if pipe_readable() is actually false? Say, a spurios wakeup or, say,
pipe_write() does wake_up(rd_wait) when another reader has already made
the pipe_readable() condition false? This case looks "obviously fine" too.
So I am still confused.
I will wait for reply from Sapkal, then I'll try to make a debugging patch.
Oleg.
next prev parent reply other threads:[~2025-02-25 14:27 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-02 14:07 [PATCH] pipe_read: don't wake up the writer if the pipe is still full Oleg Nesterov
2025-01-02 16:20 ` WangYuli
2025-01-02 16:46 ` Oleg Nesterov
2025-01-04 8:42 ` Christian Brauner
2025-01-31 9:49 ` K Prateek Nayak
2025-01-31 13:23 ` Oleg Nesterov
2025-01-31 20:06 ` Linus Torvalds
2025-02-02 17:01 ` Oleg Nesterov
2025-02-02 18:39 ` Linus Torvalds
2025-02-02 19:32 ` Oleg Nesterov
2025-02-04 11:17 ` Christian Brauner
2025-02-03 9:05 ` K Prateek Nayak
2025-02-04 13:49 ` Oleg Nesterov
2025-02-24 9:26 ` Sapkal, Swapnil
2025-02-24 14:24 ` Oleg Nesterov
2025-02-24 18:36 ` Linus Torvalds
2025-02-25 14:26 ` Oleg Nesterov [this message]
2025-02-25 11:57 ` Oleg Nesterov
2025-02-26 5:55 ` Sapkal, Swapnil
2025-02-26 11:38 ` Oleg Nesterov
2025-02-26 17:56 ` Sapkal, Swapnil
2025-02-26 18:12 ` Oleg Nesterov
2025-03-03 13:00 ` Alexey Gladkov
2025-03-03 15:46 ` K Prateek Nayak
2025-03-03 17:18 ` Alexey Gladkov
2025-02-26 13:18 ` Mateusz Guzik
2025-02-26 13:21 ` Mateusz Guzik
2025-02-26 17:16 ` Oleg Nesterov
2025-02-27 16:18 ` Sapkal, Swapnil
2025-02-27 16:34 ` Mateusz Guzik
2025-02-27 21:12 ` Oleg Nesterov
2025-02-28 5:58 ` Sapkal, Swapnil
2025-02-28 14:30 ` Oleg Nesterov
2025-02-28 16:33 ` Oleg Nesterov
2025-03-03 9:46 ` Sapkal, Swapnil
2025-03-03 14:37 ` Mateusz Guzik
2025-03-03 14:51 ` Mateusz Guzik
2025-03-03 15:31 ` K Prateek Nayak
2025-03-03 17:54 ` Mateusz Guzik
2025-03-03 18:11 ` Linus Torvalds
2025-03-03 18:33 ` Mateusz Guzik
2025-03-03 18:55 ` Linus Torvalds
2025-03-03 19:06 ` Mateusz Guzik
2025-03-03 20:27 ` Oleg Nesterov
2025-03-03 20:46 ` Linus Torvalds
2025-03-04 5:31 ` K Prateek Nayak
2025-03-04 6:32 ` Linus Torvalds
2025-03-04 12:54 ` Oleg Nesterov
2025-03-04 13:25 ` Oleg Nesterov
2025-03-04 18:28 ` Linus Torvalds
2025-03-04 22:11 ` Oleg Nesterov
2025-03-05 4:40 ` K Prateek Nayak
2025-03-05 4:52 ` Linus Torvalds
2025-03-04 13:51 ` [PATCH] fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex K Prateek Nayak
2025-03-04 18:36 ` Alexey Gladkov
2025-03-04 19:03 ` Linus Torvalds
2025-03-05 15:31 ` [PATCH] pipe_read: don't wake up the writer if the pipe is still full Rasmus Villemoes
2025-03-05 16:50 ` Linus Torvalds
2025-03-06 9:48 ` Rasmus Villemoes
2025-03-06 14:42 ` Rasmus Villemoes
2025-03-05 16:40 ` Linus Torvalds
2025-03-06 8:35 ` Rasmus Villemoes
2025-03-06 17:59 ` Linus Torvalds
2025-03-06 9:28 ` Rasmus Villemoes
2025-03-06 11:39 ` [RFC PATCH 0/3] pipe: Convert pipe->{head,tail} to unsigned short K Prateek Nayak
2025-03-06 11:39 ` [RFC PATCH 1/3] fs/pipe: Limit the slots in pipe_resize_ring() K Prateek Nayak
2025-03-06 12:28 ` Oleg Nesterov
2025-03-06 15:26 ` K Prateek Nayak
2025-03-06 11:39 ` [RFC PATCH 2/3] fs/splice: Atomically read pipe->{head,tail} in opipe_prep() K Prateek Nayak
2025-03-06 11:39 ` [RFC PATCH 3/3] treewide: pipe: Convert all references to pipe->{head,tail,max_usage,ring_size} to unsigned short K Prateek Nayak
2025-03-06 12:32 ` Oleg Nesterov
2025-03-06 12:41 ` Oleg Nesterov
2025-03-06 15:33 ` K Prateek Nayak
2025-03-06 18:04 ` Linus Torvalds
2025-03-06 14:27 ` Rasmus Villemoes
2025-03-03 18:32 ` [PATCH] pipe_read: don't wake up the writer if the pipe is still full K Prateek Nayak
2025-03-04 5:22 ` K Prateek Nayak
2025-03-03 16:49 ` Oleg Nesterov
2025-03-04 5:06 ` Hillf Danton
2025-03-04 5:35 ` K Prateek Nayak
2025-03-04 10:29 ` Hillf Danton
2025-03-04 12:34 ` Oleg Nesterov
2025-03-04 23:35 ` Hillf Danton
2025-03-04 23:49 ` Oleg Nesterov
2025-03-05 4:56 ` Hillf Danton
2025-03-05 11:44 ` Oleg Nesterov
2025-03-05 22:46 ` Hillf Danton
2025-03-06 9:30 ` Oleg Nesterov
2025-03-07 6:08 ` Hillf Danton
2025-03-07 6:24 ` K Prateek Nayak
2025-03-07 10:46 ` Hillf Danton
2025-03-07 11:29 ` Oleg Nesterov
2025-03-07 12:34 ` Oleg Nesterov
2025-03-07 23:56 ` Hillf Danton
2025-03-09 14:01 ` K Prateek Nayak
2025-03-09 17:02 ` Oleg Nesterov
2025-03-10 10:49 ` Hillf Danton
2025-03-10 11:09 ` Oleg Nesterov
2025-03-10 11:37 ` Hillf Danton
2025-03-10 12:43 ` Oleg Nesterov
2025-03-10 23:33 ` Hillf Danton
2025-03-11 0:26 ` Linus Torvalds
2025-03-11 6:54 ` Oleg Nesterov
[not found] ` <20250311112922.3342-1-hdanton@sina.com>
2025-03-11 11:53 ` Oleg Nesterov
2025-03-07 11:26 ` Oleg Nesterov
2025-02-27 12:50 ` Oleg Nesterov
2025-02-27 13:52 ` Oleg Nesterov
2025-02-27 15:59 ` Mateusz Guzik
2025-02-27 16:28 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250225142632.GA29585@redhat.com \
--to=oleg@redhat.com \
--cc=Neeraj.Upadhyay@amd.com \
--cc=brauner@kernel.org \
--cc=dhowells@redhat.com \
--cc=gautham.shenoy@amd.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=swapnil.sapkal@amd.com \
--cc=torvalds@linux-foundation.org \
--cc=wangyuli@uniontech.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.