All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nam Cao <namcao@linutronix.de>
To: Mateusz Guzik <mjguzik@gmail.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	Shuah Khan <shuah@kernel.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Soheil Hassas Yeganeh <soheil@google.com>,
	Khazhismel Kumykov <khazhy@google.com>,
	Willem de Bruijn <willemb@google.com>,
	Eric Dumazet <edumazet@google.com>, Jens Axboe <axboe@kernel.dk>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH 2/2] eventpoll: Fix epoll_wait() report false negative
Date: Wed, 17 Sep 2025 15:41:52 +0200	[thread overview]
Message-ID: <87zfat19i7.fsf@yellow.woof> (raw)
In-Reply-To: <aflo53gea7i6cyy22avn7mqxb3xboakgjwnmj4bqmjp6oafejj@owgv35lly7zq>

Mateusz Guzik <mjguzik@gmail.com> writes:
> I'll say upfront I'm not an epoll person, just looked here out of
> curiosity.

Feedbacks always welcomed.

> I can agree there is a bug. The event is generated before any of the
> threads even exist and they only poll for it, none of them consume it.
>
> However, the commit message fails to explain why the change fixes
> anything and I think your patch de facto reverts e59d3c64cba6 ("epoll:
> eliminate unnecessary lock for zero timeout"). Looking at that diff
> the point was to avoid the expensive lock trip if timeout == 0 and there
> are no events.
>
> As for the bug is, from my reading the ep_start_scan()/ep_done_scan()
> pair transiently disturbs the state checked by ep_events_available(),
> resulting in false-negatives. Then the locked check works because by the
> time you acquire it, the damage is undone.

Exactly so. I can add this into the description.

> Given the commits referenced in Fixes:, I suspect the real fix would be
> to stop destroying that state of course.
>
> But if that's not feasible, I would check if a sequence counter around
> this would do the trick -- then the racy ep_events_available(ep) upfront
> would become safe with smaller overhead than with your proposal for the
> no-event case, but with higher overhead when there is something.
>
> My proposal is trivial to implement, I have no idea if it will get a
> buy-in though.

My question is whether the performance of epoll_wait() with zero
timeout is really that important that we have to complicate
things. If epoll_wait() with zero timeout is called repeatedly in a loop
but there is no event, I'm sure there will be measurabled performance
drop. But sane user would just use timeout in that case.

epoll's data is protected by a lock. Therefore I think the most
straightforward solution is just taking the lock before reading the
data.

Lockless is hard to get right and may cause hard-to-debug problems. So
unless this performance drop somehow bothers someone, I would prefer
"keep it simple, stupid".

Best regards,
Nam

  reply	other threads:[~2025-09-17 13:41 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-18  7:52 [PATCH 0/2] eventpoll: Fix epoll_wait() report false negative Nam Cao
2025-07-18  7:52 ` [PATCH 1/2] selftests/eventpoll: Add test for multiple waiters Nam Cao
2025-07-18  7:52 ` [PATCH 2/2] eventpoll: Fix epoll_wait() report false negative Nam Cao
2025-07-18  8:38   ` Soheil Hassas Yeganeh
2025-07-18  8:59     ` Nam Cao
2026-04-29  6:54       ` Christian Brauner
2026-04-29  7:27         ` Nam Cao
2026-04-29 15:34           ` Mateusz Guzik
2026-05-03 13:24             ` Nam Cao
2026-05-04 12:00         ` David Laight
2025-09-17 12:49   ` Mateusz Guzik
2025-09-17 13:41     ` Nam Cao [this message]
2025-09-17 16:05       ` Mateusz Guzik
2025-09-17 16:08         ` Mateusz Guzik
2025-09-17 18:03           ` Khazhy Kumykov
2025-09-17 22:28             ` Khazhy Kumykov
2025-09-17 22:38               ` Mateusz Guzik
2025-09-22  6:26                 ` Nam Cao
2025-09-20 14:42         ` David Laight
2025-09-20 14:45           ` Mateusz Guzik
2025-09-17  7:27 ` [PATCH 0/2] " Nam Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zfat19i7.fsf@yellow.woof \
    --to=namcao@linutronix.de \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=dave@stgolabs.net \
    --cc=edumazet@google.com \
    --cc=jack@suse.cz \
    --cc=khazhy@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mjguzik@gmail.com \
    --cc=shuah@kernel.org \
    --cc=soheil@google.com \
    --cc=stable@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.