public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION] osnoise: "eventpoll: Replace rwlock with spinlock" causes ~50µs noise spikes on isolated PREEMPT_RT cores
@ 2026-03-26 14:00 Ionut Nechita (Wind River)
  2026-03-26 14:31 ` Sebastian Andrzej Siewior
  2026-03-26 14:52 ` Greg KH
  0 siblings, 2 replies; 3+ messages in thread
From: Ionut Nechita (Wind River) @ 2026-03-26 14:00 UTC (permalink / raw)
  To: namcao, brauner
  Cc: linux-fsdevel, linux-rt-users, stable, linux-kernel, frederic,
	vschneid, gregkh, chris.friesen, viorel-catalin.rapiteanu,
	iulian.mocanu

Hi,

I'm reporting a regression introduced by commit 0c43094f8cc9
("eventpoll: Replace rwlock with spinlock"), backported to stable 6.12.y.

On a PREEMPT_RT system with nohz_full isolated cores, this commit causes
significant osnoise degradation on the isolated CPUs.

Setup:
  - Kernel: 6.12.78 with PREEMPT_RT
  - Hardware: x86_64, dual-socket (CPUs 0-63)
  - Boot params: nohz_full=1-16,33-48 isolcpus=nohz,domain,managed_irq,1-16,33-48
    rcu_nocbs=1-31,33-63 kthread_cpus=0,32 irqaffinity=17-31,49-63
  - Tool: osnoise tracer (./osnoise -c 1-16,33-48)

With commit applied (spinlock, kernel 6.12.78-vanilla-0):

  CPU    RUNTIME   MAX_NOISE   AVAIL%      NOISE  NMI   IRQ   SIRQ  Thread
  [001]   950000       50163   94.719%        14    0   6864     0    5922
  [004]   950000       50294   94.705%        14    0   6864     0    5920
  [007]   950000       49782   94.759%        14    0   6864     1    5921
  [033]   950000       49528   94.786%        15    0   6864     2    5922
  [016]   950000       48551   94.889%        20    0   6863    19    5942
  [008]   950000       44343   95.332%        14    0   6864     0    5925

With commit reverted (rwlock restored, kernel 6.12.78-vanilla-1):

  CPU    RUNTIME   MAX_NOISE   AVAIL%      NOISE  NMI   IRQ   SIRQ  Thread
  [001]   950000           0   100.000%       0    0      6     0       0
  [004]   950000           0   100.000%       0    0      4     0       0
  [007]   950000           0   100.000%       0    0      4     0       0
  [033]   950000           0   100.000%       0    0      4     0       0
  [016]   950000           0   100.000%       0    0      5     0       0
  [008]   950000           7    99.999%       7    0      5     0       0

Summary across all isolated cores (32 CPUs):

                          With spinlock       With rwlock (reverted)
  MAX noise (ns):         44,343 - 51,869     0 - 10
  IRQ count/sample:       ~6,650 - 6,870      3 - 7
  Thread noise/sample:    ~5,700 - 5,940      0 - 1
  CPU availability:       94.5% - 95.3%       ~100%

The regression is roughly 3 orders of magnitude in noise on isolated
cores. The test was run over many consecutive samples and the pattern
is consistent: with the spinlock, every isolated core sees thousands
of IRQs and ~50µs of noise per 950ms sample window. With the rwlock,
the cores are essentially silent.

Note that CPU 016 occasionally shows SIRQ noise (softirq) with both
kernels, which is a separate known issue with the tick on the first
nohz_full CPU. The eventpoll regression is the dominant noise source.

My understanding of the root cause: the original rwlock allowed
ep_poll_callback() (producer side, running from IRQ context on any CPU)
to use read_lock, which does not cause cross-CPU contention on isolated
cores when no local epoll activity exists. With the spinlock conversion,
on PREEMPT_RT spinlock_t becomes an rt_mutex. This means that even if
the isolated core is not involved in any epoll activity, the lock's
cacheline bouncing and potential PI-boosted wakeups from housekeeping
CPUs can inject noise into the isolated cores via IPI or cache
invalidation traffic.

The commit message acknowledges the throughput regression but argues
real workloads won't notice. However, for RT/latency-sensitive
deployments with CPU isolation, the impact is severe and measurable
even with zero local epoll usage.

I believe this needs either:
  a) A revert of the backport for stable RT trees, or
  b) A fix that avoids the spinlock contention path for isolated CPUs

I can provide the full osnoise trace data if needed.

Tested on:
  Linux system-0 6.12.78-vanilla-{0,1} SMP PREEMPT_RT x86_64
  Linux system-0 6.12.57-vanilla-{0,1} SMP PREEMPT_RT x86_64

Thanks,
Ionut.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [REGRESSION] osnoise: "eventpoll: Replace rwlock with spinlock" causes ~50µs noise spikes on isolated PREEMPT_RT cores
  2026-03-26 14:00 [REGRESSION] osnoise: "eventpoll: Replace rwlock with spinlock" causes ~50µs noise spikes on isolated PREEMPT_RT cores Ionut Nechita (Wind River)
@ 2026-03-26 14:31 ` Sebastian Andrzej Siewior
  2026-03-26 14:52 ` Greg KH
  1 sibling, 0 replies; 3+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-03-26 14:31 UTC (permalink / raw)
  To: Ionut Nechita (Wind River)
  Cc: namcao, brauner, linux-fsdevel, linux-rt-users, stable,
	linux-kernel, frederic, vschneid, gregkh, chris.friesen,
	viorel-catalin.rapiteanu, iulian.mocanu

On 2026-03-26 16:00:57 [+0200], Ionut Nechita (Wind River) wrote:
> Summary across all isolated cores (32 CPUs):
> 
>                           With spinlock       With rwlock (reverted)
>   MAX noise (ns):         44,343 - 51,869     0 - 10
>   IRQ count/sample:       ~6,650 - 6,870      3 - 7
>   Thread noise/sample:    ~5,700 - 5,940      0 - 1
>   CPU availability:       94.5% - 95.3%       ~100%

is there some load or just idle with osnoise?

> My understanding of the root cause: the original rwlock allowed
> ep_poll_callback() (producer side, running from IRQ context on any CPU)
> to use read_lock, which does not cause cross-CPU contention on isolated
> cores when no local epoll activity exists. With the spinlock conversion,
> on PREEMPT_RT spinlock_t becomes an rt_mutex. This means that even if
> the isolated core is not involved in any epoll activity, the lock's
> cacheline bouncing and potential PI-boosted wakeups from housekeeping
> CPUs can inject noise into the isolated cores via IPI or cache
> invalidation traffic.

With the read_lock() you can acquire the lock with multiple readers.
Each read will increment the "reader counter" so there is cache line
activity. If a isolated CPU does not participate, it does not
participate. With the change to spinlock_t there can be only one user at
a time. So the other have to wait and again, isolated core which don't
participate are not affected.

> The commit message acknowledges the throughput regression but argues
> real workloads won't notice. However, for RT/latency-sensitive
> deployments with CPU isolation, the impact is severe and measurable
> even with zero local epoll usage.
> 
> I believe this needs either:
>   a) A revert of the backport for stable RT trees, or

I highly doubt since it affected RT loads.

>   b) A fix that avoids the spinlock contention path for isolated CPUs
> 
> I can provide the full osnoise trace data if needed.

So the question is why are the isolated core affected if they don't
participate is epoll.

> Tested on:
>   Linux system-0 6.12.78-vanilla-{0,1} SMP PREEMPT_RT x86_64
>   Linux system-0 6.12.57-vanilla-{0,1} SMP PREEMPT_RT x86_64
> 
> Thanks,
> Ionut.

Sebastian

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [REGRESSION] osnoise: "eventpoll: Replace rwlock with spinlock" causes ~50µs noise spikes on isolated PREEMPT_RT cores
  2026-03-26 14:00 [REGRESSION] osnoise: "eventpoll: Replace rwlock with spinlock" causes ~50µs noise spikes on isolated PREEMPT_RT cores Ionut Nechita (Wind River)
  2026-03-26 14:31 ` Sebastian Andrzej Siewior
@ 2026-03-26 14:52 ` Greg KH
  1 sibling, 0 replies; 3+ messages in thread
From: Greg KH @ 2026-03-26 14:52 UTC (permalink / raw)
  To: Ionut Nechita (Wind River)
  Cc: namcao, brauner, linux-fsdevel, linux-rt-users, stable,
	linux-kernel, frederic, vschneid, chris.friesen,
	viorel-catalin.rapiteanu, iulian.mocanu

On Thu, Mar 26, 2026 at 04:00:57PM +0200, Ionut Nechita (Wind River) wrote:
> Hi,
> 
> I'm reporting a regression introduced by commit 0c43094f8cc9
> ("eventpoll: Replace rwlock with spinlock"), backported to stable 6.12.y.

Does this regression also show up in the 6.18 release and newer?  If so,
please work to address it there first, as that is where it needs to be
handled first.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-26 14:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-26 14:00 [REGRESSION] osnoise: "eventpoll: Replace rwlock with spinlock" causes ~50µs noise spikes on isolated PREEMPT_RT cores Ionut Nechita (Wind River)
2026-03-26 14:31 ` Sebastian Andrzej Siewior
2026-03-26 14:52 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox