From: Peter Zijlstra <peterz@infradead.org>
To: Andres Freund <andres@anarazel.de>
Cc: Salvatore Dipietro <dipiets@amazon.it>,
linux-kernel@vger.kernel.org, alisaidi@amazon.com,
blakgeof@amazon.com, abuehaze@amazon.de,
dipietro.salvatore@gmail.com, Thomas Gleixner <tglx@kernel.org>,
Valentin Schneider <vschneid@redhat.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Mark Rutland <mark.rutland@arm.com>
Subject: Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default
Date: Tue, 7 Apr 2026 10:49:13 +0200 [thread overview]
Message-ID: <20260407084913.GF3738010@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <yr3inlzesdb45n6i6lpbimwr7b25kqkn37qzlvvzgad5hfd7ut@xv4cihno76wu>
On Sat, Apr 04, 2026 at 01:42:22PM -0400, Andres Freund wrote:
> Hi,
>
> On 2026-04-03 23:32:07 +0200, Peter Zijlstra wrote:
> > On Fri, Apr 03, 2026 at 07:19:36PM +0000, Salvatore Dipietro wrote:
> > > We are reporting a throughput and latency regression on PostgreSQL
> > > pgbench (simple-update) on arm64 caused by commit 7dadeaa6e851
> > > ("sched: Further restrict the preemption modes") introduced in
> > > v7.0-rc1.
> > >
> > > The regression manifests as a 0.51x throughput drop on a pgbench
> > > simple-update workload with 1024 clients on a 96-vCPU
> > > (AWS EC2 m8g.24xlarge) Graviton4 arm64 system. Perf profiling
> > > shows 55% of CPU time is consumed spinning in PostgreSQL's
> > > userspace spinlock (s_lock()) under PREEMPT_LAZY:
> > >
> > > |- 56.03% - StartReadBuffer
> > > |- 55.93% - GetVictimBuffer
> > > |- 55.93% - StrategyGetBuffer
> > > |- 55.60% - s_lock <<<< 55% of time
> > > | |- 0.39% - el0t_64_irq
> > > | |- 0.10% - perform_spin_delay
> > > |- 0.08% - LockBufHdr
> > > |- 0.07% - hash_search_with_hash_value
> > > |- 0.40% - WaitReadBuffers
> >
> > The fix here is to make PostgreSQL make use of rseq slice extension:
> >
> > https://lkml.kernel.org/r/20251215155615.870031952@linutronix.de
> >
> > That should limit the exposure to lock holder preemption (unless
> > PostgreSQL is doing seriously egregious things).
>
> Maybe we should, but requiring the use of a new low level facility that was
> introduced in the 7.0 kernel, to address a regression that exists only in
> 7.0+, seems not great.
>
> It's not like it's a completely trivial thing to add support for either, so I
> doubt it'll be the right thing to backpatch it into already released major
> versions of postgres.
Just to clarify my response: all I really saw was 'userspace spinlock'
and we just did the rseq slice ext stuff (with Oracle) for exactly this
type of thing. And even NONE is susceptible to scheduling the lock
holder.
It was also the last email I did on Good Friday and thinking hard really
wasn't high on the list of things :-)
Anyway, IF we revert -- and I think you've already made a fine case for
not doing that -- it will be a very temporary thing, NONE will go away.
As to kernel version thing; why should people upgrade to the very latest
kernel release and not also be expected to upgrade PostgreSQL to the
very latest?
If they want to use old PostgreSQL, they can use old kernel too, right?
Both have stable releases that should keep them afloat for a while.
Again, not saying we can't do better, but also sometimes you have to
break eggs to make cake :-)
next prev parent reply other threads:[~2026-04-07 8:49 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-03 19:19 [PATCH 0/1] sched: Restore PREEMPT_NONE as default Salvatore Dipietro
2026-04-03 19:19 ` [PATCH 1/1] " Salvatore Dipietro
2026-04-03 21:32 ` [PATCH 0/1] " Peter Zijlstra
2026-04-04 17:42 ` Andres Freund
2026-04-05 1:40 ` Andres Freund
2026-04-05 4:21 ` Andres Freund
2026-04-05 6:08 ` Ritesh Harjani
2026-04-05 14:09 ` Andres Freund
2026-04-05 14:44 ` Andres Freund
2026-04-07 8:29 ` Peter Zijlstra
2026-04-07 8:27 ` Peter Zijlstra
2026-04-07 10:17 ` David Laight
2026-04-07 8:20 ` Peter Zijlstra
2026-04-07 9:07 ` Peter Zijlstra
2026-04-07 11:19 ` Mark Rutland
2026-04-07 8:49 ` Peter Zijlstra [this message]
2026-04-06 0:43 ` Qais Yousef
2026-04-05 14:44 ` Mitsumasa KONDO
2026-04-05 16:43 ` Andres Freund
2026-04-06 1:46 ` Mitsumasa KONDO
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260407084913.GF3738010@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=abuehaze@amazon.de \
--cc=alisaidi@amazon.com \
--cc=andres@anarazel.de \
--cc=bigeasy@linutronix.de \
--cc=blakgeof@amazon.com \
--cc=dipietro.salvatore@gmail.com \
--cc=dipiets@amazon.it \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=tglx@kernel.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox