From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C20DC38C2B5 for ; Tue, 7 Apr 2026 08:49:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775551760; cv=none; b=gGJTUHcEBNp/yEI0SNGPDyVUBIlNtMb8ZE8ji8keuILWq0dW/mvtV+hJZOiyOaYxY6C/CT4VphU/aCiDlG6676s1jT1NSC3OalQ16WC+JoTh/KeUyJwMySEJZyjcHbCwWG4yryUIq05oAk0hEEt/hW0rzryVL1A58uQDEEFsu2M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775551760; c=relaxed/simple; bh=yqjk3vs544UZTjTGejp+jInUmmgNht/WRfpdLWNlBRE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ipSDnylCKXiM8/rcvajYpu5HQICH+i5xAo+xgRbhsp0UmhzMnRZRR6U/lEv+ZYvAZqC8j7nJSaw9erXeBg6OQJVMYG+fQcU2iNsDRSk/BqFDG6oLWLc6ZXt160jxGUHCuel80FgiYjVUWYGK2AkeE1+jIgRcnmTtqjaN5Oau0D0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=qX0zDtT8; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="qX0zDtT8" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=YcGkBmx+6R/0gT1UE6EFoNPhJAkcX7nBvHXrle33Ph4=; b=qX0zDtT849QZhTFml8S+V+g/i4 fiKtP8CSvm6hVS7GJcVx3/3RTw+SSO+mswW7tw9y2ixTZOY2bzvTWwByu6c+1WvAr5RS9H+ZT53/z ii7z/qm9ER9DZ2iii04pKpgrAA2Gqx/JhdAZctd2LOjuquuvrOnCKppp7eged4+THrACm6civcDA+ ZDtszPvnPjuZgCVDFywJMwvwZM9kKynIAP7RT/GZxZbI1DbMzcL59sTPaXVUAGX0JkjP/L6v8cDwJ RlNDNefd7xNal+NWNnyES3GMPKUftA31CXDQ32C4kUtpPRl7UKLt5/hpjBhlmeAjd7gDTbNAbnc// C3cZYJ0Q==; Received: from 2001-1c00-8d85-4b00-266e-96ff-fe07-7dcc.cable.dynamic.v6.ziggo.nl ([2001:1c00:8d85:4b00:266e:96ff:fe07:7dcc] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1wA27W-00000007s72-330H; Tue, 07 Apr 2026 08:49:14 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id E6F303005E5; Tue, 07 Apr 2026 10:49:13 +0200 (CEST) Date: Tue, 7 Apr 2026 10:49:13 +0200 From: Peter Zijlstra To: Andres Freund Cc: Salvatore Dipietro , linux-kernel@vger.kernel.org, alisaidi@amazon.com, blakgeof@amazon.com, abuehaze@amazon.de, dipietro.salvatore@gmail.com, Thomas Gleixner , Valentin Schneider , Sebastian Andrzej Siewior , Mark Rutland Subject: Re: [PATCH 0/1] sched: Restore PREEMPT_NONE as default Message-ID: <20260407084913.GF3738010@noisy.programming.kicks-ass.net> References: <20260403191942.21410-1-dipiets@amazon.it> <20260403213207.GF2872@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Sat, Apr 04, 2026 at 01:42:22PM -0400, Andres Freund wrote: > Hi, > > On 2026-04-03 23:32:07 +0200, Peter Zijlstra wrote: > > On Fri, Apr 03, 2026 at 07:19:36PM +0000, Salvatore Dipietro wrote: > > > We are reporting a throughput and latency regression on PostgreSQL > > > pgbench (simple-update) on arm64 caused by commit 7dadeaa6e851 > > > ("sched: Further restrict the preemption modes") introduced in > > > v7.0-rc1. > > > > > > The regression manifests as a 0.51x throughput drop on a pgbench > > > simple-update workload with 1024 clients on a 96-vCPU > > > (AWS EC2 m8g.24xlarge) Graviton4 arm64 system. Perf profiling > > > shows 55% of CPU time is consumed spinning in PostgreSQL's > > > userspace spinlock (s_lock()) under PREEMPT_LAZY: > > > > > > |- 56.03% - StartReadBuffer > > > |- 55.93% - GetVictimBuffer > > > |- 55.93% - StrategyGetBuffer > > > |- 55.60% - s_lock <<<< 55% of time > > > | |- 0.39% - el0t_64_irq > > > | |- 0.10% - perform_spin_delay > > > |- 0.08% - LockBufHdr > > > |- 0.07% - hash_search_with_hash_value > > > |- 0.40% - WaitReadBuffers > > > > The fix here is to make PostgreSQL make use of rseq slice extension: > > > > https://lkml.kernel.org/r/20251215155615.870031952@linutronix.de > > > > That should limit the exposure to lock holder preemption (unless > > PostgreSQL is doing seriously egregious things). > > Maybe we should, but requiring the use of a new low level facility that was > introduced in the 7.0 kernel, to address a regression that exists only in > 7.0+, seems not great. > > It's not like it's a completely trivial thing to add support for either, so I > doubt it'll be the right thing to backpatch it into already released major > versions of postgres. Just to clarify my response: all I really saw was 'userspace spinlock' and we just did the rseq slice ext stuff (with Oracle) for exactly this type of thing. And even NONE is susceptible to scheduling the lock holder. It was also the last email I did on Good Friday and thinking hard really wasn't high on the list of things :-) Anyway, IF we revert -- and I think you've already made a fine case for not doing that -- it will be a very temporary thing, NONE will go away. As to kernel version thing; why should people upgrade to the very latest kernel release and not also be expected to upgrade PostgreSQL to the very latest? If they want to use old PostgreSQL, they can use old kernel too, right? Both have stable releases that should keep them afloat for a while. Again, not saying we can't do better, but also sometimes you have to break eggs to make cake :-)