From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EC46C004C0 for ; Mon, 23 Oct 2023 11:13:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233627AbjJWLNC (ORCPT ); Mon, 23 Oct 2023 07:13:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233644AbjJWLNA (ORCPT ); Mon, 23 Oct 2023 07:13:00 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57EA610B for ; Mon, 23 Oct 2023 04:12:58 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7E65AC433C7; Mon, 23 Oct 2023 11:12:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1698059577; bh=nD6R9MTHvsw460mt64utci+jSpcN+357AA0Zee3xvHE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vjqRMF2UDVAxdNELrGfkWo9cMRNz6kxAGmE8PCPktVES3PIseDzF/dhLts5RTl5aL 6rZw/pWBcqw9fv71CRO/R6EszZ+In7pOOszKg6l1BYFr43xXlm8EkYT1mZFCduyxTl u52QdwJRBpecvBUgojUGMR67RNaokRqz/yd2B2FA= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Srikar Dronamraju , Laurent Dufour , Shrikanth Hegde , Nicholas Piggin , Michael Ellerman , "Nysal Jan K.A" Subject: [PATCH 6.5 223/241] powerpc/qspinlock: Fix stale propagated yield_cpu Date: Mon, 23 Oct 2023 12:56:49 +0200 Message-ID: <20231023104839.299887627@linuxfoundation.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231023104833.832874523@linuxfoundation.org> References: <20231023104833.832874523@linuxfoundation.org> User-Agent: quilt/0.67 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org 6.5-stable review patch. If anyone has any objections, please let me know. ------------------ From: Nicholas Piggin commit f9bc9bbe8afdf83412728f0b464979a72a3b9ec2 upstream. yield_cpu is a sample of a preempted lock holder that gets propagated back through the queue. Queued waiters use this to yield to the preempted lock holder without continually sampling the lock word (which would defeat the purpose of MCS queueing by bouncing the cache line). The problem is that yield_cpu can become stale. It can take some time to be passed down the chain, and if any queued waiter gets preempted then it will cease to propagate the yield_cpu to later waiters. This can result in yielding to a CPU that no longer holds the lock, which is bad, but particularly if it is currently in H_CEDE (idle), then it appears to be preempted and some hypervisors (PowerVM) can cause very long H_CONFER latencies waiting for H_CEDE wakeup. This results in latency spikes and hard lockups on oversubscribed partitions with lock contention. This is a minimal fix. Before yielding to yield_cpu, sample the lock word to confirm yield_cpu is still the owner, and bail out of it is not. Thanks to a bunch of people who reported this and tracked down the exact problem using tracepoints and dispatch trace logs. Fixes: 28db61e207ea ("powerpc/qspinlock: allow propagation of yield CPU down the queue") Cc: stable@vger.kernel.org # v6.2+ Reported-by: Srikar Dronamraju Reported-by: Laurent Dufour Reported-by: Shrikanth Hegde Debugged-by: "Nysal Jan K.A" Signed-off-by: Nicholas Piggin Tested-by: Shrikanth Hegde Signed-off-by: Michael Ellerman Link: https://msgid.link/20231016124305.139923-2-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/lib/qspinlock.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 253620979d0c..6dd2f46bd3ef 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -406,6 +406,9 @@ static __always_inline bool yield_to_prev(struct qspinlock *lock, struct qnode * if ((yield_count & 1) == 0) goto yield_prev; /* owner vcpu is running */ + if (get_owner_cpu(READ_ONCE(lock->val)) != yield_cpu) + goto yield_prev; /* re-sample lock owner */ + spin_end(); preempted = true; -- 2.42.0