From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E0143E277D; Thu, 16 Apr 2026 15:05:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.62.254.231 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776351941; cv=none; b=Nc21ufbvGnPzShvsUUqQAKIZCkpoD0eaMYQ1c72PrHpu2lrzQ1dDXgxZUCPTEwlFh5+EmybMRJSocsPbW+l/YDacGU5wLPernzV5uWuDBjF9h8/+JG6dEyXVmMJ28cGh0gt25IcoB5EchIRrwpMUxuFDmxUaB/xx8zsJ45n5RFw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776351941; c=relaxed/simple; bh=Ysq8Olc8I/8z2zi8Kqd4izsQ/k8vcNmLrPYq8SO1vLQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=G1j9QYXHBASxEVgRBBUfZf41Q4tNA77fk/MiagTKQVzGz+o+QASd/SQ26ZxcWEudlHYv+nijiTsZPbURx4pBiI1Y6gQ0L1hq2cd2ndYJsemX1x39Q6k2DlF2C8/ZoONkm1H3QyLQT79vB3Gjss051A+JdgGPBz88jjnjZ4QmkUI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com; spf=pass smtp.mailfrom=ilvokhin.com; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b=ZJ0D+hte; arc=none smtp.client-ip=178.62.254.231 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ilvokhin.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ilvokhin.com header.i=@ilvokhin.com header.b="ZJ0D+hte" Received: from localhost.localdomain (shell.ilvokhin.com [138.68.190.75]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id 4610CC7445; Thu, 16 Apr 2026 15:05:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1776351936; bh=ntzklD+dlafRxc2qQ7RauvHyr8rVOtuWoe+2Ne6//MM=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=ZJ0D+hte6v21FBtB159stP65D5edyBCtOztjwIJ4B83RE+6twBpRXQV/ufulR85Hr xv0/3D8s8xhx7NnpT6mzB1U7KkZNmkDnMtWl880ln3z99TkM6GOfYQhfSwno9ljTG/ 02QN2PBrMtUtalpoGs3NMaYrhkh3/0jPjrUp+xFQ= From: Dmitry Ilvokhin To: Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , Waiman Long , Thomas Bogendoerfer , Juergen Gross , Ajay Kaher , Alexey Makhalov , Broadcom internal kernel review list , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Arnd Bergmann , Dennis Zhou , Tejun Heo , Christoph Lameter , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers Cc: linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, virtualization@lists.linux.dev, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, kernel-team@meta.com, "Paul E. McKenney" , Dmitry Ilvokhin Subject: [PATCH v5 5/7] locking: Add contended_release tracepoint to qspinlock Date: Thu, 16 Apr 2026 15:05:11 +0000 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-trace-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Use the arch-overridable queued_spin_release(), introduced in the previous commit, to ensure the tracepoint works correctly across all architectures, including those with custom unlock implementations (e.g. x86 paravirt). When the tracepoint is disabled, the only addition to the hot path is a single NOP instruction (the static branch). When enabled, the contention check, trace call, and unlock are combined in an out-of-line function to minimize hot path impact, avoiding the compiler needing to preserve the lock pointer in a callee-saved register across the trace call. Binary size impact (x86_64, defconfig): uninlined unlock (common case): +680 bytes (+0.00%) inlined unlock (worst case): +83659 bytes (+0.21%) The inlined unlock case could not be achieved through Kconfig options on x86_64 as PREEMPT_BUILD unconditionally selects UNINLINE_SPIN_UNLOCK on x86_64. The UNINLINE_SPIN_UNLOCK guards were manually inverted to force inline the unlock path and estimate the worst case binary size increase. In practice, configurations with UNINLINE_SPIN_UNLOCK=n have already opted against binary size optimization, so the inlined worst case is unlikely to be a concern. Architectures with fully custom qspinlock implementations (e.g. PowerPC) are not covered by this change. Signed-off-by: Dmitry Ilvokhin --- include/asm-generic/qspinlock.h | 18 ++++++++++++++++++ kernel/locking/qspinlock.c | 8 ++++++++ 2 files changed, 26 insertions(+) diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h index df76f34645a0..915a4c2777f6 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -41,6 +41,7 @@ #include #include +#include #ifndef queued_spin_is_locked /** @@ -129,12 +130,29 @@ static __always_inline void queued_spin_release(struct qspinlock *lock) } #endif +DECLARE_TRACEPOINT(contended_release); + +extern void queued_spin_release_traced(struct qspinlock *lock); + /** * queued_spin_unlock - unlock a queued spinlock * @lock : Pointer to queued spinlock structure + * + * Generic tracing wrapper around the arch-overridable + * queued_spin_release(). */ static __always_inline void queued_spin_unlock(struct qspinlock *lock) { + /* + * Trace and release are combined in queued_spin_release_traced() so + * the compiler does not need to preserve the lock pointer across the + * function call, avoiding callee-saved register save/restore on the + * hot path. + */ + if (tracepoint_enabled(contended_release)) { + queued_spin_release_traced(lock); + return; + } queued_spin_release(lock); } diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index af8d122bb649..c72610980ec7 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -104,6 +104,14 @@ static __always_inline u32 __pv_wait_head_or_lock(struct qspinlock *lock, #define queued_spin_lock_slowpath native_queued_spin_lock_slowpath #endif +void __lockfunc queued_spin_release_traced(struct qspinlock *lock) +{ + if (queued_spin_is_contended(lock)) + trace_contended_release(lock); + queued_spin_release(lock); +} +EXPORT_SYMBOL(queued_spin_release_traced); + #endif /* _GEN_PV_LOCK_SLOWPATH */ /** -- 2.52.0