From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCE1C327C08; Thu, 14 May 2026 16:04:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.40.44.10 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778774642; cv=none; b=HpyfHM/WV2pqioQATQcRC+EM8EP2LF4z2KO+YSRhj7x8urQQuTb6HwDCueQs/rwwU72eQwvcGawgGruMjNPZ7IOnOjGo3hsWdFAfrwZTfdy5LaA+rhQxa2NPL4rTdh7j1d8R/hhPvDQtuX9v8HZbwRDmSqjjt6dzRcMyDCudj3E= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778774642; c=relaxed/simple; bh=IfJ4JpyIX3bxpjESdnNrG0HyvHftYTP+uUETrEn8WJQ=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=EvyD88kmIpxiBbgtbWvZJg0GZizhAqVEwhGpY1LcbxZdvVijki509B/mW+o5kxxDLr6IXDYhkg1FLgHapAmsFst7PUmiC9uEtb0nOwW9FPSKU2vlED/ovk9Z1yHx0uHALJAejV9spyqyjI1SI536cBa5f5hFUvs/alL7RHtan1Q= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=goodmis.org; spf=pass smtp.mailfrom=goodmis.org; arc=none smtp.client-ip=216.40.44.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=goodmis.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=goodmis.org Received: from omf19.hostedemail.com (lb01a-stub [10.200.18.249]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 4F1EB12032B; Thu, 14 May 2026 16:03:51 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: rostedt@goodmis.org) by omf19.hostedemail.com (Postfix) with ESMTPA id 3251B20028; Thu, 14 May 2026 16:03:45 +0000 (UTC) Date: Thu, 14 May 2026 12:03:48 -0400 From: Steven Rostedt To: Dmitry Ilvokhin Cc: Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , Waiman Long , Thomas Bogendoerfer , Juergen Gross , Ajay Kaher , Alexey Makhalov , Broadcom internal kernel review list , Thomas Gleixner , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Arnd Bergmann , Dennis Zhou , Tejun Heo , Christoph Lameter , Masami Hiramatsu , Mathieu Desnoyers , linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org, virtualization@lists.linux.dev, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, kernel-team@meta.com, "Paul E. McKenney" Subject: Re: [PATCH v6 5/7] locking: Add contended_release tracepoint to qspinlock Message-ID: <20260514120348.7a64facc@gandalf.local.home> In-Reply-To: References: <5d7ea75ffe74a785e6b234ada9f23c6373d4b4c1.1777999826.git.d@ilvokhin.com> <20260513114102.50f4ca68@gandalf.local.home> X-Mailer: Claws Mail 3.20.0git84 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Precedence: bulk X-Mailing-List: linux-trace-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspamout05 X-Rspamd-Queue-Id: 3251B20028 X-Stat-Signature: yq6zq8w6pg8sf89tbku7px9pn6kjg7ib X-Session-Marker: 726F737465647440676F6F646D69732E6F7267 X-Session-ID: U2FsdGVkX18+AovYNGIlALhyn+K5C/7gpJOZfNyJEu4= X-HE-Tag: 1778774625-341613 X-HE-Meta: U2FsdGVkX19HQDyuK5W1sWFaIilY7bm7Gf/MZYHKwmEDTE3ZbhyKfVzPtSNaRvRwwLibMDr+clEajGn4IMslvXyWRSAtdRw7n20LqpU/ZL1JRwQy7RM1IZ8C/G8JiCKlU7NLZxt8T0jLkJLCLpPM7qJ+dpyaKbSRbUbJJ9wssRC6BHcQX6wyCcVcl8zzZF/OVyWwMsrTKquwAEsHp9WwFxub5w8dGBF9fpamxEC5Vrgn4VfpydBb94gT6qa9OETZ2G0tA+8SrtKJggxMx1uuZ2LNQUPr2vuo/NU83uN08IiVbeRQ5csVSw+NNSzK03CbEK4AKDmVTf9PZCF/6yk3hM3689icTTPJeoQXH95ipe2RYbbc23zucZ9uSvKuke71iwqS2WK43PR19rLABJH2Mw== On Thu, 14 May 2026 14:13:35 +0000 Dmitry Ilvokhin wrote: > > > +void __lockfunc queued_spin_release_traced(struct qspinlock *lock) > > > +{ > > > + if (queued_spin_is_contended(lock)) > > > + trace_call__contended_release(lock); > > > + queued_spin_release(lock); > > > > And then remove the duplicate call of "queued_spin_release()" here. > > This is the scenario the comment above the static branch describes. > Here's what it looks like in practice on x86_64 (defconfig, compiled > with GCC 11). > > Current design (trace + unlock combined, with return): > > endbr64 > xchg %ax,%ax ; NOP (static branch) > movb $0x0,(%rdi) ; unlock > decl %gs:__preempt_count > je preempt > jmp __x86_return_thunk > call queued_spin_release_traced ; cold > jmp preempt_handling ; cold > call __SCT__preempt_schedule > jmp __x86_return_thunk > > With the trace-only function (no return, unlock after the call): > > endbr64 > push %rbx ; saves callee-saved rbx (!) > mov %rdi,%rbx ; preserve lock across call (!) > xchg %ax,%ax ; NOP (static branch) > movb $0x0,(%rbx) ; unlock > decl %gs:__preempt_count > je preempt > pop %rbx ; callee-saved restore (!) > jmp __x86_return_thunk > call queued_spin_release_traced ; cold > jmp unlock ; cold > call __SCT__preempt_schedule > pop %rbx > jmp __x86_return_thunk > > Three extra instructions marked by "!" on the hot path (push, mov, pop), > all wasted when the tracepoint is off. That's the main reason for > combining trace and unlock in the same out-of-line function. Ah, because the return makes it into two tail calls. I still don't like the duplication, perhaps add some more comments about needing to update the other location if anything changes here? And perhaps comment that this duplicate code helps the assembly. -- Steve