From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8675A2F361F for ; Fri, 6 Mar 2026 22:51:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772837517; cv=none; b=Y2BHta1UGoFr5LAR2BnVpWwkpxvzN8F+VP9VQvQ8EJwmQGbgtndvD2yZlsLRvA7bmDqasPJeyGQ2gJcx6R1yyZs0R6yjzdwUz2kvSLw0W38ilmT4qvVXDu3ByCxQgqWvtyp+tAV/LuDLHFQwBcoGsGQrgqcY9BStga+L+11Pp80= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772837517; c=relaxed/simple; bh=UIm0BAAQ9NyxI+jm+ok1keo4a15LjFE7AzVDkGuAcfY=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=NY7tsxyIg2gPnMo7E7wI5Y6gTIh2kymKqmG+wW8/LY07CO70hnLHp1TqVe0cVzNURkd4YmsP84qIcEUHRvn5o/sBL/J0j5U7X86NjAPkCbJ5tCYvhxvDiRiswtPqzn1PY9GVcTvzZu3u76iQXk0lUheexP84z+ndMYkGlfEA0cY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=m9nLE2rD; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="m9nLE2rD" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-439d8dc4ae4so1272001f8f.2 for ; Fri, 06 Mar 2026 14:51:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772837515; x=1773442315; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=K+iheRlEY4/DoNMzKlD0KGFlC6kRSQ6TEZ/Z4FCsEKA=; b=m9nLE2rDTDSBRngd9/cYf9NAGYc0P5ryh/zGhrnW5IRYZdNz7AbaicEuCycRa0I5Dm ayPgCnXymmNwaVrFoRSjMhKPoT12s72WWvge+z+sFYHfYTnhVSYONINKoyObC6kqfsxV BxWBR60U4osN2mmlzQUZJ3UJltXWV4fMKPO5GBm1fqn34VFJn3UGpKwmfFAe0k5ctmJs k5FU8V2ovvtumI4gJJ/RR/6ZMXaKY0EQfgc964rkNoro7cGCylpAI3j3M1gDWflxN0US Vi7UajcdxR4Emy5pcbqX+uTJtJrQo8g1YXl9dUf5ZxoCHXnuu6pcUcVJgD+C6BdFrVpu nFLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772837515; x=1773442315; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=K+iheRlEY4/DoNMzKlD0KGFlC6kRSQ6TEZ/Z4FCsEKA=; b=UmzYqsaPdFRBn837zqO07nzruMdtYGC51wFn40MCzsLOove9AdeVO2JEhagsWMxZY9 MgG5duXWZVqXQu8yNf2hZkXMvXS9WAu6cCqCTiA+nTRanBou7KbTrAx4nFKeFOd6lhiT uWaXjX/G/F4LDnzRKya7F4vr1CVXoC+3TSKukkRz1Fyr3ZOITRt/8BnCZYhYEjHWZdJo XVNg6AEYSjPU9aNTbCi3AvYYnC1icIN05h2usKYtmyBresKYq7rsdb7JzTlvgCzzDPpC lX07UR3u9ntVMoU+IOHqZ2Jxkty7eVMr+RCooapLTxuDUaxA1vcY/IaY+UKzyn+kbbfL HgWA== X-Forwarded-Encrypted: i=1; AJvYcCVRPD7qVaav/r9ZnylLennT4M3966LIfR+/+hGMprt6uH7a6jEQmG4XFPU0QAwGDr7+y2cZXkiSTupBudU=@vger.kernel.org X-Gm-Message-State: AOJu0YwuD/12phYSmjtQSVOMQ7N6C2S0WubChuIgw6pPvYkpZymluhwN /TTkKvceiQFShqbvp7m0UhohZkW3c5LNjRM4I6UnTrcPIHV/lyVqZCzP X-Gm-Gg: ATEYQzy90Xihj+hzv2VEqAiLuSwlNILAjx0PxvCaIqWt8yyYRknWQLA1jrQTwibEhsf V0nOPPBZRZWZWdb62V08/e60FukuHKJ32mCMdN6WqwSyaTXEiBlhue6jjzQIG5D4uP70UGvSwdk CwKXWFeriTeG4POb19FRVin2pg+E6WwEfiZIOxBO60dojL90Az+ccmVGlX2Gcevq2ghmbWWnt7m 0jR824O2VTKYzF5NySwqAgqzI6QtvC2N4jHZ7SauIwaLa2bey81Jj94x1+d0fwv4iugam6y9qNX wgk3l/8vQwwgrsllJ9u89HlxIiCHdHY6HZbjngvCDwqkcBA2HnBE8PJioIImsBztQUJ+HQbkHdk ERUs6u6UQcstNdNmEQAjQTqg2yhiGFQziucAnwwkfJ1LppxlhbB4fLGGMz112hppqlGNIZjUqJV hL5WOJ0xVX8fwUdvCXk+5mYmLmCDE95n62io3mhSH4jsph8PwpjiJ26fAtMvXOSi0kDxt1taFA2 9VRTeW3jWjm X-Received: by 2002:a05:6000:250c:b0:439:afcd:b629 with SMTP id ffacd0b85a97d-439da87bf34mr6435044f8f.55.1772837514679; Fri, 06 Mar 2026 14:51:54 -0800 (PST) Received: from snowdrop.snailnet.com (82-69-66-36.dsl.in-addr.zen.co.uk. [82.69.66.36]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-439dae57c05sm6160845f8f.39.2026.03.06.14.51.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Mar 2026 14:51:54 -0800 (PST) From: david.laight.linux@gmail.com To: Waiman Long , Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng , linux-kernel@vger.kernel.org, Linus Torvalds , Yafang Shao , Steven Rostedt Cc: David Laight Subject: [PATCH v3 next 0/5] locking/osq_lock: Optimisations to osq_lock code Date: Fri, 6 Mar 2026 22:51:45 +0000 Message-Id: <20260306225150.93178-1-david.laight.linux@gmail.com> X-Mailer: git-send-email 2.39.5 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: David Laight This is a slightly edited copy of v2 from 2 years ago. I've re-read the comments (on v1 and v2). Patch #3 now unconditionally calls decode_cpu() when stabilizing @prev (I'm not at all sure the cpu number can ever be unchanged.) Patch #5 now converts almost all the cpu numbers to 'unsigned int'. Fot patch #2 I've found a note that: kernel test robot noticed a 10.7% improvement of stress-ng.netlink-task.ops_per_sec Notes from v2: Patch #1 is the node->locked part of v1's patch #2. Patch #2 removes the pretty much guaranteed cache line reload getting the cpu number (from node->prev) for the vcpu_is_preempted() check. It is (basically) the old #5 with the addition of a READ_ONCE() and leaving the '+ 1' offset (for patch 3). Patch #3 ends up removing both node->cpu and node->prev. This saves issues initialising node->cpu. Basically node->cpu was only ever read as node->prev->cpu in the unqueue code. Most of the time it is the value read from lock->tail that was used to obtain 'prev' in the first place. The only time it is different is in the unlock race path where 'prev' is re-read from node->prev - updated right at the bottom of osq_lock(). So the updated node->prev_cpu can used (and prev obtained from it) without worrying about only one of node->prev and node->prev-cpu being updated. Linus did suggest just saving the cpu numbers instead of pointers. It actually works for 'prev' but not 'next'. Patch #4 removes the unnecessary node->next = NULL assignment from the top of osq_lock(). Patch #5 just stops gcc using two separate instructions to decrement the offset cpu number and then convert it to 64 bits. Linus got annoyed with it, and I'd spotted it as well. I don't seem to be able to get gcc to convert __per_cpu_offset[cpu - 1] to (__per_cpu_offset - 1)[cpu] (cpu is offset by one) but, in any case, it would still need zero extending in the common case. David Laight (5): Defer clearing node->locked until the slow osq_lock() path. Optimise vcpu_is_preempted() check. Use node->prev_cpu instead of saving node->prev. Optimise decode_cpu() and per_cpu_ptr(). Avoid writing to node->next in the osq_lock() fast path. kernel/locking/osq_lock.c | 56 +++++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 29 deletions(-) -- 2.39.5