From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 863733009D4 for ; Wed, 11 Mar 2026 00:17:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.47 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773188228; cv=none; b=FvF4/RKGkjKuNcXxSQ8S6ZPzVzqvds32Dvc1542pOhZ/oHynR86Q4vIDf1D8r2LXXO/GLt3BXG2EPLQZjxoZ14+NZdZTX6+PwQQx2ur7O7vtAycjfFY2Ker9yqNDsuSCZ53comZymfzzLBKHGqBOD2XOBL/ejxn/ruv/ppuNn8U= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773188228; c=relaxed/simple; bh=1ujKX/tnwuLZjUskQscotFf+R3i1Agm4N8nb7Tlxh5I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type:Content-Disposition; b=IY39KOXNaDHe4Dtp78ndQ+EMpoH2Lc675aEiMMSRdrUwCRNwcVd3G3mRKpcz0ICtptCFb8PwHuWOZZ1F54C1Y4NhbftHgHQxWv1UcNmIBeeZawc6R4o3WFsdZGD2h6vCMnhgOmOvBdlS/GckrhHrG+V5TtYXO7eDdlxbzKAgvDk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hrZTxOhS; arc=none smtp.client-ip=209.85.128.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hrZTxOhS" Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-485445e80bdso11166035e9.0 for ; Tue, 10 Mar 2026 17:17:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773188222; x=1773793022; darn=vger.kernel.org; h=content-transfer-encoding:content-disposition:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=EAc0Wu4l1JF/9CE3rxRioBbsVeC7GaknCsGWnSGnsWw=; b=hrZTxOhSwGdAKvu6nnZzal0YIkfLVZ2E7U0SxwSuztxUMXjHgsnfo2KyPnvLu0D6JY PP8Qj8zHS8MP/RaNJSj2ozo8sZnAbJSRUJ+q1NYZgmkdKyrhjbhAT1Q5G2hbBbMsKSWV sPU/vXW3XVHne/fdyImlpI2oRr4yA/IBwz8saODhRzSByi4vwvIb7kLW+Lq5SC2tJ8G7 ZZo96CRw5Q6ALJw0GA9JUyIABn0BYngVfvx0MY/W6WDoioC8gHjnrLGAC9hDfNiP6P9D 7Px0zNUkJvwODtGc/wWvJE8m++M3jae75L367hkYk7ytIWP1IaqNSoGXwTeTspgPc6KU ujGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773188222; x=1773793022; h=content-transfer-encoding:content-disposition:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EAc0Wu4l1JF/9CE3rxRioBbsVeC7GaknCsGWnSGnsWw=; b=n6btp59ApsWFqveobxtTsddaY1rWMPF1gfYMbCdUwzs45T0AVFYRG4GW9Fj1Bj/xSu 9oXadA39FM52/s/qhn1dX+7FHJxSvGEwICmnoImPAZi8NvA18Jd/DEuS3BlIfPQ0LsOb mlLa2Y8xso4l7STbSX2puuY261OQZ5S6RiQMET3RSXSiD/akQ8yKNXmNU3uh0IYSSYjw 3L9GXDpcLsdB2yGnPccalXCNvZcASK9fOqlinE1HbiIfYZ9mUr/in79ikAXXHkT65rVF lDaXCx/urewxJ8p8RdAz65tv/7PjZ+nPTQHN4hPryBpSd9GD0+lQcgVKlpkQcwzVAPmd 5Y8Q== X-Forwarded-Encrypted: i=1; AJvYcCUd3FTfvo7EPLWQ8F/PDty89B36kx+pd5ZcXPcXldG6Wp4+6e3i75+QjjrZwWt7aamyHgeI+hAx9O3+a6A=@vger.kernel.org X-Gm-Message-State: AOJu0Yw+0DMeakpRyhG4Fe1vztR2pbN8Ond7LfWRufyXKJg+BFBszWPK RGR2mTtT5Gtfen2dNcPLStAORW/rn3iKKHthCmmt1Lmz+qPs7JEGQ640 X-Gm-Gg: ATEYQzwAkukpxmLkFIUFhtIYTFGsTdmxVjRdwcibAAXIX0WksUeN9am3eBm3CmmCNI/ sssFA8Qk6Xugts0dzAAQwwj6WXLR+USPo8bk/2XCLrh+tYnEt8dkl+S7cgHHyBcv6Vw0+gVZGXp 5qlaF0QZ3vb4z781di98OxOna7p5ysqbzrXkFQKJ6JK67q6EeJZ2XdZfxlJ8SdpEhoE/tXC0yq7 B4fHX67MbeRPQM1uP107j6wuAqRDxN5sRY7nkmAIhG+h/RmxIMBI4HtPTRfqVB9G4H7NU/srdHb voC4YkFqV9hdbLl5jBETIJhDScSZu4BAZ0E6tGpQ962bV8x/FkkrdZtPnnnH+sk/nQAgBdCricu 64y8M0CrtvB23Wx9RtYcGzx55HUltjJL7fFxPQj4rc/Xaf+dAgB9HhoOX2zUZuHHwL1LbtKkEG4 ErvNIwxzIeHiUXASZsKqpxxRH2jMoPDrg6170= X-Received: by 2002:a05:600c:46d3:b0:485:3a03:ced1 with SMTP id 5b1f17b1804b1-4854b12c404mr10656345e9.28.1773188222254; Tue, 10 Mar 2026 17:17:02 -0700 (PDT) Received: from WindFlash.powerhub ([2a0a:ef40:1b2a:fa01:9944:6a8c:dc37:eba5]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-48541aa73dasm283494525e9.2.2026.03.10.17.17.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Mar 2026 17:17:01 -0700 (PDT) From: Leonardo Bras To: "Vlastimil Babka (SUSE)" Cc: Leonardo Bras , Marcelo Tosatti , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Thomas Gleixner , Waiman Long , Boqun Feun , Frederic Weisbecker Subject: Re: [PATCH v2 2/5] Introducing qpw_lock() and per-cpu queue & flush work Date: Tue, 10 Mar 2026 21:16:52 -0300 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: <477bf592-489e-45e6-a386-6b3fdb289c39@kernel.org> References: <20260302154945.143996316@redhat.com> <20260302155105.214878062@redhat.com> <682380ba-c8f3-4023-928c-2152e934f8db@kernel.org> <477bf592-489e-45e6-a386-6b3fdb289c39@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: 8bit On Mon, Mar 09, 2026 at 11:14:23AM +0100, Vlastimil Babka (SUSE) wrote: > On 3/8/26 19:00, Leonardo Bras wrote: > > On Tue, Mar 03, 2026 at 01:02:13PM -0300, Marcelo Tosatti wrote: > >> On Tue, Mar 03, 2026 at 01:03:36PM +0100, Vlastimil Babka (SUSE) wrote: > >> > On 3/2/26 16:49, Marcelo Tosatti wrote: > >> > > +#define local_qpw_lock(lock) \ > >> > > + do { \ > >> > > + if (static_branch_maybe(CONFIG_QPW_DEFAULT, &qpw_sl)) { \ > >> > > + migrate_disable(); \ > >> > > >> > Have you considered using migrate_disable() on PREEMPT_RT and > >> > preempt_disable() on !PREEMPT_RT since it's cheaper? It's what the pcp > >> > locking in mm/page_alloc.c does, for that reason. It should reduce the > >> > overhead with qpw=1 on !PREEMPT_RT. > >> > >> migrate_disable: > >> Patched kernel, CONFIG_QPW=y, qpw=1: 192 cycles > >> > >> preempt_disable: > >> [ 65.497223] kmalloc_bench: Avg cycles per kmalloc: 184 cycles > >> > >> I tried it before, but it was crashing for some reason which i didnt > >> look into (perhaps PREEMPT_RT was enabled). > >> > >> Will change this for the next iteration, thanks. > >> > > > > Hi all, > > > > That made me remember that rt spinlock already uses migrate_disable and > > non-rt spinlocks already have preempt_disable() > > > > Maybe it's actually worth adding a local_spin_lock() in spinlock{,_rt}.c > > whichy would get the per-cpu variable inside the preempt/migrate_disable > > area, and making use of it in qpw code. That way we avoid nesting > > migtrate_disable or preempt_disable, and further reducing impact. > > That would be nice indeed. But since the nested disable/enable cost should > be low, and the spinlock code rather complicated, it might be tough to sell. > It would be also great to have those trylocks inline on all arches. Fair enough. I will take a look in spinlock code later, maybe we can have one in qpw code that can be used internally without impacting other users. > > > The alternative is to not have migrate/preempt disable here and actually > > trust the ones inside the locking primitives. Is there a chance of > > contention, but I don't remember being able to detect it. > > So then we could pick the lock on one cpu but then get migrated and actually > lock it on another cpu. Is contention the only possible downside of this, or > could it lead to subtle bugs depending on the particular user? The paths > that don't flush stuff on remote cpus but expect working with the local > cpu's structure in a fastpath might get broken. I'd be wary of this. Yeah, that's right. Contention could be really bad for realtime, as rare as it may happen. And you are right in potential bugs: for user functions that operate on local per-cpu data (this_cpu_read/write) it would be expensive to have a per_cpu_read/write(), so IIRC Marcelo did not convert that in functions that always run in local_cpu. If the cpu migrates before getting the lock, we will safely operate remotelly on that cpu data, but any this_cpu_*() in the function will operate in local cpu instead of remote cpu. So you and Marcelo are correct: we can't have migrate/preempt happening during the routine, which means we need them before we get the cpu. Thanks! Leo