From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 82E5635E95A
	for <linux-kernel@vger.kernel.org>; Thu,  2 Apr 2026 10:43:31 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1775126613; cv=none; b=QJ1oEfezrhhcO+SPOj1yT/v73PmsDm2aXuHXPpKuNp+4XbPDVXkmlov2+WZWaYrM4M6038euc6PuBnrQSFxJyT97+naa9N1ZOc1c50MOFCSGlRj9716ofxnXODkMhuQRQY3onG0KQA6klLbh7dOabos9oz5EeXlDUnVXe1MsfM8=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1775126613; c=relaxed/simple;
	bh=9MwThBYB8DDMTGeUvxNIKlP6gMW9SnNuKzzL7x8z4tw=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=DRdUHrUzXRyPk2CU5VhdTpzNJuO4z/qybAKyn12qU/29NztBg3gPPkZgIyc/6fj0IeutZpzSTHnNu8cIQSA7o+vB9ouDpjxeapf3u4GK+i5wbZbRL8ZwJE/bbmaiq44Q4fv1pHPfedtJLtzjTc8k8jv6wpRyqhd5pn84e9fOgU4=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=LNL6/ETH; arc=none smtp.client-ip=90.155.92.199
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org
Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="LNL6/ETH"
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version:
	References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To:
	Content-Transfer-Encoding:Content-ID:Content-Description;
	bh=wxeaUehvEZ46tUOBcm7owOueyyvxLD3ABXcInnfomT4=; b=LNL6/ETHCStvXhilDl1om/amSL
	GzjYq7OY22TxiqVmWTK8/CTeXlTuuQ6FMkBP40VaDkrN/mYn0GCcpk65mKkkWiQWTabVIn7+pFazQ
	mjmvajqA4vF/L6Si4ZEhT1/DjEBbgVyO+gUY5kHZgo2dFJE9JfcOzFSA2nmQmn6aGDl8Pn1ofOdgQ
	kPLIyIzZR9VjHsITXvu+Qg+xJTFPldb1TBnT1Wcf988evNzeSS57RkGCuHpilcDiI5T1sxy7qNUkO
	m0LGRCwGTM6JEfr6sW0rzTXdM8cp1gQDJqwz30B9FixA/+ThY8rOYw5eUyeDJFjCFExTQUrxDBlEh
	UhdlIcZQ==;
Received: from 2001-1c00-8d85-4b00-266e-96ff-fe07-7dcc.cable.dynamic.v6.ziggo.nl ([2001:1c00:8d85:4b00:266e:96ff:fe07:7dcc] helo=noisy.programming.kicks-ass.net)
	by desiato.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux))
	id 1w8FWD-00000002FhD-1SsJ;
	Thu, 02 Apr 2026 10:43:23 +0000
Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000)
	id A09EF30301D; Thu, 02 Apr 2026 12:43:19 +0200 (CEST)
Date: Thu, 2 Apr 2026 12:43:19 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: "Deng, Pan" <pan.deng@intel.com>
Cc: "mingo@kernel.org" <mingo@kernel.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Li, Tianyou" <tianyou.li@intel.com>,
	"tim.c.chen@linux.intel.com" <tim.c.chen@linux.intel.com>,
	"Chen, Yu C" <yu.c.chen@intel.com>
Subject: Re: [PATCH v2 1/4] sched/rt: Optimize cpupri_vec layout to mitigate
 cache line contention
Message-ID: <20260402104319.GY3738786@noisy.programming.kicks-ass.net>
References: <cover.1753076363.git.pan.deng@intel.com>
 <24c460fb48d86a5b990acbb42d0d29d91dfc427c.1753076363.git.pan.deng@intel.com>
 <20260320100903.GR3738786@noisy.programming.kicks-ass.net>
 <BL1PR11MB6003F4527C9B3C896C42EF429648A@BL1PR11MB6003.namprd11.prod.outlook.com>
 <20260324121146.GC3738010@noisy.programming.kicks-ass.net>
 <BL1PR11MB6003D258A1034857024341969657A@BL1PR11MB6003.namprd11.prod.outlook.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <BL1PR11MB6003D258A1034857024341969657A@BL1PR11MB6003.namprd11.prod.outlook.com>

On Fri, Mar 27, 2026 at 10:17:13AM +0000, Deng, Pan wrote:
> > 
> > On Tue, Mar 24, 2026 at 09:36:14AM +0000, Deng, Pan wrote:
> > 
> > > Regarding this patch, yes, using cacheline aligned could increase potential
> > > memory usage.
> > > After internal discussion, we are thinking of an alternative method to
> > > mitigate the waste of memory usage, that is, using kmalloc() to allocate
> > > count in a different memory space rather than placing the count and
> > > cpumask together in this structure. The rationale is that, writing to
> > > address pointed by the counter and reading the address from cpumask
> > > is isolated in different memory space which could reduce the ratio of
> > > cache false sharing, besides, kmalloc() based on slub/slab could place
> > > the objects in different cache lines to reduce the cache contention.
> > > The drawback of dynamic allocation counter is that, we have to maintain
> > > the life cycle of the counters.
> > > Could you please advise if sticking with current cache_align attribute
> > > method or using kmalloc() is preferred?
> > 
> > Well, you'd have to allocate a full cacheline anyway. If you allocate N
> > 4 byte (counter) objects, there's a fair chance they end up in the same
> > cacheline (its a SLAB after all) and then you're back to having a ton of
> > false sharing.
> > 
> > Anyway, for you specific workload, why isn't partitioning a viable
> > solution? It would not need any kernel modifications and would get rid
> > of the contention entirely.
> 
> Thank you very much for pointing this out.
> 
> We understand cpuset partitioning would eliminate the contention.
> However, in managed container platforms (e.g., Kubernetes), users can
> obtain RT capabilities for their workloads via CAP_SYS_NICE, but they
> don't have host-level privileges to create cpuset partitions.

So because Kubernetes is shit, you're going to patch the kernel? Isn't
that backwards? Should you not instead try and fix this kubernetes
thing?