From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5AEA139DBC0 for ; Thu, 9 Apr 2026 12:47:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775738839; cv=none; b=Jkj81oD+CrerHL3crU7u1kHWe3/RXG+ywSsVcGAhBmENoCOT913ba1f4L3raYmJFLwcdOM/Io96/aWm8RBh+kJFLOZh4GyADnvVk8YYTHSkP6LVtczuYPl18ahGExlwSPhf9Od1W/yFwRDfmCKt1DGj/BLLY5q33EK96Rdnq9G4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775738839; c=relaxed/simple; bh=grNols1FaEKrdJsv2DNqMGfdck+q1C9UhNG8zy0gqJE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=J3ggFjcvBx+NcRxQ4JfaoNj8GOEkeu9+tmY6JSiMCpUmuJpczk/x0At/xJhd7lIlBrySTrbQWqWIpUnzt6idO3fl5F/INceatNgN1Bh7kjUgbrPVaWcNIpSd0R3gSmmQXJR3Gy05hZN/LW1Bxx+NdQKBPFKxgS5nnuktTfNHqN0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=BFqGKeKm; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="BFqGKeKm" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=7SNDZtP4EWkKQ6Po/AbYgzlBsbawc4zXI5xGTlmgJ3E=; b=BFqGKeKmR5j2Fbw9ZCU9B/tZT3 j6EI3FmaHe/fEQi9f5iYFm/NDYzCY7ZV7kqBw+MR/sxdIUWE69rm5hAjUL3SjsFerINeVOXC4lb1f QW/qfqe8MSWHjUFGg8hEgcEsiQkL7b2gpUcD+MbAUgMDiOspAY4Bh3tevkAZ30vVkTWN076uBonKM WjVp1XnDcZE+vy3gURaAhj6s2tmZRdjOlwIkiUuQRo3TDIJOYa/P1tMPAsPRfmcvUvCmhkomyPeuF /X8YsDelszbwebwzSlR/5pRYGK4EvtnTfPx9tL/ogmy3H+bAAoR5SiSQdQ83jMq5XXKYpr/JX2sLA P0YGxTbA==; Received: from 2001-1c00-8d85-4b00-266e-96ff-fe07-7dcc.cable.dynamic.v6.ziggo.nl ([2001:1c00:8d85:4b00:266e:96ff:fe07:7dcc] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1wAomS-00000006zNy-1Q41; Thu, 09 Apr 2026 12:46:44 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id A9D43300583; Thu, 09 Apr 2026 14:46:42 +0200 (CEST) Date: Thu, 9 Apr 2026 14:46:42 +0200 From: Peter Zijlstra To: Tim Chen Cc: Ingo Molnar , K Prateek Nayak , "Gautham R . Shenoy" , Vincent Guittot , Chen Yu , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Madadi Vineeth Reddy , Hillf Danton , Shrikanth Hegde , Jianyong Wu , Yangyu Chen , Tingyin Duan , Vern Hao , Vern Hao , Len Brown , Aubrey Li , Zhao Liu , Chen Yu , Adam Li , Aaron Lu , Tim Chen , Josh Don , Gavin Guo , Qais Yousef , Libo Chen , linux-kernel@vger.kernel.org Subject: Re: [Patch v4 17/22] sched/cache: Avoid cache-aware scheduling for memory-heavy processes Message-ID: <20260409124642.GC3126523@noisy.programming.kicks-ass.net> References: <339bb2636c7306e17540268a9295a8e673b92804.1775065312.git.tim.c.chen@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <339bb2636c7306e17540268a9295a8e673b92804.1775065312.git.tim.c.chen@linux.intel.com> On Wed, Apr 01, 2026 at 02:52:29PM -0700, Tim Chen wrote: > From: Chen Yu > > Prateek and Tingyin reported that memory-intensive workloads (such as > stream) can saturate memory bandwidth and caches on the preferred LLC > when sched_cache aggregates too many threads. > > To mitigate this, estimate a process's memory footprint by comparing > its RSS (anonymous and shared pages) to the size of the LLC. If RSS > exceeds the LLC size, skip cache-aware scheduling. > > Note that RSS is only an approximation of the memory footprint. > By default, the comparison is strict, but a later patch will allow > users to provide a hint to adjust this threshold. > > According to the test from Adam, some systems do not have shared L3 > but with shared L2 as clusters. In this case, the L2 becomes the LLC[1]. This is pretty terrible. If you want LLC size, add it to the topology information (and ideally integrate with RDT) and make proportional to cpumask size, such that if someone cuts the domain in pieces, they get proportional size etc. Also, if we have NUMA_BALANCING on, that can provide a much better estimate for the actual size. Just using RSS seems like a very bad metric here.