From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 377251A9FA0; Tue, 30 Sep 2025 16:02:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759248143; cv=none; b=SswIEvVZVz5K1L2VS68kcmRKbhzmdREbOnt15AGmnZF4FIRqUOWt401eqd2dg1clXOXr40lTpLxLCqwpmXf3RNp6ykqMeWyrHUF4NLPVUM6NFxaBt3Qw/95IrnSOHNsQEpNha+RguPkfRXaG1AIOBnPjS5uhKUflIC1aCxLr9bg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759248143; c=relaxed/simple; bh=siM+CKFJj/a/G+7V5mMOO5hSCbBHXJMz98nqGsIKQ3o=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=I1I13PZcxmi9L3Qvn2MYMM1aod+YPqYSY4rjiAZi2tQ4MOKnCgxaCcio/MqmF/DO7Q30ECazTiSEBhbpSH6/iLctvlM1wh5nHsjjWPHzXR7NICFM+c3mZJ4LHhlyc7lv7r8Yp/Ei2Ka6Uk4q/UPTqdow32tep5g5crS16OBf9L8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 635591424; Tue, 30 Sep 2025 09:02:12 -0700 (PDT) Received: from e133380.arm.com (e133380.arm.com [10.1.197.68]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6B2D23F66E; Tue, 30 Sep 2025 09:02:18 -0700 (PDT) Date: Tue, 30 Sep 2025 17:02:15 +0100 From: Dave Martin To: "Luck, Tony" Cc: linux-kernel@vger.kernel.org, Reinette Chatre , James Morse , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Jonathan Corbet , x86@kernel.org, linux-doc@vger.kernel.org Subject: Re: [PATCH] fs/resctrl,x86/resctrl: Factor mba rounding to be per-arch Message-ID: References: <20250902162507.18520-1-Dave.Martin@arm.com> Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Hi Tony, On Mon, Sep 29, 2025 at 09:37:41AM -0700, Luck, Tony wrote: > On Mon, Sep 29, 2025 at 02:56:19PM +0100, Dave Martin wrote: > > Hi Tony, > > > > Thanks for taking at look at this -- comments below. > > > > [...] > > > > On Thu, Sep 25, 2025 at 03:58:35PM -0700, Luck, Tony wrote: > > > On Mon, Sep 22, 2025 at 04:04:40PM +0100, Dave Martin wrote: > > > > [...] > > > > > > Would something like the following work? A read from schemata might > > > > produce something like this: > > > > > > > > MB: 0=50, 1=50 > > > > # MB_HW: 0=32, 1=32 > > > > # MB_MIN: 0=31, 1=31 > > > > # MB_MAX: 0=32, 1=32 > > > > [...] > > > > > > I'd be interested in people's thoughts on it, though. > > > > > > Applying this to Intel upcoming region aware memory bandwidth > > > that supports 255 steps and h/w min/max limits. > > > > Following the MPAM example, would you also expect: > > > > scale: 255 > > unit: 100pc > > > > ...? > > Yes. 255 (or whatever "Q" value is provided in the ACPI table) > corresponds to no throttling, so 100% bandwidth. > > > > > > We would have info files with "min = 1, max = 255" and a schemata > > > file that looks like this to legacy apps: [...] > > > MB: 0=4;1=75 [...] > > > with the legacy line updated with the rounded value of the MB_HW > > > supplied by the user. 10/255 = 3.921% ... so call it "4". > > > > I'm suggesting that this always be rounded up, so that you have a > > guarantee that the steps are no smaller than the reported value. > > Round up, rather than round-to-nearest, make sense. Though perhaps > only cosmetic as I would be surprised if anyone has a mix of tools > looking at the legacy schemata lines while programming using the > direct h/w controls. Ack [...] > > Do you have concerns about existing software choking on the #-prefixed > > lines? > > Do they even need a # prefix? We already mix lines for multiple > resources in the schemata file with a separate prefix for each resource. > The schemata file also allows writes to just update one resource (or > one domain in a single resource). The schemata file started with just > "L3". Then we added "L2", "MB", and "SMBA" with no concern that the > initial "L3" manipulating tools would be confused. The "#" thing is for backwards compatibility with old userspace that might blindly "paste back" unknown entries when writing the schemata file. (See also my reply to Reinette [1].) > > > Then use some other tag naming for other regions. Remote DDR, > > > local CXL, remote CXL are the ones we think are next in the h/w > > > memory sequence. But the "region" concept would allow for other > > > options as other memory technologies come into use. > > > > Would it be reasnable just to have a set of these schema instances, per > > region, so: > > > > MB_HW: ... // implicitly region 0 > > MB_HW_1: ... > > MB_HW_2: ... > > Chen Yu is currently looking at putting the word "TIER" into the > name, since there's some precedent for describing memory in "tiers". > > Whatever naming scheme is used, the important part is how will users > find out what each schemata line actually means/controls. Agreed. That's a problem, but a separate one. [...] > > Or, did you have something else in mind? > > > > My thinking is that we avoid adding complexity in the schemata file if > > we treat mapping these schema instances onto the hardware topology as > > an orthogonal problem. So long as we have unique names in the schemata > > file, we can describe elsewhere what they relate to in the hardware. > > Yes, exactly this. OK, that's reassuring. Cheers ---Dave [1] https://lore.kernel.org/lkml/aNv53UmFGDBL0z3O@e133380.arm.com/