From mboxrd@z Thu Jan 1 00:00:00 1970 From: avanbrunt@nvidia.com (Alexander Van Brunt) Date: Thu, 29 Oct 2015 23:06:36 +0000 Subject: [PATCH 0/3] Revert arm64 cache geometry In-Reply-To: <5A68E2B9-2254-42CA-8781-E2FA737C366C@linaro.org> References: <1446068637-11509-1-git-send-email-avanbrunt@nvidia.com> <20151029154346.GL8644@n2100.arm.linux.org.uk>, <5A68E2B9-2254-42CA-8781-E2FA737C366C@linaro.org> Message-ID: <1446160085730.52481@nvidia.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org >> On 29 okt. 2015, at 16:43, Russell King - ARM Linux wrote: >> >>> On Thu, Oct 29, 2015 at 12:22:51PM +0900, Ard Biesheuvel wrote: >>> Fair enough. It is a bit disappointing that we cannot trust these >>> values, but if the architecture does not mandate their accuracy, we >>> obviously should not be using them in the way that we are. >>> >>> I think we have similar code in the ARM tree, so we should probably >>> make some changes there as well. >> >> I've opposed exporting the cache dimensions to userspace for several >> reasons: >> > >I agree with the arguments below. However, what I refer to here is kernel code that infers whether a certain VIPT cache is non-aliasing based on the way size, which is calculated from values that are exposed to software for the sole purpose of enumerating cachelines by set/way. No, CCSIDR_EL1 is for the sole purpose of indicating how to clean / invalidate caches by set and way. The documentation explicitly says it is not for enumerating the cache geometry. ________________________________________ From: Ard Biesheuvel Sent: Thursday, October 29, 2015 9:28 AM To: Russell King - ARM Linux Cc: Alexander Van Brunt; linux-arm-kernel at lists.infradead.org; Will Deacon; Sudeep Holla; Catalin Marinas Subject: Re: [PATCH 0/3] Revert arm64 cache geometry > On 29 okt. 2015, at 16:43, Russell King - ARM Linux wrote: > >> On Thu, Oct 29, 2015 at 12:22:51PM +0900, Ard Biesheuvel wrote: >> Fair enough. It is a bit disappointing that we cannot trust these >> values, but if the architecture does not mandate their accuracy, we >> obviously should not be using them in the way that we are. >> >> I think we have similar code in the ARM tree, so we should probably >> make some changes there as well. > > I've opposed exporting the cache dimensions to userspace for several > reasons: > I agree with the arguments below. However, what I refer to here is kernel code that infers whether a certain VIPT cache is non-aliasing based on the way size, which is calculated from values that are exposed to software for the sole purpose of enumerating cachelines by set/way. > * it will become a nightmare with the various different register formats > to properly decode these values > * older CPUs don't have the cache ID registers, so we'd need to augment > any export with additional static configuration somehow > * I don't trust userland with this information to make the right choices, > especially when faced with further levels of caches. > > The main reason for people wanting the cache dimensions has been "so we > can select the optimal code for the CPU". Given all the combinations > of caches out there, I've always said selecting code based on one level > of cache is totally insane, and userspace is better off doing some > performance measurement of its implementations and selecting the most > appropriate version. > > There's many things that affect the performance of code paths with CPUs. > It's not just about cache line size, but instruction latencies, write > delays, branch prediction and so forth. You can't _say_ "because this > CPU has a 32K L1 cache, if I optimise my code as X it'll perform > better everywhere with a 32K L1 cache than optimised Y." > > Selecting code based on cache parameters is just wrong. > > There may be cases where userspace would like to know the cache line > size, so it can appropriately align data structures - but that again > depends on what you're trying to achieve, and what if the L1 cache > line size is different from the L2 cache line size... > > It's a minefield, one which IMHO userspace shouldn't be trusted with. > Userspace should assume the worst case cache line size seen in ARM > CPUs and be done with it. > > -- > FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up > according to speedtest.net.