From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 633E53D6470 for ; Wed, 4 Feb 2026 13:44:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770212693; cv=none; b=lj7xo/5VzRH0IU3ttSCUFUaRqTogbguX4xV3dWo2b6xzdc5Yyb+R47Nop7MNXon0ZC6QKjQtfe89KKJLfznhqdgp3of/KLZEL6JHuZH9nuQcL7IhrQeXTg4Ll68Mclsd5kiRllbX0jSKhqqw4OpCHZgrijKU7hM+9eW1buL0EEo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770212693; c=relaxed/simple; bh=hc1g+ZIUn4T3oSFFCA1/2dlL33QnclFjMsY2ObUon6s=; h=Date:From:To:CC:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=FAJnJP90eb6wT7TepvHePcBo93E7dL8hKrOa5iF4QIudgpXFZR3Lr2ISIGVqqfVVfUaiXAIxx7WTCS1GfnMbyicSijCb7Eq+nUAxVFCZjBFCd19FF1OFwmEmRdUmoV91E3wuBza4GL6xP01xksZhdAUD/KPE4NeyH/AiMJ6wcqY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.224.83]) by frasgout.his.huawei.com (SkyGuard) with ESMTPS id 4f5hQX5fT2zHnGkY; Wed, 4 Feb 2026 21:43:48 +0800 (CST) Received: from dubpeml500005.china.huawei.com (unknown [7.214.145.207]) by mail.maildlp.com (Postfix) with ESMTPS id 3273A40086; Wed, 4 Feb 2026 21:44:50 +0800 (CST) Received: from localhost (10.203.177.15) by dubpeml500005.china.huawei.com (7.214.145.207) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Wed, 4 Feb 2026 13:44:48 +0000 Date: Wed, 4 Feb 2026 13:44:47 +0000 From: Jonathan Cameron To: Linus Walleij CC: Yushan Wang , , , , , , , , , , , , , , , SeongJae Park , Subject: Re: [PATCH 1/3] soc cache: L3 cache driver for HiSilicon SoC Message-ID: <20260204134447.00000afd@huawei.com> In-Reply-To: <20260204134020.00002393@huawei.com> References: <20260203161843.649417-1-wangyushan12@huawei.com> <20260203161843.649417-2-wangyushan12@huawei.com> <20260204134020.00002393@huawei.com> X-Mailer: Claws Mail 4.3.0 (GTK 3.24.42; x86_64-w64-mingw32) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To dubpeml500005.china.huawei.com (7.214.145.207) Fixed linux-mm address that got added a few emails back. On Wed, 4 Feb 2026 13:40:20 +0000 Jonathan Cameron wrote: > On Wed, 4 Feb 2026 01:10:01 +0100 > Linus Walleij wrote: >=20 > > Hi Yushan, > >=20 > > thanks for your patch! > >=20 > > On Tue, Feb 3, 2026 at 5:18=E2=80=AFPM Yushan Wang wrote: =20 > > > > > > The driver will create a file of `/dev/hisi_l3c` on init, mmap > > > operations to it will allocate a memory region that is guaranteed to = be > > > placed in L3 cache. > > > > > > The driver also provides unmap() to deallocated the locked memory. > > > > > > The driver also provides an ioctl interface for user to get cache lock > > > information, such as lock restrictions and locked sizes. > > > > > > Signed-off-by: Yushan Wang =20 > >=20 > > The commit message does not say *why* you are doing this? > > =20 > > > +config HISI_SOC_L3C > > > + bool "HiSilicon L3 Cache device driver" > > > + depends on ACPI > > > + depends on ARM64 || COMPILE_TEST > > > + help > > > + This driver provides the functions to lock L3 cache entries= from > > > + being evicted for better performance. =20 > >=20 > > Here is the reason though. > >=20 > > Things like this need to be CC to linux-mm@vger.kernel.org. > >=20 > > I don't see why userspace would be so well informed as to make decisions > > about what should be locked in the L3 cache and not? > >=20 > > I see the memory hierarchy as any other hardware: a resource that is > > allocated and arbitrated by the kernel. > >=20 > > The MM subsytem knows which memory is most cache hot. > > Especially when you use DAMON DAMOS, which has the sole > > purpose of executing actions like that. Here is a good YouTube. > > https://www.youtube.com/watch?v=3DxKJO4kLTHOI =20 > Hi Linus, >=20 > This typically isn't about cache hot. It it were, the data would > be in the cache without this. It's about ensuring something that would > otherwise unlikely to be there is in the cache. >=20 > Normally that's a latency critical region. In general the kernel > has no chance of figuring out what those are ahead of time, only > userspace can know (based on profiling etc) that is per workload. > The first hit matters in these use cases and it's not something > the prefetchers can help with. >=20 > The only thing we could do if this was in kernel would be to > have userspace pass some hints and then let the kernel actually > kick off the process. That just boils down to using a different > interface to do what this driver is doing (and that's the conversaion > this series is trying to get going) It's a finite resource > and you absolutely need userspace to be able to tell if it > got what it asked for or not. >=20 > Damon might be useful for that preanalysis though but it can't do > anything for the infrequent extremely latency sensitive accesses. > Normally this is fleet wide stuff based on intensive benchmarking > of a few nodes. Same sort of approach as the original warehouse > scale computing paper on tuning zswap capacity across a fleet. > Its an extreme form of profile guided optimization (and not > currently automatic I think?). If we are putting code in this > locked region, the program has been carefully recompiled / linked > to group the critical parts so that we can use the minimum number > of these locked regions. Data is a little simpler. >=20 > It's kind of similar to resctl but at a sub process granularity. >=20 > >=20 > > Shouldn't the MM subsystem be in charge of determining, locking > > down and freeing up hot regions in L3 cache? > >=20 > > This looks more like userspace is going to determine that but > > how exactly? By running DAMON? Then it's better to keep the > > whole mechanism in the kernel where it belongs and let the > > MM subsystem adapt locked L3 cache to the usage patterns. =20 >=20 > I haven't yet come up with any plausible scheme by which the MM > subsystem could do this. >=20 > I think what we need here Yushan, is more detail on end to end > use cases for this. Some examples etc as clearer motivation. >=20 > Jonathan >=20 > >=20 > > Yours, > > Linus Walleij > > =20 >=20