From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bo Yan Subject: Re: [PATCH] arm64: tegra: add topology data for Tegra194 cpu Date: Mon, 11 Feb 2019 15:34:27 -0800 Message-ID: References: <1548959754-3941-1-git-send-email-byan@nvidia.com> <20190131222517.GB13156@mithrandir> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org To: Thierry Reding Cc: jonathanh@nvidia.com, linux-tegra@vger.kernel.org, mark.rutland@arm.com, robh+dt@kernel.org, linux-kernel@vger.kernel.org List-Id: linux-tegra@vger.kernel.org To make this simpler, I think it's best to isolate the cache information=20 in its own patch. So I will amend this patch to include topology=20 information only. On 1/31/19 3:29 PM, Bo Yan wrote: >=20 > On 1/31/19 2:25 PM, Thierry Reding wrote: >> On Thu, Jan 31, 2019 at 10:35:54AM -0800, Bo Yan wrote: >>> The xavier CPU architecture includes 8 CPU cores organized in >>> 4 clusters. Add cpu-map data for topology initialization, add >>> cache data for cache node creation in sysfs. >>> >>> Signed-off-by: Bo Yan >>> --- >>> =A0 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 148=20 >>> +++++++++++++++++++++++++++++-- >>> =A0 1 file changed, 140 insertions(+), 8 deletions(-) >>> >>> diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi=20 >>> b/arch/arm64/boot/dts/nvidia/tegra194.dtsi >>> index 6dfa1ca..7c2a1fb 100644 >>> --- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi >>> +++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi >>> @@ -870,63 +870,195 @@ >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 #address-cells =3D <1>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 #size-cells =3D <0>; >=20 >> These don't seem to be well-defined. They are mentioned in a very weird >> locations (Documentation/devicetree/booting-without-of.txt) but there >> seem to be examples and other device tree files that use them so maybe >> those are all valid. It might be worth mentioning these in other places >> where people can more easily find them. >=20 > It might be logical to place a reference to this document=20 > (booting-without-of.txt) in architecture specific documents, for=20 > example, arm/cpus.txt. I see the need for improved documentation, but=20 > this probably should be best done in a separate change. >> >> According to the above document, {i,d}-cache-line-size are deprecated in >> favour of {i,d}-cache-block-size. >=20 > Mostly, this seems to be derived from the oddity of PowerPC, which might= =20 > have different cache-line-size and cache-block-size. I don't know if=20 > there are other examples? It looks like the {i,d}-cache-line-size are=20 > being used in dts files for almost all architectures, the only exception= =20 > is arch/sh/boot/dts/j2_mimas_v2.dts. On ARM and ARM64, cache-line-size=20 > is the same as cache-block-size. So I am wondering whether the=20 > booting-without-of.txt should be fixed instead? just to keep it=20 > consistent among dts files, especially in arm64. >=20 >> >> I also don't see any mention of {i,d}-cache_sets in the device tree >> bindings, though riscv/cpus.txt mentions {i,d}-cache-sets (note the >> hyphen instead of underscore) in the examples. arm/l2c2x0.txt and >> arm/uniphier/cache-unifier.txt describe cache-sets, though that's >> slightly different. >> >> Might make sense to document all these in more standard places. Maybe >> adding them to arm/cpus.txt. For consistency with other properties, I >> think there should be called {i,d}-cache-sets like for RISC-V. >> >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 l2-cache =3D <&l2_0>; >> >> This seems to be called next-level-cache everywhere else, though it's >> only formally described in arm/uniphier/cache-uniphier.txt. So might >> also make sense to add this to arm/cpus.txt. >=20 > the improved documentation is certainly desired, I agree. >> >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 }; >>> -=A0=A0=A0=A0=A0=A0=A0 cpu@1 { >>> +=A0=A0=A0=A0=A0=A0=A0 cl0_1: cpu@1 { >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 compatible =3D "nvidia,tegra194= -carmel", "arm,armv8"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 device_type =3D "cpu"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 reg =3D <0x10001>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 enable-method =3D "psci"; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-size =3D <131072>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-sets =3D <512>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-size =3D <65536>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache_sets =3D <256>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 l2-cache =3D <&l2_0>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 }; >>> -=A0=A0=A0=A0=A0=A0=A0 cpu@2 { >>> +=A0=A0=A0=A0=A0=A0=A0 cl1_0: cpu@2 { >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 compatible =3D "nvidia,tegra194= -carmel", "arm,armv8"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 device_type =3D "cpu"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 reg =3D <0x100>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 enable-method =3D "psci"; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-size =3D <131072>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-sets =3D <512>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-size =3D <65536>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache_sets =3D <256>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 l2-cache =3D <&l2_1>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 }; >>> -=A0=A0=A0=A0=A0=A0=A0 cpu@3 { >>> +=A0=A0=A0=A0=A0=A0=A0 cl1_1: cpu@3 { >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 compatible =3D "nvidia,tegra194= -carmel", "arm,armv8"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 device_type =3D "cpu"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 reg =3D <0x101>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 enable-method =3D "psci"; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-size =3D <131072>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-sets =3D <512>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-size =3D <65536>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache_sets =3D <256>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 l2-cache =3D <&l2_1>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 }; >>> -=A0=A0=A0=A0=A0=A0=A0 cpu@4 { >>> +=A0=A0=A0=A0=A0=A0=A0 cl2_0: cpu@4 { >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 compatible =3D "nvidia,tegra194= -carmel", "arm,armv8"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 device_type =3D "cpu"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 reg =3D <0x200>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 enable-method =3D "psci"; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-size =3D <131072>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-sets =3D <512>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-size =3D <65536>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache_sets =3D <256>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 l2-cache =3D <&l2_2>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 }; >>> -=A0=A0=A0=A0=A0=A0=A0 cpu@5 { >>> +=A0=A0=A0=A0=A0=A0=A0 cl2_1: cpu@5 { >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 compatible =3D "nvidia,tegra194= -carmel", "arm,armv8"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 device_type =3D "cpu"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 reg =3D <0x201>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 enable-method =3D "psci"; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-size =3D <131072>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-sets =3D <512>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-size =3D <65536>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache_sets =3D <256>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 l2-cache =3D <&l2_2>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 }; >>> -=A0=A0=A0=A0=A0=A0=A0 cpu@6 { >>> +=A0=A0=A0=A0=A0=A0=A0 cl3_0: cpu@6 { >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 compatible =3D "nvidia,tegra194= -carmel", "arm,armv8"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 device_type =3D "cpu"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 reg =3D <0x10300>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 enable-method =3D "psci"; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-size =3D <131072>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-sets =3D <512>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-size =3D <65536>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache_sets =3D <256>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 l2-cache =3D <&l2_3>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 }; >>> -=A0=A0=A0=A0=A0=A0=A0 cpu@7 { >>> +=A0=A0=A0=A0=A0=A0=A0 cl3_1: cpu@7 { >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 compatible =3D "nvidia,tegra194= -carmel", "arm,armv8"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 device_type =3D "cpu"; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 reg =3D <0x10301>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 enable-method =3D "psci"; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-size =3D <131072>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 i-cache-sets =3D <512>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-size =3D <65536>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 d-cache_sets =3D <256>; >>> +=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 l2-cache =3D <&l2_3>; >>> =A0=A0=A0=A0=A0=A0=A0=A0=A0 }; >>> =A0=A0=A0=A0=A0 }; >>> +=A0=A0=A0 l2_0: l2-cache0 { >>> +=A0=A0=A0=A0=A0=A0=A0 cache-size =3D <2097152>; >>> +=A0=A0=A0=A0=A0=A0=A0 cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0 cache-sets =3D <2048>; >>> +=A0=A0=A0=A0=A0=A0=A0 next-level-cache =3D <&l3>; >>> +=A0=A0=A0 }; >> >> Does this need a compatible string? Also, are there controllers behind >> these caches? I'm just wondering if these also need reg properties and >> unit-addresses. >=20 > No need for compatible string. No reg properties and addresses. These=20 > will be parsed by drivers/of/base.c and drivers/base/cacheinfo.c, they=20 > are generic. >> >> arm/l2c2x0.txt and arm/uniphier/cache-uniphier.txt describe an >> additional property that you don't specify here: cache-level. This >> sounds useful to have so that we don't have to guess the cache level >> from the name, which may or may not work depending on what people name >> the nodes. >=20 > the cache level property is implied in device tree hierarchy, so after=20 > system boots up, I can find cache level in related sysfs nodes: >=20 > =A0=A0=A0 [root@alarm cache]# cat index*/level > =A0=A0=A0 1 > =A0=A0=A0 1 > =A0=A0=A0 2 > =A0=A0=A0 3 >=20 >=20 >> >> Also, similar to the L1 cache, cache-block-size is preferred over >> cache-line-size. >> >>> +=A0=A0=A0 l3: l3-cache { >>> +=A0=A0=A0=A0=A0=A0=A0 cache-size =3D <4194304>; >>> +=A0=A0=A0=A0=A0=A0=A0 cache-line-size =3D <64>; >>> +=A0=A0=A0=A0=A0=A0=A0 cache-sets =3D <4096>; >>> +=A0=A0=A0 }; >> >> The same comments apply as for the L2 caches. >> >> Thierry >>