linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
@ 2025-01-03 18:22 Marc Zyngier
  2025-01-06  9:40 ` Mark Rutland
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Zyngier @ 2025-01-03 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, Mark Brown, stable

The hwcaps code that exposes SVE features to userspace only
considers ID_AA64ZFR0_EL1, while this is only valid when
ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.

The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
ID_AA64ZFR0_EL1 register is also 0. So far, so good.

Things become a bit more interesting if the HW implements SME.
In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
features. And these fields overlap with their SVE interpretations.
But the architecture says that the SME and SVE feature sets must
match, so we're still hunky-dory.

This goes wrong if the HW implements SME, but not SVE. In this
case, we end-up advertising some SVE features to userspace, even
if the HW has none. That's because we never consider whether SVE
is actually implemented. Oh well.

Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
being non-zero.

Fixes: 06a916feca2b ("arm64: Expose SVE2 features for userspace")
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
---
 arch/arm64/kernel/cpufeature.c | 40 +++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 6ce71f444ed84..6874aca5da9df 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -3022,6 +3022,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.matches = match,						\
 	}
 
+#define HWCAP_CAP_MATCH_ID(match, reg, field, min_value, cap_type, cap)		\
+	{									\
+		__HWCAP_CAP(#cap, cap_type, cap)				\
+		HWCAP_CPUID_MATCH(reg, field, min_value) 			\
+		.matches = match,						\
+	}
+
 #ifdef CONFIG_ARM64_PTR_AUTH
 static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = {
 	{
@@ -3050,6 +3057,13 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = {
 };
 #endif
 
+#ifdef CONFIG_ARM64_SVE
+static bool has_sve_feature(const struct arm64_cpu_capabilities *cap, int scope)
+{
+	return system_supports_sve() && has_user_cpuid_feature(cap, scope);
+}
+#endif
+
 static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(ID_AA64ISAR0_EL1, AES, PMULL, CAP_HWCAP, KERNEL_HWCAP_PMULL),
 	HWCAP_CAP(ID_AA64ISAR0_EL1, AES, AES, CAP_HWCAP, KERNEL_HWCAP_AES),
@@ -3092,19 +3106,19 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(ID_AA64MMFR2_EL1, AT, IMP, CAP_HWCAP, KERNEL_HWCAP_USCAT),
 #ifdef CONFIG_ARM64_SVE
 	HWCAP_CAP(ID_AA64PFR0_EL1, SVE, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, SVEver, SVE2p1, CAP_HWCAP, KERNEL_HWCAP_SVE2P1),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, SVEver, SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, AES, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEAES),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, AES, PMULL128, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, BitPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, B16B16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE_B16B16),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, BF16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBF16),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_SVE_EBF16),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, SHA3, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESHA3),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, SM4, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESM4),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, F32MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, F64MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, SVEver, SVE2p1, CAP_HWCAP, KERNEL_HWCAP_SVE2P1),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, SVEver, SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, AES, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEAES),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, AES, PMULL128, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, BitPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, B16B16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE_B16B16),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, BF16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBF16),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_SVE_EBF16),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, SHA3, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESHA3),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, SM4, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESM4),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, F32MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM),
+	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, F64MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM),
 #endif
 #ifdef CONFIG_ARM64_GCS
 	HWCAP_CAP(ID_AA64PFR1_EL1, GCS, IMP, CAP_HWCAP, KERNEL_HWCAP_GCS),
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
  2025-01-03 18:22 [PATCH v2] arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented Marc Zyngier
@ 2025-01-06  9:40 ` Mark Rutland
  2025-01-06 10:57   ` Catalin Marinas
  2025-01-06 11:12   ` Marc Zyngier
  0 siblings, 2 replies; 6+ messages in thread
From: Mark Rutland @ 2025-01-06  9:40 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Brown,
	stable

On Fri, Jan 03, 2025 at 06:22:55PM +0000, Marc Zyngier wrote:
> The hwcaps code that exposes SVE features to userspace only
> considers ID_AA64ZFR0_EL1, while this is only valid when
> ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.
> 
> The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
> ID_AA64ZFR0_EL1 register is also 0. So far, so good.
> 
> Things become a bit more interesting if the HW implements SME.
> In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
> features. And these fields overlap with their SVE interpretations.
> But the architecture says that the SME and SVE feature sets must
> match, so we're still hunky-dory.
> 
> This goes wrong if the HW implements SME, but not SVE. In this
> case, we end-up advertising some SVE features to userspace, even
> if the HW has none. That's because we never consider whether SVE
> is actually implemented. Oh well.

Ugh; this is a massive pain. :(

Was this found by inspection, or is some real software going wrong?

> Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
> being non-zero.

Unfortunately, I'm not sure this fix is correct+complete.

We expose ID_AA64PFR0_EL1 and ID_AA64ZFR0_EL1 via ID register emulation,
so any userspace software reading ID_AA64ZFR0_EL1 will encounter the
same surprise. If we hide that I'm worried we might hide some SME-only
information that isn't exposed elsewhere, and I'm not sure we can
reasonably hide ID_AA64ZFR0_EL1 emulation for SME-only (more on that
below).

Secondly, all our HWCAP documentation is written in the form:

| HWCAP2_SVEBF16
|     Functionality implied by ID_AA64ZFR0_EL1.BF16 == 0b0001.

... so while the architectural behaviour is a surprise, the kernel is
(techincallyy) behaving exactly as documented prior to this patch. Maybe
we need to change that documentation?

Do we have equivalent SME hwcaps for the relevant features?

... looking at:

  https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64ZFR0-EL1--SVE-Feature-ID-Register-0?lang=en

... I see that ID_AA64ZFR0_EL1.B16B16 >= 0b0010 implies the presence of
SME BFMUL and BFSCALE instructions, but I don't see something equivalent
in ID_AA64SMFR0_EL1 per:

  https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64SMFR0-EL1--SME-Feature-ID-Register-0?lang=en

... so I suspect ID_AA64ZFR0_EL1 might be the only source of truth for
this.

It is bizarre that ID_AA64SMFR0_EL1 doesn't follow the same format, and
ID_AA64SMFR0_EL1.B16B16 is a single-bit field that cannot encode the
same values as ID_AA64ZFR0_EL1.B16B16 (which is a 4-bit field).

Even if we change Linux here, someone will need to chase up with the
architects to ensure this isn't made worse in future.

Mark.

> Fixes: 06a916feca2b ("arm64: Expose SVE2 features for userspace")
> Reported-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: stable@vger.kernel.org
> ---
>  arch/arm64/kernel/cpufeature.c | 40 +++++++++++++++++++++++-----------
>  1 file changed, 27 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 6ce71f444ed84..6874aca5da9df 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -3022,6 +3022,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.matches = match,						\
>  	}
>  
> +#define HWCAP_CAP_MATCH_ID(match, reg, field, min_value, cap_type, cap)		\
> +	{									\
> +		__HWCAP_CAP(#cap, cap_type, cap)				\
> +		HWCAP_CPUID_MATCH(reg, field, min_value) 			\
> +		.matches = match,						\
> +	}
> +
>  #ifdef CONFIG_ARM64_PTR_AUTH
>  static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = {
>  	{
> @@ -3050,6 +3057,13 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = {
>  };
>  #endif
>  
> +#ifdef CONFIG_ARM64_SVE
> +static bool has_sve_feature(const struct arm64_cpu_capabilities *cap, int scope)
> +{
> +	return system_supports_sve() && has_user_cpuid_feature(cap, scope);
> +}
> +#endif
> +
>  static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
>  	HWCAP_CAP(ID_AA64ISAR0_EL1, AES, PMULL, CAP_HWCAP, KERNEL_HWCAP_PMULL),
>  	HWCAP_CAP(ID_AA64ISAR0_EL1, AES, AES, CAP_HWCAP, KERNEL_HWCAP_AES),
> @@ -3092,19 +3106,19 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
>  	HWCAP_CAP(ID_AA64MMFR2_EL1, AT, IMP, CAP_HWCAP, KERNEL_HWCAP_USCAT),
>  #ifdef CONFIG_ARM64_SVE
>  	HWCAP_CAP(ID_AA64PFR0_EL1, SVE, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, SVEver, SVE2p1, CAP_HWCAP, KERNEL_HWCAP_SVE2P1),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, SVEver, SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, AES, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEAES),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, AES, PMULL128, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, BitPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, B16B16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE_B16B16),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, BF16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBF16),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_SVE_EBF16),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, SHA3, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESHA3),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, SM4, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESM4),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, F32MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM),
> -	HWCAP_CAP(ID_AA64ZFR0_EL1, F64MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, SVEver, SVE2p1, CAP_HWCAP, KERNEL_HWCAP_SVE2P1),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, SVEver, SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, AES, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEAES),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, AES, PMULL128, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, BitPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, B16B16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE_B16B16),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, BF16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBF16),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_SVE_EBF16),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, SHA3, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESHA3),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, SM4, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESM4),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, F32MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM),
> +	HWCAP_CAP_MATCH_ID(has_sve_feature, ID_AA64ZFR0_EL1, F64MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM),
>  #endif
>  #ifdef CONFIG_ARM64_GCS
>  	HWCAP_CAP(ID_AA64PFR1_EL1, GCS, IMP, CAP_HWCAP, KERNEL_HWCAP_GCS),
> -- 
> 2.39.2
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
  2025-01-06  9:40 ` Mark Rutland
@ 2025-01-06 10:57   ` Catalin Marinas
  2025-01-06 11:12   ` Marc Zyngier
  1 sibling, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2025-01-06 10:57 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Marc Zyngier, linux-arm-kernel, Will Deacon, Mark Brown, stable

On Mon, Jan 06, 2025 at 09:40:56AM +0000, Mark Rutland wrote:
> On Fri, Jan 03, 2025 at 06:22:55PM +0000, Marc Zyngier wrote:
> > The hwcaps code that exposes SVE features to userspace only
> > considers ID_AA64ZFR0_EL1, while this is only valid when
> > ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.
> > 
> > The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
> > ID_AA64ZFR0_EL1 register is also 0. So far, so good.
> > 
> > Things become a bit more interesting if the HW implements SME.
> > In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
> > features. And these fields overlap with their SVE interpretations.
> > But the architecture says that the SME and SVE feature sets must
> > match, so we're still hunky-dory.
> > 
> > This goes wrong if the HW implements SME, but not SVE. In this
> > case, we end-up advertising some SVE features to userspace, even
> > if the HW has none. That's because we never consider whether SVE
> > is actually implemented. Oh well.
> 
> Ugh; this is a massive pain. :(
> 
> Was this found by inspection, or is some real software going wrong?

It goes wrong on M4 in a VM. The latest macOS (15.2 I think) enabled
those ID regs for guests and Linux user space started falling apart
(first one reported was a fairly recent JDK getting SIGILL when trying
to use the INCB instruction). Reported initially on the Parallels forum.

> > Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
> > being non-zero.
> 
> Unfortunately, I'm not sure this fix is correct+complete.
> 
> We expose ID_AA64PFR0_EL1 and ID_AA64ZFR0_EL1 via ID register emulation,
> so any userspace software reading ID_AA64ZFR0_EL1 will encounter the
> same surprise. If we hide that I'm worried we might hide some SME-only
> information that isn't exposed elsewhere, and I'm not sure we can
> reasonably hide ID_AA64ZFR0_EL1 emulation for SME-only (more on that
> below).

Good point about the user also accessing these registers through
emulation.

> Secondly, all our HWCAP documentation is written in the form:
> 
> | HWCAP2_SVEBF16
> |     Functionality implied by ID_AA64ZFR0_EL1.BF16 == 0b0001.
> 
> ... so while the architectural behaviour is a surprise, the kernel is
> (techincallyy) behaving exactly as documented prior to this patch. Maybe
> we need to change that documentation?

The kernel is also reporting HWCAP2_SVE2 based on ID_AA64ZFR0_EL1.SVEver
which I don't think it should (my reading of the spec). I suspect that's
what's causing JDK failures.

> Do we have equivalent SME hwcaps for the relevant features?
> 
> ... looking at:
> 
>   https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64ZFR0-EL1--SVE-Feature-ID-Register-0?lang=en
> 
> ... I see that ID_AA64ZFR0_EL1.B16B16 >= 0b0010 implies the presence of
> SME BFMUL and BFSCALE instructions, but I don't see something equivalent
> in ID_AA64SMFR0_EL1 per:
> 
>   https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64SMFR0-EL1--SME-Feature-ID-Register-0?lang=en
> 
> ... so I suspect ID_AA64ZFR0_EL1 might be the only source of truth for
> this.
> 
> It is bizarre that ID_AA64SMFR0_EL1 doesn't follow the same format, and
> ID_AA64SMFR0_EL1.B16B16 is a single-bit field that cannot encode the
> same values as ID_AA64ZFR0_EL1.B16B16 (which is a 4-bit field).

Oh, I'm getting confused now. Do we have this information exposed twice
in the ID regs? I think in the kernel we use ZFR0 for SVE and SMFR0 for
the SME equivalent but the architecture is actually confusing with ZFR0
describing both SME and SVE features available. I guess at some point
the architects thought we can't have SME without SVE but changed their
mind and we haven't spotted.

-- 
Catalin


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
  2025-01-06  9:40 ` Mark Rutland
  2025-01-06 10:57   ` Catalin Marinas
@ 2025-01-06 11:12   ` Marc Zyngier
  2025-01-06 12:03     ` Mark Rutland
  1 sibling, 1 reply; 6+ messages in thread
From: Marc Zyngier @ 2025-01-06 11:12 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Brown,
	stable

On Mon, 06 Jan 2025 09:40:56 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Fri, Jan 03, 2025 at 06:22:55PM +0000, Marc Zyngier wrote:
> > The hwcaps code that exposes SVE features to userspace only
> > considers ID_AA64ZFR0_EL1, while this is only valid when
> > ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.
> > 
> > The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
> > ID_AA64ZFR0_EL1 register is also 0. So far, so good.
> > 
> > Things become a bit more interesting if the HW implements SME.
> > In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
> > features. And these fields overlap with their SVE interpretations.
> > But the architecture says that the SME and SVE feature sets must
> > match, so we're still hunky-dory.
> > 
> > This goes wrong if the HW implements SME, but not SVE. In this
> > case, we end-up advertising some SVE features to userspace, even
> > if the HW has none. That's because we never consider whether SVE
> > is actually implemented. Oh well.
> 
> Ugh; this is a massive pain. :(
> 
> Was this found by inspection, or is some real software going wrong?

Catalin can comment on that -- I understand that he found existing SW
latching on SVE2 being wrongly advertised as hwcaps.

> > Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
> > being non-zero.
> 
> Unfortunately, I'm not sure this fix is correct+complete.
> 
> We expose ID_AA64PFR0_EL1 and ID_AA64ZFR0_EL1 via ID register emulation,
> so any userspace software reading ID_AA64ZFR0_EL1 will encounter the
> same surprise. If we hide that I'm worried we might hide some SME-only
> information that isn't exposed elsewhere, and I'm not sure we can
> reasonably hide ID_AA64ZFR0_EL1 emulation for SME-only (more on that
> below).

I don't understand where things go wrong. EL0 SW that looks at the ID
registers should perform similar checks, and we are not trying to make
things better on that front (we can't). Unless you invent time travel
and fix the architecture 5 years ago... :-/

The hwcaps are effectively demultiplexing the ID registers, and they
have to be exact, which is what this patch provides (SVE2 doesn't get
wrongly advertised when not present).

> Secondly, all our HWCAP documentation is written in the form:
> 
> | HWCAP2_SVEBF16
> |     Functionality implied by ID_AA64ZFR0_EL1.BF16 == 0b0001.
> 
> ... so while the architectural behaviour is a surprise, the kernel is
> (techincallyy) behaving exactly as documented prior to this patch. Maybe
> we need to change that documentation?

Again, I don't see what goes wrong here. BF16 is only implemented for
SVE or SME+FA64, and FA64 requires SVE2. So at least for that one, we
should be good.

>
> Do we have equivalent SME hwcaps for the relevant features?
>
> ... looking at:
> 
>   https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64ZFR0-EL1--SVE-Feature-ID-Register-0?lang=en
> 
> ... I see that ID_AA64ZFR0_EL1.B16B16 >= 0b0010 implies the presence of
> SME BFMUL and BFSCALE instructions, but I don't see something equivalent
> in ID_AA64SMFR0_EL1 per:
> 
>   https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64SMFR0-EL1--SME-Feature-ID-Register-0?lang=en
> 
> ... so I suspect ID_AA64ZFR0_EL1 might be the only source of truth for
> this.

Indeed, and the SME HWCAPs are not doing the right thing either. Or
rather, we have no way to tell userspace that BFMUL/BFSCALE are
available.

> It is bizarre that ID_AA64SMFR0_EL1 doesn't follow the same format, and
> ID_AA64SMFR0_EL1.B16B16 is a single-bit field that cannot encode the
> same values as ID_AA64ZFR0_EL1.B16B16 (which is a 4-bit field).

*everything* about SME is bizarre.

> Even if we change Linux here, someone will need to chase up with the
> architects to ensure this isn't made worse in future.

Good luck!

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
  2025-01-06 11:12   ` Marc Zyngier
@ 2025-01-06 12:03     ` Mark Rutland
  2025-01-06 12:21       ` Marc Zyngier
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Rutland @ 2025-01-06 12:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Brown,
	stable

On Mon, Jan 06, 2025 at 11:12:53AM +0000, Marc Zyngier wrote:
> On Mon, 06 Jan 2025 09:40:56 +0000,
> Mark Rutland <mark.rutland@arm.com> wrote:
> > 
> > On Fri, Jan 03, 2025 at 06:22:55PM +0000, Marc Zyngier wrote:
> > > The hwcaps code that exposes SVE features to userspace only
> > > considers ID_AA64ZFR0_EL1, while this is only valid when
> > > ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.
> > > 
> > > The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
> > > ID_AA64ZFR0_EL1 register is also 0. So far, so good.
> > > 
> > > Things become a bit more interesting if the HW implements SME.
> > > In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
> > > features. And these fields overlap with their SVE interpretations.
> > > But the architecture says that the SME and SVE feature sets must
> > > match, so we're still hunky-dory.
> > > 
> > > This goes wrong if the HW implements SME, but not SVE. In this
> > > case, we end-up advertising some SVE features to userspace, even
> > > if the HW has none. That's because we never consider whether SVE
> > > is actually implemented. Oh well.
> > 
> > Ugh; this is a massive pain. :(
> > 
> > Was this found by inspection, or is some real software going wrong?
> 
> Catalin can comment on that -- I understand that he found existing SW
> latching on SVE2 being wrongly advertised as hwcaps.
> 
> > > Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
> > > being non-zero.
> > 
> > Unfortunately, I'm not sure this fix is correct+complete.
> > 
> > We expose ID_AA64PFR0_EL1 and ID_AA64ZFR0_EL1 via ID register emulation,
> > so any userspace software reading ID_AA64ZFR0_EL1 will encounter the
> > same surprise. If we hide that I'm worried we might hide some SME-only
> > information that isn't exposed elsewhere, and I'm not sure we can
> > reasonably hide ID_AA64ZFR0_EL1 emulation for SME-only (more on that
> > below).
> 
> I don't understand where things go wrong. EL0 SW that looks at the ID
> registers should perform similar checks, and we are not trying to make
> things better on that front (we can't). Unless you invent time travel
> and fix the architecture 5 years ago... :-/

Fair enough; if we say software consuming ID_AA64ZFR0_EL1 must check
ID_AA64PFR0_EL1.SVE or ID_AA64PFR1_EL1.SME first, and we leave the
emulation of ID_AA64ZFR0_EL1 as-is, that's fine by me.

> The hwcaps are effectively demultiplexing the ID registers, and they
> have to be exact, which is what this patch provides (SVE2 doesn't get
> wrongly advertised when not present).

> > Secondly, all our HWCAP documentation is written in the form:
> > 
> > | HWCAP2_SVEBF16
> > |     Functionality implied by ID_AA64ZFR0_EL1.BF16 == 0b0001.
> > 
> > ... so while the architectural behaviour is a surprise, the kernel is
> > (techincallyy) behaving exactly as documented prior to this patch. Maybe
> > we need to change that documentation?
> 
> Again, I don't see what goes wrong here. BF16 is only implemented for
> SVE or SME+FA64, and FA64 requires SVE2. So at least for that one, we
> should be good.

That was probably a bad example. What I was trying to get at is that the
HWCAPs are behavind exactly *as documented*, but that's not what we
actually want them to describe. For example, SVE2 is described as:

| Functionality implied by ID_AA64ZFR0_EL1.SVEver == 0b0001.

... which is exactly what we check today, but that doesn't
architecturally imply FEAT_SVE2 on SME-only HW where it can apparently
be 0b0001 due to FEAT_SME alone.

So to match the code change we'd need to change that to something like:

| Functionality impled by ID_AA64PFR0_EL1 == 0b0001 and
| ID_AA64ZFR0_EL1.SVEver == 0b0001

... with similar for other hwcaps.

> > Do we have equivalent SME hwcaps for the relevant features?
> >
> > ... looking at:
> > 
> >   https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64ZFR0-EL1--SVE-Feature-ID-Register-0?lang=en
> > 
> > ... I see that ID_AA64ZFR0_EL1.B16B16 >= 0b0010 implies the presence of
> > SME BFMUL and BFSCALE instructions, but I don't see something equivalent
> > in ID_AA64SMFR0_EL1 per:
> > 
> >   https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64SMFR0-EL1--SME-Feature-ID-Register-0?lang=en
> > 
> > ... so I suspect ID_AA64ZFR0_EL1 might be the only source of truth for
> > this.
> 
> Indeed, and the SME HWCAPs are not doing the right thing either. Or
> rather, we have no way to tell userspace that BFMUL/BFSCALE are
> available.

To be clear, I'm happy to punt on adding SME-specific HWCAPs, I just
want to make sure we're agreed as to whether the existing HWCAPs should
be SVE-specific, which it sounds like we are?

Mark.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented
  2025-01-06 12:03     ` Mark Rutland
@ 2025-01-06 12:21       ` Marc Zyngier
  0 siblings, 0 replies; 6+ messages in thread
From: Marc Zyngier @ 2025-01-06 12:21 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, Catalin Marinas, Will Deacon, Mark Brown,
	stable

On Mon, 06 Jan 2025 12:03:44 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Mon, Jan 06, 2025 at 11:12:53AM +0000, Marc Zyngier wrote:
> > On Mon, 06 Jan 2025 09:40:56 +0000,
> > Mark Rutland <mark.rutland@arm.com> wrote:
> > > 
> > > On Fri, Jan 03, 2025 at 06:22:55PM +0000, Marc Zyngier wrote:
> > > > The hwcaps code that exposes SVE features to userspace only
> > > > considers ID_AA64ZFR0_EL1, while this is only valid when
> > > > ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.
> > > > 
> > > > The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
> > > > ID_AA64ZFR0_EL1 register is also 0. So far, so good.
> > > > 
> > > > Things become a bit more interesting if the HW implements SME.
> > > > In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
> > > > features. And these fields overlap with their SVE interpretations.
> > > > But the architecture says that the SME and SVE feature sets must
> > > > match, so we're still hunky-dory.
> > > > 
> > > > This goes wrong if the HW implements SME, but not SVE. In this
> > > > case, we end-up advertising some SVE features to userspace, even
> > > > if the HW has none. That's because we never consider whether SVE
> > > > is actually implemented. Oh well.
> > > 
> > > Ugh; this is a massive pain. :(
> > > 
> > > Was this found by inspection, or is some real software going wrong?
> > 
> > Catalin can comment on that -- I understand that he found existing SW
> > latching on SVE2 being wrongly advertised as hwcaps.
> > 
> > > > Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
> > > > being non-zero.
> > > 
> > > Unfortunately, I'm not sure this fix is correct+complete.
> > > 
> > > We expose ID_AA64PFR0_EL1 and ID_AA64ZFR0_EL1 via ID register emulation,
> > > so any userspace software reading ID_AA64ZFR0_EL1 will encounter the
> > > same surprise. If we hide that I'm worried we might hide some SME-only
> > > information that isn't exposed elsewhere, and I'm not sure we can
> > > reasonably hide ID_AA64ZFR0_EL1 emulation for SME-only (more on that
> > > below).
> > 
> > I don't understand where things go wrong. EL0 SW that looks at the ID
> > registers should perform similar checks, and we are not trying to make
> > things better on that front (we can't). Unless you invent time travel
> > and fix the architecture 5 years ago... :-/
> 
> Fair enough; if we say software consuming ID_AA64ZFR0_EL1 must check
> ID_AA64PFR0_EL1.SVE or ID_AA64PFR1_EL1.SME first, and we leave the
> emulation of ID_AA64ZFR0_EL1 as-is, that's fine by me.

I think that's what the architecture forces on us, unfortunately.

> 
> > The hwcaps are effectively demultiplexing the ID registers, and they
> > have to be exact, which is what this patch provides (SVE2 doesn't get
> > wrongly advertised when not present).
> 
> > > Secondly, all our HWCAP documentation is written in the form:
> > > 
> > > | HWCAP2_SVEBF16
> > > |     Functionality implied by ID_AA64ZFR0_EL1.BF16 == 0b0001.
> > > 
> > > ... so while the architectural behaviour is a surprise, the kernel is
> > > (techincallyy) behaving exactly as documented prior to this patch. Maybe
> > > we need to change that documentation?
> > 
> > Again, I don't see what goes wrong here. BF16 is only implemented for
> > SVE or SME+FA64, and FA64 requires SVE2. So at least for that one, we
> > should be good.
> 
> That was probably a bad example. What I was trying to get at is that the
> HWCAPs are behavind exactly *as documented*, but that's not what we
> actually want them to describe. For example, SVE2 is described as:
> 
> | Functionality implied by ID_AA64ZFR0_EL1.SVEver == 0b0001.
> 
> ... which is exactly what we check today, but that doesn't
> architecturally imply FEAT_SVE2 on SME-only HW where it can apparently
> be 0b0001 due to FEAT_SME alone.
> 
> So to match the code change we'd need to change that to something like:
> 
> | Functionality impled by ID_AA64PFR0_EL1 == 0b0001 and
> | ID_AA64ZFR0_EL1.SVEver == 0b0001
> 
> ... with similar for other hwcaps.

Yeah, seems like a decent addition. I'll fold that in.

> 
> > > Do we have equivalent SME hwcaps for the relevant features?
> > >
> > > ... looking at:
> > > 
> > >   https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64ZFR0-EL1--SVE-Feature-ID-Register-0?lang=en
> > > 
> > > ... I see that ID_AA64ZFR0_EL1.B16B16 >= 0b0010 implies the presence of
> > > SME BFMUL and BFSCALE instructions, but I don't see something equivalent
> > > in ID_AA64SMFR0_EL1 per:
> > > 
> > >   https://developer.arm.com/documentation/ddi0601/2024-12/AArch64-Registers/ID-AA64SMFR0-EL1--SME-Feature-ID-Register-0?lang=en
> > > 
> > > ... so I suspect ID_AA64ZFR0_EL1 might be the only source of truth for
> > > this.
> > 
> > Indeed, and the SME HWCAPs are not doing the right thing either. Or
> > rather, we have no way to tell userspace that BFMUL/BFSCALE are
> > available.
> 
> To be clear, I'm happy to punt on adding SME-specific HWCAPs, I just
> want to make sure we're agreed as to whether the existing HWCAPs should
> be SVE-specific, which it sounds like we are?

I think we're aligned here. I'll respin something shortly, once I've
made some progress on the state of my Inbox... :-/

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-01-06 12:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-03 18:22 [PATCH v2] arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented Marc Zyngier
2025-01-06  9:40 ` Mark Rutland
2025-01-06 10:57   ` Catalin Marinas
2025-01-06 11:12   ` Marc Zyngier
2025-01-06 12:03     ` Mark Rutland
2025-01-06 12:21       ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).