* [PATCH] arm64/sme: More sensibly define the size for the ZA register set
@ 2022-05-05 22:15 Mark Brown
2022-05-06 2:43 ` Naresh Kamboju
2022-05-06 17:39 ` Catalin Marinas
0 siblings, 2 replies; 3+ messages in thread
From: Mark Brown @ 2022-05-05 22:15 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: linux-arm-kernel, Naresh Kamboju, Mark Brown, Qian Cai
Since the vector length configuration mechanism is identical between SVE
and SME we share large elements of the code including the definition for
the maximum vector length. Unfortunately when we were defining the ABI
for SVE we included not only the actual maximum vector length of 2048
bits but also the value possible if all the bits reserved in the
architecture for expansion of the LEN field were used, 16384 bits.
This starts creating problems if we try to allocate anything for the ZA
matrix based on the maximum possible vector length, as we do for the
regset used with ptrace during the process of generating a core dump.
While the maximum potential size for ZA with the current architecture is
a reasonably managable 64K with the higher reserved limit ZA would be
64M which leads to entirely reasonable complaints from the memory
management code when we try to allocate a buffer of that size. Avoid
these issues by defining the actual maximum vector length for the
architecture and using it for the SME regsets.
Also use the full ZA_PT_SIZE() with the header rather than just the
actual register payload when specifying the size, fixing support for the
largest vector lengths now that we have this new, lower define. With the
SVE maximum this did not cause problems due to the extra headroom we
had.
While we're at it add a comment clarifying why even though ZA is a
single register we tell the regset code that it is a multi-register
regset.
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 12 ++++++++++++
arch/arm64/kernel/ptrace.c | 12 ++++++++++--
2 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 6fd879fe02da..aa11dbec0d70 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -32,6 +32,18 @@
#define VFP_STATE_SIZE ((32 * 8) + 4)
#endif
+/*
+ * When we defined the maximum SVE vector length we defined the ABI so
+ * that the maximum vector length included all the reserved for future
+ * expansion bits in ZCR rather than those just currently defined by
+ * the architecture. While SME follows a similar pattern the fact that
+ * it includes a square matrix means that any allocations that attempt
+ * to cover the maximum potential vector length (such as happen with
+ * the regset used for ptrace) end up being extremely large. Define
+ * the much lower actual limit for use in such situations.
+ */
+#define SME_VQ_MAX 16
+
struct task_struct;
extern void fpsimd_save_state(struct user_fpsimd_state *state);
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 78085a32df83..140f5a8cc7c4 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1438,7 +1438,7 @@ static const struct user_regset aarch64_regsets[] = {
#ifdef CONFIG_ARM64_SME
[REGSET_SSVE] = { /* Streaming mode SVE */
.core_note_type = NT_ARM_SSVE,
- .n = DIV_ROUND_UP(SVE_PT_SIZE(SVE_VQ_MAX, SVE_PT_REGS_SVE),
+ .n = DIV_ROUND_UP(SVE_PT_SIZE(SME_VQ_MAX, SVE_PT_REGS_SVE),
SVE_VQ_BYTES),
.size = SVE_VQ_BYTES,
.align = SVE_VQ_BYTES,
@@ -1447,7 +1447,15 @@ static const struct user_regset aarch64_regsets[] = {
},
[REGSET_ZA] = { /* SME ZA */
.core_note_type = NT_ARM_ZA,
- .n = DIV_ROUND_UP(ZA_PT_ZA_SIZE(SVE_VQ_MAX), SVE_VQ_BYTES),
+ /*
+ * ZA is a single register but it's variably sized and
+ * the ptrace core requires that the size of any data
+ * be an exact multiple of the configured register
+ * size so report as though we had SVE_VQ_BYTES
+ * registers. These values aren't exposed to
+ * userspace.
+ */
+ .n = DIV_ROUND_UP(ZA_PT_SIZE(SME_VQ_MAX), SVE_VQ_BYTES),
.size = SVE_VQ_BYTES,
.align = SVE_VQ_BYTES,
.regset_get = za_get,
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] arm64/sme: More sensibly define the size for the ZA register set
2022-05-05 22:15 [PATCH] arm64/sme: More sensibly define the size for the ZA register set Mark Brown
@ 2022-05-06 2:43 ` Naresh Kamboju
2022-05-06 17:39 ` Catalin Marinas
1 sibling, 0 replies; 3+ messages in thread
From: Naresh Kamboju @ 2022-05-06 2:43 UTC (permalink / raw)
To: Mark Brown; +Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, Qian Cai
Hi Mark
On Fri, 6 May 2022 at 03:45, Mark Brown <broonie@kernel.org> wrote:
>
> Since the vector length configuration mechanism is identical between SVE
> and SME we share large elements of the code including the definition for
> the maximum vector length. Unfortunately when we were defining the ABI
> for SVE we included not only the actual maximum vector length of 2048
> bits but also the value possible if all the bits reserved in the
> architecture for expansion of the LEN field were used, 16384 bits.
>
> This starts creating problems if we try to allocate anything for the ZA
> matrix based on the maximum possible vector length, as we do for the
> regset used with ptrace during the process of generating a core dump.
> While the maximum potential size for ZA with the current architecture is
> a reasonably managable 64K with the higher reserved limit ZA would be
> 64M which leads to entirely reasonable complaints from the memory
> management code when we try to allocate a buffer of that size. Avoid
> these issues by defining the actual maximum vector length for the
> architecture and using it for the SME regsets.
>
> Also use the full ZA_PT_SIZE() with the header rather than just the
> actual register payload when specifying the size, fixing support for the
> largest vector lengths now that we have this new, lower define. With the
> SVE maximum this did not cause problems due to the extra headroom we
> had.
>
> While we're at it add a comment clarifying why even though ZA is a
> single register we tell the regset code that it is a multi-register
> regset.
Thanks for providing the fix patch.
I have tested this patch and the reported warning got fixed.
>
> Reported-by: Qian Cai <quic_qiancai@quicinc.com>
> Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> ---
> arch/arm64/include/asm/fpsimd.h | 12 ++++++++++++
> arch/arm64/kernel/ptrace.c | 12 ++++++++++--
> 2 files changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index 6fd879fe02da..aa11dbec0d70 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -32,6 +32,18 @@
> #define VFP_STATE_SIZE ((32 * 8) + 4)
> #endif
>
> +/*
> + * When we defined the maximum SVE vector length we defined the ABI so
> + * that the maximum vector length included all the reserved for future
> + * expansion bits in ZCR rather than those just currently defined by
> + * the architecture. While SME follows a similar pattern the fact that
> + * it includes a square matrix means that any allocations that attempt
> + * to cover the maximum potential vector length (such as happen with
> + * the regset used for ptrace) end up being extremely large. Define
> + * the much lower actual limit for use in such situations.
> + */
> +#define SME_VQ_MAX 16
> +
> struct task_struct;
>
> extern void fpsimd_save_state(struct user_fpsimd_state *state);
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 78085a32df83..140f5a8cc7c4 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -1438,7 +1438,7 @@ static const struct user_regset aarch64_regsets[] = {
> #ifdef CONFIG_ARM64_SME
> [REGSET_SSVE] = { /* Streaming mode SVE */
> .core_note_type = NT_ARM_SSVE,
> - .n = DIV_ROUND_UP(SVE_PT_SIZE(SVE_VQ_MAX, SVE_PT_REGS_SVE),
> + .n = DIV_ROUND_UP(SVE_PT_SIZE(SME_VQ_MAX, SVE_PT_REGS_SVE),
> SVE_VQ_BYTES),
> .size = SVE_VQ_BYTES,
> .align = SVE_VQ_BYTES,
> @@ -1447,7 +1447,15 @@ static const struct user_regset aarch64_regsets[] = {
> },
> [REGSET_ZA] = { /* SME ZA */
> .core_note_type = NT_ARM_ZA,
> - .n = DIV_ROUND_UP(ZA_PT_ZA_SIZE(SVE_VQ_MAX), SVE_VQ_BYTES),
> + /*
> + * ZA is a single register but it's variably sized and
> + * the ptrace core requires that the size of any data
> + * be an exact multiple of the configured register
> + * size so report as though we had SVE_VQ_BYTES
> + * registers. These values aren't exposed to
> + * userspace.
> + */
> + .n = DIV_ROUND_UP(ZA_PT_SIZE(SME_VQ_MAX), SVE_VQ_BYTES),
> .size = SVE_VQ_BYTES,
> .align = SVE_VQ_BYTES,
> .regset_get = za_get,
> --
> 2.30.2
>
Test log link,
https://lkft.validation.linaro.org/scheduler/job/4993362
- Naresh
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] arm64/sme: More sensibly define the size for the ZA register set
2022-05-05 22:15 [PATCH] arm64/sme: More sensibly define the size for the ZA register set Mark Brown
2022-05-06 2:43 ` Naresh Kamboju
@ 2022-05-06 17:39 ` Catalin Marinas
1 sibling, 0 replies; 3+ messages in thread
From: Catalin Marinas @ 2022-05-06 17:39 UTC (permalink / raw)
To: Will Deacon, Mark Brown; +Cc: Naresh Kamboju, Qian Cai, linux-arm-kernel
On Thu, 5 May 2022 23:15:17 +0100, Mark Brown wrote:
> Since the vector length configuration mechanism is identical between SVE
> and SME we share large elements of the code including the definition for
> the maximum vector length. Unfortunately when we were defining the ABI
> for SVE we included not only the actual maximum vector length of 2048
> bits but also the value possible if all the bits reserved in the
> architecture for expansion of the LEN field were used, 16384 bits.
>
> [...]
Applied to arm64 (for-next/sme), thanks!
[1/1] arm64/sme: More sensibly define the size for the ZA register set
https://git.kernel.org/arm64/c/d158a0608eb8
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-05-06 17:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-05 22:15 [PATCH] arm64/sme: More sensibly define the size for the ZA register set Mark Brown
2022-05-06 2:43 ` Naresh Kamboju
2022-05-06 17:39 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).