Linux Kernel Selftest development
 help / color / mirror / Atom feed
From: Luis Machado <luis.machado@arm.com>
To: Mark Brown <broonie@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Shuah Khan <shuah@kernel.org>
Cc: Alan Hayward <alan.hayward@arm.com>,
	Szabolcs Nagy <szabolcs.nagy@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v2 04/21] arm64/sme: Document SME 2 and SME 2.1 ABI
Date: Fri, 11 Nov 2022 10:38:12 +0000	[thread overview]
Message-ID: <b8129bae-6917-dd42-7e11-000bacca3669@arm.com> (raw)
In-Reply-To: <ac30884b-3c48-4fb9-d649-aaf5863e4505@arm.com>

On 11/11/22 10:17, Luis Machado wrote:
> On 11/1/22 14:33, Mark Brown wrote:
>> As well as a number of simple features which only add new instructions and
>> require corresponding hwcaps SME2 introduces a new register ZT0 for which
>> we must define ABI. Fortunately this is a fixed size 512 bits and therefore
>> much more straightforward than the base SME state, the only wrinkle is that
>> it is only accessible when ZA is accessible.
>>
>> While there is only a single register the architecture is written with a
>> view to exensibility, including a number in the name, so follow this in the
>> ABI.
>>
>> Signed-off-by: Mark Brown <broonie@kernel.org>
>> ---
>>   Documentation/arm64/sme.rst | 52 ++++++++++++++++++++++++++++++-------
>>   1 file changed, 43 insertions(+), 9 deletions(-)
>>
>> diff --git a/Documentation/arm64/sme.rst b/Documentation/arm64/sme.rst
>> index 16d2db4c2e2e..5f7eabee4853 100644
>> --- a/Documentation/arm64/sme.rst
>> +++ b/Documentation/arm64/sme.rst
>> @@ -18,14 +18,19 @@ model features for SME is included in Appendix A.
>>   1.  General
>>   -----------
>> -* PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA
>> -  register state and TPIDR2_EL0 are tracked per thread.
>> +* PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA and (when
>> +  present) ZT0 register state and TPIDR2_EL0 are tracked per thread.
>>   * The presence of SME is reported to userspace via HWCAP2_SME in the aux vector
>>     AT_HWCAP2 entry.  Presence of this flag implies the presence of the SME
>>     instructions and registers, and the Linux-specific system interfaces
>>     described in this document.  SME is reported in /proc/cpuinfo as "sme".
>> +* The presence of SME2 is reported to userspace via HWCAP2_SME in the
> 
> I suppose HWCAP2_SME -> HWCAP2_SME2?
> 
>> +  aux vector AT_HWCAP2 entry.  Presence of this flag implies the presence of
>> +  the SME2 instructions and ZT0, and the Linux-specific system interfaces
>> +  described in this document.  SME2 is reported in /proc/cpuinfo as "sme2".
>> +
>>   * Support for the execution of SME instructions in userspace can also be
>>     detected by reading the CPU ID register ID_AA64PFR1_EL1 using an MRS
>>     instruction, and checking that the value of the SME field is nonzero. [3]
>> @@ -44,6 +49,7 @@ model features for SME is included in Appendix A.
>>       HWCAP2_SME_B16F32
>>       HWCAP2_SME_F32F32
>>       HWCAP2_SME_FA64
>> +        HWCAP2_SME2
>>     This list may be extended over time as the SME architecture evolves.
>> @@ -52,8 +58,8 @@ model features for SME is included in Appendix A.
>>     cpu-feature-registers.txt for details.
>>   * Debuggers should restrict themselves to interacting with the target via the
>> -  NT_ARM_SVE, NT_ARM_SSVE and NT_ARM_ZA regsets.  The recommended way
>> -  of detecting support for these regsets is to connect to a target process
>> +  NT_ARM_SVE, NT_ARM_SSVE, NT_ARM_ZA and NT_ARM_ZT regsets.  The recommended
>> +  way of detecting support for these regsets is to connect to a target process
>>     first and then attempt a
>>       ptrace(PTRACE_GETREGSET, pid, NT_ARM_<regset>, &iov).
>> @@ -89,13 +95,13 @@ be zeroed.
>>   -------------------------
>>   * On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the
>> -  ZA matrix are preserved.
>> +  ZA matrix and ZT0 (if present) are preserved.
>>   * On syscall PSTATE.SM will be cleared and the SVE registers will be handled
>>     as per the standard SVE ABI.
>> -* Neither the SVE registers nor ZA are used to pass arguments to or receive
>> -  results from any syscall.
>> +* None of the SVE registers, ZA or ZT0 are used to pass arguments to
>> +  or receive results from any syscall.
>>   * On process creation (eg, clone()) the newly created process will have
>>     PSTATE.SM cleared.
>> @@ -134,6 +140,14 @@ be zeroed.
>>     __reserved[] referencing this space.  za_context is then written in the
>>     extra space.  Refer to [1] for further details about this mechanism.
>> +* If ZT is supported and PSTATE.ZA==1 then a signal frame record for ZT will
>> +  be generated.
> 
> I noticed we refer to ZT0 as ZT sometimes. Should we use ZT0 throughout? Or maybe ZT, if it makes more sense?
> 
> Otherwise it can get a bit confusing.
> 

Reading through the rest of the series, I noticed we're leaving room for more ZT registers in the future.

>> +
>> +* The signal record for ZT has magic ZT_MAGIC (0x73d4e827) and consists of a
>> +  standard signal frame header followed by a struct zt_context specifying
>> +  the number of ZT registers supported by the system, then zt_contxt.nregs
> 
> zt_contxt -> zt_context
> 
>> +  blocks of 64 bytes of data per register.
>> +
>>   5.  Signal return
>>   -----------------
>> @@ -151,6 +165,9 @@ When returning from a signal handler:
>>     the signal frame does not match the current vector length, the signal return
>>     attempt is treated as illegal, resulting in a forced SIGSEGV.
>> +* If ZT is not supported or PSTATE.ZA==0 then it is illegal to have a
>> +  signal frame record for ZT, resulting in a forced SIGSEGV.
>> +
>>   6.  prctl extensions
>>   --------------------
>> @@ -214,8 +231,8 @@ prctl(PR_SME_SET_VL, unsigned long arg)
>>         vector length that will be applied at the next execve() by the calling
>>         thread.
>> -    * Changing the vector length causes all of ZA, P0..P15, FFR and all bits of
>> -      Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
>> +    * Changing the vector length causes all of ZA, ZT, P0..P15, FFR and all
>> +      bits of Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
>>         unspecified, including both streaming and non-streaming SVE state.
>>         Calling PR_SME_SET_VL with vl equal to the thread's current vector
>>         length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag,
>> @@ -317,6 +334,15 @@ The regset data starts with struct user_za_header, containing:
>>   * The effect of writing a partial, incomplete payload is unspecified.
>> +* A new regset NT_ARM_ZT is defined for for access to ZT state via
> 
> typo, double for
> 
>> +  PTRACE_GETREGSET and PTRACE_SETREGSET.
>> +
>> +* The NT_ARM_ZT regset consists of a single 512 bit register.
>> +
>> +* When PSTATE.ZA==0 reads of NT_ARM_ZT will report all bits of ZT as 0.
>> +
>> +* Writes to NT_ARM_ZT will set PSTATE.ZA to 1.
>> +
>>   8.  ELF coredump extensions
>>   ---------------------------
>> @@ -331,6 +357,11 @@ The regset data starts with struct user_za_header, containing:
>>     been read if a PTRACE_GETREGSET of NT_ARM_ZA were executed for each thread
>>     when the coredump was generated.
>> +* A NT_ARM_ZT note will be added to each coredump for each thread of the
>> +  dumped process.  The contents will be equivalent to the data that would have
>> +  been read if a PTRACE_GETREGSET of NT_ARM_ZT were executed for each thread
>> +  when the coredump was generated.
>> +
>>   * The NT_ARM_TLS note will be extended to two registers, the second register
>>     will contain TPIDR2_EL0 on systems that support SME and will be read as
>>     zero with writes ignored otherwise.
>> @@ -406,6 +437,9 @@ In A64 state, SME adds the following:
>>     For best system performance it is strongly encouraged for software to enable
>>     ZA only when it is actively being used.
>> +* A new ZT0 register is introduced when SME2 is present. This is a 512 bit
>> +  register which is accessible PSTATE.ZA is set, as ZA itself is.
> 
> accessible WHEN?
> 
>> +
>>   * Two new 1 bit fields in PSTATE which may be controlled via the SMSTART and
>>     SMSTOP instructions or by access to the SVCR system register:
> 


  reply	other threads:[~2022-11-11 10:38 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-01 14:33 [PATCH v2 00/21] arm64/sme: Support SME 2 and SME 2.1 Mark Brown
2022-11-01 14:33 ` [PATCH v2 01/21] arm64/sme: Rename za_state to sme_state Mark Brown
2022-11-01 14:33 ` [PATCH v2 02/21] arm64: Document boot requirements for SME 2 Mark Brown
2022-11-01 14:33 ` [PATCH v2 03/21] arm64/sysreg: Update system registers for SME 2 and 2.1 Mark Brown
2022-11-01 14:33 ` [PATCH v2 04/21] arm64/sme: Document SME 2 and SME 2.1 ABI Mark Brown
2022-11-11 10:17   ` Luis Machado
2022-11-11 10:38     ` Luis Machado [this message]
2022-11-11 11:20     ` Mark Brown
2022-11-01 14:33 ` [PATCH v2 05/21] arm64/esr: Document ISS for ZT0 being disabled Mark Brown
2022-11-01 14:33 ` [PATCH v2 06/21] arm64/sme: Manually encode ZT0 load and store instructions Mark Brown
2022-11-01 14:33 ` [PATCH v2 07/21] arm64/sme: Enable host kernel to access ZT0 Mark Brown
2022-11-01 14:33 ` [PATCH v2 08/21] arm64/sme: Add basic enumeration for SME2 Mark Brown
2022-11-01 14:33 ` [PATCH v2 09/21] arm64/sme: Provide storage for ZT0 Mark Brown
2022-11-01 14:33 ` [PATCH v2 10/21] arm64/sme: Implement context switching " Mark Brown
2022-11-01 14:33 ` [PATCH v2 11/21] arm64/sme: Implement signal handling for ZT Mark Brown
2022-11-01 14:33 ` [PATCH v2 12/21] arm64/sme: Implement ZT0 ptrace support Mark Brown
2022-11-11 10:31   ` Luis Machado
2022-11-11 11:25     ` Mark Brown
2022-11-01 14:33 ` [PATCH v2 13/21] arm64/sme: Add hwcaps for SME 2 and 2.1 features Mark Brown
2022-11-01 14:33 ` [PATCH v2 14/21] kselftest/arm64: Add a stress test program for ZT0 Mark Brown
2022-11-01 14:33 ` [PATCH v2 15/21] kselftest/arm64: Cover ZT in the FP stress test Mark Brown
2022-11-01 14:33 ` [PATCH v2 16/21] kselftest/arm64: Enumerate SME2 in the signal test utility code Mark Brown
2022-11-01 14:33 ` [PATCH v2 17/21] kselftest/arm64: Teach the generic signal context validation about ZT Mark Brown
2022-11-01 14:33 ` [PATCH v2 18/21] kselftest/arm64: Add test coverage for ZT register signal frames Mark Brown
2022-11-01 14:33 ` [PATCH v2 19/21] kselftest/arm64: Add SME2 coverage to syscall-abi Mark Brown
2022-11-01 14:33 ` [PATCH v2 20/21] kselftest/arm64: Add coverage of the ZT ptrace regset Mark Brown
2022-11-01 14:33 ` [PATCH v2 21/21] kselftest/arm64: Add coverage of SME 2 and 2.1 hwcaps Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b8129bae-6917-dd42-7e11-000bacca3669@arm.com \
    --to=luis.machado@arm.com \
    --cc=alan.hayward@arm.com \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=szabolcs.nagy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox