From: "Alex Bennée" <alex.bennee@linaro.org>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v4 1/5] linux-user: Implement aarch64 PR_SVE_SET/GET_VL
Date: Tue, 06 Mar 2018 12:28:53 +0000 [thread overview]
Message-ID: <87a7vltmka.fsf@linaro.org> (raw)
In-Reply-To: <20180303143823.27055-2-richard.henderson@linaro.org>
Richard Henderson <richard.henderson@linaro.org> writes:
> As an implementation choice, widening VL has zeroed the
> previously inaccessible portion of the sve registers.
>
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> linux-user/aarch64/target_syscall.h | 3 +++
> target/arm/cpu.h | 1 +
> linux-user/syscall.c | 27 ++++++++++++++++++++++++
> target/arm/cpu64.c | 41 +++++++++++++++++++++++++++++++++++++
> 4 files changed, 72 insertions(+)
>
> diff --git a/linux-user/aarch64/target_syscall.h b/linux-user/aarch64/target_syscall.h
> index 604ab99b14..205265e619 100644
> --- a/linux-user/aarch64/target_syscall.h
> +++ b/linux-user/aarch64/target_syscall.h
> @@ -19,4 +19,7 @@ struct target_pt_regs {
> #define TARGET_MLOCKALL_MCL_CURRENT 1
> #define TARGET_MLOCKALL_MCL_FUTURE 2
>
> +#define TARGET_PR_SVE_SET_VL 50
> +#define TARGET_PR_SVE_GET_VL 51
For some reason I thought we might get this from our copy of
linux-headers but it seems we only do that for KVM bits.
> +
> #endif /* AARCH64_TARGET_SYSCALL_H */
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index 8dd6b788df..5f4566f017 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -861,6 +861,7 @@ int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
> #ifdef TARGET_AARCH64
> int aarch64_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
> int aarch64_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
> +void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq);
> #endif
>
> target_ulong do_arm_semihosting(CPUARMState *env);
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index e24f43c4a2..38f40e2692 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -10670,6 +10670,33 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
> break;
> }
> #endif
> +#ifdef TARGET_AARCH64
> + case TARGET_PR_SVE_SET_VL:
> + /* We cannot support either PR_SVE_SET_VL_ONEXEC
> + or PR_SVE_VL_INHERIT. Therefore, anything above
> + ARM_MAX_VQ results in EINVAL. */
> + ret = -TARGET_EINVAL;
> + if (arm_feature(cpu_env, ARM_FEATURE_SVE)
> + && arg2 >= 0 && arg2 <= ARM_MAX_VQ * 16 && !(arg2 & 15)) {
> + CPUARMState *env = cpu_env;
The kernel code splits the arg2 up into VL and flags. We don't seem to
be doing that here.
vl = arg & PR_SVE_VL_LEN_MASK;
flags = arg & ~vl;
I'm not sure what && !(arg2 & 15) is doing but PR_SVE_VL_LEN_MASK is
0xffff, Perhaps some defines would be useful to make it clearer.
> + int old_vq = (env->vfp.zcr_el[1] & 0xf) + 1;
> + int vq = MAX(arg2 / 16, 1);
> +
> + if (vq < old_vq) {
> + aarch64_sve_narrow_vq(env, vq);
> + }
> + env->vfp.zcr_el[1] = vq - 1;
It seems odd not to have setting this inside cpu64.c. Won't a similar
manipulation need to be made for system mode? I'd keep all the logic
together in aarch64_sve_narrow_vq (or maybe call it aarch64_sve_set_vq
and pass it the current exception level).
> + ret = vq * 16;
> + }
> + break;
> + case TARGET_PR_SVE_GET_VL:
> + ret = -TARGET_EINVAL;
> + if (arm_feature(cpu_env, ARM_FEATURE_SVE)) {
> + CPUARMState *env = cpu_env;
> + ret = ((env->vfp.zcr_el[1] & 0xf) + 1) * 16;
> + }
> + break;
> +#endif /* AARCH64 */
> case PR_GET_SECCOMP:
> case PR_SET_SECCOMP:
> /* Disable seccomp to prevent the target disabling syscalls we
> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
> index 4228713b19..74b485b382 100644
> --- a/target/arm/cpu64.c
> +++ b/target/arm/cpu64.c
> @@ -366,3 +366,44 @@ static void aarch64_cpu_register_types(void)
> }
>
> type_init(aarch64_cpu_register_types)
> +
> +/* The manual says that when SVE is enabled and VQ is widened the
> + * implementation is allowed to zero the previously inaccessible
> + * portion of the registers. The corollary to that is that when
> + * SVE is enabled and VQ is narrowed we are also allowed to zero
> + * the now inaccessible portion of the registers.
> + *
> + * The intent of this is that no predicate bit beyond VQ is ever set.
> + * Which means that some operations on predicate registers themselves
> + * may operate on full uint64_t or even unrolled across the maximum
> + * uint64_t[4]. Performing 4 bits of host arithmetic unconditionally
> + * may well be cheaper than conditionals to restrict the operation
> + * to the relevant portion of a uint16_t[16].
> + *
> + * TODO: Need to call this for changes to the real system registers
> + * and EL state changes.
> + */
> +void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq)
> +{
> + int i, j;
> + uint64_t pmask;
> +
> + assert(vq >= 1 && vq <= ARM_MAX_VQ);
> +
> + /* Zap the high bits of the zregs. */
> + for (i = 0; i < 32; i++) {
> + memset(&env->vfp.zregs[i].d[2 * vq], 0, 16 * (ARM_MAX_VQ - vq));
> + }
> +
> + /* Zap the high bits of the pregs and ffr. */
> + pmask = 0;
> + if (vq & 3) {
> + pmask = ~(-1ULL << (16 * (vq & 3)));
> + }
The kernel defines SVE_VQ_BYTES for clarity, perhaps we should do so to
here.
> + for (j = vq / 4; j < ARM_MAX_VQ / 4; j++) {
> + for (i = 0; i < 17; ++i) {
> + env->vfp.pregs[i].p[j] &= pmask;
> + }
> + pmask = 0;
> + }
> +}
--
Alex Bennée
next prev parent reply other threads:[~2018-03-06 12:28 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-03 14:38 [Qemu-devel] [PATCH v4 0/5] target/arm linux-user changes for sve Richard Henderson
2018-03-03 14:38 ` [Qemu-devel] [PATCH v4 1/5] linux-user: Implement aarch64 PR_SVE_SET/GET_VL Richard Henderson
2018-03-06 12:28 ` Alex Bennée [this message]
2018-03-06 12:58 ` [Qemu-arm] " Peter Maydell
2018-03-06 12:58 ` [Qemu-devel] [Qemu-arm] " Peter Maydell
2018-03-06 14:03 ` [Qemu-arm] [Qemu-devel] " Alex Bennée
2018-03-06 14:03 ` [Qemu-devel] [Qemu-arm] " Alex Bennée
2018-03-03 14:38 ` [Qemu-devel] [PATCH v4 2/5] aarch64-linux-user: Split out helpers for guest signal handling Richard Henderson
2018-03-06 14:11 ` Alex Bennée
2018-03-03 14:38 ` [Qemu-devel] [PATCH v4 3/5] aarch64-linux-user: Remove struct target_aux_context Richard Henderson
2018-03-06 14:16 ` Alex Bennée
2018-03-03 14:38 ` [Qemu-arm] [PATCH v4 4/5] aarch64-linux-user: Add support for EXTRA signal frame records Richard Henderson
2018-03-03 14:38 ` [Qemu-devel] " Richard Henderson
2018-03-06 14:26 ` [Qemu-arm] " Alex Bennée
2018-03-06 14:26 ` [Qemu-devel] " Alex Bennée
2018-03-03 14:38 ` [Qemu-arm] [PATCH v4 5/5] aarch64-linux-user: Add support for SVE " Richard Henderson
2018-03-03 14:38 ` [Qemu-devel] " Richard Henderson
2018-03-05 15:44 ` [Qemu-arm] " Peter Maydell
2018-03-05 15:44 ` [Qemu-devel] " Peter Maydell
2018-03-06 14:27 ` Alex Bennée
2018-03-06 14:27 ` [Qemu-devel] " Alex Bennée
2018-03-05 15:42 ` [Qemu-arm] [Qemu-devel] [PATCH v4 0/5] target/arm linux-user changes for sve Peter Maydell
2018-03-05 15:42 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a7vltmka.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.