From: Richard Henderson <richard.henderson@linaro.org>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-devel@nongnu.org, qemu-arm@nongnu.org
Subject: Re: [PATCH v3 26/51] target/arm: Implement SME LD1, ST1
Date: Thu, 23 Jun 2022 13:36:42 -0700 [thread overview]
Message-ID: <c6308243-0742-b9dc-a325-da095e687181@linaro.org> (raw)
In-Reply-To: <CAFEAcA932Ud1PH4Az=NVW02mR8Q_GNmsH-kvLTRZ_TmCR8=Psg@mail.gmail.com>
On 6/23/22 04:41, Peter Maydell wrote:
>> +/*
>> + * FIXME: The ARMVectorReg elements are stored in host-endian 64-bit units.
>> + * We do not have a defined ordering of the 64-bit units for host-endian
>> + * 128-bit quantities. For now, just leave the host words in little-endian
>> + * order and hope for the best.
>> + */
>
> I don't understand this comment. The architecture ought to specify
> what order the two halves of a 128-bit quantity ought to go in the
> ZArray storage. If we can't guarantee that a host int128_t can be
> handled in a way that does the right thing, then we just can't
> use int128_t.
Indeed. The spec is an array of bits, and the way the indexing is done is equal to
putting the two 64-bit quantities in little-endian order in our array. I'll improve the
comment.
>> +static inline QEMU_ALWAYS_INLINE
>> +void sme_ld1(CPUARMState *env, void *za, uint64_t *vg,
>> + const target_ulong addr, uint32_t desc, const uintptr_t ra,
>> + const int esz, uint32_t mtedesc, bool vertical,
>> + sve_ldst1_host_fn *host_fn,
>> + sve_ldst1_tlb_fn *tlb_fn,
>> + ClearFn *clr_fn,
>> + CopyFn *cpy_fn)
>
> We now have several rather long functions that are
> pretty complicated and pretty similar handling the various
> SVE and SME loads and stores. Is there really no hope for
> sharing code ?
I'm not sure. Maybe. The two issues are:
(1) sve ld4/st4 -- arm didn't make z29-z31 illegal, but defined wraparound to z0. So I
pass in a Zreg number not a pointer to those routines. So the routines can't be reused
for Zarray without changing that.
(2) sme ld1/st1 vertical slice, which significantly alters the spacing between the elements.
One possibility for these cases is to perform the load into env->some_scratch_space, then
copy into the final position afterward, and the reverse for stores.
Is that preferable? Do you see another alternative?
>> + t_za = get_tile_rowcol(s, a->esz, a->rs, a->za_imm, a->v);
>> + t_pg = pred_full_reg_ptr(s, a->pg);
>> + addr = tcg_temp_new_i64();
>> +
>> + tcg_gen_shli_i64(addr, cpu_reg(s, a->rn), a->esz);
>> + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rm));
>
> Aren't rm and rn the wrong way around here? The pseudocode
> says that rn is the base (can be SP, not scaled) and rm is
> the offset (can be XZR, scaled by mbytes).
Yep, thanks.
r~
next prev parent reply other threads:[~2022-06-23 20:37 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-20 17:51 [PATCH v3 00/51] target/arm: Scalable Matrix Extension Richard Henderson
2022-06-20 17:51 ` [PATCH v3 01/51] target/arm: Implement TPIDR2_EL0 Richard Henderson
2022-06-20 17:51 ` [PATCH v3 02/51] target/arm: Add SMEEXC_EL to TB flags Richard Henderson
2022-06-20 17:51 ` [PATCH v3 03/51] target/arm: Add syn_smetrap Richard Henderson
2022-06-20 17:51 ` [PATCH v3 04/51] target/arm: Add ARM_CP_SME Richard Henderson
2022-06-20 17:51 ` [PATCH v3 05/51] target/arm: Add SVCR Richard Henderson
2022-06-20 17:51 ` [PATCH v3 06/51] target/arm: Add SMCR_ELx Richard Henderson
2022-06-20 17:51 ` [PATCH v3 07/51] target/arm: Add SMIDR_EL1, SMPRI_EL1, SMPRIMAP_EL2 Richard Henderson
2022-06-20 17:51 ` [PATCH v3 08/51] target/arm: Add PSTATE.{SM,ZA} to TB flags Richard Henderson
2022-06-20 17:51 ` [PATCH v3 09/51] target/arm: Add the SME ZA storage to CPUARMState Richard Henderson
2022-06-21 20:24 ` Peter Maydell
2022-06-20 17:51 ` [PATCH v3 10/51] target/arm: Implement SMSTART, SMSTOP Richard Henderson
2022-06-20 17:51 ` [PATCH v3 11/51] target/arm: Move error for sve%d property to arm_cpu_sve_finalize Richard Henderson
2022-06-20 17:51 ` [PATCH v3 12/51] target/arm: Create ARMVQMap Richard Henderson
2022-06-20 17:51 ` [PATCH v3 13/51] target/arm: Generalize cpu_arm_{get,set}_vq Richard Henderson
2022-06-20 17:51 ` [PATCH v3 14/51] target/arm: Generalize cpu_arm_{get, set}_default_vec_len Richard Henderson
2022-06-20 17:51 ` [PATCH v3 15/51] target/arm: Move arm_cpu_*_finalize to internals.h Richard Henderson
2022-06-20 17:52 ` [PATCH v3 16/51] target/arm: Unexport aarch64_add_*_properties Richard Henderson
2022-06-20 17:52 ` [PATCH v3 17/51] target/arm: Add cpu properties for SME Richard Henderson
2022-06-21 17:13 ` Peter Maydell
2024-04-12 11:36 ` Peter Maydell
2024-04-12 16:17 ` Richard Henderson
2022-06-20 17:52 ` [PATCH v3 18/51] target/arm: Introduce sve_vqm1_for_el_sm Richard Henderson
2022-06-20 17:52 ` [PATCH v3 19/51] target/arm: Add SVL to TB flags Richard Henderson
2022-06-20 17:52 ` [PATCH v3 20/51] target/arm: Move pred_{full, gvec}_reg_{offset, size} to translate-a64.h Richard Henderson
2022-06-20 17:52 ` [PATCH v3 21/51] target/arm: Add infrastructure for disas_sme Richard Henderson
2022-06-20 17:52 ` [PATCH v3 22/51] target/arm: Trap AdvSIMD usage when Streaming SVE is active Richard Henderson
2022-06-24 15:30 ` Peter Maydell
2022-06-24 20:34 ` Richard Henderson
2022-06-24 21:38 ` Peter Maydell
2022-06-26 3:37 ` Richard Henderson
2022-06-20 17:52 ` [PATCH v3 23/51] target/arm: Implement SME RDSVL, ADDSVL, ADDSPL Richard Henderson
2022-06-21 17:23 ` Peter Maydell
2022-06-22 0:58 ` Richard Henderson
2022-06-23 10:12 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 24/51] target/arm: Implement SME ZERO Richard Henderson
2022-06-21 20:07 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 25/51] target/arm: Implement SME MOVA Richard Henderson
2022-06-23 11:24 ` Peter Maydell
2022-06-23 14:44 ` Richard Henderson
2022-06-20 17:52 ` [PATCH v3 26/51] target/arm: Implement SME LD1, ST1 Richard Henderson
2022-06-23 11:41 ` Peter Maydell
2022-06-23 20:36 ` Richard Henderson [this message]
2022-06-24 10:05 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 27/51] target/arm: Export unpredicated ld/st from translate-sve.c Richard Henderson
2022-06-23 11:42 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 28/51] target/arm: Implement SME LDR, STR Richard Henderson
2022-06-23 11:46 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 29/51] target/arm: Implement SME ADDHA, ADDVA Richard Henderson
2022-06-23 12:04 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 30/51] target/arm: Implement FMOPA, FMOPS (non-widening) Richard Henderson
2022-06-24 12:31 ` Peter Maydell
2022-06-24 14:16 ` Richard Henderson
2022-06-20 17:52 ` [PATCH v3 31/51] target/arm: Implement BFMOPA, BFMOPS Richard Henderson
2022-06-20 17:52 ` [PATCH v3 32/51] target/arm: Implement FMOPA, FMOPS (widening) Richard Henderson
2022-06-20 17:52 ` [PATCH v3 33/51] target/arm: Implement SME integer outer product Richard Henderson
2022-06-24 12:39 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 34/51] target/arm: Implement PSEL Richard Henderson
2022-06-24 12:51 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 35/51] target/arm: Implement REVD Richard Henderson
2022-06-24 12:54 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 36/51] target/arm: Implement SCLAMP, UCLAMP Richard Henderson
2022-06-24 13:00 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 37/51] target/arm: Reset streaming sve state on exception boundaries Richard Henderson
2022-06-24 13:02 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 38/51] target/arm: Enable SME for -cpu max Richard Henderson
2022-06-24 13:03 ` Peter Maydell
2022-06-20 17:52 ` [PATCH v3 39/51] linux-user/aarch64: Clear tpidr2_el0 if CLONE_SETTLS Richard Henderson
2022-06-20 17:52 ` [PATCH v3 40/51] linux-user/aarch64: Reset PSTATE.SM on syscalls Richard Henderson
2022-06-20 17:52 ` [PATCH v3 41/51] linux-user/aarch64: Add SM bit to SVE signal context Richard Henderson
2022-06-20 17:52 ` [PATCH v3 42/51] linux-user/aarch64: Tidy target_restore_sigframe error return Richard Henderson
2022-06-20 17:52 ` [PATCH v3 43/51] linux-user/aarch64: Do not allow duplicate or short sve records Richard Henderson
2022-06-20 17:52 ` [PATCH v3 44/51] linux-user/aarch64: Verify extra record lock succeeded Richard Henderson
2022-06-20 17:52 ` [PATCH v3 45/51] linux-user/aarch64: Move sve record checks into restore Richard Henderson
2022-06-20 17:52 ` [PATCH v3 46/51] linux-user/aarch64: Implement SME signal handling Richard Henderson
2022-06-20 17:52 ` [PATCH v3 47/51] linux-user: Rename sve prctls Richard Henderson
2022-06-20 17:52 ` [PATCH v3 48/51] linux-user/aarch64: Implement PR_SME_GET_VL, PR_SME_SET_VL Richard Henderson
2022-06-20 17:52 ` [PATCH v3 49/51] target/arm: Only set ZEN in reset if SVE present Richard Henderson
2022-06-20 17:52 ` [PATCH v3 50/51] target/arm: Enable SME for user-only Richard Henderson
2022-06-20 17:52 ` [PATCH v3 51/51] linux-user/aarch64: Add SME related hwcap entries Richard Henderson
2022-06-24 15:02 ` [PATCH v3 00/51] target/arm: Scalable Matrix Extension Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c6308243-0742-b9dc-a325-da095e687181@linaro.org \
--to=richard.henderson@linaro.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).