From: Richard Henderson <richard.henderson@linaro.org>
To: Anton Johansson <anjo@rev.ng>
Cc: qemu-devel@nongnu.org, ale@rev.ng, ltaylorsimpson@gmail.com,
bcain@quicinc.com, philmd@linaro.org, alex.bennee@linaro.org
Subject: Re: [RFC PATCH v1 03/43] accel/tcg: Add gvec size changing operations
Date: Tue, 3 Dec 2024 12:57:57 -0600 [thread overview]
Message-ID: <e4910c71-8220-404b-bb43-0b885914e183@linaro.org> (raw)
In-Reply-To: <v5pkpmxto7vtshg7a5mifaozrzn6n5d7raknvydad3oxk67jeu@i4jydb4wylpb>
On 12/3/24 12:08, Anton Johansson wrote:
> On 22/11/24, Richard Henderson wrote:
>> On 11/20/24 19:49, Anton Johansson wrote:
>>> Adds new functions to the gvec API for truncating, sign- or zero
>>> extending vector elements. Currently implemented as helper functions,
>>> these may be mapped onto host vector instructions in the future.
>>>
>>> For the time being, allows translation of more complicated vector
>>> instructions by helper-to-tcg.
>>>
>>> Signed-off-by: Anton Johansson <anjo@rev.ng>
>>> ---
>>> accel/tcg/tcg-runtime-gvec.c | 41 +++++++++++++++++
>>> accel/tcg/tcg-runtime.h | 22 +++++++++
>>> include/tcg/tcg-op-gvec-common.h | 18 ++++++++
>>> tcg/tcg-op-gvec.c | 78 ++++++++++++++++++++++++++++++++
>>> 4 files changed, 159 insertions(+)
>>>
>>> diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c
>>> index afca89baa1..685c991e6a 100644
>>> --- a/accel/tcg/tcg-runtime-gvec.c
>>> +++ b/accel/tcg/tcg-runtime-gvec.c
>>> @@ -1569,3 +1569,44 @@ void HELPER(gvec_bitsel)(void *d, void *a, void *b, void *c, uint32_t desc)
>>> }
>>> clear_high(d, oprsz, desc);
>>> }
>>> +
>>> +#define DO_SZ_OP1(NAME, DSTTY, SRCTY) \
>>> +void HELPER(NAME)(void *d, void *a, uint32_t desc) \
>>> +{ \
>>> + intptr_t oprsz = simd_oprsz(desc); \
>>> + intptr_t elsz = oprsz/sizeof(DSTTY); \
>>> + intptr_t i; \
>>> + \
>>> + for (i = 0; i < elsz; ++i) { \
>>> + SRCTY aa = *((SRCTY *) a + i); \
>>> + *((DSTTY *) d + i) = aa; \
>>> + } \
>>> + clear_high(d, oprsz, desc); \
>>
>> This formulation is not valid.
>>
>> (1) Generic forms must *always* operate strictly on columns. This
>> formulation is either expanding a narrow vector to a wider vector or
>> compressing a wider vector to a narrow vector.
>>
>> (2) This takes no care for byte ordering of the data between columns. This
>> is where sticking strictly to columns helps, in that we can assume that data
>> is host-endian *within the column*, but we cannot assume anything about the
>> element indexing of ptr + i.
>
> Concerning (1) and (2), is this a limitation imposed on generic vector
> ops. to simplify mapping to host vector instructions where
> padding/alignment of elements might differ? From my understanding, the
> helper above should be fine since we can assume contiguous elements?
This is a limitation imposed on generic vector ops, because different target/arch/
represent their vectors in different ways.
For instance, Arm and RISC-V chunk the vector in to host-endian uint64_t, with the chunks
indexed little-endian. But PPC puts the entire 128-bit vector in host-endian bit
ordering, so the uint64_t chunks are host-endian.
On a big-endian host, ptr+1 may be addressing element i-1 or i-7 instead of i+1.
> I see, I don't think we can make this work for Hexagon vector ops., as
> an example consider V6_vadduwsat which performs an unsigned saturated
> add of 32-bit elements, currently we emit
>
> void emit_V6_vadduwsat(intptr_t vec2, intptr_t vec7, intptr_t vec6) {
> VectorMem mem = {0};
> intptr_t vec5 = temp_new_gvec(&mem, 256);
> tcg_gen_gvec_zext(MO_64, MO_32, vec5, vec7, 256, 128, 256);
>
> intptr_t vec1 = temp_new_gvec(&mem, 256);
> tcg_gen_gvec_zext(MO_64, MO_32, vec1, vec6, 256, 128, 256);
>
> tcg_gen_gvec_add(MO_64, vec1, vec1, vec5, 256, 256);
>
> intptr_t vec3 = temp_new_gvec(&mem, 256);
> tcg_gen_gvec_dup_imm(MO_64, vec3, 256, 256, 4294967295ull);
>
> tcg_gen_gvec_umin(MO_64, vec1, vec1, vec3, 256, 256);
>
> tcg_gen_gvec_trunc(MO_32, MO_64, vec2, vec1, 128, 256, 128);
> }
>
> so we really do rely on the size-changing property of zext here, the
> input vectors are 128-byte and we expand them to 256-byte. We could
> expand vector operations within the instruction to the largest vector
> size, but would need to zext and trunc to destination and source
> registers anyway.
Yes, well, this is the output of llvm though, yes?
Did you forget to describe TCG's native saturating operations to the compiler?
tcg_gen_gvec_usadd performs exactly this operation.
And if you'd like to improve llvm, usadd(a, b) equals umin(a, ~b) + b.
Fewer operations without having to change vector sizes.
Similarly for unsigned saturating subtract: ussub(a, b) equals umax(a, b) - b.
r~
next prev parent reply other threads:[~2024-12-03 18:58 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-21 1:49 [RFC PATCH v1 00/43] Introduce helper-to-tcg Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 01/43] Add option to enable/disable helper-to-tcg Anton Johansson via
2024-11-22 17:30 ` Richard Henderson
2024-11-22 18:23 ` Paolo Bonzini
2024-12-03 19:05 ` Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 02/43] accel/tcg: Add bitreverse and funnel-shift runtime helper functions Anton Johansson via
2024-11-22 17:35 ` Richard Henderson
2024-12-03 17:50 ` Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 03/43] accel/tcg: Add gvec size changing operations Anton Johansson via
2024-11-22 17:50 ` Richard Henderson
2024-12-03 18:08 ` Anton Johansson via
2024-12-03 18:57 ` Richard Henderson [this message]
2024-12-03 20:15 ` Anton Johansson via
2024-12-03 21:14 ` Richard Henderson
2024-11-21 1:49 ` [RFC PATCH v1 04/43] tcg: Add gvec functions for creating consant vectors Anton Johansson via
2024-11-22 18:00 ` Richard Henderson
2024-12-03 18:19 ` Anton Johansson via
2024-12-03 19:03 ` Richard Henderson
2024-11-21 1:49 ` [RFC PATCH v1 05/43] tcg: Add helper function dispatcher and hook tcg_gen_callN Anton Johansson via
2024-11-22 18:04 ` Richard Henderson
2024-12-03 18:45 ` Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 06/43] tcg: Introduce tcg-global-mappings Anton Johansson via
2024-11-22 19:14 ` Richard Henderson
2024-11-21 1:49 ` [RFC PATCH v1 07/43] tcg: Increase maximum TB size and maximum temporaries Anton Johansson via
2024-11-22 18:11 ` Richard Henderson
2024-11-21 1:49 ` [RFC PATCH v1 08/43] include/helper-to-tcg: Introduce annotate.h Anton Johansson via
2024-11-22 18:12 ` Richard Henderson
2024-11-25 11:27 ` Philippe Mathieu-Daudé
2024-12-03 19:00 ` Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 09/43] helper-to-tcg: Introduce get-llvm-ir.py Anton Johansson via
2024-11-22 18:14 ` Richard Henderson
2024-12-03 18:49 ` Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 10/43] helper-to-tcg: Add meson.build Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 11/43] helper-to-tcg: Introduce llvm-compat Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 12/43] helper-to-tcg: Introduce custom LLVM pipeline Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 13/43] helper-to-tcg: Introduce Error.h Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 14/43] helper-to-tcg: Introduce PrepareForOptPass Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 15/43] helper-to-tcg: PrepareForOptPass, map annotations Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 16/43] helper-to-tcg: PrepareForOptPass, Cull unused functions Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 17/43] helper-to-tcg: PrepareForOptPass, undef llvm.returnaddress Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 18/43] helper-to-tcg: PrepareForOptPass, Remove noinline attribute Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 19/43] helper-to-tcg: Pipeline, run optimization pass Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 20/43] helper-to-tcg: Introduce pseudo instructions Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 21/43] helper-to-tcg: Introduce PrepareForTcgPass Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 22/43] helper-to-tcg: PrepareForTcgPass, remove functions w. cycles Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 23/43] helper-to-tcg: PrepareForTcgPass, demote phi nodes Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 24/43] helper-to-tcg: PrepareForTcgPass, map TCG globals Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 25/43] helper-to-tcg: PrepareForTcgPass, transform GEPs Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 26/43] helper-to-tcg: PrepareForTcgPass, canonicalize IR Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 27/43] helper-to-tcg: PrepareForTcgPass, identity map trivial expressions Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 28/43] helper-to-tcg: Introduce TcgType.h Anton Johansson via
2024-11-22 18:26 ` Richard Henderson
2024-12-03 18:50 ` Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 29/43] helper-to-tcg: Introduce TCG register allocation Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 30/43] helper-to-tcg: TcgGenPass, introduce TcgEmit.[cpp|h] Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 31/43] helper-to-tcg: Introduce TcgGenPass Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 32/43] helper-to-tcg: Add README Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 33/43] helper-to-tcg: Add end-to-end tests Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 34/43] target/hexagon: Add get_tb_mmu_index() Anton Johansson via
2024-11-22 18:34 ` Richard Henderson
2024-12-03 18:50 ` Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 35/43] target/hexagon: Use argparse in all python scripts Anton Johansson via
2024-12-05 15:23 ` Brian Cain
2024-11-21 1:49 ` [RFC PATCH v1 36/43] target/hexagon: Add temporary vector storage Anton Johansson via
2024-11-22 18:35 ` Richard Henderson
2024-12-03 18:56 ` Anton Johansson via
2024-12-03 20:28 ` Brian Cain
2024-12-04 0:37 ` ltaylorsimpson
2024-11-21 1:49 ` [RFC PATCH v1 37/43] target/hexagon: Make HVX vector args. restrict * Anton Johansson via
2024-11-25 11:36 ` Philippe Mathieu-Daudé
2024-11-25 12:00 ` Paolo Bonzini
2024-12-03 18:57 ` Anton Johansson via
2024-12-03 18:58 ` Brian Cain
2024-11-21 1:49 ` [RFC PATCH v1 38/43] target/hexagon: Use cpu_mapping to map env -> TCG Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 39/43] target/hexagon: Keep gen_slotval/check_noshuf for helper-to-tcg Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 40/43] target/hexagon: Emit annotations for helpers Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 41/43] target/hexagon: Manually call generated HVX instructions Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 42/43] target/hexagon: Only translate w. idef-parser if helper-to-tcg failed Anton Johansson via
2024-11-21 1:49 ` [RFC PATCH v1 43/43] target/hexagon: Use helper-to-tcg Anton Johansson via
2024-11-25 11:34 ` [RFC PATCH v1 00/43] Introduce helper-to-tcg Philippe Mathieu-Daudé
2024-12-03 18:58 ` Anton Johansson via
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e4910c71-8220-404b-bb43-0b885914e183@linaro.org \
--to=richard.henderson@linaro.org \
--cc=ale@rev.ng \
--cc=alex.bennee@linaro.org \
--cc=anjo@rev.ng \
--cc=bcain@quicinc.com \
--cc=ltaylorsimpson@gmail.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).