Re: [PATCH 2/2] util/bufferiszero: Add simd acceleration for loongarch64

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Richard Henderson <richard.henderson@linaro.org>
To: maobibo <maobibo@loongson.cn>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Daniel P . Berrangé" <berrange@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH 2/2] util/bufferiszero: Add simd acceleration for loongarch64
Date: Wed, 5 Jun 2024 20:18:35 -0700	[thread overview]
Message-ID: <58ba9ea7-cc45-47d4-a278-3777b496cb44@linaro.org> (raw)
In-Reply-To: <7b4c6909-40e8-def7-03e8-18a3303295f1@loongson.cn>

On 6/5/24 19:30, maobibo wrote:
> 
> 
> On 2024/6/6 上午7:51, Richard Henderson wrote:
>> On 6/5/24 02:32, Bibo Mao wrote:
>>> Different gcc versions have different features, macro CONFIG_LSX_OPT
>>> and CONFIG_LASX_OPT is added here to detect whether gcc supports
>>> built-in lsx/lasx macro.
>>>
>>> Function buffer_zero_lsx() is added for 128bit simd fpu optimization,
>>> and function buffer_zero_lasx() is for 256bit simd fpu optimization.
>>>
>>> Loongarch gcc built-in lsx/lasx macro can be used only when compiler
>>> option -mlsx/-mlasx is added, and there is no separate compiler option
>>> for function only. So it is only in effect when qemu is compiled with
>>> parameter --extra-cflags="-mlasx"
>>>
>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>>> ---
>>>   meson.build         |  11 +++++
>>>   util/bufferiszero.c | 103 ++++++++++++++++++++++++++++++++++++++++++++
>>>   2 files changed, 114 insertions(+)
>>>
>>> diff --git a/meson.build b/meson.build
>>> index 6386607144..29bc362d7a 100644
>>> --- a/meson.build
>>> +++ b/meson.build
>>> @@ -2855,6 +2855,17 @@ config_host_data.set('CONFIG_ARM_AES_BUILTIN', cc.compiles('''
>>>       void foo(uint8x16_t *p) { *p = vaesmcq_u8(*p); }
>>>     '''))
>>> +# For Loongarch64, detect if LSX/LASX are available.
>>> + config_host_data.set('CONFIG_LSX_OPT', cc.compiles('''
>>> +    #include "lsxintrin.h"
>>> +    int foo(__m128i v) { return __lsx_bz_v(v); }
>>> +  '''))
>>> +
>>> +config_host_data.set('CONFIG_LASX_OPT', cc.compiles('''
>>> +    #include "lasxintrin.h"
>>> +    int foo(__m256i v) { return __lasx_xbz_v(v); }
>>> +  '''))
>>
>> Both of these are introduced by gcc 14 and llvm 18, so I'm not certain of the utility of 
>> separate tests.  We might simplify this with
>>
>>    config_host_data.set('CONFIG_LSX_LASX_INTRIN_H',
>>      cc.has_header('lsxintrin.h') && cc.has_header('lasxintrin.h'))
>>
>>
>> As you say, these headers require vector instructions to be enabled at compile-time 
>> rather than detecting them at runtime.  This is a point where the compilers could be 
>> improved to support __attribute__((target("xyz"))) and the builtins with that.  The i386 
>> port does this, for instance.
>>
>> In the meantime, it means that you don't need a runtime test.  Similar to aarch64 and 
>> the use of __ARM_NEON as a compile-time test for simd support.  Perhaps
>>
>> #elif defined(CONFIG_LSX_LASX_INTRIN_H) && \
>>        (defined(__loongarch_sx) || defined(__loongarch_asx))
>> # ifdef __loongarch_sx
>>    ...
>> # endif
>> # ifdef __loongarch_asx
>>    ...
>> # endif
> Sure, will do in this way.
> And also there is runtime check coming from hwcap, such this:
> 
> unsigned info = cpuinfo_init();
>    if (info & CPUINFO_LASX)

static biz_accel_fn const accel_table[] = {
     buffer_is_zero_int_ge256,
#ifdef __loongarch_sx
     buffer_is_zero_lsx,
#endif
#ifdef __loongarch_asx
     buffer_is_zero_lasx,
#endif
};

static unsigned best_accel(void)
{
#ifdef __loongarch_asx
     /* lasx may be index 1 or 2, but always last */
     return ARRAY_SIZE(accel_table) - 1;
#else
     /* lsx is always index 1 */
     return 1;
#endif
}


r~

next prev parent reply	other threads:[~2024-06-06  3:19 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-05  9:32 [PATCH 0/2] Add simd optimization with function buffer_is_zero Bibo Mao
2024-06-05  9:32 ` [PATCH 1/2] util: Add lasx cpuinfo for loongarch64 Bibo Mao
2024-06-05 11:53   ` Philippe Mathieu-Daudé
2024-06-06  2:17     ` maobibo
2024-06-05  9:32 ` [PATCH 2/2] util/bufferiszero: Add simd acceleration " Bibo Mao
2024-06-05 23:51   ` Richard Henderson
2024-06-06  2:30     ` maobibo
2024-06-06  3:18       ` Richard Henderson [this message]
2024-06-06  3:27         ` Richard Henderson
2024-06-06  3:36           ` maobibo
2024-06-06  3:42             ` Richard Henderson
2024-06-06  4:00               ` maobibo
2024-06-07  0:25                 ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58ba9ea7-cc45-47d4-a278-3777b496cb44@linaro.org \
    --to=richard.henderson@linaro.org \
    --cc=berrange@redhat.com \
    --cc=maobibo@loongson.cn \
    --cc=marcandre.lureau@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).