From: Will Deacon <will.deacon@arm.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: liuyun01 <liuyun01@kylinos.cn>,
Catalin Marinas <catalin.marinas@arm.com>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
linux-block@vger.kernel.org
Subject: Re: [PATCH v3 2/2] arm64: crypto: add NEON accelerated XOR implementation
Date: Tue, 27 Nov 2018 18:03:25 +0000 [thread overview]
Message-ID: <20181127180325.GA19216@arm.com> (raw)
In-Reply-To: <CAKv+Gu_Jh356UqQcfVOQvuBUpw3z2ihTv-gpLbeUaFium-wVPA@mail.gmail.com>
On Tue, Nov 27, 2018 at 01:46:48PM +0100, Ard Biesheuvel wrote:
> (add maintainers back to cc)
>
> On Tue, 27 Nov 2018 at 12:49, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> >
> > On Tue, 27 Nov 2018 at 11:10, Jackie Liu <liuyun01@kylinos.cn> wrote:
> > >
> > > This is a NEON acceleration method that can improve
> > > performance by approximately 20%. I got the following
> > > data from the centos 7.5 on Huawei's HISI1616 chip:
> > >
> > > [ 93.837726] xor: measuring software checksum speed
> > > [ 93.874039] 8regs : 7123.200 MB/sec
> > > [ 93.914038] 32regs : 7180.300 MB/sec
> > > [ 93.954043] arm64_neon: 9856.000 MB/sec
> >
> > That looks more like 37% to me
> >
> > Note that Cortex-A57 gives me
> >
> > [ 0.111543] xor: measuring software checksum speed
> > [ 0.154874] 8regs : 3782.000 MB/sec
> > [ 0.195069] 32regs : 6095.000 MB/sec
> > [ 0.235145] arm64_neon: 5924.000 MB/sec
> > [ 0.236942] xor: using function: 32regs (6095.000 MB/sec)
> >
> > so we fall back to the scalar code, which is fine.
> >
> > > [ 93.954047] xor: using function: arm64_neon (9856.000 MB/sec)
> > >
> > > I believe this code can bring some optimization for
> > > all arm64 platform.
> > >
> > > That is patch version 3. Thanks for Ard Biesheuvel's
> > > suggestions.
> > >
> > > Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
> >
> > Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >
>
> This goes with v4 of the NEON intrinsics patch.
>
> Jackie: no need to resend these, but next time, please repost the
> series entirely, not just a single patch, and keep the maintainers on
> cc.
Actually, it would be helpful if they were resent since I'm currently CC'd
on a v4 1/1 and a v3 2/2 and don't really know what I'm supposed to do with
them :)
Will
WARNING: multiple messages have this Message-ID (diff)
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v3 2/2] arm64: crypto: add NEON accelerated XOR implementation
Date: Tue, 27 Nov 2018 18:03:25 +0000 [thread overview]
Message-ID: <20181127180325.GA19216@arm.com> (raw)
In-Reply-To: <CAKv+Gu_Jh356UqQcfVOQvuBUpw3z2ihTv-gpLbeUaFium-wVPA@mail.gmail.com>
On Tue, Nov 27, 2018 at 01:46:48PM +0100, Ard Biesheuvel wrote:
> (add maintainers back to cc)
>
> On Tue, 27 Nov 2018 at 12:49, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> >
> > On Tue, 27 Nov 2018 at 11:10, Jackie Liu <liuyun01@kylinos.cn> wrote:
> > >
> > > This is a NEON acceleration method that can improve
> > > performance by approximately 20%. I got the following
> > > data from the centos 7.5 on Huawei's HISI1616 chip:
> > >
> > > [ 93.837726] xor: measuring software checksum speed
> > > [ 93.874039] 8regs : 7123.200 MB/sec
> > > [ 93.914038] 32regs : 7180.300 MB/sec
> > > [ 93.954043] arm64_neon: 9856.000 MB/sec
> >
> > That looks more like 37% to me
> >
> > Note that Cortex-A57 gives me
> >
> > [ 0.111543] xor: measuring software checksum speed
> > [ 0.154874] 8regs : 3782.000 MB/sec
> > [ 0.195069] 32regs : 6095.000 MB/sec
> > [ 0.235145] arm64_neon: 5924.000 MB/sec
> > [ 0.236942] xor: using function: 32regs (6095.000 MB/sec)
> >
> > so we fall back to the scalar code, which is fine.
> >
> > > [ 93.954047] xor: using function: arm64_neon (9856.000 MB/sec)
> > >
> > > I believe this code can bring some optimization for
> > > all arm64 platform.
> > >
> > > That is patch version 3. Thanks for Ard Biesheuvel's
> > > suggestions.
> > >
> > > Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
> >
> > Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >
>
> This goes with v4 of the NEON intrinsics patch.
>
> Jackie: no need to resend these, but next time, please repost the
> series entirely, not just a single patch, and keep the maintainers on
> cc.
Actually, it would be helpful if they were resent since I'm currently CC'd
on a v4 1/1 and a v3 2/2 and don't really know what I'm supposed to do with
them :)
Will
next prev parent reply other threads:[~2018-11-27 18:03 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-27 10:08 [PATCH v3 1/2] arm64/neon: add workaround for ambiguous C99 stdint.h types Jackie Liu
2018-11-27 10:08 ` Jackie Liu
2018-11-27 10:08 ` [PATCH v3 2/2] arm64: crypto: add NEON accelerated XOR implementation Jackie Liu
2018-11-27 10:08 ` Jackie Liu
2018-11-27 11:49 ` Ard Biesheuvel
2018-11-27 11:49 ` Ard Biesheuvel
2018-11-27 12:33 ` JackieLiu
2018-11-27 12:33 ` JackieLiu
2018-11-27 12:46 ` Ard Biesheuvel
2018-11-27 12:46 ` Ard Biesheuvel
2018-11-27 12:52 ` JackieLiu
2018-11-27 12:52 ` JackieLiu
2018-11-27 18:03 ` Will Deacon [this message]
2018-11-27 18:03 ` Will Deacon
2018-11-29 17:00 ` Dave Martin
2018-11-29 17:00 ` Dave Martin
2018-11-29 18:09 ` Ard Biesheuvel
2018-11-29 18:09 ` Ard Biesheuvel
2018-11-29 18:20 ` Dave Martin
2018-11-29 18:20 ` Dave Martin
2018-11-30 1:15 ` JackieLiu
2018-11-30 1:15 ` JackieLiu
2018-11-27 11:42 ` [PATCH v3 1/2] arm64/neon: add workaround for ambiguous C99 stdint.h types Ard Biesheuvel
2018-11-27 11:42 ` Ard Biesheuvel
2018-11-29 16:55 ` Dave Martin
2018-11-29 16:55 ` Dave Martin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181127180325.GA19216@arm.com \
--to=will.deacon@arm.com \
--cc=ard.biesheuvel@linaro.org \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=liuyun01@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.