From: Cyril Bur <cyrilbur@gmail.com>
To: Simon Guo <wei.guo.simon@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org,
David Laight <David.Laight@ACULAB.COM>,
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Subject: Re: [PATCH v2 2/3] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision
Date: Tue, 26 Sep 2017 09:59:46 +1000 [thread overview]
Message-ID: <1506383986.2918.4.camel@gmail.com> (raw)
In-Reply-To: <20170923211843.GA10899@simonLocalRHEL7.x64>
On Sun, 2017-09-24 at 05:18 +0800, Simon Guo wrote:
> Hi Cyril,
> On Sat, Sep 23, 2017 at 12:06:48AM +1000, Cyril Bur wrote:
> > On Thu, 2017-09-21 at 07:34 +0800, wei.guo.simon@gmail.com wrote:
> > > From: Simon Guo <wei.guo.simon@gmail.com>
> > >
> > > This patch add VMX primitives to do memcmp() in case the compare size
> > > exceeds 4K bytes.
> > >
> >
> > Hi Simon,
> >
> > Sorry I didn't see this sooner, I've actually been working on a kernel
> > version of glibc commit dec4a7105e (powerpc: Improve memcmp performance
> > for POWER8) unfortunately I've been distracted and it still isn't done.
>
> Thanks for sync with me. Let's consolidate our effort together :)
>
> I have a quick check on glibc commit dec4a7105e.
> Looks the aligned case comparison with VSX is launched without rN size
> limitation, which means it will have a VSX reg load penalty even when the
> length is 9 bytes.
>
This was written for userspace which doesn't have to explicitly enable
VMX in order to use it - we need to be smarter in the kernel.
> It did some optimization when src/dest addrs don't have the same offset
> on 8 bytes alignment boundary. I need to read more closely.
>
> > I wonder if we can consolidate our efforts here. One thing I did come
> > across in my testing is that for memcmp() that will fail early (I
> > haven't narrowed down the the optimal number yet) the cost of enabling
> > VMX actually turns out to be a performance regression, as such I've
> > added a small check of the first 64 bytes to the start before enabling
> > VMX to ensure the penalty is worth taking.
>
> Will there still be a penalty if the 65th byte differs?
>
I haven't benchmarked it exactly, my rationale for 64 bytes was that it
is the stride of the vectorised copy loop so, if we know we'll fail
before even completing one iteration of the vectorized loop there isn't
any point using the vector regs.
> >
> > Also, you should consider doing 4K and greater, KSM (Kernel Samepage
> > Merging) uses PAGE_SIZE which can be as small as 4K.
>
> Currently the VMX will only be applied when size exceeds 4K. Are you
> suggesting a bigger threshold than 4K?
>
Equal to or greater than 4K, KSM will benefit.
> We can sync more offline for v3.
>
> Thanks,
> - Simon
next prev parent reply other threads:[~2017-09-25 23:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-20 23:34 [PATCH v2 0/3] powerpc/64: memcmp() optimization wei.guo.simon
2017-09-20 23:34 ` [PATCH v2 1/3] powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp() wei.guo.simon
2017-09-20 23:34 ` [PATCH v2 2/3] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision wei.guo.simon
2017-09-21 0:54 ` Simon Guo
2017-09-22 14:06 ` Cyril Bur
2017-09-23 21:18 ` Simon Guo
2017-09-25 23:59 ` Cyril Bur [this message]
2017-09-26 5:34 ` Michael Ellerman
2017-09-26 11:26 ` Segher Boessenkool
2017-09-27 3:38 ` Michael Ellerman
2017-09-27 9:27 ` Segher Boessenkool
2017-09-27 9:43 ` David Laight
2017-09-27 18:33 ` Simon Guo
2017-09-28 9:24 ` David Laight
2017-09-27 16:22 ` Simon Guo
2017-09-20 23:34 ` [PATCH v2 3/3] powerpc:selftest update memcmp_64 selftest for VMX implementation wei.guo.simon
2017-09-25 9:30 ` David Laight
2017-09-24 6:19 ` Simon Guo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1506383986.2918.4.camel@gmail.com \
--to=cyrilbur@gmail.com \
--cc=David.Laight@ACULAB.COM \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=naveen.n.rao@linux.vnet.ibm.com \
--cc=wei.guo.simon@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).