From: Nick Piggin <npiggin@gmail.com>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: David Miller <davem@davemloft.net>,
hooanon05@yahoo.co.jp, npiggin@kernel.dk,
torvalds@linux-foundation.org, linux-arch@vger.kernel.org,
x86@kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: Big git diff speedup by avoiding x86 "fast string" memcmp
Date: Fri, 17 Dec 2010 00:13:47 +1100 [thread overview]
Message-ID: <AANLkTin3v0BLYAuuqTgEzKgSOpVg4GJaJCEds=2ePrBH@mail.gmail.com> (raw)
In-Reply-To: <4D09E185.2040600@panasas.com>
On Thu, Dec 16, 2010 at 8:53 PM, Boaz Harrosh <bharrosh@panasas.com> wrote:
> On 12/15/2010 08:00 PM, David Miller wrote:
>> From: Boaz Harrosh <bharrosh@panasas.com>
>> Date: Wed, 15 Dec 2010 15:15:09 +0200
>>
>>> I agree that the byte-compare or long-compare should give you very close
>>> results in modern pipeline CPUs. But surly 12 increments-and-test should
>>> show up against 3 (or even 2). I would say it must be a better plan.
>>
>> For strings of these lengths the setup code necessary to initialize
>> the inner loop and the tail code to handle the sub-word ending cases
>> eliminate whatever gains there are.
>>
>
> You miss understood me. I'm saying that we know the beggining of the
> string is aligned and Nick offered to pad the last long, so surly
> a shift by 2 (or 3) + the reduction of the 12 dec-and-test to 3
> should give you an optimization?
Masking is still going to take a bit more code, but perhaps. We're talking
a handful of cycles here, so if you add a branch mispredict, or icache
miss, you'll kill your savings.
This is what I've got at the moment, which adds only 8 bytes over the
rep cmp variant, in the __d_lookup_rcu function.
static inline int dentry_memcmp(const unsigned char *cs,
const unsigned char *ct, size_t count)
{
int ret;
do {
ret = (*cs != *ct);
if (ret)
break;
cs++;
ct++;
count--;
} while (count);
return ret;
}
Now if we pad and zero fill the dentry name, then we can compare with
the path string, but we still have to mask that guy (unfortunately, I
didn't consider that earlier) so it's not trivial and adds quite a bit of code
size and branches:
static inline int dentry_memcmp_long(const unsigned char *cs,
const unsigned char *ct, ssize_t count)
{
const unsigned long *ls = (const unsigned long *)cs;
const unsigned long *lt = (const unsigned long *)ct;
int ret;
do {
unsigned long t = *lt;
unsigned long s = *ls;
int shift = 0;
if (count < 8)
shift = (8 - count) * 8;
t = t & (0xffffffffffffffff >> shift);
ret = (s != t);
if (ret)
break;
ls++;
lt++;
count-=8;
} while (count > 0);
return ret;
}
Something like this should work on little endian. You'd have to coax gcc to
generate a cmov to get rid of that branch I think, because it won't be
predictable for small string lengths. But then you have to think about
icache...
next prev parent reply other threads:[~2010-12-16 13:13 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-09 7:09 Big git diff speedup by avoiding x86 "fast string" memcmp Nick Piggin
2010-12-09 13:37 ` Borislav Petkov
2010-12-10 2:38 ` Nick Piggin
2010-12-10 4:27 ` Nick Piggin
2010-12-10 14:23 ` J. R. Okajima
2010-12-13 1:45 ` Nick Piggin
2010-12-13 7:29 ` J. R. Okajima
2010-12-13 8:25 ` Nick Piggin
2010-12-14 19:01 ` J. R. Okajima
2010-12-15 4:06 ` Nick Piggin
2010-12-15 5:57 ` J. R. Okajima
2010-12-15 13:15 ` Boaz Harrosh
2010-12-15 18:00 ` David Miller
2010-12-16 9:53 ` Boaz Harrosh
2010-12-16 13:13 ` Nick Piggin [this message]
2010-12-16 14:03 ` Boaz Harrosh
2010-12-16 14:15 ` Nick Piggin
2010-12-16 16:51 ` Linus Torvalds
2010-12-16 17:57 ` David Miller
2010-12-15 4:38 ` Américo Wang
2010-12-15 5:54 ` Nick Piggin
2010-12-15 7:12 ` Linus Torvalds
2010-12-15 23:09 ` Tony Luck
2010-12-16 2:34 ` Nick Piggin
-- strict thread matches above, loose matches on Subject: below --
2010-12-18 22:54 George Spelvin
2010-12-19 14:28 ` Boaz Harrosh
2010-12-19 15:46 ` Nick Piggin
2010-12-19 17:06 ` George Spelvin
2010-12-21 9:26 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='AANLkTin3v0BLYAuuqTgEzKgSOpVg4GJaJCEds=2ePrBH@mail.gmail.com' \
--to=npiggin@gmail.com \
--cc=bharrosh@panasas.com \
--cc=davem@davemloft.net \
--cc=hooanon05@yahoo.co.jp \
--cc=linux-arch@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=npiggin@kernel.dk \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).