linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J. R. Okajima" <hooanon05@yahoo.co.jp>
To: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-arch@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: Big git diff speedup by avoiding x86 "fast string" memcmp
Date: Fri, 10 Dec 2010 23:23:17 +0900	[thread overview]
Message-ID: <19324.1291990997@jrobl> (raw)
In-Reply-To: <20101209070938.GA3949@amd>


Nick Piggin:
> The standard memcmp function on a Westmere system shows up hot in
> profiles in the `git diff` workload (both parallel and single threaded),
> and it is likely due to the costs associated with trapping into
> microcode, and little opportunity to improve memory access (dentry
> name is not likely to take up more than a cacheline).

Let me make sure.
What you are pointing out is
- asm("repe; cmpsb") may grab CPU long time, and can be a hazard for
  scaling.
- by breaking it into pieces, the chances to scale will increase.
Right?

Anyway this appraoch replacing smallest code by larger but faster code
is interesting.
How about mixing 'unsigned char *' and 'unsigned long *' in referencing
the given strings?
For example,

int f(const unsigned char *cs, const unsigned char *ct, size_t count)
{
	int ret;
	union {
		const unsigned long *l;
		const unsigned char *c;
	} s, t;

/* this macro is your dentry_memcmp() actually */
#define cmp(s, t, c, step)		      \
	do {				      \
		while ((c) >= (step)) {	      \
			ret = (*(s) != *(t)); \
			if (ret)	      \
				return ret;   \
			(s)++;		      \
			(t)++;		      \
			(c) -= (step);	      \
		}			      \
	} while (0)

	s.c = cs;
	t.c = ct;
	cmp(s.l, t.l, count, sizeof(*s.l));
	cmp(s.c, t.c, count, sizeof(*s.c));
	return 0;
}

What I am thinking here is,
- in load and compare, there is no difference between 'char*' and
  'long*', probably.
- obviously 'step by sizeof(long)' will reduce the number of repeats.
- but I am not sure whether the length of string is generally longer
  than 4 (or 8) or not.


J. R. Okajima

  parent reply	other threads:[~2010-12-10 14:23 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-09  7:09 Big git diff speedup by avoiding x86 "fast string" memcmp Nick Piggin
2010-12-09 13:37 ` Borislav Petkov
2010-12-10  2:38   ` Nick Piggin
2010-12-10  4:27 ` Nick Piggin
2010-12-10 14:23 ` J. R. Okajima [this message]
2010-12-13  1:45   ` Nick Piggin
2010-12-13  7:29     ` J. R. Okajima
2010-12-13  8:25       ` Nick Piggin
2010-12-14 19:01         ` J. R. Okajima
2010-12-15  4:06           ` Nick Piggin
2010-12-15  5:57             ` J. R. Okajima
2010-12-15 13:15             ` Boaz Harrosh
2010-12-15 18:00               ` David Miller
2010-12-16  9:53                 ` Boaz Harrosh
2010-12-16 13:13                   ` Nick Piggin
2010-12-16 14:03                     ` Boaz Harrosh
2010-12-16 14:15                       ` Nick Piggin
2010-12-16 16:51                   ` Linus Torvalds
2010-12-16 17:57                   ` David Miller
2010-12-15  4:38         ` Américo Wang
2010-12-15  5:54           ` Nick Piggin
2010-12-15  7:12             ` Linus Torvalds
2010-12-15 23:09 ` Tony Luck
2010-12-16  2:34   ` Nick Piggin
  -- strict thread matches above, loose matches on Subject: below --
2010-12-18 22:54 George Spelvin
2010-12-19 14:28 ` Boaz Harrosh
2010-12-19 15:46 ` Nick Piggin
2010-12-19 17:06   ` George Spelvin
2010-12-21  9:26     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=19324.1291990997@jrobl \
    --to=hooanon05@yahoo.co.jp \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@kernel.dk \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).