* [PATCH] Replace strcmp_icase with strequal_icase
@ 2013-03-09 8:42 Fredrik Gustafsson
2013-03-09 8:57 ` Fredrik Gustafsson
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Fredrik Gustafsson @ 2013-03-09 8:42 UTC (permalink / raw)
To: gitster; +Cc: git, iveqy, pclouds
To improve performance.
git status before:
user 0m0.020s
user 0m0.024s
user 0m0.024s
user 0m0.020s
user 0m0.024s
user 0m0.028s
user 0m0.024s
user 0m0.024s
user 0m0.016s
user 0m0.028s
git status after:
user 0m0.012s
user 0m0.008s
user 0m0.008s
user 0m0.008s
user 0m0.008s
user 0m0.008s
user 0m0.008s
user 0m0.004s
user 0m0.008s
user 0m0.016s
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
---
dir.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/dir.c b/dir.c
index 57394e4..2b801e8 100644
--- a/dir.c
+++ b/dir.c
@@ -37,6 +37,17 @@ int fnmatch_icase(const char *pattern, const char *string, int flags)
return fnmatch(pattern, string, flags | (ignore_case ? FNM_CASEFOLD : 0));
}
+int strequal_icase(const char *first, const char *second)
+{
+ while (*first && *second) {
+ if( toupper(*first) != toupper(*second))
+ break;
+ first++;
+ second++;
+ }
+ return toupper(*first) == toupper(*second);
+}
+
inline int git_fnmatch(const char *pattern, const char *string,
int flags, int prefix)
{
@@ -626,11 +637,11 @@ int match_basename(const char *basename, int basenamelen,
int flags)
{
if (prefix == patternlen) {
- if (!strcmp_icase(pattern, basename))
+ if (!strequal_icase(pattern, basename))
return 1;
} else if (flags & EXC_FLAG_ENDSWITH) {
if (patternlen - 1 <= basenamelen &&
- !strcmp_icase(pattern + 1,
+ !strequal_icase(pattern + 1,
basename + basenamelen - patternlen + 1))
return 1;
} else {
@@ -663,7 +674,7 @@ int match_pathname(const char *pathname, int pathlen,
*/
if (pathlen < baselen + 1 ||
(baselen && pathname[baselen] != '/') ||
- strncmp_icase(pathname, base, baselen))
+ strequal_icase(pathname, base))
return 0;
namelen = baselen ? pathlen - baselen - 1 : pathlen;
--
1.8.1.5
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase
2013-03-09 8:42 [PATCH] Replace strcmp_icase with strequal_icase Fredrik Gustafsson
@ 2013-03-09 8:57 ` Fredrik Gustafsson
2013-03-09 10:21 ` Duy Nguyen
2013-03-09 10:40 ` Duy Nguyen
2 siblings, 0 replies; 9+ messages in thread
From: Fredrik Gustafsson @ 2013-03-09 8:57 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: gitster, git, pclouds
Please ignore last e-mail. Sorry for the disturbance.
--
Med vänliga hälsningar
Fredrik Gustafsson
tel: 0733-608274
e-post: iveqy@iveqy.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase
2013-03-09 8:42 [PATCH] Replace strcmp_icase with strequal_icase Fredrik Gustafsson
2013-03-09 8:57 ` Fredrik Gustafsson
@ 2013-03-09 10:21 ` Duy Nguyen
2013-03-09 10:40 ` Duy Nguyen
2 siblings, 0 replies; 9+ messages in thread
From: Duy Nguyen @ 2013-03-09 10:21 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: gitster, git
On Sat, Mar 09, 2013 at 09:42:54AM +0100, Fredrik Gustafsson wrote:
> To improve performance.
> git status before:
> user 0m0.020s
> user 0m0.024s
> user 0m0.024s
> user 0m0.020s
> user 0m0.024s
> user 0m0.028s
> user 0m0.024s
> user 0m0.024s
> user 0m0.016s
> user 0m0.028s
>
> git status after:
> user 0m0.012s
> user 0m0.008s
> user 0m0.008s
> user 0m0.008s
> user 0m0.008s
> user 0m0.008s
> user 0m0.008s
> user 0m0.004s
> user 0m0.008s
> user 0m0.016s
I tested a slightly different version that checks ignore_case, inlines
if possible and replaces one more strncmp_icase call site (the top
call site in webkit.git). The numbers are impressive (well not as
impressive as yours, but I guess it depends on the actual .gitignore
patterns). On top of my 3/3
before after
user 0m0.508s 0m0.392s
user 0m0.511s 0m0.394s
user 0m0.513s 0m0.405s
user 0m0.516s 0m0.407s
user 0m0.516s 0m0.407s
user 0m0.518s 0m0.410s
user 0m0.519s 0m0.412s
user 0m0.524s 0m0.415s
user 0m0.527s 0m0.415s
user 0m0.534s 0m0.417s
I still need to run the test suite. Then maybe reroll my series with
this.
-- 8< --
diff --git a/dir.c b/dir.c
index 2a91d14..6a9b4b7 100644
--- a/dir.c
+++ b/dir.c
@@ -21,6 +21,24 @@ static int read_directory_recursive(struct dir_struct *dir, const char *path, in
int check_only, const struct path_simplify *simplify);
static int get_dtype(struct dirent *de, const char *path, int len);
+static inline strnequal_icase(const char *first, const char *second, int length)
+{
+ if (ignore_case) {
+ while (length && toupper(*first) == toupper(*second)) {
+ first++;
+ second++;
+ length--;
+ }
+ } else {
+ while (length && *first == *second) {
+ first++;
+ second++;
+ length--;
+ }
+ }
+ return length == 0;
+}
+
inline int git_fnmatch(const char *pattern, const char *string,
int flags, int prefix)
{
@@ -611,11 +629,11 @@ int match_basename(const char *basename, int basenamelen,
{
if (prefix == patternlen) {
if (patternlen == basenamelen &&
- !strncmp_icase(pattern, basename, patternlen))
+ strnequal_icase(pattern, basename, patternlen))
return 1;
} else if (flags & EXC_FLAG_ENDSWITH) {
if (patternlen - 1 <= basenamelen &&
- !strncmp_icase(pattern + 1,
+ strnequal_icase(pattern + 1,
basename + basenamelen - patternlen + 1,
patternlen - 1))
return 1;
@@ -649,7 +667,7 @@ int match_pathname(const char *pathname, int pathlen,
*/
if (pathlen < baselen + 1 ||
(baselen && pathname[baselen] != '/') ||
- (baselen && strncmp_icase(pathname, base, baselen)))
+ (baselen && !strnequal_icase(pathname, base, baselen)))
return 0;
namelen = baselen ? pathlen - baselen - 1 : pathlen;
@@ -663,7 +681,7 @@ int match_pathname(const char *pathname, int pathlen,
if (prefix > namelen)
return 0;
- if (strncmp_icase(pattern, name, prefix))
+ if (!strnequal_icase(pattern, name, prefix))
return 0;
pattern += prefix;
name += prefix;
-- 8< --
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase
2013-03-09 8:42 [PATCH] Replace strcmp_icase with strequal_icase Fredrik Gustafsson
2013-03-09 8:57 ` Fredrik Gustafsson
2013-03-09 10:21 ` Duy Nguyen
@ 2013-03-09 10:40 ` Duy Nguyen
2013-03-09 10:54 ` Duy Nguyen
2 siblings, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2013-03-09 10:40 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: gitster, git
On Sat, Mar 9, 2013 at 3:42 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
> To improve performance.
BTW, by rolling our own string comparison, we may lose certain
optimizations done by C library. In case of glibc, it may choose to
run an sse4.2 version where 16 bytes are compared at a time. Maybe we
encounter "string not equal" much often than "string equal" and such
an optimization is unncessary, I don't know. Measured numbers say it's
unncessary as my cpu supports sse4.2.
--
Duy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase
2013-03-09 10:40 ` Duy Nguyen
@ 2013-03-09 10:54 ` Duy Nguyen
2013-03-09 11:08 ` Fredrik Gustafsson
0 siblings, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2013-03-09 10:54 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: gitster, git
On Sat, Mar 9, 2013 at 5:40 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Sat, Mar 9, 2013 at 3:42 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
>> To improve performance.
>
> BTW, by rolling our own string comparison, we may lose certain
> optimizations done by C library. In case of glibc, it may choose to
> run an sse4.2 version where 16 bytes are compared at a time. Maybe we
> encounter "string not equal" much often than "string equal" and such
> an optimization is unncessary, I don't know. Measured numbers say it's
> unncessary as my cpu supports sse4.2.
Another problem is locale. Git's toupper() does not care about locale,
which should be fine in most cases. strcasecmp is locale-aware, our
new str[n]equal_icase is not. It probably does not matter for
(ascii-based) pathnames, I guess. core.ignorecase users, any comments?
--
Duy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase
2013-03-09 10:54 ` Duy Nguyen
@ 2013-03-09 11:08 ` Fredrik Gustafsson
2013-03-09 12:05 ` Duy Nguyen
0 siblings, 1 reply; 9+ messages in thread
From: Fredrik Gustafsson @ 2013-03-09 11:08 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Fredrik Gustafsson, gitster, git
On Sat, Mar 09, 2013 at 05:54:45PM +0700, Duy Nguyen wrote:
> On Sat, Mar 9, 2013 at 5:40 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> > On Sat, Mar 9, 2013 at 3:42 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
> >> To improve performance.
> >
> > BTW, by rolling our own string comparison, we may lose certain
> > optimizations done by C library. In case of glibc, it may choose to
> > run an sse4.2 version where 16 bytes are compared at a time. Maybe we
> > encounter "string not equal" much often than "string equal" and such
> > an optimization is unncessary, I don't know. Measured numbers say it's
> > unncessary as my cpu supports sse4.2.
>
> Another problem is locale. Git's toupper() does not care about locale,
> which should be fine in most cases. strcasecmp is locale-aware, our
> new str[n]equal_icase is not. It probably does not matter for
> (ascii-based) pathnames, I guess. core.ignorecase users, any comments?
> --
> Duy
Actually when implemented a str[n]equal_icase that actually should work.
I break the test suite when trying to replace
strncmp_icase(pathname, base, baselen)) on line 711 in dir.c and I don't
get any significant improvements.
I like work in this area though, slow commit's are my worst git problem.
I often have to wait 10s. for a commit to be calculated.
--
Med vänliga hälsningar
Fredrik Gustafsson
tel: 0733-608274
e-post: iveqy@iveqy.com
From c5d1f436cdbe7b12c67e81cf1d2904d1fb2e9b6b Mon Sep 17 00:00:00 2001
From: Fredrik Gustafsson <iveqy@iveqy.com>
Date: Sat, 9 Mar 2013 09:27:16 +0100
Subject: [PATCH] Replace strcmp_icase with strequal_icase
To improve performance.
git status before:
user 0m0.020s
user 0m0.024s
user 0m0.024s
user 0m0.020s
user 0m0.024s
user 0m0.028s
user 0m0.024s
user 0m0.024s
user 0m0.016s
user 0m0.028s
git status after:
wip
Tried to replace strncmp_icase on line 711 in dir.c but then failed to
run the testsuite. Did not got any relevant speed improvements of this.
---
dir.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++--
1 files changed, 47 insertions(+), 2 deletions(-)
diff --git a/dir.c b/dir.c
index 57394e4..aace36a 100644
--- a/dir.c
+++ b/dir.c
@@ -37,6 +37,51 @@ int fnmatch_icase(const char *pattern, const char *string, int flags)
return fnmatch(pattern, string, flags | (ignore_case ? FNM_CASEFOLD : 0));
}
+int strnequal_icase(const char *first, const char *second, size_t count)
+{
+ if (ignore_case) {
+ while (*first && *second && count) {
+ if( toupper(*first) != toupper(*second))
+ break;
+ first++;
+ second++;
+ count--;
+ }
+ return toupper(*first) == toupper(*second);
+ } else {
+ while (*first && *second && count) {
+ if( *first != *second)
+ break;
+ first++;
+ second++;
+ count--;
+ }
+ return *first == *second;
+ }
+
+}
+
+int strequal_icase(const char *first, const char *second)
+{
+ if (ignore_case) {
+ while (*first && *second) {
+ if( toupper(*first) != toupper(*second))
+ break;
+ first++;
+ second++;
+ }
+ return toupper(*first) == toupper(*second);
+ } else {
+ while (*first && *second) {
+ if( *first != *second)
+ break;
+ first++;
+ second++;
+ }
+ return *first == *second;
+ }
+}
+
inline int git_fnmatch(const char *pattern, const char *string,
int flags, int prefix)
{
@@ -626,11 +671,11 @@ int match_basename(const char *basename, int basenamelen,
int flags)
{
if (prefix == patternlen) {
- if (!strcmp_icase(pattern, basename))
+ if (strequal_icase(pattern, basename))
return 1;
} else if (flags & EXC_FLAG_ENDSWITH) {
if (patternlen - 1 <= basenamelen &&
- !strcmp_icase(pattern + 1,
+ strequal_icase(pattern + 1,
basename + basenamelen - patternlen + 1))
return 1;
} else {
--
1.7.2.5
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase
2013-03-09 11:08 ` Fredrik Gustafsson
@ 2013-03-09 12:05 ` Duy Nguyen
2013-03-09 12:22 ` Fredrik Gustafsson
0 siblings, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2013-03-09 12:05 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: gitster, git
On Sat, Mar 9, 2013 at 6:08 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
> Actually when implemented a str[n]equal_icase that actually should work.
> I break the test suite when trying to replace
> strncmp_icase(pathname, base, baselen)) on line 711 in dir.c and I don't
> get any significant improvements.
Hmm.. mine passed the test suite.
> I like work in this area though, slow commit's are my worst git problem.
> I often have to wait 10s. for a commit to be calculated.
Personally I don't accept any often used git commands taking more than
1 second (in hot cache case). What commands do you use? What's the
size of the repository in terms of tracked/untracked files?
--
Duy
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase
2013-03-09 12:05 ` Duy Nguyen
@ 2013-03-09 12:22 ` Fredrik Gustafsson
2013-03-09 12:40 ` Duy Nguyen
0 siblings, 1 reply; 9+ messages in thread
From: Fredrik Gustafsson @ 2013-03-09 12:22 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Fredrik Gustafsson, gitster, git
On Sat, Mar 09, 2013 at 07:05:37PM +0700, Duy Nguyen wrote:
> On Sat, Mar 9, 2013 at 6:08 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
> > Actually when implemented a str[n]equal_icase that actually should work.
> > I break the test suite when trying to replace
> > strncmp_icase(pathname, base, baselen)) on line 711 in dir.c and I don't
> > get any significant improvements.
>
> Hmm.. mine passed the test suite.
Using my patch or your own code? Maybe I just did something wrong. Could
you see any improvements in speed?
>
> > I like work in this area though, slow commit's are my worst git problem.
> > I often have to wait 10s. for a commit to be calculated.
>
> Personally I don't accept any often used git commands taking more than
> 1 second (in hot cache case). What commands do you use? What's the
> size of the repository in terms of tracked/untracked files?
It's a small repository, 100 MB. However I have a slow hdd which is
almost full. I often add one file and make an one-line change to an
other file and then do a git commit -a. That will make git to look
through the whole repo, which isn't in the kernel RAM cache but needs to
be reed from the hdd.
--
Med vänliga hälsningar
Fredrik Gustafsson
tel: 0733-608274
e-post: iveqy@iveqy.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase
2013-03-09 12:22 ` Fredrik Gustafsson
@ 2013-03-09 12:40 ` Duy Nguyen
0 siblings, 0 replies; 9+ messages in thread
From: Duy Nguyen @ 2013-03-09 12:40 UTC (permalink / raw)
To: Fredrik Gustafsson; +Cc: gitster, git
On Sat, Mar 9, 2013 at 7:22 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
> On Sat, Mar 09, 2013 at 07:05:37PM +0700, Duy Nguyen wrote:
>> On Sat, Mar 9, 2013 at 6:08 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote:
>> > Actually when implemented a str[n]equal_icase that actually should work.
>> > I break the test suite when trying to replace
>> > strncmp_icase(pathname, base, baselen)) on line 711 in dir.c and I don't
>> > get any significant improvements.
>>
>> Hmm.. mine passed the test suite.
>
> Using my patch or your own code? Maybe I just did something wrong. Could
> you see any improvements in speed?
It's the one I posted in [1] and yes it improves speed, numbers in [1].
[1] http://thread.gmane.org/gmane.comp.version-control.git/217712/focus=217724
>> > I like work in this area though, slow commit's are my worst git problem.
>> > I often have to wait 10s. for a commit to be calculated.
>>
>> Personally I don't accept any often used git commands taking more than
>> 1 second (in hot cache case). What commands do you use? What's the
>> size of the repository in terms of tracked/untracked files?
>
> It's a small repository, 100 MB. However I have a slow hdd which is
> almost full. I often add one file and make an one-line change to an
> other file and then do a git commit -a. That will make git to look
> through the whole repo, which isn't in the kernel RAM cache but needs to
> be reed from the hdd.
"commit -a" does not run exclude (what I'm improving here). It's
probably stat problem. If you already know what files you have
changed, "git add path..." then commit without -a might help. Or turn
on core.ignorestat (read doc about it first).
--
Duy
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-03-09 12:41 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-09 8:42 [PATCH] Replace strcmp_icase with strequal_icase Fredrik Gustafsson
2013-03-09 8:57 ` Fredrik Gustafsson
2013-03-09 10:21 ` Duy Nguyen
2013-03-09 10:40 ` Duy Nguyen
2013-03-09 10:54 ` Duy Nguyen
2013-03-09 11:08 ` Fredrik Gustafsson
2013-03-09 12:05 ` Duy Nguyen
2013-03-09 12:22 ` Fredrik Gustafsson
2013-03-09 12:40 ` Duy Nguyen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).