* [PATCH] Replace strcmp_icase with strequal_icase @ 2013-03-09 8:42 Fredrik Gustafsson 2013-03-09 8:57 ` Fredrik Gustafsson ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Fredrik Gustafsson @ 2013-03-09 8:42 UTC (permalink / raw) To: gitster; +Cc: git, iveqy, pclouds To improve performance. git status before: user 0m0.020s user 0m0.024s user 0m0.024s user 0m0.020s user 0m0.024s user 0m0.028s user 0m0.024s user 0m0.024s user 0m0.016s user 0m0.028s git status after: user 0m0.012s user 0m0.008s user 0m0.008s user 0m0.008s user 0m0.008s user 0m0.008s user 0m0.008s user 0m0.004s user 0m0.008s user 0m0.016s Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com> --- dir.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/dir.c b/dir.c index 57394e4..2b801e8 100644 --- a/dir.c +++ b/dir.c @@ -37,6 +37,17 @@ int fnmatch_icase(const char *pattern, const char *string, int flags) return fnmatch(pattern, string, flags | (ignore_case ? FNM_CASEFOLD : 0)); } +int strequal_icase(const char *first, const char *second) +{ + while (*first && *second) { + if( toupper(*first) != toupper(*second)) + break; + first++; + second++; + } + return toupper(*first) == toupper(*second); +} + inline int git_fnmatch(const char *pattern, const char *string, int flags, int prefix) { @@ -626,11 +637,11 @@ int match_basename(const char *basename, int basenamelen, int flags) { if (prefix == patternlen) { - if (!strcmp_icase(pattern, basename)) + if (!strequal_icase(pattern, basename)) return 1; } else if (flags & EXC_FLAG_ENDSWITH) { if (patternlen - 1 <= basenamelen && - !strcmp_icase(pattern + 1, + !strequal_icase(pattern + 1, basename + basenamelen - patternlen + 1)) return 1; } else { @@ -663,7 +674,7 @@ int match_pathname(const char *pathname, int pathlen, */ if (pathlen < baselen + 1 || (baselen && pathname[baselen] != '/') || - strncmp_icase(pathname, base, baselen)) + strequal_icase(pathname, base)) return 0; namelen = baselen ? pathlen - baselen - 1 : pathlen; -- 1.8.1.5 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase 2013-03-09 8:42 [PATCH] Replace strcmp_icase with strequal_icase Fredrik Gustafsson @ 2013-03-09 8:57 ` Fredrik Gustafsson 2013-03-09 10:21 ` Duy Nguyen 2013-03-09 10:40 ` Duy Nguyen 2 siblings, 0 replies; 9+ messages in thread From: Fredrik Gustafsson @ 2013-03-09 8:57 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: gitster, git, pclouds Please ignore last e-mail. Sorry for the disturbance. -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iveqy@iveqy.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase 2013-03-09 8:42 [PATCH] Replace strcmp_icase with strequal_icase Fredrik Gustafsson 2013-03-09 8:57 ` Fredrik Gustafsson @ 2013-03-09 10:21 ` Duy Nguyen 2013-03-09 10:40 ` Duy Nguyen 2 siblings, 0 replies; 9+ messages in thread From: Duy Nguyen @ 2013-03-09 10:21 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: gitster, git On Sat, Mar 09, 2013 at 09:42:54AM +0100, Fredrik Gustafsson wrote: > To improve performance. > git status before: > user 0m0.020s > user 0m0.024s > user 0m0.024s > user 0m0.020s > user 0m0.024s > user 0m0.028s > user 0m0.024s > user 0m0.024s > user 0m0.016s > user 0m0.028s > > git status after: > user 0m0.012s > user 0m0.008s > user 0m0.008s > user 0m0.008s > user 0m0.008s > user 0m0.008s > user 0m0.008s > user 0m0.004s > user 0m0.008s > user 0m0.016s I tested a slightly different version that checks ignore_case, inlines if possible and replaces one more strncmp_icase call site (the top call site in webkit.git). The numbers are impressive (well not as impressive as yours, but I guess it depends on the actual .gitignore patterns). On top of my 3/3 before after user 0m0.508s 0m0.392s user 0m0.511s 0m0.394s user 0m0.513s 0m0.405s user 0m0.516s 0m0.407s user 0m0.516s 0m0.407s user 0m0.518s 0m0.410s user 0m0.519s 0m0.412s user 0m0.524s 0m0.415s user 0m0.527s 0m0.415s user 0m0.534s 0m0.417s I still need to run the test suite. Then maybe reroll my series with this. -- 8< -- diff --git a/dir.c b/dir.c index 2a91d14..6a9b4b7 100644 --- a/dir.c +++ b/dir.c @@ -21,6 +21,24 @@ static int read_directory_recursive(struct dir_struct *dir, const char *path, in int check_only, const struct path_simplify *simplify); static int get_dtype(struct dirent *de, const char *path, int len); +static inline strnequal_icase(const char *first, const char *second, int length) +{ + if (ignore_case) { + while (length && toupper(*first) == toupper(*second)) { + first++; + second++; + length--; + } + } else { + while (length && *first == *second) { + first++; + second++; + length--; + } + } + return length == 0; +} + inline int git_fnmatch(const char *pattern, const char *string, int flags, int prefix) { @@ -611,11 +629,11 @@ int match_basename(const char *basename, int basenamelen, { if (prefix == patternlen) { if (patternlen == basenamelen && - !strncmp_icase(pattern, basename, patternlen)) + strnequal_icase(pattern, basename, patternlen)) return 1; } else if (flags & EXC_FLAG_ENDSWITH) { if (patternlen - 1 <= basenamelen && - !strncmp_icase(pattern + 1, + strnequal_icase(pattern + 1, basename + basenamelen - patternlen + 1, patternlen - 1)) return 1; @@ -649,7 +667,7 @@ int match_pathname(const char *pathname, int pathlen, */ if (pathlen < baselen + 1 || (baselen && pathname[baselen] != '/') || - (baselen && strncmp_icase(pathname, base, baselen))) + (baselen && !strnequal_icase(pathname, base, baselen))) return 0; namelen = baselen ? pathlen - baselen - 1 : pathlen; @@ -663,7 +681,7 @@ int match_pathname(const char *pathname, int pathlen, if (prefix > namelen) return 0; - if (strncmp_icase(pattern, name, prefix)) + if (!strnequal_icase(pattern, name, prefix)) return 0; pattern += prefix; name += prefix; -- 8< -- ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase 2013-03-09 8:42 [PATCH] Replace strcmp_icase with strequal_icase Fredrik Gustafsson 2013-03-09 8:57 ` Fredrik Gustafsson 2013-03-09 10:21 ` Duy Nguyen @ 2013-03-09 10:40 ` Duy Nguyen 2013-03-09 10:54 ` Duy Nguyen 2 siblings, 1 reply; 9+ messages in thread From: Duy Nguyen @ 2013-03-09 10:40 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: gitster, git On Sat, Mar 9, 2013 at 3:42 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: > To improve performance. BTW, by rolling our own string comparison, we may lose certain optimizations done by C library. In case of glibc, it may choose to run an sse4.2 version where 16 bytes are compared at a time. Maybe we encounter "string not equal" much often than "string equal" and such an optimization is unncessary, I don't know. Measured numbers say it's unncessary as my cpu supports sse4.2. -- Duy ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase 2013-03-09 10:40 ` Duy Nguyen @ 2013-03-09 10:54 ` Duy Nguyen 2013-03-09 11:08 ` Fredrik Gustafsson 0 siblings, 1 reply; 9+ messages in thread From: Duy Nguyen @ 2013-03-09 10:54 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: gitster, git On Sat, Mar 9, 2013 at 5:40 PM, Duy Nguyen <pclouds@gmail.com> wrote: > On Sat, Mar 9, 2013 at 3:42 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: >> To improve performance. > > BTW, by rolling our own string comparison, we may lose certain > optimizations done by C library. In case of glibc, it may choose to > run an sse4.2 version where 16 bytes are compared at a time. Maybe we > encounter "string not equal" much often than "string equal" and such > an optimization is unncessary, I don't know. Measured numbers say it's > unncessary as my cpu supports sse4.2. Another problem is locale. Git's toupper() does not care about locale, which should be fine in most cases. strcasecmp is locale-aware, our new str[n]equal_icase is not. It probably does not matter for (ascii-based) pathnames, I guess. core.ignorecase users, any comments? -- Duy ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase 2013-03-09 10:54 ` Duy Nguyen @ 2013-03-09 11:08 ` Fredrik Gustafsson 2013-03-09 12:05 ` Duy Nguyen 0 siblings, 1 reply; 9+ messages in thread From: Fredrik Gustafsson @ 2013-03-09 11:08 UTC (permalink / raw) To: Duy Nguyen; +Cc: Fredrik Gustafsson, gitster, git On Sat, Mar 09, 2013 at 05:54:45PM +0700, Duy Nguyen wrote: > On Sat, Mar 9, 2013 at 5:40 PM, Duy Nguyen <pclouds@gmail.com> wrote: > > On Sat, Mar 9, 2013 at 3:42 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: > >> To improve performance. > > > > BTW, by rolling our own string comparison, we may lose certain > > optimizations done by C library. In case of glibc, it may choose to > > run an sse4.2 version where 16 bytes are compared at a time. Maybe we > > encounter "string not equal" much often than "string equal" and such > > an optimization is unncessary, I don't know. Measured numbers say it's > > unncessary as my cpu supports sse4.2. > > Another problem is locale. Git's toupper() does not care about locale, > which should be fine in most cases. strcasecmp is locale-aware, our > new str[n]equal_icase is not. It probably does not matter for > (ascii-based) pathnames, I guess. core.ignorecase users, any comments? > -- > Duy Actually when implemented a str[n]equal_icase that actually should work. I break the test suite when trying to replace strncmp_icase(pathname, base, baselen)) on line 711 in dir.c and I don't get any significant improvements. I like work in this area though, slow commit's are my worst git problem. I often have to wait 10s. for a commit to be calculated. -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iveqy@iveqy.com From c5d1f436cdbe7b12c67e81cf1d2904d1fb2e9b6b Mon Sep 17 00:00:00 2001 From: Fredrik Gustafsson <iveqy@iveqy.com> Date: Sat, 9 Mar 2013 09:27:16 +0100 Subject: [PATCH] Replace strcmp_icase with strequal_icase To improve performance. git status before: user 0m0.020s user 0m0.024s user 0m0.024s user 0m0.020s user 0m0.024s user 0m0.028s user 0m0.024s user 0m0.024s user 0m0.016s user 0m0.028s git status after: wip Tried to replace strncmp_icase on line 711 in dir.c but then failed to run the testsuite. Did not got any relevant speed improvements of this. --- dir.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 files changed, 47 insertions(+), 2 deletions(-) diff --git a/dir.c b/dir.c index 57394e4..aace36a 100644 --- a/dir.c +++ b/dir.c @@ -37,6 +37,51 @@ int fnmatch_icase(const char *pattern, const char *string, int flags) return fnmatch(pattern, string, flags | (ignore_case ? FNM_CASEFOLD : 0)); } +int strnequal_icase(const char *first, const char *second, size_t count) +{ + if (ignore_case) { + while (*first && *second && count) { + if( toupper(*first) != toupper(*second)) + break; + first++; + second++; + count--; + } + return toupper(*first) == toupper(*second); + } else { + while (*first && *second && count) { + if( *first != *second) + break; + first++; + second++; + count--; + } + return *first == *second; + } + +} + +int strequal_icase(const char *first, const char *second) +{ + if (ignore_case) { + while (*first && *second) { + if( toupper(*first) != toupper(*second)) + break; + first++; + second++; + } + return toupper(*first) == toupper(*second); + } else { + while (*first && *second) { + if( *first != *second) + break; + first++; + second++; + } + return *first == *second; + } +} + inline int git_fnmatch(const char *pattern, const char *string, int flags, int prefix) { @@ -626,11 +671,11 @@ int match_basename(const char *basename, int basenamelen, int flags) { if (prefix == patternlen) { - if (!strcmp_icase(pattern, basename)) + if (strequal_icase(pattern, basename)) return 1; } else if (flags & EXC_FLAG_ENDSWITH) { if (patternlen - 1 <= basenamelen && - !strcmp_icase(pattern + 1, + strequal_icase(pattern + 1, basename + basenamelen - patternlen + 1)) return 1; } else { -- 1.7.2.5 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase 2013-03-09 11:08 ` Fredrik Gustafsson @ 2013-03-09 12:05 ` Duy Nguyen 2013-03-09 12:22 ` Fredrik Gustafsson 0 siblings, 1 reply; 9+ messages in thread From: Duy Nguyen @ 2013-03-09 12:05 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: gitster, git On Sat, Mar 9, 2013 at 6:08 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: > Actually when implemented a str[n]equal_icase that actually should work. > I break the test suite when trying to replace > strncmp_icase(pathname, base, baselen)) on line 711 in dir.c and I don't > get any significant improvements. Hmm.. mine passed the test suite. > I like work in this area though, slow commit's are my worst git problem. > I often have to wait 10s. for a commit to be calculated. Personally I don't accept any often used git commands taking more than 1 second (in hot cache case). What commands do you use? What's the size of the repository in terms of tracked/untracked files? -- Duy ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase 2013-03-09 12:05 ` Duy Nguyen @ 2013-03-09 12:22 ` Fredrik Gustafsson 2013-03-09 12:40 ` Duy Nguyen 0 siblings, 1 reply; 9+ messages in thread From: Fredrik Gustafsson @ 2013-03-09 12:22 UTC (permalink / raw) To: Duy Nguyen; +Cc: Fredrik Gustafsson, gitster, git On Sat, Mar 09, 2013 at 07:05:37PM +0700, Duy Nguyen wrote: > On Sat, Mar 9, 2013 at 6:08 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: > > Actually when implemented a str[n]equal_icase that actually should work. > > I break the test suite when trying to replace > > strncmp_icase(pathname, base, baselen)) on line 711 in dir.c and I don't > > get any significant improvements. > > Hmm.. mine passed the test suite. Using my patch or your own code? Maybe I just did something wrong. Could you see any improvements in speed? > > > I like work in this area though, slow commit's are my worst git problem. > > I often have to wait 10s. for a commit to be calculated. > > Personally I don't accept any often used git commands taking more than > 1 second (in hot cache case). What commands do you use? What's the > size of the repository in terms of tracked/untracked files? It's a small repository, 100 MB. However I have a slow hdd which is almost full. I often add one file and make an one-line change to an other file and then do a git commit -a. That will make git to look through the whole repo, which isn't in the kernel RAM cache but needs to be reed from the hdd. -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iveqy@iveqy.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] Replace strcmp_icase with strequal_icase 2013-03-09 12:22 ` Fredrik Gustafsson @ 2013-03-09 12:40 ` Duy Nguyen 0 siblings, 0 replies; 9+ messages in thread From: Duy Nguyen @ 2013-03-09 12:40 UTC (permalink / raw) To: Fredrik Gustafsson; +Cc: gitster, git On Sat, Mar 9, 2013 at 7:22 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: > On Sat, Mar 09, 2013 at 07:05:37PM +0700, Duy Nguyen wrote: >> On Sat, Mar 9, 2013 at 6:08 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: >> > Actually when implemented a str[n]equal_icase that actually should work. >> > I break the test suite when trying to replace >> > strncmp_icase(pathname, base, baselen)) on line 711 in dir.c and I don't >> > get any significant improvements. >> >> Hmm.. mine passed the test suite. > > Using my patch or your own code? Maybe I just did something wrong. Could > you see any improvements in speed? It's the one I posted in [1] and yes it improves speed, numbers in [1]. [1] http://thread.gmane.org/gmane.comp.version-control.git/217712/focus=217724 >> > I like work in this area though, slow commit's are my worst git problem. >> > I often have to wait 10s. for a commit to be calculated. >> >> Personally I don't accept any often used git commands taking more than >> 1 second (in hot cache case). What commands do you use? What's the >> size of the repository in terms of tracked/untracked files? > > It's a small repository, 100 MB. However I have a slow hdd which is > almost full. I often add one file and make an one-line change to an > other file and then do a git commit -a. That will make git to look > through the whole repo, which isn't in the kernel RAM cache but needs to > be reed from the hdd. "commit -a" does not run exclude (what I'm improving here). It's probably stat problem. If you already know what files you have changed, "git add path..." then commit without -a might help. Or turn on core.ignorestat (read doc about it first). -- Duy ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-03-09 12:41 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-03-09 8:42 [PATCH] Replace strcmp_icase with strequal_icase Fredrik Gustafsson 2013-03-09 8:57 ` Fredrik Gustafsson 2013-03-09 10:21 ` Duy Nguyen 2013-03-09 10:40 ` Duy Nguyen 2013-03-09 10:54 ` Duy Nguyen 2013-03-09 11:08 ` Fredrik Gustafsson 2013-03-09 12:05 ` Duy Nguyen 2013-03-09 12:22 ` Fredrik Gustafsson 2013-03-09 12:40 ` Duy Nguyen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).