git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Karsten Blees <karsten.blees@gmail.com>
To: pro-logic <pro-logic@optusnet.com.au>, msysgit@googlegroups.com
Cc: Sebastian Schuberth <sschuberth@gmail.com>,
	git@vger.kernel.org,  szager@google.com
Subject: Re: Re: Windows performance / threading file access
Date: Tue, 22 Oct 2013 16:30:39 +0200	[thread overview]
Message-ID: <52668C0F.9050702@gmail.com> (raw)
In-Reply-To: <49cde110-f3e5-43d9-b399-6b5a6ce59014@googlegroups.com>

Am 22.10.2013 00:58, schrieb pro-logic:
>> The trace_performance functions require manual instrumentation of
>> the code sections you want to measure
> Ahh a case of RTFM :)
> 
>> Could you post details about your test setup? Are you still using
>> WebKit for your tests?
> I'm on Win7 x64, Core i5 M560, WD 7200 Laptop HDD, NTSF, no virus
> scanner, truecrypt, no defragger.
> 

OK, so truecrypt and luafv may screw things up for you (according to my measurements, luafv roughly doubles lstat times on C:).

> I've tried to be a bit smarter with the intent of my code, and this
> is what I came up with.
> 
> diff --git a/cache.h b/cache.h
> index 4bf19e3..2e9fb1f 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -294,7 +294,7 @@ extern void free_name_hash(struct index_state *istate);
>  #define active_cache_changed (the_index.cache_changed)
>  #define active_cache_tree (the_index.cache_tree)
>  
> -#define read_cache() read_index(&the_index)
> +#define read_cache() read_index_preload(&the_index, NULL)
>  #define read_cache_from(path) read_index_from(&the_index, (path))
>  #define read_cache_preload(pathspec) read_index_preload(&the_index, (pathspec))
>  #define is_cache_unborn() is_index_unborn(&the_index)
> diff --git a/read-cache.c b/read-cache.c
> index c3d5e35..5fb2788 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1866,7 +1866,7 @@ int read_index_unmerged(struct index_state *istate)
>  int i;
>  int unmerged = 0;
>  
> -read_index(istate);
> +read_index_preload(istate, NULL);
>  for (i = 0; i < istate->cache_nr; i++) {
>  struct cache_entry *ce = istate->cache[i];
>  struct cache_entry *new_ce;
> -- 
> 

Ahh, I thought that you had enabled fscache during the entire checkout.

> Interestingly when I run on a cleanly checked out blink repo my
> changes seem to make matters worse in terms of performance, but when
> working on a repo with ignored files in it it seems to work better.
> So for point of comparison I decided to run it on a comparison on a
> repo with working ignored files in it in this case msysgit/git after
> a 'make install'. When I get a few hours I'll try to build blink and
> re-run the numbers on a much much larger repo.
> 
> This comparison is a average of 3 cold cache runs of the
> kb/fscache-v4 [a] vs kb/fscache-v4 with my above changes applied [b],
> with preloadindex and fscache set to true.
> 
> For comparison
> git status -s
> [a] 3.02s
> [b] 2.92s
> 
> git reset --hard head
> [a] 3.67s
> [b] 3.09s
> 

These numbers look far too good, so you don't actually do a fresh checkout, do you? I mean, delete all files except .git; killcache; git reset --hard / git checkout -f? That would also explain your 95% lstat times, if there's nothing to do...

> git add -u
> [a] 2.89s
> [b] 2.08s
> 
> 
> I noticed something interesting. Preload index uses 20 threads to do
> the work. When I was keeping an eye on them in task manager some
> threads will finish quite quickly, while others will run a lot
> longer. The way I understand the code at the moment the threads get
> equal chunks of work to perform. It's quite lilkely that even more
> performance could be obtained out of preload if the work splitting
> was 'smarter'. My currently best idea would be to use something like
> a lock-free queue to queue up the work and let the threads get the
> work of the queue. That way all threads are busy with work for
> longer. A candidate for the implementation would be libfds [1] queue.
> However my issue with this library and the reason I haven't tried to
> integrate is simply because the code expressly has no license.
> 

As cache/cache_nr are not modified by the threads, you actually don't need a lock-free queue. An atomic counter shared by all threads should suffice (i.e. pthread's equivalent to InterlockedIncrement/InterlockedAdd).

Karsten




-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "msysGit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

  reply	other threads:[~2013-10-22 14:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-10 18:18 Windows performance / threading file access Stefan Zager
2013-10-10 20:19 ` Sebastian Schuberth
2013-10-11  0:51   ` Karsten Blees
2013-10-11  5:28     ` Stefan Zager
2013-10-11  5:35     ` Stefan Zager
2013-10-11  5:48       ` Duy Nguyen
2013-10-15 22:22       ` pro-logic
2013-10-17 16:50         ` Karsten Blees
2013-10-21 22:58           ` pro-logic
2013-10-22 14:30             ` Karsten Blees [this message]
2013-10-22 14:49               ` Sebastian Schuberth
2013-10-22 15:40                 ` Karsten Blees

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52668C0F.9050702@gmail.com \
    --to=karsten.blees@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=msysgit@googlegroups.com \
    --cc=pro-logic@optusnet.com.au \
    --cc=sschuberth@gmail.com \
    --cc=szager@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).