All of lore.kernel.org
 help / color / mirror / Atom feed
From: Karsten Blees <karsten.blees@gmail.com>
To: pro-logic <pro-logic@optusnet.com.au>, msysgit@googlegroups.com
Cc: Sebastian Schuberth <sschuberth@gmail.com>,
	git@vger.kernel.org,  szager@google.com
Subject: Re: Re: Windows performance / threading file access
Date: Tue, 22 Oct 2013 16:30:39 +0200	[thread overview]
Message-ID: <52668C0F.9050702@gmail.com> (raw)
In-Reply-To: <49cde110-f3e5-43d9-b399-6b5a6ce59014@googlegroups.com>

Am 22.10.2013 00:58, schrieb pro-logic:
>> The trace_performance functions require manual instrumentation of
>> the code sections you want to measure
> Ahh a case of RTFM :)
> 
>> Could you post details about your test setup? Are you still using
>> WebKit for your tests?
> I'm on Win7 x64, Core i5 M560, WD 7200 Laptop HDD, NTSF, no virus
> scanner, truecrypt, no defragger.
> 

OK, so truecrypt and luafv may screw things up for you (according to my measurements, luafv roughly doubles lstat times on C:).

> I've tried to be a bit smarter with the intent of my code, and this
> is what I came up with.
> 
> diff --git a/cache.h b/cache.h
> index 4bf19e3..2e9fb1f 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -294,7 +294,7 @@ extern void free_name_hash(struct index_state *istate);
>  #define active_cache_changed (the_index.cache_changed)
>  #define active_cache_tree (the_index.cache_tree)
>  
> -#define read_cache() read_index(&the_index)
> +#define read_cache() read_index_preload(&the_index, NULL)
>  #define read_cache_from(path) read_index_from(&the_index, (path))
>  #define read_cache_preload(pathspec) read_index_preload(&the_index, (pathspec))
>  #define is_cache_unborn() is_index_unborn(&the_index)
> diff --git a/read-cache.c b/read-cache.c
> index c3d5e35..5fb2788 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1866,7 +1866,7 @@ int read_index_unmerged(struct index_state *istate)
>  int i;
>  int unmerged = 0;
>  
> -read_index(istate);
> +read_index_preload(istate, NULL);
>  for (i = 0; i < istate->cache_nr; i++) {
>  struct cache_entry *ce = istate->cache[i];
>  struct cache_entry *new_ce;
> -- 
> 

Ahh, I thought that you had enabled fscache during the entire checkout.

> Interestingly when I run on a cleanly checked out blink repo my
> changes seem to make matters worse in terms of performance, but when
> working on a repo with ignored files in it it seems to work better.
> So for point of comparison I decided to run it on a comparison on a
> repo with working ignored files in it in this case msysgit/git after
> a 'make install'. When I get a few hours I'll try to build blink and
> re-run the numbers on a much much larger repo.
> 
> This comparison is a average of 3 cold cache runs of the
> kb/fscache-v4 [a] vs kb/fscache-v4 with my above changes applied [b],
> with preloadindex and fscache set to true.
> 
> For comparison
> git status -s
> [a] 3.02s
> [b] 2.92s
> 
> git reset --hard head
> [a] 3.67s
> [b] 3.09s
> 

These numbers look far too good, so you don't actually do a fresh checkout, do you? I mean, delete all files except .git; killcache; git reset --hard / git checkout -f? That would also explain your 95% lstat times, if there's nothing to do...

> git add -u
> [a] 2.89s
> [b] 2.08s
> 
> 
> I noticed something interesting. Preload index uses 20 threads to do
> the work. When I was keeping an eye on them in task manager some
> threads will finish quite quickly, while others will run a lot
> longer. The way I understand the code at the moment the threads get
> equal chunks of work to perform. It's quite lilkely that even more
> performance could be obtained out of preload if the work splitting
> was 'smarter'. My currently best idea would be to use something like
> a lock-free queue to queue up the work and let the threads get the
> work of the queue. That way all threads are busy with work for
> longer. A candidate for the implementation would be libfds [1] queue.
> However my issue with this library and the reason I haven't tried to
> integrate is simply because the code expressly has no license.
> 

As cache/cache_nr are not modified by the threads, you actually don't need a lock-free queue. An atomic counter shared by all threads should suffice (i.e. pthread's equivalent to InterlockedIncrement/InterlockedAdd).

Karsten




-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "msysGit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

  reply	other threads:[~2013-10-22 14:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-10 18:18 Windows performance / threading file access Stefan Zager
2013-10-10 20:19 ` Sebastian Schuberth
2013-10-11  0:51   ` Karsten Blees
2013-10-11  5:28     ` Stefan Zager
2013-10-11  5:35     ` Stefan Zager
2013-10-11  5:48       ` Duy Nguyen
2013-10-15 22:22       ` pro-logic
2013-10-17 16:50         ` Karsten Blees
2013-10-21 22:58           ` pro-logic
2013-10-22 14:30             ` Karsten Blees [this message]
2013-10-22 14:49               ` Sebastian Schuberth
2013-10-22 15:40                 ` Karsten Blees

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52668C0F.9050702@gmail.com \
    --to=karsten.blees@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=msysgit@googlegroups.com \
    --cc=pro-logic@optusnet.com.au \
    --cc=sschuberth@gmail.com \
    --cc=szager@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.