* Re: Is Git multithreaded ?
2013-06-12 18:28 ` Is Git multithreaded ? Laurent Alebarde
@ 2013-06-12 19:38 ` Jeff King
[not found] ` <51B971FC.3030404@free.fr>
0 siblings, 1 reply; 3+ messages in thread
From: Jeff King @ 2013-06-12 19:38 UTC (permalink / raw)
To: Laurent Alebarde; +Cc: git
On Wed, Jun 12, 2013 at 08:28:52PM +0200, Laurent Alebarde wrote:
> I wonder if Git is multithreaded ?
A few selected operations are multi-threaded if you compile with
thread support (i.e., do not set NO_PTHREADS when you build).
> For example, during a commit, does it process the files one after one,
> or does it use a set of threads, say 10, to process 10 files in
> parrallel ?
Commit is not multi-threaded, for example.
> In the Git_Guide (http://wiki.sourcemage.org/Git_Guide.html), I can
> read this :
>
> "T/o enable aut-detection for number of threads to use (good for
> multi-CPU or multi-core computers) for packing repositories, use:
But object packing (used during fetch/push, and during git-gc) is
multi-threaded (at least the delta compression portion of it is).
> But it is not a lot explanatory (to me). In particular, if Git is
> multithreded and can be configured regarding the number of workers, I
> wonder in which operations it uses it ?
There is no master list, and the set of threaded operations changes from
version to version. If you have a clone of the git source code, you can
find the places where threads are used with
git grep NO_PTHREADS
as every threaded spot also has a single-threaded variant.
The current list is something like:
- finding delta candidates during pack-objects (gc, server side of
fetch, client side of push); controlled by pack.threads, which
defaults to "number of CPUs you have"
- resolving received objects in index-pack via fetch; controlled by
pack.threads
- git grep on a working tree (I do not recall the details, but I think
grepping a commit actually ends up slower when parallel); I do not
think there is config to control this
- when stat()-ing files to refresh the index. This is not about
parallel CPU performance, but about reducing latency on slow
filesystems (e.g., NFS) by pipelining requests; controlled by
core.preloadindex, which defaults to "false"
- git may fork to perform certain asynchronous operations (e.g.,
during a fetch, one process runs pack-objects to create the output,
and the other speaks the git protocol, mostly just passing through
the output to the client. On systems with threads, some of these
operations are performed using a thread rather than fork. This is
not about CPU performance, but about keeping the code simple (and
cannot be controlled with config).
I hope that helps.
-Peff
^ permalink raw reply [flat|nested] 3+ messages in thread