From: Linus Torvalds <torvalds@linux-foundation.org>
To: James Pickens <james.e.pickens@intel.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 1/2] Add support for multi threaded checkout
Date: Thu, 18 Dec 2008 13:41:48 -0800 (PST) [thread overview]
Message-ID: <alpine.LFD.2.00.0812181333150.14014@localhost.localdomain> (raw)
In-Reply-To: <1229633811-3877-1-git-send-email-james.e.pickens@intel.com>
On Thu, 18 Dec 2008, James Pickens wrote:
>
> This speeds up operations like 'git clone' on NFS drives tremendously, but
> slows down the same operations on local disks.
>
> Partitioning the work and launching threads is done in unpack-trees.c. The code
> is mostly copied from preload_index.c. The maximum number of threads is set to
> 8, which seemed to give a reasonable tradeoff between performance improvement on
> NFS and degradation on local disks.
Hmm. I don't really like this very much.
Why? Because as your locking shows, we can really only parallelise the
actual write-out anyway, and rather than do any locking there, wouldn't it
be better to have a notion of "queued work" (which would be just the
write-out) to be done in parallel?
So instead of doing all the unpacking etc in parallel (with locking around
it to serialize it), I'd suggest doing ll the unpacking serially since
that isn't the problem anyway (and since you have to protect it with a
lock anyway), and just have a "write out and free the buffer" phase that
is done in the threads.
The alternative would be to actually do what your patch suggests, but
actually try to make the code git SHA1 object handling be thread-safe. At
that point, the ugly locking in write_entry() would go away, and you might
actually improve performance on SMP thanks to doing the CPU part in
parallel.
But as-is, I think the patch is a bit ugly. The reason I liked the index
pre-reading was that it could be done entirely locklessly, so it really
did parallelize it _fully_ (even if the IO latency part was the much
bigger issue), and that was also why it actually ended up helping even on
a local disk (only if you have multiple cores, of course).
Linus
next prev parent reply other threads:[~2008-12-18 21:43 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-18 20:51 [RFC PATCH 0/2] Add support for multi threaded checkout Pickens, James E
2008-12-18 20:56 ` [PATCH 1/2] " James Pickens
2008-12-18 20:56 ` [PATCH 2/2] Add core.threadedcheckout config option James Pickens
2008-12-18 21:41 ` Linus Torvalds [this message]
2008-12-18 23:35 ` [PATCH 1/2] Add support for multi threaded checkout James Pickens
2008-12-19 0:13 ` Linus Torvalds
2008-12-18 21:02 ` [RFC PATCH 0/2] " Nicolas Pitre
2008-12-18 21:13 ` James Pickens
2008-12-18 21:16 ` Nicolas Morey-Chaisemartin
2008-12-18 21:42 ` James Pickens
2008-12-19 1:04 ` James Pickens
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.0812181333150.14014@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=git@vger.kernel.org \
--cc=james.e.pickens@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).