All of lore.kernel.org
 help / color / mirror / Atom feed
From: Karsten Blees <karsten.blees@gmail.com>
To: Stefan Zager <szager@chromium.org>
Cc: "Nguyễn Thái Ngọc" <pclouds@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>
Subject: Re: [PATCH] Enable index-pack threading in msysgit.
Date: Fri, 21 Mar 2014 21:01:45 +0100	[thread overview]
Message-ID: <532C9AA9.1010102@gmail.com> (raw)
In-Reply-To: <CAHOQ7J-sUt3HGYNE7n=X3ZmV3Q-n+n9hMDAtzLbH3YU8iAqoqA@mail.gmail.com>

Am 20.03.2014 22:56, schrieb Stefan Zager:
> On Thu, Mar 20, 2014 at 2:35 PM, Karsten Blees <karsten.blees@gmail.com> wrote:
>> Am 20.03.2014 17:08, schrieb Stefan Zager:
>>
>>> Going forward, there is still a lot of performance that gets left on
>>> the table when you rule out threaded file access.  There are not so
>>> many calls to read, mmap, and pread in the code; it should be possible
>>> to rationalize them and make them thread-safe -- at least, thread-safe
>>> for posix-compliant systems and msysgit, which covers the great
>>> majority of git users, I would hope.
>>>
>>
>> IMO a "mostly" XSI compliant pread (or even the git_pread() emulation) is still better than forbidding the use of read() entirely. Switching from read to pread everywhere requires that all callers have to keep track of the file position, which means a _lot_ of code changes (read/xread/strbuf_read is used in ~70 places throughout git). And how do you plan to deal with platforms that don't have a thread-safe pread (HP, Cygwin)?
>>
>> Considering all that, Duy's solution of opening separate file descriptors per thread seems to be the best pattern for future multi-threaded work.
> 
> Does that mean you would endorse the (N threads) * (M pack files)
> approach to threading checkout and status?  That seems kind of
> crazy-town to me.  Not to mention that pack windows are not shared, so
> this approach to multi-threading can have the side-effect of blowing
> out memory consumption.  We have already had to dial back settings for
> pack.threads and core.deltaBaseCacheLimit, because threaded index-pack
> was causing OOM errors on 32-bit platforms.
> 

Opening more file descriptors doesn't significantly increase the memory footprint, so it shouldn't matter whether the threads read data via shared or private descriptors.

git-status with core.preloadindex is already multithreaded (at least the first part), and AFAIK doesn't read pack files at all.

I'm still not convinced that multi-threaded git-checkout is a good idea. According to my tests this is actually slower than sequential checkout. You'd have to be very careful to only multi-thread the parts that don't do any IO, such as unpacking / undeltifying.

  parent reply	other threads:[~2014-03-21 20:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-19  0:46 [PATCH] Enable index-pack threading in msysgit szager
2014-03-19  7:30 ` Duy Nguyen
2014-03-19  7:50   ` Stefan Zager
2014-03-19 10:28     ` Duy Nguyen
2014-03-19 16:57       ` Stefan Zager
2014-03-19 19:15         ` Stefan Zager
2014-03-19 20:57 ` Junio C Hamano
2014-03-20 13:54 ` Karsten Blees
2014-03-20 16:08   ` Stefan Zager
2014-03-20 21:35     ` Karsten Blees
2014-03-20 21:56       ` Stefan Zager
2014-03-21  1:33         ` Duy Nguyen
2014-03-21 20:01         ` Karsten Blees [this message]
2014-03-21  1:51     ` Duy Nguyen
2014-03-21  5:21       ` Duy Nguyen
2014-03-21  5:35         ` Stefan Zager
2014-03-21 18:55           ` Karsten Blees
2014-03-25 13:41       ` [PATCH] index-pack: work around thread-unsafe pread() Nguyễn Thái Ngọc Duy
2014-03-26  8:35 ` [PATCH] Enable index-pack threading in msysgit Johannes Sixt
  -- strict thread matches above, loose matches on Subject: below --
2014-03-19 21:35 Stefan Zager
2014-03-19 22:23 ` Junio C Hamano
2014-03-20  1:25 ` Duy Nguyen
2014-03-21 18:40   ` Karsten Blees

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=532C9AA9.1010102@gmail.com \
    --to=karsten.blees@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=szager@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.