All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Turner <dturner@twopensource.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH v2 03/17] index-helper: new daemon for caching index and related stuff
Date: Tue, 29 Mar 2016 17:51:59 -0400	[thread overview]
Message-ID: <1459288319.2976.16.camel@twopensource.com> (raw)
In-Reply-To: <CACsJy8Bk19NNacDwez6BzicThnVUDQEoGe3m+ThiorP8uP9+eA@mail.gmail.com>

On Tue, 2016-03-29 at 09:31 +0700, Duy Nguyen wrote:
> On Sat, Mar 19, 2016 at 8:04 AM, David Turner <
> dturner@twopensource.com> wrote:
> > From: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> > 
> > Instead of reading the index from disk and worrying about disk
> > corruption, the index is cached in memory (memory bit-flips happen
> > too, but hopefully less often). The result is faster read. Read
> > time
> > is reduced by 70%.
> > 
> > The biggest gain is not having to verify the trailing SHA-1, which
> > takes lots of time especially on large index files. But this also
> > opens doors for further optimiztions:
> > 
> >  - we could create an in-memory format that's essentially the
> > memory
> >    dump of the index to eliminate most of parsing/allocation
> >    overhead. The mmap'd memory can be used straight away.
> > Experiment
> >    [1] shows we could reduce read time by 88%.
> 
> This reference [1] is missing (even in my old version). I believe
> it's
> http://thread.gmane.org/gmane.comp.version-control.git/247268/focus=2
> 48771,
> comparing 256.442ms in that mail with v4 number, 2245.113ms in 0/8
> mail from the same thread.
> 
> > Git can poke the daemon via named pipes to tell it to refresh the
> > index cache, or to keep it alive some more minutes. It can't give
> > any
> > real index data directly to the daemon. Real data goes to disk
> > first,
> > then the daemon reads and verifies it from there. Poking only
> > happens
> > for $GIT_DIR/index, not temporary index files.
> 
> I think we should go with unix socket on *nix platform instead of
> named pipe. UNIX named pipe only allows one communication channel at
> a
> time. Windows named pipe is different and allows multiple clients,
> which is the same as unix socket.
> 
> 
> > $GIT_DIR/index-helper.pipe is the named pipe for daemon process.
> > The
> > daemon reads from the pipe and executes commands.  Commands that
> > need
> > replies from the daemon will have to open their own pipe, since a
> > named pipe should only have one reader.  Unix domain sockets don't
> > have this problem, but are less portable.
> 
> Hm..NO_UNIX_SOCKETS is only set for Windows in config.mak.uname and
> Windows will need to be specially tailored anyway, I think unix
> socket
> would be more elegant.

One annoyance with unix sockets is that they must have short paths
(UNIX_PATH_MAX -- about a hundred characters).  This basically means
they should be in $TMPDIR.  I guess we could go back to pid files in
$GIT_DIR, and then have a socket named after the pid.  There's also
some security issues, but it actually looks like there's a simple
enough workaround for them.

I'll change this, but it might take a bit as I'm busy with other things
this week.

> > +static void share_index(struct index_state *istate, struct shm
> > *is)
> > +{
> > +       void *new_mmap;
> > +       if (istate->mmap_size <= 20 ||
> > +           hashcmp(istate->sha1,
> > +                   (unsigned char *)istate->mmap + istate
> > ->mmap_size - 20) ||
> > +           !hashcmp(istate->sha1, is->sha1) ||
> > +           git_shm_map(O_CREAT | O_EXCL | O_RDWR, 0700, istate
> > ->mmap_size,
> > +                       &new_mmap, PROT_READ | PROT_WRITE,
> > MAP_SHARED,
> > +                       "git-index-%s", sha1_to_hex(istate->sha1))
> > < 0)
> > +               return;
> > +
> > +       release_index_shm(is);
> > +       is->size = istate->mmap_size;
> > +       is->shm = new_mmap;
> > +       hashcpy(is->sha1, istate->sha1);
> > +       memcpy(new_mmap, istate->mmap, istate->mmap_size - 20);
> > +
> > +       /*
> > +        * The trailing hash must be written last after everything
> > is
> > +        * written. It's the indication that the shared memory is
> > now
> > +        * ready.
> > +        */
> > +       hashcpy((unsigned char *)new_mmap + istate->mmap_size - 20,
> > is->sha1);
> 
> You commented here [1] a long time ago about memory barrier. I'm not
> entirely sure if compilers dare to reorder function calls, but when
> hashcpy is inlined and memcpy is builtin, I suppose that's
> possible...
> 
> [1] http://article.gmane.org/gmane.comp.version-control.git/280729

Oh, right.  Will fix.

  reply	other threads:[~2016-03-29 21:52 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-19  1:04 [PATCH v2 00/15] index-helper, watchman David Turner
2016-03-19  1:04 ` [PATCH v2 01/17] read-cache.c: fix constness of verify_hdr() David Turner
2016-03-19  1:04 ` [PATCH v2 02/17] read-cache: allow to keep mmap'd memory after reading David Turner
2016-03-19  1:04 ` [PATCH v2 03/17] index-helper: new daemon for caching index and related stuff David Turner
2016-03-29  2:31   ` Duy Nguyen
2016-03-29 21:51     ` David Turner [this message]
2016-03-19  1:04 ` [PATCH v2 04/17] index-helper: add --strict David Turner
2016-03-19  1:04 ` [PATCH v2 05/17] daemonize(): set a flag before exiting the main process David Turner
2016-03-19  1:04 ` [PATCH v2 06/17] index-helper: add --detach David Turner
2016-03-29  2:35   ` Duy Nguyen
2016-03-19  1:04 ` [PATCH v2 07/17] read-cache: add watchman 'WAMA' extension David Turner
2016-03-19  1:04 ` [PATCH v2 08/17] read-cache: invalidate untracked cache data when reading WAMA David Turner
2016-03-29  2:50   ` Duy Nguyen
2016-03-29 21:22     ` David Turner
2016-03-19  1:04 ` [PATCH v2 09/17] Add watchman support to reduce index refresh cost David Turner
2016-03-24 13:47   ` Jeff Hostetler
2016-03-24 17:58     ` David Turner
2016-03-19  1:04 ` [PATCH v2 10/17] index-helper: use watchman to avoid refreshing index with lstat() David Turner
2016-03-19  1:04 ` [PATCH v2 11/17] update-index: enable/disable watchman support David Turner
2016-03-19  1:04 ` [PATCH v2 12/17] unpack-trees: preserve index extensions David Turner
2016-03-19  1:04 ` [PATCH v2 13/17] index-helper: kill mode David Turner
2016-03-19  1:04 ` [PATCH v2 14/17] index-helper: don't run if already running David Turner
2016-03-19  1:04 ` [PATCH v2 15/17] index-helper: autorun mode David Turner
2016-03-19  1:04 ` [PATCH v2 16/17] index-helper: optionally automatically run David Turner
2016-03-19  1:04 ` [PATCH v2 17/17] read-cache: config for waiting for index-helper David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1459288319.2976.16.camel@twopensource.com \
    --to=dturner@twopensource.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.