All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Turner <dturner@twopensource.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH v5 09/15] index-helper: use watchman to avoid refreshing index with lstat()
Date: Tue, 19 Apr 2016 21:01:38 -0400	[thread overview]
Message-ID: <1461114098.5540.158.camel@twopensource.com> (raw)
In-Reply-To: <CACsJy8CM409OH12w3EdVP3UjXoURbWNuqb_coQ=AagdCs+ctaQ@mail.gmail.com>

On Wed, 2016-04-20 at 07:15 +0700, Duy Nguyen wrote:
> Continuing my comment from the --use-watchman patch about watchman
> not
> being supported...
> 
> On Wed, Apr 20, 2016 at 6:28 AM, David Turner <
> dturner@twopensource.com> wrote:
> > +static int poke_and_wait_for_reply(int fd)
> > +{
> > +       struct strbuf buf = STRBUF_INIT;
> > +       struct strbuf reply = STRBUF_INIT;
> > +       int ret = -1;
> > +       fd_set fds;
> > +       struct timeval timeout;
> > +
> > +       timeout.tv_usec = 0;
> > +       timeout.tv_sec = 1;
> > +
> > +       if (fd < 0)
> > +               return -1;
> > +
> > +       strbuf_addf(&buf, "poke %d", getpid());
> > +       if (write_in_full(fd, buf.buf, buf.len + 1) != buf.len + 1)
> > +               goto done_poke;
> > +
> > +       /* Now wait for a reply */
> > +       FD_ZERO(&fds);
> > +       FD_SET(fd, &fds);
> > +       if (select(fd + 1, &fds, NULL, NULL, &timeout) == 0)
> > +               /* No reply, giving up */
> > +               goto done_poke;
> > +
> > +       if (strbuf_getwholeline_fd(&reply, fd, 0))
> > +               goto done_poke;
> > +
> > +       if (!starts_with(reply.buf, "OK"))
> > +               goto done_poke;
> 
> ... while we could simply check USE_WATCHMAN macro and reject in
> update-index, a better solution is sending "poke %d watchman" and
> returning "OK watchman" (vs "OK") when watchman is supported and
> active. If the user already requests watchman and index-helper
> returns
> just "OK" then we can warn the user the reason of possible
> performance
> degradation. It's related to the error reporting, but I don't think
> you can send straight errors over unix socket. It's possible but it's
> a bit more complicated.

Do you mean that we should do this here?  Or in update-index -
-watchman?  If the former, I agree.  If the latter, I'm not sure; maybe
you'll be setting up your index before you've started the index helper?

> > +static void refresh_by_watchman(struct index_state *istate)
> > +{
> > +       void *shm = NULL;
> > +       int length;
> > +       int i;
> > +       struct stat st;
> > +       int fd = -1;
> > +       const char *path = git_path("shm-watchman-%s-%"PRIuMAX,
> > +                                   sha1_to_hex(istate->sha1),
> > +                                   (uintmax_t)getpid());
> > +
> > +       fd = open(path, O_RDONLY);
> > +       if (fd < 0)
> > +               return;
> > +
> > +       /*
> > +        * This watchman data is just for us -- no need to keep it
> > +        * around once we've got it open.
> > +        */
> > +       unlink(path);
> 
> This will not play well when multiple processes read and refresh the
> index at the same time. 

Multiple processes will have different pids, right?  And the pid is
included in the filename.  Am I missing something?

> This is really extra. But if we know in advance that git does not 
> need refresh(), then we should be able to tell index-helper not to 
> waste cycles contacting watchman and preparing shm-watchman-%s-%d 
> (the poke line gets more parameters). Either that, or we decouple 
> watchman requests from read_cache() requests. Only when 
> refresh_index() is called that we send something to request shm-
> watchman-%s-%d. The same for read_directory() (i.e. untracked cache 
> stuff). Hmm?

It's true that we could decouple watchman requests.  I'll look and see
if that's reasonable.

> Now that I think of it, with watchman backing us, we probably should
> just do nothing in update_index_if_able() (or write_locked_index()
> when we know only stat info is changed) when watchman is active. The
> purpose of update_index_if_able() is to avoid costly refresh, but we
> can already avoid that with watchman. And updating big index files is
> always costly (even though it should cost less with split-index). 

That sounds like a change we could make in a separate series.  It's not
a bad idea, but if our goal is to get the basic version out, we should
start there.

> Of
> course this can only be done if watchman (inotify to be precise) can
> cover whole worktree. I'm not sure how watchman behaves when there's
> not enough inotify resource to cover full worktree.

It will detect this case and will either manually recrawl (in the event
of a max_queued_events overflow) or return an error (in the event of
too many watched directories).

  reply	other threads:[~2016-04-20  1:01 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-19 23:27 [PATCH v5 00/15] index-helper/watchman David Turner
2016-04-19 23:27 ` [PATCH v5 01/15] read-cache.c: fix constness of verify_hdr() David Turner
2016-04-19 23:27 ` [PATCH v5 02/15] read-cache: allow to keep mmap'd memory after reading David Turner
2016-04-20  9:01   ` Johannes Schindelin
2016-04-20 19:41     ` David Turner
2016-04-20  9:26   ` Duy Nguyen
2016-04-20 19:43     ` David Turner
2016-04-19 23:27 ` [PATCH v5 03/15] index-helper: new daemon for caching index and related stuff David Turner
2016-04-20 12:17   ` Johannes Schindelin
2016-04-20 12:31     ` Duy Nguyen
2016-04-20 19:38     ` David Turner
2016-04-19 23:27 ` [PATCH v5 04/15] index-helper: add --strict David Turner
2016-04-19 23:27 ` [PATCH v5 05/15] daemonize(): set a flag before exiting the main process David Turner
2016-04-19 23:28 ` [PATCH v5 06/15] index-helper: add --detach David Turner
2016-04-19 23:50   ` Duy Nguyen
2016-04-20  1:04     ` David Turner
2016-04-20  9:33       ` Duy Nguyen
2016-04-25 20:53     ` David Turner
2016-04-19 23:28 ` [PATCH v5 07/15] read-cache: add watchman 'WAMA' extension David Turner
2016-04-19 23:28 ` [PATCH v5 08/15] Add watchman support to reduce index refresh cost David Turner
2016-04-19 23:28 ` [PATCH v5 09/15] index-helper: use watchman to avoid refreshing index with lstat() David Turner
2016-04-20  0:15   ` Duy Nguyen
2016-04-20  1:01     ` David Turner [this message]
2016-04-20  9:36       ` Duy Nguyen
2016-04-19 23:28 ` [PATCH v5 10/15] update-index: enable/disable watchman support David Turner
2016-04-19 23:45   ` Duy Nguyen
2016-04-20 19:50     ` David Turner
2016-04-19 23:28 ` [PATCH v5 11/15] unpack-trees: preserve index extensions David Turner
2016-04-19 23:28 ` [PATCH v5 12/15] index-helper: kill mode David Turner
2016-04-19 23:28 ` [PATCH v5 13/15] index-helper: don't run if already running David Turner
2016-04-19 23:28 ` [PATCH v5 14/15] index-helper: autorun mode David Turner
2016-04-19 23:28 ` [PATCH v5 15/15] index-helper: optionally automatically run David Turner
2016-04-20  9:59   ` [PATCH 16/15] Add tracing to measure where most of the time is spent Nguyễn Thái Ngọc Duy
2016-04-20 12:28     ` Johannes Schindelin
2016-04-20 12:36       ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1461114098.5540.158.camel@twopensource.com \
    --to=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.