All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Rast <trast@inf.ethz.ch>
To: Robert Zeh <robert.allan.zeh@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Ramkumar Ramachandra <artagnon@gmail.com>,
	Git List <git@vger.kernel.org>, Duy Nguyen <pclouds@gmail.com>
Subject: Re: [PATCH] inotify to minimize stat() calls
Date: Thu, 25 Apr 2013 10:18:21 +0200	[thread overview]
Message-ID: <87sj2f6n1u.fsf@linux-k42r.v.cablecom.net> (raw)
In-Reply-To: <51781455.9090600@gmail.com> (Robert Zeh's message of "Wed, 24 Apr 2013 12:20:21 -0500")

Robert Zeh <robert.allan.zeh@gmail.com> writes:

> Here is a patch that creates a daemon that tracks file
> state with inotify, writes it out to a file upon request,
> and changes most of the calls to stat to use said cache.
>
> It has bugs, but I figured it would be smarter to see
> if the approach was acceptable at all before spending the
> time to root the bugs out.

Thanks for tackling this; it's probably about time we got a inotify
support :-(

> I've implemented the communication with a file, and not a socket,
> because I think implementing a socket is going to create
> security issues on multiuser systems.  For example, would a
> socket allow stat information to cross user boundaries?

This ties in with an issue discussed in an earlier thread:

  http://thread.gmane.org/gmane.comp.version-control.git/217817/focus=218307

The conclusion there was that the default limits are set such that it is
not feasible to run one daemon per repository (that would quickly hit
the limits when e.g. iterating all repos in a typical android tree using
repo).

So whatever you use for communication needs to work as a global daemon.

I'd just trust the SSH folks to know about security; on my system
ssh-agent creates

  /tmp/ssh-RANDOMSTRING/agent.PID

where the directory has mode 0700, and the file is a unit socket with
mode 0600.  That should make doubly sure that no other user can open the
socket.

>  filechange-cache.c   | 203
> +++++++++++++++++++++++++++++++++++++++++++++++++++

Is your MUA wrapping the patch?

> +static void watch_directory(int inotify_fd)
> +{
> +	char buf[PATH_MAX];
> +
> +	if (!getcwd(buf, sizeof(buf)))
> +		die_errno("Unable to get current directory");
> +
> +	int i = 0;
> +	struct dir_struct dir;
> +	const char *pathspec[1] = { buf, NULL };
> +
> +	memset(&dir, 0, sizeof(dir));
> +	setup_standard_excludes(&dir);
> +
> +	fill_directory(&dir, pathspec);
> +	for(i = 0; i < dir.nr; i++) {
> +		struct dir_entry *ent = dir.entries[i];
> +		watch_file(inotify_fd, ent->name);
> +		free(ent);
> +	}

I don't get this bit.  The lstat() are run over all files listed in the
index.  So shouldn't your daemon watch exactly those (or rather, all
dirnames of such files)?

The actual directory contents are only needed to find untracked files,
and there would be a lot of complication surrounding that, so I suggest
saving that for later (and for now measuring the speedup with 'git
status -uno'!).

For example, you'd have to actually watch and re-read all .gitignore
files, and the .git/info/exclude, and the core.excludesfile, to see if
your notion of an ignored file became stale.

Also, you seem to call watch_directory() only on the current(?) dir, but
you need to recursively set up watches for all directories in the
repository.

> +	while (1) {
> +		int i = 0;
> +		length = read(inotify_fd, buffer, sizeof(buffer));
> +		for(i = 0; i < length; ) {
> +			struct inotify_event *event =
> +				(struct inotify_event*)(buffer+i);
> +			/* printf("event: %d %x %d %s\n", event->wd, event->mask,
> +			   event->len, event->name); */
> +			if (request_watch_descriptor == event->wd) {
> +				write_stat_cache();
> +			} else if (root_directory_watch_descriptor
> +				   == event->wd) {
> +				printf("root directory died!\n");
> +				exit(0);
> +			} else if (event->mask & IN_Q_OVERFLOW) {
> +				restart();

Good.

> +			} else if (event->mask & IN_MODIFY) {
> +				if (event->len)
> +					update_stat_cache(event->name);
> +			}

So whenever a file changes, you stat() it.  That's good for simplicity
now, but I suspect it will provide some optimization opportunities
later.


On some design aspects, I'd want:

* a toggle to run the test suite with the daemons, or without

* if you go with a user-wide daemon, a way to ensure that the test-suite
  daemon is not the same as my "real" daemon, and make sure it is killed
  after the test runs finish

* a test that triggers IN_Q_OVERFLOW, e.g. by sending SIGSTOP and doing
  a large repository operation

* a test that renames directories

The last one is just based on my personal experience with messing with
inotify; renaming directories is the "hard" case for that API.  We may
already cover this in the test suite, or we may not; but it must be
tested.

Other than that last point, focus your tests not on small tests but on
the test suite.  It would seem rather unlikely to me that you could
manage to pass the entire test suite with this daemon active but broken.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

  parent reply	other threads:[~2013-04-25  8:18 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-08 21:10 inotify to minimize stat() calls Ramkumar Ramachandra
2013-02-08 22:15 ` Junio C Hamano
2013-02-08 22:45   ` Junio C Hamano
2013-02-09  2:10     ` Duy Nguyen
2013-02-09  2:37       ` Junio C Hamano
2013-02-09  2:56     ` Junio C Hamano
2013-02-09  3:36       ` Robert Zeh
2013-02-09 12:05         ` Ramkumar Ramachandra
2013-02-09 12:11           ` Ramkumar Ramachandra
2013-02-09 12:53           ` Ramkumar Ramachandra
2013-02-09 12:59             ` Duy Nguyen
2013-02-09 17:10               ` Ramkumar Ramachandra
2013-02-09 18:56                 ` Ramkumar Ramachandra
2013-02-10  5:24                 ` Duy Nguyen
2013-02-10 11:17                   ` Duy Nguyen
2013-02-10 11:22                     ` Duy Nguyen
2013-02-10 20:16                       ` Junio C Hamano
2013-02-11  2:56                         ` Duy Nguyen
2013-02-11 11:12                           ` Duy Nguyen
2013-03-07 22:16                           ` Torsten Bögershausen
2013-03-08  0:04                             ` Junio C Hamano
2013-03-08  7:01                               ` Torsten Bögershausen
2013-03-08  8:15                                 ` Junio C Hamano
2013-03-08  9:24                                   ` Torsten Bögershausen
2013-03-08 10:53                                   ` Duy Nguyen
2013-03-10  8:23                                     ` Ramkumar Ramachandra
2013-03-13 12:59                                     ` [PATCH] status: hint the user about -uno if read_directory takes too long Nguyễn Thái Ngọc Duy
2013-03-13 15:21                                       ` Torsten Bögershausen
2013-03-13 16:16                                       ` Junio C Hamano
2013-03-14 10:22                                         ` Duy Nguyen
2013-03-14 15:05                                           ` Junio C Hamano
2013-03-15 12:30                                             ` Duy Nguyen
2013-03-15 15:52                                               ` Torsten Bögershausen
2013-03-15 15:57                                                 ` Ramkumar Ramachandra
2013-03-15 16:53                                                 ` Junio C Hamano
2013-03-15 17:41                                                   ` Torsten Bögershausen
2013-03-15 20:06                                                     ` Junio C Hamano
2013-03-15 21:14                                                       ` Torsten Bögershausen
2013-03-15 21:59                                                         ` Junio C Hamano
2013-03-16  7:21                                                           ` Torsten Bögershausen
2013-03-17  4:47                                                             ` Junio C Hamano
2013-03-16  1:51                                           ` Duy Nguyen
2013-02-10 13:26                     ` inotify to minimize stat() calls demerphq
2013-02-10 15:35                       ` Duy Nguyen
2013-02-14 14:36                       ` Magnus Bäck
2013-02-10 16:45                     ` Ramkumar Ramachandra
2013-02-11  3:03                       ` Duy Nguyen
2013-02-10 16:58                     ` Erik Faye-Lund
2013-02-11  3:53                       ` Duy Nguyen
2013-02-12 20:48                         ` Karsten Blees
2013-02-13 10:06                           ` Duy Nguyen
2013-02-13 12:15                           ` Duy Nguyen
2013-02-13 18:18                             ` Jeff King
2013-02-13 19:47                               ` Jeff King
2013-02-13 20:25                               ` Karsten Blees
2013-02-13 22:55                                 ` Jeff King
2013-02-14  0:48                                   ` Karsten Blees
2013-02-27 14:45                                     ` [PATCH] name-hash.c: fix endless loop with core.ignorecase=true Karsten Blees
2013-02-27 16:53                                       ` Junio C Hamano
2013-02-27 21:52                                         ` Karsten Blees
2013-02-27 23:57                                           ` [PATCH v2] " Karsten Blees
2013-02-28  0:27                                             ` Junio C Hamano
2013-02-19  9:49                           ` inotify to minimize stat() calls Ramkumar Ramachandra
2013-02-19 14:25                             ` Karsten Blees
2013-02-19 13:16                   ` Drew Northup
2013-02-19 13:47                     ` Duy Nguyen
2013-02-09 19:35           ` Junio C Hamano
2013-02-10 19:03             ` Robert Zeh
2013-02-10 19:26               ` Martin Fick
2013-02-10 20:18                 ` Robert Zeh
2013-02-11  3:21               ` Duy Nguyen
2013-02-11 14:13                 ` Robert Zeh
2013-02-19  9:57                   ` Ramkumar Ramachandra
2013-04-24 17:20               ` [PATCH] " Robert Zeh
2013-04-24 21:32                 ` Duy Nguyen
2013-04-25 19:44                   ` Robert Zeh
2013-04-25 21:20                     ` Duy Nguyen
2013-04-26 15:35                       ` Robert Zeh
2013-04-25  8:18                 ` Thomas Rast [this message]
2013-04-25 19:37                   ` Robert Zeh
2013-04-25 19:59                     ` Thomas Rast
2013-04-27 13:51                       ` Thomas Rast
2013-04-27 23:56                 ` Duy Nguyen
     [not found]                   ` <CAKXa9=r2A7UeBV2s2H3wVGdPkS1zZ9huNJhtvTC-p0S5Ed12xA@mail.gmail.com>
2013-04-30  0:27                     ` Duy Nguyen
2013-02-09 11:32       ` Ramkumar Ramachandra
2013-02-14 15:16 ` Ævar Arnfjörð Bjarmason
2013-02-14 16:31   ` Junio C Hamano
2013-02-19  9:40   ` Ramkumar Ramachandra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sj2f6n1u.fsf@linux-k42r.v.cablecom.net \
    --to=trast@inf.ethz.ch \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=robert.allan.zeh@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.