All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kjetil Barvik <barvik@broadpark.no>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH/RFC 2/4] Use 'lstat_cache()' instead of 'has_symlink_leading_path()'
Date: Tue, 06 Jan 2009 13:50:54 +0100	[thread overview]
Message-ID: <86r63gk381.fsf@broadpark.no> (raw)
In-Reply-To: <7vk598j17u.fsf@gitster.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:

> Kjetil Barvik <barvik@broadpark.no> writes:
>
>> Start using the optimised, faster and more effective symlink/directory
>> cache.  The previously used call:
>>
>>    has_symlink_leading_path(len, name);
>>
>> should be identically with the following call to lstat_cache():
>>
>>    lstat_cache(len, name,
>>                LSTAT_SYMLINK|LSTAT_DIR,
>>                LSTAT_SYMLINK);
>> ...
>
> Care to enlighten why some of callers use the above, but not others?
> Namely, check_removed() in diff-lib.c 

  I though that first it would be a good thing to introduce as little
  changes as possible, and then later on do some cleanups.  Regarding
  the 'check_removed()' function, I though that it could later have been
  written something like this after cleanups:

[....]
static int check_removed(const struct cache_entry *ce, struct stat *st)
{
	unsigned int ret_flags =
		check_lstat_cache(ce_namelen(ce), ce->name,
				  LSTAT_SYMLINK|LSTAT_NOTDIR|LSTAT_DIR);

	if (ret_flags & (LSTAT_SYMLINK|LSTAT_NOTDIR))
		return 1;
	if (ret_flags & LSTAT_LSTATERR)
		return -1;

	if (ret_flags & LSTAT_DIR) {
		unsigned char sub[20];
[....]

  This would have saved one more lstat() call in some cases.  But after
  a test, I now see that it does not work.  The reason is that it does
  not set the 'struct stat *st' parameter and/or that for the moment you
  can not tell the 'lstat_cache()' function to also always test the last
  path component.  It could be extended to do this, if someone ask for
  it and if it would be useful to extend the lstat_cache() for this
  fact.

  I will remove the '|LSTAT_NOTDIR' part from the call to lstat_cache()
  in 'check_removed()' in the next version of the patch.

> and callers in unpack-trees.c care about NOTDIR unlike others, even
> though the original code checked for exactly the same condition.

  Regarding the 'verify_absent()' function in unpack-trees.c, the
  '|LSTAT_NOTDIR' part of the call to lstat_cache() helps to avoid 16047
  lstat() calls for the given test case mentioned in the cover-letter.
  And from the source code:

  [...]
	if (lstat_cache(ce_namelen(ce), ce->name,
			LSTAT_SYMLINK|LSTAT_NOTDIR|LSTAT_DIR,
			LSTAT_SYMLINK|LSTAT_NOTDIR))
		return 0;

	if (!lstat(ce->name, &st)) {
  [...]

  it should be easy to see that if we from the lstat_cache() could
  already spot that some path component of ce->name does not exists,
  then we can avoid the following lstat() call, as it then should known
  to be failing.

  regarding the 'unlink_entry()' function in unpack-trees.c, the
  '|LSTAT_NOTDIR' part of the call to lstat_cache() does not for the
  moment helps to avoid any lstat() calls, as far as I can see.  But,
  again, from the source code:

  [..]
	char *name = ce->name;

	if (lstat_cache(ce_namelen(ce), ce->name,
			LSTAT_SYMLINK|LSTAT_NOTDIR|LSTAT_DIR,
			LSTAT_SYMLINK|LSTAT_NOTDIR))
		return;
	if (unlink(name))
		return;
  [...]

  it should be correct, since if we already know that some path
  component of ce-name does not exist, the call to unlink(name) would
  always fail (with ENOENT).

> Does this mean that some callers of has_symlink_leading_path() checked
> only for leading symlinks when they should also have checked for a leading
> non-directory, and this patch is also a bugfix?

  Yes, as indicated above, has_symlink_leading_path() should have
  checked for leading non-directories when called from for the
  'verify_absent()' function to be able to optimise away some more
  lstat() calls.

  I admit that I do not know the source code good enough to decide if
  this is an indication of a bug somewhere, or just an optimisation.

  -- kjetil

  reply	other threads:[~2009-01-06 12:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-05 13:09 [PATCH/RFC 0/4] git checkout: optimise away lots of lstat() calls Kjetil Barvik
2009-01-05 13:09 ` [PATCH/RFC 1/4] Optimised, faster, more effective symlink/directory detection Kjetil Barvik
2009-01-06  8:19   ` Junio C Hamano
2009-01-06  8:56     ` Junio C Hamano
2009-01-06  9:56       ` Kjetil Barvik
2009-01-06 13:45       ` Kjetil Barvik
2009-01-06  9:36     ` Kjetil Barvik
2009-01-05 13:09 ` [PATCH/RFC 2/4] Use 'lstat_cache()' instead of 'has_symlink_leading_path()' Kjetil Barvik
2009-01-06  8:19   ` Junio C Hamano
2009-01-06 12:50     ` Kjetil Barvik [this message]
2009-01-05 13:10 ` [PATCH/RFC 3/4] create_directories() inside entry.c: only check each directory once! Kjetil Barvik
2009-01-05 13:10 ` [PATCH/RFC 4/4] remove the old 'has_symlink_leading_path()' function Kjetil Barvik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86r63gk381.fsf@broadpark.no \
    --to=barvik@broadpark.no \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.