From: Kjetil Barvik <barvik@broadpark.no>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH/RFC 2/4] Use 'lstat_cache()' instead of 'has_symlink_leading_path()'
Date: Tue, 06 Jan 2009 13:50:54 +0100 [thread overview]
Message-ID: <86r63gk381.fsf@broadpark.no> (raw)
In-Reply-To: <7vk598j17u.fsf@gitster.siamese.dyndns.org>
Junio C Hamano <gitster@pobox.com> writes:
> Kjetil Barvik <barvik@broadpark.no> writes:
>
>> Start using the optimised, faster and more effective symlink/directory
>> cache. The previously used call:
>>
>> has_symlink_leading_path(len, name);
>>
>> should be identically with the following call to lstat_cache():
>>
>> lstat_cache(len, name,
>> LSTAT_SYMLINK|LSTAT_DIR,
>> LSTAT_SYMLINK);
>> ...
>
> Care to enlighten why some of callers use the above, but not others?
> Namely, check_removed() in diff-lib.c
I though that first it would be a good thing to introduce as little
changes as possible, and then later on do some cleanups. Regarding
the 'check_removed()' function, I though that it could later have been
written something like this after cleanups:
[....]
static int check_removed(const struct cache_entry *ce, struct stat *st)
{
unsigned int ret_flags =
check_lstat_cache(ce_namelen(ce), ce->name,
LSTAT_SYMLINK|LSTAT_NOTDIR|LSTAT_DIR);
if (ret_flags & (LSTAT_SYMLINK|LSTAT_NOTDIR))
return 1;
if (ret_flags & LSTAT_LSTATERR)
return -1;
if (ret_flags & LSTAT_DIR) {
unsigned char sub[20];
[....]
This would have saved one more lstat() call in some cases. But after
a test, I now see that it does not work. The reason is that it does
not set the 'struct stat *st' parameter and/or that for the moment you
can not tell the 'lstat_cache()' function to also always test the last
path component. It could be extended to do this, if someone ask for
it and if it would be useful to extend the lstat_cache() for this
fact.
I will remove the '|LSTAT_NOTDIR' part from the call to lstat_cache()
in 'check_removed()' in the next version of the patch.
> and callers in unpack-trees.c care about NOTDIR unlike others, even
> though the original code checked for exactly the same condition.
Regarding the 'verify_absent()' function in unpack-trees.c, the
'|LSTAT_NOTDIR' part of the call to lstat_cache() helps to avoid 16047
lstat() calls for the given test case mentioned in the cover-letter.
And from the source code:
[...]
if (lstat_cache(ce_namelen(ce), ce->name,
LSTAT_SYMLINK|LSTAT_NOTDIR|LSTAT_DIR,
LSTAT_SYMLINK|LSTAT_NOTDIR))
return 0;
if (!lstat(ce->name, &st)) {
[...]
it should be easy to see that if we from the lstat_cache() could
already spot that some path component of ce->name does not exists,
then we can avoid the following lstat() call, as it then should known
to be failing.
regarding the 'unlink_entry()' function in unpack-trees.c, the
'|LSTAT_NOTDIR' part of the call to lstat_cache() does not for the
moment helps to avoid any lstat() calls, as far as I can see. But,
again, from the source code:
[..]
char *name = ce->name;
if (lstat_cache(ce_namelen(ce), ce->name,
LSTAT_SYMLINK|LSTAT_NOTDIR|LSTAT_DIR,
LSTAT_SYMLINK|LSTAT_NOTDIR))
return;
if (unlink(name))
return;
[...]
it should be correct, since if we already know that some path
component of ce-name does not exist, the call to unlink(name) would
always fail (with ENOENT).
> Does this mean that some callers of has_symlink_leading_path() checked
> only for leading symlinks when they should also have checked for a leading
> non-directory, and this patch is also a bugfix?
Yes, as indicated above, has_symlink_leading_path() should have
checked for leading non-directories when called from for the
'verify_absent()' function to be able to optimise away some more
lstat() calls.
I admit that I do not know the source code good enough to decide if
this is an indication of a bug somewhere, or just an optimisation.
-- kjetil
next prev parent reply other threads:[~2009-01-06 12:52 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-05 13:09 [PATCH/RFC 0/4] git checkout: optimise away lots of lstat() calls Kjetil Barvik
2009-01-05 13:09 ` [PATCH/RFC 1/4] Optimised, faster, more effective symlink/directory detection Kjetil Barvik
2009-01-06 8:19 ` Junio C Hamano
2009-01-06 8:56 ` Junio C Hamano
2009-01-06 9:56 ` Kjetil Barvik
2009-01-06 13:45 ` Kjetil Barvik
2009-01-06 9:36 ` Kjetil Barvik
2009-01-05 13:09 ` [PATCH/RFC 2/4] Use 'lstat_cache()' instead of 'has_symlink_leading_path()' Kjetil Barvik
2009-01-06 8:19 ` Junio C Hamano
2009-01-06 12:50 ` Kjetil Barvik [this message]
2009-01-05 13:10 ` [PATCH/RFC 3/4] create_directories() inside entry.c: only check each directory once! Kjetil Barvik
2009-01-05 13:10 ` [PATCH/RFC 4/4] remove the old 'has_symlink_leading_path()' function Kjetil Barvik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86r63gk381.fsf@broadpark.no \
--to=barvik@broadpark.no \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.