From: Junio C Hamano <gitster@pobox.com>
To: Kjetil Barvik <barvik@broadpark.no>
Cc: git@vger.kernel.org, Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH/RFC 1/4] Optimised, faster, more effective symlink/directory detection
Date: Tue, 06 Jan 2009 00:56:32 -0800 [thread overview]
Message-ID: <7v3afwizi7.fsf@gitster.siamese.dyndns.org> (raw)
In-Reply-To: <7vprj0j181.fsf@gitster.siamese.dyndns.org> (Junio C. Hamano's message of "Tue, 06 Jan 2009 00:19:26 -0800")
Junio C Hamano <gitster@pobox.com> writes:
> Let me see if I understand the logic behind this caching....
> ...
>> + last_slash = match_len;
>> + cache_path[last_slash] = '\0';
>> +
>> + if (lstat(cache_path, &st)) {
>> + ret_flags = LSTAT_LSTATERR;
>> + if (errno == ENOENT || errno == ENOTDIR)
>> + ret_flags |= LSTAT_NOTDIR;
>
> If you tested "A/B" here and got ENOENT back, you know "A/B" does not
> exist; you cache this knowledge as "A/B is not a directory" (I also think
> you could use it as a cached knowledge that "A exists and is a directory".
> I am not sure if you are taking advantage of that).
>
> What I do not understand about this code is the ENOTDIR case. If you
> tested "A/B" and got ENOTDIR back, what you know is that "A" is not a
> directory (if the path tested this round were deeper like "X/Y/A/B", you
> know "X/Y/A" is not a directory, and you know "X" and "X/Y" are true
> directories; otherwise the loop would have exited before this round when
> you tested "X" or "X/Y" in the earlier rounds).
>
> So as far as I can think of, ENOENT case and ENOTDIR case would give you
> different information (ENOENT would say "A is a dir, A/B is not"; ENOTDIR
> would say "A is not a dir"). I am confused how you can cache the same
> path and same flag between these two cases here.
I take 1/4 of the above back.
If everything is working correctly, I do not think you will ever have a
situation where you check "A/B" here and get ENOTDIR back. lstat("A/B")
would yield ENOTDIR if "A" is not a directory, but:
(1) If you did test "A" in the earlier round in the loop, you would have
already found it is not a directory and would have exited the loop
with LSTAT_ERR. You cannot test "A/B" in such a case;
(2) If you did not test "A" in the loop, that has to be because you had a
cached information for it, and the caching logic would have returned
early if "A" was a non-directory without entering the loop; in other
words, you would test "A/B" here without testing "A" in the loop only
when the cache says "A" was a directory. You cannot get ENOTDIR in
such a case.
In any case, my main puzzlement still stands. I think ENOTDIR case should
behave differently from ENOENT case.
And I think it is an indication of a grave error, either this code is
racing with an external process that is actively mucking with the work
tree to make your cached information unusable, or somebody forgot to call
clear_lstat_cache().
Hmm?
next prev parent reply other threads:[~2009-01-06 8:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-05 13:09 [PATCH/RFC 0/4] git checkout: optimise away lots of lstat() calls Kjetil Barvik
2009-01-05 13:09 ` [PATCH/RFC 1/4] Optimised, faster, more effective symlink/directory detection Kjetil Barvik
2009-01-06 8:19 ` Junio C Hamano
2009-01-06 8:56 ` Junio C Hamano [this message]
2009-01-06 9:56 ` Kjetil Barvik
2009-01-06 13:45 ` Kjetil Barvik
2009-01-06 9:36 ` Kjetil Barvik
2009-01-05 13:09 ` [PATCH/RFC 2/4] Use 'lstat_cache()' instead of 'has_symlink_leading_path()' Kjetil Barvik
2009-01-06 8:19 ` Junio C Hamano
2009-01-06 12:50 ` Kjetil Barvik
2009-01-05 13:10 ` [PATCH/RFC 3/4] create_directories() inside entry.c: only check each directory once! Kjetil Barvik
2009-01-05 13:10 ` [PATCH/RFC 4/4] remove the old 'has_symlink_leading_path()' function Kjetil Barvik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7v3afwizi7.fsf@gitster.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=barvik@broadpark.no \
--cc=git@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).