From: Junio C Hamano <gitster@pobox.com>
To: "René Scharfe" <l.s.r@web.de>
Cc: "Nikolay Avdeev" <avdeev@math.vsu.ru>,
git@vger.kernel.org, "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: Bug report about symlinks
Date: Sun, 03 Aug 2014 10:19:21 -0700 [thread overview]
Message-ID: <xmqqk36ptrs6.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <53DCF14D.8040705@web.de> ("René Scharfe"'s message of "Sat, 02 Aug 2014 16:10:21 +0200")
René Scharfe <l.s.r@web.de> writes:
> How about the patch below? Before it checks if an index entry exists
> in the work tree, it checks if its path includes a symlink.
Honestly, I didn't expect the fix to be in the refresh-index code
path, but doing this there sort of makes sense.
I think you, who dug to find out where to add the check, already
know this, and I am writing this mainly for myself and for the list
archive, but when the knee-jerk "has-syjmlink-leading-path missing?"
reaction came to me, two obvious optimizations also came to my mind:
- When checking a cache entry "a/b/c/d/e" and we find out "a/b/c"
is a symbolic link, we mark it as ~CE_VALID, but at the same
time, we learn "a/b/c/any/thing" are also ~CE_VALID with that
check, so we _could_, after the has_symlink_leading_path once
returns true, so there may be an opportunity to fast-forward the
scan, marking all paths under "a/b/c" as ~CE_VALID.
- After finding out "a/b/c" is a symbolic link to cause anything
underneath marked as ~CE_VALID, we also know "a/" and "a/b/"
exists as real directories, so a later check for "a/b/some/thing"
can start checking at "a/b/some/" without checking "a/" and "a/b/".
The latter is more important as it is a much more common case that
the shape of the working tree not to change.
But I think the implementation of has_symlink_leading_path() already
has these optimizations built inside around the (unfortunately named)
"struct cache_def", so it probably would not give us much boost to
implement such a fast-forwarding of the scan.
> And do we need to use the threaded_ variant of the function here?
Hmmm, this is a tangent, but you comment made me wonder if we also
need to adjust preload_thread() in preload-index.c somehow, but we
do not touch CE_UPTODATE there, so it probably is not necessary.
The caller of refresh_cache_ent() is walking an array of sorted
pathnames aka istate->cache[] in a single-threaded fashion, possibly
with a pathspec to limit the scan. Do you mean that we may want to
make istate->cache[] scanned by multiple threads? I am not sure.
> diff --git a/read-cache.c b/read-cache.c
> index 5d3c8bd..6f0057f 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1064,6 +1064,14 @@ static struct cache_entry *refresh_cache_ent(struct index_state *istate,
> return ce;
> }
>
> + if (has_symlink_leading_path(ce->name, ce_namelen(ce))) {
> + if (ignore_missing)
> + return ce;
> + if (err)
> + *err = ENOENT;
> + return NULL;
> + }
> +
> if (lstat(ce->name, &st) < 0) {
> if (ignore_missing && errno == ENOENT)
> return ce;
next prev parent reply other threads:[~2014-08-03 17:19 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-31 19:50 Bug report about symlinks Nikolay Avdeev
2014-07-31 22:04 ` René Scharfe
2014-08-01 16:23 ` Junio C Hamano
2014-08-02 14:10 ` René Scharfe
2014-08-03 17:19 ` Junio C Hamano [this message]
2014-08-03 22:59 ` René Scharfe
2014-08-04 16:34 ` Junio C Hamano
2014-08-09 17:43 ` [PATCH] read-cache: check for leading symlinks when refreshing index René Scharfe
2014-08-04 11:03 ` Bug report about symlinks Duy Nguyen
2014-08-04 16:36 ` Junio C Hamano
-- strict thread matches above, loose matches on Subject: below --
2014-07-30 11:30 NickKolok
2014-08-01 19:10 ` Dennis Kaarsemaker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqk36ptrs6.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox.com \
--cc=avdeev@math.vsu.ru \
--cc=git@vger.kernel.org \
--cc=l.s.r@web.de \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.