From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Kjetil Barvik <barvik@broadpark.no>
Cc: git@vger.kernel.org, Pete Harlan <pgit@pcharlan.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH/RFC v5 1/1] more cache effective symlink/directory detection
Date: Sat, 10 Jan 2009 11:11:12 +0100 [thread overview]
Message-ID: <49687440.5090506@lsrfire.ath.cx> (raw)
In-Reply-To: <1231527954-868-2-git-send-email-barvik@broadpark.no>
Kjetil Barvik schrieb:
> - Also introduce a 'void clear_lstat_cache(void)' function, which
> should be used to clean the cache before usage. If for instance,
> you have changed the types of directories which should be cached,
> the cache could contain a path which was not wanted.
Is it possible to make the cache detect these situations automatically
by saving track_flags along with the cache contents? Not having to
clear the cache manually would be a major feature.
> --- a/cache.h
> +++ b/cache.h
> @@ -719,7 +719,29 @@ struct checkout {
> };
>
> extern int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath);
> -extern int has_symlink_leading_path(int len, const char *name);
> +
> +#define LSTAT_REG (1u << 0)
> +#define LSTAT_DIR (1u << 1)
> +#define LSTAT_NOENT (1u << 2)
> +#define LSTAT_SYMLINK (1u << 3)
> +#define LSTAT_LSTATERR (1u << 4)
> +#define LSTAT_ERR (1u << 5)
> +#define LSTAT_FULLPATH (1u << 6)
> +extern unsigned int lstat_cache(int len, const char *name,
> + unsigned int track_flags, int prefix_len_stat_func);
> +extern void clear_lstat_cache(void);
> +static inline unsigned int has_symlink_leading_path(int len, const char *name)
> +{
> + return lstat_cache(len, name, LSTAT_SYMLINK|LSTAT_DIR, -1) &
> + LSTAT_SYMLINK;
> +}
> +#define clear_symlink_cache() clear_lstat_cache()
> +static inline unsigned int has_symlink_or_noent_leading_path(int len, const char *name)
> +{
> + return lstat_cache(len, name, LSTAT_SYMLINK|LSTAT_NOENT|LSTAT_DIR, -1) &
> + (LSTAT_SYMLINK|LSTAT_NOENT);
> +}
> +#define clear_symlink_or_noent_cache() clear_lstat_cache()
What's the advantage of inlining the wrappers (expressed in units of
space and/or time)? The interface would be much nicer if you exported
the wrappers, only, and not all those constants along with them.
And why define aliases for clear_lstat_cache()?
> diff --git a/entry.c b/entry.c
> index aa2ee46a84033585d8e07a585610c5a697af82c2..293400cf5be63fd66b797a68e17bf953c600fe99 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -8,35 +8,28 @@ static void create_directories(const char *path, const struct checkout *state)
> const char *slash = path;
>
> while ((slash = strchr(slash+1, '/')) != NULL) {
> - struct stat st;
> - int stat_status;
> -
> len = slash - path;
> memcpy(buf, path, len);
> buf[len] = 0;
>
> - if (len <= state->base_dir_len)
> - /*
> - * checkout-index --prefix=<dir>; <dir> is
> - * allowed to be a symlink to an existing
> - * directory.
> - */
> - stat_status = stat(buf, &st);
> - else
> - /*
> - * if there currently is a symlink, we would
> - * want to replace it with a real directory.
> - */
> - stat_status = lstat(buf, &st);
> -
> - if (!stat_status && S_ISDIR(st.st_mode))
> + /* For 'checkout-index --prefix=<dir>', <dir> is
> + * allowed to be a symlink to an existing directory,
> + * therefore we must give 'state->base_dir_len' to the
> + * cache, such that we test path components of the
> + * prefix with stat() instead of lstat()
> + *
> + * We must also tell the cache to test the complete
> + * length of the buffer (the '|LSTAT_FULLPATH' part).
> + */
> + if (lstat_cache(len, buf, LSTAT_DIR|LSTAT_FULLPATH,
> + state->base_dir_len) &
> + LSTAT_DIR)
> continue; /* ok, it is already a directory. */
I'd say this usage is worth another wrapper.
Also, it's probably worth to split this patch up again. First switching
to your improved implementation of has_symlink_leading_path(), then
introducing has_symlink_or_noent_leading_path() and finally adding
LSTAT_FULLPATH and the fourth parameter of lstat_cache() etc. and using
this feature in entry.c seems like a nice incremental progression.
René
next prev parent reply other threads:[~2009-01-10 10:12 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-09 19:05 [PATCH/RFC v5 0/1] git checkout: optimise away lots of lstat() calls Kjetil Barvik
2009-01-09 19:05 ` [PATCH/RFC v5 1/1] more cache effective symlink/directory detection Kjetil Barvik
2009-01-10 10:11 ` René Scharfe [this message]
2009-01-11 0:48 ` Junio C Hamano
2009-01-11 8:26 ` Kjetil Barvik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49687440.5090506@lsrfire.ath.cx \
--to=rene.scharfe@lsrfire.ath.cx \
--cc=barvik@broadpark.no \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pgit@pcharlan.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).