git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tuncer Ayaz <tuncer.ayaz@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, git@vger.kernel.org
Subject: Re: [RFC/PATCH] shortstatus v1
Date: Tue, 10 Feb 2009 23:52:47 +0100	[thread overview]
Message-ID: <4ac8254d0902101452u33a1ef0mb9d34182eff5838f@mail.gmail.com> (raw)
In-Reply-To: <7vtz72kjz0.fsf@gitster.siamese.dyndns.org>

On Tue, Feb 10, 2009 at 11:25 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> Warm cache, it runs in .042s on my git repo, about half of which is the
>> untracked files check. It takes about .49s on the kernel repo. The
>> read_directory() bit is not optimized at all, and could probably benefit
>> from an early return (OTOH, the worst case is still going to need to
>> look at every path).
>
> I suspect that with a large tree your have_untracked() would show
> unnecessary overhead from dir_add_name(), because you only want one bit of
> information but there is no way to stop with "ok, we know enough".  This
> toy patch adds a trivial "early return" to read_directory() codepath, but
> there are two sad things about it.
>
>  * In order to cheaply run "is there a single other file", you really
>   should scan the level you have already opened first before digging
>   deeper.  I didn't bother because the primary use of read_directory is
>   the depth first traversal.
>
>  * In a cloned work tree, the tracked files and directories come early in
>   the physical directory and then crufts you created yourself comes at
>   the end in readdir() order.  We tend to read a lot of tracked ones
>   first and the finally hit other files.

I've done some measurements from within .bashrc via  a
custom timer() bash function called before and after git ministatus call.
The box I'm testing on is a Core2Duo with 1.8GHz and 2GB of RAM.
I have faster machines I can test on but do Git coding on this one.

For WebKit.git this is similar to the previous patch. It does also take
at least 8 or 9 seconds. That's an improvement over 8-11secs.

Is it good to test against WebKit.git? I mean it's apparently big.

>  builtin-ministatus.c |    4 +++-
>  dir.c                |   32 +++++++++++++++++++++++++++-----
>  dir.h                |    2 ++
>  3 files changed, 32 insertions(+), 6 deletions(-)
>
> diff --git a/builtin-ministatus.c b/builtin-ministatus.c
> index aff9e5a..4b5a191 100644
> --- a/builtin-ministatus.c
> +++ b/builtin-ministatus.c
> @@ -25,7 +25,7 @@ static int have_untracked(void)
>
>        read_directory(&dir, ".", "", 0, NULL);
>        /* XXX we are probably leaking memory from dir */
> -       for (i = 0; i < dir.nr; i++)
> +       for (i = 0; i < dir.nr; i++) {
>                struct dir_entry *ent = dir.entries[i];
>                if (cache_name_is_other(ent->name, ent->len))
>                        return 1;
> @@ -47,6 +47,8 @@ int cmd_ministatus(int argc, const char **argv, const char *prefix)
>                putchar('*');
>        if (have_untracked())
>                putchar('?');
> +       if (untracked_files_exist())
> +               putchar('%');
>
>        return 0;
>  }
> diff --git a/dir.c b/dir.c
> index cfd1ea5..8d4fcdd 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -16,7 +16,10 @@ struct path_simplify {
>
>  static int read_directory_recursive(struct dir_struct *dir,
>        const char *path, const char *base, int baselen,
> -       int check_only, const struct path_simplify *simplify);
> +       int mode, const struct path_simplify *simplify);
> +#define READ_DIRECTORY_EMPTY_CHECK 1
> +#define READ_DIRECTORY_OTHER_CHECK 2
> +
>  static int get_dtype(struct dirent *de, const char *path);
>
>  int common_prefix(const char **pathspec)
> @@ -505,7 +508,8 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
>        /* This is the "show_other_directories" case */
>        if (!dir->hide_empty_directories)
>                return show_directory;
> -       if (!read_directory_recursive(dir, dirname, dirname, len, 1, simplify))
> +       if (!read_directory_recursive(dir, dirname, dirname, len,
> +                                     READ_DIRECTORY_EMPTY_CHECK, simplify))
>                return ignore_directory;
>        return show_directory;
>  }
> @@ -574,10 +578,12 @@ static int get_dtype(struct dirent *de, const char *path)
>  * Also, we ignore the name ".git" (even if it is not a directory).
>  * That likely will not change.
>  */
> -static int read_directory_recursive(struct dir_struct *dir, const char *path, const char *base, int baselen, int check_only, const struct path_simplify *simplify)
> +static int read_directory_recursive(struct dir_struct *dir, const char *path, const char *base, int baselen, int mode, const struct path_simplify *simplify)
>  {
>        DIR *fdir = opendir(path);
>        int contents = 0;
> +       int empty_check_only = (mode == READ_DIRECTORY_EMPTY_CHECK);
> +       int other_check_only = (mode == READ_DIRECTORY_OTHER_CHECK);
>
>        if (fdir) {
>                struct dirent *de;
> @@ -639,7 +645,7 @@ static int read_directory_recursive(struct dir_struct *dir, const char *path, co
>                                        break;
>                                case recurse_into_directory:
>                                        contents += read_directory_recursive(dir,
> -                                               fullname, fullname, baselen + len, 0, simplify);
> +                                               fullname, fullname, baselen + len, mode, simplify);
>                                        continue;
>                                case ignore_directory:
>                                        continue;
> @@ -650,10 +656,12 @@ static int read_directory_recursive(struct dir_struct *dir, const char *path, co
>                                break;
>                        }
>                        contents++;
> -                       if (check_only)
> +                       if (empty_check_only)
>                                goto exit_early;
>                        else
>                                dir_add_name(dir, fullname, baselen + len);
> +                       if (other_check_only && dir->nr)
> +                               goto exit_early;
>                }
>  exit_early:
>                closedir(fdir);
> @@ -731,6 +739,20 @@ int read_directory(struct dir_struct *dir, const char *path, const char *base, i
>        return dir->nr;
>  }
>
> +int untracked_files_exist(void)
> +{
> +       struct dir_struct dir;
> +       int i;
> +
> +       memset(&dir, 0, sizeof(dir));
> +       setup_standard_excludes(&dir);
> +       read_directory_recursive(&dir, ".", "", 0, READ_DIRECTORY_OTHER_CHECK,
> +                                NULL);
> +       for (i = 0; i < dir.nr; i++)
> +               free(dir.entries[i]);
> +       return !!dir.nr;
> +}
> +
>  int file_exists(const char *f)
>  {
>        struct stat sb;
> diff --git a/dir.h b/dir.h
> index bdc2d47..1f8b575 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -92,4 +92,6 @@ extern int remove_dir_recursively(struct strbuf *path, int only_empty);
>  /* tries to remove the path with empty directories along it, ignores ENOENT */
>  extern int remove_path(const char *path);
>
> +extern int untracked_files_exist(void);
> +
>  #endif
>

  reply	other threads:[~2009-02-10 22:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-10  0:51 [RFC/PATCH] shortstatus v1 Tuncer Ayaz
2009-02-10  1:44 ` Junio C Hamano
2009-02-10  3:46   ` Sitaram Chamarty
2009-02-10 10:22     ` Spending time in PS1, was " Johannes Schindelin
2009-02-10 17:31       ` Sitaram Chamarty
2009-02-10 10:11   ` Tuncer Ayaz
2009-02-10 11:03 ` Jeff King
2009-02-10 11:29   ` Michael J Gruber
2009-02-10 11:31     ` Tuncer Ayaz
2009-02-10 11:45     ` Jeff King
2009-02-10 12:36       ` Michael J Gruber
2009-02-10 13:01         ` Jeff King
2009-02-10 15:58   ` Junio C Hamano
2009-02-10 18:10     ` Jeff King
2009-02-10 18:22       ` Jeff King
2009-02-10 19:11       ` Jeff King
2009-02-10 21:21         ` Tuncer Ayaz
2009-02-10 21:36           ` Jeff King
2009-02-10 22:25         ` Junio C Hamano
2009-02-10 22:52           ` Tuncer Ayaz [this message]
2009-02-10 22:55           ` Jeff King
2009-02-10 23:05             ` Junio C Hamano
2009-02-12  0:49               ` Jeff King
2009-02-10 23:52     ` Nanako Shiraishi
2009-02-11 21:24       ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2009-02-10 23:58 [RFC] New command: 'git snapshot' Ulrik Sverdrup
2009-02-11  0:08 ` [RFC/PATCH] shortstatus v1 Nanako Shiraishi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ac8254d0902101452u33a1ef0mb9d34182eff5838f@mail.gmail.com \
    --to=tuncer.ayaz@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).