From: Dmitry Potapov <dpotapov@gmail.com>
To: Eric Blake <ebb9@byu.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have?an up-to-date cache entry
Date: Fri, 10 Jul 2009 17:04:07 +0400 [thread overview]
Message-ID: <20090710130407.GE19425@dpotapov.dyndns.org> (raw)
In-Reply-To: <20090709233024.GD19425@dpotapov.dyndns.org>
On Fri, Jul 10, 2009 at 03:30:24AM +0400, Dmitry Potapov wrote:
>
> But we still use readdir() from Cygwin and that may be source of extra
> syscalls that I observe...
opendir gives an extra 'stat' before opening directory
readdir produces one more extra 'stat' on the parent directory before
returning '..'
open(.gitignore) does one extra 'stat' on the directory where it tries
to open .gitignore (it did not exist in my tests)
So, the number of 'stat' on each directory is 2 plus the number of
subidectories that it has. Thus, the total number of 'stat' for all
directories is 3 multiple the number of directories in your repo. All
those 'stat' are artifacts of Cygwin. Also, you have 2 open per each
directory and one of them are redundant (at least, for Git purposes).
Overall (including syscalls for .gitignore), you have the following
number of syscalls for each directory in your repo:
5 - QueryOpen (stat)
3 - CreateFile (open)
2 - CloseFile (close)
1 - QueryFileInternalInformationFile
Here is the detail listing of testing of read_directory_recursive:
=====
opendir(.)
QueryOpen,E:\dpotapov\repo
CreateFile,E:\dpotapov\repo
first readdir call
QueryDirectory,E:\dpotapov\repo
second readdir call that returns '..'
QueryOpen,E:\dpotapov
CreateFile,E:\dpotapov
QueryFileInternalInformationFile,E:\dpotapov
CloseFile,E:\dpotapov
open(.gitignore) -- .gitignore does not exist
QueryOpen,E:\dpotapov\repo\.gitignore
QueryOpen,E:\dpotapov\repo\.gitignore.lnk
QueryOpen,E:\dpotapov\repo
CreateFile,E:\dpotapov\repo\.gitignore
stat for untracked file
QueryOpen,E:\dpotapov\repo\bar
opendir(dir1)
QueryOpen,E:\dpotapov\repo\dir1
CreateFile,E:\dpotapov\repo\dir1
first readdir call
QueryDirectory,E:\dpotapov\repo\dir1
second readdir call that returns '..'
QueryOpen,E:\dpotapov\repo
CreateFile,E:\dpotapov\repo
QueryFileInternalInformationFile,E:\dpotapov\repo
CloseFile,E:\dpotapov\repo
open(.gitignore) -- .gitignore does not exist
QueryOpen,E:\dpotapov\repo\dir1\.gitignore
QueryOpen,E:\dpotapov\repo\dir1\.gitignore.lnk
QueryOpen,E:\dpotapov\repo\dir1
CreateFile,E:\dpotapov\repo\dir1\.gitignore
last readdir call that returns NULL
QueryDirectory,E:\dpotapov\repo\dir1
closedir
CloseFile,E:\dpotapov\repo\dir1
stat for some modified file
QueryOpen,E:\dpotapov\repo\foo
last readdir call that returns NULL
QueryDirectory,E:\dpotapov\repo
closedir
CloseFile,E:\dpotapov\repo
=====
Dmitry
next prev parent reply other threads:[~2009-07-10 13:04 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-07 0:05 Too many 'stat' calls by git-status on Windows Dmitry Potapov
2009-07-08 19:49 ` Ramsay Jones
2009-07-09 2:04 ` Linus Torvalds
2009-07-09 2:35 ` Linus Torvalds
2009-07-09 2:40 ` [PATCH 1/3] Add 'fill_directory()' helper function for directory traversal Linus Torvalds
2009-07-09 2:42 ` [PATCH 2/3] Simplify read_directory[_recursive]() arguments Linus Torvalds
2009-07-09 2:43 ` [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have an up-to-date cache entry Linus Torvalds
2009-07-09 8:18 ` Junio C Hamano
2009-07-09 15:52 ` Linus Torvalds
2009-07-09 16:32 ` Junio C Hamano
2009-07-09 16:59 ` Linus Torvalds
2009-07-09 18:34 ` Junio C Hamano
2009-07-09 17:13 ` Linus Torvalds
2009-07-09 17:18 ` Linus Torvalds
2009-07-09 18:37 ` Junio C Hamano
2009-07-09 18:53 ` Linus Torvalds
2009-07-09 20:44 ` [PATCH 4/3] Avoid using 'lstat()' to figure out directories Linus Torvalds
2009-07-09 20:47 ` [PATCH 5/3] Prepare symlink caching for thread-safety Linus Torvalds
2009-07-09 20:48 ` [PATCH 6/3] Export thread-safe version of 'has_symlink_leading_path()' Linus Torvalds
2009-07-09 20:50 ` [PATCH 7/3] Make index preloading check the whole path to the file Linus Torvalds
2009-07-09 20:56 ` Linus Torvalds
2009-07-10 3:12 ` Junio C Hamano
2009-07-10 3:29 ` Linus Torvalds
2009-07-10 3:40 ` Linus Torvalds
2009-07-11 2:53 ` Junio C Hamano
2009-07-11 3:04 ` Linus Torvalds
2009-07-12 0:09 ` [PATCH 6/3] Export thread-safe version of 'has_symlink_leading_path()' Kjetil Barvik
2009-07-12 21:33 ` Junio C Hamano
2009-07-09 22:36 ` [PATCH 4/3] Avoid using 'lstat()' to figure out directories Paolo Bonzini
2009-07-09 23:26 ` Linus Torvalds
2009-07-09 23:52 ` Linus Torvalds
2009-07-10 0:13 ` Linus Torvalds
2009-07-09 23:37 ` Junio C Hamano
2009-07-09 21:05 ` [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have an up-to-date cache entry Dmitry Potapov
2009-07-09 21:52 ` Eric Blake
2009-07-09 23:30 ` [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have?an " Dmitry Potapov
2009-07-10 13:04 ` Dmitry Potapov [this message]
2009-07-09 23:29 ` [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have an " Dmitry Potapov
2009-07-09 13:50 ` Dmitry Potapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090710130407.GE19425@dpotapov.dyndns.org \
--to=dpotapov@gmail.com \
--cc=ebb9@byu.net \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).