From: Linus Torvalds <torvalds@linux-foundation.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: Dmitry Potapov <dpotapov@gmail.com>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have an up-to-date cache entry
Date: Thu, 9 Jul 2009 09:59:10 -0700 (PDT) [thread overview]
Message-ID: <alpine.LFD.2.01.0907090954090.3352@localhost.localdomain> (raw)
In-Reply-To: <7vws6h3ji4.fsf@alter.siamese.dyndns.org>
On Thu, 9 Jul 2009, Junio C Hamano wrote:
> > +
> > + /* Try to look it up as a directory */
> > + pos = cache_name_pos(path, len);
> > + if (pos >= 0)
> > + return 0;
>
> How can this find an exact entry for the path? Assuming that the name
> hash cache_name_exists() is not out of sync?
Hopefully it would never trigger. But I'd rather write robust code that
doesn't make any fancy assumptions. Keep it simple - and keep it working
even if surprising things happen.
> > + if (!ce_uptodate(ce))
> > + break; /* continue? */
>
> I think this should be continue, as the directory D you are interested in
> may have two files, one modified, the other uptodate.
The thing is, the directory may have subdirectories, and there may be
tens of thousands of files there. And maybe this gets called by code that
hasn't done any cache preloading at all, so nothing will be up-to-date.
Do we want to loop over thousands of entries? Or do we want to loop as
little as possible, and just say "most of the time the first entry will be
representative".
But I did put the 'continue' in a comment, because it's not a correctness
issue, it's a gut feel.
Linus
next prev parent reply other threads:[~2009-07-09 16:59 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-07 0:05 Too many 'stat' calls by git-status on Windows Dmitry Potapov
2009-07-08 19:49 ` Ramsay Jones
2009-07-09 2:04 ` Linus Torvalds
2009-07-09 2:35 ` Linus Torvalds
2009-07-09 2:40 ` [PATCH 1/3] Add 'fill_directory()' helper function for directory traversal Linus Torvalds
2009-07-09 2:42 ` [PATCH 2/3] Simplify read_directory[_recursive]() arguments Linus Torvalds
2009-07-09 2:43 ` [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have an up-to-date cache entry Linus Torvalds
2009-07-09 8:18 ` Junio C Hamano
2009-07-09 15:52 ` Linus Torvalds
2009-07-09 16:32 ` Junio C Hamano
2009-07-09 16:59 ` Linus Torvalds [this message]
2009-07-09 18:34 ` Junio C Hamano
2009-07-09 17:13 ` Linus Torvalds
2009-07-09 17:18 ` Linus Torvalds
2009-07-09 18:37 ` Junio C Hamano
2009-07-09 18:53 ` Linus Torvalds
2009-07-09 20:44 ` [PATCH 4/3] Avoid using 'lstat()' to figure out directories Linus Torvalds
2009-07-09 20:47 ` [PATCH 5/3] Prepare symlink caching for thread-safety Linus Torvalds
2009-07-09 20:48 ` [PATCH 6/3] Export thread-safe version of 'has_symlink_leading_path()' Linus Torvalds
2009-07-09 20:50 ` [PATCH 7/3] Make index preloading check the whole path to the file Linus Torvalds
2009-07-09 20:56 ` Linus Torvalds
2009-07-10 3:12 ` Junio C Hamano
2009-07-10 3:29 ` Linus Torvalds
2009-07-10 3:40 ` Linus Torvalds
2009-07-11 2:53 ` Junio C Hamano
2009-07-11 3:04 ` Linus Torvalds
2009-07-12 0:09 ` [PATCH 6/3] Export thread-safe version of 'has_symlink_leading_path()' Kjetil Barvik
2009-07-12 21:33 ` Junio C Hamano
2009-07-09 22:36 ` [PATCH 4/3] Avoid using 'lstat()' to figure out directories Paolo Bonzini
2009-07-09 23:26 ` Linus Torvalds
2009-07-09 23:52 ` Linus Torvalds
2009-07-10 0:13 ` Linus Torvalds
2009-07-09 23:37 ` Junio C Hamano
2009-07-09 21:05 ` [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have an up-to-date cache entry Dmitry Potapov
2009-07-09 21:52 ` Eric Blake
2009-07-09 23:30 ` [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have?an " Dmitry Potapov
2009-07-10 13:04 ` Dmitry Potapov
2009-07-09 23:29 ` [PATCH 3/3] Avoid doing extra 'lstat()'s for d_type if we have an " Dmitry Potapov
2009-07-09 13:50 ` Dmitry Potapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.01.0907090954090.3352@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=dpotapov@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).