From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <junkio@cox.net>
Subject: Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list
Date: Thu, 14 Sep 2006 09:59:03 +0200 [thread overview]
Message-ID: <200609140959.04061.jnareb@gmail.com> (raw)
In-Reply-To: <7v8xkm6gr6.fsf@assigned-by-dhcp.cox.net>
Dnia czwartek 14. września 2006 09:37, napisałeś:
> Jakub Narebski <jnareb@gmail.com> writes:
>
> > + sub wanted {
> > + # skip dot files (hidden files), check only directories
> > + #return if (/^\./);
>
> Leftover comment?
Leftover comment, from copying anonymous 'wanted' subroutine
from git_get_refs_list. But I have realized that for gitweb
for only one project one could have ".git" as a project name,
e.g. by putting $projectroot to be live git repository (working
directory of git repository).
> > + my $subdir = substr($File::Find::name, $pfxlen + 1);
> > + # we check related file in $projectroot
> > + if (-e "$projectroot/$subdir/HEAD") {
> > + push @list, { path => $subdir };
> > + $File::Find::prune = 1;
>
> We might want to do an extra cheap check to make what we found
> is sane, to prevent us getting confused by a random file whose
> name happens to be HEAD.
That is what we did before. Simplest check, also to avoid now to
claim top directory as git repository, and to know when to cut-off
(prune) finding.
It was intended I think to avoid adding '.' and '..' as git
repositories, not stray directories. Well, perhaps index file
if it was used.
> For example, it is a regular file whose contents is a single
> line and begins with "ref: refs/heads/" (16 bytes) or it is a
> symlink and readlink result begins with "refs/heads/" (11
> bytes).
We can do that, but I think it is unnecessary. Let's assume that
$projectroot contains _only_ git repositories, perhaps in subdirs
(directory hierarchy), and perhaps some stray files like not used
now index file.
> If you feel opening and reading the file is an added overhead,
> checking for $project/$subdir/{objects,refs}/ directories might
> be a good alternative.
Probably overkill.
> > + File::Find::find({
> > + no_chdir => 1, # do not change directory
> > + follow_fast => 1, # follow symbolic links
>
> What is the reason behind choosing follow_fast? By saying
> follow_anything, you choose to care about cases where there are
> symlinks under projectroot to point at various projects. If
> that is the case, don't you want to make sure you include the
> same project only once?
First, it is faster. Second, for testing if it works I used copy
of a one "live" git repository I have (git.git repository), by making
second symlink to it.
> > + #follow_skip => 2, # ignore duplicated files and directories
>
> Leftover comment?
Leftover from benchmarking what set of options is faster.
By the way, if we choose to use 'follow' instead of 'follow_fast' we
might want to uncomment it, to not spew errors in the log.
> About these two leftover comments, if you decided you did not
> want them, please do not leave them behind.
O.K.
> If on the other hand you wanted to hint others that you are not
> sure about your decision, it would probably be better to say
> that honestly in the comment, perhaps mark the message as RFC,
> and describe what the issues are, like so:
>
> sub wanted {
> # We might want to also ignore dot files, by
> # saying "return if /^\./;" here, but there is
> # no inherent reason for us to forbid a repository
> # name from starting with a dot.
True.
> # We check only if a directory looks like a git
> # repo and do not care about non directories.
> # Note that this cannot be done with "-d _"
> # because we are using follow_fast and the last
> # stat was done with lstat(); we want to catch a
> # symlink that points at a directory.
> return unless -d $_;
> ...
Not true. Link to directory is both -d $_ and -l $_, so
return unless (-d $_ || (-l $_ && -d readlink($_)));
is not needed.
--
Jakub Narebski
Poland
next prev parent reply other threads:[~2006-09-14 7:58 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-09-14 6:39 [PATCH] gitweb: Use File::Find::find in git_get_projects_list Jakub Narebski
2006-09-14 7:37 ` Junio C Hamano
2006-09-14 7:59 ` Jakub Narebski [this message]
2006-09-14 9:23 ` Junio C Hamano
2006-09-14 18:32 ` Junio C Hamano
2006-09-14 18:43 ` Jakub Narebski
2006-09-14 9:40 ` Junio C Hamano
2006-09-14 10:01 ` Jakub Narebski
2006-09-14 16:42 ` Junio C Hamano
2006-09-14 17:02 ` Jakub Narebski
2006-09-14 17:12 ` Randal L. Schwartz
2006-09-14 17:39 ` [PATCH (amend)] " Jakub Narebski
2006-09-14 19:14 ` Jakub Narebski
2006-09-14 19:51 ` Junio C Hamano
2006-09-14 21:50 ` Randal L. Schwartz
2006-09-14 19:34 ` [PATCH (take 3)] " Jakub Narebski
2006-09-14 20:13 ` Junio C Hamano
2006-09-14 20:18 ` [PATCH (take 4)] " Jakub Narebski
2006-09-14 20:24 ` Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200609140959.04061.jnareb@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).