Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <junkio@cox.net>
Subject: Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list
Date: Thu, 14 Sep 2006 09:59:03 +0200	[thread overview]
Message-ID: <200609140959.04061.jnareb@gmail.com> (raw)
In-Reply-To: <7v8xkm6gr6.fsf@assigned-by-dhcp.cox.net>

Dnia czwartek 14. września 2006 09:37, napisałeś:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > +		sub wanted {
> > +			# skip dot files (hidden files), check only directories
> > +			#return if (/^\./);
> 
> Leftover comment?

Leftover comment, from copying anonymous 'wanted' subroutine
from git_get_refs_list. But I have realized that for gitweb
for only one project one could have ".git" as a project name,
e.g. by putting $projectroot to be live git repository (working
directory of git repository).

> > +			my $subdir = substr($File::Find::name, $pfxlen + 1);
> > +			# we check related file in $projectroot
> > +			if (-e "$projectroot/$subdir/HEAD") {
> > +				push @list, { path => $subdir };
> > +				$File::Find::prune = 1;
> 
> We might want to do an extra cheap check to make what we found
> is sane, to prevent us getting confused by a random file whose
> name happens to be HEAD.

That is what we did before. Simplest check, also to avoid now to 
claim top directory as git repository, and to know when to cut-off 
(prune) finding.

It was intended I think to avoid adding '.' and '..' as git 
repositories, not stray directories. Well, perhaps index file
if it was used.

> For example, it is a regular file whose contents is a single
> line and begins with "ref: refs/heads/" (16 bytes) or it is a
> symlink and readlink result begins with "refs/heads/" (11
> bytes).

We can do that, but I think it is unnecessary. Let's assume that
$projectroot contains _only_ git repositories, perhaps in subdirs 
(directory hierarchy), and perhaps some stray files like not used
now index file.

> If you feel opening and reading the file is an added overhead,
> checking for $project/$subdir/{objects,refs}/ directories might
> be a good alternative.

Probably overkill.

> > +		File::Find::find({
> > +			no_chdir => 1, # do not change directory
> > +			follow_fast => 1, # follow symbolic links
> 
> What is the reason behind choosing follow_fast?  By saying
> follow_anything, you choose to care about cases where there are
> symlinks under projectroot to point at various projects.  If
> that is the case, don't you want to make sure you include the
> same project only once?

First, it is faster. Second, for testing if it works I used copy
of a one "live" git repository I have (git.git repository), by making
second symlink to it.

> > +			#follow_skip => 2, # ignore duplicated files and directories
> 
> Leftover comment?

Leftover from benchmarking what set of options is faster.

By the way, if we choose to use 'follow' instead of 'follow_fast' we 
might want to uncomment it, to not spew errors in the log.

> About these two leftover comments, if you decided you did not
> want them, please do not leave them behind.

O.K.

> If on the other hand you wanted to hint others that you are not
> sure about your decision, it would probably be better to say
> that honestly in the comment, perhaps mark the message as RFC,
> and describe what the issues are, like so:
> 
> 	sub wanted {
> 		# We might want to also ignore dot files, by
>                 # saying "return if /^\./;" here, but there is
>                 # no inherent reason for us to forbid a repository
>                 # name from starting with a dot.

True.

>                 # We check only if a directory looks like a git
>                 # repo and do not care about non directories.
>                 # Note that this cannot be done with "-d _"
>                 # because we are using follow_fast and the last
>                 # stat was done with lstat(); we want to catch a
>                 # symlink that points at a directory.
>                 return unless -d $_;
>                 ...

Not true. Link to directory is both -d $_ and -l $_, so

	return unless (-d $_ || (-l $_ && -d readlink($_)));

is not needed.
-- 
Jakub Narebski
Poland

next prev parent reply	other threads:[~2006-09-14  7:58 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-14  6:39 [PATCH] gitweb: Use File::Find::find in git_get_projects_list Jakub Narebski
2006-09-14  7:37 ` Junio C Hamano
2006-09-14  7:59   ` Jakub Narebski [this message]
2006-09-14  9:23     ` Junio C Hamano
2006-09-14 18:32       ` Junio C Hamano
2006-09-14 18:43         ` Jakub Narebski
2006-09-14  9:40     ` Junio C Hamano
2006-09-14 10:01       ` Jakub Narebski
2006-09-14 16:42         ` Junio C Hamano
2006-09-14 17:02           ` Jakub Narebski
2006-09-14 17:12             ` Randal L. Schwartz
2006-09-14 17:39 ` [PATCH (amend)] " Jakub Narebski
2006-09-14 19:14   ` Jakub Narebski
2006-09-14 19:51     ` Junio C Hamano
2006-09-14 21:50   ` Randal L. Schwartz
2006-09-14 19:34 ` [PATCH (take 3)] " Jakub Narebski
2006-09-14 20:13   ` Junio C Hamano
2006-09-14 20:18     ` [PATCH (take 4)] " Jakub Narebski
2006-09-14 20:24       ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200609140959.04061.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).