All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <junkio@cox.net>
Subject: Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list
Date: Thu, 14 Sep 2006 09:59:03 +0200	[thread overview]
Message-ID: <200609140959.04061.jnareb@gmail.com> (raw)
In-Reply-To: <7v8xkm6gr6.fsf@assigned-by-dhcp.cox.net>

Dnia czwartek 14. września 2006 09:37, napisałeś:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > +		sub wanted {
> > +			# skip dot files (hidden files), check only directories
> > +			#return if (/^\./);
> 
> Leftover comment?

Leftover comment, from copying anonymous 'wanted' subroutine
from git_get_refs_list. But I have realized that for gitweb
for only one project one could have ".git" as a project name,
e.g. by putting $projectroot to be live git repository (working
directory of git repository).

> > +			my $subdir = substr($File::Find::name, $pfxlen + 1);
> > +			# we check related file in $projectroot
> > +			if (-e "$projectroot/$subdir/HEAD") {
> > +				push @list, { path => $subdir };
> > +				$File::Find::prune = 1;
> 
> We might want to do an extra cheap check to make what we found
> is sane, to prevent us getting confused by a random file whose
> name happens to be HEAD.

That is what we did before. Simplest check, also to avoid now to 
claim top directory as git repository, and to know when to cut-off 
(prune) finding.

It was intended I think to avoid adding '.' and '..' as git 
repositories, not stray directories. Well, perhaps index file
if it was used.

> For example, it is a regular file whose contents is a single
> line and begins with "ref: refs/heads/" (16 bytes) or it is a
> symlink and readlink result begins with "refs/heads/" (11
> bytes).

We can do that, but I think it is unnecessary. Let's assume that
$projectroot contains _only_ git repositories, perhaps in subdirs 
(directory hierarchy), and perhaps some stray files like not used
now index file.

> If you feel opening and reading the file is an added overhead,
> checking for $project/$subdir/{objects,refs}/ directories might
> be a good alternative.

Probably overkill.

> > +		File::Find::find({
> > +			no_chdir => 1, # do not change directory
> > +			follow_fast => 1, # follow symbolic links
> 
> What is the reason behind choosing follow_fast?  By saying
> follow_anything, you choose to care about cases where there are
> symlinks under projectroot to point at various projects.  If
> that is the case, don't you want to make sure you include the
> same project only once?

First, it is faster. Second, for testing if it works I used copy
of a one "live" git repository I have (git.git repository), by making
second symlink to it.

> > +			#follow_skip => 2, # ignore duplicated files and directories
> 
> Leftover comment?

Leftover from benchmarking what set of options is faster.

By the way, if we choose to use 'follow' instead of 'follow_fast' we 
might want to uncomment it, to not spew errors in the log.

> About these two leftover comments, if you decided you did not
> want them, please do not leave them behind.

O.K.

> If on the other hand you wanted to hint others that you are not
> sure about your decision, it would probably be better to say
> that honestly in the comment, perhaps mark the message as RFC,
> and describe what the issues are, like so:
> 
> 	sub wanted {
> 		# We might want to also ignore dot files, by
>                 # saying "return if /^\./;" here, but there is
>                 # no inherent reason for us to forbid a repository
>                 # name from starting with a dot.

True.

>                 # We check only if a directory looks like a git
>                 # repo and do not care about non directories.
>                 # Note that this cannot be done with "-d _"
>                 # because we are using follow_fast and the last
>                 # stat was done with lstat(); we want to catch a
>                 # symlink that points at a directory.
>                 return unless -d $_;
>                 ...

Not true. Link to directory is both -d $_ and -l $_, so

	return unless (-d $_ || (-l $_ && -d readlink($_)));

is not needed.
-- 
Jakub Narebski
Poland

  reply	other threads:[~2006-09-14  7:58 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-09-14  6:39 [PATCH] gitweb: Use File::Find::find in git_get_projects_list Jakub Narebski
2006-09-14  7:37 ` Junio C Hamano
2006-09-14  7:59   ` Jakub Narebski [this message]
2006-09-14  9:23     ` Junio C Hamano
2006-09-14 18:32       ` Junio C Hamano
2006-09-14 18:43         ` Jakub Narebski
2006-09-14  9:40     ` Junio C Hamano
2006-09-14 10:01       ` Jakub Narebski
2006-09-14 16:42         ` Junio C Hamano
2006-09-14 17:02           ` Jakub Narebski
2006-09-14 17:12             ` Randal L. Schwartz
2006-09-14 17:39 ` [PATCH (amend)] " Jakub Narebski
2006-09-14 19:14   ` Jakub Narebski
2006-09-14 19:51     ` Junio C Hamano
2006-09-14 21:50   ` Randal L. Schwartz
2006-09-14 19:34 ` [PATCH (take 3)] " Jakub Narebski
2006-09-14 20:13   ` Junio C Hamano
2006-09-14 20:18     ` [PATCH (take 4)] " Jakub Narebski
2006-09-14 20:24       ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200609140959.04061.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.