* [PATCH] gitweb: Use File::Find::find in git_get_projects_list @ 2006-09-14 6:39 Jakub Narebski 2006-09-14 7:37 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 6:39 UTC (permalink / raw) To: git Earlier code to get list of projects when $projects_list is a directory (e.g. when it is equal to $projectroot) had a hardcoded flat (one level) list of directories. Allow for projects to be in subdirectories also for $projects_list being a directory by using File::Find. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- This doesn't add much overhead to "project_list" view, compared to previous version; times are the same within margin of error. gitweb/gitweb.perl | 29 +++++++++++++++++++++-------- 1 files changed, 21 insertions(+), 8 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index c3544dd..470bff2 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -715,16 +715,29 @@ sub git_get_projects_list { if (-d $projects_list) { # search in directory my $dir = $projects_list; - opendir my ($dh), $dir or return undef; - while (my $dir = readdir($dh)) { - if (-e "$projectroot/$dir/HEAD") { - my $pr = { - path => $dir, - }; - push @list, $pr + my $pfxlen = length("$dir"); + + sub wanted { + # skip dot files (hidden files), check only directories + #return if (/^\./); + return unless (-d $_); + + my $subdir = substr($File::Find::name, $pfxlen + 1); + # we check related file in $projectroot + if (-e "$projectroot/$subdir/HEAD") { + push @list, { path => $subdir }; + $File::Find::prune = 1; } } - closedir($dh); + + File::Find::find({ + no_chdir => 1, # do not change directory + follow_fast => 1, # follow symbolic links + #follow_skip => 2, # ignore duplicated files and directories + dangling_symlinks => 0, # ignore dangling symlinks, silently + wanted => \&wanted, + }, "$dir"); + } elsif (-f $projects_list) { # read from file(url-encoded): # 'git%2Fgit.git Linus+Torvalds' -- 1.4.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 6:39 [PATCH] gitweb: Use File::Find::find in git_get_projects_list Jakub Narebski @ 2006-09-14 7:37 ` Junio C Hamano 2006-09-14 7:59 ` Jakub Narebski 2006-09-14 17:39 ` [PATCH (amend)] " Jakub Narebski 2006-09-14 19:34 ` [PATCH (take 3)] " Jakub Narebski 2 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2006-09-14 7:37 UTC (permalink / raw) To: Jakub Narebski; +Cc: git Jakub Narebski <jnareb@gmail.com> writes: > + sub wanted { > + # skip dot files (hidden files), check only directories > + #return if (/^\./); Leftover comment? > + my $subdir = substr($File::Find::name, $pfxlen + 1); > + # we check related file in $projectroot > + if (-e "$projectroot/$subdir/HEAD") { > + push @list, { path => $subdir }; > + $File::Find::prune = 1; We might want to do an extra cheap check to make what we found is sane, to prevent us getting confused by a random file whose name happens to be HEAD. For example, it is a regular file whose contents is a single line and begins with "ref: refs/heads/" (16 bytes) or it is a symlink and readlink result begins with "refs/heads/" (11 bytes). If you feel opening and reading the file is an added overhead, checking for $project/$subdir/{objects,refs}/ directories might be a good alternative. > + File::Find::find({ > + no_chdir => 1, # do not change directory > + follow_fast => 1, # follow symbolic links What is the reason behind choosing follow_fast? By saying follow_anything, you choose to care about cases where there are symlinks under projectroot to point at various projects. If that is the case, don't you want to make sure you include the same project only once? > + #follow_skip => 2, # ignore duplicated files and directories Leftover comment? About these two leftover comments, if you decided you did not want them, please do not leave them behind. If on the other hand you wanted to hint others that you are not sure about your decision, it would probably be better to say that honestly in the comment, perhaps mark the message as RFC, and describe what the issues are, like so: sub wanted { # We might want to also ignore dot files, by # saying "return if /^\./;" here, but there is # no inherent reason for us to forbid a repository # name from starting with a dot. # We check only if a directory looks like a git # repo and do not care about non directories. # Note that this cannot be done with "-d _" # because we are using follow_fast and the last # stat was done with lstat(); we want to catch a # symlink that points at a directory. return unless -d $_; ... ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 7:37 ` Junio C Hamano @ 2006-09-14 7:59 ` Jakub Narebski 2006-09-14 9:23 ` Junio C Hamano 2006-09-14 9:40 ` Junio C Hamano 0 siblings, 2 replies; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 7:59 UTC (permalink / raw) To: git; +Cc: Junio C Hamano Dnia czwartek 14. września 2006 09:37, napisałeś: > Jakub Narebski <jnareb@gmail.com> writes: > > > + sub wanted { > > + # skip dot files (hidden files), check only directories > > + #return if (/^\./); > > Leftover comment? Leftover comment, from copying anonymous 'wanted' subroutine from git_get_refs_list. But I have realized that for gitweb for only one project one could have ".git" as a project name, e.g. by putting $projectroot to be live git repository (working directory of git repository). > > + my $subdir = substr($File::Find::name, $pfxlen + 1); > > + # we check related file in $projectroot > > + if (-e "$projectroot/$subdir/HEAD") { > > + push @list, { path => $subdir }; > > + $File::Find::prune = 1; > > We might want to do an extra cheap check to make what we found > is sane, to prevent us getting confused by a random file whose > name happens to be HEAD. That is what we did before. Simplest check, also to avoid now to claim top directory as git repository, and to know when to cut-off (prune) finding. It was intended I think to avoid adding '.' and '..' as git repositories, not stray directories. Well, perhaps index file if it was used. > For example, it is a regular file whose contents is a single > line and begins with "ref: refs/heads/" (16 bytes) or it is a > symlink and readlink result begins with "refs/heads/" (11 > bytes). We can do that, but I think it is unnecessary. Let's assume that $projectroot contains _only_ git repositories, perhaps in subdirs (directory hierarchy), and perhaps some stray files like not used now index file. > If you feel opening and reading the file is an added overhead, > checking for $project/$subdir/{objects,refs}/ directories might > be a good alternative. Probably overkill. > > + File::Find::find({ > > + no_chdir => 1, # do not change directory > > + follow_fast => 1, # follow symbolic links > > What is the reason behind choosing follow_fast? By saying > follow_anything, you choose to care about cases where there are > symlinks under projectroot to point at various projects. If > that is the case, don't you want to make sure you include the > same project only once? First, it is faster. Second, for testing if it works I used copy of a one "live" git repository I have (git.git repository), by making second symlink to it. > > + #follow_skip => 2, # ignore duplicated files and directories > > Leftover comment? Leftover from benchmarking what set of options is faster. By the way, if we choose to use 'follow' instead of 'follow_fast' we might want to uncomment it, to not spew errors in the log. > About these two leftover comments, if you decided you did not > want them, please do not leave them behind. O.K. > If on the other hand you wanted to hint others that you are not > sure about your decision, it would probably be better to say > that honestly in the comment, perhaps mark the message as RFC, > and describe what the issues are, like so: > > sub wanted { > # We might want to also ignore dot files, by > # saying "return if /^\./;" here, but there is > # no inherent reason for us to forbid a repository > # name from starting with a dot. True. > # We check only if a directory looks like a git > # repo and do not care about non directories. > # Note that this cannot be done with "-d _" > # because we are using follow_fast and the last > # stat was done with lstat(); we want to catch a > # symlink that points at a directory. > return unless -d $_; > ... Not true. Link to directory is both -d $_ and -l $_, so return unless (-d $_ || (-l $_ && -d readlink($_))); is not needed. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 7:59 ` Jakub Narebski @ 2006-09-14 9:23 ` Junio C Hamano 2006-09-14 18:32 ` Junio C Hamano 2006-09-14 9:40 ` Junio C Hamano 1 sibling, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2006-09-14 9:23 UTC (permalink / raw) To: Jakub Narebski; +Cc: git Jakub Narebski <jnareb@gmail.com> writes: > Not true. Link to directory is both -d $_ and -l $_, so > > return unless (-d $_ || (-l $_ && -d readlink($_))); > > is not needed. I think you mis-read what I said. I first wondered why you did not say "return unless -d _" and wrote (seemingly more inefficient) "return unless -d $_". The comment is to clarify why '$' is needed. In other words, after this setup: $ rm -fr d dl $ mkdir d $ ln -s d dl you do not see an output from this: $ perl -e 'lstat "dl"; print "is-dir\n" if -d _;' but you do from this: $ perl -e 'lstat "dl"; print "is-dir\n" if -d "dl";' ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 9:23 ` Junio C Hamano @ 2006-09-14 18:32 ` Junio C Hamano 2006-09-14 18:43 ` Jakub Narebski 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2006-09-14 18:32 UTC (permalink / raw) To: Jakub Narebski; +Cc: git Junio C Hamano <junkio@cox.net> writes: > Jakub Narebski <jnareb@gmail.com> writes: > >> Not true. Link to directory is both -d $_ and -l $_, so >> >> return unless (-d $_ || (-l $_ && -d readlink($_))); >> >> is not needed. > > I think you mis-read what I said. I first wondered why you did > not say "return unless -d _" and wrote (seemingly more > inefficient) "return unless -d $_". The comment is to clarify > why '$' is needed. > > In other words, after this setup: > > $ rm -fr d dl > $ mkdir d > $ ln -s d dl > > you do not see an output from this: > > $ perl -e 'lstat "dl"; print "is-dir\n" if -d _;' > > but you do from this: > > $ perl -e 'lstat "dl"; print "is-dir\n" if -d "dl";' Side note: While return unless -d $_ there is definitely more correct than "return unless -d _" which is not, it is not the most efficient. Because you use fast_xxx, you know the last stat was lstat so "-d _" would be true if the thing you are looking at is a real directory and will be a zero-cost operation. The only case you want to be careful is a symlink pointing at a directory, so return unless ((-d _) || (-l _ && -d $_)) would be more efficient. I have a strange suspicion that Merlyn will soon join us with more expertise if we keep talking about Perl ;-) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 18:32 ` Junio C Hamano @ 2006-09-14 18:43 ` Jakub Narebski 0 siblings, 0 replies; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 18:43 UTC (permalink / raw) To: git Junio C Hamano wrote: > Side note: > > While > > return unless -d $_ > > there is definitely more correct than "return unless -d _" which > is not, it is not the most efficient. Because you use fast_xxx, > you know the last stat was lstat so "-d _" would be true if the > thing you are looking at is a real directory and will be a > zero-cost operation. The only case you want to be careful is a > symlink pointing at a directory, so > > return unless ((-d _) || (-l _ && -d $_)) > > would be more efficient. > > I have a strange suspicion that Merlyn will soon join us with > more expertise if we keep talking about Perl ;-) Truth to be told, I didn't know about '-d _', and File::Find(3pm) does talk only about $_. Besides, I guess that the possible speedup is negligible, and of course depend if for example the whole $projectroot aka. $projects_list is for example populated by symlinks. Then it would be slower. This is the place where more readable should win. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 7:59 ` Jakub Narebski 2006-09-14 9:23 ` Junio C Hamano @ 2006-09-14 9:40 ` Junio C Hamano 2006-09-14 10:01 ` Jakub Narebski 1 sibling, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2006-09-14 9:40 UTC (permalink / raw) To: Jakub Narebski; +Cc: git Jakub Narebski <jnareb@gmail.com> writes: > ... But I have realized that for gitweb > for only one project one could have ".git" as a project name, > e.g. by putting $projectroot to be live git repository (working > directory of git repository). I do not understand this comment. I have /git/{git,linux-2.6} and run test gitweb with projectroot set to /git to see it can display them both. Both are with working tree, so there are /git/git/.git and /git/linux-2.6/.git directories. Do you mean to say "project name" is always taken from the last component of the directory name and the above setup would result in ambiguity? If that is the case that sounds awful, but I haven't noticed it. > It was intended I think to avoid adding '.' and '..' as git > repositories, not stray directories. Well, perhaps index file > if it was used. Having and not having index are both valid, so there is no value in checking the index, even if we wanted to be more paranoid. Existence of HEAD, refs/heads, and/or objects/ would be a more meaningful alternative, but again only if we wanted to be more paranoid. >> > + File::Find::find({ >> > + no_chdir => 1, # do not change directory >> > + follow_fast => 1, # follow symbolic links >> >> What is the reason behind choosing follow_fast? By saying >> follow_anything, you choose to care about cases where there are >> symlinks under projectroot to point at various projects. If >> that is the case, don't you want to make sure you include the >> same project only once? > > First, it is faster. Second, for testing if it works I used copy > of a one "live" git repository I have (git.git repository), by making > second symlink to it. That was not what I wanted to ask; slower and correct is always preferred over fast and incorrect. I did not see anything that compensates the duplicates follow_fast might give you in the code, so I wondered there were some other trick you used to avoid it. In other words, "because I have such and such check to avoid duplicates, so I can safely use 'follow_fast', without using slower 'follow'" was the answer I was after. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 9:40 ` Junio C Hamano @ 2006-09-14 10:01 ` Jakub Narebski 2006-09-14 16:42 ` Junio C Hamano 0 siblings, 1 reply; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 10:01 UTC (permalink / raw) To: Junio C Hamano, git Dnia czwartek 14. września 2006 11:40, napisałeś: > Jakub Narebski <jnareb@gmail.com> writes: > > > ... But I have realized that for gitweb > > for only one project one could have ".git" as a project name, > > e.g. by putting $projectroot to be live git repository (working > > directory of git repository). > > I do not understand this comment. I have /git/{git,linux-2.6} > and run test gitweb with projectroot set to /git to see it can > display them both. Both are with working tree, so there are > /git/git/.git and /git/linux-2.6/.git directories. > > Do you mean to say "project name" is always taken from the last > component of the directory name and the above setup would result > in ambiguity? If that is the case that sounds awful, but I > haven't noticed it. Sorry, this comment was leftover from before, when no_chdir was not used. Then $_ was the last part of directory, and 'return if (/^\./);' would not take into consideration git/.git nor linux-2.6/.git as valid project names ($projectroot / $projects_list is stripped from project name). Now the test should read "if (m!^.*/\.!)" if we want to skip dot files and dot directories. > > It was intended I think to avoid adding '.' and '..' as git > > repositories, not stray directories. Well, perhaps index file > > if it was used. > > Having and not having index are both valid, so there is no value > in checking the index, even if we wanted to be more paranoid. > Existence of HEAD, refs/heads, and/or objects/ would be a more > meaningful alternative, but again only if we wanted to be more > paranoid. Sorry for the confusion. By index I meant alternate way of specyfying projects, and up till now the only way to specify hierarchical repository structure, namely $projects_list being file with encoded directories and encoded owners (it was named index.aux or index/index.aux by default). > >> > + File::Find::find({ > >> > + no_chdir => 1, # do not change directory > >> > + follow_fast => 1, # follow symbolic links > >> > >> What is the reason behind choosing follow_fast? By saying > >> follow_anything, you choose to care about cases where there are > >> symlinks under projectroot to point at various projects. If > >> that is the case, don't you want to make sure you include the > >> same project only once? > > > > First, it is faster. Second, for testing if it works I used copy > > of a one "live" git repository I have (git.git repository), by > > making second symlink to it. > > That was not what I wanted to ask; slower and correct is always > preferred over fast and incorrect. I did not see anything that > compensates the duplicates follow_fast might give you in the > code, so I wondered there were some other trick you used to avoid > it. > > In other words, "because I have such and such check to avoid > duplicates, so I can safely use 'follow_fast', without using > slower 'follow'" was the answer I was after. First of all, why duplicated repositories are considered error? It is perhaps error/mistake (besides testing of course) in doing the layout of repositories (projectroot), but it is not that something bad would happen if there are duplicated repositories. Second, follow_fast is faster, especially that in gitweb we would have to set "follow_skip => 2" to not spew errors to web server log when there are duplicated repositories. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 10:01 ` Jakub Narebski @ 2006-09-14 16:42 ` Junio C Hamano 2006-09-14 17:02 ` Jakub Narebski 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2006-09-14 16:42 UTC (permalink / raw) To: Jakub Narebski; +Cc: git Jakub Narebski <jnareb@gmail.com> writes: > Sorry, this comment was leftover from before, when no_chdir was not > used. Then $_ was the last part of directory,... Ah, thanks. I missed that difference. Did you choose to use no_chdir for performance reasons or coding convenience (somehow I had an impression that no_chdir would be slower)? > Sorry for the confusion. By index I meant alternate way of specyfying > projects,... Again, thanks for clarification. I forgot about the handcrafted project index file gitweb uses. > First of all, why duplicated repositories are considered > error? ... it is not that something bad would happen if there > are duplicated repositories. Fair enough, and I agree that that is a sane reasoning. Thanks. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 16:42 ` Junio C Hamano @ 2006-09-14 17:02 ` Jakub Narebski 2006-09-14 17:12 ` Randal L. Schwartz 0 siblings, 1 reply; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 17:02 UTC (permalink / raw) To: git Junio C Hamano wrote: > Jakub Narebski <jnareb@gmail.com> writes: > >> Sorry, this comment was leftover from before, when no_chdir was not >> used. Then $_ was the last part of directory,... > > Ah, thanks. I missed that difference. Did you choose to use > no_chdir for performance reasons or coding convenience (somehow > I had an impression that no_chdir would be slower)? First benchmarks showed that no_chdir was some faster. I have rechecked, and they are the same within the margin of error, perhaps without no_chdir is slightly faster. 447 +/- 11 ms vs. 450 +/- 10 ms according to ApacheBench (ab -n 10). So it is really the matter of convenience. The default (without no_chdir) is I think more convenient. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 17:02 ` Jakub Narebski @ 2006-09-14 17:12 ` Randal L. Schwartz 0 siblings, 0 replies; 19+ messages in thread From: Randal L. Schwartz @ 2006-09-14 17:12 UTC (permalink / raw) To: Jakub Narebski; +Cc: git >>>>> "Jakub" == Jakub Narebski <jnareb@gmail.com> writes: Jakub> First benchmarks showed that no_chdir was some faster. I have rechecked, Jakub> and they are the same within the margin of error, perhaps without Jakub> no_chdir is slightly faster. 447 +/- 11 ms vs. 450 +/- 10 ms according Jakub> to ApacheBench (ab -n 10). Any benchmarks will certainly depend on the O/S as well. The advantage to chdir is that the name-to-inode lookups don't need to keep retraversing the upper directories. And keep in mind that caching will affect the benchmarks on that. As for that whole -d thing... If you use _, you have to keep in mind whether the previous call was a stat() or an lstat(). Both lstat() (explictly) and -l use lstat(). Everything else is a stat(). So, "-l and not -d" uses an lstat to determine that we're looking at a symlink, but a stat to determine that the item pointed at is not a directory. NEVER use readlink(), as it very likely will require a lot of flattening for you to simulate what the OS does. (See my article on that at <http://www.stonehenge.com/merlyn/UnixReview/col27.html>.) -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH (amend)] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 6:39 [PATCH] gitweb: Use File::Find::find in git_get_projects_list Jakub Narebski 2006-09-14 7:37 ` Junio C Hamano @ 2006-09-14 17:39 ` Jakub Narebski 2006-09-14 19:14 ` Jakub Narebski 2006-09-14 21:50 ` Randal L. Schwartz 2006-09-14 19:34 ` [PATCH (take 3)] " Jakub Narebski 2 siblings, 2 replies; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 17:39 UTC (permalink / raw) To: git; +Cc: Junio Hamano Earlier code to get list of projects when $projects_list is a directory (e.g. when it is equal to $projectroot) had a hardcoded flat (one level) list of directories. Allow for projects to be in subdirectories also for $projects_list being a directory by using File::Find. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- gitweb/gitweb.perl | 26 ++++++++++++++++++-------- 1 files changed, 18 insertions(+), 8 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index c3544dd..27641a6 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -715,16 +715,26 @@ sub git_get_projects_list { if (-d $projects_list) { # search in directory my $dir = $projects_list; - opendir my ($dh), $dir or return undef; - while (my $dir = readdir($dh)) { - if (-e "$projectroot/$dir/HEAD") { - my $pr = { - path => $dir, - }; - push @list, $pr + my $pfxlen = length("$dir"); + + sub wanted { + # only directories can be git repositories + return unless (-d $_); + + my $subdir = substr($File::Find::name, $pfxlen + 1); + # we check related file in $projectroot + if (-e "$projectroot/$subdir/HEAD") { + push @list, { path => $subdir }; + $File::Find::prune = 1; } } - closedir($dh); + + File::Find::find({ + follow_fast => 1, # follow symbolic links + dangling_symlinks => 0, # ignore dangling symlinks, silently + wanted => \&wanted, + }, "$dir"); + } elsif (-f $projects_list) { # read from file(url-encoded): # 'git%2Fgit.git Linus+Torvalds' -- 1.4.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH (amend)] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 17:39 ` [PATCH (amend)] " Jakub Narebski @ 2006-09-14 19:14 ` Jakub Narebski 2006-09-14 19:51 ` Junio C Hamano 2006-09-14 21:50 ` Randal L. Schwartz 1 sibling, 1 reply; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 19:14 UTC (permalink / raw) To: git Gaah, the code spews quite a lot of warnings as of now. Variable "$pfxlen" will not stay shared at gitweb.perl line 724. Variable "@list" will not stay shared at gitweb.perl line 727. ... -- Jakub Narebski ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH (amend)] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 19:14 ` Jakub Narebski @ 2006-09-14 19:51 ` Junio C Hamano 0 siblings, 0 replies; 19+ messages in thread From: Junio C Hamano @ 2006-09-14 19:51 UTC (permalink / raw) To: Jakub Narebski; +Cc: git Jakub Narebski <jnareb@gmail.com> writes: > Gaah, the code spews quite a lot of warnings as of now. > > Variable "$pfxlen" will not stay shared at gitweb.perl line 724. > Variable "@list" will not stay shared at gitweb.perl line 727. > ... Thanks for an early warning. Will not apply that one and instead will wait for updates. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH (amend)] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 17:39 ` [PATCH (amend)] " Jakub Narebski 2006-09-14 19:14 ` Jakub Narebski @ 2006-09-14 21:50 ` Randal L. Schwartz 1 sibling, 0 replies; 19+ messages in thread From: Randal L. Schwartz @ 2006-09-14 21:50 UTC (permalink / raw) To: Jakub Narebski; +Cc: git, Junio Hamano >>>>> "Jakub" == Jakub Narebski <jnareb@gmail.com> writes: Jakub> + sub wanted { Jakub> + # only directories can be git repositories Jakub> + return unless (-d $_); Jakub> + Jakub> + my $subdir = substr($File::Find::name, $pfxlen + 1); Jakub> + # we check related file in $projectroot Jakub> + if (-e "$projectroot/$subdir/HEAD") { Jakub> + push @list, { path => $subdir }; Jakub> + $File::Find::prune = 1; Jakub> } Jakub> } Jakub> - closedir($dh); Jakub> + Jakub> + File::Find::find({ Jakub> + follow_fast => 1, # follow symbolic links Jakub> + dangling_symlinks => 0, # ignore dangling symlinks, silently Jakub> + wanted => \&wanted, Jakub> + }, "$dir"); Don't define a sub inside a sub. That's the "won't stay shared" problem. Move that like this: find( { follow_fast => 1, dangling_symlinks => 0, wanted => sub { ... everything in &wanted above ... }, $dir); -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH (take 3)] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 6:39 [PATCH] gitweb: Use File::Find::find in git_get_projects_list Jakub Narebski 2006-09-14 7:37 ` Junio C Hamano 2006-09-14 17:39 ` [PATCH (amend)] " Jakub Narebski @ 2006-09-14 19:34 ` Jakub Narebski 2006-09-14 20:13 ` Junio C Hamano 2 siblings, 1 reply; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 19:34 UTC (permalink / raw) To: git; +Cc: Junio Hamano Earlier code to get list of projects when $projects_list is a directory (e.g. when it is equal to $projectroot) had a hardcoded flat (one level) list of directories. Allow for projects to be in subdirectories also for $projects_list being a directory by using File::Find. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- Use anonymous subroutine to avoid Variable "@list" will not stay shared at gitweb.perl line 727. warning. Check for the current directory to avoid substr outside string warning. gitweb/gitweb.perl | 30 ++++++++++++++++++++---------- 1 files changed, 20 insertions(+), 10 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index c3544dd..bea75d3 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -715,16 +715,26 @@ sub git_get_projects_list { if (-d $projects_list) { # search in directory my $dir = $projects_list; - opendir my ($dh), $dir or return undef; - while (my $dir = readdir($dh)) { - if (-e "$projectroot/$dir/HEAD") { - my $pr = { - path => $dir, - }; - push @list, $pr - } - } - closedir($dh); + my $pfxlen = length("$dir"); + + File::Find::find({ + follow_fast => 1, # follow symbolic links + dangling_symlinks => 0, # ignore dangling symlinks, silently + wanted => sub { + # skip current directory + return if (m!^/|.|..$!); + # only directories can be git repositories + return unless (-d $_); + + my $subdir = substr($File::Find::name, $pfxlen + 1); + # we check related file in $projectroot + if (-e "$projectroot/$subdir/HEAD") { + push @list, { path => $subdir }; + $File::Find::prune = 1; + } + }, + }, "$dir"); + } elsif (-f $projects_list) { # read from file(url-encoded): # 'git%2Fgit.git Linus+Torvalds' -- 1.4.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH (take 3)] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 19:34 ` [PATCH (take 3)] " Jakub Narebski @ 2006-09-14 20:13 ` Junio C Hamano 2006-09-14 20:18 ` [PATCH (take 4)] " Jakub Narebski 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2006-09-14 20:13 UTC (permalink / raw) To: Jakub Narebski; +Cc: git Jakub Narebski <jnareb@gmail.com> writes: > + wanted => sub { > + # skip current directory > + return if (m!^/|.|..$!); Huh? (1) Did you mean to say "\." (not any single character but literally dot)? (2) how does the alternatives within m{} construct bind (iow, please be gentle to the readers)? Do you mean return if (/^\/(?:\.|\.\.)$/) in other words, return if (/^\/\.$/ || /^\/\.\.$/) in other words, return if (/^\/\.{1,2}/) ??? ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH (take 4)] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 20:13 ` Junio C Hamano @ 2006-09-14 20:18 ` Jakub Narebski 2006-09-14 20:24 ` Jakub Narebski 0 siblings, 1 reply; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 20:18 UTC (permalink / raw) To: git; +Cc: Junio C Hamano Earlier code to get list of projects when $projects_list is a directory (e.g. when it is equal to $projectroot) had a hardcoded flat (one level) list of directories. Allow for projects to be in subdirectories also for $projects_list being a directory by using File::Find. Signed-off-by: Jakub Narebski <jnareb@gmail.com> --- gitweb/gitweb.perl | 30 ++++++++++++++++++++---------- 1 files changed, 20 insertions(+), 10 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index c3544dd..25383bc 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -715,16 +715,26 @@ sub git_get_projects_list { if (-d $projects_list) { # search in directory my $dir = $projects_list; - opendir my ($dh), $dir or return undef; - while (my $dir = readdir($dh)) { - if (-e "$projectroot/$dir/HEAD") { - my $pr = { - path => $dir, - }; - push @list, $pr - } - } - closedir($dh); + my $pfxlen = length("$dir"); + + File::Find::find({ + follow_fast => 1, # follow symbolic links + dangling_symlinks => 0, # ignore dangling symlinks, silently + wanted => sub { + # skip current directory + return if (m!^[/.]$!); + # only directories can be git repositories + return unless (-d $_); + + my $subdir = substr($File::Find::name, $pfxlen + 1); + # we check related file in $projectroot + if (-e "$projectroot/$subdir/HEAD") { + push @list, { path => $subdir }; + $File::Find::prune = 1; + } + }, + }, "$dir"); + } elsif (-f $projects_list) { # read from file(url-encoded): # 'git%2Fgit.git Linus+Torvalds' -- 1.4.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH (take 4)] gitweb: Use File::Find::find in git_get_projects_list 2006-09-14 20:18 ` [PATCH (take 4)] " Jakub Narebski @ 2006-09-14 20:24 ` Jakub Narebski 0 siblings, 0 replies; 19+ messages in thread From: Jakub Narebski @ 2006-09-14 20:24 UTC (permalink / raw) To: git Jakub Narebski wrote: > + # skip current directory Perhaps it would be better to say "skip $projects_list (top) directory". I don't think it is important enough to resend a patch. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2006-09-14 21:50 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-09-14 6:39 [PATCH] gitweb: Use File::Find::find in git_get_projects_list Jakub Narebski 2006-09-14 7:37 ` Junio C Hamano 2006-09-14 7:59 ` Jakub Narebski 2006-09-14 9:23 ` Junio C Hamano 2006-09-14 18:32 ` Junio C Hamano 2006-09-14 18:43 ` Jakub Narebski 2006-09-14 9:40 ` Junio C Hamano 2006-09-14 10:01 ` Jakub Narebski 2006-09-14 16:42 ` Junio C Hamano 2006-09-14 17:02 ` Jakub Narebski 2006-09-14 17:12 ` Randal L. Schwartz 2006-09-14 17:39 ` [PATCH (amend)] " Jakub Narebski 2006-09-14 19:14 ` Jakub Narebski 2006-09-14 19:51 ` Junio C Hamano 2006-09-14 21:50 ` Randal L. Schwartz 2006-09-14 19:34 ` [PATCH (take 3)] " Jakub Narebski 2006-09-14 20:13 ` Junio C Hamano 2006-09-14 20:18 ` [PATCH (take 4)] " Jakub Narebski 2006-09-14 20:24 ` Jakub Narebski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).