git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] gitweb: speed up project listing on large work trees by limiting find depth
@ 2007-10-17  3:45 Luke Lu
  2007-10-17  4:00 ` Shawn O. Pearce
  0 siblings, 1 reply; 5+ messages in thread
From: Luke Lu @ 2007-10-17  3:45 UTC (permalink / raw)
  To: git; +Cc: pasky, spearce, Luke Lu

Resubmitting patch after passing gitweb regression tests.

Signed-off-by: Luke Lu <git@vicaya.com>
---
 Makefile                               |    2 ++
 gitweb/gitweb.perl                     |   10 ++++++++++
 t/t9500-gitweb-standalone-no-errors.sh |    1 +
 3 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/Makefile b/Makefile
index 8db4dbe..3e9938e 100644
--- a/Makefile
+++ b/Makefile
@@ -165,6 +165,7 @@ GITWEB_CONFIG = gitweb_config.perl
 GITWEB_HOME_LINK_STR = projects
 GITWEB_SITENAME =
 GITWEB_PROJECTROOT = /pub/git
+GITWEB_PROJECT_MAXDEPTH = 2007
 GITWEB_EXPORT_OK =
 GITWEB_STRICT_EXPORT =
 GITWEB_BASE_URL =
@@ -831,6 +832,7 @@ gitweb/gitweb.cgi: gitweb/gitweb.perl
 	    -e 's|++GITWEB_HOME_LINK_STR++|$(GITWEB_HOME_LINK_STR)|g' \
 	    -e 's|++GITWEB_SITENAME++|$(GITWEB_SITENAME)|g' \
 	    -e 's|++GITWEB_PROJECTROOT++|$(GITWEB_PROJECTROOT)|g' \
+	    -e 's|"++GITWEB_PROJECT_MAXDEPTH++"|$(GITWEB_PROJECT_MAXDEPTH)|g' \
 	    -e 's|++GITWEB_EXPORT_OK++|$(GITWEB_EXPORT_OK)|g' \
 	    -e 's|++GITWEB_STRICT_EXPORT++|$(GITWEB_STRICT_EXPORT)|g' \
 	    -e 's|++GITWEB_BASE_URL++|$(GITWEB_BASE_URL)|g' \
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 3064298..48e21da 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -35,6 +35,10 @@ our $GIT = "++GIT_BINDIR++/git";
 #our $projectroot = "/pub/scm";
 our $projectroot = "++GITWEB_PROJECTROOT++";
 
+# fs traversing limit for getting project list
+# the number is relative to the projectroot
+our $project_maxdepth = "++GITWEB_PROJECT_MAXDEPTH++";
+
 # target of the home link on top of all pages
 our $home_link = $my_uri || "/";
 
@@ -1509,6 +1513,7 @@ sub git_get_projects_list {
 		# remove the trailing "/"
 		$dir =~ s!/+$!!;
 		my $pfxlen = length("$dir");
+		my $pfxdepth = ($dir =~ tr!/!!);
 
 		File::Find::find({
 			follow_fast => 1, # follow symbolic links
@@ -1519,6 +1524,11 @@ sub git_get_projects_list {
 				return if (m!^[/.]$!);
 				# only directories can be git repositories
 				return unless (-d $_);
+				# don't traverse too deep (Find is super slow on os x)
+				if (($File::Find::name =~ tr!/!!) - $pfxdepth > $project_maxdepth) {
+					$File::Find::prune = 1;
+					return;
+				}
 
 				my $subdir = substr($File::Find::name, $pfxlen + 1);
 				# we check related file in $projectroot
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index 642b836..f7bad5b 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -18,6 +18,7 @@ gitweb_init () {
 our \$version = "current";
 our \$GIT = "git";
 our \$projectroot = "$(pwd)";
+our \$project_maxdepth = 8;
 our \$home_link_str = "projects";
 our \$site_name = "[localhost]";
 our \$site_header = "";
-- 
1.5.3.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] gitweb: speed up project listing on large work trees by limiting find depth
  2007-10-17  3:45 [PATCH] gitweb: speed up project listing on large work trees by limiting find depth Luke Lu
@ 2007-10-17  4:00 ` Shawn O. Pearce
  2007-10-17  4:19   ` Luke Lu
  0 siblings, 1 reply; 5+ messages in thread
From: Shawn O. Pearce @ 2007-10-17  4:00 UTC (permalink / raw)
  To: Luke Lu; +Cc: git, pasky

Luke Lu <git@vicaya.com> wrote:
> Resubmitting patch after passing gitweb regression tests.
...
> @@ -1519,6 +1524,11 @@ sub git_get_projects_list {
>  				return if (m!^[/.]$!);
>  				# only directories can be git repositories
>  				return unless (-d $_);
> +				# don't traverse too deep (Find is super slow on os x)
> +				if (($File::Find::name =~ tr!/!!) - $pfxdepth > $project_maxdepth) {
> +					$File::Find::prune = 1;
> +					return;
> +				}

Thanks.  I'm squashing this into your patch.  I'm not sure what
the impact is of altering $File::Find::name in the middle of the
find algorithm and I'm not sure we want to figure that out later.
We found out the hard way today that altering a non-local'd $_
in the function is what was causing the breakage.

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 48e21da..9f47c3f 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1525,7 +1525,8 @@ sub git_get_projects_list {
 				# only directories can be git repositories
 				return unless (-d $_);
 				# don't traverse too deep (Find is super slow on os x)
-				if (($File::Find::name =~ tr!/!!) - $pfxdepth > $project_maxdepth) {
+				local $_ = $File::Find::name;
+				if (tr!/!! - $pfxdepth > $project_maxdepth) {
 					$File::Find::prune = 1;
 					return;
 				}
-- 
Shawn.

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] gitweb: speed up project listing on large work trees by limiting find depth
  2007-10-17  4:00 ` Shawn O. Pearce
@ 2007-10-17  4:19   ` Luke Lu
  2007-10-17  4:27     ` Shawn O. Pearce
  0 siblings, 1 reply; 5+ messages in thread
From: Luke Lu @ 2007-10-17  4:19 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git, pasky


On Oct 16, 2007, at 9:00 PM, Shawn O. Pearce wrote:

> Luke Lu <git@vicaya.com> wrote:
>> Resubmitting patch after passing gitweb regression tests.
> ...
>> @@ -1519,6 +1524,11 @@ sub git_get_projects_list {
>>  				return if (m!^[/.]$!);
>>  				# only directories can be git repositories
>>  				return unless (-d $_);
>> +				# don't traverse too deep (Find is super slow on os x)
>> +				if (($File::Find::name =~ tr!/!!) - $pfxdepth >  
>> $project_maxdepth) {
>> +					$File::Find::prune = 1;
>> +					return;
>> +				}
>
> Thanks.  I'm squashing this into your patch.  I'm not sure what
> the impact is of altering $File::Find::name in the middle of the
> find algorithm and I'm not sure we want to figure that out later.
> We found out the hard way today that altering a non-local'd $_
> in the function is what was causing the breakage.

This is generally a good advice. But tr!/!! doesn't alter the string  
at all (OK, replicates it), unless you use the /d option. tr/stuff//  
is an idiom to count stuff. Check perldoc perlop for details. I don't  
think it's necessary.

>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 48e21da..9f47c3f 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -1525,7 +1525,8 @@ sub git_get_projects_list {
>  				# only directories can be git repositories
>  				return unless (-d $_);
>  				# don't traverse too deep (Find is super slow on os x)
> -				if (($File::Find::name =~ tr!/!!) - $pfxdepth >  
> $project_maxdepth) {
> +				local $_ = $File::Find::name;
> +				if (tr!/!! - $pfxdepth > $project_maxdepth) {
>  					$File::Find::prune = 1;
>  					return;
>  				}
> -- 
> Shawn.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] gitweb: speed up project listing on large work trees by limiting find depth
  2007-10-17  4:19   ` Luke Lu
@ 2007-10-17  4:27     ` Shawn O. Pearce
       [not found]       ` <562B5254-2BE7-43DF-AB62-499458E360CC@vicaya.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Shawn O. Pearce @ 2007-10-17  4:27 UTC (permalink / raw)
  To: Luke Lu; +Cc: git, pasky

Luke Lu <git@vicaya.com> wrote:
> On Oct 16, 2007, at 9:00 PM, Shawn O. Pearce wrote:
> >
> >Thanks.  I'm squashing this into your patch.  I'm not sure what
> >the impact is of altering $File::Find::name in the middle of the
> >find algorithm and I'm not sure we want to figure that out later.
> >We found out the hard way today that altering a non-local'd $_
> >in the function is what was causing the breakage.
> 
> This is generally a good advice. But tr!/!! doesn't alter the string  
> at all (OK, replicates it), unless you use the /d option. tr/stuff//  
> is an idiom to count stuff. Check perldoc perlop for details. I don't  
> think it's necessary.

Oh.  Yea, I see what you mean now.  So the bug was really that you
were matching on $_ not $File::Find::name.  But according to perldoc
File::Find $_ and $File::Find::name are the same when no_chdir =>
1 which your patch also sets.  So I'm really not seeing how the
updated version fixes the bug.
 
> >diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> >index 48e21da..9f47c3f 100755
> >--- a/gitweb/gitweb.perl
> >+++ b/gitweb/gitweb.perl
> >@@ -1525,7 +1525,8 @@ sub git_get_projects_list {
> > 				# only directories can be git repositories
> > 				return unless (-d $_);
> > 				# don't traverse too deep (Find is super 
> > 				slow on os x)
> >-				if (($File::Find::name =~ tr!/!!) - 
> >$pfxdepth >  $project_maxdepth) {
> >+				local $_ = $File::Find::name;
> >+				if (tr!/!! - $pfxdepth > $project_maxdepth) {
> > 					$File::Find::prune = 1;
> > 					return;
> > 				}

-- 
Shawn.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] gitweb: speed up project listing on large work trees by limiting find depth
       [not found]       ` <562B5254-2BE7-43DF-AB62-499458E360CC@vicaya.com>
@ 2007-10-17  5:25         ` Shawn O. Pearce
  0 siblings, 0 replies; 5+ messages in thread
From: Shawn O. Pearce @ 2007-10-17  5:25 UTC (permalink / raw)
  To: Luke Lu; +Cc: git, pasky

Luke Lu <git@vicaya.com> wrote:
> OK, let me try again :) I was using no_chdir => 1 to shorten the tr,  
> as well as saving a syscall. However the code is expecting $_ to be  
> relative elsewhere (line 1524) to check for the toplevel, so the  
> check failed for the toplevel because of no_chdir, which caused  
> substr to work on the toplevel, which is $pfxlen long. Note $pfxlen +  
> 1 passes the end of the toplevel path, hence the errors, though the  
> program still worked correctly, as $subdir is undefined in this case,  
> which would by pass the rest of the code, which is logically correct.  
> It'll probably crash, if it's written in C :)
> 
> So, I got rid of no_chdir => 1 in the new patch and uses  
> $File::Find::name directly, as otherwise I'd have to come up with a  
> messier regex for checking toplevel at line 1524.

*light dawns*.  Thank you for the explanation.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-10-17  5:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-17  3:45 [PATCH] gitweb: speed up project listing on large work trees by limiting find depth Luke Lu
2007-10-17  4:00 ` Shawn O. Pearce
2007-10-17  4:19   ` Luke Lu
2007-10-17  4:27     ` Shawn O. Pearce
     [not found]       ` <562B5254-2BE7-43DF-AB62-499458E360CC@vicaya.com>
2007-10-17  5:25         ` Shawn O. Pearce

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).