git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH/RFC] gitweb: Restructure projects list generation
@ 2011-02-26 22:06 Jakub Narebski
  2011-02-27 10:08 ` Jonathan Nieder
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Narebski @ 2011-02-26 22:06 UTC (permalink / raw)
  To: git; +Cc: Jakub Narebski, admin, John Hawley

Extract filtering out forks (which is done if 'forks' feature is
enabled) into filter_forks_from_projects_list subroutine, and
searching projects (via projects search form, or via content tags)
into search_projects_list subroutine.

Both are now run _before_ displaying projects, and not while printing;
this allow to know upfront if there were any found projects.  Gitweb
now can and do print 'No such projects found' if user searches for
phrase which does not correspond to any project (any repository).
This also would allow splitting projects list into pages, if we so
desire.

Filtering out forks and marking repository (project) as having forks
is now consolidated into one subroutine (special case of handling
forks in git_get_projects_list only for $projects_list being file is
now removed).  Forks handling is also cleaned up and simplified, which
might mean some changes in output.  $pr->{'forks'} now contains
un-filled list of forks; we can now also detect situation where the
way for having forks is prepared, but there are no forks yet.

Sorting projects got also refactored in a very straight way into
sort_projects_list subroutine.

The interaction between forks, content tags and searching is now made
more explicit: searching whether by tag, or via search form turns off
fork filtering (gitweb searches also forks, and will show all
results).  If 'ctags' feature is disabled, then searching by tag is
too.

The t9500 test now includes some basic test for 'forks' and 'ctags'
features.


Generating list of projects by scanning given directory is now also a
bit simplified wrt. handling filtering; it is byproduct of extracting
forks filtering to separate subroutine.

While at it we now detect that there are no projects and respond with
"404 No projects found" also for 'project_index' and 'opml' actions.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
It is marked _currently_ as a RFC, because while it *looks* like it
finds forks correctly (at least when $projects_list is directory), the
algorithm for detecting forks is changed... though IMVHO the code is
cleaner.

 gitweb/gitweb.perl                     |  219 +++++++++++++++++++++-----------
 t/t9500-gitweb-standalone-no-errors.sh |   49 +++++++
 2 files changed, 191 insertions(+), 77 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 1b9369d..bc60158 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2643,21 +2643,23 @@ sub git_get_project_url_list {
 }
 
 sub git_get_projects_list {
-	my ($filter) = @_;
+	my $filter = shift || '';
 	my @list;
 
-	$filter ||= '';
 	$filter =~ s/\.git$//;
 
-	my $check_forks = gitweb_check_feature('forks');
-
 	if (-d $projects_list) {
 		# search in directory
-		my $dir = $projects_list . ($filter ? "/$filter" : '');
+		my $dir = $projects_list;
 		# remove the trailing "/"
 		$dir =~ s!/+$!!;
-		my $pfxlen = length("$dir");
-		my $pfxdepth = ($dir =~ tr!/!!);
+		my $pfxlen = length("$projects_list");
+		my $pfxdepth = ($projects_list =~ tr!/!!);
+		# when filtering, search only given subdirectory
+		if ($filter) {
+			$dir .= "/$filter";
+			$dir =~ s!/+$!!;
+		}
 
 		File::Find::find({
 			follow_fast => 1, # follow symbolic links
@@ -2672,14 +2674,14 @@ sub git_get_projects_list {
 				# only directories can be git repositories
 				return unless (-d $_);
 				# don't traverse too deep (Find is super slow on os x)
+				# $project_maxdepth excludes depth of $projectroot
 				if (($File::Find::name =~ tr!/!!) - $pfxdepth > $project_maxdepth) {
 					$File::Find::prune = 1;
 					return;
 				}
 
-				my $subdir = substr($File::Find::name, $pfxlen + 1);
+				my $path = substr($File::Find::name, $pfxlen + 1);
 				# we check related file in $projectroot
-				my $path = ($filter ? "$filter/" : '') . $subdir;
 				if (check_export_ok("$projectroot/$path")) {
 					push @list, { path => $path };
 					$File::Find::prune = 1;
@@ -2692,7 +2694,6 @@ sub git_get_projects_list {
 		# 'git%2Fgit.git Linus+Torvalds'
 		# 'libs%2Fklibc%2Fklibc.git H.+Peter+Anvin'
 		# 'linux%2Fhotplug%2Fudev.git Greg+Kroah-Hartman'
-		my %paths;
 		open my $fd, '<', $projects_list or return;
 	PROJECT:
 		while (my $line = <$fd>) {
@@ -2703,32 +2704,9 @@ sub git_get_projects_list {
 			if (!defined $path) {
 				next;
 			}
-			if ($filter ne '') {
-				# looking for forks;
-				my $pfx = substr($path, 0, length($filter));
-				if ($pfx ne $filter) {
-					next PROJECT;
-				}
-				my $sfx = substr($path, length($filter));
-				if ($sfx !~ /^\/.*\.git$/) {
-					next PROJECT;
-				}
-			} elsif ($check_forks) {
-			PATH:
-				foreach my $filter (keys %paths) {
-					# looking for forks;
-					my $pfx = substr($path, 0, length($filter));
-					if ($pfx ne $filter) {
-						next PATH;
-					}
-					my $sfx = substr($path, length($filter));
-					if ($sfx !~ /^\/.*\.git$/) {
-						next PATH;
-					}
-					# is a fork, don't include it in
-					# the list
-					next PROJECT;
-				}
+			# if $filter is rpovided, check if $path begins with $filter
+			if ($filter && $path !~ m!^\Q$filter\E/!) {
+				next;
 			}
 			if (check_export_ok("$projectroot/$path")) {
 				my $pr = {
@@ -2736,8 +2714,6 @@ sub git_get_projects_list {
 					owner => to_utf8($owner),
 				};
 				push @list, $pr;
-				(my $forks_path = $path) =~ s/\.git$//;
-				$paths{$forks_path}++;
 			}
 		}
 		close $fd;
@@ -2745,6 +2721,67 @@ sub git_get_projects_list {
 	return @list;
 }
 
+# this assumes that forks follow immediately forked projects:
+# [ ..., 'repo.git', 'repo/fork.git', ...]; if not, set second
+# parameter to true value to sort projects by path first.
+sub filter_forks_from_projects_list {
+	my ($projects, $need_sorting) = @_;
+
+	@$projects = sort { $a->{'path'} cmp $b->{'path'} } @$projects
+		if ($need_sorting);
+
+	my (@filtered, $last_pr);
+ PROJECT:
+	foreach my $pr (@$projects) {
+		if ($last_pr &&
+		    $pr->{'path'} =~ m!^\Q$last_pr\E/!) {
+			# this is fork, add it to 'forks' and skip it
+			push @{$filtered[-1]->{'forks'}}, $pr;
+			next PROJECT;
+		}
+		($last_pr = $pr->{'path'}) =~ s/\.git$//;
+		if ($last_pr !~ m!/*$! && -d "$projectroot/$last_pr") {
+			# it was not foo/.git, and there can be forks
+			$pr->{'forks'} = [];
+		}
+		push @filtered, $pr;
+	}
+
+	return @filtered;
+}
+
+# note: fill_project_list_info must be run first,
+# for 'descr_long' and 'ctags' to be filled
+sub search_projects_list {
+	my ($projlist, %opts) = @_;
+	my $tagfilter  = $opts{'tagfilter'};
+	my $searchtext = $opts{'searchtext'};
+
+	return @$projlist
+		unless ($tagfilter || $searchtext);
+
+	my @projects;
+ PROJECT:
+	foreach my $pr (@$projlist) {
+
+		if ($tagfilter) {
+			next unless ref($pr->{'ctags'}) eq 'HASH';
+			next unless
+				grep { lc($_) eq lc($tagfilter) } keys %{$pr->{'ctags'}};
+		}
+
+		if ($searchtext) {
+			next unless
+				$pr->{'path'} =~ /$searchtext/ ||
+				$pr->{'descr_long'} =~ /$searchtext/;
+		}
+
+		push @projects, $pr;
+	}
+
+	return @projects;
+}
+
 our $gitweb_project_owner = undef;
 sub git_get_project_list_from_file {
 
@@ -4727,7 +4764,7 @@ sub git_patchset_body {
 # project in the list, removing invalid projects from returned list
 # NOTE: modifies $projlist, but does not remove entries from it
 sub fill_project_list_info {
-	my ($projlist, $check_forks) = @_;
+	my $projlist = shift;
 	my @projects;
 
 	my $show_ctags = gitweb_check_feature('ctags');
@@ -4747,23 +4784,36 @@ sub fill_project_list_info {
 		if (!defined $pr->{'owner'}) {
 			$pr->{'owner'} = git_get_project_owner("$pr->{'path'}") || "";
 		}
-		if ($check_forks) {
-			my $pname = $pr->{'path'};
-			if (($pname =~ s/\.git$//) &&
-			    ($pname !~ /\/$/) &&
-			    (-d "$projectroot/$pname")) {
-				$pr->{'forks'} = "-d $projectroot/$pname";
-			} else {
-				$pr->{'forks'} = 0;
-			}
+		if ($show_ctags) {
+			$pr->{'ctags'} = git_get_project_ctags($pr->{'path'});
 		}
-		$show_ctags and $pr->{'ctags'} = git_get_project_ctags($pr->{'path'});
 		push @projects, $pr;
 	}
 
 	return @projects;
 }
 
+sub sort_projects_list {
+	my ($projlist, $order) = @_;
+	my @projects;
+
+	my %order_info = (
+		project => { key => 'path', type => 'str' },
+		descr => { key => 'descr_long', type => 'str' },
+		owner => { key => 'owner', type => 'str' },
+		age => { key => 'age', type => 'num' }
+	);
+	my $oi = $order_info{$order};
+	return @$projlist unless defined $oi;
+	if ($oi->{'type'} eq 'str') {
+		@projects = sort {$a->{$oi->{'key'}} cmp $b->{$oi->{'key'}}} @$projlist;
+	} else {
+		@projects = sort {$a->{$oi->{'key'}} <=> $b->{$oi->{'key'}}} @$projlist;
+	}
+
+	return @projects;
+}
+
 # print 'sort by' <th> element, generating 'sort by $name' replay link
 # if that order is not selected
 sub print_sort_th {
@@ -4790,28 +4840,39 @@ sub format_sort_th {
 sub git_project_list_body {
 	# actually uses global variable $project
 	my ($projlist, $order, $from, $to, $extra, $no_header) = @_;
+	my @projects = @$projlist;
 
 	my $check_forks = gitweb_check_feature('forks');
-	my @projects = fill_project_list_info($projlist, $check_forks);
+	my $show_ctags  = gitweb_check_feature('ctags');
+	my $tagfilter = $show_ctags ? $cgi->param('by_tag') : undef;
+	$check_forks = undef
+		if ($tagfilter || $searchtext);
+
+	# filtering out forks before filling info allows to do less work
+	@projects = filter_forks_from_projects_list(\@projects)
+		if ($check_forks);
+	@projects = fill_project_list_info(\@projects);
+	# searching projects require filling to be run before it
+	@projects = search_projects_list(\@projects,
+	                                 'searchtext' => $searchtext,
+	                                 'tagfilter'  => $tagfilter)
+		if ($tagfilter || $searchtext);
 
 	$order ||= $default_projects_order;
 	$from = 0 unless defined $from;
 	$to = $#projects if (!defined $to || $#projects < $to);
 
-	my %order_info = (
-		project => { key => 'path', type => 'str' },
-		descr => { key => 'descr_long', type => 'str' },
-		owner => { key => 'owner', type => 'str' },
-		age => { key => 'age', type => 'num' }
-	);
-	my $oi = $order_info{$order};
-	if ($oi->{'type'} eq 'str') {
-		@projects = sort {$a->{$oi->{'key'}} cmp $b->{$oi->{'key'}}} @projects;
-	} else {
-		@projects = sort {$a->{$oi->{'key'}} <=> $b->{$oi->{'key'}}} @projects;
+	# short circuit
+	if ($from > $to) {
+		print "<center>\n".
+		      "<b>No such projects found</b><br />\n".
+		      "Click ".$cgi->a({-href=>href(project=>undef)},"here")." to view all projects<br />\n".
+		      "</center>\n<br />\n";
+		return;
 	}
 
-	my $show_ctags = gitweb_check_feature('ctags');
+	@projects = sort_projects_list(\@projects, $order);
+
 	if ($show_ctags) {
 		my %ctags;
 		foreach my $p (@projects) {
@@ -4837,32 +4898,26 @@ sub git_project_list_body {
 		      "</tr>\n";
 	}
 	my $alternate = 1;
-	my $tagfilter = $cgi->param('by_tag');
 	for (my $i = $from; $i <= $to; $i++) {
 		my $pr = $projects[$i];
 
-		next if $tagfilter and $show_ctags and not grep { lc $_ eq lc $tagfilter } keys %{$pr->{'ctags'}};
-		next if $searchtext and not $pr->{'path'} =~ /$searchtext/
-			and not $pr->{'descr_long'} =~ /$searchtext/;
-		# Weed out forks or non-matching entries of search
-		if ($check_forks) {
-			my $forkbase = $project; $forkbase ||= ''; $forkbase =~ s#\.git$#/#;
-			$forkbase="^$forkbase" if $forkbase;
-			next if not $searchtext and not $tagfilter and $show_ctags
-				and $pr->{'path'} =~ m#$forkbase.*/.*#; # regexp-safe
-		}
-
 		if ($alternate) {
 			print "<tr class=\"dark\">\n";
 		} else {
 			print "<tr class=\"light\">\n";
 		}
 		$alternate ^= 1;
+
 		if ($check_forks) {
 			print "<td>";
 			if ($pr->{'forks'}) {
-				print "<!-- $pr->{'forks'} -->\n";
-				print $cgi->a({-href => href(project=>$pr->{'path'}, action=>"forks")}, "+");
+				my $nforks = scalar @{$pr->{'forks'}};
+				if ($nforks > 0) {
+					print $cgi->a({-href => href(project=>$pr->{'path'}, action=>"forks"),
+					               -title => "$nforks forks"}, "+");
+				} else {
+					print $cgi->span({-title => "$nforks forks"}, "+");
+				}
 			}
 			print "</td>\n";
 		}
@@ -5343,7 +5398,10 @@ sub git_forks {
 }
 
 sub git_project_index {
-	my @projects = git_get_projects_list($project);
+	my @projects = git_get_projects_list();
+	if (!@projects) {
+		die_error(404, "No projects found");
+	}
 
 	print $cgi->header(
 		-type => 'text/plain',
@@ -5385,7 +5443,11 @@ sub git_summary {
 	my $check_forks = gitweb_check_feature('forks');
 
 	if ($check_forks) {
+		# find forks of a project
 		@forklist = git_get_projects_list($project);
+		# filter out forks of forks
+		@forklist = filter_forks_from_projects_list(\@forklist)
+			if (@forklist);
 	}
 
 	git_header_html();
@@ -7305,6 +7367,9 @@ sub git_atom {
 
 sub git_opml {
 	my @list = git_get_projects_list();
+	if (!@list) {
+		die_error(404, "No projects found");
+	}
 
 	print $cgi->header(
 		-type => 'text/xml',
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index 35c151d..0dc573d 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -591,4 +591,53 @@ test_expect_success HIGHLIGHT \
 	 git commit -m "Add test.sh" &&
 	 gitweb_run "p=.git;a=blob;f=test.sh"'
 
+# ----------------------------------------------------------------------
+# forks of projects
+
+cat >>gitweb_config.perl <<-\EOF &&
+$feature{'forks'}{'default'} = [1];
+EOF
+
+test_expect_success \
+	'forks: prepare' \
+	'git init --bare foo.git &&
+	 git --git-dir=foo.git --work-tree=. add file &&
+	 git --git-dir=foo.git --work-tree=. commit -m "Initial commit" &&
+	 echo "foo" > foo.git/description &&
+	 mkdir -p foo &&
+	 (cd foo &&
+	  git clone --shared --bare ../foo.git foo-forked.git &&
+	  echo "fork of foo" > foo-forked.git/description)'
+
+test_expect_success \
+	'forks: projects list' \
+	'gitweb_run'
+
+test_expect_success \
+	'forks: forks action' \
+	'gitweb_run "p=foo.git;a=forks"'
+
+# ----------------------------------------------------------------------
+# content tags (tag cloud)
+
+cat >>gitweb_config.perl <<-\EOF &&
+# we don't test _setting_ content tags, so any true value is good
+$feature{'ctags'}{'default'} = ['ctags_script.cgi'];
+EOF
+
+test_expect_success \
+	'ctags: tag cloud in projects list' \
+	'mkdir .git/ctags &&
+	 echo "2" > .git/ctags/foo &&
+	 echo "1" > .git/ctags/bar &&
+	gitweb_run'
+
+test_expect_success \
+	'ctags: search projects by existing tag' \
+	'gitweb_run "by_tag=foo"'
+
+test_expect_success \
+	'ctags: search projects by non existent tag' \
+	'gitweb_run "by_tag=non-existent"'
+
 test_done

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH/RFC] gitweb: Restructure projects list generation
  2011-02-26 22:06 [PATCH/RFC] gitweb: Restructure projects list generation Jakub Narebski
@ 2011-02-27 10:08 ` Jonathan Nieder
  2011-02-27 15:03   ` Jakub Narebski
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Nieder @ 2011-02-27 10:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, admin, John Hawley

Hi,

Jakub Narebski wrote:

> Extract filtering out forks (which is done if 'forks' feature is
> enabled) into filter_forks_from_projects_list subroutine

Context (to remind myself): the git_get_projects_list function decides
which repositories to show on the homepage.  git_project_list_body
takes care of actually showing them in a table; it displays a "+" sign
in front of projects with forks and generally the project list it is
passed omits the forks themselves.

git_get_projects_list needs to be reasonably fast, since it is used
to validate the "project" parameter in every request.

[...]
> Both are now run _before_ displaying projects, and not while printing;
> this allow to know upfront if there were any found projects.  Gitweb
> now can and do print 'No such projects found' if user searches for
> phrase which does not correspond to any project (any repository).

Aren't forks filtered out by git-get-projects-list already?

[...]
> Filtering out forks and marking repository (project) as having forks
> is now consolidated into one subroutine (special case of handling
> forks in git_get_projects_list only for $projects_list being file is
> now removed).

Ah, no, they aren't.  That's a good change for consistency already.

> Forks handling is also cleaned up and simplified, which
> might mean some changes in output.

(Not having read the code yet) this part is not obvious to me.  Maybe
an example could help.

[...]
> The interaction between forks, content tags and searching is now made
> more explicit: searching whether by tag, or via search form turns off
> fork filtering (gitweb searches also forks, and will show all
> results).

Hm, or maybe this is it.

> IMVHO the code is
> cleaner.

So some general code tidying is intermixed.

> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -2646,18 +2646,20 @@ sub git_get_projects_list {
[...]
> -		my $dir = $projects_list . ($filter ? "/$filter" : '');
> +		my $dir = $projects_list;
>  		# remove the trailing "/"
>  		$dir =~ s!/+$!!;
> -		my $pfxlen = length("$dir");
> -		my $pfxdepth = ($dir =~ tr!/!!);
> +		my $pfxlen = length("$projects_list");
> +		my $pfxdepth = ($projects_list =~ tr!/!!);
[...]

$dir is the same as before except that trailing slashes are removed,
and meanwhile the stripped names of directories returned from find()
will include $filter.  Should be almost a no-op (i.e., a nice
simplification).

> @@ -2672,14 +2674,14 @@ sub git_get_projects_list {
>  				# only directories can be git repositories
>  				return unless (-d $_);
>  				# don't traverse too deep (Find is super slow on os x)
> +				# $project_maxdepth excludes depth of $projectroot
>  				if (($File::Find::name =~ tr!/!!) - $pfxdepth > $project_maxdepth) {

A bugfix.  $project_maxdepth is advertised to be relative to the
projectroot but it was relative to $projects_list/$filter.

[...]
> @@ -2703,32 +2704,9 @@ sub git_get_projects_list {
>  			if (!defined $path) {
>  				next;
>  			}
> -			if ($filter ne '') {
> -				# looking for forks;
> -				my $pfx = substr($path, 0, length($filter));
> -				if ($pfx ne $filter) {
> -					next PROJECT;
> -				}
> -				my $sfx = substr($path, length($filter));
> -				if ($sfx !~ /^\/.*\.git$/) {
> -					next PROJECT;
> -				}
[...]
> +			# if $filter is rpovided, check if $path begins with $filter

s/rpovided/provided/ :)

> +			if ($filter && $path !~ m!^\Q$filter\E/!) {
> +				next;

Why did the code check for \Q.git\E$ before?  If I'm not missing
something, I'm happy to see the condition go.

What if $filter is 0?  Silly question, I guess.

[...]
> @@ -2745,6 +2721,67 @@ sub git_get_projects_list {
>  	return @list;
>  }
>  
> +# this assumes that forks follow immediately forked projects:
> +# [ ..., 'repo.git', 'repo/fork.git', ...]; if not, set second
> +# parameter to true value to sort projects by path first.
> +sub filter_forks_from_projects_list {
> +	my ($projects, $need_sorting) = @_;
> +
> +	@$projects = sort { $a->{'path'} cmp $b->{'path'} } @$projects
> +		if ($need_sorting);

What happens in this scenario?

	git.git
	git.related.project.git
	git/platforms.git

[...]
> @@ -4747,23 +4784,36 @@ sub fill_project_list_info {
>  		if (!defined $pr->{'owner'}) {
>  			$pr->{'owner'} = git_get_project_owner("$pr->{'path'}") || "";
>  		}
> -		if ($check_forks) {
> -			my $pname = $pr->{'path'};
> -			if (($pname =~ s/\.git$//) &&
> -			    ($pname !~ /\/$/) &&
> -			    (-d "$projectroot/$pname")) {
> -				$pr->{'forks'} = "-d $projectroot/$pname";

So currently we treat project.git as having forks when the
corresponding project/ directory exists, even when the latter is
empty.  The proposed new rule is that project.git has forks if and
only if there is at least one project/foo.git repository.  Sensible.

Thanks for a pleasant read.

Hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH/RFC] gitweb: Restructure projects list generation
  2011-02-27 10:08 ` Jonathan Nieder
@ 2011-02-27 15:03   ` Jakub Narebski
  2011-03-01  1:01     ` Jakub Narebski
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Narebski @ 2011-02-27 15:03 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, admin, John Hawley

On Sun, 27 Feb 2011, Jonathan Nieder wrote:
> Jakub Narebski wrote:
> 
> > Extract filtering out forks (which is done if 'forks' feature is
> > enabled) into filter_forks_from_projects_list subroutine
> 
> Context (to remind myself): the git_get_projects_list function decides
> which repositories to show on the homepage.

Actually it git_get_projects_list generates list of all exported
repositories, whether they are visible in projects list page (homepage)
or not... at least after this change.

Well... list of directories that look like repositories; it is actually
checked further with git_get_last_activity (running actual git command)
in fill_project_list_info if they are really git repositories.

>                                               git_project_list_body 
> takes care of actually showing them in a table; it displays a "+" sign
> in front of projects with forks and generally the project list it is
> passed omits the forks themselves.

Well, forks are hidden (listed on 'forks' pages linked to by those "+"
signs) only if 'forks' feature is enabled.

> 
> git_get_projects_list needs to be reasonably fast, since it is used
> to validate the "project" parameter in every request.

Only with $strict_export set to true (it defaults to false).

> [...]
> > Both are now run _before_ displaying projects, and not while printing;
> > this allow to know upfront if there were any found projects.  Gitweb
> > now can and do print 'No such projects found' if user searches for
> > phrase which does not correspond to any project (any repository).
> 
> Aren't forks filtered out by git-get-projects-list already?

After this commit they are not, and they should not be.  If forks were
filtered out, then with $strict_export you wouldn't be able to access
pages of repositories which are forks.

> [...]
> > Filtering out forks and marking repository (project) as having forks
> > is now consolidated into one subroutine (special case of handling
> > forks in git_get_projects_list only for $projects_list being file is
> > now removed).
> 
> Ah, no, they aren't.  That's a good change for consistency already.
> 
> > Forks handling is also cleaned up and simplified, which
> > might mean some changes in output.
> 
> (Not having read the code yet) this part is not obvious to me.  Maybe
> an example could help.

Well, the commit message is quite large anyway, so I didn't describe in
detailwhat one can read in patch itself...

But see also comment below about algorithm used.

> [...]
> > The interaction between forks, content tags and searching is now made
> > more explicit: searching whether by tag, or via search form turns off
> > fork filtering (gitweb searches also forks, and will show all
> > results).
> 
> Hm, or maybe this is it.
> 
> > IMVHO the code is
> > cleaner.
> 
> So some general code tidying is intermixed.
> 
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -2646,18 +2646,20 @@ sub git_get_projects_list {
> [...]
> > -		my $dir = $projects_list . ($filter ? "/$filter" : '');
> > +		my $dir = $projects_list;
> >  		# remove the trailing "/"
> >  		$dir =~ s!/+$!!;
> > -		my $pfxlen = length("$dir");
> > -		my $pfxdepth = ($dir =~ tr!/!!);
> > +		my $pfxlen = length("$projects_list");
> > +		my $pfxdepth = ($projects_list =~ tr!/!!);
> [...]
> 
> $dir is the same as before except that trailing slashes are removed,
> and meanwhile the stripped names of directories returned from find()
> will include $filter.  Should be almost a no-op (i.e., a nice
> simplification).

Thanks.

This code could be further simplified if not for the fact that at least
theoretically with $projects_list being directory to scan, $projectroot
can be different from $projects_list.

I'll try to resend fixed version soon.

> > @@ -2672,14 +2674,14 @@ sub git_get_projects_list {
> >  				# only directories can be git repositories
> >  				return unless (-d $_);
> >  				# don't traverse too deep (Find is super slow on os x)
> > +				# $project_maxdepth excludes depth of $projectroot
> >  				if (($File::Find::name =~ tr!/!!) - $pfxdepth > $project_maxdepth) {
> 
> A bugfix.  $project_maxdepth is advertised to be relative to the
> projectroot but it was relative to $projects_list/$filter.

I'm not sure if it was advertised in such detail, but I think consistency
is good to have: we don't go beyond some depth regardless whether $filter
is set or not, i.e. whether we are getting list of projects for homepage
('projects_list' action) or for some 'forks' page.

> [...]
> > @@ -2703,32 +2704,9 @@ sub git_get_projects_list {
> >  			if (!defined $path) {
> >  				next;
> >  			}
> > -			if ($filter ne '') {
> > -				# looking for forks;
> > -				my $pfx = substr($path, 0, length($filter));
> > -				if ($pfx ne $filter) {
> > -					next PROJECT;
> > -				}
> > -				my $sfx = substr($path, length($filter));
> > -				if ($sfx !~ /^\/.*\.git$/) {
> > -					next PROJECT;
> > -				}
> [...]
> > +			# if $filter is rpovided, check if $path begins with $filter
> 
> s/rpovided/provided/ :)

I should have run spellchecker...

> > +			if ($filter && $path !~ m!^\Q$filter\E/!) {
> > +				next;
> 
> Why did the code check for \Q.git\E$ before?  If I'm not missing
> something, I'm happy to see the condition go.

This was about looking for forks, which is now moved out of this
subroutine into filter_forks_from_projects_list subroutine.

> What if $filter is 0?  Silly question, I guess.

Not a problem.  \Q and \E are for quoting metacharacters, i.e. treating
$filter as string (as prefix to be matched) rather than as regexp.

> [...]
> > @@ -2745,6 +2721,67 @@ sub git_get_projects_list {
> >  	return @list;
> >  }
> >  
> > +# this assumes that forks follow immediately forked projects:
> > +# [ ..., 'repo.git', 'repo/fork.git', ...]; if not, set second
> > +# parameter to true value to sort projects by path first.
> > +sub filter_forks_from_projects_list {
> > +	my ($projects, $need_sorting) = @_;
> > +
> > +	@$projects = sort { $a->{'path'} cmp $b->{'path'} } @$projects
> > +		if ($need_sorting);
> 
> What happens in this scenario?
> 
> 	git.git
> 	git.related.project.git
> 	git/platforms.git

Thanks for catching this.

It looks like I have oversimplified the algorithm by requiring that 
forked project directly precedes any of its forks (if they exists).
I'd have to redo this part of patch.

> [...]
> > @@ -4747,23 +4784,36 @@ sub fill_project_list_info {
> >  		if (!defined $pr->{'owner'}) {
> >  			$pr->{'owner'} = git_get_project_owner("$pr->{'path'}") || "";
> >  		}
> > -		if ($check_forks) {
> > -			my $pname = $pr->{'path'};
> > -			if (($pname =~ s/\.git$//) &&
> > -			    ($pname !~ /\/$/) &&
> > -			    (-d "$projectroot/$pname")) {
> > -				$pr->{'forks'} = "-d $projectroot/$pname";
> 
> So currently we treat project.git as having forks when the
> corresponding project/ directory exists, even when the latter is
> empty.  The proposed new rule is that project.git has forks if and
> only if there is at least one project/foo.git repository.  Sensible.

Actually it is marked as having 0 forks, which visually results in "+"
sign which is not a hyperlink to 'forks' view.

Though I am not sure if it is worth it: without mentioned feature code
would be slightly simpler.


Note also that previous code put _string_ into 'forks' field, with
contents "-d <path to project>", instead of 1 as one might expect.

Earlier code contained a few leftover debugging things; I am not sure
if I have removed everything, at least wrt. projects list.

> Hope that helps,

Thanks again for your comments.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH/RFC] gitweb: Restructure projects list generation
  2011-02-27 15:03   ` Jakub Narebski
@ 2011-03-01  1:01     ` Jakub Narebski
  2011-03-02 16:11       ` [PATCHv2/RFC] " Jakub Narebski
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Narebski @ 2011-03-01  1:01 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, admin, John Hawley

On Sun, 27 Feb 2011, Jakub Narebski wrote:
> On Sun, 27 Feb 2011, Jonathan Nieder wrote:
> > Jakub Narebski wrote:

> > > +# this assumes that forks follow immediately forked projects:
> > > +# [ ..., 'repo.git', 'repo/fork.git', ...]; if not, set second
> > > +# parameter to true value to sort projects by path first.
> > > +sub filter_forks_from_projects_list {
> > > +	my ($projects, $need_sorting) = @_;
> > > +
> > > +	@$projects = sort { $a->{'path'} cmp $b->{'path'} } @$projects
> > > +		if ($need_sorting);
> > 
> > What happens in this scenario?
> > 
> > 	git.git
> > 	git.related.project.git
> > 	git/platforms.git
> 
> Thanks for catching this.
> 
> It looks like I have oversimplified the algorithm by requiring that 
> forked project directly precedes any of its forks (if they exists).
> I'd have to redo this part of patch.
> 
> > [...]
> > > @@ -4747,23 +4784,36 @@ sub fill_project_list_info {
> > >  		if (!defined $pr->{'owner'}) {
> > >  			$pr->{'owner'} = git_get_project_owner("$pr->{'path'}") || "";
> > >  		}
> > > -		if ($check_forks) {
> > > -			my $pname = $pr->{'path'};
> > > -			if (($pname =~ s/\.git$//) &&
> > > -			    ($pname !~ /\/$/) &&
> > > -			    (-d "$projectroot/$pname")) {
> > > -				$pr->{'forks'} = "-d $projectroot/$pname";

Let's examine what's we had, what this commit does, and what we can have.

Before this commit there were two implementations of detecting and 
filtering-out forks: one based only on directory structure in
fill_project_list_info (it detected forks, but didn't filter them out),
and one in git_get_projects_list, which was based on pathnames and was
run only for $projects_list being [index] file (it filtered-out forks).

This commit replaced both by filter_forks_from_projects_list, which
mainly detected forks based on path, with directory based detecting
if there _can_ be forks... but this solution is based on overly strict
and not true assumption that forks always directly follow forked 
project, which doesn't need to be true e.g. for scanning $projects_list
directory.


Hmmm... it looks for me like filtering out forks worked only for 
$projects_list being file; the code in fill_project_list_info() detected
which projects were forked, but didn't detect which projects were forks
and should be filtered.  The code inside $check_forks conditional that
used to be in git_projects_list_body() was about filtering out for second
time projects that are not forks of current project, for 'forks' action.

Now naive implementation of detecting forks, and also implementation used
to be used in git_get_projects_list() for $projects_list being file is
O(n^2) in number of projects.  With a bit of simplification it can be
O(<projects>*<forked projects>) + O(<projects>)*stat (cost of "-d <dir>").
Withe git.kernel.org having around 1,000 projects, and repo.or.cz having
around 4,000 projects not counting forks, O(n^2) is out of question.

Of those 4,000 projects around 300 (or 7%) has forks, according to 
repo.or.cz projects list page.  But assuming that percentage of forked
repositories remains constant with growth of number of repositories,
this is still quadratic cost... but perhaps acceptable.


We can use more advanced algorithm to find prefixes (to find forks);
for example using trie[1][2] to find forks costs O(<projects>*<depth>).
For git.kernel.org which has quite deep hierarchy of projects number
of path components (depth) is no more than 6.

Synthetic benchmark with 2000 repositories, 5% of which are forked,
with 5 forks per forked repository shows that trie-based solution is
around 25 times faster than original solution from git_get_projects_list
with %paths.

The disadvantage of this solution is that even seriously stripped-down
trie construction and lookup is around 2 pages of code... or using non-core
Tree::Trie module from CPAN.

[1]: http://en.wikipedia.org/wiki/Trie
[2]: http://p3rl.org/Tree::Trie
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCHv2/RFC] gitweb: Restructure projects list generation
  2011-03-01  1:01     ` Jakub Narebski
@ 2011-03-02 16:11       ` Jakub Narebski
  0 siblings, 0 replies; 5+ messages in thread
From: Jakub Narebski @ 2011-03-02 16:11 UTC (permalink / raw)
  To: git; +Cc: Jonathan Nieder, admin, John Hawley

Extract filtering out forks (which is done if 'forks' feature is
enabled) into filter_forks_from_projects_list subroutine, and
searching projects (via projects search form, or via content tags)
into search_projects_list subroutine.

Both are now run _before_ displaying projects, and not while printing;
this allow to know upfront if there were any found projects.  Gitweb
now can and do print 'No such projects found' if user searches for
phrase which does not correspond to any project (any repository).
This also would allow splitting projects list into pages, if we so
desire.

Filtering out forks and marking repository (project) as having forks
is now consolidated into one subroutine (special case of handling
forks in git_get_projects_list only for $projects_list being file is
now removed).  Forks handling is also cleaned up and simplified.
$pr->{'forks'} now contains un-filled list of forks; we can now also
detect situation where the way for having forks is prepared, but there
are no forks yet.

Sorting projects got also refactored in a very straight way (just
moving code) into sort_projects_list subroutine.

The interaction between forks, content tags and searching is now made
more explicit: searching whether by tag, or via search form turns off
fork filtering (gitweb searches also forks, and will show all
results).  If 'ctags' feature is disabled, then searching by tag is
too.

The t9500 test now includes some basic test for 'forks' and 'ctags'
features; the t9502 includes test checking if gitweb correctly filters
out forks.


Generating list of projects by scanning given directory is now also a
bit simplified wrt. handling filtering; it is byproduct of extracting
filtering forks to separate subroutine.

While at it we now detect that there are no projects and respond with
"404 No projects found" also for 'project_index' and 'opml' actions.

Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
The main difference from previous version is that newly introduced
filter_forks_from_projects_list() no longer relies on the assumption
that forked project immediately precedes all of its forks... which is
not true when $porjects_list is a directory and we have e.g. such
situation:

  repo.git
  repo.related.git
  repo_other.git
  repo/fork.git

This issue was noticed by Jonathan Nieder.

filter_forks_from_projects_list() uses trie[1] (prefix tree) to find
which repositories are forks of other projects; directories are used
as "letters" in trie.  Algorithm to create trie and find prefix match
was created with some help of code of Tree::Tree[2] Perl module from
CPAN.

[1]: http://en.wikipedia.org/wiki/Trie
[2]: http://p3rl.org/Tree::Trie

This revision includes additionally tests in t9502 that correct
projects are filtered.


Note that before this commit filtering out forks worked only for
$projects_list being a file, it assumed that forks are always after
forked project (not very unreasonable), and used algorithm which is
O(<projects>^2).

Current code filters out projects only where it should, regardless
whether $projects_list is directory or a file, and trie-based
algorithm is O(<projects>) + O(<projects forked>*<max depth>).
Current code also consistently finds shortest prefix (important in
counting forks in case of forks of forks).


This is an RFC because I am not sure if I did tests in t9502 right
and efficient enough.

There is also question whether $pr->{'forks'} should be a list of
forks, or would having $pr->{'forks'} being number of forks be enough.

Refactoring sort_projects_list and fixing handling of $project_maxdepth
could be done in a separate patches.

 gitweb/gitweb.perl                        |  250 ++++++++++++++++++++---------
 t/t9500-gitweb-standalone-no-errors.sh    |   49 ++++++
 t/t9502-gitweb-standalone-parse-output.sh |   74 +++++++++
 3 files changed, 296 insertions(+), 77 deletions(-)

diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 1b9369d..996b647 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -2643,21 +2643,23 @@ sub git_get_project_url_list {
 }
 
 sub git_get_projects_list {
-	my ($filter) = @_;
+	my $filter = shift || '';
 	my @list;
 
-	$filter ||= '';
 	$filter =~ s/\.git$//;
 
-	my $check_forks = gitweb_check_feature('forks');
-
 	if (-d $projects_list) {
 		# search in directory
-		my $dir = $projects_list . ($filter ? "/$filter" : '');
+		my $dir = $projects_list;
 		# remove the trailing "/"
 		$dir =~ s!/+$!!;
-		my $pfxlen = length("$dir");
-		my $pfxdepth = ($dir =~ tr!/!!);
+		my $pfxlen = length("$projects_list");
+		my $pfxdepth = ($projects_list =~ tr!/!!);
+		# when filtering, search only given subdirectory
+		if ($filter) {
+			$dir .= "/$filter";
+			$dir =~ s!/+$!!;
+		}
 
 		File::Find::find({
 			follow_fast => 1, # follow symbolic links
@@ -2672,14 +2674,14 @@ sub git_get_projects_list {
 				# only directories can be git repositories
 				return unless (-d $_);
 				# don't traverse too deep (Find is super slow on os x)
+				# $project_maxdepth excludes depth of $projectroot
 				if (($File::Find::name =~ tr!/!!) - $pfxdepth > $project_maxdepth) {
 					$File::Find::prune = 1;
 					return;
 				}
 
-				my $subdir = substr($File::Find::name, $pfxlen + 1);
+				my $path = substr($File::Find::name, $pfxlen + 1);
 				# we check related file in $projectroot
-				my $path = ($filter ? "$filter/" : '') . $subdir;
 				if (check_export_ok("$projectroot/$path")) {
 					push @list, { path => $path };
 					$File::Find::prune = 1;
@@ -2692,7 +2694,6 @@ sub git_get_projects_list {
 		# 'git%2Fgit.git Linus+Torvalds'
 		# 'libs%2Fklibc%2Fklibc.git H.+Peter+Anvin'
 		# 'linux%2Fhotplug%2Fudev.git Greg+Kroah-Hartman'
-		my %paths;
 		open my $fd, '<', $projects_list or return;
 	PROJECT:
 		while (my $line = <$fd>) {
@@ -2703,32 +2704,9 @@ sub git_get_projects_list {
 			if (!defined $path) {
 				next;
 			}
-			if ($filter ne '') {
-				# looking for forks;
-				my $pfx = substr($path, 0, length($filter));
-				if ($pfx ne $filter) {
-					next PROJECT;
-				}
-				my $sfx = substr($path, length($filter));
-				if ($sfx !~ /^\/.*\.git$/) {
-					next PROJECT;
-				}
-			} elsif ($check_forks) {
-			PATH:
-				foreach my $filter (keys %paths) {
-					# looking for forks;
-					my $pfx = substr($path, 0, length($filter));
-					if ($pfx ne $filter) {
-						next PATH;
-					}
-					my $sfx = substr($path, length($filter));
-					if ($sfx !~ /^\/.*\.git$/) {
-						next PATH;
-					}
-					# is a fork, don't include it in
-					# the list
-					next PROJECT;
-				}
+			# if $filter is rpovided, check if $path begins with $filter
+			if ($filter && $path !~ m!^\Q$filter\E/!) {
+				next;
 			}
 			if (check_export_ok("$projectroot/$path")) {
 				my $pr = {
@@ -2736,8 +2714,6 @@ sub git_get_projects_list {
 					owner => to_utf8($owner),
 				};
 				push @list, $pr;
-				(my $forks_path = $path) =~ s/\.git$//;
-				$paths{$forks_path}++;
 			}
 		}
 		close $fd;
@@ -2745,6 +2721,98 @@ sub git_get_projects_list {
 	return @list;
 }
 
+# written with help of Tree::Trie module (Perl Artistic License, GPL compatibile)
+# as side effects it sets 'forks' field to list of forks for forked projects
+sub filter_forks_from_projects_list {
+	my $projects = shift;
+
+	my %trie; # prefix tree of directories (path components)
+	# generate trie out of those directories that might contain forks
+	foreach my $pr (@$projects) {
+		my $path = $pr->{'path'};
+		$path =~ s/\.git$//;      # forks of 'repo.git' are in 'repo/' directory
+		next if ($path =~ m!/$!); # skip non-bare repositories, e.g. 'repo/.git'
+		next unless ($path);      # skip '.git' repository: tests, git-instaweb
+		next unless (-d $path);   # containing directory exists
+		$pr->{'forks'} = [];      # there can be 0 or more forks of project
+
+		# add to trie
+		my @dirs = split('/', $path);
+		# walk the trie, until either runs out of components or out of trie
+		my $ref = \%trie;
+		while (scalar @dirs &&
+		       exists($ref->{$dirs[0]})) {
+			$ref = $ref->{shift @dirs};
+		}
+		# create rest of trie structure from rest of components
+		foreach my $dir (@dirs) {
+			$ref = $ref->{$dir} = {};
+		}
+		# create end marker, store $pr as a data
+		$ref->{''} = $pr if (!exists $ref->{''});
+	}
+
+	# filter out forks, by finding shortest prefix match for paths
+	my @filtered;
+ PROJECT:
+	foreach my $pr (@$projects) {
+		# trie lookup
+		my $ref = \%trie;
+	DIR:
+		foreach my $dir (split('/', $pr->{'path'})) {
+			if (exists $ref->{''}) {
+				# found [shortest] prefix, is a fork - skip it
+				push @{$ref->{''}{'forks'}}, $pr;
+				next PROJECT;
+			}
+			if (!exists $ref->{$dir}) {
+				# not in trie, cannot have prefix, not a fork
+				push @filtered, $pr;
+				next PROJECT;
+			}
+			# If the dir is there, we just walk one step down the trie.
+			$ref = $ref->{$dir};
+		}
+		# we ran out of trie
+		# (shouldn't happen: it's either no match, or end marker)
+		push @filtered, $pr;
+	}
+
+	return @filtered;
+}
+
+# note: fill_project_list_info must be run first,
+# for 'descr_long' and 'ctags' to be filled
+sub search_projects_list {
+	my ($projlist, %opts) = @_;
+	my $tagfilter  = $opts{'tagfilter'};
+	my $searchtext = $opts{'searchtext'};
+
+	return @$projlist
+		unless ($tagfilter || $searchtext);
+
+	my @projects;
+ PROJECT:
+	foreach my $pr (@$projlist) {
+
+		if ($tagfilter) {
+			next unless ref($pr->{'ctags'}) eq 'HASH';
+			next unless
+				grep { lc($_) eq lc($tagfilter) } keys %{$pr->{'ctags'}};
+		}
+
+		if ($searchtext) {
+			next unless
+				$pr->{'path'} =~ /$searchtext/ ||
+				$pr->{'descr_long'} =~ /$searchtext/;
+		}
+
+		push @projects, $pr;
+	}
+
+	return @projects;
+}
+
 our $gitweb_project_owner = undef;
 sub git_get_project_list_from_file {
 
@@ -4727,7 +4795,7 @@ sub git_patchset_body {
 # project in the list, removing invalid projects from returned list
 # NOTE: modifies $projlist, but does not remove entries from it
 sub fill_project_list_info {
-	my ($projlist, $check_forks) = @_;
+	my $projlist = shift;
 	my @projects;
 
 	my $show_ctags = gitweb_check_feature('ctags');
@@ -4747,23 +4815,36 @@ sub fill_project_list_info {
 		if (!defined $pr->{'owner'}) {
 			$pr->{'owner'} = git_get_project_owner("$pr->{'path'}") || "";
 		}
-		if ($check_forks) {
-			my $pname = $pr->{'path'};
-			if (($pname =~ s/\.git$//) &&
-			    ($pname !~ /\/$/) &&
-			    (-d "$projectroot/$pname")) {
-				$pr->{'forks'} = "-d $projectroot/$pname";
-			} else {
-				$pr->{'forks'} = 0;
-			}
+		if ($show_ctags) {
+			$pr->{'ctags'} = git_get_project_ctags($pr->{'path'});
 		}
-		$show_ctags and $pr->{'ctags'} = git_get_project_ctags($pr->{'path'});
 		push @projects, $pr;
 	}
 
 	return @projects;
 }
 
+sub sort_projects_list {
+	my ($projlist, $order) = @_;
+	my @projects;
+
+	my %order_info = (
+		project => { key => 'path', type => 'str' },
+		descr => { key => 'descr_long', type => 'str' },
+		owner => { key => 'owner', type => 'str' },
+		age => { key => 'age', type => 'num' }
+	);
+	my $oi = $order_info{$order};
+	return @$projlist unless defined $oi;
+	if ($oi->{'type'} eq 'str') {
+		@projects = sort {$a->{$oi->{'key'}} cmp $b->{$oi->{'key'}}} @$projlist;
+	} else {
+		@projects = sort {$a->{$oi->{'key'}} <=> $b->{$oi->{'key'}}} @$projlist;
+	}
+
+	return @projects;
+}
+
 # print 'sort by' <th> element, generating 'sort by $name' replay link
 # if that order is not selected
 sub print_sort_th {
@@ -4790,28 +4871,39 @@ sub format_sort_th {
 sub git_project_list_body {
 	# actually uses global variable $project
 	my ($projlist, $order, $from, $to, $extra, $no_header) = @_;
+	my @projects = @$projlist;
 
 	my $check_forks = gitweb_check_feature('forks');
-	my @projects = fill_project_list_info($projlist, $check_forks);
+	my $show_ctags  = gitweb_check_feature('ctags');
+	my $tagfilter = $show_ctags ? $cgi->param('by_tag') : undef;
+	$check_forks = undef
+		if ($tagfilter || $searchtext);
+
+	# filtering out forks before filling info allows to do less work
+	@projects = filter_forks_from_projects_list(\@projects)
+		if ($check_forks);
+	@projects = fill_project_list_info(\@projects);
+	# searching projects require filling to be run before it
+	@projects = search_projects_list(\@projects,
+	                                 'searchtext' => $searchtext,
+	                                 'tagfilter'  => $tagfilter)
+		if ($tagfilter || $searchtext);
 
 	$order ||= $default_projects_order;
 	$from = 0 unless defined $from;
 	$to = $#projects if (!defined $to || $#projects < $to);
 
-	my %order_info = (
-		project => { key => 'path', type => 'str' },
-		descr => { key => 'descr_long', type => 'str' },
-		owner => { key => 'owner', type => 'str' },
-		age => { key => 'age', type => 'num' }
-	);
-	my $oi = $order_info{$order};
-	if ($oi->{'type'} eq 'str') {
-		@projects = sort {$a->{$oi->{'key'}} cmp $b->{$oi->{'key'}}} @projects;
-	} else {
-		@projects = sort {$a->{$oi->{'key'}} <=> $b->{$oi->{'key'}}} @projects;
+	# short circuit
+	if ($from > $to) {
+		print "<center>\n".
+		      "<b>No such projects found</b><br />\n".
+		      "Click ".$cgi->a({-href=>href(project=>undef)},"here")." to view all projects<br />\n".
+		      "</center>\n<br />\n";
+		return;
 	}
 
-	my $show_ctags = gitweb_check_feature('ctags');
+	@projects = sort_projects_list(\@projects, $order);
+
 	if ($show_ctags) {
 		my %ctags;
 		foreach my $p (@projects) {
@@ -4837,32 +4929,26 @@ sub git_project_list_body {
 		      "</tr>\n";
 	}
 	my $alternate = 1;
-	my $tagfilter = $cgi->param('by_tag');
 	for (my $i = $from; $i <= $to; $i++) {
 		my $pr = $projects[$i];
 
-		next if $tagfilter and $show_ctags and not grep { lc $_ eq lc $tagfilter } keys %{$pr->{'ctags'}};
-		next if $searchtext and not $pr->{'path'} =~ /$searchtext/
-			and not $pr->{'descr_long'} =~ /$searchtext/;
-		# Weed out forks or non-matching entries of search
-		if ($check_forks) {
-			my $forkbase = $project; $forkbase ||= ''; $forkbase =~ s#\.git$#/#;
-			$forkbase="^$forkbase" if $forkbase;
-			next if not $searchtext and not $tagfilter and $show_ctags
-				and $pr->{'path'} =~ m#$forkbase.*/.*#; # regexp-safe
-		}
-
 		if ($alternate) {
 			print "<tr class=\"dark\">\n";
 		} else {
 			print "<tr class=\"light\">\n";
 		}
 		$alternate ^= 1;
+
 		if ($check_forks) {
 			print "<td>";
 			if ($pr->{'forks'}) {
-				print "<!-- $pr->{'forks'} -->\n";
-				print $cgi->a({-href => href(project=>$pr->{'path'}, action=>"forks")}, "+");
+				my $nforks = scalar @{$pr->{'forks'}};
+				if ($nforks > 0) {
+					print $cgi->a({-href => href(project=>$pr->{'path'}, action=>"forks"),
+					               -title => "$nforks forks"}, "+");
+				} else {
+					print $cgi->span({-title => "$nforks forks"}, "+");
+				}
 			}
 			print "</td>\n";
 		}
@@ -5343,7 +5429,10 @@ sub git_forks {
 }
 
 sub git_project_index {
-	my @projects = git_get_projects_list($project);
+	my @projects = git_get_projects_list();
+	if (!@projects) {
+		die_error(404, "No projects found");
+	}
 
 	print $cgi->header(
 		-type => 'text/plain',
@@ -5385,7 +5474,11 @@ sub git_summary {
 	my $check_forks = gitweb_check_feature('forks');
 
 	if ($check_forks) {
+		# find forks of a project
 		@forklist = git_get_projects_list($project);
+		# filter out forks of forks
+		@forklist = filter_forks_from_projects_list(\@forklist)
+			if (@forklist);
 	}
 
 	git_header_html();
@@ -7305,6 +7398,9 @@ sub git_atom {
 
 sub git_opml {
 	my @list = git_get_projects_list();
+	if (!@list) {
+		die_error(404, "No projects found");
+	}
 
 	print $cgi->header(
 		-type => 'text/xml',
diff --git a/t/t9500-gitweb-standalone-no-errors.sh b/t/t9500-gitweb-standalone-no-errors.sh
index afac5b5..71ef0ac 100755
--- a/t/t9500-gitweb-standalone-no-errors.sh
+++ b/t/t9500-gitweb-standalone-no-errors.sh
@@ -595,4 +595,53 @@ test_expect_success HIGHLIGHT \
 	 git commit -m "Add test.sh" &&
 	 gitweb_run "p=.git;a=blob;f=test.sh"'
 
+# ----------------------------------------------------------------------
+# forks of projects
+
+cat >>gitweb_config.perl <<\EOF &&
+$feature{'forks'}{'default'} = [1];
+EOF
+
+test_expect_success \
+	'forks: prepare' \
+	'git init --bare foo.git &&
+	 git --git-dir=foo.git --work-tree=. add file &&
+	 git --git-dir=foo.git --work-tree=. commit -m "Initial commit" &&
+	 echo "foo" > foo.git/description &&
+	 mkdir -p foo &&
+	 (cd foo &&
+	  git clone --shared --bare ../foo.git foo-forked.git &&
+	  echo "fork of foo" > foo-forked.git/description)'
+
+test_expect_success \
+	'forks: projects list' \
+	'gitweb_run'
+
+test_expect_success \
+	'forks: forks action' \
+	'gitweb_run "p=foo.git;a=forks"'
+
+# ----------------------------------------------------------------------
+# content tags (tag cloud)
+
+cat >>gitweb_config.perl <<-\EOF &&
+# we don't test _setting_ content tags, so any true value is good
+$feature{'ctags'}{'default'} = ['ctags_script.cgi'];
+EOF
+
+test_expect_success \
+	'ctags: tag cloud in projects list' \
+	'mkdir .git/ctags &&
+	 echo "2" > .git/ctags/foo &&
+	 echo "1" > .git/ctags/bar &&
+	gitweb_run'
+
+test_expect_success \
+	'ctags: search projects by existing tag' \
+	'gitweb_run "by_tag=foo"'
+
+test_expect_success \
+	'ctags: search projects by non existent tag' \
+	'gitweb_run "by_tag=non-existent"'
+
 test_done
diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh
index dd83890..731e64c 100755
--- a/t/t9502-gitweb-standalone-parse-output.sh
+++ b/t/t9502-gitweb-standalone-parse-output.sh
@@ -112,4 +112,78 @@ test_expect_success 'snapshot: hierarchical branch name (xx/test)' '
 '
 test_debug 'cat gitweb.headers'
 
+# ----------------------------------------------------------------------
+# forks of projects
+
+test_expect_success 'forks: setup' '
+	git init --bare foo.git &&
+	echo file > file &&
+	git --git-dir=foo.git --work-tree=. add file &&
+	git --git-dir=foo.git --work-tree=. commit -m "Initial commit" &&
+	echo "foo" > foo.git/description &&
+	git clone --bare foo.git foo.bar.git &&
+	echo "foo.bar" > foo.bar.git/description &&
+	git clone --bare foo.git foo_baz.git &&
+	echo "foo_baz" > foo_baz.git/description &&
+	rm -fr   foo &&
+	mkdir -p foo &&
+	(
+		cd foo &&
+		git clone --shared --bare ../foo.git foo-forked.git &&
+		echo "fork of foo" > foo-forked.git/description
+	)
+'
+
+test_expect_success 'forks: not skipped unless "forks" feature enabled' '
+	gitweb_run "a=project_list" &&
+	grep -q ">\\.git<"               gitweb.body &&
+	grep -q ">foo\\.git<"            gitweb.body &&
+	grep -q ">foo_baz\\.git<"        gitweb.body &&
+	grep -q ">foo\\.bar\\.git<"      gitweb.body &&
+	grep -q ">foo_baz\\.git<"        gitweb.body &&
+	grep -q ">foo/foo-forked\\.git<" gitweb.body &&
+	grep -q ">fork of .*<"           gitweb.body
+'
+
+cat >>gitweb_config.perl <<\EOF &&
+$feature{'forks'}{'default'} = [1];
+EOF
+
+test_expect_success 'forks: forks skipped if "forks" feature enabled' '
+	gitweb_run "a=project_list" &&
+	grep -q ">\\.git<"               gitweb.body &&
+	grep -q ">foo\\.git<"            gitweb.body &&
+	grep -q ">foo_baz\\.git<"        gitweb.body &&
+	grep -q ">foo\\.bar\\.git<"      gitweb.body &&
+	grep -q ">foo_baz\\.git<"        gitweb.body &&
+	grep -v ">foo/foo-forked\\.git<" gitweb.body &&
+	grep -v ">fork of .*<"           gitweb.body
+'
+
+test_expect_success 'forks: "forks" action for forked repository' '
+	gitweb_run "p=foo.git;a=forks" &&
+	grep -q ">foo/foo-forked\\.git<" gitweb.body &&
+	grep -q ">fork of foo<"          gitweb.body
+'
+
+test_expect_success 'forks: can access forked repository' '
+	gitweb_run "p=foo/foo-forked.git;a=summary" &&
+	grep -q "200 OK"        gitweb.headers &&
+	grep -q ">fork of foo<" gitweb.body
+'
+
+test_expect_success 'forks: project_index lists all projects (incl. forks)' '
+	cat >expected <<-\EOF
+	.git
+	foo.bar.git
+	foo.git
+	foo/foo-forked.git
+	foo_baz.git
+	EOF
+	gitweb_run "a=project_index" &&
+	sed -e "s/ .*//" <gitweb.body | sort >actual &&
+	test_cmp expected actual
+'
+
+
 test_done
-- 
1.7.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-03-02 16:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-26 22:06 [PATCH/RFC] gitweb: Restructure projects list generation Jakub Narebski
2011-02-27 10:08 ` Jonathan Nieder
2011-02-27 15:03   ` Jakub Narebski
2011-03-01  1:01     ` Jakub Narebski
2011-03-02 16:11       ` [PATCHv2/RFC] " Jakub Narebski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).