* [PATCH] gitweb: support the rel=vcs microformat @ 2009-01-07 4:25 Joey Hess 2009-01-07 12:30 ` Giuseppe Bilotta 2009-01-09 23:49 ` Jakub Narebski 0 siblings, 2 replies; 22+ messages in thread From: Joey Hess @ 2009-01-07 4:25 UTC (permalink / raw) To: git The rel=vcs microformat allows a web page to indicate the locations of repositories related to it in a machine-parseable manner. (See http://kitenet.net/~joey/rfc/rel-vcs/) Make gitweb use the microformat in the header of pages it generates, if it has been configured with project url information in any of the usual ways. Since getting the urls can require hitting disk, I avoided putting the microformat on *every* page gitweb generates. Just put it on the project summary page, the project list page, and the forks list page. The first of these already looks up the urls, so adding the microformat was free. There is a small overhead in including the microformat on the latter two pages, but getting the project descriptions for those pages already incurs a similar overhead, and the ability to get every repo url in one place seems worthwhile. This changes git_get_project_description() to not check wantarray, and only return in list context -- the only way it is used AFAICS. Signed-off-by: Joey Hess <joey@gnu.kitenet.net> --- gitweb/gitweb.perl | 38 ++++++++++++++++++++++++++------------ 1 files changed, 26 insertions(+), 12 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 99f71b4..3f8a228 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -789,6 +789,9 @@ $git_dir = "$projectroot/$project" if $project; our @snapshot_fmts = gitweb_get_feature('snapshot'); @snapshot_fmts = filter_snapshot_fmts(@snapshot_fmts); +# populated later with git urls for the project +our @git_url_list; + # dispatch if (!defined $action) { if (defined $hash) { @@ -2100,17 +2103,22 @@ sub git_show_project_tagcloud { } sub git_get_project_url_list { + # use per project git URL list in $projectroot/$path/cloneurl + # or make project git URL from git base URL and project name my $path = shift; + my @ret; + $git_dir = "$projectroot/$path"; - open my $fd, "$git_dir/cloneurl" - or return wantarray ? - @{ config_to_multi(git_get_project_config('url')) } : - config_to_multi(git_get_project_config('url')); - my @git_project_url_list = map { chomp; $_ } <$fd>; - close $fd; + if (open my $fd, "$git_dir/cloneurl") { + @ret = map { chomp; $_ } <$fd>; + close $fd; + } + else { + @ret = @{ config_to_multi(git_get_project_config('url')) }; + } - return wantarray ? @git_project_url_list : \@git_project_url_list; + return @ret ? @ret : map { "$_/$project" } @git_base_url_list; } sub git_get_projects_list { @@ -2953,6 +2961,10 @@ EOF print qq(<link rel="shortcut icon" href="$favicon" type="image/png" />\n); } + foreach my $url (@git_url_list) { + print qq{<link rel="vcs" type="git" href="$url" />\n}; + } + print "</head>\n" . "<body>\n"; @@ -4380,6 +4392,8 @@ sub git_project_list { die_error(404, "No projects found"); } + @git_url_list = map { git_get_project_url_list($_->{path}) } @list; + git_header_html(); if (-f $home_text) { print "<div class=\"index_include\">\n"; @@ -4400,6 +4414,8 @@ sub git_forks { if (defined $order && $order !~ m/none|project|descr|owner|age/) { die_error(400, "Unknown order parameter"); } + + @git_url_list = map { git_get_project_url_list($_->{path}) } @list; my @list = git_get_projects_list($project); if (!@list) { @@ -4457,6 +4473,8 @@ sub git_summary { @forklist = git_get_projects_list($project); } + @git_url_list = git_get_project_url_list($project); + git_header_html(); git_print_page_nav('summary','', $head); @@ -4468,12 +4486,8 @@ sub git_summary { print "<tr id=\"metadata_lchange\"><td>last change</td><td>$cd{'rfc2822'}</td></tr>\n"; } - # use per project git URL list in $projectroot/$project/cloneurl - # or make project git URL from git base URL and project name my $url_tag = "URL"; - my @url_list = git_get_project_url_list($project); - @url_list = map { "$_/$project" } @git_base_url_list unless @url_list; - foreach my $git_url (@url_list) { + foreach my $git_url (@git_url_list) { next unless $git_url; print "<tr class=\"metadata_url\"><td>$url_tag</td><td>$git_url</td></tr>\n"; $url_tag = ""; -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 4:25 [PATCH] gitweb: support the rel=vcs microformat Joey Hess @ 2009-01-07 12:30 ` Giuseppe Bilotta 2009-01-07 15:50 ` Joey Hess 2009-01-09 23:56 ` Jakub Narebski 2009-01-09 23:49 ` Jakub Narebski 1 sibling, 2 replies; 22+ messages in thread From: Giuseppe Bilotta @ 2009-01-07 12:30 UTC (permalink / raw) To: git On Wednesday 07 January 2009 05:25, Joey Hess wrote: > The rel=vcs microformat allows a web page to indicate the locations of > repositories related to it in a machine-parseable manner. > (See http://kitenet.net/~joey/rfc/rel-vcs/) Interesting idea, I like it. However, I see a problem in the proposed implementation versus the spec. According to the spec: """ The "title" is optional, but recommended if there are multiple, different repositories linked to on one page. It is a human-readable description of the repository. [...] If there are multiple repositories listed, without titles, tools should assume they are different repositories. """ In this patch you do NOT add titles to the rel=vcs links, which means that everything works fine only if there is a single URL for each project. If a project has different URLs, it's going to appear multiple times as _different_ projects to a spec-compliant reader. A possible solution would be to make @git_url_list into a map keyed by the project name and having the description and repo URL(s) as values. Since there is the possibility of different projects having the same description (e.g. the default one), the link title could be composed of "$project - $description" rather than simply $description. Note that both in summary and in project list view you already retrieve the description, so there are no additional disk hits. > Make gitweb use the microformat in the header of pages it generates, > if it has been configured with project url information in any of the usual > ways. > > Since getting the urls can require hitting disk, I avoided putting the > microformat on *every* page gitweb generates. Just put it on the project > summary page, the project list page, and the forks list page. > The first of these already looks up the urls, so adding the microformat was > free. There is a small overhead in including the microformat on the > latter two pages, but getting the project descriptions for those pages > already incurs a similar overhead, and the ability to get every repo url > in one place seems worthwhile. > > This changes git_get_project_description() to not check wantarray, and only > return in list context -- the only way it is used AFAICS. I assume you mean git_get_project_url_list()? > > Signed-off-by: Joey Hess <joey@gnu.kitenet.net> > --- > gitweb/gitweb.perl | 38 ++++++++++++++++++++++++++------------ > 1 files changed, 26 insertions(+), 12 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 99f71b4..3f8a228 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -789,6 +789,9 @@ $git_dir = "$projectroot/$project" if $project; > our @snapshot_fmts = gitweb_get_feature('snapshot'); > @snapshot_fmts = filter_snapshot_fmts(@snapshot_fmts); > > +# populated later with git urls for the project > +our @git_url_list; > + > # dispatch > if (!defined $action) { > if (defined $hash) { > @@ -2100,17 +2103,22 @@ sub git_show_project_tagcloud { > } > > sub git_get_project_url_list { > + # use per project git URL list in $projectroot/$path/cloneurl > + # or make project git URL from git base URL and project name > my $path = shift; > > + my @ret; > + > $git_dir = "$projectroot/$path"; > - open my $fd, "$git_dir/cloneurl" > - or return wantarray ? > - @{ config_to_multi(git_get_project_config('url')) } : > - config_to_multi(git_get_project_config('url')); > - my @git_project_url_list = map { chomp; $_ } <$fd>; > - close $fd; > + if (open my $fd, "$git_dir/cloneurl") { > + @ret = map { chomp; $_ } <$fd>; > + close $fd; > + } > + else { Coding style: } else { > + @ret = @{ config_to_multi(git_get_project_config('url')) }; > + } > > - return wantarray ? @git_project_url_list : \@git_project_url_list; > + return @ret ? @ret : map { "$_/$project" } @git_base_url_list; > } > > sub git_get_projects_list { > @@ -2953,6 +2961,10 @@ EOF > print qq(<link rel="shortcut icon" href="$favicon" type="image/png" />\n); > } > > + foreach my $url (@git_url_list) { > + print qq{<link rel="vcs" type="git" href="$url" />\n}; > + } > + > print "</head>\n" . > "<body>\n"; > > @@ -4380,6 +4392,8 @@ sub git_project_list { > die_error(404, "No projects found"); > } > > + @git_url_list = map { git_get_project_url_list($_->{path}) } @list; > + > git_header_html(); > if (-f $home_text) { > print "<div class=\"index_include\">\n"; > @@ -4400,6 +4414,8 @@ sub git_forks { > if (defined $order && $order !~ m/none|project|descr|owner|age/) { > die_error(400, "Unknown order parameter"); > } > + > + @git_url_list = map { git_get_project_url_list($_->{path}) } @list; > > my @list = git_get_projects_list($project); > if (!@list) { > @@ -4457,6 +4473,8 @@ sub git_summary { > @forklist = git_get_projects_list($project); > } > > + @git_url_list = git_get_project_url_list($project); > + > git_header_html(); > git_print_page_nav('summary','', $head); > > @@ -4468,12 +4486,8 @@ sub git_summary { > print "<tr id=\"metadata_lchange\"><td>last change</td><td>$cd{'rfc2822'}</td></tr>\n"; > } > > - # use per project git URL list in $projectroot/$project/cloneurl > - # or make project git URL from git base URL and project name > my $url_tag = "URL"; > - my @url_list = git_get_project_url_list($project); > - @url_list = map { "$_/$project" } @git_base_url_list unless @url_list; > - foreach my $git_url (@url_list) { > + foreach my $git_url (@git_url_list) { > next unless $git_url; > print "<tr class=\"metadata_url\"><td>$url_tag</td><td>$git_url</td></tr>\n"; > $url_tag = ""; -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 12:30 ` Giuseppe Bilotta @ 2009-01-07 15:50 ` Joey Hess 2009-01-07 18:03 ` Giuseppe Bilotta 2009-01-09 23:56 ` Jakub Narebski 1 sibling, 1 reply; 22+ messages in thread From: Joey Hess @ 2009-01-07 15:50 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git [-- Attachment #1: Type: text/plain, Size: 1005 bytes --] Giuseppe Bilotta wrote: > In this patch you do NOT add titles to the rel=vcs links, which means that > everything works fine only if there is a single URL for each project. If a > project has different URLs, it's going to appear multiple times as _different_ > projects to a spec-compliant reader. > > A possible solution would be to make @git_url_list into a map keyed by the > project name and having the description and repo URL(s) as values. Yes. I considered doing that, but didn't immediatly see a way to get the project description w/o additional overhead (of looking it up a second time). > > This changes git_get_project_description() to not check wantarray, and only > > return in list context -- the only way it is used AFAICS. > > I assume you mean git_get_project_url_list()? In fact yes. Thanks for the feedback. There are some changes happening to the microformat that should make gitweb's job slightly easier, I'll respin the patch soon. -- see shy jo [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 15:50 ` Joey Hess @ 2009-01-07 18:03 ` Giuseppe Bilotta 2009-01-07 18:41 ` Joey Hess 2009-01-07 18:45 ` Joey Hess 0 siblings, 2 replies; 22+ messages in thread From: Giuseppe Bilotta @ 2009-01-07 18:03 UTC (permalink / raw) To: Joey Hess; +Cc: git On Wed, Jan 7, 2009 at 4:50 PM, Joey Hess <joey@kitenet.net> wrote: > Giuseppe Bilotta wrote: >> In this patch you do NOT add titles to the rel=vcs links, which means that >> everything works fine only if there is a single URL for each project. If a >> project has different URLs, it's going to appear multiple times as _different_ >> projects to a spec-compliant reader. >> >> A possible solution would be to make @git_url_list into a map keyed by the >> project name and having the description and repo URL(s) as values. > > Yes. I considered doing that, but didn't immediatly see a way to get the > project description w/o additional overhead (of looking it up a second > time). The solution I have in mind would be something like this: in summary or projects list view (which are the views in which we put the links, and also the views in which we loop up the repo URL and the description anyway), you fill up former @git_url_list (now %project_metadata) looking up the repo description and URLs. You then use this information both in the link tag and in the appropriate places for the visible part of the webpage: you don't have a significant overhead, because you're just moving the project description retrieval early on. You probably want to refactor the code by making a git_get_project_metadata() sub that extends the current URL retrieval by retrieving description and URLs. The routine can then be used either for one or for all the projects, as needed. > Thanks for the feedback. There are some changes happening to the > microformat that should make gitweb's job slightly easier, I'll respin > the patch soon. Let me know about this too, I very much like the idea of this microformat. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 18:03 ` Giuseppe Bilotta @ 2009-01-07 18:41 ` Joey Hess 2009-01-10 0:01 ` Jakub Narebski 2009-01-07 18:45 ` Joey Hess 1 sibling, 1 reply; 22+ messages in thread From: Joey Hess @ 2009-01-07 18:41 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git [-- Attachment #1: Type: text/plain, Size: 748 bytes --] Giuseppe Bilotta wrote: > > Thanks for the feedback. There are some changes happening to the > > microformat that should make gitweb's job slightly easier, I'll respin > > the patch soon. > > Let me know about this too, I very much like the idea of this microformat. FYI, I've updated the microformat's page with the changes. The significant one for gitweb is that it can now be applied to <a> links. So on the project page, the display of the git URL could be converted to a link using the microformat, and there's no need to get the info earlier to put it in the header. Unfortunatly, the same can't be done to the project list page, unless it's changed to have "git" links as seen on vger.kernel.org's gitweb. -- see shy jo [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 18:41 ` Joey Hess @ 2009-01-10 0:01 ` Jakub Narebski 0 siblings, 0 replies; 22+ messages in thread From: Jakub Narebski @ 2009-01-10 0:01 UTC (permalink / raw) To: Joey Hess; +Cc: Giuseppe Bilotta, git Joey Hess <joey@kitenet.net> writes: > Giuseppe Bilotta wrote: > > Joey Hess <joey@kitenet.net> writes: > > > Thanks for the feedback. There are some changes happening to the > > > microformat that should make gitweb's job slightly easier, I'll respin > > > the patch soon. > > > > Let me know about this too, I very much like the idea of this microformat. > > FYI, I've updated the microformat's page with the changes. The > significant one for gitweb is that it can now be applied to <a> links. > So on the project page, the display of the git URL could be converted to > a link using the microformat, and there's no need to get the info > earlier to put it in the header. Unfortunatly, the same can't be done to > the project list page, unless it's changed to have "git" links as seen > on vger.kernel.org's gitweb. I'm not sure if making repository URLs to be hyperlinks is a good idea. You cannot (should not) click on those in ordinary web browser; they are to be used by git (that is also additional reason why I am not so sure about 'git' link on projects_list page idea). Besides LINK elements in page HEAD are meant mainly for machine; I think it might be more important to add them for machine there, even if they are as A elements (links) or just plain text URLs somewhere else. For example we have LINK elements with alternate versions, among others OPML for projectless pages, and RSS/Atom for project pages, aven though those links are also in page body. So I'd rather have them LINKs... -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 18:03 ` Giuseppe Bilotta 2009-01-07 18:41 ` Joey Hess @ 2009-01-07 18:45 ` Joey Hess 2009-01-07 19:02 ` Joey Hess 1 sibling, 1 reply; 22+ messages in thread From: Joey Hess @ 2009-01-07 18:45 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git [-- Attachment #1: Type: text/plain, Size: 976 bytes --] Giuseppe Bilotta wrote: > The solution I have in mind would be something like this: in summary > or projects list view (which are the views in which we put the links, > and also the views in which we loop up the repo URL and the > description anyway), you fill up former @git_url_list (now > %project_metadata) looking up the repo description and URLs. You then > use this information both in the link tag and in the appropriate > places for the visible part of the webpage: you don't have a > significant overhead, because you're just moving the project > description retrieval early on. > > You probably want to refactor the code by making a > git_get_project_metadata() sub that extends the current URL retrieval > by retrieving description and URLs. The routine can then be used > either for one or for all the projects, as needed. Another approach would be to just memoize git_get_project_description and git_get_project_url_list. -- see shy jo [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 18:45 ` Joey Hess @ 2009-01-07 19:02 ` Joey Hess 2009-01-07 23:24 ` [PATCH] gitweb: support the rel=vcs-* microformat Joey Hess 2009-01-10 0:03 ` [PATCH] gitweb: support the rel=vcs microformat Jakub Narebski 0 siblings, 2 replies; 22+ messages in thread From: Joey Hess @ 2009-01-07 19:02 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git [-- Attachment #1: Type: text/plain, Size: 241 bytes --] Joey Hess wrote: > Another approach would be to just memoize git_get_project_description > and git_get_project_url_list. Especially since git_get_project_description is already called more than once for some pages. -- see shy jo [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] gitweb: support the rel=vcs-* microformat 2009-01-07 19:02 ` Joey Hess @ 2009-01-07 23:24 ` Joey Hess 2009-01-08 7:56 ` Giuseppe Bilotta 2009-01-10 0:52 ` Jakub Narebski 2009-01-10 0:03 ` [PATCH] gitweb: support the rel=vcs microformat Jakub Narebski 1 sibling, 2 replies; 22+ messages in thread From: Joey Hess @ 2009-01-07 23:24 UTC (permalink / raw) To: git The rel=vcs-* microformat allows a web page to indicate the locations of repositories related to it in a machine-parseable manner. (See http://kitenet.net/~joey/rfc/rel-vcs/) Make gitweb use the microformat if it has been configured with project url information in any of the usual ways. On the project summary page, the repository URL display is simply marked up using the microformat. On the project list page and forks list page, the microformat is embedded in the header, since the URLs do not appear on the page. The microformat could be included on other pages too, but I've skipped doing so for now, since it would mean reading another file for every page displayed. There is a small overhead in including the microformat on project list and forks list pages, but getting the project descriptions for those pages already incurs a similar overhead, and the ability to get every repo url in one place seems worthwhile. This changes git_get_project_url_list() to not check wantarray, and only return in list context -- the only way it is used AFAICS. It memoizes both that function and git_get_project_description(), to avoid redundant file reads. Signed-off-by: Joey Hess <joey@gnu.kitenet.net> --- gitweb/gitweb.perl | 78 +++++++++++++++++++++++++++++++++++++++++---------- 1 files changed, 62 insertions(+), 16 deletions(-) This incorporates Giuseppe Bilotta's feedback, and uses new features of the microformat. You can see this version running at http://git.ikiwiki.info/ diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 99f71b4..c238717 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -2020,9 +2020,14 @@ sub git_get_path_by_hash { ## ...................................................................... ## git utility functions, directly accessing git repository +{ +my %project_descriptions; # cache + sub git_get_project_description { my $path = shift; + return $project_descriptions{$path} if exists $project_descriptions{$path}; + $git_dir = "$projectroot/$path"; open my $fd, "$git_dir/description" or return git_get_project_config('description'); @@ -2031,7 +2036,9 @@ sub git_get_project_description { if (defined $descr) { chomp $descr; } - return $descr; + return $project_descriptions{$path}=$descr; +} + } sub git_get_project_ctags { @@ -2099,18 +2106,30 @@ sub git_show_project_tagcloud { } } +{ +my %project_url_lists; # cache + sub git_get_project_url_list { + # use per project git URL list in $projectroot/$path/cloneurl + # or make project git URL from git base URL and project name my $path = shift; + return @{$project_url_lists{$path}} if exists $project_url_lists{$path}; + + my @ret; $git_dir = "$projectroot/$path"; - open my $fd, "$git_dir/cloneurl" - or return wantarray ? - @{ config_to_multi(git_get_project_config('url')) } : - config_to_multi(git_get_project_config('url')); - my @git_project_url_list = map { chomp; $_ } <$fd>; - close $fd; + if (open my $fd, "$git_dir/cloneurl") { + @ret = map { chomp; $_ } <$fd>; + close $fd; + } else { + @ret = @{ config_to_multi(git_get_project_config('url')) }; + } + @ret=map { "$_/$project" } @git_base_url_list if ! @ret; + + $project_url_lists{$path}=\@ret; + return @ret; +} - return wantarray ? @git_project_url_list : \@git_project_url_list; } sub git_get_projects_list { @@ -2856,6 +2875,7 @@ sub blob_contenttype { sub git_header_html { my $status = shift || "200 OK"; my $expires = shift; + my $extraheader = shift; my $title = "$site_name"; if (defined $project) { @@ -2953,6 +2973,8 @@ EOF print qq(<link rel="shortcut icon" href="$favicon" type="image/png" />\n); } + print $extraheader if defined $extraheader; + print "</head>\n" . "<body>\n"; @@ -4365,6 +4387,26 @@ sub git_search_grep_body { print "</table>\n"; } +sub git_link_title { + my $project=shift; + + my $description=git_get_project_description($project); + return $project.(length $description ? " - $description" : ""); +} + +# generates header with links to the specified projects +sub git_links_header { + my $ret=''; + foreach my $project (@_) { + # rel=vcs-* microformat + my $title=git_link_title($project); + foreach my $url git_get_project_url_list($project) { + $ret.=qq{<link rel="vcs-git" href="$url" title="$title"/>\n} + } + } + return $ret; +} + ## ====================================================================== ## ====================================================================== ## actions @@ -4380,7 +4422,9 @@ sub git_project_list { die_error(404, "No projects found"); } - git_header_html(); + my $extraheader=git_links_header(map { $_->{path} } @list); + + git_header_html(undef, undef, $extraheader); if (-f $home_text) { print "<div class=\"index_include\">\n"; insert_file($home_text); @@ -4405,8 +4449,10 @@ sub git_forks { if (!@list) { die_error(404, "No forks found"); } + + my $extraheader=git_links_header(map { $_->{path} } @list); - git_header_html(); + git_header_html(undef, undef, $extraheader); git_print_page_nav('',''); git_print_header_div('summary', "$project forks"); git_project_list_body(\@list, $order); @@ -4468,14 +4514,14 @@ sub git_summary { print "<tr id=\"metadata_lchange\"><td>last change</td><td>$cd{'rfc2822'}</td></tr>\n"; } - # use per project git URL list in $projectroot/$project/cloneurl - # or make project git URL from git base URL and project name my $url_tag = "URL"; - my @url_list = git_get_project_url_list($project); - @url_list = map { "$_/$project" } @git_base_url_list unless @url_list; - foreach my $git_url (@url_list) { + my $title=git_link_title($project); + foreach my $git_url (git_get_project_url_list($project)) { next unless $git_url; - print "<tr class=\"metadata_url\"><td>$url_tag</td><td>$git_url</td></tr>\n"; + print "<tr class=\"metadata_url\"><td>$url_tag</td><td>". + # rel=vcs-* microformat + "<a rel=\"vcs-git\" href=\"$git_url\" title=\"$title\">$git_url</a>". + "</td></tr>\n"; $url_tag = ""; } -- 1.5.6.5 -- see shy jo ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs-* microformat 2009-01-07 23:24 ` [PATCH] gitweb: support the rel=vcs-* microformat Joey Hess @ 2009-01-08 7:56 ` Giuseppe Bilotta 2009-01-08 19:54 ` gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) Joey Hess 2009-01-10 1:04 ` [PATCH] gitweb: support the rel=vcs-* microformat Jakub Narebski 2009-01-10 0:52 ` Jakub Narebski 1 sibling, 2 replies; 22+ messages in thread From: Giuseppe Bilotta @ 2009-01-08 7:56 UTC (permalink / raw) To: git Hello Joey, On Thursday 08 January 2009 00:24, Joey Hess wrote: > The rel=vcs-* microformat allows a web page to indicate the locations of > repositories related to it in a machine-parseable manner. > (See http://kitenet.net/~joey/rfc/rel-vcs/) Have you considered submitting the microformat to microformats.org? That would make the microformat more official and would be an good first step to have wider coverage of it, and additional reviews. > Make gitweb use the microformat if it has been configured with project url > information in any of the usual ways. On the project summary page, the > repository URL display is simply marked up using the microformat. On the > project list page and forks list page, the microformat is embedded in the > header, since the URLs do not appear on the page. > > The microformat could be included on other pages too, but I've skipped > doing so for now, since it would mean reading another file for every page > displayed. > > There is a small overhead in including the microformat on project list > and forks list pages, but getting the project descriptions for those pages > already incurs a similar overhead, and the ability to get every repo url > in one place seems worthwhile. I agree with this, although people with very large project lists may differ ... do we have timings on these? > This changes git_get_project_url_list() to not check wantarray, and only > return in list context -- the only way it is used AFAICS. It memoizes > both that function and git_get_project_description(), to avoid redundant > file reads. You may want to consider splitting the patch into three: memoizing of git_get_project_description(), reworking of git_get_project_url_list(), and the actual rel=vc-* insertions. > Signed-off-by: Joey Hess <joey@gnu.kitenet.net> > --- > gitweb/gitweb.perl | 78 +++++++++++++++++++++++++++++++++++++++++---------- > 1 files changed, 62 insertions(+), 16 deletions(-) > > This incorporates Giuseppe Bilotta's feedback, and uses new features > of the microformat. You can see this version running at > http://git.ikiwiki.info/ Oh, and do consider cc'ing jnareb and paski when submitting patches for gitweb, as they are the (unofficial?) maintainers. I usually cc gitster (Junio C Hamano) too. [ Also cc'ing me for this round would have been a nice idea too, since we had the review going on ;-) ] > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 99f71b4..c238717 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -2020,9 +2020,14 @@ sub git_get_path_by_hash { > ## ...................................................................... > ## git utility functions, directly accessing git repository > > +{ > +my %project_descriptions; # cache > + Out of curiosity, why the grouping? I would have had our %project_descriptions; up above with all the global variables. > sub git_get_project_description { > my $path = shift; > > + return $project_descriptions{$path} if exists $project_descriptions{$path}; > + This line is bordering on the 80 characters, so you may want to consider moving 'my $descr' here, with something such as my $descr = $project_descriptions{$path}; return $descr if exists $descr; Also, I'm no perl guru so I'm not sure about exists vs defined here. > $git_dir = "$projectroot/$path"; > open my $fd, "$git_dir/description" > or return git_get_project_config('description'); > @@ -2031,7 +2036,9 @@ sub git_get_project_description { > if (defined $descr) { > chomp $descr; > } > - return $descr; > + return $project_descriptions{$path}=$descr; > +} > + > } [This is where I would end the first patch] > > sub git_get_project_ctags { > @@ -2099,18 +2106,30 @@ sub git_show_project_tagcloud { > } > } > > +{ > +my %project_url_lists; # cache > + Ditto for this: why not our %project_url_lists; without scoping? > sub git_get_project_url_list { > + # use per project git URL list in $projectroot/$path/cloneurl > + # or make project git URL from git base URL and project name > my $path = shift; > > + return @{$project_url_lists{$path}} if exists $project_url_lists{$path}; > + > + my @ret; > $git_dir = "$projectroot/$path"; > - open my $fd, "$git_dir/cloneurl" > - or return wantarray ? > - @{ config_to_multi(git_get_project_config('url')) } : > - config_to_multi(git_get_project_config('url')); > - my @git_project_url_list = map { chomp; $_ } <$fd>; > - close $fd; > + if (open my $fd, "$git_dir/cloneurl") { > + @ret = map { chomp; $_ } <$fd>; > + close $fd; > + } else { > + @ret = @{ config_to_multi(git_get_project_config('url')) }; > + } > + @ret=map { "$_/$project" } @git_base_url_list if ! @ret; > + > + $project_url_lists{$path}=\@ret; > + return @ret; > +} > > - return wantarray ? @git_project_url_list : \@git_project_url_list; > } [This is where I would end the second patch] > > sub git_get_projects_list { > @@ -2856,6 +2875,7 @@ sub blob_contenttype { > sub git_header_html { > my $status = shift || "200 OK"; > my $expires = shift; > + my $extraheader = shift; > > my $title = "$site_name"; > if (defined $project) { > @@ -2953,6 +2973,8 @@ EOF > print qq(<link rel="shortcut icon" href="$favicon" type="image/png" />\n); > } > > + print $extraheader if defined $extraheader; > + > print "</head>\n" . > "<body>\n"; > > @@ -4365,6 +4387,26 @@ sub git_search_grep_body { > print "</table>\n"; > } > > +sub git_link_title { > + my $project=shift; > + > + my $description=git_get_project_description($project); > + return $project.(length $description ? " - $description" : ""); > +} Nice. > + > +# generates header with links to the specified projects > +sub git_links_header { > + my $ret=''; > + foreach my $project (@_) { > + # rel=vcs-* microformat > + my $title=git_link_title($project); > + foreach my $url git_get_project_url_list($project) { > + $ret.=qq{<link rel="vcs-git" href="$url" title="$title"/>\n} > + } > + } > + return $ret; > +} > + > ## ====================================================================== > ## ====================================================================== > ## actions > @@ -4380,7 +4422,9 @@ sub git_project_list { > die_error(404, "No projects found"); > } > > - git_header_html(); > + my $extraheader=git_links_header(map { $_->{path} } @list); > + > + git_header_html(undef, undef, $extraheader); > if (-f $home_text) { > print "<div class=\"index_include\">\n"; > insert_file($home_text); > @@ -4405,8 +4449,10 @@ sub git_forks { > if (!@list) { > die_error(404, "No forks found"); > } > + > + my $extraheader=git_links_header(map { $_->{path} } @list); > > - git_header_html(); > + git_header_html(undef, undef, $extraheader); This makes me wonder if it would be worth it to turn git_header_html into -param => value style, but I'm not really sure it's worth it. > git_print_page_nav('',''); > git_print_header_div('summary', "$project forks"); > git_project_list_body(\@list, $order); > @@ -4468,14 +4514,14 @@ sub git_summary { > print "<tr id=\"metadata_lchange\"><td>last change</td><td>$cd{'rfc2822'}</td></tr>\n"; > } > > - # use per project git URL list in $projectroot/$project/cloneurl > - # or make project git URL from git base URL and project name > my $url_tag = "URL"; > - my @url_list = git_get_project_url_list($project); > - @url_list = map { "$_/$project" } @git_base_url_list unless @url_list; > - foreach my $git_url (@url_list) { > + my $title=git_link_title($project); > + foreach my $git_url (git_get_project_url_list($project)) { > next unless $git_url; > - print "<tr class=\"metadata_url\"><td>$url_tag</td><td>$git_url</td></tr>\n"; > + print "<tr class=\"metadata_url\"><td>$url_tag</td><td>". > + # rel=vcs-* microformat > + "<a rel=\"vcs-git\" href=\"$git_url\" title=\"$title\">$git_url</a>". > + "</td></tr>\n"; > $url_tag = ""; > } Good. Of course the comment removal (which is actually a due move to git_get_project_url_list) would go in the appropriate patch if you split them 8-) -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 22+ messages in thread
* gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) 2009-01-08 7:56 ` Giuseppe Bilotta @ 2009-01-08 19:54 ` Joey Hess 2009-01-08 23:53 ` J.H. 2009-01-10 1:11 ` Jakub Narebski 2009-01-10 1:04 ` [PATCH] gitweb: support the rel=vcs-* microformat Jakub Narebski 1 sibling, 2 replies; 22+ messages in thread From: Joey Hess @ 2009-01-08 19:54 UTC (permalink / raw) To: git [-- Attachment #1: Type: text/plain, Size: 2147 bytes --] Giuseppe Bilotta wrote: > > There is a small overhead in including the microformat on project list > > and forks list pages, but getting the project descriptions for those pages > > already incurs a similar overhead, and the ability to get every repo url > > in one place seems worthwhile. > > I agree with this, although people with very large project lists may > differ ... do we have timings on these? AFAICS, when displaying the project list, gitweb reads each project's description file, falling back to reading its config file if there is no description file. If performance was a problem here, the thing to do would be to add project descriptions to the $project_list file, and use those in preference to the description files. If a large site has done that, they've not sent in the patch. :-) With my patch, it will read each cloneurl file too. The best way to optimise that for large sites seems to be to add an option that would ignore the cloneurl files and config file and always use @git_base_url_list. I checked the only large site I have access to (git.debian.org) and they use a $project_list file, but I see no other performance tuning. That's a 2 ghz machine; it takes gitweb 28 (!) seconds to generate the nearly 1 MB index web page for 1671 repositories: /srv/git.debian.org/http/cgi-bin/gitweb.cgi 3.04s user 9.24s system 43% cpu 28.515 total Notice that most of the time is spent by child processes. For each repository, gitweb runs git-for-each-ref to determine the time of the last commit. If that is removed (say if there were a way to get the info w/o forking), performance improves nicely: ./gitweb.cgi > /dev/null 1.29s user 1.08s system 69% cpu 3.389 total Making it not read description files for each project, as I suggest above, is the next best optimisation: ./gitweb.cgi > /dev/null 1.08s user 0.05s system 96% cpu 1.170 total So, I think it makes sense to optimise gitweb and offer knobs for performance tuning at the expense of the flexability of description and cloneurl files. But, git-for-each-ref is swamping everything else. -- see shy jo [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) 2009-01-08 19:54 ` gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) Joey Hess @ 2009-01-08 23:53 ` J.H. 2009-01-09 0:16 ` Miklos Vajna ` (2 more replies) 2009-01-10 1:11 ` Jakub Narebski 1 sibling, 3 replies; 22+ messages in thread From: J.H. @ 2009-01-08 23:53 UTC (permalink / raw) To: Joey Hess; +Cc: git Joey Hess wrote: > Giuseppe Bilotta wrote: > >>> There is a small overhead in including the microformat on project list >>> and forks list pages, but getting the project descriptions for those pages >>> already incurs a similar overhead, and the ability to get every repo url >>> in one place seems worthwhile. >>> >> I agree with this, although people with very large project lists may >> differ ... do we have timings on these? >> > > AFAICS, when displaying the project list, gitweb reads each project's > description file, falling back to reading its config file if there is no > description file. > > If performance was a problem here, the thing to do would be to add > project descriptions to the $project_list file, and use those in > preference to the description files. If a large site has done that, > they've not sent in the patch. :-) > No because all the large sites have pain points and issues elsewhere in the app. Most of the large sites (which I can at least speak for Kernel.org) went and have built in full caching layers into gitweb itself to deal with the problem. This means that we don't have to worry about nickle and dime performance improvements that are specific to one section, but can do a very broad sweep and get dramatically better performance across all of gitweb. Those patches have all made it back out onto the mailing list, but for a number of different reasons none have been accepted into the mainline branch. > With my patch, it will read each cloneurl file too. The best way to > optimise that for large sites seems to be to add an option that would > ignore the cloneurl files and config file and always use > @git_base_url_list. > > I checked the only large site I have access to (git.debian.org) and they > use a $project_list file, but I see no other performance tuning. That's > a 2 ghz machine; it takes gitweb 28 (!) seconds to generate the nearly 1 > MB index web page for 1671 repositories: > Look at either Lea's or my caching engines, it will help dramatically on something of that size. > /srv/git.debian.org/http/cgi-bin/gitweb.cgi 3.04s user 9.24s system 43% cpu 28.515 total > > Notice that most of the time is spent by child processes. For each > repository, gitweb runs git-for-each-ref to determine the time of the > last commit. > > If that is removed (say if there were a way to get the info w/o > forking), performance improves nicely: > > ./gitweb.cgi > /dev/null 1.29s user 1.08s system 69% cpu 3.389 total > > Making it not read description files for each project, as I suggest above, > is the next best optimisation: > > ./gitweb.cgi > /dev/null 1.08s user 0.05s system 96% cpu 1.170 total > > So, I think it makes sense to optimise gitweb and offer knobs for performance > tuning at the expense of the flexability of description and cloneurl files. > But, git-for-each-ref is swamping everything else The problem is the knobs are going to be very fine grained, you really are better off looking at one of the caching engines that's available now. Performance options are hard, because it's difficult to relay to anyone the complex tradeoffs, thus keeping knobs like that to a minimum are really a necessity. - John 'Warthog9' Hawley ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) 2009-01-08 23:53 ` J.H. @ 2009-01-09 0:16 ` Miklos Vajna 2009-01-09 0:19 ` Johannes Schindelin 2009-01-10 1:44 ` Jakub Narebski 2 siblings, 0 replies; 22+ messages in thread From: Miklos Vajna @ 2009-01-09 0:16 UTC (permalink / raw) To: J.H., git; +Cc: Joey Hess [-- Attachment #1: Type: text/plain, Size: 433 bytes --] On Thu, Jan 08, 2009 at 03:53:16PM -0800, "J.H." <warthog19@eaglescrag.net> wrote: > Look at either Lea's or my caching engines, it will help dramatically on > something of that size. repo.or.cz uses a single patch for caching the project list only: http://repo.or.cz/w/git/repo.git?a=commit;h=152fb0b22d36c6981ac3c4403b69ad91b27a1bc6 you are probably better off with such a small patch instead of using a gitweb fork. [-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) 2009-01-08 23:53 ` J.H. 2009-01-09 0:16 ` Miklos Vajna @ 2009-01-09 0:19 ` Johannes Schindelin 2009-01-09 0:26 ` J.H. 2009-01-10 1:44 ` Jakub Narebski 2 siblings, 1 reply; 22+ messages in thread From: Johannes Schindelin @ 2009-01-09 0:19 UTC (permalink / raw) To: J.H.; +Cc: Joey Hess, git Hi, On Thu, 8 Jan 2009, J.H. wrote: > Look at either Lea's or my caching engines, it will help dramatically on > something of that size. Speaking of which, do you have any performance comparisons between the two? Ciao, Dscho ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) 2009-01-09 0:19 ` Johannes Schindelin @ 2009-01-09 0:26 ` J.H. 0 siblings, 0 replies; 22+ messages in thread From: J.H. @ 2009-01-09 0:26 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Joey Hess, git Johannes Schindelin wrote: > Hi, > > On Thu, 8 Jan 2009, J.H. wrote: > > >> Look at either Lea's or my caching engines, it will help dramatically on >> something of that size. >> > > Speaking of which, do you have any performance comparisons between the > two? > Lea's got some - I can see if I can dig up my copy (or if she's paying attention maybe she can publish them), though either one is orders of magnitude faster than the normal code. Beyond that it waffles back and forth which one is faster & why mainly because of the approaches we each took on the caching. Generally speaking I would push people more towards Lea's than my work, if nothing else hers is more in line with current gitweb, though I have had some thoughts about undoing my file breakout and getting my code base back up to speed. - John 'Warthog9' Hawley ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) 2009-01-08 23:53 ` J.H. 2009-01-09 0:16 ` Miklos Vajna 2009-01-09 0:19 ` Johannes Schindelin @ 2009-01-10 1:44 ` Jakub Narebski 2 siblings, 0 replies; 22+ messages in thread From: Jakub Narebski @ 2009-01-10 1:44 UTC (permalink / raw) To: J.H.; +Cc: Joey Hess, git, Giuseppe Bilotta "J.H." <warthog19@eaglescrag.net> writes: > Joey Hess wrote: >> Giuseppe Bilotta wrote: >> >>>> There is a small overhead in including the microformat on project list >>>> and forks list pages, but getting the project descriptions for those pages >>>> already incurs a similar overhead, and the ability to get every repo url >>>> in one place seems worthwhile. >>>> >>> I agree with this, although people with very large project lists may >>> differ ... do we have timings on these? >>> >> >> AFAICS, when displaying the project list, gitweb reads each project's >> description file, falling back to reading its config file if there is no >> description file. >> >> If performance was a problem here, the thing to do would be to add >> project descriptions to the $project_list file, and use those in >> preference to the description files. If a large site has done that, >> they've not sent in the patch. :-) > > No because all the large sites have pain points and issues elsewhere > in the app. Most of the large sites (which I can at least speak for > Kernel.org) went and have built in full caching layers into gitweb > itself to deal with the problem. This means that we don't have to > worry about nickle and dime performance improvements that are specific > to one section, but can do a very broad sweep and get dramatically > better performance across all of gitweb. Those patches have all made > it back out onto the mailing list, but for a number of different > reasons none have been accepted into the mainline branch. Additional issue is that when you add or delete repository (project), you have to correct or regenerate projects_index file. While it is I think quite easy for git hosting sites such as repo.or.cz, it is harder for sites which offer gitweb just like they ofer WWW homepages: as a service, with repositories created (and descriptions updated) outside of gitweb control. -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) 2009-01-08 19:54 ` gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) Joey Hess 2009-01-08 23:53 ` J.H. @ 2009-01-10 1:11 ` Jakub Narebski 1 sibling, 0 replies; 22+ messages in thread From: Jakub Narebski @ 2009-01-10 1:11 UTC (permalink / raw) To: Joey Hess; +Cc: git, Giuseppe Bilotta Joey Hess <joey@kitenet.net> writes: > Giuseppe Bilotta wrote: > > > There is a small overhead in including the microformat on project list > > > and forks list pages, but getting the project descriptions for those pages > > > already incurs a similar overhead, and the ability to get every repo url > > > in one place seems worthwhile. > > > > I agree with this, although people with very large project lists may > > differ ... do we have timings on these? > > AFAICS, when displaying the project list, gitweb reads each project's > description file, falling back to reading its config file if there is no > description file. > > If performance was a problem here, the thing to do would be to add > project descriptions to the $project_list file, and use those in > preference to the description files. If a large site has done that, > they've not sent in the patch. :-) There was such patch sent by me, but IIRC it fall out, also because it was sent IIRC in feature freeze time. I have "gitweb: Extend project_index file format by project description" in my StGit stack. > > With my patch, it will read each cloneurl file too. The best way to > optimise that for large sites seems to be to add an option that would > ignore the cloneurl files and config file and always use > @git_base_url_list. Good idea. > > I checked the only large site I have access to (git.debian.org) and they > use a $project_list file, but I see no other performance tuning. That's > a 2 ghz machine; it takes gitweb 28 (!) seconds to generate the nearly 1 > MB index web page for 1671 repositories: > > /srv/git.debian.org/http/cgi-bin/gitweb.cgi 3.04s user 9.24s system 43% cpu 28.515 total > > > Notice that most of the time is spent by child processes. For each > repository, gitweb runs git-for-each-ref to determine the time of the > last commit. > > If that is removed (say if there were a way to get the info w/o > forking), performance improves nicely: > > ./gitweb.cgi > /dev/null 1.29s user 1.08s system 69% cpu 3.389 total > > Making it not read description files for each project, as I suggest above, > is the next best optimisation: > > ./gitweb.cgi > /dev/null 1.08s user 0.05s system 96% cpu 1.170 total > > So, I think it makes sense to optimise gitweb and offer knobs for performance > tuning at the expense of the flexability of description and cloneurl files. > But, git-for-each-ref is swamping everything else. One solution would be to limit number of projects displayed on the page, for example to 100 projects, although that would mainly reduce problem with dealing with large page on client size, less so server load unless we _do not_ sort projects by age. Another solution would be to use caching: repo.or.cz uses one solution (caching only of projects_list action), kernel.org other solution (gitweb caching from GSoC 2008 project). -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs-* microformat 2009-01-08 7:56 ` Giuseppe Bilotta 2009-01-08 19:54 ` gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) Joey Hess @ 2009-01-10 1:04 ` Jakub Narebski 1 sibling, 0 replies; 22+ messages in thread From: Jakub Narebski @ 2009-01-10 1:04 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Joey Hess Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes: > On Thursday 08 January 2009 00:24, Joey Hess wrote: > > > The rel=vcs-* microformat allows a web page to indicate the locations of > > repositories related to it in a machine-parseable manner. > > (See http://kitenet.net/~joey/rfc/rel-vcs/) > > Have you considered submitting the microformat to microformats.org? > That would make the microformat more official and would be an good > first step to have wider coverage of it, and additional reviews. Good thinking. BTW. microformats.org is IIRC wiki (or at least part of it is wiki), so it should be easy to do... > > > Make gitweb use the microformat if it has been configured with project url > > information in any of the usual ways. On the project summary page, the > > repository URL display is simply marked up using the microformat. On the > > project list page and forks list page, the microformat is embedded in the > > header, since the URLs do not appear on the page. > > > > The microformat could be included on other pages too, but I've skipped > > doing so for now, since it would mean reading another file for every page > > displayed. > > > > There is a small overhead in including the microformat on project list > > and forks list pages, but getting the project descriptions for those pages > > already incurs a similar overhead, and the ability to get every repo url > > in one place seems worthwhile. > > I agree with this, although people with very large project lists may > differ ... do we have timings on these? I think while adding this microformat to 'summary' page is non-issue, we might want to be able configure it out so it is not used for projects_list page (which might be very large). And what about OPML, RSS and Atom formats? > > > This changes git_get_project_url_list() to not check wantarray, and only > > return in list context -- the only way it is used AFAICS. It memoizes > > both that function and git_get_project_description(), to avoid redundant > > file reads. > > You may want to consider splitting the patch into three: memoizing > of git_get_project_description(), reworking of > git_get_project_url_list(), and the actual rel=vc-* insertions. Very good idea. Small, single feature patches are nice. [...] > > sub git_get_project_description { > > my $path = shift; > > > > + return $project_descriptions{$path} if exists $project_descriptions{$path}; > > + > > This line is bordering on the 80 characters, so you may want to > consider moving 'my $descr' here, with something such as > > my $descr = $project_descriptions{$path}; > return $descr if exists $descr; > > Also, I'm no perl guru so I'm not sure about exists vs defined here. You might have undefined value in existing key, but I guess that we can assume that those are equivalent for this. While 'exists' seems more up to what you check (does the key exosts in hash) you further on rely on the fact that $descr is not undefined. [...] > > ## ====================================================================== > > ## ====================================================================== > > ## actions > > @@ -4380,7 +4422,9 @@ sub git_project_list { > > die_error(404, "No projects found"); > > } > > > > - git_header_html(); > > + my $extraheader=git_links_header(map { $_->{path} } @list); > > + > > + git_header_html(undef, undef, $extraheader); > > if (-f $home_text) { > > print "<div class=\"index_include\">\n"; > > insert_file($home_text); > > @@ -4405,8 +4449,10 @@ sub git_forks { > > if (!@list) { > > die_error(404, "No forks found"); > > } > > + > > + my $extraheader=git_links_header(map { $_->{path} } @list); > > > > - git_header_html(); > > + git_header_html(undef, undef, $extraheader); > > This makes me wonder if it would be worth it to turn git_header_html > into -param => value style, but I'm not really sure it's worth it. It is git_header_html(STATUS, EXPIRES, EXTRA) Hmmm... now I have checked we use either git_header_html() in gitweb (which is most common), or git_header_html(STATUS) in die_error, or in a few cases git_header_html(undef, $expires); and now git_header_html(undef, undef, $extra), so named parameters might be a good idea... I don't have opinion here... -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs-* microformat 2009-01-07 23:24 ` [PATCH] gitweb: support the rel=vcs-* microformat Joey Hess 2009-01-08 7:56 ` Giuseppe Bilotta @ 2009-01-10 0:52 ` Jakub Narebski 1 sibling, 0 replies; 22+ messages in thread From: Jakub Narebski @ 2009-01-10 0:52 UTC (permalink / raw) To: Joey Hess; +Cc: git Joey Hess <joey@kitenet.net> writes: > The rel=vcs-* microformat allows a web page to indicate the locations of > repositories related to it in a machine-parseable manner. > (See http://kitenet.net/~joey/rfc/rel-vcs/) > > Make gitweb use the microformat if it has been configured with project url > information in any of the usual ways. On the project summary page, the > repository URL display is simply marked up using the microformat. On the > project list page and forks list page, the microformat is embedded in the > header, since the URLs do not appear on the page. I think having LINK elements also for 'summary' page would be a good idea. This microformat is I think mainly for machines, and machines can I guess read better a few LINK elements in fairly small HEAD of page, than scan all of many link (A) elements on the page for those matching vcs-* microformat. Beside I am not sure if for example hyperlinking SCP-style repository URL makes sense at all; I am also not sure if hyperlinking links on which you cannot click on makes good sense (unless you use SPAN or ABBR instead of A to mark repo links...) > > The microformat could be included on other pages too, but I've skipped > doing so for now, since it would mean reading another file for every page > displayed. Also it is not necessary: if some tool want to get repo links for given project, it can get 'summary' page; if some tool want to get list of all repos, it can access one of projects list actions. > > There is a small overhead in including the microformat on project list > and forks list pages, but getting the project descriptions for those pages > already incurs a similar overhead, and the ability to get every repo url > in one place seems worthwhile. By the way, do you have any benchmarks for that? > > This changes git_get_project_url_list() to not check wantarray, and only > return in list context -- the only way it is used AFAICS. It memoizes > both that function and git_get_project_description(), to avoid redundant > file reads. I would also add that, from what I understand, you have made git_get_project_url_list() subroutine to be self-sufficient: it now considers both per-repository configuration (gitweb.url in config, cloneurl file in $GIT_DIR) and global gitweb configuration (@git_base_url_list variable). Simplification of code so it always return list and does nto check contents is a side issue, orthogonal to issue mentioned above. > > Signed-off-by: Joey Hess <joey@gnu.kitenet.net> > --- > gitweb/gitweb.perl | 78 +++++++++++++++++++++++++++++++++++++++++---------- > 1 files changed, 62 insertions(+), 16 deletions(-) > > This incorporates Giuseppe Bilotta's feedback, and uses new features > of the microformat. You can see this version running at > http://git.ikiwiki.info/ > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 99f71b4..c238717 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -2020,9 +2020,14 @@ sub git_get_path_by_hash { > ## ...................................................................... > ## git utility functions, directly accessing git repository > > +{ > +my %project_descriptions; # cache > + Won't we get warnings (and perhaps errors) from mod_perl? Shouldn't this be "our %project_descriptions;"? > sub git_get_project_description { > my $path = shift; > > + return $project_descriptions{$path} if exists $project_descriptions{$path}; > + > $git_dir = "$projectroot/$path"; > open my $fd, "$git_dir/description" > or return git_get_project_config('description'); > @@ -2031,7 +2036,9 @@ sub git_get_project_description { > if (defined $descr) { > chomp $descr; > } > - return $descr; > + return $project_descriptions{$path}=$descr; > +} > + > } If we use 'title="$project git repository" for 'rel="vcs-git"' links, is it still worth it extra complication to avoid double calculation of project description in the case of 'summary' view for a project? Because IIRC for 'projects_list' view it is already cached in @projects list as 'descr' key... > > sub git_get_project_ctags { > @@ -2099,18 +2106,30 @@ sub git_show_project_tagcloud { > } > } > > +{ > +my %project_url_lists; # cache > + Same question: would it work correctly for mod_perl? > sub git_get_project_url_list { > + # use per project git URL list in $projectroot/$path/cloneurl > + # or make project git URL from git base URL and project name > my $path = shift; > > + return @{$project_url_lists{$path}} if exists $project_url_lists{$path}; > + > + my @ret; > $git_dir = "$projectroot/$path"; > - open my $fd, "$git_dir/cloneurl" > - or return wantarray ? > - @{ config_to_multi(git_get_project_config('url')) } : > - config_to_multi(git_get_project_config('url')); > - my @git_project_url_list = map { chomp; $_ } <$fd>; > - close $fd; > + if (open my $fd, "$git_dir/cloneurl") { > + @ret = map { chomp; $_ } <$fd>; > + close $fd; > + } else { > + @ret = @{ config_to_multi(git_get_project_config('url')) }; > + } > + @ret=map { "$_/$project" } @git_base_url_list if ! @ret; Style: + @ret = map { "$_/$project" } @git_base_url_list if !@ret; or even + @ret = map { "$_/$project" } @git_base_url_list unless @ret; > + > + $project_url_lists{$path}=\@ret; > + return @ret; > +} > > - return wantarray ? @git_project_url_list : \@git_project_url_list; > } Again: is it worth caching? It is only for 'summary'; for 'projects_list' it might be better to extend @projects list instead > > sub git_get_projects_list { > @@ -2856,6 +2875,7 @@ sub blob_contenttype { > sub git_header_html { > my $status = shift || "200 OK"; > my $expires = shift; > + my $extraheader = shift; > > my $title = "$site_name"; > if (defined $project) { > @@ -2953,6 +2973,8 @@ EOF > print qq(<link rel="shortcut icon" href="$favicon" type="image/png" />\n); > } > > + print $extraheader if defined $extraheader; > + > print "</head>\n" . > "<body>\n"; > Good solution, but shouldn't this be better put into separate commit, simply extending git_header_html to allow to add extra data (no need to name it $extraheader I think, $extra would be enough) to the HTML header (HEAD element contents)? > @@ -4365,6 +4387,26 @@ sub git_search_grep_body { > print "</table>\n"; > } > > +sub git_link_title { > + my $project=shift; > + > + my $description=git_get_project_description($project); > + return $project.(length $description ? " - $description" : ""); > +} Style (whitespace around '='), and the fact that IMHO "$project git repository" is better than "$project - $description", also because of "Unnamed repository; edit this file to name it for gitweb." default template > + > +# generates header with links to the specified projects > +sub git_links_header { Good abstraction, but I'm not so sure about subroutine name. > + my $ret=''; > + foreach my $project (@_) { Style: I'd rather use named variables, like "my @projects = @_"; also everywhere else we use spaces around '=' usually. > + # rel=vcs-* microformat > + my $title=git_link_title($project); Good abstraction. > + foreach my $url git_get_project_url_list($project) { > + $ret.=qq{<link rel="vcs-git" href="$url" title="$title"/>\n} To be HTML compatibile, it is better to use > + $ret.=qq{<link rel="vcs-git" href="$url" title="$title" />\n} (note the space before "/>"). > + } > + } > + return $ret; > +} > + > ## ====================================================================== > ## ====================================================================== > ## actions > @@ -4380,7 +4422,9 @@ sub git_project_list { > die_error(404, "No projects found"); > } > > - git_header_html(); > + my $extraheader=git_links_header(map { $_->{path} } @list); > + > + git_header_html(undef, undef, $extraheader); > if (-f $home_text) { > print "<div class=\"index_include\">\n"; > insert_file($home_text); > @@ -4405,8 +4449,10 @@ sub git_forks { > if (!@list) { > die_error(404, "No forks found"); > } > + > + my $extraheader=git_links_header(map { $_->{path} } @list); > > - git_header_html(); > + git_header_html(undef, undef, $extraheader); > git_print_page_nav('',''); > git_print_header_div('summary', "$project forks"); > git_project_list_body(\@list, $order); > @@ -4468,14 +4514,14 @@ sub git_summary { > print "<tr id=\"metadata_lchange\"><td>last change</td><td>$cd{'rfc2822'}</td></tr>\n"; > } > > - # use per project git URL list in $projectroot/$project/cloneurl > - # or make project git URL from git base URL and project name > my $url_tag = "URL"; > - my @url_list = git_get_project_url_list($project); > - @url_list = map { "$_/$project" } @git_base_url_list unless @url_list; > - foreach my $git_url (@url_list) { > + my $title=git_link_title($project); > + foreach my $git_url (git_get_project_url_list($project)) { > next unless $git_url; > - print "<tr class=\"metadata_url\"><td>$url_tag</td><td>$git_url</td></tr>\n"; > + print "<tr class=\"metadata_url\"><td>$url_tag</td><td>". > + # rel=vcs-* microformat > + "<a rel=\"vcs-git\" href=\"$git_url\" title=\"$title\">$git_url</a>". > + "</td></tr>\n"; > $url_tag = ""; > } Non clickable hyperlink... hmmm... > > -- > 1.5.6.5 > > > > -- > see shy jo -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 19:02 ` Joey Hess 2009-01-07 23:24 ` [PATCH] gitweb: support the rel=vcs-* microformat Joey Hess @ 2009-01-10 0:03 ` Jakub Narebski 1 sibling, 0 replies; 22+ messages in thread From: Jakub Narebski @ 2009-01-10 0:03 UTC (permalink / raw) To: Joey Hess; +Cc: Giuseppe Bilotta, git Joey Hess <joey@kitenet.net> writes: > Joey Hess wrote: > > Another approach would be to just memoize git_get_project_description > > and git_get_project_url_list. > > Especially since git_get_project_description is already called more than > once for some pages. Hmmm... this is an idea worth checking. -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 12:30 ` Giuseppe Bilotta 2009-01-07 15:50 ` Joey Hess @ 2009-01-09 23:56 ` Jakub Narebski 1 sibling, 0 replies; 22+ messages in thread From: Jakub Narebski @ 2009-01-09 23:56 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git Giuseppe Bilotta <giuseppe.bilotta@gmail.com> writes: > On Wednesday 07 January 2009 05:25, Joey Hess wrote: > > > The rel=vcs microformat allows a web page to indicate the locations of > > repositories related to it in a machine-parseable manner. > > (See http://kitenet.net/~joey/rfc/rel-vcs/) > > Interesting idea, I like it. However, I see a problem in the proposed > implementation versus the spec. According to the spec: > > """ > The "title" is optional, but recommended if there are multiple, different > repositories linked to on one page. It is a human-readable description of the > repository. > [...] > If there are multiple repositories listed, without titles, tools > should assume they are different repositories. > """ Good catch. > > In this patch you do NOT add titles to the rel=vcs links, which means that > everything works fine only if there is a single URL for each project. If a > project has different URLs, it's going to appear multiple times as _different_ > projects to a spec-compliant reader. > > A possible solution would be to make @git_url_list into a map keyed by the > project name and having the description and repo URL(s) as values. > > Since there is the possibility of different projects having the same > description (e.g. the default one), the link title could be composed of > "$project - $description" rather than simply $description. > > Note that both in summary and in project list view you already retrieve the > description, so there are no additional disk hits. Wouldn't "$project git repository" (i.e. do not use description at all) be a simpler, faster and also _better_ solution? -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] gitweb: support the rel=vcs microformat 2009-01-07 4:25 [PATCH] gitweb: support the rel=vcs microformat Joey Hess 2009-01-07 12:30 ` Giuseppe Bilotta @ 2009-01-09 23:49 ` Jakub Narebski 1 sibling, 0 replies; 22+ messages in thread From: Jakub Narebski @ 2009-01-09 23:49 UTC (permalink / raw) To: Joey Hess; +Cc: git Joey Hess <joey@kitenet.net> writes: > The rel=vcs microformat allows a web page to indicate the locations of > repositories related to it in a machine-parseable manner. > (See http://kitenet.net/~joey/rfc/rel-vcs/) Let me put here an example from avove mentioned page: <head> <link rel="vcs-git" href="git://example.org/foo.git" title="foo git repository" /> </head> <a rel="vcs-git" href="git://example.org/foo.git" title="git repository">git://example.org/foo.git</a> <a rel="vcs-git" href="git://example.org/foo.git">git repository</a> There is one problem that is not solved in above microformat, but it is problem only for git hosting sites like repo.or.cz or GitHub, namely it does not allow to distinguish between fetch (read) link, and push (write, publish) link. This is not a problem for standard (unmodified) gitweb as it shows only read-only git repositories links. We also have to decide what to put in the 'title' attribute; I think the simplest would be to put "$project git repository" or something (for example "git/git.git git repository"). One thing I worry about is that those links (or at least some of those links) are not meant for the browser to open; also SCP/SSH-like syntax for SSH protocol in the form of 'user@host:/path/to/repo.git/' which does not follow URL rules. > > Make gitweb use the microformat in the header of pages it generates, > if it has been configured with project url information in any of the usual > ways. There are two bit separate issues here: marking existing and future URLs (current project fetch URLs which IIRC are not hyperlinked now; planned/future 'git' links in project list page; perhaps also links in OPML and RSS/Atom feeds) with 'rel="vcs-git"', and adding <link .../> elements to page header. > > Since getting the urls can require hitting disk, I avoided putting the > microformat on *every* page gitweb generates. Just put it on the project > summary page, the project list page, and the forks list page. > > The first of these already looks up the urls, so adding the microformat was > free. I assume that this patch is only about adding <link ... /> elements to head? I think in the case of 'summary' view for a project it is an excellent idea (similar to having 'prev' and 'next' link elements in chaptered on-line book in HTML), and would allow for automation using gitweb as a kind of service announcement. > There is a small overhead in including the microformat on the latter > two pages [projects list and list of forks], but getting the project > descriptions for those pages already incurs a similar overhead, and > the ability to get every repo url in one place seems worthwhile. There is also OPML, which might be worth checking. By the way, for 'projects_list' action and 'forks' actions we have to decide whether to show _all_ links for each project (there can be more than one), or whether we show only some main git link (like in the case of proposed 'git' link). And whether we trust @git_base_url_list or do we take it as default and examine per-repository configuration (more costly). What is more important: 'project_list' page is already overly large when hosting very large number of repositories (there were some patches adding pagination for 'project_list', and perhaps they would be resend). Adding <link .../> elements would only add to its size; and if will be divided into pages we would have also to take it into account. > > This changes git_get_project_description() to not check wantarray, and only > return in list context -- the only way it is used AFAICS. Errr... what? Why do you change git_get_project_description() subroutine? I don't think it would be good source for 'title' attribute; perhaps for 'desc' attribute, and only aftre sanitizing "Unnamed repository; edit this file to name it for gitweb." Errata: ah, it is git_get_project_url_list() subroutine... > > Signed-off-by: Joey Hess <joey@gnu.kitenet.net> > --- > gitweb/gitweb.perl | 38 ++++++++++++++++++++++++++------------ > 1 files changed, 26 insertions(+), 12 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 99f71b4..3f8a228 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -789,6 +789,9 @@ $git_dir = "$projectroot/$project" if $project; > our @snapshot_fmts = gitweb_get_feature('snapshot'); > @snapshot_fmts = filter_snapshot_fmts(@snapshot_fmts); > > +# populated later with git urls for the project > +our @git_url_list; > + I'm not sure why this have to be global, but I assume that you want to avoid recalculationg it in git_header_html > # dispatch > if (!defined $action) { > if (defined $hash) { > @@ -2100,17 +2103,22 @@ sub git_show_project_tagcloud { > } > > sub git_get_project_url_list { > + # use per project git URL list in $projectroot/$path/cloneurl > + # or make project git URL from git base URL and project name I'd rather use separate subroutine for the second, I think. > my $path = shift; > > + my @ret; > + > $git_dir = "$projectroot/$path"; > - open my $fd, "$git_dir/cloneurl" > - or return wantarray ? > - @{ config_to_multi(git_get_project_config('url')) } : > - config_to_multi(git_get_project_config('url')); > - my @git_project_url_list = map { chomp; $_ } <$fd>; > - close $fd; > + if (open my $fd, "$git_dir/cloneurl") { > + @ret = map { chomp; $_ } <$fd>; > + close $fd; > + } > + else { Style: "} else {" > + @ret = @{ config_to_multi(git_get_project_config('url')) }; > + } > > - return wantarray ? @git_project_url_list : \@git_project_url_list; > + return @ret ? @ret : map { "$_/$project" } @git_base_url_list; > } Hmmm... currently gitweb does it at caller: my @url_list = git_get_project_url_list($project); @url_list = map { "$_/$project" } @git_base_url_list unless @url_list; Why do you want to put this in git_get_project_url_list()? Please explain (here and in the commit message too; it has to be mentioned in commit message that you cnage semantics a bit, and explain why you did so). > > sub git_get_projects_list { > @@ -2953,6 +2961,10 @@ EOF Sidenote: this should be @@ -2953,6 +2961,10 @@ sub git_header_html { but I'm not sure if it would be possible to automate... > print qq(<link rel="shortcut icon" href="$favicon" type="image/png" />\n); > } > > + foreach my $url (@git_url_list) { > + print qq{<link rel="vcs" type="git" href="$url" />\n}; > + } > + Errr... in mentioned http://kitenet.net/~joey/rel-vcs/ it is <link rel="vcs-git" href="$url" title="$project git repository" /> and not <link rel="vcs" type="git" href="$url" /> Besides, 'type' attribute for A and LINK elements is about advisory conent-type of the document pointed by link: type = content-type [CI] This attribute gives an advisory hint as to the content type of the content available at the link target address. It allows user agents to opt to use a fallback mechanism rather than fetch the content if they are advised that they will get content in a content type they do not support. Authors who use this attribute take responsibility to manage the risk that it may become inconsistent with the content available at the link target address. For the current list of registered content types, please consult [MIMETYPES]. > print "</head>\n" . > "<body>\n"; > > @@ -4380,6 +4392,8 @@ sub git_project_list { > die_error(404, "No projects found"); > } > > + @git_url_list = map { git_get_project_url_list($_->{path}) } @list; > + > git_header_html(); > if (-f $home_text) { > print "<div class=\"index_include\">\n"; > @@ -4400,6 +4414,8 @@ sub git_forks { > if (defined $order && $order !~ m/none|project|descr|owner|age/) { > die_error(400, "Unknown order parameter"); > } > + > + @git_url_list = map { git_get_project_url_list($_->{path}) } @list; > > my @list = git_get_projects_list($project); > if (!@list) { Those two are pretty straightforward, but please note that 'project_list' view (action) might be _already_ too large... > @@ -4457,6 +4473,8 @@ sub git_summary { > @forklist = git_get_projects_list($project); > } > > + @git_url_list = git_get_project_url_list($project); > + > git_header_html(); > git_print_page_nav('summary','', $head); > > @@ -4468,12 +4486,8 @@ sub git_summary { > print "<tr id=\"metadata_lchange\"><td>last change</td><td>$cd{'rfc2822'}</td></tr>\n"; > } > > - # use per project git URL list in $projectroot/$project/cloneurl > - # or make project git URL from git base URL and project name > my $url_tag = "URL"; > - my @url_list = git_get_project_url_list($project); > - @url_list = map { "$_/$project" } @git_base_url_list unless @url_list; > - foreach my $git_url (@url_list) { > + foreach my $git_url (@git_url_list) { > next unless $git_url; > print "<tr class=\"metadata_url\"><td>$url_tag</td><td>$git_url</td></tr>\n"; > $url_tag = ""; > -- > 1.5.6.5 This is also pretty straightforward: it moves calculation earlier for results to be shared with git_header_html (and uses global variable). -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2009-01-10 1:46 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-01-07 4:25 [PATCH] gitweb: support the rel=vcs microformat Joey Hess 2009-01-07 12:30 ` Giuseppe Bilotta 2009-01-07 15:50 ` Joey Hess 2009-01-07 18:03 ` Giuseppe Bilotta 2009-01-07 18:41 ` Joey Hess 2009-01-10 0:01 ` Jakub Narebski 2009-01-07 18:45 ` Joey Hess 2009-01-07 19:02 ` Joey Hess 2009-01-07 23:24 ` [PATCH] gitweb: support the rel=vcs-* microformat Joey Hess 2009-01-08 7:56 ` Giuseppe Bilotta 2009-01-08 19:54 ` gitweb index performance (Re: [PATCH] gitweb: support the rel=vcs-* microformat) Joey Hess 2009-01-08 23:53 ` J.H. 2009-01-09 0:16 ` Miklos Vajna 2009-01-09 0:19 ` Johannes Schindelin 2009-01-09 0:26 ` J.H. 2009-01-10 1:44 ` Jakub Narebski 2009-01-10 1:11 ` Jakub Narebski 2009-01-10 1:04 ` [PATCH] gitweb: support the rel=vcs-* microformat Jakub Narebski 2009-01-10 0:52 ` Jakub Narebski 2009-01-10 0:03 ` [PATCH] gitweb: support the rel=vcs microformat Jakub Narebski 2009-01-09 23:56 ` Jakub Narebski 2009-01-09 23:49 ` Jakub Narebski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).