* [PATCHv4] gitweb: PATH_INFO support improvements @ 2008-10-02 0:10 Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Giuseppe Bilotta 2008-10-02 8:19 ` [PATCHv4] gitweb: PATH_INFO support improvements Jakub Narebski 0 siblings, 2 replies; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 0:10 UTC (permalink / raw) To: git Cc: Jakub Narebski, Petr Baudis, Junio C Hamano, Shawn O. Pearce, Giuseppe Bilotta Fourth version of my gitweb PATH_INFO patchset, whose purpose is to reduce the use of CGI parameters by embedding as many parameters as possible in the URL path itself, provided the pathinfo feature is enabled. The new typical gitweb URL is therefore in the form $project/$action/$parent:$file..$hash:$file (with useless parts stripped). Backwards compatibility for old-style $project/$hash URLs is kept, as long as $hash is not a refname whose name happens to match a git action. The main implementation is provided by paired patches (#1#3, #5#6) that implement parsing and generation of the new style URLs. Patch #2 deals with a refactoring of the input parameters parsing and validation, so that the rest of gitweb can be agnostic wrt to the parameters' origin (CGI vs PATH_INFO vs possible other future inputs such as CLI). Patch #4 is a minor improvement to the URL syntax that allows web documents to be properly browsable in raw mode. Giuseppe Bilotta (6): gitweb: parse project/action/hash_base:filename PATH_INFO gitweb: refactor input parameters parse/validation gitweb: generate project/action/hash URLs gitweb: use_pathinfo filenames start with / gitweb: parse parent..current syntax from pathinfo gitweb: generate parent..current URLs gitweb/gitweb.perl | 392 ++++++++++++++++++++++++++++++++++------------------ 1 files changed, 254 insertions(+), 138 deletions(-) ^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 0:10 [PATCHv4] gitweb: PATH_INFO support improvements Giuseppe Bilotta @ 2008-10-02 0:10 ` Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: refactor input parameters parse/validation Giuseppe Bilotta ` (2 more replies) 2008-10-02 8:19 ` [PATCHv4] gitweb: PATH_INFO support improvements Jakub Narebski 1 sibling, 3 replies; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 0:10 UTC (permalink / raw) To: git Cc: Jakub Narebski, Petr Baudis, Junio C Hamano, Shawn O. Pearce, Giuseppe Bilotta This patch enables gitweb to parse URLs with more information embedded in PATH_INFO, reducing the need for CGI parameters. The typical gitweb path is now $project/$action/$hash_base:$file_name or $project/$action/$hash This is mostly backwards compatible with the old-style gitweb paths, except when $project/$branch was used to access a branch whose name matches a gitweb action. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> --- gitweb/gitweb.perl | 90 +++++++++++++++++++++++++++++++--------------------- 1 files changed, 54 insertions(+), 36 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index e7e4d6b..f088681 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -495,6 +495,37 @@ if (defined $searchtext) { $search_regexp = $search_use_regexp ? $searchtext : quotemeta $searchtext; } +# dispatch +my %actions = ( + "blame" => \&git_blame, + "blobdiff" => \&git_blobdiff, + "blobdiff_plain" => \&git_blobdiff_plain, + "blob" => \&git_blob, + "blob_plain" => \&git_blob_plain, + "commitdiff" => \&git_commitdiff, + "commitdiff_plain" => \&git_commitdiff_plain, + "commit" => \&git_commit, + "forks" => \&git_forks, + "heads" => \&git_heads, + "history" => \&git_history, + "log" => \&git_log, + "rss" => \&git_rss, + "atom" => \&git_atom, + "search" => \&git_search, + "search_help" => \&git_search_help, + "shortlog" => \&git_shortlog, + "summary" => \&git_summary, + "tag" => \&git_tag, + "tags" => \&git_tags, + "tree" => \&git_tree, + "snapshot" => \&git_snapshot, + "object" => \&git_object, + # those below don't need $project + "opml" => \&git_opml, + "project_list" => \&git_project_list, + "project_index" => \&git_project_index, +); + # now read PATH_INFO and use it as alternative to parameters sub evaluate_path_info { return if defined $project; @@ -519,9 +550,19 @@ sub evaluate_path_info { # do not change any parameters if an action is given using the query string return if $action; $path_info =~ s,^\Q$project\E/*,,; + + # next comes the action + $action = $path_info; + $action =~ s,/.*$,,; + if (exists $actions{$action}) { + $path_info =~ s,^$action/*,,; + } else { + $action = undef; + } + my ($refname, $pathname) = split(/:/, $path_info, 2); if (defined $pathname) { - # we got "project.git/branch:filename" or "project.git/branch:dir/" + # we got "project.git/action/branch:filename" or "project.git/action/branch:dir/" # we could use git_get_type(branch:pathname), but it needs $git_dir $pathname =~ s,^/+,,; if (!$pathname || substr($pathname, -1) eq "/") { @@ -534,8 +575,9 @@ sub evaluate_path_info { $file_name ||= validate_pathname($pathname); } elsif (defined $refname) { # we got "project.git/branch" - $action ||= "shortlog"; - $hash ||= validate_refname($refname); + $action ||= "shortlog"; + $hash ||= validate_refname($refname); + $hash_base ||= validate_refname($refname); } } evaluate_path_info(); @@ -544,37 +586,6 @@ evaluate_path_info(); our $git_dir; $git_dir = "$projectroot/$project" if $project; -# dispatch -my %actions = ( - "blame" => \&git_blame, - "blobdiff" => \&git_blobdiff, - "blobdiff_plain" => \&git_blobdiff_plain, - "blob" => \&git_blob, - "blob_plain" => \&git_blob_plain, - "commitdiff" => \&git_commitdiff, - "commitdiff_plain" => \&git_commitdiff_plain, - "commit" => \&git_commit, - "forks" => \&git_forks, - "heads" => \&git_heads, - "history" => \&git_history, - "log" => \&git_log, - "rss" => \&git_rss, - "atom" => \&git_atom, - "search" => \&git_search, - "search_help" => \&git_search_help, - "shortlog" => \&git_shortlog, - "summary" => \&git_summary, - "tag" => \&git_tag, - "tags" => \&git_tags, - "tree" => \&git_tree, - "snapshot" => \&git_snapshot, - "object" => \&git_object, - # those below don't need $project - "opml" => \&git_opml, - "project_list" => \&git_project_list, - "project_index" => \&git_project_index, -); - if (!defined $action) { if (defined $hash) { $action = git_get_type($hash); @@ -631,8 +642,15 @@ sub href (%) { if ($params{-replay}) { while (my ($name, $symbol) = each %mapping) { if (!exists $params{$name}) { - # to allow for multivalued params we use arrayref form - $params{$name} = [ $cgi->param($symbol) ]; + # the parameter we want to recycle may be either part of the + # list of CGI parameter, or recovered from PATH_INFO + if ($cgi->param($symbol)) { + # to allow for multivalued params we use arrayref form + $params{$name} = [ $cgi->param($symbol) ]; + } else { + no strict 'refs'; + $params{$name} = $$name if $$name; + } } } } -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv4] gitweb: refactor input parameters parse/validation 2008-10-02 0:10 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Giuseppe Bilotta @ 2008-10-02 0:10 ` Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: generate project/action/hash URLs Giuseppe Bilotta 2008-10-03 1:36 ` [PATCHv4] gitweb: refactor input parameters parse/validation Jakub Narebski 2008-10-02 8:59 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Jakub Narebski 2008-10-02 15:34 ` Petr Baudis 2 siblings, 2 replies; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 0:10 UTC (permalink / raw) To: git Cc: Jakub Narebski, Petr Baudis, Junio C Hamano, Shawn O. Pearce, Giuseppe Bilotta Since input parameters can now be obtained both from CGI parameters and PATH_INFO, we would like most of the code to be agnostic about the way parameters were retrieved. We thus collect all the parameters into the new %input_params hash, removing the need for ad-hoc kludgy code in href(). As much of the parameters validation code is now shared between CGI and PATH_INFO, although this requires PATH_INFO parsing to be interleaved into the main code instead of being factored out into its own routine. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> --- gitweb/gitweb.perl | 336 ++++++++++++++++++++++++++++------------------------ 1 files changed, 183 insertions(+), 153 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index f088681..ec4326f 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -28,8 +28,10 @@ our $my_url = $cgi->url(); our $my_uri = $cgi->url(-absolute => 1); # if we're called with PATH_INFO, we have to strip that -# from the URL to find our real URL -if (my $path_info = $ENV{"PATH_INFO"}) { +# from the URL to find our real URL. PATH_INFO is kept +# as it's used later on for parameter extraction +my $path_info = $ENV{"PATH_INFO"}; +if ($path_info) { $my_url =~ s,\Q$path_info\E$,,; $my_uri =~ s,\Q$path_info\E$,,; } @@ -390,15 +392,111 @@ $projects_list ||= $projectroot; # ====================================================================== # input validation and dispatch -our $action = $cgi->param('a'); -if (defined $action) { - if ($action =~ m/[^0-9a-zA-Z\.\-_]/) { - die_error(400, "Invalid action parameter"); + +# input parameters can be collected from a variety of sources (presently, CGI +# and PATH_INFO), so we define an %input_params hash that collects them all +# together during validation: this allows subsequent uses (e.g. href()) to be +# agnostic of the parameter origin + +my %input_params = (); + +# input parameters are stored with the long parameter name as key, so we need +# the name -> CGI key mapping here. This will also be used in the href +# subroutine to convert parameters to their CGI equivalent. +# +# XXX: Warning: If you touch this, check the search form for updating, +# too. + +my @cgi_param_mapping = ( + project => "p", + action => "a", + file_name => "f", + file_parent => "fp", + hash => "h", + hash_parent => "hp", + hash_base => "hb", + hash_parent_base => "hpb", + page => "pg", + order => "o", + searchtext => "s", + searchtype => "st", + snapshot_format => "sf", + extra_options => "opt", + search_use_regexp => "sr", +); +my %cgi_param_mapping = @cgi_param_mapping; + +# we will also need to know the possible actions, for validation +my %actions = ( + "blame" => \&git_blame, + "blobdiff" => \&git_blobdiff, + "blobdiff_plain" => \&git_blobdiff_plain, + "blob" => \&git_blob, + "blob_plain" => \&git_blob_plain, + "commitdiff" => \&git_commitdiff, + "commitdiff_plain" => \&git_commitdiff_plain, + "commit" => \&git_commit, + "forks" => \&git_forks, + "heads" => \&git_heads, + "history" => \&git_history, + "log" => \&git_log, + "rss" => \&git_rss, + "atom" => \&git_atom, + "search" => \&git_search, + "search_help" => \&git_search_help, + "shortlog" => \&git_shortlog, + "summary" => \&git_summary, + "tag" => \&git_tag, + "tags" => \&git_tags, + "tree" => \&git_tree, + "snapshot" => \&git_snapshot, + "object" => \&git_object, + # those below don't need $project + "opml" => \&git_opml, + "project_list" => \&git_project_list, + "project_index" => \&git_project_index, +); + +# fill %input_params with the CGI parameters. All values except for 'opt' +# should be single values, but opt can be an array. We should probably +# build an array of parameters that can be multi-valued, but since for the time +# being it's only this one, we just single it out +while (my ($name, $symbol) = each %cgi_param_mapping) { + if ($symbol eq 'opt') { + $input_params{$name} = [$cgi->param($symbol)]; + } else { + $input_params{$name} = $cgi->param($symbol); } } -# parameters which are pathnames -our $project = $cgi->param('p'); +# next, we want to parse PATH_INFO (which was already stored in $path_info at +# the beginning). This is a little hairy because PATH_INFO parsing needs some +# form of parameter validation, so we interleave parsing and validation. +# +# The accepted PATH_INFO syntax is $project/$action/$hash or +# $project/$action/$hash_base:$file_name, where $action may be missing (mostly for +# backwards compatibility), so we need to parse and validate the parameters in +# this same order. + +# clear $path_info of any leading / +$path_info =~ s,^/+,,; + +our $project = $input_params{'project'}; +if ($path_info && !defined $project) { + # if $project was not defined by CGI, we try to extract it from + # $path_info + $project = $path_info; + $project =~ s,/+$,,; + while ($project && !check_head_link("$projectroot/$project")) { + $project =~ s,/*[^/]*$,,; + } + $input_params{'project'} = $project; +} else { + # otherwise, we suppress $path_info parsing altogether + $path_info = undef; +} + +# project name validation if (defined $project) { if (!validate_pathname($project) || !(-d "$projectroot/$project") || @@ -408,16 +506,66 @@ if (defined $project) { undef $project; die_error(404, "No such project"); } + + # we purge the $project name from the $path_info, preparing it for + # subsequent parameters extraction + $path_info =~ s,^\Q$project\E/*,, if defined $path_info; +} else { + # we also suppress $path_info parsing if no project was defined + $path_info = undef; +} + +# action parsing/validation +our $action = $input_params{'action'}; +if ($path_info && !defined $action) { + # next comes the action + $action = $path_info; + $action =~ s,/.*$,,; + if (exists $actions{$action}) { + # remove the action from $path_info, and sync %input_params + $path_info =~ s,^$action/*,,; + $input_params{'action'} = $action; + } else { + $action = undef; + } +} elsif (defined $action && !exists $actions{$action}) { + die_error(400, "Invalid action parameter"); +} + +# we can now parse ref and pathnames in PATH_INFO +if ($path_info) { + my ($refname, $pathname) = split(/:/, $path_info, 2); + if (defined $pathname) { + # we got "project.git/action/branch:filename" or "project.git/action/branch:dir/" + # we could use git_get_type(branch:pathname), but it needs $git_dir + $pathname =~ s,^/+,,; + if (!$pathname || substr($pathname, -1) eq "/") { + $input_params{'action'} ||= "tree"; + $pathname =~ s,/$,,; + } else { + $input_params{'action'} ||= "blob_plain"; + } + $input_params{'hash_base'} ||= $refname; + $input_params{'file_name'} ||= $pathname; + } elsif (defined $refname) { + # we got "project.git/branch" + $input_params{'action'} ||= "shortlog"; + $input_params{'hash'} ||= $refname; + $input_params{'hash_base'} ||= $refname; + } } -our $file_name = $cgi->param('f'); +# and now the rest of the validation + +# parameters which are pathnames +our $file_name = $input_params{'file_name'}; if (defined $file_name) { if (!validate_pathname($file_name)) { die_error(400, "Invalid file parameter"); } } -our $file_parent = $cgi->param('fp'); +our $file_parent = $input_params{'file_parent'}; if (defined $file_parent) { if (!validate_pathname($file_parent)) { die_error(400, "Invalid file parent parameter"); @@ -425,21 +573,21 @@ if (defined $file_parent) { } # parameters which are refnames -our $hash = $cgi->param('h'); +our $hash = $input_params{'hash'}; if (defined $hash) { if (!validate_refname($hash)) { die_error(400, "Invalid hash parameter"); } } -our $hash_parent = $cgi->param('hp'); +our $hash_parent = $input_params{'hash_parent'}; if (defined $hash_parent) { if (!validate_refname($hash_parent)) { die_error(400, "Invalid hash parent parameter"); } } -our $hash_base = $cgi->param('hb'); +our $hash_base = $input_params{'hash_base'}; if (defined $hash_base) { if (!validate_refname($hash_base)) { die_error(400, "Invalid hash base parameter"); @@ -450,19 +598,19 @@ my %allowed_options = ( "--no-merges" => [ qw(rss atom log shortlog history) ], ); -our @extra_options = $cgi->param('opt'); -if (defined @extra_options) { - foreach my $opt (@extra_options) { - if (not exists $allowed_options{$opt}) { - die_error(400, "Invalid option parameter"); - } - if (not grep(/^$action$/, @{$allowed_options{$opt}})) { - die_error(400, "Invalid option parameter for this action"); - } +our @extra_options = @{$input_params{'extra_options'}}; +# @extra_options is always defined, since it can only be (currently) set from +# CGI, and $cgi->param() returns the empty array in array context +foreach my $opt (@extra_options) { + if (not exists $allowed_options{$opt}) { + die_error(400, "Invalid option parameter"); + } + if (not grep(/^$action$/, @{$allowed_options{$opt}})) { + die_error(400, "Invalid option parameter for this action"); } } -our $hash_parent_base = $cgi->param('hpb'); +our $hash_parent_base = $input_params{'hash_parent_base'}; if (defined $hash_parent_base) { if (!validate_refname($hash_parent_base)) { die_error(400, "Invalid hash parent base parameter"); @@ -470,23 +618,23 @@ if (defined $hash_parent_base) { } # other parameters -our $page = $cgi->param('pg'); +our $page = $input_params{'page'}; if (defined $page) { if ($page =~ m/[^0-9]/) { die_error(400, "Invalid page parameter"); } } -our $searchtype = $cgi->param('st'); +our $searchtype = $input_params{'searchtype'}; if (defined $searchtype) { if ($searchtype =~ m/[^a-z]/) { die_error(400, "Invalid searchtype parameter"); } } -our $search_use_regexp = $cgi->param('sr'); +our $search_use_regexp = $input_params{'search_use_regexp'}; -our $searchtext = $cgi->param('s'); +our $searchtext = $input_params{'searchtext'}; our $search_regexp; if (defined $searchtext) { if (length($searchtext) < 2) { @@ -495,93 +643,6 @@ if (defined $searchtext) { $search_regexp = $search_use_regexp ? $searchtext : quotemeta $searchtext; } -# dispatch -my %actions = ( - "blame" => \&git_blame, - "blobdiff" => \&git_blobdiff, - "blobdiff_plain" => \&git_blobdiff_plain, - "blob" => \&git_blob, - "blob_plain" => \&git_blob_plain, - "commitdiff" => \&git_commitdiff, - "commitdiff_plain" => \&git_commitdiff_plain, - "commit" => \&git_commit, - "forks" => \&git_forks, - "heads" => \&git_heads, - "history" => \&git_history, - "log" => \&git_log, - "rss" => \&git_rss, - "atom" => \&git_atom, - "search" => \&git_search, - "search_help" => \&git_search_help, - "shortlog" => \&git_shortlog, - "summary" => \&git_summary, - "tag" => \&git_tag, - "tags" => \&git_tags, - "tree" => \&git_tree, - "snapshot" => \&git_snapshot, - "object" => \&git_object, - # those below don't need $project - "opml" => \&git_opml, - "project_list" => \&git_project_list, - "project_index" => \&git_project_index, -); - -# now read PATH_INFO and use it as alternative to parameters -sub evaluate_path_info { - return if defined $project; - my $path_info = $ENV{"PATH_INFO"}; - return if !$path_info; - $path_info =~ s,^/+,,; - return if !$path_info; - # find which part of PATH_INFO is project - $project = $path_info; - $project =~ s,/+$,,; - while ($project && !check_head_link("$projectroot/$project")) { - $project =~ s,/*[^/]*$,,; - } - # validate project - $project = validate_pathname($project); - if (!$project || - ($export_ok && !-e "$projectroot/$project/$export_ok") || - ($strict_export && !project_in_list($project))) { - undef $project; - return; - } - # do not change any parameters if an action is given using the query string - return if $action; - $path_info =~ s,^\Q$project\E/*,,; - - # next comes the action - $action = $path_info; - $action =~ s,/.*$,,; - if (exists $actions{$action}) { - $path_info =~ s,^$action/*,,; - } else { - $action = undef; - } - - my ($refname, $pathname) = split(/:/, $path_info, 2); - if (defined $pathname) { - # we got "project.git/action/branch:filename" or "project.git/action/branch:dir/" - # we could use git_get_type(branch:pathname), but it needs $git_dir - $pathname =~ s,^/+,,; - if (!$pathname || substr($pathname, -1) eq "/") { - $action ||= "tree"; - $pathname =~ s,/$,,; - } else { - $action ||= "blob_plain"; - } - $hash_base ||= validate_refname($refname); - $file_name ||= validate_pathname($pathname); - } elsif (defined $refname) { - # we got "project.git/branch" - $action ||= "shortlog"; - $hash ||= validate_refname($refname); - $hash_base ||= validate_refname($refname); - } -} -evaluate_path_info(); - # path to the current git repository our $git_dir; $git_dir = "$projectroot/$project" if $project; @@ -615,43 +676,12 @@ sub href (%) { # default is to use -absolute url() i.e. $my_uri my $href = $params{-full} ? $my_url : $my_uri; - # XXX: Warning: If you touch this, check the search form for updating, - # too. - - my @mapping = ( - project => "p", - action => "a", - file_name => "f", - file_parent => "fp", - hash => "h", - hash_parent => "hp", - hash_base => "hb", - hash_parent_base => "hpb", - page => "pg", - order => "o", - searchtext => "s", - searchtype => "st", - snapshot_format => "sf", - extra_options => "opt", - search_use_regexp => "sr", - ); - my %mapping = @mapping; - $params{'project'} = $project unless exists $params{'project'}; if ($params{-replay}) { - while (my ($name, $symbol) = each %mapping) { - if (!exists $params{$name}) { - # the parameter we want to recycle may be either part of the - # list of CGI parameter, or recovered from PATH_INFO - if ($cgi->param($symbol)) { - # to allow for multivalued params we use arrayref form - $params{$name} = [ $cgi->param($symbol) ]; - } else { - no strict 'refs'; - $params{$name} = $$name if $$name; - } - } + while (my ($name, $val) = each %input_params) { + $params{$name} = $input_params{$name} + unless (exists $params{$name}); } } @@ -669,8 +699,8 @@ sub href (%) { # now encode the parameters explicitly my @result = (); - for (my $i = 0; $i < @mapping; $i += 2) { - my ($name, $symbol) = ($mapping[$i], $mapping[$i+1]); + for (my $i = 0; $i < @cgi_param_mapping; $i += 2) { + my ($name, $symbol) = ($cgi_param_mapping[$i], $cgi_param_mapping[$i+1]); if (defined $params{$name}) { if (ref($params{$name}) eq "ARRAY") { foreach my $par (@{$params{$name}}) { @@ -4006,7 +4036,7 @@ sub git_search_grep_body { ## actions sub git_project_list { - my $order = $cgi->param('o'); + my $order = $input_params{'order'}; if (defined $order && $order !~ m/none|project|descr|owner|age/) { die_error(400, "Unknown order parameter"); } @@ -4029,7 +4059,7 @@ sub git_project_list { } sub git_forks { - my $order = $cgi->param('o'); + my $order = $input_params{'order'}; if (defined $order && $order !~ m/none|project|descr|owner|age/) { die_error(400, "Unknown order parameter"); } @@ -4562,7 +4592,7 @@ sub git_snapshot { my @supported_fmts = gitweb_check_feature('snapshot'); @supported_fmts = filter_snapshot_fmts(@supported_fmts); - my $format = $cgi->param('sf'); + my $format = $input_params{'snapshot_format'}; if (!@supported_fmts) { die_error(403, "Snapshots not allowed"); } -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv4] gitweb: generate project/action/hash URLs 2008-10-02 0:10 ` [PATCHv4] gitweb: refactor input parameters parse/validation Giuseppe Bilotta @ 2008-10-02 0:10 ` Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: use_pathinfo filenames start with / Giuseppe Bilotta ` (2 more replies) 2008-10-03 1:36 ` [PATCHv4] gitweb: refactor input parameters parse/validation Jakub Narebski 1 sibling, 3 replies; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 0:10 UTC (permalink / raw) To: git Cc: Jakub Narebski, Petr Baudis, Junio C Hamano, Shawn O. Pearce, Giuseppe Bilotta When generating path info URLs, reduce the number of CGI parameters by embedding action and hash_parent:filename or hash in the path. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> --- gitweb/gitweb.perl | 32 +++++++++++++++++++++++++++++--- 1 files changed, 29 insertions(+), 3 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index ec4326f..2c380ac 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -687,14 +687,40 @@ sub href (%) { my ($use_pathinfo) = gitweb_check_feature('pathinfo'); if ($use_pathinfo) { - # use PATH_INFO for project name + # try to put as many parameters as possible in PATH_INFO: + # - project name + # - action + # - hash or hash_base:filename + + # Strip any trailing / from $href, or we might get double + # slashes when the script is the DirectoryIndex + # + $href =~ s,/$,,; + + # Then add the project name, if present $href .= "/".esc_url($params{'project'}) if defined $params{'project'}; delete $params{'project'}; - # Summary just uses the project path URL - if (defined $params{'action'} && $params{'action'} eq 'summary') { + # Summary just uses the project path URL, any other action is + # added to the URL + if (defined $params{'action'}) { + $href .= "/".esc_url($params{'action'}) unless $params{'action'} eq 'summary'; delete $params{'action'}; } + + # Finally, we put either hash_base:file_name or hash + if (defined $params{'hash_base'}) { + $href .= "/".esc_url($params{'hash_base'}); + if (defined $params{'file_name'}) { + $href .= ":".esc_url($params{'file_name'}); + delete $params{'file_name'}; + } + delete $params{'hash'}; + delete $params{'hash_base'}; + } elsif (defined $params{'hash'}) { + $href .= "/".esc_url($params{'hash'}); + delete $params{'hash'}; + } } # now encode the parameters explicitly -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv4] gitweb: use_pathinfo filenames start with / 2008-10-02 0:10 ` [PATCHv4] gitweb: generate project/action/hash URLs Giuseppe Bilotta @ 2008-10-02 0:10 ` Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Giuseppe Bilotta 2008-10-03 11:28 ` [PATCHv4] gitweb: use_pathinfo filenames start with / Jakub Narebski 2008-10-03 1:48 ` [PATCHv4] gitweb: generate project/action/hash URLs Jakub Narebski 2008-10-04 1:15 ` Jakub Narebski 2 siblings, 2 replies; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 0:10 UTC (permalink / raw) To: git Cc: Jakub Narebski, Petr Baudis, Junio C Hamano, Shawn O. Pearce, Giuseppe Bilotta When using path info, make filenames start with a / (right after the : that separates them from the hash base). This minimal change allows relative navigation to work properly when viewing HTML files. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> --- gitweb/gitweb.perl | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 2c380ac..3e5b2b7 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -690,7 +690,7 @@ sub href (%) { # try to put as many parameters as possible in PATH_INFO: # - project name # - action - # - hash or hash_base:filename + # - hash or hash_base:/filename # Strip any trailing / from $href, or we might get double # slashes when the script is the DirectoryIndex @@ -708,11 +708,11 @@ sub href (%) { delete $params{'action'}; } - # Finally, we put either hash_base:file_name or hash + # Finally, we put either hash_base:/file_name or hash if (defined $params{'hash_base'}) { $href .= "/".esc_url($params{'hash_base'}); if (defined $params{'file_name'}) { - $href .= ":".esc_url($params{'file_name'}); + $href .= ":/".esc_url($params{'file_name'}); delete $params{'file_name'}; } delete $params{'hash'}; -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv4] gitweb: parse parent..current syntax from pathinfo 2008-10-02 0:10 ` [PATCHv4] gitweb: use_pathinfo filenames start with / Giuseppe Bilotta @ 2008-10-02 0:10 ` Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: generate parent..current URLs Giuseppe Bilotta 2008-10-04 1:31 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Jakub Narebski 2008-10-03 11:28 ` [PATCHv4] gitweb: use_pathinfo filenames start with / Jakub Narebski 1 sibling, 2 replies; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 0:10 UTC (permalink / raw) To: git Cc: Jakub Narebski, Petr Baudis, Junio C Hamano, Shawn O. Pearce, Giuseppe Bilotta This makes it possible to use an URL such as $project/somebranch..otherbranch:/filename to get a diff between different version of a file. Paths like $project/$action/somebranch:/somefile..otherbranch:/otherfile are parsed as well. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> --- gitweb/gitweb.perl | 26 ++++++++++++++++++++++++-- 1 files changed, 24 insertions(+), 2 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 3e5b2b7..89e360f 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -534,7 +534,9 @@ if ($path_info && !defined $action) { # we can now parse ref and pathnames in PATH_INFO if ($path_info) { - my ($refname, $pathname) = split(/:/, $path_info, 2); + $path_info =~ /^((.+?)(:(.+))?\.\.)?(.+?)(:(.+))?$/; + my ($parentrefname, $parentpathname, $refname, $pathname) = ( + $2, $4, $5, $7); if (defined $pathname) { # we got "project.git/action/branch:filename" or "project.git/action/branch:dir/" # we could use git_get_type(branch:pathname), but it needs $git_dir @@ -543,7 +545,11 @@ if ($path_info) { $input_params{'action'} ||= "tree"; $pathname =~ s,/$,,; } else { - $input_params{'action'} ||= "blob_plain"; + if ($parentrefname) { + $input_params{'action'} ||= "blobdiff_plain"; + } else { + $input_params{'action'} ||= "blob_plain"; + } } $input_params{'hash_base'} ||= $refname; $input_params{'file_name'} ||= $pathname; @@ -553,6 +559,22 @@ if ($path_info) { $input_params{'hash'} ||= $refname; $input_params{'hash_base'} ||= $refname; } + # the parent part might be missing the pathname, in which case we use the $file_name, if present + if (defined $parentrefname) { + $input_params{'hash_parent_base'} ||= $parentrefname; + if ($parentpathname) { + $parentpathname =~ s,^/+,,; + $parentpathname =~ s,/$,,; + $input_params{'file_parent'} ||= $parentpathname; + } else { + $input_params{'file_parent'} ||= $input_params{'file_name'}; + } + if (defined $input_params{'file_parent'}) { + $input_params{'hash_parent'} ||= git_get_hash_by_path($input_params{'hash_parent_base'}, $input_params{'file_parent'}); + } else { + $input_params{'hash_parent'} ||= $parentrefname; + } + } } # and now the rest of the validation -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCHv4] gitweb: generate parent..current URLs 2008-10-02 0:10 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Giuseppe Bilotta @ 2008-10-02 0:10 ` Giuseppe Bilotta 2008-10-06 0:17 ` Jakub Narebski 2008-10-04 1:31 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Jakub Narebski 1 sibling, 1 reply; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 0:10 UTC (permalink / raw) To: git Cc: Jakub Narebski, Petr Baudis, Junio C Hamano, Shawn O. Pearce, Giuseppe Bilotta If use_pathinfo is enabled, href now creates links that contain paths in the form $project/$action/oldhash:/oldname..newhash:/newname for actions that use hash_parent etc. If any of the filename contains two consecutive dots, it's kept as a CGI parameter since the resulting path would otherwise be ambiguous. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> --- gitweb/gitweb.perl | 30 +++++++++++++++++++++++++----- 1 files changed, 25 insertions(+), 5 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 89e360f..d863ef7 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -712,7 +712,8 @@ sub href (%) { # try to put as many parameters as possible in PATH_INFO: # - project name # - action - # - hash or hash_base:/filename + # - hash_parent or hash_parent_base:/file_parent + # - hash or hash_base:/file_name # Strip any trailing / from $href, or we might get double # slashes when the script is the DirectoryIndex @@ -730,17 +731,36 @@ sub href (%) { delete $params{'action'}; } - # Finally, we put either hash_base:/file_name or hash + # Next, we put hash_parent_base:/file_parent..hash_base:/file_name, + # stripping nonexistent or useless pieces + $href .= "/" if ($params{'hash_base'} || $params{'hash_parent_base'} + || $params{'hash_parent'} || $params{'hash'}); if (defined $params{'hash_base'}) { - $href .= "/".esc_url($params{'hash_base'}); - if (defined $params{'file_name'}) { + if (defined $params{'hash_parent_base'}) { + $href .= esc_url($params{'hash_parent_base'}); + # skip the file_parent if it's the same as the file_name + delete $params{'file_parent'} if $params{'file_parent'} eq $params{'file_name'}; + if (defined $params{'file_parent'} && $params{'file_parent'} !~ /\.\./) { + $href .= ":/".esc_url($params{'file_parent'}); + delete $params{'file_parent'}; + } + $href .= ".."; + delete $params{'hash_parent'}; + delete $params{'hash_parent_base'}; + } elsif (defined $params{'hash_parent'}) { + $href .= esc_url($params{'hash_parent'}). ".."; + delete $params{'hash_parent'}; + } + + $href .= esc_url($params{'hash_base'}); + if (defined $params{'file_name'} && $params{'file_name'} !~ /\.\./) { $href .= ":/".esc_url($params{'file_name'}); delete $params{'file_name'}; } delete $params{'hash'}; delete $params{'hash_base'}; } elsif (defined $params{'hash'}) { - $href .= "/".esc_url($params{'hash'}); + $href .= esc_url($params{'hash'}); delete $params{'hash'}; } } -- 1.5.6.5 ^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: generate parent..current URLs 2008-10-02 0:10 ` [PATCHv4] gitweb: generate parent..current URLs Giuseppe Bilotta @ 2008-10-06 0:17 ` Jakub Narebski 0 siblings, 0 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-06 0:17 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: > If use_pathinfo is enabled, href now creates links that contain paths in > the form $project/$action/oldhash:/oldname..newhash:/newname for actions > that use hash_parent etc. [cut] >From the first glance, it looks good. I just worry a bit about complicated issue of hash_parent vs hash_parent_base etc. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse parent..current syntax from pathinfo 2008-10-02 0:10 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: generate parent..current URLs Giuseppe Bilotta @ 2008-10-04 1:31 ` Jakub Narebski 2008-10-04 7:24 ` Giuseppe Bilotta 2008-10-05 8:19 ` Jakub Narebski 1 sibling, 2 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-04 1:31 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: > This makes it possible to use an URL such as > $project/somebranch..otherbranch:/filename to get a diff between > different version of a file. Paths like > $project/$action/somebranch:/somefile..otherbranch:/otherfile are parsed > as well. > In short, it allows to have link to '*diff' views using path_info URL, or in general to pass $hash_[parent_]base and $file_parent using path_info. > Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> > --- > gitweb/gitweb.perl | 26 ++++++++++++++++++++++++-- > 1 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 3e5b2b7..89e360f 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -534,7 +534,9 @@ if ($path_info && !defined $action) { > > # we can now parse ref and pathnames in PATH_INFO > if ($path_info) { > - my ($refname, $pathname) = split(/:/, $path_info, 2); > + $path_info =~ /^((.+?)(:(.+))?\.\.)?(.+?)(:(.+))?$/; > + my ($parentrefname, $parentpathname, $refname, $pathname) = ( > + $2, $4, $5, $7); Style: I would use (but that is perhaps matter of taste) + my ($parentrefname, $parentpathname, $refname, $pathname) = + ($2, $4, $5, $7); Also it would be I think simpler to use instead non-catching grouping, i.e. (?: xxx ) extended pattern (see perlre(1)), and use ($1, $2, $3, $4), or even simpler 'list = (string =~ regexp)' form. I also think that the situation is more complicated than that, if we want to be more correct. The following path_info layouts with '..' make sense: hpb:fp..hb:f hpb..hb:f == hpb:f..hb:f hp..h And the layout below can be though to make sense, but it is just plain weird. hpb:fp..f == hpb:fp..HEAD:f > if (defined $pathname) { > # we got "project.git/action/branch:filename" or "project.git/action/branch:dir/" > # we could use git_get_type(branch:pathname), but it needs $git_dir > @@ -543,7 +545,11 @@ if ($path_info) { > $input_params{'action'} ||= "tree"; > $pathname =~ s,/$,,; > } else { > - $input_params{'action'} ||= "blob_plain"; > + if ($parentrefname) { > + $input_params{'action'} ||= "blobdiff_plain"; > + } else { > + $input_params{'action'} ||= "blob_plain"; > + } Good catch. > } > $input_params{'hash_base'} ||= $refname; > $input_params{'file_name'} ||= $pathname; > @@ -553,6 +559,22 @@ if ($path_info) { > $input_params{'hash'} ||= $refname; > $input_params{'hash_base'} ||= $refname; > } > + # the parent part might be missing the pathname, in which case we use the $file_name, if present > + if (defined $parentrefname) { > + $input_params{'hash_parent_base'} ||= $parentrefname; > + if ($parentpathname) { > + $parentpathname =~ s,^/+,,; > + $parentpathname =~ s,/$,,; > + $input_params{'file_parent'} ||= $parentpathname; > + } else { > + $input_params{'file_parent'} ||= $input_params{'file_name'}; > + } > + if (defined $input_params{'file_parent'}) { > + $input_params{'hash_parent'} ||= git_get_hash_by_path($input_params{'hash_parent_base'}, $input_params{'file_parent'}); This line is bit long, and I think it should be wrapped.. > + } else { > + $input_params{'hash_parent'} ||= $parentrefname; > + } > + } > } -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse parent..current syntax from pathinfo 2008-10-04 1:31 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Jakub Narebski @ 2008-10-04 7:24 ` Giuseppe Bilotta 2008-10-04 7:48 ` Jakub Narebski 2008-10-05 8:19 ` Jakub Narebski 1 sibling, 1 reply; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-04 7:24 UTC (permalink / raw) To: Jakub Narebski; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Sat, Oct 4, 2008 at 3:31 AM, Jakub Narebski <jnareb@gmail.com> wrote: > On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: > >> This makes it possible to use an URL such as >> $project/somebranch..otherbranch:/filename to get a diff between >> different version of a file. Paths like >> $project/$action/somebranch:/somefile..otherbranch:/otherfile are parsed >> as well. >> > > In short, it allows to have link to '*diff' views using path_info URL, > or in general to pass $hash_[parent_]base and $file_parent using > path_info. Yes, that's probably a better form for the commit message. >> Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> >> --- >> gitweb/gitweb.perl | 26 ++++++++++++++++++++++++-- >> 1 files changed, 24 insertions(+), 2 deletions(-) >> >> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl >> index 3e5b2b7..89e360f 100755 >> --- a/gitweb/gitweb.perl >> +++ b/gitweb/gitweb.perl >> @@ -534,7 +534,9 @@ if ($path_info && !defined $action) { >> >> # we can now parse ref and pathnames in PATH_INFO >> if ($path_info) { >> - my ($refname, $pathname) = split(/:/, $path_info, 2); >> + $path_info =~ /^((.+?)(:(.+))?\.\.)?(.+?)(:(.+))?$/; >> + my ($parentrefname, $parentpathname, $refname, $pathname) = ( >> + $2, $4, $5, $7); > > Style: I would use (but that is perhaps matter of taste) > > + my ($parentrefname, $parentpathname, $refname, $pathname) = > + ($2, $4, $5, $7); Right, I'm not sure why I put the ( on the previous line. > Also it would be I think simpler to use instead non-catching grouping, > i.e. (?: xxx ) extended pattern (see perlre(1)), and use > ($1, $2, $3, $4), or even simpler 'list = (string =~ regexp)' form. Good idea, I'll rework it in that sense. > I also think that the situation is more complicated than that, if we > want to be more correct. > > The following path_info layouts with '..' make sense: > > hpb:fp..hb:f > hpb..hb:f == hpb:f..hb:f > hp..h And these are matched by the above regexp > And the layout below can be though to make sense, but it is just > plain weird. > > hpb:fp..f == hpb:fp..HEAD:f I'm afraid I'm not going to support that, although it's probably easy to support hpb:fp..:f (i.e. accept a missing refname but on condition of having a : in front of the file spec). >> + if (defined $input_params{'file_parent'}) { >> + $input_params{'hash_parent'} ||= git_get_hash_by_path($input_params{'hash_parent_base'}, $input_params{'file_parent'}); > > This line is bit long, and I think it should be wrapped.. By the way, on the first revision of the path_info patchset, you had me discard $hash ||= git_get_hash_by_path($hash_base, $file_name); in the simple case on the basis that it was an extra call to external git. I actually forgot to remove it from this part of the patchset too at the time, so this gets me wondering about this: should I put it back in place in the simple case, or remove it from here too? -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse parent..current syntax from pathinfo 2008-10-04 7:24 ` Giuseppe Bilotta @ 2008-10-04 7:48 ` Jakub Narebski 0 siblings, 0 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-04 7:48 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce Giuseppe Bilotta wrote: > On Sat, Oct 4, 2008 at 3:31 AM, Jakub Narebski <jnareb@gmail.com> wrote: >> On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: >> >>> This makes it possible to use an URL such as >>> $project/somebranch..otherbranch:/filename to get a diff between >>> different version of a file. Paths like >>> $project/$action/somebranch:/somefile..otherbranch:/otherfile are parsed >>> as well. >>> >> >> In short, it allows to have link to '*diff' views using path_info URL, >> or in general to pass $hash_[parent_]base and $file_parent using >> path_info. > > Yes, that's probably a better form for the commit message. I have thought about this rather as supplement (addition) to the current commit message (which states explicitly new form of supported path_info URL), than replacing it. >> The following path_info layouts with '..' make sense: >> >> hpb:fp..hb:f >> hpb..hb:f == hpb:f..hb:f >> hp..h > > And these are matched by the above regexp > >> And the layout below can be though to make sense, but it is just >> plain weird. >> >> hpb:fp..f == hpb:fp..HEAD:f > > I'm afraid I'm not going to support that, although it's probably easy > to support hpb:fp..:f (i.e. accept a missing refname but on condition > of having a : in front of the file spec). No, not supporting this form is just fine. >>> + if (defined $input_params{'file_parent'}) { >>> + $input_params{'hash_parent'} ||= git_get_hash_by_path($input_params{'hash_parent_base'}, $input_params{'file_parent'}); >> >> This line is bit long, and I think it should be wrapped.. > > By the way, on the first revision of the path_info patchset, you had me discard > > $hash ||= git_get_hash_by_path($hash_base, $file_name); > > in the simple case on the basis that it was an extra call to external git. > > I actually forgot to remove it from this part of the patchset too at > the time, so this gets me wondering about this: should I put it back > in place in the simple case, or remove it from here too? I think you should remove it here too. IMHO if needed, it should be dealt with (and I think is dealt with) in appropriate action subroutine. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse parent..current syntax from pathinfo 2008-10-04 1:31 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Jakub Narebski 2008-10-04 7:24 ` Giuseppe Bilotta @ 2008-10-05 8:19 ` Jakub Narebski 1 sibling, 0 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-05 8:19 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Sun, 4 Oct 2008, Jakub Narebski wrote: > On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > > index 3e5b2b7..89e360f 100755 > > --- a/gitweb/gitweb.perl > > +++ b/gitweb/gitweb.perl > > @@ -534,7 +534,9 @@ if ($path_info && !defined $action) { > > > > # we can now parse ref and pathnames in PATH_INFO > > if ($path_info) { > > - my ($refname, $pathname) = split(/:/, $path_info, 2); > > + $path_info =~ /^((.+?)(:(.+))?\.\.)?(.+?)(:(.+))?$/; > > + my ($parentrefname, $parentpathname, $refname, $pathname) = ( > > + $2, $4, $5, $7); > > Style: I would use (but that is perhaps matter of taste) > > + my ($parentrefname, $parentpathname, $refname, $pathname) = > + ($2, $4, $5, $7); > > Also it would be I think simpler to use instead non-catching grouping, > i.e. (?: xxx ) extended pattern (see perlre(1)), and use > ($1, $2, $3, $4), or even simpler 'list = (string =~ regexp)' form. Alternate solution would be to use split(..., ..., 2), but I think you got the regular expression right. Just mentioning... -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: use_pathinfo filenames start with / 2008-10-02 0:10 ` [PATCHv4] gitweb: use_pathinfo filenames start with / Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Giuseppe Bilotta @ 2008-10-03 11:28 ` Jakub Narebski 1 sibling, 0 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-03 11:28 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: > When using path info, make filenames start with a / (right after the : > that separates them from the hash base). This minimal change allows > relative navigation to work properly when viewing HTML files. ...in 'raw' mode/'blob_plain' view. This allows to for example view web site as it was at some revision, following relative links and checking if they are broken. > > Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Acked-by: Jakub Narebski <jnareb@gmail.com> (Note that it depends on previous patch, and without it doesn't make sense, so this Ack doesn't matter much now!) > --- > gitweb/gitweb.perl | 6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 2c380ac..3e5b2b7 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -690,7 +690,7 @@ sub href (%) { > # try to put as many parameters as possible in PATH_INFO: > # - project name > # - action > - # - hash or hash_base:filename > + # - hash or hash_base:/filename > > # Strip any trailing / from $href, or we might get double > # slashes when the script is the DirectoryIndex > @@ -708,11 +708,11 @@ sub href (%) { > delete $params{'action'}; > } > > - # Finally, we put either hash_base:file_name or hash > + # Finally, we put either hash_base:/file_name or hash > if (defined $params{'hash_base'}) { > $href .= "/".esc_url($params{'hash_base'}); > if (defined $params{'file_name'}) { > - $href .= ":".esc_url($params{'file_name'}); > + $href .= ":/".esc_url($params{'file_name'}); > delete $params{'file_name'}; > } > delete $params{'hash'}; > -- > 1.5.6.5 > > -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: generate project/action/hash URLs 2008-10-02 0:10 ` [PATCHv4] gitweb: generate project/action/hash URLs Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: use_pathinfo filenames start with / Giuseppe Bilotta @ 2008-10-03 1:48 ` Jakub Narebski [not found] ` <cb7bb73a0810022330l498bdb20h703dec7833a443e@mail.gmail.com> 2008-10-04 1:15 ` Jakub Narebski 2 siblings, 1 reply; 34+ messages in thread From: Jakub Narebski @ 2008-10-03 1:48 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Tue, 2 Oct 2008, Giuseppe Bilotta wrote: > When generating path info URLs, reduce the number of CGI parameters by > embedding action and hash_parent:filename or hash in the path. _Perhaps_ it should be noted that even though gitweb accepted 'project/hash' and 'project/hash_base:file_name' path_info URLs, it generated links with only 'project/' in path_info. > > Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> > --- > gitweb/gitweb.perl | 32 +++++++++++++++++++++++++++++--- > 1 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index ec4326f..2c380ac 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -687,14 +687,40 @@ sub href (%) { > > my ($use_pathinfo) = gitweb_check_feature('pathinfo'); > if ($use_pathinfo) { > - # use PATH_INFO for project name > + # try to put as many parameters as possible in PATH_INFO: > + # - project name > + # - action > + # - hash or hash_base:filename > + > + # Strip any trailing / from $href, or we might get double > + # slashes when the script is the DirectoryIndex Perhaps example, like $href='gitweb.example.com/', could be put here. > + # I think that we can lose this empty line comment here. > + $href =~ s,/$,,; > + > + # Then add the project name, if present > $href .= "/".esc_url($params{'project'}) if defined $params{'project'}; > delete $params{'project'}; > > - # Summary just uses the project path URL > - if (defined $params{'action'} && $params{'action'} eq 'summary') { > + # Summary just uses the project path URL, any other action is > + # added to the URL > + if (defined $params{'action'}) { > + $href .= "/".esc_url($params{'action'}) unless $params{'action'} eq 'summary'; > delete $params{'action'}; > } Good. > + > + # Finally, we put either hash_base:file_name or hash > + if (defined $params{'hash_base'}) { > + $href .= "/".esc_url($params{'hash_base'}); > + if (defined $params{'file_name'}) { > + $href .= ":".esc_url($params{'file_name'}); > + delete $params{'file_name'}; > + } > + delete $params{'hash'}; > + delete $params{'hash_base'}; > + } elsif (defined $params{'hash'}) { > + $href .= "/".esc_url($params{'hash'}); > + delete $params{'hash'}; > + } Hmmmm... Shouldn't the code first check for $file_name, then add either "$hash_base:$file_name" (url-escaped), or "$hash" (not "$hash_base")? > } > > # now encode the parameters explicitly Thank you very much for work on improving gitweb. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
[parent not found: <cb7bb73a0810022330l498bdb20h703dec7833a443e@mail.gmail.com>]
* Re: [PATCHv4] gitweb: generate project/action/hash URLs [not found] ` <cb7bb73a0810022330l498bdb20h703dec7833a443e@mail.gmail.com> @ 2008-10-03 11:24 ` Jakub Narebski 0 siblings, 0 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-03 11:24 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git On Fri, 3 October 2008, Giuseppe Bilotta wrote: > On Fri, Oct 3, 2008 at 3:48 AM, Jakub Narebski <jnareb@gmail.com> wrote: >> On Tue, 2 Oct 2008, Giuseppe Bilotta wrote: >>> + # Finally, we put either hash_base:file_name or hash >>> + if (defined $params{'hash_base'}) { >>> + $href .= "/".esc_url($params{'hash_base'}); >>> + if (defined $params{'file_name'}) { >>> + $href .= ":".esc_url($params{'file_name'}); >>> + delete $params{'file_name'}; >>> + } >>> + delete $params{'hash'}; >>> + delete $params{'hash_base'}; >>> + } elsif (defined $params{'hash'}) { >>> + $href .= "/".esc_url($params{'hash'}); >>> + delete $params{'hash'}; >>> + } >> >> Hmmmm... >> >> Shouldn't the code first check for $file_name, then add either >> "$hash_base:$file_name" (url-escaped), or "$hash" (not "$hash_base")? > > Hm, your idea is probably better indeed, if we can ensure that > $file_name is always set for generated links that need $hash_base (I > mean, as opposed to the root tree case we were discussing for the > first patch). Hmm... I'm just worrying here about diluting meaning of params passed via path_info. We had either project/hash, or project/hash_base:file_name; now the hashy parameter in path_info can be hash or hash_base and we don't know which. But perhaps I am worrying over nothing... A short comment though would be nice (that we can have $hash or $hash_base for case without $file_name) -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: generate project/action/hash URLs 2008-10-02 0:10 ` [PATCHv4] gitweb: generate project/action/hash URLs Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: use_pathinfo filenames start with / Giuseppe Bilotta 2008-10-03 1:48 ` [PATCHv4] gitweb: generate project/action/hash URLs Jakub Narebski @ 2008-10-04 1:15 ` Jakub Narebski 2 siblings, 0 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-04 1:15 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: > When generating path info URLs, reduce the number of CGI parameters by > embedding action and hash_parent:filename or hash in the path. I think this is good. >--- > + # Finally, we put either hash_base:file_name or hash > + if (defined $params{'hash_base'}) { > + $href .= "/".esc_url($params{'hash_base'}); > + if (defined $params{'file_name'}) { > + $href .= ":".esc_url($params{'file_name'}); > + delete $params{'file_name'}; > + } > + delete $params{'hash'}; > + delete $params{'hash_base'}; > + } elsif (defined $params{'hash'}) { > + $href .= "/".esc_url($params{'hash'}); > + delete $params{'hash'}; > + } > } That I'm not sure about, both the layout of conditional (shouldn't we check $file_name first), and the fact that we remove parameter which is not passed, and can be even not recoverable (for example both 'hash' and 'hash_base' set, 'hash' != 'hash_base', and 'file_name' not set). So the code above probably has some strange corner cases, but I guess it wouldn't be triggered by links generated by gitweb. But I guess that is "good enough", especially that 'tree' and 'history' action links can have 'file_name' unset if they refer to top tree, and they still need 'hash_base'. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: refactor input parameters parse/validation 2008-10-02 0:10 ` [PATCHv4] gitweb: refactor input parameters parse/validation Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: generate project/action/hash URLs Giuseppe Bilotta @ 2008-10-03 1:36 ` Jakub Narebski 2008-10-03 7:24 ` Giuseppe Bilotta 1 sibling, 1 reply; 34+ messages in thread From: Jakub Narebski @ 2008-10-03 1:36 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: > Since input parameters can now be obtained both from CGI parameters and > PATH_INFO, we would like most of the code to be agnostic about the way > parameters were retrieved. > I'd prefer that this cleanup/refactoring patch was _first_ patch in the series, as we were able to obtain parameters both from CGI query parameters and from PATH_INFO ($project, $hash or $hash_base+$file_name) _before_ first patch in this series. So it correct not only issue introduced by first patch (and fixed somewhat there), but what was outstanding (but rare because gitweb did not generate such links) issue. > We thus collect all the parameters into the new %input_params hash, > removing the need for ad-hoc kludgy code in href(). Alternate solution would be push data from PATH_INFO in query params data using (for example) $cgi->param('a', $action); or, naming parameters explicitely $cgi->param(-name=>'a', -value=>$action); This avoids need for additional variable, reuses current code, and nicely sidesteps issue whether to use long names for keys ('action', 'file_name') or short ones from CGI query ('a', 'f'). It would probably has to be at least partially written to check which of those two solutions (%input_params or $cgi->param('a', $a))) is better. > As much of the > parameters validation code is now shared between CGI and PATH_INFO, > although this requires PATH_INFO parsing to be interleaved into the main > code instead of being factored out into its own routine. I'm not sure if it is worth it then to unify parameters validation, with such drawback. > Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> > --- > gitweb/gitweb.perl | 336 ++++++++++++++++++++++++++++------------------------ > 1 files changed, 183 insertions(+), 153 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index f088681..ec4326f 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -28,8 +28,10 @@ our $my_url = $cgi->url(); > our $my_uri = $cgi->url(-absolute => 1); > > # if we're called with PATH_INFO, we have to strip that > -# from the URL to find our real URL > -if (my $path_info = $ENV{"PATH_INFO"}) { > +# from the URL to find our real URL. PATH_INFO is kept > +# as it's used later on for parameter extraction > +my $path_info = $ENV{"PATH_INFO"}; > +if ($path_info) { > $my_url =~ s,\Q$path_info\E$,,; > $my_uri =~ s,\Q$path_info\E$,,; > } This might be separate patch, if you wanted to increase your commit count ;-) More seriously I think it should be at least briefly mentioned in commit message (make $path_info global). > @@ -390,15 +392,111 @@ $projects_list ||= $projectroot; > > # ====================================================================== > # input validation and dispatch > -our $action = $cgi->param('a'); > -if (defined $action) { > - if ($action =~ m/[^0-9a-zA-Z\.\-_]/) { > - die_error(400, "Invalid action parameter"); > + > +# input parameters can be collected from a variety of sources (presently, CGI > +# and PATH_INFO), so we define an %input_params hash that collects them all > +# together during validation: this allows subsequent uses (e.g. href()) to be > +# agnostic of the parameter origin > + > +my %input_params = (); > + > +# input parameters are stored with the long parameter name as key, so we need > +# the name -> CGI key mapping here. This will also be used in the href > +# subroutine to convert parameters to their CGI equivalent. > +# > +# XXX: Warning: If you touch this, check the search form for updating, > +# too. > + > +my @cgi_param_mapping = ( > + project => "p", > + action => "a", > + file_name => "f", > + file_parent => "fp", > + hash => "h", > + hash_parent => "hp", > + hash_base => "hb", > + hash_parent_base => "hpb", > + page => "pg", > + order => "o", > + searchtext => "s", > + searchtype => "st", > + snapshot_format => "sf", > + extra_options => "opt", > + search_use_regexp => "sr", > +); > +my %cgi_param_mapping = @cgi_param_mapping; > + > +# we will also need to know the possible actions, for validation > +my %actions = ( > + "blame" => \&git_blame, > + "blobdiff" => \&git_blobdiff, > + "blobdiff_plain" => \&git_blobdiff_plain, > + "blob" => \&git_blob, > + "blob_plain" => \&git_blob_plain, > + "commitdiff" => \&git_commitdiff, > + "commitdiff_plain" => \&git_commitdiff_plain, > + "commit" => \&git_commit, > + "forks" => \&git_forks, > + "heads" => \&git_heads, > + "history" => \&git_history, > + "log" => \&git_log, > + "rss" => \&git_rss, > + "atom" => \&git_atom, > + "search" => \&git_search, > + "search_help" => \&git_search_help, > + "shortlog" => \&git_shortlog, > + "summary" => \&git_summary, > + "tag" => \&git_tag, > + "tags" => \&git_tags, > + "tree" => \&git_tree, > + "snapshot" => \&git_snapshot, > + "object" => \&git_object, > + # those below don't need $project > + "opml" => \&git_opml, > + "project_list" => \&git_project_list, > + "project_index" => \&git_project_index, > +); Nice, although I'm not sure if [%@]cgi_param_mapping has to global. If we use long parameters names as keys, I think it has to, somewhat. See also comment below. > + > +# fill %input_params with the CGI parameters. All values except for 'opt' > +# should be single values, but opt can be an array. We should probably > +# build an array of parameters that can be multi-valued, but since for the time > +# being it's only this one, we just single it out > +while (my ($name, $symbol) = each %cgi_param_mapping) { > + if ($symbol eq 'opt') { > + $input_params{$name} = [$cgi->param($symbol)]; > + } else { > + $input_params{$name} = $cgi->param($symbol); > } > } If it was chosen to use short (CGI query) parameter names, the above loop could be replaced by simple %input_params = $cgi->Vars; or to be more exact, if we want to have multi-valued parameters stored as array ref %input_params = map { [ split /\0/ ] if /\0/; } $cgi->Vars; See CGI(3pm): When using this [Vars], the thing you must watch out for are multivalued CGI parameters. Because a hash cannot distinguish between scalar and list con- text, multivalued parameters will be returned as a packed string, separated by the "\0" (null) character. You must split this packed string in order to get at the individual values. > -# parameters which are pathnames > -our $project = $cgi->param('p'); > +# next, we want to parse PATH_INFO (which was already stored in $path_info at > +# the beginning). This is a little hairy because PATH_INFO parsing needs some > +# form of parameter validation, so we interleave parsing and validation. I don't think it is a good idea. In my opinion, for my taste, it would be better to separate evaluating path_info from the rest. We could instead introduce convention that if variable (like $project) is set, then it is assumed to be validated; if it is present only in the %input_params hash, then it has to be validated. On the other hand this ordering, first by parameter, then by method of extraction could be seem quite equally valid. Nevertheless I think that previous flow with separate evaluate_path_info() and what should be evaluate_CGI_query() has better encapsulation. > +# > +# The accepted PATH_INFO syntax is $project/$action/$hash or > +# $project/$action/$hash_base:$file_name, where $action may be missing (mostly for > +# backwards compatibility), so we need to parse and validate the parameters in > +# this same order. > + > +# clear $path_info of any leading / > +$path_info =~ s,^/+,,; > + > +our $project = $input_params{'project'}; > +if ($path_info && !defined $project) { > + # if $project was not defined by CGI, we try to extract it from > + # $path_info > + $project = $path_info; > + $project =~ s,/+$,,; > + while ($project && !check_head_link("$projectroot/$project")) { > + $project =~ s,/*[^/]*$,,; > + } > + $input_params{'project'} = $project; > +} else { > + # otherwise, we suppress $path_info parsing altogether > + $path_info = undef; > +} > + > +# project name validation > if (defined $project) { > if (!validate_pathname($project) || > !(-d "$projectroot/$project") || Note that this code does less checking if $project is in path_info than for the case where it is set by CGI query. Perhaps there should be base fast check in a loop, and more extensive test later. > @@ -408,16 +506,66 @@ if (defined $project) { > undef $project; > die_error(404, "No such project"); > } > + > + # we purge the $project name from the $path_info, preparing it for > + # subsequent parameters extraction > + $path_info =~ s,^\Q$project\E/*,, if defined $path_info; > +} else { > + # we also suppress $path_info parsing if no project was defined > + $path_info = undef; > +} In evaluate_path_info it was simply 'return if...'; here with mentioned interleaving it is harder and a bit hacky. [cut] > @@ -615,43 +676,12 @@ sub href (%) { > # default is to use -absolute url() i.e. $my_uri > my $href = $params{-full} ? $my_url : $my_uri; [cut removed, or rather moved to beginning, @mapping] > $params{'project'} = $project unless exists $params{'project'}; > > if ($params{-replay}) { > - while (my ($name, $symbol) = each %mapping) { > - if (!exists $params{$name}) { > - # the parameter we want to recycle may be either part of the > - # list of CGI parameter, or recovered from PATH_INFO > - if ($cgi->param($symbol)) { > - # to allow for multivalued params we use arrayref form > - $params{$name} = [ $cgi->param($symbol) ]; > - } else { > - no strict 'refs'; > - $params{$name} = $$name if $$name; > - } > - } > + while (my ($name, $val) = each %input_params) { > + $params{$name} = $input_params{$name} > + unless (exists $params{$name}); Very nice, short code. Should be something like that from the very beginning. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: refactor input parameters parse/validation 2008-10-03 1:36 ` [PATCHv4] gitweb: refactor input parameters parse/validation Jakub Narebski @ 2008-10-03 7:24 ` Giuseppe Bilotta 2008-10-03 11:20 ` Jakub Narebski 0 siblings, 1 reply; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-03 7:24 UTC (permalink / raw) To: Jakub Narebski; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Fri, Oct 3, 2008 at 3:36 AM, Jakub Narebski <jnareb@gmail.com> wrote: > On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: > >> Since input parameters can now be obtained both from CGI parameters and >> PATH_INFO, we would like most of the code to be agnostic about the way >> parameters were retrieved. > > I'd prefer that this cleanup/refactoring patch was _first_ patch in > the series, as we were able to obtain parameters both from CGI query > parameters and from PATH_INFO ($project, $hash or $hash_base+$file_name) > _before_ first patch in this series. So it correct not only issue > introduced by first patch (and fixed somewhat there), but what was > outstanding (but rare because gitweb did not generate such links) > issue. It's true that the patch makes sense regardless of the rest of the path_info patch, indeed. >> We thus collect all the parameters into the new %input_params hash, >> removing the need for ad-hoc kludgy code in href(). > > Alternate solution would be push data from PATH_INFO in query params > data using (for example) > > $cgi->param('a', $action); > > or, naming parameters explicitely > > $cgi->param(-name=>'a', -value=>$action); > > This avoids need for additional variable, reuses current code, and > nicely sidesteps issue whether to use long names for keys ('action', > 'file_name') or short ones from CGI query ('a', 'f'). > > > It would probably has to be at least partially written to check which > of those two solutions (%input_params or $cgi->param('a', $a))) > is better. If we really don't want to care where the parameters came from *at all*, writing on $cgi->param is likely to be the cleanest solution. >> As much of the >> parameters validation code is now shared between CGI and PATH_INFO, >> although this requires PATH_INFO parsing to be interleaved into the main >> code instead of being factored out into its own routine. > > I'm not sure if it is worth it then to unify parameters validation, > with such drawback. Yeah, I don't like it very much either. But it does spare a little on the validation. OTOH, the amount we spare is not extraordinary, so it's probably not worth the spaghettization of the code ... >> # if we're called with PATH_INFO, we have to strip that >> -# from the URL to find our real URL >> -if (my $path_info = $ENV{"PATH_INFO"}) { >> +# from the URL to find our real URL. PATH_INFO is kept >> +# as it's used later on for parameter extraction >> +my $path_info = $ENV{"PATH_INFO"}; >> +if ($path_info) { >> $my_url =~ s,\Q$path_info\E$,,; >> $my_uri =~ s,\Q$path_info\E$,,; >> } > > This might be separate patch, if you wanted to increase your commit > count ;-) More seriously I think it should be at least briefly > mentioned in commit message (make $path_info global). Note however that the path_info evaluation is _destructive_, so after evaluation is complete we don't have much of it left. [snip] > Nice, although I'm not sure if [%@]cgi_param_mapping has to global. > If we use long parameters names as keys, I think it has to, somewhat. > See also comment below. >> +# fill %input_params with the CGI parameters. All values except for 'opt' >> +# should be single values, but opt can be an array. We should probably >> +# build an array of parameters that can be multi-valued, but since for the time >> +# being it's only this one, we just single it out >> +while (my ($name, $symbol) = each %cgi_param_mapping) { >> + if ($symbol eq 'opt') { >> + $input_params{$name} = [$cgi->param($symbol)]; >> + } else { >> + $input_params{$name} = $cgi->param($symbol); >> } >> } > > If it was chosen to use short (CGI query) parameter names, the above > loop could be replaced by simple > > %input_params = $cgi->Vars; > > or to be more exact, if we want to have multi-valued parameters stored > as array ref > > %input_params = map { [ split /\0/ ] if /\0/; } $cgi->Vars; > > > See CGI(3pm): > > When using this [Vars], the thing you must watch out for are multivalued CGI > parameters. Because a hash cannot distinguish between scalar and list con- > text, multivalued parameters will be returned as a packed string, separated > by the "\0" (null) character. You must split this packed string in order > to get at the individual values. Ah, an interesting alternative. This would make parameter copying a one-liner, almost as good as just using $cgi->param for everything :) >> -# parameters which are pathnames >> -our $project = $cgi->param('p'); >> +# next, we want to parse PATH_INFO (which was already stored in $path_info at >> +# the beginning). This is a little hairy because PATH_INFO parsing needs some >> +# form of parameter validation, so we interleave parsing and validation. > > I don't think it is a good idea. In my opinion, for my taste, it would > be better to separate evaluating path_info from the rest. > > We could instead introduce convention that if variable (like $project) > is set, then it is assumed to be validated; if it is present only in > the %input_params hash, then it has to be validated. > > > On the other hand this ordering, first by parameter, then by method > of extraction could be seem quite equally valid. Nevertheless I think > that previous flow with separate evaluate_path_info() and what should > be evaluate_CGI_query() has better encapsulation. > >> +# >> +# The accepted PATH_INFO syntax is $project/$action/$hash or >> +# $project/$action/$hash_base:$file_name, where $action may be missing (mostly for >> +# backwards compatibility), so we need to parse and validate the parameters in >> +# this same order. >> + >> +# clear $path_info of any leading / >> +$path_info =~ s,^/+,,; >> + >> +our $project = $input_params{'project'}; >> +if ($path_info && !defined $project) { >> + # if $project was not defined by CGI, we try to extract it from >> + # $path_info >> + $project = $path_info; >> + $project =~ s,/+$,,; >> + while ($project && !check_head_link("$projectroot/$project")) { >> + $project =~ s,/*[^/]*$,,; >> + } >> + $input_params{'project'} = $project; >> +} else { >> + # otherwise, we suppress $path_info parsing altogether >> + $path_info = undef; >> +} >> + >> +# project name validation >> if (defined $project) { >> if (!validate_pathname($project) || >> !(-d "$projectroot/$project") || > > Note that this code does less checking if $project is in path_info than > for the case where it is set by CGI query. Perhaps there should be base > fast check in a loop, and more extensive test later. Uh ... isn't this exactly what's happening? In the loop we just gobble until we find a git dir. Validation is then done, and it's the _same_ validation for both cases. Why do you say that path_info $project is less checked? >> @@ -408,16 +506,66 @@ if (defined $project) { >> undef $project; >> die_error(404, "No such project"); >> } >> + >> + # we purge the $project name from the $path_info, preparing it for >> + # subsequent parameters extraction >> + $path_info =~ s,^\Q$project\E/*,, if defined $path_info; >> +} else { >> + # we also suppress $path_info parsing if no project was defined >> + $path_info = undef; >> +} > > In evaluate_path_info it was simply 'return if...'; here with mentioned > interleaving it is harder and a bit hacky. I know. >> $params{'project'} = $project unless exists $params{'project'}; >> >> if ($params{-replay}) { >> - while (my ($name, $symbol) = each %mapping) { >> - if (!exists $params{$name}) { >> - # the parameter we want to recycle may be either part of the >> - # list of CGI parameter, or recovered from PATH_INFO >> - if ($cgi->param($symbol)) { >> - # to allow for multivalued params we use arrayref form >> - $params{$name} = [ $cgi->param($symbol) ]; >> - } else { >> - no strict 'refs'; >> - $params{$name} = $$name if $$name; >> - } >> - } >> + while (my ($name, $val) = each %input_params) { >> + $params{$name} = $input_params{$name} >> + unless (exists $params{$name}); > > Very nice, short code. Should be something like that from > the very beginning. Ok, I'll try working up a patch for params merging before any path_info extensions. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: refactor input parameters parse/validation 2008-10-03 7:24 ` Giuseppe Bilotta @ 2008-10-03 11:20 ` Jakub Narebski 0 siblings, 0 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-03 11:20 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Fri, 3 Oct 2008, Giuseppe Bilotta wrote: > On Fri, Oct 3, 2008 at 3:36 AM, Jakub Narebski <jnareb@gmail.com> wrote: >> On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: >>> We thus collect all the parameters into the new %input_params hash, >>> removing the need for ad-hoc kludgy code in href(). >> >> Alternate solution would be push data from PATH_INFO in query params >> data using (for example) >> >> $cgi->param('a', $action); >> >> or, naming parameters explicitely >> >> $cgi->param(-name=>'a', -value=>$action); >> >> This avoids need for additional variable, reuses current code, and >> nicely sidesteps issue whether to use long names for keys ('action', >> 'file_name') or short ones from CGI query ('a', 'f'). >> >> >> It would probably has to be at least partially written to check which >> of those two solutions (%input_params or $cgi->param('a', $a))) >> is better. > > If we really don't want to care where the parameters came from *at > all*, writing on $cgi->param is likely to be the cleanest solution. I now think that %input_params is a better solution, even if we won't be implementing -true_replay / -faithful_replay option, where options gotten in path_info would be passed in path_info, and options passed in query string would be passed as CGI params (even if those params could be encoded in path_info). One of the advantages is that having @cgi_param_mapping near the top provides opportunity to describe short CGI query options. Another advantage is that with %input_params it is easy to find if parameter is multivalued, or is meant to be multivalued: just check ref($input_params{long_name}); well, perhaps not _that_ easy, but... >>> As much of the >>> parameters validation code is now shared between CGI and PATH_INFO, >>> although this requires PATH_INFO parsing to be interleaved into the main >>> code instead of being factored out into its own routine. >> >> I'm not sure if it is worth it then to unify parameters validation, >> with such drawback. > > Yeah, I don't like it very much either. But it does spare a little on > the validation. OTOH, the amount we spare is not extraordinary, so > it's probably not worth the spaghettization of the code ... Avoiding repetition can be done in different ways, for example we can have gitweb assume that if value is already in params variable ($project, $action, $hash,...) it is already validated, and if it is only in %input_params it is not. Add to that for example putting common part of project path checks into validate_project and you have all the advantages (no code repetition, spared validation) without drawback of harder to maintain spaghetti-like code. >>> +# fill %input_params with the CGI parameters. All values except for 'opt' >>> +# should be single values, but opt can be an array. We should probably >>> +# build an array of parameters that can be multi-valued, but since for the time >>> +# being it's only this one, we just single it out >>> +while (my ($name, $symbol) = each %cgi_param_mapping) { >>> + if ($symbol eq 'opt') { >>> + $input_params{$name} = [$cgi->param($symbol)]; >>> + } else { >>> + $input_params{$name} = $cgi->param($symbol); >>> } >>> } >> >> If it was chosen to use short (CGI query) parameter names, the above >> loop could be replaced by simple >> >> %input_params = $cgi->Vars; >> >> or to be more exact, if we want to have multi-valued parameters stored >> as array ref >> >> %input_params = map { [ split /\0/ ] if /\0/; } $cgi->Vars; >> >> >> See CGI(3pm): >> >> When using this [Vars], the thing you must watch out for are multivalued CGI >> parameters. Because a hash cannot distinguish between scalar and list con- >> text, multivalued parameters will be returned as a packed string, separated >> by the "\0" (null) character. You must split this packed string in order >> to get at the individual values. > > Ah, an interesting alternative. This would make parameter copying a > one-liner, almost as good as just using $cgi->param for everything :) Hmmmm... [cut] >> Note that this code does less checking if $project is in path_info than >> for the case where it is set by CGI query. Perhaps there should be base >> fast check in a loop, and more extensive test later. > > Uh ... isn't this exactly what's happening? In the loop we just gobble > until we find a git dir. Validation is then done, and it's the _same_ > validation for both cases. Why do you say that path_info $project is > less checked? For project from query string we now have: !validate_pathname($project) || !(-d "$projectroot/$project") || !check_head_link("$projectroot/$project") || ($export_ok && !(-e "$projectroot/$project/$export_ok")) || ($strict_export && !project_in_list($project)) For project from path_info we now have: while ($project && !check_head_link("$projectroot/$project")) { $project =~ s,/*[^/]*$,,; } ... !validate_pathname($project) || ($export_ok && !-e "$projectroot/$project/$export_ok") || ($strict_export && !project_in_list($project)) It is almost the same; I have thought that they differ more. Perhaps some of above code could be refactored into validate_project(), or project_ok() subroutine. >>> $params{'project'} = $project unless exists $params{'project'}; >>> >>> if ($params{-replay}) { >>> - while (my ($name, $symbol) = each %mapping) { >>> - if (!exists $params{$name}) { >>> - # the parameter we want to recycle may be either part of the >>> - # list of CGI parameter, or recovered from PATH_INFO >>> - if ($cgi->param($symbol)) { >>> - # to allow for multivalued params we use arrayref form >>> - $params{$name} = [ $cgi->param($symbol) ]; >>> - } else { >>> - no strict 'refs'; >>> - $params{$name} = $$name if $$name; >>> - } >>> - } >>> + while (my ($name, $val) = each %input_params) { >>> + $params{$name} = $input_params{$name} >>> + unless (exists $params{$name}); >> >> Very nice, short code. Should be something like that from >> the very beginning. > > Ok, I'll try working up a patch for params merging before any > path_info extensions. Perhaps also put query string handling into evaluate_CGI_query(), or evaluate_query_string(), similar to evaluate_path_info()? -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 0:10 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: refactor input parameters parse/validation Giuseppe Bilotta @ 2008-10-02 8:59 ` Jakub Narebski 2008-10-02 9:43 ` Giuseppe Bilotta 2008-10-02 15:34 ` Petr Baudis 2 siblings, 1 reply; 34+ messages in thread From: Jakub Narebski @ 2008-10-02 8:59 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce Giuseppe Bilotta wrote: > This patch enables gitweb to parse URLs with more information embedded > in PATH_INFO, reducing the need for CGI parameters. The typical gitweb > path is now $project/$action/$hash_base:$file_name or > $project/$action/$hash > > This is mostly backwards compatible with the old-style gitweb paths, > except when $project/$branch was used to access a branch whose name > matches a gitweb action. Nice summary. > Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> > --- > gitweb/gitweb.perl | 90 +++++++++++++++++++++++++++++++--------------------- > 1 files changed, 54 insertions(+), 36 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index e7e4d6b..f088681 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -495,6 +495,37 @@ if (defined $searchtext) { > $search_regexp = $search_use_regexp ? $searchtext : quotemeta $searchtext; > } > > +# dispatch > +my %actions = ( > + "blame" => \&git_blame, [...] > +); I'm not sure if the '# dispatch' comment is correct here now that %actions hash is moved away from actual dispatch (selecting action to run) > @@ -519,9 +550,19 @@ sub evaluate_path_info { > # do not change any parameters if an action is given using the query string > return if $action; > $path_info =~ s,^\Q$project\E/*,,; > + > + # next comes the action > + $action = $path_info; > + $action =~ s,/.*$,,; I would use here pattern matching, but your code is also good and doesn't need changing; just for completeness below there is alternate solution: + $path_info =~ m,^(.*?)/,; + $action = $1; > + if (exists $actions{$action}) { > + $path_info =~ s,^$action/*,,; > + } else { > + $action = undef; > + } > + [...] > @@ -534,8 +575,9 @@ sub evaluate_path_info { > $file_name ||= validate_pathname($pathname); > } elsif (defined $refname) { > # we got "project.git/branch" You meant here # we got "project.git/branch" or "project.git/action/branch" > - $action ||= "shortlog"; > - $hash ||= validate_refname($refname); > + $action ||= "shortlog"; > + $hash ||= validate_refname($refname); > + $hash_base ||= validate_refname($refname); > } > } This hunk is IMHO incorrect. First, $refname is _either_ $hash, or $hash_base; it cannot be both. Second, in most cases (like the case of 'shortlog' action, either explicit or implicit) it is simply $hash; I think it can be $hash_base when $file_name is not set only in singular exception case of 'tree' view for the top tree (lack of filename is not an error, but is equivalent to $file_name='/'). > @@ -544,37 +586,6 @@ evaluate_path_info(); > our $git_dir; > $git_dir = "$projectroot/$project" if $project; > > -# dispatch > -my %actions = ( [...] > -); > - > if (!defined $action) { > if (defined $hash) { > $action = git_get_type($hash); I _think_ the '# dispatch' comment should be left here, and not moved with the %actions hash. > @@ -631,8 +642,15 @@ sub href (%) { > if ($params{-replay}) { > while (my ($name, $symbol) = each %mapping) { > if (!exists $params{$name}) { > - # to allow for multivalued params we use arrayref form > - $params{$name} = [ $cgi->param($symbol) ]; > + # the parameter we want to recycle may be either part of the > + # list of CGI parameter, or recovered from PATH_INFO > + if ($cgi->param($symbol)) { > + # to allow for multivalued params we use arrayref form > + $params{$name} = [ $cgi->param($symbol) ]; > + } else { > + no strict 'refs'; > + $params{$name} = $$name if $$name; I would _perhaps_ add here comment that multivalued parameters can come only from CGI query string, so there is no need for something like: + $params{$name} = (ref($$name) ? @$name : $$name) if $$name; > + } > } > } > } This fragment is a bit of ugly code, hopefully corrected in later patch. I think it would be better to have 'refactor parsing/validation of input parameters' to be very fist patch in series; I am not sure but I suspect that is a kind of bugfix for current "$project/$hash" ('shortlog' view) and "$project/$hash_base:$file_name" ('blob_plain' and 'tree' view) path_info. P.S. It is a bit of pity that Mechanize test from Lea Wiemann caching gitweb code is not in the 'master' or at least 'pu'. Using big, single, monolithic patch instead of patch series of small, easy reviewable commits strikes again... ;-( -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 8:59 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Jakub Narebski @ 2008-10-02 9:43 ` Giuseppe Bilotta 2008-10-03 0:48 ` Jakub Narebski 0 siblings, 1 reply; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 9:43 UTC (permalink / raw) To: Jakub Narebski; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Thu, Oct 2, 2008 at 10:59 AM, Jakub Narebski <jnareb@gmail.com> wrote: > Giuseppe Bilotta wrote: >> >> +# dispatch >> +my %actions = ( >> + "blame" => \&git_blame, > [...] >> +); > > I'm not sure if the '# dispatch' comment is correct here now that > %actions hash is moved away from actual dispatch (selecting action > to run) Bingo. >> @@ -519,9 +550,19 @@ sub evaluate_path_info { >> # do not change any parameters if an action is given using the query string >> return if $action; >> $path_info =~ s,^\Q$project\E/*,,; >> + >> + # next comes the action >> + $action = $path_info; >> + $action =~ s,/.*$,,; > > I would use here pattern matching, but your code is also good and > doesn't need changing; just for completeness below there is alternate > solution: > > + $path_info =~ m,^(.*?)/,; > + $action = $1; Yeah, I just followed the existing code style. >> @@ -534,8 +575,9 @@ sub evaluate_path_info { >> $file_name ||= validate_pathname($pathname); >> } elsif (defined $refname) { >> # we got "project.git/branch" > > You meant here > > # we got "project.git/branch" or "project.git/action/branch" Yes I do. >> - $action ||= "shortlog"; >> - $hash ||= validate_refname($refname); >> + $action ||= "shortlog"; >> + $hash ||= validate_refname($refname); >> + $hash_base ||= validate_refname($refname); >> } >> } > > This hunk is IMHO incorrect. First, $refname is _either_ $hash, or > $hash_base; it cannot be both. Second, in most cases (like the case > of 'shortlog' action, either explicit or implicit) it is simply $hash; > I think it can be $hash_base when $file_name is not set only in > singular exception case of 'tree' view for the top tree (lack of > filename is not an error, but is equivalent to $file_name='/'). OTOH, while setting both $hash and $hash_base has worked fine for me so far (because the right one is automatically used and apparently setting the other doesn't hurt), choosing which one to set is a much hairier case. Do you have suggestions for a better way to always make it work? >> @@ -544,37 +586,6 @@ evaluate_path_info(); >> our $git_dir; >> $git_dir = "$projectroot/$project" if $project; >> >> -# dispatch >> -my %actions = ( > [...] >> -); >> - >> if (!defined $action) { >> if (defined $hash) { >> $action = git_get_type($hash); > > I _think_ the '# dispatch' comment should be left here, and not moved > with the %actions hash. I agree. >> @@ -631,8 +642,15 @@ sub href (%) { >> if ($params{-replay}) { >> while (my ($name, $symbol) = each %mapping) { >> if (!exists $params{$name}) { >> - # to allow for multivalued params we use arrayref form >> - $params{$name} = [ $cgi->param($symbol) ]; >> + # the parameter we want to recycle may be either part of the >> + # list of CGI parameter, or recovered from PATH_INFO >> + if ($cgi->param($symbol)) { >> + # to allow for multivalued params we use arrayref form >> + $params{$name} = [ $cgi->param($symbol) ]; >> + } else { >> + no strict 'refs'; >> + $params{$name} = $$name if $$name; > > I would _perhaps_ add here comment that multivalued parameters can come > only from CGI query string, so there is no need for something like: > > + $params{$name} = (ref($$name) ? @$name : $$name) if $$name; > >> + } >> } >> } >> } > > This fragment is a bit of ugly code, hopefully corrected in later patch. > I think it would be better to have 'refactor parsing/validation of input > parameters' to be very fist patch in series; I am not sure but I suspect > that is a kind of bugfix for current "$project/$hash" ('shortlog' view) > and "$project/$hash_base:$file_name" ('blob_plain' and 'tree' view) > path_info. But implementing the path_info parsing first makes the input param refactoring SO much nicer that I would rather put a comment here saying "this code sucks: we should rather collect all input parameters" and then clean it up on the subsequent patch. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 9:43 ` Giuseppe Bilotta @ 2008-10-03 0:48 ` Jakub Narebski 2008-10-03 6:04 ` Giuseppe Bilotta 0 siblings, 1 reply; 34+ messages in thread From: Jakub Narebski @ 2008-10-03 0:48 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: > On Thu, Oct 2, 2008 at 10:59 AM, Jakub Narebski <jnareb@gmail.com> wrote: >> Giuseppe Bilotta wrote: >>> - $action ||= "shortlog"; >>> - $hash ||= validate_refname($refname); >>> + $action ||= "shortlog"; >>> + $hash ||= validate_refname($refname); >>> + $hash_base ||= validate_refname($refname); >>> } >>> } >> >> This hunk is IMHO incorrect. First, $refname is _either_ $hash, or >> $hash_base; it cannot be both. Second, in most cases (like the case >> of 'shortlog' action, either explicit or implicit) it is simply $hash; >> I think it can be $hash_base when $file_name is not set only in >> singular exception case of 'tree' view for the top tree (lack of >> filename is not an error, but is equivalent to $file_name='/'). > > OTOH, while setting both $hash and $hash_base has worked fine for me > so far (because the right one is automatically used and apparently > setting the other doesn't hurt), choosing which one to set is a much > hairier case. Do you have suggestions for a better way to always make > it work? Well, it is either checking $action and setting either $hash or $hash_base, or setting both, with some comments on why and when it is needed (as discussed on #git). IIUC $hash_base is needed only for filename-taking tree actions which acts on top-tree, and therefore don't need $file_name, like 'project/tree/branch' or related 'project/history/branch' (the latter is practically almost equivalent to 'project/shortlog/branch' or 'project/branch'). I'm not sure if it wouldn't be better to call validate_refname($refname) once, either as: $hash_base ||= $hash ||= validate_refname($refname); but that might be incorrect in the obscure case of setting $hash via 'h' CGI query parameter, and letting gitweb to set-up $hash_base via path_info, so perhaps ($refname is local to evaluate_path_info, IIRC) $refname = validate_refname($refname); $hash ||= $refname; $hash_base ||= $refname; But that is just nitpicking this fragment of code to death. In short: either check which of $hash and $hash_base to set in this branch of conditional, or explain why setting both $hash and $hash_base is needed, and why it is acceptable, either as comments, or in commit message. >>> @@ -631,8 +642,15 @@ sub href (%) { >>> if ($params{-replay}) { >>> while (my ($name, $symbol) = each %mapping) { >>> if (!exists $params{$name}) { >>> - # to allow for multivalued params we use arrayref form >>> - $params{$name} = [ $cgi->param($symbol) ]; >>> + # the parameter we want to recycle may be either part of the >>> + # list of CGI parameter, or recovered from PATH_INFO >>> + if ($cgi->param($symbol)) { >>> + # to allow for multivalued params we use arrayref form >>> + $params{$name} = [ $cgi->param($symbol) ]; >>> + } else { >>> + no strict 'refs'; >>> + $params{$name} = $$name if $$name; >> >> I would _perhaps_ add here comment that multivalued parameters can come >> only from CGI query string, so there is no need for something like: >> >> + $params{$name} = (ref($$name) ? @$name : $$name) if $$name; >> >>> + } >>> } >>> } >>> } >> >> This fragment is a bit of ugly code, hopefully corrected in later patch. >> I think it would be better to have 'refactor parsing/validation of input >> parameters' to be very fist patch in series; I am not sure but I suspect >> that is a kind of bugfix for current "$project/$hash" ('shortlog' view) >> and "$project/$hash_base:$file_name" ('blob_plain' and 'tree' view) >> path_info. > > But implementing the path_info parsing first makes the input param > refactoring SO much nicer that I would rather put a comment here > saying "this code sucks: we should rather collect all input > parameters" and then clean it up on the subsequent patch. Why not cleanup first? When implementing href(..., -replay=>1) I have forgot that some of gitweb parameters are implicitly passed ($project, because it is needed in most gitweb links), and some can be passed via path_info ($hash and/or $hash_base, $file_name). Your code adds $action to the mix, but it doesn't change the fact that 1.) even before your code -replay case was incorrect for some path_info links (handcrafted, as gitweb generates only $project via path_info); 2.) code you have added is a bit ugly. Besides using variables change a little meaning of -replay, namely in your code gitweb always sets action, even for non-path_name links when we started from "default action" (i.e. without action set) links. I guess this is mainly theoretical issue, as I don't think that default views use many -replay links. P.S. with the idea of pushing parameters obtained not from CGI query string to $cgi->param() via "$cgi->param($name, $value);" or in named params form "$cgi->(-name=>$name, -value=>$value);" you would not need to change (a bit hacky, admittedly) href(...,-replay=>1) code. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-03 0:48 ` Jakub Narebski @ 2008-10-03 6:04 ` Giuseppe Bilotta 2008-10-03 10:31 ` Jakub Narebski 0 siblings, 1 reply; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-03 6:04 UTC (permalink / raw) To: Jakub Narebski; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Fri, Oct 3, 2008 at 2:48 AM, Jakub Narebski <jnareb@gmail.com> wrote: > On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: >> OTOH, while setting both $hash and $hash_base has worked fine for me >> so far (because the right one is automatically used and apparently >> setting the other doesn't hurt), choosing which one to set is a much >> hairier case. Do you have suggestions for a better way to always make >> it work? > > Well, it is either checking $action and setting either $hash or > $hash_base, or setting both, with some comments on why and when it is > needed (as discussed on #git). IIUC $hash_base is needed only for > filename-taking tree actions which acts on top-tree, and therefore > don't need $file_name, like 'project/tree/branch' or related > 'project/history/branch' (the latter is practically almost equivalent > to 'project/shortlog/branch' or 'project/branch'). > > I'm not sure if it wouldn't be better to call validate_refname($refname) > once, either as: > > $hash_base ||= $hash ||= validate_refname($refname); > > but that might be incorrect in the obscure case of setting $hash via 'h' > CGI query parameter, and letting gitweb to set-up $hash_base via > path_info, so perhaps ($refname is local to evaluate_path_info, IIRC) > > $refname = validate_refname($refname); > $hash ||= $refname; > $hash_base ||= $refname; I'll go with this. > But that is just nitpicking this fragment of code to death. In short: > either check which of $hash and $hash_base to set in this branch of > conditional, or explain why setting both $hash and $hash_base is needed, > and why it is acceptable, either as comments, or in commit message. Comment is probably better, as long as I remember to move it with the code it belongs to ;) >>>> @@ -631,8 +642,15 @@ sub href (%) { >>>> if ($params{-replay}) { >>>> while (my ($name, $symbol) = each %mapping) { >>>> if (!exists $params{$name}) { >>>> - # to allow for multivalued params we use arrayref form >>>> - $params{$name} = [ $cgi->param($symbol) ]; >>>> + # the parameter we want to recycle may be either part of the >>>> + # list of CGI parameter, or recovered from PATH_INFO >>>> + if ($cgi->param($symbol)) { >>>> + # to allow for multivalued params we use arrayref form >>>> + $params{$name} = [ $cgi->param($symbol) ]; >>>> + } else { >>>> + no strict 'refs'; >>>> + $params{$name} = $$name if $$name; >>> >>> I would _perhaps_ add here comment that multivalued parameters can come >>> only from CGI query string, so there is no need for something like: >>> >>> + $params{$name} = (ref($$name) ? @$name : $$name) if $$name; >>> >>>> + } >>>> } >>>> } >>>> } >>> >>> This fragment is a bit of ugly code, hopefully corrected in later patch. >>> I think it would be better to have 'refactor parsing/validation of input >>> parameters' to be very fist patch in series; I am not sure but I suspect >>> that is a kind of bugfix for current "$project/$hash" ('shortlog' view) >>> and "$project/$hash_base:$file_name" ('blob_plain' and 'tree' view) >>> path_info. >> >> But implementing the path_info parsing first makes the input param >> refactoring SO much nicer that I would rather put a comment here >> saying "this code sucks: we should rather collect all input >> parameters" and then clean it up on the subsequent patch. > > Why not cleanup first? Because cleaning it up depends on the refactoring, and the refactoring is much cleaner when path_info already handles $action too. > When implementing href(..., -replay=>1) I have forgot that some of > gitweb parameters are implicitly passed ($project, because it is needed > in most gitweb links), and some can be passed via path_info ($hash > and/or $hash_base, $file_name). Your code adds $action to the mix, but > it doesn't change the fact that 1.) even before your code -replay case > was incorrect for some path_info links (handcrafted, as gitweb generates > only $project via path_info); 2.) code you have added is a bit ugly. > > Besides using variables change a little meaning of -replay, namely > in your code gitweb always sets action, even for non-path_name links > when we started from "default action" (i.e. without action set) links. > I guess this is mainly theoretical issue, as I don't think that default > views use many -replay links. Ah the issue of the default action is something I hadn't taken into consideration, actually. Now the question is, should replay keep default -> default, or should it go with default -> last incantation? > P.S. with the idea of pushing parameters obtained not from CGI query > string to $cgi->param() via "$cgi->param($name, $value);" or in named > params form "$cgi->(-name=>$name, -value=>$value);" you would not need > to change (a bit hacky, admittedly) href(...,-replay=>1) code. Yes, but it would muddy the waters about 'where did this parameter come from' in case we ever need to know that. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-03 6:04 ` Giuseppe Bilotta @ 2008-10-03 10:31 ` Jakub Narebski 0 siblings, 0 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-03 10:31 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Fri, 3 Oct 2008, Giuseppe Bilotta wrote: > On Fri, Oct 3, 2008 at 2:48 AM, Jakub Narebski <jnareb@gmail.com> wrote: >> On Thu, 2 Oct 2008, Giuseppe Bilotta wrote: >>> But implementing the path_info parsing first makes the input param >>> refactoring SO much nicer that I would rather put a comment here >>> saying "this code sucks: we should rather collect all input >>> parameters" and then clean it up on the subsequent patch. >> >> Why not cleanup first? > > Because cleaning it up depends on the refactoring, and the refactoring > is much cleaner when path_info already handles $action too. Hmmm... it looks like after refactoring you don't have to change href() at all when adding $action to parameters gitweb can get from path_info; all that having refactoring (cleanup) upfront changes in/for this patch is a bit different saving action (also in %input_params) and lack of this ugly and I think subtly wrong added code for href(...,-replay=>1). >> When implementing href(..., -replay=>1) I have forgot that some of >> gitweb parameters are implicitly passed ($project, because it is needed >> in most gitweb links), and some can be passed via path_info ($hash >> and/or $hash_base, $file_name). Your code adds $action to the mix, but >> it doesn't change the fact that 1.) even before your code -replay case >> was incorrect for some path_info links (handcrafted, as gitweb generates >> only $project via path_info); 2.) code you have added is a bit ugly. >> >> Besides using variables change a little meaning of -replay, namely >> in your code gitweb always sets action, even for non-path_name links >> when we started from "default action" (i.e. without action set) links. >> I guess this is mainly theoretical issue, as I don't think that default >> views use many -replay links. > > Ah the issue of the default action is something I hadn't taken into > consideration, actually. Now the question is, should replay keep > default -> default, or should it go with default -> last incantation? I think we should use default -> default. Besides I think there can also be an issue ("can" because I am not sure if in practice it is a problem) the fact that gitweb sometimes expand parameters to sha-1, for example setting $head to git_get_head_hash() if it is not set (default param value), and not to 'HEAD'. This perhaps should be changed, but to be on safer side better not to use 'action variables' because some code treats them as temporary variables. >> P.S. with the idea of pushing parameters obtained not from CGI query >> string to $cgi->param() via "$cgi->param($name, $value);" or in named >> params form "$cgi->(-name=>$name, -value=>$value);" you would not need >> to change (a bit hacky, admittedly) href(...,-replay=>1) code. > > Yes, but it would muddy the waters about 'where did this parameter > come from' in case we ever need to know that. True. Like for example implementing -faithful_replay where if parameter was passed through path_info it is replayed through path_info, and if it was passed through query string it is replayed as CGI query param. After thinking a bit about that I think that %input_params idea is superior to both $cgi->params(-name=>..., -value=>...) and to have either no strict refs $$name or name to action variable ref hash. See also my comment for the refactoring patch. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 0:10 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: refactor input parameters parse/validation Giuseppe Bilotta 2008-10-02 8:59 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Jakub Narebski @ 2008-10-02 15:34 ` Petr Baudis 2008-10-02 19:30 ` Giuseppe Bilotta 2 siblings, 1 reply; 34+ messages in thread From: Petr Baudis @ 2008-10-02 15:34 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Jakub Narebski, Junio C Hamano, Shawn O. Pearce On Thu, Oct 02, 2008 at 02:10:29AM +0200, Giuseppe Bilotta wrote: > This patch enables gitweb to parse URLs with more information embedded > in PATH_INFO, reducing the need for CGI parameters. The typical gitweb > path is now $project/$action/$hash_base:$file_name or > $project/$action/$hash > > This is mostly backwards compatible with the old-style gitweb paths, > except when $project/$branch was used to access a branch whose name > matches a gitweb action. > > Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Just nits... > --- > gitweb/gitweb.perl | 90 +++++++++++++++++++++++++++++++--------------------- > 1 files changed, 54 insertions(+), 36 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index e7e4d6b..f088681 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -519,9 +550,19 @@ sub evaluate_path_info { > # do not change any parameters if an action is given using the query string > return if $action; > $path_info =~ s,^\Q$project\E/*,,; > + > + # next comes the action > + $action = $path_info; > + $action =~ s,/.*$,,; > + if (exists $actions{$action}) { > + $path_info =~ s,^$action/*,,; > + } else { > + $action = undef; Extra whitespace. > + } > + > my ($refname, $pathname) = split(/:/, $path_info, 2); > if (defined $pathname) { > - # we got "project.git/branch:filename" or "project.git/branch:dir/" > + # we got "project.git/action/branch:filename" or "project.git/action/branch:dir/" > # we could use git_get_type(branch:pathname), but it needs $git_dir > $pathname =~ s,^/+,,; > if (!$pathname || substr($pathname, -1) eq "/") { But the old variant is still possible, maybe the comments should indicate that the action/ part is optional. > @@ -534,8 +575,9 @@ sub evaluate_path_info { > $file_name ||= validate_pathname($pathname); > } elsif (defined $refname) { > # we got "project.git/branch" > - $action ||= "shortlog"; > - $hash ||= validate_refname($refname); > + $action ||= "shortlog"; > + $hash ||= validate_refname($refname); > + $hash_base ||= validate_refname($refname); > } > } > evaluate_path_info(); What is this good for? -- Petr "Pasky" Baudis People who take cold baths never have rheumatism, but they have cold baths. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 15:34 ` Petr Baudis @ 2008-10-02 19:30 ` Giuseppe Bilotta 2008-10-02 20:56 ` Petr Baudis 0 siblings, 1 reply; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 19:30 UTC (permalink / raw) To: Petr Baudis; +Cc: git, Jakub Narebski, Junio C Hamano, Shawn O. Pearce On Thu, Oct 2, 2008 at 5:34 PM, Petr Baudis <pasky@suse.cz> wrote: > Just nits... >> + $action = undef; > > Extra whitespace. Right, fixed. >> + } >> + >> my ($refname, $pathname) = split(/:/, $path_info, 2); >> if (defined $pathname) { >> - # we got "project.git/branch:filename" or "project.git/branch:dir/" >> + # we got "project.git/action/branch:filename" or "project.git/action/branch:dir/" >> # we could use git_get_type(branch:pathname), but it needs $git_dir >> $pathname =~ s,^/+,,; >> if (!$pathname || substr($pathname, -1) eq "/") { > > But the old variant is still possible, maybe the comments should > indicate that the action/ part is optional. I put the action/ part in square brackets. (e.g. project.git/[action/]branch:filename). Is this good enough? >> @@ -534,8 +575,9 @@ sub evaluate_path_info { >> $file_name ||= validate_pathname($pathname); >> } elsif (defined $refname) { >> # we got "project.git/branch" >> - $action ||= "shortlog"; >> - $hash ||= validate_refname($refname); >> + $action ||= "shortlog"; >> + $hash ||= validate_refname($refname); >> + $hash_base ||= validate_refname($refname); >> } >> } >> evaluate_path_info(); > > What is this good for? The purpose of what? setting both $hash and $hash_base was something that I found was needed in some extreme cases, as discussed with Jakub. Proposals for recommended cleaner but equally fast way to handle it. If you're referring to the whitespace, I was just lining up the entries. Should I do it in a separate patch? -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 19:30 ` Giuseppe Bilotta @ 2008-10-02 20:56 ` Petr Baudis 2008-10-02 21:05 ` Giuseppe Bilotta 0 siblings, 1 reply; 34+ messages in thread From: Petr Baudis @ 2008-10-02 20:56 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Jakub Narebski, Junio C Hamano, Shawn O. Pearce On Thu, Oct 02, 2008 at 09:30:42PM +0200, Giuseppe Bilotta wrote: > On Thu, Oct 2, 2008 at 5:34 PM, Petr Baudis <pasky@suse.cz> wrote: > > > Just nits... > >> + } > >> + > >> my ($refname, $pathname) = split(/:/, $path_info, 2); > >> if (defined $pathname) { > >> - # we got "project.git/branch:filename" or "project.git/branch:dir/" > >> + # we got "project.git/action/branch:filename" or "project.git/action/branch:dir/" > >> # we could use git_get_type(branch:pathname), but it needs $git_dir > >> $pathname =~ s,^/+,,; > >> if (!$pathname || substr($pathname, -1) eq "/") { > > > > But the old variant is still possible, maybe the comments should > > indicate that the action/ part is optional. > > I put the action/ part in square brackets. (e.g. > project.git/[action/]branch:filename). Is this good enough? That's perfect. > >> @@ -534,8 +575,9 @@ sub evaluate_path_info { > >> $file_name ||= validate_pathname($pathname); > >> } elsif (defined $refname) { > >> # we got "project.git/branch" > >> - $action ||= "shortlog"; > >> - $hash ||= validate_refname($refname); > >> + $action ||= "shortlog"; > >> + $hash ||= validate_refname($refname); > >> + $hash_base ||= validate_refname($refname); > >> } > >> } > >> evaluate_path_info(); > > > > What is this good for? > > The purpose of what? setting both $hash and $hash_base was something > that I found was needed in some extreme cases, as discussed with > Jakub. Proposals for recommended cleaner but equally fast way to > handle it. If you're referring to the whitespace, I was just lining up > the entries. Should I do it in a separate patch? I refer to the setting of $hash_base (I'm not huge fan of the whitespace aligning, but I don't really care). What extreme cases are these? I think you should describe that in the code since it's not really obvious. Maybe I could find it in older threads but I should understand the code just from reading it. :-) -- Petr "Pasky" Baudis People who take cold baths never have rheumatism, but they have cold baths. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 20:56 ` Petr Baudis @ 2008-10-02 21:05 ` Giuseppe Bilotta 2008-10-02 22:04 ` Petr Baudis 0 siblings, 1 reply; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 21:05 UTC (permalink / raw) To: Petr Baudis; +Cc: git, Jakub Narebski, Junio C Hamano, Shawn O. Pearce >> >> @@ -534,8 +575,9 @@ sub evaluate_path_info { >> >> $file_name ||= validate_pathname($pathname); >> >> } elsif (defined $refname) { >> >> # we got "project.git/branch" >> >> - $action ||= "shortlog"; >> >> - $hash ||= validate_refname($refname); >> >> + $action ||= "shortlog"; >> >> + $hash ||= validate_refname($refname); >> >> + $hash_base ||= validate_refname($refname); >> >> } >> >> } >> >> evaluate_path_info(); >> > >> > What is this good for? >> >> The purpose of what? setting both $hash and $hash_base was something >> that I found was needed in some extreme cases, as discussed with >> Jakub. Proposals for recommended cleaner but equally fast way to >> handle it. If you're referring to the whitespace, I was just lining up >> the entries. Should I do it in a separate patch? > > I refer to the setting of $hash_base (I'm not huge fan of the whitespace > aligning, but I don't really care). What extreme cases are these? > I think you should describe that in the code since it's not really > obvious. Maybe I could find it in older threads but I should understand > the code just from reading it. :-) In preparing the new patchset, I've put a big comment block explaining why we need to set both $hash and $hash_base in this code-path: # we got "project.git/[action/]branch". in this case # we set both $hash and $hash_base because different actions # need one or the other to be set to behave correctly. # # For example, if $hash_base is not set then the blob and # history links on the page project.git/tree/somebranch will # assume a $hash_base of HEAD instead of the correct # somebranch. # Conversely, not setting $hash will make URLs such as # project.git/shortlog/somebranch display the wrong page. # # Since it also turns out that the unused one is properly # overwritten as needed, setting both is quite safe. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 21:05 ` Giuseppe Bilotta @ 2008-10-02 22:04 ` Petr Baudis 2008-10-02 22:41 ` Jakub Narebski 0 siblings, 1 reply; 34+ messages in thread From: Petr Baudis @ 2008-10-02 22:04 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Jakub Narebski, Junio C Hamano, Shawn O. Pearce On Thu, Oct 02, 2008 at 11:05:18PM +0200, Giuseppe Bilotta wrote: > In preparing the new patchset, I've put a big comment block explaining > why we need to set both $hash and $hash_base in this code-path: > > # we got "project.git/[action/]branch". in this case > # we set both $hash and $hash_base because different actions > # need one or the other to be set to behave correctly. > # > # For example, if $hash_base is not set then the blob and > # history links on the page project.git/tree/somebranch will > # assume a $hash_base of HEAD instead of the correct > # somebranch. > # Conversely, not setting $hash will make URLs such as > # project.git/shortlog/somebranch display the wrong page. > # > # Since it also turns out that the unused one is properly > # overwritten as needed, setting both is quite safe. Ok, but is this related to the pathinfo changes? Or is this fixing a separate bug? (Not that I would want to bother you with splitting this, but you should at least mention it in the commit message... and it's fairly isolated anyway.) -- Petr "Pasky" Baudis People who take cold baths never have rheumatism, but they have cold baths. ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 22:04 ` Petr Baudis @ 2008-10-02 22:41 ` Jakub Narebski 2008-10-03 5:54 ` Giuseppe Bilotta 0 siblings, 1 reply; 34+ messages in thread From: Jakub Narebski @ 2008-10-02 22:41 UTC (permalink / raw) To: Petr Baudis; +Cc: Giuseppe Bilotta, git, Junio C Hamano, Shawn O. Pearce Petr Baudis wrote: > On Thu, Oct 02, 2008 at 11:05:18PM +0200, Giuseppe Bilotta wrote: > > > > In preparing the new patchset, I've put a big comment block explaining > > why we need to set both $hash and $hash_base in this code-path: > > > > # we got "project.git/[action/]branch". in this case > > # we set both $hash and $hash_base because different actions > > # need one or the other to be set to behave correctly. > > # > > # For example, if $hash_base is not set then the blob and 'blob' without $file_name doesn't have sense, but 'tree' does. Probably should be s/blob/tree/ above. > > # history links on the page project.git/tree/somebranch will > > # assume a $hash_base of HEAD instead of the correct > > # somebranch. > > # Conversely, not setting $hash will make URLs such as > > # project.git/shortlog/somebranch display the wrong page. > > # > > # Since it also turns out that the unused one is properly > > # overwritten as needed, setting both is quite safe. > > Ok, but is this related to the pathinfo changes? Or is this fixing a > separate bug? (Not that I would want to bother you with splitting this, > but you should at least mention it in the commit message... and it's > fairly isolated anyway.) Yes, it related to path_info changes. With current code the only way to get to no $file_name branch of evaluate_path_info() code was to have URL which looks like project/branch, for which 'shortlog' action was chosen (if not specified by CGI query parameters), and for 'shortlog' action the correct way was to use $hash. Now there can be project/tree/branch to show top directory of given branch, and this as described required $hash_base set. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO 2008-10-02 22:41 ` Jakub Narebski @ 2008-10-03 5:54 ` Giuseppe Bilotta 0 siblings, 0 replies; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-03 5:54 UTC (permalink / raw) To: Jakub Narebski; +Cc: Petr Baudis, git, Junio C Hamano, Shawn O. Pearce On Fri, Oct 3, 2008 at 12:41 AM, Jakub Narebski <jnareb@gmail.com> wrote: > Petr Baudis wrote: >> On Thu, Oct 02, 2008 at 11:05:18PM +0200, Giuseppe Bilotta wrote: >> > >> > In preparing the new patchset, I've put a big comment block explaining >> > why we need to set both $hash and $hash_base in this code-path: >> > >> > # we got "project.git/[action/]branch". in this case >> > # we set both $hash and $hash_base because different actions >> > # need one or the other to be set to behave correctly. >> > # >> > # For example, if $hash_base is not set then the blob and > > 'blob' without $file_name doesn't have sense, but 'tree' does. > Probably should be s/blob/tree/ above. > >> > # history links on the page project.git/tree/somebranch will >> > # assume a $hash_base of HEAD instead of the correct >> > # somebranch. Note: the blob and history links _in a tree page_ are wrong. The page itself is correct. What this means is that if you go to project.git/tree/somebranch you get the correct list of files and directories for the project at somebranch's HEAD, but all the links IN that page will point to the wrong version, because they will have no $hash_base set and will thus default to HEAD. You can see this effect for example by going to http://repo.or.cz/w/git.git/v1.4.0?a=tree <- this hand-crafted path would never be generated by the old pathinfo code, but it show exactly what would happen with my pahtinfo code on http://repo.or.cz/w/git.git/tree/v1.4.0 if $hash_base wasn't set. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: PATH_INFO support improvements 2008-10-02 0:10 [PATCHv4] gitweb: PATH_INFO support improvements Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Giuseppe Bilotta @ 2008-10-02 8:19 ` Jakub Narebski 2008-10-02 8:49 ` Giuseppe Bilotta 1 sibling, 1 reply; 34+ messages in thread From: Jakub Narebski @ 2008-10-02 8:19 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce Giuseppe Bilotta wrote: > Fourth version of my gitweb PATH_INFO patchset, whose purpose is to > reduce the use of CGI parameters by embedding as many parameters as > possible in the URL path itself, provided the pathinfo feature is > enabled. A nit: when sending longer patch series you should use numbered format in the form of [PATCH m/n] or [PATCH m/n vX] prefix. > > The new typical gitweb URL is therefore in the form > > $project/$action/$parent:$file..$hash:$file > > (with useless parts stripped). Backwards compatibility for old-style > $project/$hash URLs is kept, as long as $hash is not a refname whose > name happens to match a git action. Minor nit: there was also old-style $project/$hash_base:$file_name path_info format. > > The main implementation is provided by paired patches (#1#3, #5#6) > that implement parsing and generation of the new style URLs. > > Patch #2 deals with a refactoring of the input parameters parsing and > validation, so that the rest of gitweb can be agnostic wrt to the > parameters' origin (CGI vs PATH_INFO vs possible other future inputs > such as CLI). > > Patch #4 is a minor improvement to the URL syntax that allows web > documents to be properly browsable in raw mode. Very nice summary of patchset and patch coverage in this cover letter. > > Giuseppe Bilotta (6): > gitweb: parse project/action/hash_base:filename PATH_INFO > gitweb: refactor input parameters parse/validation > gitweb: generate project/action/hash URLs > gitweb: use_pathinfo filenames start with / > gitweb: parse parent..current syntax from pathinfo > gitweb: generate parent..current URLs -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: PATH_INFO support improvements 2008-10-02 8:19 ` [PATCHv4] gitweb: PATH_INFO support improvements Jakub Narebski @ 2008-10-02 8:49 ` Giuseppe Bilotta 2008-10-02 10:16 ` Jakub Narebski 0 siblings, 1 reply; 34+ messages in thread From: Giuseppe Bilotta @ 2008-10-02 8:49 UTC (permalink / raw) To: Jakub Narebski; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce On Thu, Oct 2, 2008 at 10:19 AM, Jakub Narebski <jnareb@gmail.com> wrote: > Giuseppe Bilotta wrote: > >> Fourth version of my gitweb PATH_INFO patchset, whose purpose is to >> reduce the use of CGI parameters by embedding as many parameters as >> possible in the URL path itself, provided the pathinfo feature is >> enabled. > > A nit: when sending longer patch series you should use numbered > format in the form of [PATCH m/n] or [PATCH m/n vX] prefix. W00t, I still manage to get this wrong. Kudos to me 8-/ I wonder why these options are not the default when there is more than one patch, btw? (And yes, I tried looking into the builtin-log.c code but making it automatic is somewhat less trivial than I can dedicate time to.) >> The new typical gitweb URL is therefore in the form >> >> $project/$action/$parent:$file..$hash:$file >> >> (with useless parts stripped). Backwards compatibility for old-style >> $project/$hash URLs is kept, as long as $hash is not a refname whose >> name happens to match a git action. > > Minor nit: there was also old-style $project/$hash_base:$file_name > path_info format. Right, forgot about that. >> The main implementation is provided by paired patches (#1#3, #5#6) >> that implement parsing and generation of the new style URLs. >> >> Patch #2 deals with a refactoring of the input parameters parsing and >> validation, so that the rest of gitweb can be agnostic wrt to the >> parameters' origin (CGI vs PATH_INFO vs possible other future inputs >> such as CLI). >> >> Patch #4 is a minor improvement to the URL syntax that allows web >> documents to be properly browsable in raw mode. > > Very nice summary of patchset and patch coverage in this cover letter. Thanks. At least I'm learning from my past errors. I'll manage to send the perfect patchset sooner or later ;) -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCHv4] gitweb: PATH_INFO support improvements 2008-10-02 8:49 ` Giuseppe Bilotta @ 2008-10-02 10:16 ` Jakub Narebski 0 siblings, 0 replies; 34+ messages in thread From: Jakub Narebski @ 2008-10-02 10:16 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: git, Petr Baudis, Junio C Hamano, Shawn O. Pearce Giuseppe Bilotta wrote: > On Thu, Oct 2, 2008 at 10:19 AM, Jakub Narebski wrote: > > > > A nit: when sending longer patch series you should use numbered > > format in the form of [PATCH m/n] or [PATCH m/n vX] prefix. > > W00t, I still manage to get this wrong. Kudos to me 8-/ > > I wonder why these options are not the default when there is more than > one patch, btw? > > (And yes, I tried looking into the builtin-log.c code but making it > automatic is somewhat less trivial than I can dedicate time to.) Hmmm... I thought that format.numbered config variable is 'auto' by default; I guess it isn't (just so you know where to look to change it ;-)). I have the following in the .git/config / ~/.gitconfig: [format] numbered = auto suffix = .txt -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2008-10-06 0:18 UTC | newest] Thread overview: 34+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-10-02 0:10 [PATCHv4] gitweb: PATH_INFO support improvements Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: refactor input parameters parse/validation Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: generate project/action/hash URLs Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: use_pathinfo filenames start with / Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Giuseppe Bilotta 2008-10-02 0:10 ` [PATCHv4] gitweb: generate parent..current URLs Giuseppe Bilotta 2008-10-06 0:17 ` Jakub Narebski 2008-10-04 1:31 ` [PATCHv4] gitweb: parse parent..current syntax from pathinfo Jakub Narebski 2008-10-04 7:24 ` Giuseppe Bilotta 2008-10-04 7:48 ` Jakub Narebski 2008-10-05 8:19 ` Jakub Narebski 2008-10-03 11:28 ` [PATCHv4] gitweb: use_pathinfo filenames start with / Jakub Narebski 2008-10-03 1:48 ` [PATCHv4] gitweb: generate project/action/hash URLs Jakub Narebski [not found] ` <cb7bb73a0810022330l498bdb20h703dec7833a443e@mail.gmail.com> 2008-10-03 11:24 ` Jakub Narebski 2008-10-04 1:15 ` Jakub Narebski 2008-10-03 1:36 ` [PATCHv4] gitweb: refactor input parameters parse/validation Jakub Narebski 2008-10-03 7:24 ` Giuseppe Bilotta 2008-10-03 11:20 ` Jakub Narebski 2008-10-02 8:59 ` [PATCHv4] gitweb: parse project/action/hash_base:filename PATH_INFO Jakub Narebski 2008-10-02 9:43 ` Giuseppe Bilotta 2008-10-03 0:48 ` Jakub Narebski 2008-10-03 6:04 ` Giuseppe Bilotta 2008-10-03 10:31 ` Jakub Narebski 2008-10-02 15:34 ` Petr Baudis 2008-10-02 19:30 ` Giuseppe Bilotta 2008-10-02 20:56 ` Petr Baudis 2008-10-02 21:05 ` Giuseppe Bilotta 2008-10-02 22:04 ` Petr Baudis 2008-10-02 22:41 ` Jakub Narebski 2008-10-03 5:54 ` Giuseppe Bilotta 2008-10-02 8:19 ` [PATCHv4] gitweb: PATH_INFO support improvements Jakub Narebski 2008-10-02 8:49 ` Giuseppe Bilotta 2008-10-02 10:16 ` Jakub Narebski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).