* [RFC/PATCH 2/2] gitweb: check given hash before trying to create snapshot @ 2009-09-10 21:20 Mark Rada 2009-09-11 7:52 ` Jakub Narebski 0 siblings, 1 reply; 4+ messages in thread From: Mark Rada @ 2009-09-10 21:20 UTC (permalink / raw) To: git; +Cc: Jakub Narebski, Junio C Hamano This patch is dual purposed. First, it makes things nicer in cases when you hand craft the snapshot URL but make a typo (e.g. netx instead of next); you will now get an error message instead of a broken tarball. Second, any given treeish will always be translated to the full length, unambiguous, hash id; this will be useful for things like creating unique names for snapshot caches. This patch includes test for t9501 to demonstrate the changed functionality. Signed-off-by: Mark Rada <marada@uwaterloo.ca> --- gitweb/gitweb.perl | 5 +++-- t/t9501-gitweb-standalone-http-status.sh | 26 ++++++++++++++++++++++++++ 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index d650188..4ae960c 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -5197,8 +5197,9 @@ sub git_snapshot { die_error(403, "Unsupported snapshot format"); } - if (!defined $hash) { - $hash = &git_get_hash($project); + my $snapshot = &git_get_hash($project, $hash); + if (!$snapshot) { + die_error(400, "Not a valid hash id: $hash"); } my $name = $project; diff --git a/t/t9501-gitweb-standalone-http-status.sh b/t/t9501-gitweb-standalone-http-status.sh index d0ff21d..4f8f147 100644 --- a/t/t9501-gitweb-standalone-http-status.sh +++ b/t/t9501-gitweb-standalone-http-status.sh @@ -75,4 +75,30 @@ test_expect_success \ test_debug 'cat gitweb.output' +test_expect_success \ + 'snapshots: bad treeish id' \ + 'gitweb_run "p=.git;a=snapshot;h=frizzumFrazzum;sf=tgz" && + grep "400 - Not a valid hash id:" gitweb.output' +test_debug 'cat gitweb.output' + +test_expect_success \ + 'snapshots: good treeish id' \ + 'gitweb_run "p=.git;a=snapshot;h=master;sf=tgz" && + grep "Status: 200 OK" gitweb.output' +test_debug 'cat gitweb.output' + +test_expect_success \ + 'snapshots: good object id' \ + 'ID=`git rev-parse --verify HEAD` && + gitweb_run "p=.git;a=snapshot;h=$ID;sf=tgz" && + grep "Status: 200 OK" gitweb.output' +test_debug 'cat gitweb.output' + +test_expect_success \ + 'snapshots: bad object id' \ + 'gitweb_run "p=.git;a=snapshot;h=abcdef01234;sf=tgz" && + grep "400 - Not a valid hash id:" gitweb.output' +test_debug 'cat gitweb.output' + + test_done -- 1.6.4.2 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC/PATCH 2/2] gitweb: check given hash before trying to create snapshot 2009-09-10 21:20 [RFC/PATCH 2/2] gitweb: check given hash before trying to create snapshot Mark Rada @ 2009-09-11 7:52 ` Jakub Narebski 2009-09-11 15:44 ` Mark Rada 0 siblings, 1 reply; 4+ messages in thread From: Jakub Narebski @ 2009-09-11 7:52 UTC (permalink / raw) To: Mark Rada; +Cc: git, Junio C Hamano On Thu, 10 Sep 2009, Mark Rada wrote: > This patch is dual purposed. > > First, it makes things nicer in cases when you hand craft the snapshot > URL but make a typo (e.g. netx instead of next); you will now get an > error message instead of a broken tarball. This is a very good idea. > > Second, any given treeish will always be translated to the full length, > unambiguous, hash id; this will be useful for things like creating > unique names for snapshot caches. But this is not a good idea, IMHO. First, it introduces feature that nobody uses (at least yet); we can introduce this feature when it is needed instead. Second, I'd rather have better names for snapshots than using full SHA-1. For snapshot of 'v1.5.0' of repository 'repo.git' I'd prefer for snapshot to be named 'repo-v1.5.0', and for snapshot of 'next' branch of the same project to be named for example 'repo-next-20090909', or perhaps 'repo-next-2009-09-10T09:16:18' or 'repo-next-20090909-g5f6b0ff', or 'repo-v1.6.5-rc0-164-g5f6b0ff'. I'm not sure what would be the best name of snapshot of given subdirectory... In short: I'd rather not improve on bad design of using full SHA-1 in snapshot name. > > This patch includes test for t9501 to demonstrate the changed > functionality. > > Signed-off-by: Mark Rada <marada@uwaterloo.ca> > --- > gitweb/gitweb.perl | 5 +++-- > t/t9501-gitweb-standalone-http-status.sh | 26 ++++++++++++++++++++++++++ > 2 files changed, 29 insertions(+), 2 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index d650188..4ae960c 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -5197,8 +5197,9 @@ sub git_snapshot { > die_error(403, "Unsupported snapshot format"); > } > > - if (!defined $hash) { > - $hash = &git_get_hash($project); > + my $snapshot = &git_get_hash($project, $hash); Same comment as for PATCH 1/2: don't use '&' subroutine call if it is not required. > + if (!$snapshot) { > + die_error(400, "Not a valid hash id: $hash"); Note that we don't use user input in _any_ of other error messages; you would probably need to sanitize $hash. By the way, wouldn't 404 (Not Found) be a better error code? > } > > my $name = $project; > diff --git a/t/t9501-gitweb-standalone-http-status.sh b/t/t9501-gitweb-standalone-http-status.sh > index d0ff21d..4f8f147 100644 > --- a/t/t9501-gitweb-standalone-http-status.sh > +++ b/t/t9501-gitweb-standalone-http-status.sh > @@ -75,4 +75,30 @@ test_expect_success \ > test_debug 'cat gitweb.output' > > > +test_expect_success \ > + 'snapshots: bad treeish id' \ > + 'gitweb_run "p=.git;a=snapshot;h=frizzumFrazzum;sf=tgz" && > + grep "400 - Not a valid hash id:" gitweb.output' > +test_debug 'cat gitweb.output' > + > +test_expect_success \ > + 'snapshots: good treeish id' \ > + 'gitweb_run "p=.git;a=snapshot;h=master;sf=tgz" && > + grep "Status: 200 OK" gitweb.output' > +test_debug 'cat gitweb.output' Why you don't check for "Status: 400" too? > + > +test_expect_success \ > + 'snapshots: good object id' \ > + 'ID=`git rev-parse --verify HEAD` && > + gitweb_run "p=.git;a=snapshot;h=$ID;sf=tgz" && > + grep "Status: 200 OK" gitweb.output' > +test_debug 'cat gitweb.output' > + > +test_expect_success \ > + 'snapshots: bad object id' \ > + 'gitweb_run "p=.git;a=snapshot;h=abcdef01234;sf=tgz" && > + grep "400 - Not a valid hash id:" gitweb.output' > +test_debug 'cat gitweb.output' > + > + > test_done > -- > 1.6.4.2 > > -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC/PATCH 2/2] gitweb: check given hash before trying to create snapshot 2009-09-11 7:52 ` Jakub Narebski @ 2009-09-11 15:44 ` Mark Rada 2009-09-11 16:42 ` Jakub Narebski 0 siblings, 1 reply; 4+ messages in thread From: Mark Rada @ 2009-09-11 15:44 UTC (permalink / raw) To: Jakub Narebski; +Cc: Mark Rada, git, Junio C Hamano On 2009-09-11, at 3:52 AM, Jakub Narebski wrote: > Second, I'd rather have better names for snapshots than using full > SHA-1. > For snapshot of 'v1.5.0' of repository 'repo.git' I'd prefer for > snapshot > to be named 'repo-v1.5.0', and for snapshot of 'next' branch of the > same > project to be named for example 'repo-next-20090909', or perhaps > 'repo-next-2009-09-10T09:16:18' or 'repo-next-20090909-g5f6b0ff', > or 'repo-v1.6.5-rc0-164-g5f6b0ff'. Ah, yeah, well, I let $hash still hold the originally passed value, which would be used to create the outputted file name (unless overwritten by the client by using curl -o or something). Then $snapshot holds the full hash. This way, if you were to type something like http://git.kernel.org/?p=git/git.git;a=snapshot;h=next;sf=tgz into your browser window, you would get git.git-next.tar.gz back, but the backend could look up something like git-5f6b0ffff13f5cd762d0a5a4e1c4dede58e8a537.tar.gz using the $snapshot variable in some hypothetical cache (or even without the cache it won't mangle the nicer name $hash might have). Also, right now gitweb will not accept tags for hashes. This seems to be because it passes the --verify option to rev-parse, but the output from using and not using the verify option seems to be the same (other than also accepting all tree-ishes). Could you let me know if there is a good reason not to take off the --verify option? Otherwise, I would like to take it off in the next version of this patch. Your point about adding the short hash to snapshots of branch heads is cool, I'll try that for the next version of the patch. >> } >> >> my $name = $project; >> diff --git a/t/t9501-gitweb-standalone-http-status.sh b/t/t9501- >> gitweb-standalone-http-status.sh >> index d0ff21d..4f8f147 100644 >> --- a/t/t9501-gitweb-standalone-http-status.sh >> +++ b/t/t9501-gitweb-standalone-http-status.sh >> @@ -75,4 +75,30 @@ test_expect_success \ >> test_debug 'cat gitweb.output' >> >> >> +test_expect_success \ >> + 'snapshots: bad treeish id' \ >> + 'gitweb_run "p=.git;a=snapshot;h=frizzumFrazzum;sf=tgz" && >> + grep "400 - Not a valid hash id:" gitweb.output' >> +test_debug 'cat gitweb.output' >> + >> +test_expect_success \ >> + 'snapshots: good treeish id' \ >> + 'gitweb_run "p=.git;a=snapshot;h=master;sf=tgz" && >> + grep "Status: 200 OK" gitweb.output' >> +test_debug 'cat gitweb.output' > > Why you don't check for "Status: 400" too? I'm not sure which test you are referring to (I think the second). The second test is valid and should return a nice .git-master.tar.gz tarball. >> Second, any given treeish will always be translated to the full >> length, >> unambiguous, hash id; this will be useful for things like creating >> unique names for snapshot caches. > > But this is not a good idea, IMHO. > > First, it introduces feature that nobody uses (at least yet); we can > introduce this feature when it is needed instead. Sorry for promoting vapourware, I did originally rip this patch out from something else. I removed the comment from the v2 commit message. -- Mark Rada (ferrous26) marada@uwaterloo.ca ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC/PATCH 2/2] gitweb: check given hash before trying to create snapshot 2009-09-11 15:44 ` Mark Rada @ 2009-09-11 16:42 ` Jakub Narebski 0 siblings, 0 replies; 4+ messages in thread From: Jakub Narebski @ 2009-09-11 16:42 UTC (permalink / raw) To: Mark Rada; +Cc: git, Junio C Hamano [This mail was very strangely wrapped; I fixed this for readability] On Fri, 11 Sep 2009, Mark Rada wrote: > On 2009-09-11, at 3:52 AM, Jakub Narebski wrote: > >> Second, I'd rather have better names for snapshots than using full SHA-1. >> For snapshot of 'v1.5.0' of repository 'repo.git' I'd prefer for snapshot >> to be named 'repo-v1.5.0', and for snapshot of 'next' branch of the >> same project to be named for example 'repo-next-20090909', or perhaps >> 'repo-next-2009-09-10T09:16:18' or 'repo-next-20090909-g5f6b0ff', >> or 'repo-v1.6.5-rc0-164-g5f6b0ff'. > > Ah, yeah, well, I let $hash still hold the originally passed value, > which would be used to create the outputted file name (unless > overwritten by the client by using curl -o or something). Or by specifying different file name than proposed by browser. > Then $snapshot holds the full hash. This way, if you were to type > something like > > http://git.kernel.org/?p=git/git.git;a=snapshot;h=next;sf=tgz > > into your browser window, you would get git.git-next.tar.gz back, but > the backend could look up something like > > git-5f6b0ffff13f5cd762d0a5a4e1c4dede58e8a537.tar.gz > > using the $snapshot variable in some hypothetical cache (or even > without the cache it won't mangle the nicer name $hash might have). O.K. > > Also, right now gitweb will not accept tags for hashes. This seems to be > because it passes the --verify option to rev-parse, but the output > from using and not using the verify option seems to be the same (other > than also accepting all tree-ishes). Could you let me know if there is > a good reason not to take off the --verify option? Otherwise, I would > like to take it off in the next version of this patch. Errr, what? $ 5096:[gitweb/web@git]# git rev-parse --verify v1.5.0 6db027ffe03210324939b3dd655c4223ca023b45 $ git rev-parse --verify refs/tags/v1.5.0 6db027ffe03210324939b3dd655c4223ca023b45 So it works as intended. The problem must be in some other place. The '--verify' option is needed because git-rev-parse would otherwise pass parameters it does not understand 'as is'. Compare $ git rev-parse --verify v9.9.9 2>/dev/null $ git rev-parse v9.9.9 2>/dev/null v9.9.9 > > Your point about adding the short hash to snapshots of branch heads is > cool, I'll try that for the next version of the patch. I think it would be better left for a _separate_ patch, as it is separate feature (and I guess more complicated one). >>> diff --git a/t/t9501-gitweb-standalone-http-status.sh b/t/t9501- >>> gitweb-standalone-http-status.sh >>> index d0ff21d..4f8f147 100644 >>> --- a/t/t9501-gitweb-standalone-http-status.sh >>> +++ b/t/t9501-gitweb-standalone-http-status.sh >>> @@ -75,4 +75,30 @@ test_expect_success \ >>> test_debug 'cat gitweb.output' >>> >>> >>> +test_expect_success \ >>> + 'snapshots: bad treeish id' \ >>> + 'gitweb_run "p=.git;a=snapshot;h=frizzumFrazzum;sf=tgz" && >>> + grep "400 - Not a valid hash id:" gitweb.output' >>> +test_debug 'cat gitweb.output' >>> + >>> +test_expect_success \ >>> + 'snapshots: good treeish id' \ >>> + 'gitweb_run "p=.git;a=snapshot;h=master;sf=tgz" && >>> + grep "Status: 200 OK" gitweb.output' >>> +test_debug 'cat gitweb.output' >> >> Why you don't check for "Status: 400" too? > > I'm not sure which test you are referring to (I think the second). The > second test is valid and should return a nice .git-master.tar.gz > tarball. The output of CGI script like gitweb (and therefore gitweb.output file, as it is generated now) contains both HTTP headers, separated by single empty CRLF delimited line from the proper output of a script. In first test you check that _contents_ contain specific error message, but you do not check if HTTP status code matches it (it should, because of how die_error works). In second test you check HTTP status. If the t/t9501-gitweb-standalone-http-status.sh is to be about status, I guess that you should check HTTP status, and not contents of the page (which is more likely to change, e.g. due to some prettifying). In t9501 tests you need, I think, only the HTTP headers part, unless you want also to check that the contents matches. There was some sed script shown to extract only HTTP headers. >>> Second, any given treeish will always be translated to the full length, >>> unambiguous, hash id; this will be useful for things like creating >>> unique names for snapshot caches. >> >> But this is not a good idea, IMHO. >> >> First, it introduces feature that nobody uses (at least yet); we can >> introduce this feature when it is needed instead. > > Sorry for promoting vapourware, I did originally rip this patch out from > something else. I removed the comment from the v2 commit message. Ah. O.K. then. -- Jakub Narebski Poland ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-09-11 16:42 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-09-10 21:20 [RFC/PATCH 2/2] gitweb: check given hash before trying to create snapshot Mark Rada 2009-09-11 7:52 ` Jakub Narebski 2009-09-11 15:44 ` Mark Rada 2009-09-11 16:42 ` Jakub Narebski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).