* [PATCH] gitweb: Do not parse refs by hand, use git-peek-remote instead
@ 2006-09-14 21:27 Jakub Narebski
2006-09-14 23:09 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Narebski @ 2006-09-14 21:27 UTC (permalink / raw)
To: git; +Cc: Junio Hamano
This is response to Linus work on packed refs. Additionally it
makes gitweb work with symrefs, too.
Do not parse refs by hand, using File::Find and reading individual
heads to get hash of reference, but use git-peek-remote output
instead. Assume that the hash for deref (with ^{}) always follows hash
for ref, and that we hav derefs only for tags objects; this removes
call to git_get_type (and git-cat-file -t invocation) for tags, which
speeds "summary" and "tags" views generation, but might slow
generation of "heads" view a bit. As of now we do not save and use the
deref hash.
Remove git_get_hash_by_ref while at it, as git_get_refs_list was the
only place it was used.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
By the way, the previous patch
"gitweb: Use File::Find::find in git_get_projects_list"
was added during working on this feature.
gitweb/gitweb.perl | 39 +++++++++++++++++----------------------
1 files changed, 17 insertions(+), 22 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 25383bc..4e58a6b 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -676,19 +676,6 @@ sub git_get_hash_by_path {
## ......................................................................
## git utility functions, directly accessing git repository
-# assumes that PATH is not symref
-sub git_get_hash_by_ref {
- my $path = shift;
-
- open my $fd, "$projectroot/$path" or return undef;
- my $head = <$fd>;
- close $fd;
- chomp $head;
- if ($head =~ m/^[0-9a-fA-F]{40}$/) {
- return $head;
- }
-}
-
sub git_get_project_description {
my $path = shift;
@@ -1098,17 +1085,25 @@ sub git_get_refs_list {
my @reflist;
my @refs;
- my $pfxlen = length("$projectroot/$project/$ref_dir");
- File::Find::find(sub {
- return if (/^\./);
- if (-f $_) {
- push @refs, substr($File::Find::name, $pfxlen + 1);
+ open my $fd, "-|", $GIT, "peek-remote", "$projectroot/$project/"
+ or return;
+ while (my $line = <$fd>) {
+ chomp $line;
+ if ($line =~ m/^([0-9a-fA-F]{40})\t$ref_dir\/?([^\^]+)$/) {
+ push @refs, { hash => $1, name => $2 };
+ } elsif ($line =~ m/^[0-9a-fA-F]{40}\t$ref_dir\/?.*\^\{\}$/) {
+ # assume that "peeled" ref is always after ref,
+ # and that you "peel" (deref) tags only
+ $refs[$#refs]->{'type'} = "tag";
}
- }, "$projectroot/$project/$ref_dir");
+ }
+ close $fd;
+
+ foreach my $ref (@refs) {
+ my $ref_file = $ref->{'name'};
+ my $ref_id = $ref->{'hash'};
- foreach my $ref_file (@refs) {
- my $ref_id = git_get_hash_by_ref("$project/$ref_dir/$ref_file");
- my $type = git_get_type($ref_id) || next;
+ my $type = $ref->{'type'} || git_get_type($ref_id) || next;
my %ref_item = parse_ref($ref_file, $ref_id, $type);
push @reflist, \%ref_item;
--
1.4.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] gitweb: Do not parse refs by hand, use git-peek-remote instead
2006-09-14 21:27 [PATCH] gitweb: Do not parse refs by hand, use git-peek-remote instead Jakub Narebski
@ 2006-09-14 23:09 ` Junio C Hamano
2006-09-15 1:43 ` [PATCH (take 2)] " Jakub Narebski
0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2006-09-14 23:09 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski <jnareb@gmail.com> writes:
> Remove git_get_hash_by_ref while at it, as git_get_refs_list was the
> only place it was used.
That's a very good news.
> + open my $fd, "-|", $GIT, "peek-remote", "$projectroot/$project/"
> + or return;
> + while (my $line = <$fd>) {
> + chomp $line;
> + if ($line =~ m/^([0-9a-fA-F]{40})\t$ref_dir\/?([^\^]+)$/) {
> + push @refs, { hash => $1, name => $2 };
> + } elsif ($line =~ m/^[0-9a-fA-F]{40}\t$ref_dir\/?.*\^\{\}$/) {
> + # assume that "peeled" ref is always after ref,
> + # and that you "peel" (deref) tags only
> + $refs[$#refs]->{'type'} = "tag";
> }
The assumption is good; we never show a ref^{} before showing
ref itself. But you probably would want to safeguard yourself
from future changes by not depending on "immediately after".
At least you can check $refs[-1]{'name'} is the same as the
unpeeled ref you are looking at, like this:
if ($line =~ m/^([0-9a-fA-F]{40})\t$ref_dir\/?([^\^]+)$/) {
push @refs, { hash => $1, name => $2 };
} elsif ($line =~ m/^[0-9a-fA-F]{40}\t$ref_dir\/?(.*)\^\{\}$/ &&
$1 eq $refs[-1]{'name'}) {
# most likely a tag is followed by its peeled
# one, and when that happens we know the
# previous one was of type 'tag'.
$refs[$#refs]->{'type'} = "tag";
} ...
> + foreach my $ref (@refs) {
> + my $ref_file = $ref->{'name'};
> + my $ref_id = $ref->{'hash'};
> - my $type = git_get_type($ref_id) || next;
> + my $type = $ref->{'type'} || git_get_type($ref_id) || next;
And this is a good incremental change to reduce number of
external calls.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH (take 2)] gitweb: Do not parse refs by hand, use git-peek-remote instead
2006-09-14 23:09 ` Junio C Hamano
@ 2006-09-15 1:43 ` Jakub Narebski
2006-09-15 6:15 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Narebski @ 2006-09-15 1:43 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
This is in response to Linus work on packed refs. Additionally it
makes gitweb work with symrefs, too.
Do not parse refs by hand, using File::Find and reading individual
heads to get hash of reference, but use git-peek-remote output
instead. Assume that the hash for deref (with ^{}) always follows hash
for ref, and that we hav derefs only for tags objects; this removes
call to git_get_type (and git-cat-file -t invocation) for tags, which
speeds "summary" and "tags" views generation, but might slow
generation of "heads" view a bit. As of now we do not save and use the
deref hash.
Remove git_get_hash_by_ref while at it, as git_get_refs_list was the
only place it was used.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
gitweb/gitweb.perl | 41 +++++++++++++++++++----------------------
1 files changed, 19 insertions(+), 22 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 25383bc..b4a890b 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -676,19 +676,6 @@ sub git_get_hash_by_path {
## ......................................................................
## git utility functions, directly accessing git repository
-# assumes that PATH is not symref
-sub git_get_hash_by_ref {
- my $path = shift;
-
- open my $fd, "$projectroot/$path" or return undef;
- my $head = <$fd>;
- close $fd;
- chomp $head;
- if ($head =~ m/^[0-9a-fA-F]{40}$/) {
- return $head;
- }
-}
-
sub git_get_project_description {
my $path = shift;
@@ -1098,17 +1085,27 @@ sub git_get_refs_list {
my @reflist;
my @refs;
- my $pfxlen = length("$projectroot/$project/$ref_dir");
- File::Find::find(sub {
- return if (/^\./);
- if (-f $_) {
- push @refs, substr($File::Find::name, $pfxlen + 1);
+ open my $fd, "-|", $GIT, "peek-remote", "$projectroot/$project/"
+ or return;
+ while (my $line = <$fd>) {
+ chomp $line;
+ if ($line =~ m/^([0-9a-fA-F]{40})\t$ref_dir\/?([^\^]+)$/) {
+ push @refs, { hash => $1, name => $2 };
+ } elsif ($line =~ m/^[0-9a-fA-F]{40}\t$ref_dir\/?(.*)\^\{\}$/ &&
+ $1 eq $refs[-1]{'name'}) {
+ # most likely a tag is followed by its peeled
+ # (deref) one, and when that happens we know the
+ # previous one was of type 'tag'.
+ $refs[-1]{'type'} = "tag";
}
- }, "$projectroot/$project/$ref_dir");
+ }
+ close $fd;
+
+ foreach my $ref (@refs) {
+ my $ref_file = $ref->{'name'};
+ my $ref_id = $ref->{'hash'};
- foreach my $ref_file (@refs) {
- my $ref_id = git_get_hash_by_ref("$project/$ref_dir/$ref_file");
- my $type = git_get_type($ref_id) || next;
+ my $type = $ref->{'type'} || git_get_type($ref_id) || next;
my %ref_item = parse_ref($ref_file, $ref_id, $type);
push @reflist, \%ref_item;
--
1.4.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH (take 2)] gitweb: Do not parse refs by hand, use git-peek-remote instead
2006-09-15 1:43 ` [PATCH (take 2)] " Jakub Narebski
@ 2006-09-15 6:15 ` Junio C Hamano
2006-09-15 7:14 ` Jakub Narebski
2006-09-15 10:49 ` Jakub Narebski
0 siblings, 2 replies; 9+ messages in thread
From: Junio C Hamano @ 2006-09-15 6:15 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski <jnareb@gmail.com> writes:
> This is in response to Linus work on packed refs. Additionally it
> makes gitweb work with symrefs, too.
>
> Do not parse refs by hand, using File::Find and reading individual
> heads to get hash of reference, but use git-peek-remote output
> instead.
Looks nicer. Will apply.
Now, once we start doing this, it may make sense to rethink how
this function and git_get_references functions are used. I
think
git grep -n -e '^sub ' \
-e git_get_references \
-e git_get_refs_list gitweb/gitweb.perl
would be instructive how wasteful the current code is.
get_refs_list is called _TWICE_ in git_summary and worse yet
very late in the function, after calling git_get_references that
could already have done what the function does (by the way,
git_get_references already knows how to use peek-remote output
but for some reason it uses ls-remote -- I think you can safely
rewrite it to use peek-remote). So you end up doing peek-remote
three times to draw the summary page.
git_get_references are called from almost everywhere that shows
the list of commits, which is understandable because we would
want to see those markers in the list.
I very much suspect that you can use git_get_refs_list to return
a hash and a sorted list at the same time from the same input
and make git_summary to do with just a single call to it, and
get rid of git_get_references with a little bit of restructuring
of the caller.
Hmm?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH (take 2)] gitweb: Do not parse refs by hand, use git-peek-remote instead
2006-09-15 6:15 ` Junio C Hamano
@ 2006-09-15 7:14 ` Jakub Narebski
2006-09-15 7:48 ` Junio C Hamano
2006-09-15 10:49 ` Jakub Narebski
1 sibling, 1 reply; 9+ messages in thread
From: Jakub Narebski @ 2006-09-15 7:14 UTC (permalink / raw)
To: git
Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>
>> This is in response to Linus work on packed refs. Additionally it
>> makes gitweb work with symrefs, too.
>>
>> Do not parse refs by hand, using File::Find and reading individual
>> heads to get hash of reference, but use git-peek-remote output
>> instead.
>
> Looks nicer. Will apply.
>
> Now, once we start doing this, it may make sense to rethink how
> this function and git_get_references functions are used.
[...]
> I very much suspect that you can use git_get_refs_list to return
> a hash and a sorted list at the same time from the same input
> and make git_summary to do with just a single call to it, and
> get rid of git_get_references with a little bit of restructuring
> of the caller.
It can be done. Well, we could also collapse git_get_references and
git_get_refs_list into one subroutine, but it would be slightly slower
than git_get_references.
But, if we change git_get_refs_list to do also work of git_get_references,
we should also change git_get_references to not use info/refs file at all
(it can, and usually for unknown reasons is stale) but always use
git-peek-remote, for consistency.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH (take 2)] gitweb: Do not parse refs by hand, use git-peek-remote instead
2006-09-15 7:14 ` Jakub Narebski
@ 2006-09-15 7:48 ` Junio C Hamano
2006-09-15 8:43 ` Jakub Narebski
0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2006-09-15 7:48 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski <jnareb@gmail.com> writes:
> But, if we change git_get_refs_list to do also work of git_get_references,
> we should also change git_get_references to not use info/refs file at all
> (it can, and usually for unknown reasons is stale) but always use
> git-peek-remote, for consistency.
Yes that would make sense. A repository served by gitweb does
not necessarily has to serve objects over http transport, so it
is nicer not to require info/refs to even exist or up to date.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH (take 2)] gitweb: Do not parse refs by hand, use git-peek-remote instead
2006-09-15 7:48 ` Junio C Hamano
@ 2006-09-15 8:43 ` Jakub Narebski
2006-09-15 9:05 ` Junio C Hamano
0 siblings, 1 reply; 9+ messages in thread
From: Jakub Narebski @ 2006-09-15 8:43 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano
Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>
> > But, if we change git_get_refs_list to do also work of git_get_references,
> > we should also change git_get_references to not use info/refs file at all
> > (it can, and usually for unknown reasons is stale) but always use
> > git-peek-remote, for consistency.
>
> Yes that would make sense. A repository served by gitweb does
> not necessarily has to serve objects over http transport, so it
> is nicer not to require info/refs to even exist or up to date.
We do not require info/refs, as currently git_get_references falls
back to git-ls-remotes (should be git-peek-remote as it is faster)
if info/refs does not exist. But info/refs is usually stale; I guess
it is updated on pull/fetch/push, but not on commit.
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH (take 2)] gitweb: Do not parse refs by hand, use git-peek-remote instead
2006-09-15 8:43 ` Jakub Narebski
@ 2006-09-15 9:05 ` Junio C Hamano
0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2006-09-15 9:05 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski <jnareb@gmail.com> writes:
> Junio C Hamano wrote:
>> Jakub Narebski <jnareb@gmail.com> writes:
>>
>> > But, if we change git_get_refs_list to do also work of git_get_references,
>> > we should also change git_get_references to not use info/refs file at all
>> > (it can, and usually for unknown reasons is stale) but always use
>> > git-peek-remote, for consistency.
>>
>> Yes that would make sense. A repository served by gitweb does
>> not necessarily has to serve objects over http transport, so it
>> is nicer not to require info/refs to even exist or up to date.
>
> We do not require info/refs, as currently git_get_references falls
> back to git-ls-remotes (should be git-peek-remote as it is faster)
> if info/refs does not exist. But info/refs is usually stale; I guess
> it is updated on pull/fetch/push, but not on commit.
Yes, we are in agreement and are saying the same thing.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH (take 2)] gitweb: Do not parse refs by hand, use git-peek-remote instead
2006-09-15 6:15 ` Junio C Hamano
2006-09-15 7:14 ` Jakub Narebski
@ 2006-09-15 10:49 ` Jakub Narebski
1 sibling, 0 replies; 9+ messages in thread
From: Jakub Narebski @ 2006-09-15 10:49 UTC (permalink / raw)
To: Junio C Hamano, git
Junio C Hamano wrote:
> Now, once we start doing this, it may make sense to rethink how
> this function and git_get_references functions are used. I
> think
>
> git grep -n -e '^sub ' \
> -e git_get_references \
> -e git_get_refs_list gitweb/gitweb.perl
>
> would be instructive how wasteful the current code is.
>
> get_refs_list is called _TWICE_ in git_summary and worse yet
> very late in the function, after calling git_get_references that
> could already have done what the function does (by the way,
> git_get_references already knows how to use peek-remote output
> but for some reason it uses ls-remote -- I think you can safely
> rewrite it to use peek-remote). So you end up doing peek-remote
> three times to draw the summary page.
>
> git_get_references are called from almost everywhere that shows
> the list of commits, which is understandable because we would
> want to see those markers in the list.
>
> I very much suspect that you can use git_get_refs_list to return
> a hash and a sorted list at the same time from the same input
> and make git_summary to do with just a single call to it, and
> get rid of git_get_references with a little bit of restructuring
> of the caller.
We can easily collapse two calls for git_get_refs_list in gi_summary,
one for tags and one for heads into one call plus some filtering.
Changing git_get_refs_list to do also the job of git_get_references
means that in git_tags and git_heads we do extra the job of
git_get_references. Neither git_tags, nor git_heads use references and
refs marker; using the heads references in git_heads and tags
references in git_tags is repeating the same information twice,
cluttering the output. Unless we want to add yet another subroutine...
But as call to git-peek-remote is what takes most time, we can waste
some time processing references which we would not use for the sake of
clarity. Well, we can get rid of git_get_references too, if we don't
mind slight decrease in performance.
We would then use:
git_summary:
my ($refs, $reflist) = git_get_refs_list();
my @taglist = map { s!^tags/!! } grep { m!^tags/! } @$reflist;
my @headlist = map { s!^heads/!! } grep { m!^heads/! } @$reflist;
git_heads, git_tags:
my (undef, $taglist) = git_get_refs_list("tags");
my (undef, $headlist) = git_get_refs_list("heads");
everywhere else
my ($refs, undef) = git_get_refs_list($type);
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-09-15 10:49 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-14 21:27 [PATCH] gitweb: Do not parse refs by hand, use git-peek-remote instead Jakub Narebski
2006-09-14 23:09 ` Junio C Hamano
2006-09-15 1:43 ` [PATCH (take 2)] " Jakub Narebski
2006-09-15 6:15 ` Junio C Hamano
2006-09-15 7:14 ` Jakub Narebski
2006-09-15 7:48 ` Junio C Hamano
2006-09-15 8:43 ` Jakub Narebski
2006-09-15 9:05 ` Junio C Hamano
2006-09-15 10:49 ` Jakub Narebski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).