* .git/info/refs
@ 2007-01-24 7:38 H. Peter Anvin
2007-01-24 9:28 ` .git/info/refs Jakub Narebski
0 siblings, 1 reply; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-24 7:38 UTC (permalink / raw)
To: Git Mailing List
Would it be an incompatible change to add the commit date (and perhaps
the author date) to .git/info/refs? I believe that would make it
possible to dramatically (orders of magnitude) speed up the generation
of the gitweb index page, which is easily the most expensive gitweb page
to generate.
-=hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 7:38 .git/info/refs H. Peter Anvin
@ 2007-01-24 9:28 ` Jakub Narebski
2007-01-24 15:55 ` .git/info/refs H. Peter Anvin
0 siblings, 1 reply; 39+ messages in thread
From: Jakub Narebski @ 2007-01-24 9:28 UTC (permalink / raw)
To: git
H. Peter Anvin wrote:
> Would it be an incompatible change to add the commit date (and perhaps
> the author date) to .git/info/refs? I believe that would make it
> possible to dramatically (orders of magnitude) speed up the generation
> of the gitweb index page, which is easily the most expensive gitweb page
> to generate.
With new gitweb and new git it is not that expensive. It is now one call
to git-for-each-ref per repository.
Besides, we can't rely that .git/info/refs is up to date, or even exists.
It is for dumb protocols, not for gitweb.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 9:28 ` .git/info/refs Jakub Narebski
@ 2007-01-24 15:55 ` H. Peter Anvin
2007-01-24 16:02 ` .git/info/refs Johannes Schindelin
` (2 more replies)
0 siblings, 3 replies; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-24 15:55 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski wrote:
> H. Peter Anvin wrote:
>
>> Would it be an incompatible change to add the commit date (and perhaps
>> the author date) to .git/info/refs? I believe that would make it
>> possible to dramatically (orders of magnitude) speed up the generation
>> of the gitweb index page, which is easily the most expensive gitweb page
>> to generate.
>
> With new gitweb and new git it is not that expensive. It is now one call
> to git-for-each-ref per repository.
That IS hugely expensive. On kernel.org, that is 24175 calls to git.
> Besides, we can't rely that .git/info/refs is up to date, or even exists.
> It is for dumb protocols, not for gitweb.
Well, SOMETHING needs to be done for this page, since it can take 15
minutes or more to generate. Caching doesn't help one iota, since it's
stale before being generated.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 15:55 ` .git/info/refs H. Peter Anvin
@ 2007-01-24 16:02 ` Johannes Schindelin
2007-01-24 16:24 ` .git/info/refs H. Peter Anvin
2007-01-24 20:40 ` .git/info/refs Jakub Narebski
2007-01-25 21:28 ` .git/info/refs Junio C Hamano
2 siblings, 1 reply; 39+ messages in thread
From: Johannes Schindelin @ 2007-01-24 16:02 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Jakub Narebski, git
Hi,
On Wed, 24 Jan 2007, H. Peter Anvin wrote:
> Jakub Narebski wrote:
> > H. Peter Anvin wrote:
> >
> > > Would it be an incompatible change to add the commit date (and perhaps the
> > > author date) to .git/info/refs? I believe that would make it possible to
> > > dramatically (orders of magnitude) speed up the generation of the gitweb
> > > index page, which is easily the most expensive gitweb page to generate.
> >
> > With new gitweb and new git it is not that expensive. It is now one call
> > to git-for-each-ref per repository.
>
> That IS hugely expensive. On kernel.org, that is 24175 calls to git.
>
> > Besides, we can't rely that .git/info/refs is up to date, or even exists.
> > It is for dumb protocols, not for gitweb.
>
> Well, SOMETHING needs to be done for this page, since it can take 15
> minutes or more to generate. Caching doesn't help one iota, since it's
> stale before being generated.
To me, it seems like all boils down to caching parsed data structures.
I.e. parse the config, then serialize the parsed data to a file. Don't
reparse the config unless it is 1 hour older than the config.
Likewise, run for-each-ref, and serialize the parsed data into a file.
Don't rerun for-each-ref if that file is younger than 15 minutes.
Maybe the same for the first 200 commits of each branch.
(I made those times up, but you get the idea.)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 16:02 ` .git/info/refs Johannes Schindelin
@ 2007-01-24 16:24 ` H. Peter Anvin
2007-01-24 16:38 ` .git/info/refs Johannes Schindelin
0 siblings, 1 reply; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-24 16:24 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
Johannes Schindelin wrote:
>
> To me, it seems like all boils down to caching parsed data structures.
> I.e. parse the config, then serialize the parsed data to a file. Don't
> reparse the config unless it is 1 hour older than the config.
>
> Likewise, run for-each-ref, and serialize the parsed data into a file.
> Don't rerun for-each-ref if that file is younger than 15 minutes.
>
> Maybe the same for the first 200 commits of each branch.
>
> (I made those times up, but you get the idea.)
>
A much better idea is to have that data structure updated on repository
updates, which is the whole point behind .git/info/refs. On kernel.org,
at least, if you don't keep .git/info/refs up to date you need to get
your fingers whacked anyway, since it damages usability for one
particular class of users.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 16:24 ` .git/info/refs H. Peter Anvin
@ 2007-01-24 16:38 ` Johannes Schindelin
2007-01-24 16:41 ` .git/info/refs H. Peter Anvin
0 siblings, 1 reply; 39+ messages in thread
From: Johannes Schindelin @ 2007-01-24 16:38 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Jakub Narebski, git
Hi,
On Wed, 24 Jan 2007, H. Peter Anvin wrote:
> Johannes Schindelin wrote:
> >
> > To me, it seems like all boils down to caching parsed data structures. I.e.
> > parse the config, then serialize the parsed data to a file. Don't reparse
> > the config unless it is 1 hour older than the config.
> >
> > Likewise, run for-each-ref, and serialize the parsed data into a file. Don't
> > rerun for-each-ref if that file is younger than 15 minutes.
> >
> > Maybe the same for the first 200 commits of each branch.
> >
> > (I made those times up, but you get the idea.)
> >
>
> A much better idea is to have that data structure updated on repository
> updates, which is the whole point behind .git/info/refs. On kernel.org,
> at least, if you don't keep .git/info/refs up to date you need to get
> your fingers whacked anyway, since it damages usability for one
> particular class of users.
Granted, for some things this might work. However, I would not wreak havoc
by changing the format of .git/info/refs, rather put the details you
wanted into .git/info/refs-details.
However, for other things (like showing a certain number of commits), it
_might_ make sense to cache them (e.g. when literally thousands of people
look at the 100 last commits of linux-2.6.git), but not for others (e.g.
the 100th last to the 200th last commit of git-tools.git).
Having said that, it should be relatively easy to store the (parsed, or at
least easily parseable) 500 last commits of a branch into
.git/info/commits-<branch>.
This would put the burden of publishing a branch higher, easening the
overall load on the server.
Jakub?
Ciao,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 16:38 ` .git/info/refs Johannes Schindelin
@ 2007-01-24 16:41 ` H. Peter Anvin
2007-01-24 16:52 ` .git/info/refs Johannes Schindelin
2007-01-24 17:10 ` .git/info/refs Jakub Narebski
0 siblings, 2 replies; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-24 16:41 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
Johannes Schindelin wrote:
>
> Granted, for some things this might work. However, I would not wreak havoc
> by changing the format of .git/info/refs, rather put the details you
> wanted into .git/info/refs-details.
>
It's not clear to me if it would be wrecking havoc. After all, if a
format can't be expanded *at all*, there is something wrong, and adding
things to the end of a line is a common structured way of expansion.
Hence the original query
> However, for other things (like showing a certain number of commits), it
> _might_ make sense to cache them (e.g. when literally thousands of people
> look at the 100 last commits of linux-2.6.git), but not for others (e.g.
> the 100th last to the 200th last commit of git-tools.git).
Any query that's within a repository is fairly easily cachable
post-generation. The front page (and its RSS variant) is a bit of an
exception, because it involves all repositories at once.
Doesn't mean we couldn't do better, but...
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 16:41 ` .git/info/refs H. Peter Anvin
@ 2007-01-24 16:52 ` Johannes Schindelin
2007-01-24 17:06 ` .git/info/refs H. Peter Anvin
2007-01-24 17:10 ` .git/info/refs Jakub Narebski
1 sibling, 1 reply; 39+ messages in thread
From: Johannes Schindelin @ 2007-01-24 16:52 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Jakub Narebski, git
Hi,
On Wed, 24 Jan 2007, H. Peter Anvin wrote:
> Johannes Schindelin wrote:
> >
> > Granted, for some things this might work. However, I would not wreak
> > havoc by changing the format of .git/info/refs, rather put the details
> > you wanted into .git/info/refs-details.
> >
>
> It's not clear to me if it would be wrecking havoc. After all, if a
> format can't be expanded *at all*, there is something wrong, and adding
> things to the end of a line is a common structured way of expansion.
> Hence the original query
The idea of .git/info/refs is to enable dumb transports to fetch something
akin to intelligently. They don't need that information, and frankly, I
don't think they should need to understand it.
I also expect that they interpret everything after the sha1 as refname,
what with our having become quite liberal with refnames (they can contain
spaces, tabs, and even a small amount of special K). So I don't see a way
to upgrade the file format.
But as should be clear by now, I'd prefer additional information -- that
is of no interest to dumb transports anyway -- to be put in an own file.
That also opens the possibility of, say .git/info/perl/, which contains
_only_ serialized perl objects! I imagine this could be a performance
booster.
> > However, for other things (like showing a certain number of commits),
> > it _might_ make sense to cache them (e.g. when literally thousands of
> > people look at the 100 last commits of linux-2.6.git), but not for
> > others (e.g. the 100th last to the 200th last commit of
> > git-tools.git).
>
> Any query that's within a repository is fairly easily cachable
> post-generation. The front page (and its RSS variant) is a bit of an
> exception, because it involves all repositories at once.
... and here we have a problem, right? No single update hook can update
the _whole_ information.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 16:52 ` .git/info/refs Johannes Schindelin
@ 2007-01-24 17:06 ` H. Peter Anvin
2007-01-24 17:25 ` .git/info/refs Jakub Narebski
0 siblings, 1 reply; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-24 17:06 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
Johannes Schindelin wrote:
> Hi,
>
> On Wed, 24 Jan 2007, H. Peter Anvin wrote:
>
>> Johannes Schindelin wrote:
>>> Granted, for some things this might work. However, I would not wreak
>>> havoc by changing the format of .git/info/refs, rather put the details
>>> you wanted into .git/info/refs-details.
>>>
>> It's not clear to me if it would be wrecking havoc. After all, if a
>> format can't be expanded *at all*, there is something wrong, and adding
>> things to the end of a line is a common structured way of expansion.
>> Hence the original query
>
> The idea of .git/info/refs is to enable dumb transports to fetch something
> akin to intelligently. They don't need that information, and frankly, I
> don't think they should need to understand it.
I don't think adding 10 digits to each line is going to be a sizable
impact on anything.
> I also expect that they interpret everything after the sha1 as refname,
> what with our having become quite liberal with refnames (they can contain
> spaces, tabs, and even a small amount of special K). So I don't see a way
> to upgrade the file format.
They can also contain newlines, probably, so escaping is obligatory anyway.
> But as should be clear by now, I'd prefer additional information -- that
> is of no interest to dumb transports anyway -- to be put in an own file.
Yes, but the argument seems to be philosophical.
> That also opens the possibility of, say .git/info/perl/, which contains
> _only_ serialized perl objects! I imagine this could be a performance
> booster.
For certain things, I'm sure.
>>> However, for other things (like showing a certain number of commits),
>>> it _might_ make sense to cache them (e.g. when literally thousands of
>>> people look at the 100 last commits of linux-2.6.git), but not for
>>> others (e.g. the 100th last to the 200th last commit of
>>> git-tools.git).
>> Any query that's within a repository is fairly easily cachable
>> post-generation. The front page (and its RSS variant) is a bit of an
>> exception, because it involves all repositories at once.
>
> ... and here we have a problem, right? No single update hook can update
> the _whole_ information.
I don't see a problem.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 16:41 ` .git/info/refs H. Peter Anvin
2007-01-24 16:52 ` .git/info/refs Johannes Schindelin
@ 2007-01-24 17:10 ` Jakub Narebski
2007-01-24 17:20 ` .git/info/refs Johannes Schindelin
2007-01-25 17:13 ` .git/info/refs H. Peter Anvin
1 sibling, 2 replies; 39+ messages in thread
From: Jakub Narebski @ 2007-01-24 17:10 UTC (permalink / raw)
To: git
H. Peter Anvin wrote:
> Johannes Schindelin wrote:
>>
>> Granted, for some things this might work. However, I would not wreak havoc
>> by changing the format of .git/info/refs, rather put the details you
>> wanted into .git/info/refs-details.
>
> It's not clear to me if it would be wrecking havoc. After all, if a
> format can't be expanded *at all*, there is something wrong, and adding
> things to the end of a line is a common structured way of expansion.
> Hence the original query
I don't think it can be easily expanded. .git/info/refs is meant for
http-fetch, and it mimics git-ls-remote / git-peek-remote output.
BTW. putting the info of git-for-each-ref into .git/info/refs-details
would mean that instead of "24175 calls to git" one would need to
read 24175 files. Perhaps the whole info needed to generate projects
index page should be pre-generated on push (update), instead of per
project (per repository) .git/info/refs-details
>> However, for other things (like showing a certain number of commits), it
>> _might_ make sense to cache them (e.g. when literally thousands of people
>> look at the 100 last commits of linux-2.6.git), but not for others (e.g.
>> the 100th last to the 200th last commit of git-tools.git).
>
> Any query that's within a repository is fairly easily cachable
> post-generation. The front page (and its RSS variant) is a bit of an
> exception, because it involves all repositories at once.
Actually "RSS", or to be more exact OPML variant of front page in its
current invocation is equivalent of project_index page, and it can be
generated once (well, once per adding / removing / renaming a repository).
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 17:10 ` .git/info/refs Jakub Narebski
@ 2007-01-24 17:20 ` Johannes Schindelin
2007-01-25 17:13 ` .git/info/refs H. Peter Anvin
1 sibling, 0 replies; 39+ messages in thread
From: Johannes Schindelin @ 2007-01-24 17:20 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Hi,
On Wed, 24 Jan 2007, Jakub Narebski wrote:
> H. Peter Anvin wrote:
>
> > Johannes Schindelin wrote:
> >>
> >> Granted, for some things this might work. However, I would not wreak havoc
> >> by changing the format of .git/info/refs, rather put the details you
> >> wanted into .git/info/refs-details.
> >
> > It's not clear to me if it would be wrecking havoc. After all, if a
> > format can't be expanded *at all*, there is something wrong, and adding
> > things to the end of a line is a common structured way of expansion.
> > Hence the original query
>
> I don't think it can be easily expanded. .git/info/refs is meant for
> http-fetch, and it mimics git-ls-remote / git-peek-remote output.
Exactly.
> BTW. putting the info of git-for-each-ref into .git/info/refs-details
> would mean that instead of "24175 calls to git" one would need to
> read 24175 files. Perhaps the whole info needed to generate projects
> index page should be pre-generated on push (update), instead of per
> project (per repository) .git/info/refs-details
You completely lost me there. A push (update) is done as a specific user,
who should not be able to write to a _global_ file!
Nevertheless, "24175 calls to git" is sure as hell more expensive than
"reading 24175 files".
Plus, if we integrate the functionality to write .git/info/refs-details
into update-server-info, you can reduce that further: it is no longer
per-branch, but per-repo.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 17:06 ` .git/info/refs H. Peter Anvin
@ 2007-01-24 17:25 ` Jakub Narebski
0 siblings, 0 replies; 39+ messages in thread
From: Jakub Narebski @ 2007-01-24 17:25 UTC (permalink / raw)
To: git
H. Peter Anvin wrote:
>> I also expect that they interpret everything after the sha1 as refname,
>> what with our having become quite liberal with refnames (they can contain
>> spaces, tabs, and even a small amount of special K). So I don't see a way
>> to upgrade the file format.
>
> They can also contain newlines, probably, so escaping is obligatory
> anyway.
No, refnames can not contain newlines.
And .git/info/refs mimics git-ls-remote / git-peek-remote output,
and is meant for dumb transports.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 15:55 ` .git/info/refs H. Peter Anvin
2007-01-24 16:02 ` .git/info/refs Johannes Schindelin
@ 2007-01-24 20:40 ` Jakub Narebski
2007-01-24 20:44 ` .git/info/refs hpa
2007-01-24 20:45 ` .git/info/refs hpa
2007-01-25 21:28 ` .git/info/refs Junio C Hamano
2 siblings, 2 replies; 39+ messages in thread
From: Jakub Narebski @ 2007-01-24 20:40 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git
H. Peter Anvin wrote:
> Jakub Narebski wrote:
>> Besides, we can't rely that .git/info/refs is up to date, or even exists.
>> It is for dumb protocols, not for gitweb.
>
> Well, SOMETHING needs to be done for this page, since it can take 15
> minutes or more to generate. Caching doesn't help one iota, since it's
> stale before being generated.
The simple and fast solution would be to make post-update hook contain
the git-for-each-ref with parameters like in git_get_last_activity,
saving e.g. to .git/info/last-committer, and in gitweb read this file
if it exist, run git-for-each-ref otherwise (similar to what we used to
do with .git/info/refs and git-peek-remote in gitweb).
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 20:40 ` .git/info/refs Jakub Narebski
@ 2007-01-24 20:44 ` hpa
2007-01-25 8:14 ` .git/info/refs Johannes Schindelin
2007-01-24 20:45 ` .git/info/refs hpa
1 sibling, 1 reply; 39+ messages in thread
From: hpa @ 2007-01-24 20:44 UTC (permalink / raw)
To: Jakub Narebski; +Cc: H. Peter Anvin, git
> H. Peter Anvin wrote:
>
>>> Besides, we can't rely that .git/info/refs is up to date, or even
>>> exists.
>>> It is for dumb protocols, not for gitweb.
>>
>> Well, SOMETHING needs to be done for this page, since it can take 15
>> minutes or more to generate. Caching doesn't help one iota, since it's
>> stale before being generated.
>
> The simple and fast solution would be to make post-update hook contain
> the git-for-each-ref with parameters like in git_get_last_activity,
> saving e.g. to .git/info/last-committer, and in gitweb read this file
> if it exist, run git-for-each-ref otherwise (similar to what we used to
> do with .git/info/refs and git-peek-remote in gitweb).
>
Right, this is basically what I'm saying; the question is only whether or
not this fits into .git/info/refs or should be a separate file.
Either way, I think git-update-server-info should generate all these files.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 20:40 ` .git/info/refs Jakub Narebski
2007-01-24 20:44 ` .git/info/refs hpa
@ 2007-01-24 20:45 ` hpa
1 sibling, 0 replies; 39+ messages in thread
From: hpa @ 2007-01-24 20:45 UTC (permalink / raw)
To: Jakub Narebski; +Cc: H. Peter Anvin, git
> H. Peter Anvin wrote:
>
>>> Besides, we can't rely that .git/info/refs is up to date, or even
>>> exists.
>>> It is for dumb protocols, not for gitweb.
>>
>> Well, SOMETHING needs to be done for this page, since it can take 15
>> minutes or more to generate. Caching doesn't help one iota, since it's
>> stale before being generated.
>
> The simple and fast solution would be to make post-update hook contain
> the git-for-each-ref with parameters like in git_get_last_activity,
> saving e.g. to .git/info/last-committer, and in gitweb read this file
> if it exist, run git-for-each-ref otherwise (similar to what we used to
> do with .git/info/refs and git-peek-remote in gitweb).
>
Right, this is basically what I'm saying; the question is only whether or
not this fits into .git/info/refs or should be a separate file.
Either way, I think git-update-server-info should generate all these files.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 20:44 ` .git/info/refs hpa
@ 2007-01-25 8:14 ` Johannes Schindelin
2007-01-25 16:12 ` .git/info/refs H. Peter Anvin
0 siblings, 1 reply; 39+ messages in thread
From: Johannes Schindelin @ 2007-01-25 8:14 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Jakub Narebski, git
Hi,
On Wed, 24 Jan 2007, hpa@zytor.com wrote:
> > H. Peter Anvin wrote:
> >
> >>> Besides, we can't rely that .git/info/refs is up to date, or even
> >>> exists.
> >>> It is for dumb protocols, not for gitweb.
> >>
> >> Well, SOMETHING needs to be done for this page, since it can take 15
> >> minutes or more to generate. Caching doesn't help one iota, since it's
> >> stale before being generated.
> >
> > The simple and fast solution would be to make post-update hook contain
> > the git-for-each-ref with parameters like in git_get_last_activity,
> > saving e.g. to .git/info/last-committer, and in gitweb read this file
> > if it exist, run git-for-each-ref otherwise (similar to what we used to
> > do with .git/info/refs and git-peek-remote in gitweb).
> >
>
> Right, this is basically what I'm saying; the question is only whether or
> not this fits into .git/info/refs or should be a separate file.
>
> Either way, I think git-update-server-info should generate all these files.
Well, no. At least not per default. What you want is _very_ special to
gitweb. It is _only_ needed by gitweb. And .git/info/refs is for _dumb
transports_, _not_ for gitweb.
That said, I think it makes sense _in your setup_ to trigger updating
_another_ file for use in gitweb.
Remember, this is all very, very special for gitweb. So let's separate it
cleanly from all which is not special for gitweb.
I hope I have made it clear why (at least IMHO) it would be wrong, wrong,
wrong to change the format of .git/info/refs _only_ for gitweb, which it
is not meant for to begin with.
So let's introduce another file in .git/info/ especially dedicated to
gitweb.
Then we are free to introduce real cool performance hacks, like using
Storable to store the parsed data structures (I was alluding to this in an
earlier reply, as "serializing"). Then you just retrieve the file -- if it
exists -- or call for-each-ref (like Jakub said).
By separating this gitweb-special thing cleanly, maybe into a hook, we can
have a perl script which writes this file. We can write a simple hash,
which may or may not contain keys, thus being of "extensible format".
By having this perl script, you can -- as root -- run it as the
appropriate user for each repository where it does not exist yet.
Remains the problem: how do we _force_ this hook enabled site-wide, i.e.
in _all_ repos?
But that is too easy: just edit the existing template, and then replace
the update hooks in all repos (possibly verifying that the existing update
hook indeed matches the old template).
So what problems remain with this approach?
Ciao,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-25 8:14 ` .git/info/refs Johannes Schindelin
@ 2007-01-25 16:12 ` H. Peter Anvin
2007-01-25 16:50 ` .git/info/refs Johannes Schindelin
0 siblings, 1 reply; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-25 16:12 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jakub Narebski, git
Johannes Schindelin wrote:
>>
>> Either way, I think git-update-server-info should generate all these files.
>
> Well, no. At least not per default. What you want is _very_ special to
> gitweb. It is _only_ needed by gitweb. And .git/info/refs is for _dumb
> transports_, _not_ for gitweb.
>
> That said, I think it makes sense _in your setup_ to trigger updating
> _another_ file for use in gitweb.
>
> Remember, this is all very, very special for gitweb. So let's separate it
> cleanly from all which is not special for gitweb.
>
> I hope I have made it clear why (at least IMHO) it would be wrong, wrong,
> wrong to change the format of .git/info/refs _only_ for gitweb, which it
> is not meant for to begin with.
No, you keep using circular reasoning.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-25 16:12 ` .git/info/refs H. Peter Anvin
@ 2007-01-25 16:50 ` Johannes Schindelin
0 siblings, 0 replies; 39+ messages in thread
From: Johannes Schindelin @ 2007-01-25 16:50 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git
Hi,
On Thu, 25 Jan 2007, H. Peter Anvin wrote:
> > I hope I have made it clear why (at least IMHO) it would be wrong,
> > wrong, wrong to change the format of .git/info/refs _only_ for gitweb,
> > which it is not meant for to begin with.
>
> No, you keep using circular reasoning.
No. Once again, .git/info/refs is _not_ for gitweb. But I will stop
arguing about that topic, because I don't have enough time for that.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 17:10 ` .git/info/refs Jakub Narebski
2007-01-24 17:20 ` .git/info/refs Johannes Schindelin
@ 2007-01-25 17:13 ` H. Peter Anvin
2007-01-26 11:22 ` .git/info/refs Jakub Narebski
2007-01-26 11:41 ` .git/info/refs Junio C Hamano
1 sibling, 2 replies; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-25 17:13 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski wrote:
>
> I don't think it can be easily expanded. .git/info/refs is meant for
> http-fetch, and it mimics git-ls-remote / git-peek-remote output.
For heaven's sake, in computer science we can *NEVER* use the same
feature for *MORE THAN ONE THING*. If it doesn't work format-wise
that's fine, but "it's only supposed to be used by dumb transports" is
ridiculous.
> BTW. putting the info of git-for-each-ref into .git/info/refs-details
> would mean that instead of "24175 calls to git" one would need to
> read 24175 files. Perhaps the whole info needed to generate projects
> index page should be pre-generated on push (update), instead of per
> project (per repository) .git/info/refs-details
No, it should be one file per repository, not one file per ref. Why?
Obviously we don't want 24175 files to be accessed. However, a push can
only affect files for which the repository owner has permission and
which resides in the repository filespace, so it should stay inside that
space.
On kernel.org, this would reduce the load from 24175 calls to git to
reading 250 files. Although the latter is still expensive (and will
probably need post-generation caching) the files should be small and
cacheable by the kernel, and the resulting I/O load should be quite small.
Anyway, as far as git-update-server-index is concerned, I'm *very*
concerned that there be a single command that updates all the cached
information across the repository. Telling everyone to update their
hooks every time we want to add cached information is silly. Right now,
git-update-server-index is the command to update cached information, and
for usability reasons there should be a single entry point.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-24 15:55 ` .git/info/refs H. Peter Anvin
2007-01-24 16:02 ` .git/info/refs Johannes Schindelin
2007-01-24 20:40 ` .git/info/refs Jakub Narebski
@ 2007-01-25 21:28 ` Junio C Hamano
2007-01-25 21:37 ` .git/info/refs H. Peter Anvin
2 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2007-01-25 21:28 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git
"H. Peter Anvin" <hpa@zytor.com> writes:
> Jakub Narebski wrote:
>> H. Peter Anvin wrote:
>>
>>> Would it be an incompatible change to add the commit date (and
>>> perhaps the author date) to .git/info/refs? I believe that would
>>> make it possible to dramatically (orders of magnitude) speed up the
>>> generation of the gitweb index page, which is easily the most
>>> expensive gitweb page to generate.
>>
>> With new gitweb and new git it is not that expensive. It is now one call
>> to git-for-each-ref per repository.
>
> That IS hugely expensive. On kernel.org, that is 24175 calls to git.
Do you mean you have 24k _REPOSITORIES_ served by gitweb on
kernel.org?
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-25 21:28 ` .git/info/refs Junio C Hamano
@ 2007-01-25 21:37 ` H. Peter Anvin
2007-01-25 21:51 ` .git/info/refs Junio C Hamano
0 siblings, 1 reply; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-25 21:37 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano wrote:
>>>
>>>> Would it be an incompatible change to add the commit date (and
>>>> perhaps the author date) to .git/info/refs? I believe that would
>>>> make it possible to dramatically (orders of magnitude) speed up the
>>>> generation of the gitweb index page, which is easily the most
>>>> expensive gitweb page to generate.
>>> With new gitweb and new git it is not that expensive. It is now one call
>>> to git-for-each-ref per repository.
>> That IS hugely expensive. On kernel.org, that is 24175 calls to git.
>
> Do you mean you have 24k _REPOSITORIES_ served by gitweb on
> kernel.org?
>
No, we currently have 250 repositories with a total of 24175 refs.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-25 21:37 ` .git/info/refs H. Peter Anvin
@ 2007-01-25 21:51 ` Junio C Hamano
2007-01-25 22:01 ` .git/info/refs H. Peter Anvin
0 siblings, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2007-01-25 21:51 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git
"H. Peter Anvin" <hpa@zytor.com> writes:
>>>> With new gitweb and new git it is not that expensive. It is now one call
>>>> to git-for-each-ref per repository.
>>> That IS hugely expensive. On kernel.org, that is 24175 calls to git.
>>
>> Do you mean you have 24k _REPOSITORIES_ served by gitweb on
>> kernel.org?
>
> No, we currently have 250 repositories with a total of 24175 refs.
Then that would mean 250 calls to git-for-each-ref, wouldn't it?
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-25 21:51 ` .git/info/refs Junio C Hamano
@ 2007-01-25 22:01 ` H. Peter Anvin
2007-01-25 23:33 ` .git/info/refs Johannes Schindelin
0 siblings, 1 reply; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-25 22:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Junio C Hamano wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
>
>>>>> With new gitweb and new git it is not that expensive. It is now one call
>>>>> to git-for-each-ref per repository.
>>>> That IS hugely expensive. On kernel.org, that is 24175 calls to git.
>>> Do you mean you have 24k _REPOSITORIES_ served by gitweb on
>>> kernel.org?
>> No, we currently have 250 repositories with a total of 24175 refs.
>
> Then that would mean 250 calls to git-for-each-ref, wouldn't it?
>
Well, I think it was Johannes that said once for each ref. But either
which way, it's a totally unacceptable load with resulting unacceptable
latency.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-25 22:01 ` .git/info/refs H. Peter Anvin
@ 2007-01-25 23:33 ` Johannes Schindelin
2007-01-27 22:07 ` .git/info/refs H. Peter Anvin
0 siblings, 1 reply; 39+ messages in thread
From: Johannes Schindelin @ 2007-01-25 23:33 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Junio C Hamano, git
Hi,
On Thu, 25 Jan 2007, H. Peter Anvin wrote:
> Junio C Hamano wrote:
>
> > Then that would mean 250 calls to git-for-each-ref, wouldn't it?
>
> Well, I think it was Johannes that said once for each ref. But either
> which way, it's a totally unacceptable load with resulting unacceptable
> latency.
No. I would never say that you have to run for-each-ref for each ref.
That's plain stupid.
BTW I take some satisfaction in that you finally agreed (in another email)
that some post-creation caching is necessary.
I would be even more satisfied if you finally agreed that it is a good
practice to separate conceptually different things, and not continued ad
infinitum (and ad nauseam) arguing that .git/info/refs should serve dumb
transports, and gitweb, and eventually bring peace to everybody on this
planet.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-25 17:13 ` .git/info/refs H. Peter Anvin
@ 2007-01-26 11:22 ` Jakub Narebski
2007-01-26 11:41 ` .git/info/refs Junio C Hamano
1 sibling, 0 replies; 39+ messages in thread
From: Jakub Narebski @ 2007-01-26 11:22 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git, Johannes Schindelin
H. Peter Anvin <hpa@zytor.com> wrote:
> Jakub Narebski wrote:
>>
>> I don't think it can be easily expanded. .git/info/refs is meant for
>> http-fetch, and it mimics git-ls-remote / git-peek-remote output.
>
> For heaven's sake, in computer science we can *NEVER* use the same
> feature for *MORE THAN ONE THING*. If it doesn't work format-wise
> that's fine, but "it's only supposed to be used by dumb transports" is
> ridiculous.
.git/info/refs is for dumb transports, so if we follow "do not use
the same feature for more than one thing" principle we should not
change its format for gitweb.
.git/info/refs is one of auxiliary info files to help dumb servers,
(servers that does not do on-the-fly pack generation), to help
clients discover what references server has. The second auxiliary
info file is .git/objects/info/packs. Both are generated by
git-update-server-info command, usually run from post-update hook.
Because .git/info/refs format is the same as git-ls-remote output
(AFAIK smart servers use git-ls-remote or git-peek-remote; dumb
servers use .git/info/refs) we used and can use it as ''cached''
"git ls-remote ." / "git peek-remote ." / "git show-ref --dereference"
output. For bare repositories where new data arrives only via
'update' (via push or fetch) and always trigger post-update hook,
and not for example via git-commit which does not invoke post-update
hook, the information in .git/info/refs is always fresh.
What I propose as quick solution is to add new (perhaps local)
git-update-gitweb-info command which is to be used in post-update
(and perhaps post-commit for non-bare repos) hook, and which results
we would use in gitweb. See patch at the bottom.
>> BTW. putting the info of git-for-each-ref into .git/info/refs-details
>> would mean that instead of "24175 calls to git" one would need to
>> read 24175 files. Perhaps the whole info needed to generate projects
>> index page should be pre-generated on push (update), instead of per
>> project (per repository) .git/info/refs-details
>
> No, it should be one file per repository, not one file per ref. Why?
> Obviously we don't want 24175 files to be accessed. However, a push can
> only affect files for which the repository owner has permission and
> which resides in the repository filespace, so it should stay inside that
> space.
Gitweb _newer_ did one call to git _per ref_, but always one call to git
_per repository_! Old git always used HEAD ref to get "Last Change" info
and used one call to git-rev-list (if I remember correctly), new git
checks all refs to get "Last Change" info but uses _one_ call to
git-for-each-ref. Because we did not want to affect gitweb performance
badly we waited for changing "Last Change" to check all refs and not
only HEAD to have git-for-each-ref to use one call to git command for that.
Historically it was first use of git-for-each-ref in gitweb.
Sidenote: I planned to add new %feature to gitweb to allow to chose
if to use all refs for "Last Change" info, HEAD ref, or some given ref
(for example "master"). But that would perhaps wait for .git/config
parser in Perl.
> On kernel.org, this would reduce the load from 24175 calls to git to
> reading 250 files. Although the latter is still expensive (and will
> probably need post-generation caching) the files should be small and
> cacheable by the kernel, and the resulting I/O load should be quite small.
Oh, so there are around 250 projects, and around 24175 references
together in those projects on kernel.org? I thought it were 24175
_projects_ (repositories)...
Currently, it is 250 calls to git, reading 24175 files (unless refs
are packed, then it would be reading 250 files) to get refs (heads)
info, and reading around 2*250 files (packs + index) to get last
change info. Not "24175 calls to git".
> Anyway, as far as git-update-server-info is concerned, I'm *very*
> concerned that there be a single command that updates all the cached
> information across the repository. Telling everyone to update their
> hooks every time we want to add cached information is silly. Right now,
> git-update-server-info is the command to update cached information, and
> for usability reasons there should be a single entry point.
git-update-server-info is to "update auxiliary info file to help dumb
servers". I propose to use (new) git-update-gitweb-info to help gitweb.
One command for one feature. This would mean unfortunately adding
"exec git-update-gitweb-info" line (if it does not exist) to existing
projects post-update hooks; for new projects it would be I think enough
to modify post-update template (templates/hooks--post-update or
/usr/share/git-core/templates/hooks/post-update).
Below the patches of how it can be done. Does not include corrections
to Makefile to install git-update-gitweb-info. NOT TESTED!
BTW final version of git-update-gitweb-info probably should be a built-in
command, like git-update-server-info, not a script.
diff --git a/git-update-gitweb-info.sh b/git-update-gitweb-info.sh
new file mode 100755
index 0000000..5bb44df
--- /dev/null
+++ b/git-update-gitweb-info.sh
@@ -0,0 +1,7 @@
+#!/bin/sh
+
+. git-sh-setup
+test -w "$GIT_DIR/info/last-changed" &&
+git-for-each-ref \
+ --format='%(committer)' --sort=-committerdate --count=1 refs/heads \
+ > "$GIT_DIR/info/last-changed"
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 88af2e6..e7874a6 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -1150,12 +1150,16 @@ sub git_get_last_activity {
my ($path) = @_;
my $fd;
- $git_dir = "$projectroot/$path";
- open($fd, "-|", git_cmd(), 'for-each-ref',
- '--format=%(committer)',
- '--sort=-committerdate',
- '--count=1',
- 'refs/heads') or return;
+ if (-r "$projectroot/$path/info/last-changed") {
+ open $fd, "$projectroot/$path/info/last-changed";
+ } else {
+ $git_dir = "$projectroot/$path";
+ open($fd, "-|", git_cmd(), 'for-each-ref',
+ '--format=%(committer)',
+ '--sort=-committerdate',
+ '--count=1',
+ 'refs/heads') or return;
+ }
my $most_recent = <$fd>;
close $fd or return;
if ($most_recent =~ / (\d+) [-+][01]\d\d\d$/) {
diff --git a/templates/hooks--post-update b/templates/hooks--post-update
old mode 100644
new mode 100755
index bcba893..b119224
--- a/templates/hooks--post-update
+++ b/templates/hooks--post-update
@@ -6,3 +6,4 @@
# To enable this hook, make this file executable by "chmod +x post-update".
exec git-update-server-info
+exec git-update-gitweb-info
--
Jakub Narebski
Poland
^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-25 17:13 ` .git/info/refs H. Peter Anvin
2007-01-26 11:22 ` .git/info/refs Jakub Narebski
@ 2007-01-26 11:41 ` Junio C Hamano
2007-01-26 16:39 ` .git/info/refs H. Peter Anvin
1 sibling, 1 reply; 39+ messages in thread
From: Junio C Hamano @ 2007-01-26 11:41 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git, Jakub Narebski
"H. Peter Anvin" <hpa@zytor.com> writes:
> For heaven's sake, in computer science we can *NEVER* use the same
> feature for *MORE THAN ONE THING*. If it doesn't work format-wise
> that's fine, but "it's only supposed to be used by dumb transports" is
> ridiculous.
Hmmmm... I am lost here....
> Right
> now, git-update-server-index is the command to update cached
> information, and for usability reasons there should be a single entry
> point.
Modulo s/-index/-info/, I agree that would be a very sensible
position, as long as the cost to generate additional cached
information necessary to help gitweb is reasonably small, I am
not opposed to have it generate another file [*1*].
[*1*]
I've been looking for backward-compatible holes in ls-remote and
its users, hoping we somehow could shoehorn this information in
info/refs, as I do not think its file format is sacred, nor the
file is there _only_ to help dumb transports. As long as the
published way to access that information stays consistent, the
underlying file format is a fair game. However, I do not think
the ls-remote command implementations in the wild has such a
hole I can exploit.
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-26 11:41 ` .git/info/refs Junio C Hamano
@ 2007-01-26 16:39 ` H. Peter Anvin
2007-01-26 17:06 ` .git/info/refs Jakub Narebski
2007-01-26 21:09 ` .git/info/refs Johannes Schindelin
0 siblings, 2 replies; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-26 16:39 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jakub Narebski
Junio C Hamano wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
>
>> For heaven's sake, in computer science we can *NEVER* use the same
>> feature for *MORE THAN ONE THING*. If it doesn't work format-wise
>> that's fine, but "it's only supposed to be used by dumb transports" is
>> ridiculous.
>
> Hmmmm... I am lost here....
>
Jakub and Johannes seems to have been arguing that "info/refs is for
dumb transports, therefore it cannot be used for any other purpose." I
find this argument utterly bizarre, since in general, in computer
science, you try to be multipurpose whenever practical.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-26 16:39 ` .git/info/refs H. Peter Anvin
@ 2007-01-26 17:06 ` Jakub Narebski
2007-01-26 21:09 ` .git/info/refs Johannes Schindelin
1 sibling, 0 replies; 39+ messages in thread
From: Jakub Narebski @ 2007-01-26 17:06 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Junio C Hamano, git, Johannes Schindelin
H. Peter Anvin wrote:
> Junio C Hamano wrote:
>> "H. Peter Anvin" <hpa@zytor.com> writes:
>>
>>> For heaven's sake, in computer science we can *NEVER* use the same
>>> feature for *MORE THAN ONE THING*. If it doesn't work format-wise
>>> that's fine, but "it's only supposed to be used by dumb transports" is
>>> ridiculous.
Please, for the future, mark irony if it might be mistaken...
>> Hmmmm... I am lost here....
>
> Jakub and Johannes seems to have been arguing that "info/refs is for
> dumb transports, therefore it cannot be used for any other purpose." I
> find this argument utterly bizarre, since in general, in computer
> science, you try to be multipurpose whenever practical.
First, changing info/refs format _might_ break fetch related scripts,
which rely on git-peek-remote / git-ls-remote / info/refs format.
Second, it is a bit impractical because info/refs contain (and must
contain) also _tags_ information (which is not needed for gitweb
"Last Change" field in projects list) and referenced object for
those tags. Tags need not to point to commits, nor dereference
to commits: for example in git.git tags v1.0rc1 to v1.0rc6 points
to other tags, and junio-gpg-pub point to out-of-tree blob (which
does not have any "commit time" associated). So what to write there
in the "commit time" field? What to write in "commit time" for tags?
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-26 16:39 ` .git/info/refs H. Peter Anvin
2007-01-26 17:06 ` .git/info/refs Jakub Narebski
@ 2007-01-26 21:09 ` Johannes Schindelin
2007-01-26 21:32 ` .git/info/refs H. Peter Anvin
2007-01-26 21:54 ` .git/info/refs H. Peter Anvin
1 sibling, 2 replies; 39+ messages in thread
From: Johannes Schindelin @ 2007-01-26 21:09 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Junio C Hamano, git, Jakub Narebski
Hi,
On Fri, 26 Jan 2007, H. Peter Anvin wrote:
> Junio C Hamano wrote:
> > "H. Peter Anvin" <hpa@zytor.com> writes:
> >
> > > For heaven's sake, in computer science we can *NEVER* use the same
> > > feature for *MORE THAN ONE THING*. If it doesn't work format-wise
> > > that's fine, but "it's only supposed to be used by dumb transports" is
> > > ridiculous.
> >
> > Hmmmm... I am lost here....
> >
>
> Jakub and Johannes seems to have been arguing that "info/refs is for dumb
> transports, therefore it cannot be used for any other purpose." I find this
> argument utterly bizarre, since in general, in computer science, you try to be
> multipurpose whenever practical.
You keep on harping on that issue. I really get tired of it.
You seem to propose that we should stuff things into .git/info/refs, just
because it is already there.
You seem to suggest that computer science is all about breaking things, to
muddle waters by mixing things which are clearly different kinds of
kettle, to "just" add a small thing here and there, all under the guise of
multi-purposity or whatever.
You seem to reason that practicality is more important than good style.
You know, this reasoning brought to us that big crap sh*tpile called
Windows. They also thought: it's not a big deal, let's introduce just a
little thing here and there, and a direct call from this component to that
component cannot hurt, cannit?
I am really, really getting tired of that kind of reasoning.
If you don't see how UNELEGANT it is to force dumb transports to download
things MEANT FOR GITWEB, and how much NICER it would be to have a file
WHICH IS ONLY MEANT FOR GITWEB TO BEGIN WITH, and which is easily
EXTENSIBLE, since we don't have to CARE about DUMB TRANSPORTS, because
gitweb data is a PURELY LOCAL thing, while dumb transports are NOT, I will
just start to ignore you.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-26 21:09 ` .git/info/refs Johannes Schindelin
@ 2007-01-26 21:32 ` H. Peter Anvin
2007-01-26 21:54 ` .git/info/refs H. Peter Anvin
1 sibling, 0 replies; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-26 21:32 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Junio C Hamano, git, Jakub Narebski
Johannes Schindelin wrote:
[... stuff ...]
I really could care less, as long as a single invocation is used to
update the cache information.
I disagree with your aesthetic argument, but it doesn't matter much to
me either way.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-26 21:09 ` .git/info/refs Johannes Schindelin
2007-01-26 21:32 ` .git/info/refs H. Peter Anvin
@ 2007-01-26 21:54 ` H. Peter Anvin
1 sibling, 0 replies; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-26 21:54 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Junio C Hamano, git, Jakub Narebski
By the way, let me be the first to apologize for the emotional
escalation. What matters to me is that the information is cached and
updated by a common cached information entry point (the existing
git-update-server-index would be preferred, obviously), not where the
information ends up.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-25 23:33 ` .git/info/refs Johannes Schindelin
@ 2007-01-27 22:07 ` H. Peter Anvin
2007-01-31 15:38 ` .git/info/refs Santi Béjar
2007-02-01 14:03 ` .git/info/refs Johannes Schindelin
0 siblings, 2 replies; 39+ messages in thread
From: H. Peter Anvin @ 2007-01-27 22:07 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Junio C Hamano, git
Johannes Schindelin wrote:
> No. I would never say that you have to run for-each-ref for each ref.
> That's plain stupid.
I went back and looked at the thread, and I had indeed misread the
original message, which was from Jakub, not you. I think I got in the
"this is surreal" mode as a result of that (invoking for-each-ref 250
times is bad enough, obviously.)
> BTW I take some satisfaction in that you finally agreed (in another email)
> that some post-creation caching is necessary.
I don't believe I have ever disputed that (in fact, I have pushed very
hard for gitweb to do post-creation caching.)
> I would be even more satisfied if you finally agreed that it is a good
> practice to separate conceptually different things, and not continued ad
> infinitum (and ad nauseam) arguing that .git/info/refs should serve dumb
> transports, and gitweb, and eventually bring peace to everybody on this
> planet.
I've already said I think it's an aesthetic argument, but I don't really
care either way, as long as there is only one hook that updates all the
caches. I don't want the user to have to juggle an arbitrary and
increasing number of hooks.
Fair?
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-27 22:07 ` .git/info/refs H. Peter Anvin
@ 2007-01-31 15:38 ` Santi Béjar
2007-02-01 14:03 ` .git/info/refs Johannes Schindelin
1 sibling, 0 replies; 39+ messages in thread
From: Santi Béjar @ 2007-01-31 15:38 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Johannes Schindelin, Junio C Hamano, git
On 1/27/07, H. Peter Anvin <hpa@zytor.com> wrote:
> Johannes Schindelin wrote:
> > No. I would never say that you have to run for-each-ref for each ref.
> > That's plain stupid.
>
> I went back and looked at the thread, and I had indeed misread the
> original message, which was from Jakub, not you. I think I got in the
> "this is surreal" mode as a result of that (invoking for-each-ref 250
> times is bad enough, obviously.)
>
Normally I'm not interested in the "Last Change" column, I just want
to go to the project summary page, and normally I'm not interested in
the last 16 tags (the last three are just enough). For me they should
be show only when explicitly asked.
Santi
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-01-27 22:07 ` .git/info/refs H. Peter Anvin
2007-01-31 15:38 ` .git/info/refs Santi Béjar
@ 2007-02-01 14:03 ` Johannes Schindelin
2007-02-01 16:16 ` .git/info/refs H. Peter Anvin
1 sibling, 1 reply; 39+ messages in thread
From: Johannes Schindelin @ 2007-02-01 14:03 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git
Hi,
I just had another idea: why not generate the content of the "cover page"
in a cron job, every minute or so, and save it into a static index.html?
This should take quite a load from the server, since not even Perl has to
be started to serve that page.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-02-01 14:03 ` .git/info/refs Johannes Schindelin
@ 2007-02-01 16:16 ` H. Peter Anvin
2007-02-01 16:52 ` .git/info/refs Johannes Schindelin
0 siblings, 1 reply; 39+ messages in thread
From: H. Peter Anvin @ 2007-02-01 16:16 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
Johannes Schindelin wrote:
> Hi,
>
> I just had another idea: why not generate the content of the "cover page"
> in a cron job, every minute or so, and save it into a static index.html?
> This should take quite a load from the server, since not even Perl has to
> be started to serve that page.
>
Ehm... because it often takes longer than that to generate the page?
We can pre-generate the page before the first hit, but that's not a
replacement for update-time caching.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-02-01 16:16 ` .git/info/refs H. Peter Anvin
@ 2007-02-01 16:52 ` Johannes Schindelin
2007-02-01 16:56 ` .git/info/refs H. Peter Anvin
0 siblings, 1 reply; 39+ messages in thread
From: Johannes Schindelin @ 2007-02-01 16:52 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git
Hi,
On Thu, 1 Feb 2007, H. Peter Anvin wrote:
> Johannes Schindelin wrote:
> > Hi,
> >
> > I just had another idea: why not generate the content of the "cover page" in
> > a cron job, every minute or so, and save it into a static index.html? This
> > should take quite a load from the server, since not even Perl has to be
> > started to serve that page.
> >
>
> Ehm... because it often takes longer than that to generate the page?
Sorry, I should have been clearer. Plan:
1. echo "Generating" > /htdocs/git/index.html
2. edit crontab to do this every minute:
2.1 gitweb is called directly_, to generate /htdocs/git/index.html.new
2.2 /htdocs/git/index.html.new is _moved_ into /htdocs/git/index.html,
overwriting the existing one.
Yes, there could be two instances of this task concurrently. No, it does
not matter.
> We can pre-generate the page before the first hit, but that's not a
> replacement for update-time caching.
It was only meant as a quick fix for the horrible workload.
Just a thought, feel free to ignore me,
Dscho
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-02-01 16:52 ` .git/info/refs Johannes Schindelin
@ 2007-02-01 16:56 ` H. Peter Anvin
2007-02-01 17:32 ` .git/info/refs Matthias Lederhofer
0 siblings, 1 reply; 39+ messages in thread
From: H. Peter Anvin @ 2007-02-01 16:56 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
Johannes Schindelin wrote:
>
> Sorry, I should have been clearer. Plan:
>
> 1. echo "Generating" > /htdocs/git/index.html
> 2. edit crontab to do this every minute:
> 2.1 gitweb is called directly_, to generate /htdocs/git/index.html.new
> 2.2 /htdocs/git/index.html.new is _moved_ into /htdocs/git/index.html,
> overwriting the existing one.
>
> Yes, there could be two instances of this task concurrently. No, it does
> not matter.
>
Yes, it does matter, because it drives the load up further. If you
start having this going on in overlapping instances, then you're soon on
the downhill slope of a cascading failure.
>> We can pre-generate the page before the first hit, but that's not a
>> replacement for update-time caching.
>
> It was only meant as a quick fix for the horrible workload.
And we have already experimented with it. It unfortunately doesn't help
much, it only makes matters worse.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-02-01 16:56 ` .git/info/refs H. Peter Anvin
@ 2007-02-01 17:32 ` Matthias Lederhofer
2007-02-01 17:51 ` .git/info/refs H. Peter Anvin
0 siblings, 1 reply; 39+ messages in thread
From: Matthias Lederhofer @ 2007-02-01 17:32 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: git
H. Peter Anvin <hpa@zytor.com> wrote:
> Yes, it does matter, because it drives the load up further. If you
> start having this going on in overlapping instances, then you're soon on
> the downhill slope of a cascading failure.
Add some other locking mechanism.
> And we have already experimented with it. It unfortunately doesn't help
> much, it only makes matters worse.
The gitweb overview page has less than one hit per minute? Otherwise
this should help.
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: .git/info/refs
2007-02-01 17:32 ` .git/info/refs Matthias Lederhofer
@ 2007-02-01 17:51 ` H. Peter Anvin
0 siblings, 0 replies; 39+ messages in thread
From: H. Peter Anvin @ 2007-02-01 17:51 UTC (permalink / raw)
To: H. Peter Anvin, git
Matthias Lederhofer wrote:
> H. Peter Anvin <hpa@zytor.com> wrote:
>> Yes, it does matter, because it drives the load up further. If you
>> start having this going on in overlapping instances, then you're soon on
>> the downhill slope of a cascading failure.
> Add some other locking mechanism.
>
>> And we have already experimented with it. It unfortunately doesn't help
>> much, it only makes matters worse.
> The gitweb overview page has less than one hit per minute? Otherwise
> this should help.
We already cache it with a forced duration of some 15 minutes. The end
result is exactly the same.
-hpa
^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2007-02-01 17:52 UTC | newest]
Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-24 7:38 .git/info/refs H. Peter Anvin
2007-01-24 9:28 ` .git/info/refs Jakub Narebski
2007-01-24 15:55 ` .git/info/refs H. Peter Anvin
2007-01-24 16:02 ` .git/info/refs Johannes Schindelin
2007-01-24 16:24 ` .git/info/refs H. Peter Anvin
2007-01-24 16:38 ` .git/info/refs Johannes Schindelin
2007-01-24 16:41 ` .git/info/refs H. Peter Anvin
2007-01-24 16:52 ` .git/info/refs Johannes Schindelin
2007-01-24 17:06 ` .git/info/refs H. Peter Anvin
2007-01-24 17:25 ` .git/info/refs Jakub Narebski
2007-01-24 17:10 ` .git/info/refs Jakub Narebski
2007-01-24 17:20 ` .git/info/refs Johannes Schindelin
2007-01-25 17:13 ` .git/info/refs H. Peter Anvin
2007-01-26 11:22 ` .git/info/refs Jakub Narebski
2007-01-26 11:41 ` .git/info/refs Junio C Hamano
2007-01-26 16:39 ` .git/info/refs H. Peter Anvin
2007-01-26 17:06 ` .git/info/refs Jakub Narebski
2007-01-26 21:09 ` .git/info/refs Johannes Schindelin
2007-01-26 21:32 ` .git/info/refs H. Peter Anvin
2007-01-26 21:54 ` .git/info/refs H. Peter Anvin
2007-01-24 20:40 ` .git/info/refs Jakub Narebski
2007-01-24 20:44 ` .git/info/refs hpa
2007-01-25 8:14 ` .git/info/refs Johannes Schindelin
2007-01-25 16:12 ` .git/info/refs H. Peter Anvin
2007-01-25 16:50 ` .git/info/refs Johannes Schindelin
2007-01-24 20:45 ` .git/info/refs hpa
2007-01-25 21:28 ` .git/info/refs Junio C Hamano
2007-01-25 21:37 ` .git/info/refs H. Peter Anvin
2007-01-25 21:51 ` .git/info/refs Junio C Hamano
2007-01-25 22:01 ` .git/info/refs H. Peter Anvin
2007-01-25 23:33 ` .git/info/refs Johannes Schindelin
2007-01-27 22:07 ` .git/info/refs H. Peter Anvin
2007-01-31 15:38 ` .git/info/refs Santi Béjar
2007-02-01 14:03 ` .git/info/refs Johannes Schindelin
2007-02-01 16:16 ` .git/info/refs H. Peter Anvin
2007-02-01 16:52 ` .git/info/refs Johannes Schindelin
2007-02-01 16:56 ` .git/info/refs H. Peter Anvin
2007-02-01 17:32 ` .git/info/refs Matthias Lederhofer
2007-02-01 17:51 ` .git/info/refs H. Peter Anvin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).