* read-for-fill and caching in gitweb (Re: kernel.org mirroring)
@ 2006-12-28 20:45 Martin Langhoff
2006-12-29 3:21 ` Robert Fitzsimons
0 siblings, 1 reply; 6+ messages in thread
From: Martin Langhoff @ 2006-12-28 20:45 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jeff Garzik, H. Peter Anvin, Rogan Dawes, Kernel Org Admin,
Git Mailing List, Jakub Narebski
On 12/9/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Actually, just looking at the examples, it looks like memcached is
> fundamentally flawed, exactly the same way Apache mod_cache is
> fundamentally flawed.
memcached is really fast internally, but can be rather slow from the
POV of the client code, as it forces a costly
marshalling/unmarshalling of data. For perl-only situations where it
is OK to have per-server caches, I have been looking at
Cache::FastMmap. I will probably try to implement caching for the
projects, summary & log/shortlog pages using Cache::FastMap
And I'll do read-for-fill for it, and see how that goes.
(BTW, in the last week I've had to implement a similar
anti-thundering-herds cache in PHP using memcached and/or eaccelerator
-- a shmem cache -- and I've done a read-for-fill for both of them
that works reasonably well.)
cheers,
martin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: read-for-fill and caching in gitweb (Re: kernel.org mirroring)
2006-12-28 20:45 read-for-fill and caching in gitweb (Re: kernel.org mirroring) Martin Langhoff
@ 2006-12-29 3:21 ` Robert Fitzsimons
2006-12-29 10:40 ` Jakub Narebski
0 siblings, 1 reply; 6+ messages in thread
From: Robert Fitzsimons @ 2006-12-29 3:21 UTC (permalink / raw)
To: Martin Langhoff
Cc: Linus Torvalds, Jeff Garzik, H. Peter Anvin, Rogan Dawes,
Kernel Org Admin, Git Mailing List, Jakub Narebski
> I will probably try to implement caching for the
> projects, summary & log/shortlog pages using Cache::FastMap
Here are the mean (and standard deviation) in milliseconds for those
pages using a few different versions of gitweb.
project_list summary shortlog log
v267 173 1.6 1141 8.8 795 5.0 919 1.9
1.4.4.3 220 2.3 397 2.4 930 4.2 1113 56.9
1.5.0.rc0.g4a4d 226 1.9 292 1.7 352 4.0 491 6.7
1.5.0.rc0.g4a4d 60 1.0 131 0.7 195 1.2 347 3.7
(mod_perl)
I think there would be a benefit in deploying a more recent version of
gitweb on kernel.org and and even bigger benefit if it use mod_perl. I
would be happy to help, if I can.
I'll look into the increase in time for the project_list in more recent
versions of gitweb, tomorrow.
Robert
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: read-for-fill and caching in gitweb (Re: kernel.org mirroring)
2006-12-29 3:21 ` Robert Fitzsimons
@ 2006-12-29 10:40 ` Jakub Narebski
2006-12-29 11:46 ` Martin Langhoff
2006-12-29 19:31 ` Robert Fitzsimons
0 siblings, 2 replies; 6+ messages in thread
From: Jakub Narebski @ 2006-12-29 10:40 UTC (permalink / raw)
To: Robert Fitzsimons
Cc: Martin Langhoff, Linus Torvalds, Jeff Garzik, H. Peter Anvin,
Rogan Dawes, Kernel Org Admin, Git Mailing List
Robert Fitzsimons wrote:
> Here are the mean (and standard deviation) in milliseconds for those
> pages using a few different versions of gitweb.
>
> project_list summary shortlog log
> v267 173 1.6 1141 8.8 795 5.0 919 1.9
> 1.4.4.3 220 2.3 397 2.4 930 4.2 1113 56.9
> 1.5.0.rc0.g4a4d 226 1.9 292 1.7 352 4.0 491 6.7
> 1.5.0.rc0.g4a4d 60 1.0 131 0.7 195 1.2 347 3.7
> (mod_perl)
> I'll look into the increase in time for the project_list in more recent
> versions of gitweb, tomorrow.
It is simply the case that new features cost more. Namely in earlier
versions of gitweb Last Change time was taken from HEAD (from current
branch), in newer we check all branches (using git-for-each-ref).
For published public repository it migh make sense to pack also heads
(make them packed refs).
I was thinking about making this a gitweb %feature, allowing gitweb
administrator to chose if Last Change is taken from all branches
(as it is now), from HEAD (as it was before), or from given branch
(for example master).
Another thing that might made small increase in time is checking
if project is to be visible to gitweb ($export_ok and $strict_export).
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: read-for-fill and caching in gitweb (Re: kernel.org mirroring)
2006-12-29 10:40 ` Jakub Narebski
@ 2006-12-29 11:46 ` Martin Langhoff
2006-12-29 12:47 ` Jakub Narebski
2006-12-29 19:31 ` Robert Fitzsimons
1 sibling, 1 reply; 6+ messages in thread
From: Martin Langhoff @ 2006-12-29 11:46 UTC (permalink / raw)
To: Jakub Narebski
Cc: Robert Fitzsimons, Linus Torvalds, Jeff Garzik, H. Peter Anvin,
Rogan Dawes, Kernel Org Admin, Git Mailing List
On 12/29/06, Jakub Narebski <jnareb@gmail.com> wrote:
> It is simply the case that new features cost more. Namely in earlier
> versions of gitweb Last Change time was taken from HEAD (from current
> branch), in newer we check all branches (using git-for-each-ref).
> For published public repository it migh make sense to pack also heads
> (make them packed refs).
I haven't been using packed refs at all, but it sounds like it's a
single file. So we can stat just that file rather than ask questions
about the heads themselves. That makes checking for if-modified-since
cheap as well.
> I was thinking about making this a gitweb %feature, allowing gitweb
> administrator to chose if Last Change is taken from all branches
> (as it is now), from HEAD (as it was before), or from given branch
> (for example master).
I think the natural thing is to check all heads (doing it on the cheap
on packed-refs repos) and provide tuning tips. in this case "use
packed refs" which I guess will become the default eventually.
cheers,
martin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: read-for-fill and caching in gitweb (Re: kernel.org mirroring)
2006-12-29 11:46 ` Martin Langhoff
@ 2006-12-29 12:47 ` Jakub Narebski
0 siblings, 0 replies; 6+ messages in thread
From: Jakub Narebski @ 2006-12-29 12:47 UTC (permalink / raw)
To: Martin Langhoff
Cc: Robert Fitzsimons, Linus Torvalds, Jeff Garzik, H. Peter Anvin,
Rogan Dawes, Kernel Org Admin, Git Mailing List
Martin Langhoff wrote:
> On 12/29/06, Jakub Narebski <jnareb@gmail.com> wrote:
>> It is simply the case that new features cost more. Namely in earlier
>> versions of gitweb Last Change time was taken from HEAD (from current
>> branch), in newer we check all branches (using git-for-each-ref).
>> For published public repository it migh make sense to pack also heads
>> (make them packed refs).
>
> I haven't been using packed refs at all, but it sounds like it's a
> single file. So we can stat just that file rather than ask questions
> about the heads themselves. That makes checking for if-modified-since
> cheap as well.
That I think would work _only_ for the working repository. For
publishing bare repository you push into (or which is a mirror of some
other repository) I think stat $GIT_DIR/packed-refs would return date
of last push (last mirror), not when repository was last committed to...
>> I was thinking about making this a gitweb %feature, allowing gitweb
>> administrator to chose if Last Change is taken from all branches
>> (as it is now), from HEAD (as it was before), or from given branch
>> (for example master).
>
> I think the natural thing is to check all heads (doing it on the cheap
> on packed-refs repos) and provide tuning tips. in this case "use
> packed refs" which I guess will become the default eventually.
...but this could be included in above %feature.
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: read-for-fill and caching in gitweb (Re: kernel.org mirroring)
2006-12-29 10:40 ` Jakub Narebski
2006-12-29 11:46 ` Martin Langhoff
@ 2006-12-29 19:31 ` Robert Fitzsimons
1 sibling, 0 replies; 6+ messages in thread
From: Robert Fitzsimons @ 2006-12-29 19:31 UTC (permalink / raw)
To: Jakub Narebski
Cc: Robert Fitzsimons, Martin Langhoff, Linus Torvalds, Jeff Garzik,
H. Peter Anvin, Rogan Dawes, Kernel Org Admin, Git Mailing List
> > project_list summary shortlog log
> > v267 173 1.6 1141 8.8 795 5.0 919 1.9
> > 1.4.4.3 220 2.3 397 2.4 930 4.2 1113 56.9
> > 1.5.0.rc0.g4a4d 226 1.9 292 1.7 352 4.0 491 6.7
> > 1.5.0.rc0.g4a4d 60 1.0 131 0.7 195 1.2 347 3.7
> > (mod_perl)
> It is simply the case that new features cost more. Namely in earlier
> versions of gitweb Last Change time was taken from HEAD (from current
> branch), in newer we check all branches (using git-for-each-ref).
> For published public repository it migh make sense to pack also heads
> (make them packed refs).
>
> I was thinking about making this a gitweb %feature, allowing gitweb
> administrator to chose if Last Change is taken from all branches
> (as it is now), from HEAD (as it was before), or from given branch
> (for example master).
I've sent a separate email with a patch to add this feature.
("[PATCH] gitweb: New feature last_modified_ref."
<20061229185805.GF6558@localhost>).
Here are the new numbers. Notes: I've only got 3 projects in my project
list and I did a 'git gc' on them since yesterday.
project_list summary shortlog log
v267 174 1.1 286 2.1 794 3.4 921 3.2
1.4.4.3 207 1.7 383 2.0 921 5.2 1082 3.8
g04509 + patch 213 1.6 297 68.9 341 3.9 484 5.0
g04509 + patch 71 69.9 117 2.5 190 2.1 341 2.7
(mod_perl)
g04509 + patch 209 1.0 276 1.5 342 3.3 483 6.3
(HEAD)
g04509 + patch 66 70.1 117 2.6 189 3.4 341 3.8
(HEAD, mod_perl)
The v267 summary time is wrong, that version of gitweb is not
packed-refs aware.
I think I need a more consistent test setup I'm seeing some weird
deviations.
Robert
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-12-29 19:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-28 20:45 read-for-fill and caching in gitweb (Re: kernel.org mirroring) Martin Langhoff
2006-12-29 3:21 ` Robert Fitzsimons
2006-12-29 10:40 ` Jakub Narebski
2006-12-29 11:46 ` Martin Langhoff
2006-12-29 12:47 ` Jakub Narebski
2006-12-29 19:31 ` Robert Fitzsimons
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).