* [PATCH 0/3 v2] gitweb: Support caching projects list
@ 2008-03-17 15:09 Jakub Narebski
2008-03-17 15:09 ` [PATCH 1/3] gitweb: Separate @projects population into git_get_projects_details() Jakub Narebski
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Jakub Narebski @ 2008-03-17 15:09 UTC (permalink / raw)
To: git; +Cc: Petr Baudis, J.H., Frank Lichtenheld, Jakub Narebski
This series of patches is resend of patch by Petr 'Pasky' Baudis with
the same subject, which can be found in,
Message-ID: <20080313231413.27966.3383.stgit@rover>
http://permalink.gmane.org/gmane.comp.version-control.git/77151
split into two patches (so the exact details of serializing and
caching can be separated from independent code improvement), and with
added lazy filling of details for a project.
At the bottom there is interdiff between Pasky's result and result
after first two patches here. Besides a bit of style changes the main
difference is that in this version dump of @projects array is done in
'terse' form, so it can be eval'ed directly into @projects.
Table of contents:
==================
[PATCH 1/3] gitweb: Separate filling projects info
into git_get_projects_details()
[PATCH 2/3] gitweb: Support caching projects list
[PATCH 3/3] gitweb: Fill project details only if project path
mtime changed
Shortlog:
=========
Jakub Narebski (1):
gitweb: Fill project details only if project path mtime changed
Petr Baudis (2):
gitweb: Separate filling projects info into git_get_projects_details()
gitweb: Support caching projects list
Diffstat:
=========
gitweb/gitweb.css | 6 ++++
gitweb/gitweb.perl | 73 ++++++++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 74 insertions(+), 5 deletions(-)
Interdiff:
==========
gitweb/gitweb.perl | 35 ++++++++++++++++++++---------------
1 files changed, 20 insertions(+), 15 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index bee5ec8..5527378 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -123,7 +123,7 @@ our @diff_opts = ('-M'); # taken from git_commit
# index lifetime in minutes
# the cached list version is stored in /tmp and can be tweaked
# by other scripts running with the same uid as gitweb - use this
-# only at secure installations; only single gitweb project root per
+# ONLY at secure installations; only single gitweb project root per
# system is supported!
our $projlist_cache_lifetime = 0;
@@ -3482,6 +3482,8 @@ sub git_patchset_body {
# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+# fill age, description, owner, forks (last one only if $check_forks)
+# for all projects in $projlist reference; fill projects info
sub git_get_projects_details {
my ($projlist, $check_forks) = @_;
@@ -3521,34 +3523,37 @@ sub git_project_list_body {
my ($check_forks) = gitweb_check_feature('forks');
- my $cache_file = '/tmp/gitweb.index.cache';
use File::stat;
+ use POSIX qw(:fcntl_h);
+
+ my $cache_file = '/tmp/gitweb.index.cache';
my @projects;
my $stale = 0;
- if ($cache_lifetime and -f $cache_file
- and stat($cache_file)->mtime + $cache_lifetime * 60 > time()
- and open (my $fd, $cache_file)) {
- $stale = time() - stat($cache_file)->mtime;
- my @dump = <$fd>;
+ my $now = time();
+ if ($cache_lifetime && -f $cache_file &&
+ stat($cache_file)->mtime + $cache_lifetime * 60 > $now &&
+ open(my $fd, '<', $cache_file)) {
+ $stale = $now - stat($cache_file)->mtime;
+ local $/ = undef;
+ my $dump = <$fd>;
close $fd;
- # Hack zone start
- my $VAR1;
- eval join("\n", @dump);
- @projects = @$VAR1;
- # Hack zone end
+ @projects = @{ eval $dump };
} else {
- if ($cache_lifetime and -f $cache_file) {
+ if ($cache_lifetime && -f $cache_file) {
# Postpone timeout by two minutes so that we get
# enough time to do our job.
my $time = time() - $cache_lifetime + 120;
utime $time, $time, $cache_file;
}
@projects = git_get_projects_details($projlist, $check_forks);
- if ($cache_lifetime and open (my $fd, '>'.$cache_file)) {
+ if ($cache_lifetime &&
+ sysopen(my $fd, "$cache_file.lock", O_WRONLY|O_CREAT|O_EXCL, 0600)) {
use Data::Dumper;
+ $Data::Dumper::Terse = 1;
print $fd Dumper(\@projects);
close $fd;
+ rename "$cache_file.lock", $cache_file;
}
}
@@ -3556,7 +3561,7 @@ sub git_project_list_body {
$from = 0 unless defined $from;
$to = $#projects if (!defined $to || $#projects < $to);
- if ($cache_lifetime and $stale) {
+ if ($cache_lifetime && $stale) {
print "<div class=\"stale_info\">Cached version (${stale}s old)</div>\n";
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 1/3] gitweb: Separate @projects population into git_get_projects_details()
2008-03-17 15:09 [PATCH 0/3 v2] gitweb: Support caching projects list Jakub Narebski
@ 2008-03-17 15:09 ` Jakub Narebski
2008-03-17 15:09 ` [RFC/PATCH 2/3] gitweb: Support caching projects list Jakub Narebski
2008-03-17 15:09 ` [RFC/PATCH 3/3] gitweb: Fill project details lazily when caching Jakub Narebski
2 siblings, 0 replies; 11+ messages in thread
From: Jakub Narebski @ 2008-03-17 15:09 UTC (permalink / raw)
To: git; +Cc: Petr Baudis, J.H., Frank Lichtenheld, Jakub Narebski
From: Petr Baudis <pasky@suse.cz>
For clarity projects scanning and @projects population is separated to
git_get_projects_details().
This would be required if/when implementing in-gitweb caching of
projects list generation.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
This is first part of patch sent by Petr Baudis; one that could be
applied to have better, more clear code, even as we are rehashing on
_how_ to do caching in gitweb in general, and projects list caching in
particular.
Note: git_get_projects_details() does not do
return wantarray ? @projects : \@projects
dance.
By the way; it could modify %$projlist directly, and return simply
$projlist.
gitweb/gitweb.perl | 17 +++++++++++++----
1 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index ec73cb1..90ab894 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -3473,10 +3473,10 @@ sub git_patchset_body {
# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
-sub git_project_list_body {
- my ($projlist, $order, $from, $to, $extra, $no_header) = @_;
-
- my ($check_forks) = gitweb_check_feature('forks');
+# fill age, description, owner, forks (last one only if $check_forks)
+# for all projects in $projlist reference; fill projects info
+sub git_get_projects_details {
+ my ($projlist, $check_forks) = @_;
my @projects;
foreach my $pr (@$projlist) {
@@ -3506,6 +3506,15 @@ sub git_project_list_body {
}
push @projects, $pr;
}
+ return @projects;
+}
+
+sub git_project_list_body {
+ my ($projlist, $order, $from, $to, $extra, $no_header) = @_;
+
+ my ($check_forks) = gitweb_check_feature('forks');
+
+ my @projects = git_get_projects_details($projlist, $check_forks);
$order ||= $default_projects_order;
$from = 0 unless defined $from;
--
1.5.4.3.453.gc1ad83
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC/PATCH 2/3] gitweb: Support caching projects list
2008-03-17 15:09 [PATCH 0/3 v2] gitweb: Support caching projects list Jakub Narebski
2008-03-17 15:09 ` [PATCH 1/3] gitweb: Separate @projects population into git_get_projects_details() Jakub Narebski
@ 2008-03-17 15:09 ` Jakub Narebski
2008-03-17 16:54 ` Frank Lichtenheld
2008-03-17 15:09 ` [RFC/PATCH 3/3] gitweb: Fill project details lazily when caching Jakub Narebski
2 siblings, 1 reply; 11+ messages in thread
From: Jakub Narebski @ 2008-03-17 15:09 UTC (permalink / raw)
To: git; +Cc: Petr Baudis, J.H., Frank Lichtenheld, Jakub Narebski
From: Petr Baudis <pasky@suse.cz>
On repo.or.cz (permanently I/O overloaded and hosting 1050 project +
forks), the projects list (the default gitweb page) can take more than
a minute to generate. This naive patch adds simple support for caching
the projects list data structure so that all the projects do not need
to get rescanned at every page access.
$projlist_cache_lifetime gitweb configuration variable is introduced,
by default set to zero. If set to non-zero, it describes the number of
minutes for which the cache remains valid. Only single project root
per system can use the cache. Any script running with the same uid as
gitweb can change the cache trivially - this is for secure
installations only.
The cache itself is stored in /tmp/gitweb.index.cache as a
Data::Dumper dump of the perl data structure with the list of project
details. When reusing the cache, the file is simply eval'd back into
@projects.
To prevent contention when multiple accesses coincide with cache
expiration, the timeout is postponed to time()+120 when we start
refreshing. When showing cached version, a disclaimer is shown
at the top of the projects list.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
This is (slightly changed) second part of Petr Baudis patch; the
difference (intediff) between this version and the original can be
found in cover letter for this series.
The differences, besides a bit of style changes like using '&&'
instead of 'and', are:
* Current version reads cache file in full, in 'slurp' mode, instead
of reading it line by line and then concatenating lines.
* Current version dumps @projects in the 'terse' mode, so it can be
eval'ed directly into @projects, without need of extra variable.
* Current version does atomic writing to cache file by writing first
to temporary file (there in exclusive mode to *.lock file, but
File::Temp::tempfile() temporary file could be used instead), and
then renaming file. This way we avoid possibility of reading
partially created file. Opening file in O_EXCL mode should prevent
writers trampling one over another, and make only one instance of
gitweb fill cache; on the other hand if somehow *.lock file is not
deleted it would prevent regenerating cache.
Note: instead of using Data::Dumper to serialize data we could use
Storable module (distributed with Perl like Data::Dumper). From what
I've checked it has larger initial cost, but might be better for
larger number of projects, exactly the situation when projects list
caching is needed.
I can send version using Storable; could you compare then Data::Dumper
on repo.or.cz set of repositories then, Pasky?
gitweb/gitweb.css | 6 ++++++
gitweb/gitweb.perl | 51 ++++++++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 54 insertions(+), 3 deletions(-)
diff --git a/gitweb/gitweb.css b/gitweb/gitweb.css
index 446a1c3..1e83896 100644
--- a/gitweb/gitweb.css
+++ b/gitweb/gitweb.css
@@ -85,6 +85,12 @@ div.title, a.title {
color: #000000;
}
+div.stale_info {
+ display: block;
+ text-align: right;
+ font-style: italic;
+}
+
div.readme {
padding: 8px;
}
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 90ab894..5527378 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -118,6 +118,15 @@ our $fallback_encoding = 'latin1';
# - one might want to include '-B' option, e.g. '-B', '-M'
our @diff_opts = ('-M'); # taken from git_commit
+# projects list cache for busy sites with many projects;
+# if you set this to non-zero, it will be used as the cached
+# index lifetime in minutes
+# the cached list version is stored in /tmp and can be tweaked
+# by other scripts running with the same uid as gitweb - use this
+# ONLY at secure installations; only single gitweb project root per
+# system is supported!
+our $projlist_cache_lifetime = 0;
+
# information about snapshot formats that gitweb is capable of serving
our %known_snapshot_formats = (
# name => {
@@ -3510,16 +3519,52 @@ sub git_get_projects_details {
}
sub git_project_list_body {
- my ($projlist, $order, $from, $to, $extra, $no_header) = @_;
+ my ($projlist, $order, $from, $to, $extra, $no_header, $cache_lifetime) = @_;
my ($check_forks) = gitweb_check_feature('forks');
- my @projects = git_get_projects_details($projlist, $check_forks);
+ use File::stat;
+ use POSIX qw(:fcntl_h);
+
+ my $cache_file = '/tmp/gitweb.index.cache';
+
+ my @projects;
+ my $stale = 0;
+ my $now = time();
+ if ($cache_lifetime && -f $cache_file &&
+ stat($cache_file)->mtime + $cache_lifetime * 60 > $now &&
+ open(my $fd, '<', $cache_file)) {
+ $stale = $now - stat($cache_file)->mtime;
+ local $/ = undef;
+ my $dump = <$fd>;
+ close $fd;
+ @projects = @{ eval $dump };
+ } else {
+ if ($cache_lifetime && -f $cache_file) {
+ # Postpone timeout by two minutes so that we get
+ # enough time to do our job.
+ my $time = time() - $cache_lifetime + 120;
+ utime $time, $time, $cache_file;
+ }
+ @projects = git_get_projects_details($projlist, $check_forks);
+ if ($cache_lifetime &&
+ sysopen(my $fd, "$cache_file.lock", O_WRONLY|O_CREAT|O_EXCL, 0600)) {
+ use Data::Dumper;
+ $Data::Dumper::Terse = 1;
+ print $fd Dumper(\@projects);
+ close $fd;
+ rename "$cache_file.lock", $cache_file;
+ }
+ }
$order ||= $default_projects_order;
$from = 0 unless defined $from;
$to = $#projects if (!defined $to || $#projects < $to);
+ if ($cache_lifetime && $stale) {
+ print "<div class=\"stale_info\">Cached version (${stale}s old)</div>\n";
+ }
+
print "<table class=\"project_list\">\n";
unless ($no_header) {
print "<tr>\n";
@@ -3902,7 +3947,7 @@ sub git_project_list {
close $fd;
print "</div>\n";
}
- git_project_list_body(\@list, $order);
+ git_project_list_body(\@list, $order, undef, undef, undef, undef, $projlist_cache_lifetime);
git_footer_html();
}
--
1.5.4.3.453.gc1ad83
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC/PATCH 3/3] gitweb: Fill project details lazily when caching
2008-03-17 15:09 [PATCH 0/3 v2] gitweb: Support caching projects list Jakub Narebski
2008-03-17 15:09 ` [PATCH 1/3] gitweb: Separate @projects population into git_get_projects_details() Jakub Narebski
2008-03-17 15:09 ` [RFC/PATCH 2/3] gitweb: Support caching projects list Jakub Narebski
@ 2008-03-17 15:09 ` Jakub Narebski
2008-03-18 3:14 ` Petr Baudis
2 siblings, 1 reply; 11+ messages in thread
From: Jakub Narebski @ 2008-03-17 15:09 UTC (permalink / raw)
To: git; +Cc: Petr Baudis, J.H., Frank Lichtenheld, Jakub Narebski
If caching is turned on project details can be filled in already from
the cache. When refreshing project info details for all project (when
cache is stale and has to be refreshed) generate projects info only if
modification time (as returned by lstat()) of projects repository
gitdir changed.
This way we can avoid hitting repository refs, object database and
repository config at the cost of additional lstat.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
This is an idea for further improvement of 'projects list caching'.
Could you please:
1.) comment if it is a good idea, or why it works, or why it
couldn't work :),
2.) check if this change gives any improvements in performance on
real data; note that testing would require updating repositories
if test on generated data was done, or gathering statistics over
larger time period if it was tested on "live" set.
Thanks in advance.
gitweb/gitweb.perl | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
index 5527378..1741628 100755
--- a/gitweb/gitweb.perl
+++ b/gitweb/gitweb.perl
@@ -3487,8 +3487,14 @@ sub git_patchset_body {
sub git_get_projects_details {
my ($projlist, $check_forks) = @_;
+ use File::stat;
my @projects;
foreach my $pr (@$projlist) {
+ my $mtime;
+ if ($cached && $pr->{'mtime'}) {
+ $mtime = lstat("$projectroot/$pr->{'path'}")->mtime;
+ next if ($mtime <= $pr->{'mtime'});
+ }
my (@aa) = git_get_last_activity($pr->{'path'});
unless (@aa) {
next;
@@ -3513,6 +3519,9 @@ sub git_get_projects_details {
$pr->{'forks'} = 0;
}
}
+ if ($cached) {
+ $pr->{'mtime'} = $mtime;
+ }
push @projects, $pr;
}
return @projects;
--
1.5.4.3.453.gc1ad83
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC/PATCH 2/3] gitweb: Support caching projects list
2008-03-17 15:09 ` [RFC/PATCH 2/3] gitweb: Support caching projects list Jakub Narebski
@ 2008-03-17 16:54 ` Frank Lichtenheld
2008-03-17 18:52 ` Jakub Narebski
0 siblings, 1 reply; 11+ messages in thread
From: Frank Lichtenheld @ 2008-03-17 16:54 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git, Petr Baudis, J.H.
On Mon, Mar 17, 2008 at 04:09:29PM +0100, Jakub Narebski wrote:
> From: Petr Baudis <pasky@suse.cz>
> $projlist_cache_lifetime gitweb configuration variable is introduced,
> by default set to zero. If set to non-zero, it describes the number of
> minutes for which the cache remains valid. Only single project root
> per system can use the cache. Any script running with the same uid as
> gitweb can change the cache trivially - this is for secure
> installations only.
The more subtle threat is the fact that anyone with writing
rights to /tmp can give gitweb any data he wants if the file doesn't
exist yet.
At the very least you should:
- Allow to override /tmp (via ENV{TMPDIR} or via a configuration
variable)
- Advise people to change that to something that is not world-writable
- Check if the file is owned by the uid gitweb is running under and
not word-writable.
[...]
> + my @projects;
> + my $stale = 0;
> + my $now = time();
> + if ($cache_lifetime && -f $cache_file &&
> + stat($cache_file)->mtime + $cache_lifetime * 60 > $now &&
> + open(my $fd, '<', $cache_file)) {
> + $stale = $now - stat($cache_file)->mtime;
One stat() call instead of three would be better for performance.
Gruesse,
--
Frank Lichtenheld <frank@lichtenheld.de>
www: http://www.djpig.de/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC/PATCH 2/3] gitweb: Support caching projects list
2008-03-17 16:54 ` Frank Lichtenheld
@ 2008-03-17 18:52 ` Jakub Narebski
2008-03-17 19:10 ` Frank Lichtenheld
0 siblings, 1 reply; 11+ messages in thread
From: Jakub Narebski @ 2008-03-17 18:52 UTC (permalink / raw)
To: Frank Lichtenheld; +Cc: git, Petr Baudis, J.H.
Dnia poniedziałek 17. marca 2008 17:54, Frank Lichtenheld napisał:
>At the very least you should:
[...]
> - Check if the file is owned by the uid gitweb is running under and
> not word-writable.
UID ($>) or PID ($$) should be equal to cache owner: stat($file)->uid?
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC/PATCH 2/3] gitweb: Support caching projects list
2008-03-17 18:52 ` Jakub Narebski
@ 2008-03-17 19:10 ` Frank Lichtenheld
2008-03-17 20:25 ` Jakub Narebski
0 siblings, 1 reply; 11+ messages in thread
From: Frank Lichtenheld @ 2008-03-17 19:10 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git, Petr Baudis, J.H.
On Mon, Mar 17, 2008 at 07:52:13PM +0100, Jakub Narebski wrote:
> Dnia poniedziałek 17. marca 2008 17:54, Frank Lichtenheld napisał:
>
> >At the very least you should:
> [...]
> > - Check if the file is owned by the uid gitweb is running under and
> > not word-writable.
>
> UID ($>) or PID ($$) should be equal to cache owner: stat($file)->uid?
I'm not sure what the PID has to do with anything here?
But yeah, $> was what I meant.
(Although I actually prefer to use POSIX::geteuid instead, since I can
understand that faster).
Gruesse,
--
Frank Lichtenheld <frank@lichtenheld.de>
www: http://www.djpig.de/
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC/PATCH 2/3] gitweb: Support caching projects list
2008-03-17 19:10 ` Frank Lichtenheld
@ 2008-03-17 20:25 ` Jakub Narebski
0 siblings, 0 replies; 11+ messages in thread
From: Jakub Narebski @ 2008-03-17 20:25 UTC (permalink / raw)
To: Frank Lichtenheld; +Cc: git, Petr Baudis, J.H.
Dnia poniedziałek 17. marca 2008 20:10, Frank Lichtenheld napisał:
> On Mon, Mar 17, 2008 at 07:52:13PM +0100, Jakub Narebski wrote:
>> Dnia poniedziałek 17. marca 2008 17:54, Frank Lichtenheld napisał:
>>
>>>At the very least you should:
>> [...]
>>> - Check if the file is owned by the uid gitweb is running under and
>>> not word-writable.
>>
>> UID ($>) or PID ($$) should be equal to cache owner: stat($file)->uid?
>
> I'm not sure what the PID has to do with anything here?
> But yeah, $> was what I meant.
> (Although I actually prefer to use POSIX::geteuid instead, since I can
> understand that faster).
Actually what I wanted to ask was UID ($<) vs EUID ($>), or appropriate
POSIX::get*uid functions.
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC/PATCH 3/3] gitweb: Fill project details lazily when caching
2008-03-17 15:09 ` [RFC/PATCH 3/3] gitweb: Fill project details lazily when caching Jakub Narebski
@ 2008-03-18 3:14 ` Petr Baudis
2008-03-18 9:12 ` Jakub Narebski
0 siblings, 1 reply; 11+ messages in thread
From: Petr Baudis @ 2008-03-18 3:14 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git, J.H., Frank Lichtenheld
On Mon, Mar 17, 2008 at 04:09:30PM +0100, Jakub Narebski wrote:
> If caching is turned on project details can be filled in already from
> the cache. When refreshing project info details for all project (when
> cache is stale and has to be refreshed) generate projects info only if
> modification time (as returned by lstat()) of projects repository
> gitdir changed.
>
> This way we can avoid hitting repository refs, object database and
> repository config at the cost of additional lstat.
>
> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
> ---
> This is an idea for further improvement of 'projects list caching'.
> Could you please:
>
> 1.) comment if it is a good idea, or why it works, or why it
> couldn't work :),
The idea is nice, but I'm surely missing something obvious again - why
do you use lstat() as opposed to stat()? And more importantly, the mtime
of projects repository unfortunately does not reflect almost any
changes per se; you would need to check mtime of description file,
config file and the refs instead.
--
Petr "Pasky" Baudis
Whatever you can do, or dream you can, begin it.
Boldness has genius, power, and magic in it. -- J. W. von Goethe
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC/PATCH 3/3] gitweb: Fill project details lazily when caching
2008-03-18 3:14 ` Petr Baudis
@ 2008-03-18 9:12 ` Jakub Narebski
2008-03-18 9:52 ` Frank Lichtenheld
0 siblings, 1 reply; 11+ messages in thread
From: Jakub Narebski @ 2008-03-18 9:12 UTC (permalink / raw)
To: Petr Baudis; +Cc: git, J.H., Frank Lichtenheld
On Tue, 18 March 2008, Petr Baudis wrote:
> On Mon, Mar 17, 2008 at 04:09:30PM +0100, Jakub Narebski wrote:
>>
>> If caching is turned on project details can be filled in already from
>> the cache. When refreshing project info details for all project (when
>> cache is stale and has to be refreshed) generate projects info only if
>> modification time (as returned by lstat()) of projects repository
>> gitdir changed.
>>
>> This way we can avoid hitting repository refs, object database and
>> repository config at the cost of additional lstat.
>>
>> Signed-off-by: Jakub Narebski <jnareb@gmail.com>
>> ---
>> This is an idea for further improvement of 'projects list caching'.
>> Could you please:
>>
>> 1.) comment if it is a good idea, or why it works, or why it
>> couldn't work :),
>
> The idea is nice, but I'm surely missing something obvious again - why
> do you use lstat() as opposed to stat()?
Because in my home installation of gitweb (for tests) I have
/home/local/scm/git.git symlinked to /home/jnareb/git/.git
And I want to follow changes in repository; link itself doesn't
change.
> And more importantly, the mtime
> of projects repository unfortunately does not reflect almost any
> changes per se; you would need to check mtime of description file,
> config file and the refs instead.
Well, I had hopes that because git uses "write to temporary file, rename
temporary file to final name" to have atomic file writes any change in
git repository would be reflected in mtime of topdir / GIT_DIR. I have
checked it superficially... by doing a fetch, and a commit. But while
both fetch and commit manipulate files in top dir (FETCH_HEAD, ORIG_HEAD,
COMMIT_EDITMSG) it is not the case for push, unfortunately. If all
pushes would result in pack transfer, it would be enough to watch for
GIT_DIR/objects/pack/ directory.
I think that nothing short of inotify or equivalent would work: it is
just too many files/directories to watch for changes... I hope I am
mistaken here...
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC/PATCH 3/3] gitweb: Fill project details lazily when caching
2008-03-18 9:12 ` Jakub Narebski
@ 2008-03-18 9:52 ` Frank Lichtenheld
0 siblings, 0 replies; 11+ messages in thread
From: Frank Lichtenheld @ 2008-03-18 9:52 UTC (permalink / raw)
To: Jakub Narebski; +Cc: Petr Baudis, git, J.H.
On Tue, Mar 18, 2008 at 10:12:09AM +0100, Jakub Narebski wrote:
> On Tue, 18 March 2008, Petr Baudis wrote:
> > The idea is nice, but I'm surely missing something obvious again - why
> > do you use lstat() as opposed to stat()?
>
> Because in my home installation of gitweb (for tests) I have
> /home/local/scm/git.git symlinked to /home/jnareb/git/.git
> And I want to follow changes in repository; link itself doesn't
> change.
Which means you have that backwards, since
"lstat() is identical to stat(), except that if path is a symbolic link,
then the link itself is stat-ed, not the file that it refers to."
(from linux manpage)
Gruesse,
--
Frank Lichtenheld <frank@lichtenheld.de>
www: http://www.djpig.de/
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-03-18 9:53 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-17 15:09 [PATCH 0/3 v2] gitweb: Support caching projects list Jakub Narebski
2008-03-17 15:09 ` [PATCH 1/3] gitweb: Separate @projects population into git_get_projects_details() Jakub Narebski
2008-03-17 15:09 ` [RFC/PATCH 2/3] gitweb: Support caching projects list Jakub Narebski
2008-03-17 16:54 ` Frank Lichtenheld
2008-03-17 18:52 ` Jakub Narebski
2008-03-17 19:10 ` Frank Lichtenheld
2008-03-17 20:25 ` Jakub Narebski
2008-03-17 15:09 ` [RFC/PATCH 3/3] gitweb: Fill project details lazily when caching Jakub Narebski
2008-03-18 3:14 ` Petr Baudis
2008-03-18 9:12 ` Jakub Narebski
2008-03-18 9:52 ` Frank Lichtenheld
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).