* [RFC] gitweb wishlist and TODO list (part 1)
@ 2006-12-16 23:00 Jakub Narebski
2006-12-17 1:47 ` Junio C Hamano
2006-12-21 3:22 ` Robert Fitzsimons
0 siblings, 2 replies; 4+ messages in thread
From: Jakub Narebski @ 2006-12-16 23:00 UTC (permalink / raw)
To: git; +Cc: Kernel Org Admin, Petr Baudis
This is yet another series of planned gitweb features. This part
concentrates on improving gitweb rather than on adding new features.
Comments appreciated.
Copy send to Kernel.Org admin (which probably is most interested
in improving gitweb and git performance), and to Petr Baudis which
maintains repo.or.cz public git hosting site, which runs new(est)
version of gitweb.
1. Cleanup
* HTML cleanup.
There are still some places where we use presentational HTML elements
like <i>. I'd rather have it replaced by adding class to appropriate
element, and adding proper rule to CSS. This would make changing the
style easier, but it would make generated page larger.
On the other hand we use class element when sometimes id attribute would
be better, and sometimes (in the tables) it would be better served to
add <col> and <colgroup> elements, and use header attribute and select
CSS style by this attribute value.
* CSS cleanup.
Use descendant selectors more, and use other selectors like parent or
grandchild, instead of relying on class, to remove some of redundancy in
CSS file. This would make ading similar elements easier (for example
README for a project has now wrong style), and most probably would make
CSS style smaller.
Perhaps we should reorder CSS file, and add some comments, to make it's
maintenance easier, and make it easier to add style for new elements.
* Code cleanup.
There is still a bit of code cleanup, especially in the subroutines
which were not refactored yet. For example git_search should take
advance of git-rev-list / git-log --grep, --author and --committer
options, and split search output into pages. Untabify should perhaps be
moved to esc_html, now that esc_path is separated... or perhaps not.
2. Performance etc.
* Better support for mod_perl in "CGI mode". Planned support for mod_perl
in handler mode and support for FastCGI via CGI::Fast.
Add support for mod_perl so it wouldn't need to be run with
+ParseHeaders, and wouldn't need to set CGI environment variables. This
can speed up gitweb a bit under mod_perl. The problem is how to do this
to be able to run gitweb under CGI, FastCGI, mod_perl 1.0 and 2.0.
This includes separating generating HTTP headers into separate
subroutine, at first thin wrapper around "print $cgi->header(...)".
* Native config reading.
This could speed up gitweb a bit at least in configurations where
repository is allowed to override default features (like blame,
snapshot, pickaxe) and sometimes change their options.
It would also allow to move some gitweb configuration, like description,
homepage (in repo.or.cz), cloneurl from separate loose files in $GIT_DIR
of a repository to it's proper place in the config. I'd rather add
category as gitweb.category configuration variable than as yet another
file in .git. Without native config reading it would be hard to add
sensibly configureable committags support.
The problem lies with lack of formal description of config file, and in
git specific additions which make ready modules for INI config parsing
infeasible. But it is not insurmountable. I think that config reader
should be then incorporated in Git.pm module. We for sure need some
tests for it added.
By the way, should we always parse whole config file like in
Tie::Memoize? Or read only one variable, if tieing it to hash then with
Tie::Hash? Or some combination of both?
* Cache validation and infinite cache for unchanging pages
By itself cache validation would not bring much performance boost (for
gitweb installations with large traffic), but with the reverse proxy,
aka. caching engine, aka. HTTP accelerator in front of server this could
help a lot.
This means sending proper Last-Modifed: and ETag: (if HTTP/1.1) headers,
and checking If-Modified-Since: and If-None-Match: headers, replying
with 304 Not Modified. This should work both with gitweb called both as
CGI script, and from mod_perl. While at it we should return always only
HTTP header, without generating (if possible) and writing any contents
on HEAD requests (and other which do not need body).
The idea is to use query string (current arguments) with all hashes
replaced by it's current value as sha1 for ETag, and to use committer
date for Last-Modified. If possible, we could use stat info: date of
last modification of given "loose" ref, or packed refs file, although
that might be inaccurate. Not always we can check if cache is valid
without calling any git command, but usually we would be able to do this
after only a few commands. Implementing cache validation might mean that
we would have to restructure code a bit.
Separate subroutine for HTTP headers generation would help with writing
cache validation.
Bundled together is using "infinite" (or at least large: currently we
use +1d) Expires: and/or Cache-Control: max-age (if HTTP/1.1) for
unchanging pages. Although those pages are usually rarely accessed...
Next parts: new features, improving existing views etc. in next email.
--
Jakub Narebski
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] gitweb wishlist and TODO list (part 1)
2006-12-16 23:00 [RFC] gitweb wishlist and TODO list (part 1) Jakub Narebski
@ 2006-12-17 1:47 ` Junio C Hamano
2006-12-21 3:22 ` Robert Fitzsimons
1 sibling, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2006-12-17 1:47 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git, Kernel Org Admin, Petr Baudis
Jakub Narebski <jnareb@gmail.com> writes:
> This is yet another series of planned gitweb features. This part
> concentrates on improving gitweb rather than on adding new features.
> Comments appreciated.
>
> Copy send to Kernel.Org admin (which probably is most interested
> in improving gitweb and git performance), and to Petr Baudis which
> maintains repo.or.cz public git hosting site, which runs new(est)
> version of gitweb.
To be honest, I think the 0th item of this wishlist and TODO
list would be to find a sucker^Wvolunteer who will maintain the
gitweb installation at kernel.org, for the message to be of any
interest to kernel.org admins.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] gitweb wishlist and TODO list (part 1)
2006-12-16 23:00 [RFC] gitweb wishlist and TODO list (part 1) Jakub Narebski
2006-12-17 1:47 ` Junio C Hamano
@ 2006-12-21 3:22 ` Robert Fitzsimons
2006-12-21 9:18 ` Jakub Narebski
1 sibling, 1 reply; 4+ messages in thread
From: Robert Fitzsimons @ 2006-12-21 3:22 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git, Kernel Org Admin, Petr Baudis
> * Cache validation and infinite cache for unchanging pages
>
> By itself cache validation would not bring much performance boost (for
> gitweb installations with large traffic), but with the reverse proxy,
> aka. caching engine, aka. HTTP accelerator in front of server this could
> help a lot.
There is no need for extra servers to provide server side caching.
Apache2 includes suitable modules (mod_cache) which can be configured to
cache in memory or disk the pages generated by gitweb.
For example the following apache2.conf entry will setup a 8MB mem cache
which will return cached pages even if the user tries to force a refresh
in their browser. The details are covered in the apache documentation
http://httpd.apache.org/docs/2.2/caching.html .
<IfModule mod_cache.c>
CacheDefaultExpire 60
CacheIgnoreCacheControl On
CacheIgnoreNoLastMod On
<IfModule mod_mem_cache.c>
CacheEnable mem /git/
MCacheSize 8192
MCacheMinObjectSize 512
MCacheMaxObjectSize 128000
MCacheRemovalAlgorithm LRU
</IfModule>
</IfModule>
mod_cache will only cache pages with a query string in the url if they
have an expires header. So we can put a temporary hack in using
mod_expires until gitweb sets an appropriate value.
<Location /git/>
ExpiresActive On
ExpiresDefault "access plus 1 minutes"
...
</Location>
Also the content type would need to be change to just return text/html
or MSIE will do the wrong think if it's given a application/xhtml+xml
page.
Robert
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] gitweb wishlist and TODO list (part 1)
2006-12-21 3:22 ` Robert Fitzsimons
@ 2006-12-21 9:18 ` Jakub Narebski
0 siblings, 0 replies; 4+ messages in thread
From: Jakub Narebski @ 2006-12-21 9:18 UTC (permalink / raw)
To: Robert Fitzsimons; +Cc: git, Kernel Org Admin, Petr Baudis
Robert Fitzsimons wrote:
>> * Cache validation and infinite cache for unchanging pages
>>
>> By itself cache validation would not bring much performance boost (for
>> gitweb installations with large traffic), but with the reverse proxy,
>> aka. caching engine, aka. HTTP accelerator in front of server this could
>> help a lot.
BTW in mod_perl cache validation is as simple as using meets_condition()
method on request object after we send at least one of validator
headers (Last-Modified:, ETag:)... but this would mean that cache
validation would be available only when under mod_perl...
> There is no need for extra servers to provide server side caching.
> Apache2 includes suitable modules (mod_cache) which can be configured to
> cache in memory or disk the pages generated by gitweb.
[...]
> mod_cache will only cache pages with a query string in the url if they
> have an expires header. So we can put a temporary hack in using
> mod_expires until gitweb sets an appropriate value.
>From the discussion in the
"Re: kernel.org mirroring (Re: [GIT PULL] MMC update)"
http://thread.gmane.org/gmane.comp.version-control.git/33604
thread Apache mod_cache doesn't bring much. Perhaps because of the above...
although adding artificial expires header seems a bit like a hack.
> Also the content type would need to be change to just return text/html
> or MSIE will do the wrong think if it's given a application/xhtml+xml
> page.
>From gitweb.perl:
# require explicit support from the UA if we are to send the page as
# 'application/xhtml+xml', otherwise send it as plain old 'text/html'.
# we have to do this because MSIE sometimes globs '*/*', pretending to
# support xhtml+xml but choking when it gets what it asked for.
This was added by Alp Toker <alp@atoker.com> in f6801d669ee11:
"gitweb: Send XHTML as 'application/xhtml+xml' where possible"
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-12-21 9:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-16 23:00 [RFC] gitweb wishlist and TODO list (part 1) Jakub Narebski
2006-12-17 1:47 ` Junio C Hamano
2006-12-21 3:22 ` Robert Fitzsimons
2006-12-21 9:18 ` Jakub Narebski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).