git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Lars Hjemli" <hjemli@gmail.com>
To: "Jakub Narebski" <jnareb@gmail.com>
Cc: warthog19@eaglescrag.net, git@vger.kernel.org,
	"Petr Baudis" <pasky@suse.cz>
Subject: Re: [RFC/PATCH] gitweb: Paginate project list
Date: Mon, 12 May 2008 17:43:06 +0200	[thread overview]
Message-ID: <8c5c35580805120843j9b401f8mfa104806880a51c2@mail.gmail.com> (raw)
In-Reply-To: <200805120903.25040.jnareb@gmail.com>

On 5/12/08, Jakub Narebski <jnareb@gmail.com> wrote:
> [The original email by Lars didn't get to git mailing list because of
>   lack of quotes around J.H. in "J.H." <warthog19@eaglescrag.net>
>   email address in Cc:]

Gaah, bad gmail...


>  Dnia niedziela 11. maja 2008 08:56, Lars Hjemli napisał:
>
>  > It seems to me that "projectlist in a single file" and "cache results
>  > of filled in @$projlist" are different solutions to the same problem:
>  > rapidly filling a perl datastructure.
>
> Well, yes and no.  "Projectlist in single file" is about _static_ data
>  (which changes only if projects are added, deleted, its description
>  changed; those are usually rare events), and avoiding mainly I/O and
>  not CPU (scanning filesystem for repositories, reading config and
>  description, etc.).
>
>  "Cache data" is about caching _variable_ data, such as "Last changed"
>  information for project.  Caching data instead of caching output
>  (caching HTML) allows to share cache for different presentation of
>  the very same data (e.g. 'history'/'shortlog' vs 'rss').  And for some
>  pages, like project search results, caching HTML output doesn't make
>  much sense, while caching data has it.

While I agree that caching search result output almost never makes
sense, I think it's more important that cache hits requires minimal
processing. This is why I've chosen to cache the final result instead
of an intermediate state, but both solutions obviously got some pros
and cons.

>  > This used to be expensive in terms of cache size (similar to k.orgs
>  > 20G), but current cgit solves this by treating the cache as a hash
>  > table; cgitrc has an option to set the cache size (number of files),
>  > each filename is generated as `hash(url) % cachesize` and each file
>  > contains the full url (to detect hash collisions) followed by the
>  > cached content for that url (see
>  > http://hjemli.net/git/cgit/tree/cache.c for the details).
>
>
> I guess that is the simplest solution, but I don't think that is
>  the best solution to have size-limited cache.  For example CPAN Perl
>  module Cache::SizeAwareCache and its derivatives use the following
>  algorithm
>
>   The default cache size limiting algorithm works by removing cache
>   objects in the following order until the desired limit is reached:
>
>     1) objects that have expired
>     2) objects that are least recently accessed
>     3) objects that that expire next

Again, minimal processing is the goal of cgits cache implementation,
hence the simple solution.

>  > Btw: gitweb and cgit seems to aquire the same features these days:
>  > cgit recently got pagination + search on the project list.
>
>
> I haven't checked what features cgit has lately...
>
>  Gitweb development seems a bit stalled; I got no response to latest
>  turn od gitweb TODO and wishlist list...

Well, I for one found the wishlist interesting; I've been pondering on
implementing a graphic log in cgit (inspired by git-forest and
git-graph), but I refuse to perform a  topo-sort ;-)

Hopefully I can exploit the fact that cgit never uses more than one
commit as starting point for log traversal, combined with heuristics
on commit date, to enable a fast graphic log that will be correct for
all but the most pathological cases.

--
larsh

  reply	other threads:[~2008-05-12 15:44 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-01 10:20 [RFC/PATCH] gitweb: Simplify git_project_list_body Jakub Narebski
2008-05-02 10:30 ` [RFC/PATCH] gitweb: Allow project description in project_index file Jakub Narebski
2008-05-02 13:04   ` Miklos Vajna
2008-05-03  9:03     ` Jakub Narebski
2008-05-04  2:03       ` Miklos Vajna
2008-05-09 13:23       ` [RFC/PATCH] gitweb: Project search Jakub Narebski
2008-05-10  9:28         ` [RFC/PATCH] gitweb: Paginate project list Jakub Narebski
2008-05-10 18:28           ` J.H.
2008-05-10 22:32             ` Jakub Narebski
2008-05-11  5:53               ` J.H.
2008-05-11 23:51                 ` Jakub Narebski
     [not found]                 ` <8c5c35580805102356p7e5532aah319af921f9b19392@mail.gmail.com>
2008-05-12  7:03                   ` Jakub Narebski
2008-05-12 15:43                     ` Lars Hjemli [this message]
2008-05-13  6:55                       ` Jakub Narebski
     [not found]                         ` <8c5c35580805130939m1a1ef8e0yd72402f3c79190ea@mail.gmail.com>
2008-05-13 16:46                           ` Lars Hjemli
2008-05-13 17:04                           ` Jakub Narebski
2008-05-13 19:11                             ` Kristian Høgsberg
2008-05-13 19:30                             ` Lars Hjemli
2008-05-13 23:28                               ` Jakub Narebski
2008-05-14  7:59                                 ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8c5c35580805120843j9b401f8mfa104806880a51c2@mail.gmail.com \
    --to=hjemli@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=pasky@suse.cz \
    --cc=warthog19@eaglescrag.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).