git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Marco Costalba" <mcostalba@gmail.com>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: "Peter Eriksen" <s022018@student.dtu.dk>,
	"Nicolas Pitre" <nico@cam.org>,
	git@vger.kernel.org
Subject: Re: Understanding version 4 packs
Date: Mon, 26 Mar 2007 14:16:11 +0200	[thread overview]
Message-ID: <e5bfff550703260516q5da5f46et8aab2ebadcd9cceb@mail.gmail.com> (raw)
In-Reply-To: <20070325091806.GH25863@spearce.org>

On 3/25/07, Shawn O. Pearce <spearce@spearce.org> wrote:
> Peter Eriksen <s022018@student.dtu.dk> wrote:
> > On Sat, Mar 24, 2007 at 07:24:17PM -0400, Nicolas Pitre wrote:
> > > On Sat, 24 Mar 2007, Peter Eriksen wrote:
> > >
> >
> > The uncompressed file name table contains NR_ENTRIES entries,
> > and looks like this:
> >
> > +------------+------+--------------+------+--------------------+----
> > | NR_ENTRIES | MODE |  Full path 1 | MODE | Full path 2        | ...
> > +------------+------+--------------+------+--------------------+----
> >    4 bytes    2 bytes   n1 bytes    2 bytes     n2 bytes
> >
> > MODE is a network-byte-order integer representing the mode of the path,
> > and the path is a variable length, null-terminated string.
>
> Yes so far.
>

Perhaps has been already evaluated and my comment is not pertinent
but, anyway...

Experimenting with file names cache in qgit I have found a big saving
splitting the paths in base name and file name and indexing both:

drivers\usb\host\ehci.h
drivers\usb\host\ehci-pci.c
drivers\usb\host\ohci-pci.c
kernel\sched.c

became:

dir names table

0 drivers\usb\host
1 kernel


file name table

0 ehci.h
1 ehci-pci.c
2 ohci-pci.c

In this way a big saving is achieved in case of directories deep in
the tree (long paths) and a lot of files. Also after compressing the
difference is noticeable.

Regarding MODE field an observation could be that is almost always the
same, so an idea could be to store a 'default mode' just after
nr_entries and do not add the field any more except in case path mode
is different from default mode. In case this could bring to unaligned
entries another idea could be to store _all_ mode fields at the
beginning (or at the end and let deflate to remove almost everything
more easily)

  Marco

  parent reply	other threads:[~2007-03-26 12:16 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-24 20:23 Understanding version 4 packs Peter Eriksen
2007-03-24 23:24 ` Nicolas Pitre
2007-03-25  8:35   ` Peter Eriksen
2007-03-25  9:18     ` Shawn O. Pearce
2007-03-25 17:09       ` Linus Torvalds
2007-03-25 20:31         ` Shawn O. Pearce
2007-03-26  1:12           ` Nicolas Pitre
2007-03-26  2:02             ` Shawn O. Pearce
2007-03-26  8:49               ` Jakub Narebski
2007-03-26 14:01                 ` Nicolas Pitre
2007-03-26 12:16       ` Marco Costalba [this message]
2007-03-26 14:27         ` Nicolas Pitre
2007-03-26 17:10           ` Marco Costalba
2007-03-26 18:15             ` Nicolas Pitre
2007-03-26 18:43             ` Nicolas Pitre
2007-03-27  6:46               ` Marco Costalba
2007-03-27  6:55                 ` Shawn O. Pearce
2007-03-25  8:46 ` Shawn O. Pearce
2007-03-25  9:40   ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e5bfff550703260516q5da5f46et8aab2ebadcd9cceb@mail.gmail.com \
    --to=mcostalba@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=nico@cam.org \
    --cc=s022018@student.dtu.dk \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).