git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Marco Costalba <mcostalba@gmail.com>
Cc: "Shawn O. Pearce" <spearce@spearce.org>,
	Peter Eriksen <s022018@student.dtu.dk>,
	git@vger.kernel.org
Subject: Re: Understanding version 4 packs
Date: Mon, 26 Mar 2007 10:27:39 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.0.83.0703261015110.3041@xanadu.home> (raw)
In-Reply-To: <e5bfff550703260516q5da5f46et8aab2ebadcd9cceb@mail.gmail.com>

On Mon, 26 Mar 2007, Marco Costalba wrote:

> Experimenting with file names cache in qgit I have found a big saving
> splitting the paths in base name and file name and indexing both:
> 
> drivers\usb\host\ehci.h
> drivers\usb\host\ehci-pci.c
> drivers\usb\host\ohci-pci.c
> kernel\sched.c
> 
> became:
> 
> dir names table
> 
> 0 drivers\usb\host
> 1 kernel
> 
> 
> file name table
> 
> 0 ehci.h
> 1 ehci-pci.c
> 2 ohci-pci.c
> 
> In this way a big saving is achieved in case of directories deep in
> the tree (long paths) and a lot of files. 

Sure, but if you also consider drivers/usb/Makefile and drivers/Kconfig 
for example then you start losing on space saving.  Maybe that makes 
sense for qgit but it has no advantage in a pack which contains every 
possible files.

> Regarding MODE field an observation could be that is almost always the
> same, so an idea could be to store a 'default mode' just after
> nr_entries and do not add the field any more except in case path mode
> is different from default mode.

If the mode is always the same, or most likely similar for many entries 
then it will compress very well.  In fact in the current table format 
the tree byte sequence NULL+16-bit-mode will be quite common and likely 
to deflate accordingly.  This is therefore not worth adding more complex 
handling at runtime for deciding which mode to use, and still you'd have 
to store a flag for each path component to decide if the default mode 
should be used or not anyway.

> In case this could bring to unaligned
> entries another idea could be to store _all_ mode fields at the
> beginning (or at the end and let deflate to remove almost everything
> more easily)

That's worth trying indeed.


Nicolas

  reply	other threads:[~2007-03-26 14:27 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-24 20:23 Understanding version 4 packs Peter Eriksen
2007-03-24 23:24 ` Nicolas Pitre
2007-03-25  8:35   ` Peter Eriksen
2007-03-25  9:18     ` Shawn O. Pearce
2007-03-25 17:09       ` Linus Torvalds
2007-03-25 20:31         ` Shawn O. Pearce
2007-03-26  1:12           ` Nicolas Pitre
2007-03-26  2:02             ` Shawn O. Pearce
2007-03-26  8:49               ` Jakub Narebski
2007-03-26 14:01                 ` Nicolas Pitre
2007-03-26 12:16       ` Marco Costalba
2007-03-26 14:27         ` Nicolas Pitre [this message]
2007-03-26 17:10           ` Marco Costalba
2007-03-26 18:15             ` Nicolas Pitre
2007-03-26 18:43             ` Nicolas Pitre
2007-03-27  6:46               ` Marco Costalba
2007-03-27  6:55                 ` Shawn O. Pearce
2007-03-25  8:46 ` Shawn O. Pearce
2007-03-25  9:40   ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.0.83.0703261015110.3041@xanadu.home \
    --to=nico@cam.org \
    --cc=git@vger.kernel.org \
    --cc=mcostalba@gmail.com \
    --cc=s022018@student.dtu.dk \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).