git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Duy Nguyen <pclouds@gmail.com>
Cc: John Szakmeister <john@szakmeister.net>,
	Git Mailing List <git@vger.kernel.org>,
	Nicolas Pitre <nico@fluxnic.net>
Subject: Re: Zero padded file modes...
Date: Thu, 5 Sep 2013 12:33:18 -0400	[thread overview]
Message-ID: <20130905163318.GA14338@sigill.intra.peff.net> (raw)
In-Reply-To: <CACsJy8C4PN4n1W71ajnoyFjaWCsxQjbXMbT-tcfgpXeoJKyXyA@mail.gmail.com>

On Thu, Sep 05, 2013 at 11:18:24PM +0700, Nguyen Thai Ngoc Duy wrote:

> > There are basically two solutions:
> >
> >   1. Add a single-bit flag for "I am 0-padded in the real data". We
> >      could probably even squeeze it into the same integer.
> >
> >   2. Have a "classic" section of the pack that stores the raw object
> >      bytes. For objects which do not match our expectations, store them
> >      raw instead of in v4 format. They will not get the benefit of v4
> >      optimizations, but if they are the minority of objects, that will
> >      only end up with a slight slow-down.
> 
> 3. Detect this situation and fall back to v2.
> 
> 4. Update v4 to allow storing raw tree entries mixing with v4-encoded
> tree entries. This is something between (1) and (2)

I wouldn't want to do (3). At some point pack v4 may become the standard
format, but there will be some repositories which will never be allowed
to adopt it.

For (4), yes, that could work. But like (1), it only solves problems in
tree entries. What happens if we have a quirky commit object that needs
the same treatment (e.g., a timezone that does not fit into the commit
name dictionary properly)?

> I think (4) fits better in v4 design and probably not hard to do. Nico
> recently added a code to embed a tree entry inline, but the mode must
> be encoded (and can't contain leading zeros). We could have another
> code to store mode in ascii. This also makes me wonder if we might
> have similar problems with timezones, which are also specially encoded
> in v4..

Yeah, that might be more elegant.

> (3) is probably easiest. We need to scan through all tree entries
> first when creating v4 anyway. If we detect any anomalies, just switch
> back to v2 generation. The user will be force to rewrite history in
> order to take full advantage of v4 (they can have a pack of weird
> trees in v2 and the rest in v4 pack, but that's not optimal).

Splitting across two packs isn't great, though. What if v4 eventually
becomes the normal on-the-wire format? I'd rather have some method for
just embedding what are essentially v2 objects into the v4 pack, which
would give us future room for handling these sorts of things.

But like I said, I haven't looked closely yet, so maybe there are
complications with that. In the meantime, I'll defer to the judgement of
people who know what they are talking about. :)

-Peff

  reply	other threads:[~2013-09-05 16:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-05 14:00 Zero padded file modes John Szakmeister
2013-09-05 15:36 ` Jeff King
2013-09-05 16:18   ` Duy Nguyen
2013-09-05 16:33     ` Jeff King [this message]
2013-09-05 16:56       ` Nicolas Pitre
2013-09-05 16:25   ` A Large Angry SCM
2013-09-05 17:09   ` Nicolas Pitre
2013-09-05 19:10     ` Jeff King
2013-09-05 17:13   ` John Szakmeister
2013-09-05 19:35     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130905163318.GA14338@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=john@szakmeister.net \
    --cc=nico@fluxnic.net \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).