From: "Dana How" <danahow@gmail.com>
To: "Junio C Hamano" <junkio@cox.net>
Cc: "Git Mailing List" <git@vger.kernel.org>, danahow@gmail.com
Subject: Re: [PATCH v2] Custom compression levels for objects and packs
Date: Tue, 8 May 2007 17:25:24 -0700	[thread overview]
Message-ID: <56b7f5510705081725v655d2ce1j28712507cfa7fa55@mail.gmail.com> (raw)
In-Reply-To: <7vk5vi27ko.fsf@assigned-by-dhcp.cox.net>
On 5/8/07, Junio C Hamano <junkio@cox.net> wrote:
> Dana How <danahow@gmail.com> writes:
> > Add config variables pack.compression and core.loosecompression .
> > Loose objects will be compressed using level
> >   isset(core.loosecompression) ? core.loosecompression :
> >   isset(core.compression) ? core.compression : Z_BEST_SPEED
> > and objects in packs will be compressed using level
> >   isset(pack.compression) ? pack.compression :
> >   isset(core.compression) ? core.compression : Z_DEFAULT_COMPRESSION
> > pack-objects also accepts --compression=N which
> > overrides the latter expression.
>
> Do you think the above is readable?
>   Compression level for loose objects is controlled by variable
>   core.loosecompression (or core.compression, if the former is
>   missing), and defaults to best-speed.
> or something like that?
Your phrasing is much better.
> > This applies on top of the git-repack --max-pack-size patchset.
> Hmph, that makes the --max-pack-size patchset take this more
> trivial and straightforward improvements hostage.  In general,
> I'd prefer more elaborate ones based on less questionable
> series.
The max-pack-size and pack.compression patches touch the same lines.
I thought my options were:
* Submit independently and make you merge; or
* Make one precede the other.
Since max-pack-size has been out there since April 4 and
the first acceptable version was May 1 (suggested by 0 comments),
I didn't realize it was a "questionable series".
I think it should be straightforward for me to re-submit this
based on current master.
> > +     /* differing core & pack compression when loose object -> must recompress */
> > +     if (!entry->in_pack && pack_compression_level != zlib_compression_level)
> > +             to_reuse = 0;
> > +     else
> I am not sure if that is worth it, as you do not know if the
> loose object you are looking at were compressed with the current
> settings.
You do not know for certain, that is correct.  However, config
settings setting unequal compression levels signal that you
care differently about the two cases. (For me,  I want the
compression investment to correspond to the expected lifetime of the file.)
Also,  *if* we have the knobs we want in the config file,
I don't think we're going to be changing these settings all that often.
If I didn't have this check forcing recompression in the pack,
then in the absence of deltification each object would enter the pack
by being copied (in the preceding code block) and pack.compression
would have little effect.  I actually experienced this the very first
time I imported a large dataset into git (I was trying to achieve the
effect of this patch by changing core.compression dynamically,  and
was a bit mystified for a while by the result).
Thus,  if core.loosecompression is set to speed up git-add,  I should
take the time to recompress the object when packing if pack.compression
is different (of course the hit of not doing so will be lessened by
deltification
which forces a new compression).
> > diff --git a/cache.h b/cache.h
> > index 8e76152..2b3f359 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -283,6 +283,8 @@ extern int warn_ambiguous_refs;
> >  extern int shared_repository;
> >  extern const char *apply_default_whitespace;
> >  extern int zlib_compression_level;
> > +extern int core_compression_level;
> > +extern int core_compression_seen;
>
> Could we somehow remove _seen?  Perhaps by initializing the
> _level to -1?
>
> > +int core_compression_level;
> > +int core_compression_seen;
>
> Same here.
I agree completely.  But,  what magic value should I use
to initialize the _level variables so I know they are not set?
All valid settings come from zlib.h through #define's but
there is no "invalid" defined.  Maybe I'll use -99.
Thanks,
-- 
Dana L. How  danahow@gmail.com  +1 650 804 5991 cell
next prev parent reply	other threads:[~2007-05-09  0:25 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-08 22:38 [PATCH v2] Custom compression levels for objects and packs Dana How
2007-05-08 23:56 ` Junio C Hamano
2007-05-09  0:16   ` Nicolas Pitre
2007-05-09  0:29     ` Dana How
2007-05-09  1:03       ` Nicolas Pitre
2007-05-09  6:46         ` Dana How
2007-05-09  7:13           ` Junio C Hamano
2007-05-09  0:25   ` Dana How [this message]
2007-05-09  1:23     ` Nicolas Pitre
2007-05-09  9:21       ` Dana How
2007-05-09 15:27         ` Nicolas Pitre
2007-05-09 16:26           ` Junio C Hamano
2007-05-09 16:42             ` Dana How
2007-05-09 16:59             ` [PATCH] make "repack -f" imply "pack-objects --no-reuse-object" Nicolas Pitre
2007-05-09 18:42             ` [PATCH] deprecate the new loose object header format Nicolas Pitre
2007-05-09 20:16               ` Dana How
2007-05-09 20:42                 ` Nicolas Pitre
2007-05-09 21:00                   ` Dana How
2007-05-09  5:59     ` [PATCH v2] Custom compression levels for objects and packs Junio C Hamano
2007-05-09  6:24       ` Dana How
2007-05-09  0:30 ` Petr Baudis
2007-05-09 13:56 ` Theodore Tso
2007-05-09 16:44   ` Dana How
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=56b7f5510705081725v655d2ce1j28712507cfa7fa55@mail.gmail.com \
    --to=danahow@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).