git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Carl Baldwin <cnb@fc.hp.com>, Junio C Hamano <junkio@cox.net>,
	git@vger.kernel.org
Subject: Re: [PATCH] diff-delta: produce optimal pack data
Date: Fri, 24 Feb 2006 15:19:39 -0500 (EST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0602241509050.31162@localhost.localdomain> (raw)
In-Reply-To: <Pine.LNX.4.64.0602241152290.22647@g5.osdl.org>

On Fri, 24 Feb 2006, Linus Torvalds wrote:

> The other thing to look at is "max_size": right now it initializes that to 
> "size / 2 - 20", which just says that we don't ever want a delta that is 
> larger than about half the result (plus the 20 byte overhead for pointing 
> to the thing we delta against). Again, if you feel that normal compression 
> compresses better than half, you could try changing that to
> 
> 	..
> 	max_size = size / 4 - 20;
> 	..

Like I mentioned, max_size should also be caped with the deflated 
undeltified object 
size.  This value is easy to get since plain objects are already 
deflated.

> NOTE! Every SINGLE one of those heuristics are just totally made up by 
> yours truly, and have no testing behind them. They're more of the type 
> "that sounds about right" than "this is how it must be". As mentioned, 
> Nico has already been playing with the heuristics - but he wanted better 
> packs, not better CPU usage, so he went the other way from what you would 
> want to try..

Actually it's a good balance I'm after.

Using 30% more CPU for 10% smaller packs is OK I'd say.

Using 100 times the CPU for 50% saving on only one particular delta is 
not acceptable.

And using more than one hour for 200MB of data with the current window 
default is not acceptable either.


Nicolas

  reply	other threads:[~2006-02-24 20:19 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-22  1:45 [PATCH] diff-delta: produce optimal pack data Nicolas Pitre
2006-02-24  8:49 ` Junio C Hamano
2006-02-24 15:37   ` Nicolas Pitre
2006-02-24 23:55     ` Junio C Hamano
2006-02-24 17:44   ` Carl Baldwin
2006-02-24 17:56     ` Nicolas Pitre
2006-02-24 18:35       ` Carl Baldwin
2006-02-24 18:57         ` Nicolas Pitre
2006-02-24 19:23           ` Carl Baldwin
2006-02-24 20:02             ` Nicolas Pitre
2006-02-24 20:40               ` Carl Baldwin
2006-02-24 21:12                 ` Nicolas Pitre
2006-02-24 22:50                   ` Carl Baldwin
2006-02-25  3:53                     ` Nicolas Pitre
2006-02-24 20:02             ` Linus Torvalds
2006-02-24 20:19               ` Nicolas Pitre [this message]
2006-02-24 20:53               ` Junio C Hamano
2006-02-24 21:39                 ` Nicolas Pitre
2006-02-24 21:48                   ` Nicolas Pitre
2006-02-25  0:45                   ` Linus Torvalds
2006-02-25  3:07                     ` Nicolas Pitre
2006-02-25  4:05                       ` Linus Torvalds
2006-02-25  5:10                         ` Nicolas Pitre
2006-02-25  5:35                           ` Nicolas Pitre
2006-03-07 23:48                             ` [RFH] zlib gurus out there? Junio C Hamano
2006-03-08  0:59                               ` Linus Torvalds
2006-03-08  1:22                                 ` Junio C Hamano
2006-03-08  2:00                                   ` Linus Torvalds
2006-03-08  9:46                                     ` Johannes Schindelin
2006-03-08 10:45                               ` [PATCH] write_sha1_file(): Perform Z_FULL_FLUSH between header and data Sergey Vlasov
2006-03-08 11:04                                 ` Junio C Hamano
2006-03-08 14:17                                   ` Sergey Vlasov
2006-02-25 19:18                           ` [PATCH] diff-delta: produce optimal pack data Linus Torvalds
2006-02-24 18:49       ` Carl Baldwin
2006-02-24 19:03         ` Nicolas Pitre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0602241509050.31162@localhost.localdomain \
    --to=nico@cam.org \
    --cc=cnb@fc.hp.com \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).