From: Ingo Molnar <mingo@elte.hu>
To: Paul Jackson <pj@engr.sgi.com>
Cc: torvalds@osdl.org, pasky@ucw.cz, rddunlap@osdl.org,
ross@jose.lug.udel.edu, linux-kernel@vger.kernel.org,
git@vger.kernel.org
Subject: Re: [rfc] git: combo-blobs
Date: Mon, 11 Apr 2005 17:12:04 +0200 [thread overview]
Message-ID: <20050411151204.GA5562@elte.hu> (raw)
In-Reply-To: <20050411074552.4e2e656b.pj@engr.sgi.com>
* Paul Jackson <pj@engr.sgi.com> wrote:
> Hmmm ... I have this strong sense that I am about 2 hours away from
> smacking my forehead and groaning "Duh - so that's what Ingo meant!"
>
> However, one must play out one's destiny.
>
> Could you provide an example scenario, which results in the creation
> of a combo-blob?
>
> The best I can come up with is the following.
>
> Let's say Nick changes one line in the middle of kernel/sched.c (yeah
> - I know - unlikely scenario - he usually changes more than that -
> nevermind that detail.)
>
> In the days Before Combo Blobs (BCB), git would have been told that
> kernel/sched.c was to be picked up, and would have wrapped it up in a
> zlib'd blob, sha1summed it, seen it was a new sum, and added that blob
> to its objects (or something like this -- I'm still a little fuzzy on
> these git details.)
>
> But Nick just downloaded the latest git 1.5.11.1 which has added
> support for combo blobs, so now, guessing here, instead of wrapping up
> the new sched.c, git instead unwraps the old one, diff's with the new,
> notices a couple of long sequences that are unchanged, wraps up both
> of those sequences as a couple of relatively large blobs, and wraps up
> the new lines that Nick just coded in the middle as a small blob, and
> puts all three in the object store, along with another small
> combo-blob, tying them all together.
actually, git would just include by reference the previous blob.
lets say we had the previous version of sched.c in a blob, ID
cc4ee6107d19f89898a8c89d45810f01710f2ff4. We have the new edit (which is
small, lets say 20 bytes) in blob e010fab710092b19be6e26de1721e249dff2d141.
We'd create the combo-blob representing the new version of sched.c, the
following way:
include cc4ee6107d19f89898a8c89d45810f01710f2ff4 0 54010
include e010fab710092b19be6e26de1721e249dff2d141 0 20
include cc4ee6107d19f89898a8c89d45810f01710f2ff4 54030 73061
so we'd include (by reference) most of the previous version, with a
small blob for the extras. Since sched.c compresses down to 36K, we
saved ~32K of bandwidth, and somewhere on the order of 20K of storage.
to construct the combo blob later on, we do have to unpack sched.c (and
if it's already a combo-blob that is not cached then we'd have to unpack
all parents until we arrive at some full blob).
> So far, not too bad. Haven't gained anything, and required the
> unpacking of a zlib blog we didn't require before, and the running and
> analyzing of a diff we didn't require before, but the end result is
> only moderately worse - four object blobs instead of one, but of total
> size not much larger (well, total size typically 3 disk blocks worse,
> due to a slight increase in fragmentation from using 4 blocks to store
> what used to be in one.)
we'd have 2 new objects (the 'delta' and the 'combo' blob).
(if # of objects is an issue then we could include new data in the combo
blob itself too, but that's getting too complex i think.)
Ingo
next prev parent reply other threads:[~2005-04-11 15:12 UTC|newest]
Thread overview: 194+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-09 19:45 more git updates Linus Torvalds
2005-04-09 19:56 ` Linus Torvalds
2005-04-09 20:07 ` Petr Baudis
2005-04-09 21:00 ` Linus Torvalds
2005-04-09 21:00 ` tony.luck
2005-04-10 16:01 ` Linus Torvalds
2005-04-12 17:34 ` Helge Hafting
2005-04-10 18:19 ` Paul Jackson
2005-04-10 23:04 ` Bernd Eckenfels
2005-04-11 9:27 ` Anton Altaparmakov
2005-04-09 21:08 ` Linus Torvalds
2005-04-09 23:31 ` Linus Torvalds
2005-04-10 2:41 ` Petr Baudis
2005-04-10 16:27 ` [ANNOUNCE] git-pasky-0.1 Petr Baudis
2005-04-10 16:55 ` Linus Torvalds
2005-04-10 19:49 ` Sean
2005-04-10 17:33 ` Ingo Molnar
2005-04-10 17:42 ` Willy Tarreau
2005-04-10 17:45 ` Ingo Molnar
2005-04-10 18:45 ` Petr Baudis
2005-04-10 19:13 ` Willy Tarreau
2005-04-10 21:27 ` Petr Baudis
2005-04-10 20:38 ` Linus Torvalds
2005-04-10 21:39 ` Linus Torvalds
2005-04-10 23:49 ` Petr Baudis
2005-04-10 22:27 ` Petr Baudis
2005-04-10 23:10 ` Linus Torvalds
2005-04-10 23:26 ` Petr Baudis
2005-04-10 23:46 ` Linus Torvalds
2005-04-10 23:56 ` Petr Baudis
2005-04-11 0:20 ` GIT license (Re: Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.1) Linus Torvalds
2005-04-11 0:27 ` Petr Baudis
2005-04-11 7:45 ` Ingo Molnar
2005-04-11 8:40 ` Florian Weimer
2005-04-11 10:52 ` Petr Baudis
2005-04-11 16:05 ` Florian Weimer
2005-04-10 23:23 ` [ANNOUNCE] git-pasky-0.1 Paul Jackson
2005-04-11 0:15 ` Randy.Dunlap
2005-04-11 0:30 ` Re: " Petr Baudis
2005-04-11 1:11 ` Linus Torvalds
2005-04-10 20:41 ` Paul Jackson
2005-04-11 1:58 ` [ANNOUNCE] git-pasky-0.2 Petr Baudis
2005-04-11 2:46 ` Daniel Barkalow
2005-04-11 10:17 ` Petr Baudis
2005-04-11 8:50 ` Ingo Molnar
2005-04-11 10:16 ` Petr Baudis
2005-04-11 13:57 ` [ANNOUNCE] git-pasky-0.3 Petr Baudis
2005-04-12 12:47 ` Martin Schlemmer
2005-04-12 13:02 ` Petr Baudis
2005-04-12 13:13 ` Martin Schlemmer
2005-04-12 13:23 ` Petr Baudis
[not found] ` <1113375277.23299.25.camel@nosferatu.lan>
[not found] ` <20050413075441.GD16489@pasky.ji.cz>
[not found] ` <1113381672.23299.47.camel@nosferatu.lan>
[not found] ` <20050413092656.GO16489@pasky.ji.cz>
[not found] ` <1113394537.23299.51.camel@nosferatu.lan>
2005-04-13 22:19 ` Re: Re: Remove need to untrack before tracking new branch Petr Baudis
2005-04-14 6:55 ` Martin Schlemmer
2005-04-14 8:28 ` Martin Schlemmer
2005-04-14 8:38 ` Martin Schlemmer
2005-04-14 9:11 ` Petr Baudis
2005-04-14 9:40 ` Martin Schlemmer
2005-04-14 9:55 ` Martin Schlemmer
2005-04-14 22:35 ` Alex Riesen
2005-04-15 5:45 ` Martin Schlemmer
2005-04-15 6:42 ` Paul Jackson
2005-04-15 23:49 ` Re: Re: " Alex Riesen
2005-04-14 22:42 ` Petr Baudis
2005-04-14 23:01 ` Martin Schlemmer
2005-04-14 23:00 ` Petr Baudis
2005-04-14 23:09 ` Martin Schlemmer
2005-04-14 23:25 ` Martin Schlemmer
2005-04-12 13:07 ` [ANNOUNCE] git-pasky-0.3 David Woodhouse
2005-04-13 8:47 ` Russell King
2005-04-13 8:59 ` Petr Baudis
2005-04-13 9:06 ` H. Peter Anvin
2005-04-13 9:09 ` David Woodhouse
2005-04-13 9:25 ` David Woodhouse
2005-04-13 9:42 ` Petr Baudis
2005-04-13 10:24 ` David Woodhouse
2005-04-13 17:01 ` Daniel Barkalow
2005-04-13 18:07 ` Petr Baudis
2005-04-13 18:22 ` git mailing list (Re: Re: Re: Re: [ANNOUNCE] git-pasky-0.3) Linus Torvalds
2005-04-13 18:38 ` Re: Re: Re: [ANNOUNCE] git-pasky-0.3 Daniel Barkalow
2005-04-13 12:43 ` Xavier Bestel
2005-04-13 16:48 ` H. Peter Anvin
2005-04-13 18:15 ` Xavier Bestel
2005-04-13 23:05 ` bd
2005-04-13 14:38 ` Linus Torvalds
2005-04-13 14:47 ` David Woodhouse
2005-04-13 14:59 ` Linus Torvalds
2005-04-13 9:35 ` Russell King
2005-04-13 9:38 ` Russell King
2005-04-13 9:49 ` Petr Baudis
2005-04-13 11:02 ` Ingo Molnar
2005-04-13 14:50 ` Linus Torvalds
2005-04-13 9:46 ` Petr Baudis
2005-04-13 10:28 ` Russell King
2005-04-13 19:03 ` Russell King
2005-04-13 19:13 ` Petr Baudis
2005-04-13 19:21 ` Russell King
2005-04-13 19:23 ` H. Peter Anvin
2005-04-10 6:53 ` more git updates Christopher Li
2005-04-10 11:48 ` Ralph Corderoy
2005-04-10 19:23 ` Paul Jackson
2005-04-10 18:42 ` Christopher Li
2005-04-10 22:30 ` Petr Baudis
2005-04-11 13:58 ` H. Peter Anvin
2005-04-20 20:29 ` Kai Henningsen
2005-04-24 0:42 ` Paul Jackson
2005-04-24 1:29 ` Bernd Eckenfels
2005-04-24 4:13 ` Paul Jackson
2005-04-24 4:38 ` Bernd Eckenfels
2005-04-24 4:53 ` Paul Jackson
2005-04-25 11:57 ` Theodore Ts'o
2005-04-25 16:40 ` David Wagner
2005-04-25 20:35 ` Bernd Eckenfels
2005-04-24 16:52 ` Horst von Brand
2005-04-24 8:00 ` Kai Henningsen
[not found] ` <6f6293f10504210220744af114@mail.gmail.com>
2005-04-24 8:01 ` Kai Henningsen
2005-04-11 11:35 ` [rfc] git: combo-blobs Ingo Molnar
2005-04-11 14:45 ` Paul Jackson
2005-04-11 15:12 ` Ingo Molnar [this message]
2005-04-11 15:32 ` Linus Torvalds
2005-04-11 15:39 ` Ingo Molnar
2005-04-11 15:57 ` Ingo Molnar
2005-04-11 16:01 ` Linus Torvalds
2005-04-11 16:33 ` Ingo Molnar
2005-04-12 5:42 ` Barry K. Nathan
2005-04-11 18:13 ` Chris Wedgwood
2005-04-11 18:30 ` Linus Torvalds
2005-04-11 20:18 ` Linus Torvalds
2005-04-11 18:40 ` Petr Baudis
2005-04-11 17:50 ` Paul Jackson
2005-04-11 15:28 ` Ingo Molnar
2005-04-11 15:31 ` Ingo Molnar
2005-04-12 4:05 ` more git updates David Eger
2005-04-12 8:16 ` Petr Baudis
2005-04-12 20:44 ` David Eger
2005-04-12 21:21 ` Linus Torvalds
2005-04-12 22:29 ` Krzysztof Halasa
2005-04-12 22:49 ` Linus Torvalds
2005-04-13 4:32 ` Matthias Urlichs
2005-04-12 22:36 ` David Eger
2005-04-12 23:48 ` Panagiotis Issaris
2005-04-12 23:40 ` Andrea Arcangeli
2005-04-12 23:45 ` Linus Torvalds
2005-04-13 0:14 ` Andrea Arcangeli
2005-04-13 1:10 ` Linus Torvalds
2005-04-13 10:59 ` Andrea Arcangeli
2005-04-13 20:44 ` Matt Mackall
2005-04-13 23:42 ` Krzysztof Halasa
2005-04-14 0:13 ` Matt Mackall
2005-04-13 9:30 ` Russell King
2005-04-13 10:20 ` Andrea Arcangeli
2005-04-13 14:43 ` Linus Torvalds
2005-04-10 2:07 ` Paul Jackson
2005-04-10 2:20 ` Paul Jackson
2005-04-10 2:09 ` Paul Jackson
2005-04-10 7:51 ` Junio C Hamano
2005-04-10 5:53 ` Christopher Li
2005-04-10 9:28 ` Junio C Hamano
2005-04-10 7:06 ` Christopher Li
2005-04-10 11:38 ` tony.luck
2005-04-10 9:48 ` Petr Baudis
2005-04-10 9:40 ` Wichert Akkerman
2005-04-10 9:41 ` Petr Baudis
2005-04-10 7:09 ` Christopher Li
2005-04-10 11:21 ` Proposal for shell-patch-format [was: Re: more git updates..] Rutger Nijlunsing
2005-04-10 15:44 ` more git updates Linus Torvalds
2005-04-10 17:00 ` Rutger Nijlunsing
2005-04-10 18:50 ` Paul Jackson
2005-04-10 20:57 ` Linus Torvalds
2005-04-10 19:03 ` Christopher Li
2005-04-10 22:38 ` Linus Torvalds
2005-04-10 19:53 ` Christopher Li
2005-04-10 23:21 ` Linus Torvalds
2005-04-10 21:28 ` Christopher Li
2005-04-12 5:14 ` David Lang
2005-04-12 6:00 ` Paul Jackson
2005-04-12 7:05 ` Barry K. Nathan
2005-04-11 6:57 ` bert hubert
2005-04-11 7:20 ` Christer Weinigel
2005-04-10 23:14 ` Paul Jackson
2005-04-10 23:38 ` Linus Torvalds
2005-04-11 0:19 ` Paul Jackson
2005-04-11 15:49 ` Randy.Dunlap
2005-04-11 18:30 ` Petr Baudis
2005-04-11 0:10 ` Petr Baudis
2005-04-09 22:00 ` Paul Jackson
2005-04-09 23:21 ` Ralph Corderoy
2005-04-10 0:39 ` Paul Jackson
2005-04-10 1:14 ` Bernd Eckenfels
2005-04-10 1:33 ` Paul Jackson
2005-04-10 10:22 ` Ralph Corderoy
2005-04-10 17:30 ` Paul Jackson
2005-04-10 17:31 ` Rik van Riel
2005-04-10 17:35 ` Ingo Molnar
2005-04-11 16:46 ` ross
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050411151204.GA5562@elte.hu \
--to=mingo@elte.hu \
--cc=git@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pasky@ucw.cz \
--cc=pj@engr.sgi.com \
--cc=rddunlap@osdl.org \
--cc=ross@jose.lug.udel.edu \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.