* How Blobs Work ( Blobs Vs. Deltas)
@ 2008-09-30 15:14 Feanil Patel
2008-09-30 15:28 ` Bruce Stephens
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Feanil Patel @ 2008-09-30 15:14 UTC (permalink / raw)
To: git
Hello,
I was reading about git objects on The Git
Book(http://book.git-scm.com/1_the_git_object_model.html) which was
posted on the mailing list a while back and I was wondering something
about blobs and how files are stored in any particular version. If
file A is changed from version one to version two there are two
different blobs that exist for the two versions of the file, is that
correct? The Book was saying Git does not use delta storage so does
this mean that there are two almost identical copies of the file with
the difference being the change that was put in from version one to
version two?
-Feanil
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How Blobs Work ( Blobs Vs. Deltas)
2008-09-30 15:14 How Blobs Work ( Blobs Vs. Deltas) Feanil Patel
@ 2008-09-30 15:28 ` Bruce Stephens
2008-09-30 15:29 ` Johannes Sixt
2008-09-30 18:54 ` Jakub Narebski
2 siblings, 0 replies; 4+ messages in thread
From: Bruce Stephens @ 2008-09-30 15:28 UTC (permalink / raw)
To: git
"Feanil Patel" <feanil@gmail.com> writes:
> I was reading about git objects on The Git
> Book(http://book.git-scm.com/1_the_git_object_model.html) which was
> posted on the mailing list a while back and I was wondering something
> about blobs and how files are stored in any particular version.
[...]
> The Book was saying Git does not use delta storage so does this mean
> that there are two almost identical copies of the file with the
> difference being the change that was put in from version one to
> version two?
There might be. git may also end up using deltas, see
<http://book.git-scm.com/7_the_packfile.html>.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How Blobs Work ( Blobs Vs. Deltas)
2008-09-30 15:14 How Blobs Work ( Blobs Vs. Deltas) Feanil Patel
2008-09-30 15:28 ` Bruce Stephens
@ 2008-09-30 15:29 ` Johannes Sixt
2008-09-30 18:54 ` Jakub Narebski
2 siblings, 0 replies; 4+ messages in thread
From: Johannes Sixt @ 2008-09-30 15:29 UTC (permalink / raw)
To: Feanil Patel; +Cc: git
Feanil Patel schrieb:
> I was reading about git objects on The Git
> Book(http://book.git-scm.com/1_the_git_object_model.html) which was
> posted on the mailing list a while back and I was wondering something
> about blobs and how files are stored in any particular version. If
> file A is changed from version one to version two there are two
> different blobs that exist for the two versions of the file, is that
> correct? The Book was saying Git does not use delta storage so does
> this mean that there are two almost identical copies of the file with
> the difference being the change that was put in from version one to
> version two?
At the conceptual level, yes. An entire file (== blob) is the smallest
unit that you can address. Even git's internals do not work with smaller
units.
But there is, of course, a mechanism that stores the database in a more
compact format, the so-called pack files, that basically store differences
between files as much as possible.
-- Hannes
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How Blobs Work ( Blobs Vs. Deltas)
2008-09-30 15:14 How Blobs Work ( Blobs Vs. Deltas) Feanil Patel
2008-09-30 15:28 ` Bruce Stephens
2008-09-30 15:29 ` Johannes Sixt
@ 2008-09-30 18:54 ` Jakub Narebski
2 siblings, 0 replies; 4+ messages in thread
From: Jakub Narebski @ 2008-09-30 18:54 UTC (permalink / raw)
To: Feanil Patel; +Cc: git
"Feanil Patel" <feanil@gmail.com> writes:
> Hello,
>
> I was reading about git objects in "The Git Community Book"
> (http://book.git-scm.com/1_the_git_object_model.html), which was
> posted on the mailing list a while back, and I was wondering something
> about blobs and how files are stored in any particular version. If
> file A is changed from version one to version two there are two
> different blobs that exist for the two versions of the file, is that
> correct? The Book was saying Git does not use delta storage so does
> this mean that there are two almost identical copies of the file with
> the difference being the change that was put in from version one to
> version two?
In Git there are two kinds of storage: loose objects and packs. Each
object generally starts as a loose object; for those it is like you
wrote: if you have two versions of some file, you would have both
of those contents of a file stored as separate objects (blobs). Note
that those 'blob' objects are compressed, so they usually don't take
more time than current version of file and its backup.
But there exists also other type of storage, namely packed. In the
past you had to pack (repack) objects by invoking "git repack" and
"git prune", and in more modern times by calling "git gc"; nowadays
this should be taken care of by git using "git gc --auto" behind.
When packing git tries to find objects which are close contents,
and store them as base object and binary delta (based on LibXDiff).
So you get benefits of delta storage, while on the API and script
level you always see single objects.
Note that explicit repacking allow git to not only consider versions
of the same file to diff against, tree and not only linear chains of
deltas (think branches), and while recency order is preferred it is
not enforced; objects and deltas are then compressed individually.
HTH
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-09-30 18:55 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-30 15:14 How Blobs Work ( Blobs Vs. Deltas) Feanil Patel
2008-09-30 15:28 ` Bruce Stephens
2008-09-30 15:29 ` Johannes Sixt
2008-09-30 18:54 ` Jakub Narebski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).