From: Jakub Narebski <jnareb@gmail.com>
To: neubyr <neubyr@gmail.com>
Cc: "Carlos Martín Nieto" <cmn@elego.de>, git@vger.kernel.org
Subject: Re: git repository size / compression
Date: Fri, 09 Sep 2011 07:54:55 -0700 (PDT) [thread overview]
Message-ID: <m339g5u5pm.fsf@localhost.localdomain> (raw)
In-Reply-To: <CALFxCvxmPN_O_3xpkrGUYtdkVfz5nr7eaucMrAYQ3uvi820FBg@mail.gmail.com>
neubyr <neubyr@gmail.com> writes:
> On Fri, Sep 9, 2011 at 3:23 AM, Carlos Martín Nieto <cmn@elego.de> wrote:
> > On Thu, 2011-09-08 at 21:37 -0500, neubyr wrote:
>>> I have a test git repository with just two files in it. One of the
>>> file in it has a set of two lines that is repeated n times.
>>> e.g.:
>>> {{{
>>> $ for i in {1..5}; do cat ./lexico.txt>> lexico1.txt && cat
>>> ./lexico.txt>> lexico1.txt && mv ./lexico1.txt ./lexico.txt; done
>>> }}}
>>>
>>
>> So you've just created some data that can be compressed quite
>> efficiently.
>>
>>> I ran above command few times and performed commit after each run. Now
>>> disk usage of this repository directory is mentioned below. The 419M
>>> is working directory size and 2.7M is git repository/database size.
>>>
>>> {{{
>>> $ du -h -d 1 .
>>> 2.7M ./.git
>>> 419M .
>>>
>>> }}}
Have you tried the same but with
$ git gc --prune=now
before running `du`?
>>> Is it because of the compression performed by git before storing data
>>> (or before sending commit)??
>>
>> Yes. Git stores its objects (the commit, the snapshot of the files,
>> etc.) compressed. When these objects are stored in a pack, the size can
>> be further reduced by storing some objects as deltas which describe the
>> difference between itself and some other object in the object-db.
>
> Does git store deltas for some files? I thought it uses snapshots
> (exact copy of staged files) only.
When creating packfile from loose objects (e.g. via `git gc`), it
does perform delta compression.
--
Jakub Narębski
next prev parent reply other threads:[~2011-09-09 14:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-09 2:37 git repository size / compression neubyr
2011-09-09 8:23 ` Carlos Martín Nieto
2011-09-09 14:04 ` neubyr
2011-09-09 14:25 ` Sverre Rabbelier
2011-09-09 14:28 ` Carlos Martín Nieto
2011-09-09 15:07 ` neubyr
2011-09-09 14:54 ` Jakub Narebski [this message]
2011-09-09 15:09 ` neubyr
2011-09-09 16:05 ` John Szakmeister
2011-09-09 17:49 ` Andreas Krey
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m339g5u5pm.fsf@localhost.localdomain \
--to=jnareb@gmail.com \
--cc=cmn@elego.de \
--cc=git@vger.kernel.org \
--cc=neubyr@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.