git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Igor Djordjevic <igor.d.djordjevic@gmail.com>
To: Volodymyr Sendetskyi <volodymyrse@devcom.com>, git@vger.kernel.org
Subject: Re: Binary files
Date: Thu, 20 Jul 2017 20:49:35 +0200	[thread overview]
Message-ID: <d4b1b92d-6ab1-7e6f-4afd-6194a5ba8e40@gmail.com> (raw)
In-Reply-To: <CAFc9kS_xYVyPsW7qogDxLugxBb1p2vEFAoP=W9Rdnfqs6XtWKQ@mail.gmail.com>

Hi Volodymyr,

On 20/07/2017 09:41, Volodymyr Sendetskyi wrote:
> It is known, that git handles badly storing binary files in its
> repositories at all.
> This is especially about large files: even without any changes to
> these files, their copies are snapshotted on each commit. So even
> repositories with a small amount of code can grove very fast in size
> if they contain some great binary files. Alongside this, the SVN is
> much better about that, because it make changes to the server version
> of file only if some changes were done.

You already got some proposals on what you could try for making large 
binary files handling easier, but I just wanted to comment on this 
part of your message, as it doesn`t seem to be correct.

Even though each repository file is included in each commit (being a 
full repository state snapshot), meaning big binary files as well, 
that`s just from an end-user`s perspective.

Actual implementation side is smarter than that - if file hasn`t 
changed between commits, it won`t get copied/written to Git object 
database again.

Under the hood, many different commits can point to the same 
(unchanged) file, thus repository size _does not_ grow very fast with 
each commit if large binary file is without any changes.

Usually, the biggest concern with Git and large files[1], in 
comparison to SVN, for example, is something else - Git model 
assuming each repository clone holding the complete repository 
history with all the different file versions included, so you can`t 
get just some of them, or the last snapshot only, keeping your local 
repository small in size.

If the repository you`re cloning from is a big one, your locally 
cloned repository will be as well, even if you may not really be 
interested in the big files at all... but you got some suggestions 
for handling that already, as pointed out :)

Just note that it`s not really Git vs SVN here, but more distributed 
vs centralized approach in general, as you can`t both have everything 
and yet skip something at the same time. Different systems may have 
different workarounds for a specific workflow, though.

[1] Besides taking each file version as a full-sized snapshot (at the 
beginning, at least, until the delta compression packing occurs).

Regards,
Buga

  parent reply	other threads:[~2017-07-20 18:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAFc9kS8L-JJoJqKi7bB90qwKVW8gB=EFk9D8c=4YShqnamwa2w@mail.gmail.com>
2017-07-20  7:41 ` Binary files Volodymyr Sendetskyi
2017-07-20  7:58   ` Bryan Turner
2017-07-20  8:01   ` Konstantin Khomoutov
2017-07-20  8:32   ` Lars Schneider
2017-07-20 17:22   ` Stefan Beller
2017-07-20 18:49   ` Igor Djordjevic [this message]
2017-07-20 20:40     ` Junio C Hamano
2017-07-21 17:46       ` Igor Djordjevic

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4b1b92d-6ab1-7e6f-4afd-6194a5ba8e40@gmail.com \
    --to=igor.d.djordjevic@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=volodymyrse@devcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).