From: "brian m. carlson" <sandals@crustytoothpaste.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "João Victor Bonfim" <JoaoVictorBonfim@protonmail.com>,
"git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Fw: Curiosity
Date: Thu, 16 Dec 2021 02:19:47 +0000 [thread overview]
Message-ID: <YbqiQ1B9ezF/RPOn@camp.crustytoothpaste.net> (raw)
In-Reply-To: <xmqq8rwl91yf.fsf@gitster.g>
[-- Attachment #1: Type: text/plain, Size: 2184 bytes --]
On 2021-12-15 at 18:07:20, Junio C Hamano wrote:
> João Victor Bonfim <JoaoVictorBonfim@protonmail.com> writes:
> > Since Git is almost used for everything at this point, is there
> > any intent on providing better support for non textual file types?
> > Why do I say this? Take this game mod which I follow as example ->
> > https://github.com/SolariusScorch/XComFiles <- whenever I clone it
> > Git takes a significant forever amount of time to download 452 MB
> > of files whose some part, from my perspective, isn't being delta
> > compressed like the text files are (since, whenever reading a log
> > of what changes were made, git creates and undoes modes for all
> > binary files, some of which only changed by a pixel from one
> > colour to another).
>
> Our delta compression does not care whether the contents are text or
> binary, so if it is not compressed well, so it can be a sign that
> the contents are not compressible to begin with, at least with the
> xdelta binary-diff-patch engine we use. Improvement designs,
> algorithms and patches are always welcome ;-)
To expand on this, if what you're storing is already compressed, like
Ogg Vorbis files or PNGs, like are found in that repository, then
generally they will not delta well. This is also true of things like
Microsoft Office or OpenOffice documents, because they're essentially
Zip files.
The delta algorithm looks for similarities between files to compress
them. If a file is already compressed using something like Deflate,
used in PNGs and Zip files, then even very similar files will generally
look very different, so deltification will generally be ineffective.
There are two main solutions to this. One is to store your data
uncompressed in the repository and compress it as part of a build step.
This makes your checkouts larger, but it makes your repository smaller.
The other is to store them outside of the repository proper. Some folks
use Git LFS for this, but you could also just store a manifest with file
names and secure hashes, plus a download location for a public server.
--
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
next prev parent reply other threads:[~2021-12-16 2:19 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Wlh_w2gSCDQ2ieJnIY7TStWrzxbwP98SNRIFMTYpva7SRFipqk63HEYFVF7wFn1oSHOkQNsjWGOa5L49vyRlvSLbuZqpmvOaDOHmFkdt2zw=@protonmail.com>
2021-12-15 3:52 ` Fw: Curiosity João Victor Bonfim
2021-12-15 18:07 ` Junio C Hamano
2021-12-15 23:45 ` João Victor Bonfim
2021-12-16 2:19 ` brian m. carlson [this message]
2021-12-16 21:20 ` João Victor Bonfim
2021-12-16 21:33 ` Martin Fick
2021-12-16 21:42 ` Junio C Hamano
2021-12-18 0:17 ` João Victor Bonfim
2021-12-18 0:15 ` João Victor Bonfim
2021-12-18 0:24 ` Junio C Hamano
2021-12-18 0:50 ` João Victor Bonfim
2021-12-18 1:06 ` Martin Fick
2021-12-18 1:34 ` brian m. carlson
2021-12-18 1:40 ` João Victor Bonfim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YbqiQ1B9ezF/RPOn@camp.crustytoothpaste.net \
--to=sandals@crustytoothpaste.net \
--cc=JoaoVictorBonfim@protonmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).