All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Git List <git@vger.kernel.org>, Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: Is there some script to find un-delta-able objects?
Date: Fri, 05 Oct 2018 18:44:25 +0200	[thread overview]
Message-ID: <87bm88gx7a.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20181005161943.GA8816@sigill.intra.peff.net>


On Fri, Oct 05 2018, Jeff King wrote:

> On Fri, Oct 05, 2018 at 04:20:27PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> I.e. something to generate the .gitattributes file using this format:
>>
>> https://git-scm.com/docs/gitattributes#_packing_objects
>>
>> Some stuff is obvious, like "*.gpg binary -delta", but I'm wondering if
>> there's some repo scanner utility to spew this out for a given repo.
>
> I'm not sure what you mean by "un-delta-able" objects. Do you mean ones
> where we're not likely to find a delta? Or ones where Git will not try
> to look for a delta?
>
> If the latter, I think the only rules are the "-delta" attribute and the
> object size. You should be able to use git-check-attr and "git-cat-file"
> to get that info.
>
> If the former, I don't know how you would know. We can only report on
> what isn't a delta _yet_.

Some version of the former. Ones where we haven't found any (or much of)
useful deltas yet. E.g. say I had a repository with a lot of files
generated by this command at various points in the history:

    dd if=/dev/urandom of=file.binary count=1024 bs=1024

Some script similar to git-sizer which could report that the
packed+compressed+delta'd version of the 10 *.binary files I had in my
history had a 1:1 ratio of how large they were in .git, v.s. how large
the sum of each file retrieved by "git show" was (i.e. uncompressed,
un-delta'd).

That doesn't mean that tomorrow I won't commit 10 new objects which
would have a really good delta ratio to those 10 existing files,
bringing the ratio to ~1:2, but if I had some report like:

    <ratio> <extension>

For a given repo that could be fed into .gitattributes to say we
shouldn't bother to delta files of certain extensions.

  reply	other threads:[~2018-10-05 16:44 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-05 14:20 Is there some script to find un-delta-able objects? Ævar Arnfjörð Bjarmason
2018-10-05 16:19 ` Jeff King
2018-10-05 16:44   ` Ævar Arnfjörð Bjarmason [this message]
2018-10-05 16:56     ` Jeff King
2018-10-05 16:47   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87bm88gx7a.fsf@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.