From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Git List" <git@vger.kernel.org>
Subject: Re: Is there some script to find un-delta-able objects?
Date: Fri, 05 Oct 2018 09:47:47 -0700 [thread overview]
Message-ID: <xmqqk1mwwdak.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: 20181005161943.GA8816@sigill.intra.peff.net
Jeff King <peff@peff.net> writes:
> On Fri, Oct 05, 2018 at 04:20:27PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> I.e. something to generate the .gitattributes file using this format:
>>
>> https://git-scm.com/docs/gitattributes#_packing_objects
>>
>> Some stuff is obvious, like "*.gpg binary -delta", but I'm wondering if
>> there's some repo scanner utility to spew this out for a given repo.
>
> I'm not sure what you mean by "un-delta-able" objects. Do you mean ones
> where we're not likely to find a delta? Or ones where Git will not try
> to look for a delta?
>
> If the latter, I think the only rules are the "-delta" attribute and the
> object size. You should be able to use git-check-attr and "git-cat-file"
> to get that info.
>
> If the former, I don't know how you would know. We can only report on
> what isn't a delta _yet_.
I am reasonably sure that the question is about solving the former
so that "-delta" attribute is set appropriately.
Iniitially, I thought that it is likely an undeltifiable object has
higher randomness than deltifiable ones and that can be exploited,
but if you have such a highly random blob A (and no other object
like it) in the repository and then later acquire another blob B
that happens to share most of the data with A, then A and B by
themselves will pass the "highly random" test but still yet each can
be expressed as a delta derived from the other. So your "what isn't
a delta yet" is a reasonable assessment of what mechanically can be
known.
Knowledge/heuristic like "No two '*.gpg' files are expected to be
alike" needs something more than the randomness of individual files,
I guess.
prev parent reply other threads:[~2018-10-05 16:47 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-05 14:20 Is there some script to find un-delta-able objects? Ævar Arnfjörð Bjarmason
2018-10-05 16:19 ` Jeff King
2018-10-05 16:44 ` Ævar Arnfjörð Bjarmason
2018-10-05 16:56 ` Jeff King
2018-10-05 16:47 ` Junio C Hamano [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqk1mwwdak.fsf@gitster-ct.c.googlers.com \
--to=gitster@pobox.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).