From: Junio C Hamano <gitster@pobox.com>
To: Justin Tobler <jltobler@gmail.com>
Cc: git@vger.kernel.org, karthik.188@gmail.com
Subject: Re: [RFC PATCH] diff: add option to report binary files in raw diffs
Date: Mon, 03 Nov 2025 20:44:00 -0800 [thread overview]
Message-ID: <xmqqzf92quen.fsf@gitster.g> (raw)
In-Reply-To: <xmqqa512sfcj.fsf@gitster.g> (Junio C. Hamano's message of "Mon, 03 Nov 2025 18:26:20 -0800")
Junio C Hamano <gitster@pobox.com> writes:
> Justin Tobler <jltobler@gmail.com> writes:
>
>> I have a usecase where I would like to know exactly which files in a
>> diff pair are considered binary by Git when computing diffs. When
>> computing patch diff output, Git already omits filepair diffs where at
>> least one side is considered binary and prints a "binary files differ"
>> message instead. From this message we cannot discern exactly which files
>> were considered binary by Git though.
>
> I have a usecase where I would like to know exactly which side of a
> diff filepair ends in an incomplete line in a concise format.
>
> Should we add yet another column to the raw output to indicate who
> is complete and who is incomplete?
>
> Where does it lead us and when will it stop?
>
> IOW, yuck ;-).
My point being that it will be a huge mistake to do this only by
singling a trait that is not so special as if it is very special,
only because you have been thinking about it too long (the "ends in
an incomplete line" trait is what has been on my mind for the past
few days, "this side is binary" may be what you've been thinking
about). There are many other things people would want to learn
concisely in machine readable format, like "where did the file stop
using CRLF line endings and swithced to LF line endings", that are
equally plausible as the question you are asking, or the question I
would be asking "which commit lost the final newline?"
Perhaps an extensible command line option syntax like
$ git log --raw-extended=binary,incomplete,crlf,...
is in order, and the presense of these options would add "tt,ic,cl"
somewhere in the output to signal that both sides are text, preimage
ends in an incomplete line but not postimage, and preimage uses crlf
but postimage uses lf, or something?
Extending beyond 2-way diff is still something we would need to
think about, I guess, but the only thing we need to do may be to
allow N-letter tuples instead of limiting ourselves to 2-letter
pairs, perhaps?
>> In this patch, the raw diff format is extended with a
>> `--report-binary-files` option to explicitly specify which files in the
>> diff pair were considered binary. The output in this form looks
>> something like this:
>>
>> $ git diff-tree --abbrev=8 --report-binary-files HEAD~ HEAD
>> :100644 100644 a1961526 e231acb1 bt M foo
>> :100644 100644 31eedd5c 402a70d7 bb M bar
>>
>> With this format, there is a new column before the status that specifies
>> the binary status for each file. 'b' indicates binary and 't' is used
>> otherwise.
>
> How will would this extend beyond 2-way diffs, I wonder.
> Should
>
> $ git show -c --report-binary <a merge>
>
> show [bt]{3} instead of [bt]{2} before the change status letter?
next prev parent reply other threads:[~2025-11-04 4:44 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-04 2:14 [RFC PATCH] diff: add option to report binary files in raw diffs Justin Tobler
2025-11-04 2:26 ` Junio C Hamano
2025-11-04 4:44 ` Junio C Hamano [this message]
2025-11-05 0:17 ` Justin Tobler
2025-11-05 8:04 ` Junio C Hamano
2025-11-06 21:42 ` Justin Tobler
2025-11-07 8:30 ` Torsten Bögershausen
2025-11-07 16:07 ` Junio C Hamano
2025-11-07 17:16 ` Justin Tobler
2025-11-07 17:26 ` Junio C Hamano
2025-11-05 12:14 ` Ben Knoble
2025-11-06 21:52 ` Justin Tobler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqzf92quen.fsf@gitster.g \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=jltobler@gmail.com \
--cc=karthik.188@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).