From: <rsbecker@nexbridge.com>
To: "'brian m. carlson'" <sandals@crustytoothpaste.net>,
"'Thomas Braun'" <thomas.braun@virtuell-zuhause.de>
Cc: "'Junio C Hamano'" <gitster@pobox.com>,
"'El_Hoy'" <eloyesp@gmail.com>, <git@vger.kernel.org>
Subject: RE: Making git grep ignore binary the default
Date: Sat, 18 Oct 2025 10:16:52 -0400 [thread overview]
Message-ID: <00af01dc4039$dd45e090$97d1a1b0$@nexbridge.com> (raw)
In-Reply-To: <aPLkuPgirAVHkERr@fruit.crustytoothpaste.net>
On October 17, 2025 8:52 PM, brian m. carlson wrote:
>On 2025-10-17 at 23:29:22, Thomas Braun wrote:
>> Am 17.10.2025 um 23:29 schrieb Junio C Hamano:
>> > Simply because we have never needed to do something similar to "-a"
>> > and "-I" that we added in early 2006 for the past nearly 20 years.
>> > Also because GNU does not have any such thing to force "-a" or "-I"
>> > as default. The biggest reason is that it would be surprising if
>> > such a change does not break existing scripts that have been written
>> > by people over the years.
>>
>> And if we only would have the config option "grep.ignoreBinary"
>> defaulting to false with no default change whatsoever? I always want
>> to ignore binaries when grepping and find it a bit tedious that I have
>> to spell it out all over again. And yes I do have an alias as well but
>> usually don't remember to use it.
>
>As Junio said, this could break existing scripts. If I write a command which uses `git
>grep` and expects to find all matching files, it would not work on your system with
>`grep.ignoreBinary` set to true.
>
>For instance, if I am working on a project for a company and must exclude source
>code with a certain vendor's copyright (because we don't have permission to
>distribute their code), then it would be very bad if I accidentally distributed that
>company's binary files due to `git grep -l PATTERN | xargs rm -f` not matching them
>since it would violate the license.
>
>This is just an example, but there are lots of cases where people do really want to
>search every file.
>
>> I'm also curious what people are looking for in binary files with git grep.
>
>It's common to mark PDFs or PostScript files as binary because they often contain
>embedded binary fonts, but they are actually mostly text and can be usefully
>searched with grep. For instance, I once created some awards for a non-profit
>based on combining standalone text-based PostScript code along with output from
>groff, so those independent pieces could end up being source that you might store
>in Git and search, even if many configurations would use `*.ps -text` in a system
>gitattributes file.
>
>Sometimes you also have images or such for a website, which contain XMP
>metadata (a form of XML-serialized RDF). Finding those images which have certain
>author metadata or a certain license URL embedded in them could be valuable.
I agree that this will break scripts. There are quasi-binary files in some SQL
spaces that really benefit from git grep working. Please do not make this the
default.
next prev parent reply other threads:[~2025-10-18 14:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-17 15:00 Making git grep ignore binary the default El_Hoy
2025-10-17 21:29 ` Junio C Hamano
2025-10-17 23:29 ` Thomas Braun
2025-10-18 0:52 ` brian m. carlson
2025-10-18 14:16 ` rsbecker [this message]
2025-10-20 15:24 ` Thomas Braun
2025-10-20 17:20 ` El_Hoy
2025-10-21 7:27 ` Jeff King
2025-10-18 10:22 ` Jeff King
2025-10-18 16:01 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='00af01dc4039$dd45e090$97d1a1b0$@nexbridge.com' \
--to=rsbecker@nexbridge.com \
--cc=eloyesp@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sandals@crustytoothpaste.net \
--cc=thomas.braun@virtuell-zuhause.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.