From: <rsbecker@nexbridge.com>
To: "'Junio C Hamano'" <gitster@pobox.com>
Cc: <git@vger.kernel.org>
Subject: RE: Question on Clean/Smudge Infrastructure
Date: Tue, 5 May 2026 11:04:28 -0400 [thread overview]
Message-ID: <082601dcdca0$7bcb6e80$73624b80$@nexbridge.com> (raw)
In-Reply-To: <xmqqzf2em2v8.fsf@gitster.g>
On May 5, 2026 8:56 AM, Junio C Hamano wrote:
> <rsbecker@nexbridge.com> writes:
>
> > Hi Git,
> >
> > I have a edge use case that I would like to ask about.
> >
> > Given a directory with a large number, say 100, text files, and a few
> > scattered binary files - specified in .gitattributes as binary, what
> > does clean smudge do with the binary files if they match the filter
> > specification pattern? Are they ignored or processed. I am not sure
> > that passing binary via stdin is necessarily portable. However, I
> > would like to be able to explicitly ignore the binary files in my
> > clean/smudge filters - either by doing a copy stdin/stdout (as I said,
> > probably not portable), or sending a non-zero exit code, or some other
> > mechanism.
> > The root of the use case is that the directory is subject to
> > significant changes over time, and errors are sneaking in when people
> > forget to update .gitattributes or name the files incorrectly. I would
> > like to make their situation more stable to errors.
> >
> > Thanks,
> > Randall
> >
> > --
> > Brief whoami: NonStop&UNIX developer since approximately
> > UNIX(421664400)
> > NonStop(211288444200000000)
> > -- In real life, I talk too much.
>
>
> * Passing binary via stdin is perfetly normal. Otherwise, it would
> not work to set "exif" as the textconv filter on JPEG image files.
>
> * The "filter" attribute is orthogonal to other attributes like
> "text" or "diff". If "filter" somehow paid attention to
> binary-ness of the payload and refrained from working at all, then
> it would make it impossible to filter binary contents.
>
> * If you want to apply your "filter" attribute to a subset of the
> files you have, you need to sift your files into two classes, ones
> that your filter would be used, and the other the remainder. And
> they give your filter attribute only to the former. Perhaps you
> only want *.txt to go through clean/smudge, and then you would
> have
>
> *.txt filter=mytextfilter
>
> in your .gitattributes, and in your .git/config, you would have
> lines to speicify the executable you can use on each system.
>
> [filter "mytextfilter"]
> clean = ... your system specific command comes here ...
> smudge = ... your system specific command comes here ...
Thanks. I'm going to add a mechanism to auto-add file extensions
for this situation. That will allow me to select by pattern and
improve long-term manageability.
Regards,
Randall
prev parent reply other threads:[~2026-05-05 15:04 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-04 16:57 Question on Clean/Smudge Infrastructure rsbecker
2026-05-05 12:56 ` Junio C Hamano
2026-05-05 15:04 ` rsbecker [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='082601dcdca0$7bcb6e80$73624b80$@nexbridge.com' \
--to=rsbecker@nexbridge.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.