Git development
 help / color / mirror / Atom feed
* Question on Clean/Smudge Infrastructure
@ 2026-05-04 16:57 rsbecker
  2026-05-05 12:56 ` Junio C Hamano
  0 siblings, 1 reply; 3+ messages in thread
From: rsbecker @ 2026-05-04 16:57 UTC (permalink / raw)
  To: git

Hi Git,

I have a edge use case that I would like to ask about.

Given a directory with a large number, say 100, text files, and a few
scattered
binary files - specified in .gitattributes as binary, what does clean smudge
do
with the binary files if they match the filter specification pattern? Are
they
ignored or processed. I am not sure that passing binary via stdin is
necessarily
portable. However, I would like to be able to explicitly ignore the binary
files
in my clean/smudge filters - either by doing a copy stdin/stdout (as I said,
probably
not portable), or sending a non-zero exit code, or some other mechanism.
The root of the use case is that the directory is subject to significant
changes
over time, and errors are sneaking in when people forget to update
.gitattributes
or name the files incorrectly. I would like to make their situation more
stable
to errors.

Thanks,
Randall

--
Brief whoami: NonStop&UNIX developer since approximately
UNIX(421664400)
NonStop(211288444200000000)
-- In real life, I talk too much.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question on Clean/Smudge Infrastructure
  2026-05-04 16:57 Question on Clean/Smudge Infrastructure rsbecker
@ 2026-05-05 12:56 ` Junio C Hamano
  2026-05-05 15:04   ` rsbecker
  0 siblings, 1 reply; 3+ messages in thread
From: Junio C Hamano @ 2026-05-05 12:56 UTC (permalink / raw)
  To: rsbecker; +Cc: git

<rsbecker@nexbridge.com> writes:

> Hi Git,
>
> I have a edge use case that I would like to ask about.
>
> Given a directory with a large number, say 100, text files, and a few
> scattered
> binary files - specified in .gitattributes as binary, what does clean smudge
> do
> with the binary files if they match the filter specification pattern? Are
> they
> ignored or processed. I am not sure that passing binary via stdin is
> necessarily
> portable. However, I would like to be able to explicitly ignore the binary
> files
> in my clean/smudge filters - either by doing a copy stdin/stdout (as I said,
> probably
> not portable), or sending a non-zero exit code, or some other mechanism.
> The root of the use case is that the directory is subject to significant
> changes
> over time, and errors are sneaking in when people forget to update
> .gitattributes
> or name the files incorrectly. I would like to make their situation more
> stable
> to errors.
>
> Thanks,
> Randall
>
> --
> Brief whoami: NonStop&UNIX developer since approximately
> UNIX(421664400)
> NonStop(211288444200000000)
> -- In real life, I talk too much.


* Passing binary via stdin is perfetly normal.  Otherwise, it would
  not work to set "exif" as the textconv filter on JPEG image files.

* The "filter" attribute is orthogonal to other attributes like
  "text" or "diff".  If "filter" somehow paid attention to
  binary-ness of the payload and refrained from working at all, then
  it would make it impossible to filter binary contents.

* If you want to apply your "filter" attribute to a subset of the
  files you have, you need to sift your files into two classes, ones
  that your filter would be used, and the other the remainder.  And
  they give your filter attribute only to the former.  Perhaps you
  only want *.txt to go through clean/smudge, and then you would
  have 

    *.txt filter=mytextfilter

  in your .gitattributes, and in your .git/config, you would have
  lines to speicify the executable you can use on each system.

   [filter "mytextfilter"]
	clean = ... your system specific command comes here ...
	smudge = ... your system specific command comes here ...


^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: Question on Clean/Smudge Infrastructure
  2026-05-05 12:56 ` Junio C Hamano
@ 2026-05-05 15:04   ` rsbecker
  0 siblings, 0 replies; 3+ messages in thread
From: rsbecker @ 2026-05-05 15:04 UTC (permalink / raw)
  To: 'Junio C Hamano'; +Cc: git

On May 5, 2026 8:56 AM, Junio C Hamano wrote:
> <rsbecker@nexbridge.com> writes:
> 
> > Hi Git,
> >
> > I have a edge use case that I would like to ask about.
> >
> > Given a directory with a large number, say 100, text files, and a few
> > scattered binary files - specified in .gitattributes as binary, what
> > does clean smudge do with the binary files if they match the filter
> > specification pattern? Are they ignored or processed. I am not sure
> > that passing binary via stdin is necessarily portable. However, I
> > would like to be able to explicitly ignore the binary files in my
> > clean/smudge filters - either by doing a copy stdin/stdout (as I said,
> > probably not portable), or sending a non-zero exit code, or some other
> > mechanism.
> > The root of the use case is that the directory is subject to
> > significant changes over time, and errors are sneaking in when people
> > forget to update .gitattributes or name the files incorrectly. I would
> > like to make their situation more stable to errors.
> >
> > Thanks,
> > Randall
> >
> > --
> > Brief whoami: NonStop&UNIX developer since approximately
> > UNIX(421664400)
> > NonStop(211288444200000000)
> > -- In real life, I talk too much.
> 
> 
> * Passing binary via stdin is perfetly normal.  Otherwise, it would
>   not work to set "exif" as the textconv filter on JPEG image files.
> 
> * The "filter" attribute is orthogonal to other attributes like
>   "text" or "diff".  If "filter" somehow paid attention to
>   binary-ness of the payload and refrained from working at all, then
>   it would make it impossible to filter binary contents.
> 
> * If you want to apply your "filter" attribute to a subset of the
>   files you have, you need to sift your files into two classes, ones
>   that your filter would be used, and the other the remainder.  And
>   they give your filter attribute only to the former.  Perhaps you
>   only want *.txt to go through clean/smudge, and then you would
>   have
> 
>     *.txt filter=mytextfilter
> 
>   in your .gitattributes, and in your .git/config, you would have
>   lines to speicify the executable you can use on each system.
> 
>    [filter "mytextfilter"]
> 	clean = ... your system specific command comes here ...
> 	smudge = ... your system specific command comes here ...

Thanks. I'm going to add a mechanism to auto-add file extensions
for this situation. That will allow me to select by pattern and
improve long-term manageability.

Regards,
Randall


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-05 15:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-04 16:57 Question on Clean/Smudge Infrastructure rsbecker
2026-05-05 12:56 ` Junio C Hamano
2026-05-05 15:04   ` rsbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox