git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* infelicities in git hash-object --stdin-paths with special characters
@ 2024-12-02 17:41 Joey Hess
  2024-12-05  9:44 ` Patrick Steinhardt
  0 siblings, 1 reply; 2+ messages in thread
From: Joey Hess @ 2024-12-02 17:41 UTC (permalink / raw)
  To: git

Apparently "Icon\r" is a common filename on OSX, anyway it's a legal
unix filename. It seems that sending a line containing that filename to
git hash-object --stdin-paths triggers some DOS-style CRLF handling.
Here I am running git version 2.45.2 on Linux.

$ touch Icon^M
$ printf 'Icon\r\n' | git hash-object --stdin-paths
fatal: could not open 'Icon' for reading: No such file or directory

$ echo 'wrong file!' > Icon
$ printf 'Icon\r\n' | git hash-object --stdin-paths
1c43b74a7787621318ee7442eb5a36e32476f326

While looking at builtin/hash-object.c to see why it might do this, I quickly
noticed another odd behavior:

$ touch '"foo"'
$ printf '"foo"\n' | git hash-object --stdin-paths
fatal: could not open 'foo' for reading: No such file or directory

$ touch '"foo'
$ printf '"foo\n' | git hash-object --stdin-paths
fatal: line is badly quoted

The documentation does not seem to mention that quoted lines in
--stdin-paths are at all special. Of course, quoting would be one way to
work around the CRLF problem, if it were documented.

It seems that some parts of git that read filenames from stdin use
strbuf_getline_lf and others use strbuf_getdelim_strip_crlf. There does
not seem to be any consistency, and my impression is any user is best
off using -z, when the command supports it, to avoid the mess.

Given all that, maybe adding -z to hash-object would be a good "fix".

-- 
see shy jo

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: infelicities in git hash-object --stdin-paths with special characters
  2024-12-02 17:41 infelicities in git hash-object --stdin-paths with special characters Joey Hess
@ 2024-12-05  9:44 ` Patrick Steinhardt
  0 siblings, 0 replies; 2+ messages in thread
From: Patrick Steinhardt @ 2024-12-05  9:44 UTC (permalink / raw)
  To: Joey Hess; +Cc: git

On Mon, Dec 02, 2024 at 01:41:07PM -0400, Joey Hess wrote:
> Apparently "Icon\r" is a common filename on OSX, anyway it's a legal
> unix filename. It seems that sending a line containing that filename to
> git hash-object --stdin-paths triggers some DOS-style CRLF handling.
> Here I am running git version 2.45.2 on Linux.
> 
> $ touch Icon^M
> $ printf 'Icon\r\n' | git hash-object --stdin-paths
> fatal: could not open 'Icon' for reading: No such file or directory
> 
> $ echo 'wrong file!' > Icon
> $ printf 'Icon\r\n' | git hash-object --stdin-paths
> 1c43b74a7787621318ee7442eb5a36e32476f326
> 
> While looking at builtin/hash-object.c to see why it might do this, I quickly
> noticed another odd behavior:
> 
> $ touch '"foo"'
> $ printf '"foo"\n' | git hash-object --stdin-paths
> fatal: could not open 'foo' for reading: No such file or directory
> 
> $ touch '"foo'
> $ printf '"foo\n' | git hash-object --stdin-paths
> fatal: line is badly quoted
> 
> The documentation does not seem to mention that quoted lines in
> --stdin-paths are at all special. Of course, quoting would be one way to
> work around the CRLF problem, if it were documented.

Indeed -- the documentation does not meniton quoting at all, but we do
use `unquote_c_style()` to parse paths. So the following works:

    $ echo foobar >"$(printf 'something\n\rsomething')"
    $ printf 'something\n\rsomething' | git hash-object --stdin-paths
    fatal: could not open 'something' for reading: No such file or directory
    $ printf '"something\\n\\rsomething"' | git hash-object --stdin-paths
    323fae03f4606ea9991df8befbb2fca795e648fa

Note that you have to escape both "\n" and "\r", and then Git handles
unquoting for you. This really needs documentation though.

> It seems that some parts of git that read filenames from stdin use
> strbuf_getline_lf and others use strbuf_getdelim_strip_crlf. There does
> not seem to be any consistency, and my impression is any user is best
> off using -z, when the command supports it, to avoid the mess.
> 
> Given all that, maybe adding -z to hash-object would be a good "fix".

I think this is a good idea regardless of whether we document the
quoting behaviour or not. It is way easier for programs to embed NUL
characters than having to handle the quoting rules implemented by Git.

Patrick

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-12-05  9:45 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-02 17:41 infelicities in git hash-object --stdin-paths with special characters Joey Hess
2024-12-05  9:44 ` Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).