git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG
@ 2025-10-17  1:52 Xi Ruoyao
  2025-10-17  2:06 ` Collin Funk
  0 siblings, 1 reply; 9+ messages in thread
From: Xi Ruoyao @ 2025-10-17  1:52 UTC (permalink / raw)
  To: git

Hi,

When I test git-2.51.1 I hit a test failure in t7528-signed-commit-
ssh.sh.  Running it with -v reveals:

unix_listener_tmp: path "/home/xry111/sources/12.5/git-2.51.1/t/trash directory.t7528-signed-commit-ssh/.ssh/agent/s.fTyCxA5V6V.agent.dX2yNWQUX5" too long for Unix domain socket
main: Couldn't prepare agent socket

So this seems an issue in the test harness.  Is it possible to fix it?

-- 
Xi Ruoyao <xry111@xry111.site>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG
  2025-10-17  1:52 t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG Xi Ruoyao
@ 2025-10-17  2:06 ` Collin Funk
  2025-10-17  7:09   ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: Collin Funk @ 2025-10-17  2:06 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: git

Hi Xi,

Xi Ruoyao <xry111@xry111.site> writes:

> When I test git-2.51.1 I hit a test failure in t7528-signed-commit-
> ssh.sh.  Running it with -v reveals:
>
> unix_listener_tmp: path "/home/xry111/sources/12.5/git-2.51.1/t/trash directory.t7528-signed-commit-ssh/.ssh/agent/s.fTyCxA5V6V.agent.dX2yNWQUX5" too long for Unix domain socket
> main: Couldn't prepare agent socket
>
> So this seems an issue in the test harness.  Is it possible to fix it?

Unix sockets have an unfortunate historical limit of ~100 characters on
most systems. All the derivatives of 4.4BSD have a limit of 104
characters. Linux has a limit of 108 characters [1]. AIX is nice and
supports 1024 characters, but I assume you are not using that.

I guess this test can check for that error. I'll have a look.

Collin

[1] https://github.com/torvalds/linux/blob/98ac9cc4b4452ed7e714eddc8c90ac4ae5da1a09/include/uapi/linux/un.h#L7

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG
  2025-10-17  2:06 ` Collin Funk
@ 2025-10-17  7:09   ` Jeff King
  2025-10-17  9:52     ` Lauri Tirkkonen
  2025-10-17 17:42     ` Junio C Hamano
  0 siblings, 2 replies; 9+ messages in thread
From: Jeff King @ 2025-10-17  7:09 UTC (permalink / raw)
  To: Collin Funk; +Cc: Xi Ruoyao, git

On Thu, Oct 16, 2025 at 07:06:44PM -0700, Collin Funk wrote:

> Xi Ruoyao <xry111@xry111.site> writes:
> 
> > When I test git-2.51.1 I hit a test failure in t7528-signed-commit-
> > ssh.sh.  Running it with -v reveals:
> >
> > unix_listener_tmp: path "/home/xry111/sources/12.5/git-2.51.1/t/trash directory.t7528-signed-commit-ssh/.ssh/agent/s.fTyCxA5V6V.agent.dX2yNWQUX5" too long for Unix domain socket
> > main: Couldn't prepare agent socket
> >
> > So this seems an issue in the test harness.  Is it possible to fix it?
> 
> Unix sockets have an unfortunate historical limit of ~100 characters on
> most systems. All the derivatives of 4.4BSD have a limit of 104
> characters. Linux has a limit of 108 characters [1]. AIX is nice and
> supports 1024 characters, but I assume you are not using that.
> 
> I guess this test can check for that error. I'll have a look.

Git's internal unix-domain socket code checks the name against
sizeof(sa->sun_path) and will temporarily chdir into the surrounding
directory and use a relative path if necessary.

The errors above aren't from Git, so presumably they're from ssh-agent
itself, which is pulling the name from the $HOME we set in test-lib.sh.
So probably we could use the same trick like:

diff --git a/t/t7528-signed-commit-ssh.sh b/t/t7528-signed-commit-ssh.sh
index 0f887a3ebe..214908b2eb 100755
--- a/t/t7528-signed-commit-ssh.sh
+++ b/t/t7528-signed-commit-ssh.sh
@@ -82,7 +82,7 @@ test_expect_success GPGSSH 'create signed commits' '
 test_expect_success GPGSSH 'sign commits using literal public keys with ssh-agent' '
 	test_when_finished "test_unconfig commit.gpgsign" &&
 	test_config gpg.format ssh &&
-	eval $(ssh-agent) &&
+	eval $(ssh-agent -a ./agent.sock) &&
 	test_when_finished "kill ${SSH_AGENT_PID}" &&
 	test_when_finished "test_unconfig user.signingkey" &&
 	mkdir tmpdir &&

But that does mean that ssh-agent will produce:

  SSH_AUTH_SOCK=./agent.sock

which is only valid from that directory. If we wanted to protect
ourselves, we'd have to further set SSH_AUTH_SOCK to the full
$PWD/agent.sock. But I'd guess that just pushes the error onto ssh-add
trying to connect later with the full pathname. Using the relative path
does seem to work for me, at least insofar as:

  ./t7528-signed-commit-ssh.sh --run=1-2 -v -x -i \
    --root=/tmp/holy-smokes-this-is-a-really-long-pathname

triggers the length issue before but not after.

But looking at this test, there's something even more funky going on.
Our $HOME will always have a space in it, because no matter where you
set the root, we will create "trash directory.t7582..." to work in. But
AFAICT, ssh-agent does not quote the path in its output. So for example:

  d='/tmp/has spaces'
  mkdir "$d"
  HOME=$d ssh-agent

will produce:

  SSH_AUTH_SOCK=/tmp/has spaces/.ssh/agent/s.IcPuGe26YY.agent.6PtD3uhM4O; export SSH_AUTH_SOCK;

which is nonsense to eval. And indeed, the "working" version of this
test (without a really long root path) produces:

  ./t7528-signed-commit-ssh.sh: 1: eval: directory.t7528-signed-commit-ssh/.ssh/agent/s.IcPuGe26YY.agent.sOzoazWiDc: not found

I expected that would cause ssh-add to fail, since our SSH_AUTH_SOCK
would point to truncated garbage, and we can't talk to the agent. But it
doesn't even do that. The extra space turns that line from a variable
assignment into a one-shot variable attached to a command that fails to
run. And so we're left with the original SSH_AUTH_SOCK from the
environment, the one in my real $HOME outside of the trash directory.
Yikes!

If I unset SSH_AUTH_SOCK in my environment, then the test consistently
fails. But I'm somewhat amazed that nobody has complained about this
before. Surely somebody somewhere (especially CI!) is running t7528
without SSH_AUTH_SOCK set in the environment. Which makes wonder if I'm
missing something.

-Peff

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG
  2025-10-17  7:09   ` Jeff King
@ 2025-10-17  9:52     ` Lauri Tirkkonen
  2025-10-17 10:54       ` Jeff King
  2025-10-17 17:42     ` Junio C Hamano
  1 sibling, 1 reply; 9+ messages in thread
From: Lauri Tirkkonen @ 2025-10-17  9:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Collin Funk, Xi Ruoyao, git

Hi Jeff,

On Fri, Oct 17 2025 03:09:12 -0400, Jeff King wrote:
> But looking at this test, there's something even more funky going on.
> Our $HOME will always have a space in it, because no matter where you
> set the root, we will create "trash directory.t7582..." to work in. But
> AFAICT, ssh-agent does not quote the path in its output. So for example:
> 
>   d='/tmp/has spaces'
>   mkdir "$d"
>   HOME=$d ssh-agent
> 
> will produce:
> 
>   SSH_AUTH_SOCK=/tmp/has spaces/.ssh/agent/s.IcPuGe26YY.agent.6PtD3uhM4O; export SSH_AUTH_SOCK;
> 
> which is nonsense to eval. And indeed, the "working" version of this
> test (without a really long root path) produces:
> 
>   ./t7528-signed-commit-ssh.sh: 1: eval: directory.t7528-signed-commit-ssh/.ssh/agent/s.IcPuGe26YY.agent.sOzoazWiDc: not found
> 
> I expected that would cause ssh-add to fail, since our SSH_AUTH_SOCK
> would point to truncated garbage, and we can't talk to the agent. But it
> doesn't even do that. The extra space turns that line from a variable
> assignment into a one-shot variable attached to a command that fails to
> run. And so we're left with the original SSH_AUTH_SOCK from the
> environment, the one in my real $HOME outside of the trash directory.
> Yikes!
> 
> If I unset SSH_AUTH_SOCK in my environment, then the test consistently
> fails. But I'm somewhat amazed that nobody has complained about this
> before. Surely somebody somewhere (especially CI!) is running t7528
> without SSH_AUTH_SOCK set in the environment. Which makes wonder if I'm
> missing something.

I believe the issue surfaced only now because prior to OpenSSH 10.1,
ssh-agent would put its socket in /tmp by default, not under $HOME. See
https://www.openssh.com/txt/release-10.1

We saw this failure in CI on Alpine Linux and worked around by adding -T
to the ssh-agent invocation in this test, but I suppose that won't work
for earlier releases of OpenSSH.
https://gitlab.alpinelinux.org/alpine/aports/-/commit/81a159c8a371c871c1cd0f212881a757160632fb

-- 
Lauri Tirkkonen | lotheac @ IRCnet

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG
  2025-10-17  9:52     ` Lauri Tirkkonen
@ 2025-10-17 10:54       ` Jeff King
  2025-10-17 19:31         ` brian m. carlson
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff King @ 2025-10-17 10:54 UTC (permalink / raw)
  To: Lauri Tirkkonen; +Cc: Collin Funk, Xi Ruoyao, git

On Fri, Oct 17, 2025 at 06:52:49PM +0900, Lauri Tirkkonen wrote:

> > If I unset SSH_AUTH_SOCK in my environment, then the test consistently
> > fails. But I'm somewhat amazed that nobody has complained about this
> > before. Surely somebody somewhere (especially CI!) is running t7528
> > without SSH_AUTH_SOCK set in the environment. Which makes wonder if I'm
> > missing something.
> 
> I believe the issue surfaced only now because prior to OpenSSH 10.1,
> ssh-agent would put its socket in /tmp by default, not under $HOME. See
> https://www.openssh.com/txt/release-10.1

Ah, of course. That explains it perfectly, thanks. So we're going to get
lots more reports as people upgrade. :)

> We saw this failure in CI on Alpine Linux and worked around by adding -T
> to the ssh-agent invocation in this test, but I suppose that won't work
> for earlier releases of OpenSSH.

Yeah. We could either do something like "ssh-agent -T || ssh-agent", or
we could go with "ssh-agent -a" (which has been around since 2002, but
does raise the potential relative-path issue).

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG
  2025-10-17  7:09   ` Jeff King
  2025-10-17  9:52     ` Lauri Tirkkonen
@ 2025-10-17 17:42     ` Junio C Hamano
  2025-10-18  9:51       ` Jeff King
  1 sibling, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2025-10-17 17:42 UTC (permalink / raw)
  To: Jeff King; +Cc: Collin Funk, Xi Ruoyao, git

Jeff King <peff@peff.net> writes:

> AFAICT, ssh-agent does not quote the path in its output. So for example:
>
>   d='/tmp/has spaces'
>   mkdir "$d"
>   HOME=$d ssh-agent
>
> will produce:
>
>   SSH_AUTH_SOCK=/tmp/has spaces/.ssh/agent/s.IcPuGe26YY.agent.6PtD3uhM4O; export SSH_AUTH_SOCK;
>
> which is nonsense to eval.

So if $d were

    d='/tmp/has rm -rf in it'

would that produce some interesting side effect?

> I expected that would cause ssh-add to fail, since our SSH_AUTH_SOCK
> would point to truncated garbage, and we can't talk to the agent. But it
> doesn't even do that. The extra space turns that line from a variable
> assignment into a one-shot variable attached to a command that fails to
> run. And so we're left with the original SSH_AUTH_SOCK from the
> environment, the one in my real $HOME outside of the trash directory.
> Yikes!
>
> If I unset SSH_AUTH_SOCK in my environment, then the test consistently
> fails. But I'm somewhat amazed that nobody has complained about this
> before. Surely somebody somewhere (especially CI!) is running t7528
> without SSH_AUTH_SOCK set in the environment. Which makes wonder if I'm
> missing something.
>
> -Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG
  2025-10-17 10:54       ` Jeff King
@ 2025-10-17 19:31         ` brian m. carlson
  2025-10-18  9:56           ` Jeff King
  0 siblings, 1 reply; 9+ messages in thread
From: brian m. carlson @ 2025-10-17 19:31 UTC (permalink / raw)
  To: Jeff King; +Cc: Lauri Tirkkonen, Collin Funk, Xi Ruoyao, git

[-- Attachment #1: Type: text/plain, Size: 2340 bytes --]

On 2025-10-17 at 10:54:00, Jeff King wrote:
> On Fri, Oct 17, 2025 at 06:52:49PM +0900, Lauri Tirkkonen wrote:
> 
> > > If I unset SSH_AUTH_SOCK in my environment, then the test consistently
> > > fails. But I'm somewhat amazed that nobody has complained about this
> > > before. Surely somebody somewhere (especially CI!) is running t7528
> > > without SSH_AUTH_SOCK set in the environment. Which makes wonder if I'm
> > > missing something.
> > 
> > I believe the issue surfaced only now because prior to OpenSSH 10.1,
> > ssh-agent would put its socket in /tmp by default, not under $HOME. See
> > https://www.openssh.com/txt/release-10.1
> 
> Ah, of course. That explains it perfectly, thanks. So we're going to get
> lots more reports as people upgrade. :)

I had not had time to properly analyze it in order to say something more
thoughtful than "this is broken", but I can confirm it breaks for me on
Debian unstable:

  ERROR: ld.so: object 'libc_malloc_debug.so.0' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
  ./t7528-signed-commit-ssh.sh: 1: eval: directory.t7528-signed-commit-ssh/.ssh/agent/s.5w4CQ2109U.agent.5l0ixCaX1S: not found
  Agent pid 1429798
  Could not add identity "/home/bmc/checkouts/git/t/trash directory.t7528-signed-commit-ssh/gpghome/ed25519_ssh_signing_key": agent refused operation

Note that OpenSSH in my case is broken because of the space in the
home directory.  I've reported that to Debian and we'll see if it gets
fixed.  (I did mention it breaks the Git testsuite in the hopes that
improves the likelihood of getting it fixed.)

> > We saw this failure in CI on Alpine Linux and worked around by adding -T
> > to the ssh-agent invocation in this test, but I suppose that won't work
> > for earlier releases of OpenSSH.
> 
> Yeah. We could either do something like "ssh-agent -T || ssh-agent", or
> we could go with "ssh-agent -a" (which has been around since 2002, but
> does raise the potential relative-path issue).

I think like `ssh-agent -T || ssh-agent` would be better because we know
$HOME can be very long in our case, whereas $TMPDIR should not be
excessive (since presumably it worked before and other services, such as
tmux, place their sockets there).
-- 
brian m. carlson (they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG
  2025-10-17 17:42     ` Junio C Hamano
@ 2025-10-18  9:51       ` Jeff King
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff King @ 2025-10-18  9:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Collin Funk, Xi Ruoyao, git

On Fri, Oct 17, 2025 at 10:42:17AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > AFAICT, ssh-agent does not quote the path in its output. So for example:
> >
> >   d='/tmp/has spaces'
> >   mkdir "$d"
> >   HOME=$d ssh-agent
> >
> > will produce:
> >
> >   SSH_AUTH_SOCK=/tmp/has spaces/.ssh/agent/s.IcPuGe26YY.agent.6PtD3uhM4O; export SSH_AUTH_SOCK;
> >
> > which is nonsense to eval.
> 
> So if $d were
> 
>     d='/tmp/has rm -rf in it'
> 
> would that produce some interesting side effect?

Yep. Somewhat terrifying, though I guess if an attacker controls your
$HOME environment variable you probably have bigger worries.

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG
  2025-10-17 19:31         ` brian m. carlson
@ 2025-10-18  9:56           ` Jeff King
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff King @ 2025-10-18  9:56 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Lauri Tirkkonen, Collin Funk, Xi Ruoyao, git

On Fri, Oct 17, 2025 at 07:31:06PM +0000, brian m. carlson wrote:

> I had not had time to properly analyze it in order to say something more
> thoughtful than "this is broken", but I can confirm it breaks for me on
> Debian unstable:
> 
>   ERROR: ld.so: object 'libc_malloc_debug.so.0' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
>   ./t7528-signed-commit-ssh.sh: 1: eval: directory.t7528-signed-commit-ssh/.ssh/agent/s.5w4CQ2109U.agent.5l0ixCaX1S: not found
>   Agent pid 1429798
>   Could not add identity "/home/bmc/checkouts/git/t/trash directory.t7528-signed-commit-ssh/gpghome/ed25519_ssh_signing_key": agent refused operation
> 
> Note that OpenSSH in my case is broken because of the space in the
> home directory.  I've reported that to Debian and we'll see if it gets
> fixed.  (I did mention it breaks the Git testsuite in the hopes that
> improves the likelihood of getting it fixed.)

Thanks, I saw your report and had nothing to add. I agree it would be
nice if ssh-agent shell-quoted the output. I don't think there should be
portability issues.

> > Yeah. We could either do something like "ssh-agent -T || ssh-agent", or
> > we could go with "ssh-agent -a" (which has been around since 2002, but
> > does raise the potential relative-path issue).
> 
> I think like `ssh-agent -T || ssh-agent` would be better because we know
> $HOME can be very long in our case, whereas $TMPDIR should not be
> excessive (since presumably it worked before and other services, such as
> tmux, place their sockets there).

Yeah, I think "-T" would work fine. I just find it a bit hacky to assume
that a failure of "ssh-agent -T" is because of the "-T" option. We also
could do better at detecting errors in general. If you did not have
ssh-agent at all, then:

  eval $(ssh-agent) &&

will not fail the &&-chain since its exit code is eaten in the $()
substitution, and we eval an empty string. So arguably we should pull
this into a prereq or something that makes sure we can actually run
ssh-agent in the first place.

-Peff

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-10-18  9:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-17  1:52 t7528-signed-commit-ssh.sh fails due to ssh-agent fails to start with ENAMETOOLONG Xi Ruoyao
2025-10-17  2:06 ` Collin Funk
2025-10-17  7:09   ` Jeff King
2025-10-17  9:52     ` Lauri Tirkkonen
2025-10-17 10:54       ` Jeff King
2025-10-17 19:31         ` brian m. carlson
2025-10-18  9:56           ` Jeff King
2025-10-17 17:42     ` Junio C Hamano
2025-10-18  9:51       ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).