git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Aaron Schrab <aaron@schrab.com>
Cc: Max Amelchenko <maxamel2002@gmail.com>,
	Taylor Blau <me@ttaylorr.com>,
	Bagas Sanjaya <bagasdotme@gmail.com>,
	git@vger.kernel.org,
	Hideaki Yoshifuji <hideaki.yoshifuji@miraclelinux.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [bug] git clone command leaves orphaned ssh process
Date: Tue, 12 Sep 2023 00:33:45 -0400	[thread overview]
Message-ID: <20230912043345.GA1623696@coredump.intra.peff.net> (raw)
In-Reply-To: <20230912T004049Z.jiWw7xuK7fiT@pug.qqx.org>

On Mon, Sep 11, 2023 at 08:40:49PM -0400, Aaron Schrab wrote:

> At 13:11 +0300 11 Sep 2023, Max Amelchenko <maxamel2002@gmail.com> wrote:
> > Maybe it's connected also to the underlying infrastructure? We are
> > getting this in AWS lambda jobs and we're hitting a system limit of
> > max processes because of it.
> 
> Running as a lambda, or in a container, could definitely be why you're
> seeing a difference. Normally when a process is orphaned it gets adopted by
> `init` (PID 1), and that will take care of cleaning up after orphaned zombie
> processes.
> 
> But most of the time containers just run the configured process directly,
> without an init process. That leaves nothing to clean orphan processes.

Yeah, that seems like the culprit. If the clone finishes successfully,
we do end up in finish_connect(), where we wait() for the process. But
if we exit early (in this case, ssh bails and we get EOF on the pipe
reading from it), then we may call die() and exit immediately.

We _could_ take special care to add every spawned process to a global
list, set up handlers via atexit() and signal(), and then reap the
processes. But traditionally it's not a big deal to exit with un-reaped
children, and this is the responsibility of init. I'm not sure it makes
sense for Git to basically reimplement that catch-all (and of course we
cannot even do it reliably if we are killed by certain signals).

> Although for that to really be a problem, would require hitting that max
> process limit inside a single container invocation. Of course since
> containers usually aren't meant to be spawning a lot of processes, that
> limit might be a lot lower than on a normal system.
> 
> I know that Docker provides a way to include an init process in the started
> container (`docker run --init`), but I don't think that AWS Lambda does.

I don't know anything about Lambda, but if you are running arbitrary
commands, then it seems like you could insert something like this:

  https://github.com/krallin/tini

into the mix. I much prefer that to teaching Git to try to do the same
thing in-process.

-Peff

  reply	other threads:[~2023-09-12  4:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-10  6:38 [bug] git clone command leaves orphaned ssh process Max Amelchenko
2023-09-10  8:50 ` Bagas Sanjaya
2023-09-10  9:47   ` Max Amelchenko
2023-09-10 18:47     ` Taylor Blau
2023-09-11 10:11       ` Max Amelchenko
2023-09-12  0:40         ` Aaron Schrab
2023-09-12  4:33           ` Jeff King [this message]
2023-09-24 10:25             ` Max Amelchenko
2023-09-25 12:29               ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230912043345.GA1623696@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=aaron@schrab.com \
    --cc=bagasdotme@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hideaki.yoshifuji@miraclelinux.com \
    --cc=maxamel2002@gmail.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).