git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] run-command: factor out child_process_clear()
@ 2015-10-24 12:11 René Scharfe
  2015-10-24 12:23 ` [PATCH 2/2] daemon: plug memory leak René Scharfe
  2015-10-26 18:43 ` [PATCH 1/2] run-command: factor out child_process_clear() Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: René Scharfe @ 2015-10-24 12:11 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano, Jeff King

Avoid duplication by moving the code to release allocated memory for
arguments and environment to its own function, child_process_clear().
Export it to provide a counterpart to child_process_init().

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 Documentation/technical/api-run-command.txt |  7 +++++++
 run-command.c                               | 15 +++++++++------
 run-command.h                               |  1 +
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/api-run-command.txt b/Documentation/technical/api-run-command.txt
index a9fdb45..8bf3e37 100644
--- a/Documentation/technical/api-run-command.txt
+++ b/Documentation/technical/api-run-command.txt
@@ -46,6 +46,13 @@ Functions
 	The argument dir corresponds the member .dir. The argument env
 	corresponds to the member .env.
 
+`child_process_clear`::
+
+	Release the memory associated with the struct child_process.
+	Most users of the run-command API don't need to call this
+	function explicitly because `start_command` invokes it on
+	failure and `finish_command` calls it automatically already.
+
 The functions above do the following:
 
 . If a system call failed, errno is set and -1 is returned. A diagnostic
diff --git a/run-command.c b/run-command.c
index e17e456..13fa452 100644
--- a/run-command.c
+++ b/run-command.c
@@ -11,6 +11,12 @@ void child_process_init(struct child_process *child)
 	argv_array_init(&child->env_array);
 }
 
+void child_process_clear(struct child_process *child)
+{
+	argv_array_clear(&child->args);
+	argv_array_clear(&child->env_array);
+}
+
 struct child_to_clean {
 	pid_t pid;
 	struct child_to_clean *next;
@@ -327,8 +333,7 @@ int start_command(struct child_process *cmd)
 fail_pipe:
 			error("cannot create %s pipe for %s: %s",
 				str, cmd->argv[0], strerror(failed_errno));
-			argv_array_clear(&cmd->args);
-			argv_array_clear(&cmd->env_array);
+			child_process_clear(cmd);
 			errno = failed_errno;
 			return -1;
 		}
@@ -513,8 +518,7 @@ fail_pipe:
 			close_pair(fderr);
 		else if (cmd->err)
 			close(cmd->err);
-		argv_array_clear(&cmd->args);
-		argv_array_clear(&cmd->env_array);
+		child_process_clear(cmd);
 		errno = failed_errno;
 		return -1;
 	}
@@ -540,8 +544,7 @@ fail_pipe:
 int finish_command(struct child_process *cmd)
 {
 	int ret = wait_or_whine(cmd->pid, cmd->argv[0], 0);
-	argv_array_clear(&cmd->args);
-	argv_array_clear(&cmd->env_array);
+	child_process_clear(cmd);
 	return ret;
 }
 
diff --git a/run-command.h b/run-command.h
index 5428b04..12bb26c 100644
--- a/run-command.h
+++ b/run-command.h
@@ -47,6 +47,7 @@ struct child_process {
 
 #define CHILD_PROCESS_INIT { NULL, ARGV_ARRAY_INIT, ARGV_ARRAY_INIT }
 void child_process_init(struct child_process *);
+void child_process_clear(struct child_process *);
 
 int start_command(struct child_process *);
 int finish_command(struct child_process *);
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] daemon: plug memory leak
  2015-10-24 12:11 [PATCH 1/2] run-command: factor out child_process_clear() René Scharfe
@ 2015-10-24 12:23 ` René Scharfe
  2015-10-26 19:47   ` Jeff King
  2015-10-26 18:43 ` [PATCH 1/2] run-command: factor out child_process_clear() Junio C Hamano
  1 sibling, 1 reply; 7+ messages in thread
From: René Scharfe @ 2015-10-24 12:23 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano, Jeff King

Call child_process_clear() when a child ends to release the memory
allocated for its environment.  This is necessary because unlike all
other users of start_command() we don't call finish_command(), which
would have taken care of that for us.

This leak was introduced by f063d38b (daemon: use cld->env_array
when re-spawning).

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 daemon.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/daemon.c b/daemon.c
index 56679a1..be70cd4 100644
--- a/daemon.c
+++ b/daemon.c
@@ -802,6 +802,7 @@ static void check_dead_children(void)
 			/* remove the child */
 			*cradle = blanket->next;
 			live_children--;
+			child_process_clear(&blanket->cld);
 			free(blanket);
 		} else
 			cradle = &blanket->next;
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] run-command: factor out child_process_clear()
  2015-10-24 12:11 [PATCH 1/2] run-command: factor out child_process_clear() René Scharfe
  2015-10-24 12:23 ` [PATCH 2/2] daemon: plug memory leak René Scharfe
@ 2015-10-26 18:43 ` Junio C Hamano
  2015-10-26 19:23   ` Stefan Beller
  1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2015-10-26 18:43 UTC (permalink / raw)
  To: René Scharfe, Stefan Beller; +Cc: Git List, Jeff King

René Scharfe <l.s.r@web.de> writes:

> Avoid duplication by moving the code to release allocated memory for
> arguments and environment to its own function, child_process_clear().
> Export it to provide a counterpart to child_process_init().
>
> Signed-off-by: Rene Scharfe <l.s.r@web.de>
> ---

Hmm, is this _deinit() Stefan added to his series recently?

>  Documentation/technical/api-run-command.txt |  7 +++++++
>  run-command.c                               | 15 +++++++++------
>  run-command.h                               |  1 +
>  3 files changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/technical/api-run-command.txt b/Documentation/technical/api-run-command.txt
> index a9fdb45..8bf3e37 100644
> --- a/Documentation/technical/api-run-command.txt
> +++ b/Documentation/technical/api-run-command.txt
> @@ -46,6 +46,13 @@ Functions
>  	The argument dir corresponds the member .dir. The argument env
>  	corresponds to the member .env.
>  
> +`child_process_clear`::
> +
> +	Release the memory associated with the struct child_process.
> +	Most users of the run-command API don't need to call this
> +	function explicitly because `start_command` invokes it on
> +	failure and `finish_command` calls it automatically already.
> +
>  The functions above do the following:
>  
>  . If a system call failed, errno is set and -1 is returned. A diagnostic
> diff --git a/run-command.c b/run-command.c
> index e17e456..13fa452 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -11,6 +11,12 @@ void child_process_init(struct child_process *child)
>  	argv_array_init(&child->env_array);
>  }
>  
> +void child_process_clear(struct child_process *child)
> +{
> +	argv_array_clear(&child->args);
> +	argv_array_clear(&child->env_array);
> +}
> +
>  struct child_to_clean {
>  	pid_t pid;
>  	struct child_to_clean *next;
> @@ -327,8 +333,7 @@ int start_command(struct child_process *cmd)
>  fail_pipe:
>  			error("cannot create %s pipe for %s: %s",
>  				str, cmd->argv[0], strerror(failed_errno));
> -			argv_array_clear(&cmd->args);
> -			argv_array_clear(&cmd->env_array);
> +			child_process_clear(cmd);
>  			errno = failed_errno;
>  			return -1;
>  		}
> @@ -513,8 +518,7 @@ fail_pipe:
>  			close_pair(fderr);
>  		else if (cmd->err)
>  			close(cmd->err);
> -		argv_array_clear(&cmd->args);
> -		argv_array_clear(&cmd->env_array);
> +		child_process_clear(cmd);
>  		errno = failed_errno;
>  		return -1;
>  	}
> @@ -540,8 +544,7 @@ fail_pipe:
>  int finish_command(struct child_process *cmd)
>  {
>  	int ret = wait_or_whine(cmd->pid, cmd->argv[0], 0);
> -	argv_array_clear(&cmd->args);
> -	argv_array_clear(&cmd->env_array);
> +	child_process_clear(cmd);
>  	return ret;
>  }
>  
> diff --git a/run-command.h b/run-command.h
> index 5428b04..12bb26c 100644
> --- a/run-command.h
> +++ b/run-command.h
> @@ -47,6 +47,7 @@ struct child_process {
>  
>  #define CHILD_PROCESS_INIT { NULL, ARGV_ARRAY_INIT, ARGV_ARRAY_INIT }
>  void child_process_init(struct child_process *);
> +void child_process_clear(struct child_process *);
>  
>  int start_command(struct child_process *);
>  int finish_command(struct child_process *);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] run-command: factor out child_process_clear()
  2015-10-26 18:43 ` [PATCH 1/2] run-command: factor out child_process_clear() Junio C Hamano
@ 2015-10-26 19:23   ` Stefan Beller
  2015-10-27 21:29     ` René Scharfe
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Beller @ 2015-10-26 19:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, Git List, Jeff King

On Mon, Oct 26, 2015 at 11:43 AM, Junio C Hamano <gitster@pobox.com> wrote:
> René Scharfe <l.s.r@web.de> writes:
>
>> Avoid duplication by moving the code to release allocated memory for
>> arguments and environment to its own function, child_process_clear().
>> Export it to provide a counterpart to child_process_init().
>>
>> Signed-off-by: Rene Scharfe <l.s.r@web.de>
>> ---
>
> Hmm, is this _deinit() Stefan added to his series recently?

Yes. Although you (Junio) take credit for actually using it in these
places, too. :)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] daemon: plug memory leak
  2015-10-24 12:23 ` [PATCH 2/2] daemon: plug memory leak René Scharfe
@ 2015-10-26 19:47   ` Jeff King
  0 siblings, 0 replies; 7+ messages in thread
From: Jeff King @ 2015-10-26 19:47 UTC (permalink / raw)
  To: René Scharfe; +Cc: Git List, Junio C Hamano

On Sat, Oct 24, 2015 at 02:23:20PM +0200, René Scharfe wrote:

> Call child_process_clear() when a child ends to release the memory
> allocated for its environment.  This is necessary because unlike all
> other users of start_command() we don't call finish_command(), which
> would have taken care of that for us.
> 
> This leak was introduced by f063d38b (daemon: use cld->env_array
> when re-spawning).

I agree this is a leak we need to address, but I find the solution a
little unsatisfactory (but tl;dr, it's probably the best we can do).

It would be nice if we could call finish_command() here to do the
wait(). But it does not know about WNOHANG, so we would have to
introduce a new helper anyway.

However, the whole check_dead_children function is accidentally
quadratic, and should probably be refactored in general. It loops over
the list of children, calling `waitpid(pid, WNOHANG)` on each. So if `n`
children connect at one time, we make `O(n^2)` waitpid calls.

I have patches to instead loop over waitpid(-1, WNOHANG) until there are
no dead children left (and use a hash to go from pid to each "struct
child").

But the reason I haven't posted them is that I'm somewhat of the opinion
that this child-list can go away altogether. The main use is in
enforcing the max_connections variable. Just enforcing that can be done
with a counter, but we do something much weirder: when we get a new
connection, we try to kill an _existing_ child by finding the first
client with multiple connections, killing that, sleeping for a second
(!), and then if that killed something, giving the slot to the new
connection.

This has a few problems:

  1. Under non-malicious heavy load, you probably want to reject the new
     request rather than killing an existing one. If you kill a fetch
     that is 99% completed, that client is likely to come back and fetch
     again, and you've just thrown away all of the effort (both CPU and
     bandwidth) put into the failed clone. You haven't spent anything on
     the incoming connection, so you'd much rather they go away and come
     back in a moment.

  2. I think the kill_some_child algorithm was meant to enforce
     fairness so that a single IP cannot monopolize all of your slots.
     I'm not sure that's realistic for a few reasons:

       a. Real bad guys will just hit you with a DDoS from many IPs.

       b. Real sites have load-balancing in front of git-daemon, and the
          client IPs may not be meaningful.

       c. The duplicate selection is pretty naive. A bad guy with tons
	  of duplicate connections is no more likely to be killed than a
	  normal client with exactly two connections (it actually
	  depends solely on when the latest connection from either came
	  in). So I suspect a determine attacker with a single IP could
	  still present a problem.

  3. Calling sleep() in the server parent process is a great way to kill
     performance. :)

Of course my view is somewhat skewed by running a really big site with
load balancers and such (and we have long set --max-connections high
enough that it has never been hit). Maybe it's more realistic protection
for a "normal" site.

But my inclination would be to drop kill_some_child entirely in favor of
simply rejecting new connections that are over the limit, and getting
rid of the child-list entirely (and calling waitpid only to reap the
zombie PIDs).

I guess we'd still need something like child_process_clear(), though. We
could just run it immediately after spawning the child, rather than when
we cleaned it up.  So in that sense this series is the first incremental
step toward that future, anyway.

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] run-command: factor out child_process_clear()
  2015-10-26 19:23   ` Stefan Beller
@ 2015-10-27 21:29     ` René Scharfe
  2015-10-27 21:31       ` Stefan Beller
  0 siblings, 1 reply; 7+ messages in thread
From: René Scharfe @ 2015-10-27 21:29 UTC (permalink / raw)
  To: Stefan Beller, Junio C Hamano; +Cc: Git List, Jeff King

Am 26.10.2015 um 20:23 schrieb Stefan Beller:
> On Mon, Oct 26, 2015 at 11:43 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> René Scharfe <l.s.r@web.de> writes:
>>
>>> Avoid duplication by moving the code to release allocated memory for
>>> arguments and environment to its own function, child_process_clear().
>>> Export it to provide a counterpart to child_process_init().
>>>
>>> Signed-off-by: Rene Scharfe <l.s.r@web.de>
>>> ---
>>
>> Hmm, is this _deinit() Stefan added to his series recently?
>
> Yes. Although you (Junio) take credit for actually using it in these
> places, too. :)

Oh, and it's already in next.  I still like _clear() better (as in 
{argv,object,ref,sha1}_array_clear(), or string_list_clear(), or ...) 
than _deinit(), which seems to have been introduced by vcs-svn and isn't 
a real word.  I'll rebase my patches on top of this series, but probably 
not before the weekend.

René

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] run-command: factor out child_process_clear()
  2015-10-27 21:29     ` René Scharfe
@ 2015-10-27 21:31       ` Stefan Beller
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Beller @ 2015-10-27 21:31 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, Git List, Jeff King

On Tue, Oct 27, 2015 at 2:29 PM, René Scharfe <l.s.r@web.de> wrote:
> Am 26.10.2015 um 20:23 schrieb Stefan Beller:
>>
>> On Mon, Oct 26, 2015 at 11:43 AM, Junio C Hamano <gitster@pobox.com>
>> wrote:
>>>
>>> René Scharfe <l.s.r@web.de> writes:
>>>
>>>> Avoid duplication by moving the code to release allocated memory for
>>>> arguments and environment to its own function, child_process_clear().
>>>> Export it to provide a counterpart to child_process_init().
>>>>
>>>> Signed-off-by: Rene Scharfe <l.s.r@web.de>
>>>> ---
>>>
>>>
>>> Hmm, is this _deinit() Stefan added to his series recently?
>>
>>
>> Yes. Although you (Junio) take credit for actually using it in these
>> places, too. :)
>
>
> Oh, and it's already in next.  I still like _clear() better (as in
> {argv,object,ref,sha1}_array_clear(), or string_list_clear(), or ...) than
> _deinit(), which seems to have been introduced by vcs-svn and isn't a real
> word.  I'll rebase my patches on top of this series, but probably not before
> the weekend.
>

Thanks for the coming cleanup. :)
As you know naming things is hard.

> René
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-10-27 21:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-24 12:11 [PATCH 1/2] run-command: factor out child_process_clear() René Scharfe
2015-10-24 12:23 ` [PATCH 2/2] daemon: plug memory leak René Scharfe
2015-10-26 19:47   ` Jeff King
2015-10-26 18:43 ` [PATCH 1/2] run-command: factor out child_process_clear() Junio C Hamano
2015-10-26 19:23   ` Stefan Beller
2015-10-27 21:29     ` René Scharfe
2015-10-27 21:31       ` Stefan Beller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).