git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] receive-pack: plug minor memory leak in unpack()
@ 2014-10-11 11:00 René Scharfe
  2014-10-12  1:53 ` Jeff King
  0 siblings, 1 reply; 10+ messages in thread
From: René Scharfe @ 2014-10-11 11:00 UTC (permalink / raw)
  To: Git Mailing List; +Cc: Junio C Hamano, Nguyễn Thái Ngọc Duy

The argv_array used in unpack() is never freed.  Instead of adding
explicit calls to argv_array_clear() use the args member of struct
child_process and let run_command() and friends clean up for us.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 builtin/receive-pack.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index a51846c..443dd37 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1220,7 +1220,6 @@ static const char *pack_lockfile;
 static const char *unpack(int err_fd, struct shallow_info *si)
 {
 	struct pack_header hdr;
-	struct argv_array av = ARGV_ARRAY_INIT;
 	const char *hdr_err;
 	int status;
 	char hdr_arg[38];
@@ -1243,16 +1242,16 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 
 	if (si->nr_ours || si->nr_theirs) {
 		alt_shallow_file = setup_temporary_shallow(si->shallow);
-		argv_array_pushl(&av, "--shallow-file", alt_shallow_file, NULL);
+		argv_array_push(&child.args, "--shallow-file");
+		argv_array_push(&child.args, alt_shallow_file);
 	}
 
 	if (ntohl(hdr.hdr_entries) < unpack_limit) {
-		argv_array_pushl(&av, "unpack-objects", hdr_arg, NULL);
+		argv_array_pushl(&child.args, "unpack-objects", hdr_arg, NULL);
 		if (quiet)
-			argv_array_push(&av, "-q");
+			argv_array_push(&child.args, "-q");
 		if (fsck_objects)
-			argv_array_push(&av, "--strict");
-		child.argv = av.argv;
+			argv_array_push(&child.args, "--strict");
 		child.no_stdout = 1;
 		child.err = err_fd;
 		child.git_cmd = 1;
@@ -1267,13 +1266,12 @@ static const char *unpack(int err_fd, struct shallow_info *si)
 		if (gethostname(keep_arg + s, sizeof(keep_arg) - s))
 			strcpy(keep_arg + s, "localhost");
 
-		argv_array_pushl(&av, "index-pack",
+		argv_array_pushl(&child.args, "index-pack",
 				 "--stdin", hdr_arg, keep_arg, NULL);
 		if (fsck_objects)
-			argv_array_push(&av, "--strict");
+			argv_array_push(&child.args, "--strict");
 		if (fix_thin)
-			argv_array_push(&av, "--fix-thin");
-		child.argv = av.argv;
+			argv_array_push(&child.args, "--fix-thin");
 		child.out = -1;
 		child.err = err_fd;
 		child.git_cmd = 1;
-- 
2.1.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] receive-pack: plug minor memory leak in unpack()
  2014-10-11 11:00 [PATCH] receive-pack: plug minor memory leak in unpack() René Scharfe
@ 2014-10-12  1:53 ` Jeff King
  2014-10-13 19:08   ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff King @ 2014-10-12  1:53 UTC (permalink / raw)
  To: René Scharfe
  Cc: Git Mailing List, Junio C Hamano,
	Nguyễn Thái Ngọc Duy

On Sat, Oct 11, 2014 at 01:00:16PM +0200, René Scharfe wrote:

> The argv_array used in unpack() is never freed.  Instead of adding
> explicit calls to argv_array_clear() use the args member of struct
> child_process and let run_command() and friends clean up for us.

Looks good. I notice that the recently added prepare_push_cert_sha1 uses
an argv_array to create the child_process.env, and we leak the result. I
wonder if run-command should provide a managed env array similar to the
"args" array.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] receive-pack: plug minor memory leak in unpack()
  2014-10-12  1:53 ` Jeff King
@ 2014-10-13 19:08   ` Junio C Hamano
  2014-10-14  9:16     ` Jeff King
                       ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Junio C Hamano @ 2014-10-13 19:08 UTC (permalink / raw)
  To: Jeff King
  Cc: René Scharfe, Git Mailing List,
	Nguyễn Thái Ngọc Duy

Jeff King <peff@peff.net> writes:

> On Sat, Oct 11, 2014 at 01:00:16PM +0200, René Scharfe wrote:
>
>> The argv_array used in unpack() is never freed.  Instead of adding
>> explicit calls to argv_array_clear() use the args member of struct
>> child_process and let run_command() and friends clean up for us.
>
> Looks good. I notice that the recently added prepare_push_cert_sha1 uses
> an argv_array to create the child_process.env, and we leak the result.

You're right.  finish_command() runs argv_array_clear() on
cmd->args, but does not care about cmd->env so anybody that use the
(optional) "use these environment variable settings while running
the command" would leak unless the caller takes care of it.

> I wonder if run-command should provide a managed env array similar
> to the "args" array.

I took a look at a few of them:

 - setup_pager() uses a static set of environment variables that are
   not built with argv_array(); should be easy to convert, though.

 - wt_status_print_submodule_summary() does use two argv_arrays, for
   env and argv.  It can use the managed "args" today, and with such
   a change it can also use the managed "envs".

 - receive-pack.c::prepare_push_cert_sha1() would benefit from
   managed "envs".

 - http-backend.c::run_service() would benefit from managed "envs".

 - daemon.c::handle() uses a static set of environment variables
   that are not built with argv_array().  Same for argv.

It shouldn't be too hard to catch and convert all of them.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] receive-pack: plug minor memory leak in unpack()
  2014-10-13 19:08   ` Junio C Hamano
@ 2014-10-14  9:16     ` Jeff King
  2014-10-19 11:13       ` René Scharfe
  2014-10-19 11:13     ` [PATCH 1/2] run-command: add env_array, an optional argv_array for env René Scharfe
  2014-10-19 11:14     ` [PATCH 2/2] use env_array member of struct child_process René Scharfe
  2 siblings, 1 reply; 10+ messages in thread
From: Jeff King @ 2014-10-14  9:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: René Scharfe, Git Mailing List,
	Nguyễn Thái Ngọc Duy

On Mon, Oct 13, 2014 at 12:08:09PM -0700, Junio C Hamano wrote:

> > I wonder if run-command should provide a managed env array similar
> > to the "args" array.
> 
> I took a look at a few of them:

I took a brief look, too.

I had hoped we could just all it "env" and everybody would be happy
using it instead of bare pointers. But quite a few callers assign
"local_repo_env" to to child_process.env. We could copy it into an
argv_array, of course, but that really feels like working around the
interface. So I think we would prefer to keep both forms available.

That raises the question: what should it be called? The "argv_array"
form of "argv" is called "args". The more I see it, the more I hate that
name, as the two are easily confused. We could have:

  const char **argv;
  struct argv_array argv_array;
  const char **env;
  struct argv_array env_array;

though "argv_array" is a lot to type when you have a string of
argv_array_pushf() calls (which are already by themselves kind of
verbose). Maybe that's not too big a deal, though.

We could flip it to give the managed version the short name (and calling
the unmanaged version "env_ptr" or something). That would require
munging the existing callers, but the tweak would be simple.

>  - daemon.c::handle() uses a static set of environment variables
>    that are not built with argv_array().  Same for argv.

Most of the callers you mentioned are good candidates. This one is
tricky.

The argv array gets malloc'd and set up by the parent git-daemon
process. Then each time we get a client, we create a new struct
child_process that references it. So using the managed argv-array would
actually be a bit more complicated (and not save any memory; we just
always point to the single copy for each child).

For the environment, we build it in a function-local buffer, point the
child_process's env field at it, start the child, and then copy the
child_process into our global list of children. When we notice a child
is dead (by linearly going through the list with waitpid), we free the
list entry. So there are a few potentially bad things here:

  1. We memcpy the child_process to put it on the list. Which does work,
     though it feels a little like we are violating the abstraction
     barrier.

  2. The child_process in the list points to the local "env" buffer that
     is no longer valid. There's no bug because we don't ever look at
     it. Moving to a managed env would fix that. But I have to wonder if
     we even want to be keeping the "struct child_process" around in the
     first place (all we really care about is the pid).

  3. If we do move to a managed env, then we expect it to get cleaned up
     in finish_command. But we never call that; we just free the list
     memory containing the child_process. We would want to call
     finish_command. Except that we will have reaped the process already
     with our call to waitpid() from check_dead_children. So we'd need a
     new call to do just the cleanup without the wait, I guess.

  4. For every loop on the listen socket, we call waitpid() for each
     living child, which is a bit wasteful. We'd probably do better to
     call waitpid(0, &status, WNOHANG), and then look up the resulting
     pid in a hashmap or sorted list when we actually see something that
     died. I don't know that this is a huge problem in practice. We use
     git-daemon pretty heavily on our backend servers at GitHub, and it
     seems to use about 5% of a CPU constantly on each machine. Which is
     kind of lame, given that it isn't doing anything at all, but is
     hardly earth-shattering.

So I'm not sure if it is worth converting to a managed env. There's a
bit of ugliness, but none of it is causing any problems, and doing so
opens a can of worms. The most interesting thing to fix (to me, anyway)
is number 4, but that has nothing to do with the env in the first place.
:)

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] receive-pack: plug minor memory leak in unpack()
  2014-10-14  9:16     ` Jeff King
@ 2014-10-19 11:13       ` René Scharfe
  2014-10-20  9:19         ` Jeff King
  0 siblings, 1 reply; 10+ messages in thread
From: René Scharfe @ 2014-10-19 11:13 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano
  Cc: Git Mailing List, Nguyễn Thái Ngọc Duy

Am 14.10.2014 um 11:16 schrieb Jeff King:
> On Mon, Oct 13, 2014 at 12:08:09PM -0700, Junio C Hamano wrote:
>
>>> I wonder if run-command should provide a managed env array similar
>>> to the "args" array.

That's a good idea.

>>
>> I took a look at a few of them:
>
> I took a brief look, too.
>
> I had hoped we could just all it "env" and everybody would be happy
> using it instead of bare pointers. But quite a few callers assign
> "local_repo_env" to to child_process.env. We could copy it into an
> argv_array, of course, but that really feels like working around the
> interface. So I think we would prefer to keep both forms available.

We could add a flag (clear_local_repo_env?) and reference local_repo_env 
only in run-command.c for these cases.  But some other cases remain that 
are better off providing their own array, like in daemon.c.

> That raises the question: what should it be called? The "argv_array"
> form of "argv" is called "args". The more I see it, the more I hate that
> name, as the two are easily confused. We could have:
>
>    const char **argv;
>    struct argv_array argv_array;
>    const char **env;
>    struct argv_array env_array;
>
> though "argv_array" is a lot to type when you have a string of
> argv_array_pushf() calls (which are already by themselves kind of
> verbose). Maybe that's not too big a deal, though.

I actually like args and argv. :)  Mixing them up is noticed by the 
compiler, so any confusion is cleared up rather quickly.

> We could flip it to give the managed version the short name (and calling
> the unmanaged version "env_ptr" or something). That would require
> munging the existing callers, but the tweak would be simple.

Perhaps, but I'm a but skeptical of big renames.  Let's start small and 
add env_array, and see how far we get with that.

>>   - daemon.c::handle() uses a static set of environment variables
>>     that are not built with argv_array().  Same for argv.
>
> Most of the callers you mentioned are good candidates. This one is
> tricky.
>
> The argv array gets malloc'd and set up by the parent git-daemon
> process. Then each time we get a client, we create a new struct
> child_process that references it. So using the managed argv-array would
> actually be a bit more complicated (and not save any memory; we just
> always point to the single copy for each child).
>
> For the environment, we build it in a function-local buffer, point the
> child_process's env field at it, start the child, and then copy the
> child_process into our global list of children. When we notice a child
> is dead (by linearly going through the list with waitpid), we free the
> list entry. So there are a few potentially bad things here:
>
>    1. We memcpy the child_process to put it on the list. Which does work,
>       though it feels a little like we are violating the abstraction
>       barrier.
>
>    2. The child_process in the list points to the local "env" buffer that
>       is no longer valid. There's no bug because we don't ever look at
>       it. Moving to a managed env would fix that. But I have to wonder if
>       we even want to be keeping the "struct child_process" around in the
>       first place (all we really care about is the pid).
>
>    3. If we do move to a managed env, then we expect it to get cleaned up
>       in finish_command. But we never call that; we just free the list
>       memory containing the child_process. We would want to call
>       finish_command. Except that we will have reaped the process already
>       with our call to waitpid() from check_dead_children. So we'd need a
>       new call to do just the cleanup without the wait, I guess.
>
>    4. For every loop on the listen socket, we call waitpid() for each
>       living child, which is a bit wasteful. We'd probably do better to
>       call waitpid(0, &status, WNOHANG), and then look up the resulting
>       pid in a hashmap or sorted list when we actually see something that
>       died. I don't know that this is a huge problem in practice. We use
>       git-daemon pretty heavily on our backend servers at GitHub, and it
>       seems to use about 5% of a CPU constantly on each machine. Which is
>       kind of lame, given that it isn't doing anything at all, but is
>       hardly earth-shattering.
>
> So I'm not sure if it is worth converting to a managed env. There's a
> bit of ugliness, but none of it is causing any problems, and doing so
> opens a can of worms. The most interesting thing to fix (to me, anyway)
> is number 4, but that has nothing to do with the env in the first place.
> :)

Trickiness makes me nervous, especially in daemon.c.  And 5% CPU usage 
just for waiting sounds awful.  Using waitpid(0, ...) is not supported 
by the current implementation in compat/mingw.c, however.

I agree that env handling should only be changed after the wait loop has 
been improved.

By the way, does getaddrinfo(3) show up in your profiles much?  Recently 
I looked into calling it only on demand instead of for all incoming 
connections because doing that unconditional with a user-supplied 
("tainted") hostname just felt wrong.  The resulting patch series turned 
out to be not very pretty and I didn't see any performance improvements 
in my very limited tests, however; not sure if it's worth it.

René

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] run-command: add env_array, an optional argv_array for env
  2014-10-13 19:08   ` Junio C Hamano
  2014-10-14  9:16     ` Jeff King
@ 2014-10-19 11:13     ` René Scharfe
  2014-10-19 11:14     ` [PATCH 2/2] use env_array member of struct child_process René Scharfe
  2 siblings, 0 replies; 10+ messages in thread
From: René Scharfe @ 2014-10-19 11:13 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Git Mailing List, Nguyễn Thái Ngọc Duy

Similar to args, add a struct argv_array member to struct child_process
that simplifies specifying the environment for children.  It is freed
automatically by finish_command() or if start_command() encounters an
error.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 Documentation/technical/api-run-command.txt | 5 +++++
 run-command.c                               | 6 ++++++
 run-command.h                               | 3 ++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/Documentation/technical/api-run-command.txt b/Documentation/technical/api-run-command.txt
index 842b838..3f12fcd 100644
--- a/Documentation/technical/api-run-command.txt
+++ b/Documentation/technical/api-run-command.txt
@@ -169,6 +169,11 @@ string pointers (NULL terminated) in .env:
 . If the string does not contain '=', it names an environment
   variable that will be removed from the child process's environment.
 
+If the .env member is NULL, `start_command` will point it at the
+.env_array `argv_array` (so you may use one or the other, but not both).
+The memory in .env_array will be cleaned up automatically during
+`finish_command` (or during `start_command` when it is unsuccessful).
+
 To specify a new initial working directory for the sub-process,
 specify it in the .dir member.
 
diff --git a/run-command.c b/run-command.c
index 761f0fd..46be513 100644
--- a/run-command.c
+++ b/run-command.c
@@ -12,6 +12,7 @@ void child_process_init(struct child_process *child)
 {
 	memset(child, 0, sizeof(*child));
 	argv_array_init(&child->args);
+	argv_array_init(&child->env_array);
 }
 
 struct child_to_clean {
@@ -287,6 +288,8 @@ int start_command(struct child_process *cmd)
 
 	if (!cmd->argv)
 		cmd->argv = cmd->args.argv;
+	if (!cmd->env)
+		cmd->env = cmd->env_array.argv;
 
 	/*
 	 * In case of errors we must keep the promise to close FDs
@@ -338,6 +341,7 @@ fail_pipe:
 			error("cannot create %s pipe for %s: %s",
 				str, cmd->argv[0], strerror(failed_errno));
 			argv_array_clear(&cmd->args);
+			argv_array_clear(&cmd->env_array);
 			errno = failed_errno;
 			return -1;
 		}
@@ -524,6 +528,7 @@ fail_pipe:
 		else if (cmd->err)
 			close(cmd->err);
 		argv_array_clear(&cmd->args);
+		argv_array_clear(&cmd->env_array);
 		errno = failed_errno;
 		return -1;
 	}
@@ -550,6 +555,7 @@ int finish_command(struct child_process *cmd)
 {
 	int ret = wait_or_whine(cmd->pid, cmd->argv[0]);
 	argv_array_clear(&cmd->args);
+	argv_array_clear(&cmd->env_array);
 	return ret;
 }
 
diff --git a/run-command.h b/run-command.h
index 1b135d1..2137315 100644
--- a/run-command.h
+++ b/run-command.h
@@ -10,6 +10,7 @@
 struct child_process {
 	const char **argv;
 	struct argv_array args;
+	struct argv_array env_array;
 	pid_t pid;
 	/*
 	 * Using .in, .out, .err:
@@ -44,7 +45,7 @@ struct child_process {
 	unsigned clean_on_exit:1;
 };
 
-#define CHILD_PROCESS_INIT { NULL, ARGV_ARRAY_INIT }
+#define CHILD_PROCESS_INIT { NULL, ARGV_ARRAY_INIT, ARGV_ARRAY_INIT }
 void child_process_init(struct child_process *);
 
 int start_command(struct child_process *);
-- 
2.1.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] use env_array member of struct child_process
  2014-10-13 19:08   ` Junio C Hamano
  2014-10-14  9:16     ` Jeff King
  2014-10-19 11:13     ` [PATCH 1/2] run-command: add env_array, an optional argv_array for env René Scharfe
@ 2014-10-19 11:14     ` René Scharfe
  2014-10-20  9:19       ` Jeff King
  2 siblings, 1 reply; 10+ messages in thread
From: René Scharfe @ 2014-10-19 11:14 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King
  Cc: Git Mailing List, Nguyễn Thái Ngọc Duy

Convert users of struct child_process to using the managed env_array for
specifying environment variables instead of supplying an array on the
stack or bringing their own argv_array.  This shortens and simplifies
the code and ensures automatically that the allocated memory is freed
after use.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 builtin/receive-pack.c | 23 ++++++++++++++---------
 http-backend.c         |  9 +++------
 pager.c                | 15 ++++-----------
 transport-helper.c     | 10 ++--------
 wt-status.c            |  6 ++----
 5 files changed, 25 insertions(+), 38 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f2f6c67..7593861 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -453,7 +453,6 @@ leave:
 static void prepare_push_cert_sha1(struct child_process *proc)
 {
 	static int already_done;
-	struct argv_array env = ARGV_ARRAY_INIT;
 
 	if (!push_cert.len)
 		return;
@@ -487,20 +486,26 @@ static void prepare_push_cert_sha1(struct child_process *proc)
 		nonce_status = check_nonce(push_cert.buf, bogs);
 	}
 	if (!is_null_sha1(push_cert_sha1)) {
-		argv_array_pushf(&env, "GIT_PUSH_CERT=%s", sha1_to_hex(push_cert_sha1));
-		argv_array_pushf(&env, "GIT_PUSH_CERT_SIGNER=%s",
+		argv_array_pushf(&proc->env_array, "GIT_PUSH_CERT=%s",
+				 sha1_to_hex(push_cert_sha1));
+		argv_array_pushf(&proc->env_array, "GIT_PUSH_CERT_SIGNER=%s",
 				 sigcheck.signer ? sigcheck.signer : "");
-		argv_array_pushf(&env, "GIT_PUSH_CERT_KEY=%s",
+		argv_array_pushf(&proc->env_array, "GIT_PUSH_CERT_KEY=%s",
 				 sigcheck.key ? sigcheck.key : "");
-		argv_array_pushf(&env, "GIT_PUSH_CERT_STATUS=%c", sigcheck.result);
+		argv_array_pushf(&proc->env_array, "GIT_PUSH_CERT_STATUS=%c",
+				 sigcheck.result);
 		if (push_cert_nonce) {
-			argv_array_pushf(&env, "GIT_PUSH_CERT_NONCE=%s", push_cert_nonce);
-			argv_array_pushf(&env, "GIT_PUSH_CERT_NONCE_STATUS=%s", nonce_status);
+			argv_array_pushf(&proc->env_array,
+					 "GIT_PUSH_CERT_NONCE=%s",
+					 push_cert_nonce);
+			argv_array_pushf(&proc->env_array,
+					 "GIT_PUSH_CERT_NONCE_STATUS=%s",
+					 nonce_status);
 			if (nonce_status == NONCE_SLOP)
-				argv_array_pushf(&env, "GIT_PUSH_CERT_NONCE_SLOP=%ld",
+				argv_array_pushf(&proc->env_array,
+						 "GIT_PUSH_CERT_NONCE_SLOP=%ld",
 						 nonce_stamp_slop);
 		}
-		proc->env = env.argv;
 	}
 }
 
diff --git a/http-backend.c b/http-backend.c
index 404e682..e3e0dda 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -314,7 +314,6 @@ static void run_service(const char **argv)
 	const char *encoding = getenv("HTTP_CONTENT_ENCODING");
 	const char *user = getenv("REMOTE_USER");
 	const char *host = getenv("REMOTE_ADDR");
-	struct argv_array env = ARGV_ARRAY_INIT;
 	int gzipped_request = 0;
 	struct child_process cld = CHILD_PROCESS_INIT;
 
@@ -329,13 +328,12 @@ static void run_service(const char **argv)
 		host = "(none)";
 
 	if (!getenv("GIT_COMMITTER_NAME"))
-		argv_array_pushf(&env, "GIT_COMMITTER_NAME=%s", user);
+		argv_array_pushf(&cld.env_array, "GIT_COMMITTER_NAME=%s", user);
 	if (!getenv("GIT_COMMITTER_EMAIL"))
-		argv_array_pushf(&env, "GIT_COMMITTER_EMAIL=%s@http.%s",
-				 user, host);
+		argv_array_pushf(&cld.env_array,
+				 "GIT_COMMITTER_EMAIL=%s@http.%s", user, host);
 
 	cld.argv = argv;
-	cld.env = env.argv;
 	if (gzipped_request)
 		cld.in = -1;
 	cld.git_cmd = 1;
@@ -350,7 +348,6 @@ static void run_service(const char **argv)
 
 	if (finish_command(&cld))
 		exit(1);
-	argv_array_clear(&env);
 }
 
 static int show_text_ref(const char *name, const unsigned char *sha1,
diff --git a/pager.c b/pager.c
index b2b805a..f6e8c33 100644
--- a/pager.c
+++ b/pager.c
@@ -74,17 +74,10 @@ void setup_pager(void)
 	pager_process.use_shell = 1;
 	pager_process.argv = pager_argv;
 	pager_process.in = -1;
-	if (!getenv("LESS") || !getenv("LV")) {
-		static const char *env[3];
-		int i = 0;
-
-		if (!getenv("LESS"))
-			env[i++] = "LESS=FRX";
-		if (!getenv("LV"))
-			env[i++] = "LV=-c";
-		env[i] = NULL;
-		pager_process.env = env;
-	}
+	if (!getenv("LESS"))
+		argv_array_push(&pager_process.env_array, "LESS=FRX");
+	if (!getenv("LV"))
+		argv_array_push(&pager_process.env_array, "LV=-c");
 	if (start_command(&pager_process))
 		return;
 
diff --git a/transport-helper.c b/transport-helper.c
index 2b24d51..033a228 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -108,12 +108,6 @@ static struct child_process *get_helper(struct transport *transport)
 	int refspec_alloc = 0;
 	int duped;
 	int code;
-	char git_dir_buf[sizeof(GIT_DIR_ENVIRONMENT) + PATH_MAX + 1];
-	const char *helper_env[] = {
-		git_dir_buf,
-		NULL
-	};
-
 
 	if (data->helper)
 		return data->helper;
@@ -129,8 +123,8 @@ static struct child_process *get_helper(struct transport *transport)
 	helper->git_cmd = 0;
 	helper->silent_exec_failure = 1;
 
-	snprintf(git_dir_buf, sizeof(git_dir_buf), "%s=%s", GIT_DIR_ENVIRONMENT, get_git_dir());
-	helper->env = helper_env;
+	argv_array_pushf(&helper->env_array, "%s=%s", GIT_DIR_ENVIRONMENT,
+			 get_git_dir());
 
 	code = start_command(helper);
 	if (code < 0 && errno == ENOENT)
diff --git a/wt-status.c b/wt-status.c
index 1bf5d72..66d267c 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -726,14 +726,14 @@ static void wt_status_print_changed(struct wt_status *s)
 static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitted)
 {
 	struct child_process sm_summary = CHILD_PROCESS_INIT;
-	struct argv_array env = ARGV_ARRAY_INIT;
 	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct strbuf cmd_stdout = STRBUF_INIT;
 	struct strbuf summary = STRBUF_INIT;
 	char *summary_content;
 	size_t len;
 
-	argv_array_pushf(&env, "GIT_INDEX_FILE=%s", s->index_file);
+	argv_array_pushf(&sm_summary.env_array, "GIT_INDEX_FILE=%s",
+			 s->index_file);
 
 	argv_array_push(&argv, "submodule");
 	argv_array_push(&argv, "summary");
@@ -745,14 +745,12 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt
 		argv_array_push(&argv, s->amend ? "HEAD^" : "HEAD");
 
 	sm_summary.argv = argv.argv;
-	sm_summary.env = env.argv;
 	sm_summary.git_cmd = 1;
 	sm_summary.no_stdin = 1;
 	fflush(s->fp);
 	sm_summary.out = -1;
 
 	run_command(&sm_summary);
-	argv_array_clear(&env);
 	argv_array_clear(&argv);
 
 	len = strbuf_read(&cmd_stdout, sm_summary.out, 1024);
-- 
2.1.2

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] receive-pack: plug minor memory leak in unpack()
  2014-10-19 11:13       ` René Scharfe
@ 2014-10-20  9:19         ` Jeff King
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff King @ 2014-10-20  9:19 UTC (permalink / raw)
  To: René Scharfe
  Cc: Junio C Hamano, Git Mailing List,
	Nguyễn Thái Ngọc Duy

On Sun, Oct 19, 2014 at 01:13:30PM +0200, René Scharfe wrote:

> >We could flip it to give the managed version the short name (and calling
> >the unmanaged version "env_ptr" or something). That would require
> >munging the existing callers, but the tweak would be simple.
> 
> Perhaps, but I'm a but skeptical of big renames.  Let's start small and add
> env_array, and see how far we get with that.

Yeah, having basically implemented patches similar to yours, I think
that is a good first step. Both of your patches looked good to me.

> Trickiness makes me nervous, especially in daemon.c.  And 5% CPU usage just
> for waiting sounds awful.  Using waitpid(0, ...) is not supported by the
> current implementation in compat/mingw.c, however.

I guess you could use wait() and a counter that you increment whenever
you get SIGCLD, but that feels a bit hacky. I wonder how bad a real
waitpid would be for mingw.

> By the way, does getaddrinfo(3) show up in your profiles much?  Recently I
> looked into calling it only on demand instead of for all incoming
> connections because doing that unconditional with a user-supplied
> ("tainted") hostname just felt wrong.  The resulting patch series turned out
> to be not very pretty and I didn't see any performance improvements in my
> very limited tests, however; not sure if it's worth it.

It shows up in the child, not the parent, so it wasn't part of the
profiling I did recently. I did look at it just now, and it does
introduce some latency into each request (though not a lot of CPU;
mainly it's the DNS request). Like you, I'm nervous about convincing
git-daemon to do lookups on random hosts. By itself it's not horrible
(except for tying up git-daemon with absurdly long chains of glueless
references), but I worry that it could exacerbate other problems
(overflows or other bugs in DNS resolvers, as part of a cache-poisoning
scheme, orbeing used for DoS amplification).

I think doing it on demand would be a lot more sensible. We do not need
to do a lookup at all unless the %H, %CH, or %IP interpolated path
features are used. And we do not need to do hostname canonicalization
unless %CH is used.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] use env_array member of struct child_process
  2014-10-19 11:14     ` [PATCH 2/2] use env_array member of struct child_process René Scharfe
@ 2014-10-20  9:19       ` Jeff King
  2014-11-09 13:49         ` René Scharfe
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff King @ 2014-10-20  9:19 UTC (permalink / raw)
  To: René Scharfe
  Cc: Junio C Hamano, Git Mailing List,
	Nguyễn Thái Ngọc Duy

On Sun, Oct 19, 2014 at 01:14:20PM +0200, René Scharfe wrote:

> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -726,14 +726,14 @@ static void wt_status_print_changed(struct wt_status *s)
>  static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitted)
>  {
>  	struct child_process sm_summary = CHILD_PROCESS_INIT;
> -	struct argv_array env = ARGV_ARRAY_INIT;
>  	struct argv_array argv = ARGV_ARRAY_INIT;

I don't think it belongs in this patch, but a follow-on 3/2 might be to
give argv the same treatment.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] use env_array member of struct child_process
  2014-10-20  9:19       ` Jeff King
@ 2014-11-09 13:49         ` René Scharfe
  0 siblings, 0 replies; 10+ messages in thread
From: René Scharfe @ 2014-11-09 13:49 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Git Mailing List,
	Nguyễn Thái Ngọc Duy

Am 20.10.2014 um 11:19 schrieb Jeff King:
> On Sun, Oct 19, 2014 at 01:14:20PM +0200, René Scharfe wrote:
> 
>> --- a/wt-status.c
>> +++ b/wt-status.c
>> @@ -726,14 +726,14 @@ static void wt_status_print_changed(struct wt_status *s)
>>   static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitted)
>>   {
>>   	struct child_process sm_summary = CHILD_PROCESS_INIT;
>> -	struct argv_array env = ARGV_ARRAY_INIT;
>>   	struct argv_array argv = ARGV_ARRAY_INIT;
> 
> I don't think it belongs in this patch, but a follow-on 3/2 might be to
> give argv the same treatment.

Yes, good idea.

-- >8 --
Subject: [PATCH] use args member of struct child_process

Convert users of struct child_process to using the managed argv_array
args instead of providing their own.  This shortens the code a bit and
ensures that the allocated memory is released automatically after use.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
 builtin/repack.c | 47 ++++++++++++++++++++++-------------------------
 wt-status.c      | 17 +++++++----------
 2 files changed, 29 insertions(+), 35 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index 2845620..83e91c7 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -135,7 +135,6 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	};
 	struct child_process cmd = CHILD_PROCESS_INIT;
 	struct string_list_item *item;
-	struct argv_array cmd_args = ARGV_ARRAY_INIT;
 	struct string_list names = STRING_LIST_INIT_DUP;
 	struct string_list rollback = STRING_LIST_INIT_NODUP;
 	struct string_list existing_packs = STRING_LIST_INIT_DUP;
@@ -202,56 +201,55 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 
 	sigchain_push_common(remove_pack_on_signal);
 
-	argv_array_push(&cmd_args, "pack-objects");
-	argv_array_push(&cmd_args, "--keep-true-parents");
+	argv_array_push(&cmd.args, "pack-objects");
+	argv_array_push(&cmd.args, "--keep-true-parents");
 	if (!pack_kept_objects)
-		argv_array_push(&cmd_args, "--honor-pack-keep");
-	argv_array_push(&cmd_args, "--non-empty");
-	argv_array_push(&cmd_args, "--all");
-	argv_array_push(&cmd_args, "--reflog");
-	argv_array_push(&cmd_args, "--indexed-objects");
+		argv_array_push(&cmd.args, "--honor-pack-keep");
+	argv_array_push(&cmd.args, "--non-empty");
+	argv_array_push(&cmd.args, "--all");
+	argv_array_push(&cmd.args, "--reflog");
+	argv_array_push(&cmd.args, "--indexed-objects");
 	if (window)
-		argv_array_pushf(&cmd_args, "--window=%s", window);
+		argv_array_pushf(&cmd.args, "--window=%s", window);
 	if (window_memory)
-		argv_array_pushf(&cmd_args, "--window-memory=%s", window_memory);
+		argv_array_pushf(&cmd.args, "--window-memory=%s", window_memory);
 	if (depth)
-		argv_array_pushf(&cmd_args, "--depth=%s", depth);
+		argv_array_pushf(&cmd.args, "--depth=%s", depth);
 	if (max_pack_size)
-		argv_array_pushf(&cmd_args, "--max-pack-size=%s", max_pack_size);
+		argv_array_pushf(&cmd.args, "--max-pack-size=%s", max_pack_size);
 	if (no_reuse_delta)
-		argv_array_pushf(&cmd_args, "--no-reuse-delta");
+		argv_array_pushf(&cmd.args, "--no-reuse-delta");
 	if (no_reuse_object)
-		argv_array_pushf(&cmd_args, "--no-reuse-object");
+		argv_array_pushf(&cmd.args, "--no-reuse-object");
 	if (write_bitmaps)
-		argv_array_push(&cmd_args, "--write-bitmap-index");
+		argv_array_push(&cmd.args, "--write-bitmap-index");
 
 	if (pack_everything & ALL_INTO_ONE) {
 		get_non_kept_pack_filenames(&existing_packs);
 
 		if (existing_packs.nr && delete_redundant) {
 			if (unpack_unreachable)
-				argv_array_pushf(&cmd_args,
+				argv_array_pushf(&cmd.args,
 						"--unpack-unreachable=%s",
 						unpack_unreachable);
 			else if (pack_everything & LOOSEN_UNREACHABLE)
-				argv_array_push(&cmd_args,
+				argv_array_push(&cmd.args,
 						"--unpack-unreachable");
 		}
 	} else {
-		argv_array_push(&cmd_args, "--unpacked");
-		argv_array_push(&cmd_args, "--incremental");
+		argv_array_push(&cmd.args, "--unpacked");
+		argv_array_push(&cmd.args, "--incremental");
 	}
 
 	if (local)
-		argv_array_push(&cmd_args,  "--local");
+		argv_array_push(&cmd.args,  "--local");
 	if (quiet)
-		argv_array_push(&cmd_args,  "--quiet");
+		argv_array_push(&cmd.args,  "--quiet");
 	if (delta_base_offset)
-		argv_array_push(&cmd_args,  "--delta-base-offset");
+		argv_array_push(&cmd.args,  "--delta-base-offset");
 
-	argv_array_push(&cmd_args, packtmp);
+	argv_array_push(&cmd.args, packtmp);
 
-	cmd.argv = cmd_args.argv;
 	cmd.git_cmd = 1;
 	cmd.out = -1;
 	cmd.no_stdin = 1;
@@ -270,7 +268,6 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	ret = finish_command(&cmd);
 	if (ret)
 		return ret;
-	argv_array_clear(&cmd_args);
 
 	if (!names.nr && !quiet)
 		printf("Nothing new to pack.\n");
diff --git a/wt-status.c b/wt-status.c
index cdbc8d7..b54eac5 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -726,7 +726,6 @@ static void wt_status_print_changed(struct wt_status *s)
 static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitted)
 {
 	struct child_process sm_summary = CHILD_PROCESS_INIT;
-	struct argv_array argv = ARGV_ARRAY_INIT;
 	struct strbuf cmd_stdout = STRBUF_INIT;
 	struct strbuf summary = STRBUF_INIT;
 	char *summary_content;
@@ -735,23 +734,21 @@ static void wt_status_print_submodule_summary(struct wt_status *s, int uncommitt
 	argv_array_pushf(&sm_summary.env_array, "GIT_INDEX_FILE=%s",
 			 s->index_file);
 
-	argv_array_push(&argv, "submodule");
-	argv_array_push(&argv, "summary");
-	argv_array_push(&argv, uncommitted ? "--files" : "--cached");
-	argv_array_push(&argv, "--for-status");
-	argv_array_push(&argv, "--summary-limit");
-	argv_array_pushf(&argv, "%d", s->submodule_summary);
+	argv_array_push(&sm_summary.args, "submodule");
+	argv_array_push(&sm_summary.args, "summary");
+	argv_array_push(&sm_summary.args, uncommitted ? "--files" : "--cached");
+	argv_array_push(&sm_summary.args, "--for-status");
+	argv_array_push(&sm_summary.args, "--summary-limit");
+	argv_array_pushf(&sm_summary.args, "%d", s->submodule_summary);
 	if (!uncommitted)
-		argv_array_push(&argv, s->amend ? "HEAD^" : "HEAD");
+		argv_array_push(&sm_summary.args, s->amend ? "HEAD^" : "HEAD");
 
-	sm_summary.argv = argv.argv;
 	sm_summary.git_cmd = 1;
 	sm_summary.no_stdin = 1;
 	fflush(s->fp);
 	sm_summary.out = -1;
 
 	run_command(&sm_summary);
-	argv_array_clear(&argv);
 
 	len = strbuf_read(&cmd_stdout, sm_summary.out, 1024);
 
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-11-09 13:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-11 11:00 [PATCH] receive-pack: plug minor memory leak in unpack() René Scharfe
2014-10-12  1:53 ` Jeff King
2014-10-13 19:08   ` Junio C Hamano
2014-10-14  9:16     ` Jeff King
2014-10-19 11:13       ` René Scharfe
2014-10-20  9:19         ` Jeff King
2014-10-19 11:13     ` [PATCH 1/2] run-command: add env_array, an optional argv_array for env René Scharfe
2014-10-19 11:14     ` [PATCH 2/2] use env_array member of struct child_process René Scharfe
2014-10-20  9:19       ` Jeff King
2014-11-09 13:49         ` René Scharfe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).