From: "Jakub Narębski" <jnareb@gmail.com>
To: Lars Schneider <larsxschneider@gmail.com>
Cc: "Junio C Hamano" <gitster@pobox.com>,
"Torsten Bögershausen" <tboegi@web.de>, git <git@vger.kernel.org>,
"Jeff King" <peff@peff.net>, "Stefan Beller" <sbeller@google.com>,
"Martin-Louis Bright" <mlbright@gmail.com>,
"Ramsay Jones" <ramsay@ramsayjones.plus.com>
Subject: Re: [PATCH v8 00/11] Git filter protocol
Date: Tue, 4 Oct 2016 21:04:55 +0200 [thread overview]
Message-ID: <f7f9ca4c-229c-390a-beb0-a58e0d3d66b3@gmail.com> (raw)
In-Reply-To: <E9946E9F-6EE5-492B-B122-9078CEB88044@gmail.com>
W dniu 03.10.2016 o 19:13, Lars Schneider pisze:
>> On 01 Oct 2016, at 22:48, Jakub Narębski <jnareb@gmail.com> wrote:
>> W dniu 01.10.2016 o 20:59, Lars Schneider pisze:
>>> On 29 Sep 2016, at 23:27, Junio C Hamano <gitster@pobox.com> wrote:
>>>> Lars Schneider <larsxschneider@gmail.com> writes:
>>>>
>>>> If the filter process refuses to die forever when Git told it to
>>>> shutdown (by closing the pipe to it, for example), that filter
>>>> process is simply buggy. I think we want users to become aware of
>>>> that, instead of Git leaving it behind, which essentially is to
>>>> sweep the problem under the rug.
>>
>> Well, it would be good to tell users _why_ Git is hanging, see below.
>
> Agreed. Do you think it is OK to write the message to stderr?
On the other hand, this is why GIT_TRACE (and GIT_TRACE_PERFORMANCE)
was invented for. We do not signal troubles with single-shot filters,
so I guess doing it for multi-file filters is not needed.
>>>> I agree with what Peff said elsewhere in the thread; if a filter
>>>> process wants to take time to clean things up while letting Git
>>>> proceed, it can do its own process management, but I think it is
>>>> sensible for Git to wait the filter process it directly spawned.
>>>
>>> To realize the approach above I prototyped the run-command patch below:
>>>
>>> I added an "exit_timeout" variable to the "child_process" struct.
>>> On exit, Git will close the pipe to the process and wait "exit_timeout"
>>> seconds until it kills the child process. If "exit_timeout" is negative
>>> then Git will wait until the process is done.
>>
>> That might be good approach. Probably the default would be to wait.
>
> I think I would prefer a 2sec timeout or something as default. This way
> we can ensure Git would not wait indefinitely for a buggy filter by default.
Actually this waiting for multi-file filter is only about waiting for
the shutdown process of the filter. The filter could still hang during
processing a file, and git would hang too, if I understand it correctly.
[...]
>> Also, how would one set default value of timeout for all process
>> based filters?
>
> I think we don't need that because a timeout is always specific
> to a filter (if the 2sec default is not sufficient).
All right (assuming that timeouts are good idea).
>>>
>>> + while ((waitpid(p->pid, &status, 0)) < 0 && errno == EINTR)
>>> + ; /* nothing */
>>
>> Ah, this loop is here because waiting on waitpid() can be interrupted
>> by the delivery of a signal to the calling process; though the result
>> is -1, not just any < 0.
>
> "< 0" is also used in wait_or_whine()
O.K. (though it doesn't necessary mean that it is correct, there
is another point for using "< 0").
[...]
>> There is also another complication: there can be more than one
>> long-running filter driver used. With this implementation we
>> wait for each of one in sequence, e.g. 10s + 10s + 10s.
>
> Good idea, I fixed that in the version below!
>
[...]
> [...] this function is also used with the async struct...
Hmmm... now I wonder if it is a good idea (similar treatment for
single-file async-invoked filter, and multi-file pkt-line filters).
For single-file one-shot filter (correct me if I am wrong):
- git sends contents to filter, signals end with EOF
(after process is started)
- in an async process:
- process is started
- git reads contents from filter, until EOF
- if process did not end, it is killed
For multi-process pkt-line based filter (simplified):
- process is started
- handshake
- for each file
- file is send to filter process over pkt-line,
end signalled with flush packet
- git reads from filter from pkt-line, until flush
- ...
See how single-shot filter is sent EOF, though in different part
of code. We need to signal multi-file filter that no more files
will be coming. Simplest solution is to send EOF (we could send
"command=shutdown" for example...) to filter, and wait for EOF
from filter (or for "status=finished" and EOF).
We could kill multi-file filter after sending last file and
receiving full response... but I think single-shot filter gets
killed only because it allows for very simple filters, and reusing
existing commands as filters.
[...]
> diff --git a/run-command.c b/run-command.c
> index 3269362..ca0feef 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -21,6 +21,9 @@ void child_process_clear(struct child_process *child)
>
> struct child_to_clean {
> pid_t pid;
> + char *name;
I guess it is here for output purposes?
Should we store full command here, or just name of <driver>?
> + int stdin;
I guess the name `stdin` for file _descriptor_ is something
used in other parts of convert.c code, isn't it?
> + int timeout;
Hmmm... we assume that timeout is in seconds, not millis or other
value, isn't it. timeout_sec would perhaps be unnecessarily long.
> struct child_to_clean *next;
> };
> static struct child_to_clean *children_to_clean;
> @@ -28,12 +31,53 @@ static int installed_child_cleanup_handler;
>
> static void cleanup_children(int sig, int in_signal)
> {
> + int status;
> + struct timeval tv;
> + time_t secs;
> + struct child_to_clean *p = children_to_clean;
> +
> + // Send EOF to children as indicator that Git will exit soon
> + while (p) {
> + if (p->timeout != 0) {
Here we use timeout == 0 as a special case, a special indicator
(IIUC for the single-shot filter case, where it is closed already).
This is not documented. Somebody setting timeout to "0" would
be surprised, isn't it?
> + if (p->stdin > 0)
> + close(p->stdin);
> + }
> + p = p->next;
> + }
> +
> while (children_to_clean) {
> - struct child_to_clean *p = children_to_clean;
> + p = children_to_clean;
> children_to_clean = p->next;
> +
> + if (p->timeout != 0) {
> + fprintf(stderr, _("Waiting for '%s' to finish..."), p->name);
> + if (p->timeout < 0) {
> + // No timeout given - wait indefinitely
> + while ((waitpid(p->pid, &status, 0)) < 0 && errno == EINTR)
> + ; /* nothing */
> + } else {
> + // Wait until timeout
> + gettimeofday(&tv, NULL);
> + secs = tv.tv_sec;
> + while (!waitpid(p->pid, &status, WNOHANG) &&
> + tv.tv_sec - secs < p->timeout) {
> + fprintf(stderr, _(" \rWaiting %lds for '%s' to finish..."),
> + p->timeout - tv.tv_sec + secs - 1, p->name);
> + gettimeofday(&tv, NULL);
> + sleep_millisec(10);
> + }
> + }
I wonder if we have some progress-printing code we can borrow
from, or just plain use (like progress report for long checkout).
> + if (waitpid(p->pid, &status, WNOHANG))
> + fprintf(stderr, _("done!\n"));
> + else
> + fprintf(stderr, _("timeout. Killing...\n"));
> + }
> +
> kill(p->pid, sig);
> - if (!in_signal)
> + if (!in_signal) {
> + free(p->name);
> free(p);
> + }
> }
> }
>
> @@ -49,10 +93,18 @@ static void cleanup_children_on_exit(void)
> cleanup_children(SIGTERM, 0);
> }
>
> -static void mark_child_for_cleanup(pid_t pid)
> +static void mark_child_for_cleanup_with_timeout(pid_t pid, const char *name, int stdin, int timeout)
> {
> struct child_to_clean *p = xmalloc(sizeof(*p));
> p->pid = pid;
> + p->timeout = timeout;
> + p->stdin = stdin;
> + if (name) {
> + p->name = xmalloc(strlen(name) + 1);
> + strcpy(p->name, name);
Don't we have xstrdup() for that, or am I mistaken?
> + } else {
> + p->name = "process";
Hmmmm...
> + }
> p->next = children_to_clean;
> children_to_clean = p;
>
> @@ -63,6 +115,13 @@ static void mark_child_for_cleanup(pid_t pid)
> }
> }
>
> +#ifdef NO_PTHREADS
> +static void mark_child_for_cleanup(pid_t pid, const char *name, int timeout, int stdin)
> +{
> + mark_child_for_cleanup_with_timeout(pid, NULL, 0, 0);
> +}
> +#endif
Uh?
> +
> static void clear_child_for_cleanup(pid_t pid)
> {
> struct child_to_clean **pp;
> @@ -422,7 +481,8 @@ int start_command(struct child_process *cmd)
> if (cmd->pid < 0)
> error_errno("cannot fork() for %s", cmd->argv[0]);
> else if (cmd->clean_on_exit)
> - mark_child_for_cleanup(cmd->pid);
> + mark_child_for_cleanup_with_timeout(
> + cmd->pid, cmd->argv[0], cmd->in, cmd->clean_on_exit_timeout);
All right, nice abstraction.
>
> /*
> * Wait for child's execvp. If the execvp succeeds (or if fork()
> @@ -483,7 +543,8 @@ int start_command(struct child_process *cmd)
> if (cmd->pid < 0 && (!cmd->silent_exec_failure || errno != ENOENT))
> error_errno("cannot spawn %s", cmd->argv[0]);
> if (cmd->clean_on_exit && cmd->pid >= 0)
> - mark_child_for_cleanup(cmd->pid);
> + mark_child_for_cleanup_with_timeout(
> + cmd->pid, cmd->argv[0], cmd->in, cmd->clean_on_exit_timeout);
>
> argv_array_clear(&nargv);
> cmd->argv = sargv;
> diff --git a/run-command.h b/run-command.h
> index cf29a31..4c1c1f4 100644
> --- a/run-command.h
> +++ b/run-command.h
> @@ -43,6 +43,16 @@ struct child_process {
> unsigned stdout_to_stderr:1;
> unsigned use_shell:1;
> unsigned clean_on_exit:1;
> + /*
> + * clean_on_exit_timeout is only considered if clean_on_exit is set.
> + * - Specify 0 to kill the child on Git exit (default)
> + * - Specify a negative value to close the child's stdin on Git exit
> + * and wait indefinitely for the child's termination.
> + * - Specify a positive value to close the child's stdin on Git exit
> + * and wait clean_on_exit_timeout seconds for the child's
> + * termination.
All right, so here is this documentation...
> + */
> + int clean_on_exit_timeout;
> };
>
> #define CHILD_PROCESS_INIT { NULL, ARGV_ARRAY_INIT, ARGV_ARRAY_INIT }
>
>
For full patch, you would need also to add to Documentation/config.txt
Best,
--
Jakub Narębski
next prev parent reply other threads:[~2016-10-04 19:05 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-20 19:02 [PATCH v8 00/11] Git filter protocol larsxschneider
2016-09-20 19:02 ` [PATCH v8 01/11] pkt-line: rename packet_write() to packet_write_fmt() larsxschneider
2016-09-24 21:14 ` Jakub Narębski
2016-09-26 18:49 ` Lars Schneider
2016-09-28 23:15 ` Jakub Narębski
2016-09-20 19:02 ` [PATCH v8 02/11] pkt-line: extract set_packet_header() larsxschneider
2016-09-24 21:22 ` Jakub Narębski
2016-09-26 18:53 ` Lars Schneider
2016-09-20 19:02 ` [PATCH v8 03/11] run-command: move check_pipe() from write_or_die to run_command larsxschneider
2016-09-24 22:12 ` Jakub Narębski
2016-09-26 16:13 ` Lars Schneider
2016-09-26 16:21 ` Jakub Narębski
2016-09-20 19:02 ` [PATCH v8 04/11] pkt-line: add packet_write_fmt_gently() larsxschneider
2016-09-24 22:27 ` Jakub Narębski
2016-09-20 19:02 ` [PATCH v8 05/11] pkt-line: add packet_flush_gently() larsxschneider
2016-09-24 22:56 ` Jakub Narębski
2016-09-20 19:02 ` [PATCH v8 06/11] pkt-line: add packet_write_gently() larsxschneider
2016-09-25 11:26 ` Jakub Narębski
2016-09-26 19:21 ` Lars Schneider
2016-09-27 8:39 ` Jeff King
2016-09-27 19:33 ` Jakub Narębski
2016-09-20 19:02 ` [PATCH v8 07/11] pkt-line: add functions to read/write flush terminated packet streams larsxschneider
2016-09-25 13:46 ` Jakub Narębski
2016-09-26 20:23 ` Lars Schneider
2016-09-27 8:14 ` Lars Schneider
2016-09-27 9:00 ` Jeff King
2016-09-27 12:10 ` Lars Schneider
2016-09-27 12:13 ` Jeff King
2016-09-20 19:02 ` [PATCH v8 08/11] convert: quote filter names in error messages larsxschneider
2016-09-25 14:03 ` Jakub Narębski
2016-09-20 19:02 ` [PATCH v8 09/11] convert: modernize tests larsxschneider
2016-09-25 14:43 ` Jakub Narębski
2016-09-20 19:02 ` [PATCH v8 10/11] convert: make apply_filter() adhere to standard Git error handling larsxschneider
2016-09-25 14:47 ` Jakub Narębski
2016-09-20 19:02 ` [PATCH v8 11/11] convert: add filter.<driver>.process option larsxschneider
2016-09-26 22:41 ` Jakub Narębski
2016-09-30 18:56 ` Lars Schneider
2016-10-04 20:50 ` Jakub Narębski
2016-10-06 13:16 ` Lars Schneider
2016-09-27 15:37 ` Jakub Narębski
2016-09-30 19:38 ` Lars Schneider
2016-10-04 21:00 ` Jakub Narębski
2016-10-06 21:27 ` Lars Schneider
2016-09-28 23:14 ` Jakub Narębski
2016-10-01 15:34 ` Lars Schneider
2016-10-04 21:34 ` Jakub Narębski
2016-09-28 21:49 ` [PATCH v8 00/11] Git filter protocol Junio C Hamano
2016-09-29 10:28 ` Lars Schneider
2016-09-29 11:57 ` Torsten Bögershausen
2016-09-29 16:57 ` Junio C Hamano
2016-09-29 17:57 ` Lars Schneider
2016-09-29 18:18 ` Torsten Bögershausen
2016-09-29 18:38 ` Johannes Sixt
2016-09-29 21:27 ` Junio C Hamano
2016-10-01 18:59 ` Lars Schneider
2016-10-01 20:48 ` Jakub Narębski
2016-10-03 17:13 ` Lars Schneider
2016-10-04 19:04 ` Jakub Narębski [this message]
2016-10-06 13:13 ` Lars Schneider
2016-10-06 16:01 ` Jeff King
2016-10-06 17:17 ` Junio C Hamano
2016-10-03 17:02 ` Junio C Hamano
2016-10-03 17:35 ` Lars Schneider
2016-10-04 12:11 ` Jeff King
2016-10-04 16:47 ` Junio C Hamano
2016-09-29 18:02 ` Jeff King
2016-09-29 21:19 ` Junio C Hamano
2016-09-29 20:50 ` Lars Schneider
2016-09-29 21:12 ` Junio C Hamano
2016-09-29 20:59 ` Jakub Narębski
2016-09-29 21:17 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f7f9ca4c-229c-390a-beb0-a58e0d3d66b3@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=larsxschneider@gmail.com \
--cc=mlbright@gmail.com \
--cc=peff@peff.net \
--cc=ramsay@ramsayjones.plus.com \
--cc=sbeller@google.com \
--cc=tboegi@web.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).