git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Fredrik Kuivinen <frekui@gmail.com>
To: Johannes Sixt <j6t@kdbg.org>
Cc: Junio C Hamano <gitster@pobox.com>,
	"Shawn O. Pearce" <spearce@spearce.org>,
	git@vger.kernel.org
Subject: Re: [PATCH 7/6] Enable threaded async procedures whenever pthreads is  available
Date: Tue, 23 Mar 2010 09:15:45 +0100	[thread overview]
Message-ID: <4c8ef71003230115y64d36094y178fcfe6576e9c66@mail.gmail.com> (raw)
In-Reply-To: <201003172228.18939.j6t@kdbg.org>

On Wed, Mar 17, 2010 at 22:28, Johannes Sixt <j6t@kdbg.org> wrote:
> On Mittwoch, 10. März 2010, Junio C Hamano wrote:
>> Johannes Sixt <j6t@kdbg.org> writes:
>> > On Samstag, 6. März 2010, Shawn O. Pearce wrote:
>> >> I'm in favor of that.  If we have threaded delta search enabled,
>> >> we probably can also run these async procedures in a POSIX thread
>> >> rather than forking off a child.
>> >
>> > OK. The patch could look like this.
>>
>> Will queue in 'pu', but as Shawn said, we should probably give another
>> closer look at the callees that are started with this interface before
>> moving forward.
>
> So here is my analysis.
>
> There are currently these users of the async infrastructure:
>
> builtin-fetch-pack.c
> builtin-receive-pack.c
> builtin-send-pack.c
> convert.c
> remote-curl.c
> upload_pack.c
>
> The list below shows all functions that read or write potentially global state
> that are called from the async procedure (except for upload_pack.c, see
> below). I ignored all functions that do not depend on global state, like
> strcmp, memcpy, sprintf, strerror, etc.
>

> ----------
> convert.c:filter_buffer()
>  pipe
>  close
>  getenv
>  fork
>  read
>  write
>  fprintf
>  waitpid
>
> if GIT_TRACE is set:
>  xrealloc
>  die
>  free
>
> This one is less trivial. It calls start_command+finish_command from the async
> procedure. The parent calls
>  read()
>  xrealloc() (via strbuf_read())
>  error()
> (and nothing else) between start_async() and finish_async().
>
> As long as GIT_TRACE is not set, we are safe, because the async procedure
> carefully releases all global resources that it allocated, and the parent is
> looking in the other direction while it does os. Even if GIT_TRACE is set,
> xrealloc() in the parent and the async procedure cannot be called
> simultanously because the procedure calls it only before it begins writing
> output, and the parent only after it received this output.

Maybe I'm missing something but, isn't it possible that xrealloc is
called simultaneously from the two threads if GIT_TRACE is set?

Immediately after start_async the parent calls strbuf_read. We then
get the call chain
strbuf_read -> strbuf_grow -> ALLOG_GROW -> xrealloc, so xrealloc is
called before we read any data in the parent.

In the child we have start_command -> trace_argv_printf -> strbuf_grow -> ...

That xmalloc and xrealloc aren't thread-safe feels a bit fragile.
Maybe we should try to fix that.

> ----------
> upload_pack:create_pack_file(): This user is different because it calls into
> the revision walker in the async procedure, which definitely affects global
> state. Therefore, here is the list of functions called by the parent until it
> exits:
>  pipe
>  close
>  getenv
>  fork
>  read
>  write
>  fprintf
>  waitpid
>  alarm
>  die
>  die_errno
>  pthread_join / waitpid

sha1_to_hex is also called by the parent and the current
implementation of that function is not thread-safe. sha1_to_hex is
also called by some paths in the revision machinery, but I don't know
if it will ever be called in this particular case.

Maybe it would be a good idea to create wrappers for getenv, setenv,
unsetenv, and putenv to make them thread-safe as well. Then won't have
to worry about them in the future.

- Fredrik

  parent reply	other threads:[~2010-03-23  8:15 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-06 15:40 [PATCH 0/6] Pass t5530 on Windows Johannes Sixt
2010-03-06 20:12 ` Junio C Hamano
2010-03-06 21:50   ` Shawn O. Pearce
2010-03-09 20:00     ` [PATCH 7/6] Enable threaded async procedures whenever pthreads is available Johannes Sixt
2010-03-09 23:43       ` Shawn O. Pearce
2010-03-10 22:28       ` Junio C Hamano
2010-03-11 19:53         ` Johannes Sixt
2010-03-12  5:56           ` Junio C Hamano
2010-03-17 21:28         ` Johannes Sixt
2010-03-17 22:19           ` Junio C Hamano
2010-03-23  8:15           ` Fredrik Kuivinen [this message]
2010-03-23 20:19             ` Johannes Sixt
2010-03-23 20:25               ` Johannes Sixt
2010-03-23 20:44                 ` Junio C Hamano
2010-03-23 21:09                   ` Johannes Sixt
2010-03-23 21:42               ` Fredrik Kuivinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c8ef71003230115y64d36094y178fcfe6576e9c66@mail.gmail.com \
    --to=frekui@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).