From: Karl Wiberg <kha@virtutech.com>
To: Catalin Marinas <catalin.marinas@gmail.com>
Cc: mandolaerik@gmail.com, git@vger.kernel.org
Subject: Re: [PATCH] Work around performance bug in subprocess.Popen.communicate()
Date: Fri, 14 Aug 2009 08:21:49 +0200 [thread overview]
Message-ID: <20090814062149.GA23179@lux.e.vtech> (raw)
In-Reply-To: <b0943d9e0908131518i3ac18331leb4c0c76313b0780@mail.gmail.com>
On 2009-08-13 23:18:59 +0100, Catalin Marinas wrote:
> 2009/8/4 Karl Wiberg <kha@virtutech.com>:
>
> > On 2009-07-31 12:27:53 +0100, Catalin Marinas wrote:
> >
> > > But can this not lead to a deadlock if the __indata is big? The
> > > stdout of the created process is only processed once the whole
> > > __indata is written. I thought communicate() was created to
> > > avoid this issue.
> >
> > I don't think there's a problem. write() isn't supposed to have a
> > limit on the amount of data it will accept in one call, as far as
> > I'm aware. Plus, it works just fine with Erik's test case---which
> > in my case was about 7 MB. If it can handle 7 MB, I doubt there's
> > a limit we'll hit anytime soon.
>
> write() itself doesn't have a limit, it's mainly what the
> application receiving the data can handle. In the Git case, I think
> it takes all the input as it isn't a filtering tool (things may be
> different with tools like sed etc.).
>
> > Oh, and we still call communicate()---we just don't pass it any
> > additional bytes to write to stdin.
>
> Yes, but if write() is blocked, communicate() won't be called.
Ah, so that's what you were worrying about. Yeah, if the process we're
writing to was going to block us (== not read input) for long periods
of time, we'd have to handle that case. And if it was going to decide
to stop reading input half-way through and give us a sigpipe, we'd
have to handle that. But ...
> Since we are only using Git, I'll merge this patch
Exactly. We really don't expect git to do any of those things to us.
We know it reads all available input before producing the output.
(Well, almost always. I have patches in my experimental branch that
use git cat-file --batch and git diff-tree --stdin, where there can be
many rounds of talking back and forth, but when driving those I'm
being very careful to not deadlock.)
> (and maybe add a comment).
Sure.
--
Karl Wiberg, Virtutech
next prev parent reply other threads:[~2009-08-14 6:22 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-31 9:36 [PATCH] Work around performance bug in subprocess.Popen.communicate() Karl Wiberg
2009-07-31 11:27 ` Catalin Marinas
2009-08-04 8:51 ` Karl Wiberg
2009-08-13 22:18 ` Catalin Marinas
2009-08-14 6:21 ` Karl Wiberg [this message]
2009-08-14 7:26 ` Erik Sandberg
2009-08-14 8:12 ` Karl Wiberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090814062149.GA23179@lux.e.vtech \
--to=kha@virtutech.com \
--cc=catalin.marinas@gmail.com \
--cc=git@vger.kernel.org \
--cc=mandolaerik@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).