From: Paolo Bonzini <pbonzini@redhat.com>
To: Junio C Hamano <gitster@pobox.com>,
"Randall S. Becker" <rsbecker@nexbridge.com>
Cc: git@vger.kernel.org
Subject: Re: [Bug/Solution] Git hangs in compat/poll/poll.c on HPE NonStop
Date: Fri, 29 Sep 2017 10:41:23 +0200 [thread overview]
Message-ID: <d934af29-9cee-ea74-40c8-e63f7592f8aa@redhat.com> (raw)
In-Reply-To: <xmqqo9puzbii.fsf@gitster.mtv.corp.google.com>
On 29/09/2017 00:47, Junio C Hamano wrote:
> [jch: a patch that hopefully is applicable is attached at the end;
> I'd appreciate input from Paolo, the contributor of the original
> upstream]
I wasn't involved in upstream commit d42461c3, but Linux is also always
overwriting the revents output with zero.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo
> "Randall S. Becker" <rsbecker@nexbridge.com> writes:
>
>> After a whole lot of investigating, we (it is a large "we") have discovered
>> the reason for the hang we occasionally get in git-upload-pack on HPE
>> NonStop servers - reported here well over a year ago. This resulted from a
>> subtle check that the operating system does on file descriptors. When it
>> sees random values in pfd[i].revents,...
>
> What do you mean by "random values"? Where do these random values
> come from? The original code the patch touches is _not_ setting
> anything, so it is leaving it uninitialized and that is seen by the
> system? If that is the case perhaps should we clear it before we
> call into this codepath?
>
>> .... There is a non-destructive fix
>> that I would like to propose for this that I have already tested.
>
> I am not sure in what sense this is "non-destructive"; we left the
> value as it was when we didn't notice anything happening in
> compute_revents(). Now we explicitly destroy the value there was
> (just like we do in the case where the corresponding file descriptor
> was negative). Maybe overwriting with 0 is the right thing, but I
> wouldn't call it "non-destructive", though. Puzzling...
>
>> --- a/compat/poll/poll.c
>> +++ b/compat/poll/poll.c
>> @@ -438,6 +438,10 @@ poll (struct pollfd *pfd, nfds_t nfd, int timeout)
>> pfd[i].revents = happened;
>> rc++;
>> }
>> + else
>> + {
>> + pfd[i].revents = 0;
>> + }
>> }
>>
>> return rc;
>
> FYI, the upstream gnulib rewrites this part of the code with
> d42461c3 ("poll: fixes for large fds", 2015-02-20) quite a bit, and
> it reads like this:
>
> + {
> + pfd[i].revents = (pfd[i].fd < 0
> + ? 0
> + : compute_revents (pfd[i].fd, pfd[i].events,
> + &rfds, &wfds, &efds));
> + rc += pfd[i].revents != 0;
> + }
>
> The code before your fix (and before d42461c3) used to assign to
> pfd[i].revents only when compute_revents() gave us non-zero value,
> but with d42461c3 in the upstream, it pfd[i].revents is assigned
> the return value from compute_revents() even if the value returned
> is 0. And rc is incremented only when that value is non-zero.
>
> The result of applying your patch is equivalent, so after all, I
> tend to think that your patch is the right fix in the context of the
> code we have (i.e. the older code we borrowed from them).
>
> I wonder if we want to refresh the borrowed copy of poll.c to a
> newer snapshot someday, but that is totally a separate topic. At
> least, I now know that your fix in this patch will not be broken due
> to d42461c3 updating the code in a slightly different way ;-)
>
> Randall, I forged your Sign-off in the patch below, but please say
> it is OK, after reading Documentation/SubmittingPatches.
>
> Thanks.
>
> -- >8 --
> From: Randall S. Becker <rsbecker@nexbridge.com>
> Subject: poll.c: always set revents, even if to zero.
>
> Match what happens to pfd[i].revents when compute_revents() returns
> 0 to the upstream gnulib's commit d42461c3 ("poll: fixes for large
> fds", 2015-02-20). The revents field is set to 0, without
> incrementing the value rc to be returned from the function.
>
> This fixes occasional hangs in git-upload-pack on NPE NonStop.
>
> Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> compat/poll/poll.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/compat/poll/poll.c b/compat/poll/poll.c
> index b10adc780f..ae03b74a6f 100644
> --- a/compat/poll/poll.c
> +++ b/compat/poll/poll.c
> @@ -438,6 +438,10 @@ poll (struct pollfd *pfd, nfds_t nfd, int timeout)
> pfd[i].revents = happened;
> rc++;
> }
> + else
> + {
> + pfd[i].revents = 0;
> + }
> }
>
> return rc;
>
prev parent reply other threads:[~2017-09-29 8:41 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-28 21:01 [Bug/Solution] Git hangs in compat/poll/poll.c on HPE NonStop Randall S. Becker
2017-09-28 22:47 ` Junio C Hamano
2017-09-29 8:41 ` Paolo Bonzini [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d934af29-9cee-ea74-40c8-e63f7592f8aa@redhat.com \
--to=pbonzini@redhat.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=rsbecker@nexbridge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).