From: Dave Gordon <david.s.gordon@intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
Daniel Vetter <daniel@ffwll.ch>,
intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH igt] core/sighelper: Interrupt everyone in the process group
Date: Mon, 11 Jan 2016 13:29:12 +0000 [thread overview]
Message-ID: <5693AE28.8040905@intel.com> (raw)
In-Reply-To: <20160111123435.GU652@nuc-i3427.alporthouse.com>
On 11/01/16 12:34, Chris Wilson wrote:
> On Mon, Jan 11, 2016 at 12:25:07PM +0000, Dave Gordon wrote:
>> On 11/01/16 09:06, Daniel Vetter wrote:
>>> On Mon, Jan 11, 2016 at 08:54:59AM +0000, Chris Wilson wrote:
>>>> On Mon, Jan 11, 2016 at 08:57:33AM +0100, Daniel Vetter wrote:
>>>>> On Fri, Jan 08, 2016 at 08:44:29AM +0000, Chris Wilson wrote:
>>>>>> Some stress tests create both the signal helper and a lot of competing
>>>>>> processes. In these tests, the parent is just waiting upon the children,
>>>>>> and the intention is not to keep waking up the waiting parent, but to
>>>>>> keep interrupting the children (as we hope to trigger races in our
>>>>>> kernel code). kill(-pid) sends the signal to all members of the process
>>>>>> group, not just the target pid.
>>>>>
>>>>> I don't really have any clue about unix pgroups, but the -pid disappeared
>>>>> compared to the previous version.
>>>>
>>>> -getppid().
>>>>
>>>> I felt it was clearer to pass along the "negative pid = process group"
>>>> after setting up the process group.
>>>
>>> Oh, I was blind ... Yeah looks better, but please add a bigger comment
>>> around that code explaining why we need a group and why we use SIG_CONT.
>>> With that acked-by: me.
>>>
>>> Cheers, Daniel
>>>
>>>>>> We also switch from using SIGUSR1 to SIGCONT to paper over a race
>>>>>> condition when forking children that saw the default signal action being
>>>>>> run (and thus killing the child).
>>>>>
>>>>> I thought I fixed that race by first installing the new signal handler,
>>>>> then forking. Ok, rechecked and it's the SYS_getpid stuff, so another
>>>>> race. Still I thought signal handlers would survive a fork?
>>>>
>>>> So did irc. They didn't appear to as the children would sporadically
>>>> die with SIGUSR1.
>>>
>>> Could be that libc is doing something funny, iirc they have piles of fork
>>> helpers to make fork more reliable (breaking locks and stuff like that),
>>> but then in turn break the abstraction.
>>> -Daniel
>>
>> You could use killpg(pgrp, sig) rather than kill(), just to make it
>> clearer that the target is a process group, rather than people
>> having to know about the "negative pid" semantics.
>>
>> I don't think SIGCHLD is a good idea; it has kernel-defined
>> semantics beyond just sending a signal. And it may not be delivered
>> at all, if the disposition is not "caught". SIGUSR1 was the right
>> thing, really; so it would be better to work out how to make that
>> work properly, rather than change to a different one.
>
> SIGCONT not SIGCHLD. And the deposition is supposed to be fully under
> our control any way.
Oops, yes, I meant SIGCONT has kernel-defined semantics, etc ...
Catching SIGCONT is ... unusual. Because you can't catch SIGSTOP, you
don't normally have any reason to catch SIGCONT.
Actually, SIGCONT is even more bizarre than SIGCHLD as sending a SIGCONT
to a process can result in a SIGCHLD being sent to its parent.
>> Signal handlers are (supposed to be) inherited across fork(); signal
>> disposition is also inherited, and the set of pending signals of a
>> new process is (supposed to be) empty. OTOH a signal can be
>> delivered to the child before it returns from the fork(), which may
>> be a bit surprising.
>>
>> I think the safest way to avoid unexpected signals around a fork() is:
>>
>> parent calls sigprocmask() to block all interesting signals
>> parent calls fork() --> child inherits mask
>> parent calls sigprocmask() to restore the previous mask
>
> I tried that.
> -Chris
Are we using signal(2) to install the handlers? 'Cos that's archaic and
has known unfixable race conditions. The Linux kernel supplies SysV
signal semantics by default, which means the disposition gets reset
before the handler is called, so a double signal kills the program. The
glibc signal(3) wrapper provides BSD semantics which are slightly less
problematic; but libc5 signal(3) implements SysV.
The proper answer is usually to use sigaction(2) instead. Then the race
conditions don't (shouldn't) occur, at least if the user code gets all
the options right.
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2016-01-11 13:29 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-08 8:44 [PATCH igt] core/sighelper: Interrupt everyone in the process group Chris Wilson
2016-01-11 7:57 ` Daniel Vetter
2016-01-11 8:54 ` Chris Wilson
2016-01-11 9:06 ` Daniel Vetter
2016-01-11 12:25 ` Dave Gordon
2016-01-11 12:34 ` Chris Wilson
2016-01-11 13:29 ` Dave Gordon [this message]
2016-01-11 13:41 ` Chris Wilson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5693AE28.8040905@intel.com \
--to=david.s.gordon@intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=daniel@ffwll.ch \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox