From: ebiederm@xmission.com (Eric W. Biederman)
To: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Cc: "Michel Dänzer" <michel@daenzer.net>,
linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org,
dri-devel@lists.freedesktop.org, David.Panariti@amd.com,
oleg@redhat.com, Alexander.Deucher@amd.com,
akpm@linux-foundation.org, Christian.Koenig@amd.com
Subject: Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process.
Date: Tue, 24 Apr 2018 17:11:44 -0500 [thread overview]
Message-ID: <87a7ts2r1b.fsf@xmission.com> (raw)
In-Reply-To: <27d7d15b-f7c3-2a0a-af85-eb243526ac88@amd.com> (Andrey Grodzovsky's message of "Tue, 24 Apr 2018 17:37:08 -0400")
Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
> On 04/24/2018 05:21 PM, Eric W. Biederman wrote:
>> Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> writes:
>>
>>> On 04/24/2018 03:44 PM, Daniel Vetter wrote:
>>>> On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel Dänzer wrote:
>>>>> Adding the dri-devel list, since this is driver independent code.
>>>>>
>>>>>
>>>>> On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote:
>>>>>> Avoid calling wait_event_killable when you are possibly being called
>>>>>> from get_signal routine since in that case you end up in a deadlock
>>>>>> where you are alreay blocked in singla processing any trying to wait
>>>>> Multiple typos here, "[...] already blocked in signal processing and [...]"?
>>>>>
>>>>>
>>>>>> on a new signal.
>>>>>>
>>>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>> ---
>>>>>> drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++--
>>>>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>> index 088ff2b..09fd258 100644
>>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
>>>>>> @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched,
>>>>>> return;
>>>>>> /**
>>>>>> * The client will not queue more IBs during this fini, consume existing
>>>>>> - * queued IBs or discard them on SIGKILL
>>>>>> + * queued IBs or discard them when in death signal state since
>>>>>> + * wait_event_killable can't receive signals in that state.
>>>>>> */
>>>>>> - if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL)
>>>>>> + if (current->flags & PF_SIGNALED)
>>>> You want fatal_signal_pending() here, instead of inventing your own broken
>>>> version.
>>> I rely on current->flags & PF_SIGNALED because this being set from
>>> within get_signal,
>> It doesn't mean that. Unless you are called by do_coredump (you
>> aren't).
>
> Looking in latest code here
> https://elixir.bootlin.com/linux/v4.17-rc2/source/kernel/signal.c#L2449
> i see that current->flags |= PF_SIGNALED; is out side of
> if (sig_kernel_coredump(signr)) {...} scope
In small words. You showed me the backtrace and I have read
the code.
PF_SIGNALED means you got killed by a signal.
get_signal
do_coredump
do_group_exit
do_exit
exit_signals
sets PF_EXITING
exit_mm
calls fput on mmaps
calls sched_task_work
exit_files
calls fput on open files
calls sched_task_work
exit_task_work
task_work_run
/* you are here */
So strictly speaking you are inside of get_signal it is not
meaningful to speak of yourself as within get_signal.
I am a little surprised to see task_work_run called so early.
I was mostly expecting it to happen when the dead task was
scheduling away, like normally happens.
Testing for PF_SIGNALED does not give you anything at all
that testing for PF_EXITING (the flag that signal handling
is shutdown) does not get you.
There is no point in distinguishing PF_SIGNALED from any other
path to do_exit. do_exit never returns.
The task is dead.
Blocking indefinitely while shutting down a task is a bad idea.
Blocking indefinitely while closing a file descriptor is a bad idea.
The task has been killed it can't get more dead. SIGKILL is meaningless
at this point.
So you need a timeout, or not to wait at all.
Eric
next prev parent reply other threads:[~2018-04-24 22:13 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-24 15:30 Avoid uninterruptible sleep during process exit Andrey Grodzovsky
2018-04-24 15:30 ` [PATCH 1/3] signals: Allow generation of SIGKILL to exiting task Andrey Grodzovsky
2018-04-24 16:10 ` Eric W. Biederman
2018-04-24 16:42 ` Eric W. Biederman
2018-04-24 16:51 ` Andrey Grodzovsky
2018-04-24 17:29 ` Eric W. Biederman
2018-04-25 13:13 ` Oleg Nesterov
2018-04-24 15:30 ` [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process Andrey Grodzovsky
2018-04-24 15:46 ` Michel Dänzer
2018-04-24 15:51 ` Andrey Grodzovsky
2018-04-24 15:52 ` Andrey Grodzovsky
2018-04-24 19:44 ` Daniel Vetter
2018-04-24 21:00 ` Eric W. Biederman
2018-04-24 21:02 ` Andrey Grodzovsky
2018-04-24 21:21 ` Eric W. Biederman
2018-04-24 21:37 ` Andrey Grodzovsky
2018-04-24 22:11 ` Eric W. Biederman [this message]
2018-04-25 7:14 ` Daniel Vetter
2018-04-25 13:08 ` Andrey Grodzovsky
2018-04-25 15:29 ` Eric W. Biederman
[not found] ` <311660b9-9e46-b960-3088-06e16ac3838d@amd.com>
2018-04-25 16:31 ` Eric W. Biederman
2018-04-24 21:40 ` Daniel Vetter
2018-04-25 13:22 ` Oleg Nesterov
2018-04-25 13:36 ` Daniel Vetter
2018-04-25 14:18 ` Oleg Nesterov
2018-04-25 13:43 ` Andrey Grodzovsky
2018-04-24 16:23 ` Eric W. Biederman
2018-04-24 16:43 ` Andrey Grodzovsky
2018-04-24 17:12 ` Eric W. Biederman
2018-04-25 13:55 ` Oleg Nesterov
2018-04-25 14:21 ` Andrey Grodzovsky
2018-04-25 17:17 ` Oleg Nesterov
2018-04-25 18:40 ` Andrey Grodzovsky
2018-04-26 0:01 ` Eric W. Biederman
2018-04-26 12:34 ` Andrey Grodzovsky
2018-04-26 12:52 ` Andrey Grodzovsky
2018-04-26 15:57 ` Eric W. Biederman
2018-04-26 20:43 ` Andrey Grodzovsky
2018-04-30 12:08 ` Christian König
2018-04-30 14:32 ` Andrey Grodzovsky
2018-04-30 15:25 ` Christian König
2018-04-30 16:00 ` Oleg Nesterov
2018-04-30 16:10 ` Andrey Grodzovsky
2018-04-30 18:29 ` Christian König
2018-04-30 19:28 ` Andrey Grodzovsky
2018-05-02 11:48 ` Christian König
2018-05-01 14:35 ` Oleg Nesterov
2018-05-23 15:08 ` Andrey Grodzovsky
2018-04-30 15:29 ` Oleg Nesterov
2018-04-30 16:25 ` Eric W. Biederman
2018-04-30 17:18 ` Andrey Grodzovsky
2018-04-25 13:05 ` Oleg Nesterov
2018-04-24 15:30 ` [PATCH 3/3] drm/amdgpu: Switch to interrupted wait to recover from ring hang Andrey Grodzovsky
2018-04-24 15:52 ` Panariti, David
2018-04-24 15:58 ` Andrey Grodzovsky
2018-04-24 16:20 ` Panariti, David
2018-04-24 16:30 ` Eric W. Biederman
2018-04-25 17:17 ` Andrey Grodzovsky
2018-04-25 20:55 ` Eric W. Biederman
2018-04-26 12:28 ` Andrey Grodzovsky
2018-04-24 16:14 ` Eric W. Biederman
2018-04-24 16:38 ` Andrey Grodzovsky
2018-04-30 11:34 ` Christian König
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a7ts2r1b.fsf@xmission.com \
--to=ebiederm@xmission.com \
--cc=Alexander.Deucher@amd.com \
--cc=Andrey.Grodzovsky@amd.com \
--cc=Christian.Koenig@amd.com \
--cc=David.Panariti@amd.com \
--cc=akpm@linux-foundation.org \
--cc=amd-gfx@lists.freedesktop.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=michel@daenzer.net \
--cc=oleg@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).