From: oleg@redhat.com (Oleg Nesterov)
To: linux-arm-kernel@lists.infradead.org
Subject: [BISECTED] rcu_sched self-detected stall since 3.17
Date: Tue, 15 Dec 2015 17:56:50 +0100 [thread overview]
Message-ID: <20151215165650.GA13604@redhat.com> (raw)
In-Reply-To: <20151201130404.GL3816@twins.programming.kicks-ass.net>
Sorry again for the huge delay.
And all I can say is that I am all confused.
On 12/01, Peter Zijlstra wrote:
>
> On Fri, Nov 20, 2015 at 03:35:38PM +0000, Vladimir Murzin wrote:
> > commit 743162013d40ca612b4cb53d3a200dff2d9ab26e
> > Author: NeilBrown <neilb@suse.de>
> > Date: Mon Jul 7 15:16:04 2014 +1000
That patch still looks correct to me.
> > and if I apply following diff I don't see stalls anymore.
> >
> > diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> > index a104879..2d68cdb 100644
> > --- a/kernel/sched/wait.c
> > +++ b/kernel/sched/wait.c
> > @@ -514,9 +514,10 @@ EXPORT_SYMBOL(bit_wait);
> >
> > __sched int bit_wait_io(void *word)
> > {
> > + io_schedule();
> > +
> > if (signal_pending_state(current->state, current))
> > return 1;
> > - io_schedule();
> > return 0;
> > }
> > EXPORT_SYMBOL(bit_wait_io);
I can't understand why this change helps. But note that it actually removes
the signal_pending_state() check from bit_wait_io(), current->state is always
TASK_RUNNING after return from schedule(), signal_pending_state() will always
return zero.
This means that after this change wait_on_page_bit_killable() will spin in a
busy-wait loop if the caller is killed.
> The reason this is broken is that schedule() will no-op when there is a
> pending signal, while raising a signal will also issue a wakeup.
But why this is wrong? We should notice signal_pending_state() on the next
iteration.
> Thus the right thing to do is check for the signal state after,
I think this check should work on both sides. The only difference is that
you obviously can't use current->state after schedule().
I still can't understand the problem.
Oleg.
WARNING: multiple messages have this Message-ID (diff)
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>,
linux-kernel@vger.kernel.org, neilb@suse.de,
mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org,
linux-mm@kvack.org
Subject: Re: [BISECTED] rcu_sched self-detected stall since 3.17
Date: Tue, 15 Dec 2015 17:56:50 +0100 [thread overview]
Message-ID: <20151215165650.GA13604@redhat.com> (raw)
In-Reply-To: <20151201130404.GL3816@twins.programming.kicks-ass.net>
Sorry again for the huge delay.
And all I can say is that I am all confused.
On 12/01, Peter Zijlstra wrote:
>
> On Fri, Nov 20, 2015 at 03:35:38PM +0000, Vladimir Murzin wrote:
> > commit 743162013d40ca612b4cb53d3a200dff2d9ab26e
> > Author: NeilBrown <neilb@suse.de>
> > Date: Mon Jul 7 15:16:04 2014 +1000
That patch still looks correct to me.
> > and if I apply following diff I don't see stalls anymore.
> >
> > diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> > index a104879..2d68cdb 100644
> > --- a/kernel/sched/wait.c
> > +++ b/kernel/sched/wait.c
> > @@ -514,9 +514,10 @@ EXPORT_SYMBOL(bit_wait);
> >
> > __sched int bit_wait_io(void *word)
> > {
> > + io_schedule();
> > +
> > if (signal_pending_state(current->state, current))
> > return 1;
> > - io_schedule();
> > return 0;
> > }
> > EXPORT_SYMBOL(bit_wait_io);
I can't understand why this change helps. But note that it actually removes
the signal_pending_state() check from bit_wait_io(), current->state is always
TASK_RUNNING after return from schedule(), signal_pending_state() will always
return zero.
This means that after this change wait_on_page_bit_killable() will spin in a
busy-wait loop if the caller is killed.
> The reason this is broken is that schedule() will no-op when there is a
> pending signal, while raising a signal will also issue a wakeup.
But why this is wrong? We should notice signal_pending_state() on the next
iteration.
> Thus the right thing to do is check for the signal state after,
I think this check should work on both sides. The only difference is that
you obviously can't use current->state after schedule().
I still can't understand the problem.
Oleg.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>,
linux-kernel@vger.kernel.org, neilb@suse.de,
mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org,
linux-mm@kvack.org
Subject: Re: [BISECTED] rcu_sched self-detected stall since 3.17
Date: Tue, 15 Dec 2015 17:56:50 +0100 [thread overview]
Message-ID: <20151215165650.GA13604@redhat.com> (raw)
In-Reply-To: <20151201130404.GL3816@twins.programming.kicks-ass.net>
Sorry again for the huge delay.
And all I can say is that I am all confused.
On 12/01, Peter Zijlstra wrote:
>
> On Fri, Nov 20, 2015 at 03:35:38PM +0000, Vladimir Murzin wrote:
> > commit 743162013d40ca612b4cb53d3a200dff2d9ab26e
> > Author: NeilBrown <neilb@suse.de>
> > Date: Mon Jul 7 15:16:04 2014 +1000
That patch still looks correct to me.
> > and if I apply following diff I don't see stalls anymore.
> >
> > diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
> > index a104879..2d68cdb 100644
> > --- a/kernel/sched/wait.c
> > +++ b/kernel/sched/wait.c
> > @@ -514,9 +514,10 @@ EXPORT_SYMBOL(bit_wait);
> >
> > __sched int bit_wait_io(void *word)
> > {
> > + io_schedule();
> > +
> > if (signal_pending_state(current->state, current))
> > return 1;
> > - io_schedule();
> > return 0;
> > }
> > EXPORT_SYMBOL(bit_wait_io);
I can't understand why this change helps. But note that it actually removes
the signal_pending_state() check from bit_wait_io(), current->state is always
TASK_RUNNING after return from schedule(), signal_pending_state() will always
return zero.
This means that after this change wait_on_page_bit_killable() will spin in a
busy-wait loop if the caller is killed.
> The reason this is broken is that schedule() will no-op when there is a
> pending signal, while raising a signal will also issue a wakeup.
But why this is wrong? We should notice signal_pending_state() on the next
iteration.
> Thus the right thing to do is check for the signal state after,
I think this check should work on both sides. The only difference is that
you obviously can't use current->state after schedule().
I still can't understand the problem.
Oleg.
next prev parent reply other threads:[~2015-12-15 16:56 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-20 15:35 [BISECTED] rcu_sched self-detected stall since 3.17 Vladimir Murzin
2015-11-20 15:35 ` Vladimir Murzin
2015-11-20 15:35 ` Vladimir Murzin
2015-12-01 11:50 ` Vladimir Murzin
2015-12-01 11:50 ` Vladimir Murzin
2015-12-01 11:50 ` Vladimir Murzin
2015-12-01 13:04 ` Peter Zijlstra
2015-12-01 13:04 ` Peter Zijlstra
2015-12-01 13:04 ` Peter Zijlstra
2015-12-02 9:04 ` Vladimir Murzin
2015-12-02 9:04 ` Vladimir Murzin
2015-12-02 9:04 ` Vladimir Murzin
2015-12-04 11:52 ` [tip:locking/core] sched/wait: Fix signal handling in bit wait helpers tip-bot for Peter Zijlstra
2015-12-08 10:47 ` Peter Zijlstra
2015-12-09 1:06 ` NeilBrown
2015-12-09 7:40 ` Peter Zijlstra
2015-12-09 21:30 ` NeilBrown
2015-12-10 13:09 ` Peter Zijlstra
2015-12-11 11:30 ` Paul Turner
2015-12-11 11:39 ` Peter Zijlstra
2015-12-11 11:53 ` Vladimir Murzin
2015-12-11 13:08 ` Jan Stancek
2015-12-11 13:22 ` Peter Zijlstra
2015-12-11 17:57 ` Vladimir Murzin
2015-12-15 18:16 ` Oleg Nesterov
2015-12-15 19:01 ` Oleg Nesterov
2015-12-15 16:56 ` Oleg Nesterov [this message]
2015-12-15 16:56 ` [BISECTED] rcu_sched self-detected stall since 3.17 Oleg Nesterov
2015-12-15 16:56 ` Oleg Nesterov
-- strict thread matches above, loose matches on Subject: below --
2015-12-02 6:53 Ross Green
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151215165650.GA13604@redhat.com \
--to=oleg@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.