stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Request for backport 35a2897c2a to 4.9 branch
@ 2018-08-02 19:08 David Chen
  2018-08-03 19:39 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 2+ messages in thread
From: David Chen @ 2018-08-02 19:08 UTC (permalink / raw)
  To: stable@vger.kernel.org
  Cc: Paul E. McKenney, Peter Zijlstra, Greg Kroah-Hartman

Hi all,

We'd like to have the following commit backport to 4.9 branch to fix an
issue we are seeing.

35a2897c2a306cca344ca5c0b43416707018f434
    sched/wait: Remove the lockless swait_active() check in swake_up*()

In 4.9 branch, we hit an issue in RCU, where the NOCB follower list not getting
reclaimed and causing OOM.

In discussion with Paul, we were able to figure out the problem was because of
missed wake up resulted from lack of proper memory barrier between setting
wake up condition and swake_up().

nocb_leader_wait()
{
		*tail = rdp->nocb_gp_head;
		smp_mb__after_atomic(); /* Store *tail before wakeup. */
		if (rdp != my_rdp && tail == &rdp->nocb_follower_head) {
			swake_up(&rdp->nocb_wq);

Note, that the smp_mb__after_atomic() is only a compiler barrier on x86.
Originally I was going to change the barrier to smp_mb(). But then I found out
master has the above mentioned patch that solves the same class of problem by
removing the lockless check inside swake_up().

So I'm wonder if we can backport this patch to 4.9 branch to solve this issue,
and maybe solve other potential missed wake up issue as well.

Thanks,
David

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Request for backport 35a2897c2a to 4.9 branch
  2018-08-02 19:08 Request for backport 35a2897c2a to 4.9 branch David Chen
@ 2018-08-03 19:39 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 2+ messages in thread
From: Greg Kroah-Hartman @ 2018-08-03 19:39 UTC (permalink / raw)
  To: David Chen; +Cc: stable@vger.kernel.org, Paul E. McKenney, Peter Zijlstra

On Thu, Aug 02, 2018 at 07:08:41PM +0000, David Chen wrote:
> Hi all,
> 
> We'd like to have the following commit backport to 4.9 branch to fix an
> issue we are seeing.
> 
> 35a2897c2a306cca344ca5c0b43416707018f434
>     sched/wait: Remove the lockless swait_active() check in swake_up*()
> 
> In 4.9 branch, we hit an issue in RCU, where the NOCB follower list not getting
> reclaimed and causing OOM.
> 
> In discussion with Paul, we were able to figure out the problem was because of
> missed wake up resulted from lack of proper memory barrier between setting
> wake up condition and swake_up().
> 
> nocb_leader_wait()
> {
> 		*tail = rdp->nocb_gp_head;
> 		smp_mb__after_atomic(); /* Store *tail before wakeup. */
> 		if (rdp != my_rdp && tail == &rdp->nocb_follower_head) {
> 			swake_up(&rdp->nocb_wq);
> 
> Note, that the smp_mb__after_atomic() is only a compiler barrier on x86.
> Originally I was going to change the barrier to smp_mb(). But then I found out
> master has the above mentioned patch that solves the same class of problem by
> removing the lockless check inside swake_up().
> 
> So I'm wonder if we can backport this patch to 4.9 branch to solve this issue,
> and maybe solve other potential missed wake up issue as well.

Now applied, thanks.

greg k-h

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-08-03 21:37 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-02 19:08 Request for backport 35a2897c2a to 4.9 branch David Chen
2018-08-03 19:39 ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).