All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Paul E. McKenney" <paul.mckenney@linaro.org>,
	linuxppc-dev@lists.ozlabs.org, Hugh Dickins <hughd@google.com>,
	linux-kernel@vger.kernel.org
Subject: Re: linux-next ppc64: RCU mods cause __might_sleep BUGs
Date: Tue, 1 May 2012 06:39:00 -0700	[thread overview]
Message-ID: <20120501133900.GA4462@linux.vnet.ibm.com> (raw)
In-Reply-To: <1335832418.20866.95.camel@pasglop>

On Tue, May 01, 2012 at 10:33:38AM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2012-04-30 at 15:37 -0700, Hugh Dickins wrote:
> > 
> > BUG: sleeping function called from invalid context at include/linux/pagemap.h:354
> > in_atomic(): 0, irqs_disabled(): 0, pid: 6886, name: cc1
> 
> Hrm ... in_atomic and irqs_disabled are both 0 ... so yeah it smells
> like a preempt count problem... odd.

All of the preempt-count patches are now in mainline.  :-(

The CONFIG_PROVE_RCU checks verify that either the task is new or it
is the same task that was last context-switched to on this CPU.  So
the most likely suspect is a newly created task that starts running
without schedule_tail() being invoked on the path from parent task
to child task.  If so, the fix would be to invoke rcu_switch_from()
and rcu_switch_to() on that code path.

So, does Power have a way of switching to a new task without involving
schedule_tail()?  I convinced myself that my old bugbear, usermode helpers,
aren't causing this problem, but I could easily be missing something.

> Did you get a specific bisect target yet ?

On this one, Hugh is close enough.  ;-)

							Thanx, Paul

> Cheers,
> Ben.
> 
> > Call Trace:
> > [c0000001a99f78e0] [c00000000000f34c] .show_stack+0x6c/0x16c (unreliable)
> > [c0000001a99f7990] [c000000000077b40] .__might_sleep+0x11c/0x134
> > [c0000001a99f7a10] [c0000000000c6228] .filemap_fault+0x1fc/0x494
> > [c0000001a99f7af0] [c0000000000e7c9c] .__do_fault+0x120/0x684
> > [c0000001a99f7c00] [c000000000025790] .do_page_fault+0x458/0x664
> > [c0000001a99f7e30] [c000000000005868] handle_page_fault+0x10/0x30
> > 
> > I've plenty more examples, most of them from page faults or from kswapd;
> > but I don't think there's any more useful information in them.
> > 
> > Anything I can try later on? 
> 

WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hugh Dickins <hughd@google.com>,
	"Paul E. McKenney" <paul.mckenney@linaro.org>,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: linux-next ppc64: RCU mods cause __might_sleep BUGs
Date: Tue, 1 May 2012 06:39:00 -0700	[thread overview]
Message-ID: <20120501133900.GA4462@linux.vnet.ibm.com> (raw)
In-Reply-To: <1335832418.20866.95.camel@pasglop>

On Tue, May 01, 2012 at 10:33:38AM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2012-04-30 at 15:37 -0700, Hugh Dickins wrote:
> > 
> > BUG: sleeping function called from invalid context at include/linux/pagemap.h:354
> > in_atomic(): 0, irqs_disabled(): 0, pid: 6886, name: cc1
> 
> Hrm ... in_atomic and irqs_disabled are both 0 ... so yeah it smells
> like a preempt count problem... odd.

All of the preempt-count patches are now in mainline.  :-(

The CONFIG_PROVE_RCU checks verify that either the task is new or it
is the same task that was last context-switched to on this CPU.  So
the most likely suspect is a newly created task that starts running
without schedule_tail() being invoked on the path from parent task
to child task.  If so, the fix would be to invoke rcu_switch_from()
and rcu_switch_to() on that code path.

So, does Power have a way of switching to a new task without involving
schedule_tail()?  I convinced myself that my old bugbear, usermode helpers,
aren't causing this problem, but I could easily be missing something.

> Did you get a specific bisect target yet ?

On this one, Hugh is close enough.  ;-)

							Thanx, Paul

> Cheers,
> Ben.
> 
> > Call Trace:
> > [c0000001a99f78e0] [c00000000000f34c] .show_stack+0x6c/0x16c (unreliable)
> > [c0000001a99f7990] [c000000000077b40] .__might_sleep+0x11c/0x134
> > [c0000001a99f7a10] [c0000000000c6228] .filemap_fault+0x1fc/0x494
> > [c0000001a99f7af0] [c0000000000e7c9c] .__do_fault+0x120/0x684
> > [c0000001a99f7c00] [c000000000025790] .do_page_fault+0x458/0x664
> > [c0000001a99f7e30] [c000000000005868] handle_page_fault+0x10/0x30
> > 
> > I've plenty more examples, most of them from page faults or from kswapd;
> > but I don't think there's any more useful information in them.
> > 
> > Anything I can try later on? 
> 


  parent reply	other threads:[~2012-05-01 13:39 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-30 22:37 linux-next ppc64: RCU mods cause __might_sleep BUGs Hugh Dickins
2012-04-30 22:37 ` Hugh Dickins
2012-04-30 23:14 ` Paul E. McKenney
2012-04-30 23:14   ` Paul E. McKenney
2012-05-01  0:33 ` Benjamin Herrenschmidt
2012-05-01  0:33   ` Benjamin Herrenschmidt
2012-05-01  5:10   ` Hugh Dickins
2012-05-01  5:10     ` Hugh Dickins
2012-05-01 14:22     ` Paul E. McKenney
2012-05-01 14:22       ` Paul E. McKenney
2012-05-01 21:42       ` Hugh Dickins
2012-05-01 21:42         ` Hugh Dickins
2012-05-01 23:25         ` Paul E. McKenney
2012-05-01 23:25           ` Paul E. McKenney
2012-05-02 20:25           ` Hugh Dickins
2012-05-02 20:25             ` Hugh Dickins
2012-05-02 20:49             ` Paul E. McKenney
2012-05-02 20:49               ` Paul E. McKenney
2012-05-02 21:32               ` Paul E. McKenney
2012-05-02 21:32                 ` Paul E. McKenney
2012-05-02 21:36                 ` Paul E. McKenney
2012-05-02 21:36                   ` Paul E. McKenney
2012-05-02 21:20             ` Benjamin Herrenschmidt
2012-05-02 21:20               ` Benjamin Herrenschmidt
2012-05-02 21:54               ` Paul E. McKenney
2012-05-02 21:54                 ` Paul E. McKenney
2012-05-02 22:54                 ` Hugh Dickins
2012-05-02 22:54                   ` Hugh Dickins
2012-05-03  0:14                   ` Paul E. McKenney
2012-05-03  0:14                     ` Paul E. McKenney
2012-05-03  0:24                     ` Hugh Dickins
2012-05-03  0:24                       ` Hugh Dickins
2012-05-07 16:21                       ` Hugh Dickins
2012-05-07 16:21                         ` Hugh Dickins
2012-05-07 18:50                         ` Paul E. McKenney
2012-05-07 18:50                           ` Paul E. McKenney
2012-05-07 21:38                           ` Hugh Dickins
2012-05-07 21:38                             ` Hugh Dickins
2012-05-01 13:39   ` Paul E. McKenney [this message]
2012-05-01 13:39     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120501133900.GA4462@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paul.mckenney@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.