public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* Re: CRASH : RCU detected stall
       [not found] <CAKEzZ8wRAFxXOcgxzu2KkntvNz8QvGocWpH_xDbRxRg=7LgbPA@mail.gmail.com>
@ 2017-06-13 17:30 ` Theodore Ts'o
       [not found]   ` <CAKEzZ8yb_n+iCy8+_fYujc1LAbijjjd97+dEh9qrwNfx0GAqUw@mail.gmail.com>
  0 siblings, 1 reply; 2+ messages in thread
From: Theodore Ts'o @ 2017-06-13 17:30 UTC (permalink / raw)
  To: Ramin Farajpour Cami; +Cc: adilger.kernel, linux-ext4, linux-kernel, syzkaller

On Tue, Jun 13, 2017 at 07:35:37PM +0430, Ramin Farajpour Cami wrote:
> Hi,
> 
> I've got the following error report while fuzzing the kernel with syzkaller
> version (4.12-rc5)
> 
> https://groups.google.com/forum/#!topic/syzkaller/4e8MkNnRFRQ

Can you reliably reproduce this failure?  If so, please give details.
Unfortunately this sort of RCU self-stall can be caused by any number
of things.

				- Ted

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: CRASH : RCU detected stall
       [not found]   ` <CAKEzZ8yb_n+iCy8+_fYujc1LAbijjjd97+dEh9qrwNfx0GAqUw@mail.gmail.com>
@ 2017-06-14 12:55     ` Theodore Ts'o
  0 siblings, 0 replies; 2+ messages in thread
From: Theodore Ts'o @ 2017-06-14 12:55 UTC (permalink / raw)
  To: Ramin Farajpour Cami; +Cc: adilger.kernel, linux-ext4, linux-kernel, syzkaller

On Wed, Jun 14, 2017 at 10:02:00AM +0430, Ramin Farajpour Cami wrote:
> 
> Unfortunately it's not reproducible. do you have idea about it?

Nope.  Note that this isn't necessarily an ext4 bug.  We have two
complaints about an rcu_sched thread getting staved and an NMI handler
getting taking too long to run:

rcu_sched kthread starved for 22270 jiffies! g2951 c2950 f0x0
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 2.762 msecs

We also happened be doing writeback on another CPU and the ext4 thread
was doing a slab allocation.

What ultimiately caused the RCU starvation is not at all clear.  Were
we spinning inside the slab allocator, or not?  Getting some
magic-sysrq triggers to see if the PC was always in the slab allocator
would be useful.  And if that's the case, it's not clear what might
have caused us to spinning in the slab allocator.  It could be due to
some slab state getting corrupted by a previous system call, and ext4
was just unlucky enough to do the slab allocation which caused it to
go for a loop.

We just don't have enough information to do any kind of useful
investigation.

					- Ted

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-06-14 12:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAKEzZ8wRAFxXOcgxzu2KkntvNz8QvGocWpH_xDbRxRg=7LgbPA@mail.gmail.com>
2017-06-13 17:30 ` CRASH : RCU detected stall Theodore Ts'o
     [not found]   ` <CAKEzZ8yb_n+iCy8+_fYujc1LAbijjjd97+dEh9qrwNfx0GAqUw@mail.gmail.com>
2017-06-14 12:55     ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox