All of lore.kernel.org
 help / color / mirror / Atom feed
* inode->i_wb_list corruption.
@ 2012-03-06 18:51 Dave Jones
  2012-03-06 21:03 ` Jan Kara
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Jones @ 2012-03-06 18:51 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Fedora Kernel Team, viro

We've had three separate reports against 3.2.x recently where the linked list debugging
is getting tripped up by the prev->next pointer being null instead of pointing
to the current list entry while walking the i_wb_list

Call traces are slightly different each time, but all end up walking i_wb_list 
in dput -> d_kill -> i_put -> evict -> inode_wb_list_del

What protects that list ? It looks to be just bdi->wb.list_lock ?


full reports at:
https://bugzilla.redhat.com/show_bug.cgi?id=784741
https://bugzilla.redhat.com/show_bug.cgi?id=799229
https://bugzilla.redhat.com/show_bug.cgi?id=799692

	Dave


^ permalink raw reply	[flat|nested] 19+ messages in thread
* Re: inode->i_wb_list corruption.
@ 2012-03-15 14:08 Petr Tesařík
  2012-03-15 14:22 ` Dave Airlie
  2012-03-15 14:49 ` Dave Jones
  0 siblings, 2 replies; 19+ messages in thread
From: Petr Tesařík @ 2012-03-15 14:08 UTC (permalink / raw)
  To: Dave Jones
  Cc: Yang Bai, Fengguang Wu, Linux Kernel, Fedora Kernel Team, kernel

Dne So 10. března 2012 02:00:15 Dave Jones napsal(a):
> (trimmed cc)
> 
> On Sat, Mar 10, 2012 at 12:14:37AM +0800, Yang Bai wrote:
>  > On Fri, Mar 9, 2012 at 11:19 PM, Dave Jones <davej@redhat.com> wrote:
>  > > And with that, this arrived..
>  > > https://bugzilla.redhat.com/show_bug.cgi?id=788433#c3
>  > > 
>  > > I'm leaning strongly towards believing this is yet another case of
>  > > i915 corrupting memory on resume.
>  > 
>  > Nice catch. I am wondering
>  > 1) why all lists being affected and
>  > 2) why all list_head's prev being set to NULL.
>  > 
>  > Any ideas?
> 
> This is probably the same bug:
> https://bugzilla.kernel.org/show_bug.cgi?id=37142 Petr noticed that the
> corruption is 32 bytes getting zeroed at the beginning of a page.
> 
> I think this may be responsible for a lot of different bugs that we've
> had reported.
> 
> i915_drm_thaw is a deep nest of functions though, so this is going to be
> hard to track down where that write is coming from. Because the corruption
> seems to happen to pages that are already allocated, we probably can't
> even rely on DEBUG_PAGEALLOC, though it might be worth trying.

If it you believe it could be written by the CPU, I can try to catch the 
instruction that writes to this memory. My plan is as follows:

Set up all the hardware debug registers to trap writes to the pages that are 
likely to get corrupted. Remember, I've seen the corruption happen always 
roughly in the same physical memory area.

I know, there are only 4 registers I can use, and the potential corruption 
area is much larger than 4 pages, but with enough reboots, the chance is quite 
high that I'll be lucky.

I haven't gone for that plan yet, because I thought the area was in fact 
written to by someone else on the PCI bus, not the CPU. If nothing else, I can 
verify that. ;-)

Dave, do you think the result of such testing would help you resolve the bug?

Petr

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-03-15 14:49 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-06 18:51 inode->i_wb_list corruption Dave Jones
2012-03-06 21:03 ` Jan Kara
2012-03-07  7:26   ` Fengguang Wu
2012-03-07 10:42     ` Jan Kara
2012-03-09  8:34       ` Yang Bai
2012-03-09 14:57         ` Dave Jones
2012-03-09 15:19           ` Dave Jones
2012-03-09 16:14             ` Yang Bai
2012-03-09 18:00               ` Dave Jones
2012-03-09 20:08                 ` Keith Packard
2012-03-09 20:19                   ` Josh Boyer
2012-03-09 22:44                     ` Keith Packard
2012-03-12 21:13                       ` Josh Boyer
2012-03-12 21:27                         ` David Woodhouse
2012-03-12 23:26                   ` Dave Jones
2012-03-13  0:06                     ` Keith Packard
  -- strict thread matches above, loose matches on Subject: below --
2012-03-15 14:08 Petr Tesařík
2012-03-15 14:22 ` Dave Airlie
2012-03-15 14:49 ` Dave Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.