linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Was: 2.6.25-mm1]
       [not found] <20080418014757.52fb4a4f.akpm@linux-foundation.org>
@ 2008-04-21  8:31 ` Jiri Slaby
  2008-04-21  9:06   ` Al Viro
  0 siblings, 1 reply; 7+ messages in thread
From: Jiri Slaby @ 2008-04-21  8:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Al Viro, linux-fsdevel

On 04/18/2008 10:47 AM, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25/2.6.25-mm1/ 

Hi,

$ ls /usr/share/man/cat3readlin
Segmentation fault

[the file doesn't exist.]
This is probably the same bug as in -rc8-mm2 I reported here:
http://www.opensubscriber.com/message/linux-kernel@vger.kernel.org/9008289.html

general protection fault: 0000 [1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/net/eth0/statistics/collisions
CPU 0
Modules linked in: test ipv6 tun bitrev arc4 ecb crypto_blkcipher cryptomgr 
crypto_algapi ath5k mac80211 crc32 sr_mod usbhid ohci1394 rtc_cmos hid rtc_core 
cfg80211 ieee1394 cdrom ehci_hcd rtc_lib ff_memless floppy evdev
Pid: 24838, comm: man Not tainted 2.6.25-mm1_64 #403
RIP: 0010:[<ffffffff802aca27>]  [<ffffffff802aca27>] __d_lookup+0x97/0x160
RSP: 0018:ffff8100337d1b98  EFLAGS: 00010206
RAX: 00f0000000000000 RBX: 00f0000000000000 RCX: 0000000000000012
RDX: ffff8100200830e0 RSI: ffff8100337d1ca8 RDI: ffff810079195708
RBP: ffff8100337d1bf8 R08: ffff8100337d1ca8 R09: 0000000000000000
R10: 000000000000013d R11: 0000000000000246 R12: ffff8100200830c8
R13: 00000000198eaed5 R14: ffff810079195708 R15: ffff8100337d1bc8
FS:  00007f447b5c06f0(0000) GS:ffffffff80664000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000001484f88 CR3: 000000005fac4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process man (pid: 24838, threadinfo ffff8100337d0000, task ffff810034418000)
Stack:  ffff8100337d1ca8 000000000000000b ffff810079195710 0000000b792561a0
  ffff81003136600f ffffffff802f9073 00f0000000000000 0000000000000001
  ffff8100337d1e48 ffff8100337d1e48 ffff8100337d1ca8 ffff8100337d1cb8
Call Trace:
  [<ffffffff802f9073>] ? ext3_lookup+0xc3/0x100
  [<ffffffff802a1e85>] do_lookup+0x35/0x220
  [<ffffffff802a22c2>] __link_path_walk+0x252/0x1010
  [<ffffffff802b20ba>] ? mntput_no_expire+0x2a/0x140
  [<ffffffff802a30ee>] path_walk+0x6e/0xe0
  [<ffffffff802a33b2>] do_path_lookup+0xa2/0x240
  [<ffffffff802a38b7>] __path_lookup_intent_open+0x67/0xd0
  [<ffffffff802a392c>] path_lookup_open+0xc/0x10
  [<ffffffff802a487a>] do_filp_open+0xaa/0x990
  [<ffffffff8024f8b4>] ? up+0x34/0x50
  [<ffffffff804fd619>] ? unlock_kernel+0x29/0x30
  [<ffffffff802b20ba>] ? mntput_no_expire+0x2a/0x140
  [<ffffffff80295ddc>] ? get_unused_fd_flags+0x8c/0x140
  [<ffffffff80295f06>] do_sys_open+0x76/0x110
  [<ffffffff80295fcb>] sys_open+0x1b/0x20
  [<ffffffff8020b91b>] system_call_after_swapgs+0x7b/0x80


Code: 48 89 c3 48 8b 55 d0 8b 45 bc 48 85 d2 48 89 45 a8 75 18 eb 5f 0f 1f 80 00 
00 00 00 48 8b 1b 48 89 5d d0 49 8b 07 48 85 c0 74 49 <48> 8b 03 4c 8d 63 e8 0f 
18 08 45 39 6c 24 30 75 e0 4d 39 74 24
RIP  [<ffffffff802aca27>] __d_lookup+0x97/0x160
  RSP <ffff8100337d1b98>
---[ end trace cb4ec4895332b217 ]---

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Was: 2.6.25-mm1]
  2008-04-21  8:31 ` [Was: 2.6.25-mm1] Jiri Slaby
@ 2008-04-21  9:06   ` Al Viro
  2008-04-21  9:37     ` fault in __d_lookup " Jiri Slaby
  0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2008-04-21  9:06 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Andrew Morton, linux-kernel, linux-fsdevel

On Mon, Apr 21, 2008 at 10:31:40AM +0200, Jiri Slaby wrote:

        hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
                struct qstr *qstr;

                if (dentry->d_name.hash != hash)
                        continue;

walking into node == (struct hlist_node *)0x00f0000000000000...


^ permalink raw reply	[flat|nested] 7+ messages in thread

* fault in __d_lookup [Was: 2.6.25-mm1]
  2008-04-21  9:06   ` Al Viro
@ 2008-04-21  9:37     ` Jiri Slaby
  2008-04-21  9:45       ` Al Viro
  0 siblings, 1 reply; 7+ messages in thread
From: Jiri Slaby @ 2008-04-21  9:37 UTC (permalink / raw)
  To: Al Viro; +Cc: Andrew Morton, linux-kernel, linux-fsdevel

On 04/21/2008 11:06 AM, Al Viro wrote:
> On Mon, Apr 21, 2008 at 10:31:40AM +0200, Jiri Slaby wrote:
> 
>         hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
>                 struct qstr *qstr;
> 
>                 if (dentry->d_name.hash != hash)
>                         continue;
> 
> walking into node == (struct hlist_node *)0x00f0000000000000...

Yup, true, In the last oops I stuck on memcmp few lines below.

BTW. it's 100% reproducible after it happens once, but fixable by reboot. Any 
tests I should run (memtest, some printks sticked anywhere)?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fault in __d_lookup [Was: 2.6.25-mm1]
  2008-04-21  9:37     ` fault in __d_lookup " Jiri Slaby
@ 2008-04-21  9:45       ` Al Viro
  2008-04-21  9:59         ` Jiri Slaby
  0 siblings, 1 reply; 7+ messages in thread
From: Al Viro @ 2008-04-21  9:45 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Andrew Morton, linux-kernel, linux-fsdevel

On Mon, Apr 21, 2008 at 11:37:40AM +0200, Jiri Slaby wrote:
> On 04/21/2008 11:06 AM, Al Viro wrote:
> >On Mon, Apr 21, 2008 at 10:31:40AM +0200, Jiri Slaby wrote:
> >
> >        hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
> >                struct qstr *qstr;
> >
> >                if (dentry->d_name.hash != hash)
> >                        continue;
> >
> >walking into node == (struct hlist_node *)0x00f0000000000000...
> 
> Yup, true, In the last oops I stuck on memcmp few lines below.
> 
> BTW. it's 100% reproducible after it happens once, but fixable by reboot. 
> Any tests I should run (memtest, some printks sticked anywhere)?

Well, if list has such turd in it, you'll certainly hit it every time
you walk that list, so 100% reproducible is not surprising.

How well is it reproducible from fresh boot?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fault in __d_lookup [Was: 2.6.25-mm1]
  2008-04-21  9:45       ` Al Viro
@ 2008-04-21  9:59         ` Jiri Slaby
  2008-04-21 13:42           ` Rafael J. Wysocki
  2008-04-21 17:23           ` Matthew Wilcox
  0 siblings, 2 replies; 7+ messages in thread
From: Jiri Slaby @ 2008-04-21  9:59 UTC (permalink / raw)
  To: Al Viro; +Cc: Andrew Morton, linux-kernel, linux-fsdevel

On 04/21/2008 11:45 AM, Al Viro wrote:
> On Mon, Apr 21, 2008 at 11:37:40AM +0200, Jiri Slaby wrote:
>> On 04/21/2008 11:06 AM, Al Viro wrote:
>>> On Mon, Apr 21, 2008 at 10:31:40AM +0200, Jiri Slaby wrote:
>>>
>>>        hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
>>>                struct qstr *qstr;
>>>
>>>                if (dentry->d_name.hash != hash)
>>>                        continue;
>>>
>>> walking into node == (struct hlist_node *)0x00f0000000000000...
>> Yup, true, In the last oops I stuck on memcmp few lines below.
>>
>> BTW. it's 100% reproducible after it happens once, but fixable by reboot. 
>> Any tests I should run (memtest, some printks sticked anywhere)?
> 
> Well, if list has such turd in it, you'll certainly hit it every time
> you walk that list, so 100% reproducible is not surprising.
> 
> How well is it reproducible from fresh boot?

Few days with suspend/resume cycles. This one was booted 12 hours ago, one 
suspend/resume. Will keep an eye on it and keep you informed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fault in __d_lookup [Was: 2.6.25-mm1]
  2008-04-21  9:59         ` Jiri Slaby
@ 2008-04-21 13:42           ` Rafael J. Wysocki
  2008-04-21 17:23           ` Matthew Wilcox
  1 sibling, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2008-04-21 13:42 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Al Viro, Andrew Morton, linux-kernel, linux-fsdevel

On Monday, 21 of April 2008, Jiri Slaby wrote:
> On 04/21/2008 11:45 AM, Al Viro wrote:
> > On Mon, Apr 21, 2008 at 11:37:40AM +0200, Jiri Slaby wrote:
> >> On 04/21/2008 11:06 AM, Al Viro wrote:
> >>> On Mon, Apr 21, 2008 at 10:31:40AM +0200, Jiri Slaby wrote:
> >>>
> >>>        hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
> >>>                struct qstr *qstr;
> >>>
> >>>                if (dentry->d_name.hash != hash)
> >>>                        continue;
> >>>
> >>> walking into node == (struct hlist_node *)0x00f0000000000000...
> >> Yup, true, In the last oops I stuck on memcmp few lines below.
> >>
> >> BTW. it's 100% reproducible after it happens once, but fixable by reboot. 
> >> Any tests I should run (memtest, some printks sticked anywhere)?
> > 
> > Well, if list has such turd in it, you'll certainly hit it every time
> > you walk that list, so 100% reproducible is not surprising.
> > 
> > How well is it reproducible from fresh boot?
> 
> Few days with suspend/resume cycles. This one was booted 12 hours ago, one 
> suspend/resume. Will keep an eye on it and keep you informed.

I think that's exactly the same problem I reported here:
http://lkml.org/lkml/2008/4/20/182
for 2.6.25-git2, so it hit the mainline and seems to be related to RCU.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fault in __d_lookup [Was: 2.6.25-mm1]
  2008-04-21  9:59         ` Jiri Slaby
  2008-04-21 13:42           ` Rafael J. Wysocki
@ 2008-04-21 17:23           ` Matthew Wilcox
  1 sibling, 0 replies; 7+ messages in thread
From: Matthew Wilcox @ 2008-04-21 17:23 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Al Viro, Andrew Morton, linux-kernel, linux-fsdevel

On Mon, Apr 21, 2008 at 11:59:51AM +0200, Jiri Slaby wrote:
> On 04/21/2008 11:45 AM, Al Viro wrote:
> >Well, if list has such turd in it, you'll certainly hit it every time
> >you walk that list, so 100% reproducible is not surprising.
> >
> >How well is it reproducible from fresh boot?
> 
> Few days with suspend/resume cycles. This one was booted 12 hours ago, one 
> suspend/resume. Will keep an eye on it and keep you informed.

Shall we see if we can catch it earlier?  I have no idea if this will
help ... I haven't even booted it on a testmachine yet ;-)  If I got
something wrong, it'll BUG() pretty early.

diff --git a/include/linux/list.h b/include/linux/list.h
index 75ce2cb..238ca1e 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -724,10 +724,17 @@ static inline int hlist_empty(const struct hlist_head *h)
 	return !h->first;
 }
 
+#ifdef CONFIG_DEBUG_LIST
+extern void hlist_check(struct hlist_node *n);
+#else
+#define hlist_check(n)		do { } while (0)
+#endif
+
 static inline void __hlist_del(struct hlist_node *n)
 {
 	struct hlist_node *next = n->next;
 	struct hlist_node **pprev = n->pprev;
+	hlist_check(n);
 	*pprev = next;
 	if (next)
 		next->pprev = pprev;
@@ -785,6 +792,7 @@ static inline void hlist_replace_rcu(struct hlist_node *old,
 {
 	struct hlist_node *next = old->next;
 
+	hlist_check(old);
 	new->next = next;
 	new->pprev = old->pprev;
 	smp_wmb();
@@ -840,6 +848,7 @@ static inline void hlist_add_head_rcu(struct hlist_node *n,
 static inline void hlist_add_before(struct hlist_node *n,
 					struct hlist_node *next)
 {
+	hlist_check(next);
 	n->pprev = next->pprev;
 	n->next = next;
 	next->pprev = &n->next;
@@ -849,6 +858,7 @@ static inline void hlist_add_before(struct hlist_node *n,
 static inline void hlist_add_after(struct hlist_node *n,
 					struct hlist_node *next)
 {
+	hlist_check(next);
 	next->next = n->next;
 	n->next = next;
 	next->pprev = &n->next;
@@ -878,6 +888,7 @@ static inline void hlist_add_after(struct hlist_node *n,
 static inline void hlist_add_before_rcu(struct hlist_node *n,
 					struct hlist_node *next)
 {
+	hlist_check(next);
 	n->pprev = next->pprev;
 	n->next = next;
 	smp_wmb();
@@ -906,6 +917,7 @@ static inline void hlist_add_before_rcu(struct hlist_node *n,
 static inline void hlist_add_after_rcu(struct hlist_node *prev,
 				       struct hlist_node *n)
 {
+	hlist_check(prev);
 	n->next = prev->next;
 	n->pprev = &prev->next;
 	smp_wmb();
diff --git a/lib/list_debug.c b/lib/list_debug.c
index 4350ba9..00b56bf 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -1,5 +1,8 @@
 /*
  * Copyright 2006, Red Hat, Inc., Dave Jones
+ * Copyright 2008 Intel Corporation
+ * Author: Matthew Wilcox <willy@linux.intel.com>
+ *
  * Released under the General Public License (GPL).
  *
  * This file contains the linked list implementations for
@@ -76,3 +79,18 @@ void list_del(struct list_head *entry)
 	entry->prev = LIST_POISON2;
 }
 EXPORT_SYMBOL(list_del);
+
+void hlist_check(struct hlist_node *n)
+{
+	if (unlikely(*n->pprev != n)) {
+		printk(KERN_ERR "hlist corruption. *pprev should be %p, "
+				"but was %p\n", n, *n->pprev);
+		BUG();
+	}
+	if (unlikely(n->next != NULL && n->next->pprev != &n->next)) {
+		printk(KERN_ERR "hlist corruption. n->next->pprev should be"
+				"%p, but was %p\n", &n->next, n->next->pprev);
+		BUG();
+	}
+}
+EXPORT_SYMBOL(hlist_check);

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-04-21 17:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20080418014757.52fb4a4f.akpm@linux-foundation.org>
2008-04-21  8:31 ` [Was: 2.6.25-mm1] Jiri Slaby
2008-04-21  9:06   ` Al Viro
2008-04-21  9:37     ` fault in __d_lookup " Jiri Slaby
2008-04-21  9:45       ` Al Viro
2008-04-21  9:59         ` Jiri Slaby
2008-04-21 13:42           ` Rafael J. Wysocki
2008-04-21 17:23           ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).