linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.38.8 kernel bug in XFS or megaraid driver with heavy I/O load
@ 2011-10-11  9:17 Anders Ossowicki
  2011-10-11 13:34 ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Anders Ossowicki @ 2011-10-11  9:17 UTC (permalink / raw)
  To: linux-kernel; +Cc: aradford

We seem to have hit a bug on our brand-new disk with an XFS filesystem on the
2.6.38.8 kernel. The disk is 2 Dell MD1220 enclosures with Intel SSDs daisy
chained behind an LSI MegaRAID SAS 9285-8e raid controller. It was under heavy
I/O load, 1-200 MB/s r/w from postgres for about a week before the bug showed
up. The system itself is a Dell PowerEdge R815 with 32 cpu cores and 256G
memory. 

Support for the 9285-8e controller was introduced as part of a series of
patches for drivers/scsi/megaraid in 2.6.38 (0d49016b..cd50ba8e). Given that
the megaraid driver support for the 9285-8e controller is so new it might be
the real source of the issue, but this is pure speculation on my part. Any
suggestions would be most welcome.

The full dmesg is available at
http://dev.exherbo.org/~arkanoid/kat-dmesg-2011-10.txt

BUG: unable to handle kernel paging request at 000000000040403c
IP: [<ffffffff810f8d71>] find_get_pages+0x61/0x110
PGD 0 
Oops: 0000 [#1] SMP  
last sysfs file: /sys/devices/system/cpu/cpu31/cache/index2/shared_cpu_map
CPU 11 
Modules linked in: btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs
 minix ntfs vfat msdos fat jfs xfs reiserfs nfsd exportfs nfs lockd nfs_acl
 auth_rpcgss sunrpc autofs4 psmouse serio_raw joydev ixgbe lp amd64_edac_mod
 i2c_piix4 dca parport edac_core bnx2 power_meter dcdbas mdio edac_mce_amd ses
 enclosure usbhid hid ahci mpt2sas libahci scsi_transport_sas megaraid_sas
 raid_class 

Pid: 27512, comm: flush-8:32 Tainted: G        W   2.6.38.8 #1 Dell Inc.
 PowerEdge R815/04Y8PT
RIP: 0010:[<ffffffff810f8d71>]  [<ffffffff810f8d71>] find_get_pages+0x61/0x110
RSP: 0018:ffff881fdee55800  EFLAGS: 00010246
RAX: ffff8814a66d7000 RBX: ffff881fdee558c0 RCX: 000000000000000e
RDX: 0000000000000005 RSI: 0000000000000001 RDI: 0000000000404034
RBP: ffff881fdee55850 R08: 0000000000000001 R09: 0000000000000002
R10: ffffea00a0ff7788 R11: ffff88129306ac88 R12: 0000000000031535
R13: 000000000000000e R14: ffff881fdee558e8 R15: 0000000000000005
FS:  00007fec9ce13720(0000) GS:ffff88181fc80000(0000) knlGS:00000000f744d6d0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000040403c CR3: 0000000001a03000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process flush-8:32 (pid: 27512, threadinfo ffff881fdee54000, task ffff881fdf4adb80)
Stack:
 0000000000000000 0000000000000000 0000000000000000 ffff8832e7edf6e0
 0000000000000000 ffff881fdee558b0 ffffea008b443c18 0000000000031535
 ffff8832e7edf590 ffff881fdee55d20 ffff881fdee55870 ffffffff81101f92
Call Trace:
 [<ffffffff81101f92>] pagevec_lookup+0x22/0x30
 [<ffffffffa033e00d>] xfs_cluster_write+0xad/0x180 [xfs]
 [<ffffffffa033e4f4>] xfs_vm_writepage+0x414/0x4f0 [xfs]
 [<ffffffff810ffb77>] __writepage+0x17/0x40 
 [<ffffffff81100d95>] write_cache_pages+0x1c5/0x4a0
 [<ffffffff810ffb60>] ? __writepage+0x0/0x40
 [<ffffffff81101094>] generic_writepages+0x24/0x30
 [<ffffffffa033d5dd>] xfs_vm_writepages+0x5d/0x80 [xfs]
 [<ffffffff811010c1>] do_writepages+0x21/0x40
 [<ffffffff811730bf>] writeback_single_inode+0x9f/0x250
 [<ffffffff8117370b>] writeback_sb_inodes+0xcb/0x170
 [<ffffffff81174174>] writeback_inodes_wb+0xa4/0x170
 [<ffffffff8117450b>] wb_writeback+0x2cb/0x440
 [<ffffffff81035bb9>] ? default_spin_lock_flags+0x9/0x10
 [<ffffffff8158b3af>] ? _raw_spin_lock_irqsave+0x2f/0x40
 [<ffffffff811748ac>] wb_do_writeback+0x22c/0x280
 [<ffffffff811749aa>] bdi_writeback_thread+0xaa/0x260
 [<ffffffff81174900>] ? bdi_writeback_thread+0x0/0x260
 [<ffffffff81081b76>] kthread+0x96/0xa0
 [<ffffffff8100cda4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81081ae0>] ? kthread+0x0/0xa0
 [<ffffffff8100cda0>] ? kernel_thread_helper+0x0/0x10
Code: 4e 1c 00 85 c0 89 c1 0f 84 a7 00 00 00 49 89 de 45 31 ff 31 d2 0f 1f 44
 00 00 49 8b 06 48 8b 38 48 85 ff 74 3d 40 f6 c7 01 75 54 <44> 8b 47 08 4c 8d 57
 08 45 85 c0 74 e5 45 8d 48 01 44 89 c0 f0  
RIP  [<ffffffff810f8d71>] find_get_pages+0x61/0x110
RSP <ffff881fdee55800>
CR2: 000000000040403c
---[ end trace 84193c2a431ae14b ]--- 
-- 
Anders Ossowicki


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-10-24 16:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-11  9:17 2.6.38.8 kernel bug in XFS or megaraid driver with heavy I/O load Anders Ossowicki
2011-10-11 13:34 ` Christoph Hellwig
2011-10-11 14:13   ` Anders Ossowicki
2011-10-11 16:07     ` Jesper Krogh
2011-10-12  0:35     ` Dave Chinner
2011-10-12  4:13       ` Stan Hoeppner
2011-10-12 12:29       ` Anders Ossowicki
2011-10-17 12:40   ` jesper
2011-10-24 16:45     ` Michael Monnerie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).