* assert in xfs_log_commit_cil
@ 2014-01-24 19:37 Ben Myers
2014-01-24 22:20 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: Ben Myers @ 2014-01-24 19:37 UTC (permalink / raw)
To: xfs
Hi Folks,
I hit this assertion on one of my test boxes today:
[1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636
[1167966.162659] ------------[ cut here ]------------
[1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
[1167966.168026] invalid opcode: 0000 [#4] SMP
[1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd(F) at
a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
[1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF D IO 3.13.0-rc2-0.9-default #28
[1167966.168103] Hardware name: SGI.COM SUMMIT/S2600GZ, BIOS SE5C600.86B.01.06.0002.110120121539 11/01/2012
[1167966.168139] Workqueue: xfs-data/sdb1 xfs_end_io [xfs]
[1167966.168141] task: ffff88065a0f0450 ti: ffff88009f2a8000 task.ti: ffff88009f2a8000
[1167966.168174] RIP: 0010:[<ffffffffa06cd39d>] [<ffffffffa06cd39d>] assfail+0x1d/0x30 [xfs]
[1167966.168176] RSP: 0018:ffff88009f2a9ce8 EFLAGS: 00010292
[1167966.168177] RAX: 0000000000000061 RBX: ffff88070085f180 RCX: 0000000000000000
[1167966.168179] RDX: ffff88083f68ed68 RSI: ffff88083f68d248 RDI: ffff88083f68d248
[1167966.168180] RBP: ffff88009f2a9ce8 R08: 00000000000226fb R09: 000000000000000a
[1167966.168182] R10: 00000000000226fb R11: 0000000000000006 R12: 0000000000000000
[1167966.168183] R13: ffff88070085f180 R14: ffff880829c21000 R15: ffff88067f367400
[1167966.168186] FS: 0000000000000000(0000) GS:ffff88083f680000(0000) knlGS:0000000000000000
[1167966.168187] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1167966.168189] CR2: 00007fd326883160 CR3: 00000003623a0000 CR4: 00000000000407e0
[1167966.168190] Stack:
[1167966.168196] ffff88009f2a9d38 ffffffffa072c3f0 ffff88009f2a9d38 ffff88009f2a9d58
[1167966.168200] ffff88070085f1c0 ffff88073d3d71c0 0000000000000000 0000000000000000
[1167966.168205] ffff880829c21000 0000000000000041 ffff88009f2a9d88 ffffffffa06d585c
[1167966.168206] Call Trace:
[1167966.168248] [<ffffffffa072c3f0>] xfs_log_commit_cil+0x170/0x180 [xfs]
[1167966.168281] [<ffffffffa06d585c>] xfs_trans_commit+0x15c/0x2a0 [xfs]
[1167966.168305] [<ffffffffa06b404b>] xfs_setfilesize+0x12b/0x130 [xfs]
[1167966.168327] [<ffffffffa06b4775>] xfs_end_io+0x75/0xf0 [xfs]
[1167966.168338] [<ffffffff81065d6c>] process_one_work+0x17c/0x3d0
[1167966.168343] [<ffffffff8106715b>] worker_thread+0x12b/0x410
[1167966.168349] [<ffffffff81067030>] ? manage_workers+0x1a0/0x1a0
[1167966.168356] [<ffffffff8106cbdb>] kthread+0xdb/0xf0
[1167966.168361] [<ffffffff8106cb00>] ? kthread_freezable_should_stop+0x70/0x70
[1167966.168369] [<ffffffff814c4c3c>] ret_from_fork+0x7c/0xb0
[1167966.168374] [<ffffffff8106cb00>] ? kthread_freezable_should_stop+0x70/0x70
[1167966.168401] Code: 00 00 00 48 89 45 c8 e8 42 fc ff ff c9 c3 55 41 89 d0 48 89 f1 48 89 fa 48 c7 c6 58 4e 75 a0 31 ff 48 89 e5 31 c0 e8 93 ff ff ff <0f> 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b
[1167966.168430] RIP [<ffffffffa06cd39d>] assfail+0x1d/0x30 [xfs]
[1167966.168431] RSP <ffff88009f2a9ce8>
[1167966.218543] ---[ end trace c4c3ac02d344970e ]---
That machine was running xfs/109 at the time, at commit bf3964c1.
-Ben
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: assert in xfs_log_commit_cil
2014-01-24 19:37 assert in xfs_log_commit_cil Ben Myers
@ 2014-01-24 22:20 ` Dave Chinner
2014-01-24 22:39 ` Ben Myers
2014-07-19 21:02 ` Andre Noll
0 siblings, 2 replies; 6+ messages in thread
From: Dave Chinner @ 2014-01-24 22:20 UTC (permalink / raw)
To: Ben Myers; +Cc: xfs
On Fri, Jan 24, 2014 at 01:37:02PM -0600, Ben Myers wrote:
> Hi Folks,
>
> I hit this assertion on one of my test boxes today:
>
> [1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636
I suppose that can happen if we are committing a transaction that
has no dirty objects in it. But that can't happen from
xfs_setfilesize(). That implies memory corruption or that someone has
busted rwsem behaviour.
> [1167966.162659] ------------[ cut here ]------------
> [1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
> [1167966.168026] invalid opcode: 0000 [#4] SMP
> [1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd(F)
at
> a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
> [1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF D IO 3.13.0-rc2-0.9-default #28
That's a rather heavily tainted kernel you are testing there. It's
got forced module loads, TAINT_DIE which means this isn't the first
oops the kernel has had, TAINT_FIRMWARE_WORKAROUND which means the
hardware has bios/errata issues that need fixing, and you're
building and using out-of-tree modules that are force loaded so
there's no guarantee that all kernel/module ABIs match precisely....
The key one is that TAINT_DIE is already set. Something has already
paniced on the machine, and once that happens all bets are off. Can
you reproduce this on a clean, untainted kernel?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: assert in xfs_log_commit_cil
2014-01-24 22:20 ` Dave Chinner
@ 2014-01-24 22:39 ` Ben Myers
2014-07-19 21:02 ` Andre Noll
1 sibling, 0 replies; 6+ messages in thread
From: Ben Myers @ 2014-01-24 22:39 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Sat, Jan 25, 2014 at 09:20:17AM +1100, Dave Chinner wrote:
> On Fri, Jan 24, 2014 at 01:37:02PM -0600, Ben Myers wrote:
> > Hi Folks,
> >
> > I hit this assertion on one of my test boxes today:
> >
> > [1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636
>
> I suppose that can happen if we are committing a transaction that
> has no dirty objects in it. But that can't happen from
> xfs_setfilesize(). That implies memory corruption or that someone has
> busted rwsem behaviour.
>
> > [1167966.162659] ------------[ cut here ]------------
> > [1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
> > [1167966.168026] invalid opcode: 0000 [#4] SMP
> > [1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd(F
) at
> > a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
> > [1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF D IO 3.13.0-rc2-0.9-default #28
>
> That's a rather heavily tainted kernel you are testing there. It's
> got forced module loads, TAINT_DIE which means this isn't the first
> oops the kernel has had, TAINT_FIRMWARE_WORKAROUND which means the
> hardware has bios/errata issues that need fixing, and you're
> building and using out-of-tree modules that are force loaded so
> there's no guarantee that all kernel/module ABIs match precisely....
>
> The key one is that TAINT_DIE is already set. Something has already
> paniced on the machine, and once that happens all bets are off. Can
> you reproduce this on a clean, untainted kernel?
I'll give it a try...
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: assert in xfs_log_commit_cil
2014-01-24 22:20 ` Dave Chinner
2014-01-24 22:39 ` Ben Myers
@ 2014-07-19 21:02 ` Andre Noll
2014-07-21 0:04 ` Dave Chinner
1 sibling, 1 reply; 6+ messages in thread
From: Andre Noll @ 2014-07-19 21:02 UTC (permalink / raw)
To: Dave Chinner; +Cc: Ben Myers, xfs
[-- Attachment #1.1: Type: text/plain, Size: 3763 bytes --]
On Sat, Jan 25, 09:20, Dave Chinner wrote:
> > I hit this assertion on one of my test boxes today:
> >
> > [1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636
>
> I suppose that can happen if we are committing a transaction that
> has no dirty objects in it. But that can't happen from
> xfs_setfilesize(). That implies memory corruption or that someone has
> busted rwsem behaviour.
>
> > [1167966.162659] ------------[ cut here ]------------
> > [1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
> > [1167966.168026] invalid opcode: 0000 [#4] SMP
> > [1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd(F)
> at
> > a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
> > [1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF D IO 3.13.0-rc2-0.9-default #28
>
> That's a rather heavily tainted kernel you are testing there.
FWIW, I'm also seeing this on an untainted 3.14.11 kernel:
[95004.073063] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: fs/xfs/xfs_log_cil.c, line: 647
[95004.073068] ------------[ cut here ]------------
[95004.073079] WARNING: CPU: 5 PID: 13368 at fs/xfs/xfs_message.c:99 xfs_log_commit_cil+0x371/0x5a0()
[95004.073081] Modules linked in: af_packet
[95004.073087] CPU: 5 PID: 13368 Comm: kworker/5:4 Not tainted 3.14.11 #18
[95004.073088] Hardware name: Supermicro H8DG6/H8DGi/H8DG6/H8DGi, BIOS 2.0b 03/01/2012
[95004.073094] Workqueue: xfs-data/dm-1 xfs_end_io
[95004.073096] 0000000000000000 ffffffff81760b6c ffffffff815b37a1 0000000000000000
[95004.073098] ffffffff8103c3f2 ffff880fe098b900 ffff881e6fcb0d00 ffff880fe098b900
[95004.073100] ffff881e6fcb0dd8 ffff8823bc512600 ffffffff81262db1 0000000000000000
[95004.073103] Call Trace:
[95004.073110] [<ffffffff815b37a1>] ? dump_stack+0x41/0x51
[95004.073114] [<ffffffff8103c3f2>] ? warn_slowpath_common+0x82/0xb0
[95004.073117] [<ffffffff81262db1>] ? xfs_log_commit_cil+0x371/0x5a0
[95004.073120] [<ffffffff8121687b>] ? xfs_trans_commit+0xcb/0x2c0
[95004.073123] [<ffffffff811f8c9c>] ? xfs_end_io+0x6c/0xe0
[95004.073126] [<ffffffff8105138e>] ? process_one_work+0x13e/0x3b0
[95004.073129] [<ffffffff81051e39>] ? worker_thread+0x109/0x350
[95004.073131] [<ffffffff81051d30>] ? manage_workers.isra.28+0x2c0/0x2c0
[95004.073134] [<ffffffff81057f0c>] ? kthread+0xbc/0xe0
[95004.073136] [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
[95004.073139] [<ffffffff815b92fc>] ? ret_from_fork+0x7c/0xb0
[95004.073141] [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
[95004.073142] ---[ end trace b591fe6842af909e ]---
Any hints?
Andre
--
The only person who always got his work done by Friday was Robinson Crusoe
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: assert in xfs_log_commit_cil
2014-07-19 21:02 ` Andre Noll
@ 2014-07-21 0:04 ` Dave Chinner
2014-07-21 7:40 ` Andre Noll
0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2014-07-21 0:04 UTC (permalink / raw)
To: Andre Noll; +Cc: Ben Myers, xfs
On Sat, Jul 19, 2014 at 11:02:45PM +0200, Andre Noll wrote:
> On Sat, Jan 25, 09:20, Dave Chinner wrote:
> > > I hit this assertion on one of my test boxes today:
> > >
> > > [1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636
> >
> > I suppose that can happen if we are committing a transaction that
> > has no dirty objects in it. But that can't happen from
> > xfs_setfilesize(). That implies memory corruption or that someone has
> > busted rwsem behaviour.
> >
> > > [1167966.162659] ------------[ cut here ]------------
> > > [1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
> > > [1167966.168026] invalid opcode: 0000 [#4] SMP
> > > [1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd
(F)
> > at
> > > a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
> > > [1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF D IO 3.13.0-rc2-0.9-default #28
> >
> > That's a rather heavily tainted kernel you are testing there.
>
> FWIW, I'm also seeing this on an untainted 3.14.11 kernel:
>
> [95004.073063] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: fs/xfs/xfs_log_cil.c, line: 647
> [95004.073068] ------------[ cut here ]------------
> [95004.073079] WARNING: CPU: 5 PID: 13368 at fs/xfs/xfs_message.c:99 xfs_log_commit_cil+0x371/0x5a0()
> [95004.073081] Modules linked in: af_packet
> [95004.073087] CPU: 5 PID: 13368 Comm: kworker/5:4 Not tainted 3.14.11 #18
> [95004.073088] Hardware name: Supermicro H8DG6/H8DGi/H8DG6/H8DGi, BIOS 2.0b 03/01/2012
> [95004.073094] Workqueue: xfs-data/dm-1 xfs_end_io
> [95004.073096] 0000000000000000 ffffffff81760b6c ffffffff815b37a1 0000000000000000
> [95004.073098] ffffffff8103c3f2 ffff880fe098b900 ffff881e6fcb0d00 ffff880fe098b900
> [95004.073100] ffff881e6fcb0dd8 ffff8823bc512600 ffffffff81262db1 0000000000000000
> [95004.073103] Call Trace:
> [95004.073110] [<ffffffff815b37a1>] ? dump_stack+0x41/0x51
> [95004.073114] [<ffffffff8103c3f2>] ? warn_slowpath_common+0x82/0xb0
> [95004.073117] [<ffffffff81262db1>] ? xfs_log_commit_cil+0x371/0x5a0
> [95004.073120] [<ffffffff8121687b>] ? xfs_trans_commit+0xcb/0x2c0
> [95004.073123] [<ffffffff811f8c9c>] ? xfs_end_io+0x6c/0xe0
> [95004.073126] [<ffffffff8105138e>] ? process_one_work+0x13e/0x3b0
> [95004.073129] [<ffffffff81051e39>] ? worker_thread+0x109/0x350
> [95004.073131] [<ffffffff81051d30>] ? manage_workers.isra.28+0x2c0/0x2c0
> [95004.073134] [<ffffffff81057f0c>] ? kthread+0xbc/0xe0
> [95004.073136] [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
> [95004.073139] [<ffffffff815b92fc>] ? ret_from_fork+0x7c/0xb0
> [95004.073141] [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
> [95004.073142] ---[ end trace b591fe6842af909e ]---
>
> Any hints?
More information required.
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: assert in xfs_log_commit_cil
2014-07-21 0:04 ` Dave Chinner
@ 2014-07-21 7:40 ` Andre Noll
0 siblings, 0 replies; 6+ messages in thread
From: Andre Noll @ 2014-07-21 7:40 UTC (permalink / raw)
To: Dave Chinner; +Cc: Ben Myers, xfs
[-- Attachment #1.1: Type: text/plain, Size: 8366 bytes --]
On Mon, Jul 21, 10:04, Dave Chinner wrote:
> > FWIW, I'm also seeing this on an untainted 3.14.11 kernel:
> >
> > [95004.073063] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: fs/xfs/xfs_log_cil.c, line: 647
> > [95004.073068] ------------[ cut here ]------------
> > [95004.073079] WARNING: CPU: 5 PID: 13368 at fs/xfs/xfs_message.c:99 xfs_log_commit_cil+0x371/0x5a0()
> > [95004.073081] Modules linked in: af_packet
> > [95004.073087] CPU: 5 PID: 13368 Comm: kworker/5:4 Not tainted 3.14.11 #18
> > [95004.073088] Hardware name: Supermicro H8DG6/H8DGi/H8DG6/H8DGi, BIOS 2.0b 03/01/2012
> > [95004.073094] Workqueue: xfs-data/dm-1 xfs_end_io
> > [95004.073096] 0000000000000000 ffffffff81760b6c ffffffff815b37a1 0000000000000000
> > [95004.073098] ffffffff8103c3f2 ffff880fe098b900 ffff881e6fcb0d00 ffff880fe098b900
> > [95004.073100] ffff881e6fcb0dd8 ffff8823bc512600 ffffffff81262db1 0000000000000000
> > [95004.073103] Call Trace:
> > [95004.073110] [<ffffffff815b37a1>] ? dump_stack+0x41/0x51
> > [95004.073114] [<ffffffff8103c3f2>] ? warn_slowpath_common+0x82/0xb0
> > [95004.073117] [<ffffffff81262db1>] ? xfs_log_commit_cil+0x371/0x5a0
> > [95004.073120] [<ffffffff8121687b>] ? xfs_trans_commit+0xcb/0x2c0
> > [95004.073123] [<ffffffff811f8c9c>] ? xfs_end_io+0x6c/0xe0
> > [95004.073126] [<ffffffff8105138e>] ? process_one_work+0x13e/0x3b0
> > [95004.073129] [<ffffffff81051e39>] ? worker_thread+0x109/0x350
> > [95004.073131] [<ffffffff81051d30>] ? manage_workers.isra.28+0x2c0/0x2c0
> > [95004.073134] [<ffffffff81057f0c>] ? kthread+0xbc/0xe0
> > [95004.073136] [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
> > [95004.073139] [<ffffffff815b92fc>] ? ret_from_fork+0x7c/0xb0
> > [95004.073141] [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
> > [95004.073142] ---[ end trace b591fe6842af909e ]---
> >
> > Any hints?
>
> More information required.
Sure.
* xfsprogs version 3.1.7 from Ubuntu Precise
* x86_64, 2-way system, 16 AMD CPUs
* 256G RAM, /proc/meminfo is below
* ~250T storage on three XFS file systems, contents of /proc/mounts
and /proc/partitions below
* 7 x LSI HW Raid over 12x4T SATA disks
* 3 + 3 + 1 of these HW Raid arrays are combined with LVM into 3 VGs,
see pvs, vgs output below
* Hitachi/HGST 4T SATA HDS
* write cache enabled, even with bad BBU (system is connected
to UPS and Diesel emergency power)
* above backtrace indicates the problem is related to the LV dm-1,
xfsinfo of this 105T fs below
* the machine is an NFS server, connected are ~15 clients via 10GBit
ethernet (using sync mounts). These clients were heavily writing
to the fs when the problem occurred.
* no drive failures
* fs was grown twice
* user and project quotas enabled
Thanks
Andre
---
cat /proc/meminfo
~~~~~~~~~~~~~~~~~
MemTotal: 264144968 kB
MemFree: 1839520 kB
MemAvailable: 261512400 kB
Buffers: 241684 kB
Cached: 250252204 kB
SwapCached: 0 kB
Active: 96525128 kB
Inactive: 153982780 kB
Active(anon): 10140 kB
Inactive(anon): 14564 kB
Active(file): 96514988 kB
Inactive(file): 153968216 kB
Unevictable: 8052 kB
Mlocked: 0 kB
SwapTotal: 10485756 kB
SwapFree: 10485756 kB
Dirty: 31688 kB
Writeback: 16 kB
AnonPages: 24692 kB
Mapped: 7156 kB
Shmem: 12 kB
Slab: 9951456 kB
SReclaimable: 9433372 kB
SUnreclaim: 518084 kB
KernelStack: 2600 kB
PageTables: 3032 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 142558240 kB
Committed_AS: 199388 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 692260 kB
VmallocChunk: 34156662148 kB
DirectMap4k: 8704 kB
DirectMap2M: 2070528 kB
DirectMap1G: 266338304 kB
cat /proc/meminfo /proc/mounts
MemTotal: 264144968 kB
MemFree: 1521196 kB
MemAvailable: 261519256 kB
Buffers: 241696 kB
Cached: 250576284 kB
SwapCached: 0 kB
Active: 96549616 kB
Inactive: 154283584 kB
Active(anon): 10140 kB
Inactive(anon): 14564 kB
Active(file): 96539476 kB
Inactive(file): 154269020 kB
Unevictable: 8052 kB
Mlocked: 0 kB
SwapTotal: 10485756 kB
SwapFree: 10485756 kB
Dirty: 4 kB
Writeback: 0 kB
AnonPages: 24692 kB
Mapped: 7156 kB
Shmem: 12 kB
Slab: 9954412 kB
SReclaimable: 9433260 kB
SUnreclaim: 521152 kB
KernelStack: 2552 kB
PageTables: 3032 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 142558240 kB
Committed_AS: 199388 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 692260 kB
VmallocChunk: 34156662148 kB
DirectMap4k: 8704 kB
DirectMap2M: 2070528 kB
DirectMap1G: 266338304 kB
cat /proc/mounts
~~~~~~~~~~~~~~~~
rootfs / rootfs rw 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
/dev/mapper/toto-root / ext4 rw,relatime,data=ordered 0 0
devpts /dev/pts devpts rw,relatime,mode=600 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
none /dev/shm tmpfs rw,relatime 0 0
/dev/md0 /boot ext3 rw,relatime,data=ordered 0 0
/dev/mapper/toto-tmp /tmp ext4 rw,noatime,data=writeback 0 0
/dev/mapper/wizo-abt6_projects7 /ebio/abt6_projects7 xfs rw,noatime,attr2,inode64,usrquota,prjquota 0 0
/dev/mapper/zoff-abt6_projects8 /ebio/abt6_projects8 xfs rw,noatime,attr2,inode64,usrquota,prjquota 0 0
/dev/mapper/styx-abt6_sra /ebio/abt6_sra xfs rw,noatime,attr2,inode64,usrquota,prjquota 0 0
abt6-zserve.eb.local:/ebio/abt6/Users /ebio/abt6 nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.18.3.229,mountvers=3,mountport=683,mountproto=tcp,local_lock=none,addr=172.18.3.229 0 0
ohm:/ebio/abt6_ga2 /ebio/abt6_ga2 nfs rw,sync,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.18.3.247,mountvers=3,mountport=52911,mountproto=tcp,local_lock=none,addr=172.18.3.247 0 0
cat /proc/partitions
~~~~~~~~~~~~~~~~~~~~
major minor #blocks name
8 0 39062497280 sda
8 32 39062497280 sdc
8 16 39062497280 sdb
8 48 39062497280 sdd
8 64 39062497280 sde
8 80 39062497280 sdf
8 96 39062497280 sdg
8 112 146523384 sdh
8 113 1959898 sdh1
8 114 144560902 sdh2
8 128 146523384 sdi
8 129 1959898 sdi1
8 130 144560902 sdi2
9 0 1959808 md0
9 1 144560832 md1
253 0 39062495232 dm-0
253 1 112742891520 dm-1
253 2 31457280 dm-2
253 3 10485760 dm-3
253 4 31457280 dm-4
253 5 112742891520 dm-5
pvs
~~~
PV VG Fmt Attr PSize PFree
/dev/md1 toto lvm2 a- 137.86g 67.86g
/dev/sda wizo lvm2 a- 36.38t 0
/dev/sdb zoff lvm2 a- 36.38t 0
/dev/sdc wizo lvm2 a- 36.38t 4.14t
/dev/sdd zoff lvm2 a- 36.38t 0
/dev/sde styx lvm2 a- 36.38t 0
/dev/sdf zoff lvm2 a- 36.38t 4.14t
/dev/sdg wizo lvm2 a- 36.38t 0
vgs
~~~
VG #PV #LV #SN Attr VSize VFree
styx 1 1 0 wz--n- 36.38t 0
toto 1 3 0 wz--n- 137.86g 67.86g
wizo 3 1 0 wz--n- 109.14t 4.14t
zoff 3 1 0 wz--n- 109.14t 4.14t
xfs_info /ebio/abt6_projects8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
meta-data=/dev/mapper/zoff-abt6_projects8 isize=256 agcount=106, agsize=268435455 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=28185722880, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
--
The only person who always got his work done by Friday was Robinson Crusoe
[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-07-21 7:40 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-24 19:37 assert in xfs_log_commit_cil Ben Myers
2014-01-24 22:20 ` Dave Chinner
2014-01-24 22:39 ` Ben Myers
2014-07-19 21:02 ` Andre Noll
2014-07-21 0:04 ` Dave Chinner
2014-07-21 7:40 ` Andre Noll
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).