linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* assert in xfs_log_commit_cil
@ 2014-01-24 19:37 Ben Myers
  2014-01-24 22:20 ` Dave Chinner
  0 siblings, 1 reply; 6+ messages in thread
From: Ben Myers @ 2014-01-24 19:37 UTC (permalink / raw)
  To: xfs

Hi Folks,

I hit this assertion on one of my test boxes today:

[1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636
[1167966.162659] ------------[ cut here ]------------
[1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
[1167966.168026] invalid opcode: 0000 [#4] SMP
[1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd(F) at
 a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
[1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF     D   IO 3.13.0-rc2-0.9-default #28
[1167966.168103] Hardware name: SGI.COM SUMMIT/S2600GZ, BIOS SE5C600.86B.01.06.0002.110120121539 11/01/2012
[1167966.168139] Workqueue: xfs-data/sdb1 xfs_end_io [xfs]
[1167966.168141] task: ffff88065a0f0450 ti: ffff88009f2a8000 task.ti: ffff88009f2a8000
[1167966.168174] RIP: 0010:[<ffffffffa06cd39d>]  [<ffffffffa06cd39d>] assfail+0x1d/0x30 [xfs]
[1167966.168176] RSP: 0018:ffff88009f2a9ce8  EFLAGS: 00010292
[1167966.168177] RAX: 0000000000000061 RBX: ffff88070085f180 RCX: 0000000000000000
[1167966.168179] RDX: ffff88083f68ed68 RSI: ffff88083f68d248 RDI: ffff88083f68d248
[1167966.168180] RBP: ffff88009f2a9ce8 R08: 00000000000226fb R09: 000000000000000a
[1167966.168182] R10: 00000000000226fb R11: 0000000000000006 R12: 0000000000000000
[1167966.168183] R13: ffff88070085f180 R14: ffff880829c21000 R15: ffff88067f367400
[1167966.168186] FS:  0000000000000000(0000) GS:ffff88083f680000(0000) knlGS:0000000000000000
[1167966.168187] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1167966.168189] CR2: 00007fd326883160 CR3: 00000003623a0000 CR4: 00000000000407e0
[1167966.168190] Stack:
[1167966.168196]  ffff88009f2a9d38 ffffffffa072c3f0 ffff88009f2a9d38 ffff88009f2a9d58
[1167966.168200]  ffff88070085f1c0 ffff88073d3d71c0 0000000000000000 0000000000000000
[1167966.168205]  ffff880829c21000 0000000000000041 ffff88009f2a9d88 ffffffffa06d585c
[1167966.168206] Call Trace:
[1167966.168248]  [<ffffffffa072c3f0>] xfs_log_commit_cil+0x170/0x180 [xfs]
[1167966.168281]  [<ffffffffa06d585c>] xfs_trans_commit+0x15c/0x2a0 [xfs]
[1167966.168305]  [<ffffffffa06b404b>] xfs_setfilesize+0x12b/0x130 [xfs]
[1167966.168327]  [<ffffffffa06b4775>] xfs_end_io+0x75/0xf0 [xfs]
[1167966.168338]  [<ffffffff81065d6c>] process_one_work+0x17c/0x3d0
[1167966.168343]  [<ffffffff8106715b>] worker_thread+0x12b/0x410
[1167966.168349]  [<ffffffff81067030>] ? manage_workers+0x1a0/0x1a0
[1167966.168356]  [<ffffffff8106cbdb>] kthread+0xdb/0xf0
[1167966.168361]  [<ffffffff8106cb00>] ? kthread_freezable_should_stop+0x70/0x70
[1167966.168369]  [<ffffffff814c4c3c>] ret_from_fork+0x7c/0xb0
[1167966.168374]  [<ffffffff8106cb00>] ? kthread_freezable_should_stop+0x70/0x70
[1167966.168401] Code: 00 00 00 48 89 45 c8 e8 42 fc ff ff c9 c3 55 41 89 d0 48 89 f1 48 89 fa 48 c7 c6 58 4e 75 a0 31 ff 48 89 e5 31 c0 e8 93 ff ff ff <0f> 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b
[1167966.168430] RIP  [<ffffffffa06cd39d>] assfail+0x1d/0x30 [xfs]
[1167966.168431]  RSP <ffff88009f2a9ce8>
[1167966.218543] ---[ end trace c4c3ac02d344970e ]---

That machine was running xfs/109 at the time, at commit bf3964c1.

-Ben 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: assert in xfs_log_commit_cil
  2014-01-24 19:37 assert in xfs_log_commit_cil Ben Myers
@ 2014-01-24 22:20 ` Dave Chinner
  2014-01-24 22:39   ` Ben Myers
  2014-07-19 21:02   ` Andre Noll
  0 siblings, 2 replies; 6+ messages in thread
From: Dave Chinner @ 2014-01-24 22:20 UTC (permalink / raw)
  To: Ben Myers; +Cc: xfs

On Fri, Jan 24, 2014 at 01:37:02PM -0600, Ben Myers wrote:
> Hi Folks,
> 
> I hit this assertion on one of my test boxes today:
> 
> [1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636

I suppose that can happen if we are committing a transaction that
has no dirty objects in it. But that can't happen from
xfs_setfilesize(). That implies memory corruption or that someone has
busted rwsem behaviour.

> [1167966.162659] ------------[ cut here ]------------
> [1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
> [1167966.168026] invalid opcode: 0000 [#4] SMP
> [1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd(F) 
 at
>  a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
> [1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF     D   IO 3.13.0-rc2-0.9-default #28

That's a rather heavily tainted kernel you are testing there. It's
got forced module loads, TAINT_DIE which means this isn't the first
oops the kernel has had, TAINT_FIRMWARE_WORKAROUND which means the
hardware has bios/errata issues that need fixing, and you're
building and using out-of-tree modules that are force loaded so
there's no guarantee that all kernel/module ABIs match precisely....

The key one is that TAINT_DIE is already set. Something has already
paniced on the machine, and once that happens all bets are off. Can
you reproduce this on a clean, untainted kernel?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: assert in xfs_log_commit_cil
  2014-01-24 22:20 ` Dave Chinner
@ 2014-01-24 22:39   ` Ben Myers
  2014-07-19 21:02   ` Andre Noll
  1 sibling, 0 replies; 6+ messages in thread
From: Ben Myers @ 2014-01-24 22:39 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Sat, Jan 25, 2014 at 09:20:17AM +1100, Dave Chinner wrote:
> On Fri, Jan 24, 2014 at 01:37:02PM -0600, Ben Myers wrote:
> > Hi Folks,
> > 
> > I hit this assertion on one of my test boxes today:
> > 
> > [1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636
> 
> I suppose that can happen if we are committing a transaction that
> has no dirty objects in it. But that can't happen from
> xfs_setfilesize(). That implies memory corruption or that someone has
> busted rwsem behaviour.
> 
> > [1167966.162659] ------------[ cut here ]------------
> > [1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
> > [1167966.168026] invalid opcode: 0000 [#4] SMP
> > [1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd(F
 ) at
> >  a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
> > [1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF     D   IO 3.13.0-rc2-0.9-default #28
> 
> That's a rather heavily tainted kernel you are testing there. It's
> got forced module loads, TAINT_DIE which means this isn't the first
> oops the kernel has had, TAINT_FIRMWARE_WORKAROUND which means the
> hardware has bios/errata issues that need fixing, and you're
> building and using out-of-tree modules that are force loaded so
> there's no guarantee that all kernel/module ABIs match precisely....
> 
> The key one is that TAINT_DIE is already set. Something has already
> paniced on the machine, and once that happens all bets are off. Can
> you reproduce this on a clean, untainted kernel?

I'll give it a try...

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: assert in xfs_log_commit_cil
  2014-01-24 22:20 ` Dave Chinner
  2014-01-24 22:39   ` Ben Myers
@ 2014-07-19 21:02   ` Andre Noll
  2014-07-21  0:04     ` Dave Chinner
  1 sibling, 1 reply; 6+ messages in thread
From: Andre Noll @ 2014-07-19 21:02 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Ben Myers, xfs


[-- Attachment #1.1: Type: text/plain, Size: 3763 bytes --]

On Sat, Jan 25, 09:20, Dave Chinner wrote:
> > I hit this assertion on one of my test boxes today:
> > 
> > [1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636
> 
> I suppose that can happen if we are committing a transaction that
> has no dirty objects in it. But that can't happen from
> xfs_setfilesize(). That implies memory corruption or that someone has
> busted rwsem behaviour.
> 
> > [1167966.162659] ------------[ cut here ]------------
> > [1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
> > [1167966.168026] invalid opcode: 0000 [#4] SMP
> > [1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd(F) 
>  at
> >  a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
> > [1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF     D   IO 3.13.0-rc2-0.9-default #28
> 
> That's a rather heavily tainted kernel you are testing there.

FWIW, I'm also seeing this on an untainted 3.14.11 kernel:

[95004.073063] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: fs/xfs/xfs_log_cil.c, line: 647
[95004.073068] ------------[ cut here ]------------
[95004.073079] WARNING: CPU: 5 PID: 13368 at fs/xfs/xfs_message.c:99 xfs_log_commit_cil+0x371/0x5a0()
[95004.073081] Modules linked in: af_packet
[95004.073087] CPU: 5 PID: 13368 Comm: kworker/5:4 Not tainted 3.14.11 #18
[95004.073088] Hardware name: Supermicro H8DG6/H8DGi/H8DG6/H8DGi, BIOS 2.0b       03/01/2012
[95004.073094] Workqueue: xfs-data/dm-1 xfs_end_io
[95004.073096]  0000000000000000 ffffffff81760b6c ffffffff815b37a1 0000000000000000
[95004.073098]  ffffffff8103c3f2 ffff880fe098b900 ffff881e6fcb0d00 ffff880fe098b900
[95004.073100]  ffff881e6fcb0dd8 ffff8823bc512600 ffffffff81262db1 0000000000000000
[95004.073103] Call Trace:
[95004.073110]  [<ffffffff815b37a1>] ? dump_stack+0x41/0x51
[95004.073114]  [<ffffffff8103c3f2>] ? warn_slowpath_common+0x82/0xb0
[95004.073117]  [<ffffffff81262db1>] ? xfs_log_commit_cil+0x371/0x5a0
[95004.073120]  [<ffffffff8121687b>] ? xfs_trans_commit+0xcb/0x2c0
[95004.073123]  [<ffffffff811f8c9c>] ? xfs_end_io+0x6c/0xe0
[95004.073126]  [<ffffffff8105138e>] ? process_one_work+0x13e/0x3b0
[95004.073129]  [<ffffffff81051e39>] ? worker_thread+0x109/0x350
[95004.073131]  [<ffffffff81051d30>] ? manage_workers.isra.28+0x2c0/0x2c0
[95004.073134]  [<ffffffff81057f0c>] ? kthread+0xbc/0xe0
[95004.073136]  [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
[95004.073139]  [<ffffffff815b92fc>] ? ret_from_fork+0x7c/0xb0
[95004.073141]  [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
[95004.073142] ---[ end trace b591fe6842af909e ]---

Any hints?
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: assert in xfs_log_commit_cil
  2014-07-19 21:02   ` Andre Noll
@ 2014-07-21  0:04     ` Dave Chinner
  2014-07-21  7:40       ` Andre Noll
  0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2014-07-21  0:04 UTC (permalink / raw)
  To: Andre Noll; +Cc: Ben Myers, xfs

On Sat, Jul 19, 2014 at 11:02:45PM +0200, Andre Noll wrote:
> On Sat, Jan 25, 09:20, Dave Chinner wrote:
> > > I hit this assertion on one of my test boxes today:
> > > 
> > > [1167966.151275] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: /root/xfs/fs/xfs/xfs_log_cil.c, line: 636
> > 
> > I suppose that can happen if we are committing a transaction that
> > has no dirty objects in it. But that can't happen from
> > xfs_setfilesize(). That implies memory corruption or that someone has
> > busted rwsem behaviour.
> > 
> > > [1167966.162659] ------------[ cut here ]------------
> > > [1167966.168021] kernel BUG at /root/xfs/fs/xfs/xfs_message.c:107!
> > > [1167966.168026] invalid opcode: 0000 [#4] SMP
> > > [1167966.168081] Modules linked in: xfs(OF) ext2(F) dm_flakey(F) crc32c(F) libcrc32c(F) autofs4(F) cpufreq_conservative(F) cpufreq_userspace(F) cpufreq_powersave(F) microcode(F) fuse(F) loop(F) dm_mod(F) joydev(F) hid_generic(F) usbhid(F) hid(F) ehci_pci(F) ehci_hcd(F) iTCO_wdt(F) iTCO_vendor_support(F) ipv6(F) usbcore(F) sg(F) igb(F) isci(F) sr_mod(F) pcspkr(F) mptctl(F) cdrom(F) libsas(F) usb_common(F) ioatdma(F) ptp(F) i2c_i801(F) lpc_ich(F) mfd_core(F) pps_core(F) dca(F) rtc_cmos(F) acpi_cpufreq(F) wmi(F) button(F) mgag200(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) mpt2sas(F) raid_class(F) scsi_dh_emc(F) scsi_dh_rdac(F) scsi_dh_alua(F) scsi_dh_hp_sw(F) scsi_dh(F) thermal(F) sata_nv(F) processor(F) piix(F) mptsas(F) mptscsih(F) scsi_transport_sas(F) mptbase(F) megaraid_sas(F) ide_generic(F) ide_core(F) fan(F) thermal_sys(F) hwmon(F) ext3(F) jbd(F) mbcache(F) edd
 (F) 
> >  at
> > >  a_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: scsi_debug]
> > > [1167966.168102] CPU: 10 PID: 13005 Comm: kworker/10:3 Tainted: GF     D   IO 3.13.0-rc2-0.9-default #28
> > 
> > That's a rather heavily tainted kernel you are testing there.
> 
> FWIW, I'm also seeing this on an untainted 3.14.11 kernel:
> 
> [95004.073063] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: fs/xfs/xfs_log_cil.c, line: 647
> [95004.073068] ------------[ cut here ]------------
> [95004.073079] WARNING: CPU: 5 PID: 13368 at fs/xfs/xfs_message.c:99 xfs_log_commit_cil+0x371/0x5a0()
> [95004.073081] Modules linked in: af_packet
> [95004.073087] CPU: 5 PID: 13368 Comm: kworker/5:4 Not tainted 3.14.11 #18
> [95004.073088] Hardware name: Supermicro H8DG6/H8DGi/H8DG6/H8DGi, BIOS 2.0b       03/01/2012
> [95004.073094] Workqueue: xfs-data/dm-1 xfs_end_io
> [95004.073096]  0000000000000000 ffffffff81760b6c ffffffff815b37a1 0000000000000000
> [95004.073098]  ffffffff8103c3f2 ffff880fe098b900 ffff881e6fcb0d00 ffff880fe098b900
> [95004.073100]  ffff881e6fcb0dd8 ffff8823bc512600 ffffffff81262db1 0000000000000000
> [95004.073103] Call Trace:
> [95004.073110]  [<ffffffff815b37a1>] ? dump_stack+0x41/0x51
> [95004.073114]  [<ffffffff8103c3f2>] ? warn_slowpath_common+0x82/0xb0
> [95004.073117]  [<ffffffff81262db1>] ? xfs_log_commit_cil+0x371/0x5a0
> [95004.073120]  [<ffffffff8121687b>] ? xfs_trans_commit+0xcb/0x2c0
> [95004.073123]  [<ffffffff811f8c9c>] ? xfs_end_io+0x6c/0xe0
> [95004.073126]  [<ffffffff8105138e>] ? process_one_work+0x13e/0x3b0
> [95004.073129]  [<ffffffff81051e39>] ? worker_thread+0x109/0x350
> [95004.073131]  [<ffffffff81051d30>] ? manage_workers.isra.28+0x2c0/0x2c0
> [95004.073134]  [<ffffffff81057f0c>] ? kthread+0xbc/0xe0
> [95004.073136]  [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
> [95004.073139]  [<ffffffff815b92fc>] ? ret_from_fork+0x7c/0xb0
> [95004.073141]  [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
> [95004.073142] ---[ end trace b591fe6842af909e ]---
> 
> Any hints?

More information required.

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: assert in xfs_log_commit_cil
  2014-07-21  0:04     ` Dave Chinner
@ 2014-07-21  7:40       ` Andre Noll
  0 siblings, 0 replies; 6+ messages in thread
From: Andre Noll @ 2014-07-21  7:40 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Ben Myers, xfs


[-- Attachment #1.1: Type: text/plain, Size: 8366 bytes --]

On Mon, Jul 21, 10:04, Dave Chinner wrote:
> > FWIW, I'm also seeing this on an untainted 3.14.11 kernel:
> > 
> > [95004.073063] XFS: Assertion failed: !list_empty(&cil->xc_cil), file: fs/xfs/xfs_log_cil.c, line: 647
> > [95004.073068] ------------[ cut here ]------------
> > [95004.073079] WARNING: CPU: 5 PID: 13368 at fs/xfs/xfs_message.c:99 xfs_log_commit_cil+0x371/0x5a0()
> > [95004.073081] Modules linked in: af_packet
> > [95004.073087] CPU: 5 PID: 13368 Comm: kworker/5:4 Not tainted 3.14.11 #18
> > [95004.073088] Hardware name: Supermicro H8DG6/H8DGi/H8DG6/H8DGi, BIOS 2.0b       03/01/2012
> > [95004.073094] Workqueue: xfs-data/dm-1 xfs_end_io
> > [95004.073096]  0000000000000000 ffffffff81760b6c ffffffff815b37a1 0000000000000000
> > [95004.073098]  ffffffff8103c3f2 ffff880fe098b900 ffff881e6fcb0d00 ffff880fe098b900
> > [95004.073100]  ffff881e6fcb0dd8 ffff8823bc512600 ffffffff81262db1 0000000000000000
> > [95004.073103] Call Trace:
> > [95004.073110]  [<ffffffff815b37a1>] ? dump_stack+0x41/0x51
> > [95004.073114]  [<ffffffff8103c3f2>] ? warn_slowpath_common+0x82/0xb0
> > [95004.073117]  [<ffffffff81262db1>] ? xfs_log_commit_cil+0x371/0x5a0
> > [95004.073120]  [<ffffffff8121687b>] ? xfs_trans_commit+0xcb/0x2c0
> > [95004.073123]  [<ffffffff811f8c9c>] ? xfs_end_io+0x6c/0xe0
> > [95004.073126]  [<ffffffff8105138e>] ? process_one_work+0x13e/0x3b0
> > [95004.073129]  [<ffffffff81051e39>] ? worker_thread+0x109/0x350
> > [95004.073131]  [<ffffffff81051d30>] ? manage_workers.isra.28+0x2c0/0x2c0
> > [95004.073134]  [<ffffffff81057f0c>] ? kthread+0xbc/0xe0
> > [95004.073136]  [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
> > [95004.073139]  [<ffffffff815b92fc>] ? ret_from_fork+0x7c/0xb0
> > [95004.073141]  [<ffffffff81057e50>] ? kthread_freezable_should_stop+0x60/0x60
> > [95004.073142] ---[ end trace b591fe6842af909e ]---
> > 
> > Any hints?
> 
> More information required.

Sure.

* xfsprogs version 3.1.7 from Ubuntu Precise
* x86_64, 2-way system, 16 AMD CPUs
* 256G RAM, /proc/meminfo is below
* ~250T storage on three XFS file systems, contents of /proc/mounts
  and /proc/partitions below
* 7 x LSI HW Raid over 12x4T SATA disks
* 3 + 3 + 1 of these HW Raid arrays are combined with LVM into 3 VGs,
  see pvs, vgs output below
* Hitachi/HGST 4T SATA HDS
* write cache enabled, even with bad BBU (system is connected
  to UPS and Diesel emergency power)
* above backtrace indicates the problem is related to the LV dm-1,
  xfsinfo of this 105T fs below
* the machine is an NFS server, connected are ~15 clients via 10GBit
  ethernet (using sync mounts). These clients were heavily writing
  to the fs when the problem occurred.
* no drive failures
* fs was grown twice
* user and project quotas enabled

Thanks
Andre
---
cat /proc/meminfo
~~~~~~~~~~~~~~~~~
MemTotal:       264144968 kB
MemFree:         1839520 kB
MemAvailable:   261512400 kB
Buffers:          241684 kB
Cached:         250252204 kB
SwapCached:            0 kB
Active:         96525128 kB
Inactive:       153982780 kB
Active(anon):      10140 kB
Inactive(anon):    14564 kB
Active(file):   96514988 kB
Inactive(file): 153968216 kB
Unevictable:        8052 kB
Mlocked:               0 kB
SwapTotal:      10485756 kB
SwapFree:       10485756 kB
Dirty:             31688 kB
Writeback:            16 kB
AnonPages:         24692 kB
Mapped:             7156 kB
Shmem:                12 kB
Slab:            9951456 kB
SReclaimable:    9433372 kB
SUnreclaim:       518084 kB
KernelStack:        2600 kB
PageTables:         3032 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    142558240 kB
Committed_AS:     199388 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      692260 kB
VmallocChunk:   34156662148 kB
DirectMap4k:        8704 kB
DirectMap2M:     2070528 kB
DirectMap1G:    266338304 kB
cat /proc/meminfo /proc/mounts
MemTotal:       264144968 kB
MemFree:         1521196 kB
MemAvailable:   261519256 kB
Buffers:          241696 kB
Cached:         250576284 kB
SwapCached:            0 kB
Active:         96549616 kB
Inactive:       154283584 kB
Active(anon):      10140 kB
Inactive(anon):    14564 kB
Active(file):   96539476 kB
Inactive(file): 154269020 kB
Unevictable:        8052 kB
Mlocked:               0 kB
SwapTotal:      10485756 kB
SwapFree:       10485756 kB
Dirty:                 4 kB
Writeback:             0 kB
AnonPages:         24692 kB
Mapped:             7156 kB
Shmem:                12 kB
Slab:            9954412 kB
SReclaimable:    9433260 kB
SUnreclaim:       521152 kB
KernelStack:        2552 kB
PageTables:         3032 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    142558240 kB
Committed_AS:     199388 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      692260 kB
VmallocChunk:   34156662148 kB
DirectMap4k:        8704 kB
DirectMap2M:     2070528 kB
DirectMap1G:    266338304 kB

cat /proc/mounts
~~~~~~~~~~~~~~~~
rootfs / rootfs rw 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
/dev/mapper/toto-root / ext4 rw,relatime,data=ordered 0 0
devpts /dev/pts devpts rw,relatime,mode=600 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
none /dev/shm tmpfs rw,relatime 0 0
/dev/md0 /boot ext3 rw,relatime,data=ordered 0 0
/dev/mapper/toto-tmp /tmp ext4 rw,noatime,data=writeback 0 0
/dev/mapper/wizo-abt6_projects7 /ebio/abt6_projects7 xfs rw,noatime,attr2,inode64,usrquota,prjquota 0 0
/dev/mapper/zoff-abt6_projects8 /ebio/abt6_projects8 xfs rw,noatime,attr2,inode64,usrquota,prjquota 0 0
/dev/mapper/styx-abt6_sra /ebio/abt6_sra xfs rw,noatime,attr2,inode64,usrquota,prjquota 0 0
abt6-zserve.eb.local:/ebio/abt6/Users /ebio/abt6 nfs rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.18.3.229,mountvers=3,mountport=683,mountproto=tcp,local_lock=none,addr=172.18.3.229 0 0
ohm:/ebio/abt6_ga2 /ebio/abt6_ga2 nfs rw,sync,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.18.3.247,mountvers=3,mountport=52911,mountproto=tcp,local_lock=none,addr=172.18.3.247 0 0

cat /proc/partitions
~~~~~~~~~~~~~~~~~~~~
major minor  #blocks  name

   8        0 39062497280 sda
   8       32 39062497280 sdc
   8       16 39062497280 sdb
   8       48 39062497280 sdd
   8       64 39062497280 sde
   8       80 39062497280 sdf
   8       96 39062497280 sdg
   8      112  146523384 sdh
   8      113    1959898 sdh1
   8      114  144560902 sdh2
   8      128  146523384 sdi
   8      129    1959898 sdi1
   8      130  144560902 sdi2
   9        0    1959808 md0
   9        1  144560832 md1
 253        0 39062495232 dm-0
 253        1 112742891520 dm-1
 253        2   31457280 dm-2
 253        3   10485760 dm-3
 253        4   31457280 dm-4
 253        5 112742891520 dm-5

pvs
~~~
  PV         VG   Fmt  Attr PSize   PFree 
  /dev/md1   toto lvm2 a-   137.86g 67.86g
  /dev/sda   wizo lvm2 a-    36.38t     0 
  /dev/sdb   zoff lvm2 a-    36.38t     0 
  /dev/sdc   wizo lvm2 a-    36.38t  4.14t
  /dev/sdd   zoff lvm2 a-    36.38t     0 
  /dev/sde   styx lvm2 a-    36.38t     0 
  /dev/sdf   zoff lvm2 a-    36.38t  4.14t
  /dev/sdg   wizo lvm2 a-    36.38t     0 

vgs
~~~
  VG   #PV #LV #SN Attr   VSize   VFree 
  styx   1   1   0 wz--n-  36.38t     0 
  toto   1   3   0 wz--n- 137.86g 67.86g
  wizo   3   1   0 wz--n- 109.14t  4.14t
  zoff   3   1   0 wz--n- 109.14t  4.14t

xfs_info /ebio/abt6_projects8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
meta-data=/dev/mapper/zoff-abt6_projects8 isize=256    agcount=106, agsize=268435455 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=28185722880, imaxpct=5
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-07-21  7:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-24 19:37 assert in xfs_log_commit_cil Ben Myers
2014-01-24 22:20 ` Dave Chinner
2014-01-24 22:39   ` Ben Myers
2014-07-19 21:02   ` Andre Noll
2014-07-21  0:04     ` Dave Chinner
2014-07-21  7:40       ` Andre Noll

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).