Hi, yesterday while compiling LibreOffice on my amd64 box running Linux 3.0.0, those bugs happened: [692827.021788] BUG at fs/jfs/jfs_txnmgr.c:940 assert(mp->nohomeok > 0) [692827.021804] ------------[ cut here ]------------ [692827.021821] kernel BUG at fs/jfs/jfs_txnmgr.c:940! [692827.021832] invalid opcode: 0000 [#1] SMP [692827.021846] CPU 0 [692827.021851] Modules linked in: xt_state xt_dscp xt_hashlimit xt_conntrack xt_DSCP ath9k i915 mac80211 ath9k_common ath9k_hw drm_kms_helper ath cfg80211 cfbcopyarea e1000e video rfkill backlight cfbimgblt cfbfillrect [692827.021934] [692827.021940] Pid: 697, comm: jfsCommit Not tainted 3.0.0-OSS4.2 #1 /DQ45CB [692827.021967] RIP: 0010:[] [] txUnlock+0xc2/0x21a [692827.021991] RSP: 0018:ffff880231901e20 EFLAGS: 00010286 [692827.022003] RAX: 000000000000004e RBX: ffffc900080aef00 RCX: 000000000000ef51 [692827.022021] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff81ef0e60 [692827.022038] RBP: ffff880231901e60 R08: 0000000000000046 R09: 00000000fffffffe [692827.022054] R10: ffff8802b1901d3f R11: 0000000000000000 R12: ffff8801a8fca380 [692827.022070] R13: ffff88022e05b600 R14: ffffc900081db3e0 R15: 000000005063059c [692827.022087] FS: 0000000000000000(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000 [692827.022105] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [692827.022118] CR2: 00007f98337ce000 CR3: 0000000003f36000 CR4: 00000000000406f0 [692827.022135] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [692827.022152] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [692827.022168] Process jfsCommit (pid: 697, threadinfo ffff880231900000, task ffff880232746140) [692827.022185] Stack: [692827.022191] 2e84880231901e60 ffff88022e05b718 ffffffff816a41f3 ffff880232746140 [692827.022214] ffff880231901e80 ffff88023332b300 cccccccccccccccd ffffc900080aef00 [692827.022238] ffff880231901ee0 ffffffff81254de7 0000000000010bc0 ffff88022e05b600 [692827.022261] Call Trace: [692827.022274] [] ? _raw_spin_lock_irqsave+0xd/0x2a [692827.022290] [] jfs_lazycommit+0x13d/0x24e [692827.022305] [] ? try_to_wake_up+0x1c5/0x1c5 [692827.022318] [] ? txCommit+0x33b/0x33b [692827.022332] [] kthread+0x7d/0x85 [692827.022346] [] kernel_thread_helper+0x4/0x10 [692827.022361] [] ? kthread_worker_fn+0x137/0x137 [692827.022376] [] ? gs_change+0xb/0xb [692827.022387] Code: ff 41 8b 44 24 68 85 c0 7f 23 48 c7 c1 df 5a b1 81 ba ac 03 00 00 48 c7 c6 cb 5a b1 81 48 c7 c7 3a 4c b1 81 31 c0 e8 69 fd 44 00 <0f> 0b ff c8 41 89 44 24 68 85 c0 75 0a 49 8b 7c 24 58 e8 9b 68 [692827.022519] RIP [] txUnlock+0xc2/0x21a [692827.022533] RSP [692827.073901] ---[ end trace b4cb55fb5d49efbd ]--- [693433.340865] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366922] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366925] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366931] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366933] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366935] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366937] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366939] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366941] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366943] ERROR: (device md1): XT_GETPAGE: xtree page corrupt [693433.366945] BUG at fs/jfs/jfs_metapage.c:752 assert(mp->count) [693433.366955] ------------[ cut here ]------------ [693433.366970] kernel BUG at fs/jfs/jfs_metapage.c:752! [693433.366981] invalid opcode: 0000 [#2] SMP [693433.366993] CPU 2 [693433.366998] Modules linked in: xt_state xt_dscp xt_hashlimit xt_conntrack xt_DSCP ath9k i915 mac80211 ath9k_common ath9k_hw drm_kms_helper ath cfg80211 cfbcopyarea e1000e video rfkill backlight cfbimgblt cfbfillrect [693433.367069] [693433.367075] Pid: 10230, comm: shutdown Tainted: G D 3.0.0-OSS4.2 #1 /DQ45CB [693433.367098] RIP: 0010:[] [] release_metapage+0x5f/0xf7 [693433.367122] RSP: 0018:ffff8801a7a41ae8 EFLAGS: 00010296 [693433.367134] RAX: 0000000000000049 RBX: ffff8801a8fca380 RCX: 0000000000000aa3 [693433.367149] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff81ef0e60 [693433.367165] RBP: ffff8801a7a41af8 R08: 0000000000000046 R09: 00000000fffffffe [693433.367180] R10: ffff880227a41a07 R11: 0000000000000000 R12: ffffea0007051da8 [693433.367196] R13: ffff88022e901000 R14: 0000000000000001 R15: 000000000000034b [693433.367212] FS: 00007fcbd4f3f700(0000) GS:ffff88023bd00000(0000) knlGS:0000000000000000 [693433.367229] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [693433.367242] CR2: 00007fcbd4abb980 CR3: 00000001fbe6d000 CR4: 00000000000406e0 [693433.367258] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [693433.367273] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [693433.367289] Process shutdown (pid: 10230, threadinfo ffff8801a7a40000, task ffff88023328e140) [693433.367307] Stack: [693433.367312] ffff8801a8fca380 0000000000000000 ffff8801a7a41b98 ffffffff8124dcb2 [693433.367334] ffff8801a7a41b78 ffffffff81253dbc ffff880232cdbdc0 0000000200000002 [693433.367356] ffff880231483400 00000001812b07bf ffff8801a7a41cc0 0000402000000000 [693433.367378] Call Trace: [693433.367386] [] dtInsert+0x18c/0x19e [693433.367399] [] ? txLock+0x1bd/0x5ab [693433.367411] [] jfs_create+0x153/0x27f [693433.367425] [] ? do_lookup+0x46/0x28f [693433.367439] [] ? _raw_spin_lock+0x9/0xb [693433.367452] [] ? __d_lookup+0x91/0xcb [693433.367465] [] vfs_create+0x6b/0x8d [693433.367478] [] do_last+0x223/0x46b [693433.367491] [] path_openat+0xc2/0x334 [693433.367503] [] do_filp_open+0x38/0x84 [693433.367517] [] ? strncpy_from_user+0x34/0x37 [693433.367532] [] ? alloc_fd+0x6f/0x11a [693433.367545] [] do_sys_open+0x6f/0x101 [693433.367557] [] sys_open+0x1b/0x1d [693433.367570] [] system_call_fastpath+0x16/0x1b [693433.367583] Code: 56 de ff 48 8b 43 28 48 85 c0 75 21 48 c7 c1 d6 59 b1 81 ba f0 02 00 00 48 c7 c6 33 59 b1 81 48 c7 c7 3a 4c b1 81 e8 e8 25 45 00 <0f> 0b 48 ff c8 48 89 43 28 48 85 c0 75 75 83 7b 68 00 75 6f 48 [693433.367707] RIP [] release_metapage+0x5f/0xf7 [693433.367723] RSP [693433.373041] ---[ end trace b4cb55fb5d49efbe ]--- The system was in an unusable state afterwards - new network connections couldn't be established while already running ones continued to work fine. Also, neither reboot (segfaulted the first time, won't return the second time) or init 6 did work and I was forced to do a hard shutdown/reboot. I'm pretty sure that the system would have locked up completely after some time if I hadn't rebooted after the incident, as happened several times, but I've never been able to get a trace of what happened so far. As this bug happened at pretty high disk load (whilst compiling), I guess JFS has a few really nasty bugs. This is actually not the first bug I encounter with JFS. I've noticed, that: - performance degrades a lot with files > 50GB for some reason. This is especially annoying if copying full disk images. It starts up fine with maximum speed and drops to a few MB/s as the file gets bigger and bigger. - on older Kernels I've seen the jfsCommit threads causing a *lot* CPU load when working with big files. It got worse and worse the bigger the file became until the system hung completely (even AFTER the file was transferred successfully and a sync called.) It started with as low as 6% CPU load and increased steadily, even on multiple jfsCommit threads. I haven't tested this on 3.0.0 yet though, but every older version I ran had this kind of problem. All those problems may or may not be linked together. Something seems to be really wrong. Best regards, Mihai