* / is no longer Reiser4 :( @ 2005-11-19 15:15 John Gilmore 2005-11-21 17:57 ` Hans Reiser 0 siblings, 1 reply; 7+ messages in thread From: John Gilmore @ 2005-11-19 15:15 UTC (permalink / raw) To: reiserfs-list Following Han's comment about the deliterious effects of 6% fragmentation, I attempted a manual defrag of my hard disk. While restoring the .tar file, I had nothing better to do than watch it. And a good thing too! It got a recurring oops. about every other minute or so, it would stop with a long kernel message than mostly scrolled off of the screen... I thought those where supposed to show up in a log files somewhere if possible, but I can't find it. And it should have been possible, as the computer continued to run just fine. These oopses caused some sort of data corruption - root wouldn't boot properly afterwards. So I reformated as ext3 and untarred my root again. That worked fine, so I know it wasn't corruption of the tar file. I took a photograph, and I'll try to type in some of it. Just looking at the names of the procudures, it looks like memory pressure made reiser4 flush, and then some of the lower level functions tried to allocate memory and failed. But since I don't have the top of the oops message, I can't tell. Wait - I could've stopped the scrolling with ^S, scrolled back with ^pageup, and photoed the whole thing! Aaaargghh.... Well, I'm not redoing it right now, I need to be getting to bed. I may try it again later - but then maybe I'll update to 2.6.14-mm2 with patch from namesys first... Here's the (tail end of the) oops message, sans addresses and offsets because I'm feeling lazy and I'm in a hurry: mempool_alloc+0x3a/0xe0 __split_bio+0x128/0x190 in_drive_list dm_request generic_make_request submit_bio do_IRQ reiser4_clear_page_dirty write_jnodes_to_disk_extent write_jnode_list write_fq flush_current_atom flush_some_atom writeout reiser4_sync_inodes writeback_inodes background_writeout pdflush __pdflush pdflush background_writeout kthread kthread kernel_thread_helper ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: / is no longer Reiser4 :( 2005-11-19 15:15 / is no longer Reiser4 :( John Gilmore @ 2005-11-21 17:57 ` Hans Reiser 2005-11-21 19:15 ` Alexander Zarochentsev 0 siblings, 1 reply; 7+ messages in thread From: Hans Reiser @ 2005-11-21 17:57 UTC (permalink / raw) To: Alexander Zarochentcev; +Cc: John Gilmore, reiserfs-list zam, please look into this. Hans John Gilmore wrote: >Following Han's comment about the deliterious effects of 6% fragmentation, I >attempted a manual defrag of my hard disk. > >While restoring the .tar file, I had nothing better to do than watch it. And a >good thing too! It got a recurring oops. about every other minute or so, it >would stop with a long kernel message than mostly scrolled off of the >screen... I thought those where supposed to show up in a log files somewhere >if possible, but I can't find it. And it should have been possible, as the >computer continued to run just fine. > >These oopses caused some sort of data corruption - root wouldn't boot properly >afterwards. So I reformated as ext3 and untarred my root again. That worked >fine, so I know it wasn't corruption of the tar file. > >I took a photograph, and I'll try to type in some of it. Just looking at the >names of the procudures, it looks like memory pressure made reiser4 flush, >and then some of the lower level functions tried to allocate memory and >failed. But since I don't have the top of the oops message, I can't tell. > >Wait - I could've stopped the scrolling with ^S, scrolled back with ^pageup, >and photoed the whole thing! Aaaargghh.... > >Well, I'm not redoing it right now, I need to be getting to bed. > >I may try it again later - but then maybe I'll update to 2.6.14-mm2 with patch >from namesys first... > >Here's the (tail end of the) oops message, sans addresses and offsets because >I'm feeling lazy and I'm in a hurry: > >mempool_alloc+0x3a/0xe0 >__split_bio+0x128/0x190 >in_drive_list >dm_request >generic_make_request >submit_bio >do_IRQ >reiser4_clear_page_dirty >write_jnodes_to_disk_extent >write_jnode_list >write_fq >flush_current_atom >flush_some_atom >writeout >reiser4_sync_inodes >writeback_inodes >background_writeout >pdflush >__pdflush >pdflush >background_writeout >kthread >kthread >kernel_thread_helper > > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: / is no longer Reiser4 :( 2005-11-21 17:57 ` Hans Reiser @ 2005-11-21 19:15 ` Alexander Zarochentsev 2005-11-21 19:23 ` Jake Maciejewski 2005-11-21 23:17 ` Hans Reiser 0 siblings, 2 replies; 7+ messages in thread From: Alexander Zarochentsev @ 2005-11-21 19:15 UTC (permalink / raw) To: Hans Reiser; +Cc: John Gilmore, reiserfs-list Hi On Monday 21 November 2005 20:57, Hans Reiser wrote: > zam, please look into this. > > Hans > > John Gilmore wrote: > >Following Han's comment about the deliterious effects of 6% fragmentation, > > I attempted a manual defrag of my hard disk. > > > >While restoring the .tar file, I had nothing better to do than watch it. > > And a good thing too! It got a recurring oops. about every other minute > > or so, it would stop with a long kernel message than mostly scrolled off > > of the screen... I thought those where supposed to show up in a log files > > somewhere if possible, but I can't find it. And it should have been > > possible, as the computer continued to run just fine. > > > >These oopses caused some sort of data corruption - root wouldn't boot one bug responsible for fs corruption was fixed recently. the fix is in 2.6.14-mm2 already. > > properly afterwards. So I reformated as ext3 and untarred my root again. > > That worked fine, so I know it wasn't corruption of the tar file. > > > >I took a photograph, and I'll try to type in some of it. Just looking at > > the names of the procudures, it looks like memory pressure made reiser4 > > flush, and then some of the lower level functions tried to allocate > > memory and failed. But since I don't have the top of the oops message, I > > can't tell. > > > >Wait - I could've stopped the scrolling with ^S, scrolled back with > > ^pageup, and photoed the whole thing! Aaaargghh.... > > > >Well, I'm not redoing it right now, I need to be getting to bed. > > > >I may try it again later - but then maybe I'll update to 2.6.14-mm2 with > > patch from namesys first... > > > >Here's the (tail end of the) oops message, sans addresses and offsets > > because I'm feeling lazy and I'm in a hurry: > > > >mempool_alloc+0x3a/0xe0 > >__split_bio+0x128/0x190 > >in_drive_list > >dm_request > >generic_make_request > >submit_bio > >do_IRQ > >reiser4_clear_page_dirty > >write_jnodes_to_disk_extent > >write_jnode_list > >write_fq > >flush_current_atom > >flush_some_atom > >writeout > >reiser4_sync_inodes > >writeback_inodes > >background_writeout > >pdflush > >__pdflush > >pdflush > >background_writeout > >kthread > >kthread > >kernel_thread_helper -- Alex. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: / is no longer Reiser4 :( 2005-11-21 19:15 ` Alexander Zarochentsev @ 2005-11-21 19:23 ` Jake Maciejewski 2005-11-21 19:56 ` Alexander Zarochentsev 2005-11-21 23:17 ` Hans Reiser 1 sibling, 1 reply; 7+ messages in thread From: Jake Maciejewski @ 2005-11-21 19:23 UTC (permalink / raw) To: Alexander Zarochentsev; +Cc: Hans Reiser, John Gilmore, reiserfs-list On Mon, 2005-11-21 at 22:15 +0300, Alexander Zarochentsev wrote: > Hi > > On Monday 21 November 2005 20:57, Hans Reiser wrote: > > zam, please look into this. > > > > > Hans > > > > John Gilmore wrote: > > >Following Han's comment about the deliterious effects of 6% fragmentation, > > > I attempted a manual defrag of my hard disk. > > > > > >While restoring the .tar file, I had nothing better to do than watch it. > > > And a good thing too! It got a recurring oops. about every other minute > > > or so, it would stop with a long kernel message than mostly scrolled off > > > of the screen... I thought those where supposed to show up in a log files > > > somewhere if possible, but I can't find it. And it should have been > > > possible, as the computer continued to run just fine. > > > > > >These oopses caused some sort of data corruption - root wouldn't boot > > one bug responsible for fs corruption was fixed recently. > the fix is in 2.6.14-mm2 already. Can we get a fix for vanilla? I haven't had problems yet, but I don't want to run mm unless absolutely necessary, and lately I've lost confidence in the "apply mm patches to vanilla and hope it works" approach. > > > properly afterwards. So I reformated as ext3 and untarred my root again. > > > That worked fine, so I know it wasn't corruption of the tar file. > > > > > >I took a photograph, and I'll try to type in some of it. Just looking at > > > the names of the procudures, it looks like memory pressure made reiser4 > > > flush, and then some of the lower level functions tried to allocate > > > memory and failed. But since I don't have the top of the oops message, I > > > can't tell. > > > > > >Wait - I could've stopped the scrolling with ^S, scrolled back with > > > ^pageup, and photoed the whole thing! Aaaargghh.... > > > > > >Well, I'm not redoing it right now, I need to be getting to bed. > > > > > >I may try it again later - but then maybe I'll update to 2.6.14-mm2 with > > > patch from namesys first... > > > > > >Here's the (tail end of the) oops message, sans addresses and offsets > > > because I'm feeling lazy and I'm in a hurry: > > > > > >mempool_alloc+0x3a/0xe0 > > >__split_bio+0x128/0x190 > > >in_drive_list > > >dm_request > > >generic_make_request > > >submit_bio > > >do_IRQ > > >reiser4_clear_page_dirty > > >write_jnodes_to_disk_extent > > >write_jnode_list > > >write_fq > > >flush_current_atom > > >flush_some_atom > > >writeout > > >reiser4_sync_inodes > > >writeback_inodes > > >background_writeout > > >pdflush > > >__pdflush > > >pdflush > > >background_writeout > > >kthread > > >kthread > > >kernel_thread_helper > -- Jake Maciejewski <maciejej@msoe.edu> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: / is no longer Reiser4 :( 2005-11-21 19:23 ` Jake Maciejewski @ 2005-11-21 19:56 ` Alexander Zarochentsev 0 siblings, 0 replies; 7+ messages in thread From: Alexander Zarochentsev @ 2005-11-21 19:56 UTC (permalink / raw) To: Jake Maciejewski; +Cc: Hans Reiser, John Gilmore, reiserfs-list On Monday 21 November 2005 22:23, Jake Maciejewski wrote: > On Mon, 2005-11-21 at 22:15 +0300, Alexander Zarochentsev wrote: > > Hi > > > > On Monday 21 November 2005 20:57, Hans Reiser wrote: > > > zam, please look into this. > > > > > > > > > Hans > > > > > > John Gilmore wrote: > > > >Following Han's comment about the deliterious effects of 6% > > > > fragmentation, I attempted a manual defrag of my hard disk. > > > > > > > >While restoring the .tar file, I had nothing better to do than watch > > > > it. And a good thing too! It got a recurring oops. about every other > > > > minute or so, it would stop with a long kernel message than mostly > > > > scrolled off of the screen... I thought those where supposed to show > > > > up in a log files somewhere if possible, but I can't find it. And it > > > > should have been possible, as the computer continued to run just > > > > fine. > > > > > > > >These oopses caused some sort of data corruption - root wouldn't boot > > > > one bug responsible for fs corruption was fixed recently. > > the fix is in 2.6.14-mm2 already. > > Can we get a fix for vanilla? I haven't had problems yet, but I don't > want to run mm unless absolutely necessary, and lately I've lost > confidence in the "apply mm patches to vanilla and hope it works" > approach. reiser4-for-2.6.14-1.patch.gz contains the fix as well, the initial fix was: --- a/as_ops.c +++ b/as_ops.c @@ -229,7 +229,7 @@ int reiser4_invalidatepage(struct page * node = jprivate(page); spin_lock_jnode(node); if (!JF_ISSET(node, JNODE_DIRTY) && !JF_ISSET(node, JNODE_FLUSH_QUEUED) && - !JF_ISSET(node, JNODE_WRITEBACK)) { + !JF_ISSET(node, JNODE_WRITEBACK) && !JF_ISSET(node, JNODE_OVRWR)) { /* there is not need to capture */ jref(node); JF_SET(node, JNODE_HEARD_BANSHEE); our git repo shows that the bug was added at 16 of August. > > > > > properly afterwards. So I reformated as ext3 and untarred my root > > > > again. That worked fine, so I know it wasn't corruption of the tar > > > > file. > > > > > > > >I took a photograph, and I'll try to type in some of it. Just looking > > > > at the names of the procudures, it looks like memory pressure made > > > > reiser4 flush, and then some of the lower level functions tried to > > > > allocate memory and failed. But since I don't have the top of the > > > > oops message, I can't tell. > > > > > > > >Wait - I could've stopped the scrolling with ^S, scrolled back with > > > > ^pageup, and photoed the whole thing! Aaaargghh.... > > > > > > > >Well, I'm not redoing it right now, I need to be getting to bed. > > > > > > > >I may try it again later - but then maybe I'll update to 2.6.14-mm2 > > > > with patch from namesys first... > > > > > > > >Here's the (tail end of the) oops message, sans addresses and offsets > > > > because I'm feeling lazy and I'm in a hurry: > > > > > > > >mempool_alloc+0x3a/0xe0 > > > >__split_bio+0x128/0x190 > > > >in_drive_list > > > >dm_request > > > >generic_make_request > > > >submit_bio > > > >do_IRQ > > > >reiser4_clear_page_dirty > > > >write_jnodes_to_disk_extent > > > >write_jnode_list > > > >write_fq > > > >flush_current_atom > > > >flush_some_atom > > > >writeout > > > >reiser4_sync_inodes > > > >writeback_inodes > > > >background_writeout > > > >pdflush > > > >__pdflush > > > >pdflush > > > >background_writeout > > > >kthread > > > >kthread > > > >kernel_thread_helper -- Alex. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: / is no longer Reiser4 :( 2005-11-21 19:15 ` Alexander Zarochentsev 2005-11-21 19:23 ` Jake Maciejewski @ 2005-11-21 23:17 ` Hans Reiser 2005-11-23 9:34 ` John Gilmore 1 sibling, 1 reply; 7+ messages in thread From: Hans Reiser @ 2005-11-21 23:17 UTC (permalink / raw) To: Alexander Zarochentsev; +Cc: John Gilmore, reiserfs-list Alexander Zarochentsev wrote: > >one bug responsible for fs corruption was fixed recently. >the fix is in 2.6.14-mm2 already. > > > Then send an email titled something like "Data corruption bug was fixed, be sure to upgrade!" to our list. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: / is no longer Reiser4 :( 2005-11-21 23:17 ` Hans Reiser @ 2005-11-23 9:34 ` John Gilmore 0 siblings, 0 replies; 7+ messages in thread From: John Gilmore @ 2005-11-23 9:34 UTC (permalink / raw) To: reiserfs-list On Monday 21 November 2005 23:17, Hans Reiser wrote: > Alexander Zarochentsev wrote: > >one bug responsible for fs corruption was fixed recently. > >the fix is in 2.6.14-mm2 already. > > Then send an email titled something like "Data corruption bug was fixed, > be sure to upgrade!" to our list. I tried it again and got the complete oops text. It's a "soft lockup detected on CPU#0" message, which leads me to believe that it's a side effect of the sync taking a long time. I've got 1.5 gigs of memory and a very slow hard disk. hdparm -tT gives ~4.5 MB/s or up to 8 MB/s if I have everythings turned on that I can. I can't enable dma, because hdparm refuses to do so, and I haven't figured out which parameters to pass to which modules to make it so that I can. It's also possible that my hardware is buggy, and the driver knows that and is thus refusing to enable dma and corrupt data. I've got 2.6.14-mm2 with the latest reiser4 patch, but it's giving my loads of garbage like: *** Warning: "plugin_set_compression" [fs/reiser4/plugin/compress/compress_plugins.ko] undefined! I think that maybe the source was corrupted in the restore process (I'll have to do it again---later) I'm moving on friday/saturday, and I don't have arrangements for internet access at the new digs yet, so if you've got questions, ask them now... BUG: soft lockup detected on CPU#0 Pid: 4582, comm: pdflush EIP: 0060:[<f892da3b>] CPU: 0 EIP is at ide_pio_sector+0xcb/0x120 [ide_core] EFLAGS: 00000282 Not tainted (2.6.14-mm1) EAX: ec5bc000 EBX: eb531000 ECX: 00000000 EDX: 000001f0 ESI: 00000004 EDO: f893f120 ENP: 00000282 DS: 007b ES: 007b CR0: 8005003b CR2: 0812b008 CR3: 298b8000 CR4: 000006d0 [<f892dadd>] ide_pio_multi+0x4d/0x70 [ide_core] [<f892de61>] task_out_intr+0x101/0x140 [ide_core] [<f892836d>] ide_intr+0x7d/0x180 [ide_core] [<f892dd60>] task_out_intr+0x0/0x140 [ide_core] [<c013c22d>] handle_IRQ_event+0x3d/0x70 [<c013c2c3>] __do_IRQ+0x63/0xc0 [<c0105379>] do_IRQ+0x19/0x30 [<c0103b1a>] common_interupt+0x1a/0x20 [<c013d6ca>] unlock_page+0xa/0x30 [<f8a4c130>] write_jnodes_to_disk_extent+0x1b0/0x2c0 [reiser4] [<f8a4c4c9>] write_jnode_list+0xa9/0x110 [reiser4] [<f8a51483>] write_fq+0x53/0x70 [reiser4] [<f9a47d19>] write_prepped_nodes+0x39/0x40 [resier4] [<f8a48f0c>] squeeze_right_twig+0x10c/0x160 [reiser4] [<f8a49156>] squeeze_right_twig_and_advance_coord+0x26/0x80 [reiser4] [<f8a49a84>] handle_pos_end_of_twig+0xd4/0x290 [reiser4] [<f8a810d5>] item_length_by_coord+0x15/0x20 [reiser4] [<f8a49f28>] squalloc+0x28/0x60 [reiser4] [<f8a4829f>] jnode_flush+0x2cf/0x340 [reiser4] [<f8a48519>] flush_current_atom+0xf9/0x250 [reiser4] [<f8a457cf>] flush_some_atom+0xaf/0x2c0 [reiser4] [<f8a565a4>] writeout+0x124/0x200 [reiser4] [<f8a52c04>] reiser4_sync_inodes+0x64/0xf0 [reiser4] [<c0184b0d>] writeback_inodes+0x4d/0xb0 [<c0144118>] background_writeout+0x98/0xe0 [<c0144cb0>] pdflush+0x0/0x30 [<c0144c0d>] __pdflush+0xbd/0x160 [<c0144cd6>] pdflush+0x26/0x30 [<c0144080>] background_writeout+0x0/0xe0 [<c0144080>] background_writeout+0x0/0xe0 [<c012f1e6>] kthread+0xb6/0xc0 [<c012f130>] kthread+0x0/0xc0 [<c0101369>] kernel_thread_helper+0x5/0xc ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-11-23 9:34 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-11-19 15:15 / is no longer Reiser4 :( John Gilmore 2005-11-21 17:57 ` Hans Reiser 2005-11-21 19:15 ` Alexander Zarochentsev 2005-11-21 19:23 ` Jake Maciejewski 2005-11-21 19:56 ` Alexander Zarochentsev 2005-11-21 23:17 ` Hans Reiser 2005-11-23 9:34 ` John Gilmore
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.