* [2.4.1] system goes glacial, Reiser on /usr doesn't sync @ 2001-02-20 10:16 ` Kevin Turner 2001-02-20 10:30 ` Keith Owens ` (3 more replies) 0 siblings, 4 replies; 8+ messages in thread From: Kevin Turner @ 2001-02-20 10:16 UTC (permalink / raw) To: linux-kernel Version: Linux version 2.4.1-pre12 (gcc version 2.95.3 20010125 (prerelease)) Possible suspect players: dpkg seems to trigger the bug ReiserFS is the partition that doesn't sync binfmt_misc shows up in the call traces. Symptoms: The system assumes glacial speeds. If you're *lucky*, you'll see one widget re-paint in X before the next ice age. Ctrl-alt-delete is unresponsive, as are attempts to start proccesses via the network or joystick port. Keypresses to programs such as getty are not echoed. All program output to console and network is stopped dead. If you leave for a several-hour-long coffee break and come back to it, there's still no evidence that you banged on the keyboard. What does work: If not in X, I may switch between VTs. Keypresses are echoed on VTs without running processes. And the machine still pings. Occasionally (a few times a minute at most), you'll see the HDD LED blink. When it happens: Sometimes, not always, when accessing the package database or updating installed packages with the Debian's "dpkg" tool. When it *didn't* happen: before I switched to kernel 2.4.x. Debugging information: After I had about enough of this, I recompiled the kernel with CONFIG_REISERFS_CHECK and Magic SysRq. ReiserFS gives no debugging messages of any sort during these episodes. Magic SysRq, however, does work. Magic-tErminate clears everything up and the system resumes normal operation. Magic-Sync appears not to work, as "Ok" and "Done" don't appear. But after Magic-E, they do show up. Logs show that one partition actually sync'd before termination, the others didn't. The partition where it stalled just so happens to be a ReiserFS partition (3.5.x format). kernel: SysRq: Emergency Sync kernel: Syncing device 03:01 ... OK kernel: Syncing device 16:02 ... <6>SysRq: Terminate All Tasks kernel: OK kernel: Syncing device 16:01 ... OK kernel: Syncing device 03:03 ... OK 03:01 - ext2 / 16:02 - ReiFS /usr (note: /var is a symlink to /usr/root-var) 16:01 - ext2 /home 03:03 - ext2 /other In ReiserFS's defense, /usr and /var are the patitions that dpkg was most likely to have open files on, so perhaps that was it. When I asked magic about running processes, it said that, among other things, there was one apt-get, one dpkg, and three instances of dpkg-deb. Of those, dpkg was the only one in 'R' state, the others were 'S'. Here's the call trace: dpkg R 00000000 0 9278 9152 9285 (NOTLB) Call Trace: [__alloc_pages+372/720] [grow_buffers+62/364] [refill_freelist+28/48] [refill_freelist+41/48] [getblk+242/264] [binfmt_misc:__insmod_binfmt_misc_O/lib/modules/2.4.1-pre12/kernel/fs/bi+-621596/96] [binfmt_misc:__insmod_binfmt_misc_O/lib/modules/2.4.1-pre12/kernel/fs/bi+-970133/96] [binfmt_misc:__insmod_binfmt_misc_O/lib/modules/2.4.1-pre12/kernel/fs/bi+-750728/96] [binfmt_misc:__insmod_binfmt_misc_O/lib/modules/2.4.1-pre12/kernel/fs/bi+-976628/96] [binfmt_misc:__insmod_binfmt_misc_O/lib/modules/2.4.1-pre12/kernel/fs/bi+-918150/96] [binfmt_misc:__insmod_binfmt_misc_O/lib/modules/2.4.1-pre12/kernel/fs/bi+-976370/96] [binfmt_misc:__insmod_binfmt_misc_O/lib/modules/2.4.1-pre12/kernel/fs/bi+-1080871/96] [permission+42/48] [vfs_link+141/192] [sys_link+202/300] Why binfmt_misc? I'll be burned if I know. It is true that one of the packages I was installing this run was related to binfmt_misc ("binfmt-support"), but that wasn't the case the previous times this bug has happened. On the other hand, I don't have call traces for those times. Linux version 2.4.1-pre12 (gcc version 2.95.3 20010125 (prerelease)) dpkg version 1.3.8.1 arch is i586 (GenuineIntel Pentium MMX) disks are IDE 48 MB RAM swap > 200 MB Motherboard: ABIT-TX5 Host bridge: Intel Corporation 430TX - 82439TX MTXC (rev 01) IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01) More logs are posted at: http://olug.cs.oberlin.edu/~kturner/bugs/glacial-2.4.1.html The kernel's configuration is at: http://olug.cs.oberlin.edu/~kturner/bugs/config-2.4.1-pre12 And yes, I will try upgrading to a newer kernel. But since I can't quite reproduce this on demand yet, I thought I'd report it now while the logbits are fresh. -- Kevin Turner <acapnotic@users.sourceforge.net> | OpenPGP encryption welcome here I usually subscribe to lists I post to, but for LKML I'm making an exception =) Please feel free to CC me on all follow-ups. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [2.4.1] system goes glacial, Reiser on /usr doesn't sync 2001-02-20 10:16 ` [2.4.1] system goes glacial, Reiser on /usr doesn't sync Kevin Turner @ 2001-02-20 10:30 ` Keith Owens 2001-02-20 11:33 ` David ` (2 subsequent siblings) 3 siblings, 0 replies; 8+ messages in thread From: Keith Owens @ 2001-02-20 10:30 UTC (permalink / raw) To: He-Who-Is-Not-Subscribed-to-LKML; +Cc: linux-kernel On Tue, 20 Feb 2001 02:16:09 -0800, Kevin Turner <acapnotic@users.sourceforge.net> wrote: >[binfmt_misc:__insmod_binfmt_misc_O/lib/modules/2.4.1-pre12/kernel/fs/bi+-621596/96] > >Why binfmt_misc? I'll be burned if I know. Because klogd conversion of addresses to symbols is a pile of crud. Turn off klogd symbol conversion (klogd -x) and run the raw addresses through ksymoops. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [2.4.1] system goes glacial, Reiser on /usr doesn't sync 2001-02-20 10:16 ` [2.4.1] system goes glacial, Reiser on /usr doesn't sync Kevin Turner 2001-02-20 10:30 ` Keith Owens @ 2001-02-20 11:33 ` David 2001-02-20 12:24 ` Kevin Turner 2001-02-20 16:25 ` Chris Mason 2001-02-20 12:34 ` Kevin Turner 2001-02-23 23:35 ` Dwayne C. Litzenberger 3 siblings, 2 replies; 8+ messages in thread From: David @ 2001-02-20 11:33 UTC (permalink / raw) To: He-Who-Is-Not-Subscribed-to-LKML; +Cc: linux-kernel Kevin Turner wrote: > Version: > Linux version 2.4.1-pre12 (gcc version 2.95.3 20010125 (prerelease)) > > Possible suspect players: > dpkg seems to trigger the bug > ReiserFS is the partition that doesn't sync > binfmt_misc shows up in the call traces. > > Symptoms: > > The system assumes glacial speeds. If you're *lucky*, you'll see one > widget re-paint in X before the next ice age. Ctrl-alt-delete is > unresponsive, as are attempts to start proccesses via the network or > joystick port. Keypresses to programs such as getty are not echoed. > All program output to console and network is stopped dead. If you leave > for a several-hour-long coffee break and come back to it, there's still > no evidence that you banged on the keyboard. Wild shot in the dark....I'd lay odds that you had about 6-7 Megs free in your buffers/cache line, yes? -d ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [2.4.1] system goes glacial, Reiser on /usr doesn't sync 2001-02-20 11:33 ` David @ 2001-02-20 12:24 ` Kevin Turner 2001-02-20 16:25 ` Chris Mason 1 sibling, 0 replies; 8+ messages in thread From: Kevin Turner @ 2001-02-20 12:24 UTC (permalink / raw) To: linux-kernel On Tue, Feb 20, 2001 at 03:33:33AM -0800, David wrote: > Wild shot in the dark....I'd lay odds that you had about 6-7 Megs free > in your buffers/cache line, yes? David! You're psychic! SysRq: Show Memory Mem-info: Free pages: 712kB ( 0kB HighMem) ( Active: 1779, inactive_dirty: 1507, inactive_clean: 0, free: 178 (192 384 576) ) 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB = 512kB) 0*4kB 1*8kB 0*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB = 200kB) = 0kB) Swap cache: add 73994, delete 73947, find 19040/117035 Free swap: 206664kB 12288 pages of RAM 0 pages of HIGHMEM 653 reserved pages 4740 pages shared 47 pages swap cached 0 pages in page table cache Buffer memory: 6028kB ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [2.4.1] system goes glacial, Reiser on /usr doesn't sync 2001-02-20 11:33 ` David 2001-02-20 12:24 ` Kevin Turner @ 2001-02-20 16:25 ` Chris Mason 1 sibling, 0 replies; 8+ messages in thread From: Chris Mason @ 2001-02-20 16:25 UTC (permalink / raw) To: David, He-Who-Is-Not-Subscribed-to-LKML; +Cc: linux-kernel On Tuesday, February 20, 2001 03:33:33 AM -0800 David <david@blue-labs.org> wrote: > Kevin Turner wrote: > >> Version: >> Linux version 2.4.1-pre12 (gcc version 2.95.3 20010125 (prerelease)) >> >> Possible suspect players: >> dpkg seems to trigger the bug >> ReiserFS is the partition that doesn't sync >> binfmt_misc shows up in the call traces. >> >> Symptoms: >> >> The system assumes glacial speeds. If you're *lucky*, you'll see one >> widget re-paint in X before the next ice age. Ctrl-alt-delete is >> unresponsive, as are attempts to start proccesses via the network or >> joystick port. Keypresses to programs such as getty are not echoed. >> All program output to console and network is stopped dead. If you leave >> for a several-hour-long coffee break and come back to it, there's still >> no evidence that you banged on the keyboard. > > > Wild shot in the dark....I'd lay odds that you had about 6-7 Megs free in > your buffers/cache line, yes? > David, have any of Rik's patches helped here? -chris ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [2.4.1] system goes glacial, Reiser on /usr doesn't sync 2001-02-20 10:16 ` [2.4.1] system goes glacial, Reiser on /usr doesn't sync Kevin Turner 2001-02-20 10:30 ` Keith Owens 2001-02-20 11:33 ` David @ 2001-02-20 12:34 ` Kevin Turner 2001-02-23 23:35 ` Dwayne C. Litzenberger 3 siblings, 0 replies; 8+ messages in thread From: Kevin Turner @ 2001-02-20 12:34 UTC (permalink / raw) To: linux-kernel Thanks to Keith who pointed out that "klogd conversion of addresses to symbols is a pile of crud." Here's what I'm getting out of ksymoops now. It's not as much, since it's just the parts I copied down by hand... (I'll get it next time. Whenever that happens to be. Installing the ksymoops package didn't trigger it.) from magic-showPc: EIP: 0010:[<c0126533>] CPU: 0 EFLAGS: 00000207 Using defaults from ksymoops -t elf32-i386 -a i386 EAX: c0206f98 EBX: c10af1e0 ECX: c10af1fc EDX: c1091200 ESI: c10af1fc EDI: 00000000 EBP: 000005ac DS: 0018 ES: 0018 CR0: 8005003b CR2: 08052beb CR3: 00839000 CR4: 00000010 Call Trace: [<c0126dec>] [<c0126f66>] [<c0127c58>] [<c0127d0c>] [<c01283b1>] [<c011e0ea>] [<c011e140>] [<c011e485>] [<c010f83c>] [<c010f704>] [<c0110336>] [<c0127d5e>] [<c0127d84>] [<c0139cea>] [<c0138fe8>] [<c0108ee4>] [<c013a3e4>] [<c0108da3>] Warning (Oops_read): Code line not seen, dumping what data is available >>EIP; c0126533 <page_launder+2d3/8a0> <===== Trace; c0126dec <do_try_to_free_pages+34/7c> Trace; c0126f66 <try_to_free_pages+22/2c> Trace; c0127c58 <__alloc_pages+230/2d0> Trace; c0127d0c <__get_free_pages+14/20> Trace; c01283b1 <read_swap_cache_async+31/a0> Trace; c011e0ea <swapin_readahead+8e/c4> Trace; c011e140 <do_swap_page+20/114> Trace; c011e485 <handle_mm_fault+fd/154> Trace; c010f83c <do_page_fault+138/3fc> Trace; c010f704 <do_page_fault+0/3fc> Trace; c0110336 <schedule+26a/394> Trace; c0127d5e <__free_pages+1a/1c> Trace; c0127d84 <free_pages+24/28> Trace; c0139cea <poll_freewait+3a/44> Trace; c0138fe8 <do_fcntl+14c/204> Trace; c0108ee4 <error_code+34/40> Trace; c013a3e4 <sys_select+3bc/494> Trace; c0108da3 <system_call+33/40> The running dpkg process: Warning (Oops_read): Code line not seen, dumping what data is available Trace; c0127b9c <__alloc_pages+174/2d0> Trace; c012f932 <grow_buffers+3e/16c> Trace; c012dad4 <refill_freelist+1c/30> Trace; c012dae1 <refill_freelist+29/30> Trace; c012dec2 <getblk+f2/108> Trace; c38b7378 <[reiserfs].bss.end+45979/af661> Trace; c386226b <[reiserfs]do_journal_end+633/ab4> Trace; c3897b78 <[reiserfs].bss.end+26179/af661> Trace; c386090c <[reiserfs]do_journal_begin_r+188/258> Trace; c012c0bc <filp_close+5c/64> Trace; c0108da3 <system_call+33/40> The running dpkg process, several minutes later is the same, but the "bss.end" line above the 'do_journal_end' call reads: Trace; c38b73e4 <[reiserfs].bss.end+459e5/af661> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [2.4.1] system goes glacial, Reiser on /usr doesn't sync 2001-02-20 10:16 ` [2.4.1] system goes glacial, Reiser on /usr doesn't sync Kevin Turner ` (2 preceding siblings ...) 2001-02-20 12:34 ` Kevin Turner @ 2001-02-23 23:35 ` Dwayne C. Litzenberger 2001-02-24 2:21 ` Dwayne C. Litzenberger 3 siblings, 1 reply; 8+ messages in thread From: Dwayne C. Litzenberger @ 2001-02-23 23:35 UTC (permalink / raw) To: Kevin Turner; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 460 bytes --] I have the same problem with 2.4.1 (and 2.4.2). Two processes that are actively using the disk (multiple files) seem to deadlock the system. Killing the right process (SysRq-K) seems to fix things. I'm kind of new to kernel debugging. Anyone want to guide me through it? -- Dwayne C. Litzenberger - dlitz@dlitz.net - Please always Cc to me when replying to me on the lists. - See the mail headers for GPG/advertising/homepage information. [-- Attachment #2: Type: application/pgp-signature, Size: 240 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [2.4.1] system goes glacial, Reiser on /usr doesn't sync 2001-02-23 23:35 ` Dwayne C. Litzenberger @ 2001-02-24 2:21 ` Dwayne C. Litzenberger 0 siblings, 0 replies; 8+ messages in thread From: Dwayne C. Litzenberger @ 2001-02-24 2:21 UTC (permalink / raw) To: Kevin Turner; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 373 bytes --] I think Alan fixed it. I've been running 2.4.2-ac3 under heavy load for about a half-hour now under heavy disk usage (apt-get + 2 kernel builds + Netscape/X11), and it hasn't locked up yet. -- Dwayne C. Litzenberger - dlitz@dlitz.net - Please always Cc to me when replying to me on the lists. - See the mail headers for GPG/advertising/homepage information. [-- Attachment #2: Type: application/pgp-signature, Size: 240 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2001-02-24 2:21 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20010220011536.A10778@troglodyte.menefee>
2001-02-20 10:16 ` [2.4.1] system goes glacial, Reiser on /usr doesn't sync Kevin Turner
2001-02-20 10:30 ` Keith Owens
2001-02-20 11:33 ` David
2001-02-20 12:24 ` Kevin Turner
2001-02-20 16:25 ` Chris Mason
2001-02-20 12:34 ` Kevin Turner
2001-02-23 23:35 ` Dwayne C. Litzenberger
2001-02-24 2:21 ` Dwayne C. Litzenberger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox