From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751176AbWBZCjf (ORCPT ); Sat, 25 Feb 2006 21:39:35 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751187AbWBZCjf (ORCPT ); Sat, 25 Feb 2006 21:39:35 -0500 Received: from central.webforum.de ([194.126.158.21]:61316 "EHLO server.webforum.de") by vger.kernel.org with ESMTP id S1751176AbWBZCje (ORCPT ); Sat, 25 Feb 2006 21:39:34 -0500 Subject: Oops in 2.6.15 with OpenVZ 025stab014 with high IO usage From: Friedrich Schaeuffelhut To: linux-kernel@vger.kernel.org Content-Type: text/plain Date: Sun, 26 Feb 2006 03:12:55 +0100 Message-Id: <1140919976.8790.39.camel@localhost> Mime-Version: 1.0 X-Mailer: Evolution 2.4.2.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Hi there, I'm using a linux 2.6.15 with OpenVZ 025stab014 from http://openvz.org/download/beta/kernel/ Under high IO usage this Kernel oopes. I'm not a kernel developer, so I hoped someone could explain those Oopses to me. I'm also in contact with the OpenVZ developers. The Oops is reproduceable by running a command that stresses file io like: stress -t 36000 -c 1 -i 1 -m 2 --vm-hang 5 -d 2 --hdd-bytes 800M The problem does not appear in a vanilla 2.6.15.4 kernel. The system setup is: CPU: Intel(R) Pentium(R) 4 CPU 2.60GHz RAM: 1GB (48 runns of a mem test didn't show any errors) Board: Tyan 5102, 82801EB/ER (ICH5/ICH5R) SCSI: Adaptec AHA-3960D / AIC-7899A U160/m 2 x Vendor: SEAGATE Model: ST336607LC SATA: 1 x Vendor: ATA Model: Maxtor 6V300F0 USB-Disk: 1 x Vendor: TinyDisk Model: 2005-12-08 Filesystem: EXT3 on software raid 1 on SCSI Disks SWAP on software raid 1 on SCSI Disks The machine was rebooted after each Oops. Please CC: me (fries@desert.lnp.org) any responses to this post. Thank you for your time Friedrich Here are two of the Kernel Oopes: Oops: 0000 [#1] CPU: 1 EIP: 0060:[] Not tainted VLI EFLAGS: 00010002 (2.6.15-025stab014) eax: f02667cc ebx: 51377c51 ecx: fffffffa edx: 0002eec0 esi: 000200d2 edi: f02667c8 ebp: 0002eec0 esp: d96efd34 ds: 007b es: 007b ss: 0068 Stack: 00000000 000200d2 d96efd74 bfbc66ae c013a3bf f02667c8 0002eec0 c149731c 00001000 00000000 0002eec0 00001000 f026671c c02dd820 f02667c8 f61a4500 c14479c0 000c02ee d96efe88 3f26671c 0000000a 00000000 c1249384 c12493a8 Call Trace: [] generic_file_buffered_write+0x141/0x535 [] __generic_file_aio_write_nolock+0x430/0x45e [] scsi_prep_fn+0x16e/0x1c7 [] generic_file_aio_write+0x66/0xb7 [] ext3_file_write+0x26/0x94 [] do_sync_write+0xb8/0xeb [] scsi_io_completion+0x1c5/0x3a5 [] sd_rw_intr+0x21c/0x224 [] autoremove_wake_function+0x0/0x3a [] scsi_finish_command+0x13/0x83 [] inotify_dentry_parent_queue_event+0x29/0x8f [] vfs_write+0x89/0x120 [] sys_write+0x3b/0x63 [] sysenter_past_esp+0x54/0x75 Code: fb 89 d0 5b c3 55 57 56 53 8b 7c 24 14 8b 6c 24 18 8d 47 10 e8 86 bf 15 00 55 8d 47 04 50 e8 51 4d 08 00 89 c3 58 85 db 5a 74 56 <8b> 03 89 da f6 c4 40 74 03 8b 53 0c f0 ff 42 04 f0 0f ba 2b 00 >>EIP; c0138c32 <===== >>eax; f02667cc >>ebx; 51377c51 >>ecx; fffffffa <__kernel_rt_sigreturn+1bba/????> >>edi; f02667c8 >>esp; d96efd34 Trace; c013a3bf Trace; c013abe3 <__generic_file_aio_write_nolock+430/45e> Trace; c0206528 Trace; c013ae43 Trace; c018a78c Trace; c01525b9 Trace; c0205f85 Trace; c0225eb5 Trace; c012b5fe Trace; c020236b Trace; c0173f90 Trace; c0152675 Trace; c01527aa Trace; c01027eb This architecture has variable length instructions, decoding before eip is unreliable, take these instructions with a pinch of salt. Code; c0138c07 00000000 <_EIP>: Code; c0138c07 0: fb sti Code; c0138c08 1: 89 d0 mov %edx,%eax Code; c0138c0a 3: 5b pop %ebx Code; c0138c0b 4: c3 ret Code; c0138c0c 5: 55 push %ebp Code; c0138c0d 6: 57 push %edi Code; c0138c0e 7: 56 push %esi Code; c0138c0f 8: 53 push %ebx Code; c0138c10 9: 8b 7c 24 14 mov 0x14(%esp),%edi Code; c0138c14 d: 8b 6c 24 18 mov 0x18(%esp),%ebp Code; c0138c18 11: 8d 47 10 lea 0x10(%edi),%eax Code; c0138c1b 14: e8 86 bf 15 00 call 15bf9f <_EIP+0x15bf9f> Code; c0138c20 19: 55 push %ebp Code; c0138c21 1a: 8d 47 04 lea 0x4(%edi),%eax Code; c0138c24 1d: 50 push %eax Code; c0138c25 1e: e8 51 4d 08 00 call 84d74 <_EIP+0x84d74> Code; c0138c2a 23: 89 c3 mov %eax,%ebx Code; c0138c2c 25: 58 pop %eax Code; c0138c2d 26: 85 db test %ebx,%ebx Code; c0138c2f 28: 5a pop %edx Code; c0138c30 29: 74 56 je 81 <_EIP+0x81> This decode from eip onwards should be reliable Code; c0138c32 00000000 <_EIP>: Code; c0138c32 <===== 0: 8b 03 mov (%ebx),%eax <===== Code; c0138c34 2: 89 da mov %ebx,%edx Code; c0138c36 4: f6 c4 40 test $0x40,%ah Code; c0138c39 7: 74 03 je c <_EIP+0xc> Code; c0138c3b 9: 8b 53 0c mov 0xc(%ebx),%edx Code; c0138c3e c: f0 ff 42 04 lock incl 0x4(%edx) Code; c0138c42 10: f0 0f ba 2b 00 lock btsl $0x0,(%ebx) Oops: 0002 [#1] CPU: 0 EIP: 0060:[] Not tainted VLI EFLAGS: 00010206 (2.6.15-025stab014) eax: 00104029 ebx: 4324772d ecx: c81fb2a8 edx: e77de4e4 esi: c81fb2a8 edi: c16194c4 ebp: e77de4e4 esp: c1bf9dd4 ds: 007b es: 007b ss: 0068 Stack: e77de4e4 c81fb2a8 c019af85 c81fb2a8 f6ff0400 e77de4e4 00000000 c16194c4 f6ff0400 f0f7ea60 c1bf9f34 c018cf4b f6ff0400 c16194c4 000000d0 f0f7ea60 c16194c4 c014244c c16194c4 000000d0 00000001 00000001 00000011 00000000 Call Trace: [] journal_try_to_free_buffers+0x88/0xc3 [] ext3_releasepage+0x59/0x63 [] shrink_list+0x259/0x3ab [] shrink_cache+0xff/0x24b [] shrink_zone+0xab/0xc4 [] balance_pgdat+0x209/0x35f [] kswapd+0xee/0xf3 [] autoremove_wake_function+0x0/0x3a [] ret_from_fork+0x6/0x14 [] autoremove_wake_function+0x0/0x3a [] kswapd+0x0/0xf3 [] kernel_thread_helper+0x5/0xb Code: a9 00 00 20 00 75 08 0f 0b 36 00 f9 4b 2a c0 f0 0f ba 33 15 5b c3 56 53 8b 74 24 0c 8b 1e eb 0b f3 90 8b 03 a9 00 00 20 00 75 f5 0f ba 2b 15 19 c0 85 c0 75 ec 83 7e 04 00 7f 29 68 f3 9d 2a >>EIP; c019fba5 <===== >>eax; 00104029 >>ebx; 4324772d >>ecx; c81fb2a8 >>edx; e77de4e4 >>esi; c81fb2a8 >>edi; c16194c4 >>ebp; e77de4e4 >>esp; c1bf9dd4 Trace; c019af85 Trace; c018cf4b Trace; c014244c Trace; c014273e Trace; c0142d8d Trace; c01432b2 Trace; c01434f6 Trace; c012b5fe Trace; c010274e Trace; c012b5fe Trace; c0143408 Trace; c0100f35 This architecture has variable length instructions, decoding before eip is unreliable, take these instructions with a pinch of salt. Code; c019fb7a 00000000 <_EIP>: Code; c019fb7a 0: a9 00 00 20 00 test $0x200000,%eax Code; c019fb7f 5: 75 08 jne f <_EIP+0xf> Code; c019fb81 7: 0f 0b ud2a Code; c019fb83 9: 36 ss Code; c019fb84 a: 00 f9 add %bh,%cl Code; c019fb86 c: 4b dec %ebx Code; c019fb87 d: 2a c0 sub %al,%al Code; c019fb89 f: f0 0f ba 33 15 lock btrl $0x15,(%ebx) Code; c019fb8e 14: 5b pop %ebx Code; c019fb8f 15: c3 ret Code; c019fb90 16: 56 push %esi Code; c019fb91 17: 53 push %ebx Code; c019fb92 18: 8b 74 24 0c mov 0xc(%esp),%esi Code; c019fb96 1c: 8b 1e mov (%esi),%ebx Code; c019fb98 1e: eb 0b jmp 2b <_EIP+0x2b> Code; c019fb9a 20: f3 90 pause Code; c019fb9c 22: 8b 03 mov (%ebx),%eax Code; c019fb9e 24: a9 00 00 20 00 test $0x200000,%eax Code; c019fba3 29: 75 f5 jne 20 <_EIP+0x20> This decode from eip onwards should be reliable Code; c019fba5 00000000 <_EIP>: Code; c019fba5 <===== 0: f0 0f ba 2b 15 lock btsl $0x15,(%ebx) <===== Code; c019fbaa 5: 19 c0 sbb %eax,%eax Code; c019fbac 7: 85 c0 test %eax,%eax Code; c019fbae 9: 75 ec jne fffffff7 <_EIP+0xfffffff7> Code; c019fbb0 b: 83 7e 04 00 cmpl $0x0,0x4(%esi) Code; c019fbb4 f: 7f 29 jg 3a <_EIP+0x3a> Code; c019fbb6 11: 68 .byte 0x68 Code; c019fbb7 12: f3 9d repz popf Code; c019fbb9 14: 2a .byte 0x2a Kernel panic - not syncing: Fatal exception