* 2.4.20pre11aa1 @ 2002-10-16 16:51 Andrea Arcangeli 2002-10-17 12:04 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 0 siblings, 1 reply; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-16 16:51 UTC (permalink / raw) To: linux-kernel; +Cc: Srihari Vijayaraghavan Srihari, I would like if you could try to reproduce with this new one with CONFIG_SOUND=n. Thanks! URL: http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.20pre11aa1.gz http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.20pre11aa1/ Only in 2.4.20pre10aa1: 00_extraversion-10 Only in 2.4.20pre11aa1: 00_extraversion-11 Only in 2.4.20pre10aa1: 00_max_bytes-5 Only in 2.4.20pre11aa1: 00_max_bytes-6 Only in 2.4.20pre10aa1: 60_pagecache-atomic-6 Only in 2.4.20pre11aa1: 60_pagecache-atomic-7 Only in 2.4.20pre10aa1: 70_intermezzo-junk-1 Only in 2.4.20pre11aa1: 70_intermezzo-junk-2 Rediffed. Only in 2.4.20pre11aa1: 00_fcntl_getfl-largefile-1 Clear the implicit O_LARGEPAGE with 64bit archs. Only in 2.4.20pre11aa1: 00_o_direct-read-overflow-write-locking-xfs-2 fix xfs compilation (from Christoph). Only in 2.4.20pre10aa1: 20_sched-o1-fixes-4 Only in 2.4.20pre11aa1: 20_sched-o1-fixes-5 Take the expired queue into account in sched_yield, still sched_yield is a cpu-local operation unlike in 2.4 mainline. Fix idle rescheduling so we don't waste an 80% of the cpu power of some big irons. Fixed a race that could explain some instability (in my my tree only). Only in 2.4.20pre10aa1: 86_x86_64-tsc-hpet-pit-1 Dropped temporarily. Only in 2.4.20pre10aa1: 9900_aio-11.gz Only in 2.4.20pre11aa1: 9900_aio-12.gz Unplug the queue properly in the next_chunk passes too. (from Chris Mason) Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-16 16:51 2.4.20pre11aa1 Andrea Arcangeli @ 2002-10-17 12:04 ` Srihari Vijayaraghavan 2002-10-17 12:10 ` 2.4.20pre11aa1 Andrea Arcangeli 0 siblings, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-17 12:04 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello Andrea, > Srihari, I would like if you could try to reproduce with this new one > with CONFIG_SOUND=n. Thanks! No worries! I tried it without sound and unfortunately it crashed few times. The good news is that it is very stable without agpgart and radeon (module or not) support. These are the three oops with agpgart and radeon as modules: ------------------------------------------------------------------------------------------ ksymoops 2.4.5 on i686 2.4.20-pre11aa1. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.20-pre11aa1/ (default) -m /boot/System.map-2.4.20-pre11aa1 (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Oct 17 20:27:24 localhost kernel: Unable to handle kernel paging request at virtual address c68b8008 Oct 17 20:27:24 localhost kernel: c01180ae Oct 17 20:27:24 localhost kernel: *pde = 068001e3 Oct 17 20:27:24 localhost kernel: Oops: 0000 2.4.20-pre11aa1 #3 Thu Oct 17 20:18:58 EST 2002 Oct 17 20:27:24 localhost kernel: CPU: 0 Oct 17 20:27:24 localhost kernel: EIP: 0010:[<c01180ae>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 Oct 17 20:27:24 localhost kernel: EFLAGS: 00013206 Oct 17 20:27:24 localhost kernel: eax: bfffec7c ebx: c68b8000 ecx: c020de0c edx: 00000018 Oct 17 20:27:24 localhost kernel: esi: 00000100 edi: bfffec7c ebp: ffffffff esp: c58f5f78 Oct 17 20:27:24 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 17 20:27:24 localhost kernel: Process modprobe (pid: 888, stackpage=c58f5000) Oct 17 20:27:24 localhost kernel: Stack: dff82e04 00000000 00001000 00000000 00000000 ffffffea c020dda0 bfffec7c Oct 17 20:27:24 localhost kernel: 080640e8 c01188a4 080640e8 00000100 bfffec7c 00000004 c58f4000 00000100 Oct 17 20:27:24 localhost kernel: bfffec7c bfffeca8 c01074ff 00000000 00000001 080640e8 00000100 bfffec7c Oct 17 20:27:24 localhost kernel: Call Trace: [<c01188a4>] [<c01074ff>] Oct 17 20:27:24 localhost kernel: Code: 8b 7b 08 89 e9 31 c0 f2 ae f7 d1 49 8d 79 01 39 f7 77 7f 8b >>EIP; c01180ae <qm_modules+2e/140> <===== >>eax; bfffec7c Before first symbol >>ebx; c68b8000 <[agpgart].bss.end+2c031e5/1c0a3265> >>ecx; c020de0c <modlist_lock+0/0> >>edi; bfffec7c Before first symbol >>ebp; ffffffff <END_OF_CODE+202a3a58/????> >>esp; c58f5f78 <[agpgart].bss.end+1c4115d/1c0a3265> Trace; c01188a4 <sys_query_module+d4/1b0> Trace; c01074ff <system_call+33/38> Code; c01180ae <qm_modules+2e/140> 00000000 <_EIP>: Code; c01180ae <qm_modules+2e/140> <===== 0: 8b 7b 08 mov 0x8(%ebx),%edi <===== Code; c01180b1 <qm_modules+31/140> 3: 89 e9 mov %ebp,%ecx Code; c01180b3 <qm_modules+33/140> 5: 31 c0 xor %eax,%eax Code; c01180b5 <qm_modules+35/140> 7: f2 ae repnz scas %es:(%edi),%al Code; c01180b7 <qm_modules+37/140> 9: f7 d1 not %ecx Code; c01180b9 <qm_modules+39/140> b: 49 dec %ecx Code; c01180ba <qm_modules+3a/140> c: 8d 79 01 lea 0x1(%ecx),%edi Code; c01180bd <qm_modules+3d/140> f: 39 f7 cmp %esi,%edi Code; c01180bf <qm_modules+3f/140> 11: 77 7f ja 92 <_EIP+0x92> Code; c01180c1 <qm_modules+41/140> 13: 8b 00 mov (%eax),%eax Oct 17 20:27:24 localhost kernel: <1>Unable to handle kernel paging request at virtual address c56ac098 Oct 17 20:27:24 localhost kernel: c0119dd0 Oct 17 20:27:24 localhost kernel: *pde = 054001e3 Oct 17 20:27:24 localhost kernel: Oops: 0000 2.4.20-pre11aa1 #3 Thu Oct 17 20:18:58 EST 2002 Oct 17 20:27:24 localhost kernel: CPU: 0 Oct 17 20:27:24 localhost kernel: EIP: 0010:[<c0119dd0>] Not tainted Oct 17 20:27:24 localhost kernel: EFLAGS: 00013206 Oct 17 20:27:24 localhost kernel: eax: 00000000 ebx: c56ac000 ecx: c4ad9000 edx: 00000000 Oct 17 20:27:24 localhost kernel: esi: c58f4000 edi: 000000b8 ebp: 0000000b esp: c58f5e2c Oct 17 20:27:24 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 17 20:27:24 localhost kernel: Process modprobe (pid: 888, stackpage=c58f5000) Oct 17 20:27:24 localhost kernel: Stack: c1587bb8 c4ad9ac0 c58f4000 00000000 c58f4000 000000b8 0000000b c011a2c0 Oct 17 20:27:24 localhost kernel: c58f4000 c16f1880 c58f5f44 00000000 000000b8 c58f4000 c0107bef 0000000b Oct 17 20:27:24 localhost kernel: c01f1e2a 00000000 00000000 c01125a4 c01f1e2a c58f5f44 00000000 dff82e00 Oct 17 20:27:24 localhost kernel: Call Trace: [<c011a2c0>] [<c0107bef>] [<c01125a4>] [<c0126aaa>] [<c01314e5>] Oct 17 20:27:24 localhost kernel: [<c0126dde>] [<c011244a>] [<c01276dc>] [<c01122a0>] [<c01075f0>] [<c01180ae>] Oct 17 20:27:24 localhost kernel: [<c01188a4>] [<c01074ff>] Oct 17 20:27:24 localhost kernel: Code: 39 b3 98 00 00 00 0f 84 85 02 00 00 8b 5b 50 81 fb 00 a0 21 >>EIP; c0119dd0 <exit_notify+20/300> <===== >>ebx; c56ac000 <[agpgart].bss.end+19f71e5/1c0a3265> >>ecx; c4ad9000 <[agpgart].bss.end+e241e5/1c0a3265> >>esi; c58f4000 <[agpgart].bss.end+1c3f1e5/1c0a3265> >>esp; c58f5e2c <[agpgart].bss.end+1c41011/1c0a3265> Trace; c011a2c0 <do_exit+210/260> Trace; c0107bef <die+7f/80> Trace; c01125a4 <do_page_fault+304/5a0> Trace; c0126aaa <do_no_page+8a/1c0> Trace; c01314e5 <lru_cache_add+65/70> Trace; c0126dde <handle_mm_fault+8e/160> Trace; c011244a <do_page_fault+1aa/5a0> Trace; c01276dc <zap_pmd_range+7c/80> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Trace; c01180ae <qm_modules+2e/140> Trace; c01188a4 <sys_query_module+d4/1b0> Trace; c01074ff <system_call+33/38> Code; c0119dd0 <exit_notify+20/300> 00000000 <_EIP>: Code; c0119dd0 <exit_notify+20/300> <===== 0: 39 b3 98 00 00 00 cmp %esi,0x98(%ebx) <===== Code; c0119dd6 <exit_notify+26/300> 6: 0f 84 85 02 00 00 je 291 <_EIP+0x291> Code; c0119ddc <exit_notify+2c/300> c: 8b 5b 50 mov 0x50(%ebx),%ebx Code; c0119ddf <exit_notify+2f/300> f: 81 fb 00 a0 21 00 cmp $0x21a000,%ebx Oct 17 20:27:24 localhost kernel: <1>Unable to handle kernel paging request at virtual address c4db8098 Oct 17 20:27:24 localhost kernel: c0119dd0 Oct 17 20:27:24 localhost kernel: *pde = 04c001e3 Oct 17 20:27:24 localhost kernel: Oops: 0000 2.4.20-pre11aa1 #3 Thu Oct 17 20:18:58 EST 2002 Oct 17 20:27:24 localhost kernel: CPU: 0 Oct 17 20:27:24 localhost kernel: EIP: 0010:[<c0119dd0>] Not tainted Oct 17 20:27:24 localhost kernel: EFLAGS: 00013206 Oct 17 20:27:24 localhost kernel: eax: 00000000 ebx: c4db8000 ecx: 00000000 edx: 00000000 Oct 17 20:27:24 localhost kernel: esi: c58f4000 edi: 000002ac ebp: 0000000b esp: c58f5ce0 Oct 17 20:27:24 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 17 20:27:24 localhost kernel: Process modprobe (pid: 888, stackpage=c58f5000) Oct 17 20:27:24 localhost kernel: Stack: 00000020 00000400 c58f4000 00000000 c58f4000 000002ac 0000000b c011a2c0 Oct 17 20:27:24 localhost kernel: c58f4000 00000000 c58f5df8 00000000 000002ac c58f4000 c0107bef 0000000b Oct 17 20:27:24 localhost kernel: c01f1e2a 00000000 00000000 c01125a4 c01f1e2a c58f5df8 00000000 33323130 Oct 17 20:27:24 localhost kernel: Call Trace: [<c011a2c0>] [<c0107bef>] [<c01125a4>] [<c0131577>] [<c01278e8>] Oct 17 20:27:24 localhost kernel: [<c01122a0>] [<c01276dc>] [<c01122a0>] [<c01075f0>] [<c0119dd0>] [<c011a2c0>] Oct 17 20:27:24 localhost kernel: [<c0107bef>] [<c01125a4>] [<c0126aaa>] [<c01314e5>] [<c0126dde>] [<c011244a>] Oct 17 20:27:24 localhost kernel: [<c01276dc>] [<c01122a0>] [<c01075f0>] [<c01180ae>] [<c01188a4>] [<c01074ff>] Oct 17 20:27:24 localhost kernel: Code: 39 b3 98 00 00 00 0f 84 85 02 00 00 8b 5b 50 81 fb 00 a0 21 >>EIP; c0119dd0 <exit_notify+20/300> <===== >>ebx; c4db8000 <[agpgart].bss.end+11031e5/1c0a3265> >>esi; c58f4000 <[agpgart].bss.end+1c3f1e5/1c0a3265> >>esp; c58f5ce0 <[agpgart].bss.end+1c40ec5/1c0a3265> Trace; c011a2c0 <do_exit+210/260> Trace; c0107bef <die+7f/80> Trace; c01125a4 <do_page_fault+304/5a0> Trace; c0131577 <__lru_cache_del+87/90> Trace; c01278e8 <zap_pte_range+f8/150> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01276dc <zap_pmd_range+7c/80> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Trace; c0119dd0 <exit_notify+20/300> Trace; c011a2c0 <do_exit+210/260> Trace; c0107bef <die+7f/80> Trace; c01125a4 <do_page_fault+304/5a0> Trace; c0126aaa <do_no_page+8a/1c0> Trace; c01314e5 <lru_cache_add+65/70> Trace; c0126dde <handle_mm_fault+8e/160> Trace; c011244a <do_page_fault+1aa/5a0> Trace; c01276dc <zap_pmd_range+7c/80> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Trace; c01180ae <qm_modules+2e/140> Trace; c01188a4 <sys_query_module+d4/1b0> Trace; c01074ff <system_call+33/38> Code; c0119dd0 <exit_notify+20/300> 00000000 <_EIP>: Code; c0119dd0 <exit_notify+20/300> <===== 0: 39 b3 98 00 00 00 cmp %esi,0x98(%ebx) <===== Code; c0119dd6 <exit_notify+26/300> 6: 0f 84 85 02 00 00 je 291 <_EIP+0x291> Code; c0119ddc <exit_notify+2c/300> c: 8b 5b 50 mov 0x50(%ebx),%ebx Code; c0119ddf <exit_notify+2f/300> f: 81 fb 00 a0 21 00 cmp $0x21a000,%ebx 1 warning issued. Results may not be reliable. These are the two oops with agpgart and radeon built-in the kernel: ------------------------------------------------------------------------------------------------ ksymoops 2.4.5 on i686 2.4.20-pre11aa1-agpdrm. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.20-pre11aa1-agpdrm/ (default) -m /boot/System.map-2.4.20-pre11aa1-agpdrm (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Oct 17 21:22:29 localhost kernel: Unable to handle kernel paging request at virtual address c72b4034 Oct 17 21:22:29 localhost kernel: c0112b57 Oct 17 21:22:29 localhost kernel: *pde = 070001e3 Oct 17 21:22:29 localhost kernel: Oops: 0000 2.4.20-pre11aa1-agpdrm #6 Thu Oct 17 21:11:50 EST 2002 Oct 17 21:22:29 localhost kernel: CPU: 0 Oct 17 21:22:29 localhost kernel: EIP: 0010:[<c0112b57>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 Oct 17 21:22:29 localhost kernel: EFLAGS: 00013086 Oct 17 21:22:29 localhost kernel: eax: 00000000 ebx: c8aaa000 ecx: c72b4000 edx: c8aabe78 Oct 17 21:22:29 localhost kernel: esi: 00000002 edi: c01f5f22 ebp: 00003246 esp: c8aabd9c Oct 17 21:22:29 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 17 21:22:29 localhost kernel: Process modprobe (pid: 1036, stackpage=c8aab000) Oct 17 21:22:29 localhost kernel: Stack: c8aaa000 00000002 c6e3c000 c8aaa000 c01124c2 c01f5f22 c8aaa000 00000000 Oct 17 21:22:29 localhost kernel: c6270f8e c110eb5c c8aaa000 c8aabfc4 0001ff9d c022326f c6270000 c110eb5c Oct 17 21:22:29 localhost kernel: c2d94000 00000000 c0223360 c8aabfc8 c0141c50 c8aabdfc c8aabf6c c8aabdfc Oct 17 21:22:29 localhost kernel: Call Trace: [<c01124c2>] [<c01f5f22>] [<c0141c50>] [<c01122a0>] [<c01075f0>] Oct 17 21:22:29 localhost kernel: [<c01f5f22>] [<c01269b2>] [<c0126dde>] [<c011244a>] [<c01286df>] [<c0128a37>] Oct 17 21:22:29 localhost kernel: [<c0128ab4>] [<c01122a0>] [<c01075f0>] Oct 17 21:22:29 localhost kernel: Code: 8b 51 34 85 d2 74 3f f7 41 14 41 00 00 00 74 36 8b 71 38 89 >>EIP; c0112b57 <search_exception_table+17/80> <===== >>ebx; c8aaa000 <[sr_mod].bss.end+1da61a9/1902c229> >>ecx; c72b4000 <[sr_mod].bss.end+5b01a9/1902c229> >>edx; c8aabe78 <[sr_mod].bss.end+1da8021/1902c229> >>edi; c01f5f22 <fast_clear_page+12/50> >>ebp; 00003246 Before first symbol >>esp; c8aabd9c <[sr_mod].bss.end+1da7f45/1902c229> Trace; c01124c2 <do_page_fault+222/5a0> Trace; c01f5f22 <fast_clear_page+12/50> Trace; c0141c50 <do_execve+180/220> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Trace; c01f5f22 <fast_clear_page+12/50> Trace; c01269b2 <do_anonymous_page+a2/110> Trace; c0126dde <handle_mm_fault+8e/160> Trace; c011244a <do_page_fault+1aa/5a0> Trace; c01286df <unmap_fixup+12f/140> Trace; c0128a37 <do_munmap+297/2d0> Trace; c0128ab4 <sys_munmap+44/80> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Code; c0112b57 <search_exception_table+17/80> 00000000 <_EIP>: Code; c0112b57 <search_exception_table+17/80> <===== 0: 8b 51 34 mov 0x34(%ecx),%edx <===== Code; c0112b5a <search_exception_table+1a/80> 3: 85 d2 test %edx,%edx Code; c0112b5c <search_exception_table+1c/80> 5: 74 3f je 46 <_EIP+0x46> Code; c0112b5e <search_exception_table+1e/80> 7: f7 41 14 41 00 00 00 testl $0x41,0x14(%ecx) Code; c0112b65 <search_exception_table+25/80> e: 74 36 je 46 <_EIP+0x46> Code; c0112b67 <search_exception_table+27/80> 10: 8b 71 38 mov 0x38(%ecx),%esi Code; c0112b6a <search_exception_table+2a/80> 13: 89 00 mov %eax,(%eax) Oct 17 21:22:29 localhost kernel: <1>Unable to handle kernel paging request at virtual address c77340c4 Oct 17 21:22:29 localhost kernel: c0139b5e Oct 17 21:22:29 localhost kernel: *pde = 07769163 Oct 17 21:22:29 localhost kernel: Oops: 0003 2.4.20-pre11aa1-agpdrm #6 Thu Oct 17 21:11:50 EST 2002 Oct 17 21:22:29 localhost kernel: CPU: 0 Oct 17 21:22:29 localhost kernel: EIP: 0010:[<c0139b5e>] Not tainted Oct 17 21:22:29 localhost kernel: EFLAGS: 00013246 Oct 17 21:22:29 localhost kernel: eax: c27e7340 ebx: c779cdc0 ecx: 00000000 edx: c77340c0 Oct 17 21:22:29 localhost kernel: esi: c158e380 edi: c1689dc0 ebp: c1ac8540 esp: c8aabc20 Oct 17 21:22:29 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 17 21:22:29 localhost kernel: Process modprobe (pid: 1036, stackpage=c8aab000) Oct 17 21:22:29 localhost kernel: Stack: c1689dc0 c779cdc0 c1c338c0 00001000 dfe572c0 08060000 c0128e85 dfe572c0 Oct 17 21:22:29 localhost kernel: 08060000 00001000 c1c33940 dfe572c0 c8aaa000 000002b4 0000000b c0115076 Oct 17 21:22:29 localhost kernel: dfe572c0 00003202 dfe572c0 c011a137 dfe572c0 00000000 c8aabd68 00000000 Oct 17 21:22:29 localhost kernel: Call Trace: [<c0128e85>] [<c0115076>] [<c011a137>] [<c0107bef>] [<c01125a4>] Oct 17 21:22:29 localhost kernel: [<c014322b>] [<c01122a0>] [<c01075f0>] [<c01f5f22>] [<c0112b57>] [<c01124c2>] Oct 17 21:22:29 localhost kernel: [<c01f5f22>] [<c0141c50>] [<c01122a0>] [<c01075f0>] [<c01f5f22>] [<c01269b2>] Oct 17 21:22:29 localhost kernel: [<c0126dde>] [<c011244a>] [<c01286df>] [<c0128a37>] [<c0128ab4>] [<c01122a0>] Oct 17 21:22:29 localhost kernel: [<c01075f0>] Oct 17 21:22:29 localhost kernel: Code: 89 42 04 c7 03 00 00 00 00 a1 b4 3e 22 c0 89 58 04 89 03 89 >>EIP; c0139b5e <fput+9e/120> <===== >>eax; c27e7340 <[floppy].bss.end+599905/4ab2645> >>ebx; c779cdc0 <[sr_mod].bss.end+a98f69/1902c229> >>edx; c77340c0 <[sr_mod].bss.end+a30269/1902c229> >>esi; c158e380 <_end+12f1d10/15aaa10> >>edi; c1689dc0 <_end+13ed750/15aaa10> >>ebp; c1ac8540 <[md].bss.end+25a861/3123a1> >>esp; c8aabc20 <[sr_mod].bss.end+1da7dc9/1902c229> Trace; c0128e85 <exit_mmap+125/140> Trace; c0115076 <mmput+56/d0> Trace; c011a137 <do_exit+87/260> Trace; c0107bef <die+7f/80> Trace; c01125a4 <do_page_fault+304/5a0> Trace; c014322b <cached_lookup+1b/70> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Trace; c01f5f22 <fast_clear_page+12/50> Trace; c0112b57 <search_exception_table+17/80> Trace; c01124c2 <do_page_fault+222/5a0> Trace; c01f5f22 <fast_clear_page+12/50> Trace; c0141c50 <do_execve+180/220> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Trace; c01f5f22 <fast_clear_page+12/50> Trace; c01269b2 <do_anonymous_page+a2/110> Trace; c0126dde <handle_mm_fault+8e/160> Trace; c011244a <do_page_fault+1aa/5a0> Trace; c01286df <unmap_fixup+12f/140> Trace; c0128a37 <do_munmap+297/2d0> Trace; c0128ab4 <sys_munmap+44/80> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Code; c0139b5e <fput+9e/120> 00000000 <_EIP>: Code; c0139b5e <fput+9e/120> <===== 0: 89 42 04 mov %eax,0x4(%edx) <===== Code; c0139b61 <fput+a1/120> 3: c7 03 00 00 00 00 movl $0x0,(%ebx) Code; c0139b67 <fput+a7/120> 9: a1 b4 3e 22 c0 mov 0xc0223eb4,%eax Code; c0139b6c <fput+ac/120> e: 89 58 04 mov %ebx,0x4(%eax) Code; c0139b6f <fput+af/120> 11: 89 03 mov %eax,(%ebx) Code; c0139b71 <fput+b1/120> 13: 89 00 mov %eax,(%eax) 1 warning issued. Results may not be reliable. The mainline (2.4.20-pre11) is fine with agpgart and radeon as modules. I haven't tested it with agpgart and radeon built-in the kernel. I am trying to find if any of my friends have a different Radeon card (mine is Radeon VE QY) or any video card that has DRM support on the official kernel tree. If I find one I will try and see if --aa works fine with that. Thanks for your help. -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-17 12:04 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-10-17 12:10 ` Andrea Arcangeli 2002-10-17 13:01 ` 2.4.20pre11aa1 Keith Owens 2002-10-17 13:02 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 0 siblings, 2 replies; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-17 12:10 UTC (permalink / raw) To: Srihari Vijayaraghavan; +Cc: linux-kernel On Thu, Oct 17, 2002 at 10:04:50PM +1000, Srihari Vijayaraghavan wrote: > Hello Andrea, > > > Srihari, I would like if you could try to reproduce with this new one > > with CONFIG_SOUND=n. Thanks! > > No worries! > > I tried it without sound and unfortunately it crashed few times. The good news > is that it is very stable without agpgart and radeon (module or not) support. I've no idea what could be wrong with the graphics drivers, there are no changes there. > ffffffff esp: c58f5f78 > Oct 17 20:27:24 localhost kernel: ds: 0018 es: 0018 ss: 0018 > Oct 17 20:27:24 localhost kernel: Process modprobe (pid: 888, please try to find which is this module, replace modprobe with a script that does: #!/bin/sh echo $@ >>/tmp/log sync modprobe.orig $@ then look at log after the crash. You said in your last email that the gart code wasn't the culprit. If it isn't the sound drivers I've no clue what it is. What does it mean the without agpgart it is very stable? That it crashes less frequently? (I recalled it crashed even without those modules) It doesn't make any sense that 2.4.20-pre11 works and my tree doesn't, there are no changes to those sound and graphics driver. Can you make sure that modversions is enabled, and please send me your .config. Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-17 12:10 ` 2.4.20pre11aa1 Andrea Arcangeli @ 2002-10-17 13:01 ` Keith Owens 2002-10-17 15:26 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 2002-10-17 13:02 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 1 sibling, 1 reply; 25+ messages in thread From: Keith Owens @ 2002-10-17 13:01 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Srihari Vijayaraghavan, linux-kernel On Thu, 17 Oct 2002 14:10:05 +0200, Andrea Arcangeli <andrea@suse.de> wrote: >please try to find which is this module, replace modprobe with a script >that does: > >#!/bin/sh >echo $@ >>/tmp/log >sync >modprobe.orig $@ You don't need that, just mkdir /var/log/ksymoops. modprobe/insmod will create a daily log file and snapshot a copy of lsmod and /proc/ksyms for every module loaded or unloaded. All with sync in the right places. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-17 13:01 ` 2.4.20pre11aa1 Keith Owens @ 2002-10-17 15:26 ` Srihari Vijayaraghavan 2002-10-17 16:27 ` 2.4.20pre11aa1 Andrea Arcangeli 0 siblings, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-17 15:26 UTC (permalink / raw) To: Keith Owens, Andrea Arcangeli; +Cc: linux-kernel Hello Keith, > You don't need that, just mkdir /var/log/ksymoops. modprobe/insmod > will create a daily log file and snapshot a copy of lsmod and > /proc/ksyms for every module loaded or unloaded. All with sync in the > right places. Thanks, and that works fine. Hello Andrea, 1. To simplify and to prove that the crashes are associated with agpgart and/or radeon I have compiled kernel with _only_ agpgart and radeon as modules and nothing else. $ cat /lib/modules/2.4.20-pre10aa1/modules.dep /lib/modules/2.4.20-pre11aa1/kernel/drivers/char/agp/agpgart.o: /lib/modules/2.4.20-pre11aa1/kernel/drivers/char/drm/radeon.o: These are some decoded output of oops appeared in the system logs: ------------------------------------------------------------------------------------------------------ ksymoops 2.4.5 on i686 2.4.20-pre11aa1. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.20-pre11aa1/ (default) -m /boot/System.map-2.4.20-pre11aa1 (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Oct 18 00:29:02 localhost kernel: Unable to handle kernel paging request at virtual address c73ae000 Oct 18 00:29:02 localhost kernel: c0210ee2 Oct 18 00:29:02 localhost kernel: *pde = 070001e3 Oct 18 00:29:02 localhost kernel: Oops: 0002 2.4.20-pre11aa1 #9 Fri Oct 18 00:06:42 EST 2002 Oct 18 00:29:02 localhost kernel: CPU: 0 Oct 18 00:29:02 localhost kernel: EIP: 0010:[<c0210ee2>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 Oct 18 00:29:02 localhost kernel: EFLAGS: 00013246 Oct 18 00:29:02 localhost kernel: eax: 0000003f ebx: c73ae000 ecx: c7c8c000 edx: 00000000 Oct 18 00:29:02 localhost kernel: esi: c2daffe0 edi: 00000fe0 ebp: c113e204 esp: c7c8deac Oct 18 00:29:02 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 18 00:29:02 localhost kernel: Process modprobe (pid: 944, stackpage=c7c8d000) Oct 18 00:29:03 localhost kernel: Stack: 00104025 c01269b2 c73ae000 c7534bfc bfff8e50 c2d9f480 c4e97a40 c0126dde Oct 18 00:29:03 localhost kernel: c2d9f480 c4e97a40 c2daffe0 c7534bfc 00000001 bfff8e50 c7c8df24 c2d9f480 Oct 18 00:29:03 localhost kernel: c4e97a40 bfff8e50 c7c8c000 c011244a c2d9f480 c4e97a40 bfff8e50 00000001 Oct 18 00:29:03 localhost kernel: Call Trace: [<c01269b2>] [<c0126dde>] [<c011244a>] [<c0127bb6>] [<c0128cc7>] Oct 18 00:29:03 localhost kernel: [<c0127ab1>] [<c01122a0>] [<c01075f0>] Oct 18 00:29:03 localhost kernel: Code: 0f e7 03 0f e7 43 08 0f e7 43 10 0f e7 43 18 0f e7 43 20 0f >>EIP; c0210ee2 <fast_clear_page+12/50> <===== >>ebx; c73ae000 <END_OF_CODE+35e90a5/????> >>ecx; c7c8c000 <END_OF_CODE+3ec70a5/????> >>esi; c2daffe0 <_end+2ad7c48/3a47ce8> >>edi; 00000fe0 Before first symbol >>ebp; c113e204 <_end+e65e6c/3a47ce8> >>esp; c7c8deac <END_OF_CODE+3ec8f51/????> Trace; c01269b2 <do_anonymous_page+a2/110> Trace; c0126dde <handle_mm_fault+8e/160> Trace; c011244a <do_page_fault+1aa/5a0> Trace; c0127bb6 <__vma_link+56/d0> Trace; c0128cc7 <do_brk+1d7/210> Trace; c0127ab1 <sys_brk+f1/130> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Code; c0210ee2 <fast_clear_page+12/50> 00000000 <_EIP>: Code; c0210ee2 <fast_clear_page+12/50> <===== 0: 0f e7 03 movntq %mm0,(%ebx) <===== Code; c0210ee5 <fast_clear_page+15/50> 3: 0f e7 43 08 movntq %mm0,0x8(%ebx) Code; c0210ee9 <fast_clear_page+19/50> 7: 0f e7 43 10 movntq %mm0,0x10(%ebx) Code; c0210eed <fast_clear_page+1d/50> b: 0f e7 43 18 movntq %mm0,0x18(%ebx) Code; c0210ef1 <fast_clear_page+21/50> f: 0f e7 43 20 movntq %mm0,0x20(%ebx) Code; c0210ef5 <fast_clear_page+25/50> 13: 0f 00 00 sldtl (%eax) Oct 18 00:29:03 localhost kernel: <1>Unable to handle kernel NULL pointer dereference at virtual address 00000044 Oct 18 00:29:03 localhost kernel: c014ca41 Oct 18 00:29:03 localhost kernel: *pde = 0752b067 Oct 18 00:29:03 localhost kernel: Oops: 0000 2.4.20-pre11aa1 #9 Fri Oct 18 00:06:42 EST 2002 Oct 18 00:29:03 localhost kernel: CPU: 0 Oct 18 00:29:03 localhost kernel: EIP: 0010:[<c014ca41>] Not tainted Oct 18 00:29:03 localhost kernel: EFLAGS: 00013217 Oct 18 00:29:03 localhost kernel: eax: dff32cf8 ebx: 00000010 ecx: 00000010 edx: dff00000 Oct 18 00:29:03 localhost kernel: esi: 00000000 edi: 00000000 ebp: 0003b0c1 esp: c64d9d74 Oct 18 00:29:03 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 18 00:29:03 localhost kernel: Process X (pid: 945, stackpage=c64d9000) Oct 18 00:29:03 localhost kernel: Stack: 00000000 00000000 00000000 00000000 00000000 dff32cf8 dfe66005 00000002 Oct 18 00:29:03 localhost kernel: dfe66005 dfe66007 00000000 c64d9e14 c014322b c16d7540 c64d9dd4 dfe66005 Oct 18 00:29:03 localhost kernel: c0143854 c16d7540 c64d9dd4 00000000 00000009 00000000 c16c29c0 00000000 Oct 18 00:29:03 localhost kernel: Call Trace: [<c014322b>] [<c0143854>] [<c0143d37>] [<c0141187>] [<c0141af7>] Oct 18 00:29:03 localhost kernel: [<c0132ecf>] [<c01314e5>] [<c0126510>] [<c0126e69>] [<c011244a>] [<c0142fd7>] Oct 18 00:29:03 localhost kernel: [<c0105c90>] [<c01074ff>] Oct 18 00:29:03 localhost kernel: Code: 39 6e 44 8b 1b 75 e8 8b 7c 24 34 39 7e 0c 75 df 8b 57 4c 85 >>EIP; c014ca41 <d_lookup+61/110> <===== >>eax; dff32cf8 <END_OF_CODE+1c16dd9d/????> >>edx; dff00000 <END_OF_CODE+1c13b0a5/????> >>ebp; 0003b0c1 Before first symbol >>esp; c64d9d74 <END_OF_CODE+2714e19/????> Trace; c014322b <cached_lookup+1b/70> Trace; c0143854 <link_path_walk+3c4/6f0> Trace; c0143d37 <path_lookup+37/40> Trace; c0141187 <open_exec+27/e0> Trace; c0141af7 <do_execve+27/220> Trace; c0132ecf <__alloc_pages+5f/280> Trace; c01314e5 <lru_cache_add+65/70> Trace; c0126510 <do_wp_page+140/1f0> Trace; c0126e69 <handle_mm_fault+119/160> Trace; c011244a <do_page_fault+1aa/5a0> Trace; c0142fd7 <getname+97/d0> Trace; c0105c90 <sys_execve+50/80> Trace; c01074ff <system_call+33/38> Code; c014ca41 <d_lookup+61/110> 00000000 <_EIP>: Code; c014ca41 <d_lookup+61/110> <===== 0: 39 6e 44 cmp %ebp,0x44(%esi) <===== Code; c014ca44 <d_lookup+64/110> 3: 8b 1b mov (%ebx),%ebx Code; c014ca46 <d_lookup+66/110> 5: 75 e8 jne ffffffef <_EIP+0xffffffef> Code; c014ca48 <d_lookup+68/110> 7: 8b 7c 24 34 mov 0x34(%esp,1),%edi Code; c014ca4c <d_lookup+6c/110> b: 39 7e 0c cmp %edi,0xc(%esi) Code; c014ca4f <d_lookup+6f/110> e: 75 df jne ffffffef <_EIP+0xffffffef> Code; c014ca51 <d_lookup+71/110> 10: 8b 57 4c mov 0x4c(%edi),%edx Code; c014ca54 <d_lookup+74/110> 13: 85 00 test %eax,(%eax) Oct 18 00:29:04 localhost kernel: <1>Unable to handle kernel paging request at virtual address c6b917c4 Oct 18 00:29:04 localhost kernel: c0139920 Oct 18 00:29:04 localhost kernel: *pde = 0748a163 Oct 18 00:29:04 localhost kernel: Oops: 0003 2.4.20-pre11aa1 #9 Fri Oct 18 00:06:42 EST 2002 Oct 18 00:29:04 localhost kernel: CPU: 0 Oct 18 00:29:04 localhost kernel: EIP: 0010:[<c0139920>] Not tainted Oct 18 00:29:04 localhost kernel: EFLAGS: 00010216 Oct 18 00:29:04 localhost kernel: eax: c6b917c0 ebx: c4a132c0 ecx: 00000004 edx: c0251474 Oct 18 00:29:04 localhost kernel: esi: 00000000 edi: ffffffe9 ebp: c158e380 esp: c8bb7f44 Oct 18 00:29:04 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 18 00:29:04 localhost kernel: Process sh (pid: 950, stackpage=c8bb7000) Oct 18 00:29:04 localhost kernel: Stack: c167e440 00000004 c57acbe4 00000000 c0137e29 00000004 c16d77c0 00000000 Oct 18 00:29:04 localhost kernel: c1be5000 4001edcd bfffeb68 c0137e07 c16d77c0 c158e380 00000000 c8bb7f84 Oct 18 00:29:04 localhost kernel: c16d77c0 c158e380 c1be5000 c2dbc61c 00000003 00000001 00000001 4001edcd Oct 18 00:29:04 localhost kernel: Call Trace: [<c0137e29>] [<c0137e07>] [<c01381e3>] [<c01074ff>] Oct 18 00:29:04 localhost kernel: Code: 89 50 04 89 02 c7 43 04 00 00 00 00 c7 03 00 00 00 00 ff 0d >>EIP; c0139920 <get_empty_filp+20/130> <===== >>eax; c6b917c0 <END_OF_CODE+2dcc865/????> >>ebx; c4a132c0 <END_OF_CODE+c4e365/????> >>edx; c0251474 <free_list+0/8> >>edi; ffffffe9 <END_OF_CODE+3c23b08e/????> >>ebp; c158e380 <_end+12b5fe8/3a47ce8> >>esp; c8bb7f44 <END_OF_CODE+4df2fe9/????> Trace; c0137e29 <dentry_open+19/210> Trace; c0137e07 <filp_open+67/70> Trace; c01381e3 <sys_open+53/a0> Trace; c01074ff <system_call+33/38> Code; c0139920 <get_empty_filp+20/130> 00000000 <_EIP>: Code; c0139920 <get_empty_filp+20/130> <===== 0: 89 50 04 mov %edx,0x4(%eax) <===== Code; c0139923 <get_empty_filp+23/130> 3: 89 02 mov %eax,(%edx) Code; c0139925 <get_empty_filp+25/130> 5: c7 43 04 00 00 00 00 movl $0x0,0x4(%ebx) Code; c013992c <get_empty_filp+2c/130> c: c7 03 00 00 00 00 movl $0x0,(%ebx) Code; c0139932 <get_empty_filp+32/130> 12: ff 0d 00 00 00 00 decl 0x0 Oct 18 00:29:10 localhost kernel: <1>Unable to handle kernel paging request at virtual address c6895b44 Oct 18 00:29:10 localhost kernel: c0139920 Oct 18 00:29:10 localhost kernel: *pde = 0748a163 Oct 18 00:29:10 localhost kernel: Oops: 0003 2.4.20-pre11aa1 #9 Fri Oct 18 00:06:42 EST 2002 Oct 18 00:29:10 localhost kernel: CPU: 0 Oct 18 00:29:10 localhost kernel: EIP: 0010:[<c0139920>] Not tainted Oct 18 00:29:10 localhost kernel: EFLAGS: 00010216 Warning (Oops_read): Code line not seen, dumping what data is available >>EIP; c0139920 <get_empty_filp+20/130> <===== 2 warnings issued. Results may not be reliable. 2. Then I compiled the kernel with one and only module ie, radeon, and nothing else. $ cat /lib/modules/2.4.20-pre11aa1/modules.dep /lib/modules/2.4.20-pre11aa1/kernel/drivers/char/drm/radeon.o: Here is the decoded output of the oops appeared on the system logs: ---------------------------------------------------------------------------------------------------- ksymoops 2.4.5 on i686 2.4.20-pre11aa1. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.20-pre11aa1/ (default) -m /boot/System.map-2.4.20-pre11aa1 (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Oct 18 01:00:26 localhost kernel: Unable to handle kernel paging request at virtual address c3d50000 Oct 18 01:00:26 localhost kernel: c021389a Oct 18 01:00:26 localhost kernel: *pde = 03c001e3 Oct 18 01:00:26 localhost kernel: Oops: 0002 2.4.20-pre11aa1 #10 Fri Oct 18 00:39:27 EST 2002 Oct 18 01:00:26 localhost kernel: CPU: 0 Oct 18 01:00:26 localhost kernel: EIP: 0010:[<c021389a>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 Oct 18 01:00:26 localhost kernel: EFLAGS: 00013246 Oct 18 01:00:26 localhost kernel: eax: 0000003a ebx: c1730000 ecx: c3df4000 edx: 00000000 Oct 18 01:00:26 localhost kernel: esi: c3d50000 edi: 01730025 ebp: c10a89dc esp: c3df5e9c Oct 18 01:00:26 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 18 01:00:26 localhost kernel: Process modprobe (pid: 712, stackpage=c3df5000) Oct 18 01:00:26 localhost kernel: Stack: c103fc5c c3fad498 c01264ce c3d50000 c1730000 dfe1ce00 c10a89dc c4c99420 Oct 18 01:00:26 localhost kernel: 42126000 dfe1ce00 c164ed40 c0126e69 dfe1ce00 c164ed40 42126000 c3fad498 Oct 18 01:00:26 localhost kernel: c4c99420 01730025 c164e5c0 dfe1ce00 c164ed40 42126000 c3df4000 c011244a Oct 18 01:00:26 localhost kernel: Call Trace: [<c01264ce>] [<c0126e69>] [<c011244a>] [<c01276dc>] [<c0139b8c>] Oct 18 01:00:26 localhost kernel: [<c01286df>] [<c0128a37>] [<c0128ab4>] [<c01122a0>] [<c01075f0>] Oct 18 01:00:26 localhost kernel: Code: 0f e7 06 0f 6f 4b 08 0f e7 4e 08 0f 6f 53 10 0f e7 56 10 0f >>EIP; c021389a <fast_copy_page+3a/e0> <===== >>ebx; c1730000 <_end+1455ba8/3a85c28> >>ecx; c3df4000 <END_OF_CODE+7ea89/????> >>esi; c3d50000 <_end+3a75ba8/3a85c28> >>edi; 01730025 Before first symbol >>ebp; c10a89dc <_end+dce584/3a85c28> >>esp; c3df5e9c <END_OF_CODE+80925/????> Trace; c01264ce <do_wp_page+fe/1f0> Trace; c0126e69 <handle_mm_fault+119/160> Trace; c011244a <do_page_fault+1aa/5a0> Trace; c01276dc <zap_pmd_range+7c/80> Trace; c0139b8c <fput+cc/120> Trace; c01286df <unmap_fixup+12f/140> Trace; c0128a37 <do_munmap+297/2d0> Trace; c0128ab4 <sys_munmap+44/80> Trace; c01122a0 <do_page_fault+0/5a0> Trace; c01075f0 <error_code+34/3c> Code; c021389a <fast_copy_page+3a/e0> 00000000 <_EIP>: Code; c021389a <fast_copy_page+3a/e0> <===== 0: 0f e7 06 movntq %mm0,(%esi) <===== Code; c021389d <fast_copy_page+3d/e0> 3: 0f 6f 4b 08 movq 0x8(%ebx),%mm1 Code; c02138a1 <fast_copy_page+41/e0> 7: 0f e7 4e 08 movntq %mm1,0x8(%esi) Code; c02138a5 <fast_copy_page+45/e0> b: 0f 6f 53 10 movq 0x10(%ebx),%mm2 Code; c02138a9 <fast_copy_page+49/e0> f: 0f e7 56 10 movntq %mm2,0x10(%esi) Code; c02138ad <fast_copy_page+4d/e0> 13: 0f 00 00 sldtl (%eax) 1 warning issued. Results may not be reliable. I can provide .config upon request, but it is basically the same as the previous one except I have deselected the whole Netfilter stuff. Thanks. -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-17 15:26 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-10-17 16:27 ` Andrea Arcangeli [not found] ` <200210190014.19357.harisri@bigpond.com> 0 siblings, 1 reply; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-17 16:27 UTC (permalink / raw) To: Srihari Vijayaraghavan; +Cc: Keith Owens, linux-kernel On Fri, Oct 18, 2002 at 01:26:36AM +1000, Srihari Vijayaraghavan wrote: > Hello Keith, > > > You don't need that, just mkdir /var/log/ksymoops. modprobe/insmod > > will create a daily log file and snapshot a copy of lsmod and > > /proc/ksyms for every module loaded or unloaded. All with sync in the > > right places. > > Thanks, and that works fine. if you enabled it before getting the new oopses what's interesting is that you send me a tarball of /var/log/ksymoops, so I will also be able to resolve those module addresses too (please send me also your agpgart.o and your radeon.o modules, all from the same kernels: .o, ksymoops and below oopses). thanks, Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
[parent not found: <200210190014.19357.harisri@bigpond.com>]
* Re: 2.4.20pre11aa1 [not found] ` <200210190014.19357.harisri@bigpond.com> @ 2002-10-18 14:52 ` Andrea Arcangeli 2002-10-18 15:21 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 2002-10-18 15:34 ` 2.4.20pre11aa1 Keith Owens 0 siblings, 2 replies; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-18 14:52 UTC (permalink / raw) To: Srihari Vijayaraghavan; +Cc: linux-kernel On Sat, Oct 19, 2002 at 12:14:19AM +1000, Srihari Vijayaraghavan wrote: > Oct 18 23:40:42 localhost kernel: Process modprobe (pid: 957, modprobe was running at 234042, now in the log I see: 20021018 234001 start /sbin/modprobe -s -k -- char-major-14 safemode=1 20021018 234001 probe ended 20021018 234004 start /sbin/modprobe -s -k -- char-major-10-134 safemode=1 20021018 234004 probe ended 20021018 234014 start /sbin/modprobe -s -k -- char-major-10-134 safemode=1 20021018 234014 probe ended 20021018 234021 start /sbin/modprobe -s -k -- char-major-14 safemode=1 20021018 234021 probe ended 20021018 234022 start /sbin/modprobe -s -k -- ide-cd safemode=1 20021018 234022 probe ended 20021018 234022 start /sbin/modprobe -s -k -- ide-cd safemode=1 20021018 234022 probe ended 20021018 234040 start /sbin/modprobe -s -k -- char-major-14 safemode=1 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 20021018 234040 probe ended 20021018 234051 start /sbin/modprobe -s -k -- binfmt-ffff safemode=1 20021018 234051 probe ended 20021018 234051 start /sbin/modprobe -s -k -- binfmt-ffff safemode=1 20021018 234051 probe ended I don't see any modprobe in the logs at 234042 and the one at 234040 is writing "probe ended" at 234040. maybe it was another modprobe that crashed before it could write into the logs? or maybe it was the underlined one that crashed after writing "probe ended"? But anyways it looks like modprobe is innocent if it didn't write into the log any new module loaded. Do you agree Keith? if you still have the .config used to build the kernel please send it too, thanks! I've no idea why radeon or agpgart could generate corruption in my tree and not in mainline and I can't reproduce. the best would be if you could do a binary search on all the patches applied (first applying all the [012]* and see if you can rerproduce, and so on) Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-18 14:52 ` 2.4.20pre11aa1 Andrea Arcangeli @ 2002-10-18 15:21 ` Srihari Vijayaraghavan 2002-10-18 15:34 ` 2.4.20pre11aa1 Keith Owens 1 sibling, 0 replies; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-18 15:21 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello, On Saturday 19 October 2002 00:52, Andrea Arcangeli wrote: > if you still have the .config used to build the kernel please send it > too, thanks! CONFIG_X86=y CONFIG_UID16=y CONFIG_MODULES=y CONFIG_MODVERSIONS=y CONFIG_KMOD=y CONFIG_MK7=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_CMPXCHG8=y CONFIG_X86_HAS_TSC=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_USE_3DNOW=y CONFIG_X86_PGE=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_F00F_WORKS_OK=y CONFIG_X86_MCE=y CONFIG_NOHIGHMEM=y CONFIG_1GB=y CONFIG_MTRR=y CONFIG_X86_TSC=y CONFIG_NET=y CONFIG_PCI=y CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_NAMES=y CONFIG_SYSVIPC=y CONFIG_SYSCTL=y CONFIG_KCORE_ELF=y CONFIG_BINFMT_AOUT=y CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=y CONFIG_PM=y CONFIG_BLK_DEV_FD=y CONFIG_MD=y CONFIG_BLK_DEV_MD=y CONFIG_MD_RAID0=y CONFIG_PACKET=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_IDE=y CONFIG_BLK_DEV_IDE=y CONFIG_BLK_DEV_IDEDISK=y CONFIG_IDEDISK_MULTI_MODE=y CONFIG_BLK_DEV_IDECD=y CONFIG_BLK_DEV_IDEPCI=y CONFIG_IDEPCI_SHARE_IRQ=y CONFIG_BLK_DEV_IDEDMA_PCI=y CONFIG_IDEDMA_PCI_AUTO=y CONFIG_BLK_DEV_IDEDMA=y CONFIG_BLK_DEV_ADMA=y CONFIG_BLK_DEV_VIA82CXXX=y CONFIG_IDEDMA_AUTO=y CONFIG_BLK_DEV_IDE_MODES=y CONFIG_NETDEVICES=y CONFIG_PPP=y CONFIG_PPP_ASYNC=y CONFIG_PPP_DEFLATE=y CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_SERIAL=y CONFIG_SERIAL_EXTENDED=y CONFIG_UNIX98_PTYS=y CONFIG_MOUSE=y CONFIG_PSMOUSE=y CONFIG_RTC=y CONFIG_AGP=m CONFIG_AGP_AMD=y CONFIG_DRM=y CONFIG_DRM_NEW=y CONFIG_DRM_RADEON=m CONFIG_EXT3_FS=y CONFIG_JBD=y CONFIG_RAMFS=y CONFIG_ISO9660_FS=y CONFIG_JOLIET=y CONFIG_PROC_FS=y CONFIG_DEVPTS_FS=y CONFIG_MSDOS_PARTITION=y CONFIG_NLS=y CONFIG_VGA_CONSOLE=y CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=y > I've no idea why radeon or agpgart could generate corruption in my tree > and not in mainline and I can't reproduce. the best would be if you > could do a binary search on all the patches applied (first applying all > the [012]* and see if you can rerproduce, and so on) I will try that. Thanks -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-18 14:52 ` 2.4.20pre11aa1 Andrea Arcangeli 2002-10-18 15:21 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-10-18 15:34 ` Keith Owens 2002-10-18 16:00 ` 2.4.20pre11aa1 Andrea Arcangeli 1 sibling, 1 reply; 25+ messages in thread From: Keith Owens @ 2002-10-18 15:34 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: Srihari Vijayaraghavan, linux-kernel On Fri, 18 Oct 2002 16:52:04 +0200, Andrea Arcangeli <andrea@suse.de> wrote: >On Sat, Oct 19, 2002 at 12:14:19AM +1000, Srihari Vijayaraghavan wrote: >> Oct 18 23:40:42 localhost kernel: Process modprobe (pid: 957, > >modprobe was running at 234042, now in the log I see: > >20021018 234001 start /sbin/modprobe -s -k -- char-major-14 safemode=1 >20021018 234001 probe ended >20021018 234004 start /sbin/modprobe -s -k -- char-major-10-134 safemode=1 >20021018 234004 probe ended >20021018 234014 start /sbin/modprobe -s -k -- char-major-10-134 safemode=1 >20021018 234014 probe ended >20021018 234021 start /sbin/modprobe -s -k -- char-major-14 safemode=1 >20021018 234021 probe ended >20021018 234022 start /sbin/modprobe -s -k -- ide-cd safemode=1 >20021018 234022 probe ended >20021018 234022 start /sbin/modprobe -s -k -- ide-cd safemode=1 >20021018 234022 probe ended >20021018 234040 start /sbin/modprobe -s -k -- char-major-14 safemode=1 >^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >20021018 234040 probe ended >20021018 234051 start /sbin/modprobe -s -k -- binfmt-ffff safemode=1 >20021018 234051 probe ended >20021018 234051 start /sbin/modprobe -s -k -- binfmt-ffff safemode=1 >20021018 234051 probe ended > >I don't see any modprobe in the logs at 234042 and the one at 234040 is >writing "probe ended" at 234040. maybe it was another modprobe that >crashed before it could write into the logs? or maybe it was the >underlined one that crashed after writing "probe ended"? But anyways it >looks like modprobe is innocent if it didn't write into the log any new >module loaded. Do you agree Keith? modprobe appends to the log for all operations that might change the module state. The data is flushed before changing module state, with snap_shot_log() fprintf(log, "\n"); fflush(log); fdatasync(fileno(log)); fclose(log); so the log should always be valid, even if modprobe then crashes. There is no system code after modprobe writes 'probe ended', crashes after writing 'probe ended' should not be possible. Three possibilities :- (a) The modprobe at 234040 completed the load successfully then the oops occurred before the modprobe task was completely purged. IOW, the module loaded, module_init() ran, modprobe returned to user space then the module died handling some event. (b) The failing modprobe at 234042 is real, but is performing an operation that will not change module state. For example, it is doing modprobe -n, this will not log but will still invoke some module syscalls. The oops is then caused by corrupt module tables. (c) modprobe is not being run as root so it cannot log. Although it cannot actually change module state, it will do part of the work in extracting existing module symbols. Again, the oops is caused by corrupt module tables. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-18 15:34 ` 2.4.20pre11aa1 Keith Owens @ 2002-10-18 16:00 ` Andrea Arcangeli 2002-10-19 1:21 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 0 siblings, 1 reply; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-18 16:00 UTC (permalink / raw) To: Keith Owens; +Cc: Srihari Vijayaraghavan, linux-kernel On Sat, Oct 19, 2002 at 01:34:06AM +1000, Keith Owens wrote: > Three possibilities :- > > (a) The modprobe at 234040 completed the load successfully then the > oops occurred before the modprobe task was completely purged. IOW, the > module loaded, module_init() ran, modprobe returned to user space then > the module died handling some event. > > (b) The failing modprobe at 234042 is real, but is performing an > operation that will not change module state. For example, it is > doing modprobe -n, this will not log but will still invoke some module > syscalls. The oops is then caused by corrupt module tables. > > (c) modprobe is not being run as root so it cannot log. Although it > cannot actually change module state, it will do part of the work in > extracting existing module symbols. Again, the oops is caused by > corrupt module tables. thanks for the help. the corrupted module tables rings a bell. I fixed the wrong locking in the module code that could corrupt these tables (they were relying on the bkl but the bkl means nothing if you copy_user in the middle of the loop like the module code does, so I replaced the bkl with a semaphore and that should fix things), but I wonder if I broken something else with these fixes. Here's the patch that I'm talking about, you may want to start the binary search backing this out and see if the problem goes away. if it goes away I clearly need to double check it ;) diff -urNp x-ref/kernel/module.c x/kernel/module.c --- x-ref/kernel/module.c Tue Jan 22 18:56:00 2002 +++ x/kernel/module.c Thu Oct 10 23:47:20 2002 @@ -78,6 +78,8 @@ static int kmalloc_failed; spinlock_t modlist_lock = SPIN_LOCK_UNLOCKED; +static DECLARE_MUTEX(module_mutex); + /** * inter_module_register - register a new set of inter module data. * @im_name: an arbitrary string to identify the data, must be unique @@ -298,7 +300,7 @@ sys_create_module(const char *name_user, if (!capable(CAP_SYS_MODULE)) return -EPERM; - lock_kernel(); + down(&module_mutex); if ((namelen = get_mod_name(name_user, &name)) < 0) { error = namelen; goto err0; @@ -334,7 +336,7 @@ sys_create_module(const char *name_user, err1: put_mod_name(name); err0: - unlock_kernel(); + up(&module_mutex); return error; } @@ -353,7 +355,7 @@ sys_init_module(const char *name_user, s if (!capable(CAP_SYS_MODULE)) return -EPERM; - lock_kernel(); + down(&module_mutex); if ((namelen = get_mod_name(name_user, &name)) < 0) { error = namelen; goto err0; @@ -549,13 +551,16 @@ sys_init_module(const char *name_user, s /* Initialize the module. */ atomic_set(&mod->uc.usecount,1); mod->flags |= MOD_INITIALIZING; + up(&module_mutex); if (mod->init && (error = mod->init()) != 0) { + down(&module_mutex); atomic_set(&mod->uc.usecount,0); mod->flags &= ~MOD_INITIALIZING; if (error > 0) /* Buggy module */ error = -EBUSY; goto err0; } + down(&module_mutex); atomic_dec(&mod->uc.usecount); /* And set it running. */ @@ -571,7 +576,7 @@ err2: err1: put_mod_name(name); err0: - unlock_kernel(); + up(&module_mutex); kfree(name_tmp); return error; } @@ -602,7 +607,7 @@ sys_delete_module(const char *name_user) if (!capable(CAP_SYS_MODULE)) return -EPERM; - lock_kernel(); + down(&module_mutex); if (name_user) { if ((error = get_mod_name(name_user, &name)) < 0) goto out; @@ -664,7 +669,7 @@ restart: error = 0; out: - unlock_kernel(); + up(&module_mutex); return error; } @@ -887,7 +892,7 @@ sys_query_module(const char *name_user, struct module *mod; int err; - lock_kernel(); + down(&module_mutex); if (name_user == NULL) mod = &kernel_module; else { @@ -937,7 +942,7 @@ sys_query_module(const char *name_user, atomic_dec(&mod->uc.usecount); out: - unlock_kernel(); + up(&module_mutex); return err; } @@ -956,7 +961,7 @@ sys_get_kernel_syms(struct kernel_sym *t int i; struct kernel_sym ksym; - lock_kernel(); + down(&module_mutex); for (mod = module_list, i = 0; mod; mod = mod->next) { /* include the count for the module name! */ i += mod->nsyms + 1; @@ -999,7 +1004,7 @@ sys_get_kernel_syms(struct kernel_sym *t } } out: - unlock_kernel(); + up(&module_mutex); return i; } @@ -1037,8 +1042,11 @@ free_module(struct module *mod, int tag_ if (mod->flags & MOD_RUNNING) { - if(mod->cleanup) + if(mod->cleanup) { + up(&module_mutex); mod->cleanup(); + down(&module_mutex); + } mod->flags &= ~MOD_RUNNING; } @@ -1082,6 +1090,7 @@ int get_module_list(char *p) char tmpstr[64]; struct module_ref *ref; + down(&module_mutex); for (mod = module_list; mod != &kernel_module; mod = mod->next) { long len; const char *q; @@ -1150,6 +1159,7 @@ int get_module_list(char *p) } fini: + up(&module_mutex); return PAGE_SIZE - left; } @@ -1172,7 +1182,7 @@ static void *s_start(struct seq_file *m, if (!p) return ERR_PTR(-ENOMEM); - lock_kernel(); + down(&module_mutex); for (v = module_list, n = *pos; v; n -= v->nsyms, v = v->next) { if (n < v->nsyms) { p->mod = v; @@ -1180,7 +1190,7 @@ static void *s_start(struct seq_file *m, return p; } } - unlock_kernel(); + up(&module_mutex); kfree(p); return NULL; } @@ -1193,7 +1203,7 @@ static void *s_next(struct seq_file *m, do { v->mod = v->mod->next; if (!v->mod) { - unlock_kernel(); + up(&module_mutex); kfree(p); return NULL; } @@ -1206,7 +1216,7 @@ static void *s_next(struct seq_file *m, static void s_stop(struct seq_file *m, void *p) { if (p && !IS_ERR(p)) { - unlock_kernel(); + up(&module_mutex); kfree(p); } } Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-18 16:00 ` 2.4.20pre11aa1 Andrea Arcangeli @ 2002-10-19 1:21 ` Srihari Vijayaraghavan 2002-10-19 1:25 ` 2.4.20pre11aa1 Andrea Arcangeli 0 siblings, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-19 1:21 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel, Keith Owens Hello Andrea, On Saturday 19 October 2002 02:00, Andrea Arcangeli wrote: > the corrupted module tables rings a bell. I fixed the wrong locking in > the module code that could corrupt these tables (they were relying on > the bkl but the bkl means nothing if you copy_user in the middle of the > loop like the module code does, so I replaced the bkl with a semaphore > and that should fix things), but I wonder if I broken something else > with these fixes. > > Here's the patch that I'm talking about, you may want to start the > binary search backing this out and see if the problem goes away. if it > goes away I clearly need to double check it ;) Unfortunately removing that change off kernel/module.c did not help. I may be wrong but considering in my case the kernel is crashing whether agpgart/radeon are compiled as modules or built-in, I suspect that this issue is larger than just modules sub-system. Anyway I will start applying the patches from 00* on-wards from your tree to see if I can reliably prove where the problem is. Thanks. -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-19 1:21 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-10-19 1:25 ` Andrea Arcangeli 2002-10-22 10:48 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 0 siblings, 1 reply; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-19 1:25 UTC (permalink / raw) To: Srihari Vijayaraghavan; +Cc: linux-kernel, Keith Owens On Sat, Oct 19, 2002 at 11:21:19AM +1000, Srihari Vijayaraghavan wrote: > I may be wrong but considering in my case the kernel is crashing whether > agpgart/radeon are compiled as modules or built-in, I suspect that this issue > is larger than just modules sub-system. agreed. the oops in modprobe sounds more like a coincidence now. > Anyway I will start applying the patches from 00* on-wards from your tree to > see if I can reliably prove where the problem is. that will help a lot, thanks! Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-19 1:25 ` 2.4.20pre11aa1 Andrea Arcangeli @ 2002-10-22 10:48 ` Srihari Vijayaraghavan 2002-10-22 14:55 ` 2.4.20pre11aa1 Andrea Arcangeli 0 siblings, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-22 10:48 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello Andrea, On Saturday 19 October 2002 11:25, Andrea Arcangeli wrote: > that will help a lot, thanks! Is there a quick HOWTO on how to apply the individual patches? Do I apply 00*gz patches after applying 00* patches? When I tried the above procedure there were a lot of hunks and it did not compile bzImage and agpgart.o etc.. Thanks -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-22 10:48 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-10-22 14:55 ` Andrea Arcangeli 2002-10-23 12:27 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 0 siblings, 1 reply; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-22 14:55 UTC (permalink / raw) To: Srihari Vijayaraghavan; +Cc: linux-kernel On Tue, Oct 22, 2002 at 08:48:05PM +1000, Srihari Vijayaraghavan wrote: > Hello Andrea, > > On Saturday 19 October 2002 11:25, Andrea Arcangeli wrote: > > that will help a lot, thanks! > > Is there a quick HOWTO on how to apply the individual patches? > > Do I apply 00*gz patches after applying 00* patches? gz doesn't matter, the `ls` ordering is the only thing that matters. You can gzip -d * and then apply [0123]* and see if it still breaks. > When I tried the above procedure there were a lot of hunks and it did not > compile bzImage and agpgart.o etc.. something like this will apply cleanly, if every patch is self contained as it should, it will compile correctly too: rm ../2.4.20pre11aa1/*.bz2 gzip -d ../2.4.20pre11aa1/*.gz for i in ../2.4.20pre11aa1/[0123]*; patch -p1 < $i; done Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-22 14:55 ` 2.4.20pre11aa1 Andrea Arcangeli @ 2002-10-23 12:27 ` Srihari Vijayaraghavan 2002-10-23 12:46 ` 2.4.20pre11aa1 Andrea Arcangeli 0 siblings, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-23 12:27 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello Andrea, On Wednesday 23 October 2002 00:55, Andrea Arcangeli wrote: > something like this will apply cleanly, if every patch is self contained > as it should, it will compile correctly too: > > rm ../2.4.20pre11aa1/*.bz2 > gzip -d ../2.4.20pre11aa1/*.gz > for i in ../2.4.20pre11aa1/[0123]*; patch -p1 < $i; done Thanks that is neat. I was able to trigger few oops with [0123]* patches. ksymoops 2.4.5 on i686 2.4.20-pre11aa1-0123. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.20-pre11aa1-0123/ (default) -m /boot/System.map-2.4.20-pre11aa1-0123 (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. Oct 23 21:23:22 localhost kernel: Unable to handle kernel paging request at virtual address c463b440 Oct 23 21:23:22 localhost kernel: c01485d1 Oct 23 21:23:22 localhost kernel: *pde = 045fe163 Oct 23 21:23:22 localhost kernel: Oops: 0003 Oct 23 21:23:22 localhost kernel: CPU: 0 Oct 23 21:23:22 localhost kernel: EIP: 0010:[<c01485d1>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 Oct 23 21:23:22 localhost kernel: EFLAGS: 00010282 Oct 23 21:23:22 localhost kernel: eax: c463b440 ebx: c463b440 ecx: c5938080 edx: 00000296 Oct 23 21:23:22 localhost kernel: esi: c6a6f6c8 edi: c6a6f680 ebp: 00000001 esp: c85d1f18 Oct 23 21:23:22 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 23 21:23:22 localhost kernel: Process bonobo-activati (pid: 795, stackpage=c85d1000) Oct 23 21:23:22 localhost kernel: Stack: c01a492d c158f2dc 00000000 c6a6f6c8 c01de68b c463b440 00000000 00000217 Oct 23 21:23:22 localhost kernel: c158e480 c463b440 c5706b50 c158e200 c5706a40 c6cf4140 c01de987 c6a6f680 Oct 23 21:23:22 localhost kernel: 00000000 c01a236c c5706b50 c5706a40 c01a2949 c5706b50 c641f3c0 00000000 Oct 23 21:23:22 localhost kernel: Call Trace: [<c01a492d>] [<c01de68b>] [<c01de987>] [<c01a236c>] [<c01a2949>] Oct 23 21:23:22 localhost kernel: [<c0136782>] [<c0134e6d>] [<c0134eee>] [<c010737f>] Oct 23 21:23:22 localhost kernel: Code: ff 0b 0f 94 c0 84 c0 0f 84 8f 00 00 00 8d 73 18 39 73 18 74 >>EIP; c01485d1 <dput+11/110> <===== >>eax; c463b440 <END_OF_CODE+66e505/????> >>ebx; c463b440 <END_OF_CODE+66e505/????> >>ecx; c5938080 <END_OF_CODE+196b145/????> >>esi; c6a6f6c8 <END_OF_CODE+2aa278d/????> >>edi; c6a6f680 <END_OF_CODE+2aa2745/????> >>esp; c85d1f18 <END_OF_CODE+4604fdd/????> Trace; c01a492d <sk_free+2d/60> Trace; c01de68b <unix_release_sock+11b/1d0> Trace; c01de987 <unix_release+27/30> Trace; c01a236c <sock_release+5c/60> Trace; c01a2949 <sock_close+39/60> Trace; c0136782 <fput+102/130> Trace; c0134e6d <filp_close+4d/80> Trace; c0134eee <sys_close+4e/60> Trace; c010737f <system_call+33/38> Code; c01485d1 <dput+11/110> 00000000 <_EIP>: Code; c01485d1 <dput+11/110> <===== 0: ff 0b decl (%ebx) <===== Code; c01485d3 <dput+13/110> 2: 0f 94 c0 sete %al Code; c01485d6 <dput+16/110> 5: 84 c0 test %al,%al Code; c01485d8 <dput+18/110> 7: 0f 84 8f 00 00 00 je 9c <_EIP+0x9c> Code; c01485de <dput+1e/110> d: 8d 73 18 lea 0x18(%ebx),%esi Code; c01485e1 <dput+21/110> 10: 39 73 18 cmp %esi,0x18(%ebx) Code; c01485e4 <dput+24/110> 13: 74 00 je 15 <_EIP+0x15> Oct 23 21:23:22 localhost kernel: <1>Unable to handle kernel paging request at virtual address c4c6a360 Oct 23 21:23:22 localhost kernel: c0137103 Oct 23 21:23:22 localhost kernel: *pde = 04c001e3 Oct 23 21:23:22 localhost kernel: Oops: 0002 Oct 23 21:23:22 localhost kernel: CPU: 0 Oct 23 21:23:22 localhost kernel: EIP: 0010:[<c0137103>] Not tainted Oct 23 21:23:22 localhost kernel: EFLAGS: 00013286 Oct 23 21:23:22 localhost kernel: eax: c4c6a340 ebx: 00000000 ecx: c916b940 edx: c025ec44 Oct 23 21:23:22 localhost kernel: esi: c916b940 edi: c1ee3930 ebp: c1ee3cc0 esp: c1c11e54 Oct 23 21:23:22 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 23 21:23:22 localhost kernel: Process kjournald (pid: 136, stackpage=c1c11000) Oct 23 21:23:22 localhost kernel: Stack: 00000000 c01379e8 c916b940 00000000 c916b940 c1ee3450 c0169b7e c916b940 Oct 23 21:23:22 localhost kernel: 0000002d c1c11ea8 000002fa ffffffff c1c10000 dffceaf4 00000000 00000000 Oct 23 21:23:22 localhost kernel: 00000000 00000000 c1ca2c40 c1b72540 000002fa c90e1640 c90e15c0 c8576a40 Oct 23 21:23:22 localhost kernel: Call Trace: [<c01379e8>] [<c0169b7e>] [<c011350b>] [<c016bf5c>] [<c016be00>] Oct 23 21:23:22 localhost kernel: [<c010576e>] [<c016be20>] Oct 23 21:23:22 localhost kernel: Code: 89 48 20 8b 02 89 48 24 ff 04 9d 50 ec 25 c0 0f b7 41 08 01 >>EIP; c0137103 <__insert_into_lru_list+43/60> <===== >>eax; c4c6a340 <END_OF_CODE+c9d405/????> >>ecx; c916b940 <END_OF_CODE+519ea05/????> >>edx; c025ec44 <lru_list+0/c> >>esi; c916b940 <END_OF_CODE+519ea05/????> >>edi; c1ee3930 <[md].bss.end+216dd1/2273521> >>ebp; c1ee3cc0 <[md].bss.end+217161/2273521> >>esp; c1c11e54 <_end+1997e04/1a32030> Trace; c01379e8 <__refile_buffer+58/70> Trace; c0169b7e <journal_commit_transaction+105e/11c0> Trace; c011350b <schedule+15b/240> Trace; c016bf5c <kjournald+13c/1d0> Trace; c016be00 <commit_timeout+0/10> Trace; c010576e <kernel_thread+2e/40> Trace; c016be20 <kjournald+0/1d0> Code; c0137103 <__insert_into_lru_list+43/60> 00000000 <_EIP>: Code; c0137103 <__insert_into_lru_list+43/60> <===== 0: 89 48 20 mov %ecx,0x20(%eax) <===== Code; c0137106 <__insert_into_lru_list+46/60> 3: 8b 02 mov (%edx),%eax Code; c0137108 <__insert_into_lru_list+48/60> 5: 89 48 24 mov %ecx,0x24(%eax) Code; c013710b <__insert_into_lru_list+4b/60> 8: ff 04 9d 50 ec 25 c0 incl 0xc025ec50(,%ebx,4) Code; c0137112 <__insert_into_lru_list+52/60> f: 0f b7 41 08 movzwl 0x8(%ecx),%eax Code; c0137116 <__insert_into_lru_list+56/60> 13: 01 00 add %eax,(%eax) Oct 23 21:23:22 localhost kernel: <1>Unable to handle kernel paging request at virtual address c51c0098 Oct 23 21:23:22 localhost kernel: c0119a10 Oct 23 21:23:22 localhost kernel: *pde = 050001e3 Oct 23 21:23:22 localhost kernel: Oops: 0000 Oct 23 21:23:22 localhost kernel: CPU: 0 Oct 23 21:23:22 localhost kernel: EIP: 0010:[<c0119a10>] Not tainted Oct 23 21:23:22 localhost kernel: EFLAGS: 00013206 Oct 23 21:23:22 localhost kernel: eax: 00000000 ebx: c51c0000 ecx: c193f000 edx: 00000000 Oct 23 21:23:22 localhost kernel: esi: c1c10000 edi: 0000006a ebp: 0000000b esp: c1c11d08 Oct 23 21:23:22 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 23 21:23:22 localhost kernel: Process kjournald (pid: 136, stackpage=c1c11000) Oct 23 21:23:22 localhost kernel: Stack: c1587bb8 c193f040 c1c10000 00000000 c1c10000 0000006a 0000000b c0119f00 Oct 23 21:23:22 localhost kernel: c1c10000 00000002 c1c11e20 00000002 0000006a c1c10000 c01079f2 0000000b Oct 23 21:23:22 localhost kernel: c01edc4a 00000002 4942412e c01123c4 c01edc4a c1c11e20 00000002 c0276784 Oct 23 21:23:22 localhost kernel: Call Trace: [<c0119f00>] [<c01079f2>] [<c01123c4>] [<c019bc12>] [<c0137cab>] Oct 23 21:23:22 localhost kernel: [<c018f4ec>] [<c018f8d5>] [<c018fac5>] [<c01120b0>] [<c0107470>] [<c0137103>] Oct 23 21:23:22 localhost kernel: [<c01379e8>] [<c0169b7e>] [<c011350b>] [<c016bf5c>] [<c016be00>] [<c010576e>] Oct 23 21:23:22 localhost kernel: [<c016be20>] Oct 23 21:23:22 localhost kernel: Code: 39 b3 98 00 00 00 0f 84 85 02 00 00 8b 5b 50 81 fb 00 80 21 >>EIP; c0119a10 <exit_notify+20/300> <===== >>ebx; c51c0000 <END_OF_CODE+11f30c5/????> >>ecx; c193f000 <_end+16c4fb0/1a32030> >>esi; c1c10000 <_end+1995fb0/1a32030> >>esp; c1c11d08 <_end+1997cb8/1a32030> Trace; c0119f00 <do_exit+210/260> Trace; c01079f2 <die+72/80> Trace; c01123c4 <do_page_fault+314/5d0> Trace; c019bc12 <do_rw_disk+4b2/5c0> Trace; c0137cab <create_buffers+6b/e0> Trace; c018f4ec <ide_wait_stat+bc/130> Trace; c018f8d5 <start_request+1b5/250> Trace; c018fac5 <ide_do_request+c5/1c0> Trace; c01120b0 <do_page_fault+0/5d0> Trace; c0107470 <error_code+34/3c> Trace; c0137103 <__insert_into_lru_list+43/60> Trace; c01379e8 <__refile_buffer+58/70> Trace; c0169b7e <journal_commit_transaction+105e/11c0> Trace; c011350b <schedule+15b/240> Trace; c016bf5c <kjournald+13c/1d0> Trace; c016be00 <commit_timeout+0/10> Trace; c010576e <kernel_thread+2e/40> Trace; c016be20 <kjournald+0/1d0> Code; c0119a10 <exit_notify+20/300> 00000000 <_EIP>: Code; c0119a10 <exit_notify+20/300> <===== 0: 39 b3 98 00 00 00 cmp %esi,0x98(%ebx) <===== Code; c0119a16 <exit_notify+26/300> 6: 0f 84 85 02 00 00 je 291 <_EIP+0x291> Code; c0119a1c <exit_notify+2c/300> c: 8b 5b 50 mov 0x50(%ebx),%ebx Code; c0119a1f <exit_notify+2f/300> f: 81 fb 00 80 21 00 cmp $0x218000,%ebx Oct 23 21:23:22 localhost kernel: <1>Unable to handle kernel paging request at virtual address c54bc098 Oct 23 21:23:22 localhost kernel: c0119a10 Oct 23 21:23:22 localhost kernel: *pde = 054001e3 Oct 23 21:23:22 localhost kernel: Oops: 0000 Oct 23 21:23:22 localhost kernel: CPU: 0 Oct 23 21:23:22 localhost kernel: EIP: 0010:[<c0119a10>] Not tainted Oct 23 21:23:23 localhost kernel: EFLAGS: 00013206 Oct 23 21:23:23 localhost kernel: eax: 00000000 ebx: c54bc000 ecx: 00000000 edx: 00000000 Oct 23 21:23:23 localhost kernel: esi: c1c10000 edi: 000001c0 ebp: 0000000b esp: c1c11bbc Oct 23 21:23:23 localhost kernel: ds: 0018 es: 0018 ss: 0018 Oct 23 21:23:23 localhost kernel: Process kjournald (pid: 136, stackpage=c1c11000) Oct 23 21:23:23 localhost kernel: Stack: 00000020 00000400 c1c10000 00000000 c1c10000 000001c0 0000000b c0119f00 Oct 23 21:23:23 localhost kernel: c1c10000 00000000 c1c11cd4 00000000 000001c0 c1c10000 c01079f2 0000000b Oct 23 21:23:23 localhost kernel: c01edc4a 00000000 24548924 c01123c4 c01edc4a c1c11cd4 00000000 33323130 Oct 23 21:23:23 localhost kernel: Call Trace: [<c0119f00>] [<c01079f2>] [<c01123c4>] [<c0185ba9>] [<c0185ba9>] Oct 23 21:23:23 localhost kernel: [<c0185ba9>] [<c01167bf>] [<c0185ba9>] [<c0185ba9>] [<c01120b0>] [<c0107470>] Oct 23 21:23:23 localhost kernel: [<c0119a10>] [<c0119f00>] [<c01079f2>] [<c01123c4>] [<c019bc12>] [<c0137cab>] Oct 23 21:23:23 localhost kernel: [<c018f4ec>] [<c018f8d5>] [<c018fac5>] [<c01120b0>] [<c0107470>] [<c0137103>] Oct 23 21:23:23 localhost kernel: [<c01379e8>] [<c0169b7e>] [<c011350b>] [<c016bf5c>] [<c016be00>] [<c010576e>] Oct 23 21:23:23 localhost kernel: [<c016be20>] Oct 23 21:23:23 localhost kernel: Code: 39 b3 98 00 00 00 0f 84 85 02 00 00 8b 5b 50 81 fb 00 80 21 >>EIP; c0119a10 <exit_notify+20/300> <===== >>ebx; c54bc000 <END_OF_CODE+14ef0c5/????> >>esi; c1c10000 <_end+1995fb0/1a32030> >>esp; c1c11bbc <_end+1997b6c/1a32030> Trace; c0119f00 <do_exit+210/260> Trace; c01079f2 <die+72/80> Trace; c01123c4 <do_page_fault+314/5d0> Trace; c0185ba9 <vt_console_print+59/310> Trace; c0185ba9 <vt_console_print+59/310> Trace; c0185ba9 <vt_console_print+59/310> Trace; c01167bf <__call_console_drivers+5f/70> Trace; c0185ba9 <vt_console_print+59/310> Trace; c0185ba9 <vt_console_print+59/310> Trace; c01120b0 <do_page_fault+0/5d0> Trace; c0107470 <error_code+34/3c> Trace; c0119a10 <exit_notify+20/300> Trace; c0119f00 <do_exit+210/260> Trace; c01079f2 <die+72/80> Trace; c01123c4 <do_page_fault+314/5d0> Trace; c019bc12 <do_rw_disk+4b2/5c0> Trace; c0137cab <create_buffers+6b/e0> Trace; c018f4ec <ide_wait_stat+bc/130> Trace; c018f8d5 <start_request+1b5/250> Trace; c018fac5 <ide_do_request+c5/1c0> Trace; c01120b0 <do_page_fault+0/5d0> Trace; c0107470 <error_code+34/3c> Trace; c0137103 <__insert_into_lru_list+43/60> Trace; c01379e8 <__refile_buffer+58/70> Trace; c0169b7e <journal_commit_transaction+105e/11c0> Trace; c011350b <schedule+15b/240> Trace; c016bf5c <kjournald+13c/1d0> Trace; c016be00 <commit_timeout+0/10> Trace; c010576e <kernel_thread+2e/40> Trace; c016be20 <kjournald+0/1d0> Code; c0119a10 <exit_notify+20/300> 00000000 <_EIP>: Code; c0119a10 <exit_notify+20/300> <===== 0: 39 b3 98 00 00 00 cmp %esi,0x98(%ebx) <===== Code; c0119a16 <exit_notify+26/300> 6: 0f 84 85 02 00 00 je 291 <_EIP+0x291> Code; c0119a1c <exit_notify+2c/300> c: 8b 5b 50 mov 0x50(%ebx),%ebx Code; c0119a1f <exit_notify+2f/300> f: 81 fb 00 80 21 00 cmp $0x218000,%ebx 1 warning issued. Results may not be reliable. When I tried to see if I can trigger the oops with only 0* patches, I couldn't compile the kernel. Here is the standard error stream of 'make dep clean ; make bzImage' : module.c:7:28: linux/rcupdate.h: No such file or directory module.c: In function `free_module': module.c:1082: warning: implicit declaration of function `synchronize_kernel' make[2]: *** [module.o] Error 1 make[1]: *** [first_rule] Error 2 make: *** [_dir_kernel] Error 2 BTW I heard DaveM mentioning about AMD only bugs appearing during 2.4.20-pre series, I am not sure about -aa series though. I thought of testing the -aa/radeon/agpgart on my friend's computer which is an Intel P-III/VIA Chipset mother board. Thanks for your help. -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-23 12:27 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-10-23 12:46 ` Andrea Arcangeli 2002-10-23 14:26 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 0 siblings, 1 reply; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-23 12:46 UTC (permalink / raw) To: Srihari Vijayaraghavan; +Cc: linux-kernel On Wed, Oct 23, 2002 at 10:27:47PM +1000, Srihari Vijayaraghavan wrote: > module.c:7:28: linux/rcupdate.h: No such file or directory > module.c: In function `free_module': > module.c:1082: warning: implicit declaration of function `synchronize_kernel' > make[2]: *** [module.o] Error 1 > make[1]: *** [first_rule] Error 2 > make: *** [_dir_kernel] Error 2 Ok, please try to backout 2.4.20pre11aa1/00_reduce-module-races-1. I just moved it into the 20 serie. that should fix this bit. Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-23 12:46 ` 2.4.20pre11aa1 Andrea Arcangeli @ 2002-10-23 14:26 ` Srihari Vijayaraghavan 2002-10-23 14:35 ` 2.4.20pre11aa1 Andrea Arcangeli 0 siblings, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-23 14:26 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello Andrea, On Wednesday 23 October 2002 22:46, Andrea Arcangeli wrote: > Ok, please try to backout 2.4.20pre11aa1/00_reduce-module-races-1. > I just moved it into the 20 serie. that should fix this bit. Yes I did that. I renamed it to _00_reduce-module-races-1, and did the patching again. But that did not help. Here is the current std_err: exit.c: In function `release_task': exit.c:44: warning: implicit declaration of function `sched_exit' shmem.c: In function `shmem_getpage_locked': shmem.c:560: warning: unused variable `flags' {standard input}: Assembler messages: {standard input}:1014: Warning: indirect lcall without `*' {standard input}:1091: Warning: indirect lcall without `*' {standard input}:1176: Warning: indirect lcall without `*' {standard input}:1255: Warning: indirect lcall without `*' {standard input}:1271: Warning: indirect lcall without `*' {standard input}:1281: Warning: indirect lcall without `*' {standard input}:1349: Warning: indirect lcall without `*' {standard input}:1364: Warning: indirect lcall without `*' {standard input}:1375: Warning: indirect lcall without `*' {standard input}:1874: Warning: indirect lcall without `*' {standard input}:1960: Warning: indirect lcall without `*' init_task.c:3:34: linux/sched_runqueue.h: No such file or directory make[1]: *** [init_task.o] Error 1 make: *** [_dir_arch/i386/kernel] Error 2 Thanks. -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-23 14:26 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-10-23 14:35 ` Andrea Arcangeli 2002-10-25 14:03 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 0 siblings, 1 reply; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-23 14:35 UTC (permalink / raw) To: Srihari Vijayaraghavan; +Cc: linux-kernel On Thu, Oct 24, 2002 at 12:26:36AM +1000, Srihari Vijayaraghavan wrote: > Hello Andrea, > > On Wednesday 23 October 2002 22:46, Andrea Arcangeli wrote: > > Ok, please try to backout 2.4.20pre11aa1/00_reduce-module-races-1. > > I just moved it into the 20 serie. that should fix this bit. > > Yes I did that. I renamed it to _00_reduce-module-races-1, and did the > patching again. > > But that did not help. Here is the current std_err: > > exit.c: In function `release_task': > exit.c:44: warning: implicit declaration of function `sched_exit' > shmem.c: In function `shmem_getpage_locked': > shmem.c:560: warning: unused variable `flags' > {standard input}: Assembler messages: > {standard input}:1014: Warning: indirect lcall without `*' > {standard input}:1091: Warning: indirect lcall without `*' > {standard input}:1176: Warning: indirect lcall without `*' > {standard input}:1255: Warning: indirect lcall without `*' > {standard input}:1271: Warning: indirect lcall without `*' > {standard input}:1281: Warning: indirect lcall without `*' > {standard input}:1349: Warning: indirect lcall without `*' > {standard input}:1364: Warning: indirect lcall without `*' > {standard input}:1375: Warning: indirect lcall without `*' > {standard input}:1874: Warning: indirect lcall without `*' > {standard input}:1960: Warning: indirect lcall without `*' > init_task.c:3:34: linux/sched_runqueue.h: No such file or directory > make[1]: *** [init_task.o] Error 1 > make: *** [_dir_arch/i386/kernel] Error 2 try to apply all the scheduler related patches: 10_sched-o1-hyperthreading-3 20_apm-o1-sched-1 20_sched-o1-fixes-5 21_o1-A4-aa-1 20_rcu-poll-7 Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-23 14:35 ` 2.4.20pre11aa1 Andrea Arcangeli @ 2002-10-25 14:03 ` Srihari Vijayaraghavan 2002-10-31 10:47 ` 2.4.20pre11aa1 Srihari Vijayaraghavan 0 siblings, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-25 14:03 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello Andrea, [I tried to post the reply through groups.google.com, and it looks like it didn't get to lkml. :( ] > try to apply all the scheduler related patches: > > 10_sched-o1-hyperthreading-3 20_apm-o1-sched-1 20_sched-o1-fixes-5 > 21_o1-A4-aa-1 20_rcu-poll-7 OK. I have applied the patches 0* and the following patches in this order: 10_sched-o1-hyperthreading-3 20_apm-o1-sched-1 20_rcu-poll-7 20_sched-o1-fixes-5 21_o1-A4-aa-1 The resulting kernel is very stable and it does not crash. Then I tried patches [01]* and the extra patches (20_apm-o1-sched-1, 20_rcu-poll-7, 20_sched-o1-fixes-5, 21_o1-A4-aa-1), I couldn't compile the kernel. Here is the current std_err: inode.c:1468: warning: initialization from incompatible pointer type In file included from ide.c:149: /usr/src/01/include/linux/ide.h:333:16: warning: ISO C requires whitespace after the macro name ide.c: In function `init_hwif_data': ide.c:270: `ide_disk' undeclared (first use in this function) ide.c:270: (Each undeclared identifier is reported only once ide.c:270: for each function it appears in.) ide.c: In function `ide_geninit': ide.c:639: `ide_disk' undeclared (first use in this function) ide.c: In function `do_reset1': ide.c:791: `ide_disk' undeclared (first use in this function) ide.c: In function `ide_dump_status': ide.c:973: `ide_disk' undeclared (first use in this function) ide.c: In function `try_to_flush_leftover_data': ide.c:1034: `ide_disk' undeclared (first use in this function) ide.c: In function `ide_error': ide.c:1071: `ide_disk' undeclared (first use in this function) ide.c: In function `start_request': ide.c:1373: `ide_disk' undeclared (first use in this function) ide.c: In function `ide_open': ide.c:2119: `ide_disk' undeclared (first use in this function) ide.c: In function `ide_reinit_drive': ide.c:2768: `ide_disk' undeclared (first use in this function) ide.c: In function `ide_ioctl': ide.c:2842: `ide_disk' undeclared (first use in this function) ide.c: In function `ide_setup': ide.c:3383: `ide_disk' undeclared (first use in this function) make[3]: *** [ide.o] Error 1 make[2]: *** [first_rule] Error 2 make[1]: *** [_subdir_ide] Error 2 make: *** [_dir_drivers] Error 2 make: *** Waiting for unfinished jobs.... {standard input}: Assembler messages: {standard input}:1014: Warning: indirect lcall without `*' {standard input}:1091: Warning: indirect lcall without `*' {standard input}:1176: Warning: indirect lcall without `*' {standard input}:1255: Warning: indirect lcall without `*' {standard input}:1271: Warning: indirect lcall without `*' {standard input}:1281: Warning: indirect lcall without `*' {standard input}:1349: Warning: indirect lcall without `*' {standard input}:1364: Warning: indirect lcall without `*' {standard input}:1375: Warning: indirect lcall without `*' {standard input}:1874: Warning: indirect lcall without `*' {standard input}:1960: Warning: indirect lcall without `*' Thanks. -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-25 14:03 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-10-31 10:47 ` Srihari Vijayaraghavan 2002-11-09 9:34 ` Solved 2.4.20pre11aa1/2.4.20rc1aa1 Agpgart/Radeon crash. [was: Re: 2.4.20pre11aa1] Srihari Vijayaraghavan 0 siblings, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-31 10:47 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello Andrea, On Saturday 26 October 2002 00:03, Srihari Vijayaraghavan wrote: > The resulting kernel is very stable and it does not crash. > > Then I tried patches [01]* and the extra patches (20_apm-o1-sched-1, > 20_rcu-poll-7, 20_sched-o1-fixes-5, 21_o1-A4-aa-1), I couldn't compile > the kernel. The current status is: [0]* - compiles fine - works fine [01]* - couldn't compile [012]* - compiles fine - crashes So I believe either 1* or 2* patches are introducing the issue. In the mean time I had an opportunity to test -aa on a nice IBM NetVista computer, whose configuration is as follows: 00:00.0 Host bridge: Intel Corp. 82815 815 Chipset Host Bridge and Memory Controller Hub (rev 02) 00:02.0 VGA compatible controller: Intel Corp. 82815 CGC [Chipset Graphics Controller] (rev 02) 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB PCI Bridge (rev 02) 00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 02) 00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 02) 00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #1) (rev 02) 00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 02) 00:1f.5 Multimedia audio controller: Intel Corp. 82801BA/BAM AC'97 Audio (rev 02) 01:08.0 Ethernet controller: Intel Corp. 82801BA/BAM/CA/CAM Ethernet Controller (rev 01) I can easily reproduce the same issue on that computer too (of course I am using CONFIG_AGP_I810 for agpgart support and CONFIG_DRM_I810 for i810 display card support). I think this eliminates the doubt on DRM support of Radeon (or i810 for that matter), and the issue appears very specific to agpgart in general. Anyway I guess we are very close to the problem, if someone helps me to compile -aa with [01]* patches I think we can pinpoint the issue I suspect. Thanks for your help and support. -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Solved 2.4.20pre11aa1/2.4.20rc1aa1 Agpgart/Radeon crash. [was: Re: 2.4.20pre11aa1] 2002-10-31 10:47 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-11-09 9:34 ` Srihari Vijayaraghavan 2002-11-10 2:50 ` Andrea Arcangeli 0 siblings, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-11-09 9:34 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello Andrea, > So I believe either 1* or 2* patches are introducing the issue. Got it. The 10_x86-fast-pte2 patch is introducting the instability. I have tested it on 2.4.20rc1aa1 though, backing out that patch alone solves the instability. I can give the .config and ksymoops of 2.4.20rc1aa1 if needed. > In the mean time I had an opportunity to test -aa on a nice IBM NetVista > computer, whose configuration is as follows: I will verify this finding even on that computer perhaps on Monday. Thanks for your help. -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Solved 2.4.20pre11aa1/2.4.20rc1aa1 Agpgart/Radeon crash. [was: Re: 2.4.20pre11aa1] 2002-11-09 9:34 ` Solved 2.4.20pre11aa1/2.4.20rc1aa1 Agpgart/Radeon crash. [was: Re: 2.4.20pre11aa1] Srihari Vijayaraghavan @ 2002-11-10 2:50 ` Andrea Arcangeli 2002-11-10 3:24 ` Srihari Vijayaraghavan 0 siblings, 1 reply; 25+ messages in thread From: Andrea Arcangeli @ 2002-11-10 2:50 UTC (permalink / raw) To: Srihari Vijayaraghavan; +Cc: linux-kernel On Sat, Nov 09, 2002 at 08:34:39PM +1100, Srihari Vijayaraghavan wrote: > Hello Andrea, > > > So I believe either 1* or 2* patches are introducing the issue. > > Got it. The 10_x86-fast-pte2 patch is introducting the instability. Great job! Many thanks! This reduces the bug a whole lot. I will think on Monday what could be going wrong with that patch, in the meantime just try to run (slower ;) with it backed out, to be sure it's really such one (nevertheless if I had to guess right now I would say this most certainly is triggering a bug somewhere else, unlikely that such patch is really containing a bug, the patch is kind of obviously correct and it is a so much stressed codepath that everybody would reproduce it if that was the case, one of the reason I could never guess such patch could be the interesting one for your case without your useful binary search). Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Solved 2.4.20pre11aa1/2.4.20rc1aa1 Agpgart/Radeon crash. [was: Re: 2.4.20pre11aa1] 2002-11-10 2:50 ` Andrea Arcangeli @ 2002-11-10 3:24 ` Srihari Vijayaraghavan 0 siblings, 0 replies; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-11-10 3:24 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello Andrea, On Sunday 10 November 2002 13:50, Andrea Arcangeli wrote: > Great job! Many thanks! This reduces the bug a whole lot. I will think > on Monday what could be going wrong with that patch, in the meantime > just try to run (slower ;) with it backed out, to be sure it's really I am running complete 2.4.20rc1aa1 minus 10_x86-fast-pte-2 at present. It has been very stable as mainline plus as snappy as -aa :). On a related note, I had to apply 20_rcu-poll-7 for compiling 10* patch(es) (even for the10_ext3-o_direct-2 patch), so would it be a good idea to move it as the earliest 10* patch? Thanks. -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-17 12:10 ` 2.4.20pre11aa1 Andrea Arcangeli 2002-10-17 13:01 ` 2.4.20pre11aa1 Keith Owens @ 2002-10-17 13:02 ` Srihari Vijayaraghavan 2002-10-17 13:00 ` 2.4.20pre11aa1 Andrea Arcangeli 1 sibling, 1 reply; 25+ messages in thread From: Srihari Vijayaraghavan @ 2002-10-17 13:02 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel Hello, > please try to find which is this module, replace modprobe with a script > that does: > > #!/bin/sh > echo $@ >>/tmp/log > sync > modprobe.orig $@ I will try that. > then look at log after the crash. You said in your last email that the > gart code wasn't the culprit. If it isn't the sound drivers I've no > clue what it is. What does it mean the without agpgart it is very > stable? That it crashes less frequently? (I recalled it crashed even > without those modules) Sorry if it was not clear. The -aa kernel crashes _only_ when I have agpgart and radeon support (either as modules or as built-in the kernel). If there is no agpgart and radeon support enabled, it does not crash. > It doesn't make any sense that 2.4.20-pre11 works and my tree doesn't, > there are no changes to those sound and graphics driver. Can you make > sure that modversions is enabled, and please send me your .config. Here is my current .config. While this one doesn't have modversions enabled I have seen crashes even when it is enabled. CONFIG_X86=y CONFIG_UID16=y CONFIG_MODULES=y CONFIG_KMOD=y CONFIG_MK7=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_CMPXCHG8=y CONFIG_X86_HAS_TSC=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_USE_3DNOW=y CONFIG_X86_PGE=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_F00F_WORKS_OK=y CONFIG_X86_MCE=y CONFIG_NOHIGHMEM=y CONFIG_1GB=y CONFIG_MTRR=y CONFIG_X86_TSC=y CONFIG_NET=y CONFIG_PCI=y CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_NAMES=y CONFIG_SYSVIPC=y CONFIG_SYSCTL=y CONFIG_KCORE_ELF=y CONFIG_BINFMT_AOUT=y CONFIG_BINFMT_ELF=y CONFIG_BINFMT_MISC=y CONFIG_PM=y CONFIG_BLK_DEV_FD=m CONFIG_MD=y CONFIG_BLK_DEV_MD=m CONFIG_MD_RAID0=m CONFIG_PACKET=m CONFIG_NETFILTER=y CONFIG_UNIX=m CONFIG_INET=y CONFIG_IP_NF_CONNTRACK=m CONFIG_IP_NF_FTP=m CONFIG_IP_NF_IPTABLES=m CONFIG_IP_NF_MATCH_TOS=m CONFIG_IP_NF_MATCH_STATE=m CONFIG_IP_NF_FILTER=m CONFIG_IP_NF_TARGET_REJECT=m CONFIG_IP_NF_NAT=m CONFIG_IP_NF_NAT_NEEDED=y CONFIG_IP_NF_TARGET_MASQUERADE=m CONFIG_IP_NF_TARGET_REDIRECT=m CONFIG_IP_NF_NAT_FTP=m CONFIG_IP_NF_TARGET_LOG=m CONFIG_IDE=y CONFIG_BLK_DEV_IDE=y CONFIG_BLK_DEV_IDEDISK=y CONFIG_IDEDISK_MULTI_MODE=y CONFIG_BLK_DEV_IDECD=m CONFIG_BLK_DEV_IDESCSI=m CONFIG_BLK_DEV_IDEPCI=y CONFIG_IDEPCI_SHARE_IRQ=y CONFIG_BLK_DEV_IDEDMA_PCI=y CONFIG_IDEDMA_PCI_AUTO=y CONFIG_BLK_DEV_IDEDMA=y CONFIG_BLK_DEV_ADMA=y CONFIG_BLK_DEV_VIA82CXXX=y CONFIG_IDEDMA_AUTO=y CONFIG_BLK_DEV_IDE_MODES=y CONFIG_SCSI=m CONFIG_BLK_DEV_SR=m CONFIG_CHR_DEV_SG=m CONFIG_SCSI_DEBUG_QUEUES=y CONFIG_SCSI_MULTI_LUN=y CONFIG_SCSI_CONSTANTS=y CONFIG_NETDEVICES=y CONFIG_PPP=m CONFIG_PPP_ASYNC=m CONFIG_PPP_DEFLATE=m CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_SERIAL=m CONFIG_SERIAL_EXTENDED=y CONFIG_UNIX98_PTYS=y CONFIG_MOUSE=m CONFIG_PSMOUSE=y CONFIG_RTC=m CONFIG_AGP=y CONFIG_AGP_AMD=y CONFIG_DRM=y CONFIG_DRM_NEW=y CONFIG_DRM_RADEON=y CONFIG_EXT3_FS=y CONFIG_JBD=y CONFIG_RAMFS=y CONFIG_ISO9660_FS=m CONFIG_JOLIET=y CONFIG_PROC_FS=y CONFIG_DEVPTS_FS=y CONFIG_MSDOS_PARTITION=y CONFIG_NLS=y CONFIG_VGA_CONSOLE=y CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y CONFIG_ZLIB_INFLATE=m CONFIG_ZLIB_DEFLATE=m Thanks -- Hari harisri@bigpond.com ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.4.20pre11aa1 2002-10-17 13:02 ` 2.4.20pre11aa1 Srihari Vijayaraghavan @ 2002-10-17 13:00 ` Andrea Arcangeli 0 siblings, 0 replies; 25+ messages in thread From: Andrea Arcangeli @ 2002-10-17 13:00 UTC (permalink / raw) To: Srihari Vijayaraghavan; +Cc: linux-kernel On Thu, Oct 17, 2002 at 11:02:24PM +1000, Srihari Vijayaraghavan wrote: > Sorry if it was not clear. The -aa kernel crashes _only_ when I have agpgart > and radeon support (either as modules or as built-in the kernel). If there is > no agpgart and radeon support enabled, it does not crash. ok. So the mystery is why it crashes only with my tree. there are no changes to the graphics/gart drivers as far as I can tell. Now I even wonder about a collision of dma with the sound driver or something weird like that ;) > > It doesn't make any sense that 2.4.20-pre11 works and my tree doesn't, > > there are no changes to those sound and graphics driver. Can you make > > sure that modversions is enabled, and please send me your .config. > > Here is my current .config. While this one doesn't have modversions enabled I > have seen crashes even when it is enabled. ok. but you can left modversions enabled, I do it myself too ;) Andrea ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2002-11-10 3:06 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-16 16:51 2.4.20pre11aa1 Andrea Arcangeli
2002-10-17 12:04 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-10-17 12:10 ` 2.4.20pre11aa1 Andrea Arcangeli
2002-10-17 13:01 ` 2.4.20pre11aa1 Keith Owens
2002-10-17 15:26 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-10-17 16:27 ` 2.4.20pre11aa1 Andrea Arcangeli
[not found] ` <200210190014.19357.harisri@bigpond.com>
2002-10-18 14:52 ` 2.4.20pre11aa1 Andrea Arcangeli
2002-10-18 15:21 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-10-18 15:34 ` 2.4.20pre11aa1 Keith Owens
2002-10-18 16:00 ` 2.4.20pre11aa1 Andrea Arcangeli
2002-10-19 1:21 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-10-19 1:25 ` 2.4.20pre11aa1 Andrea Arcangeli
2002-10-22 10:48 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-10-22 14:55 ` 2.4.20pre11aa1 Andrea Arcangeli
2002-10-23 12:27 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-10-23 12:46 ` 2.4.20pre11aa1 Andrea Arcangeli
2002-10-23 14:26 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-10-23 14:35 ` 2.4.20pre11aa1 Andrea Arcangeli
2002-10-25 14:03 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-10-31 10:47 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-11-09 9:34 ` Solved 2.4.20pre11aa1/2.4.20rc1aa1 Agpgart/Radeon crash. [was: Re: 2.4.20pre11aa1] Srihari Vijayaraghavan
2002-11-10 2:50 ` Andrea Arcangeli
2002-11-10 3:24 ` Srihari Vijayaraghavan
2002-10-17 13:02 ` 2.4.20pre11aa1 Srihari Vijayaraghavan
2002-10-17 13:00 ` 2.4.20pre11aa1 Andrea Arcangeli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox