* [PATCH] Fix Oops in crash_shrink_memory @ 2010-06-07 7:28 Pavan Naregundi 2010-06-08 7:07 ` Pavan Naregundi 0 siblings, 1 reply; 11+ messages in thread From: Pavan Naregundi @ 2010-06-07 7:28 UTC (permalink / raw) To: linux-kernel [-- Attachment #1: Type: text/plain, Size: 3207 bytes --] Hi Everyone, Please add me to CC in your reply.. When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" will generate OOPS message in the kernel. Below is the OOPS message and other details, # cat /proc/cmdline ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0 rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3 # uname -a Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64 ppc64 GNU/Linux # cd /sys/kernel/ # ls debug kexec_loaded profiling uevent_seqnum kexec_crash_loaded mm security vmcoreinfo kexec_crash_size notes uevent_helper # cat kexec_crash_loaded 0 # cat kexec_loaded 0 # cat kexec_crash_size 1 # echo 0 > kexec_crash_size Unable to handle kernel paging request for data at address 0x00000030 Faulting instruction address: 0xc0000000000930b4 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=1024 NUMA pSeries last sysfs file: /sys/kernel/kexec_crash_size Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2 mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last unloaded: scsi_wait_scan] NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0 REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1) MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000 DAR: 0000000000000030, DSISR: 0000000040000000 TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12 GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8 c000000000f42950 GPR04: c0000000b7803af0 0000000000000008 0000000000000002 c0000000005c6438 GPR08: 0000000000000000 000000008000000c 0000000000000000 0000000000000000 GPR12: 0000000040242448 c000000007441e00 00000000100f6210 0000000000000000 GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc 00000000100f4b4c GPR20: 00000000103f4de8 0000000000000000 0000000000000000 00000000100f0000 GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0 c0000000bc67f780 GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8 c000000000da8408 NIP [c0000000000930b4] .release_resource+0x34/0xe0 LR [c0000000000930ac] .release_resource+0x2c/0xe0 Call Trace: [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0 (unreliable) [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0 [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90 [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50 [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0 [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0 [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0 [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40 Instruction dump: fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c 7fbf4800 ---[ end trace afbc780462c9bf4e ]--- When crashkernel is not enabled, crashk_res resource have not been reserved. Hence crashk_res.parent will be NULL. Attaching a simple patch to this problem. Patch is tested and resolves this bug. Thanks.. Pavan [-- Attachment #2: fix-crash_shrink_memory.patch --] [-- Type: text/x-patch, Size: 371 bytes --] diff -Naur a/kernel/kexec.c b/kernel/kexec.c --- a/kernel/kexec.c 2010-06-07 18:45:55.050000000 +0530 +++ b/kernel/kexec.c 2010-06-07 18:50:28.070000004 +0530 @@ -1134,7 +1134,7 @@ free_reserved_phys_range(end, crashk_res.end); - if (start == end) + if ((start == end) && (crashk_res.parent != NULL)) release_resource(&crashk_res); crashk_res.end = end - 1; ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-07 7:28 [PATCH] Fix Oops in crash_shrink_memory Pavan Naregundi @ 2010-06-08 7:07 ` Pavan Naregundi 2010-06-08 7:59 ` Américo Wang 0 siblings, 1 reply; 11+ messages in thread From: Pavan Naregundi @ 2010-06-08 7:07 UTC (permalink / raw) To: linux-kernel; +Cc: vgoyal, hbabu, kexec Adding CC's.. On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote: > Hi Everyone, > > Please add me to CC in your reply.. > > When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" > will generate OOPS message in the kernel. Below is the OOPS message and > other details, > > # cat /proc/cmdline > ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0 > rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3 > # uname -a > Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64 > ppc64 GNU/Linux > # cd /sys/kernel/ > # ls > debug kexec_loaded profiling uevent_seqnum > kexec_crash_loaded mm security vmcoreinfo > kexec_crash_size notes uevent_helper > # cat kexec_crash_loaded > 0 > # cat kexec_loaded > 0 > # cat kexec_crash_size > 1 > # echo 0 > kexec_crash_size > Unable to handle kernel paging request for data at address 0x00000030 > Faulting instruction address: 0xc0000000000930b4 > Oops: Kernel access of bad area, sig: 11 [#1] > SMP NR_CPUS=1024 NUMA pSeries > last sysfs file: /sys/kernel/kexec_crash_size > Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log > dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2 > mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last > unloaded: scsi_wait_scan] > NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0 > REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1) > MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000 > DAR: 0000000000000030, DSISR: 0000000040000000 > TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12 > GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8 > c000000000f42950 > GPR04: c0000000b7803af0 0000000000000008 0000000000000002 > c0000000005c6438 > GPR08: 0000000000000000 000000008000000c 0000000000000000 > 0000000000000000 > GPR12: 0000000040242448 c000000007441e00 00000000100f6210 > 0000000000000000 > GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc > 00000000100f4b4c > GPR20: 00000000103f4de8 0000000000000000 0000000000000000 > 00000000100f0000 > GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0 > c0000000bc67f780 > GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8 > c000000000da8408 > NIP [c0000000000930b4] .release_resource+0x34/0xe0 > LR [c0000000000930ac] .release_resource+0x2c/0xe0 > Call Trace: > [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0 > (unreliable) > [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0 > [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90 > [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50 > [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0 > [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0 > [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0 > [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40 > Instruction dump: > fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000 > 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c > 7fbf4800 > ---[ end trace afbc780462c9bf4e ]--- > > When crashkernel is not enabled, crashk_res resource have not been > reserved. Hence crashk_res.parent will be NULL. > > Attaching a simple patch to this problem. Patch is tested and resolves this bug. > > Thanks.. > Pavan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-08 7:07 ` Pavan Naregundi @ 2010-06-08 7:59 ` Américo Wang 2010-06-08 8:40 ` Pavan Naregundi 0 siblings, 1 reply; 11+ messages in thread From: Américo Wang @ 2010-06-08 7:59 UTC (permalink / raw) To: Pavan Naregundi; +Cc: linux-kernel, vgoyal, hbabu, kexec On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote: >Adding CC's.. > > >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote: >> Hi Everyone, >> >> Please add me to CC in your reply.. >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" >> will generate OOPS message in the kernel. Below is the OOPS message and >> other details, >> >> # cat /proc/cmdline >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0 >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3 >> # uname -a >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64 >> ppc64 GNU/Linux >> # cd /sys/kernel/ >> # ls >> debug kexec_loaded profiling uevent_seqnum >> kexec_crash_loaded mm security vmcoreinfo >> kexec_crash_size notes uevent_helper >> # cat kexec_crash_loaded >> 0 >> # cat kexec_loaded >> 0 >> # cat kexec_crash_size >> 1 >> # echo 0 > kexec_crash_size >> Unable to handle kernel paging request for data at address 0x00000030 >> Faulting instruction address: 0xc0000000000930b4 >> Oops: Kernel access of bad area, sig: 11 [#1] >> SMP NR_CPUS=1024 NUMA pSeries >> last sysfs file: /sys/kernel/kexec_crash_size >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2 >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last >> unloaded: scsi_wait_scan] >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0 >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1) >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000 >> DAR: 0000000000000030, DSISR: 0000000040000000 >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12 >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8 >> c000000000f42950 >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002 >> c0000000005c6438 >> GPR08: 0000000000000000 000000008000000c 0000000000000000 >> 0000000000000000 >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210 >> 0000000000000000 >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc >> 00000000100f4b4c >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000 >> 00000000100f0000 >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0 >> c0000000bc67f780 >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8 >> c000000000da8408 >> NIP [c0000000000930b4] .release_resource+0x34/0xe0 >> LR [c0000000000930ac] .release_resource+0x2c/0xe0 >> Call Trace: >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0 >> (unreliable) >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0 >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90 >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50 >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0 >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0 >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0 >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40 >> Instruction dump: >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000 >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c >> 7fbf4800 >> ---[ end trace afbc780462c9bf4e ]--- >> >> When crashkernel is not enabled, crashk_res resource have not been >> reserved. Hence crashk_res.parent will be NULL. >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug. Ouch... The patch indeed addresses the problem, looks good to me. Please add your Signed-off-by and my: Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Another problem is that you should get 0 instead of 1 when you don't reserve any memory. Thanks! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-08 7:59 ` Américo Wang @ 2010-06-08 8:40 ` Pavan Naregundi 2010-06-08 8:54 ` Américo Wang 0 siblings, 1 reply; 11+ messages in thread From: Pavan Naregundi @ 2010-06-08 8:40 UTC (permalink / raw) To: Américo Wang; +Cc: linux-kernel, vgoyal, hbabu, kexec [-- Attachment #1: Type: text/plain, Size: 4450 bytes --] On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote: > On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote: > >Adding CC's.. > > > > > >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote: > >> Hi Everyone, > >> > >> Please add me to CC in your reply.. > >> > >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" > >> will generate OOPS message in the kernel. Below is the OOPS message and > >> other details, > >> > >> # cat /proc/cmdline > >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0 > >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3 > >> # uname -a > >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64 > >> ppc64 GNU/Linux > >> # cd /sys/kernel/ > >> # ls > >> debug kexec_loaded profiling uevent_seqnum > >> kexec_crash_loaded mm security vmcoreinfo > >> kexec_crash_size notes uevent_helper > >> # cat kexec_crash_loaded > >> 0 > >> # cat kexec_loaded > >> 0 > >> # cat kexec_crash_size > >> 1 > >> # echo 0 > kexec_crash_size > >> Unable to handle kernel paging request for data at address 0x00000030 > >> Faulting instruction address: 0xc0000000000930b4 > >> Oops: Kernel access of bad area, sig: 11 [#1] > >> SMP NR_CPUS=1024 NUMA pSeries > >> last sysfs file: /sys/kernel/kexec_crash_size > >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log > >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2 > >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last > >> unloaded: scsi_wait_scan] > >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0 > >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1) > >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000 > >> DAR: 0000000000000030, DSISR: 0000000040000000 > >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12 > >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8 > >> c000000000f42950 > >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002 > >> c0000000005c6438 > >> GPR08: 0000000000000000 000000008000000c 0000000000000000 > >> 0000000000000000 > >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210 > >> 0000000000000000 > >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc > >> 00000000100f4b4c > >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000 > >> 00000000100f0000 > >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0 > >> c0000000bc67f780 > >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8 > >> c000000000da8408 > >> NIP [c0000000000930b4] .release_resource+0x34/0xe0 > >> LR [c0000000000930ac] .release_resource+0x2c/0xe0 > >> Call Trace: > >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0 > >> (unreliable) > >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0 > >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90 > >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50 > >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0 > >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0 > >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0 > >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40 > >> Instruction dump: > >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000 > >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c > >> 7fbf4800 > >> ---[ end trace afbc780462c9bf4e ]--- > >> > >> When crashkernel is not enabled, crashk_res resource have not been > >> reserved. Hence crashk_res.parent will be NULL. > >> > >> Attaching a simple patch to this problem. Patch is tested and resolves this bug. > > > Ouch... > > The patch indeed addresses the problem, looks good to me. > Please add your Signed-off-by and my: > > Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> > > Another problem is that you should get 0 instead of 1 when you don't > reserve any memory. We get 1 here because, crash_get_memory_size() adds 1 as below, size = crashk_res.end - crashk_res.start + 1; We cant remove this addition, as it is required to display correct size in case if we reserve the crash memory. Coming back to issue.. Attaching the patch again. Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> --- [-- Attachment #2: fix-crash_shrink_memory.patch --] [-- Type: text/x-patch, Size: 371 bytes --] diff -Naur a/kernel/kexec.c b/kernel/kexec.c --- a/kernel/kexec.c 2010-06-07 18:45:55.050000000 +0530 +++ b/kernel/kexec.c 2010-06-07 18:50:28.070000004 +0530 @@ -1134,7 +1134,7 @@ free_reserved_phys_range(end, crashk_res.end); - if (start == end) + if ((start == end) && (crashk_res.parent != NULL)) release_resource(&crashk_res); crashk_res.end = end - 1; ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-08 8:40 ` Pavan Naregundi @ 2010-06-08 8:54 ` Américo Wang 2010-06-08 9:41 ` Pavan Naregundi 0 siblings, 1 reply; 11+ messages in thread From: Américo Wang @ 2010-06-08 8:54 UTC (permalink / raw) To: Pavan Naregundi; +Cc: Américo Wang, linux-kernel, vgoyal, hbabu, kexec On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote: >On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote: >> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote: >> >Adding CC's.. >> > >> > >> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote: >> >> Hi Everyone, >> >> >> >> Please add me to CC in your reply.. >> >> >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" >> >> will generate OOPS message in the kernel. Below is the OOPS message and >> >> other details, >> >> >> >> # cat /proc/cmdline >> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0 >> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3 >> >> # uname -a >> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64 >> >> ppc64 GNU/Linux >> >> # cd /sys/kernel/ >> >> # ls >> >> debug kexec_loaded profiling uevent_seqnum >> >> kexec_crash_loaded mm security vmcoreinfo >> >> kexec_crash_size notes uevent_helper >> >> # cat kexec_crash_loaded >> >> 0 >> >> # cat kexec_loaded >> >> 0 >> >> # cat kexec_crash_size >> >> 1 >> >> # echo 0 > kexec_crash_size >> >> Unable to handle kernel paging request for data at address 0x00000030 >> >> Faulting instruction address: 0xc0000000000930b4 >> >> Oops: Kernel access of bad area, sig: 11 [#1] >> >> SMP NR_CPUS=1024 NUMA pSeries >> >> last sysfs file: /sys/kernel/kexec_crash_size >> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log >> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2 >> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last >> >> unloaded: scsi_wait_scan] >> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0 >> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1) >> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000 >> >> DAR: 0000000000000030, DSISR: 0000000040000000 >> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12 >> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8 >> >> c000000000f42950 >> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002 >> >> c0000000005c6438 >> >> GPR08: 0000000000000000 000000008000000c 0000000000000000 >> >> 0000000000000000 >> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210 >> >> 0000000000000000 >> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc >> >> 00000000100f4b4c >> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000 >> >> 00000000100f0000 >> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0 >> >> c0000000bc67f780 >> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8 >> >> c000000000da8408 >> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0 >> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0 >> >> Call Trace: >> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0 >> >> (unreliable) >> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0 >> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90 >> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50 >> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0 >> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0 >> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0 >> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40 >> >> Instruction dump: >> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000 >> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c >> >> 7fbf4800 >> >> ---[ end trace afbc780462c9bf4e ]--- >> >> >> >> When crashkernel is not enabled, crashk_res resource have not been >> >> reserved. Hence crashk_res.parent will be NULL. >> >> >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug. >> >> >> Ouch... >> >> The patch indeed addresses the problem, looks good to me. >> Please add your Signed-off-by and my: >> >> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> >> >> Another problem is that you should get 0 instead of 1 when you don't >> reserve any memory. > >We get 1 here because, crash_get_memory_size() adds 1 as below, > >size = crashk_res.end - crashk_res.start + 1; > >We cant remove this addition, as it is required to display correct size >in case if we reserve the crash memory. Yeah, but 0 is a special case, isn't it? > >Coming back to issue.. Attaching the patch again. > >Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> >Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Thanks! >--- >diff -Naur a/kernel/kexec.c b/kernel/kexec.c >--- a/kernel/kexec.c 2010-06-07 18:45:55.050000000 +0530 >+++ b/kernel/kexec.c 2010-06-07 18:50:28.070000004 +0530 >@@ -1134,7 +1134,7 @@ > > free_reserved_phys_range(end, crashk_res.end); > >- if (start == end) >+ if ((start == end) && (crashk_res.parent != NULL)) > release_resource(&crashk_res); > crashk_res.end = end - 1; > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-08 8:54 ` Américo Wang @ 2010-06-08 9:41 ` Pavan Naregundi 2010-06-09 3:44 ` Simon Horman 0 siblings, 1 reply; 11+ messages in thread From: Pavan Naregundi @ 2010-06-08 9:41 UTC (permalink / raw) To: Américo Wang; +Cc: linux-kernel, vgoyal, hbabu, kexec [-- Attachment #1: Type: text/plain, Size: 5115 bytes --] On Tue, 2010-06-08 at 16:54 +0800, Américo Wang wrote: > On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote: > >On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote: > >> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote: > >> >Adding CC's.. > >> > > >> > > >> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote: > >> >> Hi Everyone, > >> >> > >> >> Please add me to CC in your reply.. > >> >> > >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" > >> >> will generate OOPS message in the kernel. Below is the OOPS message and > >> >> other details, > >> >> > >> >> # cat /proc/cmdline > >> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0 > >> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3 > >> >> # uname -a > >> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64 > >> >> ppc64 GNU/Linux > >> >> # cd /sys/kernel/ > >> >> # ls > >> >> debug kexec_loaded profiling uevent_seqnum > >> >> kexec_crash_loaded mm security vmcoreinfo > >> >> kexec_crash_size notes uevent_helper > >> >> # cat kexec_crash_loaded > >> >> 0 > >> >> # cat kexec_loaded > >> >> 0 > >> >> # cat kexec_crash_size > >> >> 1 > >> >> # echo 0 > kexec_crash_size > >> >> Unable to handle kernel paging request for data at address 0x00000030 > >> >> Faulting instruction address: 0xc0000000000930b4 > >> >> Oops: Kernel access of bad area, sig: 11 [#1] > >> >> SMP NR_CPUS=1024 NUMA pSeries > >> >> last sysfs file: /sys/kernel/kexec_crash_size > >> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log > >> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2 > >> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last > >> >> unloaded: scsi_wait_scan] > >> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0 > >> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1) > >> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000 > >> >> DAR: 0000000000000030, DSISR: 0000000040000000 > >> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12 > >> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8 > >> >> c000000000f42950 > >> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002 > >> >> c0000000005c6438 > >> >> GPR08: 0000000000000000 000000008000000c 0000000000000000 > >> >> 0000000000000000 > >> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210 > >> >> 0000000000000000 > >> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc > >> >> 00000000100f4b4c > >> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000 > >> >> 00000000100f0000 > >> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0 > >> >> c0000000bc67f780 > >> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8 > >> >> c000000000da8408 > >> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0 > >> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0 > >> >> Call Trace: > >> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0 > >> >> (unreliable) > >> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0 > >> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90 > >> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50 > >> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0 > >> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0 > >> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0 > >> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40 > >> >> Instruction dump: > >> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000 > >> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c > >> >> 7fbf4800 > >> >> ---[ end trace afbc780462c9bf4e ]--- > >> >> > >> >> When crashkernel is not enabled, crashk_res resource have not been > >> >> reserved. Hence crashk_res.parent will be NULL. > >> >> > >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug. > >> > >> > >> Ouch... > >> > >> The patch indeed addresses the problem, looks good to me. > >> Please add your Signed-off-by and my: > >> > >> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> > >> > >> Another problem is that you should get 0 instead of 1 when you don't > >> reserve any memory. > > > >We get 1 here because, crash_get_memory_size() adds 1 as below, > > > >size = crashk_res.end - crashk_res.start + 1; > > > >We cant remove this addition, as it is required to display correct size > >in case if we reserve the crash memory. > > Yeah, but 0 is a special case, isn't it? Yes, it is a special case. Prepared a new patch which solves both of this issues. 1. OOPs in crash_shrink_memory 2. make crash_get_memory_size to return correct size, in case of crash memory not reserved. Patch is tested. Thank You. Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> --- [-- Attachment #2: fix-kexec.patch --] [-- Type: text/x-patch, Size: 683 bytes --] diff -Naur a/kernel/kexec.c b/kernel/kexec.c --- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530 +++ b/kernel/kexec.c 2010-06-08 21:19:26.190000043 +0530 @@ -1089,9 +1089,10 @@ size_t crash_get_memory_size(void) { - size_t size; + size_t size = 0; mutex_lock(&kexec_mutex); - size = crashk_res.end - crashk_res.start + 1; + if(crashk_res.end != crashk_res.start) + size = crashk_res.end - crashk_res.start + 1; mutex_unlock(&kexec_mutex); return size; } @@ -1134,7 +1135,7 @@ free_reserved_phys_range(end, crashk_res.end); - if (start == end) + if ((start == end) && (crashk_res.parent != NULL)) release_resource(&crashk_res); crashk_res.end = end - 1; ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-08 9:41 ` Pavan Naregundi @ 2010-06-09 3:44 ` Simon Horman 2010-06-09 6:27 ` Pavan Naregundi 0 siblings, 1 reply; 11+ messages in thread From: Simon Horman @ 2010-06-09 3:44 UTC (permalink / raw) To: Pavan Naregundi; +Cc: Américo Wang, linux-kernel, vgoyal, hbabu, kexec On Tue, Jun 08, 2010 at 03:11:47PM +0530, Pavan Naregundi wrote: > On Tue, 2010-06-08 at 16:54 +0800, Américo Wang wrote: > > On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote: > > >On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote: > > >> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote: > > >> >Adding CC's.. > > >> > > > >> > > > >> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote: > > >> >> Hi Everyone, > > >> >> > > >> >> Please add me to CC in your reply.. > > >> >> > > >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" > > >> >> will generate OOPS message in the kernel. Below is the OOPS message and > > >> >> other details, > > >> >> > > >> >> # cat /proc/cmdline > > >> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0 > > >> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3 > > >> >> # uname -a > > >> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64 > > >> >> ppc64 GNU/Linux > > >> >> # cd /sys/kernel/ > > >> >> # ls > > >> >> debug kexec_loaded profiling uevent_seqnum > > >> >> kexec_crash_loaded mm security vmcoreinfo > > >> >> kexec_crash_size notes uevent_helper > > >> >> # cat kexec_crash_loaded > > >> >> 0 > > >> >> # cat kexec_loaded > > >> >> 0 > > >> >> # cat kexec_crash_size > > >> >> 1 > > >> >> # echo 0 > kexec_crash_size > > >> >> Unable to handle kernel paging request for data at address 0x00000030 > > >> >> Faulting instruction address: 0xc0000000000930b4 > > >> >> Oops: Kernel access of bad area, sig: 11 [#1] > > >> >> SMP NR_CPUS=1024 NUMA pSeries > > >> >> last sysfs file: /sys/kernel/kexec_crash_size > > >> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log > > >> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2 > > >> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last > > >> >> unloaded: scsi_wait_scan] > > >> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0 > > >> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1) > > >> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000 > > >> >> DAR: 0000000000000030, DSISR: 0000000040000000 > > >> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12 > > >> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8 > > >> >> c000000000f42950 > > >> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002 > > >> >> c0000000005c6438 > > >> >> GPR08: 0000000000000000 000000008000000c 0000000000000000 > > >> >> 0000000000000000 > > >> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210 > > >> >> 0000000000000000 > > >> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc > > >> >> 00000000100f4b4c > > >> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000 > > >> >> 00000000100f0000 > > >> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0 > > >> >> c0000000bc67f780 > > >> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8 > > >> >> c000000000da8408 > > >> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0 > > >> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0 > > >> >> Call Trace: > > >> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0 > > >> >> (unreliable) > > >> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0 > > >> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90 > > >> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50 > > >> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0 > > >> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0 > > >> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0 > > >> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40 > > >> >> Instruction dump: > > >> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000 > > >> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c > > >> >> 7fbf4800 > > >> >> ---[ end trace afbc780462c9bf4e ]--- > > >> >> > > >> >> When crashkernel is not enabled, crashk_res resource have not been > > >> >> reserved. Hence crashk_res.parent will be NULL. > > >> >> > > >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug. > > >> > > >> > > >> Ouch... > > >> > > >> The patch indeed addresses the problem, looks good to me. > > >> Please add your Signed-off-by and my: > > >> > > >> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> > > >> > > >> Another problem is that you should get 0 instead of 1 when you don't > > >> reserve any memory. > > > > > >We get 1 here because, crash_get_memory_size() adds 1 as below, > > > > > >size = crashk_res.end - crashk_res.start + 1; > > > > > >We cant remove this addition, as it is required to display correct size > > >in case if we reserve the crash memory. > > > > Yeah, but 0 is a special case, isn't it? > > Yes, it is a special case. > > Prepared a new patch which solves both of this issues. > > 1. OOPs in crash_shrink_memory > 2. make crash_get_memory_size to return correct size, in case of crash > memory not reserved. > > Patch is tested. > > Thank You. > > Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> > Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> > --- > > > > diff -Naur a/kernel/kexec.c b/kernel/kexec.c > --- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530 > +++ b/kernel/kexec.c 2010-06-08 21:19:26.190000043 +0530 > @@ -1089,9 +1089,10 @@ > > size_t crash_get_memory_size(void) > { > - size_t size; > + size_t size = 0; > mutex_lock(&kexec_mutex); > - size = crashk_res.end - crashk_res.start + 1; > + if(crashk_res.end != crashk_res.start) > + size = crashk_res.end - crashk_res.start + 1; Minor style-issue: there should be a space between if and (. > mutex_unlock(&kexec_mutex); > return size; > } > @@ -1134,7 +1135,7 @@ > > free_reserved_phys_range(end, crashk_res.end); > > - if (start == end) > + if ((start == end) && (crashk_res.parent != NULL)) > release_resource(&crashk_res); > crashk_res.end = end - 1; > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-09 3:44 ` Simon Horman @ 2010-06-09 6:27 ` Pavan Naregundi 2010-06-09 14:05 ` Vivek Goyal 2010-06-10 21:26 ` Andrew Morton 0 siblings, 2 replies; 11+ messages in thread From: Pavan Naregundi @ 2010-06-09 6:27 UTC (permalink / raw) To: Simon Horman; +Cc: Américo Wang, linux-kernel, vgoyal, hbabu, kexec [-- Attachment #1: Type: text/plain, Size: 5923 bytes --] On Wed, 2010-06-09 at 12:44 +0900, Simon Horman wrote: > On Tue, Jun 08, 2010 at 03:11:47PM +0530, Pavan Naregundi wrote: > > On Tue, 2010-06-08 at 16:54 +0800, Américo Wang wrote: > > > On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote: > > > >On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote: > > > >> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote: > > > >> >Adding CC's.. > > > >> > > > > >> > > > > >> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote: > > > >> >> Hi Everyone, > > > >> >> > > > >> >> Please add me to CC in your reply.. > > > >> >> > > > >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" > > > >> >> will generate OOPS message in the kernel. Below is the OOPS message and > > > >> >> other details, > > > >> >> > > > >> >> # cat /proc/cmdline > > > >> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0 > > > >> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3 > > > >> >> # uname -a > > > >> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64 > > > >> >> ppc64 GNU/Linux > > > >> >> # cd /sys/kernel/ > > > >> >> # ls > > > >> >> debug kexec_loaded profiling uevent_seqnum > > > >> >> kexec_crash_loaded mm security vmcoreinfo > > > >> >> kexec_crash_size notes uevent_helper > > > >> >> # cat kexec_crash_loaded > > > >> >> 0 > > > >> >> # cat kexec_loaded > > > >> >> 0 > > > >> >> # cat kexec_crash_size > > > >> >> 1 > > > >> >> # echo 0 > kexec_crash_size > > > >> >> Unable to handle kernel paging request for data at address 0x00000030 > > > >> >> Faulting instruction address: 0xc0000000000930b4 > > > >> >> Oops: Kernel access of bad area, sig: 11 [#1] > > > >> >> SMP NR_CPUS=1024 NUMA pSeries > > > >> >> last sysfs file: /sys/kernel/kexec_crash_size > > > >> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log > > > >> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2 > > > >> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last > > > >> >> unloaded: scsi_wait_scan] > > > >> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0 > > > >> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1) > > > >> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000 > > > >> >> DAR: 0000000000000030, DSISR: 0000000040000000 > > > >> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12 > > > >> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8 > > > >> >> c000000000f42950 > > > >> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002 > > > >> >> c0000000005c6438 > > > >> >> GPR08: 0000000000000000 000000008000000c 0000000000000000 > > > >> >> 0000000000000000 > > > >> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210 > > > >> >> 0000000000000000 > > > >> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc > > > >> >> 00000000100f4b4c > > > >> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000 > > > >> >> 00000000100f0000 > > > >> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0 > > > >> >> c0000000bc67f780 > > > >> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8 > > > >> >> c000000000da8408 > > > >> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0 > > > >> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0 > > > >> >> Call Trace: > > > >> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0 > > > >> >> (unreliable) > > > >> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0 > > > >> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90 > > > >> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50 > > > >> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0 > > > >> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0 > > > >> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0 > > > >> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40 > > > >> >> Instruction dump: > > > >> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000 > > > >> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c > > > >> >> 7fbf4800 > > > >> >> ---[ end trace afbc780462c9bf4e ]--- > > > >> >> > > > >> >> When crashkernel is not enabled, crashk_res resource have not been > > > >> >> reserved. Hence crashk_res.parent will be NULL. > > > >> >> > > > >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug. > > > >> > > > >> > > > >> Ouch... > > > >> > > > >> The patch indeed addresses the problem, looks good to me. > > > >> Please add your Signed-off-by and my: > > > >> > > > >> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> > > > >> > > > >> Another problem is that you should get 0 instead of 1 when you don't > > > >> reserve any memory. > > > > > > > >We get 1 here because, crash_get_memory_size() adds 1 as below, > > > > > > > >size = crashk_res.end - crashk_res.start + 1; > > > > > > > >We cant remove this addition, as it is required to display correct size > > > >in case if we reserve the crash memory. > > > > > > Yeah, but 0 is a special case, isn't it? > > > > Yes, it is a special case. > > > > Prepared a new patch which solves both of this issues. > > > > 1. OOPs in crash_shrink_memory > > 2. make crash_get_memory_size to return correct size, in case of crash > > memory not reserved. > > > > Patch is tested. > > > > + if(crashk_res.end != crashk_res.start) > > + size = crashk_res.end - crashk_res.start + 1; > > Minor style-issue: there should be a space between if and (. > Sorry for that. Resending the patch with fixed style issues. Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> -- [-- Attachment #2: fix-kexec.patch --] [-- Type: text/x-patch, Size: 684 bytes --] diff -Naur a/kernel/kexec.c b/kernel/kexec.c --- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530 +++ b/kernel/kexec.c 2010-06-09 18:01:37.590007921 +0530 @@ -1089,9 +1089,10 @@ size_t crash_get_memory_size(void) { - size_t size; + size_t size = 0; mutex_lock(&kexec_mutex); - size = crashk_res.end - crashk_res.start + 1; + if (crashk_res.end != crashk_res.start) + size = crashk_res.end - crashk_res.start + 1; mutex_unlock(&kexec_mutex); return size; } @@ -1134,7 +1135,7 @@ free_reserved_phys_range(end, crashk_res.end); - if (start == end) + if ((start == end) && (crashk_res.parent != NULL)) release_resource(&crashk_res); crashk_res.end = end - 1; ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-09 6:27 ` Pavan Naregundi @ 2010-06-09 14:05 ` Vivek Goyal 2010-06-10 21:26 ` Andrew Morton 1 sibling, 0 replies; 11+ messages in thread From: Vivek Goyal @ 2010-06-09 14:05 UTC (permalink / raw) To: Pavan Naregundi Cc: Simon Horman, Américo Wang, linux-kernel, hbabu, kexec On Wed, Jun 09, 2010 at 11:57:14AM +0530, Pavan Naregundi wrote: > On Wed, 2010-06-09 at 12:44 +0900, Simon Horman wrote: > > On Tue, Jun 08, 2010 at 03:11:47PM +0530, Pavan Naregundi wrote: > > > On Tue, 2010-06-08 at 16:54 +0800, Américo Wang wrote: > > > > On Tue, Jun 08, 2010 at 02:10:05PM +0530, Pavan Naregundi wrote: > > > > >On Tue, 2010-06-08 at 15:59 +0800, Américo Wang wrote: > > > > >> On Tue, Jun 08, 2010 at 12:37:37PM +0530, Pavan Naregundi wrote: > > > > >> >Adding CC's.. > > > > >> > > > > > >> > > > > > >> >On Mon, 2010-06-07 at 12:58 +0530, Pavan Naregundi wrote: > > > > >> >> Hi Everyone, > > > > >> >> > > > > >> >> Please add me to CC in your reply.. > > > > >> >> > > > > >> >> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" > > > > >> >> will generate OOPS message in the kernel. Below is the OOPS message and > > > > >> >> other details, > > > > >> >> > > > > >> >> # cat /proc/cmdline > > > > >> >> ro LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=hvc0 > > > > >> >> rhgb root=UUID=eafd9874-010c-46f9-a0c6-ea8db7c61ac3 > > > > >> >> # uname -a > > > > >> >> Linux XXXXXXXX 2.6.35-rc1 #1 SMP Mon Jun 7 18:04:53 IST 2010 ppc64 ppc64 > > > > >> >> ppc64 GNU/Linux > > > > >> >> # cd /sys/kernel/ > > > > >> >> # ls > > > > >> >> debug kexec_loaded profiling uevent_seqnum > > > > >> >> kexec_crash_loaded mm security vmcoreinfo > > > > >> >> kexec_crash_size notes uevent_helper > > > > >> >> # cat kexec_crash_loaded > > > > >> >> 0 > > > > >> >> # cat kexec_loaded > > > > >> >> 0 > > > > >> >> # cat kexec_crash_size > > > > >> >> 1 > > > > >> >> # echo 0 > kexec_crash_size > > > > >> >> Unable to handle kernel paging request for data at address 0x00000030 > > > > >> >> Faulting instruction address: 0xc0000000000930b4 > > > > >> >> Oops: Kernel access of bad area, sig: 11 [#1] > > > > >> >> SMP NR_CPUS=1024 NUMA pSeries > > > > >> >> last sysfs file: /sys/kernel/kexec_crash_size > > > > >> >> Modules linked in: sunrpc ipv6 ext3 jbd dm_mirror dm_region_hash dm_log > > > > >> >> dm_multipath dm_mod uinput sg ehea sr_mod ibmveth cdrom ext4 jbd2 > > > > >> >> mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt [last > > > > >> >> unloaded: scsi_wait_scan] > > > > >> >> NIP: c0000000000930b4 LR: c0000000000930ac CTR: c0000000000b7ce0 > > > > >> >> REGS: c0000000b7803750 TRAP: 0300 Not tainted (2.6.35-rc1) > > > > >> >> MSR: 8000000000009032 <EE,ME,IR,DR> CR: 28242482 XER: 20000000 > > > > >> >> DAR: 0000000000000030, DSISR: 0000000040000000 > > > > >> >> TASK = c0000000b9cc3dc0[1381] 'bash' THREAD: c0000000b7800000 CPU: 12 > > > > >> >> GPR00: c0000000000930ac c0000000b78039d0 c000000000e885a8 > > > > >> >> c000000000f42950 > > > > >> >> GPR04: c0000000b7803af0 0000000000000008 0000000000000002 > > > > >> >> c0000000005c6438 > > > > >> >> GPR08: 0000000000000000 000000008000000c 0000000000000000 > > > > >> >> 0000000000000000 > > > > >> >> GPR12: 0000000040242448 c000000007441e00 00000000100f6210 > > > > >> >> 0000000000000000 > > > > >> >> GPR16: 00000000100f4a38 00000000100cfb98 00000000100f4bdc > > > > >> >> 00000000100f4b4c > > > > >> >> GPR20: 00000000103f4de8 0000000000000000 0000000000000000 > > > > >> >> 00000000100f0000 > > > > >> >> GPR24: c0000000005c65c0 0000000000000000 c000000000da83e0 > > > > >> >> c0000000bc67f780 > > > > >> >> GPR28: c0000000bc684ae0 c000000000f42950 c000000000e1e3f8 > > > > >> >> c000000000da8408 > > > > >> >> NIP [c0000000000930b4] .release_resource+0x34/0xe0 > > > > >> >> LR [c0000000000930ac] .release_resource+0x2c/0xe0 > > > > >> >> Call Trace: > > > > >> >> [c0000000b78039d0] [c0000000000930ac] .release_resource+0x2c/0xe0 > > > > >> >> (unreliable) > > > > >> >> [c0000000b7803a60] [c0000000000d4fc8] .crash_shrink_memory+0x1c8/0x1f0 > > > > >> >> [c0000000b7803b30] [c0000000000b7d38] .kexec_crash_size_store+0x58/0x90 > > > > >> >> [c0000000b7803bc0] [c0000000002b0bb4] .kobj_attr_store+0x34/0x50 > > > > >> >> [c0000000b7803c30] [c000000000226d5c] .sysfs_write_file+0xec/0x1f0 > > > > >> >> [c0000000b7803ce0] [c00000000019e0bc] .vfs_write+0xec/0x1f0 > > > > >> >> [c0000000b7803d80] [c00000000019e2e8] .SyS_write+0x58/0xb0 > > > > >> >> [c0000000b7803e30] [c00000000000852c] syscall_exit+0x0/0x40 > > > > >> >> Instruction dump: > > > > >> >> fba1ffe8 fbc1fff0 fbe1fff8 ebc2b228 7c7f1b78 f8010010 f821ff71 ebbe8000 > > > > >> >> 7fa3eb78 484e35d9 60000000 e97f0020 <e92b0030> 2fa90000 419e002c > > > > >> >> 7fbf4800 > > > > >> >> ---[ end trace afbc780462c9bf4e ]--- > > > > >> >> > > > > >> >> When crashkernel is not enabled, crashk_res resource have not been > > > > >> >> reserved. Hence crashk_res.parent will be NULL. > > > > >> >> > > > > >> >> Attaching a simple patch to this problem. Patch is tested and resolves this bug. > > > > >> > > > > >> > > > > >> Ouch... > > > > >> > > > > >> The patch indeed addresses the problem, looks good to me. > > > > >> Please add your Signed-off-by and my: > > > > >> > > > > >> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> > > > > >> > > > > >> Another problem is that you should get 0 instead of 1 when you don't > > > > >> reserve any memory. > > > > > > > > > >We get 1 here because, crash_get_memory_size() adds 1 as below, > > > > > > > > > >size = crashk_res.end - crashk_res.start + 1; > > > > > > > > > >We cant remove this addition, as it is required to display correct size > > > > >in case if we reserve the crash memory. > > > > > > > > Yeah, but 0 is a special case, isn't it? > > > > > > Yes, it is a special case. > > > > > > Prepared a new patch which solves both of this issues. > > > > > > 1. OOPs in crash_shrink_memory > > > 2. make crash_get_memory_size to return correct size, in case of crash > > > memory not reserved. > > > > > > Patch is tested. > > > > > > > + if(crashk_res.end != crashk_res.start) > > > + size = crashk_res.end - crashk_res.start + 1; > > > > Minor style-issue: there should be a space between if and (. > > > > Sorry for that. > > Resending the patch with fixed style issues. > Looks good to me. Acked-by: Vivek Goyal <vgoyal@redhat.com> Thanks Vivek > Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> > Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> > -- > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-09 6:27 ` Pavan Naregundi 2010-06-09 14:05 ` Vivek Goyal @ 2010-06-10 21:26 ` Andrew Morton 2010-06-11 7:30 ` Pavan Naregundi 1 sibling, 1 reply; 11+ messages in thread From: Andrew Morton @ 2010-06-10 21:26 UTC (permalink / raw) To: Pavan Naregundi Cc: Simon Horman, Américo Wang, linux-kernel, vgoyal, hbabu, kexec On Wed, 09 Jun 2010 11:57:14 +0530 Pavan Naregundi <pavan@linux.vnet.ibm.com> wrote: > Resending the patch with fixed style issues. > > Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> > Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> > -- > > > > > [fix-kexec.patch text/x-patch (685B)] > diff -Naur a/kernel/kexec.c b/kernel/kexec.c > --- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530 > +++ b/kernel/kexec.c 2010-06-09 18:01:37.590007921 +0530 > @@ -1089,9 +1089,10 @@ > > size_t crash_get_memory_size(void) > { > - size_t size; > + size_t size = 0; > mutex_lock(&kexec_mutex); > - size = crashk_res.end - crashk_res.start + 1; > + if (crashk_res.end != crashk_res.start) > + size = crashk_res.end - crashk_res.start + 1; > mutex_unlock(&kexec_mutex); > return size; > } > @@ -1134,7 +1135,7 @@ > > free_reserved_phys_range(end, crashk_res.end); > > - if (start == end) > + if ((start == end) && (crashk_res.parent != NULL)) > release_resource(&crashk_res); > crashk_res.end = end - 1; The patch doesn't have a changelog and I'd prefer not to have to crawl through the email thread and write one myself. Please resend, including a full description of the bug and of its fix. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Fix Oops in crash_shrink_memory 2010-06-10 21:26 ` Andrew Morton @ 2010-06-11 7:30 ` Pavan Naregundi 0 siblings, 0 replies; 11+ messages in thread From: Pavan Naregundi @ 2010-06-11 7:30 UTC (permalink / raw) To: Andrew Morton Cc: Simon Horman, Américo Wang, linux-kernel, vgoyal, hbabu, kexec [-- Attachment #1: Type: text/plain, Size: 2147 bytes --] On Thu, 2010-06-10 at 14:26 -0700, Andrew Morton wrote: > On Wed, 09 Jun 2010 11:57:14 +0530 > Pavan Naregundi <pavan@linux.vnet.ibm.com> wrote: > > > Resending the patch with fixed style issues. > > > > Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> > > Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> > > -- > > > > > > > > > > [fix-kexec.patch text/x-patch (685B)] > > diff -Naur a/kernel/kexec.c b/kernel/kexec.c > > --- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530 > > +++ b/kernel/kexec.c 2010-06-09 18:01:37.590007921 +0530 > > @@ -1089,9 +1089,10 @@ > > > > size_t crash_get_memory_size(void) > > { > > - size_t size; > > + size_t size = 0; > > mutex_lock(&kexec_mutex); > > - size = crashk_res.end - crashk_res.start + 1; > > + if (crashk_res.end != crashk_res.start) > > + size = crashk_res.end - crashk_res.start + 1; > > mutex_unlock(&kexec_mutex); > > return size; > > } > > @@ -1134,7 +1135,7 @@ > > > > free_reserved_phys_range(end, crashk_res.end); > > > > - if (start == end) > > + if ((start == end) && (crashk_res.parent != NULL)) > > release_resource(&crashk_res); > > crashk_res.end = end - 1; > > The patch doesn't have a changelog and I'd prefer not to have to crawl > through the email thread and write one myself. > > Please resend, including a full description of the bug and of its fix. Subject: kexec: fix Oops in crash_shrink_memory() From: Pavan Naregundi <pavan@linux.vnet.ibm.com> When crashkernel is not enabled, "echo 0 > /sys/kernel/kexec_crash_size" OOPSes the kernel in crash_shrink_memory. This happens when crash_shrink_memory tries to release the 'crashk_res' resource which are not reserved. Also value of "/sys/kernel/kexec_crash_size" shows as 1, which should be 0. This patch fixes the OOPS in crash_shrink_memory and shows "/sys/kernel/kexec_crash_size" as 0 when crash kernel memory is not reserved. Signed-off-by: Pavan Naregundi <pavan@linux.vnet.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Simon Horman <horms@verge.net.au> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> --- [-- Attachment #2: kexec-fix-oops-in-crash_shrink_memory.patch --] [-- Type: text/x-patch, Size: 764 bytes --] diff -uprN a/kernel/kexec.c b/kernel/kexec.c --- a/kernel/kexec.c 2010-06-08 21:17:21.850000033 +0530 +++ b/kernel/kexec.c 2010-06-09 18:01:37.590007921 +0530 @@ -1089,9 +1089,10 @@ void crash_kexec(struct pt_regs *regs) size_t crash_get_memory_size(void) { - size_t size; + size_t size = 0; mutex_lock(&kexec_mutex); - size = crashk_res.end - crashk_res.start + 1; + if (crashk_res.end != crashk_res.start) + size = crashk_res.end - crashk_res.start + 1; mutex_unlock(&kexec_mutex); return size; } @@ -1134,7 +1135,7 @@ int crash_shrink_memory(unsigned long ne free_reserved_phys_range(end, crashk_res.end); - if (start == end) + if ((start == end) && (crashk_res.parent != NULL)) release_resource(&crashk_res); crashk_res.end = end - 1; ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-06-11 7:30 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-06-07 7:28 [PATCH] Fix Oops in crash_shrink_memory Pavan Naregundi 2010-06-08 7:07 ` Pavan Naregundi 2010-06-08 7:59 ` Américo Wang 2010-06-08 8:40 ` Pavan Naregundi 2010-06-08 8:54 ` Américo Wang 2010-06-08 9:41 ` Pavan Naregundi 2010-06-09 3:44 ` Simon Horman 2010-06-09 6:27 ` Pavan Naregundi 2010-06-09 14:05 ` Vivek Goyal 2010-06-10 21:26 ` Andrew Morton 2010-06-11 7:30 ` Pavan Naregundi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox