* [PATCH] IA64: kexec allocates too few memory for kdump kernel itself @ 2008-09-03 21:01 Jay Lan 2008-09-04 0:01 ` Simon Horman 0 siblings, 1 reply; 8+ messages in thread From: Jay Lan @ 2008-09-03 21:01 UTC (permalink / raw) To: kexec [-- Attachment #1: Type: text/plain, Size: 802 bytes --] Sometimes the kexec would allocate not enough memory for kdump kernel itself on IA64 and caused kdump kernel to panic at boot. When it happens, the /proc/iomem would show a kernel RAM segment like this: 3014000000-3015294fff : System RAM 3014000000-3014823ccf : Kernel code 3014823cd0-3014dee8ef : Kernel data 3014dee8f0-301529448f : Kernel bss 3015295000-307bffdfff : System RAM 3018000000-3037ffffff : Crash kernel But kexec would allocate memory 3018000000-3019290000 for the kernel, which is 0x5000 smaller than the regular kernel. In my cases, the physical_node_map and kern_memmap of the kdump kernel overlaped and caused data corruption. This patch fixes the problem. The patch was generated against kexec-tools 2.0.0 and tested in 2.6.27-rc4. Signed-off-by: Jay Lan <jlan@sgi.com> [-- Attachment #2: bootmem-fix --] [-- Type: text/plain, Size: 1030 bytes --] --- kexec/arch/ia64/crashdump-ia64.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) Index: kexec-tools/kexec/arch/ia64/crashdump-ia64.c =================================================================== --- kexec-tools.orig/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:24:14.289758063 -0700 +++ kexec-tools/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:29:34.095833316 -0700 @@ -90,15 +90,15 @@ static void add_loaded_segments_info(str phdr = &ehdr->e_phdr[i]; if (phdr->p_type != PT_LOAD) break; - if (loaded_segments[loaded_segments_num].end != - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) - break; + if (loaded_segments[loaded_segments_num].end < + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) + loaded_segments[loaded_segments_num].end + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); loaded_segments[loaded_segments_num].end += (phdr->p_memsz + ELF_PAGE_SIZE - 1) & ~(ELF_PAGE_SIZE - 1); i++; } - loaded_segments_num++; } } [-- Attachment #3: Type: text/plain, Size: 143 bytes --] _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] IA64: kexec allocates too few memory for kdump kernel itself 2008-09-03 21:01 [PATCH] IA64: kexec allocates too few memory for kdump kernel itself Jay Lan @ 2008-09-04 0:01 ` Simon Horman 2008-09-04 1:37 ` Jay Lan 0 siblings, 1 reply; 8+ messages in thread From: Simon Horman @ 2008-09-04 0:01 UTC (permalink / raw) To: Jay Lan; +Cc: kexec On Wed, Sep 03, 2008 at 02:01:59PM -0700, Jay Lan wrote: > Sometimes the kexec would allocate not enough memory for kdump kernel > itself on IA64 and caused kdump kernel to panic at boot. > > When it happens, the /proc/iomem would show a kernel RAM segment like > this: > 3014000000-3015294fff : System RAM > 3014000000-3014823ccf : Kernel code > 3014823cd0-3014dee8ef : Kernel data > 3014dee8f0-301529448f : Kernel bss > 3015295000-307bffdfff : System RAM > 3018000000-3037ffffff : Crash kernel > > But kexec would allocate memory 3018000000-3019290000 for the kernel, > which is 0x5000 smaller than the regular kernel. In my cases, the > physical_node_map and kern_memmap of the kdump kernel overlaped and > caused data corruption. > > This patch fixes the problem. The patch was generated against > kexec-tools 2.0.0 and tested in 2.6.27-rc4. Hi Jay, I am unclear about why this underallocation occurs. > > Signed-off-by: Jay Lan <jlan@sgi.com> > > --- > kexec/arch/ia64/crashdump-ia64.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > Index: kexec-tools/kexec/arch/ia64/crashdump-ia64.c > =================================================================== > --- kexec-tools.orig/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:24:14.289758063 -0700 > +++ kexec-tools/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:29:34.095833316 -0700 > @@ -90,15 +90,15 @@ static void add_loaded_segments_info(str > phdr = &ehdr->e_phdr[i]; > if (phdr->p_type != PT_LOAD) > break; > - if (loaded_segments[loaded_segments_num].end != > - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) > - break; > + if (loaded_segments[loaded_segments_num].end < > + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) > + loaded_segments[loaded_segments_num].end > + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); > loaded_segments[loaded_segments_num].end += > (phdr->p_memsz + ELF_PAGE_SIZE - 1) & > ~(ELF_PAGE_SIZE - 1); > i++; > } > - > loaded_segments_num++; > } > } > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] IA64: kexec allocates too few memory for kdump kernel itself 2008-09-04 0:01 ` Simon Horman @ 2008-09-04 1:37 ` Jay Lan 2008-09-04 18:28 ` Jay Lan 0 siblings, 1 reply; 8+ messages in thread From: Jay Lan @ 2008-09-04 1:37 UTC (permalink / raw) To: Simon Horman; +Cc: kexec Simon Horman wrote: > On Wed, Sep 03, 2008 at 02:01:59PM -0700, Jay Lan wrote: >> Sometimes the kexec would allocate not enough memory for kdump kernel >> itself on IA64 and caused kdump kernel to panic at boot. >> >> When it happens, the /proc/iomem would show a kernel RAM segment like >> this: >> 3014000000-3015294fff : System RAM >> 3014000000-3014823ccf : Kernel code >> 3014823cd0-3014dee8ef : Kernel data >> 3014dee8f0-301529448f : Kernel bss >> 3015295000-307bffdfff : System RAM >> 3018000000-3037ffffff : Crash kernel >> >> But kexec would allocate memory 3018000000-3019290000 for the kernel, >> which is 0x5000 smaller than the regular kernel. In my cases, the >> physical_node_map and kern_memmap of the kdump kernel overlaped and >> caused data corruption. >> >> This patch fixes the problem. The patch was generated against >> kexec-tools 2.0.0 and tested in 2.6.27-rc4. > > Hi Jay, > > I am unclear about why this underallocation occurs. Hi Simon, The routine add_loaded_segments_info() set up "loaded_segment" array that is needed by purgatory code, based on data stored in the mem_ehdr array passed in as the second parameter. Upon entrance of the routine, the crash_memory_range[] contains information about the regular kernel: crash_memory_range[ 0]: start= 3000080000, end= 30003fffff crash_memory_range[ 1]: start= 3003000000, end= 3005ffffff crash_memory_range[ 2]: start= 3006000000, end= 3013ffffff crash_memory_range[ 3]: start= 3014000000, end= 3015294fff The #3 entry is the kernel memory segment. And the mem_ehdr array would contain data as such: i=0, p_paddr=3018000000, p_memsz=d04480, p_offset=10000, p_type=1 i=1, p_paddr=3018d20000, p_memsz=9620, p_offset=d20000, p_type=1 i=2, p_paddr=3018d30000, p_memsz=564490, p_offset=d30000, p_type=1 i=3, p_paddr=0, p_memsz=0, p_offset=0, p_type=4 The code wants the new loaded_segments contain starting address all aligned at page boundary, which is 0x10000 in IA64. Note that the p_memsz of mem_ehdr does not match to entries in /proc/iomem: 3014000000-3015294fff : System RAM 3014000000-3014823ccf : Kernel code 3014823cd0-3014dee8ef : Kernel data 3014dee8f0-301529448f : Kernel bss The original code of add_loaded_segments_info() would go through the mem_ehdr array and use the p_paddr of the first entry (the beginning of the reserved memory) as the start address, add the p_memsz of three entries to calculate the end address of the kernel segment. But the p_paddr of i=0 plus p_memsz of i=0 should result in 3018d10000 as the p_paddr of i=1 entry, but actually the p_paddr of i=1 is 3018d20000. The logic of that routine can not explain the discrepency. So, where the data of mem_ehdr array come from? add_loaded_segments_info <- load_crashdump_segments <- elf_ia64_load <- file_type[i].load <- my_load The elf_ia64_load set up mem_ehdr, probabaly based on data pointed by *buf, which i think comes from vmlinuz. So, i failed to find out how the p_memsz were set up initially. But, i think we did it the way too complicated, IMHO. The crash_memory_range[] array showed the kernel segment consumed 0x1295000 bytes of memory and we only need to tell the purgatory code to reserve that amount of memory. The logic in add_loaded_segments_info() came out with 0x1290000 and caused the crashkernel to panic on boot. Hmmm, as i types now, i may not consider the situation where the crashkernel is not the same as the first kernel... Note that the underallocation does not _ALWAYS_ happen! It depends on the vmlinux we build. Honestly i do not understand some part of the kexec-tools code well enough to make major surgery to the code. So, i just compare the end address after calculation of i=0 entry of mem_ehdr array with the start address of the second entry. If it is too small, i just bring it up to align with the start address of the second entry. I am happy to allocate one extra page, may not be needed in some cases, of memory than to panic. Yes, my patch is a work-around. If you can find the true cause of the problem and fix it, it would be great and appreciated! Regards, - jay > >> Signed-off-by: Jay Lan <jlan@sgi.com> >> > >> --- >> kexec/arch/ia64/crashdump-ia64.c | 8 ++++---- >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> Index: kexec-tools/kexec/arch/ia64/crashdump-ia64.c >> =================================================================== >> --- kexec-tools.orig/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:24:14.289758063 -0700 >> +++ kexec-tools/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:29:34.095833316 -0700 >> @@ -90,15 +90,15 @@ static void add_loaded_segments_info(str >> phdr = &ehdr->e_phdr[i]; >> if (phdr->p_type != PT_LOAD) >> break; >> - if (loaded_segments[loaded_segments_num].end != >> - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) >> - break; >> + if (loaded_segments[loaded_segments_num].end < >> + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) >> + loaded_segments[loaded_segments_num].end >> + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); >> loaded_segments[loaded_segments_num].end += >> (phdr->p_memsz + ELF_PAGE_SIZE - 1) & >> ~(ELF_PAGE_SIZE - 1); >> i++; >> } >> - >> loaded_segments_num++; >> } >> } > _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] IA64: kexec allocates too few memory for kdump kernel itself 2008-09-04 1:37 ` Jay Lan @ 2008-09-04 18:28 ` Jay Lan 2008-09-12 23:27 ` Simon Horman 2008-09-15 6:58 ` Simon Horman 0 siblings, 2 replies; 8+ messages in thread From: Jay Lan @ 2008-09-04 18:28 UTC (permalink / raw) To: Simon Horman; +Cc: kexec Jay Lan wrote: > Simon Horman wrote: >> On Wed, Sep 03, 2008 at 02:01:59PM -0700, Jay Lan wrote: >>> Sometimes the kexec would allocate not enough memory for kdump kernel >>> itself on IA64 and caused kdump kernel to panic at boot. >>> >>> When it happens, the /proc/iomem would show a kernel RAM segment like >>> this: >>> 3014000000-3015294fff : System RAM >>> 3014000000-3014823ccf : Kernel code >>> 3014823cd0-3014dee8ef : Kernel data >>> 3014dee8f0-301529448f : Kernel bss >>> 3015295000-307bffdfff : System RAM >>> 3018000000-3037ffffff : Crash kernel >>> >>> But kexec would allocate memory 3018000000-3019290000 for the kernel, >>> which is 0x5000 smaller than the regular kernel. In my cases, the >>> physical_node_map and kern_memmap of the kdump kernel overlaped and >>> caused data corruption. >>> >>> This patch fixes the problem. The patch was generated against >>> kexec-tools 2.0.0 and tested in 2.6.27-rc4. >> Hi Jay, >> >> I am unclear about why this underallocation occurs. > > Hi Simon, > > The routine add_loaded_segments_info() set up "loaded_segment" array > that is needed by purgatory code, based on data stored in the > mem_ehdr array passed in as the second parameter. > > Upon entrance of the routine, the crash_memory_range[] contains > information about the regular kernel: > crash_memory_range[ 0]: start= 3000080000, end= 30003fffff > crash_memory_range[ 1]: start= 3003000000, end= 3005ffffff > crash_memory_range[ 2]: start= 3006000000, end= 3013ffffff > crash_memory_range[ 3]: start= 3014000000, end= 3015294fff > > The #3 entry is the kernel memory segment. > > And the mem_ehdr array would contain data as such: Hi, It should be mem_phdr, got it from mem_ehdr->e_phdr. > i=0, p_paddr=3018000000, p_memsz=d04480, p_offset=10000, p_type=1 > i=1, p_paddr=3018d20000, p_memsz=9620, p_offset=d20000, p_type=1 > i=2, p_paddr=3018d30000, p_memsz=564490, p_offset=d30000, p_type=1 > i=3, p_paddr=0, p_memsz=0, p_offset=0, p_type=4 Does anyone understand how the array were created and why there was a gap between i=0 and i=1 entries? I think this is the problem but i do not know how to fix it, so tried to work around it. The statement my patch replaced was totally broken: - if (loaded_segments[loaded_segments_num].end != - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) - break; + if (loaded_segments[loaded_segments_num].end < + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) + loaded_segments[loaded_segments_num].end + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); My debugging showed that when "loaded_segments[loaded_segments_num].end" != "phdr->p_paddr & ~(ELF_PAGE_SIZE-1)", they were treated as equal and continue to next statement. However, if i assign both expression to local variables and do comparison, the 'break' statement is executed correctly when two values are not the same. Unfortunately, consequently the kdump kernel would _alawys_ hang. I believe the intent of the original statement is to ensure there is no gap between entries of mem_phdr array. But if there is a gap, kexec should simply exit with failure. The 'break' statement just created a loaded_segment[] array that broke the kernel memory segment into multiple entries and resulted in the kdump kernel hang in find_memory(). The IA64 (at least 2.6.27-rc4) kdump kernel works in some cases today are simply out of luck. I believe the real fix is to fix the contents of the mem_phdr array. Since i do not know how to fix it, my patch would close up the gap where there is the a gap between entries of the mem_phdr array. Does it make more sense to you now, Simon? Regards, - jay > > The code wants the new loaded_segments contain starting address > all aligned at page boundary, which is 0x10000 in IA64. > > Note that the p_memsz of mem_ehdr does not match to entries in > /proc/iomem: > 3014000000-3015294fff : System RAM > 3014000000-3014823ccf : Kernel code > 3014823cd0-3014dee8ef : Kernel data > 3014dee8f0-301529448f : Kernel bss > > The original code of add_loaded_segments_info() would go through > the mem_ehdr array and use the p_paddr of the first entry (the > beginning of the reserved memory) as the start address, add > the p_memsz of three entries to calculate the end address of > the kernel segment. > > But the p_paddr of i=0 plus p_memsz of i=0 should result in > 3018d10000 as the p_paddr of i=1 entry, but actually the > p_paddr of i=1 is 3018d20000. The logic of that routine > can not explain the discrepency. > > So, where the data of mem_ehdr array come from? > > add_loaded_segments_info > <- load_crashdump_segments > <- elf_ia64_load > <- file_type[i].load > <- my_load > > The elf_ia64_load set up mem_ehdr, probabaly based on data > pointed by *buf, which i think comes from vmlinuz. > > So, i failed to find out how the p_memsz were set up initially. > But, i think we did it the way too complicated, IMHO. > > The crash_memory_range[] array showed the kernel segment consumed > 0x1295000 bytes of memory and we only need to tell the purgatory > code to reserve that amount of memory. The logic in > add_loaded_segments_info() came out with 0x1290000 and caused the > crashkernel to panic on boot. > > Hmmm, as i types now, i may not consider the situation where > the crashkernel is not the same as the first kernel... > > Note that the underallocation does not _ALWAYS_ happen! It depends > on the vmlinux we build. Honestly i do not understand some part of > the kexec-tools code well enough to make major surgery to the code. > So, i just compare the end address after calculation of i=0 entry > of mem_ehdr array with the start address of the second entry. If it > is too small, i just bring it up to align with the start address of > the second entry. I am happy to allocate one extra page, may not be > needed in some cases, of memory than to panic. Yes, my patch is > a work-around. > > If you can find the true cause of the problem and fix it, it > would be great and appreciated! > > Regards, > - jay > > >>> Signed-off-by: Jay Lan <jlan@sgi.com> >>> >>> --- >>> kexec/arch/ia64/crashdump-ia64.c | 8 ++++---- >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> Index: kexec-tools/kexec/arch/ia64/crashdump-ia64.c >>> =================================================================== >>> --- kexec-tools.orig/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:24:14.289758063 -0700 >>> +++ kexec-tools/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:29:34.095833316 -0700 >>> @@ -90,15 +90,15 @@ static void add_loaded_segments_info(str >>> phdr = &ehdr->e_phdr[i]; >>> if (phdr->p_type != PT_LOAD) >>> break; >>> - if (loaded_segments[loaded_segments_num].end != >>> - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) >>> - break; >>> + if (loaded_segments[loaded_segments_num].end < >>> + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) >>> + loaded_segments[loaded_segments_num].end >>> + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); >>> loaded_segments[loaded_segments_num].end += >>> (phdr->p_memsz + ELF_PAGE_SIZE - 1) & >>> ~(ELF_PAGE_SIZE - 1); >>> i++; >>> } >>> - >>> loaded_segments_num++; >>> } >>> } > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] IA64: kexec allocates too few memory for kdump kernel itself 2008-09-04 18:28 ` Jay Lan @ 2008-09-12 23:27 ` Simon Horman 2008-09-13 0:38 ` Jay Lan 2008-09-15 6:58 ` Simon Horman 1 sibling, 1 reply; 8+ messages in thread From: Simon Horman @ 2008-09-12 23:27 UTC (permalink / raw) To: Jay Lan; +Cc: kexec On Thu, Sep 04, 2008 at 11:28:32AM -0700, Jay Lan wrote: > Jay Lan wrote: > > Simon Horman wrote: > >> On Wed, Sep 03, 2008 at 02:01:59PM -0700, Jay Lan wrote: > >>> Sometimes the kexec would allocate not enough memory for kdump kernel > >>> itself on IA64 and caused kdump kernel to panic at boot. > >>> > >>> When it happens, the /proc/iomem would show a kernel RAM segment like > >>> this: > >>> 3014000000-3015294fff : System RAM > >>> 3014000000-3014823ccf : Kernel code > >>> 3014823cd0-3014dee8ef : Kernel data > >>> 3014dee8f0-301529448f : Kernel bss > >>> 3015295000-307bffdfff : System RAM > >>> 3018000000-3037ffffff : Crash kernel > >>> > >>> But kexec would allocate memory 3018000000-3019290000 for the kernel, > >>> which is 0x5000 smaller than the regular kernel. In my cases, the > >>> physical_node_map and kern_memmap of the kdump kernel overlaped and > >>> caused data corruption. > >>> > >>> This patch fixes the problem. The patch was generated against > >>> kexec-tools 2.0.0 and tested in 2.6.27-rc4. > >> Hi Jay, > >> > >> I am unclear about why this underallocation occurs. > > > > Hi Simon, > > > > The routine add_loaded_segments_info() set up "loaded_segment" array > > that is needed by purgatory code, based on data stored in the > > mem_ehdr array passed in as the second parameter. > > > > Upon entrance of the routine, the crash_memory_range[] contains > > information about the regular kernel: > > crash_memory_range[ 0]: start= 3000080000, end= 30003fffff > > crash_memory_range[ 1]: start= 3003000000, end= 3005ffffff > > crash_memory_range[ 2]: start= 3006000000, end= 3013ffffff > > crash_memory_range[ 3]: start= 3014000000, end= 3015294fff > > > > The #3 entry is the kernel memory segment. > > > > And the mem_ehdr array would contain data as such: > > Hi, > > It should be mem_phdr, got it from mem_ehdr->e_phdr. > > > i=0, p_paddr=3018000000, p_memsz=d04480, p_offset=10000, p_type=1 > > i=1, p_paddr=3018d20000, p_memsz=9620, p_offset=d20000, p_type=1 > > i=2, p_paddr=3018d30000, p_memsz=564490, p_offset=d30000, p_type=1 > > i=3, p_paddr=0, p_memsz=0, p_offset=0, p_type=4 > > Does anyone understand how the array were created and why there > was a gap between i=0 and i=1 entries? I think this is the problem > but i do not know how to fix it, so tried to work around it. > > The statement my patch replaced was totally broken: > - if (loaded_segments[loaded_segments_num].end != > - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) > - break; > + if (loaded_segments[loaded_segments_num].end < > + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) > + loaded_segments[loaded_segments_num].end > + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); > > My debugging showed that when "loaded_segments[loaded_segments_num].end" > != "phdr->p_paddr & ~(ELF_PAGE_SIZE-1)", they were treated as equal > and continue to next statement. However, if i assign both expression > to local variables and do comparison, the 'break' statement is > executed correctly when two values are not the same. Unfortunately, > consequently the kdump kernel would _alawys_ hang. > > I believe the intent of the original statement is to ensure there is > no gap between entries of mem_phdr array. But if there is a gap, > kexec should simply exit with failure. The 'break' statement just > created a loaded_segment[] array that broke the kernel memory segment > into multiple entries and resulted in the kdump kernel hang in > find_memory(). The IA64 (at least 2.6.27-rc4) kdump kernel works in > some cases today are simply out of luck. > > I believe the real fix is to fix the contents of the mem_phdr array. > Since i do not know how to fix it, my patch would close up the > gap where there is the a gap between entries of the mem_phdr array. > > Does it make more sense to you now, Simon? Hi Jay, yes that does make sense. I'd like to poke around and see if mem_phdr can be fixed. > > Regards, > - jay > > > > > > > The code wants the new loaded_segments contain starting address > > all aligned at page boundary, which is 0x10000 in IA64. > > > > Note that the p_memsz of mem_ehdr does not match to entries in > > /proc/iomem: > > 3014000000-3015294fff : System RAM > > 3014000000-3014823ccf : Kernel code > > 3014823cd0-3014dee8ef : Kernel data > > 3014dee8f0-301529448f : Kernel bss > > > > The original code of add_loaded_segments_info() would go through > > the mem_ehdr array and use the p_paddr of the first entry (the > > beginning of the reserved memory) as the start address, add > > the p_memsz of three entries to calculate the end address of > > the kernel segment. > > > > But the p_paddr of i=0 plus p_memsz of i=0 should result in > > 3018d10000 as the p_paddr of i=1 entry, but actually the > > p_paddr of i=1 is 3018d20000. The logic of that routine > > can not explain the discrepency. > > > > So, where the data of mem_ehdr array come from? > > > > add_loaded_segments_info > > <- load_crashdump_segments > > <- elf_ia64_load > > <- file_type[i].load > > <- my_load > > > > The elf_ia64_load set up mem_ehdr, probabaly based on data > > pointed by *buf, which i think comes from vmlinuz. > > > > So, i failed to find out how the p_memsz were set up initially. > > But, i think we did it the way too complicated, IMHO. > > > > The crash_memory_range[] array showed the kernel segment consumed > > 0x1295000 bytes of memory and we only need to tell the purgatory > > code to reserve that amount of memory. The logic in > > add_loaded_segments_info() came out with 0x1290000 and caused the > > crashkernel to panic on boot. > > > > Hmmm, as i types now, i may not consider the situation where > > the crashkernel is not the same as the first kernel... > > > > Note that the underallocation does not _ALWAYS_ happen! It depends > > on the vmlinux we build. Honestly i do not understand some part of > > the kexec-tools code well enough to make major surgery to the code. > > So, i just compare the end address after calculation of i=0 entry > > of mem_ehdr array with the start address of the second entry. If it > > is too small, i just bring it up to align with the start address of > > the second entry. I am happy to allocate one extra page, may not be > > needed in some cases, of memory than to panic. Yes, my patch is > > a work-around. > > > > If you can find the true cause of the problem and fix it, it > > would be great and appreciated! > > > > Regards, > > - jay > > > > > >>> Signed-off-by: Jay Lan <jlan@sgi.com> > >>> > >>> --- > >>> kexec/arch/ia64/crashdump-ia64.c | 8 ++++---- > >>> 1 file changed, 4 insertions(+), 4 deletions(-) > >>> > >>> Index: kexec-tools/kexec/arch/ia64/crashdump-ia64.c > >>> =================================================================== > >>> --- kexec-tools.orig/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:24:14.289758063 -0700 > >>> +++ kexec-tools/kexec/arch/ia64/crashdump-ia64.c 2008-09-03 11:29:34.095833316 -0700 > >>> @@ -90,15 +90,15 @@ static void add_loaded_segments_info(str > >>> phdr = &ehdr->e_phdr[i]; > >>> if (phdr->p_type != PT_LOAD) > >>> break; > >>> - if (loaded_segments[loaded_segments_num].end != > >>> - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) > >>> - break; > >>> + if (loaded_segments[loaded_segments_num].end < > >>> + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) > >>> + loaded_segments[loaded_segments_num].end > >>> + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); > >>> loaded_segments[loaded_segments_num].end += > >>> (phdr->p_memsz + ELF_PAGE_SIZE - 1) & > >>> ~(ELF_PAGE_SIZE - 1); > >>> i++; > >>> } > >>> - > >>> loaded_segments_num++; > >>> } > >>> } > > > > _______________________________________________ > > kexec mailing list > > kexec@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/kexec -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] IA64: kexec allocates too few memory for kdump kernel itself 2008-09-12 23:27 ` Simon Horman @ 2008-09-13 0:38 ` Jay Lan 2008-09-15 5:47 ` Simon Horman 0 siblings, 1 reply; 8+ messages in thread From: Jay Lan @ 2008-09-13 0:38 UTC (permalink / raw) To: Simon Horman; +Cc: kexec Simon Horman wrote: >> Hi, >> >> It should be mem_phdr, got it from mem_ehdr->e_phdr. >> >>> i=0, p_paddr=3018000000, p_memsz=d04480, p_offset=10000, p_type=1 >>> i=1, p_paddr=3018d20000, p_memsz=9620, p_offset=d20000, p_type=1 >>> i=2, p_paddr=3018d30000, p_memsz=564490, p_offset=d30000, p_type=1 >>> i=3, p_paddr=0, p_memsz=0, p_offset=0, p_type=4 >> Does anyone understand how the array were created and why there >> was a gap between i=0 and i=1 entries? I think this is the problem >> but i do not know how to fix it, so tried to work around it. >> >> The statement my patch replaced was totally broken: >> - if (loaded_segments[loaded_segments_num].end != >> - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) >> - break; >> + if (loaded_segments[loaded_segments_num].end < >> + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) >> + loaded_segments[loaded_segments_num].end >> + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); >> >> My debugging showed that when "loaded_segments[loaded_segments_num].end" >> != "phdr->p_paddr & ~(ELF_PAGE_SIZE-1)", they were treated as equal >> and continue to next statement. However, if i assign both expression >> to local variables and do comparison, the 'break' statement is >> executed correctly when two values are not the same. Unfortunately, >> consequently the kdump kernel would _alawys_ hang. >> >> I believe the intent of the original statement is to ensure there is >> no gap between entries of mem_phdr array. But if there is a gap, >> kexec should simply exit with failure. The 'break' statement just >> created a loaded_segment[] array that broke the kernel memory segment >> into multiple entries and resulted in the kdump kernel hang in >> find_memory(). The IA64 (at least 2.6.27-rc4) kdump kernel works in >> some cases today are simply out of luck. >> >> I believe the real fix is to fix the contents of the mem_phdr array. >> Since i do not know how to fix it, my patch would close up the >> gap where there is the a gap between entries of the mem_phdr array. >> >> Does it make more sense to you now, Simon? > > Hi Jay, > > yes that does make sense. I'd like to poke around and see > if mem_phdr can be fixed. I think the whole ehdr is read from the kernel binary in slurp_decompress_file. Bernhard reported a kdump kernel boot problem caused by a patch regarding per-cpu variables access in early boot code: http://article.gmane.org/gmane.linux.ports.ia64/19380 I backed out the offending patch and i was no longer able to reproduce this problem. So, it is safe to say the problem was due to how we process data from the vmlinuz. The code i tried to change: - if (loaded_segments[loaded_segments_num].end != - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) - break; has two problems: 1) '!=' operation takes precedence over '&'. If the code is to do what it intends to do, the statement should be: if (loaded_segments[loaded_segments_num].end != (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) break; 2) When the 'break' is really executed, you breaks the kernel segment into multiple segments. The code needs fix even if the problem i saw was a result of a bug in the kernel. Thanks, jay > _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] IA64: kexec allocates too few memory for kdump kernel itself 2008-09-13 0:38 ` Jay Lan @ 2008-09-15 5:47 ` Simon Horman 0 siblings, 0 replies; 8+ messages in thread From: Simon Horman @ 2008-09-15 5:47 UTC (permalink / raw) To: Jay Lan; +Cc: kexec On Fri, Sep 12, 2008 at 05:38:09PM -0700, Jay Lan wrote: > Simon Horman wrote: > >> Hi, > >> > >> It should be mem_phdr, got it from mem_ehdr->e_phdr. > >> > >>> i=0, p_paddr=3018000000, p_memsz=d04480, p_offset=10000, p_type=1 > >>> i=1, p_paddr=3018d20000, p_memsz=9620, p_offset=d20000, p_type=1 > >>> i=2, p_paddr=3018d30000, p_memsz=564490, p_offset=d30000, p_type=1 > >>> i=3, p_paddr=0, p_memsz=0, p_offset=0, p_type=4 > >> Does anyone understand how the array were created and why there > >> was a gap between i=0 and i=1 entries? I think this is the problem > >> but i do not know how to fix it, so tried to work around it. > >> > >> The statement my patch replaced was totally broken: > >> - if (loaded_segments[loaded_segments_num].end != > >> - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) > >> - break; > >> + if (loaded_segments[loaded_segments_num].end < > >> + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) > >> + loaded_segments[loaded_segments_num].end > >> + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); > >> > >> My debugging showed that when "loaded_segments[loaded_segments_num].end" > >> != "phdr->p_paddr & ~(ELF_PAGE_SIZE-1)", they were treated as equal > >> and continue to next statement. However, if i assign both expression > >> to local variables and do comparison, the 'break' statement is > >> executed correctly when two values are not the same. Unfortunately, > >> consequently the kdump kernel would _alawys_ hang. > >> > >> I believe the intent of the original statement is to ensure there is > >> no gap between entries of mem_phdr array. But if there is a gap, > >> kexec should simply exit with failure. The 'break' statement just > >> created a loaded_segment[] array that broke the kernel memory segment > >> into multiple entries and resulted in the kdump kernel hang in > >> find_memory(). The IA64 (at least 2.6.27-rc4) kdump kernel works in > >> some cases today are simply out of luck. > >> > >> I believe the real fix is to fix the contents of the mem_phdr array. > >> Since i do not know how to fix it, my patch would close up the > >> gap where there is the a gap between entries of the mem_phdr array. > >> > >> Does it make more sense to you now, Simon? > > > > Hi Jay, > > > > yes that does make sense. I'd like to poke around and see > > if mem_phdr can be fixed. > > I think the whole ehdr is read from the kernel binary in > slurp_decompress_file. > > Bernhard reported a kdump kernel boot problem caused by a patch > regarding per-cpu variables access in early boot code: > http://article.gmane.org/gmane.linux.ports.ia64/19380 > > I backed out the offending patch and i was no longer able to > reproduce this problem. > > So, it is safe to say the problem was due to how we process > data from the vmlinuz. > > The code i tried to change: > - if (loaded_segments[loaded_segments_num].end != > - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) > - break; > has two problems: > 1) '!=' operation takes precedence over '&'. If the code is to > do what it intends to do, the statement should be: > if (loaded_segments[loaded_segments_num].end != > (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) > break; Good point. That is definately a problem. > 2) When the 'break' is really executed, you breaks the kernel > segment into multiple segments. I also agree that is a problem. I just wonder what is the best thing to do about it. > The code needs fix even if the problem i saw was a result of > a bug in the kernel. Agreed. For now can you make a patch that just fixes precedence logic bug? Lets merge that and then tackle the more complex broken segment issue. -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] IA64: kexec allocates too few memory for kdump kernel itself 2008-09-04 18:28 ` Jay Lan 2008-09-12 23:27 ` Simon Horman @ 2008-09-15 6:58 ` Simon Horman 1 sibling, 0 replies; 8+ messages in thread From: Simon Horman @ 2008-09-15 6:58 UTC (permalink / raw) To: Jay Lan; +Cc: kexec On Thu, Sep 04, 2008 at 11:28:32AM -0700, Jay Lan wrote: > Jay Lan wrote: > > Simon Horman wrote: > >> On Wed, Sep 03, 2008 at 02:01:59PM -0700, Jay Lan wrote: > >>> Sometimes the kexec would allocate not enough memory for kdump kernel > >>> itself on IA64 and caused kdump kernel to panic at boot. > >>> > >>> When it happens, the /proc/iomem would show a kernel RAM segment like > >>> this: > >>> 3014000000-3015294fff : System RAM > >>> 3014000000-3014823ccf : Kernel code > >>> 3014823cd0-3014dee8ef : Kernel data > >>> 3014dee8f0-301529448f : Kernel bss > >>> 3015295000-307bffdfff : System RAM > >>> 3018000000-3037ffffff : Crash kernel > >>> > >>> But kexec would allocate memory 3018000000-3019290000 for the kernel, > >>> which is 0x5000 smaller than the regular kernel. In my cases, the > >>> physical_node_map and kern_memmap of the kdump kernel overlaped and > >>> caused data corruption. > >>> > >>> This patch fixes the problem. The patch was generated against > >>> kexec-tools 2.0.0 and tested in 2.6.27-rc4. > >> Hi Jay, > >> > >> I am unclear about why this underallocation occurs. > > > > Hi Simon, > > > > The routine add_loaded_segments_info() set up "loaded_segment" array > > that is needed by purgatory code, based on data stored in the > > mem_ehdr array passed in as the second parameter. > > > > Upon entrance of the routine, the crash_memory_range[] contains > > information about the regular kernel: > > crash_memory_range[ 0]: start= 3000080000, end= 30003fffff > > crash_memory_range[ 1]: start= 3003000000, end= 3005ffffff > > crash_memory_range[ 2]: start= 3006000000, end= 3013ffffff > > crash_memory_range[ 3]: start= 3014000000, end= 3015294fff > > > > The #3 entry is the kernel memory segment. > > > > And the mem_ehdr array would contain data as such: > > Hi, > > It should be mem_phdr, got it from mem_ehdr->e_phdr. > > > i=0, p_paddr=3018000000, p_memsz=d04480, p_offset=10000, p_type=1 > > i=1, p_paddr=3018d20000, p_memsz=9620, p_offset=d20000, p_type=1 > > i=2, p_paddr=3018d30000, p_memsz=564490, p_offset=d30000, p_type=1 > > i=3, p_paddr=0, p_memsz=0, p_offset=0, p_type=4 > > Does anyone understand how the array were created and why there > was a gap between i=0 and i=1 entries? I think this is the problem > but i do not know how to fix it, so tried to work around it. > > The statement my patch replaced was totally broken: > - if (loaded_segments[loaded_segments_num].end != > - phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) > - break; > + if (loaded_segments[loaded_segments_num].end < > + (phdr->p_paddr & ~(ELF_PAGE_SIZE-1)) ) > + loaded_segments[loaded_segments_num].end > + = phdr->p_paddr & ~(ELF_PAGE_SIZE-1); > > My debugging showed that when "loaded_segments[loaded_segments_num].end" > != "phdr->p_paddr & ~(ELF_PAGE_SIZE-1)", they were treated as equal > and continue to next statement. However, if i assign both expression > to local variables and do comparison, the 'break' statement is > executed correctly when two values are not the same. Unfortunately, > consequently the kdump kernel would _alawys_ hang. > > I believe the intent of the original statement is to ensure there is > no gap between entries of mem_phdr array. But if there is a gap, > kexec should simply exit with failure. The 'break' statement just > created a loaded_segment[] array that broke the kernel memory segment > into multiple entries and resulted in the kdump kernel hang in > find_memory(). The IA64 (at least 2.6.27-rc4) kdump kernel works in > some cases today are simply out of luck. > > I believe the real fix is to fix the contents of the mem_phdr array. > Since i do not know how to fix it, my patch would close up the > gap where there is the a gap between entries of the mem_phdr array. > > Does it make more sense to you now, Simon? > > Regards, > - jay > > > > > > > The code wants the new loaded_segments contain starting address > > all aligned at page boundary, which is 0x10000 in IA64. > > > > Note that the p_memsz of mem_ehdr does not match to entries in > > /proc/iomem: > > 3014000000-3015294fff : System RAM > > 3014000000-3014823ccf : Kernel code > > 3014823cd0-3014dee8ef : Kernel data > > 3014dee8f0-301529448f : Kernel bss > > > > The original code of add_loaded_segments_info() would go through > > the mem_ehdr array and use the p_paddr of the first entry (the > > beginning of the reserved memory) as the start address, add > > the p_memsz of three entries to calculate the end address of > > the kernel segment. > > > > But the p_paddr of i=0 plus p_memsz of i=0 should result in > > 3018d10000 as the p_paddr of i=1 entry, but actually the > > p_paddr of i=1 is 3018d20000. The logic of that routine > > can not explain the discrepency. > > > > So, where the data of mem_ehdr array come from? > > > > add_loaded_segments_info > > <- load_crashdump_segments > > <- elf_ia64_load > > <- file_type[i].load > > <- my_load > > > > The elf_ia64_load set up mem_ehdr, probabaly based on data > > pointed by *buf, which i think comes from vmlinuz. Yes, I believe that is the case too. > > So, i failed to find out how the p_memsz were set up initially. > > But, i think we did it the way too complicated, IMHO. I think that the relevant code path is: build_mem_elf64_phdr() called by: build_mem_phdrs() called by: build_elf_info() called by: build_elf_exec_info() called by: elf_ia64_load(), before calling load_crashdump_segments() As the PT_LOAD segments in mem_ehdr should correlate with data read from vmlinux, perhaps you can see something interesting by running readelf -l vmlinux > > The crash_memory_range[] array showed the kernel segment consumed > > 0x1295000 bytes of memory and we only need to tell the purgatory > > code to reserve that amount of memory. The logic in > > add_loaded_segments_info() came out with 0x1290000 and caused the > > crashkernel to panic on boot. > > > > Hmmm, as i types now, i may not consider the situation where > > the crashkernel is not the same as the first kernel... > > > > Note that the underallocation does not _ALWAYS_ happen! It depends > > on the vmlinux we build. Honestly i do not understand some part of > > the kexec-tools code well enough to make major surgery to the code. > > So, i just compare the end address after calculation of i=0 entry > > of mem_ehdr array with the start address of the second entry. If it > > is too small, i just bring it up to align with the start address of > > the second entry. I am happy to allocate one extra page, may not be > > needed in some cases, of memory than to panic. Yes, my patch is > > a work-around. > > > > If you can find the true cause of the problem and fix it, it > > would be great and appreciated! -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-09-15 6:58 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-09-03 21:01 [PATCH] IA64: kexec allocates too few memory for kdump kernel itself Jay Lan 2008-09-04 0:01 ` Simon Horman 2008-09-04 1:37 ` Jay Lan 2008-09-04 18:28 ` Jay Lan 2008-09-12 23:27 ` Simon Horman 2008-09-13 0:38 ` Jay Lan 2008-09-15 5:47 ` Simon Horman 2008-09-15 6:58 ` Simon Horman
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.