* [PATCH 0/3] makedumpfile: Fix handling of ELF segments with p_memsz > p_filesz
@ 2016-02-10 7:47 Petr Tesarik
2016-02-10 7:49 ` [PATCH 1/3] makedumpfile: Keep segment memory size when re-filtering ELF dumps Petr Tesarik
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Petr Tesarik @ 2016-02-10 7:47 UTC (permalink / raw)
To: Atsushi Kumagai, Ivan Delalande; +Cc: kexec mailing list
Hello Ivan & Atsushi,
after spending some time on this, I think I have a complete working
solution. I chose to rewrite readpage_elf from scratch. Unfortunately,
I'm unable to verify that it still works the same (sans bugs) because
of a regression in commit d18796d090623d18f46c8dc608dcad9960fbbe9b
(reported on the mailing list).
I was able to verify that --dump-dmesg works on a variety of weird ELF
files that I created manually (and the output hasn't changed). In my
opinion, this is enough to verify that the read routine works, but the
final decision is up to you.
Petr Tesarik
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH 1/3] makedumpfile: Keep segment memory size when re-filtering ELF dumps 2016-02-10 7:47 [PATCH 0/3] makedumpfile: Fix handling of ELF segments with p_memsz > p_filesz Petr Tesarik @ 2016-02-10 7:49 ` Petr Tesarik 2016-02-10 7:49 ` [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered Petr Tesarik 2016-02-10 7:50 ` [PATCH 3/3] makedumpfile: Rewrite readpage_elf Petr Tesarik 2 siblings, 0 replies; 15+ messages in thread From: Petr Tesarik @ 2016-02-10 7:49 UTC (permalink / raw) To: Atsushi Kumagai, Ivan Delalande; +Cc: kexec mailing list Originally, makedumpfile was designed to read from /proc/vmcore, where each segment's p_memsz is equal to its p_filesz (see below). However, makedumpfile can also be used to re-filter an already filtered ELF dump file, where memory size may be larger than file size. In that case the memory size should be used as the size of the segment. This affects: 1. max_mapnr This value is computed as the highest phys_end. If the last segment was filtered, max_mapnr may be too small, and the crash utility will refuse to read that memory (even with --zero_excluded). 2. p_memsz field in ELF dumps The resulting ELF segment p_memsz will be capped to original file's p_filesz, ignoring the original p_memsz. 3. memory holes in KDUMP dumps Pages excluded in the original ELF dump will be appear as memory holes in the resulting KDUMP file's first bitmap. 4. vtop translation Virtual addresses that were filtered out in the original ELF file cannot be translated to physical addresses using the generic vtop translation. My fix uses p_memsz to set physical and virtual extents of ELF segments, because this is their actual size. However, file size is important when accessing page data, so it must be stored separately and checked when translating a physical address to a file offset. Signed-off-by: Petr Tesarik <ptesarik@suse.com> --- elf_info.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) --- a/elf_info.c +++ b/elf_info.c @@ -38,6 +38,7 @@ struct pt_load_segment { off_t file_offset; + off_t file_size; unsigned long long phys_start; unsigned long long phys_end; unsigned long long virt_start; @@ -162,10 +163,11 @@ dump_Elf_load(Elf64_Phdr *prog, int num_ pls = &pt_loads[num_load]; pls->phys_start = prog->p_paddr; - pls->phys_end = pls->phys_start + prog->p_filesz; + pls->phys_end = pls->phys_start + prog->p_memsz; pls->virt_start = prog->p_vaddr; - pls->virt_end = pls->virt_start + prog->p_filesz; + pls->virt_end = pls->virt_start + prog->p_memsz; pls->file_offset = prog->p_offset; + pls->file_size = prog->p_filesz; DEBUG_MSG("LOAD (%d)\n", num_load); DEBUG_MSG(" phys_start : %llx\n", pls->phys_start); @@ -462,7 +464,7 @@ paddr_to_offset(unsigned long long paddr for (i = offset = 0; i < num_pt_loads; i++) { pls = &pt_loads[i]; if ((paddr >= pls->phys_start) - && (paddr < pls->phys_end)) { + && (paddr < pls->phys_start + pls->file_size)) { offset = (off_t)(paddr - pls->phys_start) + pls->file_offset; break; @@ -480,16 +482,14 @@ paddr_to_offset2(unsigned long long padd { int i; off_t offset; - unsigned long long len; struct pt_load_segment *pls; for (i = offset = 0; i < num_pt_loads; i++) { pls = &pt_loads[i]; - len = pls->phys_end - pls->phys_start; if ((paddr >= pls->phys_start) - && (paddr < pls->phys_end) + && (paddr < pls->phys_start + pls->file_size) && (hint >= pls->file_offset) - && (hint < pls->file_offset + len)) { + && (hint < pls->file_offset + pls->file_size)) { offset = (off_t)(paddr - pls->phys_start) + pls->file_offset; break; _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered 2016-02-10 7:47 [PATCH 0/3] makedumpfile: Fix handling of ELF segments with p_memsz > p_filesz Petr Tesarik 2016-02-10 7:49 ` [PATCH 1/3] makedumpfile: Keep segment memory size when re-filtering ELF dumps Petr Tesarik @ 2016-02-10 7:49 ` Petr Tesarik 2016-05-18 7:44 ` Atsushi Kumagai 2016-02-10 7:50 ` [PATCH 3/3] makedumpfile: Rewrite readpage_elf Petr Tesarik 2 siblings, 1 reply; 15+ messages in thread From: Petr Tesarik @ 2016-02-10 7:49 UTC (permalink / raw) To: Atsushi Kumagai, Ivan Delalande; +Cc: kexec mailing list Pages that were filtered in the original dump file should also be filtered in the destination file, i.e. pages between p_filesz and p_memsz must have their corresponding bit cleared in the 2nd bitmap. Note that in theory, such ELF segments might not refer to filtered pages, because the crash kernel copies the program headers verbatim from elfcorehdr=. However, all common sources of /proc/vmcore program headers use the same value for each p_memsz and p_filesz: a. kexec(8) Program headers are created in two places, but in both cases the value of p_memsz is equal to p_filesz. Reference: kexec/crashdump-elf.c in kexec-tools b. kexec_file_load(2) Conceptually the same code as kexec(8): two places, both set p_filesz and p_memsz to the same value. Reference: kexec/crashdump-elf.c in Linux kernel c. created in the crash kernel (s390) On s390, program headers are created in the crash kernel, but both p_filesz and p_memsz are set to "end - start", i.e. the same value. Reference: arch/s390/kernel/crash_dump.c in Linux kernel The above means that p_memsz > p_filesz only in ELF files that were processed by makedumpfile, hence it can be safely assumed that pages between p_filesz and p_memsz were filtered out. Signed-off-by: Petr Tesarik <ptesarik@suse.com> --- elf_info.c | 26 ++++++++++++++++++++++++++ elf_info.h | 5 +++++ makedumpfile.c | 32 ++++++++++++++++++++++++++++++++ 3 files changed, 63 insertions(+) --- a/elf_info.c +++ b/elf_info.c @@ -1096,6 +1096,32 @@ get_pt_load(int idx, return TRUE; } +int +get_pt_load_extents(int idx, + unsigned long long *phys_start, + unsigned long long *phys_end, + off_t *file_offset, + off_t *file_size) +{ + struct pt_load_segment *pls; + + if (num_pt_loads <= idx) + return FALSE; + + pls = &pt_loads[idx]; + + if (phys_start) + *phys_start = pls->phys_start; + if (phys_end) + *phys_end = pls->phys_end; + if (file_offset) + *file_offset = pls->file_offset; + if (file_size) + *file_size = pls->file_size; + + return TRUE; +} + unsigned int get_num_pt_loads(void) { --- a/elf_info.h +++ b/elf_info.h @@ -61,6 +61,11 @@ int get_pt_load(int idx, unsigned long long *phys_end, unsigned long long *virt_start, unsigned long long *virt_end); +int get_pt_load_extents(int idx, + unsigned long long *phys_start, + unsigned long long *phys_end, + off_t *file_offset, + off_t *file_size); unsigned int get_num_pt_loads(void); void set_nr_cpus(int num); --- a/makedumpfile.c +++ b/makedumpfile.c @@ -4395,6 +4395,32 @@ read_start_flat_header(void) return TRUE; } +static void +exclude_nodata_pages(struct cycle *cycle) +{ + int i; + unsigned long long phys_start, phys_end; + off_t file_size; + + i = 0; + while (get_pt_load_extents(i, &phys_start, &phys_end, + NULL, &file_size)) { + unsigned long long pfn, pfn_end; + + pfn = paddr_to_pfn(phys_start + file_size); + pfn_end = paddr_to_pfn(phys_end); + if (pfn < cycle->start_pfn) + pfn = cycle->start_pfn; + if (pfn_end >= cycle->end_pfn) + pfn_end = cycle->end_pfn - 1; + while (pfn <= pfn_end) { + clear_bit_on_2nd_bitmap(pfn, cycle); + ++pfn; + } + ++i; + } +} + int read_flat_data_header(struct makedumpfile_data_header *fdh) { @@ -6087,6 +6113,12 @@ create_2nd_bitmap(struct cycle *cycle) } /* + * If re-filtering ELF dump, exclude pages that were already + * excluded in the original file. + */ + exclude_nodata_pages(cycle); + + /* * Exclude cache pages, cache private pages, user data pages, * and hwpoison pages. */ _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered 2016-02-10 7:49 ` [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered Petr Tesarik @ 2016-05-18 7:44 ` Atsushi Kumagai 2016-06-07 4:18 ` Atsushi Kumagai 0 siblings, 1 reply; 15+ messages in thread From: Atsushi Kumagai @ 2016-05-18 7:44 UTC (permalink / raw) To: Petr Tesarik; +Cc: kexec mailing list Hello Petr, Sorry for my late comment: >Pages that were filtered in the original dump file should also be filtered >in the destination file, i.e. pages between p_filesz and p_memsz must have >their corresponding bit cleared in the 2nd bitmap. Note that in theory, >such ELF segments might not refer to filtered pages, because the crash >kernel copies the program headers verbatim from elfcorehdr=. However, all >common sources of /proc/vmcore program headers use the same value for each >p_memsz and p_filesz: > >a. kexec(8) > Program headers are created in two places, but in both cases the value > of p_memsz is equal to p_filesz. > Reference: kexec/crashdump-elf.c in kexec-tools > >b. kexec_file_load(2) > Conceptually the same code as kexec(8): two places, both set p_filesz > and p_memsz to the same value. > Reference: kexec/crashdump-elf.c in Linux kernel > >c. created in the crash kernel (s390) > On s390, program headers are created in the crash kernel, but both > p_filesz and p_memsz are set to "end - start", i.e. the same value. > Reference: arch/s390/kernel/crash_dump.c in Linux kernel > >The above means that p_memsz > p_filesz only in ELF files that were >processed by makedumpfile, hence it can be safely assumed that pages >between p_filesz and p_memsz were filtered out. > >Signed-off-by: Petr Tesarik <ptesarik@suse.com> > >--- > elf_info.c | 26 ++++++++++++++++++++++++++ > elf_info.h | 5 +++++ > makedumpfile.c | 32 ++++++++++++++++++++++++++++++++ > 3 files changed, 63 insertions(+) > >--- a/elf_info.c >+++ b/elf_info.c >@@ -1096,6 +1096,32 @@ get_pt_load(int idx, > return TRUE; > } > >+int >+get_pt_load_extents(int idx, >+ unsigned long long *phys_start, >+ unsigned long long *phys_end, >+ off_t *file_offset, >+ off_t *file_size) >+{ >+ struct pt_load_segment *pls; >+ >+ if (num_pt_loads <= idx) >+ return FALSE; >+ >+ pls = &pt_loads[idx]; >+ >+ if (phys_start) >+ *phys_start = pls->phys_start; >+ if (phys_end) >+ *phys_end = pls->phys_end; >+ if (file_offset) >+ *file_offset = pls->file_offset; >+ if (file_size) >+ *file_size = pls->file_size; >+ >+ return TRUE; >+} >+ > unsigned int > get_num_pt_loads(void) > { >--- a/elf_info.h >+++ b/elf_info.h >@@ -61,6 +61,11 @@ int get_pt_load(int idx, > unsigned long long *phys_end, > unsigned long long *virt_start, > unsigned long long *virt_end); >+int get_pt_load_extents(int idx, >+ unsigned long long *phys_start, >+ unsigned long long *phys_end, >+ off_t *file_offset, >+ off_t *file_size); > unsigned int get_num_pt_loads(void); > > void set_nr_cpus(int num); >--- a/makedumpfile.c >+++ b/makedumpfile.c >@@ -4395,6 +4395,32 @@ read_start_flat_header(void) > return TRUE; > } > >+static void >+exclude_nodata_pages(struct cycle *cycle) >+{ >+ int i; >+ unsigned long long phys_start, phys_end; >+ off_t file_size; >+ >+ i = 0; >+ while (get_pt_load_extents(i, &phys_start, &phys_end, >+ NULL, &file_size)) { >+ unsigned long long pfn, pfn_end; >+ >+ pfn = paddr_to_pfn(phys_start + file_size); >+ pfn_end = paddr_to_pfn(phys_end); Does this code exclude the first pfn of out of PT_LOAD even if there is no unstored pages ? pfn and pfn_end will point at the next pfn to the last pfn of PT_LOAD. This will be problem for the environments which have a overlapped PT_LOAD segment like ia64. >+ if (pfn < cycle->start_pfn) >+ pfn = cycle->start_pfn; >+ if (pfn_end >= cycle->end_pfn) >+ pfn_end = cycle->end_pfn - 1; >+ while (pfn <= pfn_end) { Should we change this condition to "pfn < pfn_end" ? >+ clear_bit_on_2nd_bitmap(pfn, cycle); >+ ++pfn; >+ } >+ ++i; >+ } >+} Thanks, Atsushi Kumagai >+ > int > read_flat_data_header(struct makedumpfile_data_header *fdh) > { >@@ -6087,6 +6113,12 @@ create_2nd_bitmap(struct cycle *cycle) > } > > /* >+ * If re-filtering ELF dump, exclude pages that were already >+ * excluded in the original file. >+ */ >+ exclude_nodata_pages(cycle); >+ >+ /* > * Exclude cache pages, cache private pages, user data pages, > * and hwpoison pages. > */ > >_______________________________________________ >kexec mailing list >kexec@lists.infradead.org >http://lists.infradead.org/mailman/listinfo/kexec _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered 2016-05-18 7:44 ` Atsushi Kumagai @ 2016-06-07 4:18 ` Atsushi Kumagai 2016-07-12 21:38 ` Petr Tesarik 0 siblings, 1 reply; 15+ messages in thread From: Atsushi Kumagai @ 2016-06-07 4:18 UTC (permalink / raw) To: Petr Tesarik; +Cc: kexec mailing list >>+static void >>+exclude_nodata_pages(struct cycle *cycle) >>+{ >>+ int i; >>+ unsigned long long phys_start, phys_end; >>+ off_t file_size; >>+ >>+ i = 0; >>+ while (get_pt_load_extents(i, &phys_start, &phys_end, >>+ NULL, &file_size)) { >>+ unsigned long long pfn, pfn_end; >>+ >>+ pfn = paddr_to_pfn(phys_start + file_size); >>+ pfn_end = paddr_to_pfn(phys_end); > >Does this code exclude the first pfn of out of PT_LOAD even if there is >no unstored pages ? pfn and pfn_end will point at the next pfn to the >last pfn of PT_LOAD. >This will be problem for the environments which have a overlapped PT_LOAD >segment like ia64. > >>+ if (pfn < cycle->start_pfn) >>+ pfn = cycle->start_pfn; >>+ if (pfn_end >= cycle->end_pfn) >>+ pfn_end = cycle->end_pfn - 1; >>+ while (pfn <= pfn_end) { > >Should we change this condition to "pfn < pfn_end" ? > >>+ clear_bit_on_2nd_bitmap(pfn, cycle); >>+ ++pfn; >>+ } >>+ ++i; >>+ } >>+} I fixed this as below for v1.6.0. Of course your comment would still be helpful. -- Author: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> Date: Thu May 26 15:17:25 2016 +0900 [PATCH] Fix boundary value bug for checking memory holes Now the next pfn to the last pfn of PT_LOAD is always excluded by boundary value bug. --------- PT_LOAD ----------| ----+-----------------------+---------------------+---- | pfn:N | pfn:N+1 | ... ----+-----------------------+---------------------+---- ^ this pfn This will be problem in the environments which have a overlapped PT_LOAD segment like ia64 because the pfn excluded by this bug can be included in another PT_LOAD. Signed-off-by: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> diff --git a/makedumpfile.c b/makedumpfile.c index 4f17686..8a80976 100644 --- a/makedumpfile.c +++ b/makedumpfile.c @@ -4449,7 +4449,7 @@ exclude_nodata_pages(struct cycle *cycle) pfn = cycle->start_pfn; if (pfn_end >= cycle->end_pfn) pfn_end = cycle->end_pfn - 1; - while (pfn <= pfn_end) { + while (pfn < pfn_end) { clear_bit_on_2nd_bitmap(pfn, cycle); ++pfn; } Thanks, Atsushi Kumagai > >Thanks, >Atsushi Kumagai > >>+ >> int >> read_flat_data_header(struct makedumpfile_data_header *fdh) >> { >>@@ -6087,6 +6113,12 @@ create_2nd_bitmap(struct cycle *cycle) >> } >> >> /* >>+ * If re-filtering ELF dump, exclude pages that were already >>+ * excluded in the original file. >>+ */ >>+ exclude_nodata_pages(cycle); >>+ >>+ /* >> * Exclude cache pages, cache private pages, user data pages, >> * and hwpoison pages. >> */ >> >>_______________________________________________ >>kexec mailing list >>kexec@lists.infradead.org >>http://lists.infradead.org/mailman/listinfo/kexec > >_______________________________________________ >kexec mailing list >kexec@lists.infradead.org >http://lists.infradead.org/mailman/listinfo/kexec _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered 2016-06-07 4:18 ` Atsushi Kumagai @ 2016-07-12 21:38 ` Petr Tesarik 2016-07-13 7:43 ` Atsushi Kumagai 0 siblings, 1 reply; 15+ messages in thread From: Petr Tesarik @ 2016-07-12 21:38 UTC (permalink / raw) To: Atsushi Kumagai; +Cc: kexec mailing list On Tue, 7 Jun 2016 04:18:48 +0000 Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> wrote: > >>+static void > >>+exclude_nodata_pages(struct cycle *cycle) > >>+{ > >>+ int i; > >>+ unsigned long long phys_start, phys_end; > >>+ off_t file_size; > >>+ > >>+ i = 0; > >>+ while (get_pt_load_extents(i, &phys_start, &phys_end, > >>+ NULL, &file_size)) { > >>+ unsigned long long pfn, pfn_end; > >>+ > >>+ pfn = paddr_to_pfn(phys_start + file_size); > >>+ pfn_end = paddr_to_pfn(phys_end); > > > >Does this code exclude the first pfn of out of PT_LOAD even if there is > >no unstored pages ? pfn and pfn_end will point at the next pfn to the > >last pfn of PT_LOAD. Indeed, phys_end is in fact the first physical address that is _not_ contained in the segment. Good catch! > >This will be problem for the environments which have a overlapped PT_LOAD > >segment like ia64. I think it's quite the opposite. Partial pages on IA64 are the only ones that were handled correctly with the old code. > > > >>+ if (pfn < cycle->start_pfn) > >>+ pfn = cycle->start_pfn; > >>+ if (pfn_end >= cycle->end_pfn) > >>+ pfn_end = cycle->end_pfn - 1; > >>+ while (pfn <= pfn_end) { > > > >Should we change this condition to "pfn < pfn_end" ? Better than nothing, but if the last page of a LOAD segment is not partial, it will not be excluded. Except for partial pages, end_pfn refers to the first PFN _after_ the current segment. > > > >>+ clear_bit_on_2nd_bitmap(pfn, cycle); > >>+ ++pfn; > >>+ } > >>+ ++i; > >>+ } > >>+} > > I fixed this as below for v1.6.0. > Of course your comment would still be helpful. It comes too late, I'm afraid. In all ia64 dumps with partial pages I have seen, the full content of the partial page is indeed stored in another segment, so the page would be marked incorrectly. OTOH I think the off-by-one bug you fixed affected all segments, even those without any partial pages; it would incorrectly mark the first PFN after the segment as filtered. The perfect solution is to round the starting PFN up instead of down to preserve partial pages, but I'll have to think about it some more before I can send a patch. Thank you for your fix! Petr T _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered 2016-07-12 21:38 ` Petr Tesarik @ 2016-07-13 7:43 ` Atsushi Kumagai 0 siblings, 0 replies; 15+ messages in thread From: Atsushi Kumagai @ 2016-07-13 7:43 UTC (permalink / raw) To: ptesarik@suse.cz; +Cc: kexec@lists.infradead.org Hello Petr, I'm happy to get your reply. >On Tue, 7 Jun 2016 04:18:48 +0000 >Atsushi Kumagai <ats-kumagai@wm.jp.nec.com> wrote: > >> >>+static void >> >>+exclude_nodata_pages(struct cycle *cycle) >> >>+{ >> >>+ int i; >> >>+ unsigned long long phys_start, phys_end; >> >>+ off_t file_size; >> >>+ >> >>+ i = 0; >> >>+ while (get_pt_load_extents(i, &phys_start, &phys_end, >> >>+ NULL, &file_size)) { >> >>+ unsigned long long pfn, pfn_end; >> >>+ >> >>+ pfn = paddr_to_pfn(phys_start + file_size); >> >>+ pfn_end = paddr_to_pfn(phys_end); >> > >> >Does this code exclude the first pfn of out of PT_LOAD even if there is >> >no unstored pages ? pfn and pfn_end will point at the next pfn to the >> >last pfn of PT_LOAD. > >Indeed, phys_end is in fact the first physical address that is _not_ >contained in the segment. > >Good catch! > >> >This will be problem for the environments which have a overlapped PT_LOAD >> >segment like ia64. > >I think it's quite the opposite. Partial pages on IA64 are the only >ones that were handled correctly with the old code. I didn't take care partial pages, just considered the case below: pfn 1 2 3 4 5 6 7 ------------------------------------------------ PT_LAOD(1) |---+---+---+---| PT_LAOD(2) |---+---+---+---| During checking PT_LOAD(1), the bit of pfn:5 will be dropped if mem_size equals file_size, but practically the pfn is an used page as PT_LAOD(2) shows. If there is only PT_LOAD(1), pfn:5 must be on hole and the actual harm doesn't occur. >> > >> >>+ if (pfn < cycle->start_pfn) >> >>+ pfn = cycle->start_pfn; >> >>+ if (pfn_end >= cycle->end_pfn) >> >>+ pfn_end = cycle->end_pfn - 1; >> >>+ while (pfn <= pfn_end) { >> > >> >Should we change this condition to "pfn < pfn_end" ? > >Better than nothing, but if the last page of a LOAD segment is not >partial, it will not be excluded. Except for partial pages, end_pfn >refers to the first PFN _after_ the current segment. If mem_size equals file_size, the last page of a LOAD also shouldn't be removed. However, the condition of "pfn <= pfn_end" can remove it since end_pfn refers to the first PFN _after_ the current segment. I just fixed it. >> > >> >>+ clear_bit_on_2nd_bitmap(pfn, cycle); >> >>+ ++pfn; >> >>+ } >> >>+ ++i; >> >>+ } >> >>+} >> >> I fixed this as below for v1.6.0. >> Of course your comment would still be helpful. > >It comes too late, I'm afraid. > >In all ia64 dumps with partial pages I have seen, the full content of >the partial page is indeed stored in another segment, so the page would >be marked incorrectly. OTOH I think the off-by-one bug you fixed >affected all segments, even those without any partial pages; it would >incorrectly mark the first PFN after the segment as filtered. > >The perfect solution is to round the starting PFN up instead of down to >preserve partial pages, but I'll have to think about it some more >before I can send a patch. Again, I didn't consider partial page cases, I also should reconsider this. Thank you for pointing it out. Regards, Atsushi Kumagai _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/3] makedumpfile: Rewrite readpage_elf 2016-02-10 7:47 [PATCH 0/3] makedumpfile: Fix handling of ELF segments with p_memsz > p_filesz Petr Tesarik 2016-02-10 7:49 ` [PATCH 1/3] makedumpfile: Keep segment memory size when re-filtering ELF dumps Petr Tesarik 2016-02-10 7:49 ` [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered Petr Tesarik @ 2016-02-10 7:50 ` Petr Tesarik 2016-02-10 22:18 ` Ivan Delalande 2016-02-29 18:55 ` Petr Tesarik 2 siblings, 2 replies; 15+ messages in thread From: Petr Tesarik @ 2016-02-10 7:50 UTC (permalink / raw) To: Atsushi Kumagai, Ivan Delalande; +Cc: kexec mailing list The current code in readpage_elf (and readpage_elf_parallel) is extremely hard to follow. Additionally, it still does not cover all possible cases. For example, attempts to read outside of any ELF segment will end up with phys_start being 0, frac_head a negative number, interpreted as a large positive number by memset() and write past buffer end. Instead of trying to handle even more "corner cases", I rewrote the algorithm from scratch. The basic idea is simple: set a goal to fill the page buffer with data, then work towards that goal by: - filling holes with zeroes (see Note below), - p_filesz portions with file data and - remaining p_memsz portions again with zeroes. Repeat this method for each LOAD until the goal is achieved, or an error occurs. In most cases, the loop runs only once. Note: A "hole" is data at a physical address that is not covered by any ELF LOAD program header. In other words, the ELF file does not specify any data for such a hole (not even zeroes). So, why does makedumpfile fill them with zeroes? It's because makedumpfile works with page granularity (the compressed format does not even have a way to store a partial page), so if only part of a page is stored, a complete page must be provided to make this partial data accessible. Credits to Ivan Delalande <colona@arista.com> who first found the problem and wrote the original fix. Signed-off-by: Petr Tesarik <ptesarik@suse.com> --- elf_info.c | 28 +++++++ elf_info.h | 1 makedumpfile.c | 208 ++++++++++++++++++++++----------------------------------- 3 files changed, 110 insertions(+), 127 deletions(-) --- a/elf_info.c +++ b/elf_info.c @@ -691,6 +691,34 @@ get_max_paddr(void) return max_paddr; } +/* + * Find the LOAD segment which is closest to the requested + * physical address within a given distance. + * If there is no such segment, return a negative number. + */ +int +closest_pt_load(unsigned long long paddr, unsigned long distance) +{ + int i, bestidx; + struct pt_load_segment *pls; + unsigned long bestdist; + + bestdist = distance; + bestidx = -1; + for (i = 0; i < num_pt_loads; ++i) { + pls = &pt_loads[i]; + if (paddr >= pls->phys_end) + continue; + if (paddr >= pls->phys_start) + return i; /* Exact match */ + if (bestdist > pls->phys_start - paddr) { + bestdist = pls->phys_start - paddr; + bestidx = i; + } + } + return bestidx; +} + int get_elf64_ehdr(int fd, char *filename, Elf64_Ehdr *ehdr) { --- a/elf_info.h +++ b/elf_info.h @@ -39,6 +39,7 @@ off_t offset_to_pt_load_end(off_t offset unsigned long long vaddr_to_paddr_general(unsigned long long vaddr); off_t vaddr_to_offset_slow(int fd, char *filename, unsigned long long vaddr); unsigned long long get_max_paddr(void); +int closest_pt_load(unsigned long long paddr, unsigned long distance); int page_is_fractional(off_t page_offset); --- a/makedumpfile.c +++ b/makedumpfile.c @@ -648,73 +648,51 @@ read_from_vmcore_parallel(int fd_memory, static int readpage_elf(unsigned long long paddr, void *bufptr) { - off_t offset1, offset2; - size_t size1, size2; - unsigned long long phys_start, phys_end, frac_head = 0; - - offset1 = paddr_to_offset(paddr); - offset2 = paddr_to_offset(paddr + info->page_size); - phys_start = paddr; - phys_end = paddr + info->page_size; - - /* - * Check the case phys_start isn't aligned by page size like below: - * - * phys_start - * = 0x40ffda7000 - * |<-- frac_head -->|------------- PT_LOAD ------------- - * ----+-----------------------+---------------------+---- - * | pfn:N | pfn:N+1 | ... - * ----+-----------------------+---------------------+---- - * | - * pfn_to_paddr(pfn:N) # page size = 16k - * = 0x40ffda4000 - */ - if (!offset1) { - phys_start = page_head_to_phys_start(paddr); - offset1 = paddr_to_offset(phys_start); - frac_head = phys_start - paddr; - memset(bufptr, 0, frac_head); - } - - /* - * Check the case phys_end isn't aligned by page size like the - * phys_start's case. - */ - if (!offset2) { - phys_end = page_head_to_phys_end(paddr); - offset2 = paddr_to_offset(phys_end); - memset(bufptr + (phys_end - paddr), 0, info->page_size - (phys_end - paddr)); - } + int idx; + off_t offset, size; + void *p, *endp; + unsigned long long phys_start, phys_end; + + p = bufptr; + endp = p + info->page_size; + while (p < endp) { + idx = closest_pt_load(paddr + (p - bufptr), endp - p); + if (idx < 0) + break; + + get_pt_load_extents(idx, &phys_start, &phys_end, &offset, &size); + if (phys_start > paddr + (p - bufptr)) { + memset(p, 0, phys_start - paddr); + p += phys_start - paddr; + } - /* - * Check the separated page on different PT_LOAD segments. - */ - if (offset1 + (phys_end - phys_start) == offset2) { - size1 = phys_end - phys_start; - } else { - for (size1 = 1; size1 < info->page_size - frac_head; size1++) { - offset2 = paddr_to_offset(phys_start + size1); - if (offset1 + size1 != offset2) - break; + offset += paddr - phys_start; + if (size > paddr - phys_start) { + size -= paddr - phys_start; + if (size > endp - p) + size = endp - p; + if (!read_from_vmcore(offset, p, size)) { + ERRMSG("Can't read the dump memory(%s).\n", + info->name_memory); + return FALSE; + } + p += size; + } + if (p < endp) { + size = phys_end - paddr; + if (size > endp - p) + size = endp - p; + memset(p, 0, size); + p += size; } } - if(!read_from_vmcore(offset1, bufptr + frac_head, size1)) { - ERRMSG("Can't read the dump memory(%s).\n", - info->name_memory); + if (p == bufptr) { + ERRMSG("Attempt to read non-existent page at 0x%llx.\n", + paddr); return FALSE; - } - - if (size1 + frac_head != info->page_size) { - size2 = phys_end - (phys_start + size1); - - if(!read_from_vmcore(offset2, bufptr + frac_head + size1, size2)) { - ERRMSG("Can't read the dump memory(%s).\n", - info->name_memory); - return FALSE; - } - } + } else if (p < bufptr) + memset(p, 0, endp - p); return TRUE; } @@ -722,76 +700,52 @@ readpage_elf(unsigned long long paddr, v static int readpage_elf_parallel(int fd_memory, unsigned long long paddr, void *bufptr) { - off_t offset1, offset2; - size_t size1, size2; - unsigned long long phys_start, phys_end, frac_head = 0; - - offset1 = paddr_to_offset(paddr); - offset2 = paddr_to_offset(paddr + info->page_size); - phys_start = paddr; - phys_end = paddr + info->page_size; - - /* - * Check the case phys_start isn't aligned by page size like below: - * - * phys_start - * = 0x40ffda7000 - * |<-- frac_head -->|------------- PT_LOAD ------------- - * ----+-----------------------+---------------------+---- - * | pfn:N | pfn:N+1 | ... - * ----+-----------------------+---------------------+---- - * | - * pfn_to_paddr(pfn:N) # page size = 16k - * = 0x40ffda4000 - */ - if (!offset1) { - phys_start = page_head_to_phys_start(paddr); - offset1 = paddr_to_offset(phys_start); - frac_head = phys_start - paddr; - memset(bufptr, 0, frac_head); - } - - /* - * Check the case phys_end isn't aligned by page size like the - * phys_start's case. - */ - if (!offset2) { - phys_end = page_head_to_phys_end(paddr); - offset2 = paddr_to_offset(phys_end); - memset(bufptr + (phys_end - paddr), 0, info->page_size - - (phys_end - paddr)); - } + int idx; + off_t offset, size; + void *p, *endp; + unsigned long long phys_start, phys_end; + + p = bufptr; + endp = p + info->page_size; + while (p < endp) { + idx = closest_pt_load(paddr + (p - bufptr), endp - p); + if (idx < 0) + break; + + get_pt_load_extents(idx, &phys_start, &phys_end, &offset, &size); + if (phys_start > paddr + (p - bufptr)) { + memset(p, 0, phys_start - paddr); + p += phys_start - paddr; + } - /* - * Check the separated page on different PT_LOAD segments. - */ - if (offset1 + (phys_end - phys_start) == offset2) { - size1 = phys_end - phys_start; - } else { - for (size1 = 1; size1 < info->page_size - frac_head; size1++) { - offset2 = paddr_to_offset(phys_start + size1); - if (offset1 + size1 != offset2) - break; + offset += paddr - phys_start; + if (size > paddr - phys_start) { + size -= paddr - phys_start; + if (size > endp - p) + size = endp - p; + if (!read_from_vmcore_parallel(fd_memory, offset, bufptr, + size)) { + ERRMSG("Can't read the dump memory(%s).\n", + info->name_memory); + return FALSE; + } + p += size; + } + if (p < endp) { + size = phys_end - paddr; + if (size > endp - p) + size = endp - p; + memset(p, 0, size); + p += size; } } - if(!read_from_vmcore_parallel(fd_memory, offset1, bufptr + frac_head, - size1)) { - ERRMSG("Can't read the dump memory(%s).\n", - info->name_memory); + if (p == bufptr) { + ERRMSG("Attempt to read non-existent page at 0x%llx.\n", + paddr); return FALSE; - } - - if (size1 + frac_head != info->page_size) { - size2 = phys_end - (phys_start + size1); - - if(!read_from_vmcore_parallel(fd_memory, offset2, - bufptr + frac_head + size1, size2)) { - ERRMSG("Can't read the dump memory(%s).\n", - info->name_memory); - return FALSE; - } - } + } else if (p < bufptr) + memset(p, 0, endp - p); return TRUE; } _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] makedumpfile: Rewrite readpage_elf 2016-02-10 7:50 ` [PATCH 3/3] makedumpfile: Rewrite readpage_elf Petr Tesarik @ 2016-02-10 22:18 ` Ivan Delalande 2016-02-17 7:58 ` Atsushi Kumagai 2016-02-29 18:55 ` Petr Tesarik 1 sibling, 1 reply; 15+ messages in thread From: Ivan Delalande @ 2016-02-10 22:18 UTC (permalink / raw) To: Petr Tesarik; +Cc: Atsushi Kumagai, kexec mailing list On Wed, Feb 10, 2016 at 08:50:09AM +0100, Petr Tesarik wrote: > The current code in readpage_elf (and readpage_elf_parallel) is extremely > hard to follow. Additionally, it still does not cover all possible cases. > For example, attempts to read outside of any ELF segment will end up with > phys_start being 0, frac_head a negative number, interpreted as a large > positive number by memset() and write past buffer end. > > Instead of trying to handle even more "corner cases", I rewrote the > algorithm from scratch. The basic idea is simple: set a goal to fill the > page buffer with data, then work towards that goal by: > > - filling holes with zeroes (see Note below), > - p_filesz portions with file data and > - remaining p_memsz portions again with zeroes. > > Repeat this method for each LOAD until the goal is achieved, or an error > occurs. In most cases, the loop runs only once. > > Note: A "hole" is data at a physical address that is not covered by any > ELF LOAD program header. In other words, the ELF file does not specify > any data for such a hole (not even zeroes). So, why does makedumpfile > fill them with zeroes? It's because makedumpfile works with page > granularity (the compressed format does not even have a way to store > a partial page), so if only part of a page is stored, a complete page > must be provided to make this partial data accessible. > > Credits to Ivan Delalande <colona@arista.com> who first found the > problem and wrote the original fix. > > Signed-off-by: Petr Tesarik <ptesarik@suse.com> Tested-by: Ivan Delalande <colona@arista.com> Dump-dmesg works well and gives the expected results with our various setups (x86_64 only). Thanks for your work Petr! -- Ivan "Colona" Delalande Arista Networks _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH 3/3] makedumpfile: Rewrite readpage_elf 2016-02-10 22:18 ` Ivan Delalande @ 2016-02-17 7:58 ` Atsushi Kumagai 0 siblings, 0 replies; 15+ messages in thread From: Atsushi Kumagai @ 2016-02-17 7:58 UTC (permalink / raw) To: Ivan Delalande, Petr Tesarik; +Cc: kexec mailing list >On Wed, Feb 10, 2016 at 08:50:09AM +0100, Petr Tesarik wrote: >> The current code in readpage_elf (and readpage_elf_parallel) is extremely >> hard to follow. Additionally, it still does not cover all possible cases. >> For example, attempts to read outside of any ELF segment will end up with >> phys_start being 0, frac_head a negative number, interpreted as a large >> positive number by memset() and write past buffer end. >> >> Instead of trying to handle even more "corner cases", I rewrote the >> algorithm from scratch. The basic idea is simple: set a goal to fill the >> page buffer with data, then work towards that goal by: >> >> - filling holes with zeroes (see Note below), >> - p_filesz portions with file data and >> - remaining p_memsz portions again with zeroes. >> >> Repeat this method for each LOAD until the goal is achieved, or an error >> occurs. In most cases, the loop runs only once. >> >> Note: A "hole" is data at a physical address that is not covered by any >> ELF LOAD program header. In other words, the ELF file does not specify >> any data for such a hole (not even zeroes). So, why does makedumpfile >> fill them with zeroes? It's because makedumpfile works with page >> granularity (the compressed format does not even have a way to store >> a partial page), so if only part of a page is stored, a complete page >> must be provided to make this partial data accessible. >> >> Credits to Ivan Delalande <colona@arista.com> who first found the >> problem and wrote the original fix. >> >> Signed-off-by: Petr Tesarik <ptesarik@suse.com> > >Tested-by: Ivan Delalande <colona@arista.com> > >Dump-dmesg works well and gives the expected results with our various >setups (x86_64 only). Thanks for your work Petr! Thanks for your great work, Petr and Ivan. I'll merge this patch set into v1.6.0. Thanks, Atsushi Kumagai _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] makedumpfile: Rewrite readpage_elf 2016-02-10 7:50 ` [PATCH 3/3] makedumpfile: Rewrite readpage_elf Petr Tesarik 2016-02-10 22:18 ` Ivan Delalande @ 2016-02-29 18:55 ` Petr Tesarik 2016-02-29 18:59 ` [PATCH] makedumpfile: When reading partial ELF pages, check final pointer against buffer end Petr Tesarik 1 sibling, 1 reply; 15+ messages in thread From: Petr Tesarik @ 2016-02-29 18:55 UTC (permalink / raw) To: Atsushi Kumagai, Ivan Delalande; +Cc: kexec mailing list Hi, I just found a silly typo in my patch, see below. On Wed, 10 Feb 2016 08:50:09 +0100 Petr Tesarik <ptesarik@suse.cz> wrote: > The current code in readpage_elf (and readpage_elf_parallel) is extremely > hard to follow. Additionally, it still does not cover all possible cases. > For example, attempts to read outside of any ELF segment will end up with > phys_start being 0, frac_head a negative number, interpreted as a large > positive number by memset() and write past buffer end. > > Instead of trying to handle even more "corner cases", I rewrote the > algorithm from scratch. The basic idea is simple: set a goal to fill the > page buffer with data, then work towards that goal by: > > - filling holes with zeroes (see Note below), > - p_filesz portions with file data and > - remaining p_memsz portions again with zeroes. > > Repeat this method for each LOAD until the goal is achieved, or an error > occurs. In most cases, the loop runs only once. > > Note: A "hole" is data at a physical address that is not covered by any > ELF LOAD program header. In other words, the ELF file does not specify > any data for such a hole (not even zeroes). So, why does makedumpfile > fill them with zeroes? It's because makedumpfile works with page > granularity (the compressed format does not even have a way to store > a partial page), so if only part of a page is stored, a complete page > must be provided to make this partial data accessible. > > Credits to Ivan Delalande <colona@arista.com> who first found the > problem and wrote the original fix. > > Signed-off-by: Petr Tesarik <ptesarik@suse.com> > > --- > elf_info.c | 28 +++++++ > elf_info.h | 1 > makedumpfile.c | 208 ++++++++++++++++++++++----------------------------------- > 3 files changed, 110 insertions(+), 127 deletions(-) > >[...] > --- a/makedumpfile.c > +++ b/makedumpfile.c > @@ -648,73 +648,51 @@ read_from_vmcore_parallel(int fd_memory, > static int > readpage_elf(unsigned long long paddr, void *bufptr) > { > - off_t offset1, offset2; > - size_t size1, size2; > - unsigned long long phys_start, phys_end, frac_head = 0; > - > - offset1 = paddr_to_offset(paddr); > - offset2 = paddr_to_offset(paddr + info->page_size); > - phys_start = paddr; > - phys_end = paddr + info->page_size; > - > - /* > - * Check the case phys_start isn't aligned by page size like below: > - * > - * phys_start > - * = 0x40ffda7000 > - * |<-- frac_head -->|------------- PT_LOAD ------------- > - * ----+-----------------------+---------------------+---- > - * | pfn:N | pfn:N+1 | ... > - * ----+-----------------------+---------------------+---- > - * | > - * pfn_to_paddr(pfn:N) # page size = 16k > - * = 0x40ffda4000 > - */ > - if (!offset1) { > - phys_start = page_head_to_phys_start(paddr); > - offset1 = paddr_to_offset(phys_start); > - frac_head = phys_start - paddr; > - memset(bufptr, 0, frac_head); > - } > - > - /* > - * Check the case phys_end isn't aligned by page size like the > - * phys_start's case. > - */ > - if (!offset2) { > - phys_end = page_head_to_phys_end(paddr); > - offset2 = paddr_to_offset(phys_end); > - memset(bufptr + (phys_end - paddr), 0, info->page_size - (phys_end - paddr)); > - } > + int idx; > + off_t offset, size; > + void *p, *endp; > + unsigned long long phys_start, phys_end; > + > + p = bufptr; > + endp = p + info->page_size; > + while (p < endp) { > + idx = closest_pt_load(paddr + (p - bufptr), endp - p); > + if (idx < 0) > + break; > + > + get_pt_load_extents(idx, &phys_start, &phys_end, &offset, &size); > + if (phys_start > paddr + (p - bufptr)) { > + memset(p, 0, phys_start - paddr); > + p += phys_start - paddr; > + } > > - /* > - * Check the separated page on different PT_LOAD segments. > - */ > - if (offset1 + (phys_end - phys_start) == offset2) { > - size1 = phys_end - phys_start; > - } else { > - for (size1 = 1; size1 < info->page_size - frac_head; size1++) { > - offset2 = paddr_to_offset(phys_start + size1); > - if (offset1 + size1 != offset2) > - break; > + offset += paddr - phys_start; > + if (size > paddr - phys_start) { > + size -= paddr - phys_start; > + if (size > endp - p) > + size = endp - p; > + if (!read_from_vmcore(offset, p, size)) { > + ERRMSG("Can't read the dump memory(%s).\n", > + info->name_memory); > + return FALSE; > + } > + p += size; > + } > + if (p < endp) { > + size = phys_end - paddr; > + if (size > endp - p) > + size = endp - p; > + memset(p, 0, size); > + p += size; > } > } > > - if(!read_from_vmcore(offset1, bufptr + frac_head, size1)) { > - ERRMSG("Can't read the dump memory(%s).\n", > - info->name_memory); > + if (p == bufptr) { > + ERRMSG("Attempt to read non-existent page at 0x%llx.\n", > + paddr); > return FALSE; > - } > - > - if (size1 + frac_head != info->page_size) { > - size2 = phys_end - (phys_start + size1); > - > - if(!read_from_vmcore(offset2, bufptr + frac_head + size1, size2)) { > - ERRMSG("Can't read the dump memory(%s).\n", > - info->name_memory); > - return FALSE; > - } > - } > + } else if (p < bufptr) Here, of course, it should be: } else if (p < endp) > + memset(p, 0, endp - p); > > return TRUE; > } And same in the second hunk. I'm sending a patch in a minute. Petr Tesarik _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] makedumpfile: When reading partial ELF pages, check final pointer against buffer end 2016-02-29 18:55 ` Petr Tesarik @ 2016-02-29 18:59 ` Petr Tesarik 2016-02-29 19:40 ` Petr Tesarik 0 siblings, 1 reply; 15+ messages in thread From: Petr Tesarik @ 2016-02-29 18:59 UTC (permalink / raw) To: Atsushi Kumagai, Ivan Delalande; +Cc: kexec mailing list If the last part of a page is not present in the ELF file, it should be replaced with zeroes. However, the check is incorrect. Signed-off-by: Petr Tesarik <ptesarik@suse.com> diff --git a/makedumpfile.c b/makedumpfile.c index 867b953..138ddec 100644 --- a/makedumpfile.c +++ b/makedumpfile.c @@ -691,7 +691,7 @@ readpage_elf(unsigned long long paddr, void *bufptr) ERRMSG("Attempt to read non-existent page at 0x%llx.\n", paddr); return FALSE; - } else if (p < bufptr) + } else if (p < endp) memset(p, 0, endp - p); return TRUE; @@ -744,7 +744,7 @@ readpage_elf_parallel(int fd_memory, unsigned long long paddr, void *bufptr) ERRMSG("Attempt to read non-existent page at 0x%llx.\n", paddr); return FALSE; - } else if (p < bufptr) + } else if (p < endp) memset(p, 0, endp - p); return TRUE; _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] makedumpfile: When reading partial ELF pages, check final pointer against buffer end 2016-02-29 18:59 ` [PATCH] makedumpfile: When reading partial ELF pages, check final pointer against buffer end Petr Tesarik @ 2016-02-29 19:40 ` Petr Tesarik 2016-03-08 11:58 ` [PATCH] makedumpfile: Fix several issues with reading ELF pages Petr Tesarik 0 siblings, 1 reply; 15+ messages in thread From: Petr Tesarik @ 2016-02-29 19:40 UTC (permalink / raw) To: Atsushi Kumagai, Ivan Delalande; +Cc: kexec mailing list Hi again, On Mon, 29 Feb 2016 19:59:52 +0100 Petr Tesarik <ptesarik@suse.cz> wrote: > If the last part of a page is not present in the ELF file, it should > be replaced with zeroes. However, the check is incorrect. While this fix is correct, it seems there are a few more errors in the logic. I found this issues while adapting the code for libkdumpfile (https://github.com/ptesarik/libkdumpfile). This project includes a test suite, so let's postpone fixing the code in makedumpfile until I'm done with writing a good set of tests for this other project and making sure that all of them pass. Stay tuned, Petr Tesarik > Signed-off-by: Petr Tesarik <ptesarik@suse.com> > > diff --git a/makedumpfile.c b/makedumpfile.c > index 867b953..138ddec 100644 > --- a/makedumpfile.c > +++ b/makedumpfile.c > @@ -691,7 +691,7 @@ readpage_elf(unsigned long long paddr, void > *bufptr) ERRMSG("Attempt to read non-existent page at 0x%llx.\n", > paddr); > return FALSE; > - } else if (p < bufptr) > + } else if (p < endp) > memset(p, 0, endp - p); > > return TRUE; > @@ -744,7 +744,7 @@ readpage_elf_parallel(int fd_memory, unsigned > long long paddr, void *bufptr) ERRMSG("Attempt to read non-existent > page at 0x%llx.\n", paddr); > return FALSE; > - } else if (p < bufptr) > + } else if (p < endp) > memset(p, 0, endp - p); > > return TRUE; _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] makedumpfile: Fix several issues with reading ELF pages 2016-02-29 19:40 ` Petr Tesarik @ 2016-03-08 11:58 ` Petr Tesarik 2016-03-11 9:16 ` Atsushi Kumagai 0 siblings, 1 reply; 15+ messages in thread From: Petr Tesarik @ 2016-03-08 11:58 UTC (permalink / raw) To: Atsushi Kumagai; +Cc: kexec mailing list While adopting the algorithm for libkdumpfile, several corner cases were found by a test case: 1. If the last part of a page is not present in the ELF file, it should be replaced with zeroes. However, the check is incorrect. 2. If the beginning of a page is not present, following data is read from an incorrect file offset. 3. If the page is split among several segments, the current position in the buffer is not always taken into account. Case 1 is a simple typo/braino (writing bufptr instead of endp). To fix cases 2 and 3, it is best to update the paddr variable so that it always corresponds to the current read position in the buffer. I have tested the new code with a specially crafted ELF dump where one page had the following (artificial) layout: # PAGE RANGE STORED IN DATA -------------------------------------------------- 1 0x000-0x007 nowhere fake zero 2 0x008-0x067 LOAD #1 read from file 3 0x068-0x06f nowhere fake zero 4 0x070-0x13f LOAD #2 read from file 5 0x140-0x147 LOAD #2 zero (memsz > filesz) 6 0x148-0xff7 LOAD #3 read from file 7 0xff8-0xfff nowhere fake zero Case 1 tests the conditional right after get_pt_load_extents(). Case 2 tests file read after missing data. Case 3 tests the conditional from case 1 with non-zero buffer position. Case 5 tests the last conditional in the loop (p < endp). Case 6 tests exact match in get_pt_load_extents(). Case 7 tests the final conditional after the loop. Signed-off-by: Petr Tesarik <ptesarik@suse.com> --- makedumpfile.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) --- a/makedumpfile.c +++ b/makedumpfile.c @@ -656,14 +656,15 @@ readpage_elf(unsigned long long paddr, v p = bufptr; endp = p + info->page_size; while (p < endp) { - idx = closest_pt_load(paddr + (p - bufptr), endp - p); + idx = closest_pt_load(paddr, endp - p); if (idx < 0) break; get_pt_load_extents(idx, &phys_start, &phys_end, &offset, &size); - if (phys_start > paddr + (p - bufptr)) { + if (phys_start > paddr) { memset(p, 0, phys_start - paddr); p += phys_start - paddr; + paddr = phys_start; } offset += paddr - phys_start; @@ -677,6 +678,7 @@ readpage_elf(unsigned long long paddr, v return FALSE; } p += size; + paddr += size; } if (p < endp) { size = phys_end - paddr; @@ -684,6 +686,7 @@ readpage_elf(unsigned long long paddr, v size = endp - p; memset(p, 0, size); p += size; + paddr += size; } } @@ -691,7 +694,7 @@ readpage_elf(unsigned long long paddr, v ERRMSG("Attempt to read non-existent page at 0x%llx.\n", paddr); return FALSE; - } else if (p < bufptr) + } else if (p < endp) memset(p, 0, endp - p); return TRUE; @@ -708,14 +711,15 @@ readpage_elf_parallel(int fd_memory, uns p = bufptr; endp = p + info->page_size; while (p < endp) { - idx = closest_pt_load(paddr + (p - bufptr), endp - p); + idx = closest_pt_load(paddr, endp - p); if (idx < 0) break; get_pt_load_extents(idx, &phys_start, &phys_end, &offset, &size); - if (phys_start > paddr + (p - bufptr)) { + if (phys_start > paddr) { memset(p, 0, phys_start - paddr); p += phys_start - paddr; + paddr = phys_start; } offset += paddr - phys_start; @@ -730,6 +734,7 @@ readpage_elf_parallel(int fd_memory, uns return FALSE; } p += size; + paddr += size; } if (p < endp) { size = phys_end - paddr; @@ -737,6 +742,7 @@ readpage_elf_parallel(int fd_memory, uns size = endp - p; memset(p, 0, size); p += size; + paddr += size; } } @@ -744,7 +750,7 @@ readpage_elf_parallel(int fd_memory, uns ERRMSG("Attempt to read non-existent page at 0x%llx.\n", paddr); return FALSE; - } else if (p < bufptr) + } else if (p < endp) memset(p, 0, endp - p); return TRUE; _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH] makedumpfile: Fix several issues with reading ELF pages 2016-03-08 11:58 ` [PATCH] makedumpfile: Fix several issues with reading ELF pages Petr Tesarik @ 2016-03-11 9:16 ` Atsushi Kumagai 0 siblings, 0 replies; 15+ messages in thread From: Atsushi Kumagai @ 2016-03-11 9:16 UTC (permalink / raw) To: Petr Tesarik; +Cc: kexec mailing list Hello Petr, Thanks for your investigation and fixing the issue. I'll merge this into v1.6.0. Regards, Atsushi Kumagai >While adopting the algorithm for libkdumpfile, several corner cases >were found by a test case: > > 1. If the last part of a page is not present in the ELF file, it > should be replaced with zeroes. However, the check is incorrect. > > 2. If the beginning of a page is not present, following data is > read from an incorrect file offset. > > 3. If the page is split among several segments, the current > position in the buffer is not always taken into account. > >Case 1 is a simple typo/braino (writing bufptr instead of endp). > >To fix cases 2 and 3, it is best to update the paddr variable so that >it always corresponds to the current read position in the buffer. > >I have tested the new code with a specially crafted ELF dump where one >page had the following (artificial) layout: > > # PAGE RANGE STORED IN DATA > -------------------------------------------------- > 1 0x000-0x007 nowhere fake zero > 2 0x008-0x067 LOAD #1 read from file > 3 0x068-0x06f nowhere fake zero > 4 0x070-0x13f LOAD #2 read from file > 5 0x140-0x147 LOAD #2 zero (memsz > filesz) > 6 0x148-0xff7 LOAD #3 read from file > 7 0xff8-0xfff nowhere fake zero > >Case 1 tests the conditional right after get_pt_load_extents(). >Case 2 tests file read after missing data. >Case 3 tests the conditional from case 1 with non-zero buffer position. >Case 5 tests the last conditional in the loop (p < endp). >Case 6 tests exact match in get_pt_load_extents(). >Case 7 tests the final conditional after the loop. > >Signed-off-by: Petr Tesarik <ptesarik@suse.com> > >--- > makedumpfile.c | 18 ++++++++++++------ > 1 file changed, 12 insertions(+), 6 deletions(-) > >--- a/makedumpfile.c >+++ b/makedumpfile.c >@@ -656,14 +656,15 @@ readpage_elf(unsigned long long paddr, v > p = bufptr; > endp = p + info->page_size; > while (p < endp) { >- idx = closest_pt_load(paddr + (p - bufptr), endp - p); >+ idx = closest_pt_load(paddr, endp - p); > if (idx < 0) > break; > > get_pt_load_extents(idx, &phys_start, &phys_end, &offset, &size); >- if (phys_start > paddr + (p - bufptr)) { >+ if (phys_start > paddr) { > memset(p, 0, phys_start - paddr); > p += phys_start - paddr; >+ paddr = phys_start; > } > > offset += paddr - phys_start; >@@ -677,6 +678,7 @@ readpage_elf(unsigned long long paddr, v > return FALSE; > } > p += size; >+ paddr += size; > } > if (p < endp) { > size = phys_end - paddr; >@@ -684,6 +686,7 @@ readpage_elf(unsigned long long paddr, v > size = endp - p; > memset(p, 0, size); > p += size; >+ paddr += size; > } > } > >@@ -691,7 +694,7 @@ readpage_elf(unsigned long long paddr, v > ERRMSG("Attempt to read non-existent page at 0x%llx.\n", > paddr); > return FALSE; >- } else if (p < bufptr) >+ } else if (p < endp) > memset(p, 0, endp - p); > > return TRUE; >@@ -708,14 +711,15 @@ readpage_elf_parallel(int fd_memory, uns > p = bufptr; > endp = p + info->page_size; > while (p < endp) { >- idx = closest_pt_load(paddr + (p - bufptr), endp - p); >+ idx = closest_pt_load(paddr, endp - p); > if (idx < 0) > break; > > get_pt_load_extents(idx, &phys_start, &phys_end, &offset, &size); >- if (phys_start > paddr + (p - bufptr)) { >+ if (phys_start > paddr) { > memset(p, 0, phys_start - paddr); > p += phys_start - paddr; >+ paddr = phys_start; > } > > offset += paddr - phys_start; >@@ -730,6 +734,7 @@ readpage_elf_parallel(int fd_memory, uns > return FALSE; > } > p += size; >+ paddr += size; > } > if (p < endp) { > size = phys_end - paddr; >@@ -737,6 +742,7 @@ readpage_elf_parallel(int fd_memory, uns > size = endp - p; > memset(p, 0, size); > p += size; >+ paddr += size; > } > } > >@@ -744,7 +750,7 @@ readpage_elf_parallel(int fd_memory, uns > ERRMSG("Attempt to read non-existent page at 0x%llx.\n", > paddr); > return FALSE; >- } else if (p < bufptr) >+ } else if (p < endp) > memset(p, 0, endp - p); > > return TRUE; _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-07-13 8:07 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-02-10 7:47 [PATCH 0/3] makedumpfile: Fix handling of ELF segments with p_memsz > p_filesz Petr Tesarik 2016-02-10 7:49 ` [PATCH 1/3] makedumpfile: Keep segment memory size when re-filtering ELF dumps Petr Tesarik 2016-02-10 7:49 ` [PATCH 2/3] makedumpfile: Mark unstored ELF pages as filtered Petr Tesarik 2016-05-18 7:44 ` Atsushi Kumagai 2016-06-07 4:18 ` Atsushi Kumagai 2016-07-12 21:38 ` Petr Tesarik 2016-07-13 7:43 ` Atsushi Kumagai 2016-02-10 7:50 ` [PATCH 3/3] makedumpfile: Rewrite readpage_elf Petr Tesarik 2016-02-10 22:18 ` Ivan Delalande 2016-02-17 7:58 ` Atsushi Kumagai 2016-02-29 18:55 ` Petr Tesarik 2016-02-29 18:59 ` [PATCH] makedumpfile: When reading partial ELF pages, check final pointer against buffer end Petr Tesarik 2016-02-29 19:40 ` Petr Tesarik 2016-03-08 11:58 ` [PATCH] makedumpfile: Fix several issues with reading ELF pages Petr Tesarik 2016-03-11 9:16 ` Atsushi Kumagai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox