* [PATCH] kdump: force page alignment for per-CPU crash notes. @ 2012-02-29 17:21 Eugene Surovegin 2012-03-01 1:18 ` Simon Horman 2012-03-01 15:09 ` Vivek Goyal 0 siblings, 2 replies; 12+ messages in thread From: Eugene Surovegin @ 2012-02-29 17:21 UTC (permalink / raw) To: linux-kernel; +Cc: Eugene Surovegin, kexec-list, Eric Biederman, Vivek Goyal Per-CPU allocations are not guaranteed to be physically contiguous. However, kdump kernel and user-space code assumes that per-CPU memory, used for saving CPU registers on crash, is. This can cause corrupted /proc/vmcore in some cases - the main symptom being huge ELF note section. Force page alignment for note_buf_t to ensure that this assumption holds. Signed-off-by: Eugene Surovegin <surovegin@google.com> CC: Eric Biederman <ebiederm@xmission.com> CC: Vivek Goyal <vgoyal@redhat.com> CC: kexec-list <kexec@lists.infradead.org> --- kernel/kexec.c | 9 +++++++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/kexec.c b/kernel/kexec.c index 7b08867..e641b5c 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -1232,8 +1232,13 @@ void crash_save_cpu(struct pt_regs *regs, int cpu) static int __init crash_notes_memory_init(void) { - /* Allocate memory for saving cpu registers. */ - crash_notes = alloc_percpu(note_buf_t); + /* Allocate memory for saving cpu registers. + * Force page alignment to avoid crossing physical page boundary - + * kexec-tools and kernel /proc/vmcore handler assume these per-CPU + * chunks are physically contiguous. + */ + crash_notes = (note_buf_t __percpu *)__alloc_percpu(sizeof(note_buf_t), + PAGE_SIZE); if (!crash_notes) { printk("Kexec: Memory allocation for saving cpu register" " states failed\n"); -- 1.7.9.1 _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-02-29 17:21 [PATCH] kdump: force page alignment for per-CPU crash notes Eugene Surovegin @ 2012-03-01 1:18 ` Simon Horman 2012-03-01 1:23 ` Eugene Surovegin 2012-03-01 15:09 ` Vivek Goyal 1 sibling, 1 reply; 12+ messages in thread From: Simon Horman @ 2012-03-01 1:18 UTC (permalink / raw) To: Eugene Surovegin; +Cc: kexec-list, linux-kernel, Vivek Goyal, Eric Biederman On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: > Per-CPU allocations are not guaranteed to be physically contiguous. > However, kdump kernel and user-space code assumes that per-CPU > memory, used for saving CPU registers on crash, is. > This can cause corrupted /proc/vmcore in some cases - the main > symptom being huge ELF note section. > > Force page alignment for note_buf_t to ensure that this assumption holds. Ouch. I'm surprised there is an allocation on crash, perhaps it could at least be done earlier? And am I right in thinking that this change increases the likely hood that the allocation could fail? > > Signed-off-by: Eugene Surovegin <surovegin@google.com> > CC: Eric Biederman <ebiederm@xmission.com> > CC: Vivek Goyal <vgoyal@redhat.com> > CC: kexec-list <kexec@lists.infradead.org> > --- > kernel/kexec.c | 9 +++++++-- > 1 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/kernel/kexec.c b/kernel/kexec.c > index 7b08867..e641b5c 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -1232,8 +1232,13 @@ void crash_save_cpu(struct pt_regs *regs, int cpu) > > static int __init crash_notes_memory_init(void) > { > - /* Allocate memory for saving cpu registers. */ > - crash_notes = alloc_percpu(note_buf_t); > + /* Allocate memory for saving cpu registers. > + * Force page alignment to avoid crossing physical page boundary - > + * kexec-tools and kernel /proc/vmcore handler assume these per-CPU > + * chunks are physically contiguous. > + */ > + crash_notes = (note_buf_t __percpu *)__alloc_percpu(sizeof(note_buf_t), > + PAGE_SIZE); > if (!crash_notes) { > printk("Kexec: Memory allocation for saving cpu register" > " states failed\n"); > -- > 1.7.9.1 > > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec > _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-03-01 1:18 ` Simon Horman @ 2012-03-01 1:23 ` Eugene Surovegin 2012-03-01 1:32 ` Simon Horman 0 siblings, 1 reply; 12+ messages in thread From: Eugene Surovegin @ 2012-03-01 1:23 UTC (permalink / raw) To: Simon Horman; +Cc: kexec-list, linux-kernel, Vivek Goyal, Eric Biederman [-- Attachment #1.1: Type: text/plain, Size: 930 bytes --] On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote: > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: > > Per-CPU allocations are not guaranteed to be physically contiguous. > > However, kdump kernel and user-space code assumes that per-CPU > > memory, used for saving CPU registers on crash, is. > > This can cause corrupted /proc/vmcore in some cases - the main > > symptom being huge ELF note section. > > > > Force page alignment for note_buf_t to ensure that this assumption holds. > > Ouch. I'm surprised there is an allocation on crash, perhaps > it could at least be done earlier? And am I right in thinking > that this change increases the likely hood that the allocation > could fail? > I'm not following. This allocation is done on start-up, not on crash. If you cannot allocate this much memory on system boot, I'm not sure what else you can do on this system.... -- Eugene [-- Attachment #1.2: Type: text/html, Size: 1366 bytes --] [-- Attachment #2: Type: text/plain, Size: 143 bytes --] _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-03-01 1:23 ` Eugene Surovegin @ 2012-03-01 1:32 ` Simon Horman 2012-03-01 1:39 ` Eugene Surovegin 0 siblings, 1 reply; 12+ messages in thread From: Simon Horman @ 2012-03-01 1:32 UTC (permalink / raw) To: Eugene Surovegin; +Cc: kexec-list, linux-kernel, Vivek Goyal, Eric Biederman On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote: > On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote: > > > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: > > > Per-CPU allocations are not guaranteed to be physically contiguous. > > > However, kdump kernel and user-space code assumes that per-CPU > > > memory, used for saving CPU registers on crash, is. > > > This can cause corrupted /proc/vmcore in some cases - the main > > > symptom being huge ELF note section. > > > > > > Force page alignment for note_buf_t to ensure that this assumption holds. > > > > Ouch. I'm surprised there is an allocation on crash, perhaps > > it could at least be done earlier? And am I right in thinking > > that this change increases the likely hood that the allocation > > could fail? > > > > I'm not following. This allocation is done on start-up, not on crash. > If you cannot allocate this much memory on system boot, I'm not sure what > else you can do on this system.... Sorry, my eyes deceived me. You are correct and I agree. Is it the case that note_buf_t is never larger than PAGE_SIZE? If so I your patch looks good to me. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-03-01 1:32 ` Simon Horman @ 2012-03-01 1:39 ` Eugene Surovegin 2012-03-01 1:51 ` HATAYAMA Daisuke 2012-03-01 2:53 ` Simon Horman 0 siblings, 2 replies; 12+ messages in thread From: Eugene Surovegin @ 2012-03-01 1:39 UTC (permalink / raw) To: Simon Horman; +Cc: kexec-list, linux-kernel, Vivek Goyal, Eric Biederman On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote: > On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote: >> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote: >> >> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: >> > > Per-CPU allocations are not guaranteed to be physically contiguous. >> > > However, kdump kernel and user-space code assumes that per-CPU >> > > memory, used for saving CPU registers on crash, is. >> > > This can cause corrupted /proc/vmcore in some cases - the main >> > > symptom being huge ELF note section. >> > > >> > > Force page alignment for note_buf_t to ensure that this assumption holds. >> > >> > Ouch. I'm surprised there is an allocation on crash, perhaps >> > it could at least be done earlier? And am I right in thinking >> > that this change increases the likely hood that the allocation >> > could fail? >> > >> >> I'm not following. This allocation is done on start-up, not on crash. >> If you cannot allocate this much memory on system boot, I'm not sure what >> else you can do on this system.... > > Sorry, my eyes deceived me. You are correct and I agree. > > Is it the case that note_buf_t is never larger than PAGE_SIZE? > If so I your patch looks good to me. Currently, maximum note size is hardcoded in kexec-tools to 1024 (MAX_NOTE_BYTES). Usually it's way less. IIRC on x86_64 it's 336 bytes. -- Eugene _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-03-01 1:39 ` Eugene Surovegin @ 2012-03-01 1:51 ` HATAYAMA Daisuke 2012-03-01 1:56 ` Eugene Surovegin 2012-03-01 2:50 ` Simon Horman 2012-03-01 2:53 ` Simon Horman 1 sibling, 2 replies; 12+ messages in thread From: HATAYAMA Daisuke @ 2012-03-01 1:51 UTC (permalink / raw) To: surovegin; +Cc: horms, kexec, linux-kernel, vgoyal, ebiederm From: Eugene Surovegin <surovegin@google.com> Subject: Re: [PATCH] kdump: force page alignment for per-CPU crash notes. Date: Wed, 29 Feb 2012 17:39:55 -0800 > On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote: >> On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote: >>> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote: >>> >>> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: >>> > > Per-CPU allocations are not guaranteed to be physically contiguous. >>> > > However, kdump kernel and user-space code assumes that per-CPU >>> > > memory, used for saving CPU registers on crash, is. >>> > > This can cause corrupted /proc/vmcore in some cases - the main >>> > > symptom being huge ELF note section. >>> > > >>> > > Force page alignment for note_buf_t to ensure that this assumption holds. >>> > >>> > Ouch. I'm surprised there is an allocation on crash, perhaps >>> > it could at least be done earlier? And am I right in thinking >>> > that this change increases the likely hood that the allocation >>> > could fail? >>> > >>> >>> I'm not following. This allocation is done on start-up, not on crash. >>> If you cannot allocate this much memory on system boot, I'm not sure what >>> else you can do on this system.... >> >> Sorry, my eyes deceived me. You are correct and I agree. >> >> Is it the case that note_buf_t is never larger than PAGE_SIZE? >> If so I your patch looks good to me. > > Currently, maximum note size is hardcoded in kexec-tools to 1024 > (MAX_NOTE_BYTES). > Usually it's way less. IIRC on x86_64 it's 336 bytes. > This is elf_prstatus and I guess it's mostly equal to registers. crash> p sizeof(struct elf_prstatus) $3 = 336 crash> ptype struct elf_prstatus type = struct elf_prstatus { struct elf_siginfo pr_info; short int pr_cursig; long unsigned int pr_sigpend; long unsigned int pr_sighold; pid_t pr_pid; pid_t pr_ppid; pid_t pr_pgrp; pid_t pr_sid; struct timeval pr_utime; struct timeval pr_stime; struct timeval pr_cutime; struct timeval pr_cstime; elf_gregset_t pr_reg; <-- this int pr_fpvalid; } What kinds of architecture does have so many registers? It's just my interest. Or possibly other kinds of notes is written here? Thanks. HATAYAMA, Daisuke _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-03-01 1:51 ` HATAYAMA Daisuke @ 2012-03-01 1:56 ` Eugene Surovegin 2012-03-01 2:50 ` Simon Horman 1 sibling, 0 replies; 12+ messages in thread From: Eugene Surovegin @ 2012-03-01 1:56 UTC (permalink / raw) To: HATAYAMA Daisuke; +Cc: horms, kexec, linux-kernel, vgoyal, ebiederm On Wed, Feb 29, 2012 at 5:51 PM, HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote: > From: Eugene Surovegin <surovegin@google.com> > Subject: Re: [PATCH] kdump: force page alignment for per-CPU crash notes. > Date: Wed, 29 Feb 2012 17:39:55 -0800 > >> On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote: >>> On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote: >>>> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote: >>>> >>>> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: >>>> > > Per-CPU allocations are not guaranteed to be physically contiguous. >>>> > > However, kdump kernel and user-space code assumes that per-CPU >>>> > > memory, used for saving CPU registers on crash, is. >>>> > > This can cause corrupted /proc/vmcore in some cases - the main >>>> > > symptom being huge ELF note section. >>>> > > >>>> > > Force page alignment for note_buf_t to ensure that this assumption holds. >>>> > >>>> > Ouch. I'm surprised there is an allocation on crash, perhaps >>>> > it could at least be done earlier? And am I right in thinking >>>> > that this change increases the likely hood that the allocation >>>> > could fail? >>>> > >>>> >>>> I'm not following. This allocation is done on start-up, not on crash. >>>> If you cannot allocate this much memory on system boot, I'm not sure what >>>> else you can do on this system.... >>> >>> Sorry, my eyes deceived me. You are correct and I agree. >>> >>> Is it the case that note_buf_t is never larger than PAGE_SIZE? >>> If so I your patch looks good to me. >> >> Currently, maximum note size is hardcoded in kexec-tools to 1024 >> (MAX_NOTE_BYTES). >> Usually it's way less. IIRC on x86_64 it's 336 bytes. >> > > This is elf_prstatus and I guess it's mostly equal to registers. > > crash> p sizeof(struct elf_prstatus) > $3 = 336 > crash> ptype struct elf_prstatus > type = struct elf_prstatus { > struct elf_siginfo pr_info; > short int pr_cursig; > long unsigned int pr_sigpend; > long unsigned int pr_sighold; > pid_t pr_pid; > pid_t pr_ppid; > pid_t pr_pgrp; > pid_t pr_sid; > struct timeval pr_utime; > struct timeval pr_stime; > struct timeval pr_cutime; > struct timeval pr_cstime; > elf_gregset_t pr_reg; <-- this > int pr_fpvalid; > } > > What kinds of architecture does have so many registers? It's just my > interest. Or possibly other kinds of notes is written here? I'm not sure about other archs, but we don't write there anything except for 'elf_prstatus' and sentinel "final" note. -- Eugene _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-03-01 1:51 ` HATAYAMA Daisuke 2012-03-01 1:56 ` Eugene Surovegin @ 2012-03-01 2:50 ` Simon Horman 1 sibling, 0 replies; 12+ messages in thread From: Simon Horman @ 2012-03-01 2:50 UTC (permalink / raw) To: HATAYAMA Daisuke; +Cc: surovegin, kexec, linux-kernel, vgoyal, ebiederm On Thu, Mar 01, 2012 at 10:51:26AM +0900, HATAYAMA Daisuke wrote: > From: Eugene Surovegin <surovegin@google.com> > Subject: Re: [PATCH] kdump: force page alignment for per-CPU crash notes. > Date: Wed, 29 Feb 2012 17:39:55 -0800 > > > On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote: > >> On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote: > >>> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote: > >>> > >>> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: > >>> > > Per-CPU allocations are not guaranteed to be physically contiguous. > >>> > > However, kdump kernel and user-space code assumes that per-CPU > >>> > > memory, used for saving CPU registers on crash, is. > >>> > > This can cause corrupted /proc/vmcore in some cases - the main > >>> > > symptom being huge ELF note section. > >>> > > > >>> > > Force page alignment for note_buf_t to ensure that this assumption holds. > >>> > > >>> > Ouch. I'm surprised there is an allocation on crash, perhaps > >>> > it could at least be done earlier? And am I right in thinking > >>> > that this change increases the likely hood that the allocation > >>> > could fail? > >>> > > >>> > >>> I'm not following. This allocation is done on start-up, not on crash. > >>> If you cannot allocate this much memory on system boot, I'm not sure what > >>> else you can do on this system.... > >> > >> Sorry, my eyes deceived me. You are correct and I agree. > >> > >> Is it the case that note_buf_t is never larger than PAGE_SIZE? > >> If so I your patch looks good to me. > > > > Currently, maximum note size is hardcoded in kexec-tools to 1024 > > (MAX_NOTE_BYTES). > > Usually it's way less. IIRC on x86_64 it's 336 bytes. I presume that MAX_NOTE_BYTES was chosen to be large to minimise the chance that it would ever need to be changed. > This is elf_prstatus and I guess it's mostly equal to registers. > > crash> p sizeof(struct elf_prstatus) > $3 = 336 > crash> ptype struct elf_prstatus > type = struct elf_prstatus { > struct elf_siginfo pr_info; > short int pr_cursig; > long unsigned int pr_sigpend; > long unsigned int pr_sighold; > pid_t pr_pid; > pid_t pr_ppid; > pid_t pr_pgrp; > pid_t pr_sid; > struct timeval pr_utime; > struct timeval pr_stime; > struct timeval pr_cutime; > struct timeval pr_cstime; > elf_gregset_t pr_reg; <-- this > int pr_fpvalid; > } > > What kinds of architecture does have so many registers? It's just my > interest. Or possibly other kinds of notes is written here? The winner seems to be ia64 with 128. _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-03-01 1:39 ` Eugene Surovegin 2012-03-01 1:51 ` HATAYAMA Daisuke @ 2012-03-01 2:53 ` Simon Horman 1 sibling, 0 replies; 12+ messages in thread From: Simon Horman @ 2012-03-01 2:53 UTC (permalink / raw) To: Eugene Surovegin; +Cc: kexec-list, linux-kernel, Vivek Goyal, Eric Biederman On Wed, Feb 29, 2012 at 05:39:55PM -0800, Eugene Surovegin wrote: > On Wed, Feb 29, 2012 at 5:32 PM, Simon Horman <horms@verge.net.au> wrote: > > On Wed, Feb 29, 2012 at 05:23:10PM -0800, Eugene Surovegin wrote: > >> On Wed, Feb 29, 2012 at 5:18 PM, Simon Horman <horms@verge.net.au> wrote: > >> > >> > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: > >> > > Per-CPU allocations are not guaranteed to be physically contiguous. > >> > > However, kdump kernel and user-space code assumes that per-CPU > >> > > memory, used for saving CPU registers on crash, is. > >> > > This can cause corrupted /proc/vmcore in some cases - the main > >> > > symptom being huge ELF note section. > >> > > > >> > > Force page alignment for note_buf_t to ensure that this assumption holds. > >> > > >> > Ouch. I'm surprised there is an allocation on crash, perhaps > >> > it could at least be done earlier? And am I right in thinking > >> > that this change increases the likely hood that the allocation > >> > could fail? > >> > > >> > >> I'm not following. This allocation is done on start-up, not on crash. > >> If you cannot allocate this much memory on system boot, I'm not sure what > >> else you can do on this system.... > > > > Sorry, my eyes deceived me. You are correct and I agree. > > > > Is it the case that note_buf_t is never larger than PAGE_SIZE? > > If so I your patch looks good to me. > > Currently, maximum note size is hardcoded in kexec-tools to 1024 > (MAX_NOTE_BYTES). > Usually it's way less. IIRC on x86_64 it's 336 bytes. Ok, understood. Reviewed-by: Simon Horman <horms@verge.net.au> _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-02-29 17:21 [PATCH] kdump: force page alignment for per-CPU crash notes Eugene Surovegin 2012-03-01 1:18 ` Simon Horman @ 2012-03-01 15:09 ` Vivek Goyal 2012-03-01 17:04 ` Eugene Surovegin 1 sibling, 1 reply; 12+ messages in thread From: Vivek Goyal @ 2012-03-01 15:09 UTC (permalink / raw) To: Eugene Surovegin; +Cc: kexec-list, linux-kernel, Eric Biederman On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: > Per-CPU allocations are not guaranteed to be physically contiguous. > However, kdump kernel and user-space code assumes that per-CPU > memory, used for saving CPU registers on crash, is. > This can cause corrupted /proc/vmcore in some cases - the main > symptom being huge ELF note section. > > Force page alignment for note_buf_t to ensure that this assumption holds. > Hi Eugene, Where do we make assumption that crash_notes address is page aligned? In first kernel we save notes using virtual addresses as returned by per_cpu_ptr(). So first kernel should be fine (crash_save_cpu()). In second kernel, fs/proc/vmcore.c code does not seem to be assuming that notes are stored as page aligned address. merge_note_headers_elf64() read_from_oldmem() offset = (unsigned long)(*ppos % PAGE_SIZE); pfn = (unsigned long)(*ppos / PAGE_SIZE); We see to calculate pfn and offset inside the page. Map the pfn and then access offset. So as long as whole of the note section is stored on a single physical page it is not a problem. Are you referring to the fact that note could be stored on two different physical pages and then kexec-tools does not know about the physical address of second page, hence second kernel does not know about it and we never read that data. That sounds like a problem. So it make sense to force the page size alignment and if notes ever grow beyond 1 page, we need to come up with more complex ways of communicating more than 1 physical address to second kernel. Thanks Vivek > Signed-off-by: Eugene Surovegin <surovegin@google.com> > CC: Eric Biederman <ebiederm@xmission.com> > CC: Vivek Goyal <vgoyal@redhat.com> > CC: kexec-list <kexec@lists.infradead.org> > --- > kernel/kexec.c | 9 +++++++-- > 1 files changed, 7 insertions(+), 2 deletions(-) > > diff --git a/kernel/kexec.c b/kernel/kexec.c > index 7b08867..e641b5c 100644 > --- a/kernel/kexec.c > +++ b/kernel/kexec.c > @@ -1232,8 +1232,13 @@ void crash_save_cpu(struct pt_regs *regs, int cpu) > > static int __init crash_notes_memory_init(void) > { > - /* Allocate memory for saving cpu registers. */ > - crash_notes = alloc_percpu(note_buf_t); > + /* Allocate memory for saving cpu registers. > + * Force page alignment to avoid crossing physical page boundary - > + * kexec-tools and kernel /proc/vmcore handler assume these per-CPU > + * chunks are physically contiguous. > + */ > + crash_notes = (note_buf_t __percpu *)__alloc_percpu(sizeof(note_buf_t), > + PAGE_SIZE); > if (!crash_notes) { > printk("Kexec: Memory allocation for saving cpu register" > " states failed\n"); > -- > 1.7.9.1 _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-03-01 15:09 ` Vivek Goyal @ 2012-03-01 17:04 ` Eugene Surovegin 2012-03-01 18:31 ` Vivek Goyal 0 siblings, 1 reply; 12+ messages in thread From: Eugene Surovegin @ 2012-03-01 17:04 UTC (permalink / raw) To: Vivek Goyal; +Cc: kexec-list, linux-kernel, Eric Biederman On Thu, Mar 1, 2012 at 7:09 AM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: >> Per-CPU allocations are not guaranteed to be physically contiguous. >> However, kdump kernel and user-space code assumes that per-CPU >> memory, used for saving CPU registers on crash, is. >> This can cause corrupted /proc/vmcore in some cases - the main >> symptom being huge ELF note section. >> >> Force page alignment for note_buf_t to ensure that this assumption holds. >> > > Hi Eugene, > > Where do we make assumption that crash_notes address is page aligned? I never said that. I said that assumption was that note chunks were physically contiguous. [snip] > Are you referring to the fact that note could be stored on two different > physical pages and then kexec-tools does not know about the physical > address of second page, hence second kernel does not know about it > and we never read that data. That sounds like a problem. Yes. This is exactly what happens in some cases. > So it make sense to force the page size alignment and if notes ever > grow beyond 1 page, we need to come up with more complex ways of > communicating more than 1 physical address to second kernel. Yes. Thankfully notes so far are smaller than a page. And if they do grow larger, than kexec-tools have to be modified anyways - currently maximum note size (1K) is hardcoded there. If this happens, I think instead of coming up with a way to communicate phys to virt mapping, it'd be easier just to change th way notes are allocated in the kernel so they are phys contiguous. -- Eugene _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] kdump: force page alignment for per-CPU crash notes. 2012-03-01 17:04 ` Eugene Surovegin @ 2012-03-01 18:31 ` Vivek Goyal 0 siblings, 0 replies; 12+ messages in thread From: Vivek Goyal @ 2012-03-01 18:31 UTC (permalink / raw) To: Eugene Surovegin; +Cc: kexec-list, linux-kernel, Eric Biederman On Thu, Mar 01, 2012 at 09:04:32AM -0800, Eugene Surovegin wrote: > On Thu, Mar 1, 2012 at 7:09 AM, Vivek Goyal <vgoyal@redhat.com> wrote: > > On Wed, Feb 29, 2012 at 09:21:23AM -0800, Eugene Surovegin wrote: > >> Per-CPU allocations are not guaranteed to be physically contiguous. > >> However, kdump kernel and user-space code assumes that per-CPU > >> memory, used for saving CPU registers on crash, is. > >> This can cause corrupted /proc/vmcore in some cases - the main > >> symptom being huge ELF note section. > >> > >> Force page alignment for note_buf_t to ensure that this assumption holds. > >> > > > > Hi Eugene, > > > > Where do we make assumption that crash_notes address is page aligned? > > I never said that. I said that assumption was that note chunks were > physically contiguous. > > [snip] > > > Are you referring to the fact that note could be stored on two different > > physical pages and then kexec-tools does not know about the physical > > address of second page, hence second kernel does not know about it > > and we never read that data. That sounds like a problem. > > Yes. This is exactly what happens in some cases. Ok. Thanks for clarifying. Patch looks good to me. Acked-by: Vivek Goyal <vgoyal@redhat.com> Vivek _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2012-03-01 18:56 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-02-29 17:21 [PATCH] kdump: force page alignment for per-CPU crash notes Eugene Surovegin 2012-03-01 1:18 ` Simon Horman 2012-03-01 1:23 ` Eugene Surovegin 2012-03-01 1:32 ` Simon Horman 2012-03-01 1:39 ` Eugene Surovegin 2012-03-01 1:51 ` HATAYAMA Daisuke 2012-03-01 1:56 ` Eugene Surovegin 2012-03-01 2:50 ` Simon Horman 2012-03-01 2:53 ` Simon Horman 2012-03-01 15:09 ` Vivek Goyal 2012-03-01 17:04 ` Eugene Surovegin 2012-03-01 18:31 ` Vivek Goyal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox