* [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation
@ 2016-08-01 17:07 Thomas Garnier
2016-08-01 17:07 ` [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping Thomas Garnier
` (2 more replies)
0 siblings, 3 replies; 30+ messages in thread
From: Thomas Garnier @ 2016-08-01 17:07 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook,
Thomas Garnier, Yinghai Lu, Rafael J . Wysocki, Pavel Machek
Cc: x86, linux-kernel, linux-pm, kernel-hardening
***Background:
KASLR memory randomization for x86_64 was added when KASLR did not support hibernation. Now that it does, some changes are needed.
***Problems that needed solving:
Hibernation was failing on reboot with a GP fault when CONFIG_RANDOMIZE_MEMORY was enabled. Two issues were identified.
The original fault was due to a wrong physical address assigned to cr3. The problem was introduced with __PAGE_OFFSET becoming a global variable when randomized. The fix uses a define to use the glbobal or immediate value based on config settings.
The second isssue was that the temporary page table mapping did not support virtual addresses not aligned on PGD level. KASLR memory randomization will generated a random address aligned on PUD level. The fix correctly calculates the offset on all levels of the temporary page table.
***Parts:
- 01/02: Support unaligned addresses (second issue)
- 02/02: Fix __PAGE_OFFSET usage on assembly (first issue)
^ permalink raw reply [flat|nested] 30+ messages in thread* [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-01 17:07 [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation Thomas Garnier @ 2016-08-01 17:07 ` Thomas Garnier 2016-08-02 0:36 ` Rafael J. Wysocki 2016-08-02 17:36 ` Yinghai Lu 2016-08-01 17:08 ` [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore Thomas Garnier 2016-08-01 23:48 ` [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation Rafael J. Wysocki 2 siblings, 2 replies; 30+ messages in thread From: Thomas Garnier @ 2016-08-01 17:07 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Thomas Garnier, Yinghai Lu, Rafael J . Wysocki, Pavel Machek Cc: x86, linux-kernel, linux-pm, kernel-hardening Correctly setup the temporary mapping for hibernation. Previous implementation assumed the address was aligned on the PGD level. With KASLR memory randomization enabled, the address is randomized on the PUD level. This change supports unaligned address up to PMD. Signed-off-by: Thomas Garnier <thgarnie@google.com> --- arch/x86/mm/ident_map.c | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c index ec21796..ea1ebf1 100644 --- a/arch/x86/mm/ident_map.c +++ b/arch/x86/mm/ident_map.c @@ -3,15 +3,16 @@ * included by both the compressed kernel and the regular kernel. */ -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page, +static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page, unsigned long addr, unsigned long end) { - addr &= PMD_MASK; - for (; addr < end; addr += PMD_SIZE) { - pmd_t *pmd = pmd_page + pmd_index(addr); + int off = info->kernel_mapping ? pmd_index(__PAGE_OFFSET) : 0; + + for (addr &= PMD_MASK; addr < end; addr += PMD_SIZE) { + pmd_t *pmd = pmd_page + pmd_index(addr) + off; if (!pmd_present(*pmd)) - set_pmd(pmd, __pmd(addr | pmd_flag)); + set_pmd(pmd, __pmd(addr | info->pmd_flag)); } } @@ -19,9 +20,10 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, unsigned long addr, unsigned long end) { unsigned long next; + int off = info->kernel_mapping ? pud_index(__PAGE_OFFSET) : 0; for (; addr < end; addr = next) { - pud_t *pud = pud_page + pud_index(addr); + pud_t *pud = pud_page + pud_index(addr) + off; pmd_t *pmd; next = (addr & PUD_MASK) + PUD_SIZE; @@ -30,13 +32,13 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, if (pud_present(*pud)) { pmd = pmd_offset(pud, 0); - ident_pmd_init(info->pmd_flag, pmd, addr, next); + ident_pmd_init(info, pmd, addr, next); continue; } pmd = (pmd_t *)info->alloc_pgt_page(info->context); if (!pmd) return -ENOMEM; - ident_pmd_init(info->pmd_flag, pmd, addr, next); + ident_pmd_init(info, pmd, addr, next); set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); } -- 2.8.0.rc3.226.g39d4020 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-01 17:07 ` [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping Thomas Garnier @ 2016-08-02 0:36 ` Rafael J. Wysocki 2016-08-02 18:01 ` Yinghai Lu 2016-08-02 17:36 ` Yinghai Lu 1 sibling, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-02 0:36 UTC (permalink / raw) To: Thomas Garnier, Ingo Molnar Cc: Thomas Gleixner, H . Peter Anvin, Kees Cook, Yinghai Lu, Pavel Machek, x86, linux-kernel, linux-pm, kernel-hardening On Monday, August 01, 2016 10:07:59 AM Thomas Garnier wrote: > Correctly setup the temporary mapping for hibernation. Previous > implementation assumed the address was aligned on the PGD level. With > KASLR memory randomization enabled, the address is randomized on the PUD > level. This change supports unaligned address up to PMD. > > Signed-off-by: Thomas Garnier <thgarnie@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> This code is shared with kexec AFAICS, so it likely is better to push it through tip rather than through the PM tree. > --- > arch/x86/mm/ident_map.c | 18 ++++++++++-------- > 1 file changed, 10 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c > index ec21796..ea1ebf1 100644 > --- a/arch/x86/mm/ident_map.c > +++ b/arch/x86/mm/ident_map.c > @@ -3,15 +3,16 @@ > * included by both the compressed kernel and the regular kernel. > */ > > -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page, > +static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page, > unsigned long addr, unsigned long end) > { > - addr &= PMD_MASK; > - for (; addr < end; addr += PMD_SIZE) { > - pmd_t *pmd = pmd_page + pmd_index(addr); > + int off = info->kernel_mapping ? pmd_index(__PAGE_OFFSET) : 0; > + > + for (addr &= PMD_MASK; addr < end; addr += PMD_SIZE) { > + pmd_t *pmd = pmd_page + pmd_index(addr) + off; > > if (!pmd_present(*pmd)) > - set_pmd(pmd, __pmd(addr | pmd_flag)); > + set_pmd(pmd, __pmd(addr | info->pmd_flag)); > } > } > > @@ -19,9 +20,10 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, > unsigned long addr, unsigned long end) > { > unsigned long next; > + int off = info->kernel_mapping ? pud_index(__PAGE_OFFSET) : 0; > > for (; addr < end; addr = next) { > - pud_t *pud = pud_page + pud_index(addr); > + pud_t *pud = pud_page + pud_index(addr) + off; > pmd_t *pmd; > > next = (addr & PUD_MASK) + PUD_SIZE; > @@ -30,13 +32,13 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, > > if (pud_present(*pud)) { > pmd = pmd_offset(pud, 0); > - ident_pmd_init(info->pmd_flag, pmd, addr, next); > + ident_pmd_init(info, pmd, addr, next); > continue; > } > pmd = (pmd_t *)info->alloc_pgt_page(info->context); > if (!pmd) > return -ENOMEM; > - ident_pmd_init(info->pmd_flag, pmd, addr, next); > + ident_pmd_init(info, pmd, addr, next); > set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); > } > > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-02 0:36 ` Rafael J. Wysocki @ 2016-08-02 18:01 ` Yinghai Lu 0 siblings, 0 replies; 30+ messages in thread From: Yinghai Lu @ 2016-08-02 18:01 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Thomas Garnier, Ingo Molnar, Thomas Gleixner, H . Peter Anvin, Kees Cook, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com On Mon, Aug 1, 2016 at 5:36 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Monday, August 01, 2016 10:07:59 AM Thomas Garnier wrote: >> Correctly setup the temporary mapping for hibernation. Previous >> implementation assumed the address was aligned on the PGD level. With >> KASLR memory randomization enabled, the address is randomized on the PUD >> level. This change supports unaligned address up to PMD. > > This code is shared with kexec AFAICS, so it likely is better to push it > through tip rather than through the PM tree. Only calling path via arch/x86/power/hibernate_64.c have kernel_mapping = true; other two paths: arch/x86/boot/compressed/pagetable.c and arch/x86/kernel/machine_kexec_64.c all have kernel_mapping as false. maybe that path need simplified kernel_physical_mapping_init() instead? Thanks Yinghai ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-01 17:07 ` [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping Thomas Garnier 2016-08-02 0:36 ` Rafael J. Wysocki @ 2016-08-02 17:36 ` Yinghai Lu 2016-08-02 17:48 ` Thomas Garnier 1 sibling, 1 reply; 30+ messages in thread From: Yinghai Lu @ 2016-08-02 17:36 UTC (permalink / raw) To: Thomas Garnier Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Rafael J . Wysocki, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com On Mon, Aug 1, 2016 at 10:07 AM, Thomas Garnier <thgarnie@google.com> wrote: > Correctly setup the temporary mapping for hibernation. Previous > implementation assumed the address was aligned on the PGD level. With > KASLR memory randomization enabled, the address is randomized on the PUD > level. This change supports unaligned address up to PMD. > > Signed-off-by: Thomas Garnier <thgarnie@google.com> > --- > arch/x86/mm/ident_map.c | 18 ++++++++++-------- > 1 file changed, 10 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c > index ec21796..ea1ebf1 100644 > --- a/arch/x86/mm/ident_map.c > +++ b/arch/x86/mm/ident_map.c > @@ -3,15 +3,16 @@ > * included by both the compressed kernel and the regular kernel. > */ > > -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page, > +static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page, > unsigned long addr, unsigned long end) > { > - addr &= PMD_MASK; > - for (; addr < end; addr += PMD_SIZE) { > - pmd_t *pmd = pmd_page + pmd_index(addr); > + int off = info->kernel_mapping ? pmd_index(__PAGE_OFFSET) : 0; > + > + for (addr &= PMD_MASK; addr < end; addr += PMD_SIZE) { > + pmd_t *pmd = pmd_page + pmd_index(addr) + off; > > if (!pmd_present(*pmd)) > - set_pmd(pmd, __pmd(addr | pmd_flag)); > + set_pmd(pmd, __pmd(addr | info->pmd_flag)); > } > } > > @@ -19,9 +20,10 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, > unsigned long addr, unsigned long end) > { > unsigned long next; > + int off = info->kernel_mapping ? pud_index(__PAGE_OFFSET) : 0; > > for (; addr < end; addr = next) { > - pud_t *pud = pud_page + pud_index(addr); > + pud_t *pud = pud_page + pud_index(addr) + off; > pmd_t *pmd; > > next = (addr & PUD_MASK) + PUD_SIZE; Is there any chance for (pud_index(addr) + off) or (pmd_index(addr) + off) bigger than 512? Looks like we need to change the loop from phys address to virtual address instead. to avoid the overflow. Thanks Yinghai ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-02 17:36 ` Yinghai Lu @ 2016-08-02 17:48 ` Thomas Garnier 2016-08-02 19:55 ` Yinghai Lu 0 siblings, 1 reply; 30+ messages in thread From: Thomas Garnier @ 2016-08-02 17:48 UTC (permalink / raw) To: Yinghai Lu Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Rafael J . Wysocki, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com On Tue, Aug 2, 2016 at 10:36 AM, Yinghai Lu <yinghai@kernel.org> wrote: > On Mon, Aug 1, 2016 at 10:07 AM, Thomas Garnier <thgarnie@google.com> wrote: >> Correctly setup the temporary mapping for hibernation. Previous >> implementation assumed the address was aligned on the PGD level. With >> KASLR memory randomization enabled, the address is randomized on the PUD >> level. This change supports unaligned address up to PMD. >> >> Signed-off-by: Thomas Garnier <thgarnie@google.com> >> --- >> arch/x86/mm/ident_map.c | 18 ++++++++++-------- >> 1 file changed, 10 insertions(+), 8 deletions(-) >> >> diff --git a/arch/x86/mm/ident_map.c b/arch/x86/mm/ident_map.c >> index ec21796..ea1ebf1 100644 >> --- a/arch/x86/mm/ident_map.c >> +++ b/arch/x86/mm/ident_map.c >> @@ -3,15 +3,16 @@ >> * included by both the compressed kernel and the regular kernel. >> */ >> >> -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page, >> +static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page, >> unsigned long addr, unsigned long end) >> { >> - addr &= PMD_MASK; >> - for (; addr < end; addr += PMD_SIZE) { >> - pmd_t *pmd = pmd_page + pmd_index(addr); >> + int off = info->kernel_mapping ? pmd_index(__PAGE_OFFSET) : 0; >> + >> + for (addr &= PMD_MASK; addr < end; addr += PMD_SIZE) { >> + pmd_t *pmd = pmd_page + pmd_index(addr) + off; >> >> if (!pmd_present(*pmd)) >> - set_pmd(pmd, __pmd(addr | pmd_flag)); >> + set_pmd(pmd, __pmd(addr | info->pmd_flag)); >> } >> } >> >> @@ -19,9 +20,10 @@ static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, >> unsigned long addr, unsigned long end) >> { >> unsigned long next; >> + int off = info->kernel_mapping ? pud_index(__PAGE_OFFSET) : 0; >> >> for (; addr < end; addr = next) { >> - pud_t *pud = pud_page + pud_index(addr); >> + pud_t *pud = pud_page + pud_index(addr) + off; >> pmd_t *pmd; >> >> next = (addr & PUD_MASK) + PUD_SIZE; > > Is there any chance for (pud_index(addr) + off) or (pmd_index(addr) + off) > bigger than 512? > > Looks like we need to change the loop from phys address to virtual > address instead. > to avoid the overflow. > That's a good point. I will take a look at it. > Thanks > > Yinghai ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-02 17:48 ` Thomas Garnier @ 2016-08-02 19:55 ` Yinghai Lu 2016-08-03 15:29 ` Thomas Garnier 2016-08-03 18:23 ` [PATCH v2] " Yinghai Lu 0 siblings, 2 replies; 30+ messages in thread From: Yinghai Lu @ 2016-08-02 19:55 UTC (permalink / raw) To: Thomas Garnier Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Rafael J . Wysocki, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com [-- Attachment #1: Type: text/plain, Size: 296 bytes --] On Tue, Aug 2, 2016 at 10:48 AM, Thomas Garnier <thgarnie@google.com> wrote: > On Tue, Aug 2, 2016 at 10:36 AM, Yinghai Lu <yinghai@kernel.org> wrote: >> >> Looks like we need to change the loop from phys address to virtual >> address instead. >> to avoid the overflow. something like attached. [-- Attachment #2: fix_ident_off.patch --] [-- Type: text/x-patch, Size: 3570 bytes --] --- arch/x86/mm/ident_map.c | 54 ++++++++++++++++++++++++++++-------------------- 1 file changed, 32 insertions(+), 22 deletions(-) Index: linux-2.6/arch/x86/mm/ident_map.c =================================================================== --- linux-2.6.orig/arch/x86/mm/ident_map.c +++ linux-2.6/arch/x86/mm/ident_map.c @@ -3,40 +3,47 @@ * included by both the compressed kernel and the regular kernel. */ -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page, +static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page, unsigned long addr, unsigned long end) { - addr &= PMD_MASK; - for (; addr < end; addr += PMD_SIZE) { - pmd_t *pmd = pmd_page + pmd_index(addr); + unsigned long off = info->kernel_mapping ? __PAGE_OFFSET : 0; + unsigned long vaddr = addr + off; + unsigned long vend = end + off; + + vaddr &= PMD_MASK; + for (; vaddr < vend; vaddr += PMD_SIZE) { + pmd_t *pmd = pmd_page + pmd_index(vaddr); if (!pmd_present(*pmd)) - set_pmd(pmd, __pmd(addr | pmd_flag)); + set_pmd(pmd, __pmd(vaddr - off | info->pmd_flag)); } } static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, unsigned long addr, unsigned long end) { - unsigned long next; + unsigned long off = info->kernel_mapping ? __PAGE_OFFSET : 0; + unsigned long vaddr = addr + off; + unsigned long vend = end + off; + unsigned long vnext; - for (; addr < end; addr = next) { - pud_t *pud = pud_page + pud_index(addr); + for (; vaddr < vend; vaddr = vnext) { + pud_t *pud = pud_page + pud_index(vaddr); pmd_t *pmd; - next = (addr & PUD_MASK) + PUD_SIZE; - if (next > end) - next = end; + vnext = (vaddr & PUD_MASK) + PUD_SIZE; + if (vnext > vend) + vnext = vend; if (pud_present(*pud)) { pmd = pmd_offset(pud, 0); - ident_pmd_init(info->pmd_flag, pmd, addr, next); + ident_pmd_init(info, pmd, vaddr - off, vnext - off); continue; } pmd = (pmd_t *)info->alloc_pgt_page(info->context); if (!pmd) return -ENOMEM; - ident_pmd_init(info->pmd_flag, pmd, addr, next); + ident_pmd_init(info, pmd, vaddr - off, vnext - off); set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); } @@ -46,21 +53,24 @@ static int ident_pud_init(struct x86_map int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page, unsigned long addr, unsigned long end) { - unsigned long next; int result; - int off = info->kernel_mapping ? pgd_index(__PAGE_OFFSET) : 0; + unsigned long off = info->kernel_mapping ? __PAGE_OFFSET : 0; + unsigned long vaddr = addr + off; + unsigned long vend = end + off; + unsigned long vnext; - for (; addr < end; addr = next) { - pgd_t *pgd = pgd_page + pgd_index(addr) + off; + for (; vaddr < vend; vaddr = vnext) { + pgd_t *pgd = pgd_page + pgd_index(vaddr); pud_t *pud; - next = (addr & PGDIR_MASK) + PGDIR_SIZE; - if (next > end) - next = end; + vnext = (vaddr & PGDIR_MASK) + PGDIR_SIZE; + if (vnext > vend) + vnext = vend; if (pgd_present(*pgd)) { pud = pud_offset(pgd, 0); - result = ident_pud_init(info, pud, addr, next); + result = ident_pud_init(info, pud, vaddr - off, + vnext - off); if (result) return result; continue; @@ -69,7 +79,7 @@ int kernel_ident_mapping_init(struct x86 pud = (pud_t *)info->alloc_pgt_page(info->context); if (!pud) return -ENOMEM; - result = ident_pud_init(info, pud, addr, next); + result = ident_pud_init(info, pud, vaddr - off, vnext - off); if (result) return result; set_pgd(pgd, __pgd(__pa(pud) | _KERNPG_TABLE)); ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-02 19:55 ` Yinghai Lu @ 2016-08-03 15:29 ` Thomas Garnier 2016-08-03 18:23 ` [PATCH v2] " Yinghai Lu 1 sibling, 0 replies; 30+ messages in thread From: Thomas Garnier @ 2016-08-03 15:29 UTC (permalink / raw) To: Yinghai Lu Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Rafael J . Wysocki, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com On Tue, Aug 2, 2016 at 12:55 PM, Yinghai Lu <yinghai@kernel.org> wrote: > On Tue, Aug 2, 2016 at 10:48 AM, Thomas Garnier <thgarnie@google.com> wrote: >> On Tue, Aug 2, 2016 at 10:36 AM, Yinghai Lu <yinghai@kernel.org> wrote: >>> >>> Looks like we need to change the loop from phys address to virtual >>> address instead. >>> to avoid the overflow. > > something like attached. I tested it and it worked well. I just got this warning on build: In file included from arch/x86/mm/init_64.c:60:0: arch/x86/mm/ident_map.c: In function ‘ident_pmd_init’: arch/x86/mm/ident_map.c:18:29: warning: suggest parentheses around arithmetic in operand of ‘|’ [-Wparentheses] set_pmd(pmd, __pmd(vaddr - off | info->pmd_flag)); Do you want to resend your version for integration? ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-02 19:55 ` Yinghai Lu 2016-08-03 15:29 ` Thomas Garnier @ 2016-08-03 18:23 ` Yinghai Lu 2016-08-03 21:28 ` Rafael J. Wysocki 1 sibling, 1 reply; 30+ messages in thread From: Yinghai Lu @ 2016-08-03 18:23 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Rafael J . Wysocki, Pavel Machek Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening, Thomas Garnier, Yinghai Lu From: Thomas Garnier <thgarnie@google.com> Correctly setup the temporary mapping for hibernation. Previous implementation assumed the offset between KVA and PA was aligned on the PGD level. With KASLR memory randomization enabled, the offset is randomized on the PUD level. This change supports unaligned up to PMD. Signed-off-by: Thomas Garnier <thgarnie@google.com> [yinghai: change loop to virtual address] Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- arch/x86/mm/ident_map.c | 54 ++++++++++++++++++++++++++++-------------------- 1 file changed, 32 insertions(+), 22 deletions(-) Index: linux-2.6/arch/x86/mm/ident_map.c =================================================================== --- linux-2.6.orig/arch/x86/mm/ident_map.c +++ linux-2.6/arch/x86/mm/ident_map.c @@ -3,40 +3,47 @@ * included by both the compressed kernel and the regular kernel. */ -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page, +static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page, unsigned long addr, unsigned long end) { - addr &= PMD_MASK; - for (; addr < end; addr += PMD_SIZE) { - pmd_t *pmd = pmd_page + pmd_index(addr); + unsigned long off = info->kernel_mapping ? __PAGE_OFFSET : 0; + unsigned long vaddr = addr + off; + unsigned long vend = end + off; + + vaddr &= PMD_MASK; + for (; vaddr < vend; vaddr += PMD_SIZE) { + pmd_t *pmd = pmd_page + pmd_index(vaddr); if (!pmd_present(*pmd)) - set_pmd(pmd, __pmd(addr | pmd_flag)); + set_pmd(pmd, __pmd((vaddr - off) | info->pmd_flag)); } } static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, unsigned long addr, unsigned long end) { - unsigned long next; + unsigned long off = info->kernel_mapping ? __PAGE_OFFSET : 0; + unsigned long vaddr = addr + off; + unsigned long vend = end + off; + unsigned long vnext; - for (; addr < end; addr = next) { - pud_t *pud = pud_page + pud_index(addr); + for (; vaddr < vend; vaddr = vnext) { + pud_t *pud = pud_page + pud_index(vaddr); pmd_t *pmd; - next = (addr & PUD_MASK) + PUD_SIZE; - if (next > end) - next = end; + vnext = (vaddr & PUD_MASK) + PUD_SIZE; + if (vnext > vend) + vnext = vend; if (pud_present(*pud)) { pmd = pmd_offset(pud, 0); - ident_pmd_init(info->pmd_flag, pmd, addr, next); + ident_pmd_init(info, pmd, vaddr - off, vnext - off); continue; } pmd = (pmd_t *)info->alloc_pgt_page(info->context); if (!pmd) return -ENOMEM; - ident_pmd_init(info->pmd_flag, pmd, addr, next); + ident_pmd_init(info, pmd, vaddr - off, vnext - off); set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); } @@ -46,21 +53,24 @@ static int ident_pud_init(struct x86_map int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page, unsigned long addr, unsigned long end) { - unsigned long next; int result; - int off = info->kernel_mapping ? pgd_index(__PAGE_OFFSET) : 0; + unsigned long off = info->kernel_mapping ? __PAGE_OFFSET : 0; + unsigned long vaddr = addr + off; + unsigned long vend = end + off; + unsigned long vnext; - for (; addr < end; addr = next) { - pgd_t *pgd = pgd_page + pgd_index(addr) + off; + for (; vaddr < vend; vaddr = vnext) { + pgd_t *pgd = pgd_page + pgd_index(vaddr); pud_t *pud; - next = (addr & PGDIR_MASK) + PGDIR_SIZE; - if (next > end) - next = end; + vnext = (vaddr & PGDIR_MASK) + PGDIR_SIZE; + if (vnext > vend) + vnext = vend; if (pgd_present(*pgd)) { pud = pud_offset(pgd, 0); - result = ident_pud_init(info, pud, addr, next); + result = ident_pud_init(info, pud, vaddr - off, + vnext - off); if (result) return result; continue; @@ -69,7 +79,7 @@ int kernel_ident_mapping_init(struct x86 pud = (pud_t *)info->alloc_pgt_page(info->context); if (!pud) return -ENOMEM; - result = ident_pud_init(info, pud, addr, next); + result = ident_pud_init(info, pud, vaddr - off, vnext - off); if (result) return result; set_pgd(pgd, __pgd(__pa(pud) | _KERNPG_TABLE)); ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-03 18:23 ` [PATCH v2] " Yinghai Lu @ 2016-08-03 21:28 ` Rafael J. Wysocki 2016-08-07 1:03 ` Rafael J. Wysocki 0 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-03 21:28 UTC (permalink / raw) To: Yinghai Lu Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Rafael J . Wysocki, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening, Thomas Garnier On Wed, Aug 3, 2016 at 8:23 PM, Yinghai Lu <yinghai@kernel.org> wrote: > From: Thomas Garnier <thgarnie@google.com> > > Correctly setup the temporary mapping for hibernation. Previous > implementation assumed the offset between KVA and PA was aligned on the PGD level. > With KASLR memory randomization enabled, the offset is randomized on the PUD > level. This change supports unaligned up to PMD. > > Signed-off-by: Thomas Garnier <thgarnie@google.com> > [yinghai: change loop to virtual address] > Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > arch/x86/mm/ident_map.c | 54 ++++++++++++++++++++++++++++-------------------- > 1 file changed, 32 insertions(+), 22 deletions(-) > > Index: linux-2.6/arch/x86/mm/ident_map.c > =================================================================== > --- linux-2.6.orig/arch/x86/mm/ident_map.c > +++ linux-2.6/arch/x86/mm/ident_map.c > @@ -3,40 +3,47 @@ > * included by both the compressed kernel and the regular kernel. > */ > > -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page, > +static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page, > unsigned long addr, unsigned long end) > { > - addr &= PMD_MASK; > - for (; addr < end; addr += PMD_SIZE) { > - pmd_t *pmd = pmd_page + pmd_index(addr); > + unsigned long off = info->kernel_mapping ? __PAGE_OFFSET : 0; > + unsigned long vaddr = addr + off; > + unsigned long vend = end + off; > + > + vaddr &= PMD_MASK; > + for (; vaddr < vend; vaddr += PMD_SIZE) { > + pmd_t *pmd = pmd_page + pmd_index(vaddr); > > if (!pmd_present(*pmd)) > - set_pmd(pmd, __pmd(addr | pmd_flag)); > + set_pmd(pmd, __pmd((vaddr - off) | info->pmd_flag)); > } > } > > static int ident_pud_init(struct x86_mapping_info *info, pud_t *pud_page, > unsigned long addr, unsigned long end) > { > - unsigned long next; > + unsigned long off = info->kernel_mapping ? __PAGE_OFFSET : 0; > + unsigned long vaddr = addr + off; > + unsigned long vend = end + off; > + unsigned long vnext; > > - for (; addr < end; addr = next) { > - pud_t *pud = pud_page + pud_index(addr); > + for (; vaddr < vend; vaddr = vnext) { > + pud_t *pud = pud_page + pud_index(vaddr); > pmd_t *pmd; > > - next = (addr & PUD_MASK) + PUD_SIZE; > - if (next > end) > - next = end; > + vnext = (vaddr & PUD_MASK) + PUD_SIZE; > + if (vnext > vend) > + vnext = vend; > > if (pud_present(*pud)) { > pmd = pmd_offset(pud, 0); > - ident_pmd_init(info->pmd_flag, pmd, addr, next); > + ident_pmd_init(info, pmd, vaddr - off, vnext - off); > continue; > } > pmd = (pmd_t *)info->alloc_pgt_page(info->context); > if (!pmd) > return -ENOMEM; > - ident_pmd_init(info->pmd_flag, pmd, addr, next); > + ident_pmd_init(info, pmd, vaddr - off, vnext - off); > set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); > } > > @@ -46,21 +53,24 @@ static int ident_pud_init(struct x86_map > int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page, > unsigned long addr, unsigned long end) > { > - unsigned long next; > int result; > - int off = info->kernel_mapping ? pgd_index(__PAGE_OFFSET) : 0; > + unsigned long off = info->kernel_mapping ? __PAGE_OFFSET : 0; > + unsigned long vaddr = addr + off; > + unsigned long vend = end + off; > + unsigned long vnext; > > - for (; addr < end; addr = next) { > - pgd_t *pgd = pgd_page + pgd_index(addr) + off; > + for (; vaddr < vend; vaddr = vnext) { > + pgd_t *pgd = pgd_page + pgd_index(vaddr); > pud_t *pud; > > - next = (addr & PGDIR_MASK) + PGDIR_SIZE; > - if (next > end) > - next = end; > + vnext = (vaddr & PGDIR_MASK) + PGDIR_SIZE; > + if (vnext > vend) > + vnext = vend; > > if (pgd_present(*pgd)) { > pud = pud_offset(pgd, 0); > - result = ident_pud_init(info, pud, addr, next); > + result = ident_pud_init(info, pud, vaddr - off, > + vnext - off); > if (result) > return result; > continue; > @@ -69,7 +79,7 @@ int kernel_ident_mapping_init(struct x86 > pud = (pud_t *)info->alloc_pgt_page(info->context); > if (!pud) > return -ENOMEM; > - result = ident_pud_init(info, pud, addr, next); > + result = ident_pud_init(info, pud, vaddr - off, vnext - off); > if (result) > return result; > set_pgd(pgd, __pgd(__pa(pud) | _KERNPG_TABLE)); > -- > To unsubscribe from this list: send the line "unsubscribe linux-pm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-03 21:28 ` Rafael J. Wysocki @ 2016-08-07 1:03 ` Rafael J. Wysocki 2016-08-07 4:53 ` Yinghai Lu 0 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-07 1:03 UTC (permalink / raw) To: Yinghai Lu, Thomas Garnier Cc: Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening On Wednesday, August 03, 2016 11:28:48 PM Rafael J. Wysocki wrote: > On Wed, Aug 3, 2016 at 8:23 PM, Yinghai Lu <yinghai@kernel.org> wrote: > > From: Thomas Garnier <thgarnie@google.com> > > > > Correctly setup the temporary mapping for hibernation. Previous > > implementation assumed the offset between KVA and PA was aligned on the PGD level. > > With KASLR memory randomization enabled, the offset is randomized on the PUD > > level. This change supports unaligned up to PMD. > > > > Signed-off-by: Thomas Garnier <thgarnie@google.com> > > [yinghai: change loop to virtual address] > > Signed-off-by: Yinghai Lu <yinghai@kernel.org> > > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> On a second thought, it seems to be better to follow your suggestion to simply provide a special version of kernel_ident_mapping_init() for hibernation, because it is sufficiently distinct from the other users of the code in ident_map.c. The patch below does just that (lightly tested). Thomas, can you please test this one too? Thanks, Rafael --- From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Subject: [PATCH] x86/power/64: Always create temporary identity mapping correctly The low-level resume-from-hibernation code on x86-64 uses kernel_ident_mapping_init() to create the temoprary identity mapping, but that function assumes that the offset between kernel virtual addresses and physical addresses is aligned on the PGD level. However, with a randomized identity mapping base, it may be aligned on the PUD level and if that happens, the temporary identity mapping created by set_up_temporary_mappings() will not reflect the actual kernel identity mapping and the image restoration will fail as a result (leading to a kernel panic most of the time). To fix this problem, provide simplified routines for creating the temporary identity mapping during resume from hibernation on x86-64 that support unaligned offsets between KVA and PA up to the PMD level. Although kernel_ident_mapping_init() might be made work in that case too, using hibernation-specific code for that is way simpler. Reported-by: Thomas Garnier <thgarnie@google.com> Suggested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- arch/x86/power/hibernate_64.c | 61 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 53 insertions(+), 8 deletions(-) Index: linux-pm/arch/x86/power/hibernate_64.c =================================================================== --- linux-pm.orig/arch/x86/power/hibernate_64.c +++ linux-pm/arch/x86/power/hibernate_64.c @@ -77,18 +77,63 @@ static int set_up_temporary_text_mapping return 0; } -static void *alloc_pgt_page(void *context) +static void ident_pmd_init(pmd_t *pmd, unsigned long addr, unsigned long end) { - return (void *)get_safe_page(GFP_ATOMIC); + for (; addr < end; addr += PMD_SIZE) + set_pmd(pmd + pmd_index(addr), + __pmd((addr - __PAGE_OFFSET) | __PAGE_KERNEL_LARGE_EXEC)); +} + +static int ident_pud_init(pud_t *pud, unsigned long addr, unsigned long end) +{ + unsigned long next; + + for (; addr < end; addr = next) { + pmd_t *pmd; + + pmd = (pmd_t *)get_safe_page(GFP_ATOMIC); + if (!pmd) + return -ENOMEM; + + next = (addr & PUD_MASK) + PUD_SIZE; + if (next > end) + next = end; + + ident_pmd_init(pmd, addr & PMD_MASK, next); + set_pud(pud + pud_index(addr), __pud(__pa(pmd) | _KERNPG_TABLE)); + } + return 0; +} + +static int ident_mapping_init(pgd_t *pgd, unsigned long mstart, unsigned long mend) +{ + unsigned long addr = mstart + __PAGE_OFFSET; + unsigned long end = mend + __PAGE_OFFSET; + unsigned long next; + + for (; addr < end; addr = next) { + pud_t *pud; + int result; + + pud = (pud_t *)get_safe_page(GFP_ATOMIC); + if (!pud) + return -ENOMEM; + + next = (addr & PGDIR_MASK) + PGDIR_SIZE; + if (next > end) + next = end; + + result = ident_pud_init(pud, addr, next); + if (result) + return result; + + set_pgd(pgd + pgd_index(addr), __pgd(__pa(pud) | _KERNPG_TABLE)); + } + return 0; } static int set_up_temporary_mappings(void) { - struct x86_mapping_info info = { - .alloc_pgt_page = alloc_pgt_page, - .pmd_flag = __PAGE_KERNEL_LARGE_EXEC, - .kernel_mapping = true, - }; unsigned long mstart, mend; pgd_t *pgd; int result; @@ -108,7 +153,7 @@ static int set_up_temporary_mappings(voi mstart = pfn_mapped[i].start << PAGE_SHIFT; mend = pfn_mapped[i].end << PAGE_SHIFT; - result = kernel_ident_mapping_init(&info, pgd, mstart, mend); + result = ident_mapping_init(pgd, mstart, mend); if (result) return result; } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-07 1:03 ` Rafael J. Wysocki @ 2016-08-07 4:53 ` Yinghai Lu 2016-08-07 23:23 ` Rafael J. Wysocki 0 siblings, 1 reply; 30+ messages in thread From: Yinghai Lu @ 2016-08-07 4:53 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Thomas Garnier, Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com On Sat, Aug 6, 2016 at 6:03 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Wednesday, August 03, 2016 11:28:48 PM Rafael J. Wysocki wrote: > > On a second thought, it seems to be better to follow your suggestion to simply > provide a special version of kernel_ident_mapping_init() for hibernation, > because it is sufficiently distinct from the other users of the code in > ident_map.c. > > The patch below does just that (lightly tested). > > > --- > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Subject: [PATCH] x86/power/64: Always create temporary identity mapping correctly > > The low-level resume-from-hibernation code on x86-64 uses > kernel_ident_mapping_init() to create the temoprary identity mapping, > but that function assumes that the offset between kernel virtual > addresses and physical addresses is aligned on the PGD level. > > However, with a randomized identity mapping base, it may be aligned > on the PUD level and if that happens, the temporary identity mapping > created by set_up_temporary_mappings() will not reflect the actual > kernel identity mapping and the image restoration will fail as a > result (leading to a kernel panic most of the time). > > To fix this problem, provide simplified routines for creating the > temporary identity mapping during resume from hibernation on x86-64 > that support unaligned offsets between KVA and PA up to the PMD > level. > > Although kernel_ident_mapping_init() might be made work in that > case too, using hibernation-specific code for that is way simpler. > > Reported-by: Thomas Garnier <thgarnie@google.com> > Suggested-by: Yinghai Lu <yinghai@kernel.org> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > arch/x86/power/hibernate_64.c | 61 ++++++++++++++++++++++++++++++++++++------ > 1 file changed, 53 insertions(+), 8 deletions(-) > > Index: linux-pm/arch/x86/power/hibernate_64.c > =================================================================== > --- linux-pm.orig/arch/x86/power/hibernate_64.c > +++ linux-pm/arch/x86/power/hibernate_64.c > @@ -77,18 +77,63 @@ static int set_up_temporary_text_mapping > return 0; > } > > -static void *alloc_pgt_page(void *context) > +static void ident_pmd_init(pmd_t *pmd, unsigned long addr, unsigned long end) > { > - return (void *)get_safe_page(GFP_ATOMIC); > + for (; addr < end; addr += PMD_SIZE) > + set_pmd(pmd + pmd_index(addr), > + __pmd((addr - __PAGE_OFFSET) | __PAGE_KERNEL_LARGE_EXEC)); > +} > + > +static int ident_pud_init(pud_t *pud, unsigned long addr, unsigned long end) > +{ > + unsigned long next; > + > + for (; addr < end; addr = next) { > + pmd_t *pmd; > + > + pmd = (pmd_t *)get_safe_page(GFP_ATOMIC); > + if (!pmd) > + return -ENOMEM; > + > + next = (addr & PUD_MASK) + PUD_SIZE; > + if (next > end) > + next = end; > + > + ident_pmd_init(pmd, addr & PMD_MASK, next); > + set_pud(pud + pud_index(addr), __pud(__pa(pmd) | _KERNPG_TABLE)); > + } > + return 0; > +} > + > +static int ident_mapping_init(pgd_t *pgd, unsigned long mstart, unsigned long mend) > +{ > + unsigned long addr = mstart + __PAGE_OFFSET; > + unsigned long end = mend + __PAGE_OFFSET; > + unsigned long next; > + > + for (; addr < end; addr = next) { > + pud_t *pud; > + int result; > + > + pud = (pud_t *)get_safe_page(GFP_ATOMIC); > + if (!pud) > + return -ENOMEM; > + > + next = (addr & PGDIR_MASK) + PGDIR_SIZE; > + if (next > end) > + next = end; > + > + result = ident_pud_init(pud, addr, next); > + if (result) > + return result; > + > + set_pgd(pgd + pgd_index(addr), __pgd(__pa(pud) | _KERNPG_TABLE)); > + } > + return 0; > } > > static int set_up_temporary_mappings(void) > { > - struct x86_mapping_info info = { > - .alloc_pgt_page = alloc_pgt_page, > - .pmd_flag = __PAGE_KERNEL_LARGE_EXEC, > - .kernel_mapping = true, > - }; > unsigned long mstart, mend; > pgd_t *pgd; > int result; > @@ -108,7 +153,7 @@ static int set_up_temporary_mappings(voi > mstart = pfn_mapped[i].start << PAGE_SHIFT; > mend = pfn_mapped[i].end << PAGE_SHIFT; > > - result = kernel_ident_mapping_init(&info, pgd, mstart, mend); > + result = ident_mapping_init(pgd, mstart, mend); > if (result) > return result; > } > Hi Rafael, Your version seems not considering different pfn_mapped range could share same PGD (512G) or even PUD(1G), or even same PMD (2M) range. so just keep on using kernel_ident_mapping_init() for that. At the same time, set_up_temporary_text_mapping could be replaced with kernel_ident_mapping_init() too if restore_jump_address is KVA for jump_address_phys. Thanks Yinghai ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-07 4:53 ` Yinghai Lu @ 2016-08-07 23:23 ` Rafael J. Wysocki 2016-08-08 7:06 ` Yinghai Lu 0 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-07 23:23 UTC (permalink / raw) To: Yinghai Lu, Thomas Garnier Cc: Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com On Saturday, August 06, 2016 09:53:50 PM Yinghai Lu wrote: > On Sat, Aug 6, 2016 at 6:03 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > On Wednesday, August 03, 2016 11:28:48 PM Rafael J. Wysocki wrote: > > > > On a second thought, it seems to be better to follow your suggestion to simply > > provide a special version of kernel_ident_mapping_init() for hibernation, > > because it is sufficiently distinct from the other users of the code in > > ident_map.c. > > > > The patch below does just that (lightly tested). > > > > > > --- > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Subject: [PATCH] x86/power/64: Always create temporary identity mapping correctly > > > > The low-level resume-from-hibernation code on x86-64 uses > > kernel_ident_mapping_init() to create the temoprary identity mapping, > > but that function assumes that the offset between kernel virtual > > addresses and physical addresses is aligned on the PGD level. > > > > However, with a randomized identity mapping base, it may be aligned > > on the PUD level and if that happens, the temporary identity mapping > > created by set_up_temporary_mappings() will not reflect the actual > > kernel identity mapping and the image restoration will fail as a > > result (leading to a kernel panic most of the time). > > > > To fix this problem, provide simplified routines for creating the > > temporary identity mapping during resume from hibernation on x86-64 > > that support unaligned offsets between KVA and PA up to the PMD > > level. > > > > Although kernel_ident_mapping_init() might be made work in that > > case too, using hibernation-specific code for that is way simpler. > > > > Reported-by: Thomas Garnier <thgarnie@google.com> > > Suggested-by: Yinghai Lu <yinghai@kernel.org> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > --- > > arch/x86/power/hibernate_64.c | 61 ++++++++++++++++++++++++++++++++++++------ > > 1 file changed, 53 insertions(+), 8 deletions(-) > > > > Index: linux-pm/arch/x86/power/hibernate_64.c > > =================================================================== > > --- linux-pm.orig/arch/x86/power/hibernate_64.c > > +++ linux-pm/arch/x86/power/hibernate_64.c > > @@ -77,18 +77,63 @@ static int set_up_temporary_text_mapping > > return 0; > > } > > > > -static void *alloc_pgt_page(void *context) > > +static void ident_pmd_init(pmd_t *pmd, unsigned long addr, unsigned long end) > > { > > - return (void *)get_safe_page(GFP_ATOMIC); > > + for (; addr < end; addr += PMD_SIZE) > > + set_pmd(pmd + pmd_index(addr), > > + __pmd((addr - __PAGE_OFFSET) | __PAGE_KERNEL_LARGE_EXEC)); > > +} > > + > > +static int ident_pud_init(pud_t *pud, unsigned long addr, unsigned long end) > > +{ > > + unsigned long next; > > + > > + for (; addr < end; addr = next) { > > + pmd_t *pmd; > > + > > + pmd = (pmd_t *)get_safe_page(GFP_ATOMIC); > > + if (!pmd) > > + return -ENOMEM; > > + > > + next = (addr & PUD_MASK) + PUD_SIZE; > > + if (next > end) > > + next = end; > > + > > + ident_pmd_init(pmd, addr & PMD_MASK, next); > > + set_pud(pud + pud_index(addr), __pud(__pa(pmd) | _KERNPG_TABLE)); > > + } > > + return 0; > > +} > > + > > +static int ident_mapping_init(pgd_t *pgd, unsigned long mstart, unsigned long mend) > > +{ > > + unsigned long addr = mstart + __PAGE_OFFSET; > > + unsigned long end = mend + __PAGE_OFFSET; > > + unsigned long next; > > + > > + for (; addr < end; addr = next) { > > + pud_t *pud; > > + int result; > > + > > + pud = (pud_t *)get_safe_page(GFP_ATOMIC); > > + if (!pud) > > + return -ENOMEM; > > + > > + next = (addr & PGDIR_MASK) + PGDIR_SIZE; > > + if (next > end) > > + next = end; > > + > > + result = ident_pud_init(pud, addr, next); > > + if (result) > > + return result; > > + > > + set_pgd(pgd + pgd_index(addr), __pgd(__pa(pud) | _KERNPG_TABLE)); > > + } > > + return 0; > > } > > > > static int set_up_temporary_mappings(void) > > { > > - struct x86_mapping_info info = { > > - .alloc_pgt_page = alloc_pgt_page, > > - .pmd_flag = __PAGE_KERNEL_LARGE_EXEC, > > - .kernel_mapping = true, > > - }; > > unsigned long mstart, mend; > > pgd_t *pgd; > > int result; > > @@ -108,7 +153,7 @@ static int set_up_temporary_mappings(voi > > mstart = pfn_mapped[i].start << PAGE_SHIFT; > > mend = pfn_mapped[i].end << PAGE_SHIFT; > > > > - result = kernel_ident_mapping_init(&info, pgd, mstart, mend); > > + result = ident_mapping_init(pgd, mstart, mend); > > if (result) > > return result; > > } > > > > Hi Rafael, > > Your version seems not considering different pfn_mapped range could > share same PGD (512G) or even PUD(1G), or even same PMD (2M) range. Good point! > so just keep on using kernel_ident_mapping_init() for that. But then playing with offsets in ident_pud_init() is not necessary, because that function works on virtual addresses only, so the appended patch should be sufficient to fix the problem, shouldn't it? > At the same time, set_up_temporary_text_mapping could be replaced with > kernel_ident_mapping_init() too if restore_jump_address is KVA for > jump_address_phys. I see no reason to do that. First, it is not guaranteed that restore_jump_address will always be a KVA for jump_address_phys and second, it really is only necessary to map one PMD in there. Thanks, Rafael --- From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Subject: [PATCH v2] x86/power/64: Always create temporary identity mapping correctly The low-level resume-from-hibernation code on x86-64 uses kernel_ident_mapping_init() to create the temoprary identity mapping, but that function assumes that the offset between kernel virtual addresses and physical addresses is aligned on the PGD level. However, with a randomized identity mapping base, it may be aligned on the PUD level and if that happens, the temporary identity mapping created by set_up_temporary_mappings() will not reflect the actual kernel identity mapping and the image restoration will fail as a result (leading to a kernel panic most of the time). To fix this problem, rework kernel_ident_mapping_init() to support unaligned offsets between KVA and PA up to the PMD level and make set_up_temporary_mappings() use it as approprtiate. Reported-by: Thomas Garnier <thgarnie@google.com> Suggested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- arch/x86/include/asm/init.h | 4 ++-- arch/x86/mm/ident_map.c | 19 +++++++++++-------- arch/x86/power/hibernate_64.c | 2 +- 3 files changed, 14 insertions(+), 11 deletions(-) Index: linux-pm/arch/x86/include/asm/init.h =================================================================== --- linux-pm.orig/arch/x86/include/asm/init.h +++ linux-pm/arch/x86/include/asm/init.h @@ -5,10 +5,10 @@ struct x86_mapping_info { void *(*alloc_pgt_page)(void *); /* allocate buf for page table */ void *context; /* context for alloc_pgt_page */ unsigned long pmd_flag; /* page flag for PMD entry */ - bool kernel_mapping; /* kernel mapping or ident mapping */ + unsigned long offset; /* ident mapping offset */ }; int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page, - unsigned long addr, unsigned long end); + unsigned long pstart, unsigned long pend); #endif /* _ASM_X86_INIT_H */ Index: linux-pm/arch/x86/mm/ident_map.c =================================================================== --- linux-pm.orig/arch/x86/mm/ident_map.c +++ linux-pm/arch/x86/mm/ident_map.c @@ -3,15 +3,17 @@ * included by both the compressed kernel and the regular kernel. */ -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page, +static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page, unsigned long addr, unsigned long end) { addr &= PMD_MASK; for (; addr < end; addr += PMD_SIZE) { pmd_t *pmd = pmd_page + pmd_index(addr); - if (!pmd_present(*pmd)) - set_pmd(pmd, __pmd(addr | pmd_flag)); + if (pmd_present(*pmd)) + continue; + + set_pmd(pmd, __pmd((addr - info->offset) | info->pmd_flag)); } } @@ -30,13 +32,13 @@ static int ident_pud_init(struct x86_map if (pud_present(*pud)) { pmd = pmd_offset(pud, 0); - ident_pmd_init(info->pmd_flag, pmd, addr, next); + ident_pmd_init(info, pmd, addr, next); continue; } pmd = (pmd_t *)info->alloc_pgt_page(info->context); if (!pmd) return -ENOMEM; - ident_pmd_init(info->pmd_flag, pmd, addr, next); + ident_pmd_init(info, pmd, addr, next); set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); } @@ -44,14 +46,15 @@ static int ident_pud_init(struct x86_map } int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page, - unsigned long addr, unsigned long end) + unsigned long pstart, unsigned long pend) { + unsigned long addr = pstart + info->offset; + unsigned long end = pend + info->offset; unsigned long next; int result; - int off = info->kernel_mapping ? pgd_index(__PAGE_OFFSET) : 0; for (; addr < end; addr = next) { - pgd_t *pgd = pgd_page + pgd_index(addr) + off; + pgd_t *pgd = pgd_page + pgd_index(addr); pud_t *pud; next = (addr & PGDIR_MASK) + PGDIR_SIZE; Index: linux-pm/arch/x86/power/hibernate_64.c =================================================================== --- linux-pm.orig/arch/x86/power/hibernate_64.c +++ linux-pm/arch/x86/power/hibernate_64.c @@ -87,7 +87,7 @@ static int set_up_temporary_mappings(voi struct x86_mapping_info info = { .alloc_pgt_page = alloc_pgt_page, .pmd_flag = __PAGE_KERNEL_LARGE_EXEC, - .kernel_mapping = true, + .offset = __PAGE_OFFSET, }; unsigned long mstart, mend; pgd_t *pgd; ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-07 23:23 ` Rafael J. Wysocki @ 2016-08-08 7:06 ` Yinghai Lu 2016-08-08 7:23 ` Yinghai Lu 0 siblings, 1 reply; 30+ messages in thread From: Yinghai Lu @ 2016-08-08 7:06 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Thomas Garnier, Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com On Sun, Aug 7, 2016 at 4:23 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Saturday, August 06, 2016 09:53:50 PM Yinghai Lu wrote: >> On Sat, Aug 6, 2016 at 6:03 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> > On Wednesday, August 03, 2016 11:28:48 PM Rafael J. Wysocki wrote: >> > >> > On a second thought, it seems to be better to follow your suggestion to simply >> > provide a special version of kernel_ident_mapping_init() for hibernation, >> > because it is sufficiently distinct from the other users of the code in >> > ident_map.c. >> > >> > The patch below does just that (lightly tested). >> > >> > >> > --- >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> > Subject: [PATCH] x86/power/64: Always create temporary identity mapping correctly >> > >> > The low-level resume-from-hibernation code on x86-64 uses >> > kernel_ident_mapping_init() to create the temoprary identity mapping, >> > but that function assumes that the offset between kernel virtual >> > addresses and physical addresses is aligned on the PGD level. >> > >> > However, with a randomized identity mapping base, it may be aligned >> > on the PUD level and if that happens, the temporary identity mapping >> > created by set_up_temporary_mappings() will not reflect the actual >> > kernel identity mapping and the image restoration will fail as a >> > result (leading to a kernel panic most of the time). >> > >> > To fix this problem, provide simplified routines for creating the >> > temporary identity mapping during resume from hibernation on x86-64 >> > that support unaligned offsets between KVA and PA up to the PMD >> > level. >> > >> > Although kernel_ident_mapping_init() might be made work in that >> > case too, using hibernation-specific code for that is way simpler. >> > >> > Reported-by: Thomas Garnier <thgarnie@google.com> >> > Suggested-by: Yinghai Lu <yinghai@kernel.org> >> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> > --- >> > arch/x86/power/hibernate_64.c | 61 ++++++++++++++++++++++++++++++++++++------ >> > 1 file changed, 53 insertions(+), 8 deletions(-) >> > >> > Index: linux-pm/arch/x86/power/hibernate_64.c >> > =================================================================== >> > --- linux-pm.orig/arch/x86/power/hibernate_64.c >> > +++ linux-pm/arch/x86/power/hibernate_64.c >> > @@ -77,18 +77,63 @@ static int set_up_temporary_text_mapping >> > return 0; >> > } >> > >> > -static void *alloc_pgt_page(void *context) >> > +static void ident_pmd_init(pmd_t *pmd, unsigned long addr, unsigned long end) >> > { >> > - return (void *)get_safe_page(GFP_ATOMIC); >> > + for (; addr < end; addr += PMD_SIZE) >> > + set_pmd(pmd + pmd_index(addr), >> > + __pmd((addr - __PAGE_OFFSET) | __PAGE_KERNEL_LARGE_EXEC)); >> > +} >> > + >> > +static int ident_pud_init(pud_t *pud, unsigned long addr, unsigned long end) >> > +{ >> > + unsigned long next; >> > + >> > + for (; addr < end; addr = next) { >> > + pmd_t *pmd; >> > + >> > + pmd = (pmd_t *)get_safe_page(GFP_ATOMIC); >> > + if (!pmd) >> > + return -ENOMEM; >> > + >> > + next = (addr & PUD_MASK) + PUD_SIZE; >> > + if (next > end) >> > + next = end; >> > + >> > + ident_pmd_init(pmd, addr & PMD_MASK, next); >> > + set_pud(pud + pud_index(addr), __pud(__pa(pmd) | _KERNPG_TABLE)); >> > + } >> > + return 0; >> > +} >> > + >> > +static int ident_mapping_init(pgd_t *pgd, unsigned long mstart, unsigned long mend) >> > +{ >> > + unsigned long addr = mstart + __PAGE_OFFSET; >> > + unsigned long end = mend + __PAGE_OFFSET; >> > + unsigned long next; >> > + >> > + for (; addr < end; addr = next) { >> > + pud_t *pud; >> > + int result; >> > + >> > + pud = (pud_t *)get_safe_page(GFP_ATOMIC); >> > + if (!pud) >> > + return -ENOMEM; >> > + >> > + next = (addr & PGDIR_MASK) + PGDIR_SIZE; >> > + if (next > end) >> > + next = end; >> > + >> > + result = ident_pud_init(pud, addr, next); >> > + if (result) >> > + return result; >> > + >> > + set_pgd(pgd + pgd_index(addr), __pgd(__pa(pud) | _KERNPG_TABLE)); >> > + } >> > + return 0; >> > } >> > >> > static int set_up_temporary_mappings(void) >> > { >> > - struct x86_mapping_info info = { >> > - .alloc_pgt_page = alloc_pgt_page, >> > - .pmd_flag = __PAGE_KERNEL_LARGE_EXEC, >> > - .kernel_mapping = true, >> > - }; >> > unsigned long mstart, mend; >> > pgd_t *pgd; >> > int result; >> > @@ -108,7 +153,7 @@ static int set_up_temporary_mappings(voi >> > mstart = pfn_mapped[i].start << PAGE_SHIFT; >> > mend = pfn_mapped[i].end << PAGE_SHIFT; >> > >> > - result = kernel_ident_mapping_init(&info, pgd, mstart, mend); >> > + result = ident_mapping_init(pgd, mstart, mend); >> > if (result) >> > return result; >> > } >> > >> >> Hi Rafael, >> >> Your version seems not considering different pfn_mapped range could >> share same PGD (512G) or even PUD(1G), or even same PMD (2M) range. > > Good point! > >> so just keep on using kernel_ident_mapping_init() for that. > > But then playing with offsets in ident_pud_init() is not necessary, > because that function works on virtual addresses only, so the appended patch > should be sufficient to fix the problem, shouldn't it? I agree. > >> At the same time, set_up_temporary_text_mapping could be replaced with >> kernel_ident_mapping_init() too if restore_jump_address is KVA for >> jump_address_phys. > > I see no reason to do that. > > First, it is not guaranteed that restore_jump_address will always be a KVA for > jump_address_phys and second, it really is only necessary to map one PMD in > there. With your v2 version, you could pass difference between restore_jump_address and jump_address_phys as info->off ? With that, we can kill more lines if replace with set_up_temporary_text_mapping with kernel_ident_mapping_init() and make code more readable. But just keep that in separated patch after your v2 patch. > --- > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Subject: [PATCH v2] x86/power/64: Always create temporary identity mapping correctly > > The low-level resume-from-hibernation code on x86-64 uses > kernel_ident_mapping_init() to create the temoprary identity mapping, > but that function assumes that the offset between kernel virtual > addresses and physical addresses is aligned on the PGD level. > > However, with a randomized identity mapping base, it may be aligned > on the PUD level and if that happens, the temporary identity mapping > created by set_up_temporary_mappings() will not reflect the actual > kernel identity mapping and the image restoration will fail as a > result (leading to a kernel panic most of the time). > > To fix this problem, rework kernel_ident_mapping_init() to support > unaligned offsets between KVA and PA up to the PMD level and make > set_up_temporary_mappings() use it as approprtiate. > > Reported-by: Thomas Garnier <thgarnie@google.com> > Suggested-by: Yinghai Lu <yinghai@kernel.org> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> it should replace [PATCH v2] x86/power/64: Support unaligned addresses for temporary mapping Acked-by: Yinghai Lu <yinghai@kernel.org> > --- > arch/x86/include/asm/init.h | 4 ++-- > arch/x86/mm/ident_map.c | 19 +++++++++++-------- > arch/x86/power/hibernate_64.c | 2 +- > 3 files changed, 14 insertions(+), 11 deletions(-) > > Index: linux-pm/arch/x86/include/asm/init.h > =================================================================== > --- linux-pm.orig/arch/x86/include/asm/init.h > +++ linux-pm/arch/x86/include/asm/init.h > @@ -5,10 +5,10 @@ struct x86_mapping_info { > void *(*alloc_pgt_page)(void *); /* allocate buf for page table */ > void *context; /* context for alloc_pgt_page */ > unsigned long pmd_flag; /* page flag for PMD entry */ > - bool kernel_mapping; /* kernel mapping or ident mapping */ > + unsigned long offset; /* ident mapping offset */ > }; > > int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page, > - unsigned long addr, unsigned long end); > + unsigned long pstart, unsigned long pend); > > #endif /* _ASM_X86_INIT_H */ > Index: linux-pm/arch/x86/mm/ident_map.c > =================================================================== > --- linux-pm.orig/arch/x86/mm/ident_map.c > +++ linux-pm/arch/x86/mm/ident_map.c > @@ -3,15 +3,17 @@ > * included by both the compressed kernel and the regular kernel. > */ > > -static void ident_pmd_init(unsigned long pmd_flag, pmd_t *pmd_page, > +static void ident_pmd_init(struct x86_mapping_info *info, pmd_t *pmd_page, > unsigned long addr, unsigned long end) > { > addr &= PMD_MASK; > for (; addr < end; addr += PMD_SIZE) { > pmd_t *pmd = pmd_page + pmd_index(addr); > > - if (!pmd_present(*pmd)) > - set_pmd(pmd, __pmd(addr | pmd_flag)); > + if (pmd_present(*pmd)) > + continue; > + > + set_pmd(pmd, __pmd((addr - info->offset) | info->pmd_flag)); > } > } > > @@ -30,13 +32,13 @@ static int ident_pud_init(struct x86_map > > if (pud_present(*pud)) { > pmd = pmd_offset(pud, 0); > - ident_pmd_init(info->pmd_flag, pmd, addr, next); > + ident_pmd_init(info, pmd, addr, next); > continue; > } > pmd = (pmd_t *)info->alloc_pgt_page(info->context); > if (!pmd) > return -ENOMEM; > - ident_pmd_init(info->pmd_flag, pmd, addr, next); > + ident_pmd_init(info, pmd, addr, next); > set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE)); > } > > @@ -44,14 +46,15 @@ static int ident_pud_init(struct x86_map > } > > int kernel_ident_mapping_init(struct x86_mapping_info *info, pgd_t *pgd_page, > - unsigned long addr, unsigned long end) > + unsigned long pstart, unsigned long pend) > { > + unsigned long addr = pstart + info->offset; > + unsigned long end = pend + info->offset; > unsigned long next; > int result; > - int off = info->kernel_mapping ? pgd_index(__PAGE_OFFSET) : 0; > > for (; addr < end; addr = next) { > - pgd_t *pgd = pgd_page + pgd_index(addr) + off; > + pgd_t *pgd = pgd_page + pgd_index(addr); > pud_t *pud; > > next = (addr & PGDIR_MASK) + PGDIR_SIZE; > Index: linux-pm/arch/x86/power/hibernate_64.c > =================================================================== > --- linux-pm.orig/arch/x86/power/hibernate_64.c > +++ linux-pm/arch/x86/power/hibernate_64.c > @@ -87,7 +87,7 @@ static int set_up_temporary_mappings(voi > struct x86_mapping_info info = { > .alloc_pgt_page = alloc_pgt_page, > .pmd_flag = __PAGE_KERNEL_LARGE_EXEC, > - .kernel_mapping = true, > + .offset = __PAGE_OFFSET, > }; > unsigned long mstart, mend; > pgd_t *pgd; > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-08 7:06 ` Yinghai Lu @ 2016-08-08 7:23 ` Yinghai Lu 2016-08-08 13:16 ` Rafael J. Wysocki 0 siblings, 1 reply; 30+ messages in thread From: Yinghai Lu @ 2016-08-08 7:23 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Thomas Garnier, Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com On Mon, Aug 8, 2016 at 12:06 AM, Yinghai Lu <yinghai@kernel.org> wrote: >> >>> At the same time, set_up_temporary_text_mapping could be replaced with >>> kernel_ident_mapping_init() too if restore_jump_address is KVA for >>> jump_address_phys. >> >> I see no reason to do that. >> >> First, it is not guaranteed that restore_jump_address will always be a KVA for >> jump_address_phys and second, it really is only necessary to map one PMD in >> there. > > With your v2 version, you could pass difference between restore_jump_address and > jump_address_phys as info->off ? > With that, we can kill more lines if replace with > set_up_temporary_text_mapping with > kernel_ident_mapping_init() and make code more readable. > > But just keep that in separated patch after your v2 patch. like: --- arch/x86/power/hibernate_64.c | 55 ++++++++++++------------------------------ 1 file changed, 17 insertions(+), 38 deletions(-) Index: linux-2.6/arch/x86/power/hibernate_64.c =================================================================== --- linux-2.6.orig/arch/x86/power/hibernate_64.c +++ linux-2.6/arch/x86/power/hibernate_64.c @@ -41,42 +41,6 @@ unsigned long temp_level4_pgt __visible; unsigned long relocated_restore_code __visible; -static int set_up_temporary_text_mapping(pgd_t *pgd) -{ - pmd_t *pmd; - pud_t *pud; - - /* - * The new mapping only has to cover the page containing the image - * kernel's entry point (jump_address_phys), because the switch over to - * it is carried out by relocated code running from a page allocated - * specifically for this purpose and covered by the identity mapping, so - * the temporary kernel text mapping is only needed for the final jump. - * Moreover, in that mapping the virtual address of the image kernel's - * entry point must be the same as its virtual address in the image - * kernel (restore_jump_address), so the image kernel's - * restore_registers() code doesn't find itself in a different area of - * the virtual address space after switching over to the original page - * tables used by the image kernel. - */ - pud = (pud_t *)get_safe_page(GFP_ATOMIC); - if (!pud) - return -ENOMEM; - - pmd = (pmd_t *)get_safe_page(GFP_ATOMIC); - if (!pmd) - return -ENOMEM; - - set_pmd(pmd + pmd_index(restore_jump_address), - __pmd((jump_address_phys & PMD_MASK) | __PAGE_KERNEL_LARGE_EXEC)); - set_pud(pud + pud_index(restore_jump_address), - __pud(__pa(pmd) | _KERNPG_TABLE)); - set_pgd(pgd + pgd_index(restore_jump_address), - __pgd(__pa(pud) | _KERNPG_TABLE)); - - return 0; -} - static void *alloc_pgt_page(void *context) { return (void *)get_safe_page(GFP_ATOMIC); @@ -87,7 +51,6 @@ static int set_up_temporary_mappings(voi struct x86_mapping_info info = { .alloc_pgt_page = alloc_pgt_page, .pmd_flag = __PAGE_KERNEL_LARGE_EXEC, - .offset = __PAGE_OFFSET, }; unsigned long mstart, mend; pgd_t *pgd; @@ -99,11 +62,27 @@ static int set_up_temporary_mappings(voi return -ENOMEM; /* Prepare a temporary mapping for the kernel text */ - result = set_up_temporary_text_mapping(pgd); + /* + * The new mapping only has to cover the page containing the image + * kernel's entry point (jump_address_phys), because the switch over to + * it is carried out by relocated code running from a page allocated + * specifically for this purpose and covered by the identity mapping, so + * the temporary kernel text mapping is only needed for the final jump. + * Moreover, in that mapping the virtual address of the image kernel's + * entry point must be the same as its virtual address in the image + * kernel (restore_jump_address), so the image kernel's + * restore_registers() code doesn't find itself in a different area of + * the virtual address space after switching over to the original page + * tables used by the image kernel. + */ + info.offset = restore_jump_address - jump_address_phys; + result = kernel_ident_mapping_init(&info, pgd, jump_address_phys, + jump_address_phys + PMD_SIZE); if (result) return result; /* Set up the direct mapping from scratch */ + info.offset = __PAGE_OFFSET; for (i = 0; i < nr_pfn_mapped; i++) { mstart = pfn_mapped[i].start << PAGE_SHIFT; mend = pfn_mapped[i].end << PAGE_SHIFT; ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v2] x86/power/64: Support unaligned addresses for temporary mapping 2016-08-08 7:23 ` Yinghai Lu @ 2016-08-08 13:16 ` Rafael J. Wysocki 0 siblings, 0 replies; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-08 13:16 UTC (permalink / raw) To: Yinghai Lu Cc: Thomas Garnier, Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Pavel Machek, the arch/x86 maintainers, Linux Kernel Mailing List, Linux PM list, kernel-hardening@lists.openwall.com On Monday, August 08, 2016 12:23:33 AM Yinghai Lu wrote: > On Mon, Aug 8, 2016 at 12:06 AM, Yinghai Lu <yinghai@kernel.org> wrote: > >> > >>> At the same time, set_up_temporary_text_mapping could be replaced with > >>> kernel_ident_mapping_init() too if restore_jump_address is KVA for > >>> jump_address_phys. > >> > >> I see no reason to do that. > >> > >> First, it is not guaranteed that restore_jump_address will always be a KVA for > >> jump_address_phys and second, it really is only necessary to map one PMD in > >> there. > > > > With your v2 version, you could pass difference between restore_jump_address and > > jump_address_phys as info->off ? > > With that, we can kill more lines if replace with > > set_up_temporary_text_mapping with > > kernel_ident_mapping_init() and make code more readable. > > > > But just keep that in separated patch after your v2 patch. > > like: > > --- > arch/x86/power/hibernate_64.c | 55 ++++++++++++------------------------------ > 1 file changed, 17 insertions(+), 38 deletions(-) > > Index: linux-2.6/arch/x86/power/hibernate_64.c > =================================================================== > --- linux-2.6.orig/arch/x86/power/hibernate_64.c > +++ linux-2.6/arch/x86/power/hibernate_64.c > @@ -41,42 +41,6 @@ unsigned long temp_level4_pgt __visible; > > unsigned long relocated_restore_code __visible; > > -static int set_up_temporary_text_mapping(pgd_t *pgd) > -{ > - pmd_t *pmd; > - pud_t *pud; > - > - /* > - * The new mapping only has to cover the page containing the image > - * kernel's entry point (jump_address_phys), because the switch over to > - * it is carried out by relocated code running from a page allocated > - * specifically for this purpose and covered by the identity mapping, so > - * the temporary kernel text mapping is only needed for the final jump. > - * Moreover, in that mapping the virtual address of the image kernel's > - * entry point must be the same as its virtual address in the image > - * kernel (restore_jump_address), so the image kernel's > - * restore_registers() code doesn't find itself in a different area of > - * the virtual address space after switching over to the original page > - * tables used by the image kernel. > - */ > - pud = (pud_t *)get_safe_page(GFP_ATOMIC); > - if (!pud) > - return -ENOMEM; > - > - pmd = (pmd_t *)get_safe_page(GFP_ATOMIC); > - if (!pmd) > - return -ENOMEM; > - > - set_pmd(pmd + pmd_index(restore_jump_address), > - __pmd((jump_address_phys & PMD_MASK) | __PAGE_KERNEL_LARGE_EXEC)); > - set_pud(pud + pud_index(restore_jump_address), > - __pud(__pa(pmd) | _KERNPG_TABLE)); > - set_pgd(pgd + pgd_index(restore_jump_address), > - __pgd(__pa(pud) | _KERNPG_TABLE)); > - > - return 0; > -} > - > static void *alloc_pgt_page(void *context) > { > return (void *)get_safe_page(GFP_ATOMIC); > @@ -87,7 +51,6 @@ static int set_up_temporary_mappings(voi > struct x86_mapping_info info = { > .alloc_pgt_page = alloc_pgt_page, > .pmd_flag = __PAGE_KERNEL_LARGE_EXEC, > - .offset = __PAGE_OFFSET, > }; > unsigned long mstart, mend; > pgd_t *pgd; > @@ -99,11 +62,27 @@ static int set_up_temporary_mappings(voi > return -ENOMEM; > > /* Prepare a temporary mapping for the kernel text */ > - result = set_up_temporary_text_mapping(pgd); > + /* > + * The new mapping only has to cover the page containing the image > + * kernel's entry point (jump_address_phys), because the switch over to > + * it is carried out by relocated code running from a page allocated > + * specifically for this purpose and covered by the identity mapping, so > + * the temporary kernel text mapping is only needed for the final jump. > + * Moreover, in that mapping the virtual address of the image kernel's > + * entry point must be the same as its virtual address in the image > + * kernel (restore_jump_address), so the image kernel's > + * restore_registers() code doesn't find itself in a different area of > + * the virtual address space after switching over to the original page > + * tables used by the image kernel. > + */ > + info.offset = restore_jump_address - jump_address_phys; > + result = kernel_ident_mapping_init(&info, pgd, jump_address_phys, > + jump_address_phys + PMD_SIZE); > if (result) > return result; > > /* Set up the direct mapping from scratch */ > + info.offset = __PAGE_OFFSET; > for (i = 0; i < nr_pfn_mapped; i++) { > mstart = pfn_mapped[i].start << PAGE_SHIFT; > mend = pfn_mapped[i].end << PAGE_SHIFT; OK, I see what you mean. Looks like a good idea to me. Thanks, Rafael ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore 2016-08-01 17:07 [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation Thomas Garnier 2016-08-01 17:07 ` [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping Thomas Garnier @ 2016-08-01 17:08 ` Thomas Garnier 2016-08-02 0:38 ` Rafael J. Wysocki 2016-08-01 23:48 ` [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation Rafael J. Wysocki 2 siblings, 1 reply; 30+ messages in thread From: Thomas Garnier @ 2016-08-01 17:08 UTC (permalink / raw) To: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Thomas Garnier, Yinghai Lu, Rafael J . Wysocki, Pavel Machek Cc: x86, linux-kernel, linux-pm, kernel-hardening When KASLR memory randomization is used, __PAGE_OFFSET is a global variable changed during boot. The assembly code was using the variable as an immediate value to calculate the cr3 physical address. The physical address was incorrect resulting to a GP fault. Signed-off-by: Thomas Garnier <thgarnie@google.com> --- arch/x86/power/hibernate_asm_64.S | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S index 8eee0e9..8db4905 100644 --- a/arch/x86/power/hibernate_asm_64.S +++ b/arch/x86/power/hibernate_asm_64.S @@ -23,6 +23,16 @@ #include <asm/processor-flags.h> #include <asm/frame.h> +/* + * A global variable holds the page_offset when KASLR memory randomization + * is enabled. + */ +#ifdef CONFIG_RANDOMIZE_MEMORY +#define __PAGE_OFFSET_REF __PAGE_OFFSET +#else +#define __PAGE_OFFSET_REF $__PAGE_OFFSET +#endif + ENTRY(swsusp_arch_suspend) movq $saved_context, %rax movq %rsp, pt_regs_sp(%rax) @@ -72,7 +82,7 @@ ENTRY(restore_image) /* code below has been relocated to a safe page */ ENTRY(core_restore_code) /* switch to temporary page tables */ - movq $__PAGE_OFFSET, %rcx + movq __PAGE_OFFSET_REF, %rcx subq %rcx, %rax movq %rax, %cr3 /* flush TLB */ -- 2.8.0.rc3.226.g39d4020 ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore 2016-08-01 17:08 ` [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore Thomas Garnier @ 2016-08-02 0:38 ` Rafael J. Wysocki 2016-08-02 14:34 ` Thomas Garnier 0 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-02 0:38 UTC (permalink / raw) To: Thomas Garnier Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, Pavel Machek, x86, linux-kernel, linux-pm, kernel-hardening On Monday, August 01, 2016 10:08:00 AM Thomas Garnier wrote: > When KASLR memory randomization is used, __PAGE_OFFSET is a global > variable changed during boot. The assembly code was using the variable > as an immediate value to calculate the cr3 physical address. The > physical address was incorrect resulting to a GP fault. > > Signed-off-by: Thomas Garnier <thgarnie@google.com> > --- > arch/x86/power/hibernate_asm_64.S | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S > index 8eee0e9..8db4905 100644 > --- a/arch/x86/power/hibernate_asm_64.S > +++ b/arch/x86/power/hibernate_asm_64.S > @@ -23,6 +23,16 @@ > #include <asm/processor-flags.h> > #include <asm/frame.h> > > +/* > + * A global variable holds the page_offset when KASLR memory randomization > + * is enabled. > + */ > +#ifdef CONFIG_RANDOMIZE_MEMORY > +#define __PAGE_OFFSET_REF __PAGE_OFFSET > +#else > +#define __PAGE_OFFSET_REF $__PAGE_OFFSET > +#endif > + > ENTRY(swsusp_arch_suspend) > movq $saved_context, %rax > movq %rsp, pt_regs_sp(%rax) > @@ -72,7 +82,7 @@ ENTRY(restore_image) > /* code below has been relocated to a safe page */ > ENTRY(core_restore_code) > /* switch to temporary page tables */ > - movq $__PAGE_OFFSET, %rcx > + movq __PAGE_OFFSET_REF, %rcx > subq %rcx, %rax > movq %rax, %cr3 > /* flush TLB */ > I'm not particularly liking the #ifdefs and they won't be really necessary if the subtraction is carried out by the C code IMO. What about the patch below instead? --- arch/x86/power/hibernate_64.c | 18 +++++++++--------- arch/x86/power/hibernate_asm_64.S | 2 -- 2 files changed, 9 insertions(+), 11 deletions(-) Index: linux-pm/arch/x86/power/hibernate_asm_64.S =================================================================== --- linux-pm.orig/arch/x86/power/hibernate_asm_64.S +++ linux-pm/arch/x86/power/hibernate_asm_64.S @@ -72,8 +72,6 @@ ENTRY(restore_image) /* code below has been relocated to a safe page */ ENTRY(core_restore_code) /* switch to temporary page tables */ - movq $__PAGE_OFFSET, %rcx - subq %rcx, %rax movq %rax, %cr3 /* flush TLB */ movq %rbx, %rcx Index: linux-pm/arch/x86/power/hibernate_64.c =================================================================== --- linux-pm.orig/arch/x86/power/hibernate_64.c +++ linux-pm/arch/x86/power/hibernate_64.c @@ -37,11 +37,11 @@ unsigned long jump_address_phys; */ unsigned long restore_cr3 __visible; -pgd_t *temp_level4_pgt __visible; +unsigned long temp_level4_pgt __visible; unsigned long relocated_restore_code __visible; -static int set_up_temporary_text_mapping(void) +static int set_up_temporary_text_mapping(pgd_t *pgd) { pmd_t *pmd; pud_t *pud; @@ -71,7 +71,7 @@ static int set_up_temporary_text_mapping __pmd((jump_address_phys & PMD_MASK) | __PAGE_KERNEL_LARGE_EXEC)); set_pud(pud + pud_index(restore_jump_address), __pud(__pa(pmd) | _KERNPG_TABLE)); - set_pgd(temp_level4_pgt + pgd_index(restore_jump_address), + set_pgd(pgd + pgd_index(restore_jump_address), __pgd(__pa(pud) | _KERNPG_TABLE)); return 0; @@ -90,15 +90,16 @@ static int set_up_temporary_mappings(voi .kernel_mapping = true, }; unsigned long mstart, mend; + pgd_t *pgd; int result; int i; - temp_level4_pgt = (pgd_t *)get_safe_page(GFP_ATOMIC); - if (!temp_level4_pgt) + pgd = (pgd_t *)get_safe_page(GFP_ATOMIC); + if (!pgd) return -ENOMEM; /* Prepare a temporary mapping for the kernel text */ - result = set_up_temporary_text_mapping(); + result = set_up_temporary_text_mapping(pgd); if (result) return result; @@ -107,13 +108,12 @@ static int set_up_temporary_mappings(voi mstart = pfn_mapped[i].start << PAGE_SHIFT; mend = pfn_mapped[i].end << PAGE_SHIFT; - result = kernel_ident_mapping_init(&info, temp_level4_pgt, - mstart, mend); - + result = kernel_ident_mapping_init(&info, pgd, mstart, mend); if (result) return result; } + temp_level4_pgt = (unsigned long)pgd - __PAGE_OFFSET; return 0; } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore 2016-08-02 0:38 ` Rafael J. Wysocki @ 2016-08-02 14:34 ` Thomas Garnier 2016-08-02 20:47 ` Rafael J. Wysocki 0 siblings, 1 reply; 30+ messages in thread From: Thomas Garnier @ 2016-08-02 14:34 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, Pavel Machek, x86, LKML, Linux PM list, kernel-hardening On Mon, Aug 1, 2016 at 5:38 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Monday, August 01, 2016 10:08:00 AM Thomas Garnier wrote: >> When KASLR memory randomization is used, __PAGE_OFFSET is a global >> variable changed during boot. The assembly code was using the variable >> as an immediate value to calculate the cr3 physical address. The >> physical address was incorrect resulting to a GP fault. >> >> Signed-off-by: Thomas Garnier <thgarnie@google.com> >> --- >> arch/x86/power/hibernate_asm_64.S | 12 +++++++++++- >> 1 file changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S >> index 8eee0e9..8db4905 100644 >> --- a/arch/x86/power/hibernate_asm_64.S >> +++ b/arch/x86/power/hibernate_asm_64.S >> @@ -23,6 +23,16 @@ >> #include <asm/processor-flags.h> >> #include <asm/frame.h> >> >> +/* >> + * A global variable holds the page_offset when KASLR memory randomization >> + * is enabled. >> + */ >> +#ifdef CONFIG_RANDOMIZE_MEMORY >> +#define __PAGE_OFFSET_REF __PAGE_OFFSET >> +#else >> +#define __PAGE_OFFSET_REF $__PAGE_OFFSET >> +#endif >> + >> ENTRY(swsusp_arch_suspend) >> movq $saved_context, %rax >> movq %rsp, pt_regs_sp(%rax) >> @@ -72,7 +82,7 @@ ENTRY(restore_image) >> /* code below has been relocated to a safe page */ >> ENTRY(core_restore_code) >> /* switch to temporary page tables */ >> - movq $__PAGE_OFFSET, %rcx >> + movq __PAGE_OFFSET_REF, %rcx >> subq %rcx, %rax >> movq %rax, %cr3 >> /* flush TLB */ >> > > I'm not particularly liking the #ifdefs and they won't be really > necessary if the subtraction is carried out by the C code IMO. > > What about the patch below instead? > Yes, I think that's a good idea. I will test it and send PATCH v2. Thanks for the quick feedback. > --- > arch/x86/power/hibernate_64.c | 18 +++++++++--------- > arch/x86/power/hibernate_asm_64.S | 2 -- > 2 files changed, 9 insertions(+), 11 deletions(-) > > Index: linux-pm/arch/x86/power/hibernate_asm_64.S > =================================================================== > --- linux-pm.orig/arch/x86/power/hibernate_asm_64.S > +++ linux-pm/arch/x86/power/hibernate_asm_64.S > @@ -72,8 +72,6 @@ ENTRY(restore_image) > /* code below has been relocated to a safe page */ > ENTRY(core_restore_code) > /* switch to temporary page tables */ > - movq $__PAGE_OFFSET, %rcx > - subq %rcx, %rax > movq %rax, %cr3 > /* flush TLB */ > movq %rbx, %rcx > Index: linux-pm/arch/x86/power/hibernate_64.c > =================================================================== > --- linux-pm.orig/arch/x86/power/hibernate_64.c > +++ linux-pm/arch/x86/power/hibernate_64.c > @@ -37,11 +37,11 @@ unsigned long jump_address_phys; > */ > unsigned long restore_cr3 __visible; > > -pgd_t *temp_level4_pgt __visible; > +unsigned long temp_level4_pgt __visible; > > unsigned long relocated_restore_code __visible; > > -static int set_up_temporary_text_mapping(void) > +static int set_up_temporary_text_mapping(pgd_t *pgd) > { > pmd_t *pmd; > pud_t *pud; > @@ -71,7 +71,7 @@ static int set_up_temporary_text_mapping > __pmd((jump_address_phys & PMD_MASK) | __PAGE_KERNEL_LARGE_EXEC)); > set_pud(pud + pud_index(restore_jump_address), > __pud(__pa(pmd) | _KERNPG_TABLE)); > - set_pgd(temp_level4_pgt + pgd_index(restore_jump_address), > + set_pgd(pgd + pgd_index(restore_jump_address), > __pgd(__pa(pud) | _KERNPG_TABLE)); > > return 0; > @@ -90,15 +90,16 @@ static int set_up_temporary_mappings(voi > .kernel_mapping = true, > }; > unsigned long mstart, mend; > + pgd_t *pgd; > int result; > int i; > > - temp_level4_pgt = (pgd_t *)get_safe_page(GFP_ATOMIC); > - if (!temp_level4_pgt) > + pgd = (pgd_t *)get_safe_page(GFP_ATOMIC); > + if (!pgd) > return -ENOMEM; > > /* Prepare a temporary mapping for the kernel text */ > - result = set_up_temporary_text_mapping(); > + result = set_up_temporary_text_mapping(pgd); > if (result) > return result; > > @@ -107,13 +108,12 @@ static int set_up_temporary_mappings(voi > mstart = pfn_mapped[i].start << PAGE_SHIFT; > mend = pfn_mapped[i].end << PAGE_SHIFT; > > - result = kernel_ident_mapping_init(&info, temp_level4_pgt, > - mstart, mend); > - > + result = kernel_ident_mapping_init(&info, pgd, mstart, mend); > if (result) > return result; > } > > + temp_level4_pgt = (unsigned long)pgd - __PAGE_OFFSET; > return 0; > } > > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore 2016-08-02 14:34 ` Thomas Garnier @ 2016-08-02 20:47 ` Rafael J. Wysocki 2016-08-02 20:59 ` Thomas Garnier 0 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-02 20:47 UTC (permalink / raw) To: Thomas Garnier Cc: Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, Pavel Machek, the arch/x86 maintainers, LKML, Linux PM list, kernel-hardening On Tue, Aug 2, 2016 at 4:34 PM, Thomas Garnier <thgarnie@google.com> wrote: > On Mon, Aug 1, 2016 at 5:38 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> On Monday, August 01, 2016 10:08:00 AM Thomas Garnier wrote: >>> When KASLR memory randomization is used, __PAGE_OFFSET is a global >>> variable changed during boot. The assembly code was using the variable >>> as an immediate value to calculate the cr3 physical address. The >>> physical address was incorrect resulting to a GP fault. >>> >>> Signed-off-by: Thomas Garnier <thgarnie@google.com> >>> --- >>> arch/x86/power/hibernate_asm_64.S | 12 +++++++++++- >>> 1 file changed, 11 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S >>> index 8eee0e9..8db4905 100644 >>> --- a/arch/x86/power/hibernate_asm_64.S >>> +++ b/arch/x86/power/hibernate_asm_64.S >>> @@ -23,6 +23,16 @@ >>> #include <asm/processor-flags.h> >>> #include <asm/frame.h> >>> >>> +/* >>> + * A global variable holds the page_offset when KASLR memory randomization >>> + * is enabled. >>> + */ >>> +#ifdef CONFIG_RANDOMIZE_MEMORY >>> +#define __PAGE_OFFSET_REF __PAGE_OFFSET >>> +#else >>> +#define __PAGE_OFFSET_REF $__PAGE_OFFSET >>> +#endif >>> + >>> ENTRY(swsusp_arch_suspend) >>> movq $saved_context, %rax >>> movq %rsp, pt_regs_sp(%rax) >>> @@ -72,7 +82,7 @@ ENTRY(restore_image) >>> /* code below has been relocated to a safe page */ >>> ENTRY(core_restore_code) >>> /* switch to temporary page tables */ >>> - movq $__PAGE_OFFSET, %rcx >>> + movq __PAGE_OFFSET_REF, %rcx >>> subq %rcx, %rax >>> movq %rax, %cr3 >>> /* flush TLB */ >>> >> >> I'm not particularly liking the #ifdefs and they won't be really >> necessary if the subtraction is carried out by the C code IMO. >> >> What about the patch below instead? >> > > Yes, I think that's a good idea. I will test it and send PATCH v2. No need to send this patch again. Please just let me know if it works for you. :-) Thanks, Rafael ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore 2016-08-02 20:47 ` Rafael J. Wysocki @ 2016-08-02 20:59 ` Thomas Garnier 2016-08-02 21:08 ` Rafael J. Wysocki 0 siblings, 1 reply; 30+ messages in thread From: Thomas Garnier @ 2016-08-02 20:59 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, Pavel Machek, the arch/x86 maintainers, LKML, Linux PM list, kernel-hardening On Tue, Aug 2, 2016 at 1:47 PM, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Tue, Aug 2, 2016 at 4:34 PM, Thomas Garnier <thgarnie@google.com> wrote: >> On Mon, Aug 1, 2016 at 5:38 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >>> On Monday, August 01, 2016 10:08:00 AM Thomas Garnier wrote: >>>> When KASLR memory randomization is used, __PAGE_OFFSET is a global >>>> variable changed during boot. The assembly code was using the variable >>>> as an immediate value to calculate the cr3 physical address. The >>>> physical address was incorrect resulting to a GP fault. >>>> >>>> Signed-off-by: Thomas Garnier <thgarnie@google.com> >>>> --- >>>> arch/x86/power/hibernate_asm_64.S | 12 +++++++++++- >>>> 1 file changed, 11 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S >>>> index 8eee0e9..8db4905 100644 >>>> --- a/arch/x86/power/hibernate_asm_64.S >>>> +++ b/arch/x86/power/hibernate_asm_64.S >>>> @@ -23,6 +23,16 @@ >>>> #include <asm/processor-flags.h> >>>> #include <asm/frame.h> >>>> >>>> +/* >>>> + * A global variable holds the page_offset when KASLR memory randomization >>>> + * is enabled. >>>> + */ >>>> +#ifdef CONFIG_RANDOMIZE_MEMORY >>>> +#define __PAGE_OFFSET_REF __PAGE_OFFSET >>>> +#else >>>> +#define __PAGE_OFFSET_REF $__PAGE_OFFSET >>>> +#endif >>>> + >>>> ENTRY(swsusp_arch_suspend) >>>> movq $saved_context, %rax >>>> movq %rsp, pt_regs_sp(%rax) >>>> @@ -72,7 +82,7 @@ ENTRY(restore_image) >>>> /* code below has been relocated to a safe page */ >>>> ENTRY(core_restore_code) >>>> /* switch to temporary page tables */ >>>> - movq $__PAGE_OFFSET, %rcx >>>> + movq __PAGE_OFFSET_REF, %rcx >>>> subq %rcx, %rax >>>> movq %rax, %cr3 >>>> /* flush TLB */ >>>> >>> >>> I'm not particularly liking the #ifdefs and they won't be really >>> necessary if the subtraction is carried out by the C code IMO. >>> >>> What about the patch below instead? >>> >> >> Yes, I think that's a good idea. I will test it and send PATCH v2. > > No need to send this patch again. Please just let me know if it works > for you. :-) > It worked well when I tested it and I agree that's a better approach. > Thanks, > Rafael ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore 2016-08-02 20:59 ` Thomas Garnier @ 2016-08-02 21:08 ` Rafael J. Wysocki 2016-08-02 23:19 ` [PATCH] x86/power/64: Do not refer to __PAGE_OFFSET from assembly code Rafael J. Wysocki 0 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-02 21:08 UTC (permalink / raw) To: Thomas Garnier Cc: Rafael J. Wysocki, Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, Pavel Machek, the arch/x86 maintainers, LKML, Linux PM list, kernel-hardening On Tue, Aug 2, 2016 at 10:59 PM, Thomas Garnier <thgarnie@google.com> wrote: > On Tue, Aug 2, 2016 at 1:47 PM, Rafael J. Wysocki <rafael@kernel.org> wrote: >> On Tue, Aug 2, 2016 at 4:34 PM, Thomas Garnier <thgarnie@google.com> wrote: >>> On Mon, Aug 1, 2016 at 5:38 PM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >>>> On Monday, August 01, 2016 10:08:00 AM Thomas Garnier wrote: >>>>> When KASLR memory randomization is used, __PAGE_OFFSET is a global >>>>> variable changed during boot. The assembly code was using the variable >>>>> as an immediate value to calculate the cr3 physical address. The >>>>> physical address was incorrect resulting to a GP fault. >>>>> >>>>> Signed-off-by: Thomas Garnier <thgarnie@google.com> >>>>> --- >>>>> arch/x86/power/hibernate_asm_64.S | 12 +++++++++++- >>>>> 1 file changed, 11 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S >>>>> index 8eee0e9..8db4905 100644 >>>>> --- a/arch/x86/power/hibernate_asm_64.S >>>>> +++ b/arch/x86/power/hibernate_asm_64.S >>>>> @@ -23,6 +23,16 @@ >>>>> #include <asm/processor-flags.h> >>>>> #include <asm/frame.h> >>>>> >>>>> +/* >>>>> + * A global variable holds the page_offset when KASLR memory randomization >>>>> + * is enabled. >>>>> + */ >>>>> +#ifdef CONFIG_RANDOMIZE_MEMORY >>>>> +#define __PAGE_OFFSET_REF __PAGE_OFFSET >>>>> +#else >>>>> +#define __PAGE_OFFSET_REF $__PAGE_OFFSET >>>>> +#endif >>>>> + >>>>> ENTRY(swsusp_arch_suspend) >>>>> movq $saved_context, %rax >>>>> movq %rsp, pt_regs_sp(%rax) >>>>> @@ -72,7 +82,7 @@ ENTRY(restore_image) >>>>> /* code below has been relocated to a safe page */ >>>>> ENTRY(core_restore_code) >>>>> /* switch to temporary page tables */ >>>>> - movq $__PAGE_OFFSET, %rcx >>>>> + movq __PAGE_OFFSET_REF, %rcx >>>>> subq %rcx, %rax >>>>> movq %rax, %cr3 >>>>> /* flush TLB */ >>>>> >>>> >>>> I'm not particularly liking the #ifdefs and they won't be really >>>> necessary if the subtraction is carried out by the C code IMO. >>>> >>>> What about the patch below instead? >>>> >>> >>> Yes, I think that's a good idea. I will test it and send PATCH v2. >> >> No need to send this patch again. Please just let me know if it works >> for you. :-) >> > > It worked well when I tested it and I agree that's a better approach. OK, thanks! Let me add a changelog to it then. ^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH] x86/power/64: Do not refer to __PAGE_OFFSET from assembly code 2016-08-02 21:08 ` Rafael J. Wysocki @ 2016-08-02 23:19 ` Rafael J. Wysocki 2016-08-05 10:37 ` Pavel Machek 0 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-02 23:19 UTC (permalink / raw) To: the arch/x86 maintainers, Linux PM list Cc: Rafael J. Wysocki, Thomas Garnier, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, Pavel Machek, LKML, kernel-hardening From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> When CONFIG_RANDOMIZE_MEMORY is set on x86-64, __PAGE_OFFSET becomes a variable and using it as a symbol in the image memory restoration assembly code under core_restore_code is not correct any more. To avoid that problem, modify set_up_temporary_mappings() to compute the physical address of the temporary page tables and store it in temp_level4_pgt, so that the value of that variable is ready to be written into CR3. Then, the assembly code doesn't have to worry about converting that value into a physical address and things work regardless of whether or not CONFIG_RANDOMIZE_MEMORY is set. Reported-and-tested-by: Thomas Garnier <thgarnie@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- I'm going to queue this up, so if there are any objections/concerns about it, please let me know ASAP. Thanks, Rafael --- arch/x86/power/hibernate_64.c | 18 +++++++++--------- arch/x86/power/hibernate_asm_64.S | 2 -- 2 files changed, 9 insertions(+), 11 deletions(-) Index: linux-pm/arch/x86/power/hibernate_asm_64.S =================================================================== --- linux-pm.orig/arch/x86/power/hibernate_asm_64.S +++ linux-pm/arch/x86/power/hibernate_asm_64.S @@ -72,8 +72,6 @@ ENTRY(restore_image) /* code below has been relocated to a safe page */ ENTRY(core_restore_code) /* switch to temporary page tables */ - movq $__PAGE_OFFSET, %rcx - subq %rcx, %rax movq %rax, %cr3 /* flush TLB */ movq %rbx, %rcx Index: linux-pm/arch/x86/power/hibernate_64.c =================================================================== --- linux-pm.orig/arch/x86/power/hibernate_64.c +++ linux-pm/arch/x86/power/hibernate_64.c @@ -37,11 +37,11 @@ unsigned long jump_address_phys; */ unsigned long restore_cr3 __visible; -pgd_t *temp_level4_pgt __visible; +unsigned long temp_level4_pgt __visible; unsigned long relocated_restore_code __visible; -static int set_up_temporary_text_mapping(void) +static int set_up_temporary_text_mapping(pgd_t *pgd) { pmd_t *pmd; pud_t *pud; @@ -71,7 +71,7 @@ static int set_up_temporary_text_mapping __pmd((jump_address_phys & PMD_MASK) | __PAGE_KERNEL_LARGE_EXEC)); set_pud(pud + pud_index(restore_jump_address), __pud(__pa(pmd) | _KERNPG_TABLE)); - set_pgd(temp_level4_pgt + pgd_index(restore_jump_address), + set_pgd(pgd + pgd_index(restore_jump_address), __pgd(__pa(pud) | _KERNPG_TABLE)); return 0; @@ -90,15 +90,16 @@ static int set_up_temporary_mappings(voi .kernel_mapping = true, }; unsigned long mstart, mend; + pgd_t *pgd; int result; int i; - temp_level4_pgt = (pgd_t *)get_safe_page(GFP_ATOMIC); - if (!temp_level4_pgt) + pgd = (pgd_t *)get_safe_page(GFP_ATOMIC); + if (!pgd) return -ENOMEM; /* Prepare a temporary mapping for the kernel text */ - result = set_up_temporary_text_mapping(); + result = set_up_temporary_text_mapping(pgd); if (result) return result; @@ -107,13 +108,12 @@ static int set_up_temporary_mappings(voi mstart = pfn_mapped[i].start << PAGE_SHIFT; mend = pfn_mapped[i].end << PAGE_SHIFT; - result = kernel_ident_mapping_init(&info, temp_level4_pgt, - mstart, mend); - + result = kernel_ident_mapping_init(&info, pgd, mstart, mend); if (result) return result; } + temp_level4_pgt = (unsigned long)pgd - __PAGE_OFFSET; return 0; } ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] x86/power/64: Do not refer to __PAGE_OFFSET from assembly code 2016-08-02 23:19 ` [PATCH] x86/power/64: Do not refer to __PAGE_OFFSET from assembly code Rafael J. Wysocki @ 2016-08-05 10:37 ` Pavel Machek 2016-08-05 14:44 ` Rafael J. Wysocki 0 siblings, 1 reply; 30+ messages in thread From: Pavel Machek @ 2016-08-05 10:37 UTC (permalink / raw) To: Rafael J. Wysocki Cc: the arch/x86 maintainers, Linux PM list, Rafael J. Wysocki, Thomas Garnier, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, LKML, kernel-hardening On Wed 2016-08-03 01:19:26, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > When CONFIG_RANDOMIZE_MEMORY is set on x86-64, __PAGE_OFFSET becomes > a variable and using it as a symbol in the image memory restoration > assembly code under core_restore_code is not correct any more. On a related note... we should really have page_offset variable in such case, and use that -- having __FOO_BAR not being a constant is ugly/confusing/dangerous. > To avoid that problem, modify set_up_temporary_mappings() to compute > the physical address of the temporary page tables and store it in > temp_level4_pgt, so that the value of that variable is ready to be > written into CR3. Then, the assembly code doesn't have to worry > about converting that value into a physical address and things work > regardless of whether or not CONFIG_RANDOMIZE_MEMORY is set. > > Reported-and-tested-by: Thomas Garnier <thgarnie@google.com> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Pavel Machek <pavel@ucw.cz> Is similar patch needed for i386? Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] x86/power/64: Do not refer to __PAGE_OFFSET from assembly code 2016-08-05 10:37 ` Pavel Machek @ 2016-08-05 14:44 ` Rafael J. Wysocki 2016-08-05 15:21 ` Thomas Garnier 0 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-05 14:44 UTC (permalink / raw) To: Pavel Machek Cc: the arch/x86 maintainers, Linux PM list, Rafael J. Wysocki, Thomas Garnier, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, LKML, kernel-hardening On Friday, August 05, 2016 12:37:13 PM Pavel Machek wrote: > On Wed 2016-08-03 01:19:26, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > When CONFIG_RANDOMIZE_MEMORY is set on x86-64, __PAGE_OFFSET becomes > > a variable and using it as a symbol in the image memory restoration > > assembly code under core_restore_code is not correct any more. > > On a related note... we should really have page_offset variable in > such case, and use that -- having __FOO_BAR not being a constant is > ugly/confusing/dangerous. > > > To avoid that problem, modify set_up_temporary_mappings() to compute > > the physical address of the temporary page tables and store it in > > temp_level4_pgt, so that the value of that variable is ready to be > > written into CR3. Then, the assembly code doesn't have to worry > > about converting that value into a physical address and things work > > regardless of whether or not CONFIG_RANDOMIZE_MEMORY is set. > > > > Reported-and-tested-by: Thomas Garnier <thgarnie@google.com> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Acked-by: Pavel Machek <pavel@ucw.cz> > > Is similar patch needed for i386? Yes, it is, in general, for i386 hibernation to work with ASLR. But it doesn't work with it for other reasons ATM, AFAICS. Unfortunately, I won't really have the time to take care of this any time soon. Thanks, Rafael ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] x86/power/64: Do not refer to __PAGE_OFFSET from assembly code 2016-08-05 14:44 ` Rafael J. Wysocki @ 2016-08-05 15:21 ` Thomas Garnier 2016-08-05 23:12 ` Rafael J. Wysocki 0 siblings, 1 reply; 30+ messages in thread From: Thomas Garnier @ 2016-08-05 15:21 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pavel Machek, the arch/x86 maintainers, Linux PM list, Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, LKML, kernel-hardening On Fri, Aug 5, 2016 at 7:44 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Friday, August 05, 2016 12:37:13 PM Pavel Machek wrote: >> On Wed 2016-08-03 01:19:26, Rafael J. Wysocki wrote: >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> > >> > When CONFIG_RANDOMIZE_MEMORY is set on x86-64, __PAGE_OFFSET becomes >> > a variable and using it as a symbol in the image memory restoration >> > assembly code under core_restore_code is not correct any more. >> >> On a related note... we should really have page_offset variable in >> such case, and use that -- having __FOO_BAR not being a constant is >> ugly/confusing/dangerous. >> >> > To avoid that problem, modify set_up_temporary_mappings() to compute >> > the physical address of the temporary page tables and store it in >> > temp_level4_pgt, so that the value of that variable is ready to be >> > written into CR3. Then, the assembly code doesn't have to worry >> > about converting that value into a physical address and things work >> > regardless of whether or not CONFIG_RANDOMIZE_MEMORY is set. >> > >> > Reported-and-tested-by: Thomas Garnier <thgarnie@google.com> >> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> >> Acked-by: Pavel Machek <pavel@ucw.cz> >> >> Is similar patch needed for i386? > > Yes, it is, in general, for i386 hibernation to work with ASLR. > > But it doesn't work with it for other reasons ATM, AFAICS. > > Unfortunately, I won't really have the time to take care of this any time > soon. > KASLR memory randomization is only available for x64 right now. I plan on porting to 32bit eventually and will test/adapt hibernation as part of it. > Thanks, > Rafael > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] x86/power/64: Do not refer to __PAGE_OFFSET from assembly code 2016-08-05 15:21 ` Thomas Garnier @ 2016-08-05 23:12 ` Rafael J. Wysocki 2016-08-06 19:41 ` Pavel Machek 0 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-05 23:12 UTC (permalink / raw) To: Thomas Garnier Cc: Pavel Machek, the arch/x86 maintainers, Linux PM list, Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, LKML, kernel-hardening On Friday, August 05, 2016 08:21:31 AM Thomas Garnier wrote: > On Fri, Aug 5, 2016 at 7:44 AM, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > > On Friday, August 05, 2016 12:37:13 PM Pavel Machek wrote: > >> On Wed 2016-08-03 01:19:26, Rafael J. Wysocki wrote: > >> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >> > > >> > When CONFIG_RANDOMIZE_MEMORY is set on x86-64, __PAGE_OFFSET becomes > >> > a variable and using it as a symbol in the image memory restoration > >> > assembly code under core_restore_code is not correct any more. > >> > >> On a related note... we should really have page_offset variable in > >> such case, and use that -- having __FOO_BAR not being a constant is > >> ugly/confusing/dangerous. > >> > >> > To avoid that problem, modify set_up_temporary_mappings() to compute > >> > the physical address of the temporary page tables and store it in > >> > temp_level4_pgt, so that the value of that variable is ready to be > >> > written into CR3. Then, the assembly code doesn't have to worry > >> > about converting that value into a physical address and things work > >> > regardless of whether or not CONFIG_RANDOMIZE_MEMORY is set. > >> > > >> > Reported-and-tested-by: Thomas Garnier <thgarnie@google.com> > >> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > >> > >> Acked-by: Pavel Machek <pavel@ucw.cz> > >> > >> Is similar patch needed for i386? > > > > Yes, it is, in general, for i386 hibernation to work with ASLR. > > > > But it doesn't work with it for other reasons ATM, AFAICS. > > > > Unfortunately, I won't really have the time to take care of this any time > > soon. > > > > KASLR memory randomization is only available for x64 right now. I plan > on porting to 32bit eventually and will test/adapt hibernation as part > of it. Great to hear that, but you need to be aware that the i386 hibernate code has not been touched for a long time and it makes some heavy assumptions that are not made on x86-64. Please keep me and Pavel in the loop, though. Thanks, Rafael ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH] x86/power/64: Do not refer to __PAGE_OFFSET from assembly code 2016-08-05 23:12 ` Rafael J. Wysocki @ 2016-08-06 19:41 ` Pavel Machek 0 siblings, 0 replies; 30+ messages in thread From: Pavel Machek @ 2016-08-06 19:41 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Thomas Garnier, the arch/x86 maintainers, Linux PM list, Rafael J. Wysocki, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, LKML, kernel-hardening Hi! > > >> Is similar patch needed for i386? > > > > > > Yes, it is, in general, for i386 hibernation to work with ASLR. > > > > > > But it doesn't work with it for other reasons ATM, AFAICS. > > > > > > Unfortunately, I won't really have the time to take care of this any time > > > soon. > > > > > > > KASLR memory randomization is only available for x64 right now. I plan > > on porting to 32bit eventually and will test/adapt hibernation as part > > of it. > > Great to hear that, but you need to be aware that the i386 hibernate code has > not been touched for a long time and it makes some heavy assumptions that > are not made on x86-64. Yes, we did pretty bad job keeping i386 and x86-64 in sync. This should bring them closer together. (My original motivation was to enable hibernation and resume using differnet kernel versions. That worked. Merge with v4.7 changes was not trivial, but it still appears to work, probably doing some stuff that is not neccessary on 32-bit.) Signed-off-by: Pavel Machek <pavel@ucw.cz> (but cleanup before applying :-)) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3a9add5..b5c48f1 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2236,7 +2236,7 @@ menu "Power management and ACPI options" config ARCH_HIBERNATION_HEADER def_bool y - depends on X86_64 && HIBERNATION + depends on HIBERNATION source "kernel/power/Kconfig" diff --git a/arch/x86/include/asm/suspend_32.h b/arch/x86/include/asm/suspend_32.h index 8e9dbe7..81c5bfc 100644 --- a/arch/x86/include/asm/suspend_32.h +++ b/arch/x86/include/asm/suspend_32.h @@ -25,4 +25,7 @@ struct saved_context { unsigned long return_address; } __attribute__((packed)); +extern char core_restore_code; +extern char restore_registers; + #endif /* _ASM_X86_SUSPEND_32_H */ diff --git a/arch/x86/power/hibernate.c b/arch/x86/power/hibernate.c new file mode 100644 index 0000000..c0b0572 --- /dev/null +++ b/arch/x86/power/hibernate.c @@ -0,0 +1,49 @@ +int reallocate_restore_code(void) +{ + relocated_restore_code = (void *)get_safe_page(GFP_ATOMIC); + if (!relocated_restore_code) + return -ENOMEM; + memcpy(relocated_restore_code, &core_restore_code, + &restore_registers - &core_restore_code); + return 0; +} + +struct restore_data_record { + unsigned long jump_address; + unsigned long jump_address_phys; + unsigned long cr3; + unsigned long magic; +}; + +/** + * arch_hibernation_header_save - populate the architecture specific part + * of a hibernation image header + * @addr: address to save the data at + */ +int arch_hibernation_header_save(void *addr, unsigned int max_size) +{ + struct restore_data_record *rdr = addr; + + if (max_size < sizeof(struct restore_data_record)) + return -EOVERFLOW; + rdr->jump_address = (unsigned long)&restore_registers; + rdr->jump_address_phys = __pa_symbol(&restore_registers); + rdr->cr3 = restore_cr3; + rdr->magic = RESTORE_MAGIC; + return 0; +} + +/** + * arch_hibernation_header_restore - read the architecture specific data + * from the hibernation image header + * @addr: address to read the data from + */ +int arch_hibernation_header_restore(void *addr) +{ + struct restore_data_record *rdr = addr; + + restore_jump_address = rdr->jump_address; + jump_address_phys = rdr->jump_address_phys; + restore_cr3 = rdr->cr3; + return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL; +} diff --git a/arch/x86/power/hibernate_32.c b/arch/x86/power/hibernate_32.c index 9f14bd3..784e6c7 100644 --- a/arch/x86/power/hibernate_32.c +++ b/arch/x86/power/hibernate_32.c @@ -4,6 +4,7 @@ * Distribute under GPLv2 * * Copyright (c) 2006 Rafael J. Wysocki <rjw@sisk.pl> + * Copyright (c) 2015 Pavel Machek <pavel@ucw.cz> */ #include <linux/gfp.h> @@ -14,13 +15,30 @@ #include <asm/pgtable.h> #include <asm/mmzone.h> #include <asm/sections.h> +#include <asm/suspend.h> +#include <asm/tlbflush.h> /* Defined in hibernate_asm_32.S */ extern int restore_image(void); +/* + * Address to jump to in the last phase of restore in order to get to the image + * kernel's text (this value is passed in the image header). + */ +unsigned long restore_jump_address __visible; +unsigned long jump_address_phys; + +/* + * Value of the cr3 register from before the hibernation (this value is passed + * in the image header). + */ +unsigned long restore_cr3 __visible; + /* Pointer to the temporary resume page tables */ pgd_t *resume_pg_dir; +void *relocated_restore_code __visible; + /* The following three functions are based on the analogous code in * arch/x86/mm/init_32.c */ @@ -142,6 +160,9 @@ static inline void resume_init_first_level_page_table(pgd_t *pg_dir) #endif } +#define RESTORE_MAGIC 0x1bea1e0UL +#include "hibernate.c" + int swsusp_arch_resume(void) { int error; @@ -155,6 +176,10 @@ int swsusp_arch_resume(void) if (error) return error; + error = reallocate_restore_code(); + if (error) + return error; + /* We have got enough memory and from now on we cannot recover */ restore_image(); return 0; diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c index f2b5e6a..8aea0a1 100644 --- a/arch/x86/power/hibernate_64.c +++ b/arch/x86/power/hibernate_64.c @@ -117,6 +117,9 @@ static int set_up_temporary_mappings(void) return 0; } +#define RESTORE_MAGIC 0x123456789ABCDEF0UL +#include "hibernate.c" + static int relocate_restore_code(void) { pgd_t *pgd; @@ -177,44 +180,3 @@ int pfn_is_nosave(unsigned long pfn) return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); } -struct restore_data_record { - unsigned long jump_address; - unsigned long jump_address_phys; - unsigned long cr3; - unsigned long magic; -}; - -#define RESTORE_MAGIC 0x123456789ABCDEF0UL - -/** - * arch_hibernation_header_save - populate the architecture specific part - * of a hibernation image header - * @addr: address to save the data at - */ -int arch_hibernation_header_save(void *addr, unsigned int max_size) -{ - struct restore_data_record *rdr = addr; - - if (max_size < sizeof(struct restore_data_record)) - return -EOVERFLOW; - rdr->jump_address = (unsigned long)&restore_registers; - rdr->jump_address_phys = __pa_symbol(&restore_registers); - rdr->cr3 = restore_cr3; - rdr->magic = RESTORE_MAGIC; - return 0; -} - -/** - * arch_hibernation_header_restore - read the architecture specific data - * from the hibernation image header - * @addr: address to read the data from - */ -int arch_hibernation_header_restore(void *addr) -{ - struct restore_data_record *rdr = addr; - - restore_jump_address = rdr->jump_address; - jump_address_phys = rdr->jump_address_phys; - restore_cr3 = rdr->cr3; - return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL; -} diff --git a/arch/x86/power/hibernate_asm_32.S b/arch/x86/power/hibernate_asm_32.S index 1d0fa0e..f7e62d3 100644 --- a/arch/x86/power/hibernate_asm_32.S +++ b/arch/x86/power/hibernate_asm_32.S @@ -1,5 +1,14 @@ /* - * This may not use any stack, nor any variable that is not "NoSave": + * Hibernation support for i386 + * + * Distribute under GPLv2. + * + * Copyright 2007 Rafael J. Wysocki <rjw@sisk.pl> + * Copyright 2005 Andi Kleen <ak@suse.de> + * Copyright 2004, 2015 Pavel Machek <pavel@ucw.cz> + * + * swsusp_arch_resume must not use any stack or any nonlocal variables while + * copying pages: * * Its rewriting one kernel image with another. What is stack in "old" * image could very well be data page in "new" image, and overwriting @@ -23,6 +32,10 @@ ENTRY(swsusp_arch_suspend) pushfl popl saved_context_eflags + /* save cr3 */ + movl %cr3, %eax + movl %eax, restore_cr3 + call swsusp_save ret @@ -38,9 +51,27 @@ ENTRY(restore_image) movl %cr3, %eax; # flush TLB movl %eax, %cr3 1: + + /* prepare to jump to the image kernel */ + movl restore_jump_address, %eax + movl restore_cr3, %ebx + +#if 0 + FIXME + /* prepare to switch to temporary page tables */ + movq temp_level4_pgt(%rip), %rax + movq mmu_cr4_features(%rip), %rbx +#endif + + /* prepare to copy image data to their original locations */ movl restore_pblist, %edx + + /* jump to relocated restore code */ + movl relocated_restore_code, %ecx + jmpl *%ecx .p2align 4,,7 +ENTRY(core_restore_code) copy_loop: testl %edx, %edx jz done @@ -48,7 +79,7 @@ copy_loop: movl pbe_address(%edx), %esi movl pbe_orig_address(%edx), %edi - movl $1024, %ecx + movl $(PAGE_SIZE >> 2), %ecx rep movsl @@ -57,6 +88,20 @@ copy_loop: .p2align 4,,7 done: + /* jump to the restore_registers address from the image header */ + jmpl *%eax + /* + * NOTE: This assumes that the boot kernel's text mapping covers the + * image kernel's page containing restore_registers and the address of + * this page is the same as in the image kernel's text mapping (it + * should always be true, because the text mapping is linear, starting + * from 0, and is supposed to cover the entire kernel text for every + * kernel). + * + * code below belongs to the image kernel + */ + +ENTRY(restore_registers) /* go back to the original page tables */ movl $swapper_pg_dir, %eax subl $__PAGE_OFFSET, %eax @@ -81,4 +126,7 @@ done: xorl %eax, %eax + /* tell the hibernation core that we've just restored the memory */ + movl %eax, in_suspend + ret diff --git a/tools/testing/selftests/power/sleep b/tools/testing/selftests/power/sleep new file mode 100755 index 0000000..277d59d --- /dev/null +++ b/tools/testing/selftests/power/sleep @@ -0,0 +1,5 @@ +#!/bin/bash +echo 0 > /sys/class/rtc/rtc0/wakealarm +echo `date '+%s' -d '+ 1 minutes'` > /sys/class/rtc/rtc0/wakealarm +echo mem > /sys/power/state + -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation 2016-08-01 17:07 [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation Thomas Garnier 2016-08-01 17:07 ` [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping Thomas Garnier 2016-08-01 17:08 ` [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore Thomas Garnier @ 2016-08-01 23:48 ` Rafael J. Wysocki 2016-08-02 0:47 ` Rafael J. Wysocki 2 siblings, 1 reply; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-01 23:48 UTC (permalink / raw) To: Thomas Garnier Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, Pavel Machek, x86, linux-kernel, linux-pm, kernel-hardening On Monday, August 01, 2016 10:07:58 AM Thomas Garnier wrote: > ***Background: > KASLR memory randomization for x86_64 was added when KASLR did not support hibernation. Now that it does, some changes are needed. > > ***Problems that needed solving: > Hibernation was failing on reboot with a GP fault when CONFIG_RANDOMIZE_MEMORY was enabled. Two issues were identified. > > The original fault was due to a wrong physical address assigned to cr3. The problem was introduced with __PAGE_OFFSET becoming a global variable when randomized. The fix uses a define to use the glbobal or immediate value based on config settings. > > The second isssue was that the temporary page table mapping did not support virtual addresses not aligned on PGD level. KASLR memory randomization will generated a random address aligned on PUD level. The fix correctly calculates the offset on all levels of the temporary page table. > > ***Parts: > - 01/02: Support unaligned addresses (second issue) > - 02/02: Fix __PAGE_OFFSET usage on assembly (first issue) Thanks a lot for taking care of this! Patch [2/2] looks good to me, but I'd to the [1/2] differently (more details will follow). Thanks, Rafael ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation 2016-08-01 23:48 ` [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation Rafael J. Wysocki @ 2016-08-02 0:47 ` Rafael J. Wysocki 0 siblings, 0 replies; 30+ messages in thread From: Rafael J. Wysocki @ 2016-08-02 0:47 UTC (permalink / raw) To: Thomas Garnier Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Kees Cook, Yinghai Lu, Pavel Machek, x86, linux-kernel, linux-pm, kernel-hardening On Tuesday, August 02, 2016 01:48:13 AM Rafael J. Wysocki wrote: > On Monday, August 01, 2016 10:07:58 AM Thomas Garnier wrote: > > ***Background: > > KASLR memory randomization for x86_64 was added when KASLR did not support hibernation. Now that it does, some changes are needed. > > > > ***Problems that needed solving: > > Hibernation was failing on reboot with a GP fault when CONFIG_RANDOMIZE_MEMORY was enabled. Two issues were identified. > > > > The original fault was due to a wrong physical address assigned to cr3. The problem was introduced with __PAGE_OFFSET becoming a global variable when randomized. The fix uses a define to use the glbobal or immediate value based on config settings. > > > > The second isssue was that the temporary page table mapping did not support virtual addresses not aligned on PGD level. KASLR memory randomization will generated a random address aligned on PUD level. The fix correctly calculates the offset on all levels of the temporary page table. > > > > ***Parts: > > - 01/02: Support unaligned addresses (second issue) > > - 02/02: Fix __PAGE_OFFSET usage on assembly (first issue) > > Thanks a lot for taking care of this! > > Patch [2/2] looks good to me, but I'd to the [1/2] differently (more details > will follow). Well, I got this the other way around, sorry. I've just sent an ACK for the [1/2] and an alternative patch for the [2/2]. Thanks, Rafael ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2016-08-08 13:11 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-01 17:07 [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation Thomas Garnier 2016-08-01 17:07 ` [PATCH v1 1/2] x86/power/64: Support unaligned addresses for temporary mapping Thomas Garnier 2016-08-02 0:36 ` Rafael J. Wysocki 2016-08-02 18:01 ` Yinghai Lu 2016-08-02 17:36 ` Yinghai Lu 2016-08-02 17:48 ` Thomas Garnier 2016-08-02 19:55 ` Yinghai Lu 2016-08-03 15:29 ` Thomas Garnier 2016-08-03 18:23 ` [PATCH v2] " Yinghai Lu 2016-08-03 21:28 ` Rafael J. Wysocki 2016-08-07 1:03 ` Rafael J. Wysocki 2016-08-07 4:53 ` Yinghai Lu 2016-08-07 23:23 ` Rafael J. Wysocki 2016-08-08 7:06 ` Yinghai Lu 2016-08-08 7:23 ` Yinghai Lu 2016-08-08 13:16 ` Rafael J. Wysocki 2016-08-01 17:08 ` [PATCH v1 2/2] x86/power/64: Fix __PAGE_OFFSET usage on restore Thomas Garnier 2016-08-02 0:38 ` Rafael J. Wysocki 2016-08-02 14:34 ` Thomas Garnier 2016-08-02 20:47 ` Rafael J. Wysocki 2016-08-02 20:59 ` Thomas Garnier 2016-08-02 21:08 ` Rafael J. Wysocki 2016-08-02 23:19 ` [PATCH] x86/power/64: Do not refer to __PAGE_OFFSET from assembly code Rafael J. Wysocki 2016-08-05 10:37 ` Pavel Machek 2016-08-05 14:44 ` Rafael J. Wysocki 2016-08-05 15:21 ` Thomas Garnier 2016-08-05 23:12 ` Rafael J. Wysocki 2016-08-06 19:41 ` Pavel Machek 2016-08-01 23:48 ` [PATCH v1 0/2] x86/power/64: Make KASLR memory randomization compatible with hibernation Rafael J. Wysocki 2016-08-02 0:47 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).