* [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization
@ 2019-01-31 19:24 Thomas Garnier
2019-01-31 19:24 ` [PATCH v6 19/27] kvm: Adapt assembly for PIE support Thomas Garnier
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Thomas Garnier @ 2019-01-31 19:24 UTC (permalink / raw)
To: kernel-hardening
Cc: Jan Kiszka, Pavel Machek, Andrey Ryabinin, Christoph Lameter,
Rafael Ávila de Espíndola, linux-arch, Andi Kleen,
Michael Ellerman, linux-sparse, xen-devel, Alexander Popov,
Len Brown, linux-pm, Nicholas Piggin, Cao jin, Mike Rapoport,
Andy Lutomirski, Dennis Zhou, Thomas Gleixner, nixiaoming,
Michal Marek, Greg Kroah-Hartman, Nick Desaulniers, linux-kernel
There has been no major concern in the latest iterations. I am interested on
what would be the best way to slowly integrate this patchset upstream.
Changes:
- patch v6:
- Rebase on latest changes in jump tables and crypto.
- Fix wording on couple commits.
- Revisit checkpatch warnings.
- Moving to @chromium.org.
- patch v5:
- Adapt new crypto modules for PIE.
- Improve per-cpu commit message.
- Fix xen 32-bit build error with .quad.
- Remove extra code for ftrace.
- patch v4:
- Simplify early boot by removing global variables.
- Modify the mcount location script for __mcount_loc intead of the address
read in the ftrace implementation.
- Edit commit description to explain better where the kernel can be located.
- Streamlined the testing done on each patch proposal. Always testing
hibernation, suspend, ftrace and kprobe to ensure no regressions.
- patch v3:
- Update on message to describe longer term PIE goal.
- Minor change on ftrace if condition.
- Changed code using xchgq.
- patch v2:
- Adapt patch to work post KPTI and compiler changes
- Redo all performance testing with latest configs and compilers
- Simplify mov macro on PIE (MOVABS now)
- Reduce GOT footprint
- patch v1:
- Simplify ftrace implementation.
- Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
- rfc v3:
- Use --emit-relocs instead of -pie to reduce dynamic relocation space on
mapped memory. It also simplifies the relocation process.
- Move the start the module section next to the kernel. Remove the need for
-mcmodel=large on modules. Extends module space from 1 to 2G maximum.
- Support for XEN PVH as 32-bit relocations can be ignored with
--emit-relocs.
- Support for GOT relocations previously done automatically with -pie.
- Remove need for dynamic PLT in modules.
- Support dymamic GOT for modules.
- rfc v2:
- Add support for global stack cookie while compiler default to fs without
mcmodel=kernel
- Change patch 7 to correctly jump out of the identity mapping on kexec load
preserve.
These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G. The chosen range is the one currently
available, future changes will allow the kernel module to have a wider
randomization range.
Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
feedback for using -pie versus --emit-relocs and details on compiler code
generation.
The patches:
- 1-2, 4-13, 18-19: Change in assembly code to be PIE compliant.
- 3: Add a new _ASM_MOVABS macro to fetch a symbol address generically.
- 14: Adapt percpu design to work correctly when PIE is enabled.
- 15: Provide an option to default visibility to hidden except for key symbols.
It removes errors between compilation units.
- 16: Add PROVIDE_HIDDEN replacement on the linker script for weak symbols to
reduce GOT footprint.
- 17: Adapt relocation tool to handle PIE binary correctly.
- 20: Add support for global cookie.
- 21: Support ftrace with PIE (used on Ubuntu config).
- 22: Add option to move the module section just after the kernel.
- 23: Adapt module loading to support PIE with dynamic GOT.
- 24: Make the GOT read-only.
- 25: Add the CONFIG_X86_PIE option (off by default).
- 26: Adapt relocation tool to generate a 64-bit relocation table.
- 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
from 1G to 3G (off by default).
Performance/Size impact:
Size of vmlinux (Default configuration):
File size:
- PIE disabled: +0.18%
- PIE enabled: -1.977% (less relocations)
.text section:
- PIE disabled: same
- PIE enabled: same
Size of vmlinux (Ubuntu configuration):
File size:
- PIE disabled: +0.21%
- PIE enabled: +10%
.text section:
- PIE disabled: same
- PIE enabled: +0.001%
The size increase is mainly due to not having access to the 32-bit signed
relocation that can be used with mcmodel=kernel. A small part is due to reduced
optimization for PIE code. This bug [1] was opened with gcc to provide a better
code generation for kernel PIE.
Hackbench (50% and 1600% on thread/process for pipe/sockets):
- PIE disabled: no significant change (avg -/+ 0.5% on latest test).
- PIE enabled: between -1% to +1% in average (default and Ubuntu config).
Kernbench (average of 10 Half and Optimal runs):
Elapsed Time:
- PIE disabled: no significant change (avg -0.5%)
- PIE enabled: average -0.5% to +0.5%
System Time:
- PIE disabled: no significant change (avg -0.1%)
- PIE enabled: average -0.4% to +0.4%.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
diffstat:
Documentation/x86/x86_64/mm.txt | 3
Makefile | 3
arch/x86/Kconfig | 45 ++++++
arch/x86/Makefile | 58 ++++++++
arch/x86/boot/boot.h | 2
arch/x86/boot/compressed/Makefile | 5
arch/x86/boot/compressed/misc.c | 10 +
arch/x86/crypto/aegis128-aesni-asm.S | 6
arch/x86/crypto/aegis128l-aesni-asm.S | 8 -
arch/x86/crypto/aegis256-aesni-asm.S | 6
arch/x86/crypto/aes-x86_64-asm_64.S | 45 ++++--
arch/x86/crypto/aesni-intel_asm.S | 8 -
arch/x86/crypto/camellia-aesni-avx-asm_64.S | 42 +++---
arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 +++---
arch/x86/crypto/camellia-x86_64-asm_64.S | 8 -
arch/x86/crypto/cast5-avx-x86_64-asm_64.S | 50 ++++---
arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++---
arch/x86/crypto/des3_ede-asm_64.S | 96 +++++++++-----
arch/x86/crypto/ghash-clmulni-intel_asm.S | 4
arch/x86/crypto/glue_helper-asm-avx.S | 4
arch/x86/crypto/glue_helper-asm-avx2.S | 6
arch/x86/crypto/morus1280-avx2-asm.S | 4
arch/x86/crypto/morus1280-sse2-asm.S | 8 -
arch/x86/crypto/morus640-sse2-asm.S | 6
arch/x86/crypto/sha256-avx2-asm.S | 23 ++-
arch/x86/entry/calling.h | 2
arch/x86/entry/entry_32.S | 3
arch/x86/entry/entry_64.S | 23 ++-
arch/x86/include/asm/alternative.h | 6
arch/x86/include/asm/asm.h | 1
arch/x86/include/asm/jump_label.h | 8 -
arch/x86/include/asm/kvm_host.h | 8 -
arch/x86/include/asm/module.h | 11 +
arch/x86/include/asm/page_64_types.h | 10 +
arch/x86/include/asm/paravirt_types.h | 12 +
arch/x86/include/asm/percpu.h | 25 ++-
arch/x86/include/asm/pgtable_64_types.h | 6
arch/x86/include/asm/pm-trace.h | 2
arch/x86/include/asm/processor.h | 13 +
arch/x86/include/asm/sections.h | 4
arch/x86/include/asm/setup.h | 2
arch/x86/include/asm/stackprotector.h | 19 ++
arch/x86/kernel/Makefile | 6
arch/x86/kernel/acpi/wakeup_64.S | 31 ++--
arch/x86/kernel/asm-offsets.c | 3
arch/x86/kernel/asm-offsets_32.c | 3
arch/x86/kernel/asm-offsets_64.c | 3
arch/x86/kernel/cpu/common.c | 3
arch/x86/kernel/cpu/microcode/core.c | 4
arch/x86/kernel/ftrace.c | 51 +++++++
arch/x86/kernel/head64.c | 23 ++-
arch/x86/kernel/head_32.S | 3
arch/x86/kernel/head_64.S | 31 +++-
arch/x86/kernel/kvm.c | 6
arch/x86/kernel/module.c | 181 ++++++++++++++++++++++++++-
arch/x86/kernel/module.lds | 3
arch/x86/kernel/process.c | 5
arch/x86/kernel/relocate_kernel_64.S | 2
arch/x86/kernel/setup_percpu.c | 5
arch/x86/kernel/vmlinux.lds.S | 13 +
arch/x86/kvm/svm.c | 4
arch/x86/kvm/vmx/vmx.c | 2
arch/x86/lib/cmpxchg16b_emu.S | 8 -
arch/x86/mm/dump_pagetables.c | 3
arch/x86/platform/pvh/head.S | 14 +-
arch/x86/power/hibernate_asm_64.S | 4
arch/x86/tools/relocs.c | 173 +++++++++++++++++++++++--
arch/x86/tools/relocs.h | 4
arch/x86/tools/relocs_common.c | 15 +-
arch/x86/xen/xen-asm.S | 12 -
arch/x86/xen/xen-head.S | 11 -
drivers/base/firmware_loader/main.c | 4
include/asm-generic/sections.h | 6
include/asm-generic/vmlinux.lds.h | 12 +
include/linux/compiler.h | 7 +
init/Kconfig | 16 ++
kernel/kallsyms.c | 16 +-
kernel/trace/trace.h | 4
lib/dynamic_debug.c | 4
scripts/link-vmlinux.sh | 14 ++
scripts/recordmcount.c | 78 +++++++----
81 files changed, 1130 insertions(+), 350 deletions(-)
Patchset is based on next-20190130.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH v6 19/27] kvm: Adapt assembly for PIE support 2019-01-31 19:24 [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization Thomas Garnier @ 2019-01-31 19:24 ` Thomas Garnier 2019-02-06 19:56 ` Sean Christopherson 2019-01-31 19:59 ` [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization Kees Cook 2019-01-31 21:40 ` Konrad Rzeszutek Wilk 2 siblings, 1 reply; 7+ messages in thread From: Thomas Garnier @ 2019-01-31 19:24 UTC (permalink / raw) To: kernel-hardening Cc: kristen, Thomas Garnier, Paolo Bonzini, Radim Krčmář, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, Joerg Roedel, kvm, linux-kernel Change the assembly code to use only relative references of symbols for the kernel to be PIE compatible. The new __ASM_MOVABS macro is used to get the address of a symbol on both 32 and 64-bit with PIE support. Position Independent Executable (PIE) support will allow to extend the KASLR randomization range below 0xffffffff80000000. Signed-off-by: Thomas Garnier <thgarnie@chromium.org> --- arch/x86/include/asm/kvm_host.h | 8 ++++++-- arch/x86/kernel/kvm.c | 6 ++++-- arch/x86/kvm/svm.c | 4 ++-- arch/x86/kvm/vmx/vmx.c | 2 +- 4 files changed, 13 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 4660ce90de7f..fdb3307d5fe1 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1498,9 +1498,13 @@ asmlinkage void kvm_spurious_fault(void); ".pushsection .fixup, \"ax\" \n" \ "667: \n\t" \ cleanup_insn "\n\t" \ - "cmpb $0, kvm_rebooting \n\t" \ + "cmpb $0, kvm_rebooting" __ASM_SEL(, (%%rip)) " \n\t" \ "jne 668b \n\t" \ - __ASM_SIZE(push) " $666b \n\t" \ + __ASM_SIZE(push) "$0 \n\t" \ + __ASM_SIZE(push) "%%" _ASM_AX " \n\t" \ + _ASM_MOVABS " $666b, %%" _ASM_AX "\n\t" \ + _ASM_MOV " %%" _ASM_AX ", " __ASM_SEL(4, 8) "(%%" _ASM_SP ") \n\t" \ + __ASM_SIZE(pop) "%%" _ASM_AX " \n\t" \ "jmp kvm_spurious_fault \n\t" \ ".popsection \n\t" \ _ASM_EXTABLE(666b, 667b) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 5c93a65ee1e5..f6eb02004e43 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -826,8 +826,10 @@ asm( ".global __raw_callee_save___kvm_vcpu_is_preempted;" ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" "__raw_callee_save___kvm_vcpu_is_preempted:" -"movq __per_cpu_offset(,%rdi,8), %rax;" -"cmpb $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);" +"leaq __per_cpu_offset(%rip), %rax;" +"movq (%rax,%rdi,8), %rax;" +"addq " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rax;" +"cmpb $0, (%rax);" "setne %al;" "ret;" ".popsection"); diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index f13a3a24d360..26abb82b1b67 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -706,12 +706,12 @@ static u32 svm_msrpm_offset(u32 msr) static inline void clgi(void) { - asm volatile (__ex("clgi")); + asm volatile (__ex("clgi") : :); } static inline void stgi(void) { - asm volatile (__ex("stgi")); + asm volatile (__ex("stgi") : :); } static inline void invlpga(unsigned long addr, u32 asid) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 4341175339f3..3275761a7375 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2161,7 +2161,7 @@ static void vmclear_local_loaded_vmcss(void) */ static void kvm_cpu_vmxoff(void) { - asm volatile (__ex("vmxoff")); + asm volatile (__ex("vmxoff") :::); intel_pt_handle_vmx(0); cr4_clear_bits(X86_CR4_VMXE); -- 2.20.1.495.gaa96b0ce6b-goog ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v6 19/27] kvm: Adapt assembly for PIE support 2019-01-31 19:24 ` [PATCH v6 19/27] kvm: Adapt assembly for PIE support Thomas Garnier @ 2019-02-06 19:56 ` Sean Christopherson 2019-02-06 21:23 ` Thomas Garnier 0 siblings, 1 reply; 7+ messages in thread From: Sean Christopherson @ 2019-02-06 19:56 UTC (permalink / raw) To: Thomas Garnier Cc: kernel-hardening, kristen, Paolo Bonzini, Radim Krčmář, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, Joerg Roedel, kvm, linux-kernel On Thu, Jan 31, 2019 at 11:24:26AM -0800, Thomas Garnier wrote: > Change the assembly code to use only relative references of symbols for the > kernel to be PIE compatible. The new __ASM_MOVABS macro is used to > get the address of a symbol on both 32 and 64-bit with PIE support. > > Position Independent Executable (PIE) support will allow to extend the > KASLR randomization range below 0xffffffff80000000. > > Signed-off-by: Thomas Garnier <thgarnie@chromium.org> > --- > arch/x86/include/asm/kvm_host.h | 8 ++++++-- > arch/x86/kernel/kvm.c | 6 ++++-- > arch/x86/kvm/svm.c | 4 ++-- > arch/x86/kvm/vmx/vmx.c | 2 +- > 4 files changed, 13 insertions(+), 7 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 4660ce90de7f..fdb3307d5fe1 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1498,9 +1498,13 @@ asmlinkage void kvm_spurious_fault(void); > ".pushsection .fixup, \"ax\" \n" \ > "667: \n\t" \ > cleanup_insn "\n\t" \ > - "cmpb $0, kvm_rebooting \n\t" \ > + "cmpb $0, kvm_rebooting" __ASM_SEL(, (%%rip)) " \n\t" \ > "jne 668b \n\t" \ > - __ASM_SIZE(push) " $666b \n\t" \ > + __ASM_SIZE(push) "$0 \n\t" \ > + __ASM_SIZE(push) "%%" _ASM_AX " \n\t" \ > + _ASM_MOVABS " $666b, %%" _ASM_AX "\n\t" \ > + _ASM_MOV " %%" _ASM_AX ", " __ASM_SEL(4, 8) "(%%" _ASM_SP ") \n\t" \ > + __ASM_SIZE(pop) "%%" _ASM_AX " \n\t" \ This blob isn't very intuitive to begin with, and the extra stack shenanigans are a bit much when PIE is disabled. What about breaking out the behavior to separate helper macros to keep the simpler code for non-PIE and to make the code somewhat self-documenting? E.g.: #ifndef CONFIG_X86_PIE #define KVM_PUSH_FAULTING_INSN_RIP __ASM_SIZE(push) " $666b \n\t" #else #define KVM_PUSH_FAULTING_INSN_RIP \ __ASM_SIZE(push) "$0 \n\t" \ __ASM_SIZE(push) "%%" _ASM_AX " \n\t" \ _ASM_MOVABS " $666b, %%" _ASM_AX "\n\t" \ _ASM_MOV " %%" _ASM_AX ", " __ASM_SEL(4, 8) "(%%" _ASM_SP ") \n\t" \ __ASM_SIZE(pop) "%%" _ASM_AX " \n\t" #endif #define ____kvm_handle_fault_on_reboot(insn, cleanup_insn) \ "666: " insn "\n\t" \ "668: \n\t" \ ".pushsection .fixup, \"ax\" \n" \ "667: \n\t" \ cleanup_insn "\n\t" \ "cmpb $0, kvm_rebooting" __ASM_SEL(, (%%rip)) " \n\t" \ "jne 668b \n\t" \ KVM_PUSH_FAULTING_INSN_RIP \ "jmp kvm_spurious_fault \n\t" \ ".popsection \n\t" \ _ASM_EXTABLE(666b, 667b) > "jmp kvm_spurious_fault \n\t" \ > ".popsection \n\t" \ > _ASM_EXTABLE(666b, 667b) > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index 5c93a65ee1e5..f6eb02004e43 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c This change to arch/x86/kernel/kvm.c should be done in a separate patch as it affects the kernel itself when running as a guest under KVM, whereas arch/x86/kvm/**/* and arch/x86/include/asm/kvm_host.h affect KVM as a host, i.e. the KVM module. Case in point, the below bug causes a kernel panic when running as a KVM guest but has no impact on the KVM module. > @@ -826,8 +826,10 @@ asm( > ".global __raw_callee_save___kvm_vcpu_is_preempted;" > ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" > "__raw_callee_save___kvm_vcpu_is_preempted:" > -"movq __per_cpu_offset(,%rdi,8), %rax;" > -"cmpb $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);" > +"leaq __per_cpu_offset(%rip), %rax;" > +"movq (%rax,%rdi,8), %rax;" > +"addq " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rax;" This is wrong, it's directly accessing the per-cpu offset of 'steal_time' as a virtual address, e.g. without PIE enabled: 0xffffffff8104820b <+11>: add 0x7efccffe(%rip),%rax # 0x15210 <steal_time+16> This results in kernel panics due to unhandled page faults: [ 0.001453] BUG: unable to handle kernel paging request at 0000000000015210 [ 0.001453] #PF error: [normal kernel read fault] I think you want something like the following, except that the whole point of handcoded assembly is to avoid modifying registers other than RAX, i.e. modifying RDI is a no-no. "leaq " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rdi;" "cmpb $0, (%rax,%rdi,1);" And similar to the comment on ____kvm_handle_fault_on_reboot(), what about wrapping the PIE-specific version in an ifdef? > +"cmpb $0, (%rax);" > "setne %al;" > "ret;" > ".popsection"); > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > index f13a3a24d360..26abb82b1b67 100644 > --- a/arch/x86/kvm/svm.c > +++ b/arch/x86/kvm/svm.c > @@ -706,12 +706,12 @@ static u32 svm_msrpm_offset(u32 msr) > > static inline void clgi(void) > { > - asm volatile (__ex("clgi")); > + asm volatile (__ex("clgi") : :); > } > > static inline void stgi(void) > { > - asm volatile (__ex("stgi")); > + asm volatile (__ex("stgi") : :); > } > > static inline void invlpga(unsigned long addr, u32 asid) > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 4341175339f3..3275761a7375 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -2161,7 +2161,7 @@ static void vmclear_local_loaded_vmcss(void) > */ > static void kvm_cpu_vmxoff(void) > { > - asm volatile (__ex("vmxoff")); > + asm volatile (__ex("vmxoff") :::); > > intel_pt_handle_vmx(0); > cr4_clear_bits(X86_CR4_VMXE); > -- > 2.20.1.495.gaa96b0ce6b-goog > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6 19/27] kvm: Adapt assembly for PIE support 2019-02-06 19:56 ` Sean Christopherson @ 2019-02-06 21:23 ` Thomas Garnier 0 siblings, 0 replies; 7+ messages in thread From: Thomas Garnier @ 2019-02-06 21:23 UTC (permalink / raw) To: Sean Christopherson Cc: Kernel Hardening, Kristen Carlson Accardi, Paolo Bonzini, Radim Krčmář, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, the arch/x86 maintainers, Joerg Roedel, kvm list, LKML On Wed, Feb 6, 2019 at 11:56 AM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Thu, Jan 31, 2019 at 11:24:26AM -0800, Thomas Garnier wrote: > > Change the assembly code to use only relative references of symbols for the > > kernel to be PIE compatible. The new __ASM_MOVABS macro is used to > > get the address of a symbol on both 32 and 64-bit with PIE support. > > > > Position Independent Executable (PIE) support will allow to extend the > > KASLR randomization range below 0xffffffff80000000. > > > > Signed-off-by: Thomas Garnier <thgarnie@chromium.org> > > --- > > arch/x86/include/asm/kvm_host.h | 8 ++++++-- > > arch/x86/kernel/kvm.c | 6 ++++-- > > arch/x86/kvm/svm.c | 4 ++-- > > arch/x86/kvm/vmx/vmx.c | 2 +- > > 4 files changed, 13 insertions(+), 7 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index 4660ce90de7f..fdb3307d5fe1 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -1498,9 +1498,13 @@ asmlinkage void kvm_spurious_fault(void); > > ".pushsection .fixup, \"ax\" \n" \ > > "667: \n\t" \ > > cleanup_insn "\n\t" \ > > - "cmpb $0, kvm_rebooting \n\t" \ > > + "cmpb $0, kvm_rebooting" __ASM_SEL(, (%%rip)) " \n\t" \ > > "jne 668b \n\t" \ > > - __ASM_SIZE(push) " $666b \n\t" \ > > + __ASM_SIZE(push) "$0 \n\t" \ > > + __ASM_SIZE(push) "%%" _ASM_AX " \n\t" \ > > + _ASM_MOVABS " $666b, %%" _ASM_AX "\n\t" \ > > + _ASM_MOV " %%" _ASM_AX ", " __ASM_SEL(4, 8) "(%%" _ASM_SP ") \n\t" \ > > + __ASM_SIZE(pop) "%%" _ASM_AX " \n\t" \ > > This blob isn't very intuitive to begin with, and the extra stack > shenanigans are a bit much when PIE is disabled. What about breaking > out the behavior to separate helper macros to keep the simpler code > for non-PIE and to make the code somewhat self-documenting? E.g.: > > #ifndef CONFIG_X86_PIE > #define KVM_PUSH_FAULTING_INSN_RIP __ASM_SIZE(push) " $666b \n\t" > #else > #define KVM_PUSH_FAULTING_INSN_RIP \ > __ASM_SIZE(push) "$0 \n\t" \ > __ASM_SIZE(push) "%%" _ASM_AX " \n\t" \ > _ASM_MOVABS " $666b, %%" _ASM_AX "\n\t" \ > _ASM_MOV " %%" _ASM_AX ", " __ASM_SEL(4, 8) "(%%" _ASM_SP ") \n\t" \ > __ASM_SIZE(pop) "%%" _ASM_AX " \n\t" > #endif > > #define ____kvm_handle_fault_on_reboot(insn, cleanup_insn) \ > "666: " insn "\n\t" \ > "668: \n\t" \ > ".pushsection .fixup, \"ax\" \n" \ > "667: \n\t" \ > cleanup_insn "\n\t" \ > "cmpb $0, kvm_rebooting" __ASM_SEL(, (%%rip)) " \n\t" \ > "jne 668b \n\t" \ > KVM_PUSH_FAULTING_INSN_RIP \ > "jmp kvm_spurious_fault \n\t" \ > ".popsection \n\t" \ > _ASM_EXTABLE(666b, 667b) > > > "jmp kvm_spurious_fault \n\t" \ > > ".popsection \n\t" \ > > _ASM_EXTABLE(666b, 667b) > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > > index 5c93a65ee1e5..f6eb02004e43 100644 > > --- a/arch/x86/kernel/kvm.c > > +++ b/arch/x86/kernel/kvm.c > > This change to arch/x86/kernel/kvm.c should be done in a separate patch > as it affects the kernel itself when running as a guest under KVM, > whereas arch/x86/kvm/**/* and arch/x86/include/asm/kvm_host.h affect > KVM as a host, i.e. the KVM module. Case in point, the below bug causes > a kernel panic when running as a KVM guest but has no impact on the KVM > module. Got it, will split in next iteration. > > > @@ -826,8 +826,10 @@ asm( > > ".global __raw_callee_save___kvm_vcpu_is_preempted;" > > ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;" > > "__raw_callee_save___kvm_vcpu_is_preempted:" > > -"movq __per_cpu_offset(,%rdi,8), %rax;" > > -"cmpb $0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);" > > +"leaq __per_cpu_offset(%rip), %rax;" > > +"movq (%rax,%rdi,8), %rax;" > > +"addq " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rax;" > > This is wrong, it's directly accessing the per-cpu offset of 'steal_time' > as a virtual address, e.g. without PIE enabled: > > 0xffffffff8104820b <+11>: add 0x7efccffe(%rip),%rax # 0x15210 <steal_time+16> > > This results in kernel panics due to unhandled page faults: > > [ 0.001453] BUG: unable to handle kernel paging request at 0000000000015210 > [ 0.001453] #PF error: [normal kernel read fault] Yes, I think something went wrong in rebasing. Thanks for pointing it out. > > I think you want something like the following, except that the whole > point of handcoded assembly is to avoid modifying registers other than > RAX, i.e. modifying RDI is a no-no. > > "leaq " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rip), %rdi;" > "cmpb $0, (%rax,%rdi,1);" > > > And similar to the comment on ____kvm_handle_fault_on_reboot(), what > about wrapping the PIE-specific version in an ifdef? I will look into this and try your approach. > > > +"cmpb $0, (%rax);" > > "setne %al;" > > "ret;" > > ".popsection"); > > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c > > index f13a3a24d360..26abb82b1b67 100644 > > --- a/arch/x86/kvm/svm.c > > +++ b/arch/x86/kvm/svm.c > > @@ -706,12 +706,12 @@ static u32 svm_msrpm_offset(u32 msr) > > > > static inline void clgi(void) > > { > > - asm volatile (__ex("clgi")); > > + asm volatile (__ex("clgi") : :); > > } > > > > static inline void stgi(void) > > { > > - asm volatile (__ex("stgi")); > > + asm volatile (__ex("stgi") : :); > > } > > > > static inline void invlpga(unsigned long addr, u32 asid) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index 4341175339f3..3275761a7375 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -2161,7 +2161,7 @@ static void vmclear_local_loaded_vmcss(void) > > */ > > static void kvm_cpu_vmxoff(void) > > { > > - asm volatile (__ex("vmxoff")); > > + asm volatile (__ex("vmxoff") :::); > > > > intel_pt_handle_vmx(0); > > cr4_clear_bits(X86_CR4_VMXE); > > -- > > 2.20.1.495.gaa96b0ce6b-goog > > Thanks for the feedback. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization 2019-01-31 19:24 [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization Thomas Garnier 2019-01-31 19:24 ` [PATCH v6 19/27] kvm: Adapt assembly for PIE support Thomas Garnier @ 2019-01-31 19:59 ` Kees Cook 2019-01-31 21:40 ` Konrad Rzeszutek Wilk 2 siblings, 0 replies; 7+ messages in thread From: Kees Cook @ 2019-01-31 19:59 UTC (permalink / raw) To: Thomas Garnier Cc: Kernel Hardening, Kristen Carlson Accardi, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Jonathan Corbet, Masahiro Yamada, Michal Marek, Herbert Xu, David S. Miller, Andy Lutomirski, Paolo Bonzini, Radim Krčmář, Juergen Gross, Alok Kataria, Dennis Zhou, Tejun Heo, Christoph Lameter, Rafael J. Wysocki, Len Brown, Pavel Machek On Fri, Feb 1, 2019 at 8:28 AM Thomas Garnier <thgarnie@chromium.org> wrote: > These patches make the changes necessary to build the kernel as Position > Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below > the top 2G of the virtual address space. It allows to optionally extend the > KASLR randomization range from 1G to 3G. The chosen range is the one currently > available, future changes will allow the kernel module to have a wider > randomization range. This also lays the groundwork for doing compilation-unit-granularity KASLR, as Kristen has been working on. With PIE working, the relocations are more sane and boot-time reordering becomes possible (or at least, it becomes the same logically as doing the work on modules, etc). -- Kees Cook ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization 2019-01-31 19:24 [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization Thomas Garnier 2019-01-31 19:24 ` [PATCH v6 19/27] kvm: Adapt assembly for PIE support Thomas Garnier 2019-01-31 19:59 ` [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization Kees Cook @ 2019-01-31 21:40 ` Konrad Rzeszutek Wilk 2019-01-31 22:42 ` Thomas Garnier 2 siblings, 1 reply; 7+ messages in thread From: Konrad Rzeszutek Wilk @ 2019-01-31 21:40 UTC (permalink / raw) To: Thomas Garnier Cc: kernel-hardening, kristen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, Jonathan Corbet, Masahiro Yamada, Michal Marek, Herbert Xu, David S. Miller, Andy Lutomirski, Paolo Bonzini, Radim Krčmář, Juergen Gross, Alok Kataria, Dennis Zhou, Tejun Heo, Christoph Lameter, Rafael J. Wysocki, Len Brown On Thu, Jan 31, 2019 at 11:24:07AM -0800, Thomas Garnier wrote: > There has been no major concern in the latest iterations. I am interested on > what would be the best way to slowly integrate this patchset upstream. One question that I was somehow expected in this cover letter - what about all those lovely speculative bugs? As in say some one hasn't updated their machine with the Spectre v3a microcode - wouldn't they be able to get the kernel virtual address space? In effect rendering all this hard-work not needed? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization 2019-01-31 21:40 ` Konrad Rzeszutek Wilk @ 2019-01-31 22:42 ` Thomas Garnier 0 siblings, 0 replies; 7+ messages in thread From: Thomas Garnier @ 2019-01-31 22:42 UTC (permalink / raw) To: Konrad Rzeszutek Wilk Cc: Kernel Hardening, kristen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, the arch/x86 maintainers, Jonathan Corbet, Masahiro Yamada, Michal Marek, Herbert Xu, David S. Miller, Andy Lutomirski, Paolo Bonzini, Radim Krčmář, Juergen Gross, Alok Kataria, Dennis Zhou, Tejun Heo, Christoph Lameter, Rafael J. Wysocki, Len Brown, Pavel Machek On Thu, Jan 31, 2019 at 1:41 PM Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > On Thu, Jan 31, 2019 at 11:24:07AM -0800, Thomas Garnier wrote: > > There has been no major concern in the latest iterations. I am interested on > > what would be the best way to slowly integrate this patchset upstream. > > One question that I was somehow expected in this cover letter - what > about all those lovely speculative bugs? As in say some one hasn't > updated their machine with the Spectre v3a microcode - wouldn't they > be able to get the kernel virtual address space? Yes they would be. > > In effect rendering all this hard-work not needed? Only if we think Spectre bugs will never be fixed. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-02-06 21:23 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-01-31 19:24 [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization Thomas Garnier 2019-01-31 19:24 ` [PATCH v6 19/27] kvm: Adapt assembly for PIE support Thomas Garnier 2019-02-06 19:56 ` Sean Christopherson 2019-02-06 21:23 ` Thomas Garnier 2019-01-31 19:59 ` [PATCH v6 00/27] x86: PIE support and option to extend KASLR randomization Kees Cook 2019-01-31 21:40 ` Konrad Rzeszutek Wilk 2019-01-31 22:42 ` Thomas Garnier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).