* CFP WEBIST 2018 - 14th Int.l Conf. on Web Information Systems and Technologies (Seville/Spain)
From: webist @ 2018-03-12 18:14 UTC (permalink / raw)
To: virtualization
SUBMISSION DEADLINE
14th International Conference on Web Information Systems and Technologies
Submission Deadline: May 2, 2018
http://www.webist.org/
September 18 - 20, 2018
Seville, Spain.
WEBIST is organized in 5 major tracks:
- Internet Technology
- Mobile and NLP Information Systems
- Service Based Information Systems, Platforms and Eco-Systems
- Web Intelligence
- Web Interfaces
A short list of presented papers will be selected so that revised and extended versions of these papers will be published by Springer.
All papers presented at the congress venue will also be available at the SCITEPRESS Digital Library (http://www.scitepress.org/DigitalLibrary/).
Should you have any question please don’t hesitate contacting me.
Kind regards,
WEBIST Secretariat
Address: Av. D. Manuel I, 27A, 2º esq.
2910-595 Setubal, Portugal
Tel: +351 265 100 033
Fax: +351 265 520 186
Web: http://www.webist.org/
e-mail: webist.secretariat@insticc.org
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* CFP WINSYS 2018 - 15th Int.l Conf. on Wireless Networks and Mobile Systems (Porto/Portugal)
From: icete @ 2018-03-12 18:57 UTC (permalink / raw)
To: virtualization
SUBMISSION DEADLINE
15th International Conference on Wireless Networks and Mobile Systems
Submission Deadline: April 3, 2018
http://www.winsys.icete.org/
July 26 - 28, 2018
Porto, Portugal.
WINSYS is organized in 3 major tracks:
- Sensor Networks and Ad Hoc Communications
- Wireless and Mobile Technologies
- Mobile Software and Services
Technically Co-sponsored by IEEE Systems Council.
With the presence of internationally distinguished keynote speakers:
George Karagiannidis , Aristotle University of Thessaloniki, Greece
Zeno Geradts, Netherlands Forensic Institute, Netherlands
Jan Mendling, Vienna University of Economics and Business, Austria
A short list of presented papers will be selected so that revised and extended versions of these papers will be published by Springer.
All papers presented at the congress venue will also be available at the SCITEPRESS Digital Library (http://www.scitepress.org/DigitalLibrary/).
Should you have any question please don’t hesitate contacting me.
Kind regards,
WINSYS Secretariat
Address: Av. D. Manuel I, 27A, 2º esq.
2910-595 Setubal, Portugal
Tel: +351 265 520 185
Fax: +351 265 520 186
Web: http://www.winsys.icete.org/
e-mail: winsys.secretariat@insticc.org
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* CFP SECRYPT 2018 - 15th Int.l Conf. on Security and Cryptography (Porto/Portugal)
From: icete @ 2018-03-12 18:57 UTC (permalink / raw)
To: virtualization
SUBMISSION DEADLINE
15th International Conference on Security and Cryptography
Submission Deadline: April 3, 2018
http://www.secrypt.icete.org/
July 26 - 28, 2018
Porto, Portugal.
Technically Co-sponsored by IEEE Systems Council.
With the presence of internationally distinguished keynote speakers:
George Karagiannidis , Aristotle University of Thessaloniki, Greece
Zeno Geradts, Netherlands Forensic Institute, Netherlands
Jan Mendling, Vienna University of Economics and Business, Austria
A short list of presented papers will be selected so that revised and extended versions of these papers will be published by Springer.
All papers presented at the congress venue will also be available at the SCITEPRESS Digital Library (http://www.scitepress.org/DigitalLibrary/).
Should you have any question please don’t hesitate contacting me.
Kind regards,
SECRYPT Secretariat
Address: Av. D. Manuel I, 27A, 2º esq.
2910-595 Setubal, Portugal
Tel: +351 265 520 185
Fax: +351 265 520 186
Web: http://www.secrypt.icete.org/
e-mail: secrypt.secretariat@insticc.org
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* CFP PECCS 2018 - 8th Int.l Joint Conf. on Pervasive and Embedded Computing and Communication Systems (Porto/Portugal)
From: peccs @ 2018-03-12 18:57 UTC (permalink / raw)
To: virtualization
SUBMISSION DEADLINE
8th International Joint Conference on Pervasive and Embedded Computing and Communication Systems
Submission Deadline: April 3, 2018
http://www.peccs.org/
July 29 - 30, 2018
Porto, Portugal.
In Cooperation with EUROMICRO.
With the presence of internationally distinguished keynote speakers:
Stephan Weiss, University of Strathclyde, United Kingdom
A short list of presented papers will be selected so that revised and extended versions of these papers will be published by Springer.
All papers presented at the congress venue will also be available at the SCITEPRESS Digital Library (http://www.scitepress.org/DigitalLibrary/).
Should you have any question please don’t hesitate contacting me.
Kind regards,
PECCS Secretariat
Address: Av. D. Manuel I, 27A, 2º esq.
2910-595 Setubal, Portugal
Tel: +351 265 520 184
Fax: +351 265 520 186
Web: http://www.peccs.org/
e-mail: peccs.secretariat@insticc.org
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* CFP DCNET 2018 - 9th Int.l Conf. on Data Communication Networking (Porto/Portugal)
From: icete @ 2018-03-12 18:57 UTC (permalink / raw)
To: virtualization
SUBMISSION DEADLINE
9th International Conference on Data Communication Networking
Submission Deadline: April 3, 2018
http://www.dcnet.icete.org/
July 26 - 28, 2018
Porto, Portugal.
Technically Co-sponsored by IEEE Systems Council.
With the presence of internationally distinguished keynote speakers:
George Karagiannidis , Aristotle University of Thessaloniki, Greece
Zeno Geradts, Netherlands Forensic Institute, Netherlands
Jan Mendling, Vienna University of Economics and Business, Austria
A short list of presented papers will be selected so that revised and extended versions of these papers will be published by Springer.
All papers presented at the congress venue will also be available at the SCITEPRESS Digital Library (http://www.scitepress.org/DigitalLibrary/).
Should you have any question please don’t hesitate contacting me.
Kind regards,
DCNET Secretariat
Address: Av. D. Manuel I, 27A, 2º esq.
2910-595 Setubal, Portugal
Tel: +351 265 520 185
Fax: +351 265 520 186
Web: http://www.dcnet.icete.org/
e-mail: dcnet.secretariat@insticc.org
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* CFP ICETE 2018 - 15th Int.l Joint Conf. on e-Business and Telecommunications (Porto/Portugal)
From: icete @ 2018-03-12 18:58 UTC (permalink / raw)
To: virtualization
SUBMISSION DEADLINE
15th International Joint Conference on e-Business and Telecommunications
Submission Deadline: April 3, 2018
http://www.icete.org/
July 26 - 28, 2018
Porto, Portugal.
Technically Co-sponsored by IEEE Systems Council.
In Cooperation with EOS, Photonics21, ACM SIGMM.
With the presence of internationally distinguished keynote speakers:
George Karagiannidis , Aristotle University of Thessaloniki, Greece
Zeno Geradts, Netherlands Forensic Institute, Netherlands
Jan Mendling, Vienna University of Economics and Business, Austria
A short list of presented papers will be selected so that revised and extended versions of these papers will be published by Springer.
All papers presented at the congress venue will also be available at the SCITEPRESS Digital Library (http://www.scitepress.org/DigitalLibrary/).
Should you have any question please don’t hesitate contacting me.
Kind regards,
ICETE Secretariat
Address: Av. D. Manuel I, 27A, 2º esq.
2910-595 Setubal, Portugal
Tel: +351 265 520 185
Fax: +351 265 520 186
Web: http://www.icete.org/
e-mail: icete.secretariat@insticc.org
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* Re: virtio-gpu: Hang on shutdown after suspend/resume with virtio
From: Christian Borntraeger @ 2018-03-13 13:32 UTC (permalink / raw)
To: Gerd Hoffmann, kvm list, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org
In-Reply-To: <92144051-e8ac-de5d-3fac-87dfd0e8270b@de.ibm.com>
On 03/13/2018 01:41 PM, Christian Borntraeger wrote:
> Gerd,
>
> another thing with virtio-gpu.
>
> I can successfully do suspend/resume (echo disk > /sys/power/state) on my system. As soon as I have a
> virtio-gpu the system hangs on reboot/shutdown:
>
> e.g.
>
> crash> bt 1
> PID: 1 TASK: 6bef0000 CPU: 4 COMMAND: "systemd"
> #0 [6beef4c0] __schedule at a55d60
> #1 [6beef530] schedule at a563f2
> #2 [6beef548] virtio_gpu_queue_ctrl_buffer_locked at 7ea708
> #3 [6beef650] virtio_gpu_queue_ctrl_buffer at 7ea830
> #4 [6beef680] virtio_gpu_primary_plane_update at 7ed5ee
> #5 [6beef710] drm_atomic_helper_commit_planes at 7a2b46
> #6 [6beef760] vgdev_atomic_commit_tail at 7e9dbc
> #7 [6beef790] commit_tail at 7a7126
> #8 [6beef7c0] drm_atomic_helper_commit at 7a73fe
> #9 [6beef800] restore_fbdev_mode_atomic at 7aa38a
> #10 [6beef8a0] drm_fb_helper_pan_display at 7ab7aa
> #11 [6beef8f0] fb_pan_display at 74a248
> #12 [6beef928] bit_update_start at 7573c0
> #13 [6beef958] fbcon_switch at 7542b4
> #14 [6beefa50] redraw_screen at 77b2b8
> #15 [6beefaa0] csi_J at 77b476
> #16 [6beefad8] do_con_trol at 77ee12
> #17 [6beefb40] do_con_write at 77f80c
> #18 [6beefc10] con_write at 78003c
> #19 [6beefc40] n_tty_write at 7662d4
> #20 [6beefce0] tty_write at 761734
> #21 [6beefd48] __vfs_write at 3483cc
> #22 [6beefe00] vfs_write at 3486e8
> #23 [6beefe60] sys_write at 3489de
> #24 [6beefea8] system_call at a5b6a8
> PSW: 0705100180000000 000003ff8111251c (user space)
> GPRS: 0000000000000000 000003ff00000002 000000000000000e 000003ff80deaef0
> 000000000000000a 0000000000000000 000000000000000f 00000000000003ef
> 0000000000000000 000000000000000e 000003ff80deaef0 000000000000000a
> 000003ff81228f58 000000000000000e 000003ff81112510 000003fffdefedf8
>
>
>
>
> Turns out that the virtio-gpu console (via vnc) prints the image loading but then
> stops and show no login prompt. (but agetty is running on tty1) So I assume it
> already hangs after resume (and I am working on the serial sclp console).
>
> Does suspend resume works for you with virtio-gpu on x86? If yes, then I would
> not to look into the s390 code (virtio-blk and net seems to work fine). Otherwise
now
> it would be good if you could have a look.
I just checked. It seems that virtio-gpu has no freeze/restore callbacks. Is there
some work planned to implement the power callbacks for virtio-gpu?
^ permalink raw reply
* Re: virtio-gpu: Hang on shutdown after suspend/resume with virtio
From: Christian Borntraeger @ 2018-03-13 14:27 UTC (permalink / raw)
To: Gerd Hoffmann, kvm list, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, David Airlie,
DRI Development
In-Reply-To: <01fcd61b-1883-1baa-b0c5-c79b4942ed35@de.ibm.com>
Also add David and dri list.....
Short summary, suspend/resume breaks with virtio-gpu as there are no freeze/restore callbacks
that put the device back into working state.
On 03/13/2018 02:32 PM, Christian Borntraeger wrote:
>
>
> On 03/13/2018 01:41 PM, Christian Borntraeger wrote:
>> Gerd,
>>
>> another thing with virtio-gpu.
>>
>> I can successfully do suspend/resume (echo disk > /sys/power/state) on my system. As soon as I have a
>> virtio-gpu the system hangs on reboot/shutdown:
>>
>> e.g.
>>
>> crash> bt 1
>> PID: 1 TASK: 6bef0000 CPU: 4 COMMAND: "systemd"
>> #0 [6beef4c0] __schedule at a55d60
>> #1 [6beef530] schedule at a563f2
>> #2 [6beef548] virtio_gpu_queue_ctrl_buffer_locked at 7ea708
>> #3 [6beef650] virtio_gpu_queue_ctrl_buffer at 7ea830
>> #4 [6beef680] virtio_gpu_primary_plane_update at 7ed5ee
>> #5 [6beef710] drm_atomic_helper_commit_planes at 7a2b46
>> #6 [6beef760] vgdev_atomic_commit_tail at 7e9dbc
>> #7 [6beef790] commit_tail at 7a7126
>> #8 [6beef7c0] drm_atomic_helper_commit at 7a73fe
>> #9 [6beef800] restore_fbdev_mode_atomic at 7aa38a
>> #10 [6beef8a0] drm_fb_helper_pan_display at 7ab7aa
>> #11 [6beef8f0] fb_pan_display at 74a248
>> #12 [6beef928] bit_update_start at 7573c0
>> #13 [6beef958] fbcon_switch at 7542b4
>> #14 [6beefa50] redraw_screen at 77b2b8
>> #15 [6beefaa0] csi_J at 77b476
>> #16 [6beefad8] do_con_trol at 77ee12
>> #17 [6beefb40] do_con_write at 77f80c
>> #18 [6beefc10] con_write at 78003c
>> #19 [6beefc40] n_tty_write at 7662d4
>> #20 [6beefce0] tty_write at 761734
>> #21 [6beefd48] __vfs_write at 3483cc
>> #22 [6beefe00] vfs_write at 3486e8
>> #23 [6beefe60] sys_write at 3489de
>> #24 [6beefea8] system_call at a5b6a8
>> PSW: 0705100180000000 000003ff8111251c (user space)
>> GPRS: 0000000000000000 000003ff00000002 000000000000000e 000003ff80deaef0
>> 000000000000000a 0000000000000000 000000000000000f 00000000000003ef
>> 0000000000000000 000000000000000e 000003ff80deaef0 000000000000000a
>> 000003ff81228f58 000000000000000e 000003ff81112510 000003fffdefedf8
>>
>>
>>
>>
>> Turns out that the virtio-gpu console (via vnc) prints the image loading but then
>> stops and show no login prompt. (but agetty is running on tty1) So I assume it
>> already hangs after resume (and I am working on the serial sclp console).
>>
>> Does suspend resume works for you with virtio-gpu on x86? If yes, then I would
>> now look into the s390 code (virtio-blk and net seems to work fine). Otherwise
>> it would be good if you could have a look.
>
> I just checked. It seems that virtio-gpu has no freeze/restore callbacks. Is there
> some work planned to implement the power callbacks for virtio-gpu?
>
^ permalink raw reply
* [PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
Changes:
- patch v2:
- Adapt patch to work post KPTI and compiler changes
- Redo all performance testing with latest configs and compilers
- Simplify mov macro on PIE (MOVABS now)
- Reduce GOT footprint
- patch v1:
- Simplify ftrace implementation.
- Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
- rfc v3:
- Use --emit-relocs instead of -pie to reduce dynamic relocation space on
mapped memory. It also simplifies the relocation process.
- Move the start the module section next to the kernel. Remove the need for
-mcmodel=large on modules. Extends module space from 1 to 2G maximum.
- Support for XEN PVH as 32-bit relocations can be ignored with
--emit-relocs.
- Support for GOT relocations previously done automatically with -pie.
- Remove need for dynamic PLT in modules.
- Support dymamic GOT for modules.
- rfc v2:
- Add support for global stack cookie while compiler default to fs without
mcmodel=kernel
- Change patch 7 to correctly jump out of the identity mapping on kexec load
preserve.
These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.
Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to Roland McGrath on his
feedback for using -pie versus --emit-relocs and details on compiler code
generation.
The patches:
- 1-3, 5-13, 18-19: Change in assembly code to be PIE compliant.
- 4: Add a new _ASM_GET_PTR macro to fetch a symbol address generically.
- 14: Adapt percpu design to work correctly when PIE is enabled.
- 15: Provide an option to default visibility to hidden except for key symbols.
It removes errors between compilation units.
- 16: Add PROVIDE_HIDDEN replacement on the linker script for weak symbols to
reduce GOT footprint.
- 17: Adapt relocation tool to handle PIE binary correctly.
- 20: Add support for global cookie.
- 21: Support ftrace with PIE (used on Ubuntu config).
- 22: Add option to move the module section just after the kernel.
- 23: Adapt module loading to support PIE with dynamic GOT.
- 24: Make the GOT read-only.
- 25: Add the CONFIG_X86_PIE option (off by default).
- 26: Adapt relocation tool to generate a 64-bit relocation table.
- 27: Add the CONFIG_RANDOMIZE_BASE_LARGE option to increase relocation range
from 1G to 3G (off by default).
Performance/Size impact:
Size of vmlinux (Default configuration):
File size:
- PIE disabled: +0.18%
- PIE enabled: -1.977% (less relocations)
.text section:
- PIE disabled: same
- PIE enabled: same
Size of vmlinux (Ubuntu configuration):
File size:
- PIE disabled: +0.21%
- PIE enabled: +10%
.text section:
- PIE disabled: same
- PIE enabled: +0.001%
The size increase is mainly due to not having access to the 32-bit signed
relocation that can be used with mcmodel=kernel. A small part is due to reduced
optimization for PIE code. This bug [1] was opened with gcc to provide a better
code generation for kernel PIE.
Hackbench (50% and 1600% on thread/process for pipe/sockets):
- PIE disabled: no significant change (avg -/+ 0.5% on latest test).
- PIE enabled: between -1% to +1% in average (default and Ubuntu config).
Kernbench (average of 10 Half and Optimal runs):
Elapsed Time:
- PIE disabled: no significant change (avg -0.5%)
- PIE enabled: average -0.5% to +0.5%
System Time:
- PIE disabled: no significant change (avg -0.1%)
- PIE enabled: average -0.4% to +0.4%.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82303
diffstat:
Documentation/x86/x86_64/mm.txt | 3
arch/x86/Kconfig | 45 ++++++
arch/x86/Makefile | 58 ++++++++
arch/x86/boot/boot.h | 2
arch/x86/boot/compressed/Makefile | 5
arch/x86/boot/compressed/misc.c | 10 +
arch/x86/crypto/aes-x86_64-asm_64.S | 45 ++++--
arch/x86/crypto/aesni-intel_asm.S | 8 -
arch/x86/crypto/aesni-intel_avx-x86_64.S | 6
arch/x86/crypto/camellia-aesni-avx-asm_64.S | 42 +++---
arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 +++---
arch/x86/crypto/camellia-x86_64-asm_64.S | 8 -
arch/x86/crypto/cast5-avx-x86_64-asm_64.S | 50 ++++---
arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++---
arch/x86/crypto/des3_ede-asm_64.S | 96 +++++++++-----
arch/x86/crypto/ghash-clmulni-intel_asm.S | 4
arch/x86/crypto/glue_helper-asm-avx.S | 4
arch/x86/crypto/glue_helper-asm-avx2.S | 6
arch/x86/crypto/sha256-avx2-asm.S | 23 ++-
arch/x86/entry/calling.h | 2
arch/x86/entry/entry_32.S | 3
arch/x86/entry/entry_64.S | 23 ++-
arch/x86/include/asm/asm.h | 1
arch/x86/include/asm/bug.h | 2
arch/x86/include/asm/ftrace.h | 6
arch/x86/include/asm/jump_label.h | 8 -
arch/x86/include/asm/kvm_host.h | 6
arch/x86/include/asm/module.h | 11 +
arch/x86/include/asm/page_64_types.h | 9 +
arch/x86/include/asm/paravirt_types.h | 12 +
arch/x86/include/asm/percpu.h | 25 ++-
arch/x86/include/asm/pgtable_64_types.h | 6
arch/x86/include/asm/pm-trace.h | 2
arch/x86/include/asm/processor.h | 12 +
arch/x86/include/asm/sections.h | 8 +
arch/x86/include/asm/setup.h | 2
arch/x86/include/asm/stackprotector.h | 19 ++
arch/x86/kernel/Makefile | 6
arch/x86/kernel/acpi/wakeup_64.S | 31 ++--
arch/x86/kernel/asm-offsets.c | 3
arch/x86/kernel/asm-offsets_32.c | 3
arch/x86/kernel/asm-offsets_64.c | 3
arch/x86/kernel/cpu/common.c | 7 -
arch/x86/kernel/cpu/microcode/core.c | 4
arch/x86/kernel/ftrace.c | 42 +++++-
arch/x86/kernel/head64.c | 23 ++-
arch/x86/kernel/head_32.S | 3
arch/x86/kernel/head_64.S | 41 +++++-
arch/x86/kernel/kvm.c | 6
arch/x86/kernel/module.c | 181 ++++++++++++++++++++++++++-
arch/x86/kernel/module.lds | 3
arch/x86/kernel/process.c | 5
arch/x86/kernel/relocate_kernel_64.S | 16 +-
arch/x86/kernel/setup_percpu.c | 2
arch/x86/kernel/vmlinux.lds.S | 13 +
arch/x86/kvm/svm.c | 4
arch/x86/lib/cmpxchg16b_emu.S | 8 -
arch/x86/mm/dump_pagetables.c | 3
arch/x86/power/hibernate_asm_64.S | 4
arch/x86/tools/relocs.c | 169 +++++++++++++++++++++++--
arch/x86/tools/relocs.h | 4
arch/x86/tools/relocs_common.c | 15 +-
arch/x86/xen/xen-asm.S | 12 -
arch/x86/xen/xen-head.S | 11 -
arch/x86/xen/xen-pvh.S | 13 +
drivers/base/firmware_class.c | 4
include/asm-generic/sections.h | 6
include/asm-generic/vmlinux.lds.h | 12 +
include/linux/compiler.h | 7 +
init/Kconfig | 16 ++
kernel/kallsyms.c | 16 +-
kernel/trace/trace.h | 4
lib/dynamic_debug.c | 4
scripts/link-vmlinux.sh | 14 ++
74 files changed, 1063 insertions(+), 315 deletions(-)
^ permalink raw reply
* [PATCH v2 01/27] x86/crypto: Adapt assembly for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/crypto/aes-x86_64-asm_64.S | 45 +++++----
arch/x86/crypto/aesni-intel_asm.S | 8 +-
arch/x86/crypto/aesni-intel_avx-x86_64.S | 6 +-
arch/x86/crypto/camellia-aesni-avx-asm_64.S | 42 ++++-----
arch/x86/crypto/camellia-aesni-avx2-asm_64.S | 44 ++++-----
arch/x86/crypto/camellia-x86_64-asm_64.S | 8 +-
arch/x86/crypto/cast5-avx-x86_64-asm_64.S | 50 +++++-----
arch/x86/crypto/cast6-avx-x86_64-asm_64.S | 44 +++++----
arch/x86/crypto/des3_ede-asm_64.S | 96 +++++++++++++-------
arch/x86/crypto/ghash-clmulni-intel_asm.S | 4 +-
arch/x86/crypto/glue_helper-asm-avx.S | 4 +-
arch/x86/crypto/glue_helper-asm-avx2.S | 6 +-
arch/x86/crypto/sha256-avx2-asm.S | 23 +++--
13 files changed, 221 insertions(+), 159 deletions(-)
diff --git a/arch/x86/crypto/aes-x86_64-asm_64.S b/arch/x86/crypto/aes-x86_64-asm_64.S
index 8739cf7795de..86fa068e5e81 100644
--- a/arch/x86/crypto/aes-x86_64-asm_64.S
+++ b/arch/x86/crypto/aes-x86_64-asm_64.S
@@ -48,8 +48,12 @@
#define R10 %r10
#define R11 %r11
+/* Hold global for PIE suport */
+#define RBASE %r12
+
#define prologue(FUNC,KEY,B128,B192,r1,r2,r5,r6,r7,r8,r9,r10,r11) \
ENTRY(FUNC); \
+ pushq RBASE; \
movq r1,r2; \
leaq KEY+48(r8),r9; \
movq r10,r11; \
@@ -74,54 +78,63 @@
movl r6 ## E,4(r9); \
movl r7 ## E,8(r9); \
movl r8 ## E,12(r9); \
+ popq RBASE; \
ret; \
ENDPROC(FUNC);
+#define round_mov(tab_off, reg_i, reg_o) \
+ leaq tab_off(%rip), RBASE; \
+ movl (RBASE,reg_i,4), reg_o;
+
+#define round_xor(tab_off, reg_i, reg_o) \
+ leaq tab_off(%rip), RBASE; \
+ xorl (RBASE,reg_i,4), reg_o;
+
#define round(TAB,OFFSET,r1,r2,r3,r4,r5,r6,r7,r8,ra,rb,rc,rd) \
movzbl r2 ## H,r5 ## E; \
movzbl r2 ## L,r6 ## E; \
- movl TAB+1024(,r5,4),r5 ## E;\
+ round_mov(TAB+1024, r5, r5 ## E)\
movw r4 ## X,r2 ## X; \
- movl TAB(,r6,4),r6 ## E; \
+ round_mov(TAB, r6, r6 ## E) \
roll $16,r2 ## E; \
shrl $16,r4 ## E; \
movzbl r4 ## L,r7 ## E; \
movzbl r4 ## H,r4 ## E; \
xorl OFFSET(r8),ra ## E; \
xorl OFFSET+4(r8),rb ## E; \
- xorl TAB+3072(,r4,4),r5 ## E;\
- xorl TAB+2048(,r7,4),r6 ## E;\
+ round_xor(TAB+3072, r4, r5 ## E)\
+ round_xor(TAB+2048, r7, r6 ## E)\
movzbl r1 ## L,r7 ## E; \
movzbl r1 ## H,r4 ## E; \
- movl TAB+1024(,r4,4),r4 ## E;\
+ round_mov(TAB+1024, r4, r4 ## E)\
movw r3 ## X,r1 ## X; \
roll $16,r1 ## E; \
shrl $16,r3 ## E; \
- xorl TAB(,r7,4),r5 ## E; \
+ round_xor(TAB, r7, r5 ## E) \
movzbl r3 ## L,r7 ## E; \
movzbl r3 ## H,r3 ## E; \
- xorl TAB+3072(,r3,4),r4 ## E;\
- xorl TAB+2048(,r7,4),r5 ## E;\
+ round_xor(TAB+3072, r3, r4 ## E)\
+ round_xor(TAB+2048, r7, r5 ## E)\
movzbl r1 ## L,r7 ## E; \
movzbl r1 ## H,r3 ## E; \
shrl $16,r1 ## E; \
- xorl TAB+3072(,r3,4),r6 ## E;\
- movl TAB+2048(,r7,4),r3 ## E;\
+ round_xor(TAB+3072, r3, r6 ## E)\
+ round_mov(TAB+2048, r7, r3 ## E)\
movzbl r1 ## L,r7 ## E; \
movzbl r1 ## H,r1 ## E; \
- xorl TAB+1024(,r1,4),r6 ## E;\
- xorl TAB(,r7,4),r3 ## E; \
+ round_xor(TAB+1024, r1, r6 ## E)\
+ round_xor(TAB, r7, r3 ## E) \
movzbl r2 ## H,r1 ## E; \
movzbl r2 ## L,r7 ## E; \
shrl $16,r2 ## E; \
- xorl TAB+3072(,r1,4),r3 ## E;\
- xorl TAB+2048(,r7,4),r4 ## E;\
+ round_xor(TAB+3072, r1, r3 ## E)\
+ round_xor(TAB+2048, r7, r4 ## E)\
movzbl r2 ## H,r1 ## E; \
movzbl r2 ## L,r2 ## E; \
xorl OFFSET+8(r8),rc ## E; \
xorl OFFSET+12(r8),rd ## E; \
- xorl TAB+1024(,r1,4),r3 ## E;\
- xorl TAB(,r2,4),r4 ## E;
+ round_xor(TAB+1024, r1, r3 ## E)\
+ round_xor(TAB, r2, r4 ## E)
#define move_regs(r1,r2,r3,r4) \
movl r3 ## E,r1 ## E; \
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index e762ef417562..4df029aa5fc1 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -2610,7 +2610,7 @@ ENDPROC(aesni_cbc_dec)
*/
.align 4
_aesni_inc_init:
- movaps .Lbswap_mask, BSWAP_MASK
+ movaps .Lbswap_mask(%rip), BSWAP_MASK
movaps IV, CTR
PSHUFB_XMM BSWAP_MASK CTR
mov $1, TCTR_LOW
@@ -2738,12 +2738,12 @@ ENTRY(aesni_xts_crypt8)
cmpb $0, %cl
movl $0, %ecx
movl $240, %r10d
- leaq _aesni_enc4, %r11
- leaq _aesni_dec4, %rax
+ leaq _aesni_enc4(%rip), %r11
+ leaq _aesni_dec4(%rip), %rax
cmovel %r10d, %ecx
cmoveq %rax, %r11
- movdqa .Lgf128mul_x_ble_mask, GF128MUL_MASK
+ movdqa .Lgf128mul_x_ble_mask(%rip), GF128MUL_MASK
movups (IVP), IV
mov 480(KEYP), KLEN
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index faecb1518bf8..488605b19fe8 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -454,7 +454,8 @@ _get_AAD_rest0\@:
vpshufb and an array of shuffle masks */
movq %r12, %r11
salq $4, %r11
- movdqu aad_shift_arr(%r11), \T1
+ leaq aad_shift_arr(%rip), %rax
+ movdqu (%rax,%r11,), \T1
vpshufb \T1, reg_i, reg_i
_get_AAD_rest_final\@:
vpshufb SHUF_MASK(%rip), reg_i, reg_i
@@ -1761,7 +1762,8 @@ _get_AAD_rest0\@:
vpshufb and an array of shuffle masks */
movq %r12, %r11
salq $4, %r11
- movdqu aad_shift_arr(%r11), \T1
+ leaq aad_shift_arr(%rip), %rax
+ movdqu (%rax,%r11,), \T1
vpshufb \T1, reg_i, reg_i
_get_AAD_rest_final\@:
vpshufb SHUF_MASK(%rip), reg_i, reg_i
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index a14af6eb09cb..f94ec9a5552b 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -53,10 +53,10 @@
/* \
* S-function with AES subbytes \
*/ \
- vmovdqa .Linv_shift_row, t4; \
- vbroadcastss .L0f0f0f0f, t7; \
- vmovdqa .Lpre_tf_lo_s1, t0; \
- vmovdqa .Lpre_tf_hi_s1, t1; \
+ vmovdqa .Linv_shift_row(%rip), t4; \
+ vbroadcastss .L0f0f0f0f(%rip), t7; \
+ vmovdqa .Lpre_tf_lo_s1(%rip), t0; \
+ vmovdqa .Lpre_tf_hi_s1(%rip), t1; \
\
/* AES inverse shift rows */ \
vpshufb t4, x0, x0; \
@@ -69,8 +69,8 @@
vpshufb t4, x6, x6; \
\
/* prefilter sboxes 1, 2 and 3 */ \
- vmovdqa .Lpre_tf_lo_s4, t2; \
- vmovdqa .Lpre_tf_hi_s4, t3; \
+ vmovdqa .Lpre_tf_lo_s4(%rip), t2; \
+ vmovdqa .Lpre_tf_hi_s4(%rip), t3; \
filter_8bit(x0, t0, t1, t7, t6); \
filter_8bit(x7, t0, t1, t7, t6); \
filter_8bit(x1, t0, t1, t7, t6); \
@@ -84,8 +84,8 @@
filter_8bit(x6, t2, t3, t7, t6); \
\
/* AES subbytes + AES shift rows */ \
- vmovdqa .Lpost_tf_lo_s1, t0; \
- vmovdqa .Lpost_tf_hi_s1, t1; \
+ vmovdqa .Lpost_tf_lo_s1(%rip), t0; \
+ vmovdqa .Lpost_tf_hi_s1(%rip), t1; \
vaesenclast t4, x0, x0; \
vaesenclast t4, x7, x7; \
vaesenclast t4, x1, x1; \
@@ -96,16 +96,16 @@
vaesenclast t4, x6, x6; \
\
/* postfilter sboxes 1 and 4 */ \
- vmovdqa .Lpost_tf_lo_s3, t2; \
- vmovdqa .Lpost_tf_hi_s3, t3; \
+ vmovdqa .Lpost_tf_lo_s3(%rip), t2; \
+ vmovdqa .Lpost_tf_hi_s3(%rip), t3; \
filter_8bit(x0, t0, t1, t7, t6); \
filter_8bit(x7, t0, t1, t7, t6); \
filter_8bit(x3, t0, t1, t7, t6); \
filter_8bit(x6, t0, t1, t7, t6); \
\
/* postfilter sbox 3 */ \
- vmovdqa .Lpost_tf_lo_s2, t4; \
- vmovdqa .Lpost_tf_hi_s2, t5; \
+ vmovdqa .Lpost_tf_lo_s2(%rip), t4; \
+ vmovdqa .Lpost_tf_hi_s2(%rip), t5; \
filter_8bit(x2, t2, t3, t7, t6); \
filter_8bit(x5, t2, t3, t7, t6); \
\
@@ -444,7 +444,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
transpose_4x4(c0, c1, c2, c3, a0, a1); \
transpose_4x4(d0, d1, d2, d3, a0, a1); \
\
- vmovdqu .Lshufb_16x16b, a0; \
+ vmovdqu .Lshufb_16x16b(%rip), a0; \
vmovdqu st1, a1; \
vpshufb a0, a2, a2; \
vpshufb a0, a3, a3; \
@@ -483,7 +483,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
#define inpack16_pre(x0, x1, x2, x3, x4, x5, x6, x7, y0, y1, y2, y3, y4, y5, \
y6, y7, rio, key) \
vmovq key, x0; \
- vpshufb .Lpack_bswap, x0, x0; \
+ vpshufb .Lpack_bswap(%rip), x0, x0; \
\
vpxor 0 * 16(rio), x0, y7; \
vpxor 1 * 16(rio), x0, y6; \
@@ -534,7 +534,7 @@ ENDPROC(roundsm16_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
vmovdqu x0, stack_tmp0; \
\
vmovq key, x0; \
- vpshufb .Lpack_bswap, x0, x0; \
+ vpshufb .Lpack_bswap(%rip), x0, x0; \
\
vpxor x0, y7, y7; \
vpxor x0, y6, y6; \
@@ -1017,7 +1017,7 @@ ENTRY(camellia_ctr_16way)
subq $(16 * 16), %rsp;
movq %rsp, %rax;
- vmovdqa .Lbswap128_mask, %xmm14;
+ vmovdqa .Lbswap128_mask(%rip), %xmm14;
/* load IV and byteswap */
vmovdqu (%rcx), %xmm0;
@@ -1066,7 +1066,7 @@ ENTRY(camellia_ctr_16way)
/* inpack16_pre: */
vmovq (key_table)(CTX), %xmm15;
- vpshufb .Lpack_bswap, %xmm15, %xmm15;
+ vpshufb .Lpack_bswap(%rip), %xmm15, %xmm15;
vpxor %xmm0, %xmm15, %xmm0;
vpxor %xmm1, %xmm15, %xmm1;
vpxor %xmm2, %xmm15, %xmm2;
@@ -1134,7 +1134,7 @@ camellia_xts_crypt_16way:
subq $(16 * 16), %rsp;
movq %rsp, %rax;
- vmovdqa .Lxts_gf128mul_and_shl1_mask, %xmm14;
+ vmovdqa .Lxts_gf128mul_and_shl1_mask(%rip), %xmm14;
/* load IV */
vmovdqu (%rcx), %xmm0;
@@ -1210,7 +1210,7 @@ camellia_xts_crypt_16way:
/* inpack16_pre: */
vmovq (key_table)(CTX, %r8, 8), %xmm15;
- vpshufb .Lpack_bswap, %xmm15, %xmm15;
+ vpshufb .Lpack_bswap(%rip), %xmm15, %xmm15;
vpxor 0 * 16(%rax), %xmm15, %xmm0;
vpxor %xmm1, %xmm15, %xmm1;
vpxor %xmm2, %xmm15, %xmm2;
@@ -1265,7 +1265,7 @@ ENTRY(camellia_xts_enc_16way)
*/
xorl %r8d, %r8d; /* input whitening key, 0 for enc */
- leaq __camellia_enc_blk16, %r9;
+ leaq __camellia_enc_blk16(%rip), %r9;
jmp camellia_xts_crypt_16way;
ENDPROC(camellia_xts_enc_16way)
@@ -1283,7 +1283,7 @@ ENTRY(camellia_xts_dec_16way)
movl $24, %eax;
cmovel %eax, %r8d; /* input whitening key, last for dec */
- leaq __camellia_dec_blk16, %r9;
+ leaq __camellia_dec_blk16(%rip), %r9;
jmp camellia_xts_crypt_16way;
ENDPROC(camellia_xts_dec_16way)
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index b66bbfa62f50..11bbaa1cd4a7 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -70,12 +70,12 @@
/* \
* S-function with AES subbytes \
*/ \
- vbroadcasti128 .Linv_shift_row, t4; \
- vpbroadcastd .L0f0f0f0f, t7; \
- vbroadcasti128 .Lpre_tf_lo_s1, t5; \
- vbroadcasti128 .Lpre_tf_hi_s1, t6; \
- vbroadcasti128 .Lpre_tf_lo_s4, t2; \
- vbroadcasti128 .Lpre_tf_hi_s4, t3; \
+ vbroadcasti128 .Linv_shift_row(%rip), t4; \
+ vpbroadcastd .L0f0f0f0f(%rip), t7; \
+ vbroadcasti128 .Lpre_tf_lo_s1(%rip), t5; \
+ vbroadcasti128 .Lpre_tf_hi_s1(%rip), t6; \
+ vbroadcasti128 .Lpre_tf_lo_s4(%rip), t2; \
+ vbroadcasti128 .Lpre_tf_hi_s4(%rip), t3; \
\
/* AES inverse shift rows */ \
vpshufb t4, x0, x0; \
@@ -121,8 +121,8 @@
vinserti128 $1, t2##_x, x6, x6; \
vextracti128 $1, x1, t3##_x; \
vextracti128 $1, x4, t2##_x; \
- vbroadcasti128 .Lpost_tf_lo_s1, t0; \
- vbroadcasti128 .Lpost_tf_hi_s1, t1; \
+ vbroadcasti128 .Lpost_tf_lo_s1(%rip), t0; \
+ vbroadcasti128 .Lpost_tf_hi_s1(%rip), t1; \
vaesenclast t4##_x, x2##_x, x2##_x; \
vaesenclast t4##_x, t6##_x, t6##_x; \
vinserti128 $1, t6##_x, x2, x2; \
@@ -137,16 +137,16 @@
vinserti128 $1, t2##_x, x4, x4; \
\
/* postfilter sboxes 1 and 4 */ \
- vbroadcasti128 .Lpost_tf_lo_s3, t2; \
- vbroadcasti128 .Lpost_tf_hi_s3, t3; \
+ vbroadcasti128 .Lpost_tf_lo_s3(%rip), t2; \
+ vbroadcasti128 .Lpost_tf_hi_s3(%rip), t3; \
filter_8bit(x0, t0, t1, t7, t6); \
filter_8bit(x7, t0, t1, t7, t6); \
filter_8bit(x3, t0, t1, t7, t6); \
filter_8bit(x6, t0, t1, t7, t6); \
\
/* postfilter sbox 3 */ \
- vbroadcasti128 .Lpost_tf_lo_s2, t4; \
- vbroadcasti128 .Lpost_tf_hi_s2, t5; \
+ vbroadcasti128 .Lpost_tf_lo_s2(%rip), t4; \
+ vbroadcasti128 .Lpost_tf_hi_s2(%rip), t5; \
filter_8bit(x2, t2, t3, t7, t6); \
filter_8bit(x5, t2, t3, t7, t6); \
\
@@ -483,7 +483,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
transpose_4x4(c0, c1, c2, c3, a0, a1); \
transpose_4x4(d0, d1, d2, d3, a0, a1); \
\
- vbroadcasti128 .Lshufb_16x16b, a0; \
+ vbroadcasti128 .Lshufb_16x16b(%rip), a0; \
vmovdqu st1, a1; \
vpshufb a0, a2, a2; \
vpshufb a0, a3, a3; \
@@ -522,7 +522,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
#define inpack32_pre(x0, x1, x2, x3, x4, x5, x6, x7, y0, y1, y2, y3, y4, y5, \
y6, y7, rio, key) \
vpbroadcastq key, x0; \
- vpshufb .Lpack_bswap, x0, x0; \
+ vpshufb .Lpack_bswap(%rip), x0, x0; \
\
vpxor 0 * 32(rio), x0, y7; \
vpxor 1 * 32(rio), x0, y6; \
@@ -573,7 +573,7 @@ ENDPROC(roundsm32_x4_x5_x6_x7_x0_x1_x2_x3_y4_y5_y6_y7_y0_y1_y2_y3_ab)
vmovdqu x0, stack_tmp0; \
\
vpbroadcastq key, x0; \
- vpshufb .Lpack_bswap, x0, x0; \
+ vpshufb .Lpack_bswap(%rip), x0, x0; \
\
vpxor x0, y7, y7; \
vpxor x0, y6, y6; \
@@ -1113,7 +1113,7 @@ ENTRY(camellia_ctr_32way)
vmovdqu (%rcx), %xmm0;
vmovdqa %xmm0, %xmm1;
inc_le128(%xmm0, %xmm15, %xmm14);
- vbroadcasti128 .Lbswap128_mask, %ymm14;
+ vbroadcasti128 .Lbswap128_mask(%rip), %ymm14;
vinserti128 $1, %xmm0, %ymm1, %ymm0;
vpshufb %ymm14, %ymm0, %ymm13;
vmovdqu %ymm13, 15 * 32(%rax);
@@ -1159,7 +1159,7 @@ ENTRY(camellia_ctr_32way)
/* inpack32_pre: */
vpbroadcastq (key_table)(CTX), %ymm15;
- vpshufb .Lpack_bswap, %ymm15, %ymm15;
+ vpshufb .Lpack_bswap(%rip), %ymm15, %ymm15;
vpxor %ymm0, %ymm15, %ymm0;
vpxor %ymm1, %ymm15, %ymm1;
vpxor %ymm2, %ymm15, %ymm2;
@@ -1243,13 +1243,13 @@ camellia_xts_crypt_32way:
subq $(16 * 32), %rsp;
movq %rsp, %rax;
- vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_0, %ymm12;
+ vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_0(%rip), %ymm12;
/* load IV and construct second IV */
vmovdqu (%rcx), %xmm0;
vmovdqa %xmm0, %xmm15;
gf128mul_x_ble(%xmm0, %xmm12, %xmm13);
- vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_1, %ymm13;
+ vbroadcasti128 .Lxts_gf128mul_and_shl1_mask_1(%rip), %ymm13;
vinserti128 $1, %xmm0, %ymm15, %ymm0;
vpxor 0 * 32(%rdx), %ymm0, %ymm15;
vmovdqu %ymm15, 15 * 32(%rax);
@@ -1326,7 +1326,7 @@ camellia_xts_crypt_32way:
/* inpack32_pre: */
vpbroadcastq (key_table)(CTX, %r8, 8), %ymm15;
- vpshufb .Lpack_bswap, %ymm15, %ymm15;
+ vpshufb .Lpack_bswap(%rip), %ymm15, %ymm15;
vpxor 0 * 32(%rax), %ymm15, %ymm0;
vpxor %ymm1, %ymm15, %ymm1;
vpxor %ymm2, %ymm15, %ymm2;
@@ -1384,7 +1384,7 @@ ENTRY(camellia_xts_enc_32way)
xorl %r8d, %r8d; /* input whitening key, 0 for enc */
- leaq __camellia_enc_blk32, %r9;
+ leaq __camellia_enc_blk32(%rip), %r9;
jmp camellia_xts_crypt_32way;
ENDPROC(camellia_xts_enc_32way)
@@ -1402,7 +1402,7 @@ ENTRY(camellia_xts_dec_32way)
movl $24, %eax;
cmovel %eax, %r8d; /* input whitening key, last for dec */
- leaq __camellia_dec_blk32, %r9;
+ leaq __camellia_dec_blk32(%rip), %r9;
jmp camellia_xts_crypt_32way;
ENDPROC(camellia_xts_dec_32way)
diff --git a/arch/x86/crypto/camellia-x86_64-asm_64.S b/arch/x86/crypto/camellia-x86_64-asm_64.S
index 95ba6956a7f6..ef1137406959 100644
--- a/arch/x86/crypto/camellia-x86_64-asm_64.S
+++ b/arch/x86/crypto/camellia-x86_64-asm_64.S
@@ -92,11 +92,13 @@
#define RXORbl %r9b
#define xor2ror16(T0, T1, tmp1, tmp2, ab, dst) \
+ leaq T0(%rip), tmp1; \
movzbl ab ## bl, tmp2 ## d; \
+ xorq (tmp1, tmp2, 8), dst; \
+ leaq T1(%rip), tmp2; \
movzbl ab ## bh, tmp1 ## d; \
- rorq $16, ab; \
- xorq T0(, tmp2, 8), dst; \
- xorq T1(, tmp1, 8), dst;
+ xorq (tmp2, tmp1, 8), dst; \
+ rorq $16, ab;
/**********************************************************************
1-way camellia
diff --git a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
index 86107c961bb4..64eb5c87d04a 100644
--- a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
@@ -98,16 +98,20 @@
#define lookup_32bit(src, dst, op1, op2, op3, interleave_op, il_reg) \
- movzbl src ## bh, RID1d; \
- movzbl src ## bl, RID2d; \
- shrq $16, src; \
- movl s1(, RID1, 4), dst ## d; \
- op1 s2(, RID2, 4), dst ## d; \
- movzbl src ## bh, RID1d; \
- movzbl src ## bl, RID2d; \
- interleave_op(il_reg); \
- op2 s3(, RID1, 4), dst ## d; \
- op3 s4(, RID2, 4), dst ## d;
+ movzbl src ## bh, RID1d; \
+ leaq s1(%rip), RID2; \
+ movl (RID2, RID1, 4), dst ## d; \
+ movzbl src ## bl, RID2d; \
+ leaq s2(%rip), RID1; \
+ op1 (RID1, RID2, 4), dst ## d; \
+ shrq $16, src; \
+ movzbl src ## bh, RID1d; \
+ leaq s3(%rip), RID2; \
+ op2 (RID2, RID1, 4), dst ## d; \
+ movzbl src ## bl, RID2d; \
+ leaq s4(%rip), RID1; \
+ op3 (RID1, RID2, 4), dst ## d; \
+ interleave_op(il_reg);
#define dummy(d) /* do nothing */
@@ -166,15 +170,15 @@
subround(l ## 3, r ## 3, l ## 4, r ## 4, f);
#define enc_preload_rkr() \
- vbroadcastss .L16_mask, RKR; \
+ vbroadcastss .L16_mask(%rip), RKR; \
/* add 16-bit rotation to key rotations (mod 32) */ \
vpxor kr(CTX), RKR, RKR;
#define dec_preload_rkr() \
- vbroadcastss .L16_mask, RKR; \
+ vbroadcastss .L16_mask(%rip), RKR; \
/* add 16-bit rotation to key rotations (mod 32) */ \
vpxor kr(CTX), RKR, RKR; \
- vpshufb .Lbswap128_mask, RKR, RKR;
+ vpshufb .Lbswap128_mask(%rip), RKR, RKR;
#define transpose_2x4(x0, x1, t0, t1) \
vpunpckldq x1, x0, t0; \
@@ -251,9 +255,9 @@ __cast5_enc_blk16:
movq %rdi, CTX;
- vmovdqa .Lbswap_mask, RKM;
- vmovd .Lfirst_mask, R1ST;
- vmovd .L32_mask, R32;
+ vmovdqa .Lbswap_mask(%rip), RKM;
+ vmovd .Lfirst_mask(%rip), R1ST;
+ vmovd .L32_mask(%rip), R32;
enc_preload_rkr();
inpack_blocks(RL1, RR1, RTMP, RX, RKM);
@@ -287,7 +291,7 @@ __cast5_enc_blk16:
popq %rbx;
popq %r15;
- vmovdqa .Lbswap_mask, RKM;
+ vmovdqa .Lbswap_mask(%rip), RKM;
outunpack_blocks(RR1, RL1, RTMP, RX, RKM);
outunpack_blocks(RR2, RL2, RTMP, RX, RKM);
@@ -325,9 +329,9 @@ __cast5_dec_blk16:
movq %rdi, CTX;
- vmovdqa .Lbswap_mask, RKM;
- vmovd .Lfirst_mask, R1ST;
- vmovd .L32_mask, R32;
+ vmovdqa .Lbswap_mask(%rip), RKM;
+ vmovd .Lfirst_mask(%rip), R1ST;
+ vmovd .L32_mask(%rip), R32;
dec_preload_rkr();
inpack_blocks(RL1, RR1, RTMP, RX, RKM);
@@ -358,7 +362,7 @@ __cast5_dec_blk16:
round(RL, RR, 1, 2);
round(RR, RL, 0, 1);
- vmovdqa .Lbswap_mask, RKM;
+ vmovdqa .Lbswap_mask(%rip), RKM;
popq %rbx;
popq %r15;
@@ -521,8 +525,8 @@ ENTRY(cast5_ctr_16way)
vpcmpeqd RKR, RKR, RKR;
vpaddq RKR, RKR, RKR; /* low: -2, high: -2 */
- vmovdqa .Lbswap_iv_mask, R1ST;
- vmovdqa .Lbswap128_mask, RKM;
+ vmovdqa .Lbswap_iv_mask(%rip), R1ST;
+ vmovdqa .Lbswap128_mask(%rip), RKM;
/* load IV and byteswap */
vmovq (%rcx), RX;
diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
index 7f30b6f0d72c..da1b7e4a23e4 100644
--- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
@@ -98,16 +98,20 @@
#define lookup_32bit(src, dst, op1, op2, op3, interleave_op, il_reg) \
- movzbl src ## bh, RID1d; \
- movzbl src ## bl, RID2d; \
- shrq $16, src; \
- movl s1(, RID1, 4), dst ## d; \
- op1 s2(, RID2, 4), dst ## d; \
- movzbl src ## bh, RID1d; \
- movzbl src ## bl, RID2d; \
- interleave_op(il_reg); \
- op2 s3(, RID1, 4), dst ## d; \
- op3 s4(, RID2, 4), dst ## d;
+ movzbl src ## bh, RID1d; \
+ leaq s1(%rip), RID2; \
+ movl (RID2, RID1, 4), dst ## d; \
+ movzbl src ## bl, RID2d; \
+ leaq s2(%rip), RID1; \
+ op1 (RID1, RID2, 4), dst ## d; \
+ shrq $16, src; \
+ movzbl src ## bh, RID1d; \
+ leaq s3(%rip), RID2; \
+ op2 (RID2, RID1, 4), dst ## d; \
+ movzbl src ## bl, RID2d; \
+ leaq s4(%rip), RID1; \
+ op3 (RID1, RID2, 4), dst ## d; \
+ interleave_op(il_reg);
#define dummy(d) /* do nothing */
@@ -190,10 +194,10 @@
qop(RD, RC, 1);
#define shuffle(mask) \
- vpshufb mask, RKR, RKR;
+ vpshufb mask(%rip), RKR, RKR;
#define preload_rkr(n, do_mask, mask) \
- vbroadcastss .L16_mask, RKR; \
+ vbroadcastss .L16_mask(%rip), RKR; \
/* add 16-bit rotation to key rotations (mod 32) */ \
vpxor (kr+n*16)(CTX), RKR, RKR; \
do_mask(mask);
@@ -275,9 +279,9 @@ __cast6_enc_blk8:
movq %rdi, CTX;
- vmovdqa .Lbswap_mask, RKM;
- vmovd .Lfirst_mask, R1ST;
- vmovd .L32_mask, R32;
+ vmovdqa .Lbswap_mask(%rip), RKM;
+ vmovd .Lfirst_mask(%rip), R1ST;
+ vmovd .L32_mask(%rip), R32;
inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -301,7 +305,7 @@ __cast6_enc_blk8:
popq %rbx;
popq %r15;
- vmovdqa .Lbswap_mask, RKM;
+ vmovdqa .Lbswap_mask(%rip), RKM;
outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -323,9 +327,9 @@ __cast6_dec_blk8:
movq %rdi, CTX;
- vmovdqa .Lbswap_mask, RKM;
- vmovd .Lfirst_mask, R1ST;
- vmovd .L32_mask, R32;
+ vmovdqa .Lbswap_mask(%rip), RKM;
+ vmovd .Lfirst_mask(%rip), R1ST;
+ vmovd .L32_mask(%rip), R32;
inpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
inpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
@@ -349,7 +353,7 @@ __cast6_dec_blk8:
popq %rbx;
popq %r15;
- vmovdqa .Lbswap_mask, RKM;
+ vmovdqa .Lbswap_mask(%rip), RKM;
outunpack_blocks(RA1, RB1, RC1, RD1, RTMP, RX, RKRF, RKM);
outunpack_blocks(RA2, RB2, RC2, RD2, RTMP, RX, RKRF, RKM);
diff --git a/arch/x86/crypto/des3_ede-asm_64.S b/arch/x86/crypto/des3_ede-asm_64.S
index 8e49ce117494..4bbd3ec78df5 100644
--- a/arch/x86/crypto/des3_ede-asm_64.S
+++ b/arch/x86/crypto/des3_ede-asm_64.S
@@ -138,21 +138,29 @@
movzbl RW0bl, RT2d; \
movzbl RW0bh, RT3d; \
shrq $16, RW0; \
- movq s8(, RT0, 8), RT0; \
- xorq s6(, RT1, 8), to; \
+ leaq s8(%rip), RW1; \
+ movq (RW1, RT0, 8), RT0; \
+ leaq s6(%rip), RW1; \
+ xorq (RW1, RT1, 8), to; \
movzbl RW0bl, RL1d; \
movzbl RW0bh, RT1d; \
shrl $16, RW0d; \
- xorq s4(, RT2, 8), RT0; \
- xorq s2(, RT3, 8), to; \
+ leaq s4(%rip), RW1; \
+ xorq (RW1, RT2, 8), RT0; \
+ leaq s2(%rip), RW1; \
+ xorq (RW1, RT3, 8), to; \
movzbl RW0bl, RT2d; \
movzbl RW0bh, RT3d; \
- xorq s7(, RL1, 8), RT0; \
- xorq s5(, RT1, 8), to; \
- xorq s3(, RT2, 8), RT0; \
+ leaq s7(%rip), RW1; \
+ xorq (RW1, RL1, 8), RT0; \
+ leaq s5(%rip), RW1; \
+ xorq (RW1, RT1, 8), to; \
+ leaq s3(%rip), RW1; \
+ xorq (RW1, RT2, 8), RT0; \
load_next_key(n, RW0); \
xorq RT0, to; \
- xorq s1(, RT3, 8), to; \
+ leaq s1(%rip), RW1; \
+ xorq (RW1, RT3, 8), to; \
#define load_next_key(n, RWx) \
movq (((n) + 1) * 8)(CTX), RWx;
@@ -364,65 +372,89 @@ ENDPROC(des3_ede_x86_64_crypt_blk)
movzbl RW0bl, RT3d; \
movzbl RW0bh, RT1d; \
shrq $16, RW0; \
- xorq s8(, RT3, 8), to##0; \
- xorq s6(, RT1, 8), to##0; \
+ leaq s8(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##0; \
+ leaq s6(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##0; \
movzbl RW0bl, RT3d; \
movzbl RW0bh, RT1d; \
shrq $16, RW0; \
- xorq s4(, RT3, 8), to##0; \
- xorq s2(, RT1, 8), to##0; \
+ leaq s4(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##0; \
+ leaq s2(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##0; \
movzbl RW0bl, RT3d; \
movzbl RW0bh, RT1d; \
shrl $16, RW0d; \
- xorq s7(, RT3, 8), to##0; \
- xorq s5(, RT1, 8), to##0; \
+ leaq s7(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##0; \
+ leaq s5(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##0; \
movzbl RW0bl, RT3d; \
movzbl RW0bh, RT1d; \
load_next_key(n, RW0); \
- xorq s3(, RT3, 8), to##0; \
- xorq s1(, RT1, 8), to##0; \
+ leaq s3(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##0; \
+ leaq s1(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##0; \
xorq from##1, RW1; \
movzbl RW1bl, RT3d; \
movzbl RW1bh, RT1d; \
shrq $16, RW1; \
- xorq s8(, RT3, 8), to##1; \
- xorq s6(, RT1, 8), to##1; \
+ leaq s8(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##1; \
+ leaq s6(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##1; \
movzbl RW1bl, RT3d; \
movzbl RW1bh, RT1d; \
shrq $16, RW1; \
- xorq s4(, RT3, 8), to##1; \
- xorq s2(, RT1, 8), to##1; \
+ leaq s4(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##1; \
+ leaq s2(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##1; \
movzbl RW1bl, RT3d; \
movzbl RW1bh, RT1d; \
shrl $16, RW1d; \
- xorq s7(, RT3, 8), to##1; \
- xorq s5(, RT1, 8), to##1; \
+ leaq s7(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##1; \
+ leaq s5(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##1; \
movzbl RW1bl, RT3d; \
movzbl RW1bh, RT1d; \
do_movq(RW0, RW1); \
- xorq s3(, RT3, 8), to##1; \
- xorq s1(, RT1, 8), to##1; \
+ leaq s3(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##1; \
+ leaq s1(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##1; \
xorq from##2, RW2; \
movzbl RW2bl, RT3d; \
movzbl RW2bh, RT1d; \
shrq $16, RW2; \
- xorq s8(, RT3, 8), to##2; \
- xorq s6(, RT1, 8), to##2; \
+ leaq s8(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##2; \
+ leaq s6(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##2; \
movzbl RW2bl, RT3d; \
movzbl RW2bh, RT1d; \
shrq $16, RW2; \
- xorq s4(, RT3, 8), to##2; \
- xorq s2(, RT1, 8), to##2; \
+ leaq s4(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##2; \
+ leaq s2(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##2; \
movzbl RW2bl, RT3d; \
movzbl RW2bh, RT1d; \
shrl $16, RW2d; \
- xorq s7(, RT3, 8), to##2; \
- xorq s5(, RT1, 8), to##2; \
+ leaq s7(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##2; \
+ leaq s5(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##2; \
movzbl RW2bl, RT3d; \
movzbl RW2bh, RT1d; \
do_movq(RW0, RW2); \
- xorq s3(, RT3, 8), to##2; \
- xorq s1(, RT1, 8), to##2;
+ leaq s3(%rip), RT2; \
+ xorq (RT2, RT3, 8), to##2; \
+ leaq s1(%rip), RT2; \
+ xorq (RT2, RT1, 8), to##2;
#define __movq(src, dst) \
movq src, dst;
diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index f94375a8dcd1..d56a281221fb 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -97,7 +97,7 @@ ENTRY(clmul_ghash_mul)
FRAME_BEGIN
movups (%rdi), DATA
movups (%rsi), SHASH
- movaps .Lbswap_mask, BSWAP
+ movaps .Lbswap_mask(%rip), BSWAP
PSHUFB_XMM BSWAP DATA
call __clmul_gf128mul_ble
PSHUFB_XMM BSWAP DATA
@@ -114,7 +114,7 @@ ENTRY(clmul_ghash_update)
FRAME_BEGIN
cmp $16, %rdx
jb .Lupdate_just_ret # check length
- movaps .Lbswap_mask, BSWAP
+ movaps .Lbswap_mask(%rip), BSWAP
movups (%rdi), DATA
movups (%rcx), SHASH
PSHUFB_XMM BSWAP DATA
diff --git a/arch/x86/crypto/glue_helper-asm-avx.S b/arch/x86/crypto/glue_helper-asm-avx.S
index 02ee2308fb38..8a49ab1699ef 100644
--- a/arch/x86/crypto/glue_helper-asm-avx.S
+++ b/arch/x86/crypto/glue_helper-asm-avx.S
@@ -54,7 +54,7 @@
#define load_ctr_8way(iv, bswap, x0, x1, x2, x3, x4, x5, x6, x7, t0, t1, t2) \
vpcmpeqd t0, t0, t0; \
vpsrldq $8, t0, t0; /* low: -1, high: 0 */ \
- vmovdqa bswap, t1; \
+ vmovdqa bswap(%rip), t1; \
\
/* load IV and byteswap */ \
vmovdqu (iv), x7; \
@@ -99,7 +99,7 @@
#define load_xts_8way(iv, src, dst, x0, x1, x2, x3, x4, x5, x6, x7, tiv, t0, \
t1, xts_gf128mul_and_shl1_mask) \
- vmovdqa xts_gf128mul_and_shl1_mask, t0; \
+ vmovdqa xts_gf128mul_and_shl1_mask(%rip), t0; \
\
/* load IV */ \
vmovdqu (iv), tiv; \
diff --git a/arch/x86/crypto/glue_helper-asm-avx2.S b/arch/x86/crypto/glue_helper-asm-avx2.S
index a53ac11dd385..e04c80467bd2 100644
--- a/arch/x86/crypto/glue_helper-asm-avx2.S
+++ b/arch/x86/crypto/glue_helper-asm-avx2.S
@@ -67,7 +67,7 @@
vmovdqu (iv), t2x; \
vmovdqa t2x, t3x; \
inc_le128(t2x, t0x, t1x); \
- vbroadcasti128 bswap, t1; \
+ vbroadcasti128 bswap(%rip), t1; \
vinserti128 $1, t2x, t3, t2; /* ab: le0 ; cd: le1 */ \
vpshufb t1, t2, x0; \
\
@@ -124,13 +124,13 @@
tivx, t0, t0x, t1, t1x, t2, t2x, t3, \
xts_gf128mul_and_shl1_mask_0, \
xts_gf128mul_and_shl1_mask_1) \
- vbroadcasti128 xts_gf128mul_and_shl1_mask_0, t1; \
+ vbroadcasti128 xts_gf128mul_and_shl1_mask_0(%rip), t1; \
\
/* load IV and construct second IV */ \
vmovdqu (iv), tivx; \
vmovdqa tivx, t0x; \
gf128mul_x_ble(tivx, t1x, t2x); \
- vbroadcasti128 xts_gf128mul_and_shl1_mask_1, t2; \
+ vbroadcasti128 xts_gf128mul_and_shl1_mask_1(%rip), t2; \
vinserti128 $1, tivx, t0, tiv; \
vpxor (0*32)(src), tiv, x0; \
vmovdqu tiv, (0*32)(dst); \
diff --git a/arch/x86/crypto/sha256-avx2-asm.S b/arch/x86/crypto/sha256-avx2-asm.S
index 1420db15dcdd..2ced4b2f6c76 100644
--- a/arch/x86/crypto/sha256-avx2-asm.S
+++ b/arch/x86/crypto/sha256-avx2-asm.S
@@ -588,37 +588,42 @@ last_block_enter:
mov INP, _INP(%rsp)
## schedule 48 input dwords, by doing 3 rounds of 12 each
- xor SRND, SRND
+ leaq K256(%rip), SRND
+ ## loop1 upper bound
+ leaq K256+3*4*32(%rip), INP
.align 16
loop1:
- vpaddd K256+0*32(SRND), X0, XFER
+ vpaddd 0*32(SRND), X0, XFER
vmovdqa XFER, 0*32+_XFER(%rsp, SRND)
FOUR_ROUNDS_AND_SCHED _XFER + 0*32
- vpaddd K256+1*32(SRND), X0, XFER
+ vpaddd 1*32(SRND), X0, XFER
vmovdqa XFER, 1*32+_XFER(%rsp, SRND)
FOUR_ROUNDS_AND_SCHED _XFER + 1*32
- vpaddd K256+2*32(SRND), X0, XFER
+ vpaddd 2*32(SRND), X0, XFER
vmovdqa XFER, 2*32+_XFER(%rsp, SRND)
FOUR_ROUNDS_AND_SCHED _XFER + 2*32
- vpaddd K256+3*32(SRND), X0, XFER
+ vpaddd 3*32(SRND), X0, XFER
vmovdqa XFER, 3*32+_XFER(%rsp, SRND)
FOUR_ROUNDS_AND_SCHED _XFER + 3*32
add $4*32, SRND
- cmp $3*4*32, SRND
+ cmp INP, SRND
jb loop1
+ ## loop2 upper bound
+ leaq K256+4*4*32(%rip), INP
+
loop2:
## Do last 16 rounds with no scheduling
- vpaddd K256+0*32(SRND), X0, XFER
+ vpaddd 0*32(SRND), X0, XFER
vmovdqa XFER, 0*32+_XFER(%rsp, SRND)
DO_4ROUNDS _XFER + 0*32
- vpaddd K256+1*32(SRND), X1, XFER
+ vpaddd 1*32(SRND), X1, XFER
vmovdqa XFER, 1*32+_XFER(%rsp, SRND)
DO_4ROUNDS _XFER + 1*32
add $2*32, SRND
@@ -626,7 +631,7 @@ loop2:
vmovdqa X2, X0
vmovdqa X3, X1
- cmp $4*4*32, SRND
+ cmp INP, SRND
jb loop2
mov _CTX(%rsp), CTX
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 02/27] x86: Use symbol name on bug table for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Replace the %c constraint with %P. The %c is incompatible with PIE
because it implies an immediate value whereas %P reference a symbol.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/include/asm/bug.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index 6804d6642767..3d690a4abf50 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -35,7 +35,7 @@ do { \
asm volatile("1:\t" ins "\n" \
".pushsection __bug_table,\"aw\"\n" \
"2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n" \
- "\t" __BUG_REL(%c0) "\t# bug_entry::file\n" \
+ "\t" __BUG_REL(%P0) "\t# bug_entry::file\n" \
"\t.word %c1" "\t# bug_entry::line\n" \
"\t.word %c2" "\t# bug_entry::flags\n" \
"\t.org 2b+%c3\n" \
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 03/27] x86: Use symbol name in jump table for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Replace the %c constraint with %P. The %c is incompatible with PIE
because it implies an immediate value whereas %P reference a symbol.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/include/asm/jump_label.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/jump_label.h b/arch/x86/include/asm/jump_label.h
index 8c0de4282659..dfdcdc39604a 100644
--- a/arch/x86/include/asm/jump_label.h
+++ b/arch/x86/include/asm/jump_label.h
@@ -37,9 +37,9 @@ static __always_inline bool arch_static_branch(struct static_key *key, bool bran
".byte " __stringify(STATIC_KEY_INIT_NOP) "\n\t"
".pushsection __jump_table, \"aw\" \n\t"
_ASM_ALIGN "\n\t"
- _ASM_PTR "1b, %l[l_yes], %c0 + %c1 \n\t"
+ _ASM_PTR "1b, %l[l_yes], %P0 \n\t"
".popsection \n\t"
- : : "i" (key), "i" (branch) : : l_yes);
+ : : "X" (&((char *)key)[branch]) : : l_yes);
return false;
l_yes:
@@ -53,9 +53,9 @@ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool
"2:\n\t"
".pushsection __jump_table, \"aw\" \n\t"
_ASM_ALIGN "\n\t"
- _ASM_PTR "1b, %l[l_yes], %c0 + %c1 \n\t"
+ _ASM_PTR "1b, %l[l_yes], %P0 \n\t"
".popsection \n\t"
- : : "i" (key), "i" (branch) : : l_yes);
+ : : "X" (&((char *)key)[branch]) : : l_yes);
return false;
l_yes:
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 04/27] x86: Add macro to get symbol address for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Add a new _ASM_MOVABS macro to fetch a symbol address. It will be used
to replace "_ASM_MOV $<symbol>, %dst" code construct that are not compatible
with PIE.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/include/asm/asm.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 386a6900e206..653d8b82f015 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -30,6 +30,7 @@
#define _ASM_ALIGN __ASM_SEL(.balign 4, .balign 8)
#define _ASM_MOV __ASM_SIZE(mov)
+#define _ASM_MOVABS __ASM_SEL(movl, movabsq)
#define _ASM_INC __ASM_SIZE(inc)
#define _ASM_DEC __ASM_SIZE(dec)
#define _ASM_ADD __ASM_SIZE(add)
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 05/27] x86: relocate_kernel - Adapt assembly for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/kernel/relocate_kernel_64.S | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 11eda21eb697..a7227dfe1a2b 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -208,9 +208,11 @@ identity_mapped:
movq %rax, %cr3
lea PAGE_SIZE(%r8), %rsp
call swap_pages
- movq $virtual_mapped, %rax
- pushq %rax
- ret
+ jmp *virtual_mapped_addr(%rip)
+
+ /* Absolute value for PIE support */
+virtual_mapped_addr:
+ .quad virtual_mapped
virtual_mapped:
movq RSP(%r8), %rsp
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 06/27] x86/entry/64: Adapt assembly for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/entry/entry_64.S | 16 ++++++++++------
arch/x86/kernel/relocate_kernel_64.S | 8 +++-----
2 files changed, 13 insertions(+), 11 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index bd53c57617e6..c53123468364 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -191,7 +191,7 @@ ENTRY(entry_SYSCALL_64_trampoline)
* spill RDI and restore it in a second-stage trampoline.
*/
pushq %rdi
- movq $entry_SYSCALL_64_stage2, %rdi
+ movabsq $entry_SYSCALL_64_stage2, %rdi
JMP_NOSPEC %rdi
END(entry_SYSCALL_64_trampoline)
@@ -1275,7 +1275,8 @@ ENTRY(error_entry)
movl %ecx, %eax /* zero extend */
cmpq %rax, RIP+8(%rsp)
je .Lbstep_iret
- cmpq $.Lgs_change, RIP+8(%rsp)
+ leaq .Lgs_change(%rip), %rcx
+ cmpq %rcx, RIP+8(%rsp)
jne .Lerror_entry_done
/*
@@ -1480,10 +1481,10 @@ ENTRY(nmi)
* resume the outer NMI.
*/
- movq $repeat_nmi, %rdx
+ leaq repeat_nmi(%rip), %rdx
cmpq 8(%rsp), %rdx
ja 1f
- movq $end_repeat_nmi, %rdx
+ leaq end_repeat_nmi(%rip), %rdx
cmpq 8(%rsp), %rdx
ja nested_nmi_out
1:
@@ -1537,7 +1538,8 @@ nested_nmi:
pushq %rdx
pushfq
pushq $__KERNEL_CS
- pushq $repeat_nmi
+ leaq repeat_nmi(%rip), %rdx
+ pushq %rdx
/* Put stack back */
addq $(6*8), %rsp
@@ -1576,7 +1578,9 @@ first_nmi:
addq $8, (%rsp) /* Fix up RSP */
pushfq /* RFLAGS */
pushq $__KERNEL_CS /* CS */
- pushq $1f /* RIP */
+ pushq %rax /* Support Position Independent Code */
+ leaq 1f(%rip), %rax /* RIP */
+ xchgq %rax, (%rsp) /* Restore RAX, put 1f */
iretq /* continues at repeat_nmi below */
UNWIND_HINT_IRET_REGS
1:
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index a7227dfe1a2b..0c0fc259a4e2 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -208,11 +208,9 @@ identity_mapped:
movq %rax, %cr3
lea PAGE_SIZE(%r8), %rsp
call swap_pages
- jmp *virtual_mapped_addr(%rip)
-
- /* Absolute value for PIE support */
-virtual_mapped_addr:
- .quad virtual_mapped
+ movabsq $virtual_mapped, %rax
+ pushq %rax
+ ret
virtual_mapped:
movq RSP(%r8), %rsp
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 07/27] x86: pm-trace - Adapt assembly for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Change assembly to use the new _ASM_MOVABS macro instead of _ASM_MOV for
the assembly to be PIE compatible.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/include/asm/pm-trace.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/pm-trace.h b/arch/x86/include/asm/pm-trace.h
index bfa32aa428e5..972070806ce9 100644
--- a/arch/x86/include/asm/pm-trace.h
+++ b/arch/x86/include/asm/pm-trace.h
@@ -8,7 +8,7 @@
do { \
if (pm_trace_enabled) { \
const void *tracedata; \
- asm volatile(_ASM_MOV " $1f,%0\n" \
+ asm volatile(_ASM_MOVABS " $1f,%0\n" \
".section .tracedata,\"a\"\n" \
"1:\t.word %c1\n\t" \
_ASM_PTR " %c2\n" \
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 08/27] x86/CPU: Adapt assembly for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible. Use the new _ASM_MOVABS macro instead of
the 'mov $symbol, %dst' construct.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/include/asm/processor.h | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b0ccd4847a58..1b9488b1018a 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -50,7 +50,7 @@ static inline void *current_text_addr(void)
{
void *pc;
- asm volatile("mov $1f, %0; 1:":"=r" (pc));
+ asm volatile(_ASM_MOVABS " $1f, %0; 1:":"=r" (pc));
return pc;
}
@@ -710,6 +710,7 @@ static inline void sync_core(void)
: ASM_CALL_CONSTRAINT : : "memory");
#else
unsigned int tmp;
+ unsigned long tmp2;
asm volatile (
UNWIND_HINT_SAVE
@@ -720,11 +721,13 @@ static inline void sync_core(void)
"pushfq\n\t"
"mov %%cs, %0\n\t"
"pushq %q0\n\t"
- "pushq $1f\n\t"
+ "leaq 1f(%%rip), %1\n\t"
+ "pushq %1\n\t"
"iretq\n\t"
UNWIND_HINT_RESTORE
"1:"
- : "=&r" (tmp), ASM_CALL_CONSTRAINT : : "cc", "memory");
+ : "=&r" (tmp), "=&r" (tmp2), ASM_CALL_CONSTRAINT
+ : : "cc", "memory");
#endif
}
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 09/27] x86/acpi: Adapt assembly for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/kernel/acpi/wakeup_64.S | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index 50b8ed0317a3..472659c0f811 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -14,7 +14,7 @@
* Hooray, we are in Long 64-bit mode (but still running in low memory)
*/
ENTRY(wakeup_long64)
- movq saved_magic, %rax
+ movq saved_magic(%rip), %rax
movq $0x123456789abcdef0, %rdx
cmpq %rdx, %rax
jne bogus_64_magic
@@ -25,14 +25,14 @@ ENTRY(wakeup_long64)
movw %ax, %es
movw %ax, %fs
movw %ax, %gs
- movq saved_rsp, %rsp
+ movq saved_rsp(%rip), %rsp
- movq saved_rbx, %rbx
- movq saved_rdi, %rdi
- movq saved_rsi, %rsi
- movq saved_rbp, %rbp
+ movq saved_rbx(%rip), %rbx
+ movq saved_rdi(%rip), %rdi
+ movq saved_rsi(%rip), %rsi
+ movq saved_rbp(%rip), %rbp
- movq saved_rip, %rax
+ movq saved_rip(%rip), %rax
jmp *%rax
ENDPROC(wakeup_long64)
@@ -45,7 +45,7 @@ ENTRY(do_suspend_lowlevel)
xorl %eax, %eax
call save_processor_state
- movq $saved_context, %rax
+ leaq saved_context(%rip), %rax
movq %rsp, pt_regs_sp(%rax)
movq %rbp, pt_regs_bp(%rax)
movq %rsi, pt_regs_si(%rax)
@@ -64,13 +64,14 @@ ENTRY(do_suspend_lowlevel)
pushfq
popq pt_regs_flags(%rax)
- movq $.Lresume_point, saved_rip(%rip)
+ leaq .Lresume_point(%rip), %rax
+ movq %rax, saved_rip(%rip)
- movq %rsp, saved_rsp
- movq %rbp, saved_rbp
- movq %rbx, saved_rbx
- movq %rdi, saved_rdi
- movq %rsi, saved_rsi
+ movq %rsp, saved_rsp(%rip)
+ movq %rbp, saved_rbp(%rip)
+ movq %rbx, saved_rbx(%rip)
+ movq %rdi, saved_rdi(%rip)
+ movq %rsi, saved_rsi(%rip)
addq $8, %rsp
movl $3, %edi
@@ -82,7 +83,7 @@ ENTRY(do_suspend_lowlevel)
.align 4
.Lresume_point:
/* We don't restore %rax, it must be 0 anyway */
- movq $saved_context, %rax
+ leaq saved_context(%rip), %rax
movq saved_context_cr4(%rax), %rbx
movq %rbx, %cr4
movq saved_context_cr3(%rax), %rbx
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 10/27] x86/boot/64: Adapt assembly for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.
Early at boot, the kernel is mapped at a temporary address while preparing
the page table. To know the changes needed for the page table with KASLR,
the boot code calculate the difference between the expected address of the
kernel and the one chosen by KASLR. It does not work with PIE because all
symbols in code are relatives. Instead of getting the future relocated
virtual address, you will get the current temporary mapping. The solution
is using global variables that will be relocated as expected.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/kernel/head_64.S | 26 ++++++++++++++++++++------
1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 48385c1074a5..48652f3ec46a 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -89,8 +89,9 @@ startup_64:
popq %rsi
/* Form the CR3 value being sure to include the CR3 modifier */
- addq $(early_top_pgt - __START_KERNEL_map), %rax
+ addq _early_top_pgt_offset(%rip), %rax
jmp 1f
+
ENTRY(secondary_startup_64)
UNWIND_HINT_EMPTY
/*
@@ -119,7 +120,7 @@ ENTRY(secondary_startup_64)
popq %rsi
/* Form the CR3 value being sure to include the CR3 modifier */
- addq $(init_top_pgt - __START_KERNEL_map), %rax
+ addq _init_top_offset(%rip), %rax
1:
/* Enable PAE mode, PGE and LA57 */
@@ -137,7 +138,7 @@ ENTRY(secondary_startup_64)
movq %rax, %cr3
/* Ensure I am executing from virtual addresses */
- movq $1f, %rax
+ movabs $1f, %rax
ANNOTATE_RETPOLINE_SAFE
jmp *%rax
1:
@@ -234,11 +235,12 @@ ENTRY(secondary_startup_64)
* REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
* address given in m16:64.
*/
- pushq $.Lafter_lret # put return address on stack for unwinder
+ leaq .Lafter_lret(%rip), %rax
+ pushq %rax # put return address on stack for unwinder
xorq %rbp, %rbp # clear frame pointer
- movq initial_code(%rip), %rax
+ leaq initial_code(%rip), %rax
pushq $__KERNEL_CS # set correct cs
- pushq %rax # target address in negative space
+ pushq (%rax) # target address in negative space
lretq
.Lafter_lret:
END(secondary_startup_64)
@@ -342,6 +344,18 @@ END(early_idt_handler_common)
GLOBAL(early_recursion_flag)
.long 0
+ /*
+ * Position Independent Code takes only relative references in code
+ * meaning a global variable address is relative to RIP and not its
+ * future virtual address. Global variables can be used instead as they
+ * are still relocated on the expected kernel mapping address.
+ */
+ .align 8
+_early_top_pgt_offset:
+ .quad early_top_pgt - __START_KERNEL_map
+_init_top_offset:
+ .quad init_top_pgt - __START_KERNEL_map
+
#define NEXT_PAGE(name) \
.balign PAGE_SIZE; \
GLOBAL(name)
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 11/27] x86/power/64: Adapt assembly for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Change the assembly code to use only relative references of symbols for the
kernel to be PIE compatible.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/power/hibernate_asm_64.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index ce8da3a0412c..6fdd7bbc3c33 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -24,7 +24,7 @@
#include <asm/frame.h>
ENTRY(swsusp_arch_suspend)
- movq $saved_context, %rax
+ leaq saved_context(%rip), %rax
movq %rsp, pt_regs_sp(%rax)
movq %rbp, pt_regs_bp(%rax)
movq %rsi, pt_regs_si(%rax)
@@ -115,7 +115,7 @@ ENTRY(restore_registers)
movq %rax, %cr4; # turn PGE back on
/* We don't restore %rax, it must be 0 anyway */
- movq $saved_context, %rax
+ leaq saved_context(%rip), %rax
movq pt_regs_sp(%rax), %rsp
movq pt_regs_bp(%rax), %rbp
movq pt_regs_si(%rax), %rsi
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 12/27] x86/paravirt: Adapt assembly for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
if PIE is enabled, switch the paravirt assembly constraints to be
compatible. The %c/i constrains generate smaller code so is kept by
default.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/include/asm/paravirt_types.h | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 180bc0bff0fb..140747a98d94 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -337,9 +337,17 @@ extern struct pv_lock_ops pv_lock_ops;
#define PARAVIRT_PATCH(x) \
(offsetof(struct paravirt_patch_template, x) / sizeof(void *))
+#ifdef CONFIG_X86_PIE
+#define paravirt_opptr_call "a"
+#define paravirt_opptr_type "p"
+#else
+#define paravirt_opptr_call "c"
+#define paravirt_opptr_type "i"
+#endif
+
#define paravirt_type(op) \
[paravirt_typenum] "i" (PARAVIRT_PATCH(op)), \
- [paravirt_opptr] "i" (&(op))
+ [paravirt_opptr] paravirt_opptr_type (&(op))
#define paravirt_clobber(clobber) \
[paravirt_clobber] "i" (clobber)
@@ -395,7 +403,7 @@ int paravirt_disable_iospace(void);
*/
#define PARAVIRT_CALL \
ANNOTATE_RETPOLINE_SAFE \
- "call *%c[paravirt_opptr];"
+ "call *%" paravirt_opptr_call "[paravirt_opptr];"
/*
* These macros are intended to wrap calls through one of the paravirt
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 13/27] x86/boot/64: Build head64.c as mcmodel large when PIE is enabled
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
The __startup_64 function assumes all symbols have relocated addresses
instead of the current boot virtual address. PIE generated code favor
relative addresses making all virtual and physical address math incorrect.
If PIE is enabled, build head64.c as mcmodel large instead to ensure absolute
references on all memory access. Add a global __force_order variable required
when using a large model with read_cr* functions.
To build head64.c as mcmodel=large, disable the retpoline gcc flags.
This code is used at early boot and removed later, it doesn't need
retpoline mitigation.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/kernel/Makefile | 6 ++++++
arch/x86/kernel/head64.c | 3 +++
2 files changed, 9 insertions(+)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 29786c87e864..1ff6be34de66 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -22,6 +22,12 @@ CFLAGS_REMOVE_early_printk.o = -pg
CFLAGS_REMOVE_head64.o = -pg
endif
+ifdef CONFIG_X86_PIE
+# Remove PIE and retpoline flags that are incompatible with mcmodel=large
+CFLAGS_REMOVE_head64.o += -fPIE -mindirect-branch=thunk-extern -mindirect-branch-register
+CFLAGS_head64.o = -mcmodel=large
+endif
+
KASAN_SANITIZE_head$(BITS).o := n
KASAN_SANITIZE_dumpstack.o := n
KASAN_SANITIZE_dumpstack_$(BITS).o := n
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 0c855deee165..2fe60e661227 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -64,6 +64,9 @@ EXPORT_SYMBOL(vmemmap_base);
#define __head __section(.head.text)
+/* Required for read_cr3 when building as PIE */
+unsigned long __force_order;
+
static void __head *fixup_pointer(void *ptr, unsigned long physaddr)
{
return ptr - (void *)_text + (void *)physaddr;
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 14/27] x86/percpu: Adapt percpu for PIE support
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Perpcu uses a clever design where the .percu ELF section has a virtual
address of zero and the relocation code avoid relocating specific
symbols. It makes the code simple and easily adaptable with or without
SMP support.
This design is incompatible with PIE because generated code always try to
access the zero virtual address relative to the default mapping address.
It becomes impossible when KASLR is configured to go below -2G. This
patch solves this problem by removing the zero mapping and adapting the GS
base to be relative to the expected address. These changes are done only
when PIE is enabled. The original implementation is kept as-is
by default.
The assembly and PER_CPU macros are changed to use relative references
when PIE is enabled.
The KALLSYMS_ABSOLUTE_PERCPU configuration is disabled with PIE given
percpu symbols are not absolute in this case.
Position Independent Executable (PIE) support will allow to extended the
KASLR randomization range below the -2G memory limit.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/entry/calling.h | 2 +-
arch/x86/entry/entry_64.S | 4 ++--
arch/x86/include/asm/percpu.h | 25 +++++++++++++++++++------
arch/x86/kernel/cpu/common.c | 4 +++-
arch/x86/kernel/head_64.S | 4 ++++
arch/x86/kernel/setup_percpu.c | 2 +-
arch/x86/kernel/vmlinux.lds.S | 13 +++++++++++--
arch/x86/lib/cmpxchg16b_emu.S | 8 ++++----
arch/x86/xen/xen-asm.S | 12 ++++++------
init/Kconfig | 2 +-
10 files changed, 52 insertions(+), 24 deletions(-)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index be63330c5511..7b48d97111b5 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -216,7 +216,7 @@ For 32-bit we have the following conventions - kernel is built with
.endm
#define THIS_CPU_user_pcid_flush_mask \
- PER_CPU_VAR(cpu_tlbstate) + TLB_STATE_user_pcid_flush_mask
+ PER_CPU_VAR(cpu_tlbstate + TLB_STATE_user_pcid_flush_mask)
.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index c53123468364..d34794368e20 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -358,7 +358,7 @@ ENTRY(__switch_to_asm)
#ifdef CONFIG_CC_STACKPROTECTOR
movq TASK_stack_canary(%rsi), %rbx
- movq %rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset
+ movq %rbx, PER_CPU_VAR(irq_stack_union + stack_canary_offset)
#endif
#ifdef CONFIG_RETPOLINE
@@ -896,7 +896,7 @@ apicinterrupt IRQ_WORK_VECTOR irq_work_interrupt smp_irq_work_interrupt
/*
* Exception entry points.
*/
-#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + ((x) - 1) * 8)
+#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw + (TSS_ist + ((x) - 1) * 8))
.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
ENTRY(\sym)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index a06b07399d17..7d1271b536ea 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -5,9 +5,11 @@
#ifdef CONFIG_X86_64
#define __percpu_seg gs
#define __percpu_mov_op movq
+#define __percpu_rel (%rip)
#else
#define __percpu_seg fs
#define __percpu_mov_op movl
+#define __percpu_rel
#endif
#ifdef __ASSEMBLY__
@@ -28,10 +30,14 @@
#define PER_CPU(var, reg) \
__percpu_mov_op %__percpu_seg:this_cpu_off, reg; \
lea var(reg), reg
-#define PER_CPU_VAR(var) %__percpu_seg:var
+/* Compatible with Position Independent Code */
+#define PER_CPU_VAR(var) %__percpu_seg:(var)##__percpu_rel
+/* Rare absolute reference */
+#define PER_CPU_VAR_ABS(var) %__percpu_seg:var
#else /* ! SMP */
#define PER_CPU(var, reg) __percpu_mov_op $var, reg
-#define PER_CPU_VAR(var) var
+#define PER_CPU_VAR(var) (var)##__percpu_rel
+#define PER_CPU_VAR_ABS(var) var
#endif /* SMP */
#ifdef CONFIG_X86_64_SMP
@@ -209,27 +215,34 @@ do { \
pfo_ret__; \
})
+/* Position Independent code uses relative addresses only */
+#ifdef CONFIG_X86_PIE
+#define __percpu_stable_arg __percpu_arg(a1)
+#else
+#define __percpu_stable_arg __percpu_arg(P1)
+#endif
+
#define percpu_stable_op(op, var) \
({ \
typeof(var) pfo_ret__; \
switch (sizeof(var)) { \
case 1: \
- asm(op "b "__percpu_arg(P1)",%0" \
+ asm(op "b "__percpu_stable_arg ",%0" \
: "=q" (pfo_ret__) \
: "p" (&(var))); \
break; \
case 2: \
- asm(op "w "__percpu_arg(P1)",%0" \
+ asm(op "w "__percpu_stable_arg ",%0" \
: "=r" (pfo_ret__) \
: "p" (&(var))); \
break; \
case 4: \
- asm(op "l "__percpu_arg(P1)",%0" \
+ asm(op "l "__percpu_stable_arg ",%0" \
: "=r" (pfo_ret__) \
: "p" (&(var))); \
break; \
case 8: \
- asm(op "q "__percpu_arg(P1)",%0" \
+ asm(op "q "__percpu_stable_arg ",%0" \
: "=r" (pfo_ret__) \
: "p" (&(var))); \
break; \
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 348cf4821240..0e9a87a34f76 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -487,7 +487,9 @@ void load_percpu_segment(int cpu)
loadsegment(fs, __KERNEL_PERCPU);
#else
__loadsegment_simple(gs, 0);
- wrmsrl(MSR_GS_BASE, (unsigned long)per_cpu(irq_stack_union.gs_base, cpu));
+ wrmsrl(MSR_GS_BASE,
+ (unsigned long)per_cpu(irq_stack_union.gs_base, cpu) -
+ (unsigned long)__per_cpu_start);
#endif
load_stack_canary_segment();
}
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 48652f3ec46a..87624e1fe22f 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -266,7 +266,11 @@ ENDPROC(start_cpu0)
GLOBAL(initial_code)
.quad x86_64_start_kernel
GLOBAL(initial_gs)
+#ifdef CONFIG_X86_PIE
+ .quad 0
+#else
.quad INIT_PER_CPU_VAR(irq_stack_union)
+#endif
GLOBAL(initial_stack)
/*
* The SIZEOF_PTREGS gap is a convention which helps the in-kernel
diff --git a/arch/x86/kernel/setup_percpu.c b/arch/x86/kernel/setup_percpu.c
index ea554f812ee1..f7655979f19a 100644
--- a/arch/x86/kernel/setup_percpu.c
+++ b/arch/x86/kernel/setup_percpu.c
@@ -26,7 +26,7 @@
DEFINE_PER_CPU_READ_MOSTLY(int, cpu_number);
EXPORT_PER_CPU_SYMBOL(cpu_number);
-#ifdef CONFIG_X86_64
+#if defined(CONFIG_X86_64) && !defined(CONFIG_X86_PIE)
#define BOOT_PERCPU_OFFSET ((unsigned long)__per_cpu_load)
#else
#define BOOT_PERCPU_OFFSET 0
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 1c43a2e839fa..4a5957900dcc 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -209,9 +209,14 @@ SECTIONS
/*
* percpu offsets are zero-based on SMP. PERCPU_VADDR() changes the
* output PHDR, so the next output section - .init.text - should
- * start another segment - init.
+ * start another segment - init. For Position Independent Code, the
+ * per-cpu section cannot be zero-based because everything is relative.
*/
+#ifdef CONFIG_X86_PIE
+ PERCPU_SECTION(INTERNODE_CACHE_BYTES)
+#else
PERCPU_VADDR(INTERNODE_CACHE_BYTES, 0, :percpu)
+#endif
ASSERT(SIZEOF(.data..percpu) < CONFIG_PHYSICAL_START,
"per-CPU data too large - increase CONFIG_PHYSICAL_START")
#endif
@@ -387,7 +392,11 @@ SECTIONS
* Per-cpu symbols which need to be offset from __per_cpu_load
* for the boot processor.
*/
+#ifdef CONFIG_X86_PIE
+#define INIT_PER_CPU(x) init_per_cpu__##x = x
+#else
#define INIT_PER_CPU(x) init_per_cpu__##x = x + __per_cpu_load
+#endif
INIT_PER_CPU(gdt_page);
INIT_PER_CPU(irq_stack_union);
@@ -397,7 +406,7 @@ INIT_PER_CPU(irq_stack_union);
. = ASSERT((_end - _text <= KERNEL_IMAGE_SIZE),
"kernel image bigger than KERNEL_IMAGE_SIZE");
-#ifdef CONFIG_SMP
+#if defined(CONFIG_SMP) && !defined(CONFIG_X86_PIE)
. = ASSERT((irq_stack_union == 0),
"irq_stack_union is not at start of per-cpu area");
#endif
diff --git a/arch/x86/lib/cmpxchg16b_emu.S b/arch/x86/lib/cmpxchg16b_emu.S
index 9b330242e740..254950604ae4 100644
--- a/arch/x86/lib/cmpxchg16b_emu.S
+++ b/arch/x86/lib/cmpxchg16b_emu.S
@@ -33,13 +33,13 @@ ENTRY(this_cpu_cmpxchg16b_emu)
pushfq
cli
- cmpq PER_CPU_VAR((%rsi)), %rax
+ cmpq PER_CPU_VAR_ABS((%rsi)), %rax
jne .Lnot_same
- cmpq PER_CPU_VAR(8(%rsi)), %rdx
+ cmpq PER_CPU_VAR_ABS(8(%rsi)), %rdx
jne .Lnot_same
- movq %rbx, PER_CPU_VAR((%rsi))
- movq %rcx, PER_CPU_VAR(8(%rsi))
+ movq %rbx, PER_CPU_VAR_ABS((%rsi))
+ movq %rcx, PER_CPU_VAR_ABS(8(%rsi))
popfq
mov $1, %al
diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index 8019edd0125c..a5d73d3218be 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -21,7 +21,7 @@
ENTRY(xen_irq_enable_direct)
FRAME_BEGIN
/* Unmask events */
- movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ movb $0, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
/*
* Preempt here doesn't matter because that will deal with any
@@ -30,7 +30,7 @@ ENTRY(xen_irq_enable_direct)
*/
/* Test for pending */
- testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending
+ testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending)
jz 1f
call check_events
@@ -45,7 +45,7 @@ ENTRY(xen_irq_enable_direct)
* non-zero.
*/
ENTRY(xen_irq_disable_direct)
- movb $1, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ movb $1, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
ret
ENDPROC(xen_irq_disable_direct)
@@ -59,7 +59,7 @@ ENDPROC(xen_irq_disable_direct)
* x86 use opposite senses (mask vs enable).
*/
ENTRY(xen_save_fl_direct)
- testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
setz %ah
addb %ah, %ah
ret
@@ -80,7 +80,7 @@ ENTRY(xen_restore_fl_direct)
#else
testb $X86_EFLAGS_IF>>8, %ah
#endif
- setz PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ setz PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
/*
* Preempt here doesn't matter because that will deal with any
* pending interrupts. The pending check may end up being run
@@ -88,7 +88,7 @@ ENTRY(xen_restore_fl_direct)
*/
/* check for unmasked and pending */
- cmpw $0x0001, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending
+ cmpw $0x0001, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending)
jnz 1f
call check_events
1:
diff --git a/init/Kconfig b/init/Kconfig
index 221cac95044f..acc9087546ac 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1365,7 +1365,7 @@ config KALLSYMS_ALL
config KALLSYMS_ABSOLUTE_PERCPU
bool
depends on KALLSYMS
- default X86_64 && SMP
+ default X86_64 && SMP && !X86_PIE
config KALLSYMS_BASE_RELATIVE
bool
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 15/27] compiler: Option to default to hidden symbols
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Provide an option to default visibility to hidden except for key
symbols. This option is disabled by default and will be used by x86_64
PIE support to remove errors between compilation units.
The default visibility is also enabled for external symbols that are
compared as they maybe equals (start/end of sections). In this case,
older versions of GCC will remove the comparison if the symbols are
hidden. This issue exists at least on gcc 4.9 and before.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
arch/x86/boot/boot.h | 2 +-
arch/x86/include/asm/setup.h | 2 +-
arch/x86/kernel/cpu/microcode/core.c | 4 ++--
drivers/base/firmware_class.c | 4 ++--
include/asm-generic/sections.h | 6 ++++++
include/linux/compiler.h | 7 +++++++
init/Kconfig | 7 +++++++
kernel/kallsyms.c | 16 ++++++++--------
kernel/trace/trace.h | 4 ++--
lib/dynamic_debug.c | 4 ++--
10 files changed, 38 insertions(+), 18 deletions(-)
diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index ef5a9cc66fb8..d726c35bdd96 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -193,7 +193,7 @@ static inline bool memcmp_gs(const void *s1, addr_t s2, size_t len)
}
/* Heap -- available for dynamic lists. */
-extern char _end[];
+extern char _end[] __default_visibility;
extern char *HEAP;
extern char *heap_end;
#define RESET_HEAP() ((void *)( HEAP = _end ))
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 3108e297d87d..dfba64fe1c7e 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -70,7 +70,7 @@ static inline void x86_ce4100_early_setup(void) { }
* This is set up by the setup-routine at boot-time
*/
extern struct boot_params boot_params;
-extern char _text[];
+extern char _text[] __default_visibility;
static inline bool kaslr_enabled(void)
{
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index aa1b9a422f2b..ed5675db6e82 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -141,8 +141,8 @@ static bool __init check_loader_disabled_bsp(void)
return *res;
}
-extern struct builtin_fw __start_builtin_fw[];
-extern struct builtin_fw __end_builtin_fw[];
+extern struct builtin_fw __start_builtin_fw[] __default_visibility;
+extern struct builtin_fw __end_builtin_fw[] __default_visibility;
bool get_builtin_firmware(struct cpio_data *cd, const char *name)
{
diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
index 7dd36ace6152..939a1952d0ab 100644
--- a/drivers/base/firmware_class.c
+++ b/drivers/base/firmware_class.c
@@ -136,8 +136,8 @@ static struct firmware_cache fw_cache;
#ifdef CONFIG_FW_LOADER
-extern struct builtin_fw __start_builtin_fw[];
-extern struct builtin_fw __end_builtin_fw[];
+extern struct builtin_fw __start_builtin_fw[] __default_visibility;
+extern struct builtin_fw __end_builtin_fw[] __default_visibility;
static void fw_copy_to_prealloc_buf(struct firmware *fw,
void *buf, size_t size)
diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
index 849cd8eb5ca0..0a0e23405ddd 100644
--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -32,6 +32,9 @@
* __softirqentry_text_start, __softirqentry_text_end
* __start_opd, __end_opd
*/
+#ifdef CONFIG_DEFAULT_HIDDEN
+#pragma GCC visibility push(default)
+#endif
extern char _text[], _stext[], _etext[];
extern char _data[], _sdata[], _edata[];
extern char __bss_start[], __bss_stop[];
@@ -49,6 +52,9 @@ extern char __start_once[], __end_once[];
/* Start and end of .ctors section - used for constructor calls. */
extern char __ctors_start[], __ctors_end[];
+#ifdef CONFIG_DEFAULT_HIDDEN
+#pragma GCC visibility pop
+#endif
/* Start and end of .opd section - used for function descriptors. */
extern char __start_opd[], __end_opd[];
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index ab4711c63601..a9ac84e37af9 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -278,6 +278,13 @@ unsigned long read_word_at_a_time(const void *addr)
__u.__val; \
})
+#ifdef CONFIG_DEFAULT_HIDDEN
+#pragma GCC visibility push(hidden)
+#define __default_visibility __attribute__((visibility ("default")))
+#else
+#define __default_visibility
+#endif
+
#endif /* __KERNEL__ */
#endif /* __ASSEMBLY__ */
diff --git a/init/Kconfig b/init/Kconfig
index acc9087546ac..c924babc6d47 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1667,6 +1667,13 @@ config PROFILING
config TRACEPOINTS
bool
+#
+# Default to hidden visibility for all symbols.
+# Useful for Position Independent Code to reduce global references.
+#
+config DEFAULT_HIDDEN
+ bool
+
source "arch/Kconfig"
endmenu # General setup
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index a23e21ada81b..f4e58b7a6daf 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -29,24 +29,24 @@
* These will be re-linked against their real values
* during the second link stage.
*/
-extern const unsigned long kallsyms_addresses[] __weak;
-extern const int kallsyms_offsets[] __weak;
-extern const u8 kallsyms_names[] __weak;
+extern const unsigned long kallsyms_addresses[] __weak __default_visibility;
+extern const int kallsyms_offsets[] __weak __default_visibility;
+extern const u8 kallsyms_names[] __weak __default_visibility;
/*
* Tell the compiler that the count isn't in the small data section if the arch
* has one (eg: FRV).
*/
extern const unsigned long kallsyms_num_syms
-__attribute__((weak, section(".rodata")));
+__attribute__((weak, section(".rodata"))) __default_visibility;
extern const unsigned long kallsyms_relative_base
-__attribute__((weak, section(".rodata")));
+__attribute__((weak, section(".rodata"))) __default_visibility;
-extern const u8 kallsyms_token_table[] __weak;
-extern const u16 kallsyms_token_index[] __weak;
+extern const u8 kallsyms_token_table[] __weak __default_visibility;
+extern const u16 kallsyms_token_index[] __weak __default_visibility;
-extern const unsigned long kallsyms_markers[] __weak;
+extern const unsigned long kallsyms_markers[] __weak __default_visibility;
/*
* Expand a compressed symbol data into the resulting uncompressed string,
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 2a6d0325a761..5aebb0dcecba 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1741,8 +1741,8 @@ extern int trace_event_enable_disable(struct trace_event_file *file,
int enable, int soft_disable);
extern int tracing_alloc_snapshot(void);
-extern const char *__start___trace_bprintk_fmt[];
-extern const char *__stop___trace_bprintk_fmt[];
+extern const char *__start___trace_bprintk_fmt[] __default_visibility;
+extern const char *__stop___trace_bprintk_fmt[] __default_visibility;
extern const char *__start___tracepoint_str[];
extern const char *__stop___tracepoint_str[];
diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
index c7c96bc7654a..40b752b53627 100644
--- a/lib/dynamic_debug.c
+++ b/lib/dynamic_debug.c
@@ -37,8 +37,8 @@
#include <linux/device.h>
#include <linux/netdevice.h>
-extern struct _ddebug __start___verbose[];
-extern struct _ddebug __stop___verbose[];
+extern struct _ddebug __start___verbose[] __default_visibility;
+extern struct _ddebug __stop___verbose[] __default_visibility;
struct ddebug_table {
struct list_head link;
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
* [PATCH v2 16/27] compiler: Option to add PROVIDE_HIDDEN replacement for weak symbols
From: Thomas Garnier via Virtualization @ 2018-03-13 20:59 UTC (permalink / raw)
To: Herbert Xu, David S . Miller, Thomas Gleixner, Ingo Molnar,
H . Peter Anvin, Peter Zijlstra, Josh Poimboeuf,
Greg Kroah-Hartman, Kate Stewart, Thomas Garnier, Arnd Bergmann,
Philippe Ombredanne, Arnaldo Carvalho de Melo, Andrey Ryabinin,
Matthias Kaehlcke, Kees Cook, Tom Lendacky, Kirill A . Shutemov,
Andy Lutomirski, Dominik Brodowski, Borislav Petkov,
Borislav Petkov, Raf
Cc: linux-arch, kvm, linux-pm, x86, linux-doc, linux-kernel,
virtualization, linux-sparse, linux-crypto, kernel-hardening,
xen-devel
In-Reply-To: <20180313205945.245105-1-thgarnie@google.com>
Provide an option to have a PROVIDE_HIDDEN (linker script) entry for
each weak symbol. This option solve an error in x86_64 where the linker
optimizes pie generate code to be non-pie because --emit-relocs was used
instead of -pie (to reduce dynamic relocations).
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
init/Kconfig | 7 +++++++
scripts/link-vmlinux.sh | 14 ++++++++++++++
2 files changed, 21 insertions(+)
diff --git a/init/Kconfig b/init/Kconfig
index c924babc6d47..fe9f9ada4db0 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1927,6 +1927,13 @@ config ASN1
inform it as to what tags are to be expected in a stream and what
functions to call on what tags.
+config WEAK_PROVIDE_HIDDEN
+ bool
+ help
+ Generate linker script PROVIDE_HIDDEN entries for all weak symbols. It
+ allows to prevent non-pie code being replaced by the linker if the
+ emit-relocs option is used instead of pie (useful for x86_64 pie).
+
source "kernel/Kconfig.locks"
config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 08ca08e9105c..c015f5142ecf 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -146,6 +146,17 @@ kallsyms()
${CC} ${aflags} -c -o ${2} ${afile}
}
+gen_weak_provide_hidden()
+{
+ if [ -n "${CONFIG_WEAK_PROVIDE_HIDDEN}" ]; then
+ local pattern="s/^\s\+ w \(\w\+\)$/PROVIDE_HIDDEN(\1 = .);/gp"
+ echo -e "SECTIONS {\n. = _end;" > .tmp_vmlinux_hiddenld
+ ${NM} ${1} | sed -n "${pattern}" >> .tmp_vmlinux_hiddenld
+ echo "}" >> .tmp_vmlinux_hiddenld
+ LDFLAGS_vmlinux="${LDFLAGS_vmlinux} -T .tmp_vmlinux_hiddenld"
+ fi
+}
+
# Create map file with all symbols from ${1}
# See mksymap for additional details
mksysmap()
@@ -230,6 +241,9 @@ modpost_link vmlinux.o
# modpost vmlinux.o to check for section mismatches
${MAKE} -f "${srctree}/scripts/Makefile.modpost" vmlinux.o
+# Generate weak linker script
+gen_weak_provide_hidden vmlinux.o
+
kallsymso=""
kallsyms_vmlinux=""
if [ -n "${CONFIG_KALLSYMS}" ]; then
--
2.16.2.660.g709887971b-goog
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox