From: Matthias Kaehlcke <mka@chromium.org>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: "Andrey Ryabinin" <ryabinin.a.a@gmail.com>,
"Chris J Arges" <chris.j.arges@canonical.com>,
"Borislav Petkov" <bp@suse.de>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Ingo Molnar" <mingo@redhat.com>,
"H . Peter Anvin" <hpa@zytor.com>,
"x86@kernel.org" <x86@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
"Douglas Anderson" <dianders@chromium.org>,
"Michael Davidson" <md@google.com>,
"Greg Hackmann" <ghackmann@google.com>,
"Nick Desaulniers" <ndesaulniers@google.com>,
"Stephen Hines" <srhines@google.com>,
"Kees Cook" <keescook@chromium.org>,
"Arnd Bergmann" <arnd@arndb.de>,
"Bernhard Rosenkränzer" <Bernhard.Rosenkranzer@linaro.org>
Subject: Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm"
Date: Fri, 28 Jul 2017 18:06:32 -0700 [thread overview]
Message-ID: <20170729010632.GE84665@google.com> (raw)
In-Reply-To: <20170729005521.2lhekao5fvb452oy@treble>
El Fri, Jul 28, 2017 at 07:55:21PM -0500 Josh Poimboeuf ha dit:
> On Fri, Jul 28, 2017 at 05:38:52PM -0700, Matthias Kaehlcke wrote:
> > El Thu, Jul 20, 2017 at 03:56:52PM -0500 Josh Poimboeuf ha dit:
> >
> > > On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote:
> > > > FWIW bellow is my understanding of what's going on.
> > > >
> > > > It seems clang treats local named register almost the same as ordinary
> > > > local variables.
> > > > The only difference is that before reading the register variable clang
> > > > puts variable's value into the specified register.
> > > >
> > > > So clang just assigns stack slot for the variable __sp where it's
> > > > going to keep variable's value.
> > > > But since __sp is unitialized (we haven't assign anything to it), the
> > > > value of the __sp is some garbage from stack.
> > > > inline asm specifies __sp as input, so clang assumes that it have to
> > > > load __sp into 'rsp' because inline asm is going to use
> > > > it. And it just loads garbage from stack into 'rsp'
> > > >
> > > > In fact, such behavior (I mean storing the value on stack and loading
> > > > into reg before the use) is very useful.
> > > > Clang's behavior allows to keep the value assigned to the
> > > > call-clobbered register across the function calls.
> > > >
> > > > Unlike clang, gcc assigns value to the register right away and doesn't
> > > > store the value anywhere else. So if the reg is
> > > > call clobbered register you have to be absolutely sure that there is
> > > > no subsequent function call that might clobber the register.
> > > >
> > > > E.g. see some real examples
> > > > https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm:
> > > > Fix improper register assignment").
> > > > These bugs shouldn't happen with clang.
> > > >
> > > > But the global named register works slightly differently in clang. For
> > > > the global, the value is just the value of the register itself,
> > > > whatever it is. Read/write from global named register is just like
> > > > direct read/write to the register
> > >
> > > Thanks, that clears up a lot of the confusion for me.
> > >
> > > Still, unfortunately, I don't think that's going to work for GCC.
> > > Changing the '__sp' register variable to global in the header file
> > > causes it to make a *bunch* of changes across the kernel, even in
> > > functions which don't do inline asm. It seems to be disabling some
> > > optimizations across the board.
> > >
> > > I do have another idea, which is to replace all uses of
> > >
> > > asm(" ... call foo ... " : outputs : inputs : clobbers);
> > >
> > > with a new ASM_CALL macro:
> > >
> > > ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers);
> > >
> > > Then the compiler differences can be abstracted out, with GCC adding
> > > "sp" as an output constraint and clang doing nothing (for now).
> >
> > The idea sounds interesting, however I see two issues with ASM_CALL():
> >
> > Inline assembly expects the individual elements of outputs, inputs and
> > clobbers to be comma separated, and so does the macro for it's
> > parameters.
>
> I think this isn't a problem, see the (untested and unfinished) patch
> below for an idea of how the arguments can be handled.
Good point!
> > The assembler template can refer to the position of output and input
> > operands, adding "sp" as output changes the positions of the inputs
> > wrt to the clang version.
>
> True, though I think we could convert all ASM_CALL users to use named
> operands instead of the (bug-prone) numbered ones. It would be an
> improvement regardless.
Agreed, that sounds like a good solution and a desirable improvement.
> > Not sure how to move forward from here. Not even using an ugly #ifdef
> > seems to be a halfway reasonable solution, since get_user() isn't the
> > only place using this construct and #ifdefs would result in highly
> > redundant macros in multiple places.
>
> I still think ASM_CALL might work. The below patch is what I came up
> with last week before I got pulled into some other work. I'll be out on
> vacation next week but I hope to finish the patch after that.
Thanks for working on this.
Enjoy your vacation!
Matthias
> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index 1b020381ab38..5da4bcbeebab 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -216,14 +216,14 @@ static inline int alternatives_text_reserved(void *start, void *end)
> * Otherwise, old function is used.
> */
> #define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, feature2, \
> - output, input...) \
> + output, input, clobbers...) \
> { \
> - register void *__sp asm(_ASM_SP); \
> - asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
> - "call %P[new2]", feature2) \
> - : output, "+r" (__sp) \
> - : [old] "i" (oldfunc), [new1] "i" (newfunc1), \
> - [new2] "i" (newfunc2), ## input); \
> + ASM_CALL(ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1, \
> + "call %P[new2]", feature2), \
> + ARGS(output), \
> + ARGS([old] "i" (oldfunc), [new1] "i" (newfunc1), \
> + [new2] "i" (newfunc2), ARGS(input)), \
> + ## clobbers); \
> }
>
> /*
> diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h
> index b4a0d43248cf..21adcc8e739f 100644
> --- a/arch/x86/include/asm/page_64.h
> +++ b/arch/x86/include/asm/page_64.h
> @@ -45,8 +45,8 @@ static inline void clear_page(void *page)
> clear_page_rep, X86_FEATURE_REP_GOOD,
> clear_page_erms, X86_FEATURE_ERMS,
> "=D" (page),
> - "0" (page)
> - : "memory", "rax", "rcx");
> + "0" (page),
> + ARGS("memory", "rax", "rcx"));
> }
>
> void copy_page(void *to, void *from);
> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
> index cb976bab6299..2a585b799a3e 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -471,8 +471,7 @@ int paravirt_disable_iospace(void);
> */
> #ifdef CONFIG_X86_32
> #define PVOP_VCALL_ARGS \
> - unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx; \
> - register void *__sp asm("esp")
> + unsigned long __eax = __eax, __edx = __edx, __ecx = __ecx;
> #define PVOP_CALL_ARGS PVOP_VCALL_ARGS
>
> #define PVOP_CALL_ARG1(x) "a" ((unsigned long)(x))
> @@ -492,8 +491,7 @@ int paravirt_disable_iospace(void);
> /* [re]ax isn't an arg, but the return val */
> #define PVOP_VCALL_ARGS \
> unsigned long __edi = __edi, __esi = __esi, \
> - __edx = __edx, __ecx = __ecx, __eax = __eax; \
> - register void *__sp asm("rsp")
> + __edx = __edx, __ecx = __ecx, __eax = __eax;
> #define PVOP_CALL_ARGS PVOP_VCALL_ARGS
>
> #define PVOP_CALL_ARG1(x) "D" ((unsigned long)(x))
> @@ -541,24 +539,24 @@ int paravirt_disable_iospace(void);
> /* This is 32-bit specific, but is okay in 64-bit */ \
> /* since this condition will never hold */ \
> if (sizeof(rettype) > sizeof(unsigned long)) { \
> - asm volatile(pre \
> - paravirt_alt(PARAVIRT_CALL) \
> - post \
> - : call_clbr, "+r" (__sp) \
> - : paravirt_type(op), \
> - paravirt_clobber(clbr), \
> - ##__VA_ARGS__ \
> - : "memory", "cc" extra_clbr); \
> + ASM_CALL(pre \
> + paravirt_alt(PARAVIRT_CALL) \
> + post, \
> + ARGS(call_clbr), \
> + ARGS(paravirt_type(op), \
> + paravirt_clobber(clbr), \
> + ##__VA_ARGS__), \
> + ARGS("memory", "cc" extra_clbr)); \
> __ret = (rettype)((((u64)__edx) << 32) | __eax); \
> } else { \
> - asm volatile(pre \
> - paravirt_alt(PARAVIRT_CALL) \
> - post \
> - : call_clbr, "+r" (__sp) \
> - : paravirt_type(op), \
> - paravirt_clobber(clbr), \
> - ##__VA_ARGS__ \
> - : "memory", "cc" extra_clbr); \
> + ASM_CALL(pre \
> + paravirt_alt(PARAVIRT_CALL) \
> + post, \
> + ARGS(call_clbr), \
> + ARGS(paravirt_type(op), \
> + paravirt_clobber(clbr), \
> + ##__VA_ARGS__), \
> + ARGS("memory", "cc" extra_clbr)); \
> __ret = (rettype)(__eax & PVOP_RETMASK(rettype)); \
> } \
> __ret; \
> @@ -578,14 +576,14 @@ int paravirt_disable_iospace(void);
> ({ \
> PVOP_VCALL_ARGS; \
> PVOP_TEST_NULL(op); \
> - asm volatile(pre \
> - paravirt_alt(PARAVIRT_CALL) \
> - post \
> - : call_clbr, "+r" (__sp) \
> - : paravirt_type(op), \
> - paravirt_clobber(clbr), \
> - ##__VA_ARGS__ \
> - : "memory", "cc" extra_clbr); \
> + ASM_CALL(pre \
> + paravirt_alt(PARAVIRT_CALL) \
> + post, \
> + ARGS(call_clbr), \
> + ARGS(paravirt_type(op), \
> + paravirt_clobber(clbr), \
> + ##__VA_ARGS__), \
> + ARGS("memory", "cc" extra_clbr)); \
> })
>
> #define __PVOP_VCALL(op, pre, post, ...) \
> diff --git a/arch/x86/include/asm/preempt.h b/arch/x86/include/asm/preempt.h
> index ec1f3c651150..19a034cbde37 100644
> --- a/arch/x86/include/asm/preempt.h
> +++ b/arch/x86/include/asm/preempt.h
> @@ -102,16 +102,14 @@ static __always_inline bool should_resched(int preempt_offset)
> extern asmlinkage void ___preempt_schedule(void);
> # define __preempt_schedule() \
> ({ \
> - register void *__sp asm(_ASM_SP); \
> - asm volatile ("call ___preempt_schedule" : "+r"(__sp)); \
> + ASM_CALL("call ___preempt_schedule",,); \
> })
>
> extern asmlinkage void preempt_schedule(void);
> extern asmlinkage void ___preempt_schedule_notrace(void);
> # define __preempt_schedule_notrace() \
> ({ \
> - register void *__sp asm(_ASM_SP); \
> - asm volatile ("call ___preempt_schedule_notrace" : "+r"(__sp)); \
> + ASM_CALL("call ___preempt_schedule_notrace",,); \
> })
> extern asmlinkage void preempt_schedule_notrace(void);
> #endif
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 550bc4ab0df4..2bbfb777c8bb 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -677,20 +677,19 @@ static inline void sync_core(void)
> * Like all of Linux's memory ordering operations, this is a
> * compiler barrier as well.
> */
> - register void *__sp asm(_ASM_SP);
>
> #ifdef CONFIG_X86_32
> - asm volatile (
> + ASM_CALL(
> "pushfl\n\t"
> "pushl %%cs\n\t"
> "pushl $1f\n\t"
> "iret\n\t"
> - "1:"
> - : "+r" (__sp) : : "memory");
> + "1:",
> + , , "memory");
> #else
> unsigned int tmp;
>
> - asm volatile (
> + ASM_CALL(
> UNWIND_HINT_SAVE
> "mov %%ss, %0\n\t"
> "pushq %q0\n\t"
> @@ -702,8 +701,8 @@ static inline void sync_core(void)
> "pushq $1f\n\t"
> "iretq\n\t"
> UNWIND_HINT_RESTORE
> - "1:"
> - : "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
> + "1:",
> + "=&r" (tmp), , ARGS("cc", "memory"));
> #endif
> }
>
> diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
> index a34e0d4b957d..9d20f4fb5d7c 100644
> --- a/arch/x86/include/asm/rwsem.h
> +++ b/arch/x86/include/asm/rwsem.h
> @@ -99,25 +99,24 @@ static inline bool __down_read_trylock(struct rw_semaphore *sem)
> /*
> * lock for writing
> */
> -#define ____down_write(sem, slow_path) \
> -({ \
> - long tmp; \
> - struct rw_semaphore* ret; \
> - register void *__sp asm(_ASM_SP); \
> - \
> - asm volatile("# beginning down_write\n\t" \
> - LOCK_PREFIX " xadd %1,(%4)\n\t" \
> - /* adds 0xffff0001, returns the old value */ \
> - " test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" \
> - /* was the active mask 0 before? */\
> - " jz 1f\n" \
> - " call " slow_path "\n" \
> - "1:\n" \
> - "# ending down_write" \
> - : "+m" (sem->count), "=d" (tmp), "=a" (ret), "+r" (__sp) \
> - : "a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS) \
> - : "memory", "cc"); \
> - ret; \
> +#define ____down_write(sem, slow_path) \
> +({ \
> + long tmp; \
> + struct rw_semaphore* ret; \
> + \
> + ASM_CALL("# beginning down_write\n\t" \
> + LOCK_PREFIX " xadd %1,(%3)\n\t" \
> + /* adds 0xffff0001, returns the old value */ \
> + " test " __ASM_SEL(%w1,%k1) "," __ASM_SEL(%w1,%k1) "\n\t" \
> + /* was the active mask 0 before? */ \
> + " jz 1f\n" \
> + " call " slow_path "\n" \
> + "1:\n" \
> + "# ending down_write", \
> + ARGS("+m" (sem->count), "=d" (tmp), "=a" (ret)), \
> + ARGS("a" (sem), "1" (RWSEM_ACTIVE_WRITE_BIAS)), \
> + ARGS("memory", "cc")); \
> + ret; \
> })
>
> static inline void __down_write(struct rw_semaphore *sem)
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index 184eb9894dae..c7be076ee703 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -166,12 +166,11 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
> ({ \
> int __ret_gu; \
> register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \
> - register void *__sp asm(_ASM_SP); \
> __chk_user_ptr(ptr); \
> might_fault(); \
> - asm volatile("call __get_user_%P4" \
> - : "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \
> - : "0" (ptr), "i" (sizeof(*(ptr)))); \
> + ASM_CALL("call __get_user_%P3", \
> + ARGS("=a" (__ret_gu), "=r" (__val_gu)), \
> + ARGS("0" (ptr), "i" (sizeof(*(ptr))))); \
> (x) = (__force __typeof__(*(ptr))) __val_gu; \
> __builtin_expect(__ret_gu, 0); \
> })
> diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
> index b16f6a1d8b26..8b63e2cb1eaf 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -38,10 +38,9 @@ copy_user_generic(void *to, const void *from, unsigned len)
> X86_FEATURE_REP_GOOD,
> copy_user_enhanced_fast_string,
> X86_FEATURE_ERMS,
> - ASM_OUTPUT2("=a" (ret), "=D" (to), "=S" (from),
> - "=d" (len)),
> - "1" (to), "2" (from), "3" (len)
> - : "memory", "rcx", "r8", "r9", "r10", "r11");
> + ARGS("=a" (ret), "=D" (to), "=S" (from), "=d" (len)),
> + ARGS("1" (to), "2" (from), "3" (len)),
> + ARGS("memory", "rcx", "r8", "r9", "r10", "r11"));
> return ret;
> }
>
> diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
> index 11071fcd630e..2feddeeec1d1 100644
> --- a/arch/x86/include/asm/xen/hypercall.h
> +++ b/arch/x86/include/asm/xen/hypercall.h
> @@ -114,9 +114,8 @@ extern struct { char _entry[32]; } hypercall_page[];
> register unsigned long __arg3 asm(__HYPERCALL_ARG3REG) = __arg3; \
> register unsigned long __arg4 asm(__HYPERCALL_ARG4REG) = __arg4; \
> register unsigned long __arg5 asm(__HYPERCALL_ARG5REG) = __arg5; \
> - register void *__sp asm(_ASM_SP);
>
> -#define __HYPERCALL_0PARAM "=r" (__res), "+r" (__sp)
> +#define __HYPERCALL_0PARAM "=r" (__res)
> #define __HYPERCALL_1PARAM __HYPERCALL_0PARAM, "+r" (__arg1)
> #define __HYPERCALL_2PARAM __HYPERCALL_1PARAM, "+r" (__arg2)
> #define __HYPERCALL_3PARAM __HYPERCALL_2PARAM, "+r" (__arg3)
> @@ -146,10 +145,10 @@ extern struct { char _entry[32]; } hypercall_page[];
> ({ \
> __HYPERCALL_DECLS; \
> __HYPERCALL_0ARG(); \
> - asm volatile (__HYPERCALL \
> - : __HYPERCALL_0PARAM \
> - : __HYPERCALL_ENTRY(name) \
> - : __HYPERCALL_CLOBBER0); \
> + ASM_CALL(__HYPERCALL, \
> + __HYPERCALL_0PARAM, \
> + __HYPERCALL_ENTRY(name), \
> + __HYPERCALL_CLOBBER0); \
> (type)__res; \
> })
>
> @@ -157,10 +156,10 @@ extern struct { char _entry[32]; } hypercall_page[];
> ({ \
> __HYPERCALL_DECLS; \
> __HYPERCALL_1ARG(a1); \
> - asm volatile (__HYPERCALL \
> - : __HYPERCALL_1PARAM \
> - : __HYPERCALL_ENTRY(name) \
> - : __HYPERCALL_CLOBBER1); \
> + ASM_CALL(__HYPERCALL, \
> + __HYPERCALL_1PARAM, \
> + __HYPERCALL_ENTRY(name), \
> + __HYPERCALL_CLOBBER1); \
> (type)__res; \
> })
>
> @@ -168,10 +167,10 @@ extern struct { char _entry[32]; } hypercall_page[];
> ({ \
> __HYPERCALL_DECLS; \
> __HYPERCALL_2ARG(a1, a2); \
> - asm volatile (__HYPERCALL \
> - : __HYPERCALL_2PARAM \
> - : __HYPERCALL_ENTRY(name) \
> - : __HYPERCALL_CLOBBER2); \
> + ASM_CALL(__HYPERCALL, \
> + __HYPERCALL_2PARAM, \
> + __HYPERCALL_ENTRY(name), \
> + __HYPERCALL_CLOBBER2); \
> (type)__res; \
> })
>
> @@ -179,10 +178,10 @@ extern struct { char _entry[32]; } hypercall_page[];
> ({ \
> __HYPERCALL_DECLS; \
> __HYPERCALL_3ARG(a1, a2, a3); \
> - asm volatile (__HYPERCALL \
> - : __HYPERCALL_3PARAM \
> - : __HYPERCALL_ENTRY(name) \
> - : __HYPERCALL_CLOBBER3); \
> + ASM_CALL(__HYPERCALL, \
> + __HYPERCALL_3PARAM, \
> + __HYPERCALL_ENTRY(name), \
> + __HYPERCALL_CLOBBER3); \
> (type)__res; \
> })
>
> @@ -190,10 +189,10 @@ extern struct { char _entry[32]; } hypercall_page[];
> ({ \
> __HYPERCALL_DECLS; \
> __HYPERCALL_4ARG(a1, a2, a3, a4); \
> - asm volatile (__HYPERCALL \
> - : __HYPERCALL_4PARAM \
> - : __HYPERCALL_ENTRY(name) \
> - : __HYPERCALL_CLOBBER4); \
> + ASM_CALL(__HYPERCALL, \
> + __HYPERCALL_4PARAM, \
> + __HYPERCALL_ENTRY(name), \
> + __HYPERCALL_CLOBBER4); \
> (type)__res; \
> })
>
> @@ -201,10 +200,10 @@ extern struct { char _entry[32]; } hypercall_page[];
> ({ \
> __HYPERCALL_DECLS; \
> __HYPERCALL_5ARG(a1, a2, a3, a4, a5); \
> - asm volatile (__HYPERCALL \
> - : __HYPERCALL_5PARAM \
> - : __HYPERCALL_ENTRY(name) \
> - : __HYPERCALL_CLOBBER5); \
> + ASM_CALL(__HYPERCALL, \
> + __HYPERCALL_5PARAM, \
> + __HYPERCALL_ENTRY(name), \
> + __HYPERCALL_CLOBBER5); \
> (type)__res; \
> })
>
> @@ -218,10 +217,10 @@ privcmd_call(unsigned call,
> __HYPERCALL_5ARG(a1, a2, a3, a4, a5);
>
> stac();
> - asm volatile("call *%[call]"
> - : __HYPERCALL_5PARAM
> - : [call] "a" (&hypercall_page[call])
> - : __HYPERCALL_CLOBBER5);
> + ASM_CALL("call *%[call]",
> + __HYPERCALL_5PARAM,
> + [call] "a" (&hypercall_page[call]),
> + __HYPERCALL_CLOBBER5);
> clac();
>
> return (long)__res;
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index fb0055953fbc..4d0d326029a7 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -5284,16 +5284,15 @@ static void fetch_possible_mmx_operand(struct x86_emulate_ctxt *ctxt,
>
> static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *))
> {
> - register void *__sp asm(_ASM_SP);
> ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;
>
> if (!(ctxt->d & ByteOp))
> fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
>
> - asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n"
> - : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
> - [fastop]"+S"(fop), "+r"(__sp)
> - : "c"(ctxt->src2.val));
> + ASM_CALL("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n",
> + ARGS("+a"(ctxt->dst.val), "+d"(ctxt->src.val),
> + [flags]"+D"(flags), [fastop]"+S"(fop)),
> + ARGS("c"(ctxt->src2.val)));
>
> ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
> if (!fop) /* exception is returned in fop variable */
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index ffd469ecad57..d42806ad3c9c 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -8671,7 +8671,6 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
> static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
> {
> u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
> - register void *__sp asm(_ASM_SP);
>
> if ((exit_intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK))
> == (INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR)) {
> @@ -8686,7 +8685,7 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
> vector = exit_intr_info & INTR_INFO_VECTOR_MASK;
> desc = (gate_desc *)vmx->host_idt_base + vector;
> entry = gate_offset(*desc);
> - asm volatile(
> + ASM_CALL(
> #ifdef CONFIG_X86_64
> "mov %%" _ASM_SP ", %[sp]\n\t"
> "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t"
> @@ -8695,16 +8694,14 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
> #endif
> "pushf\n\t"
> __ASM_SIZE(push) " $%c[cs]\n\t"
> - "call *%[entry]\n\t"
> - :
> + "call *%[entry]\n\t",
> #ifdef CONFIG_X86_64
> - [sp]"=&r"(tmp),
> + [sp]"=&r"(tmp)
> #endif
> - "+r"(__sp)
> - :
> - [entry]"r"(entry),
> - [ss]"i"(__KERNEL_DS),
> - [cs]"i"(__KERNEL_CS)
> + ,
> + ARGS([entry]"r"(entry),
> + [ss]"i"(__KERNEL_DS),
> + [cs]"i"(__KERNEL_CS))
> );
> }
> }
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 2a1fa10c6a98..ca642ac3a390 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -802,7 +802,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
> if (is_vmalloc_addr((void *)address) &&
> (((unsigned long)tsk->stack - 1 - address < PAGE_SIZE) ||
> address - ((unsigned long)tsk->stack + THREAD_SIZE) < PAGE_SIZE)) {
> - register void *__sp asm("rsp");
> unsigned long stack = this_cpu_read(orig_ist.ist[DOUBLEFAULT_STACK]) - sizeof(void *);
> /*
> * We're likely to be running with very little stack space
> @@ -814,13 +813,12 @@ no_context(struct pt_regs *regs, unsigned long error_code,
> * and then double-fault, though, because we're likely to
> * break the console driver and lose most of the stack dump.
> */
> - asm volatile ("movq %[stack], %%rsp\n\t"
> - "call handle_stack_overflow\n\t"
> - "1: jmp 1b"
> - : "+r" (__sp)
> - : "D" ("kernel stack overflow (page fault)"),
> - "S" (regs), "d" (address),
> - [stack] "rm" (stack));
> + ASM_CALL("movq %[stack], %%rsp\n\t"
> + "call handle_stack_overflow\n\t"
> + "1: jmp 1b",
> + ,
> + ARGS("D" ("kernel stack overflow (page fault)"),
> + "S" (regs), "d" (address), [stack] "rm" (stack)));
> unreachable();
> }
> #endif
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 3f8c88e29a46..3a57f32a0bfd 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -589,4 +589,18 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
> (_________p1); \
> })
>
> +#define ARGS(args...) args
> +
> +#ifdef CONFIG_FRAME_POINTER
> +
> +#define ASM_CALL(str, outputs, inputs, clobbers...) \
> + asm volatile(str : outputs : inputs : "sp", ## clobbers)
> +
> +#else
> +
> +#define ASM_CALL(str, outputs, inputs, clobbers...) \
> + asm volatile(str : outputs : inputs : clobbers);
> +
> +#endif
> +
> #endif /* __LINUX_COMPILER_H */
> diff --git a/tools/objtool/Documentation/stack-validation.txt b/tools/objtool/Documentation/stack-validation.txt
> index 6a1af43862df..766a00ebf80c 100644
> --- a/tools/objtool/Documentation/stack-validation.txt
> +++ b/tools/objtool/Documentation/stack-validation.txt
> @@ -193,13 +193,8 @@ they mean, and suggestions for how to fix them.
>
> If it's a GCC-compiled .c file, the error may be because the function
> uses an inline asm() statement which has a "call" instruction. An
> - asm() statement with a call instruction must declare the use of the
> - stack pointer in its output operand. For example, on x86_64:
> -
> - register void *__sp asm("rsp");
> - asm volatile("call func" : "+r" (__sp));
> -
> - Otherwise the stack frame may not get created before the call.
> + asm() statement with a call instruction must use the ASM_CALL macro,
> + which forces the frame pointer to be saved before the call.
>
>
> 2. file.o: warning: objtool: .text+0x53: unreachable instruction
> @@ -221,7 +216,7 @@ they mean, and suggestions for how to fix them.
> change ENDPROC to END.
>
>
> -4. file.o: warning: objtool: func(): can't find starting instruction
> +3. file.o: warning: objtool: func(): can't find starting instruction
> or
> file.o: warning: objtool: func()+0x11dd: can't decode instruction
>
> @@ -230,7 +225,7 @@ they mean, and suggestions for how to fix them.
> section like .data or .rodata.
>
>
> -5. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function
> +4. file.o: warning: objtool: func()+0x6: unsupported instruction in callable function
>
> This is a kernel entry/exit instruction like sysenter or iret. Such
> instructions aren't allowed in a callable function, and are most
> @@ -239,7 +234,7 @@ they mean, and suggestions for how to fix them.
> annotated with the unwind hint macros in asm/unwind_hints.h.
>
>
> -6. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame
> +5. file.o: warning: objtool: func()+0x26: sibling call from callable instruction with modified stack frame
>
> This is a dynamic jump or a jump to an undefined symbol. Objtool
> assumed it's a sibling call and detected that the frame pointer
> @@ -253,7 +248,7 @@ they mean, and suggestions for how to fix them.
> the unwind hint macros in asm/unwind_hints.h.
>
>
> -7. file: warning: objtool: func()+0x5c: stack state mismatch
> +6. file: warning: objtool: func()+0x5c: stack state mismatch
>
> The instruction's frame pointer state is inconsistent, depending on
> which execution path was taken to reach the instruction.
> @@ -270,7 +265,7 @@ they mean, and suggestions for how to fix them.
> asm/unwind_hints.h.
>
>
> -8. file.o: warning: objtool: funcA() falls through to next function funcB()
> +7. file.o: warning: objtool: funcA() falls through to next function funcB()
>
> This means that funcA() doesn't end with a return instruction or an
> unconditional jump, and that objtool has determined that the function
next prev parent reply other threads:[~2017-07-29 1:06 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-12 21:27 [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm" Matthias Kaehlcke
2017-07-12 22:12 ` Josh Poimboeuf
2017-07-12 22:20 ` Matthias Kaehlcke
2017-07-12 22:35 ` Josh Poimboeuf
2017-07-12 22:36 ` Josh Poimboeuf
2017-07-12 23:22 ` Matthias Kaehlcke
2017-07-13 18:00 ` Josh Poimboeuf
2017-07-13 18:47 ` Matthias Kaehlcke
2017-07-13 19:25 ` Josh Poimboeuf
2017-07-13 19:38 ` Michael Davidson
2017-07-13 20:18 ` Josh Poimboeuf
2017-07-13 20:20 ` Andrey Rybainin
2017-07-13 20:34 ` Josh Poimboeuf
2017-07-13 21:12 ` Matthias Kaehlcke
2017-07-13 21:34 ` Josh Poimboeuf
2017-07-13 21:57 ` Matthias Kaehlcke
2017-07-19 17:46 ` Josh Poimboeuf
2017-07-19 21:50 ` Matthias Kaehlcke
2017-07-20 10:01 ` Andrey Ryabinin
2017-07-20 15:18 ` Josh Poimboeuf
2017-07-20 15:30 ` Andrey Ryabinin
2017-07-20 20:56 ` Josh Poimboeuf
2017-07-21 9:13 ` Andrey Ryabinin
2017-07-21 13:24 ` Josh Poimboeuf
2017-07-29 0:38 ` Matthias Kaehlcke
2017-07-29 0:55 ` Josh Poimboeuf
2017-07-29 0:58 ` Josh Poimboeuf
2017-07-29 1:06 ` Matthias Kaehlcke [this message]
2017-07-13 21:14 ` Matthias Kaehlcke
2017-07-13 21:25 ` Andrey Rybainin
2017-07-13 21:43 ` Matthias Kaehlcke
2017-07-13 21:52 ` Josh Poimboeuf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170729010632.GE84665@google.com \
--to=mka@chromium.org \
--cc=Bernhard.Rosenkranzer@linaro.org \
--cc=arnd@arndb.de \
--cc=bp@suse.de \
--cc=chris.j.arges@canonical.com \
--cc=dianders@chromium.org \
--cc=ghackmann@google.com \
--cc=hpa@zytor.com \
--cc=jpoimboe@redhat.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=md@google.com \
--cc=mingo@redhat.com \
--cc=ndesaulniers@google.com \
--cc=ryabinin.a.a@gmail.com \
--cc=srhines@google.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.