Re: [PATCH] x86: fix ordering constraints on crX read/writes

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Zachary Amsden <zamsden@redhat.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Glauber Costa <glommer@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>, Avi Kivity <avi@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] x86: fix ordering constraints on crX read/writes
Date: Wed, 14 Jul 2010 14:28:39 -1000	[thread overview]
Message-ID: <4C3E5637.4010300@redhat.com> (raw)
In-Reply-To: <4C3E363B.7060804@goop.org>

On 07/14/2010 12:12 PM, Jeremy Fitzhardinge wrote:
> Change d3ca901f94b3299 introduces a new mechanism for sequencing
> accesses to the control registers using a variable to act as a
> dependency.  (However, the patch description only says it is unifying
> parts of system.h and makes no mention of this new code.)
>
> This sequencing implementation is flawed in two ways:
>   - Firstly, it gets the input/outputs for __force_order wrong on
>     the asm statements.  __force_order is a proxy for the control
>     registers themselves, so a write_crX should write __force_order,
>     and conversely for read_crX.  The original code got this backwards,
>     making write_crX read from __force_order, which in principle would
>     allow gcc to eliminate a "redundant" write_crX.
>
>   - Secondly, writes to control registers can have drastic
>     side-effects on all memory operations (write to cr3 changes the
>     current pagetable and redefines the meaning of all addresses,
>     for example), so they must clobber "memory" in order to be
>     ordered with respect to memory operations.
>
> We seem to have been saved by the implicit ordering that "asm volatile"
> gives us.
>
> Signed-off-by: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>
> Cc: Glauber de Oliveira Costa<gcosta@redhat.com>
> Cc: Ingo Molnar<mingo@elte.hu>
> Cc: "H. Peter Anvin"<hpa@zytor.com>
> Cc: Thomas Gleixner<tglx@linutronix.de>
> Cc: Avi Kivity<avi@redhat.com>
> Cc: Zachary Amsden<zamsden@redhat.com>
>
> diff --git a/arch/x86/include/asm/system.h b/arch/x86/include/asm/system.h
> index e7f4d33..b782af2 100644
> --- a/arch/x86/include/asm/system.h
> +++ b/arch/x86/include/asm/system.h
> @@ -212,53 +212,68 @@ static inline void native_clts(void)
>
>   /*
>    * Volatile isn't enough to prevent the compiler from reordering the
> - * read/write functions for the control registers and messing everything up.
> - * A memory clobber would solve the problem, but would prevent reordering of
> - * all loads stores around it, which can hurt performance. Solution is to
> - * use a variable and mimic reads and writes to it to enforce serialization
> + * read/write functions for the control registers and messing
> + * everything up.  A memory clobber would solve the problem, but would
> + * prevent reordering of all loads stores around it, which can hurt
> + * performance (however, control register writes can have drastic
> + * effects on memory accesses - like switching pagetables and thereby
> + * redefining what an address means - so they still need to clobber
> + * memory).
> + *
> + * Solution is to use a variable and mimic reads and writes to it to
> + * enforce serialization.  __force_order acts as a proxy for the
> + * control registers, so a read_crX reads __force_order, and write_crX
> + * writes it (actually both reads and writes it to indicate that
> + * write-over-write can't be "optimised" away).
> + *
> + * This assumes there's no global optimisation between object files,
> + * so using a static per-file "__force_order" is OK.  (In theory we
> + * don't need __force_order to be instantiated at all, since it is
> + * never actually read or written to.  But gcc might decide to
> + * generate a reference to it anyway, so we need to keep it around.)
>    */
>   static unsigned long __force_order;
>
>   static inline unsigned long native_read_cr0(void)
>   {
>   	unsigned long val;
> -	asm volatile("mov %%cr0,%0\n\t" : "=r" (val), "=m" (__force_order));
> +	asm volatile("mov %%cr0,%0\n\t" : "=r" (val) : "m" (__force_order));
>   	return val;
>   }
>
>   static inline void native_write_cr0(unsigned long val)
>   {
> -	asm volatile("mov %0,%%cr0": : "r" (val), "m" (__force_order));
> +	asm volatile("mov %1,%%cr0": "+m" (__force_order) : "r" (val) : "memory");
>   }
>
>   static inline unsigned long native_read_cr2(void)
>   {
>   	unsigned long val;
> -	asm volatile("mov %%cr2,%0\n\t" : "=r" (val), "=m" (__force_order));
> +	asm volatile("mov %%cr2,%0\n\t" : "=r" (val) : "m" (__force_order));
>   	return val;
>   }
>
>   static inline void native_write_cr2(unsigned long val)
>   {
> -	asm volatile("mov %0,%%cr2": : "r" (val), "m" (__force_order));
> +	asm volatile("mov %1,%%cr2": "+m" (__force_order) : "r" (val) : "memory");
>   }
>    

You don't need the memory clobber there.  Technically, this should never 
be used, however.

>
>   static inline unsigned long native_read_cr3(void)
>   {
>   	unsigned long val;
> -	asm volatile("mov %%cr3,%0\n\t" : "=r" (val), "=m" (__force_order));
> +	asm volatile("mov %%cr3,%0\n\t" : "=r" (val) : "m" (__force_order));
>   	return val;
>   }
>
>   static inline void native_write_cr3(unsigned long val)
>   {
> -	asm volatile("mov %0,%%cr3": : "r" (val), "m" (__force_order));
> +	asm volatile("mov %1,%%cr3": "+m" (__force_order) : "r" (val) : "memory");
>   }
>
>   static inline unsigned long native_read_cr4(void)
>   {
>   	unsigned long val;
> -	asm volatile("mov %%cr4,%0\n\t" : "=r" (val), "=m" (__force_order));
> +	asm volatile("mov %%cr4,%0\n\t" : "=r" (val) : "m" (__force_order));
>   	return val;
>   }
>
> @@ -271,7 +286,7 @@ static inline unsigned long native_read_cr4_safe(void)
>   	asm volatile("1: mov %%cr4, %0\n"
>   		     "2:\n"
>   		     _ASM_EXTABLE(1b, 2b)
> -		     : "=r" (val), "=m" (__force_order) : "0" (0));
> +		     : "=r" (val) : "m" (__force_order), "0" (0));
>   #else
>   	val = native_read_cr4();
>   #endif
> @@ -280,7 +295,7 @@ static inline unsigned long native_read_cr4_safe(void)
>
>   static inline void native_write_cr4(unsigned long val)
>   {
> -	asm volatile("mov %0,%%cr4": : "r" (val), "m" (__force_order));
> +	asm volatile("mov %1,%%cr4": "+m" (__force_order) : "r" (val) : "memory");
>   }
>
>   #ifdef CONFIG_X86_64
>
>
>    

Looks good.  I really hope __force_order gets pruned however.  Does it 
actually?

next prev parent reply	other threads:[~2010-07-15  0:28 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-14 22:12 [PATCH] x86: fix ordering constraints on crX read/writes Jeremy Fitzhardinge
2010-07-15  0:28 ` Zachary Amsden [this message]
2010-07-15  0:55   ` Jeremy Fitzhardinge
2010-07-15  1:00     ` Zachary Amsden
2010-07-15  1:29     ` H. Peter Anvin
2010-07-15 14:34       ` Jeremy Fitzhardinge
2010-07-15 18:54         ` H. Peter Anvin
2010-07-15 19:28           ` Jeremy Fitzhardinge
2010-07-15 19:36             ` H. Peter Anvin
2010-07-15 19:57               ` Jeremy Fitzhardinge
2010-07-15  7:07   ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C3E5637.4010300@redhat.com \
    --to=zamsden@redhat.com \
    --cc=avi@redhat.com \
    --cc=glommer@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).