From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linutronix.de (146.0.238.70:993) by crypto-ml.lab.linutronix.de with IMAP4-SSL for ; 14 Jan 2019 19:20:10 -0000 Received: from mga07.intel.com ([134.134.136.100]) by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1gj7mK-0003hl-O0 for speck@linutronix.de; Mon, 14 Jan 2019 20:20:09 +0100 Subject: [MODERATED] Re: [PATCH v4 05/28] MDSv4 10 References: <021c5ba2a9fdae326058dd16785b30c31546cd0f.1547256470.git.ak@linux.intel.com> From: Dave Hansen Message-ID: <1ce79648-6263-d726-bb69-da54131535a2@intel.com> Date: Mon, 14 Jan 2019 11:20:02 -0800 MIME-Version: 1.0 In-Reply-To: <021c5ba2a9fdae326058dd16785b30c31546cd0f.1547256470.git.ak@linux.intel.com> Content-Type: multipart/mixed; boundary="7OGI42AkB7AxZBUWrbS9X3j2Y6mtJBg3f"; protected-headers="v1" To: speck@linutronix.de List-ID: This is an OpenPGP/MIME encrypted message (RFC 4880 and 3156) --7OGI42AkB7AxZBUWrbS9X3j2Y6mtJBg3f Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 1/11/19 5:29 PM, speck for Andi Kleen wrote: > When entering idle the internal state of the current CPU might > become visible to the thread sibling because the CPU "frees" some > internal resources. Is there some documentation somewhere about what "idle" means here? It looks like MWAIT and HLT certainly count, but is there anything else? I'm just trying to figure out how we make sure we catch all of the call-sites for these. This sprinkles quite a few of them around, and I'm wondering how you found these, how we know if we missed any, and how we keep folks from reintroducing new call-sites that would make us vulnerable again. I did a quick "objdump | grep mwait" and this patch appears to catch all the functions that I encountered. > +/* > + * Clear CPU buffers before going idle, so that no state is leaked to = SMT > + * siblings taking over thread resources. > + * Out of line to avoid include hell. > + * > + * Assumes that interrupts are disabled and only get reenabled > + * before idle, otherwise the data from a racing interrupt might not > + * get cleared. There are some callers who violate this, > + * but they are only used in unattackable cases.> + */ Can we please document the unattackable cases, along with the reasons they are unattackable? This property also keeps us from being able to annotate this site with lockdep checks for interrupts being off, which is a bit unfortunate. > +static inline void clear_cpu_idle(void) > +{ > + if (sched_smt_active()) { > + clear_thread_flag(TIF_CLEAR_CPU); > + clear_cpu(); > + } > +} =2E.. > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idl= e.c > index b2131c4ea124..0342daa122fe 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -33,6 +33,7 @@ > #include > #include > #include > +#include > =20 > /* > * Include the apic definitions for x86 to have the APIC timer related= defines > @@ -120,6 +121,7 @@ static const struct dmi_system_id processor_power_d= mi_table[] =3D { > */ > static void __cpuidle acpi_safe_halt(void) > { > + clear_cpu_idle(); > if (!tif_need_resched()) { > safe_halt(); > local_irq_disable(); Why is this one outside the if()? Seems like it could be safely inside next to safe_halt(). > @@ -681,6 +683,7 @@ static int acpi_idle_play_dead(struct cpuidle_devic= e *dev, int index) > =20 > ACPI_FLUSH_CPU_CACHE(); > =20 > + clear_cpu_idle(); > while (1) { > =20 > if (cx->entry_method =3D=3D ACPI_CSTATE_HALT) At the risk of bike-shedding... Why don't we just catch all these *play_dead() sites inside play_dead() itself, or at arch_cpu_idle_dead()?= > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > index 8b5d85c91e9d..ddaa7603d53a 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -65,6 +65,7 @@ > #include > #include > #include > +#include > =20 > #define INTEL_IDLE_VERSION "0.4.1" > =20 > @@ -933,6 +934,8 @@ static __cpuidle int intel_idle(struct cpuidle_devi= ce *dev, > } > } > =20 > + clear_cpu_idle(); > + > mwait_idle_with_hints(eax, ecx); And my like bikeshed: It seems like this would be a much smaller patch, and be less likely to have future code add vulnerabilities if we just patched mwait_idle_with_hints(). --7OGI42AkB7AxZBUWrbS9X3j2Y6mtJBg3f--