From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration Date: Tue, 14 Nov 2017 08:16:09 -0800 Message-ID: References: <20171110211249.10742-1-mathieu.desnoyers@efficios.com> <885227610.13045.1510351034488.JavaMail.zimbra@efficios.com> <617343212.13932.1510592207202.JavaMail.zimbra@efficios.com> <4d47fbb8-8f99-19d3-a9cf-66841aeffac3@scylladb.com> <4431530.14831.1510672632887.JavaMail.zimbra@efficios.com> <20171114160541.GC3165@worktop.lehotels.local> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Thomas Gleixner Cc: Andy Lutomirski , Peter Zijlstra , Mathieu Desnoyers , Avi Kivity , Linus Torvalds , linux-kernel , linux-api , "Paul E. McKenney" , Boqun Feng , Andrew Hunter , maged michael , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Dave Watson , Ingo Molnar , "H. Peter Anvin" , Andrea Parri , "Russell King, ARM Linux" List-Id: linux-api@vger.kernel.org On Tue, Nov 14, 2017 at 8:13 AM, Thomas Gleixner wrote: > On Tue, 14 Nov 2017, Andy Lutomirski wrote: >> On Tue, Nov 14, 2017 at 8:05 AM, Peter Zijlstra wrote: >> > On Tue, Nov 14, 2017 at 03:17:12PM +0000, Mathieu Desnoyers wrote: >> >> I've tried to create a small single-threaded self-modifying loop in >> >> user-space to trigger a trace cache or speculative execution quirk, >> >> but I have not succeeded yet. I suspect that I would need to know >> >> more about the internals of the processor architecture to create the >> >> right stalls that would allow speculative execution to move further >> >> ahead, and trigger an incoherent execution flow. Ideas on how to >> >> trigger this would be welcome. >> > >> > I thought the whole problem was per definition multi-threaded. >> > >> > Single-threaded stuff can't get out of sync with itself; you'll always >> > observe your own stores. >> > >> > And ISTR the JIT scenario being something like the JIT overwriting >> > previously executed but supposedly no longer used code. And in this >> > scenario you'd want to guarantee all CPUs observe the new code before >> > jumping into it. >> > >> > The current approach is using mprotect(), except that on a number of >> > platforms the TLB invalidate from that is not guaranteed to be strong >> > enough to sync for code changes. >> > >> > On x86 the mprotect() should work just fine, since we broadcast IPIs for >> > the TLB invalidate and the IRET from those will get the things synced up >> > again (if nothing else; very likely we'll have done a MOV-CR3 which will >> > of course also have sufficient syncness on it). >> > >> > But PowerPC, s390, ARM et al that do TLB invalidates without interrupts >> > and don't guarantee their TLB invalidate sync against execution units >> > are left broken by this scheme. >> > >> >> On x86 single-thread, you can still get in trouble, I think. Do a >> store, get migrated, execute the stored code. There's no actual >> guarantee that the new CPU does a CR3 load due to laziness. > > The migration IPI will probably prevent that. What guarantees that there's an IPI? Do we never do a syscall, get migrated during syscall processing (due to cond_resched(), for example), and land on another CPU that just happened to already be scheduling? --Andy