From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:60627)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1Y1aJH-0003jq-5y
	for qemu-devel@nongnu.org; Thu, 18 Dec 2014 07:36:07 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1Y1aJA-00085f-9j
	for qemu-devel@nongnu.org; Thu, 18 Dec 2014 07:36:03 -0500
Received: from mx1.redhat.com ([209.132.183.28]:57430)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1Y1aJA-00085W-1p
	for qemu-devel@nongnu.org; Thu, 18 Dec 2014 07:35:56 -0500
Date: Thu, 18 Dec 2014 12:35:43 +0000
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Message-ID: <20141218123543.GE4744@work-vm>
References: <8B6B4BF9-3400-4125-8571-F4EF9F12AA89@greensocs.com>
	<5491666A.7060001@suse.de> <54916829.3020200@redhat.com>
	<CAFEAcA9hW2JeTmCLE9DK8n4VwHaXYPSu5Lt-NnDtTXgL5VfRYA@mail.gmail.com>
	<60A11491-8466-4EBC-9877-22E341688DD9@greensocs.com>
	<CAFEAcA_MVtXGKOjz+ExhWPzB69LvJqHLC=wcs-XngYjXCdkp5A@mail.gmail.com>
	<6B541656-15EA-47CB-8043-AE3B18FC60D4@greensocs.com>
	<CAFEAcA_8U0zsiZ4e43tmi4mQDgMYmK21d3CouZpN8aNHmBR-pg@mail.gmail.com>
	<F8D0B38D-5900-45B5-BFE2-2807498F9B33@greensocs.com>
	<5492C798.8070503@suse.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5492C798.8070503@suse.de>
Subject: Re: [Qemu-devel] [RFC PATCH] target-arm: protect cpu_exclusive_*.
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Alexander Graf <agraf@suse.de>
Cc: Peter Maydell <peter.maydell@linaro.org>, Mark Burton <mark.burton@greensocs.com>, QEMU Developers <qemu-devel@nongnu.org>, Paolo Bonzini <pbonzini@redhat.com>, Llu?s Vilanova <vilanova@ac.upc.edu>, KONRAD Fr?d?ric <fred.konrad@greensocs.com>

* Alexander Graf (agraf@suse.de) wrote:
> 
> 
> On 18.12.14 10:12, Mark Burton wrote:
> > 
> >> On 17 Dec 2014, at 17:39, Peter Maydell <peter.maydell@linaro.org> wrote:
> >>
> >> On 17 December 2014 at 16:29, Mark Burton <mark.burton@greensocs.com> wrote:
> >>>> On 17 Dec 2014, at 17:27, Peter Maydell <peter.maydell@linaro.org> wrote:
> >>>> I think a mutex is fine, personally -- I just don't want
> >>>> to see fifteen hand-hacked mutexes in the target-* code.
> >>>>
> >>>
> >>> Which would seem to favour the helper function approach?
> >>> Or am I missing something?
> >>
> >> You need at least some support from QEMU core -- consider
> >> what happens with this patch if the ldrex takes a data
> >> abort, for instance.
> >>
> >> And if you need the "stop all other CPUs while I do this???
> > 
> > It looks like a corner case, but working this through - the ???simple??? put a mutex around the atomic instructions approach would indeed need to ensure that no other core was doing anything - that just happens to be true for qemu today (or - we would have to put a mutex around all writes); in order to ensure the case where a store exclusive could potential fail if a non-atomic instruction wrote (a different value) to the same address. This is currently guarantee by the implementation in Qemu - how useful it is I dont know, but if we break it, we run the risk that something will fail (at the least, we could not claim to have kept things the same).
> > 
> > This also has implications for the idea of adding TCG ops I think...
> > The ideal scenario is that we could ???fallback??? on the same semantics that are there today - allowing specific target/host combinations to be optimised (and to improve their functionality). 
> > But that means, from within the TCG Op, we would need to have a mechanism, to cause other TCG???s to take an exit???. etc etc??? In the end, I???m sure it???s possible, but it feels so awkward.
> 
> That's the nice thing about transactions - they guarantee that no other
> CPU accesses the same cache line at the same time. So you're safe
> against other vcpus even without blocking them manually.
> 
> For the non-transactional implementation we probably would need an "IPI
> others and halt them until we're done with the critical section"
> approach. But I really wouldn't concentrate on making things fast on old
> CPUs.

Hang on; 99% of the worlds CPUs don't have (working) transactional memory;
so it's a bit excessive to lump them all under old CPUs.

> Also keep in mind that for the UP case we can always omit all the magic
> - we only need to detect when we move into an SMP case (linux-user clone
> or -smp on system).

Depends on the architecture to depend if IO breaks those type of ops.

Dave

> 
> > 
> > To re-cap where we are (for my own benefit if nobody else):
> > We have several propositions in terms of implementing Atomic instructions
> > 
> > 1/ We wrap the atomic instructions in a mutex using helper functions (this is the approach others have taken, it???s simple, but it is not clean, as stated above).
> 
> This is horrible. Imagine you have this split approach with a load
> exclusive and then store whereas the load starts mutex usage and the
> store stop is. At that point if the store creates a segfault you'll be
> left with a dangling mutex.
> 
> This stuff really belongs into the TCG core.
> 
> > 
> > 1.5/ We add a mechanism to ensure that when the mutex is taken, all other cores are ???stopped???.
> > 
> > 2/ We add some TCG ops to effectively do the same thing, but this would give us the benefit of being able to provide better implementations. This is attractive, but we would end up needing ops to cover at least exclusive load/store and atomic compare exchange. To me this looks less than elegant (being pulled close to the target, rather than being able to generalise), but it???s not clear how we would implement the operations as we would like, with a machine instruction, unless we did split them out along these lines. This approach also (probably) requires the 1.5 mechanism above.
> 
> I'm still in favor of just forcing the semantics of transactions onto
> this. If the host doesn't implement transactions, tough luck - do the
> "halt all others" IPI.
> 
> > 
> > 3/ We have discussed a ???h/w??? approach to the problem. In this case, all atomic instructions are forced to take the slow path - and a additional flags are added to the memory API. We then deal with the issue closer to the memory where we can record who has a lock on a memory address. For this to work - we would also either
> > a) need to add a mprotect type approach to ensure no ???non atomic??? writes occur - or
> > b) need to force all cores to mark the page with the exclusive memory as IO or similar to ensure that all write accesses followed the slow path.
> > 
> > 4/ There is an option to implement exclusive operations within the TCG using mprotect (and signal handlers). I have some concerns on this : would we need have to have support for each host O/S???. I also think we might end up the a lot of protected regions causing a lot of SIGSEGV???s because an errant guest doesn???t behave well - basically we will need to see the impact on performance - finally - this will be really painful to deal with for cases where the exclusive memory is held in what Qemu considers IO space !!!
> > 	In other words - putting the mprotect inside TCG looks to me like it???s mutually exclusive to supporting a memory-based scheme like (3).
> 
> Again, I don't think it's worth caring about legacy host systems too
> much. In a few years from now transactional memory will be commodity,
> just like KVM is today.
> 
> 
> Alex
> 
> > My personal preference is for 3b) it  is ???safe??? - its where the hardware is.
> > 3a is an optimization of that.
> > to me, (2) is an optimisation again. We are effectively saying, if you are able to do this directly, then you dont need to pass via the slow path. Otherwise, you always have the option of reverting to the slow path.
> > 
> > Frankly - 1 and 1.5 are hacks - they are not optimisations, they are just dirty hacks. However - their saving grace is that they are hacks that exist and ???work???. I dislike patching the hack, but it did seem to offer the fastest solution to get around this problem - at least for now. I am no longer convinced.
> > 
> > 4/ is something I???d like other peoples views on too??? Is it a better approach? What about the slow path?
> > 
> > I increasingly begin to feel that we should really approach this from the other end, and provide a ???correct??? solution using the memory - then worry about making that faster???
> > 
> > Cheers
> > 
> > Mark.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> >> semantics linux-user currently uses then that definitely needs
> >> core code support. (Maybe linux-user is being over-zealous
> >> there; I haven't thought about it.)
> >>
> >> -- PMM
> > 
> > 
> > 	 +44 (0)20 7100 3485 x 210
> >  +33 (0)5 33 52 01 77x 210
> > 
> > 	+33 (0)603762104
> > 	mark.burton
> > 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK