[Qemu-devel] Atomic Instructions

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] Atomic Instructions - comments please
@ 2014-12-15 12:56 Mark Burton
  2014-12-15 13:16 ` Paolo Bonzini
  2014-12-15 13:28 ` Peter Maydell
  0 siblings, 2 replies; 14+ messages in thread
From: Mark Burton @ 2014-12-15 12:56 UTC (permalink / raw)
  To: mttcg; +Cc: Developers qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1893 bytes --]

Comments please….

Choices for Atomic instructions:
The current approach (for ARM at least) for Ld and St exclusive inside Qemu simply records the address and the value that atomic read instructions attempt to read from. When an atomic write happens, it checks the value and address remain the same, otherwise it fails.

This just doesn’t match the architecturally defined functionality. For instance, a write will succeed if another thread intervenes with a write at the same address with the same (old) value.

However - for most people, most of the time, this semantic seems to work.

One proposal is ‘simply’ to add a mutex around this code, such that multi-threaded TCG will correctly update/read these saved address/values.
This _should_ maintain the status-quo. Things that were broken before will remain broken, nothing new should break. The concern is that the fact that the TCG was previously uni-threaded MAY be masking problems with this code that we are not taking into account.

A second proposal is to somehow re-use the dirty bit mechanism.
	However - the dirty bit approach seems to be too corse grained (per page), and (I think) only covers writes, which seems like a limitation which is not ideal….

A third proposal is to mark pages as IO when a ld/st ex is performed to them. Then to augment the memory API to indicate that a ld/st ex is in hand, allowing the memory chain to decide if it should allow the write or cause a fault. This would seem to be the closest approach to the real H/W. However it means marking pages as IO (which could cause us issues if there was code on that page?, or slow things down) - and it means adding to the memory API.

Cheers

Mark.

	 +44 (0)20 7100 3485 x 210
 +33 (0)5 33 52 01 77x 210

	+33 (0)603762104
	mark.burton
 <applewebdata://22184F84-4051-4135-B094-40936F803859/www.greensocs.com>

[-- Attachment #2: Type: text/html, Size: 3672 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 12:56 [Qemu-devel] Atomic Instructions - comments please Mark Burton
@ 2014-12-15 13:16 ` Paolo Bonzini
  2014-12-15 13:23   ` Peter Maydell
  2014-12-15 14:09   ` Peter Maydell
  2014-12-15 13:28 ` Peter Maydell
  1 sibling, 2 replies; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-15 13:16 UTC (permalink / raw)
  To: Mark Burton, mttcg; +Cc: Developers qemu-devel



On 15/12/2014 13:56, Mark Burton wrote:
> A third proposal is to mark pages as IO when a ld/st ex is performed to
> them. Then to augment the memory API to indicate that a ld/st ex is in
> hand, allowing the memory chain to decide if it should allow the write
> or cause a fault. This would seem to be the closest approach to the real
> H/W. However it means marking pages as IO (which could cause us issues
> if there was code on that page?, or slow things down) - and it means
> adding to the memory API.

What are the intended semantics for reads/writes from the outside world?
 Should they cause the ll/sc pair to retry or not?

If not, it should not need any change to the memory API; you can do it
entirely within cputlb.c, roughly the same as the handling of
TLB_NOTDIRTY.  It also marks pages as I/O, but only internally within TCG.

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:16 ` Paolo Bonzini
@ 2014-12-15 13:23   ` Peter Maydell
  2014-12-15 13:28     ` Paolo Bonzini
  2014-12-15 14:09   ` Peter Maydell
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Maydell @ 2014-12-15 13:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mttcg, Mark Burton, Developers qemu-devel

On 15 December 2014 at 13:16, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 15/12/2014 13:56, Mark Burton wrote:
>> A third proposal is to mark pages as IO when a ld/st ex is performed to
>> them. Then to augment the memory API to indicate that a ld/st ex is in
>> hand, allowing the memory chain to decide if it should allow the write
>> or cause a fault. This would seem to be the closest approach to the real
>> H/W. However it means marking pages as IO (which could cause us issues
>> if there was code on that page?, or slow things down) - and it means
>> adding to the memory API.
>
> What are the intended semantics for reads/writes from the outside world?
>  Should they cause the ll/sc pair to retry or not?

Assuming that by "outside world" you mean "some other CPU in this
cluster" [in ARM-speak, some other observer in the same shareability
domain]: writes should. Reads should not.

-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 12:56 [Qemu-devel] Atomic Instructions - comments please Mark Burton
  2014-12-15 13:16 ` Paolo Bonzini
@ 2014-12-15 13:28 ` Peter Maydell
  2014-12-15 13:32   ` Paolo Bonzini
  2014-12-15 13:35   ` Mark Burton
  1 sibling, 2 replies; 14+ messages in thread
From: Peter Maydell @ 2014-12-15 13:28 UTC (permalink / raw)
  To: Mark Burton; +Cc: mttcg, Developers qemu-devel

On 15 December 2014 at 12:56, Mark Burton <mark.burton@greensocs.com> wrote:
> One proposal is ‘simply’ to add a mutex around this code, such
> that multi-threaded TCG will correctly update/read these saved
> address/values.
> This _should_ maintain the status-quo. Things that were broken
> before will remain broken, nothing new should break. The concern
> is that the fact that the TCG was previously uni-threaded MAY be
> masking problems with this code that we are not taking into account.

Personally I would start out with this approach. We're going to
need a "do this whole sequence atomically wrt other guest CPUs"
mechanism anyway, so it's not implementing something we wouldn't
otherwise need. And it's the simple thing to do. It's certainly
possible to do a more architecturally correct ld/st exclusive
implementation along the lines of how we manage TB invalidation
with the dirty bitmap, but if we can do without it then we
should try to keep the scope of this project constrained: it's
a big enough job as it is.

-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:23   ` Peter Maydell
@ 2014-12-15 13:28     ` Paolo Bonzini
  2014-12-15 13:36       ` Peter Maydell
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-15 13:28 UTC (permalink / raw)
  To: Peter Maydell; +Cc: mttcg, Mark Burton, Developers qemu-devel



On 15/12/2014 14:23, Peter Maydell wrote:
> > What are the intended semantics for reads/writes from the outside world?
> >  Should they cause the ll/sc pair to retry or not?
> Assuming that by "outside world" you mean "some other CPU in this
> cluster" [in ARM-speak, some other observer in the same shareability
> domain]: writes should. Reads should not.

No, I mean from DMA.  Other CPU of course need to abort reads/writes.

The CPU that executes LL can invalidate the physical address on all
other CPUs' TLBs.  Attempts to write to that address then would fill the
TLB entry with the new TLB_LL bit.

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:28 ` Peter Maydell
@ 2014-12-15 13:32   ` Paolo Bonzini
  2014-12-15 13:37     ` Mark Burton
  2014-12-15 13:39     ` Peter Maydell
  2014-12-15 13:35   ` Mark Burton
  1 sibling, 2 replies; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-15 13:32 UTC (permalink / raw)
  To: Peter Maydell, Mark Burton; +Cc: mttcg, Developers qemu-devel



On 15/12/2014 14:28, Peter Maydell wrote:
> Personally I would start out with this approach. We're going to
> need a "do this whole sequence atomically wrt other guest CPUs"
> mechanism anyway, so it's not implementing something we wouldn't
> otherwise need. And it's the simple thing to do. It's certainly
> possible to do a more architecturally correct ld/st exclusive
> implementation along the lines of how we manage TB invalidation
> with the dirty bitmap, but if we can do without it then we
> should try to keep the scope of this project constrained: it's
> a big enough job as it is.

How would "add a mutex" work unless you add a mutex or CAS to each and
every qemu_st operation?

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:28 ` Peter Maydell
  2014-12-15 13:32   ` Paolo Bonzini
@ 2014-12-15 13:35   ` Mark Burton
  2014-12-15 18:15     ` Lluís Vilanova
  1 sibling, 1 reply; 14+ messages in thread
From: Mark Burton @ 2014-12-15 13:35 UTC (permalink / raw)
  To: Peter Maydell; +Cc: mttcg, Developers qemu-devel

We will roll a patch for this approach shortly.
For the ‘better’ approach - I think it’s something we will consider doing…. but as you say, one thing at a time.
I dont think it will be too bad to implement, given what already exists in the tlb’s - (except if we have to protect (for some architecture or other) against non-atomic reads from an address marked atomic, I think).  I think we can treat this independently (unless we discover that it won’t work without :-) )


Cheers

Mark.

> On 15 Dec 2014, at 14:28, Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> On 15 December 2014 at 12:56, Mark Burton <mark.burton@greensocs.com> wrote:
>> One proposal is ‘simply’ to add a mutex around this code, such
>> that multi-threaded TCG will correctly update/read these saved
>> address/values.
>> This _should_ maintain the status-quo. Things that were broken
>> before will remain broken, nothing new should break. The concern
>> is that the fact that the TCG was previously uni-threaded MAY be
>> masking problems with this code that we are not taking into account.
> 
> Personally I would start out with this approach. We're going to
> need a "do this whole sequence atomically wrt other guest CPUs"
> mechanism anyway, so it's not implementing something we wouldn't
> otherwise need. And it's the simple thing to do. It's certainly
> possible to do a more architecturally correct ld/st exclusive
> implementation along the lines of how we manage TB invalidation
> with the dirty bitmap, but if we can do without it then we
> should try to keep the scope of this project constrained: it's
> a big enough job as it is.
> 
> -- PMM


	 +44 (0)20 7100 3485 x 210
 +33 (0)5 33 52 01 77x 210

	+33 (0)603762104
	mark.burton

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:28     ` Paolo Bonzini
@ 2014-12-15 13:36       ` Peter Maydell
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2014-12-15 13:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mttcg, Mark Burton, Developers qemu-devel

On 15 December 2014 at 13:28, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 15/12/2014 14:23, Peter Maydell wrote:
>> > What are the intended semantics for reads/writes from the outside world?
>> >  Should they cause the ll/sc pair to retry or not?
>> Assuming that by "outside world" you mean "some other CPU in this
>> cluster" [in ARM-speak, some other observer in the same shareability
>> domain]: writes should. Reads should not.
>
> No, I mean from DMA.  Other CPU of course need to abort reads/writes.

Depends whether the DMAing device is in the same shareability domain.
Sufficiently clever DMA devices (probably ones which do cache-coherent
DMA; GPUs spring to mind as a possibility) probably would; plain old
DMA devices not. In practice anybody trying to do DMA onto an ll/sc
mutex location is out of their mind and we can ignore this possibility.

-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:32   ` Paolo Bonzini
@ 2014-12-15 13:37     ` Mark Burton
  2014-12-15 13:39     ` Peter Maydell
  1 sibling, 0 replies; 14+ messages in thread
From: Mark Burton @ 2014-12-15 13:37 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mttcg, Peter Maydell, Developers qemu-devel

(not address of mttcg list server)

> On 15 Dec 2014, at 14:32, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> 
> 
> On 15/12/2014 14:28, Peter Maydell wrote:
>> Personally I would start out with this approach. We're going to
>> need a "do this whole sequence atomically wrt other guest CPUs"
>> mechanism anyway, so it's not implementing something we wouldn't
>> otherwise need. And it's the simple thing to do. It's certainly
>> possible to do a more architecturally correct ld/st exclusive
>> implementation along the lines of how we manage TB invalidation
>> with the dirty bitmap, but if we can do without it then we
>> should try to keep the scope of this project constrained: it's
>> a big enough job as it is.
> 
> How would "add a mutex" work unless you add a mutex or CAS to each and
> every qemu_st operation?

it’ll do what it does now…. e.g. it won’t ‘work’ the way the architecture is defined…. but it will work as well as it does now….

Cheers

Mark

> 
> Paolo


	 +44 (0)20 7100 3485 x 210
+33 (0)5 33 52 01 77x 210

	+33 (0)603762104
	mark.burton

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:32   ` Paolo Bonzini
  2014-12-15 13:37     ` Mark Burton
@ 2014-12-15 13:39     ` Peter Maydell
  2014-12-15 13:43       ` Mark Burton
  2014-12-15 13:46       ` Paolo Bonzini
  1 sibling, 2 replies; 14+ messages in thread
From: Peter Maydell @ 2014-12-15 13:39 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Mark Burton, Developers qemu-devel

[I'm getting bounces from mttcg@greensocs.com so have taken
them off cc:
550 5.1.1 <mttcg@greensocs.com>: Recipient address rejected: User
unknown in virtual mailbox table]

On 15 December 2014 at 13:32, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 15/12/2014 14:28, Peter Maydell wrote:
>> Personally I would start out with this approach. We're going to
>> need a "do this whole sequence atomically wrt other guest CPUs"
>> mechanism anyway, so it's not implementing something we wouldn't
>> otherwise need. And it's the simple thing to do. It's certainly
>> possible to do a more architecturally correct ld/st exclusive
>> implementation along the lines of how we manage TB invalidation
>> with the dirty bitmap, but if we can do without it then we
>> should try to keep the scope of this project constrained: it's
>> a big enough job as it is.
>
> How would "add a mutex" work unless you add a mutex or CAS to each and
> every qemu_st operation?

Same way our current approach works -- we simply don't implement
"stores interrupt ll/sc operations": only a store-conditional
can break a load-locked's lock. In practice this works ok
because the stereotypical sequences that guests use don't rely
on this part of the spec.

-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:39     ` Peter Maydell
@ 2014-12-15 13:43       ` Mark Burton
  2014-12-15 13:46       ` Paolo Bonzini
  1 sibling, 0 replies; 14+ messages in thread
From: Mark Burton @ 2014-12-15 13:43 UTC (permalink / raw)
  To: Peter Maydell, mttcg; +Cc: Paolo Bonzini, Developers qemu-devel


> On 15 Dec 2014, at 14:39, Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> [I'm getting bounces from mttcg@greensocs.com so have taken
> them off cc:
> 550 5.1.1 <mttcg@greensocs.com>: Recipient address rejected: User
> unknown in virtual mailbox table]
> 

the address should be: mttcg@listserver.greensocs.com
Not sure who introduced the other address..

Cheers
Mark.



> On 15 December 2014 at 13:32, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> 
>> 
>> On 15/12/2014 14:28, Peter Maydell wrote:
>>> Personally I would start out with this approach. We're going to
>>> need a "do this whole sequence atomically wrt other guest CPUs"
>>> mechanism anyway, so it's not implementing something we wouldn't
>>> otherwise need. And it's the simple thing to do. It's certainly
>>> possible to do a more architecturally correct ld/st exclusive
>>> implementation along the lines of how we manage TB invalidation
>>> with the dirty bitmap, but if we can do without it then we
>>> should try to keep the scope of this project constrained: it's
>>> a big enough job as it is.
>> 
>> How would "add a mutex" work unless you add a mutex or CAS to each and
>> every qemu_st operation?
> 
> Same way our current approach works -- we simply don't implement
> "stores interrupt ll/sc operations": only a store-conditional
> can break a load-locked's lock. In practice this works ok
> because the stereotypical sequences that guests use don't rely
> on this part of the spec.
> 
> -- PMM


	 +44 (0)20 7100 3485 x 210
 +33 (0)5 33 52 01 77x 210

	+33 (0)603762104
	mark.burton

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:39     ` Peter Maydell
  2014-12-15 13:43       ` Mark Burton
@ 2014-12-15 13:46       ` Paolo Bonzini
  1 sibling, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2014-12-15 13:46 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Mark Burton, Developers qemu-devel



On 15/12/2014 14:39, Peter Maydell wrote:
>> >
>> > How would "add a mutex" work unless you add a mutex or CAS to each and
>> > every qemu_st operation?
> Same way our current approach works -- we simply don't implement
> "stores interrupt ll/sc operations": only a store-conditional
> can break a load-locked's lock. In practice this works ok
> because the stereotypical sequences that guests use don't rely
> on this part of the spec.

Makes sense, thanks!

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:16 ` Paolo Bonzini
  2014-12-15 13:23   ` Peter Maydell
@ 2014-12-15 14:09   ` Peter Maydell
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2014-12-15 14:09 UTC (permalink / raw)
  To: mttcg, QEMU Developers; +Cc: Paolo Bonzini, Mark Burton

On 15 December 2014 at 13:16, Paolo Bonzini <pbonzini@redhat.com> wrote:
> If not, it should not need any change to the memory API; you can do it
> entirely within cputlb.c, roughly the same as the handling of
> TLB_NOTDIRTY.  It also marks pages as I/O, but only internally within TCG.

Speaking of TLB_NOTDIRTY, I just wrote up a summary of how that
works for a private email, so I figured I might as well send it
here too so it's in the qemu-devel mail archives; it's probably
not new information to anybody involved in this immediate
conversation.

How we arrange to throw away cached translations when the guest
writes to that part of memory:

 * we have two data structures effectively tracking dirty status:
   (1) there are a set of bitmaps which track different kinds of
   dirtiness (the DIRTY_MEMORY_*); the functions for manipulating
   these are mostly in ram_addr.h. One of the bitmaps is for
   DIRTY_MEMORY_CODE.
   (2) where we have an entry in the QEMU TLB for a page which
   is backed by host RAM, we may set the TLB_NOTDIRTY bit in
   the addr_write TLB entry field (TLB_NOTDIRTY is one of
   several low order bits that can be set in what is otherwise
   a page-aligned virtual address in the TLB structure. TLB_MMIO
   is another, indicating that the entry is not RAM at all.)
   TLB entries come and go, but the bitmaps cover all of physical
   RAM. When a TLB entry is present then the NOTDIRTY flag should
   be just a cache for "at least one of the dirty bitmaps says
   this page is not dirty".
 * when we generate code we call tlb_protect_code() (from
   tb_alloc_page()): this calls cpu_physical_memory_reset_dirty(),
   which both updates the dirty bitmap data structure (marking
   the region as clean in the DIRTY_MEMORY_CODE bitmap) and also
   calls cpu_tlb_reset_dirty_all() to OR in the TLB_NOTDIRTY
   flag for any present TLB entries in the range
 * when we add an entry to the TLB, tlb_set_page() will OR in
   the TLB_NOTDIRTY bit if the bitmap says this is clean memory,
   so the two structures stay in sync
 * tlb_set_page() also calls memory_region_section_get_iotlb()
   to get an iotlb entry for this RAM, which is what will be
   used on the slow path. For RAM this will be io_mem_notdirty.
 * if the guest attempts a read, we don't do anything special
   because this uses addr_read, not addr_write
 * for a guest write, the generated code will look at addr_write;
   it takes the fast path if the low order bits are clear
   (indicating dirty host RAM). Otherwise we take the slow
   path (clean RAM, MMIO, nothing present, etc etc).
 * we then follow the slow path without special casing RAM,
   which means we'll use the iotlb entry set up when the TLB
   entry was populated, which is io_mem_notdirty.
 * notdirty_mem_write() will invalidate the cached TBs if
   the DIRTY_MEMORY_CODE bitmap says this memory is clean,
   and do the access the slow way. We then mark the TLB entry
   as dirty by calling tlb_set_dirty, so next time we'll take
   the fast path. (There's an optimisation wrinkle here:
   tb_invalidate_phys_page_fast() is complicated because it
   tries to avoid simply nuking every TB in the page. So it
   might need to keep accesses on the slow path. It only calls
   tlb_unprotect_code_phys() to update the DIRTY_MEMORY_CODE
   bitmap if every TB on the page has been invalidated. This
   is why notdirty_mem_write()'s call to tlb_set_dirty() is
   conditional.)
 * writes to already-dirty memory can take the fast path,
   which just writes to the host RAM without calling out
   or checking any dirty bits.

Note that for linux-user mode the mechanism is totally
different, because we don't have a softmmu TLB data structure;
instead we use mprotect to write-protect the page, and then
in the SIGSEGV handler we may throw away cached TBs before
un-write-protecting it.

-- PMM

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Atomic Instructions - comments please
  2014-12-15 13:35   ` Mark Burton
@ 2014-12-15 18:15     ` Lluís Vilanova
  0 siblings, 0 replies; 14+ messages in thread
From: Lluís Vilanova @ 2014-12-15 18:15 UTC (permalink / raw)
  To: Mark Burton; +Cc: mttcg, Peter Maydell, Developers qemu-devel

Mark Burton writes:

> We will roll a patch for this approach shortly.
> For the ‘better’ approach - I think it’s something we will consider doing…. but as you say, one thing at a time.
> I dont think it will be too bad to implement, given what already exists in the
> tlb’s - (except if we have to protect (for some architecture or other) against
> non-atomic reads from an address marked atomic, I think).  I think we can treat
> this independently (unless we discover that it won’t work without :-) )

I think something similar was discussed before, but you can also hybridize the
current approach with mprotect. The main problem I see is that every CPU TLB
will have to mmap+mprotect a host page (a different host page for every guest
CPU page table, but not necessarily every guest CPU).

On the bright side, that will avoid locks in the fast path. Instead of locking,
the fast path can rely on mprotect and only check if the target page contains
data or non-translated code in the case of writes. Otherwise you must resort to
locking, but that can be moved to a non-faulting slow path.

This should leave a fast path with the same complexity of today's, but I'm not
sure about the amount of overheads this would add to other paths (and how
frequent they'll be).


Best,
  Lluis




> Cheers

> Mark.

>> On 15 Dec 2014, at 14:28, Peter Maydell <peter.maydell@linaro.org> wrote:
>> 
>> On 15 December 2014 at 12:56, Mark Burton <mark.burton@greensocs.com> wrote:
>>> One proposal is ‘simply’ to add a mutex around this code, such
>>> that multi-threaded TCG will correctly update/read these saved
>>> address/values.
>>> This _should_ maintain the status-quo. Things that were broken
>>> before will remain broken, nothing new should break. The concern
>>> is that the fact that the TCG was previously uni-threaded MAY be
>>> masking problems with this code that we are not taking into account.
>> 
>> Personally I would start out with this approach. We're going to
>> need a "do this whole sequence atomically wrt other guest CPUs"
>> mechanism anyway, so it's not implementing something we wouldn't
>> otherwise need. And it's the simple thing to do. It's certainly
>> possible to do a more architecturally correct ld/st exclusive
>> implementation along the lines of how we manage TB invalidation
>> with the dirty bitmap, but if we can do without it then we
>> should try to keep the scope of this project constrained: it's
>> a big enough job as it is.
>> 
>> -- PMM


-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-12-15 18:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-15 12:56 [Qemu-devel] Atomic Instructions - comments please Mark Burton
2014-12-15 13:16 ` Paolo Bonzini
2014-12-15 13:23   ` Peter Maydell
2014-12-15 13:28     ` Paolo Bonzini
2014-12-15 13:36       ` Peter Maydell
2014-12-15 14:09   ` Peter Maydell
2014-12-15 13:28 ` Peter Maydell
2014-12-15 13:32   ` Paolo Bonzini
2014-12-15 13:37     ` Mark Burton
2014-12-15 13:39     ` Peter Maydell
2014-12-15 13:43       ` Mark Burton
2014-12-15 13:46       ` Paolo Bonzini
2014-12-15 13:35   ` Mark Burton
2014-12-15 18:15     ` Lluís Vilanova

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.