qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* Question about atomics
@ 2022-03-08  4:18 Warner Losh
  2022-03-08  5:00 ` Richard Henderson
  0 siblings, 1 reply; 16+ messages in thread
From: Warner Losh @ 2022-03-08  4:18 UTC (permalink / raw)
  To: QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 1394 bytes --]

I have a question related to the user-mode emulation and atomics. I asked
on IRC, but thinking about it, I think it may be too complex to discuss in
that medium...

In FreeBSD we have a system call that uses host atomic operations to
interact memory that userland also interacts with using atomic operations.

In bsd-user we call the kernel with a special flag for dealing with 32-bit
processes running on a 64-bit kernel. In this case, we use 32-bit-sized
atomics to set variables in the address space of the bsd-user guest. This
is used when running armv7 binaries on amd64 hosts.

First question: Is this expected to work? I know I'm a bit vague, so as a
followup question: If there's restrictions on this, what might they be? Do
some classes of atomic operations work, while others may fail or need
additional cooperation? Are there any conformance tests I could compile for
FreeBSD/armv7 to test the hypothesis that atomic operations are misbehaving?

I'm asking because I'm seeing a rare, but not rare enough, race that's
corrupting state in ways that only appear to be possible when pthread
mutexes aren't working (which only break when atomic operations are
broken). So far my efforts to narrow this down has been unsuccessful and
I'm looking to both understand qemu/tcm better as well as to reduce the
problem space to search...

Thanks for any help you might be able to give.

Warner

[-- Attachment #2: Type: text/html, Size: 1630 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-08  4:18 Question about atomics Warner Losh
@ 2022-03-08  5:00 ` Richard Henderson
  2022-03-08 14:09   ` Warner Losh
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2022-03-08  5:00 UTC (permalink / raw)
  To: Warner Losh, QEMU Developers

On 3/7/22 18:18, Warner Losh wrote:
> I have a question related to the user-mode emulation and atomics. I asked on IRC, but 
> thinking about it, I think it may be too complex to discuss in that medium...
> 
> In FreeBSD we have a system call that uses host atomic operations to interact memory that 
> userland also interacts with using atomic operations.
> 
> In bsd-user we call the kernel with a special flag for dealing with 32-bit processes 
> running on a 64-bit kernel. In this case, we use 32-bit-sized atomics to set variables in 
> the address space of the bsd-user guest. This is used when running armv7 binaries on amd64 
> hosts.
> 
> First question: Is this expected to work? I know I'm a bit vague, so as a followup 
> question: If there's restrictions on this, what might they be? Do some classes of atomic 
> operations work, while others may fail or need additional cooperation? Are there any 
> conformance tests I could compile for FreeBSD/armv7 to test the hypothesis that atomic 
> operations are misbehaving?

Yes, qatomic_foo is expected to work.  It's what we use across threads, and it is expected 
to work "in kernel mode", i.e. within cpu_loop().

There are compile-time restrictions on the set of atomic operations, mostly based on what 
the host supports.  But anything that actually compiles is expected to work (there are a 
set of ifdefs if you need something more than the default).

Beyond that, there is start_exclusive() / end_exclusive() which will stop-the-world and 
make sure that the current thread is the only one running.

> Thanks for any help you might be able to give.

Show the code in question?


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-08  5:00 ` Richard Henderson
@ 2022-03-08 14:09   ` Warner Losh
  2022-03-08 14:26     ` Paolo Bonzini
  0 siblings, 1 reply; 16+ messages in thread
From: Warner Losh @ 2022-03-08 14:09 UTC (permalink / raw)
  To: Richard Henderson; +Cc: QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 2392 bytes --]

On Mon, Mar 7, 2022 at 10:00 PM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 3/7/22 18:18, Warner Losh wrote:
> > I have a question related to the user-mode emulation and atomics. I
> asked on IRC, but
> > thinking about it, I think it may be too complex to discuss in that
> medium...
> >
> > In FreeBSD we have a system call that uses host atomic operations to
> interact memory that
> > userland also interacts with using atomic operations.
> >
> > In bsd-user we call the kernel with a special flag for dealing with
> 32-bit processes
> > running on a 64-bit kernel. In this case, we use 32-bit-sized atomics to
> set variables in
> > the address space of the bsd-user guest. This is used when running armv7
> binaries on amd64
> > hosts.
> >
> > First question: Is this expected to work? I know I'm a bit vague, so as
> a followup
> > question: If there's restrictions on this, what might they be? Do some
> classes of atomic
> > operations work, while others may fail or need additional cooperation?
> Are there any
> > conformance tests I could compile for FreeBSD/armv7 to test the
> hypothesis that atomic
> > operations are misbehaving?
>
> Yes, qatomic_foo is expected to work.  It's what we use across threads,
> and it is expected
> to work "in kernel mode", i.e. within cpu_loop().
>

Even when the writers are done in the context of system calls to the kernel?


> There are compile-time restrictions on the set of atomic operations,
> mostly based on what
> the host supports.  But anything that actually compiles is expected to
> work (there are a
> set of ifdefs if you need something more than the default).
>
> Beyond that, there is start_exclusive() / end_exclusive() which will
> stop-the-world and
> make sure that the current thread is the only one running.
>

So anything that happens in the BSD host kernel would need to be confined
to the one
and only on running thread? It also assumes only one thread is scheduled
and running
and that might be a source of 'brokeness' if there's an issue in the BSD
implementation
of the mechanisms that are used for that. And if the system call does this
w/o using
the start_exclusive/end_exclusive stuff, is that a problem?


> > Thanks for any help you might be able to give.
>
> Show the code in question?
>

Which code? The test cases that are failing, or the bsd-user code in the
branch I suspect?

Warner

[-- Attachment #2: Type: text/html, Size: 3289 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-08 14:09   ` Warner Losh
@ 2022-03-08 14:26     ` Paolo Bonzini
  2022-03-08 16:29       ` Warner Losh
  0 siblings, 1 reply; 16+ messages in thread
From: Paolo Bonzini @ 2022-03-08 14:26 UTC (permalink / raw)
  To: Warner Losh, Richard Henderson; +Cc: QEMU Developers

On 3/8/22 15:09, Warner Losh wrote:
> 
>     Yes, qatomic_foo is expected to work.  It's what we use across
>     threads, and it is expected to work "in kernel mode", i.e. within cpu_loop().
> 
> Even when the writers are done in the context of system calls to the kernel?

Yes.

That said, for the similar syscall in Linux we just forward it to the 
kernel (and the kernel obviously can only do an atomic---no 
start_exclusive/end_exclusive involved).

> And if the system call does this w/o using
> the start_exclusive/end_exclusive stuff, is that a problem?

If it does it without start_exclusive/end_exclusive they must use 
qatomic_foo().  If it does it with start_exclusive/end_exclusive, they 
can even write a compare-and-exchange as

     old = *(uint64_t *)g2h(cs, addr);
     if (old == oldval)
         *(uint64_t *)g2h(cs, addr) = new;

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-08 14:26     ` Paolo Bonzini
@ 2022-03-08 16:29       ` Warner Losh
  2022-03-13  4:59         ` Warner Losh
  0 siblings, 1 reply; 16+ messages in thread
From: Warner Losh @ 2022-03-08 16:29 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Richard Henderson, QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 3296 bytes --]

On Tue, Mar 8, 2022 at 7:26 AM Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 3/8/22 15:09, Warner Losh wrote:
> >
> >     Yes, qatomic_foo is expected to work.  It's what we use across
> >     threads, and it is expected to work "in kernel mode", i.e. within
> cpu_loop().
> >
> > Even when the writers are done in the context of system calls to the
> kernel?
>
> Yes.
>
> That said, for the similar syscall in Linux we just forward it to the
> kernel (and the kernel obviously can only do an atomic---no
> start_exclusive/end_exclusive involved).
>

OK. It seemed similar to futex, but I didn't know if that worked because
it restricted itself, or because all atomics worked when used from the
kernel :)


> > And if the system call does this w/o using
> > the start_exclusive/end_exclusive stuff, is that a problem?
>
> If it does it without start_exclusive/end_exclusive they must use
> qatomic_foo().


So this answer is in the context *-user implementing a system call
that's executed as a callout from cpu_loop()? Or does the kernel
have to use the C11 atomics that qatomic_foo() is based on... I'm
thinking the former based on the above, but want to make sure.


> If it does it with start_exclusive/end_exclusive, they
> can even write a compare-and-exchange as
>
>      old = *(uint64_t *)g2h(cs, addr);
>      if (old == oldval)
>          *(uint64_t *)g2h(cs, addr) = new;
>

 Nice.

The test program that's seeing corrupted mutex state is just two threads.
I've simplified
it a bit (it's an ATF test, and there's a lot of boilerplate that I
removed, including validating
the return values). It looks pretty straight forward. Often it will work,
sometimes, though
it fails with an internal assertion in what implements pthread_mutex about
state that's
not possible w/o the atomics/system calls that implement the pthread_mutex
failing to
work.

Warner

P.S. Here's the code for reading purposes... W/o the headers it won't
compile and the
ATF stuff at the end comes from elsewhere...

static pthread_mutex_t static_mutex = PTHREAD_MUTEX_INITIALIZER;
static int global_x;
bool thread3_started = false;

static void *
mutex3_threadfunc(void *arg) {
        long count = *(int *)arg;

        thread3_started = true;
        while (count--) {
                pthread_mutex_lock(&static_mutex);
                global_x++;
                pthread_mutex_unlock(&static_mutex);
        }
}

int main(int argc, char **argv) {
        int count, count2;
        pthread_t new;
        void *joinval;

        global_x = 0;
        count = count2 = 1000;
        pthread_mutex_lock(&static_mutex);
        pthread_create(&new, NULL, mutex3_threadfunc, &count2);
        while (!thread3_started) {
                /* Wait for thread 3 to start to increase chance of race */
        }
       pthread_mutex_unlock(&static_mutex);
       while (count--) {
                pthread_mutex_lock(&static_mutex);
                global_x++;
                pthread_mutex_unlock(&static_mutex);
        }

        pthread_join(new, &joinval);
        pthread_mutex_lock(&static_mutex);
        ATF_REQUIRE_EQ_MSG(count, -1, "%d", count);
        ATF_REQUIRE_EQ_MSG((long)joinval, -1, "%ld", (long)joinval);
        ATF_REQUIRE_EQ_MSG(global_x, 1000 * 2, "%d vs %d", globaol_x,
            1000 * 2);
}

[-- Attachment #2: Type: text/html, Size: 4870 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-08 16:29       ` Warner Losh
@ 2022-03-13  4:59         ` Warner Losh
  2022-03-13 16:47           ` Richard Henderson
  0 siblings, 1 reply; 16+ messages in thread
From: Warner Losh @ 2022-03-13  4:59 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Richard Henderson, QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 5105 bytes --]

On Tue, Mar 8, 2022 at 9:29 AM Warner Losh <imp@bsdimp.com> wrote:

>
>
> On Tue, Mar 8, 2022 at 7:26 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>> On 3/8/22 15:09, Warner Losh wrote:
>> >
>> >     Yes, qatomic_foo is expected to work.  It's what we use across
>> >     threads, and it is expected to work "in kernel mode", i.e. within
>> cpu_loop().
>> >
>> > Even when the writers are done in the context of system calls to the
>> kernel?
>>
>> Yes.
>>
>> That said, for the similar syscall in Linux we just forward it to the
>> kernel (and the kernel obviously can only do an atomic---no
>> start_exclusive/end_exclusive involved).
>>
>
> OK. It seemed similar to futex, but I didn't know if that worked because
> it restricted itself, or because all atomics worked when used from the
> kernel :)
>

So how do you handle multiple writers? I think I've found a way they
can race, but I need a sanity check to make sure I'm understanding
correctly.

FreeBSD's pthread_mutex is shared between the kernel and user land.
So it does a compare and set to take the lock. Uncontested and unheld
locks will mean we've taken the lock and return. Contested locks
are kicked to the kernel to wait. When userland releases the lock
it signals the kernel to wakeup via a system call. The kernel then
does a cas to try to acquire the lock. It either returns with the lock
held, or goes back to sleep. This we have atomics operating both in
the kernel (via standard host atomics) and userland atomics done
via start/end_exclusive. So I'm struggling with how the start/end_exclsuive
interacts with the cas from the kernel. the kernel can modify the value
after tcg has read the old value before it's goes to set it thinking
that it's OK.

Basically. Lock is unlocked, so it has '0' in the owner field, which is
the oldval for the cas operation.

Thread 1 (tcg inside of start_exclusive)
     old = *(uint64_t *)g2h(cs, addr);
     if (old == oldval)
/*                  kernel does the cas here, finds it uncontested and
stores it's ownership */
         *(uint64_t *)g2h(cs, addr) = new; /* now the kernel value is
overwritten, two threads think they own the lock */

Or am I missing something there?

This doesn't need to necessarily work, but I'm trying to understand if I
understand
the race well enough (and if so, I'll need to do something else to implement
these things).

Warner


> > And if the system call does this w/o using
>> > the start_exclusive/end_exclusive stuff, is that a problem?
>>
>> If it does it without start_exclusive/end_exclusive they must use
>> qatomic_foo().
>
>
> So this answer is in the context *-user implementing a system call
> that's executed as a callout from cpu_loop()? Or does the kernel
> have to use the C11 atomics that qatomic_foo() is based on... I'm
> thinking the former based on the above, but want to make sure.
>
>
>> If it does it with start_exclusive/end_exclusive, they
>> can even write a compare-and-exchange as
>>
>>      old = *(uint64_t *)g2h(cs, addr);
>>      if (old == oldval)
>>          *(uint64_t *)g2h(cs, addr) = new;
>>
>
>  Nice.
>
> The test program that's seeing corrupted mutex state is just two threads.
> I've simplified
> it a bit (it's an ATF test, and there's a lot of boilerplate that I
> removed, including validating
> the return values). It looks pretty straight forward. Often it will work,
> sometimes, though
> it fails with an internal assertion in what implements pthread_mutex about
> state that's
> not possible w/o the atomics/system calls that implement the pthread_mutex
> failing to
> work.
>
> Warner
>
> P.S. Here's the code for reading purposes... W/o the headers it won't
> compile and the
> ATF stuff at the end comes from elsewhere...
>
> static pthread_mutex_t static_mutex = PTHREAD_MUTEX_INITIALIZER;
> static int global_x;
> bool thread3_started = false;
>
> static void *
> mutex3_threadfunc(void *arg) {
>         long count = *(int *)arg;
>
>         thread3_started = true;
>         while (count--) {
>                 pthread_mutex_lock(&static_mutex);
>                 global_x++;
>                 pthread_mutex_unlock(&static_mutex);
>         }
> }
>
> int main(int argc, char **argv) {
>         int count, count2;
>         pthread_t new;
>         void *joinval;
>
>         global_x = 0;
>         count = count2 = 1000;
>         pthread_mutex_lock(&static_mutex);
>         pthread_create(&new, NULL, mutex3_threadfunc, &count2);
>         while (!thread3_started) {
>                 /* Wait for thread 3 to start to increase chance of race */
>         }
>        pthread_mutex_unlock(&static_mutex);
>        while (count--) {
>                 pthread_mutex_lock(&static_mutex);
>                 global_x++;
>                 pthread_mutex_unlock(&static_mutex);
>         }
>
>         pthread_join(new, &joinval);
>         pthread_mutex_lock(&static_mutex);
>         ATF_REQUIRE_EQ_MSG(count, -1, "%d", count);
>         ATF_REQUIRE_EQ_MSG((long)joinval, -1, "%ld", (long)joinval);
>         ATF_REQUIRE_EQ_MSG(global_x, 1000 * 2, "%d vs %d", globaol_x,
>             1000 * 2);
> }
>

[-- Attachment #2: Type: text/html, Size: 7502 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-13  4:59         ` Warner Losh
@ 2022-03-13 16:47           ` Richard Henderson
  2022-03-13 16:57             ` Warner Losh
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2022-03-13 16:47 UTC (permalink / raw)
  To: Warner Losh, Paolo Bonzini; +Cc: QEMU Developers

On 3/12/22 20:59, Warner Losh wrote:
> FreeBSD's pthread_mutex is shared between the kernel and user land.
> So it does a compare and set to take the lock. Uncontested and unheld
> locks will mean we've taken the lock and return. Contested locks
> are kicked to the kernel to wait. When userland releases the lock
> it signals the kernel to wakeup via a system call. The kernel then
> does a cas to try to acquire the lock. It either returns with the lock
> held, or goes back to sleep. This we have atomics operating both in
> the kernel (via standard host atomics) and userland atomics done
> via start/end_exclusive.

You need to use standard host atomics for this case.

r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-13 16:47           ` Richard Henderson
@ 2022-03-13 16:57             ` Warner Losh
  2022-03-13 17:03               ` Richard Henderson
  0 siblings, 1 reply; 16+ messages in thread
From: Warner Losh @ 2022-03-13 16:57 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 924 bytes --]

On Sun, Mar 13, 2022, 10:47 AM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 3/12/22 20:59, Warner Losh wrote:
> > FreeBSD's pthread_mutex is shared between the kernel and user land.
> > So it does a compare and set to take the lock. Uncontested and unheld
> > locks will mean we've taken the lock and return. Contested locks
> > are kicked to the kernel to wait. When userland releases the lock
> > it signals the kernel to wakeup via a system call. The kernel then
> > does a cas to try to acquire the lock. It either returns with the lock
> > held, or goes back to sleep. This we have atomics operating both in
> > the kernel (via standard host atomics) and userland atomics done
> > via start/end_exclusive.
>
> You need to use standard host atomics for this case.
>

Or use the start/end_exclusive for both by emulating the kernel call, I
presume? It's the mixing that's the problem, right?

Warner

>

[-- Attachment #2: Type: text/html, Size: 1547 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-13 16:57             ` Warner Losh
@ 2022-03-13 17:03               ` Richard Henderson
  2022-03-13 18:29                 ` Warner Losh
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2022-03-13 17:03 UTC (permalink / raw)
  To: Warner Losh; +Cc: Paolo Bonzini, QEMU Developers

On 3/13/22 09:57, Warner Losh wrote:
> 
> 
> On Sun, Mar 13, 2022, 10:47 AM Richard Henderson <richard.henderson@linaro.org 
> <mailto:richard.henderson@linaro.org>> wrote:
> 
>     On 3/12/22 20:59, Warner Losh wrote:
>      > FreeBSD's pthread_mutex is shared between the kernel and user land.
>      > So it does a compare and set to take the lock. Uncontested and unheld
>      > locks will mean we've taken the lock and return. Contested locks
>      > are kicked to the kernel to wait. When userland releases the lock
>      > it signals the kernel to wakeup via a system call. The kernel then
>      > does a cas to try to acquire the lock. It either returns with the lock
>      > held, or goes back to sleep. This we have atomics operating both in
>      > the kernel (via standard host atomics) and userland atomics done
>      > via start/end_exclusive.
> 
>     You need to use standard host atomics for this case.
> 
> 
> Or use the start/end_exclusive for both by emulating the kernel call, I presume? It's the 
> mixing that's the problem, right?

Well, preferably no.  Use start/end_exclusive only when you have no alternative, which for 
a simple CAS should not be the case on any FreeBSD host.

Using start/end_exclusive is entirely local to the current process, and means you don't 
have atomicity across processes.  Which can cause problems when emulating an entire chroot.


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-13 17:03               ` Richard Henderson
@ 2022-03-13 18:29                 ` Warner Losh
  2022-03-13 20:19                   ` Richard Henderson
  0 siblings, 1 reply; 16+ messages in thread
From: Warner Losh @ 2022-03-13 18:29 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 1783 bytes --]

On Sun, Mar 13, 2022 at 11:03 AM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 3/13/22 09:57, Warner Losh wrote:
> >
> >
> > On Sun, Mar 13, 2022, 10:47 AM Richard Henderson <
> richard.henderson@linaro.org
> > <mailto:richard.henderson@linaro.org>> wrote:
> >
> >     On 3/12/22 20:59, Warner Losh wrote:
> >      > FreeBSD's pthread_mutex is shared between the kernel and user
> land.
> >      > So it does a compare and set to take the lock. Uncontested and
> unheld
> >      > locks will mean we've taken the lock and return. Contested locks
> >      > are kicked to the kernel to wait. When userland releases the lock
> >      > it signals the kernel to wakeup via a system call. The kernel then
> >      > does a cas to try to acquire the lock. It either returns with the
> lock
> >      > held, or goes back to sleep. This we have atomics operating both
> in
> >      > the kernel (via standard host atomics) and userland atomics done
> >      > via start/end_exclusive.
> >
> >     You need to use standard host atomics for this case.
> >
> >
> > Or use the start/end_exclusive for both by emulating the kernel call, I
> presume? It's the
> > mixing that's the problem, right?
>
> Well, preferably no.  Use start/end_exclusive only when you have no
> alternative, which for
> a simple CAS should not be the case on any FreeBSD host.
>
> Using start/end_exclusive is entirely local to the current process, and
> means you don't
> have atomicity across processes.  Which can cause problems when emulating
> an entire chroot.
>

So I was assuming that the cas instructions for arm use start/end_exclsive
under the covers.
Is that the case? Or is there something clever there I've overlooked and
the start/end_exclusive
stuff is only used for fallbacks?

Warner

[-- Attachment #2: Type: text/html, Size: 2546 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-13 18:29                 ` Warner Losh
@ 2022-03-13 20:19                   ` Richard Henderson
  2022-03-14  4:09                     ` Warner Losh
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2022-03-13 20:19 UTC (permalink / raw)
  To: Warner Losh; +Cc: Paolo Bonzini, QEMU Developers

On 3/13/22 11:29, Warner Losh wrote:
> So I was assuming that the cas instructions for arm use start/end_exclsive under the covers.
> Is that the case?

Nope.  The store exclusive guest insn is implemented with a host cmpxchg.

Oh, I'd forgotten about the old arm cmpxchg64 syscall, which is still implemented with 
start/end_exclusive.  That should get fixed...


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-13 20:19                   ` Richard Henderson
@ 2022-03-14  4:09                     ` Warner Losh
  2022-03-14  4:36                       ` Richard Henderson
  2022-03-14  4:43                       ` Richard Henderson
  0 siblings, 2 replies; 16+ messages in thread
From: Warner Losh @ 2022-03-14  4:09 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 1072 bytes --]

On Sun, Mar 13, 2022 at 2:19 PM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 3/13/22 11:29, Warner Losh wrote:
> > So I was assuming that the cas instructions for arm use
> start/end_exclsive under the covers.
> > Is that the case?
>
> Nope.  The store exclusive guest insn is implemented with a host cmpxchg.
>

Oh? Out of paranoia, how can I verify that this is the case when compiled
on FreeBSD?
Perhaps the atomic sequence FreeBSD uses differs a little from Linux and we
don't trigger
that code? Or there's some adjustment that I've not made yet... the code
seems to work
on 3.1 but not on latest, and there's been a lot of changes to tcg, so I'd
like to rule it
out since there's a lot of other change too and there's too many
variables...


> Oh, I'd forgotten about the old arm cmpxchg64 syscall, which is still
> implemented with
> start/end_exclusive.  That should get fixed...
>

FreeBSD doesn't have this helper. So bsd-user doesn't implement it... The
only
good thing about that is that there's nothing for me to fix :/...

Warner


>
> r~
>

[-- Attachment #2: Type: text/html, Size: 1874 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-14  4:09                     ` Warner Losh
@ 2022-03-14  4:36                       ` Richard Henderson
  2022-03-14  5:42                         ` Warner Losh
  2022-03-14  4:43                       ` Richard Henderson
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2022-03-14  4:36 UTC (permalink / raw)
  To: Warner Losh; +Cc: Paolo Bonzini, QEMU Developers

On 3/13/22 21:09, Warner Losh wrote:
> Oh? Out of paranoia, how can I verify that this is the case when compiled on FreeBSD?
> Perhaps the atomic sequence FreeBSD uses differs a little from Linux and we don't trigger
> that code?

$ objdump -dr libqemu-arm-*-user.fa.p/accel_tcg_user-exec.c.o

0000000000001490 <helper_atomic_cmpxchgl_le>:
...
     14b7:       e8 04 ec ff ff          callq  c0 <atomic_mmu_lookup.constprop.0>
     14bc:       48 89 c2                mov    %rax,%rdx
     14bf:       44 89 e0                mov    %r12d,%eax
     14c2:       f0 44 0f b1 32          lock cmpxchg %r14d,(%rdx)


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-14  4:09                     ` Warner Losh
  2022-03-14  4:36                       ` Richard Henderson
@ 2022-03-14  4:43                       ` Richard Henderson
  2022-03-14  5:34                         ` Warner Losh
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Henderson @ 2022-03-14  4:43 UTC (permalink / raw)
  To: Warner Losh; +Cc: Paolo Bonzini, QEMU Developers

On 3/13/22 21:09, Warner Losh wrote:
> Oh? Out of paranoia, how can I verify that this is the case when compiled on FreeBSD?
> Perhaps the atomic sequence FreeBSD uses differs a little from Linux and we don't trigger
> that code? Or there's some adjustment that I've not made yet... the code seems to work
> on 3.1 but not on latest, and there's been a lot of changes to tcg, so I'd like to rule it
> out since there's a lot of other change too and there's too many variables...

Can you point me at this code on your branch?


r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-14  4:43                       ` Richard Henderson
@ 2022-03-14  5:34                         ` Warner Losh
  0 siblings, 0 replies; 16+ messages in thread
From: Warner Losh @ 2022-03-14  5:34 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 776 bytes --]

On Sun, Mar 13, 2022 at 10:43 PM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 3/13/22 21:09, Warner Losh wrote:
> > Oh? Out of paranoia, how can I verify that this is the case when
> compiled on FreeBSD?
> > Perhaps the atomic sequence FreeBSD uses differs a little from Linux and
> we don't trigger
> > that code? Or there's some adjustment that I've not made yet... the code
> seems to work
> > on 3.1 but not on latest, and there's been a lot of changes to tcg, so
> I'd like to rule it
> > out since there's a lot of other change too and there's too many
> variables...
>
> Can you point me at this code on your branch?
>

I just pushed this to gitlab:

https://gitlab.com/bsdimp/qemu/-/tree/blitz

Thanks for any insight that you can provide...

Warner

[-- Attachment #2: Type: text/html, Size: 1306 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Question about atomics
  2022-03-14  4:36                       ` Richard Henderson
@ 2022-03-14  5:42                         ` Warner Losh
  0 siblings, 0 replies; 16+ messages in thread
From: Warner Losh @ 2022-03-14  5:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, QEMU Developers

[-- Attachment #1: Type: text/plain, Size: 1267 bytes --]

On Sun, Mar 13, 2022 at 10:36 PM Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 3/13/22 21:09, Warner Losh wrote:
> > Oh? Out of paranoia, how can I verify that this is the case when
> compiled on FreeBSD?
> > Perhaps the atomic sequence FreeBSD uses differs a little from Linux and
> we don't trigger
> > that code?
>
> $ objdump -dr libqemu-arm-*-user.fa.p/accel_tcg_user-exec.c.o
>
> 0000000000001490 <helper_atomic_cmpxchgl_le>:
> ...
>      14b7:       e8 04 ec ff ff          callq  c0
> <atomic_mmu_lookup.constprop.0>
>      14bc:       48 89 c2                mov    %rax,%rdx
>      14bf:       44 89 e0                mov    %r12d,%eax
>      14c2:       f0 44 0f b1 32          lock cmpxchg %r14d,(%rdx)
>

Looks like this compiles correctly on FreeBSD... We have something similar:

    1f69:       41 89 f1                mov    %esi,%r9d
    1f6c:       48 8b 3d 00 00 00 00    mov    0x0(%rip),%rdi        # 1f73
<helper_atomic_cmpxchgl_le+0x53>
    1f73:       64 48 8b 34 25 00 00    mov    %fs:0x0,%rsi
    1f7a:       00 00
    1f7c:       48 89 8e 00 00 00 00    mov    %rcx,0x0(%rsi)
    1f83:       89 d0                   mov    %edx,%eax
    1f85:       f0 46 0f b1 04 0f       lock cmpxchg %r8d,(%rdi,%r9,1)

Warner


> r~
>

[-- Attachment #2: Type: text/html, Size: 2024 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-03-14  5:44 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-08  4:18 Question about atomics Warner Losh
2022-03-08  5:00 ` Richard Henderson
2022-03-08 14:09   ` Warner Losh
2022-03-08 14:26     ` Paolo Bonzini
2022-03-08 16:29       ` Warner Losh
2022-03-13  4:59         ` Warner Losh
2022-03-13 16:47           ` Richard Henderson
2022-03-13 16:57             ` Warner Losh
2022-03-13 17:03               ` Richard Henderson
2022-03-13 18:29                 ` Warner Losh
2022-03-13 20:19                   ` Richard Henderson
2022-03-14  4:09                     ` Warner Losh
2022-03-14  4:36                       ` Richard Henderson
2022-03-14  5:42                         ` Warner Losh
2022-03-14  4:43                       ` Richard Henderson
2022-03-14  5:34                         ` Warner Losh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).