[Qemu-devel] Self-modifying test case for mttcg

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] Self-modifying test case for mttcg
@ 2015-07-21 10:58 Alexander Spyridakis
  2015-07-22 12:38 ` Andrew Jones
  0 siblings, 1 reply; 8+ messages in thread
From: Alexander Spyridakis @ 2015-07-21 10:58 UTC (permalink / raw)
  To: mttcg
  Cc: Claudio Fontana, Mark Burton, Alvise Rigo, QEMU Developers,
	Jani Kokkonen, Alex Bennée, KONRAD Frédéric

Hello all,

You can find a new self-modifying test case in the following branch:
> git clone https://git.virtualopensystems.com/dev/tcg_baremetal_tests.git -b smc_test

For each core, the test will run a small assembly snippet which
increments a variable. Immediately after, the snippet is modified in
memory to increment by 1 or 2 every other loop cycle, then passes
execution to the next core. At the end of the test we calculate the
expected result and compare it to the actual incremented variable. If
all code modifications happened correctly we pass the test.

The test case has been tested with upstream QEMU, MTTCG and KVM with
success. Next version of the test will include more corner cases, such
as changing TBs immediately after code modification, to make sure that
we cover every scenario.

To run it:
> make virt (or virt64/vexpress for other targets)
> ~/mttcg/arm-softmmu/qemu-system-arm -nographic -M virt -cpu cortex-a15 -kernel build-virt/image-virt.axf -smp 8

Also, by popular demand I started a port of the test for kvm-unit-tests:
> git clone https://git.virtualopensystems.com/dev/kvm-unit-tests.git

For the kvm-unit-tests version, I have some troubles with caches and
the MMU (which is disabled for this test). While TCG and MTTCG work,
KVM fails the test with strange results. I will keep looking to find
the exact problem.

Best regards.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Self-modifying test case for mttcg
  2015-07-21 10:58 [Qemu-devel] Self-modifying test case for mttcg Alexander Spyridakis
@ 2015-07-22 12:38 ` Andrew Jones
  2015-07-22 13:06   ` Paolo Bonzini
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Andrew Jones @ 2015-07-22 12:38 UTC (permalink / raw)
  To: Alexander Spyridakis
  Cc: mttcg, Claudio Fontana, Mark Burton, Alvise Rigo, QEMU Developers,
	Jani Kokkonen, Alex Bennée, KONRAD Frédéric

On Tue, Jul 21, 2015 at 12:58:56PM +0200, Alexander Spyridakis wrote:
> Hello all,
> 
> You can find a new self-modifying test case in the following branch:
> > git clone https://git.virtualopensystems.com/dev/tcg_baremetal_tests.git -b smc_test
> 
> For each core, the test will run a small assembly snippet which
> increments a variable. Immediately after, the snippet is modified in
> memory to increment by 1 or 2 every other loop cycle, then passes
> execution to the next core. At the end of the test we calculate the
> expected result and compare it to the actual incremented variable. If
> all code modifications happened correctly we pass the test.
> 
> The test case has been tested with upstream QEMU, MTTCG and KVM with
> success. Next version of the test will include more corner cases, such
> as changing TBs immediately after code modification, to make sure that
> we cover every scenario.
> 
> To run it:
> > make virt (or virt64/vexpress for other targets)
> > ~/mttcg/arm-softmmu/qemu-system-arm -nographic -M virt -cpu cortex-a15 -kernel build-virt/image-virt.axf -smp 8
> 
> Also, by popular demand I started a port of the test for kvm-unit-tests:
> > git clone https://git.virtualopensystems.com/dev/kvm-unit-tests.git

I took a quick look at this and see issues with the test code. First,
you're spinning on a stack variable with this,

    /* Wait for our turn */
    while(next_cpu != cpu);

next_cpu needs to be global, and incremented atomically. I haven't gotten
around to adding atomic_add/inc yet, but it would easy, and I'm happy to
do it, even yet this week.

And, as for the MMU, I see from the comment in your test code that you're
hitting an exception when trying to modify code. This is because the code
is mapped readonly in order to use it from usermode. I suggest you modify
the page tables (see below for how) to map the code writeable. Do this
before kicking your secondary cpus, so they'll come up ready.

There are other issues you'll need to fix as well though in the test code;
count should be initialized, result should be volatile, others? I suggest
you make sure it works for one vcpu first.

For modifying page tables, I think something like this should work for
you (untested)

#include <asm/setup.h>
int main(void)
{
    mmu_set_range_ptes(mmu_idmap, PHYS_OFFSET, PHYS_OFFSET,
                       PHYS_END, __pgprot(PTE_WBWA));
    flush_tlb_all();

    ...

I look forward to seeing your fixed up kvm-unit-test test posted. Please
CC me on it.

drew

> 
> For the kvm-unit-tests version, I have some troubles with caches and
> the MMU (which is disabled for this test). While TCG and MTTCG work,
> KVM fails the test with strange results. I will keep looking to find
> the exact problem.
> 
> Best regards.
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Self-modifying test case for mttcg
  2015-07-22 12:38 ` Andrew Jones
@ 2015-07-22 13:06   ` Paolo Bonzini
  2015-07-22 13:44     ` Andrew Jones
  2015-07-22 15:01   ` Andrew Jones
  2015-07-22 23:12   ` Alexander Spyridakis
  2 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2015-07-22 13:06 UTC (permalink / raw)
  To: Andrew Jones, Alexander Spyridakis
  Cc: mttcg, Claudio Fontana, Mark Burton, Alvise Rigo, QEMU Developers,
	Jani Kokkonen, Alex Bennée, KONRAD Frédéric



On 22/07/2015 14:38, Andrew Jones wrote:
> I took a quick look at this and see issues with the test code. First,
> you're spinning on a stack variable with this,
> 
>     /* Wait for our turn */
>     while(next_cpu != cpu);
> 
> next_cpu needs to be global, and incremented atomically. I haven't gotten
> around to adding atomic_add/inc yet, but it would easy, and I'm happy to
> do it, even yet this week.

You can just use __sync_fetch_and_add(&next_cpu, 1) too, so we don't end
up with too much arch-specific code.

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Self-modifying test case for mttcg
  2015-07-22 13:06   ` Paolo Bonzini
@ 2015-07-22 13:44     ` Andrew Jones
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Jones @ 2015-07-22 13:44 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: mttcg, Mark Burton, Alexander Spyridakis, Claudio Fontana,
	QEMU Developers, Alvise Rigo, Jani Kokkonen, Alex Bennée,
	KONRAD Frédéric

On Wed, Jul 22, 2015 at 03:06:03PM +0200, Paolo Bonzini wrote:
> 
> 
> On 22/07/2015 14:38, Andrew Jones wrote:
> > I took a quick look at this and see issues with the test code. First,
> > you're spinning on a stack variable with this,
> > 
> >     /* Wait for our turn */
> >     while(next_cpu != cpu);
> > 
> > next_cpu needs to be global, and incremented atomically. I haven't gotten
> > around to adding atomic_add/inc yet, but it would easy, and I'm happy to
> > do it, even yet this week.
> 
> You can just use __sync_fetch_and_add(&next_cpu, 1) too, so we don't end
> up with too much arch-specific code.

Ah yes, I should take more advantage of gcc-builtins in general.

drew

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Self-modifying test case for mttcg
  2015-07-22 12:38 ` Andrew Jones
  2015-07-22 13:06   ` Paolo Bonzini
@ 2015-07-22 15:01   ` Andrew Jones
  2015-07-22 23:12   ` Alexander Spyridakis
  2 siblings, 0 replies; 8+ messages in thread
From: Andrew Jones @ 2015-07-22 15:01 UTC (permalink / raw)
  To: Alexander Spyridakis
  Cc: mttcg, Claudio Fontana, Mark Burton, Alvise Rigo, QEMU Developers,
	Jani Kokkonen, Alex Bennée, KONRAD Frédéric

On Wed, Jul 22, 2015 at 02:38:11PM +0200, Andrew Jones wrote:
> On Tue, Jul 21, 2015 at 12:58:56PM +0200, Alexander Spyridakis wrote:
> > Hello all,
> > 
> > You can find a new self-modifying test case in the following branch:
> > > git clone https://git.virtualopensystems.com/dev/tcg_baremetal_tests.git -b smc_test
> > 
> > For each core, the test will run a small assembly snippet which
> > increments a variable. Immediately after, the snippet is modified in
> > memory to increment by 1 or 2 every other loop cycle, then passes
> > execution to the next core. At the end of the test we calculate the
> > expected result and compare it to the actual incremented variable. If
> > all code modifications happened correctly we pass the test.
> > 
> > The test case has been tested with upstream QEMU, MTTCG and KVM with
> > success. Next version of the test will include more corner cases, such
> > as changing TBs immediately after code modification, to make sure that
> > we cover every scenario.
> > 
> > To run it:
> > > make virt (or virt64/vexpress for other targets)
> > > ~/mttcg/arm-softmmu/qemu-system-arm -nographic -M virt -cpu cortex-a15 -kernel build-virt/image-virt.axf -smp 8
> > 
> > Also, by popular demand I started a port of the test for kvm-unit-tests:
> > > git clone https://git.virtualopensystems.com/dev/kvm-unit-tests.git
> 
> I took a quick look at this and see issues with the test code. First,
> you're spinning on a stack variable with this,
> 
>     /* Wait for our turn */
>     while(next_cpu != cpu);
> 
> next_cpu needs to be global, and incremented atomically. I haven't gotten
> around to adding atomic_add/inc yet, but it would easy, and I'm happy to
> do it, even yet this week.
> 
> And, as for the MMU, I see from the comment in your test code that you're
> hitting an exception when trying to modify code. This is because the code
> is mapped readonly in order to use it from usermode. I suggest you modify
> the page tables (see below for how) to map the code writeable. Do this
> before kicking your secondary cpus, so they'll come up ready.
> 
> There are other issues you'll need to fix as well though in the test code;
> count should be initialized, result should be volatile, others? I suggest
> you make sure it works for one vcpu first.
> 
> For modifying page tables, I think something like this should work for
> you (untested)
> 
> #include <asm/setup.h>
> int main(void)
> {
>     mmu_set_range_ptes(mmu_idmap, PHYS_OFFSET, PHYS_OFFSET,
>                        PHYS_END, __pgprot(PTE_WBWA));
>     flush_tlb_all();
> 
>     ...
> 
> I look forward to seeing your fixed up kvm-unit-test test posted. Please
> CC me on it.

Just thought of another issue with the unit test. There's no isb()
following the code modification.

> 
> drew
> 
> > 
> > For the kvm-unit-tests version, I have some troubles with caches and
> > the MMU (which is disabled for this test). While TCG and MTTCG work,
> > KVM fails the test with strange results. I will keep looking to find
> > the exact problem.
> > 
> > Best regards.
> > 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Self-modifying test case for mttcg
  2015-07-22 12:38 ` Andrew Jones
  2015-07-22 13:06   ` Paolo Bonzini
  2015-07-22 15:01   ` Andrew Jones
@ 2015-07-22 23:12   ` Alexander Spyridakis
  2015-07-23 10:04     ` Andrew Jones
  2 siblings, 1 reply; 8+ messages in thread
From: Alexander Spyridakis @ 2015-07-22 23:12 UTC (permalink / raw)
  To: Andrew Jones
  Cc: mttcg, Mark Burton, Alexander Spyridakis, Claudio Fontana,
	QEMU Developers, Alvise Rigo, Jani Kokkonen, Alex Bennée,
	KONRAD Frédéric

Hello Andrew,

First, thanks for the comments.

On 22 July 2015 at 14:38, Andrew Jones <drjones@redhat.com> wrote:
> I took a quick look at this and see issues with the test code. First,
> you're spinning on a stack variable with this,
>
>     /* Wait for our turn */
>     while(next_cpu != cpu);
>
> next_cpu needs to be global, and incremented atomically. I haven't gotten
> around to adding atomic_add/inc yet, but it would easy, and I'm happy to
> do it, even yet this week.

I tend to agree with what you are saying, but isn't a static local
declaration forcing all CPUs to see the same variable (and not their
instance)? Otherwise the test would hang indefinitely with multiple
CPUs in this check.

About atomicity I thought of using a lock, but then I realized that
there is no way to have a race condition. Let me explain in more
detail:
Variable 'next_cpu', as being static will be initialized to 0, making
all CPUs, except CPU0, to spin. Since only CPU0 will update next_cpu
(from 0 to 1), while all others are spinning, incrementing it
atomically shouldn't be mandatory. Of course I might have missed
something very obvious, so please correct me if I am wrong.

> And, as for the MMU, I see from the comment in your test code that you're
> hitting an exception when trying to modify code. This is because the code
> is mapped readonly in order to use it from usermode. I suggest you modify
> the page tables (see below for how) to map the code writeable. Do this
> before kicking your secondary cpus, so they'll come up ready.

Indeed it is as you say. When I was experimenting with the MMU, by
removing PTE_RDONLY in lib/arm/mmu.c:113 I was able to run the test
without the need to disable it. At the time, I was uncertain how to do
it properly without this hack (which was outside the test case), so
instead I opted to disable the MMU. Your suggested way, to set it
inside the test case, makes sense, so I will probably do it like that.

> There are other issues you'll need to fix as well though in the test code;
> count should be initialized, result should be volatile, others? I suggest
> you make sure it works for one vcpu first.

Again, 'count' is declared as static, which initializes it to 0. I
played around with 'result' being volatile or not, but I didn't see
any difference. I guess it is better to play it on the safe side,
since this one is sensitive for the test.

> Just thought of another issue with the unit test. There's no isb()
> following the code modification.

Good catch, I played around with dsb and isb, as well as
local_flush_tlb_all() at the time, but I wasn't sure about if I am
paranoid or really needed. Since you also mention this I will test
more the use of an isb after modifying the opcode.

> I look forward to seeing your fixed up kvm-unit-test test posted. Please
> CC me on it.

Thanks again for your input, I will try to update by early next week.

Best regards.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Self-modifying test case for mttcg
  2015-07-22 23:12   ` Alexander Spyridakis
@ 2015-07-23 10:04     ` Andrew Jones
  2015-07-23 14:42       ` Alexander Spyridakis
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Jones @ 2015-07-23 10:04 UTC (permalink / raw)
  To: Alexander Spyridakis
  Cc: mttcg, Mark Burton, Claudio Fontana, QEMU Developers, Alvise Rigo,
	Jani Kokkonen, Alex Bennée, KONRAD Frédéric

On Thu, Jul 23, 2015 at 01:12:30AM +0200, Alexander Spyridakis wrote:
> Hello Andrew,
> 
> First, thanks for the comments.
> 
> On 22 July 2015 at 14:38, Andrew Jones <drjones@redhat.com> wrote:
> > I took a quick look at this and see issues with the test code. First,
> > you're spinning on a stack variable with this,
> >
> >     /* Wait for our turn */
> >     while(next_cpu != cpu);
> >
> > next_cpu needs to be global, and incremented atomically. I haven't gotten
> > around to adding atomic_add/inc yet, but it would easy, and I'm happy to
> > do it, even yet this week.
> 
> I tend to agree with what you are saying, but isn't a static local
> declaration forcing all CPUs to see the same variable (and not their
> instance)? Otherwise the test would hang indefinitely with multiple
> CPUs in this check.
> 
> About atomicity I thought of using a lock, but then I realized that
> there is no way to have a race condition. Let me explain in more
> detail:
> Variable 'next_cpu', as being static will be initialized to 0, making
> all CPUs, except CPU0, to spin. Since only CPU0 will update next_cpu
> (from 0 to 1), while all others are spinning, incrementing it
> atomically shouldn't be mandatory. Of course I might have missed
> something very obvious, so please correct me if I am wrong.

I didn't read the unit test closely enough, looked over the static.
You're right about it. I would actually make the test more smp-ish
though. Rather than force cpus to run one at a time, I would use atomic
ops, allowing the cpus to run in parallel.

> 
> > And, as for the MMU, I see from the comment in your test code that you're
> > hitting an exception when trying to modify code. This is because the code
> > is mapped readonly in order to use it from usermode. I suggest you modify
> > the page tables (see below for how) to map the code writeable. Do this
> > before kicking your secondary cpus, so they'll come up ready.
> 
> Indeed it is as you say. When I was experimenting with the MMU, by
> removing PTE_RDONLY in lib/arm/mmu.c:113 I was able to run the test
> without the need to disable it. At the time, I was uncertain how to do
> it properly without this hack (which was outside the test case), so
> instead I opted to disable the MMU. Your suggested way, to set it
> inside the test case, makes sense, so I will probably do it like that.
> 
> > There are other issues you'll need to fix as well though in the test code;
> > count should be initialized, result should be volatile, others? I suggest
> > you make sure it works for one vcpu first.
> 
> Again, 'count' is declared as static, which initializes it to 0. I
> played around with 'result' being volatile or not, but I didn't see
> any difference. I guess it is better to play it on the safe side,
> since this one is sensitive for the test.

Actually, now that I'm looking closer. You can scratch this suggestion.
As you only use result in an asm where you have an explicit store, it
doesn't need to be declared volatile.

> 
> > Just thought of another issue with the unit test. There's no isb()
> > following the code modification.
> 
> Good catch, I played around with dsb and isb, as well as
> local_flush_tlb_all() at the time, but I wasn't sure about if I am
> paranoid or really needed. Since you also mention this I will test
> more the use of an isb after modifying the opcode.

Right, the lack of isb() is the reason you have a printf in there as a
hack to make things work.

> 
> > I look forward to seeing your fixed up kvm-unit-test test posted. Please
> > CC me on it.
> 
> Thanks again for your input, I will try to update by early next week.
> 
> Best regards.
> 

One more observation; to make this work for aarch64 you'll need to modify
your instruction modification. The add immediate can't be changed with
+-1 like it can for arm. Please make sure the code works for both, even
if you need to do

#ifdef __arm__
asm volatile(arm asm)
#else
asm volatile(aarch64 asm)
#endif

Also, if you need a flip-flopping flag, why not use xor instead of
a counter mod 2?

Thanks,
drew

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] Self-modifying test case for mttcg
  2015-07-23 10:04     ` Andrew Jones
@ 2015-07-23 14:42       ` Alexander Spyridakis
  0 siblings, 0 replies; 8+ messages in thread
From: Alexander Spyridakis @ 2015-07-23 14:42 UTC (permalink / raw)
  To: Andrew Jones
  Cc: mttcg, Claudio Fontana, Alexander Spyridakis, Mark Burton,
	Alvise Rigo, QEMU Developers, Jani Kokkonen, Alex Bennée,
	KONRAD Frédéric

On 23 July 2015 at 12:04, Andrew Jones <drjones@redhat.com> wrote:
> One more observation; to make this work for aarch64 you'll need to modify
> your instruction modification. The add immediate can't be changed with
> +-1 like it can for arm. Please make sure the code works for both, even
> if you need to do
>
> #ifdef __arm__
> asm volatile(arm asm)
> #else
> asm volatile(aarch64 asm)
> #endif

Yes I already did something like this on my original repo (ptr points
to seg0), will update it:
#ifdef ARCH_ARM
            (int *)seg0++;
        } else {
            (int *)seg0--;
        }
#elif ARCH_AARCH64
            *ptr |= 1 << 11;
            *ptr &= ~(1 << 10);
        } else {
            *ptr |= 1 << 10;
            *ptr &= ~(1 << 11);
        }
#endif

> Also, if you need a flip-flopping flag, why not use xor instead of
> a counter mod 2?

You are right xor is more straightforward.

Best regards.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-07-23 14:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-21 10:58 [Qemu-devel] Self-modifying test case for mttcg Alexander Spyridakis
2015-07-22 12:38 ` Andrew Jones
2015-07-22 13:06   ` Paolo Bonzini
2015-07-22 13:44     ` Andrew Jones
2015-07-22 15:01   ` Andrew Jones
2015-07-22 23:12   ` Alexander Spyridakis
2015-07-23 10:04     ` Andrew Jones
2015-07-23 14:42       ` Alexander Spyridakis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).