qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] ppc64 not resuming with v2.3.0-rc3
@ 2015-04-16 16:43 Stefan Berger
  2015-04-16 19:42 ` Mark Cave-Ayland
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Berger @ 2015-04-16 16:43 UTC (permalink / raw)
  To: agraf, Mark Cave-Ayland, qemu-devel, qemu-ppc

The culprit patch seems to be the following commit. If I remove these 
changes from the tip of the tree it works again (on SLOF level):

commit 2360b6e84f78d41fa0f76555a947148b73645259
Author: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Date:   Mon Feb 9 22:40:48 2015 +0000

     target-ppc: force update of msr bits in cpu_post_load

     Since env->msr has already been restored by the time cpu_post_load 
is called,
     make sure that ppc_store_msr() is explicitly called with all msr 
bits except
     MSR_TGPR marked as invalid.

     This solves the issue where MSR flags aren't set correctly when 
restoring a VM
     snapshot, in particular the internal env->excp_prefix value when 
MSR_EP has
     been altered by a guest.

     Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
     Signed-off-by: Alexander Graf <agraf@suse.de>

diff --git a/target-ppc/machine.c b/target-ppc/machine.c
index c801b82..3921012 100644
--- a/target-ppc/machine.c
+++ b/target-ppc/machine.c
@@ -159,6 +159,7 @@ static int cpu_post_load(void *opaque, int version_id)
      PowerPCCPU *cpu = opaque;
      CPUPPCState *env = &cpu->env;
      int i;
+    target_ulong msr;

      /*
       * We always ignore the source PVR. The user or management
@@ -190,7 +191,12 @@ static int cpu_post_load(void *opaque, int version_id)
          /* Restore htab_base and htab_mask variables */
          ppc_store_sdr1(env, env->spr[SPR_SDR1]);
      }
-    hreg_compute_hflags(env);
+
+    /* Mark msr bits except MSR_TGPR invalid before restoring */
+    msr = env->msr;
+    env->msr ^= ~(1ULL << MSR_TGPR);
+    ppc_store_msr(env, msr);
+
      hreg_compute_mem_idx(env);

      return 0;


    Stefan

PS: Sorry for the late notice (-rc3), but I only started doing things 
with ppc64 a few days ago.

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc64 not resuming with v2.3.0-rc3
  2015-04-16 16:43 [Qemu-devel] ppc64 not resuming with v2.3.0-rc3 Stefan Berger
@ 2015-04-16 19:42 ` Mark Cave-Ayland
  2015-04-16 19:49   ` Stefan Berger
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Cave-Ayland @ 2015-04-16 19:42 UTC (permalink / raw)
  To: Stefan Berger, agraf, qemu-devel, qemu-ppc

On 16/04/15 17:43, Stefan Berger wrote:

> The culprit patch seems to be the following commit. If I remove these
> changes from the tip of the tree it works again (on SLOF level):
> 
> commit 2360b6e84f78d41fa0f76555a947148b73645259
> Author: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> Date:   Mon Feb 9 22:40:48 2015 +0000
> 
>     target-ppc: force update of msr bits in cpu_post_load
> 
>     Since env->msr has already been restored by the time cpu_post_load
> is called,
>     make sure that ppc_store_msr() is explicitly called with all msr
> bits except
>     MSR_TGPR marked as invalid.
> 
>     This solves the issue where MSR flags aren't set correctly when
> restoring a VM
>     snapshot, in particular the internal env->excp_prefix value when
> MSR_EP has
>     been altered by a guest.
> 
>     Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>     Signed-off-by: Alexander Graf <agraf@suse.de>
> 
> diff --git a/target-ppc/machine.c b/target-ppc/machine.c
> index c801b82..3921012 100644
> --- a/target-ppc/machine.c
> +++ b/target-ppc/machine.c
> @@ -159,6 +159,7 @@ static int cpu_post_load(void *opaque, int version_id)
>      PowerPCCPU *cpu = opaque;
>      CPUPPCState *env = &cpu->env;
>      int i;
> +    target_ulong msr;
> 
>      /*
>       * We always ignore the source PVR. The user or management
> @@ -190,7 +191,12 @@ static int cpu_post_load(void *opaque, int version_id)
>          /* Restore htab_base and htab_mask variables */
>          ppc_store_sdr1(env, env->spr[SPR_SDR1]);
>      }
> -    hreg_compute_hflags(env);
> +
> +    /* Mark msr bits except MSR_TGPR invalid before restoring */
> +    msr = env->msr;
> +    env->msr ^= ~(1ULL << MSR_TGPR);
> +    ppc_store_msr(env, msr);
> +
>      hreg_compute_mem_idx(env);
> 
>      return 0;
> 
> 
>    Stefan
> 
> PS: Sorry for the late notice (-rc3), but I only started doing things
> with ppc64 a few days ago.

Hmmmm the fix is correct in that internal MSR variables need to be
updated post-restore (as noted in the message above it was the exception
prefix variables that weren't updated by having MSR_EP set).

Maybe on ppc64 there is another bit similar to MSR_TGPR that needs to be
excluded? Alex, any thoughts?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc64 not resuming with v2.3.0-rc3
  2015-04-16 19:42 ` Mark Cave-Ayland
@ 2015-04-16 19:49   ` Stefan Berger
  2015-04-16 20:53     ` Mark Cave-Ayland
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Berger @ 2015-04-16 19:49 UTC (permalink / raw)
  To: Mark Cave-Ayland, agraf, qemu-devel, qemu-ppc

On 04/16/2015 03:42 PM, Mark Cave-Ayland wrote:
> On 16/04/15 17:43, Stefan Berger wrote:
>
>> The culprit patch seems to be the following commit. If I remove these
>> changes from the tip of the tree it works again (on SLOF level):
>>
>> commit 2360b6e84f78d41fa0f76555a947148b73645259
>> Author: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>> Date:   Mon Feb 9 22:40:48 2015 +0000
>>
>>      target-ppc: force update of msr bits in cpu_post_load
>>
>>      Since env->msr has already been restored by the time cpu_post_load
>> is called,
>>      make sure that ppc_store_msr() is explicitly called with all msr
>> bits except
>>      MSR_TGPR marked as invalid.
>>
>>      This solves the issue where MSR flags aren't set correctly when
>> restoring a VM
>>      snapshot, in particular the internal env->excp_prefix value when
>> MSR_EP has
>>      been altered by a guest.
>>
>>      Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>>      Signed-off-by: Alexander Graf <agraf@suse.de>
>>
>> diff --git a/target-ppc/machine.c b/target-ppc/machine.c
>> index c801b82..3921012 100644
>> --- a/target-ppc/machine.c
>> +++ b/target-ppc/machine.c
>> @@ -159,6 +159,7 @@ static int cpu_post_load(void *opaque, int version_id)
>>       PowerPCCPU *cpu = opaque;
>>       CPUPPCState *env = &cpu->env;
>>       int i;
>> +    target_ulong msr;
>>
>>       /*
>>        * We always ignore the source PVR. The user or management
>> @@ -190,7 +191,12 @@ static int cpu_post_load(void *opaque, int version_id)
>>           /* Restore htab_base and htab_mask variables */
>>           ppc_store_sdr1(env, env->spr[SPR_SDR1]);
>>       }
>> -    hreg_compute_hflags(env);
>> +
>> +    /* Mark msr bits except MSR_TGPR invalid before restoring */
>> +    msr = env->msr;
>> +    env->msr ^= ~(1ULL << MSR_TGPR);
>> +    ppc_store_msr(env, msr);
>> +
>>       hreg_compute_mem_idx(env);
>>
>>       return 0;
>>
>>
>>     Stefan
>>
>> PS: Sorry for the late notice (-rc3), but I only started doing things
>> with ppc64 a few days ago.
> Hmmmm the fix is correct in that internal MSR variables need to be
> updated post-restore (as noted in the message above it was the exception
> prefix variables that weren't updated by having MSR_EP set).
>
> Maybe on ppc64 there is another bit similar to MSR_TGPR that needs to be
> excluded? Alex, any thoughts?

I want to add that I am running QEMU for ppc64 in emulation mode on a 
x86_64 host. The suspend/resume problem, while in SLOF, did not exist in 
QEMU v2.2, so I anticipate that this is a regression would also be 
visible on QEMU on kvm, though a simple test on such a machine may show 
different...

Removing the patch solves the problem while in SLOF. Once booting into 
Linux suspend/resume does not work - with qemu-system-ppc64 on x86_64 
host. Timestamps shown by Linux actually make a jump backwards and 
ultimately Linux hangs.


      Stefan


>
> ATB,
>
> Mark.
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc64 not resuming with v2.3.0-rc3
  2015-04-16 19:49   ` Stefan Berger
@ 2015-04-16 20:53     ` Mark Cave-Ayland
  2015-04-16 21:24       ` Stefan Berger
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Cave-Ayland @ 2015-04-16 20:53 UTC (permalink / raw)
  To: Stefan Berger, agraf, qemu-devel, qemu-ppc

On 16/04/15 20:49, Stefan Berger wrote:
>> Hmmmm the fix is correct in that internal MSR variables need to be
>> updated post-restore (as noted in the message above it was the exception
>> prefix variables that weren't updated by having MSR_EP set).
>>
>> Maybe on ppc64 there is another bit similar to MSR_TGPR that needs to be
>> excluded? Alex, any thoughts?
> 
> I want to add that I am running QEMU for ppc64 in emulation mode on a
> x86_64 host. The suspend/resume problem, while in SLOF, did not exist in
> QEMU v2.2, so I anticipate that this is a regression would also be
> visible on QEMU on kvm, though a simple test on such a machine may show
> different...
> 
> Removing the patch solves the problem while in SLOF. Once booting into
> Linux suspend/resume does not work - with qemu-system-ppc64 on x86_64
> host. Timestamps shown by Linux actually make a jump backwards and
> ultimately Linux hangs.

Just to clarify the terminology here, when you say suspend/resume are
you talking about a hardware suspend/resume or issuing a savevm/loadvm
sequence in the QEMU monitor? Are you able to provide further detail to
reproduce your test case?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc64 not resuming with v2.3.0-rc3
  2015-04-16 20:53     ` Mark Cave-Ayland
@ 2015-04-16 21:24       ` Stefan Berger
  2015-04-16 21:53         ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Stefan Berger @ 2015-04-16 21:24 UTC (permalink / raw)
  To: Mark Cave-Ayland, agraf, qemu-devel, qemu-ppc

On 04/16/2015 04:53 PM, Mark Cave-Ayland wrote:
> On 16/04/15 20:49, Stefan Berger wrote:
>>> Hmmmm the fix is correct in that internal MSR variables need to be
>>> updated post-restore (as noted in the message above it was the exception
>>> prefix variables that weren't updated by having MSR_EP set).
>>>
>>> Maybe on ppc64 there is another bit similar to MSR_TGPR that needs to be
>>> excluded? Alex, any thoughts?
>> I want to add that I am running QEMU for ppc64 in emulation mode on a
>> x86_64 host. The suspend/resume problem, while in SLOF, did not exist in
>> QEMU v2.2, so I anticipate that this is a regression would also be
>> visible on QEMU on kvm, though a simple test on such a machine may show
>> different...
>>
>> Removing the patch solves the problem while in SLOF. Once booting into
>> Linux suspend/resume does not work - with qemu-system-ppc64 on x86_64
>> host. Timestamps shown by Linux actually make a jump backwards and
>> ultimately Linux hangs.
> Just to clarify the terminology here, when you say suspend/resume are
> you talking about a hardware suspend/resume or issuing a savevm/loadvm
> sequence in the QEMU monitor? Are you able to provide further detail to
> reproduce your test case?

I am using 'virsh save' to suspend the VM, 'virsh restore' to resume it, 
so I am doing this on libvirt level.

This is the XML to just test suspend / resume while it is in SLOF. No 
disk needed.

<domain type='qemu'>
   <name>ppc-test</name>
   <uuid>3e17dcdb-4a22-49ed-b8f9-4df523d04bb3</uuid>
   <memory unit='KiB'>1310720</memory>
   <currentMemory unit='KiB'>1310720</currentMemory>
   <vcpu placement='static'>1</vcpu>
   <resource>
     <partition>/machine</partition>
   </resource>
   <os>
     <type arch='ppc64' machine='pseries-2.2'>hvm</type>
     <boot dev='hd'/>
     <boot dev='cdrom'/>
   </os>
   <clock offset='utc'/>
   <on_poweroff>destroy</on_poweroff>
   <on_reboot>restart</on_reboot>
   <on_crash>restart</on_crash>
   <devices>
     <emulator>/usr/bin/qemu-system-ppc64</emulator>
     <controller type='usb' index='0'>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x01' 
function='0x0'/>
     </controller>
     <controller type='pci' index='0' model='pci-root'/>
     <input type='keyboard' bus='usb'/>
     <input type='mouse' bus='usb'/>
     <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
       <listen type='address' address='0.0.0.0'/>
     </graphics>
     <video>
       <model type='vga' vram='16384' heads='1'/>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x05' 
function='0x0'/>
     </video>
     <memballoon model='virtio'>
       <address type='pci' domain='0x0000' bus='0x00' slot='0x02' 
function='0x0'/>
     </memballoon>
   </devices>
   <seclabel type='dynamic' model='selinux' relabel='yes'/>
</domain>


     Stefan


>
> ATB,
>
> Mark.
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc64 not resuming with v2.3.0-rc3
  2015-04-16 21:24       ` Stefan Berger
@ 2015-04-16 21:53         ` Paolo Bonzini
  2015-04-16 22:23           ` Mark Cave-Ayland
  0 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2015-04-16 21:53 UTC (permalink / raw)
  To: Stefan Berger, Mark Cave-Ayland, agraf, qemu-devel, qemu-ppc



On 16/04/2015 23:24, Stefan Berger wrote:
> On 04/16/2015 04:53 PM, Mark Cave-Ayland wrote:
>> On 16/04/15 20:49, Stefan Berger wrote:
>>>> Hmmmm the fix is correct in that internal MSR variables need to be
>>>> updated post-restore (as noted in the message above it was the
>>>> exception
>>>> prefix variables that weren't updated by having MSR_EP set).
>>>>
>>>> Maybe on ppc64 there is another bit similar to MSR_TGPR that needs
>>>> to be
>>>> excluded? Alex, any thoughts?
>>> I want to add that I am running QEMU for ppc64 in emulation mode on a
>>> x86_64 host. The suspend/resume problem, while in SLOF, did not exist in
>>> QEMU v2.2, so I anticipate that this is a regression would also be
>>> visible on QEMU on kvm, though a simple test on such a machine may show
>>> different...
>>>
>>> Removing the patch solves the problem while in SLOF. Once booting into
>>> Linux suspend/resume does not work - with qemu-system-ppc64 on x86_64
>>> host. Timestamps shown by Linux actually make a jump backwards and
>>> ultimately Linux hangs.
>> Just to clarify the terminology here, when you say suspend/resume are
>> you talking about a hardware suspend/resume or issuing a savevm/loadvm
>> sequence in the QEMU monitor? Are you able to provide further detail to
>> reproduce your test case?
> 
> I am using 'virsh save' to suspend the VM, 'virsh restore' to resume it,
> so I am doing this on libvirt level.

Ok, that's the equivalent of "migrate exec:cat>foo.save" and "-incoming
'exec:cat<foo.save'"

Paolo

> 
> This is the XML to just test suspend / resume while it is in SLOF. No
> disk needed.
> 
> <domain type='qemu'>
>   <name>ppc-test</name>
>   <uuid>3e17dcdb-4a22-49ed-b8f9-4df523d04bb3</uuid>
>   <memory unit='KiB'>1310720</memory>
>   <currentMemory unit='KiB'>1310720</currentMemory>
>   <vcpu placement='static'>1</vcpu>
>   <resource>
>     <partition>/machine</partition>
>   </resource>
>   <os>
>     <type arch='ppc64' machine='pseries-2.2'>hvm</type>
>     <boot dev='hd'/>
>     <boot dev='cdrom'/>
>   </os>
>   <clock offset='utc'/>
>   <on_poweroff>destroy</on_poweroff>
>   <on_reboot>restart</on_reboot>
>   <on_crash>restart</on_crash>
>   <devices>
>     <emulator>/usr/bin/qemu-system-ppc64</emulator>
>     <controller type='usb' index='0'>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x01'
> function='0x0'/>
>     </controller>
>     <controller type='pci' index='0' model='pci-root'/>
>     <input type='keyboard' bus='usb'/>
>     <input type='mouse' bus='usb'/>
>     <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
>       <listen type='address' address='0.0.0.0'/>
>     </graphics>
>     <video>
>       <model type='vga' vram='16384' heads='1'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x05'
> function='0x0'/>
>     </video>
>     <memballoon model='virtio'>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x02'
> function='0x0'/>
>     </memballoon>
>   </devices>
>   <seclabel type='dynamic' model='selinux' relabel='yes'/>
> </domain>
> 
> 
>     Stefan
> 
> 
>>
>> ATB,
>>
>> Mark.
>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] ppc64 not resuming with v2.3.0-rc3
  2015-04-16 21:53         ` Paolo Bonzini
@ 2015-04-16 22:23           ` Mark Cave-Ayland
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Cave-Ayland @ 2015-04-16 22:23 UTC (permalink / raw)
  To: Paolo Bonzini, Stefan Berger, agraf, qemu-devel, qemu-ppc

On 16/04/15 22:53, Paolo Bonzini wrote:

> Ok, that's the equivalent of "migrate exec:cat>foo.save" and "-incoming
> 'exec:cat<foo.save'"
> 
> Paolo

Thanks - that's exactly what I needed to reproduce here.

Not working (git master)

$ ./qemu-system-ppc64 -m 128 -prom-env 'auto-boot?=false' -incoming
'exec:cat</tmp/foo.save'
hflags: 0x9000000000000000
msr: 0x9000000000000000

Working (patch reverted)

$ ./qemu-system-ppc64 -m 128 -prom-env 'auto-boot?=false' -incoming
'exec:cat</tmp/foo.save'
hflags: 0x8000000000000000
msr: 0x8000000000000000

Looks like the bit being set differently in the MSR is MSR_SHV
(hypervisor state).


ATB,

Mark.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-04-16 22:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-16 16:43 [Qemu-devel] ppc64 not resuming with v2.3.0-rc3 Stefan Berger
2015-04-16 19:42 ` Mark Cave-Ayland
2015-04-16 19:49   ` Stefan Berger
2015-04-16 20:53     ` Mark Cave-Ayland
2015-04-16 21:24       ` Stefan Berger
2015-04-16 21:53         ` Paolo Bonzini
2015-04-16 22:23           ` Mark Cave-Ayland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).