All of lore.kernel.org
 help / color / mirror / Atom feed
* 3.0.5 rc3 paravirt save failures ?
@ 2007-05-01 11:55 Daniel P. Berrange
  2007-05-01 12:47 ` Steven Hand
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel P. Berrange @ 2007-05-01 11:55 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 1144 bytes --]

I'm seeing a fairly frequent problem when trying to save paravirt domains.
xc_save is failing, and logging the following error:

(XendCheckpoint:349) ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys
(XendCheckpoint:349) ERROR Internal error: entry 206848: p2m_frame_list[404] is 0x0
(XendCheckpoint:349) ERROR Internal error: Failed to map/save the p2m frame list

Save/restore of fullyvirt on the same box is working pretty well. Anyone
have ideas on what the error message might be trying to tell me.... ? 

I'm attaching the xend.log & the guest config from xm list --long

The host is running 3.0.5 rc3 HV, and 3.0.5 rc4 userspace & arch is x86_64
I don't see anything between rc3 & rc4 HV that is explicitly addressing PV
save failures, but doing a new build to check that anyway.

Regards,
Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

[-- Attachment #2: test.log --]
[-- Type: text/plain, Size: 7026 bytes --]

[2007-05-01 08:06:09 4135] DEBUG (XendCheckpoint:88) [xc_save]: /usr/lib64/xen/bin/xc_save 19 11 0 0 0
[2007-05-01 08:06:09 4135] DEBUG (XendCheckpoint:320) suspend
[2007-05-01 08:06:09 4135] DEBUG (XendCheckpoint:91) In saveInputHandler suspend
[2007-05-01 08:06:09 4135] DEBUG (XendCheckpoint:93) Suspending 11 ...
[2007-05-01 08:06:09 4135] DEBUG (XendDomainInfo:443) XendDomainInfo.shutdown(suspend)
[2007-05-01 08:06:09 4135] DEBUG (XendDomainInfo:954) XendDomainInfo.handleShutdownWatch
[2007-05-01 08:06:09 4135] DEBUG (XendDomainInfo:954) XendDomainInfo.handleShutdownWatch
[2007-05-01 08:06:09 4135] INFO (XendDomainInfo:1136) Domain has shutdown: name=migrating-test id=11 reason=suspend.
[2007-05-01 08:06:09 4135] INFO (XendCheckpoint:98) Domain 11 suspended.
[2007-05-01 08:06:09 4135] DEBUG (XendCheckpoint:107) Written done
[2007-05-01 08:06:09 4135] INFO (XendCheckpoint:349) ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys
[2007-05-01 08:06:09 4135] INFO (XendCheckpoint:349) ERROR Internal error: entry 206848: p2m_frame_list[404] is 0x0
[2007-05-01 08:06:09 4135] INFO (XendCheckpoint:349) ERROR Internal error: Failed to map/save the p2m frame list
[2007-05-01 08:06:09 4135] INFO (XendCheckpoint:349) Save exit rc=1
[2007-05-01 08:06:09 4135] ERROR (XendCheckpoint:140) Save failed on domain test (11).
Traceback (most recent call last):
  File "/usr/lib64/python2.5/site-packages/xen/xend/XendCheckpoint.py", line 109, in save
    forkHelper(cmd, fd, saveInputHandler, False)
  File "/usr/lib64/python2.5/site-packages/xen/xend/XendCheckpoint.py", line 337, in forkHelper
    raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib64/xen/bin/xc_save 19 11 0 0 0 failed
[2007-05-01 08:06:09 4135] DEBUG (XendDomainInfo:1741) XendDomainInfo.resumeDomain(11)
[2007-05-01 08:06:09 4135] INFO (XendDomainInfo:1949) Dev 51712 still active, looping...
[2007-05-01 08:06:09 4135] INFO (XendDomainInfo:1949) Dev 51712 still active, looping...
[2007-05-01 08:06:10 4135] INFO (XendDomainInfo:1949) Dev 51712 still active, looping...
[2007-05-01 08:06:10 4135] INFO (XendDomainInfo:1949) Dev 51712 still active, looping...
[2007-05-01 08:06:10 4135] INFO (XendDomainInfo:1949) Dev 51712 still active, looping...
[2007-05-01 08:06:10 4135] INFO (XendDomainInfo:1949) Dev 51712 still active, looping...
[2007-05-01 08:06:10 4135] DEBUG (XendDomainInfo:1752) XendDomainInfo.resumeDomain: devices released
[2007-05-01 08:06:10 4135] DEBUG (XendDomainInfo:954) XendDomainInfo.handleShutdownWatch
[2007-05-01 08:06:10 4135] DEBUG (XendDomainInfo:873) Storing domain details: {'console/ring-ref': '1061645', 'image/entry': '18446744071564165120', 'console/port': '2', 'store/ring-ref': '1061646', 'image/loader': 'generic', 'vm': '/vm/66857c70-9898-fefc-ed53-8eee3ba294bf', 'control/platform-feature-multiprocessor-suspend': '1', 'image/guest-os': 'linux', 'cpu/1/availability': 'online', 'image/features/writable-descriptor-tables': '1', 'image/virt-base': '18446744071562067968', 'memory/target': '419840', 'image/guest-version': '2.6', 'image/features/supervisor-mode-kernel': '1', 'console/limit': '1048576', 'image/paddr-offset': '18446744071562067968', 'image/hypercall-page': '18446744071564189696', 'cpu/0/availability': 'online', 'image/features/pae-pgdir-above-4gb': '1', 'image/features/writable-page-tables': '1', 'image/features/auto-translated-physmap': '1', 'name': 'migrating-test',
  'domid': '11', 'image/xen-version': 'xen-3.0', 'store/port': '1'}
[2007-05-01 08:06:10 4135] INFO (XendDomainInfo:1362) createDevice: vkbd : {'devid': 0, 'uuid': '062e03d5-b82d-1b98-4588-6b1043654e19'}
[2007-05-01 08:06:10 4135] DEBUG (DevController:115) DevController: writing {'state': '1', 'backend-id': '0', 'backend': '/local/domain/0/backend/vkbd/11/0'} to /local/domain/11/device/vkbd/0.
[2007-05-01 08:06:10 4135] DEBUG (DevController:117) DevController: writing {'frontend-id': '11', 'domain': 'migrating-test', 'frontend': '/local/domain/11/device/vkbd/0', 'state': '1', 'online': '1'} to /local/domain/0/backend/vkbd/11/0.
[2007-05-01 08:06:10 4135] INFO (XendDomainInfo:1362) createDevice: vfb : {'vncunused': '1', 'vnclisten': '0.0.0.0', 'uuid': 'e6460c6e-e4f5-86dc-dd4c-9f1ac81ce0fe', 'devid': None, 'other_config': {'vncunused': '1', 'vnclisten': '0.0.0.0', 'type': 'vnc', 'display': 'localhost:10.0', 'xauthority': '/root/.Xauthority'}, 'type': 'vnc', 'display': 'localhost:10.0', 'xauthority': '/root/.Xauthority'}
[2007-05-01 08:06:10 4135] DEBUG (DevController:115) DevController: writing {'state': '1', 'backend-id': '0', 'backend': '/local/domain/0/backend/vfb/11/0'} to /local/domain/11/device/vfb/0.
[2007-05-01 08:06:10 4135] DEBUG (DevController:117) DevController: writing {'vncunused': '1', 'domain': 'migrating-test', 'frontend': '/local/domain/11/device/vfb/0', 'uuid': 'e6460c6e-e4f5-86dc-dd4c-9f1ac81ce0fe', 'vnclisten': '0.0.0.0', 'state': '1', 'online': '1', 'frontend-id': '11', 'type': 'vnc', 'display': 'localhost:10.0', 'xauthority': '/root/.Xauthority'} to /local/domain/0/backend/vfb/11/0.
[2007-05-01 08:06:10 4135] DEBUG (vfbif:72) No VNC passwd configured for vfb access
[2007-05-01 08:06:10 4135] INFO (XendDomainInfo:1362) createDevice: vbd : {'uuid': 'ea8593a9-a51b-63bf-fa2e-8d883930e0f1', 'bootable': 1, 'devid': 51712, 'driver': 'paravirtualised', 'dev': 'xvda', 'uname': 'file:/xen/test.img', 'mode': 'w'}
[2007-05-01 08:06:10 4135] DEBUG (DevController:115) DevController: writing {'backend-id': '0', 'virtual-device': '51712', 'device-type': 'disk', 'state': '1', 'backend': '/local/domain/0/backend/vbd/11/51712'} to /local/domain/11/device/vbd/51712.
[2007-05-01 08:06:10 4135] DEBUG (DevController:117) DevController: writing {'domain': 'migrating-test', 'frontend': '/local/domain/11/device/vbd/51712', 'uuid': 'ea8593a9-a51b-63bf-fa2e-8d883930e0f1', 'dev': 'xvda', 'state': '1', 'params': '/xen/test.img', 'mode': 'w', 'online': '1', 'frontend-id': '11', 'type': 'file'} to /local/domain/0/backend/vbd/11/51712.
[2007-05-01 08:06:10 4135] INFO (XendDomainInfo:1362) createDevice: vif : {'bridge': 'virbr0', 'mac': '00:16:3e:6d:af:db', 'devid': 0, 'uuid': '2fddf993-0bdb-a3ce-3da0-03661210a1c0'}
[2007-05-01 08:06:10 4135] DEBUG (DevController:115) DevController: writing {'backend-id': '0', 'mac': '00:16:3e:6d:af:db', 'handle': '0', 'state': '1', 'backend': '/local/domain/0/backend/vif/11/0'} to /local/domain/11/device/vif/0.
[2007-05-01 08:06:10 4135] DEBUG (DevController:117) DevController: writing {'bridge': 'virbr0', 'domain': 'migrating-test', 'handle': '0', 'uuid': '2fddf993-0bdb-a3ce-3da0-03661210a1c0', 'script': '/etc/xen/scripts/vif-bridge', 'state': '1', 'frontend': '/local/domain/11/device/vif/0', 'mac': '00:16:3e:6d:af:db', 'online': '1', 'frontend-id': '11', 'type': 'netfront'} to /local/domain/0/backend/vif/11/0.
[2007-05-01 08:06:10 4135] DEBUG (XendDomainInfo:1764) XendDomainInfo.resumeDomain: devices created
[2007-05-01 08:06:10 4135] DEBUG (XendCheckpoint:143) XendCheckpoint.save: resumeDomain

[-- Attachment #3: test.sxp --]
[-- Type: text/plain, Size: 1983 bytes --]

(domain
    (domid 11)
    (on_crash restart)
    (uuid 66857c70-9898-fefc-ed53-8eee3ba294bf)
    (bootloader_args -q)
    (vcpus 2)
    (name test)
    (on_poweroff destroy)
    (on_reboot restart)
    (bootloader /usr/bin/pygrub)
    (maxmem 800)
    (memory 410)
    (shadow_memory 0)
    (cpu_weight 256)
    (cpu_cap 0)
    (features )
    (on_xend_start ignore)
    (on_xend_stop ignore)
    (start_time 1178020682.09)
    (cpu_time 13.860372956)
    (online_vcpus 2)
    (image
        (linux
            (kernel )
            (rtc_timeoffset 'None')
            (notes
                (FEATURES
                    'writable_page_tables|writable_descriptor_tables|auto_translated_physmap|pae_pgdir_above_4gb|supervisor_mode_kernel'
                )
                (VIRT_BASE 18446744071562067968)
                (GUEST_VERSION 2.6)
                (PADDR_OFFSET 18446744071562067968)
                (GUEST_OS linux)
                (HYPERCALL_PAGE 18446744071564189696)
                (LOADER generic)
                (ENTRY 18446744071564165120)
                (XEN_VERSION xen-3.0)
            )
        )
    )
    (status 2)
    (state -b----)
    (store_mfn 1061646)
    (console_mfn 1061645)
    (device
        (vif
            (bridge virbr0)
            (uuid 2fddf993-0bdb-a3ce-3da0-03661210a1c0)
            (script vif-bridge)
            (mac 00:16:3e:6d:af:db)
            (type netfront)
            (backend 0)
        )
    )
    (device
        (vbd
            (uname file:/xen/test.img)
            (uuid ea8593a9-a51b-63bf-fa2e-8d883930e0f1)
            (mode w)
            (dev xvda:disk)
            (backend 0)
            (bootable 1)
        )
    )
    (device (vkbd (backend 0)))
    (device
        (vfb
            (vncunused 1)
            (uuid e6460c6e-e4f5-86dc-dd4c-9f1ac81ce0fe)
            (vnclisten 0.0.0.0)
            (type vnc)
            (display localhost:10.0)
            (xauthority /root/.Xauthority)
        )
    )
)

[-- Attachment #4: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.0.5 rc3 paravirt save failures ?
  2007-05-01 11:55 3.0.5 rc3 paravirt save failures ? Daniel P. Berrange
@ 2007-05-01 12:47 ` Steven Hand
  2007-05-01 13:01   ` Keir Fraser
  2007-05-01 15:01   ` Daniel P. Berrange
  0 siblings, 2 replies; 6+ messages in thread
From: Steven Hand @ 2007-05-01 12:47 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: xen-devel, Steven.Hand

>I'm seeing a fairly frequent problem when trying to save paravirt domains.
>xc_save is failing, and logging the following error:
>
>(XendCheckpoint:349) ERROR Internal error: Frame# in pfn-to-mfn frame list is 
not in pseudophys
>(XendCheckpoint:349) ERROR Internal error: entry 206848: p2m_frame_list[404] i
s 0x0
>(XendCheckpoint:349) ERROR Internal error: Failed to map/save the p2m frame li
st
>
>Save/restore of fullyvirt on the same box is working pretty well. Anyone
>have ideas on what the error message might be trying to tell me.... ? 

Looks like you're running off the top of the p2m ; your domain has 800Mb 
'maxmem' and hence should have a p2m covering 800 + 8 (slack) = 808Mb. 
However entry 206868 is for the physical page just beyond that. 

Can you check the value you're getting for p2m_size in xc_domain_save.c, 
e.g. apply the following? 

diff -r d79436447a05 tools/libxc/xc_domain_save.c
--- a/tools/libxc/xc_domain_save.c      Fri Apr 27 16:17:54 2007 +0100
+++ b/tools/libxc/xc_domain_save.c      Tue May 01 13:46:26 2007 +0100
@@ -871,6 +871,7 @@ int xc_domain_save(int xc_handle, int io
 
     /* Get the size of the P2M table */
     p2m_size = xc_memory_op(xc_handle, XENMEM_maximum_gpfn, &dom) + 1;
+    DPRINTF("DBG - got size of p2m table as %ld\n", p2m_size);
 
     /* Domain is still running at this point */
     if ( live )



cheers,

S.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.0.5 rc3 paravirt save failures ?
  2007-05-01 12:47 ` Steven Hand
@ 2007-05-01 13:01   ` Keir Fraser
  2007-05-01 13:04     ` Daniel P. Berrange
  2007-05-01 15:01   ` Daniel P. Berrange
  1 sibling, 1 reply; 6+ messages in thread
From: Keir Fraser @ 2007-05-01 13:01 UTC (permalink / raw)
  To: Steven Hand, Daniel P. Berrange; +Cc: xen-devel

On 1/5/07 13:47, "Steven Hand" <Steven.Hand@cl.cam.ac.uk> wrote:

> Looks like you're running off the top of the p2m ; your domain has 800Mb
> 'maxmem' and hence should have a p2m covering 800 + 8 (slack) = 808Mb.
> However entry 206868 is for the physical page just beyond that.
> 
> Can you check the value you're getting for p2m_size in xc_domain_save.c,
> e.g. apply the following?

And is your Xen precisely matched against libxc? I fixed XENMEM_maximum_gpfn
to really return the last gpfn known to the guest, rather than the last plus
one. Hence we now add one to that value to get p2m_size in xc_domain_save().
But if this fixed xc_domain_save() was run against older Xen, you would have
p2m_size too big by one.

 -- Keir

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.0.5 rc3 paravirt save failures ?
  2007-05-01 13:01   ` Keir Fraser
@ 2007-05-01 13:04     ` Daniel P. Berrange
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel P. Berrange @ 2007-05-01 13:04 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel, Steven Hand

On Tue, May 01, 2007 at 02:01:45PM +0100, Keir Fraser wrote:
> On 1/5/07 13:47, "Steven Hand" <Steven.Hand@cl.cam.ac.uk> wrote:
> 
> > Looks like you're running off the top of the p2m ; your domain has 800Mb
> > 'maxmem' and hence should have a p2m covering 800 + 8 (slack) = 808Mb.
> > However entry 206868 is for the physical page just beyond that.
> > 
> > Can you check the value you're getting for p2m_size in xc_domain_save.c,
> > e.g. apply the following?
> 
> And is your Xen precisely matched against libxc? I fixed XENMEM_maximum_gpfn
> to really return the last gpfn known to the guest, rather than the last plus
> one. Hence we now add one to that value to get p2m_size in xc_domain_save().
> But if this fixed xc_domain_save() was run against older Xen, you would have
> p2m_size too big by one.

Ahh, that could well be the problem. It looks like my HV / libxc are just
straddling your changeset :-(

Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.0.5 rc3 paravirt save failures ?
  2007-05-01 12:47 ` Steven Hand
  2007-05-01 13:01   ` Keir Fraser
@ 2007-05-01 15:01   ` Daniel P. Berrange
  2007-05-01 15:09     ` Keir Fraser
  1 sibling, 1 reply; 6+ messages in thread
From: Daniel P. Berrange @ 2007-05-01 15:01 UTC (permalink / raw)
  To: Steven Hand; +Cc: xen-devel

On Tue, May 01, 2007 at 01:47:39PM +0100, Steven Hand wrote:
> >I'm seeing a fairly frequent problem when trying to save paravirt domains.
> >xc_save is failing, and logging the following error:
> >
> >(XendCheckpoint:349) ERROR Internal error: Frame# in pfn-to-mfn frame list is 
> not in pseudophys
> >(XendCheckpoint:349) ERROR Internal error: entry 206848: p2m_frame_list[404] i
> s 0x0
> >(XendCheckpoint:349) ERROR Internal error: Failed to map/save the p2m frame li
> st
> >
> >Save/restore of fullyvirt on the same box is working pretty well. Anyone
> >have ideas on what the error message might be trying to tell me.... ? 
> 
> Looks like you're running off the top of the p2m ; your domain has 800Mb 
> 'maxmem' and hence should have a p2m covering 800 + 8 (slack) = 808Mb. 
> However entry 206868 is for the physical page just beyond that. 
> 
> Can you check the value you're getting for p2m_size in xc_domain_save.c, 
> e.g. apply the following? 

I did that and it was showing 206869. Updated the HV to be post-Kier's
changes and it now shows 206868, so it was an off-by-one due to HV/libxc
mismatch.

I really was not expecting to be seeing HV hypercall API changes in
the 3.0.5 testing rc  releases :-(

Are there any more planned hypercall API/ABI changes anticipated before
3.0.5 is released, which would require a match HV/userspace update ?  

Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.0.5 rc3 paravirt save failures ?
  2007-05-01 15:01   ` Daniel P. Berrange
@ 2007-05-01 15:09     ` Keir Fraser
  0 siblings, 0 replies; 6+ messages in thread
From: Keir Fraser @ 2007-05-01 15:09 UTC (permalink / raw)
  To: Daniel P. Berrange, Steven Hand; +Cc: xen-devel




On 1/5/07 16:01, "Daniel P. Berrange" <berrange@redhat.com> wrote:

> I really was not expecting to be seeing HV hypercall API changes in
> the 3.0.5 testing rc  releases :-(
> 
> Are there any more planned hypercall API/ABI changes anticipated before
> 3.0.5 is released, which would require a match HV/userspace update ?

Only if further bugs are found.

 -- Keir

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-05-01 15:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-01 11:55 3.0.5 rc3 paravirt save failures ? Daniel P. Berrange
2007-05-01 12:47 ` Steven Hand
2007-05-01 13:01   ` Keir Fraser
2007-05-01 13:04     ` Daniel P. Berrange
2007-05-01 15:01   ` Daniel P. Berrange
2007-05-01 15:09     ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.