We are working on live-migration based on Xen-4.0.1(For history reason, and meantime we are upgrading our Xen to very latest version). Restore failed when live migrating ubuntu12.04 on xen-4.0.1. To be more specific, error occurred when populating memory. Error messages are as follow:
[2014-09-12 22:40:40 7331 1189091648] DEBUG (XendCheckpoint:307) [xc_restore]: /usr/lib64/xen/bin/xc_restore 4 2763 3 4 1 1 1 0
[2014-09-12 22:40:40 7331 1189091648] DEBUG (XendCheckpoint:428) Thread-40188
[2014-09-12 22:40:40 7331 1172306240] INFO (XendCheckpoint:476) Thread-40188:xc_domain_restore start: p2m_size = fefff
[2014-09-12 22:40:40 7331 1172306240] INFO (XendCheckpoint:476) Thread-40188:Reloading memory pages: 0%
[2014-09-12 22:40:50 7331 1172306240] INFO (XendCheckpoint:476) Thread-40188:Failed allocation for dom 2763: 128 extents of order 0
[2014-09-12 22:40:50 7331 1172306240] INFO (XendCheckpoint:476) Thread-40188:ERROR Internal error: Failed to allocate memory for batch.!
[2014-09-12 22:40:50 7331 1172306240] INFO (XendCheckpoint:476) Thread-40188:
[2014-09-12 22:40:50 7331 1172306240] INFO (XendCheckpoint:476) Thread-40188:Restore exit with rc=1
[2014-09-12 22:40:50 7331 1189091648] DEBUG (XendCheckpoint:462) /usr/lib64/xen/bin/xc_restore 4 2763 3 4 1 1 1 0 failed status 256
[2014-09-12 22:40:50 7331 1189091648] DEBUG (XendDomainInfo:3845) XendDomainInfo.destroy: domid=2763
In this case, populate_physmap terminated with nr_done 127. So xc_memory_op return 127 while nr_extents equals 128.
This problem happends once every 1770th live migration or so. As I am debugging this issue, I'm sending this email to ask for suggestions on this issue.
Thanks,
Huaixin Chang