* Improving domU restore time @ 2010-05-25 10:35 Rafal Wojtczuk 2010-05-25 10:58 ` Joanna Rutkowska 2010-05-25 11:50 ` Keir Fraser 0 siblings, 2 replies; 18+ messages in thread From: Rafal Wojtczuk @ 2010-05-25 10:35 UTC (permalink / raw) To: xen-devel Hello, I would be grateful for the comments on possible methods to improve domain restore performance. Focusing on the PV case, if it matters. 1) xen-4.0.0 I see a similar problem to the one reported at the thread at http://lists.xensource.com/archives/html/xen-devel/2010-05/msg00677.html Dom0 is 2.6.32.9-7.pvops0 x86_64, xen-4.0.0 x86_64. [user@qubes ~]$ xm create /dev/null kernel=/boot/vmlinuz-2.6.32.9-7.pvops0.qubes.x86_64 root=/dev/mapper/dmroot extra="rootdelay=1000" memory=400 ...wait a second... [user@qubes ~]$ xm save null nullsave [user@qubes ~]$ time cat nullsave >/dev/null ... [user@qubes ~]$ time cat nullsave >/dev/null ... [user@qubes ~]$ time cat nullsave >/dev/null real 0m0.173s user 0m0.010s sys 0m0.164s /* sits nicely in the cache, let's restore... */ [user@qubes ~]$ time xm restore nullsave real 0m9.189s user 0m0.151s sys 0m0.039s According to systemtap, xc_restore uses 3812s of CPU time; besides it being a lot, what uses the remaining 6s ? Just as reported previously, there are some errors in xend.log [2010-05-25 10:49:02 2392] DEBUG (XendCheckpoint:286) restore:shadow=0x0, _static_max=0x19000000, _static_min=0x0, [2010-05-25 10:49:02 2392] DEBUG (XendCheckpoint:305) [xc_restore]: /usr/lib64/xen/bin/xc_restore 39 3 1 2 0 0 0 0 [2010-05-25 10:49:02 2392] INFO (XendCheckpoint:423) xc_domain_restore start: p2m_size = 19000 [2010-05-25 10:49:02 2392] INFO (XendCheckpoint:423) Reloading memory pages: 0% [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) ERROR Internal error: Error when reading batch size [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) ERROR Internal error: error when buffering batch, finishing [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:4100% [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) Memory reloaded (0 pages) [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) read VCPU 0 [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) Completed checkpoint load [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) Domain ready to be built. [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) Restore exit with rc=0 Note, xc_restore on xen-3.4.3 works much faster (and with no warnings in the log), with the same dom0 pvops kernel. Ok, so there is some issue here. Some more generic thoughts below. 2) xen-3.4.3 Firstly, /etc/xen/scripts/block in xen-3.4.3 tries to do something like for i in /dev/loop* ; do losetup $i so, spawn one losetup process per each existing /dev/loopX; it hogs CPU, especially if your system comes with maxloops=255 :). So, let's replace it with the xen-4.0.0 version, where this problem is fixed (it uses losetup -a, hurray). Then, restore time for a 400MB domain, with the restore file in the cache, with 4 vbds backed by /dev/loopX, with one vif, is ca 2.7s real time. According to systemtap, the CPU time requirements are xend threads- 0.363s udevd(in dom0) - 0.007s /etc/xen/scripts/block and its children - 1.075s xc_restore - 1.368s /etc/xen/scripts/vif-bridge (in netvm) - 0.130s The obvious idea to improve /etc/xen/scripts/block shell script execution time is to recode it, in some other language that will not spawn hundreds of processes to do its job. Now, xc_restore. a) Is it correct that when xc_restore runs, the target domain memory is already zeroed (because hypervisor scrubs free memory, before it is assigned to a new domain) ? So, xc_save could check whether a given page contains only zeroes and if so, omit it in the savefile. This could result in quite significant savings when - we save a freshly booted domain, or if we can zero out free memory in the domain before saving - we plan to restore multiple times from the same savefile (yes, vbd must be restored in this case too). b) xen-3.4.3/xc_restore reads data from savefile in 4k portions - so, one read syscall per page. Make it read in larger chunks. It looks it is fixed in xen-4.0.0, is this correct ? Also, it looks really excessive that basically copying 400MB of memory takes over 1.3s cpu time. Is IOCTL_PRIVCMD_MMAPBATCH the culprit (its dom0 kernel code ? Xen mm code ? hypercall overhead ? ), anything else ? I am aware that in the usual cases, xc_restore is not the bottleneck (savefile reads from the disk or the network is), but in case we can fetch savefile quickly, it matters. Is 3.4.3 branch still being developed, or pure maintenance mode only, so new code should be prepared for 4.0.0 ? Regards, Rafal Wojtczuk Principal Researcher Invisible Things Lab, Qubes-os project ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Improving domU restore time 2010-05-25 10:35 Improving domU restore time Rafal Wojtczuk @ 2010-05-25 10:58 ` Joanna Rutkowska 2010-05-25 11:50 ` Keir Fraser 1 sibling, 0 replies; 18+ messages in thread From: Joanna Rutkowska @ 2010-05-25 10:58 UTC (permalink / raw) To: Rafal Wojtczuk; +Cc: xen-devel [-- Attachment #1.1: Type: text/plain, Size: 5960 bytes --] A bit of a background to the Rafal's post -- we plan to implement a feature that we call "Disposable VMs" in Qubes, that would essentially allow for super-fast creation of small, one-purpose VM (DomU), e.g. just for opening of a PDF, or Word document, etc. The point is: the creation & resume of such a VM must be really fast, i.e. much below 1s. And this seems possible, especially if we use sparse files for storing the VM's save-image and the restore operation (the VMs we're talking about here would have around 100-150MB of the actual data recorded in a sparse savefile). But, as Rafal pointed out, some operations that Xen does seem to be implemented ineffectively, and wanted to get your opinion before we start optimizing them (i.e. xc_restore and /etc/xen/scripts/block optimization that Rafal mentioned). Thanks, j. On 05/25/2010 12:35 PM, Rafal Wojtczuk wrote: > Hello, > I would be grateful for the comments on possible methods to improve domain > restore performance. Focusing on the PV case, if it matters. > 1) xen-4.0.0 > I see a similar problem to the one reported at the thread at > http://lists.xensource.com/archives/html/xen-devel/2010-05/msg00677.html > > Dom0 is 2.6.32.9-7.pvops0 x86_64, xen-4.0.0 x86_64. > [user@qubes ~]$ xm create /dev/null > kernel=/boot/vmlinuz-2.6.32.9-7.pvops0.qubes.x86_64 > root=/dev/mapper/dmroot extra="rootdelay=1000" memory=400 > ...wait a second... > [user@qubes ~]$ xm save null nullsave > [user@qubes ~]$ time cat nullsave >/dev/null > ... > [user@qubes ~]$ time cat nullsave >/dev/null > ... > [user@qubes ~]$ time cat nullsave >/dev/null > real 0m0.173s > user 0m0.010s > sys 0m0.164s > /* sits nicely in the cache, let's restore... */ > [user@qubes ~]$ time xm restore nullsave > real 0m9.189s > user 0m0.151s > sys 0m0.039s > > According to systemtap, xc_restore uses 3812s of CPU time; besides it being > a lot, what uses the remaining 6s ? Just as reported previously, there are > some errors in xend.log > > [2010-05-25 10:49:02 2392] DEBUG (XendCheckpoint:286) restore:shadow=0x0, > _static_max=0x19000000, _static_min=0x0, > [2010-05-25 10:49:02 2392] DEBUG (XendCheckpoint:305) [xc_restore]: > /usr/lib64/xen/bin/xc_restore 39 3 1 2 0 0 0 0 > [2010-05-25 10:49:02 2392] INFO (XendCheckpoint:423) xc_domain_restore > start: p2m_size = 19000 > [2010-05-25 10:49:02 2392] INFO (XendCheckpoint:423) Reloading memory pages: > 0% > [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) ERROR Internal error: > Error when reading batch size > [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) ERROR Internal error: > error when buffering batch, finishing > [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) > [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:4100% > [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) Memory reloaded (0 > pages) > [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) read VCPU 0 > [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) Completed checkpoint > load > [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) Domain ready to be > built. > [2010-05-25 10:49:11 2392] INFO (XendCheckpoint:423) Restore exit with rc=0 > > Note, xc_restore on xen-3.4.3 works much faster (and with no warnings in the > log), with the same dom0 pvops kernel. > > Ok, so there is some issue here. Some more generic thoughts below. > > 2) xen-3.4.3 > Firstly, /etc/xen/scripts/block in xen-3.4.3 tries to do something like > for i in /dev/loop* ; do > losetup $i > so, spawn one losetup process per each existing /dev/loopX; it hogs CPU, > especially if your system comes with maxloops=255 :). So, > let's replace it with the xen-4.0.0 version, where this problem is fixed (it > uses losetup -a, hurray). > Then, restore time for a 400MB domain, with the restore file in the cache, > with 4 vbds backed by /dev/loopX, with one vif, is ca 2.7s real time. > According to systemtap, the CPU time requirements are > xend threads- 0.363s > udevd(in dom0) - 0.007s > /etc/xen/scripts/block and its children - 1.075s > xc_restore - 1.368s > /etc/xen/scripts/vif-bridge (in netvm) - 0.130s > > The obvious idea to improve /etc/xen/scripts/block shell script execution time > is to recode it, in some other language that will not spawn hundreds of > processes to do its job. > > Now, xc_restore. > a) Is it correct that when xc_restore runs, the target domain memory is already > zeroed (because hypervisor scrubs free memory, before it is assigned to a > new domain) ? So, xc_save could check whether a given page contains only > zeroes and if so, omit it in the savefile. This could result in quite > significant savings when > - we save a freshly booted domain, or if we can zero out free memory in the > domain before saving > - we plan to restore multiple times from the same savefile (yes, vbd must be > restored in this case too). > > b) xen-3.4.3/xc_restore reads data from savefile in 4k portions - so, one > read syscall per page. Make it read in larger chunks. It looks it is fixed in > xen-4.0.0, is this correct ? > > Also, it looks really excessive that basically copying 400MB of memory takes > over 1.3s cpu time. Is IOCTL_PRIVCMD_MMAPBATCH the culprit (its > dom0 kernel code ? Xen mm code ? hypercall overhead ? ), anything > else ? > I am aware that in the usual cases, xc_restore is not the bottleneck > (savefile reads from the disk or the network is), but in case we can fetch > savefile quickly, it matters. > > Is 3.4.3 branch still being developed, or pure maintenance mode only, so new > code should be prepared for 4.0.0 ? > > Regards, > Rafal Wojtczuk > Principal Researcher > Invisible Things Lab, Qubes-os project > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 226 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Improving domU restore time 2010-05-25 10:35 Improving domU restore time Rafal Wojtczuk 2010-05-25 10:58 ` Joanna Rutkowska @ 2010-05-25 11:50 ` Keir Fraser 2010-05-25 12:50 ` Rafal Wojtczuk 2010-05-31 9:42 ` Rafal Wojtczuk 1 sibling, 2 replies; 18+ messages in thread From: Keir Fraser @ 2010-05-25 11:50 UTC (permalink / raw) To: Rafal Wojtczuk, xen-devel@lists.xensource.com On 25/05/2010 11:35, "Rafal Wojtczuk" <rafal@invisiblethingslab.com> wrote: > a) Is it correct that when xc_restore runs, the target domain memory is > already > zeroed (because hypervisor scrubs free memory, before it is assigned to a > new domain) There is no guarantee that the memory will be zeroed. > b) xen-3.4.3/xc_restore reads data from savefile in 4k portions - so, one > read syscall per page. Make it read in larger chunks. It looks it is fixed in > xen-4.0.0, is this correct ? It got changed a lot for Remus. I expect performance was on their mind. Normally kernel's file readahead heuristic would get back most of the performance of not reading in larger chunks. > Also, it looks really excessive that basically copying 400MB of memory takes > over 1.3s cpu time. Is IOCTL_PRIVCMD_MMAPBATCH the culprit (its > dom0 kernel code ? Xen mm code ? hypercall overhead ? ), anything > else ? I would expect IOCTL_PRIVCMD_MMAPBATCH to be the most significant part of that loop. -- Keir > I am aware that in the usual cases, xc_restore is not the bottleneck > (savefile reads from the disk or the network is), but in case we can fetch > savefile quickly, it matters. > > Is 3.4.3 branch still being developed, or pure maintenance mode only, so new > code should be prepared for 4.0.0 ? ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Improving domU restore time 2010-05-25 11:50 ` Keir Fraser @ 2010-05-25 12:50 ` Rafal Wojtczuk 2010-05-25 12:59 ` Keir Fraser 2010-05-25 13:02 ` Improving domU restore time Keir Fraser 2010-05-31 9:42 ` Rafal Wojtczuk 1 sibling, 2 replies; 18+ messages in thread From: Rafal Wojtczuk @ 2010-05-25 12:50 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel@lists.xensource.com On Tue, May 25, 2010 at 12:50:40PM +0100, Keir Fraser wrote: > On 25/05/2010 11:35, "Rafal Wojtczuk" <rafal@invisiblethingslab.com> wrote: > > > a) Is it correct that when xc_restore runs, the target domain memory is > > already > > zeroed (because hypervisor scrubs free memory, before it is assigned to a > > new domain) > > There is no guarantee that the memory will be zeroed. Interesting. For my education, could you explain who is responsible for clearing memory of a newborn domain ? Xend ? Could you point me to the relevant code fragments ? It looks sensible to clear free memory in hypervisor context in its idle cycles; if non-temporal instructions (movnti) were used for this, it would not pollute caches, and it must be done anyway ? > > b) xen-3.4.3/xc_restore reads data from savefile in 4k portions - so, one > > read syscall per page. Make it read in larger chunks. It looks it is fixed in > > xen-4.0.0, is this correct ? > > It got changed a lot for Remus. I expect performance was on their mind. > Normally kernel's file readahead heuristic would get back most of the > performance of not reading in larger chunks. Yes, readahead would keep the disk request queue full, but I was just thinking of lowering the syscall overhead. 1e5 syscalls is a lot :) [user@qubes ~]$ dd if=/dev/zero of=/dev/null bs=4k count=102400 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 0.307211 s, 1.4 GB/s [user@qubes ~]$ dd if=/dev/zero of=/dev/null bs=4M count=100 100+0 records in 100+0 records out 419430400 bytes (419 MB) copied, 0.25347 s, 1.7 GB/s RW ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Improving domU restore time 2010-05-25 12:50 ` Rafal Wojtczuk @ 2010-05-25 12:59 ` Keir Fraser 2010-05-25 13:33 ` scrubbing free'd pages James Harper 2010-05-25 14:12 ` scrubbing pages on vm pause Joanna Rutkowska 2010-05-25 13:02 ` Improving domU restore time Keir Fraser 1 sibling, 2 replies; 18+ messages in thread From: Keir Fraser @ 2010-05-25 12:59 UTC (permalink / raw) To: Rafal Wojtczuk; +Cc: xen-devel@lists.xensource.com On 25/05/2010 13:50, "Rafal Wojtczuk" <rafal@invisiblethingslab.com> wrote: >> There is no guarantee that the memory will be zeroed. > Interesting. > For my education, could you explain who is responsible for clearing memory > of a newborn domain ? Xend ? Could you point me to the relevant code > fragments ? New domains are not guaranteed to receive zeroed memory. The only guarantee Xen provides is that when it frees memory for a *dead* domain, it will scrub the contents before reallocation (it may not write zeroes however, in a debug build of Xen for example!). Other memory pages the domain freeing the pages must scrub them itself before freeing them back to Xen. > It looks sensible to clear free memory in hypervisor context in its idle > cycles; if non-temporal instructions (movnti) were used for this, it would > not pollute caches, and it must be done anyway ? Only for that one case (freeing pages of a dead domain). In that one case we currently do it synchronously. But that is because it was better than my previous crappy asynchronous scrubbing code. :-) >>> b) xen-3.4.3/xc_restore reads data from savefile in 4k portions - so, one >>> read syscall per page. Make it read in larger chunks. It looks it is fixed >>> in >>> xen-4.0.0, is this correct ? >> >> It got changed a lot for Remus. I expect performance was on their mind. >> Normally kernel's file readahead heuristic would get back most of the >> performance of not reading in larger chunks. > Yes, readahead would keep the disk request queue full, but I was just > thinking of lowering the syscall overhead. 1e5 syscalls is a lot :) Well the code looks like it batches now anyway. If it isn't, it would be interesting to see if making batches would measurably improve performance. -- Keir > [user@qubes ~]$ dd if=/dev/zero of=/dev/null bs=4k count=102400 > 102400+0 records in > 102400+0 records out > 419430400 bytes (419 MB) copied, 0.307211 s, 1.4 GB/s > [user@qubes ~]$ dd if=/dev/zero of=/dev/null bs=4M count=100 > 100+0 records in > 100+0 records out > 419430400 bytes (419 MB) copied, 0.25347 s, 1.7 GB/s ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: scrubbing free'd pages 2010-05-25 12:59 ` Keir Fraser @ 2010-05-25 13:33 ` James Harper 2010-05-25 13:39 ` Keir Fraser 2010-05-25 14:12 ` scrubbing pages on vm pause Joanna Rutkowska 1 sibling, 1 reply; 18+ messages in thread From: James Harper @ 2010-05-25 13:33 UTC (permalink / raw) To: Keir Fraser, Rafal Wojtczuk; +Cc: xen-devel > Other memory pages the domain freeing the > pages must scrub them itself before freeing them back to Xen. Is that true for a HVM domain making a decrease_reservation hypercall? If so I should modify my code accordingly... it also means I need to know if the page I'm decreasing is an unpopulated PoD page or not too. James ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: scrubbing free'd pages 2010-05-25 13:33 ` scrubbing free'd pages James Harper @ 2010-05-25 13:39 ` Keir Fraser 2010-05-25 13:48 ` Paul Durrant 0 siblings, 1 reply; 18+ messages in thread From: Keir Fraser @ 2010-05-25 13:39 UTC (permalink / raw) To: James Harper, Rafal Wojtczuk; +Cc: xen-devel@lists.xensource.com On 25/05/2010 14:33, "James Harper" <james.harper@bendigoit.com.au> wrote: >> Other memory pages the domain freeing the >> pages must scrub them itself before freeing them back to Xen. > > Is that true for a HVM domain making a decrease_reservation hypercall? > If so I should modify my code accordingly... Yes you should. > it also means I need to > know if the page I'm decreasing is an unpopulated PoD page or not too. Certainly you could avoid it in that case. Actually I think the PoD code can detect and reclaim allocated-but-zeroed pages however. But not sure if you really have to rely on that or not. -- Keir ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: scrubbing free'd pages 2010-05-25 13:39 ` Keir Fraser @ 2010-05-25 13:48 ` Paul Durrant 0 siblings, 0 replies; 18+ messages in thread From: Paul Durrant @ 2010-05-25 13:48 UTC (permalink / raw) To: Keir Fraser, James Harper, Rafal Wojtczuk; +Cc: xen-devel@lists.xensource.com > -----Original Message----- > From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel- > bounces@lists.xensource.com] On Behalf Of Keir Fraser > Sent: 25 May 2010 14:40 > To: James Harper; Rafal Wojtczuk > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] scrubbing free'd pages > > On 25/05/2010 14:33, "James Harper" <james.harper@bendigoit.com.au> > wrote: > > >> Other memory pages the domain freeing the > >> pages must scrub them itself before freeing them back to Xen. > > > > Is that true for a HVM domain making a decrease_reservation > hypercall? > > If so I should modify my code accordingly... > > Yes you should. > > > it also means I need to > > know if the page I'm decreasing is an unpopulated PoD page or not > too. > > Certainly you could avoid it in that case. Actually I think the PoD > code can > detect and reclaim allocated-but-zeroed pages however. But not sure > if you > really have to rely on that or not. > Yes, that's true, but it would be better if we didn't have to scrub pages and cause a populate immediately before an invalidate. Paul ^ permalink raw reply [flat|nested] 18+ messages in thread
* scrubbing pages on vm pause 2010-05-25 12:59 ` Keir Fraser 2010-05-25 13:33 ` scrubbing free'd pages James Harper @ 2010-05-25 14:12 ` Joanna Rutkowska 2010-05-25 14:13 ` Keir Fraser 1 sibling, 1 reply; 18+ messages in thread From: Joanna Rutkowska @ 2010-05-25 14:12 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel@lists.xensource.com, Rafal Wojtczuk [-- Attachment #1.1: Type: text/plain, Size: 868 bytes --] On 05/25/2010 02:59 PM, Keir Fraser wrote: > On 25/05/2010 13:50, "Rafal Wojtczuk" <rafal@invisiblethingslab.com> wrote: > >>> There is no guarantee that the memory will be zeroed. >> Interesting. >> For my education, could you explain who is responsible for clearing memory >> of a newborn domain ? Xend ? Could you point me to the relevant code >> fragments ? > > New domains are not guaranteed to receive zeroed memory. The only guarantee > Xen provides is that when it frees memory for a *dead* domain, it will scrub > the contents before reallocation (it may not write zeroes however, in a > debug build of Xen for example!). Other memory pages the domain freeing the > pages must scrub them itself before freeing them back to Xen. > And what happens when we pause and save a domain? Are the pages zero-out by xen in that case? joanna. [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 226 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: scrubbing pages on vm pause 2010-05-25 14:12 ` scrubbing pages on vm pause Joanna Rutkowska @ 2010-05-25 14:13 ` Keir Fraser 2010-05-25 14:19 ` Joanna Rutkowska 0 siblings, 1 reply; 18+ messages in thread From: Keir Fraser @ 2010-05-25 14:13 UTC (permalink / raw) To: Joanna Rutkowska; +Cc: xen-devel@lists.xensource.com, Rafal Wojtczuk On 25/05/2010 15:12, "Joanna Rutkowska" <joanna@invisiblethingslab.com> wrote: >> New domains are not guaranteed to receive zeroed memory. The only guarantee >> Xen provides is that when it frees memory for a *dead* domain, it will scrub >> the contents before reallocation (it may not write zeroes however, in a >> debug build of Xen for example!). Other memory pages the domain freeing the >> pages must scrub them itself before freeing them back to Xen. >> > > And what happens when we pause and save a domain? Are the pages zero-out > by xen in that case? If the original domain is subsequently destroyed then yes, Xen zeroes the pages. -- Keir ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: scrubbing pages on vm pause 2010-05-25 14:13 ` Keir Fraser @ 2010-05-25 14:19 ` Joanna Rutkowska 2010-05-25 14:19 ` Keir Fraser 0 siblings, 1 reply; 18+ messages in thread From: Joanna Rutkowska @ 2010-05-25 14:19 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel@lists.xensource.com, Rafal Wojtczuk [-- Attachment #1.1: Type: text/plain, Size: 932 bytes --] On 05/25/2010 04:13 PM, Keir Fraser wrote: > On 25/05/2010 15:12, "Joanna Rutkowska" <joanna@invisiblethingslab.com> > wrote: > >>> New domains are not guaranteed to receive zeroed memory. The only guarantee >>> Xen provides is that when it frees memory for a *dead* domain, it will scrub >>> the contents before reallocation (it may not write zeroes however, in a >>> debug build of Xen for example!). Other memory pages the domain freeing the >>> pages must scrub them itself before freeing them back to Xen. >>> >> >> And what happens when we pause and save a domain? Are the pages zero-out >> by xen in that case? > > If the original domain is subsequently destroyed then yes, Xen zeroes the > pages. > Let's consider this scenario: xm save domain1 xm create domain2 Can the domain2 get *unscrubbed* pages that were previously used by domain1, but were not scrubbed properly by domain1? j. [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 226 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: scrubbing pages on vm pause 2010-05-25 14:19 ` Joanna Rutkowska @ 2010-05-25 14:19 ` Keir Fraser 2010-05-25 14:24 ` Joanna Rutkowska 0 siblings, 1 reply; 18+ messages in thread From: Keir Fraser @ 2010-05-25 14:19 UTC (permalink / raw) To: Joanna Rutkowska; +Cc: xen-devel@lists.xensource.com, Rafal Wojtczuk On 25/05/2010 15:19, "Joanna Rutkowska" <joanna@invisiblethingslab.com> wrote: > Let's consider this scenario: > > xm save domain1 > > xm create domain2 > > Can the domain2 get *unscrubbed* pages that were previously used by > domain1, but were not scrubbed properly by domain1? Generally speaking a domain loses pages to the free pool in only two ways: via a decrease_reservation hypercall, and via domain destruction. In the former case the domain itself is responsible for first scrubbing the page. In the latter case Xen is responsible. With both avenues covered, domain2 cannot get unscrubbed pages from domain1. -- Keir ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: scrubbing pages on vm pause 2010-05-25 14:19 ` Keir Fraser @ 2010-05-25 14:24 ` Joanna Rutkowska 0 siblings, 0 replies; 18+ messages in thread From: Joanna Rutkowska @ 2010-05-25 14:24 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel@lists.xensource.com, Rafal Wojtczuk [-- Attachment #1.1: Type: text/plain, Size: 742 bytes --] On 05/25/2010 04:19 PM, Keir Fraser wrote: > On 25/05/2010 15:19, "Joanna Rutkowska" <joanna@invisiblethingslab.com> > wrote: > >> Let's consider this scenario: >> >> xm save domain1 >> >> xm create domain2 >> >> Can the domain2 get *unscrubbed* pages that were previously used by >> domain1, but were not scrubbed properly by domain1? > > Generally speaking a domain loses pages to the free pool in only two ways: > via a decrease_reservation hypercall, and via domain destruction. In the > former case the domain itself is responsible for first scrubbing the page. > In the latter case Xen is responsible. With both avenues covered, domain2 > cannot get unscrubbed pages from domain1. > Makes sense. Thanks, j. [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 226 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Improving domU restore time 2010-05-25 12:50 ` Rafal Wojtczuk 2010-05-25 12:59 ` Keir Fraser @ 2010-05-25 13:02 ` Keir Fraser 1 sibling, 0 replies; 18+ messages in thread From: Keir Fraser @ 2010-05-25 13:02 UTC (permalink / raw) To: Rafal Wojtczuk; +Cc: xen-devel@lists.xensource.com On 25/05/2010 13:50, "Rafal Wojtczuk" <rafal@invisiblethingslab.com> wrote: >> There is no guarantee that the memory will be zeroed. > Interesting. > For my education, could you explain who is responsible for clearing memory > of a newborn domain ? Xend ? Could you point me to the relevant code > fragments ? New domains are not guaranteed to receive zeroed memory. The only guarantee Xen provides is that when it frees memory for a *dead* domain, it will scrub the contents before reallocation (it may not write zeroes however, in a debug build of Xen for example!). Other memory pages the domain freeing the pages must scrub them itself before freeing them back to Xen. > It looks sensible to clear free memory in hypervisor context in its idle > cycles; if non-temporal instructions (movnti) were used for this, it would > not pollute caches, and it must be done anyway ? Only for that one case (freeing pages of a dead domain). In that one case we currently do it synchronously. But that is because it was better than my previous crappy asynchronous scrubbing code. :-) >>> b) xen-3.4.3/xc_restore reads data from savefile in 4k portions - so, one >>> read syscall per page. Make it read in larger chunks. It looks it is fixed >>> in >>> xen-4.0.0, is this correct ? >> >> It got changed a lot for Remus. I expect performance was on their mind. >> Normally kernel's file readahead heuristic would get back most of the >> performance of not reading in larger chunks. > Yes, readahead would keep the disk request queue full, but I was just > thinking of lowering the syscall overhead. 1e5 syscalls is a lot :) Well the code looks like it batches now anyway. If it isn't, it would be interesting to see if making batches would measurably improve performance. -- Keir > [user@qubes ~]$ dd if=/dev/zero of=/dev/null bs=4k count=102400 > 102400+0 records in > 102400+0 records out > 419430400 bytes (419 MB) copied, 0.307211 s, 1.4 GB/s > [user@qubes ~]$ dd if=/dev/zero of=/dev/null bs=4M count=100 > 100+0 records in > 100+0 records out > 419430400 bytes (419 MB) copied, 0.25347 s, 1.7 GB/s ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Improving domU restore time 2010-05-25 11:50 ` Keir Fraser 2010-05-25 12:50 ` Rafal Wojtczuk @ 2010-05-31 9:42 ` Rafal Wojtczuk 2010-06-01 17:00 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 18+ messages in thread From: Rafal Wojtczuk @ 2010-05-31 9:42 UTC (permalink / raw) To: Keir Fraser; +Cc: xen-devel@lists.xensource.com [-- Attachment #1: Type: text/plain, Size: 3918 bytes --] Hello, > I would be grateful for the comments on possible methods to improve domain > restore performance. Focusing on the PV case, if it matters. Continuing the topic; thank you to everyone that responded so far. Focusing on xen-3.4.3 case for now, dom0/domU still 2.6.32.x pvops x86_64. Let me just reiterate that for our purposes, the domain save time (and possible related post-processing) is not critical, it is only the restore time that matters. I did some experiments; they involve: 1) before saving a domain, have domU allocate all free memory in an userland process, then fill it with some MAGIC_PATTERN. Save domU, then process the savefile, removing all pfns (and their page content) that refer to a page containing MAGIC_PATTERN. This reduces the savefile size. 2) instead of executing "xm restore savefile", just poke the xmlrpc request to Xend unix socket via socat 3) change the /etc/xen/scripts/block so that in the "add file:" case, it calls only 3 processes (xenstore-read, losetup, xenstore-write); assuming the sharing check can be done elsewhere, this should provide realistic lower bound for the execution time For a domain with 400MB RAM and 4 vbds, with the savefile in the fs cache, this cuts down the restore real time from 2700 ms to 1153 ms. Some questions: a) is the 1) method safe ? Normally, xc_domain_restore() allocates mfns via xc_domain_memory_populate_physmap() and then calls xc_add_mmu_update(MMU_MACHPHYS_UPDATE) on the pfn/mfn pairs. If we remove some pfns from the savefile, this will not happen. Instead, the mfn for the removed pfn (referring to memory whose content we don't care for) will be allocated in uncanonicalize_pagetable(), because there will be a pte entry for this page. But uncanonicalize_pagetable() does not call xc_add_mmu_update(). Still, the domain seems to be restored properly (naturally the buffer filled previously with MAGIC_PATTERN now contains junk, but this is the whole purpose of it). Again, is xc_add_mmu_update(MMU_MACHPHYS_UPDATE) really needed in the above scenario ? It basically does set_gpfn_from_mfn(mfn, gpfn) but this should already be taken care for by xc_domain_memory_populate_physmap() ? b) There still seems to be some discrepancy between the real time (1153ms) and the CPU time (970ms); considering this is a machine with 2 cores (and at least the hotplug scripts execute in parallel), it is notable. What can cause the involved processes to sleep (we read the savefile from fs cache, so there should be no disk reads at all). Is the single threaded nature of xenstored the possible cause for the delays ? Generally xenstored seems to be quite busy during the restore. Do you think some of the queries (from Xend?) are redundant ? Is there anything else that can be removed from the relevant Xend code with no harm ? This question may sound too blunt; but given the fact that "xm restore savefile" wastes 220 ms of CPU time doing apparently nothing useful, I would assume there is some overhead in Xend too. The systemtap trace in the attachment; it does not contain a line about the xenstored CPU ticks (259ms, really a lot?), as xenstored does not terminate any thread. c) >> Also, it looks really excessive that basically copying 400MB of memory takes >> over 1.3s cpu time. Is IOCTL_PRIVCMD_MMAPBATCH the culprit (its > I would expect IOCTL_PRIVCMD_MMAPBATCH to be the most significant part of > that loop. Let's imagine there is a hypercall do_direct_memcpy_from_dom0_to_mfn(int mfn_count, mfn* mfn_array, char * pages_content). Would it make xc_restore faster if instead of using the xc_map_foreign_batch() interface, it would call the above hypercall ? On x86_64 all the physical memory is already mapped in the hypervisor (is this correct?), so this could be quicker, as no page table setup would be necessary ? Regards, Rafal Wojtczuk Principal Researcher Invisible Things Lab, Qubes-os project [-- Attachment #2: probe.systemtap --] [-- Type: text/plain, Size: 634 bytes --] global process_start probe kernel.function("do_execve").return { p=pid() t=gettimeofday_us() process_start[p]=t printf("executed pid %d parent %d time=%d name=%s\n", p, ppid(), t, execname()) } probe timer.profile { tid=tid() if (!user_mode()) kticks[tid] <<< 1 else uticks[tid] <<< 1 tids[tid] <<< 1 } global uticks, kticks, tids function logit() { tid=tid() p=pid() t=gettimeofday_us() elapsed=t-process_start[p] printf("finishing tid %d pid %d parent %d ticks %d real %d time=%d name=%s\n", tid, p, ppid(), @count(kticks[tid])+@count(uticks[tid]), elapsed, t, execname()) } probe syscall.exit { logit() } [-- Attachment #3: probeoutput.txt --] [-- Type: text/plain, Size: 3909 bytes --] # slightly postprocessed, to include [delta_from_prev, abstime] values before # each event logged 0 0 executed pid 10521 parent 2232 time=1275054562565050 name=socat 254543 254543 executed pid 10523 parent 10215 time=1275054562819593 name=block 28076 282619 executed pid 10524 parent 10220 time=1275054562847669 name=block 4585 287204 executed pid 10525 parent 10523 time=1275054562852254 name=xenstore-read 2645 289849 executed pid 10526 parent 10524 time=1275054562854899 name=xenstore-read 13441 303290 finishing tid 10525 pid 10525 parent 10523 ticks 3 real 16086 time=1275054562868340 name=xenstore-read 6564 309854 executed pid 10528 parent 10523 time=1275054562874904 name=losetup 4772 314626 finishing tid 10526 pid 10526 parent 10524 ticks 3 real 24777 time=1275054562879676 name=xenstore-read 2782 317408 executed pid 10530 parent 10524 time=1275054562882458 name=losetup 1715 319123 executed pid 10529 parent 10527 time=1275054562884173 name=block 1816 320939 finishing tid 10530 pid 10530 parent 10524 ticks 2 real 3531 time=1275054562885989 name=losetup 4658 325597 executed pid 10532 parent 10524 time=1275054562890647 name=xenstore-write 1820 327417 executed pid 10533 parent 10529 time=1275054562892467 name=xenstore-read 4007 331424 finishing tid 10532 pid 10532 parent 10524 ticks 2 real 5827 time=1275054562896474 name=xenstore-write 538 331962 finishing tid 10524 pid 10524 parent 10220 ticks 7 real 49343 time=1275054562897012 name=block 1133 333095 finishing tid 10528 pid 10528 parent 10523 ticks 4 real 23241 time=1275054562898145 name=losetup 3605 336700 finishing tid 10533 pid 10533 parent 10529 ticks 2 real 9283 time=1275054562901750 name=xenstore-read 1714 338414 executed pid 10535 parent 10523 time=1275054562903464 name=xenstore-write 3556 341970 executed pid 10536 parent 10529 time=1275054562907020 name=losetup 2064 344034 finishing tid 10536 pid 10536 parent 10529 ticks 3 real 2064 time=1275054562909084 name=losetup 1382 345416 finishing tid 10535 pid 10535 parent 10523 ticks 3 real 7002 time=1275054562910466 name=xenstore-write 591 346007 finishing tid 10523 pid 10523 parent 10215 ticks 7 real 91464 time=1275054562911057 name=block 3332 349339 executed pid 10538 parent 10529 time=1275054562914389 name=xenstore-write 4557 353896 finishing tid 10538 pid 10538 parent 10529 ticks 2 real 4557 time=1275054562918946 name=xenstore-write 549 354445 finishing tid 10529 pid 10529 parent 10527 ticks 7 real 35322 time=1275054562919495 name=block 25937 380382 executed pid 10539 parent 10215 time=1275054562945432 name=block 6636 387018 executed pid 10540 parent 10539 time=1275054562952068 name=xenstore-read 4327 391345 finishing tid 10540 pid 10540 parent 10539 ticks 3 real 4327 time=1275054562956395 name=xenstore-read 3895 395240 executed pid 10541 parent 10539 time=1275054562960290 name=losetup 1603 396843 finishing tid 10541 pid 10541 parent 10539 ticks 3 real 1603 time=1275054562961893 name=losetup 2141 398984 executed pid 10543 parent 10539 time=1275054562964034 name=xenstore-write 5343 404327 finishing tid 10543 pid 10543 parent 10539 ticks 2 real 5343 time=1275054562969377 name=xenstore-write 577 404904 finishing tid 10539 pid 10539 parent 10215 ticks 7 real 24522 time=1275054562969954 name=block 67293 472197 executed pid 10544 parent 8826 time=1275054563037247 name=xc_restore 407415 879612 finishing tid 10544 pid 10544 parent 8826 ticks 387 real 407415 time=1275054563444662 name=xc_restore 2571 882183 finishing tid 10545 pid 8826 parent 1 ticks 15 real 1275054563447233 time=1275054563447233 name=xend 271673 1153856 finishing tid 10521 pid 10521 parent 2232 ticks 8 real 1153856 time=1275054563718906 name=socat 73 1153929 finishing tid 10522 pid 8826 parent 1 ticks 238 real 1275054563718979 time=1275054563718979 name=xend 2258682 3412611 finishing tid 10215 pid 10215 parent 748 ticks 5 real 1275054565977661 time=1275054565977661 name=udevd [-- Attachment #4: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: Improving domU restore time 2010-05-31 9:42 ` Rafal Wojtczuk @ 2010-06-01 17:00 ` Jeremy Fitzhardinge 2010-06-02 16:24 ` Rafal Wojtczuk 0 siblings, 1 reply; 18+ messages in thread From: Jeremy Fitzhardinge @ 2010-06-01 17:00 UTC (permalink / raw) To: Rafal Wojtczuk; +Cc: xen-devel@lists.xensource.com, Keir Fraser On 05/31/2010 02:42 AM, Rafal Wojtczuk wrote: > Hello, > >> I would be grateful for the comments on possible methods to improve domain >> restore performance. Focusing on the PV case, if it matters. >> > Continuing the topic; thank you to everyone that responded so far. > > Focusing on xen-3.4.3 case for now, dom0/domU still 2.6.32.x pvops x86_64. > Let me just reiterate that for our purposes, the domain save time (and > possible related post-processing) is not critical, it > is only the restore time that matters. I did some experiments; they involve: > 1) before saving a domain, have domU allocate all free memory in an userland > process, then fill it with some MAGIC_PATTERN. Save domU, then process the > savefile, removing all pfns (and their page content) that refer to a page > containing MAGIC_PATTERN. > This reduces the savefile size. > Why not just balloon the domain down? > 2) instead of executing "xm restore savefile", just poke the xmlrpc request > to Xend unix socket via socat > I would seek alternatives to the xend/xm toolset. I've been doing my bit to make libxenlight/xl useful, though it still needs a lot of work to get it to anything remotely production-ready... > 3) change the /etc/xen/scripts/block so that in the "add file:" case, it calls > only 3 processes (xenstore-read, losetup, xenstore-write); assuming the > sharing check can be done elsewhere, this should provide realistic lower > bound for the execution time > > For a domain with 400MB RAM and 4 vbds, with the savefile in the fs cache, > this cuts down the restore real time from 2700 ms to 1153 ms. Some questions: > a) is the 1) method safe ? Normally, xc_domain_restore() allocates mfns via > xc_domain_memory_populate_physmap() and then calls > xc_add_mmu_update(MMU_MACHPHYS_UPDATE) on > the pfn/mfn pairs. If we remove some pfns from the savefile, this will not > happen. Instead, the mfn for the removed pfn (referring to memory whose > content we don't care for) will be allocated in uncanonicalize_pagetable(), > because there will be a pte entry for this page. But uncanonicalize_pagetable() > does not call xc_add_mmu_update(). Still, the domain seems to be restored > properly (naturally the buffer filled previously with MAGIC_PATTERN now > contains junk, but this is the whole purpose of it). > Again, is xc_add_mmu_update(MMU_MACHPHYS_UPDATE) really needed in the above > scenario ? It basically does > set_gpfn_from_mfn(mfn, gpfn) > but this should already be taken care for by > xc_domain_memory_populate_physmap() ? > > b) There still seems to be some discrepancy between the real time (1153ms) and > the CPU time (970ms); considering this is a machine with 2 cores (and at > least the hotplug scripts execute in parallel), it is notable. What can cause > the involved processes to sleep (we read the savefile from fs cache, so there > should be no disk reads at all). Is the single threaded nature of xenstored > the possible cause for the delays ? > Have you tried oxenstored? It works well for me, and seems to be a lot faster. > Generally xenstored seems to be quite busy during the restore. Do you think > some of the queries (from Xend?) are redundant ? Is there anything else > that can be removed from the relevant Xend code with no harm ? This question > may sound too blunt; but given the fact that "xm restore savefile" wastes 220 > ms of CPU time doing apparently nothing useful, I would assume there is some > overhead in Xend too. > The systemtap trace in the attachment; it does not contain a line about the > xenstored CPU ticks (259ms, really a lot?), as xenstored does not terminate > any thread. > > c) > >>> Also, it looks really excessive that basically copying 400MB of memory takes >>> over 1.3s cpu time. Is IOCTL_PRIVCMD_MMAPBATCH the culprit (its >>> >> I would expect IOCTL_PRIVCMD_MMAPBATCH to be the most significant part of >> that loop. >> > Let's imagine there is a hypercall do_direct_memcpy_from_dom0_to_mfn(int > mfn_count, mfn* mfn_array, char * pages_content). > Would it make xc_restore faster if instead of using the xc_map_foreign_batch() > interface, it would call the above hypercall ? On x86_64 all the physical > memory is already mapped in the hypervisor (is this correct?), so this could > be quicker, as no page table setup would be necessary ? > The main cost of pagetable manipulations is the tlb flush; if you can batch all your setups together to amortize the cost of the tlb flush, it should be pretty quick. But if batching is not being used properly, then it could get very expensive. My own observation of "strace xl restore" is that it seems to do a *lot* of ioctls on privcmd, but I haven't looked more closely to see what those calls are, and whether they're being done in an optimal way. J ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: Improving domU restore time 2010-06-01 17:00 ` Jeremy Fitzhardinge @ 2010-06-02 16:24 ` Rafal Wojtczuk 2010-06-02 16:33 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 18+ messages in thread From: Rafal Wojtczuk @ 2010-06-02 16:24 UTC (permalink / raw) To: Jeremy Fitzhardinge; +Cc: xen-devel@lists.xensource.com, Keir Fraser On Tue, Jun 01, 2010 at 10:00:09AM -0700, Jeremy Fitzhardinge wrote: > On 05/31/2010 02:42 AM, Rafal Wojtczuk wrote: > > Hello, > > > >> I would be grateful for the comments on possible methods to improve domain > >> restore performance. Focusing on the PV case, if it matters. > >> > > Continuing the topic; thank you to everyone that responded so far. > > > > Focusing on xen-3.4.3 case for now, dom0/domU still 2.6.32.x pvops x86_64. > > Let me just reiterate that for our purposes, the domain save time (and > > possible related post-processing) is not critical, it > > is only the restore time that matters. I did some experiments; they involve: > > 1) before saving a domain, have domU allocate all free memory in an userland > > process, then fill it with some MAGIC_PATTERN. Save domU, then process the > > savefile, removing all pfns (and their page content) that refer to a page > > containing MAGIC_PATTERN. > > This reduces the savefile size. > Why not just balloon the domain down? I thought it (well, rather the matching balloon up after restore) would cost quite some CPU time; it used to AFAIR. But nowadays it looks sensible, in 90ms range. Yes, that is much cleaner, thank you for the hint. > > should be no disk reads at all). Is the single threaded nature of xenstored > > the possible cause for the delays ? > Have you tried oxenstored? It works well for me, and seems to be a lot > faster. Do you mean http://xenbits.xensource.com/ext/xen-ocaml-tools.hg ? After some tweaks to Makefiles (-fPIC is required on x86_64 for libs sources) it compiles, but then it bails during startup with fatal error: exception Failure("ioctl bind_interdomain failed") This happens under xen-3.4.3; does it require 4.0.0 ? > >> I would expect IOCTL_PRIVCMD_MMAPBATCH to be the most significant part of > >> that loop. > > Let's imagine there is a hypercall do_direct_memcpy_from_dom0_to_mfn(int > > mfn_count, mfn* mfn_array, char * pages_content). > The main cost of pagetable manipulations is the tlb flush; if you can > batch all your setups together to amortize the cost of the tlb flush, it > should be pretty quick. But if batching is not being used properly, > then it could get very expensive. My own observation of "strace xl > restore" is that it seems to do a *lot* of ioctls on privcmd, but I > haven't looked more closely to see what those calls are, and whether > they're being done in an optimal way. Well, it looks like xc_restore should _usually_ call xc_map_foreign_batch once per pages batch (once per 1024 read pages), which looks sensible. xc_add_mmu_update also tries to batch requests. There are 432 occurences of ioctl syscall in the xc_restore strace output; I am not sure if it is damagingly numerous. Regards, Rafal Wojtczuk Principal Researcher Invisible Things Lab, Qubes-os project ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: Improving domU restore time 2010-06-02 16:24 ` Rafal Wojtczuk @ 2010-06-02 16:33 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 18+ messages in thread From: Jeremy Fitzhardinge @ 2010-06-02 16:33 UTC (permalink / raw) To: Rafal Wojtczuk; +Cc: xen-devel@lists.xensource.com, Keir Fraser On 06/02/2010 09:24 AM, Rafal Wojtczuk wrote: >> Why not just balloon the domain down? >> > I thought it (well, rather the matching balloon up after restore) would cost > quite some CPU time; it used to AFAIR. But nowadays it looks sensible, in 90ms > range. Yes, that is much cleaner, thank you for the hint. > Aside from the cost of the hypercalls to actually give up the pages, ballooning is just the same as memory allocation from the system's perspective. >>> should be no disk reads at all). Is the single threaded nature of xenstored >>> the possible cause for the delays ? >>> >> Have you tried oxenstored? It works well for me, and seems to be a lot >> faster. >> > Do you mean > http://xenbits.xensource.com/ext/xen-ocaml-tools.hg > ? > After some tweaks to Makefiles (-fPIC is required on x86_64 for libs sources) > it compiles, It builds out of the box for me on my x86-64 machine. > but then it bails during startup with > fatal error: exception Failure("ioctl bind_interdomain failed") > This happens under xen-3.4.3; does it require 4.0.0 ? > No, I don't think so, but it does have to be the first xenstore you run after boot. Ah, but Xen 4 probably has oxenstored build and other fixes which aren't in 3.4.3. In particular, I think it has been brought into the main xen-unstable repo, rather than living off to the side. But it is much quicker than the C one, I think primarily because it is entirely memory resident. > Well, it looks like xc_restore should _usually_ call > xc_map_foreign_batch once per pages batch (once per 1024 read pages), which > looks sensible. xc_add_mmu_update also tries to batch requests. There are > 432 occurences of ioctl syscall in the xc_restore strace output; I am not > sure if it is damagingly numerous. > Time for some profiling to see where the time is going then. J ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2010-06-02 16:33 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-05-25 10:35 Improving domU restore time Rafal Wojtczuk 2010-05-25 10:58 ` Joanna Rutkowska 2010-05-25 11:50 ` Keir Fraser 2010-05-25 12:50 ` Rafal Wojtczuk 2010-05-25 12:59 ` Keir Fraser 2010-05-25 13:33 ` scrubbing free'd pages James Harper 2010-05-25 13:39 ` Keir Fraser 2010-05-25 13:48 ` Paul Durrant 2010-05-25 14:12 ` scrubbing pages on vm pause Joanna Rutkowska 2010-05-25 14:13 ` Keir Fraser 2010-05-25 14:19 ` Joanna Rutkowska 2010-05-25 14:19 ` Keir Fraser 2010-05-25 14:24 ` Joanna Rutkowska 2010-05-25 13:02 ` Improving domU restore time Keir Fraser 2010-05-31 9:42 ` Rafal Wojtczuk 2010-06-01 17:00 ` Jeremy Fitzhardinge 2010-06-02 16:24 ` Rafal Wojtczuk 2010-06-02 16:33 ` Jeremy Fitzhardinge
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.