* Re: [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
[not found] <mailman.1648.1327589466.1471.xen-devel@lists.xensource.com>
@ 2012-01-26 16:30 ` Andres Lagar-Cavilla
2012-01-26 16:38 ` Olaf Hering
0 siblings, 1 reply; 8+ messages in thread
From: Andres Lagar-Cavilla @ 2012-01-26 16:30 UTC (permalink / raw)
To: xen-devel; +Cc: olaf
> Date: Thu, 26 Jan 2012 15:51:02 +0100
> From: Olaf Hering <olaf@aepfle.de>
> To: xen-devel@lists.xensource.com
> Subject: [Xen-devel] [PATCH] tools/libxc: send page-in requests in
> batches in linux_privcmd_map_foreign_bulk
> Message-ID: <bace47d7623cb92d5b08.1327589462@probook.site>
> Content-Type: text/plain; charset="us-ascii"
>
> # HG changeset patch
> # User Olaf Hering <olaf@aepfle.de>
> # Date 1327589312 -3600
> # Node ID bace47d7623cb92d5b080c54d3850642b8c52b46
> # Parent a87cbe503fed9e15d90d0bf6645d835331ec8874
> tools/libxc: send page-in requests in batches in
> linux_privcmd_map_foreign_bulk
>
> One of the bottlenecks with foreign page-in request is the poor retry
> handling in linux_privcmd_map_foreign_bulk(). It sends one request per
> paged gfn at a time and it waits until the gfn is accessible. This
> causes long delays in mmap requests from qemu-dm and xc_save.
>
> Instead of sending one request at a time, walk the entire gfn list and
> send batches of mmap requests. They will eventually end up in the pagers
> request ring (if it has room again), and will fill up this ring so that
> in turn the pager can also process page-in in batches.
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
>
> diff -r a87cbe503fed -r bace47d7623c tools/libxc/xc_linux_osdep.c
> --- a/tools/libxc/xc_linux_osdep.c
> +++ b/tools/libxc/xc_linux_osdep.c
> @@ -191,6 +191,59 @@ static void *linux_privcmd_map_foreign_b
> return addr;
> }
>
> +/*
> + * Retry mmap of paged gfns in batches
> + * retuns < 0 on fatal error
> + * returns 0 if all gfns left paging state
> + * returns > 0 if some gfns are still in paging state
> + *
> + * Walk all gfns are assemble blocks of gfns in paging state.
> + * This will keep the request ring full and avoids delays.
> + */
> +static int retry_paged(int fd, uint32_t dom, void *addr,
> + const xen_pfn_t *arr, int *err, unsigned int num)
> +{
> + privcmd_mmapbatch_v2_t ioctlx;
> + int rc, paged = 0, i = 0;
> +
> + do
> + {
> + /* Skip gfns not in paging state */
> + if ( err[i] != -ENOENT )
> + {
> + i++;
> + continue;
> + }
> +
> + paged++;
> +
> + /* At least one gfn is still in paging state */
> + ioctlx.num = 1;
> + ioctlx.dom = dom;
> + ioctlx.addr = (unsigned long)addr + ((unsigned
> long)i<<XC_PAGE_SHIFT);
> + ioctlx.arr = arr + i;
> + ioctlx.err = err + i;
> +
> + /* Assemble a batch of requests */
> + while ( ++i < num )
> + {
> + if ( err[i] != -ENOENT )
> + break;
> + ioctlx.num++;
> + }
Olaf,
if I get a pattern of errors in the err array of the form:
{ENOENT, EWHATEVER, ENOENT, EWHATEVER}
with EWHATEVER != ENOENT
this patch does not optimize anything.
Why not alloc a separate arr2 array, move the gfn's that failed with
ENOENT there, and retry the whole new arr2 block? You only need to
allocate arr2 once, to size num. That will batch always, everything.
Thanks,
Andres
> +
> + /* Send request and abort on fatal error */
> + rc = ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH_V2, &ioctlx);
> + if ( rc < 0 && errno != ENOENT )
> + goto out;
> +
> + } while ( i < num );
> +
> + rc = paged;
> +out:
> + return rc;
> +}
> +
> static void *linux_privcmd_map_foreign_bulk(xc_interface *xch,
> xc_osdep_handle h,
> uint32_t dom, int prot,
> const xen_pfn_t *arr, int
> *err, unsigned int num)
> @@ -220,21 +273,10 @@ static void *linux_privcmd_map_foreign_b
> /* Command was recognized, some gfn in arr are in paging state */
> if ( rc < 0 && errno == ENOENT )
> {
> - for ( i = rc = 0; rc == 0 && i < num; i++ )
> - {
> - if ( err[i] != -ENOENT )
> - continue;
> -
> - ioctlx.num = 1;
> - ioctlx.dom = dom;
> - ioctlx.addr = (unsigned long)addr + ((unsigned
> long)i<<XC_PAGE_SHIFT);
> - ioctlx.arr = arr + i;
> - ioctlx.err = err + i;
> - do {
> - usleep(100);
> - rc = ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH_V2, &ioctlx);
> - } while ( rc < 0 && errno == ENOENT && err[i] == -ENOENT );
> - }
> + do {
> + usleep(100);
> + rc = retry_paged(fd, dom, addr, arr, err, num);
> + } while ( rc > 0 );
> }
> /* Command was not recognized, use fall back */
> else if ( rc < 0 && errno == EINVAL && (int)num > 0 )
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
2012-01-26 16:30 ` [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk Andres Lagar-Cavilla
@ 2012-01-26 16:38 ` Olaf Hering
2012-01-26 16:58 ` Andres Lagar-Cavilla
0 siblings, 1 reply; 8+ messages in thread
From: Olaf Hering @ 2012-01-26 16:38 UTC (permalink / raw)
To: Andres Lagar-Cavilla; +Cc: xen-devel
On Thu, Jan 26, Andres Lagar-Cavilla wrote:
> Olaf,
> if I get a pattern of errors in the err array of the form:
>
> {ENOENT, EWHATEVER, ENOENT, EWHATEVER}
> with EWHATEVER != ENOENT
>
> this patch does not optimize anything.
It will poke all gfns instead of a single one before going to sleep.
> Why not alloc a separate arr2 array, move the gfn's that failed with
> ENOENT there, and retry the whole new arr2 block? You only need to
> allocate arr2 once, to size num. That will batch always, everything.
The errors for gfns are most likely fragmented. Merging the gfns means
they will get a different addr. So the overall layout of arr and err
needs to remain the same.
Olaf
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
2012-01-26 16:38 ` Olaf Hering
@ 2012-01-26 16:58 ` Andres Lagar-Cavilla
2012-01-26 17:02 ` Olaf Hering
2012-01-27 12:35 ` Olaf Hering
0 siblings, 2 replies; 8+ messages in thread
From: Andres Lagar-Cavilla @ 2012-01-26 16:58 UTC (permalink / raw)
To: Olaf Hering; +Cc: xen-devel
> On Thu, Jan 26, Andres Lagar-Cavilla wrote:
>
>> Olaf,
>> if I get a pattern of errors in the err array of the form:
>>
>> {ENOENT, EWHATEVER, ENOENT, EWHATEVER}
>> with EWHATEVER != ENOENT
>>
>> this patch does not optimize anything.
>
> It will poke all gfns instead of a single one before going to sleep.
All gfns that are contiguous in the arr array and have all failed with
ENOENT.
>
>> Why not alloc a separate arr2 array, move the gfn's that failed with
>> ENOENT there, and retry the whole new arr2 block? You only need to
>> allocate arr2 once, to size num. That will batch always, everything.
>
> The errors for gfns are most likely fragmented. Merging the gfns means
> they will get a different addr. So the overall layout of arr and err
> needs to remain the same.
Ummh yeah, addr is the problem. But as you say, the error for the gfns are
fragmented, so that seriously limits your batching abilities.
You could pass the whole arr sub-segment encompassing the first and last
gfn that failed with ENOENT. Successful maps within that array will be
re-done by the hypervisor, at no correctness cost. I would imagine that
the extra work is offset by the gains, but that remains to be seen.
Andres
>
> Olaf
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
2012-01-26 16:58 ` Andres Lagar-Cavilla
@ 2012-01-26 17:02 ` Olaf Hering
2012-01-27 12:35 ` Olaf Hering
1 sibling, 0 replies; 8+ messages in thread
From: Olaf Hering @ 2012-01-26 17:02 UTC (permalink / raw)
To: Andres Lagar-Cavilla; +Cc: xen-devel
On Thu, Jan 26, Andres Lagar-Cavilla wrote:
> >> Why not alloc a separate arr2 array, move the gfn's that failed with
> >> ENOENT there, and retry the whole new arr2 block? You only need to
> >> allocate arr2 once, to size num. That will batch always, everything.
> >
> > The errors for gfns are most likely fragmented. Merging the gfns means
> > they will get a different addr. So the overall layout of arr and err
> > needs to remain the same.
>
> Ummh yeah, addr is the problem. But as you say, the error for the gfns are
> fragmented, so that seriously limits your batching abilities.
> You could pass the whole arr sub-segment encompassing the first and last
> gfn that failed with ENOENT. Successful maps within that array will be
> re-done by the hypervisor, at no correctness cost. I would imagine that
> the extra work is offset by the gains, but that remains to be seen.
Isnt that what the patch does, find a range of ENOENTs and try that
again?
Olaf
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
2012-01-26 16:58 ` Andres Lagar-Cavilla
2012-01-26 17:02 ` Olaf Hering
@ 2012-01-27 12:35 ` Olaf Hering
1 sibling, 0 replies; 8+ messages in thread
From: Olaf Hering @ 2012-01-27 12:35 UTC (permalink / raw)
To: Andres Lagar-Cavilla; +Cc: xen-devel
On Thu, Jan 26, Andres Lagar-Cavilla wrote:
> You could pass the whole arr sub-segment encompassing the first and last
> gfn that failed with ENOENT. Successful maps within that array will be
> re-done by the hypervisor, at no correctness cost. I would imagine that
> the extra work is offset by the gains, but that remains to be seen.
I just tried it once again, and as xenpaging is written now a live
migration of an idle 512MB guest with 90MB paging target was done in
less than a minute. Previously it took a very long time, 2MB/sec. were
sent over the wire.
Olaf
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
@ 2012-03-26 13:22 Olaf Hering
2012-04-02 16:41 ` Ian Jackson
0 siblings, 1 reply; 8+ messages in thread
From: Olaf Hering @ 2012-03-26 13:22 UTC (permalink / raw)
To: xen-devel
# HG changeset patch
# User Olaf Hering <olaf@aepfle.de>
# Date 1332768138 -7200
# Node ID b66a4a57cee2de421fb5938db04797b5de10162d
# Parent 4bd752a4cdf323c41c50f8cd6286f566d67adeae
tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
One of the bottlenecks with foreign page-in request is the poor retry
handling in linux_privcmd_map_foreign_bulk(). It sends one request per
paged gfn at a time and it waits until the gfn is accessible. This
causes long delays in mmap requests from qemu-dm and xc_save.
Instead of sending one request at a time, walk the entire gfn list and
send batches of mmap requests. They will eventually end up in the pagers
request ring (if it has room again), and will fill up this ring so that
in turn the pager can also process page-in in batches.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
diff -r 4bd752a4cdf3 -r b66a4a57cee2 tools/libxc/xc_linux_osdep.c
--- a/tools/libxc/xc_linux_osdep.c
+++ b/tools/libxc/xc_linux_osdep.c
@@ -191,6 +191,59 @@ static void *linux_privcmd_map_foreign_b
return addr;
}
+/*
+ * Retry mmap of all paged gfns in batches
+ * retuns < 0 on fatal error
+ * returns 0 if all gfns left paging state
+ * returns > 0 if some gfns are still in paging state
+ *
+ * Walk all gfns and try to assemble blocks of gfns in paging state.
+ * This will keep the request ring full and avoids delays.
+ */
+static int retry_paged(int fd, uint32_t dom, void *addr,
+ const xen_pfn_t *arr, int *err, unsigned int num)
+{
+ privcmd_mmapbatch_v2_t ioctlx;
+ int rc, paged = 0, i = 0;
+
+ do
+ {
+ /* Skip gfns not in paging state */
+ if ( err[i] != -ENOENT )
+ {
+ i++;
+ continue;
+ }
+
+ paged++;
+
+ /* At least one gfn is still in paging state */
+ ioctlx.num = 1;
+ ioctlx.dom = dom;
+ ioctlx.addr = (unsigned long)addr + ((unsigned long)i<<XC_PAGE_SHIFT);
+ ioctlx.arr = arr + i;
+ ioctlx.err = err + i;
+
+ /* Assemble a batch of requests */
+ while ( ++i < num )
+ {
+ if ( err[i] != -ENOENT )
+ break;
+ ioctlx.num++;
+ }
+
+ /* Send request and abort on fatal error */
+ rc = ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH_V2, &ioctlx);
+ if ( rc < 0 && errno != ENOENT )
+ goto out;
+
+ } while ( i < num );
+
+ rc = paged;
+out:
+ return rc;
+}
+
static void *linux_privcmd_map_foreign_bulk(xc_interface *xch, xc_osdep_handle h,
uint32_t dom, int prot,
const xen_pfn_t *arr, int *err, unsigned int num)
@@ -220,21 +273,10 @@ static void *linux_privcmd_map_foreign_b
/* Command was recognized, some gfn in arr are in paging state */
if ( rc < 0 && errno == ENOENT )
{
- for ( i = rc = 0; rc == 0 && i < num; i++ )
- {
- if ( err[i] != -ENOENT )
- continue;
-
- ioctlx.num = 1;
- ioctlx.dom = dom;
- ioctlx.addr = (unsigned long)addr + ((unsigned long)i<<XC_PAGE_SHIFT);
- ioctlx.arr = arr + i;
- ioctlx.err = err + i;
- do {
- usleep(100);
- rc = ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH_V2, &ioctlx);
- } while ( rc < 0 && errno == ENOENT && err[i] == -ENOENT );
- }
+ do {
+ usleep(100);
+ rc = retry_paged(fd, dom, addr, arr, err, num);
+ } while ( rc > 0 );
}
/* Command was not recognized, use fall back */
else if ( rc < 0 && errno == EINVAL && (int)num > 0 )
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
2012-03-26 13:22 Olaf Hering
@ 2012-04-02 16:41 ` Ian Jackson
0 siblings, 0 replies; 8+ messages in thread
From: Ian Jackson @ 2012-04-02 16:41 UTC (permalink / raw)
To: Olaf Hering; +Cc: xen-devel
Olaf Hering writes ("[Xen-devel] [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk"):
> tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
@ 2012-01-26 14:51 Olaf Hering
0 siblings, 0 replies; 8+ messages in thread
From: Olaf Hering @ 2012-01-26 14:51 UTC (permalink / raw)
To: xen-devel
# HG changeset patch
# User Olaf Hering <olaf@aepfle.de>
# Date 1327589312 -3600
# Node ID bace47d7623cb92d5b080c54d3850642b8c52b46
# Parent a87cbe503fed9e15d90d0bf6645d835331ec8874
tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
One of the bottlenecks with foreign page-in request is the poor retry
handling in linux_privcmd_map_foreign_bulk(). It sends one request per
paged gfn at a time and it waits until the gfn is accessible. This
causes long delays in mmap requests from qemu-dm and xc_save.
Instead of sending one request at a time, walk the entire gfn list and
send batches of mmap requests. They will eventually end up in the pagers
request ring (if it has room again), and will fill up this ring so that
in turn the pager can also process page-in in batches.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
diff -r a87cbe503fed -r bace47d7623c tools/libxc/xc_linux_osdep.c
--- a/tools/libxc/xc_linux_osdep.c
+++ b/tools/libxc/xc_linux_osdep.c
@@ -191,6 +191,59 @@ static void *linux_privcmd_map_foreign_b
return addr;
}
+/*
+ * Retry mmap of paged gfns in batches
+ * retuns < 0 on fatal error
+ * returns 0 if all gfns left paging state
+ * returns > 0 if some gfns are still in paging state
+ *
+ * Walk all gfns are assemble blocks of gfns in paging state.
+ * This will keep the request ring full and avoids delays.
+ */
+static int retry_paged(int fd, uint32_t dom, void *addr,
+ const xen_pfn_t *arr, int *err, unsigned int num)
+{
+ privcmd_mmapbatch_v2_t ioctlx;
+ int rc, paged = 0, i = 0;
+
+ do
+ {
+ /* Skip gfns not in paging state */
+ if ( err[i] != -ENOENT )
+ {
+ i++;
+ continue;
+ }
+
+ paged++;
+
+ /* At least one gfn is still in paging state */
+ ioctlx.num = 1;
+ ioctlx.dom = dom;
+ ioctlx.addr = (unsigned long)addr + ((unsigned long)i<<XC_PAGE_SHIFT);
+ ioctlx.arr = arr + i;
+ ioctlx.err = err + i;
+
+ /* Assemble a batch of requests */
+ while ( ++i < num )
+ {
+ if ( err[i] != -ENOENT )
+ break;
+ ioctlx.num++;
+ }
+
+ /* Send request and abort on fatal error */
+ rc = ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH_V2, &ioctlx);
+ if ( rc < 0 && errno != ENOENT )
+ goto out;
+
+ } while ( i < num );
+
+ rc = paged;
+out:
+ return rc;
+}
+
static void *linux_privcmd_map_foreign_bulk(xc_interface *xch, xc_osdep_handle h,
uint32_t dom, int prot,
const xen_pfn_t *arr, int *err, unsigned int num)
@@ -220,21 +273,10 @@ static void *linux_privcmd_map_foreign_b
/* Command was recognized, some gfn in arr are in paging state */
if ( rc < 0 && errno == ENOENT )
{
- for ( i = rc = 0; rc == 0 && i < num; i++ )
- {
- if ( err[i] != -ENOENT )
- continue;
-
- ioctlx.num = 1;
- ioctlx.dom = dom;
- ioctlx.addr = (unsigned long)addr + ((unsigned long)i<<XC_PAGE_SHIFT);
- ioctlx.arr = arr + i;
- ioctlx.err = err + i;
- do {
- usleep(100);
- rc = ioctl(fd, IOCTL_PRIVCMD_MMAPBATCH_V2, &ioctlx);
- } while ( rc < 0 && errno == ENOENT && err[i] == -ENOENT );
- }
+ do {
+ usleep(100);
+ rc = retry_paged(fd, dom, addr, arr, err, num);
+ } while ( rc > 0 );
}
/* Command was not recognized, use fall back */
else if ( rc < 0 && errno == EINVAL && (int)num > 0 )
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-04-02 16:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <mailman.1648.1327589466.1471.xen-devel@lists.xensource.com>
2012-01-26 16:30 ` [PATCH] tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk Andres Lagar-Cavilla
2012-01-26 16:38 ` Olaf Hering
2012-01-26 16:58 ` Andres Lagar-Cavilla
2012-01-26 17:02 ` Olaf Hering
2012-01-27 12:35 ` Olaf Hering
2012-03-26 13:22 Olaf Hering
2012-04-02 16:41 ` Ian Jackson
-- strict thread matches above, loose matches on Subject: below --
2012-01-26 14:51 Olaf Hering
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).