* [Qemu-devel] [BUG] Regression of exec migration @ 2009-08-27 9:19 Pierre Riteau 2009-08-27 14:13 ` Anthony Liguori 0 siblings, 1 reply; 9+ messages in thread From: Pierre Riteau @ 2009-08-27 9:19 UTC (permalink / raw) To: qemu-devel, Chris Lalancette [Sorry Chris, resending without the giant attachments.] Commit 907500095851230a480b14bc852c4e49d32cb16d makes exec migration much slower than before. I'm running the latest HEAD of qemu, on Debian Lenny 5.0.2. I'm migrating a fully booted Linux VM (also running Lenny) with 128MB of RAM to a file, using the following command: migrate "exec: cat > vmimage". The resulting file has a size of 57MB (because we save only what is allocated from the 128MB). With the current HEAD, it takes from 15 to 40 seconds (it's variable) to perform the migration to the file. With commit 907500095851230a480b14bc852c4e49d32cb16d reverted (or just commenting the "socket_set_nonblock(s->fd);" statement), it takes about 3 seconds. I did a little debugging and it appears that, when in non-blocking mode, a write to the popen'ed process which fails with EAGAIN (because cat is busy writing data on disk) will only be retried after 100 ms (the next timer tick?). Linked below are two traces of qemu's output with DEBUG_BUFFERED_FILE defined. I modified the dprintf macro to print the result of qemu_get_clock(rt_clock) before the message. nonblock.log is with an unmodified HEAD while block.log has the "socket_set_nonblock(s->fd);" statement commented. Notice the "backend not ready, freezing" messages when nonblock is used. http://dl.getdropbox.com/u/1389459/qemu-devel/block.log http://dl.getdropbox.com/u/1389459/qemu-devel/nonblock.log -- Pierre Riteau -- http://perso.univ-rennes1.fr/pierre.riteau/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [BUG] Regression of exec migration 2009-08-27 9:19 [Qemu-devel] [BUG] Regression of exec migration Pierre Riteau @ 2009-08-27 14:13 ` Anthony Liguori 2009-08-27 16:16 ` Pierre Riteau 0 siblings, 1 reply; 9+ messages in thread From: Anthony Liguori @ 2009-08-27 14:13 UTC (permalink / raw) To: Pierre Riteau; +Cc: Chris Lalancette, qemu-devel Pierre Riteau wrote: > [Sorry Chris, resending without the giant attachments.] > > Commit 907500095851230a480b14bc852c4e49d32cb16d makes exec migration > much slower than before. > I'm running the latest HEAD of qemu, on Debian Lenny 5.0.2. > > I'm migrating a fully booted Linux VM (also running Lenny) with 128MB > of RAM to a file, using the following command: migrate "exec: cat > > vmimage". The resulting file has a size of 57MB (because we save only > what is allocated from the 128MB). > With the current HEAD, it takes from 15 to 40 seconds (it's variable) > to perform the migration to the file. > With commit 907500095851230a480b14bc852c4e49d32cb16d reverted (or just > commenting the "socket_set_nonblock(s->fd);" statement), it takes > about 3 seconds. Without that changeset, it wasn't a live migration. The better way to compare would be to issue stop before doing the migrate and compare that time with the previous time. When a migration is live, it's iterative which means there's more work to do. > only be retried after 100 ms (the next timer tick?). We should be selecting on the fd so it will wait until we can write to it. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [BUG] Regression of exec migration 2009-08-27 14:13 ` Anthony Liguori @ 2009-08-27 16:16 ` Pierre Riteau 2009-08-27 16:24 ` Anthony Liguori 0 siblings, 1 reply; 9+ messages in thread From: Pierre Riteau @ 2009-08-27 16:16 UTC (permalink / raw) To: Anthony Liguori; +Cc: Chris Lalancette, qemu-devel On 27 août 09, at 16:13, Anthony Liguori wrote: > Pierre Riteau wrote: >> [Sorry Chris, resending without the giant attachments.] >> >> Commit 907500095851230a480b14bc852c4e49d32cb16d makes exec >> migration much slower than before. >> I'm running the latest HEAD of qemu, on Debian Lenny 5.0.2. >> >> I'm migrating a fully booted Linux VM (also running Lenny) with >> 128MB of RAM to a file, using the following command: migrate "exec: >> cat > vmimage". The resulting file has a size of 57MB (because we >> save only what is allocated from the 128MB). >> With the current HEAD, it takes from 15 to 40 seconds (it's >> variable) to perform the migration to the file. >> With commit 907500095851230a480b14bc852c4e49d32cb16d reverted (or >> just commenting the "socket_set_nonblock(s->fd);" statement), it >> takes about 3 seconds. > > Without that changeset, it wasn't a live migration. The better way > to compare would be to issue stop before doing the migrate and > compare that time with the previous time. > > When a migration is live, it's iterative which means there's more > work to do. I tried with stop too, and I get the same results. It's an idle VM so only a small number of pages are being modified while the migration is going on. I agree that the changeset seems good, the code it replaces was obviously wrong. But I think there is something wrong somewhere else, unless it is considered normal that it takes so much time for an exec migration. To compare, using the same setup with one more machine and a Gigabit network, a tcp migration capped at 35m (the slowest speed I've measured from the disk, it can be way faster) takes about the same time, between 2 and 4 seconds. >> only be retried after 100 ms (the next timer tick?). > > We should be selecting on the fd so it will wait until we can write > to it. > > Regards, > > Anthony Liguori -- Pierre Riteau -- http://perso.univ-rennes1.fr/pierre.riteau/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [BUG] Regression of exec migration 2009-08-27 16:16 ` Pierre Riteau @ 2009-08-27 16:24 ` Anthony Liguori 2009-08-28 13:04 ` Pierre Riteau 0 siblings, 1 reply; 9+ messages in thread From: Anthony Liguori @ 2009-08-27 16:24 UTC (permalink / raw) To: Pierre Riteau; +Cc: Chris Lalancette, qemu-devel Pierre Riteau wrote: > On 27 août 09, at 16:13, Anthony Liguori wrote: > >> Pierre Riteau wrote: >>> [Sorry Chris, resending without the giant attachments.] >>> >>> Commit 907500095851230a480b14bc852c4e49d32cb16d makes exec migration >>> much slower than before. >>> I'm running the latest HEAD of qemu, on Debian Lenny 5.0.2. >>> >>> I'm migrating a fully booted Linux VM (also running Lenny) with >>> 128MB of RAM to a file, using the following command: migrate "exec: >>> cat > vmimage". The resulting file has a size of 57MB (because we >>> save only what is allocated from the 128MB). >>> With the current HEAD, it takes from 15 to 40 seconds (it's >>> variable) to perform the migration to the file. >>> With commit 907500095851230a480b14bc852c4e49d32cb16d reverted (or >>> just commenting the "socket_set_nonblock(s->fd);" statement), it >>> takes about 3 seconds. >> >> Without that changeset, it wasn't a live migration. The better way >> to compare would be to issue stop before doing the migrate and >> compare that time with the previous time. >> >> When a migration is live, it's iterative which means there's more >> work to do. > > I tried with stop too, and I get the same results. It's an idle VM so > only a small number of pages are being modified while the migration is > going on. > I agree that the changeset seems good, the code it replaces was > obviously wrong. > But I think there is something wrong somewhere else, unless it is > considered normal that it takes so much time for an exec migration. > To compare, using the same setup with one more machine and a Gigabit > network, a tcp migration capped at 35m (the slowest speed I've > measured from the disk, it can be way faster) takes about the same > time, between 2 and 4 seconds. I don't think the difference between 3 seconds and 15 seconds is significant. Can you try a different workload that will result in a migration that takes much longer (say multiple minutes)? That is, I'd like to know whether there's a fixed greater cost of exec: migration vs. factor of 5. I expect exec: to be slower because there is more copying but not by a factor of 5. I expect that it's going to be a combination of relatively small constant factor + relatively small constant fixed cost. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [BUG] Regression of exec migration 2009-08-27 16:24 ` Anthony Liguori @ 2009-08-28 13:04 ` Pierre Riteau 2009-08-28 15:41 ` Anthony Liguori 0 siblings, 1 reply; 9+ messages in thread From: Pierre Riteau @ 2009-08-28 13:04 UTC (permalink / raw) To: Anthony Liguori; +Cc: Chris Lalancette, qemu-devel On 27 août 09, at 18:24, Anthony Liguori wrote: > Pierre Riteau wrote: >> On 27 août 09, at 16:13, Anthony Liguori wrote: >> >>> Pierre Riteau wrote: >>>> [Sorry Chris, resending without the giant attachments.] >>>> >>>> Commit 907500095851230a480b14bc852c4e49d32cb16d makes exec >>>> migration much slower than before. >>>> I'm running the latest HEAD of qemu, on Debian Lenny 5.0.2. >>>> >>>> I'm migrating a fully booted Linux VM (also running Lenny) with >>>> 128MB of RAM to a file, using the following command: migrate >>>> "exec: cat > vmimage". The resulting file has a size of 57MB >>>> (because we save only what is allocated from the 128MB). >>>> With the current HEAD, it takes from 15 to 40 seconds (it's >>>> variable) to perform the migration to the file. >>>> With commit 907500095851230a480b14bc852c4e49d32cb16d reverted (or >>>> just commenting the "socket_set_nonblock(s->fd);" statement), it >>>> takes about 3 seconds. >>> >>> Without that changeset, it wasn't a live migration. The better >>> way to compare would be to issue stop before doing the migrate and >>> compare that time with the previous time. >>> >>> When a migration is live, it's iterative which means there's more >>> work to do. >> >> I tried with stop too, and I get the same results. It's an idle VM >> so only a small number of pages are being modified while the >> migration is going on. >> I agree that the changeset seems good, the code it replaces was >> obviously wrong. >> But I think there is something wrong somewhere else, unless it is >> considered normal that it takes so much time for an exec migration. >> To compare, using the same setup with one more machine and a >> Gigabit network, a tcp migration capped at 35m (the slowest speed >> I've measured from the disk, it can be way faster) takes about the >> same time, between 2 and 4 seconds. > > I don't think the difference between 3 seconds and 15 seconds is > significant. > > Can you try a different workload that will result in a migration > that takes much longer (say multiple minutes)? That is, I'd like to > know whether there's a fixed greater cost of exec: migration vs. > factor of 5. > > I expect exec: to be slower because there is more copying but not by > a factor of 5. I expect that it's going to be a combination of > relatively small constant factor + relatively small constant fixed > cost. > > Regards, > > Anthony Liguori I did more tests, now with a 1024MB VM. Before launching the migration I run a simple program that allocates 900MB of memory and fills it with random data, then sleeps. The results are: TCP migration: ~ 30s exec migration to hard drive: ~ 4 min 20s exec migration to netcat (replicating the setup used with the TCP migration): ~ 5 min 40s Now, what's interesting is that I modified the program to do an infinite loop after writing the 900MB instead of sleeping. With this new program the results for exec migration are quite different from the previous experiment: TCP migration: ~ 30s exec migration to hard drive: ~ 30s exec migration to netcat: ~ 30s Could it be that, when the VM is idling, qemu sleeps for some time and misses important events like "the popen'ed process is ready to read"?. -- Pierre Riteau -- http://perso.univ-rennes1.fr/pierre.riteau/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [BUG] Regression of exec migration 2009-08-28 13:04 ` Pierre Riteau @ 2009-08-28 15:41 ` Anthony Liguori 2009-08-31 22:55 ` [Qemu-devel] " Charles Duffy 0 siblings, 1 reply; 9+ messages in thread From: Anthony Liguori @ 2009-08-28 15:41 UTC (permalink / raw) To: Pierre Riteau; +Cc: Chris Lalancette, qemu-devel Pierre Riteau wrote: > On 27 août 09, at 18:24, Anthony Liguori wrote: > >> Pierre Riteau wrote: >>> On 27 août 09, at 16:13, Anthony Liguori wrote: >>> >>>> Pierre Riteau wrote: >>>>> [Sorry Chris, resending without the giant attachments.] >>>>> >>>>> Commit 907500095851230a480b14bc852c4e49d32cb16d makes exec >>>>> migration much slower than before. >>>>> I'm running the latest HEAD of qemu, on Debian Lenny 5.0.2. >>>>> >>>>> I'm migrating a fully booted Linux VM (also running Lenny) with >>>>> 128MB of RAM to a file, using the following command: migrate >>>>> "exec: cat > vmimage". The resulting file has a size of 57MB >>>>> (because we save only what is allocated from the 128MB). >>>>> With the current HEAD, it takes from 15 to 40 seconds (it's >>>>> variable) to perform the migration to the file. >>>>> With commit 907500095851230a480b14bc852c4e49d32cb16d reverted (or >>>>> just commenting the "socket_set_nonblock(s->fd);" statement), it >>>>> takes about 3 seconds. >>>> >>>> Without that changeset, it wasn't a live migration. The better way >>>> to compare would be to issue stop before doing the migrate and >>>> compare that time with the previous time. >>>> >>>> When a migration is live, it's iterative which means there's more >>>> work to do. >>> >>> I tried with stop too, and I get the same results. It's an idle VM >>> so only a small number of pages are being modified while the >>> migration is going on. >>> I agree that the changeset seems good, the code it replaces was >>> obviously wrong. >>> But I think there is something wrong somewhere else, unless it is >>> considered normal that it takes so much time for an exec migration. >>> To compare, using the same setup with one more machine and a Gigabit >>> network, a tcp migration capped at 35m (the slowest speed I've >>> measured from the disk, it can be way faster) takes about the same >>> time, between 2 and 4 seconds. >> >> I don't think the difference between 3 seconds and 15 seconds is >> significant. >> >> Can you try a different workload that will result in a migration that >> takes much longer (say multiple minutes)? That is, I'd like to know >> whether there's a fixed greater cost of exec: migration vs. factor of 5. >> >> I expect exec: to be slower because there is more copying but not by >> a factor of 5. I expect that it's going to be a combination of >> relatively small constant factor + relatively small constant fixed cost. >> >> Regards, >> >> Anthony Liguori > > > I did more tests, now with a 1024MB VM. Before launching the migration > I run a simple program that allocates 900MB of memory and fills it > with random data, then sleeps. > The results are: > TCP migration: ~ 30s > exec migration to hard drive: ~ 4 min 20s > exec migration to netcat (replicating the setup used with the TCP > migration): ~ 5 min 40s I think the fundamental problem is that exec migration shouldn't use popen. It should create a pipe() and do a proper fork/exec. I don't think the f* function support asynchronous IO properly. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: [BUG] Regression of exec migration 2009-08-28 15:41 ` Anthony Liguori @ 2009-08-31 22:55 ` Charles Duffy 2009-09-01 11:57 ` Pierre Riteau 0 siblings, 1 reply; 9+ messages in thread From: Charles Duffy @ 2009-08-31 22:55 UTC (permalink / raw) To: qemu-devel [-- Attachment #1: Type: text/plain, Size: 643 bytes --] Anthony Liguori wrote: > I think the fundamental problem is that exec migration shouldn't use > popen. It should create a pipe() and do a proper fork/exec. > > I don't think the f* function support asynchronous IO properly. Per this thread and a suggestion from Anthony on IRC, I've taken a shot at modifying the exec: migration code to be closer to the original kvm implementation. The severe performance degradation no longer appears to be present. This has been successfully smoke tested against the stable-0.11 branch; the rebase to master, attached, has only minor changes. Signed-off-by: Charles Duffy <Charles_Duffy@dell.com> [-- Attachment #2: 0001-stop-using-popen-for-outgoing-migrations.patch --] [-- Type: text/x-patch, Size: 4058 bytes --] >From c48ba672cd92d1d173fe9cb93f8e268d0d7952bc Mon Sep 17 00:00:00 2001 From: Charles Duffy <Charles_Duffy@dell.com> Date: Fri, 28 Aug 2009 22:17:47 -0500 Subject: [PATCH 1/2] stop using popen for outgoing migrations --- migration-exec.c | 95 ++++++++++++++++++++++++++++++++++++----------------- qemu-common.h | 2 + 2 files changed, 66 insertions(+), 31 deletions(-) diff --git a/migration-exec.c b/migration-exec.c index b45c833..c36e756 100644 --- a/migration-exec.c +++ b/migration-exec.c @@ -31,25 +31,72 @@ do { } while (0) #endif -static int file_errno(FdMigrationState *s) +// opaque is pid +typedef struct ExecMigrationState +{ + pid_t pid; +} ExecMigrationState; + +static int exec_errno(FdMigrationState *s) { return errno; } -static int file_write(FdMigrationState *s, const void * buf, size_t size) +static int exec_write(FdMigrationState *s, const void * buf, size_t size) { return write(s->fd, buf, size); } static int exec_close(FdMigrationState *s) { + int ret, status; + dprintf("exec_close\n"); - if (s->opaque) { - qemu_fclose(s->opaque); - s->opaque = NULL; - s->fd = -1; + + close(s->fd); + + if (!(s->opaque)) + return -1; + + do { + ret = waitpid(((ExecMigrationState*)(s->opaque))->pid, &status, 0); + } while (ret == -1 && errno == EINTR); + + qemu_free(s->opaque); + s->opaque = 0; + + return status; +} + +static pid_t exec_start_migration(const char *command, int *fd) { + const char *argv[] = { "sh", "-c", command, NULL }; + int fds[2]; + pid_t pid; + + if (pipe(fds) == -1) { + dprintf("pipe() (%s)\n", strerr(errno)); + return 0; } - return 0; + + pid = fork(); + if(pid == -1) { + close(fds[0]); + close(fds[1]); + dprintf("fork error (%s)\n", strerror(errno)); + return 0; + } else if (pid == 0) { + /* child process */ + close(fds[1]); + dup2(fds[0], STDIN_FILENO); + execv("/bin/sh", (char **)argv); + exit(1); + } + close(fds[0]); + + fcntl(fds[1], O_NONBLOCK); + + *fd = fds[1]; + return pid; } MigrationState *exec_start_outgoing_migration(const char *command, @@ -57,29 +104,20 @@ MigrationState *exec_start_outgoing_migration(const char *command, int detach) { FdMigrationState *s; - FILE *f; + int fd, pid; - s = qemu_mallocz(sizeof(*s)); + pid = exec_start_migration(command, &fd); + if(!pid) return NULL; - f = popen(command, "w"); - if (f == NULL) { - dprintf("Unable to popen exec target\n"); - goto err_after_alloc; - } - - s->fd = fileno(f); - if (s->fd == -1) { - dprintf("Unable to retrieve file descriptor for popen'd handle\n"); - goto err_after_open; - } - - socket_set_nonblock(s->fd); + s = qemu_mallocz(sizeof(*s)); + s->opaque = qemu_mallocz(sizeof(ExecMigrationState)); - s->opaque = qemu_popen(f, "w"); + ((ExecMigrationState*)(s->opaque))->pid = pid; + s->fd = fd; + s->get_error = exec_errno; + s->write = exec_write; s->close = exec_close; - s->get_error = file_errno; - s->write = file_write; s->mig_state.cancel = migrate_fd_cancel; s->mig_state.get_status = migrate_fd_get_status; s->mig_state.release = migrate_fd_release; @@ -93,14 +131,9 @@ MigrationState *exec_start_outgoing_migration(const char *command, migrate_fd_connect(s); return &s->mig_state; - -err_after_open: - pclose(f); -err_after_alloc: - qemu_free(s); - return NULL; } +/* helper for incoming case */ static void exec_accept_incoming_migration(void *opaque) { QEMUFile *f = opaque; diff --git a/qemu-common.h b/qemu-common.h index c8fb315..aeb1d78 100644 --- a/qemu-common.h +++ b/qemu-common.h @@ -24,6 +24,8 @@ #include <fcntl.h> #include <sys/stat.h> #include <assert.h> +#include <sys/types.h> +#include <sys/wait.h> #include "config-host.h" #ifndef O_LARGEFILE -- 1.6.4 [-- Attachment #3: 0002-stop-using-popen-for-incoming-migrations.patch --] [-- Type: text/x-patch, Size: 3641 bytes --] >From e7066ce85e969e43c8142f4845bc843e9d53f5e4 Mon Sep 17 00:00:00 2001 From: Charles Duffy <Charles_Duffy@dell.com> Date: Fri, 28 Aug 2009 23:18:49 -0500 Subject: [PATCH 2/2] stop using popen for incoming migrations --- migration-exec.c | 51 +++++++++++++++++++++++++++++++-------------------- 1 files changed, 31 insertions(+), 20 deletions(-) diff --git a/migration-exec.c b/migration-exec.c index c36e756..3dfea18 100644 --- a/migration-exec.c +++ b/migration-exec.c @@ -31,10 +31,10 @@ do { } while (0) #endif -// opaque is pid typedef struct ExecMigrationState { pid_t pid; + int fd; /* only used for incoming */ } ExecMigrationState; static int exec_errno(FdMigrationState *s) @@ -68,9 +68,9 @@ static int exec_close(FdMigrationState *s) return status; } -static pid_t exec_start_migration(const char *command, int *fd) { +static pid_t exec_start_migration(const char *command, int *fd, unsigned char incoming) { const char *argv[] = { "sh", "-c", command, NULL }; - int fds[2]; + int fds[2], parent_fd, child_fd; pid_t pid; if (pipe(fds) == -1) { @@ -78,6 +78,16 @@ static pid_t exec_start_migration(const char *command, int *fd) { return 0; } + if(incoming) { + /* child writes, parent reads */ + parent_fd = fds[0]; + child_fd = fds[1]; + } else { + /* parent writes, child reads */ + parent_fd = fds[1]; + child_fd = fds[0]; + } + pid = fork(); if(pid == -1) { close(fds[0]); @@ -86,16 +96,14 @@ static pid_t exec_start_migration(const char *command, int *fd) { return 0; } else if (pid == 0) { /* child process */ - close(fds[1]); - dup2(fds[0], STDIN_FILENO); + close(parent_fd); + dup2(child_fd, incoming ? STDOUT_FILENO : STDIN_FILENO); execv("/bin/sh", (char **)argv); exit(1); } - close(fds[0]); - - fcntl(fds[1], O_NONBLOCK); - - *fd = fds[1]; + close(child_fd); + fcntl(parent_fd, O_NONBLOCK); + *fd = parent_fd; return pid; } @@ -106,7 +114,7 @@ MigrationState *exec_start_outgoing_migration(const char *command, FdMigrationState *s; int fd, pid; - pid = exec_start_migration(command, &fd); + pid = exec_start_migration(command, &fd, 0); if(!pid) return NULL; s = qemu_mallocz(sizeof(*s)); @@ -136,7 +144,8 @@ MigrationState *exec_start_outgoing_migration(const char *command, /* helper for incoming case */ static void exec_accept_incoming_migration(void *opaque) { - QEMUFile *f = opaque; + ExecMigrationState *s = opaque; + QEMUFile *f = qemu_popen(fdopen(s->fd, "rb"), "r"); int ret; ret = qemu_loadvm_state(f); @@ -153,22 +162,24 @@ static void exec_accept_incoming_migration(void *opaque) err: qemu_fclose(f); + do { + waitpid(s->pid, NULL, 0); + } while (ret == -1 && errno == EINTR); + qemu_free(s); } int exec_start_incoming_migration(const char *command) { - QEMUFile *f; + ExecMigrationState *s; + + s = qemu_mallocz(sizeof(*s)); - dprintf("Attempting to start an incoming migration\n"); - f = qemu_popen_cmd(command, "r"); - if(f == NULL) { - dprintf("Unable to apply qemu wrapper to popen file\n"); + s->pid = exec_start_migration(command, &(s->fd), 1); + if(!s->pid) { return -errno; } - qemu_set_fd_handler2(qemu_stdio_fd(f), NULL, - exec_accept_incoming_migration, NULL, - (void *)(unsigned long)f); + qemu_set_fd_handler2(s->fd, NULL, exec_accept_incoming_migration, NULL, (void*)s); return 0; } -- 1.6.4 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] Re: [BUG] Regression of exec migration 2009-08-31 22:55 ` [Qemu-devel] " Charles Duffy @ 2009-09-01 11:57 ` Pierre Riteau 2009-09-01 20:37 ` Charles Duffy 0 siblings, 1 reply; 9+ messages in thread From: Pierre Riteau @ 2009-09-01 11:57 UTC (permalink / raw) To: Charles Duffy; +Cc: qemu-devel On 1 sept. 09, at 00:55, Charles Duffy wrote: > Anthony Liguori wrote: >> I think the fundamental problem is that exec migration shouldn't >> use popen. It should create a pipe() and do a proper fork/exec. >> I don't think the f* function support asynchronous IO properly. > > Per this thread and a suggestion from Anthony on IRC, I've taken a > shot at modifying the exec: migration code to be closer to the > original kvm implementation. The severe performance degradation no > longer appears to be present. > > This has been successfully smoke tested against the stable-0.11 > branch; the rebase to master, attached, has only minor changes. I'm afraid this solves the problem only because you disabled non blocking operations on the file descriptor, effectively reverting 907500095851230a480b14bc852c4e49d32cb16d. See comment inline. > > Signed-off-by: Charles Duffy <Charles_Duffy@dell.com> > From c48ba672cd92d1d173fe9cb93f8e268d0d7952bc Mon Sep 17 00:00:00 2001 > From: Charles Duffy <Charles_Duffy@dell.com> > Date: Fri, 28 Aug 2009 22:17:47 -0500 > Subject: [PATCH 1/2] stop using popen for outgoing migrations > > --- > migration-exec.c | 95 +++++++++++++++++++++++++++++++++++ > +----------------- > qemu-common.h | 2 + > 2 files changed, 66 insertions(+), 31 deletions(-) > > diff --git a/migration-exec.c b/migration-exec.c > index b45c833..c36e756 100644 > --- a/migration-exec.c > +++ b/migration-exec.c > @@ -31,25 +31,72 @@ > do { } while (0) > #endif > > -static int file_errno(FdMigrationState *s) > +// opaque is pid > +typedef struct ExecMigrationState > +{ > + pid_t pid; > +} ExecMigrationState; > + > +static int exec_errno(FdMigrationState *s) > { > return errno; > } > > -static int file_write(FdMigrationState *s, const void * buf, size_t > size) > +static int exec_write(FdMigrationState *s, const void * buf, size_t > size) > { > return write(s->fd, buf, size); > } > > static int exec_close(FdMigrationState *s) > { > + int ret, status; > + > dprintf("exec_close\n"); > - if (s->opaque) { > - qemu_fclose(s->opaque); > - s->opaque = NULL; > - s->fd = -1; > + > + close(s->fd); > + > + if (!(s->opaque)) > + return -1; > + > + do { > + ret = waitpid(((ExecMigrationState*)(s->opaque))->pid, > &status, 0); > + } while (ret == -1 && errno == EINTR); > + > + qemu_free(s->opaque); > + s->opaque = 0; > + > + return status; > +} > + > +static pid_t exec_start_migration(const char *command, int *fd) { > + const char *argv[] = { "sh", "-c", command, NULL }; > + int fds[2]; > + pid_t pid; > + > + if (pipe(fds) == -1) { > + dprintf("pipe() (%s)\n", strerr(errno)); > + return 0; > } > - return 0; > + > + pid = fork(); > + if(pid == -1) { > + close(fds[0]); > + close(fds[1]); > + dprintf("fork error (%s)\n", strerror(errno)); > + return 0; > + } else if (pid == 0) { > + /* child process */ > + close(fds[1]); > + dup2(fds[0], STDIN_FILENO); > + execv("/bin/sh", (char **)argv); > + exit(1); > + } > + close(fds[0]); > + > + fcntl(fds[1], O_NONBLOCK); > + > + *fd = fds[1]; > + return pid; > } > > MigrationState *exec_start_outgoing_migration(const char *command, > @@ -57,29 +104,20 @@ MigrationState > *exec_start_outgoing_migration(const char *command, > int detach) > { > FdMigrationState *s; > - FILE *f; > + int fd, pid; > > - s = qemu_mallocz(sizeof(*s)); > + pid = exec_start_migration(command, &fd); > + if(!pid) return NULL; > > - f = popen(command, "w"); > - if (f == NULL) { > - dprintf("Unable to popen exec target\n"); > - goto err_after_alloc; > - } > - > - s->fd = fileno(f); > - if (s->fd == -1) { > - dprintf("Unable to retrieve file descriptor for popen'd > handle\n"); > - goto err_after_open; > - } > - > - socket_set_nonblock(s->fd); > + s = qemu_mallocz(sizeof(*s)); > + s->opaque = qemu_mallocz(sizeof(ExecMigrationState)); > > - s->opaque = qemu_popen(f, "w"); > + ((ExecMigrationState*)(s->opaque))->pid = pid; > > + s->fd = fd; > + s->get_error = exec_errno; > + s->write = exec_write; > s->close = exec_close; > - s->get_error = file_errno; > - s->write = file_write; > s->mig_state.cancel = migrate_fd_cancel; > s->mig_state.get_status = migrate_fd_get_status; > s->mig_state.release = migrate_fd_release; > @@ -93,14 +131,9 @@ MigrationState > *exec_start_outgoing_migration(const char *command, > > migrate_fd_connect(s); > return &s->mig_state; > - > -err_after_open: > - pclose(f); > -err_after_alloc: > - qemu_free(s); > - return NULL; > } > > +/* helper for incoming case */ > static void exec_accept_incoming_migration(void *opaque) > { > QEMUFile *f = opaque; > diff --git a/qemu-common.h b/qemu-common.h > index c8fb315..aeb1d78 100644 > --- a/qemu-common.h > +++ b/qemu-common.h > @@ -24,6 +24,8 @@ > #include <fcntl.h> > #include <sys/stat.h> > #include <assert.h> > +#include <sys/types.h> > +#include <sys/wait.h> > #include "config-host.h" > > #ifndef O_LARGEFILE > -- > 1.6.4 > > From e7066ce85e969e43c8142f4845bc843e9d53f5e4 Mon Sep 17 00:00:00 2001 > From: Charles Duffy <Charles_Duffy@dell.com> > Date: Fri, 28 Aug 2009 23:18:49 -0500 > Subject: [PATCH 2/2] stop using popen for incoming migrations > > --- > migration-exec.c | 51 ++++++++++++++++++++++++++++++ > +-------------------- > 1 files changed, 31 insertions(+), 20 deletions(-) > > diff --git a/migration-exec.c b/migration-exec.c > index c36e756..3dfea18 100644 > --- a/migration-exec.c > +++ b/migration-exec.c > @@ -31,10 +31,10 @@ > do { } while (0) > #endif > > -// opaque is pid > typedef struct ExecMigrationState > { > pid_t pid; > + int fd; /* only used for incoming */ > } ExecMigrationState; > > static int exec_errno(FdMigrationState *s) > @@ -68,9 +68,9 @@ static int exec_close(FdMigrationState *s) > return status; > } > > -static pid_t exec_start_migration(const char *command, int *fd) { > +static pid_t exec_start_migration(const char *command, int *fd, > unsigned char incoming) { > const char *argv[] = { "sh", "-c", command, NULL }; > - int fds[2]; > + int fds[2], parent_fd, child_fd; > pid_t pid; > > if (pipe(fds) == -1) { > @@ -78,6 +78,16 @@ static pid_t exec_start_migration(const char > *command, int *fd) { > return 0; > } > > + if(incoming) { > + /* child writes, parent reads */ > + parent_fd = fds[0]; > + child_fd = fds[1]; > + } else { > + /* parent writes, child reads */ > + parent_fd = fds[1]; > + child_fd = fds[0]; > + } > + > pid = fork(); > if(pid == -1) { > close(fds[0]); > @@ -86,16 +96,14 @@ static pid_t exec_start_migration(const char > *command, int *fd) { > return 0; > } else if (pid == 0) { > /* child process */ > - close(fds[1]); > - dup2(fds[0], STDIN_FILENO); > + close(parent_fd); > + dup2(child_fd, incoming ? STDOUT_FILENO : STDIN_FILENO); > execv("/bin/sh", (char **)argv); > exit(1); > } > - close(fds[0]); > - > - fcntl(fds[1], O_NONBLOCK); > - > - *fd = fds[1]; > + close(child_fd); > + fcntl(parent_fd, O_NONBLOCK); This doesn't look right, You should be using F_SETFL (like socket_set_nonblock does). If you check the return value of fcntl, you'll get -1 and errno set to "Invalid argument" (at least on my system, with a different ABI it could be a valid call to fcntl but doing something different than setting flags). > + *fd = parent_fd; > return pid; > } > > @@ -106,7 +114,7 @@ MigrationState > *exec_start_outgoing_migration(const char *command, > FdMigrationState *s; > int fd, pid; > > - pid = exec_start_migration(command, &fd); > + pid = exec_start_migration(command, &fd, 0); > if(!pid) return NULL; > > s = qemu_mallocz(sizeof(*s)); > @@ -136,7 +144,8 @@ MigrationState > *exec_start_outgoing_migration(const char *command, > /* helper for incoming case */ > static void exec_accept_incoming_migration(void *opaque) > { > - QEMUFile *f = opaque; > + ExecMigrationState *s = opaque; > + QEMUFile *f = qemu_popen(fdopen(s->fd, "rb"), "r"); > int ret; > > ret = qemu_loadvm_state(f); > @@ -153,22 +162,24 @@ static void > exec_accept_incoming_migration(void *opaque) > > err: > qemu_fclose(f); > + do { > + waitpid(s->pid, NULL, 0); > + } while (ret == -1 && errno == EINTR); > + qemu_free(s); > } > > int exec_start_incoming_migration(const char *command) > { > - QEMUFile *f; > + ExecMigrationState *s; > + > + s = qemu_mallocz(sizeof(*s)); > > - dprintf("Attempting to start an incoming migration\n"); > - f = qemu_popen_cmd(command, "r"); > - if(f == NULL) { > - dprintf("Unable to apply qemu wrapper to popen file\n"); > + s->pid = exec_start_migration(command, &(s->fd), 1); > + if(!s->pid) { > return -errno; > } > > - qemu_set_fd_handler2(qemu_stdio_fd(f), NULL, > - exec_accept_incoming_migration, NULL, > - (void *)(unsigned long)f); > + qemu_set_fd_handler2(s->fd, NULL, > exec_accept_incoming_migration, NULL, (void*)s); > > return 0; > } > -- > 1.6.4 > -- Pierre Riteau -- http://perso.univ-rennes1.fr/pierre.riteau/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] Re: [BUG] Regression of exec migration 2009-09-01 11:57 ` Pierre Riteau @ 2009-09-01 20:37 ` Charles Duffy 0 siblings, 0 replies; 9+ messages in thread From: Charles Duffy @ 2009-09-01 20:37 UTC (permalink / raw) To: qemu-devel Pierre Riteau wrote: > I'm afraid this solves the problem only because you disabled non > blocking operations on the file descriptor, effectively reverting > 907500095851230a480b14bc852c4e49d32cb16d. > See comment inline. Indeed -- fixing this oversight, I'm seeing 2m30s to write an 84mb ramsave. This doesn't seem to square with Anthony's suggestion that the f* functions are to blame; we're still doing an fdopen() call and using f* on incoming migrations, but it's not incoming migrations where we're having performance issues. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-09-01 20:37 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-27 9:19 [Qemu-devel] [BUG] Regression of exec migration Pierre Riteau 2009-08-27 14:13 ` Anthony Liguori 2009-08-27 16:16 ` Pierre Riteau 2009-08-27 16:24 ` Anthony Liguori 2009-08-28 13:04 ` Pierre Riteau 2009-08-28 15:41 ` Anthony Liguori 2009-08-31 22:55 ` [Qemu-devel] " Charles Duffy 2009-09-01 11:57 ` Pierre Riteau 2009-09-01 20:37 ` Charles Duffy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).