* [Qemu-devel] [PATCH v2 0/5] A bunch of RDMA fixes
@ 2017-07-13 11:56 Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 1/5] migration/rdma: Fix race on source Dr. David Alan Gilbert (git)
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-07-13 11:56 UTC (permalink / raw)
To: qemu-devel; +Cc: quintela, lvivier, peterx
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
This is a bunch of RDMA fixes, the first is a race
I spotted a while ago that you don't hit during normal
operation; the rest are to do with migration failure and
cancellation that I started looking at because of lp1545052 which
is a failure to recover on the source if the destination
fails.
I'm pretty sure there are other cases where the source
might hang waiting for a failed destination; particularly
if the destination hangs rather than fails completely;
one I know of is waiting for the event after the rdma_disconnect
but I don't have a good fix for it. Suggestions welcome.
v2
Dropped the timeout in the poll to 0.1s
Don't check error_reported, just error_state and received_error
Dr. David Alan Gilbert (5):
migration/rdma: Fix race on source
migration: Close file on failed migration load
migration/rdma: Allow cancelling while waiting for wrid
migration/rdma: Safely convert control types
migration/rdma: Send error during cancelling
migration/migration.c | 1 +
migration/rdma.c | 122 +++++++++++++++++++++++++++++++++++---------------
2 files changed, 88 insertions(+), 35 deletions(-)
--
2.13.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 1/5] migration/rdma: Fix race on source
2017-07-13 11:56 [Qemu-devel] [PATCH v2 0/5] A bunch of RDMA fixes Dr. David Alan Gilbert (git)
@ 2017-07-13 11:56 ` Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 2/5] migration: Close file on failed migration load Dr. David Alan Gilbert (git)
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-07-13 11:56 UTC (permalink / raw)
To: qemu-devel; +Cc: quintela, lvivier, peterx
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Fix a race where the destination might try and send the source a
WRID_READY before the source has done a post-recv for it.
rdma_post_recv has to happen after the qp exists, and we're
OK since we've already called qemu_rdma_source_init that calls
qemu_alloc_qp.
This corresponds to:
https://bugzilla.redhat.com/show_bug.cgi?id=1285044
The race can be triggered by adding a few ms wait before this
post_recv_control (which was originally due to me turning on loads of
debug).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
migration/rdma.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/migration/rdma.c b/migration/rdma.c
index c6bc607a03..6111e10c70 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2365,6 +2365,12 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error **errp)
caps_to_network(&cap);
+ ret = qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY);
+ if (ret) {
+ ERROR(errp, "posting second control recv");
+ goto err_rdma_source_connect;
+ }
+
ret = rdma_connect(rdma->cm_id, &conn_param);
if (ret) {
perror("rdma_connect");
@@ -2405,12 +2411,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error **errp)
rdma_ack_cm_event(cm_event);
- ret = qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY);
- if (ret) {
- ERROR(errp, "posting second control recv!");
- goto err_rdma_source_connect;
- }
-
rdma->control_ready_expected = 1;
rdma->nb_sent = 0;
return 0;
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 2/5] migration: Close file on failed migration load
2017-07-13 11:56 [Qemu-devel] [PATCH v2 0/5] A bunch of RDMA fixes Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 1/5] migration/rdma: Fix race on source Dr. David Alan Gilbert (git)
@ 2017-07-13 11:56 ` Dr. David Alan Gilbert (git)
2017-07-14 3:27 ` Peter Xu
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 3/5] migration/rdma: Allow cancelling while waiting for wrid Dr. David Alan Gilbert (git)
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-07-13 11:56 UTC (permalink / raw)
To: qemu-devel; +Cc: quintela, lvivier, peterx
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Closing the file before exit on a failure allows
the source to cleanup better, especially with RDMA.
Partial fix for https://bugs.launchpad.net/qemu/+bug/1545052
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/migration.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/migration/migration.c b/migration/migration.c
index a0db40d364..8552f54ab4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -348,6 +348,7 @@ static void process_incoming_migration_co(void *opaque)
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
error_report("load of migration failed: %s", strerror(-ret));
+ qemu_fclose(mis->from_src_file);
exit(EXIT_FAILURE);
}
mis->bh = qemu_bh_new(process_incoming_migration_bh, mis);
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 3/5] migration/rdma: Allow cancelling while waiting for wrid
2017-07-13 11:56 [Qemu-devel] [PATCH v2 0/5] A bunch of RDMA fixes Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 1/5] migration/rdma: Fix race on source Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 2/5] migration: Close file on failed migration load Dr. David Alan Gilbert (git)
@ 2017-07-13 11:56 ` Dr. David Alan Gilbert (git)
2017-07-14 3:36 ` Peter Xu
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 4/5] migration/rdma: Safely convert control types Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 5/5] migration/rdma: Send error during cancelling Dr. David Alan Gilbert (git)
4 siblings, 1 reply; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-07-13 11:56 UTC (permalink / raw)
To: qemu-devel; +Cc: quintela, lvivier, peterx
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
When waiting for a WRID, if the other side dies we end up waiting
for ever with no way to cancel the migration.
Cure this by poll()ing the fd first with a timeout and checking
error flags and migration state.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/rdma.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 46 insertions(+), 6 deletions(-)
diff --git a/migration/rdma.c b/migration/rdma.c
index 6111e10c70..30f5542b49 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1466,6 +1466,50 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out,
return 0;
}
+/* Wait for activity on the completion channel.
+ * Returns 0 on success, none-0 on error.
+ */
+static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
+{
+ /*
+ * Coroutine doesn't start until migration_fd_process_incoming()
+ * so don't yield unless we know we're running inside of a coroutine.
+ */
+ if (rdma->migration_started_on_destination) {
+ yield_until_fd_readable(rdma->comp_channel->fd);
+ } else {
+ /* This is the source side, we're in a separate thread
+ * or destination prior to migration_fd_process_incoming()
+ * we can't yield; so we have to poll the fd.
+ * But we need to be able to handle 'cancel' or an error
+ * without hanging forever.
+ */
+ while (!rdma->error_state && !rdma->received_error) {
+ GPollFD pfds[1];
+ pfds[0].fd = rdma->comp_channel->fd;
+ pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
+ /* 0.1s timeout, should be fine for a 'cancel' */
+ switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) {
+ case 1: /* fd active */
+ return 0;
+
+ case 0: /* Timeout, go around again */
+ break;
+
+ default: /* Error of some type */
+ return -1;
+ }
+
+ if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) {
+ /* Bail out and let the cancellation happen */
+ return -EPIPE;
+ }
+ }
+ }
+
+ return rdma->error_state || rdma->received_error;
+}
+
/*
* Block until the next work request has completed.
*
@@ -1513,12 +1557,8 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdma, int wrid_requested,
}
while (1) {
- /*
- * Coroutine doesn't start until migration_fd_process_incoming()
- * so don't yield unless we know we're running inside of a coroutine.
- */
- if (rdma->migration_started_on_destination) {
- yield_until_fd_readable(rdma->comp_channel->fd);
+ if (qemu_rdma_wait_comp_channel(rdma)) {
+ goto err_block_for_wrid;
}
if (ibv_get_cq_event(rdma->comp_channel, &cq, &cq_ctx)) {
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 4/5] migration/rdma: Safely convert control types
2017-07-13 11:56 [Qemu-devel] [PATCH v2 0/5] A bunch of RDMA fixes Dr. David Alan Gilbert (git)
` (2 preceding siblings ...)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 3/5] migration/rdma: Allow cancelling while waiting for wrid Dr. David Alan Gilbert (git)
@ 2017-07-13 11:56 ` Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 5/5] migration/rdma: Send error during cancelling Dr. David Alan Gilbert (git)
4 siblings, 0 replies; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-07-13 11:56 UTC (permalink / raw)
To: qemu-devel; +Cc: quintela, lvivier, peterx
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
control_desc[] is an array of strings that correspond to a
series of message types; they're used only for error messages, but if
the message type is seriously broken then we could go off the end of
the array.
Convert the array to a function control_desc() that bound checks.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
migration/rdma.c | 54 ++++++++++++++++++++++++++++++++----------------------
1 file changed, 32 insertions(+), 22 deletions(-)
diff --git a/migration/rdma.c b/migration/rdma.c
index 30f5542b49..89684fdec6 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -165,20 +165,6 @@ enum {
RDMA_CONTROL_UNREGISTER_FINISHED, /* unpinning finished */
};
-static const char *control_desc[] = {
- [RDMA_CONTROL_NONE] = "NONE",
- [RDMA_CONTROL_ERROR] = "ERROR",
- [RDMA_CONTROL_READY] = "READY",
- [RDMA_CONTROL_QEMU_FILE] = "QEMU FILE",
- [RDMA_CONTROL_RAM_BLOCKS_REQUEST] = "RAM BLOCKS REQUEST",
- [RDMA_CONTROL_RAM_BLOCKS_RESULT] = "RAM BLOCKS RESULT",
- [RDMA_CONTROL_COMPRESS] = "COMPRESS",
- [RDMA_CONTROL_REGISTER_REQUEST] = "REGISTER REQUEST",
- [RDMA_CONTROL_REGISTER_RESULT] = "REGISTER RESULT",
- [RDMA_CONTROL_REGISTER_FINISHED] = "REGISTER FINISHED",
- [RDMA_CONTROL_UNREGISTER_REQUEST] = "UNREGISTER REQUEST",
- [RDMA_CONTROL_UNREGISTER_FINISHED] = "UNREGISTER FINISHED",
-};
/*
* Memory and MR structures used to represent an IB Send/Recv work request.
@@ -251,6 +237,30 @@ typedef struct QEMU_PACKED RDMADestBlock {
uint32_t padding;
} RDMADestBlock;
+static const char *control_desc(unsigned int rdma_control)
+{
+ static const char *strs[] = {
+ [RDMA_CONTROL_NONE] = "NONE",
+ [RDMA_CONTROL_ERROR] = "ERROR",
+ [RDMA_CONTROL_READY] = "READY",
+ [RDMA_CONTROL_QEMU_FILE] = "QEMU FILE",
+ [RDMA_CONTROL_RAM_BLOCKS_REQUEST] = "RAM BLOCKS REQUEST",
+ [RDMA_CONTROL_RAM_BLOCKS_RESULT] = "RAM BLOCKS RESULT",
+ [RDMA_CONTROL_COMPRESS] = "COMPRESS",
+ [RDMA_CONTROL_REGISTER_REQUEST] = "REGISTER REQUEST",
+ [RDMA_CONTROL_REGISTER_RESULT] = "REGISTER RESULT",
+ [RDMA_CONTROL_REGISTER_FINISHED] = "REGISTER FINISHED",
+ [RDMA_CONTROL_UNREGISTER_REQUEST] = "UNREGISTER REQUEST",
+ [RDMA_CONTROL_UNREGISTER_FINISHED] = "UNREGISTER FINISHED",
+ };
+
+ if (rdma_control > RDMA_CONTROL_UNREGISTER_FINISHED) {
+ return "??BAD CONTROL VALUE??";
+ }
+
+ return strs[rdma_control];
+}
+
static uint64_t htonll(uint64_t v)
{
union { uint32_t lv[2]; uint64_t llv; } u;
@@ -1630,7 +1640,7 @@ static int qemu_rdma_post_send_control(RDMAContext *rdma, uint8_t *buf,
.num_sge = 1,
};
- trace_qemu_rdma_post_send_control(control_desc[head->type]);
+ trace_qemu_rdma_post_send_control(control_desc(head->type));
/*
* We don't actually need to do a memcpy() in here if we used
@@ -1709,16 +1719,16 @@ static int qemu_rdma_exchange_get_response(RDMAContext *rdma,
network_to_control((void *) rdma->wr_data[idx].control);
memcpy(head, rdma->wr_data[idx].control, sizeof(RDMAControlHeader));
- trace_qemu_rdma_exchange_get_response_start(control_desc[expecting]);
+ trace_qemu_rdma_exchange_get_response_start(control_desc(expecting));
if (expecting == RDMA_CONTROL_NONE) {
- trace_qemu_rdma_exchange_get_response_none(control_desc[head->type],
+ trace_qemu_rdma_exchange_get_response_none(control_desc(head->type),
head->type);
} else if (head->type != expecting || head->type == RDMA_CONTROL_ERROR) {
error_report("Was expecting a %s (%d) control message"
", but got: %s (%d), length: %d",
- control_desc[expecting], expecting,
- control_desc[head->type], head->type, head->len);
+ control_desc(expecting), expecting,
+ control_desc(head->type), head->type, head->len);
if (head->type == RDMA_CONTROL_ERROR) {
rdma->received_error = true;
}
@@ -1828,7 +1838,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma, RDMAControlHeader *head,
}
}
- trace_qemu_rdma_exchange_send_waiting(control_desc[resp->type]);
+ trace_qemu_rdma_exchange_send_waiting(control_desc(resp->type));
ret = qemu_rdma_exchange_get_response(rdma, resp,
resp->type, RDMA_WRID_DATA);
@@ -1840,7 +1850,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma, RDMAControlHeader *head,
if (resp_idx) {
*resp_idx = RDMA_WRID_DATA;
}
- trace_qemu_rdma_exchange_send_received(control_desc[resp->type]);
+ trace_qemu_rdma_exchange_send_received(control_desc(resp->type));
}
rdma->control_ready_expected = 1;
@@ -3390,7 +3400,7 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque)
ret = -EIO;
goto out;
default:
- error_report("Unknown control message %s", control_desc[head.type]);
+ error_report("Unknown control message %s", control_desc(head.type));
ret = -EIO;
goto out;
}
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 5/5] migration/rdma: Send error during cancelling
2017-07-13 11:56 [Qemu-devel] [PATCH v2 0/5] A bunch of RDMA fixes Dr. David Alan Gilbert (git)
` (3 preceding siblings ...)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 4/5] migration/rdma: Safely convert control types Dr. David Alan Gilbert (git)
@ 2017-07-13 11:56 ` Dr. David Alan Gilbert (git)
4 siblings, 0 replies; 9+ messages in thread
From: Dr. David Alan Gilbert (git) @ 2017-07-13 11:56 UTC (permalink / raw)
To: qemu-devel; +Cc: quintela, lvivier, peterx
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
When we issue a cancel and clean up the RDMA channel
send a CONTROL_ERROR to get the destination to quit.
The rdma_cleanup code waits for the event to come back
from the rdma_disconnect; but that wont happen until the
destination quits and there's currently nothing to force
it.
Note this makes the case of a cancel work while the destination
is alive, and it already works if the destination is
truly dead. Note it doesn't fix the case where the destination
is hung (we get stuck waiting for the rdma_disconnect event).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
migration/rdma.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/migration/rdma.c b/migration/rdma.c
index 89684fdec6..bb9aa48d8c 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -2258,7 +2258,9 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
int ret, idx;
if (rdma->cm_id && rdma->connected) {
- if (rdma->error_state && !rdma->received_error) {
+ if ((rdma->error_state ||
+ migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) &&
+ !rdma->received_error) {
RDMAControlHeader head = { .len = 0,
.type = RDMA_CONTROL_ERROR,
.repeat = 1,
--
2.13.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/5] migration: Close file on failed migration load
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 2/5] migration: Close file on failed migration load Dr. David Alan Gilbert (git)
@ 2017-07-14 3:27 ` Peter Xu
0 siblings, 0 replies; 9+ messages in thread
From: Peter Xu @ 2017-07-14 3:27 UTC (permalink / raw)
To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, quintela, lvivier
On Thu, Jul 13, 2017 at 12:56:46PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Closing the file before exit on a failure allows
> the source to cleanup better, especially with RDMA.
>
> Partial fix for https://bugs.launchpad.net/qemu/+bug/1545052
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/5] migration/rdma: Allow cancelling while waiting for wrid
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 3/5] migration/rdma: Allow cancelling while waiting for wrid Dr. David Alan Gilbert (git)
@ 2017-07-14 3:36 ` Peter Xu
2017-07-14 11:13 ` Dr. David Alan Gilbert
0 siblings, 1 reply; 9+ messages in thread
From: Peter Xu @ 2017-07-14 3:36 UTC (permalink / raw)
To: Dr. David Alan Gilbert (git); +Cc: qemu-devel, quintela, lvivier
On Thu, Jul 13, 2017 at 12:56:47PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> When waiting for a WRID, if the other side dies we end up waiting
> for ever with no way to cancel the migration.
> Cure this by poll()ing the fd first with a timeout and checking
> error flags and migration state.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> migration/rdma.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 46 insertions(+), 6 deletions(-)
>
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 6111e10c70..30f5542b49 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -1466,6 +1466,50 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out,
> return 0;
> }
>
> +/* Wait for activity on the completion channel.
> + * Returns 0 on success, none-0 on error.
> + */
> +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
> +{
> + /*
> + * Coroutine doesn't start until migration_fd_process_incoming()
> + * so don't yield unless we know we're running inside of a coroutine.
> + */
> + if (rdma->migration_started_on_destination) {
> + yield_until_fd_readable(rdma->comp_channel->fd);
> + } else {
> + /* This is the source side, we're in a separate thread
> + * or destination prior to migration_fd_process_incoming()
> + * we can't yield; so we have to poll the fd.
> + * But we need to be able to handle 'cancel' or an error
> + * without hanging forever.
> + */
> + while (!rdma->error_state && !rdma->received_error) {
> + GPollFD pfds[1];
> + pfds[0].fd = rdma->comp_channel->fd;
> + pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
> + /* 0.1s timeout, should be fine for a 'cancel' */
> + switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) {
> + case 1: /* fd active */
> + return 0;
> +
> + case 0: /* Timeout, go around again */
> + break;
> +
> + default: /* Error of some type */
> + return -1;
> + }
> +
> + if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) {
> + /* Bail out and let the cancellation happen */
> + return -EPIPE;
> + }
> + }
> + }
> +
> + return rdma->error_state || rdma->received_error;
Just to note that this operation will return either 0 or 1, but not
anything <0 (most RDMA codes are using <0 as error, and error_state
should be <= 0 always iiuc).
But as the comment for this function, it's fine.
> +}
> +
> /*
> * Block until the next work request has completed.
> *
> @@ -1513,12 +1557,8 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdma, int wrid_requested,
> }
>
> while (1) {
> - /*
> - * Coroutine doesn't start until migration_fd_process_incoming()
> - * so don't yield unless we know we're running inside of a coroutine.
> - */
> - if (rdma->migration_started_on_destination) {
> - yield_until_fd_readable(rdma->comp_channel->fd);
> + if (qemu_rdma_wait_comp_channel(rdma)) {
Do we want something like:
ret = -EIO;
Here? Or capture the return code of qemu_rdma_wait_comp_channel() (but
then we need to make sure its return code is <0)?
I guess "ret" is still zero, and it means this function will return
with zero as well even timed out?
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/5] migration/rdma: Allow cancelling while waiting for wrid
2017-07-14 3:36 ` Peter Xu
@ 2017-07-14 11:13 ` Dr. David Alan Gilbert
0 siblings, 0 replies; 9+ messages in thread
From: Dr. David Alan Gilbert @ 2017-07-14 11:13 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, quintela, lvivier
* Peter Xu (peterx@redhat.com) wrote:
> On Thu, Jul 13, 2017 at 12:56:47PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > When waiting for a WRID, if the other side dies we end up waiting
> > for ever with no way to cancel the migration.
> > Cure this by poll()ing the fd first with a timeout and checking
> > error flags and migration state.
> >
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> > migration/rdma.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------
> > 1 file changed, 46 insertions(+), 6 deletions(-)
> >
> > diff --git a/migration/rdma.c b/migration/rdma.c
> > index 6111e10c70..30f5542b49 100644
> > --- a/migration/rdma.c
> > +++ b/migration/rdma.c
> > @@ -1466,6 +1466,50 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out,
> > return 0;
> > }
> >
> > +/* Wait for activity on the completion channel.
> > + * Returns 0 on success, none-0 on error.
> > + */
> > +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
> > +{
> > + /*
> > + * Coroutine doesn't start until migration_fd_process_incoming()
> > + * so don't yield unless we know we're running inside of a coroutine.
> > + */
> > + if (rdma->migration_started_on_destination) {
> > + yield_until_fd_readable(rdma->comp_channel->fd);
> > + } else {
> > + /* This is the source side, we're in a separate thread
> > + * or destination prior to migration_fd_process_incoming()
> > + * we can't yield; so we have to poll the fd.
> > + * But we need to be able to handle 'cancel' or an error
> > + * without hanging forever.
> > + */
> > + while (!rdma->error_state && !rdma->received_error) {
> > + GPollFD pfds[1];
> > + pfds[0].fd = rdma->comp_channel->fd;
> > + pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
> > + /* 0.1s timeout, should be fine for a 'cancel' */
> > + switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) {
> > + case 1: /* fd active */
> > + return 0;
> > +
> > + case 0: /* Timeout, go around again */
> > + break;
> > +
> > + default: /* Error of some type */
> > + return -1;
> > + }
> > +
> > + if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) {
> > + /* Bail out and let the cancellation happen */
> > + return -EPIPE;
> > + }
> > + }
> > + }
> > +
> > + return rdma->error_state || rdma->received_error;
>
> Just to note that this operation will return either 0 or 1, but not
> anything <0 (most RDMA codes are using <0 as error, and error_state
> should be <= 0 always iiuc).
>
> But as the comment for this function, it's fine.
It's interesting in that 'error_state' is a -ve error code, where as
received_error is just a boolean; I've changed this to:
if (rdma->received->error) {
return -EPIPE;
}
return rdma->error_state;
to make it a bit more consistent.
> > +}
> > +
> > /*
> > * Block until the next work request has completed.
> > *
> > @@ -1513,12 +1557,8 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdma, int wrid_requested,
> > }
> >
> > while (1) {
> > - /*
> > - * Coroutine doesn't start until migration_fd_process_incoming()
> > - * so don't yield unless we know we're running inside of a coroutine.
> > - */
> > - if (rdma->migration_started_on_destination) {
> > - yield_until_fd_readable(rdma->comp_channel->fd);
> > + if (qemu_rdma_wait_comp_channel(rdma)) {
>
> Do we want something like:
>
> ret = -EIO;
>
> Here? Or capture the return code of qemu_rdma_wait_comp_channel() (but
> then we need to make sure its return code is <0)?
>
> I guess "ret" is still zero, and it means this function will return
> with zero as well even timed out?
But rdma->error_state is set if wait_comp_channel fails;
(actually it wasn't in one case, where the poll fails, I've just fixed
that and will repost).
Dave
> Thanks,
>
> --
> Peter Xu
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-07-14 11:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-13 11:56 [Qemu-devel] [PATCH v2 0/5] A bunch of RDMA fixes Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 1/5] migration/rdma: Fix race on source Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 2/5] migration: Close file on failed migration load Dr. David Alan Gilbert (git)
2017-07-14 3:27 ` Peter Xu
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 3/5] migration/rdma: Allow cancelling while waiting for wrid Dr. David Alan Gilbert (git)
2017-07-14 3:36 ` Peter Xu
2017-07-14 11:13 ` Dr. David Alan Gilbert
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 4/5] migration/rdma: Safely convert control types Dr. David Alan Gilbert (git)
2017-07-13 11:56 ` [Qemu-devel] [PATCH v2 5/5] migration/rdma: Send error during cancelling Dr. David Alan Gilbert (git)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).