From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Cc: aliguori@us.ibm.com, quintela@redhat.com, qemu-devel@nongnu.org,
owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com,
gokul@us.ibm.com, pbonzini@redhat.com, chegu_vinod@hp.com,
knoel@redhat.com
Subject: Re: [Qemu-devel] [PATCH v11 11/15] rdma: core logic
Date: Tue, 25 Jun 2013 14:38:25 -0400 [thread overview]
Message-ID: <51C9E3A1.9060107@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130625163149.GA10901@dhcp-192-168-178-175.profitbricks.localdomain>
On 06/25/2013 12:31 PM, Vasilis Liaskovitis wrote:
> Hi,
>
> On Mon, Jun 24, 2013 at 09:58:01PM -0400, mrhines@linux.vnet.ibm.com wrote:
>> From: "Michael R. Hines" <mrhines@us.ibm.com>
>>
> [...]
>> +/*
>> + * Put in the log file which RDMA device was opened and the details
>> + * associated with that device.
>> + */
>> +static void qemu_rdma_dump_id(const char *who, struct ibv_context *verbs)
>> +{
>> + printf("%s RDMA Device opened: kernel name %s "
>> + "uverbs device name %s, "
>> + "infiniband_verbs class device path %s,"
>> + " infiniband class device path %s\n",
>> + who,
>> + verbs->device->name,
>> + verbs->device->dev_name,
>> + verbs->device->dev_path,
>> + verbs->device->ibdev_path);
>> +}
> see below
>
>> +static int qemu_rdma_dest_init(RDMAContext *rdma, Error **errp)
>> +{
>> + int ret = -EINVAL, idx;
>> + struct sockaddr_in sin;
>> + struct rdma_cm_id *listen_id;
>> + char ip[40] = "unknown";
>> +
>> + for (idx = 0; idx < RDMA_CONTROL_MAX_WR; idx++) {
>> + rdma->wr_data[idx].control_len = 0;
>> + rdma->wr_data[idx].control_curr = NULL;
>> + }
>> +
>> + if (rdma->host == NULL) {
>> + ERROR(errp, "RDMA host is not set!\n");
>> + rdma->error_state = -EINVAL;
>> + return -1;
>> + }
>> + /* create CM channel */
>> + rdma->channel = rdma_create_event_channel();
>> + if (!rdma->channel) {
>> + ERROR(errp, "could not create rdma event channel\n");
>> + rdma->error_state = -EINVAL;
>> + return -1;
>> + }
>> +
>> + /* create CM id */
>> + ret = rdma_create_id(rdma->channel, &listen_id, NULL, RDMA_PS_TCP);
>> + if (ret) {
>> + ERROR(errp, "could not create cm_id!\n");
>> + goto err_dest_init_create_listen_id;
>> + }
>> +
>> + memset(&sin, 0, sizeof(sin));
>> + sin.sin_family = AF_INET;
>> + sin.sin_port = htons(rdma->port);
>> +
>> + if (rdma->host && strcmp("", rdma->host)) {
>> + struct hostent *dest_addr;
>> + dest_addr = gethostbyname(rdma->host);
>> + if (!dest_addr) {
>> + ERROR(errp, "migration could not gethostbyname!\n");
>> + ret = -EINVAL;
>> + goto err_dest_init_bind_addr;
>> + }
>> + memcpy(&sin.sin_addr.s_addr, dest_addr->h_addr,
>> + dest_addr->h_length);
>> + inet_ntop(AF_INET, dest_addr->h_addr, ip, sizeof ip);
>> + } else {
>> + sin.sin_addr.s_addr = INADDR_ANY;
>> + }
>> +
>> + DPRINTF("%s => %s\n", rdma->host, ip);
>> +
>> + ret = rdma_bind_addr(listen_id, (struct sockaddr *)&sin);
>> + if (ret) {
>> + ERROR(errp, "Error: could not rdma_bind_addr!\n");
>> + goto err_dest_init_bind_addr;
>> + }
>> +
>> + rdma->listen_id = listen_id;
>> + if (listen_id->verbs) {
>> + rdma->verbs = listen_id->verbs;
>> + }
>> + qemu_rdma_dump_id("dest_init", rdma->verbs);
>
> I wonder if you have ever hit the case where rdma_bind_addr() does not set the
> verbs structure in listen_id because we are binding to the loopback device (also
> see linux kernel commit 8523c048).
> I keep hitting this case on my destination VM ("incoming x-rdma:host:port)
>
> Then I think qemu_rdma_dump_id can segfault trying to dereference a null verbs
> structure. The dump_id function should check for non-NULL verbs argument,
> or the dump should be made only in the (verbs != NULL) if clause.
>
> Disabling the dump_id above, I have rdma_resolve_addr() problems on the source
> VM side (getting RDMA_CM_EVENT_ADDR_ERROR instead of
> RDMA_CM_EVENT_ADDR_RESOLVED).
>
> I assume that is because of the null verbs structure destination problem above.
> qemu_rdma_dest_prepare() will always fail with a NULL verbs argument:
Good catch, thank you. I'll fix this immediately in the next version.
I never tried binding to the localhost before......
>> +
>> +static int qemu_rdma_dest_prepare(RDMAContext *rdma, Error **errp)
>> +{
>> + int ret;
>> + int idx;
>> +
>> + if (!rdma->verbs) {
>> + ERROR(errp, "no verbs context!\n");
>> + return 0;
>> + }
> It is first called from rdma_start_incoming_migration() and will fail with the
> loopback binding case (rdma->verbs == NULL).
>
> however later qemu_rdma_accept() will check against the incoming cm_event
> verbs structure and set the RDMAContext's verb struct, calling
> qemu_rdma_dest_prepare with that struct:
>
> [...]
>> +static int qemu_rdma_accept(RDMAContext *rdma)
> [...]
>> + if (!rdma->verbs) {
>> + rdma->verbs = verbs;
>> + /*
>> + * Cannot propagate errp, as there is no error pointer
>> + * to be propagated.
>> + */
>> + ret = qemu_rdma_dest_prepare(rdma, NULL);
>> + if (ret) {
>> + fprintf(stderr, "rdma migration: error preparing dest!\n");
>> + goto err_rdma_dest_wait;
>> + }
> Are these two cases intentionally different?
Another good catch. We should definitely allow loopback RDMA without any
issues.
I will fix this immediately as well in the next version.
- Michael
next prev parent reply other threads:[~2013-06-25 18:42 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-25 1:57 [Qemu-devel] [PATCH v11 00/15] rdma: migration support mrhines
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 01/15] rdma: add documentation mrhines
2013-06-25 11:54 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 02/15] rdma: introduce qemu_update_position() mrhines
2013-06-25 9:24 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 03/15] rdma: export yield_until_fd_readable() mrhines
2013-06-25 9:26 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 04/15] rdma: export throughput w/ MigrationStats QMP mrhines
2013-06-25 9:27 ` Juan Quintela
2013-06-25 13:36 ` Michael R. Hines
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 05/15] rdma: introduce qemu_file_mode_is_not_valid() mrhines
2013-06-25 9:28 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 06/15] rdma: export qemu_fflush() mrhines
2013-06-25 9:29 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 07/15] rdma: introduce ram_handle_compressed() mrhines
2013-06-25 9:30 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 08/15] rdma: introduce qemu_ram_foreach_block() mrhines
2013-06-25 9:30 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 09/15] rdma: new QEMUFileOps hooks mrhines
2013-06-25 11:51 ` Juan Quintela
2013-06-25 13:38 ` Michael R. Hines
2013-06-25 13:50 ` Paolo Bonzini
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 10/15] rdma: introduce capability x-rdma-pin-all mrhines
2013-06-25 9:33 ` Juan Quintela
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 11/15] rdma: core logic mrhines
2013-06-25 12:05 ` Juan Quintela
2013-06-25 13:39 ` Michael R. Hines
2013-06-25 16:31 ` Vasilis Liaskovitis
2013-06-25 16:41 ` Paolo Bonzini
2013-06-25 18:38 ` Michael R. Hines [this message]
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 12/15] rdma: send pc.ram mrhines
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 13/15] rdma: allow state transitions between other states besides ACTIVE mrhines
2013-06-25 9:40 ` Juan Quintela
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 14/15] rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state transition mrhines
2013-06-25 9:49 ` Juan Quintela
2013-06-25 10:13 ` Paolo Bonzini
2013-06-25 13:44 ` Michael R. Hines
2013-06-25 13:53 ` Paolo Bonzini
2013-06-25 14:54 ` Michael R. Hines
2013-06-25 14:55 ` Paolo Bonzini
2013-06-25 16:57 ` Michael R. Hines
2013-06-25 20:56 ` Michael R. Hines
2013-06-25 21:06 ` Paolo Bonzini
2013-06-26 0:31 ` Michael R. Hines
2013-06-26 6:37 ` Paolo Bonzini
2013-06-26 12:37 ` Michael R. Hines
2013-06-26 12:39 ` Paolo Bonzini
2013-06-26 14:09 ` Michael R. Hines
2013-06-26 14:57 ` Paolo Bonzini
2013-06-26 19:25 ` Michael R. Hines
2013-06-25 14:17 ` Juan Quintela
2013-06-25 17:02 ` Michael R. Hines
2013-06-25 18:48 ` Michael R. Hines
2013-06-25 13:40 ` Michael R. Hines
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 15/15] rdma: account for the time spent in MIG_STATE_SETUP through QMP mrhines
2013-06-25 9:50 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51C9E3A1.9060107@linux.vnet.ibm.com \
--to=mrhines@linux.vnet.ibm.com \
--cc=abali@us.ibm.com \
--cc=aliguori@us.ibm.com \
--cc=chegu_vinod@hp.com \
--cc=gokul@us.ibm.com \
--cc=knoel@redhat.com \
--cc=mrhines@us.ibm.com \
--cc=owasserm@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=vasilis.liaskovitis@profitbricks.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.