From: "Michael R. Hines" <mrhines@linux.vnet.ibm.com>
To: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Cc: aliguori@us.ibm.com, quintela@redhat.com, qemu-devel@nongnu.org,
owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com,
gokul@us.ibm.com, pbonzini@redhat.com, chegu_vinod@hp.com,
knoel@redhat.com
Subject: Re: [Qemu-devel] [PATCH v11 11/15] rdma: core logic
Date: Tue, 25 Jun 2013 14:38:25 -0400 [thread overview]
Message-ID: <51C9E3A1.9060107@linux.vnet.ibm.com> (raw)
In-Reply-To: <20130625163149.GA10901@dhcp-192-168-178-175.profitbricks.localdomain>
On 06/25/2013 12:31 PM, Vasilis Liaskovitis wrote:
> Hi,
>
> On Mon, Jun 24, 2013 at 09:58:01PM -0400, mrhines@linux.vnet.ibm.com wrote:
>> From: "Michael R. Hines" <mrhines@us.ibm.com>
>>
> [...]
>> +/*
>> + * Put in the log file which RDMA device was opened and the details
>> + * associated with that device.
>> + */
>> +static void qemu_rdma_dump_id(const char *who, struct ibv_context *verbs)
>> +{
>> + printf("%s RDMA Device opened: kernel name %s "
>> + "uverbs device name %s, "
>> + "infiniband_verbs class device path %s,"
>> + " infiniband class device path %s\n",
>> + who,
>> + verbs->device->name,
>> + verbs->device->dev_name,
>> + verbs->device->dev_path,
>> + verbs->device->ibdev_path);
>> +}
> see below
>
>> +static int qemu_rdma_dest_init(RDMAContext *rdma, Error **errp)
>> +{
>> + int ret = -EINVAL, idx;
>> + struct sockaddr_in sin;
>> + struct rdma_cm_id *listen_id;
>> + char ip[40] = "unknown";
>> +
>> + for (idx = 0; idx < RDMA_CONTROL_MAX_WR; idx++) {
>> + rdma->wr_data[idx].control_len = 0;
>> + rdma->wr_data[idx].control_curr = NULL;
>> + }
>> +
>> + if (rdma->host == NULL) {
>> + ERROR(errp, "RDMA host is not set!\n");
>> + rdma->error_state = -EINVAL;
>> + return -1;
>> + }
>> + /* create CM channel */
>> + rdma->channel = rdma_create_event_channel();
>> + if (!rdma->channel) {
>> + ERROR(errp, "could not create rdma event channel\n");
>> + rdma->error_state = -EINVAL;
>> + return -1;
>> + }
>> +
>> + /* create CM id */
>> + ret = rdma_create_id(rdma->channel, &listen_id, NULL, RDMA_PS_TCP);
>> + if (ret) {
>> + ERROR(errp, "could not create cm_id!\n");
>> + goto err_dest_init_create_listen_id;
>> + }
>> +
>> + memset(&sin, 0, sizeof(sin));
>> + sin.sin_family = AF_INET;
>> + sin.sin_port = htons(rdma->port);
>> +
>> + if (rdma->host && strcmp("", rdma->host)) {
>> + struct hostent *dest_addr;
>> + dest_addr = gethostbyname(rdma->host);
>> + if (!dest_addr) {
>> + ERROR(errp, "migration could not gethostbyname!\n");
>> + ret = -EINVAL;
>> + goto err_dest_init_bind_addr;
>> + }
>> + memcpy(&sin.sin_addr.s_addr, dest_addr->h_addr,
>> + dest_addr->h_length);
>> + inet_ntop(AF_INET, dest_addr->h_addr, ip, sizeof ip);
>> + } else {
>> + sin.sin_addr.s_addr = INADDR_ANY;
>> + }
>> +
>> + DPRINTF("%s => %s\n", rdma->host, ip);
>> +
>> + ret = rdma_bind_addr(listen_id, (struct sockaddr *)&sin);
>> + if (ret) {
>> + ERROR(errp, "Error: could not rdma_bind_addr!\n");
>> + goto err_dest_init_bind_addr;
>> + }
>> +
>> + rdma->listen_id = listen_id;
>> + if (listen_id->verbs) {
>> + rdma->verbs = listen_id->verbs;
>> + }
>> + qemu_rdma_dump_id("dest_init", rdma->verbs);
>
> I wonder if you have ever hit the case where rdma_bind_addr() does not set the
> verbs structure in listen_id because we are binding to the loopback device (also
> see linux kernel commit 8523c048).
> I keep hitting this case on my destination VM ("incoming x-rdma:host:port)
>
> Then I think qemu_rdma_dump_id can segfault trying to dereference a null verbs
> structure. The dump_id function should check for non-NULL verbs argument,
> or the dump should be made only in the (verbs != NULL) if clause.
>
> Disabling the dump_id above, I have rdma_resolve_addr() problems on the source
> VM side (getting RDMA_CM_EVENT_ADDR_ERROR instead of
> RDMA_CM_EVENT_ADDR_RESOLVED).
>
> I assume that is because of the null verbs structure destination problem above.
> qemu_rdma_dest_prepare() will always fail with a NULL verbs argument:
Good catch, thank you. I'll fix this immediately in the next version.
I never tried binding to the localhost before......
>> +
>> +static int qemu_rdma_dest_prepare(RDMAContext *rdma, Error **errp)
>> +{
>> + int ret;
>> + int idx;
>> +
>> + if (!rdma->verbs) {
>> + ERROR(errp, "no verbs context!\n");
>> + return 0;
>> + }
> It is first called from rdma_start_incoming_migration() and will fail with the
> loopback binding case (rdma->verbs == NULL).
>
> however later qemu_rdma_accept() will check against the incoming cm_event
> verbs structure and set the RDMAContext's verb struct, calling
> qemu_rdma_dest_prepare with that struct:
>
> [...]
>> +static int qemu_rdma_accept(RDMAContext *rdma)
> [...]
>> + if (!rdma->verbs) {
>> + rdma->verbs = verbs;
>> + /*
>> + * Cannot propagate errp, as there is no error pointer
>> + * to be propagated.
>> + */
>> + ret = qemu_rdma_dest_prepare(rdma, NULL);
>> + if (ret) {
>> + fprintf(stderr, "rdma migration: error preparing dest!\n");
>> + goto err_rdma_dest_wait;
>> + }
> Are these two cases intentionally different?
Another good catch. We should definitely allow loopback RDMA without any
issues.
I will fix this immediately as well in the next version.
- Michael
next prev parent reply other threads:[~2013-06-25 18:42 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-25 1:57 [Qemu-devel] [PATCH v11 00/15] rdma: migration support mrhines
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 01/15] rdma: add documentation mrhines
2013-06-25 11:54 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 02/15] rdma: introduce qemu_update_position() mrhines
2013-06-25 9:24 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 03/15] rdma: export yield_until_fd_readable() mrhines
2013-06-25 9:26 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 04/15] rdma: export throughput w/ MigrationStats QMP mrhines
2013-06-25 9:27 ` Juan Quintela
2013-06-25 13:36 ` Michael R. Hines
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 05/15] rdma: introduce qemu_file_mode_is_not_valid() mrhines
2013-06-25 9:28 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 06/15] rdma: export qemu_fflush() mrhines
2013-06-25 9:29 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 07/15] rdma: introduce ram_handle_compressed() mrhines
2013-06-25 9:30 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 08/15] rdma: introduce qemu_ram_foreach_block() mrhines
2013-06-25 9:30 ` Juan Quintela
2013-06-25 1:57 ` [Qemu-devel] [PATCH v11 09/15] rdma: new QEMUFileOps hooks mrhines
2013-06-25 11:51 ` Juan Quintela
2013-06-25 13:38 ` Michael R. Hines
2013-06-25 13:50 ` Paolo Bonzini
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 10/15] rdma: introduce capability x-rdma-pin-all mrhines
2013-06-25 9:33 ` Juan Quintela
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 11/15] rdma: core logic mrhines
2013-06-25 12:05 ` Juan Quintela
2013-06-25 13:39 ` Michael R. Hines
2013-06-25 16:31 ` Vasilis Liaskovitis
2013-06-25 16:41 ` Paolo Bonzini
2013-06-25 18:38 ` Michael R. Hines [this message]
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 12/15] rdma: send pc.ram mrhines
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 13/15] rdma: allow state transitions between other states besides ACTIVE mrhines
2013-06-25 9:40 ` Juan Quintela
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 14/15] rdma: introduce MIG_STATE_NONE and change MIG_STATE_SETUP state transition mrhines
2013-06-25 9:49 ` Juan Quintela
2013-06-25 10:13 ` Paolo Bonzini
2013-06-25 13:44 ` Michael R. Hines
2013-06-25 13:53 ` Paolo Bonzini
2013-06-25 14:54 ` Michael R. Hines
2013-06-25 14:55 ` Paolo Bonzini
2013-06-25 16:57 ` Michael R. Hines
2013-06-25 20:56 ` Michael R. Hines
2013-06-25 21:06 ` Paolo Bonzini
2013-06-26 0:31 ` Michael R. Hines
2013-06-26 6:37 ` Paolo Bonzini
2013-06-26 12:37 ` Michael R. Hines
2013-06-26 12:39 ` Paolo Bonzini
2013-06-26 14:09 ` Michael R. Hines
2013-06-26 14:57 ` Paolo Bonzini
2013-06-26 19:25 ` Michael R. Hines
2013-06-25 14:17 ` Juan Quintela
2013-06-25 17:02 ` Michael R. Hines
2013-06-25 18:48 ` Michael R. Hines
2013-06-25 13:40 ` Michael R. Hines
2013-06-25 1:58 ` [Qemu-devel] [PATCH v11 15/15] rdma: account for the time spent in MIG_STATE_SETUP through QMP mrhines
2013-06-25 9:50 ` Juan Quintela
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51C9E3A1.9060107@linux.vnet.ibm.com \
--to=mrhines@linux.vnet.ibm.com \
--cc=abali@us.ibm.com \
--cc=aliguori@us.ibm.com \
--cc=chegu_vinod@hp.com \
--cc=gokul@us.ibm.com \
--cc=knoel@redhat.com \
--cc=mrhines@us.ibm.com \
--cc=owasserm@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=vasilis.liaskovitis@profitbricks.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).