All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fabiano Rosas <farosas@suse.de>
To: "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"leobras@redhat.com" <leobras@redhat.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com>
Subject: Re: [PATCH 2/2] migration/rdma: zore out head.repeat to make the error more clear
Date: Thu, 21 Sep 2023 09:29:31 -0300	[thread overview]
Message-ID: <871qervhec.fsf@suse.de> (raw)
In-Reply-To: <2d876f0c-8726-81df-3a62-2d79a6b44ba8@fujitsu.com>

"Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com> writes:

> On 20/09/2023 21:01, Fabiano Rosas wrote:
>> Li Zhijian <lizhijian@fujitsu.com> writes:
>> 
>>> From: Li Zhijian <lizhijian@cn.fujitsu.com>
>>>
>>> Previously, we got a confusion error that complains
>>> the RDMAControlHeader.repeat:
>>> qemu-system-x86_64: rdma: Too many requests in this message (3638950032).Bailing.
>>>
>>> Actually, it's caused by an unexpected RDMAControlHeader.type.
>>> After this patch, error will become:
>>> qemu-system-x86_64: Unknown control message QEMU FILE
>>>
>>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>>> ---
>>>   migration/rdma.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/migration/rdma.c b/migration/rdma.c
>>> index a2a3db35b1..3073d9953c 100644
>>> --- a/migration/rdma.c
>>> +++ b/migration/rdma.c
>>> @@ -2812,7 +2812,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc,
>>>           size_t remaining = iov[i].iov_len;
>>>           uint8_t * data = (void *)iov[i].iov_base;
>>>           while (remaining) {
>>> -            RDMAControlHeader head;
>>> +            RDMAControlHeader head = {};
>>>   
>>>               len = MIN(remaining, RDMA_SEND_INCREMENT);
>>>               remaining -= len;
>> 
>
> 2815             RDMAControlHeader head = {};
> 2816
> 2817             len = MIN(remaining, RDMA_SEND_INCREMENT);
> 2818             remaining -= len;
> 2819
> 2820             head.len = len;
> 2821             head.type = RDMA_CONTROL_QEMU_FILE;
> 2822
> 2823             ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL);
>
>> I'm struggling to see how head is used before we set the type a couple
>> of lines below. Could you expand on it?
>
>
> IIUC, head is used for both common migration control path and RDMA specific control path.
>
> hook_stage(RAM_SAVE_FLAG_HOOK) {
>     rdma_hook_process(qemu_rdma_registration_handle) {
>        do {
>            // this is a RDMA own control block, should not be disturbed by the common migration control path.
>            // head will be extracted and processed here.
>            // qio_channel_rdma_writev() will send RDMA_CONTROL_QEMU_FILE, which is an unexpected message for this block.
>            // head.repeat will be examined before the type, so an uninitialized repeat will confuse us here.
>        } while (!RDMA_CONTROL_REGISTER_FINISHED || !error)
>     }
> }
>
>
> when qio_channel_rdma_writev() is used for common migration control path, repeat is useless and will not be examined.
>
> With this patch, we can quickly know the cause.
>

Ah, right. Somehow I interpreted the commit message as meaning the
'type' field was bogus. But it's the 'repeat' field that causes the
issue. Thanks for the explanation.

Reviewed-by: Fabiano Rosas <farosas@suse.de>



  reply	other threads:[~2023-09-21 12:42 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-20  9:04 [PATCH 1/2] migration: Fix rdma migration failed Li Zhijian
2023-09-20  9:04 ` [PATCH 2/2] migration/rdma: zore out head.repeat to make the error more clear Li Zhijian
2023-09-20 13:01   ` Fabiano Rosas
2023-09-21  1:36     ` Zhijian Li (Fujitsu)
2023-09-21 12:29       ` Fabiano Rosas [this message]
2023-09-22 15:44   ` Peter Xu
2023-09-20 12:46 ` [PATCH 1/2] migration: Fix rdma migration failed Fabiano Rosas
2023-09-22  7:42   ` Zhijian Li (Fujitsu)
2023-09-21  1:40 ` Zhijian Li (Fujitsu)
2023-09-22 15:42 ` Peter Xu
2023-09-22 15:59   ` Fabiano Rosas
2023-09-22 16:09     ` Peter Xu
2023-09-25  8:59   ` Zhijian Li (Fujitsu)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871qervhec.fsf@suse.de \
    --to=farosas@suse.de \
    --cc=leobras@redhat.com \
    --cc=lizhijian@fujitsu.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.