qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Gonglei (Arei)" via <qemu-devel@nongnu.org>
To: Jinpu Wang <jinpu.wang@ionos.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"yu.zhang@ionos.com" <yu.zhang@ionos.com>,
	"mgalaxy@akamai.com" <mgalaxy@akamai.com>,
	"elmar.gerdes@ionos.com" <elmar.gerdes@ionos.com>,
	zhengchuan <zhengchuan@huawei.com>,
	"berrange@redhat.com" <berrange@redhat.com>,
	"armbru@redhat.com" <armbru@redhat.com>,
	"lizhijian@fujitsu.com" <lizhijian@fujitsu.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"mst@redhat.com" <mst@redhat.com>,
	Xiexiangyou <xiexiangyou@huawei.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"lixiao (H)" <lixiao91@huawei.com>,
	Wangjialin <wangjialin23@huawei.com>
Subject: RE: [PATCH 0/6] refactor RDMA live migration based on rsocket API
Date: Fri, 7 Jun 2024 08:28:29 +0000	[thread overview]
Message-ID: <b637ce3cac16409c83a3391b05011eec@huawei.com> (raw)
In-Reply-To: <CAMGffEkUd2EOS3+PQ9Yfp=8V1pZB_emo7gcmxmvOX=iWVG6Axg@mail.gmail.com>



> -----Original Message-----
> From: Jinpu Wang [mailto:jinpu.wang@ionos.com]
> Sent: Friday, June 7, 2024 1:54 PM
> To: Gonglei (Arei) <arei.gonglei@huawei.com>
> Cc: qemu-devel@nongnu.org; peterx@redhat.com; yu.zhang@ionos.com;
> mgalaxy@akamai.com; elmar.gerdes@ionos.com; zhengchuan
> <zhengchuan@huawei.com>; berrange@redhat.com; armbru@redhat.com;
> lizhijian@fujitsu.com; pbonzini@redhat.com; mst@redhat.com; Xiexiangyou
> <xiexiangyou@huawei.com>; linux-rdma@vger.kernel.org; lixiao (H)
> <lixiao91@huawei.com>; Wangjialin <wangjialin23@huawei.com>
> Subject: Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API
> 
> Hi Gonglei, hi folks on the list,
> 
> On Tue, Jun 4, 2024 at 2:14 PM Gonglei <arei.gonglei@huawei.com> wrote:
> >
> > From: Jialin Wang <wangjialin23@huawei.com>
> >
> > Hi,
> >
> > This patch series attempts to refactor RDMA live migration by
> > introducing a new QIOChannelRDMA class based on the rsocket API.
> >
> > The /usr/include/rdma/rsocket.h provides a higher level rsocket API
> > that is a 1-1 match of the normal kernel 'sockets' API, which hides
> > the detail of rdma protocol into rsocket and allows us to add support
> > for some modern features like multifd more easily.
> >
> > Here is the previous discussion on refactoring RDMA live migration
> > using the rsocket API:
> >
> > https://lore.kernel.org/qemu-devel/20240328130255.52257-1-philmd@linar
> > o.org/
> >
> > We have encountered some bugs when using rsocket and plan to submit
> > them to the rdma-core community.
> >
> > In addition, the use of rsocket makes our programming more convenient,
> > but it must be noted that this method introduces multiple memory
> > copies, which can be imagined that there will be a certain performance
> > degradation, hoping that friends with RDMA network cards can help verify,
> thank you!
> First thx for the effort, we are running migration tests on our IB fabric, different
> generation of HCA from mellanox, the migration works ok, there are a few
> failures,  Yu will share the result later separately.
> 

Thank you so much. 

> The one blocker for the change is the old implementation and the new rsocket
> implementation; they don't talk to each other due to the effect of different wire
> protocol during connection establishment.
> eg the old RDMA migration has special control message during the migration
> flow, which rsocket use a different control message, so there lead to no way to
> migrate VM using rdma transport pre to the rsocket patchset to a new version
> with rsocket implementation.
> 
> Probably we should keep both implementation for a while, mark the old
> implementation as deprecated, and promote the new implementation, and
> high light in doc, they are not compatible.
> 

IMO It makes sense. What's your opinion? @Peter.


Regards,
-Gonglei

> Regards!
> Jinpu
> 
> 
> 
> >
> > Jialin Wang (6):
> >   migration: remove RDMA live migration temporarily
> >   io: add QIOChannelRDMA class
> >   io/channel-rdma: support working in coroutine
> >   tests/unit: add test-io-channel-rdma.c
> >   migration: introduce new RDMA live migration
> >   migration/rdma: support multifd for RDMA migration
> >
> >  docs/rdma.txt                     |  420 ---
> >  include/io/channel-rdma.h         |  165 ++
> >  io/channel-rdma.c                 |  798 ++++++
> >  io/meson.build                    |    1 +
> >  io/trace-events                   |   14 +
> >  meson.build                       |    6 -
> >  migration/meson.build             |    3 +-
> >  migration/migration-stats.c       |    5 +-
> >  migration/migration-stats.h       |    4 -
> >  migration/migration.c             |   13 +-
> >  migration/migration.h             |    9 -
> >  migration/multifd.c               |   10 +
> >  migration/options.c               |   16 -
> >  migration/options.h               |    2 -
> >  migration/qemu-file.c             |    1 -
> >  migration/ram.c                   |   90 +-
> >  migration/rdma.c                  | 4205 +----------------------------
> >  migration/rdma.h                  |   67 +-
> >  migration/savevm.c                |    2 +-
> >  migration/trace-events            |   68 +-
> >  qapi/migration.json               |   13 +-
> >  scripts/analyze-migration.py      |    3 -
> >  tests/unit/meson.build            |    1 +
> >  tests/unit/test-io-channel-rdma.c |  276 ++
> >  24 files changed, 1360 insertions(+), 4832 deletions(-)  delete mode
> > 100644 docs/rdma.txt  create mode 100644 include/io/channel-rdma.h
> > create mode 100644 io/channel-rdma.c  create mode 100644
> > tests/unit/test-io-channel-rdma.c
> >
> > --
> > 2.43.0
> >


  reply	other threads:[~2024-06-07  8:29 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-04 12:14 [PATCH 0/6] refactor RDMA live migration based on rsocket API Gonglei via
2024-06-04 12:14 ` [PATCH 1/6] migration: remove RDMA live migration temporarily Gonglei via
2024-06-04 14:01   ` David Hildenbrand
2024-06-05 10:02     ` Gonglei (Arei) via
2024-06-10 11:45   ` Markus Armbruster
2024-06-04 12:14 ` [PATCH 2/6] io: add QIOChannelRDMA class Gonglei via
2024-06-10  6:54   ` Jinpu Wang
2024-06-04 12:14 ` [PATCH 3/6] io/channel-rdma: support working in coroutine Gonglei via
2024-06-06 13:34   ` Haris Iqbal
2024-06-07  8:45     ` Gonglei (Arei) via
2024-06-07 10:01       ` Haris Iqbal
2024-06-07  9:04   ` Daniel P. Berrangé
2024-06-07  9:28     ` Gonglei (Arei) via
2024-06-04 12:14 ` [PATCH 4/6] tests/unit: add test-io-channel-rdma.c Gonglei via
2024-06-04 12:14 ` [PATCH 5/6] migration: introduce new RDMA live migration Gonglei via
2024-06-04 12:14 ` [PATCH 6/6] migration/rdma: support multifd for RDMA migration Gonglei via
2024-06-04 19:32 ` [PATCH 0/6] refactor RDMA live migration based on rsocket API Peter Xu
2024-06-05 10:09   ` Gonglei (Arei) via
2024-06-05 14:18     ` Peter Xu
2024-06-07  8:49       ` Gonglei (Arei) via
2024-06-10 16:35         ` Peter Xu
2024-06-07 10:06   ` Daniel P. Berrangé
2024-06-05  7:57 ` Michael S. Tsirkin
2024-06-05 10:00   ` Gonglei (Arei) via
2024-06-05 10:23     ` Michael S. Tsirkin
2024-06-06 11:31     ` Leon Romanovsky
2024-06-07  1:04       ` Zhijian Li (Fujitsu) via
2024-06-07 16:24     ` Yu Zhang
2024-06-07  5:53 ` Jinpu Wang
2024-06-07  8:28   ` Gonglei (Arei) via [this message]
2024-06-10 16:31     ` Peter Xu
2024-08-27 20:15 ` Peter Xu
2024-08-27 20:57   ` Michael S. Tsirkin
2024-09-22 19:29     ` Michael Galaxy
2024-09-23  1:04       ` Gonglei (Arei) via
2024-09-25 15:08         ` Peter Xu
2024-09-27 21:45           ` Sean Hefty
2024-09-28 17:52             ` Michael Galaxy
2024-09-29 18:14               ` Michael S. Tsirkin
2024-09-29 20:26                 ` Michael Galaxy
2024-09-29 22:26                   ` Michael S. Tsirkin
2024-09-30 15:00                     ` Michael Galaxy
2024-09-30 15:31                       ` Yu Zhang
2024-09-30 18:16               ` Peter Xu
2024-09-30 19:20                 ` Sean Hefty
2024-09-30 19:47                   ` Peter Xu
2024-10-03 21:26                     ` Michael Galaxy
2024-10-03 21:43                       ` Peter Xu
2024-10-04 14:04                         ` Michael Galaxy
2024-10-07  8:47                           ` Yu Zhang
2024-10-07 13:45                             ` Michael Galaxy
2024-10-07 18:15                               ` Leon Romanovsky
2024-10-08  9:31                                 ` Zhu Yanjun
2024-10-23 13:42                               ` Michael Galaxy
2024-09-27 20:34         ` Michael Galaxy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b637ce3cac16409c83a3391b05011eec@huawei.com \
    --to=qemu-devel@nongnu.org \
    --cc=arei.gonglei@huawei.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=elmar.gerdes@ionos.com \
    --cc=jinpu.wang@ionos.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=lixiao91@huawei.com \
    --cc=lizhijian@fujitsu.com \
    --cc=mgalaxy@akamai.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=wangjialin23@huawei.com \
    --cc=xiexiangyou@huawei.com \
    --cc=yu.zhang@ionos.com \
    --cc=zhengchuan@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).