* [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume
@ 2024-08-02 7:20 Dragos Tatulea
2024-08-02 7:20 ` [PATCH mlx5-vhost 1/7] net/mlx5: Support throttled commands from async API Dragos Tatulea
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Dragos Tatulea @ 2024-08-02 7:20 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Dragos Tatulea,
Michael S. Tsirkin, Jason Wang, Xuan Zhuo, Eugenio Pérez
Cc: Si-Wei Liu, netdev, linux-rdma, linux-kernel, virtualization
This series parallelizes the mlx5_vdpa device suspend and resume
operations through the firmware async API. The purpose is to reduce live
migration downtime.
The series starts with changing the VQ suspend and resume commands
to the async API. After that, the switch is made to issue multiple
commands of the same type in parallel.
Finally, a bonus improvement is thrown in: keep the notifierd enabled
during suspend but make it a NOP. Upon resume make sure that the link
state is forwarded. This shaves around 30ms per device constant time.
For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
x 2 threads per core), the improvements are:
+-------------------+--------+--------+-----------+
| operation | Before | After | Reduction |
|-------------------+--------+--------+-----------|
| mlx5_vdpa_suspend | 37 ms | 2.5 ms | 14x |
| mlx5_vdpa_resume | 16 ms | 5 ms | 3x |
+-------------------+--------+--------+-----------+
Note for the maintainers:
The first patch contains changes for mlx5_core. This must be applied
into the mlx5-vhost tree [0] first. Once this patch is applied on
mlx5-vhost, the change has to be pulled from mlx5-vdpa into the vhost
tree and only then the remaining patches can be applied.
[0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
Dragos Tatulea (7):
net/mlx5: Support throttled commands from async API
vdpa/mlx5: Introduce error logging function
vdpa/mlx5: Use async API for vq query command
vdpa/mlx5: Use async API for vq modify commands
vdpa/mlx5: Parallelize device suspend
vdpa/mlx5: Parallelize device resume
vdpa/mlx5: Keep notifiers during suspend but ignore
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 +-
drivers/vdpa/mlx5/core/mlx5_vdpa.h | 7 +
drivers/vdpa/mlx5/net/mlx5_vnet.c | 435 +++++++++++++-----
3 files changed, 333 insertions(+), 130 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH mlx5-vhost 1/7] net/mlx5: Support throttled commands from async API
2024-08-02 7:20 [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume Dragos Tatulea
@ 2024-08-02 7:20 ` Dragos Tatulea
2024-08-02 13:14 ` [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume Michael S. Tsirkin
2024-08-07 13:25 ` Eugenio Perez Martin
2 siblings, 0 replies; 9+ messages in thread
From: Dragos Tatulea @ 2024-08-02 7:20 UTC (permalink / raw)
To: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Si-Wei Liu, Dragos Tatulea, Leon Romanovsky, netdev, linux-rdma,
linux-kernel
Currently, commands that qualify as throttled can't be used via the
async API. That's due to the fact that the throttle semaphore can sleep
but the async API can't.
This patch allows throttling in the async API by using the tentative
variant of the semaphore and upon failure (semaphore at 0) returns EBUSY
to signal to the caller that they need to wait for the completion of
previously issued commands.
Furthermore, make sure that the semaphore is released in the callback.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Cc: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 ++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 20768ef2e9d2..f69c977c1569 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1882,10 +1882,12 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
throttle_op = mlx5_cmd_is_throttle_opcode(opcode);
if (throttle_op) {
- /* atomic context may not sleep */
- if (callback)
- return -EINVAL;
- down(&dev->cmd.vars.throttle_sem);
+ if (callback) {
+ if (down_trylock(&dev->cmd.vars.throttle_sem))
+ return -EBUSY;
+ } else {
+ down(&dev->cmd.vars.throttle_sem);
+ }
}
pages_queue = is_manage_pages(in);
@@ -2091,10 +2093,19 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work)
{
struct mlx5_async_work *work = _work;
struct mlx5_async_ctx *ctx;
+ struct mlx5_core_dev *dev;
+ u16 opcode;
ctx = work->ctx;
- status = cmd_status_err(ctx->dev, status, work->opcode, work->op_mod, work->out);
+ dev = ctx->dev;
+ opcode = work->opcode;
+ status = cmd_status_err(dev, status, work->opcode, work->op_mod, work->out);
work->user_callback(status, work);
+ /* Can't access "work" from this point on. It could have been freed in
+ * the callback.
+ */
+ if (mlx5_cmd_is_throttle_opcode(opcode))
+ up(&dev->cmd.vars.throttle_sem);
if (atomic_dec_and_test(&ctx->num_inflight))
complete(&ctx->inflight_done);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume
2024-08-02 7:20 [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume Dragos Tatulea
2024-08-02 7:20 ` [PATCH mlx5-vhost 1/7] net/mlx5: Support throttled commands from async API Dragos Tatulea
@ 2024-08-02 13:14 ` Michael S. Tsirkin
2024-08-04 8:48 ` Leon Romanovsky
2024-08-16 9:13 ` Dragos Tatulea
2024-08-07 13:25 ` Eugenio Perez Martin
2 siblings, 2 replies; 9+ messages in thread
From: Michael S. Tsirkin @ 2024-08-02 13:14 UTC (permalink / raw)
To: Dragos Tatulea
Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Si-Wei Liu, netdev, linux-rdma, linux-kernel,
virtualization
On Fri, Aug 02, 2024 at 10:20:17AM +0300, Dragos Tatulea wrote:
> This series parallelizes the mlx5_vdpa device suspend and resume
> operations through the firmware async API. The purpose is to reduce live
> migration downtime.
>
> The series starts with changing the VQ suspend and resume commands
> to the async API. After that, the switch is made to issue multiple
> commands of the same type in parallel.
>
> Finally, a bonus improvement is thrown in: keep the notifierd enabled
> during suspend but make it a NOP. Upon resume make sure that the link
> state is forwarded. This shaves around 30ms per device constant time.
>
> For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
> x 2 threads per core), the improvements are:
>
> +-------------------+--------+--------+-----------+
> | operation | Before | After | Reduction |
> |-------------------+--------+--------+-----------|
> | mlx5_vdpa_suspend | 37 ms | 2.5 ms | 14x |
> | mlx5_vdpa_resume | 16 ms | 5 ms | 3x |
> +-------------------+--------+--------+-----------+
>
> Note for the maintainers:
> The first patch contains changes for mlx5_core. This must be applied
> into the mlx5-vhost tree [0] first. Once this patch is applied on
> mlx5-vhost, the change has to be pulled from mlx5-vdpa into the vhost
> tree and only then the remaining patches can be applied.
Or maintainer just acks it and I apply directly.
Let me know when all this can happen.
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
>
> Dragos Tatulea (7):
> net/mlx5: Support throttled commands from async API
> vdpa/mlx5: Introduce error logging function
> vdpa/mlx5: Use async API for vq query command
> vdpa/mlx5: Use async API for vq modify commands
> vdpa/mlx5: Parallelize device suspend
> vdpa/mlx5: Parallelize device resume
> vdpa/mlx5: Keep notifiers during suspend but ignore
>
> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 +-
> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 7 +
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 435 +++++++++++++-----
> 3 files changed, 333 insertions(+), 130 deletions(-)
>
> --
> 2.45.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume
2024-08-02 13:14 ` [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume Michael S. Tsirkin
@ 2024-08-04 8:48 ` Leon Romanovsky
2024-08-04 13:39 ` Michael S. Tsirkin
2024-08-16 9:13 ` Dragos Tatulea
1 sibling, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2024-08-04 8:48 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Dragos Tatulea, Saeed Mahameed, Tariq Toukan, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Si-Wei Liu, netdev, linux-rdma, linux-kernel,
virtualization
On Fri, Aug 02, 2024 at 09:14:28AM -0400, Michael S. Tsirkin wrote:
> On Fri, Aug 02, 2024 at 10:20:17AM +0300, Dragos Tatulea wrote:
> > This series parallelizes the mlx5_vdpa device suspend and resume
> > operations through the firmware async API. The purpose is to reduce live
> > migration downtime.
> >
> > The series starts with changing the VQ suspend and resume commands
> > to the async API. After that, the switch is made to issue multiple
> > commands of the same type in parallel.
> >
> > Finally, a bonus improvement is thrown in: keep the notifierd enabled
> > during suspend but make it a NOP. Upon resume make sure that the link
> > state is forwarded. This shaves around 30ms per device constant time.
> >
> > For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
> > x 2 threads per core), the improvements are:
> >
> > +-------------------+--------+--------+-----------+
> > | operation | Before | After | Reduction |
> > |-------------------+--------+--------+-----------|
> > | mlx5_vdpa_suspend | 37 ms | 2.5 ms | 14x |
> > | mlx5_vdpa_resume | 16 ms | 5 ms | 3x |
> > +-------------------+--------+--------+-----------+
> >
> > Note for the maintainers:
> > The first patch contains changes for mlx5_core. This must be applied
> > into the mlx5-vhost tree [0] first. Once this patch is applied on
> > mlx5-vhost, the change has to be pulled from mlx5-vdpa into the vhost
> > tree and only then the remaining patches can be applied.
>
> Or maintainer just acks it and I apply directly.
We can do it, but there is a potential to create a conflict between your tree
and netdev for whole cycle, which will be a bit annoying. Easiest way to avoid
this is to have a shared branch, but in august everyone is on vacation, so it
will be probably fine to apply such patch directly.
Thanks
>
> Let me know when all this can happen.
>
> > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
> >
> > Dragos Tatulea (7):
> > net/mlx5: Support throttled commands from async API
> > vdpa/mlx5: Introduce error logging function
> > vdpa/mlx5: Use async API for vq query command
> > vdpa/mlx5: Use async API for vq modify commands
> > vdpa/mlx5: Parallelize device suspend
> > vdpa/mlx5: Parallelize device resume
> > vdpa/mlx5: Keep notifiers during suspend but ignore
> >
> > drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 +-
> > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 7 +
> > drivers/vdpa/mlx5/net/mlx5_vnet.c | 435 +++++++++++++-----
> > 3 files changed, 333 insertions(+), 130 deletions(-)
> >
> > --
> > 2.45.2
>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume
2024-08-04 8:48 ` Leon Romanovsky
@ 2024-08-04 13:39 ` Michael S. Tsirkin
2024-08-04 14:52 ` Leon Romanovsky
0 siblings, 1 reply; 9+ messages in thread
From: Michael S. Tsirkin @ 2024-08-04 13:39 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Dragos Tatulea, Saeed Mahameed, Tariq Toukan, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Si-Wei Liu, netdev, linux-rdma, linux-kernel,
virtualization
On Sun, Aug 04, 2024 at 11:48:39AM +0300, Leon Romanovsky wrote:
> On Fri, Aug 02, 2024 at 09:14:28AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Aug 02, 2024 at 10:20:17AM +0300, Dragos Tatulea wrote:
> > > This series parallelizes the mlx5_vdpa device suspend and resume
> > > operations through the firmware async API. The purpose is to reduce live
> > > migration downtime.
> > >
> > > The series starts with changing the VQ suspend and resume commands
> > > to the async API. After that, the switch is made to issue multiple
> > > commands of the same type in parallel.
> > >
> > > Finally, a bonus improvement is thrown in: keep the notifierd enabled
> > > during suspend but make it a NOP. Upon resume make sure that the link
> > > state is forwarded. This shaves around 30ms per device constant time.
> > >
> > > For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
> > > x 2 threads per core), the improvements are:
> > >
> > > +-------------------+--------+--------+-----------+
> > > | operation | Before | After | Reduction |
> > > |-------------------+--------+--------+-----------|
> > > | mlx5_vdpa_suspend | 37 ms | 2.5 ms | 14x |
> > > | mlx5_vdpa_resume | 16 ms | 5 ms | 3x |
> > > +-------------------+--------+--------+-----------+
> > >
> > > Note for the maintainers:
> > > The first patch contains changes for mlx5_core. This must be applied
> > > into the mlx5-vhost tree [0] first. Once this patch is applied on
> > > mlx5-vhost, the change has to be pulled from mlx5-vdpa into the vhost
> > > tree and only then the remaining patches can be applied.
> >
> > Or maintainer just acks it and I apply directly.
>
> We can do it, but there is a potential to create a conflict between your tree
> and netdev for whole cycle, which will be a bit annoying. Easiest way to avoid
> this is to have a shared branch, but in august everyone is on vacation, so it
> will be probably fine to apply such patch directly.
>
> Thanks
We can let Linus do something, it's ok ;)
> >
> > Let me know when all this can happen.
> >
> > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
> > >
> > > Dragos Tatulea (7):
> > > net/mlx5: Support throttled commands from async API
> > > vdpa/mlx5: Introduce error logging function
> > > vdpa/mlx5: Use async API for vq query command
> > > vdpa/mlx5: Use async API for vq modify commands
> > > vdpa/mlx5: Parallelize device suspend
> > > vdpa/mlx5: Parallelize device resume
> > > vdpa/mlx5: Keep notifiers during suspend but ignore
> > >
> > > drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 +-
> > > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 7 +
> > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 435 +++++++++++++-----
> > > 3 files changed, 333 insertions(+), 130 deletions(-)
> > >
> > > --
> > > 2.45.2
> >
> >
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume
2024-08-04 13:39 ` Michael S. Tsirkin
@ 2024-08-04 14:52 ` Leon Romanovsky
0 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2024-08-04 14:52 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Dragos Tatulea, Saeed Mahameed, Tariq Toukan, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Si-Wei Liu, netdev, linux-rdma, linux-kernel,
virtualization
On Sun, Aug 04, 2024 at 09:39:29AM -0400, Michael S. Tsirkin wrote:
> On Sun, Aug 04, 2024 at 11:48:39AM +0300, Leon Romanovsky wrote:
> > On Fri, Aug 02, 2024 at 09:14:28AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Aug 02, 2024 at 10:20:17AM +0300, Dragos Tatulea wrote:
> > > > This series parallelizes the mlx5_vdpa device suspend and resume
> > > > operations through the firmware async API. The purpose is to reduce live
> > > > migration downtime.
> > > >
> > > > The series starts with changing the VQ suspend and resume commands
> > > > to the async API. After that, the switch is made to issue multiple
> > > > commands of the same type in parallel.
> > > >
> > > > Finally, a bonus improvement is thrown in: keep the notifierd enabled
> > > > during suspend but make it a NOP. Upon resume make sure that the link
> > > > state is forwarded. This shaves around 30ms per device constant time.
> > > >
> > > > For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
> > > > x 2 threads per core), the improvements are:
> > > >
> > > > +-------------------+--------+--------+-----------+
> > > > | operation | Before | After | Reduction |
> > > > |-------------------+--------+--------+-----------|
> > > > | mlx5_vdpa_suspend | 37 ms | 2.5 ms | 14x |
> > > > | mlx5_vdpa_resume | 16 ms | 5 ms | 3x |
> > > > +-------------------+--------+--------+-----------+
> > > >
> > > > Note for the maintainers:
> > > > The first patch contains changes for mlx5_core. This must be applied
> > > > into the mlx5-vhost tree [0] first. Once this patch is applied on
> > > > mlx5-vhost, the change has to be pulled from mlx5-vdpa into the vhost
> > > > tree and only then the remaining patches can be applied.
> > >
> > > Or maintainer just acks it and I apply directly.
> >
> > We can do it, but there is a potential to create a conflict between your tree
> > and netdev for whole cycle, which will be a bit annoying. Easiest way to avoid
> > this is to have a shared branch, but in august everyone is on vacation, so it
> > will be probably fine to apply such patch directly.
> >
> > Thanks
>
> We can let Linus do something, it's ok ;)
Right and this is how it was for years - Linus dealt with the conflicts
between RDMA and netdev, until he pushed us to have a shared branch :).
However, in this specific cycle and for this specific change, we probably won't
get any conflicts between various trees.
Thanks
>
> > >
> > > Let me know when all this can happen.
> > >
> > > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
> > > >
> > > > Dragos Tatulea (7):
> > > > net/mlx5: Support throttled commands from async API
> > > > vdpa/mlx5: Introduce error logging function
> > > > vdpa/mlx5: Use async API for vq query command
> > > > vdpa/mlx5: Use async API for vq modify commands
> > > > vdpa/mlx5: Parallelize device suspend
> > > > vdpa/mlx5: Parallelize device resume
> > > > vdpa/mlx5: Keep notifiers during suspend but ignore
> > > >
> > > > drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 +-
> > > > drivers/vdpa/mlx5/core/mlx5_vdpa.h | 7 +
> > > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 435 +++++++++++++-----
> > > > 3 files changed, 333 insertions(+), 130 deletions(-)
> > > >
> > > > --
> > > > 2.45.2
> > >
> > >
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume
2024-08-02 7:20 [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume Dragos Tatulea
2024-08-02 7:20 ` [PATCH mlx5-vhost 1/7] net/mlx5: Support throttled commands from async API Dragos Tatulea
2024-08-02 13:14 ` [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume Michael S. Tsirkin
@ 2024-08-07 13:25 ` Eugenio Perez Martin
2024-08-07 14:54 ` Dragos Tatulea
2 siblings, 1 reply; 9+ messages in thread
From: Eugenio Perez Martin @ 2024-08-07 13:25 UTC (permalink / raw)
To: Dragos Tatulea
Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Michael S. Tsirkin,
Jason Wang, Xuan Zhuo, Si-Wei Liu, netdev, linux-rdma,
linux-kernel, virtualization
On Fri, Aug 2, 2024 at 9:24 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>
> This series parallelizes the mlx5_vdpa device suspend and resume
> operations through the firmware async API. The purpose is to reduce live
> migration downtime.
>
> The series starts with changing the VQ suspend and resume commands
> to the async API. After that, the switch is made to issue multiple
> commands of the same type in parallel.
>
There is a missed opportunity processing the CVQ MQ command here,
isn't it? It can be applied on top in another series for sure.
> Finally, a bonus improvement is thrown in: keep the notifierd enabled
> during suspend but make it a NOP. Upon resume make sure that the link
> state is forwarded. This shaves around 30ms per device constant time.
>
> For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
> x 2 threads per core), the improvements are:
>
> +-------------------+--------+--------+-----------+
> | operation | Before | After | Reduction |
> |-------------------+--------+--------+-----------|
> | mlx5_vdpa_suspend | 37 ms | 2.5 ms | 14x |
> | mlx5_vdpa_resume | 16 ms | 5 ms | 3x |
> +-------------------+--------+--------+-----------+
>
Looks great :).
Apart from the nitpick,
Acked-by: Eugenio Pérez <eperezma@redhat.com>
For the vhost part.
Thanks!
> Note for the maintainers:
> The first patch contains changes for mlx5_core. This must be applied
> into the mlx5-vhost tree [0] first. Once this patch is applied on
> mlx5-vhost, the change has to be pulled from mlx5-vdpa into the vhost
> tree and only then the remaining patches can be applied.
>
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
>
> Dragos Tatulea (7):
> net/mlx5: Support throttled commands from async API
> vdpa/mlx5: Introduce error logging function
> vdpa/mlx5: Use async API for vq query command
> vdpa/mlx5: Use async API for vq modify commands
> vdpa/mlx5: Parallelize device suspend
> vdpa/mlx5: Parallelize device resume
> vdpa/mlx5: Keep notifiers during suspend but ignore
>
> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 +-
> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 7 +
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 435 +++++++++++++-----
> 3 files changed, 333 insertions(+), 130 deletions(-)
>
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume
2024-08-07 13:25 ` Eugenio Perez Martin
@ 2024-08-07 14:54 ` Dragos Tatulea
0 siblings, 0 replies; 9+ messages in thread
From: Dragos Tatulea @ 2024-08-07 14:54 UTC (permalink / raw)
To: Eugenio Perez Martin
Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Michael S. Tsirkin,
Jason Wang, Xuan Zhuo, Si-Wei Liu, netdev, linux-rdma,
linux-kernel, virtualization
On 07.08.24 15:25, Eugenio Perez Martin wrote:
> On Fri, Aug 2, 2024 at 9:24 AM Dragos Tatulea <dtatulea@nvidia.com> wrote:
>>
>> This series parallelizes the mlx5_vdpa device suspend and resume
>> operations through the firmware async API. The purpose is to reduce live
>> migration downtime.
>>
>> The series starts with changing the VQ suspend and resume commands
>> to the async API. After that, the switch is made to issue multiple
>> commands of the same type in parallel.
>>
>
> There is a missed opportunity processing the CVQ MQ command here,
> isn't it? It can be applied on top in another series for sure.
>
Initially I considered that it would complicate the code too much in
change_num_qps(). But in the current state of the patches it's doable.
Will send a V2 with an extra patch for this.
>> Finally, a bonus improvement is thrown in: keep the notifierd enabled
>> during suspend but make it a NOP. Upon resume make sure that the link
>> state is forwarded. This shaves around 30ms per device constant time.
>>
>> For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
>> x 2 threads per core), the improvements are:
>>
>> +-------------------+--------+--------+-----------+
>> | operation | Before | After | Reduction |
>> |-------------------+--------+--------+-----------|
>> | mlx5_vdpa_suspend | 37 ms | 2.5 ms | 14x |
>> | mlx5_vdpa_resume | 16 ms | 5 ms | 3x |
>> +-------------------+--------+--------+-----------+
>>
>
> Looks great :).
>
> Apart from the nitpick,
>
> Acked-by: Eugenio Pérez <eperezma@redhat.com>
>
> For the vhost part.
Thanks!
>
> Thanks!
>
>> Note for the maintainers:
>> The first patch contains changes for mlx5_core. This must be applied
>> into the mlx5-vhost tree [0] first. Once this patch is applied on
>> mlx5-vhost, the change has to be pulled from mlx5-vdpa into the vhost
>> tree and only then the remaining patches can be applied.
>>
>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
>>
>> Dragos Tatulea (7):
>> net/mlx5: Support throttled commands from async API
>> vdpa/mlx5: Introduce error logging function
>> vdpa/mlx5: Use async API for vq query command
>> vdpa/mlx5: Use async API for vq modify commands
>> vdpa/mlx5: Parallelize device suspend
>> vdpa/mlx5: Parallelize device resume
>> vdpa/mlx5: Keep notifiers during suspend but ignore
>>
>> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 +-
>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 7 +
>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 435 +++++++++++++-----
>> 3 files changed, 333 insertions(+), 130 deletions(-)
>>
>> --
>> 2.45.2
>>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume
2024-08-02 13:14 ` [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume Michael S. Tsirkin
2024-08-04 8:48 ` Leon Romanovsky
@ 2024-08-16 9:13 ` Dragos Tatulea
1 sibling, 0 replies; 9+ messages in thread
From: Dragos Tatulea @ 2024-08-16 9:13 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Saeed Mahameed, Leon Romanovsky, Tariq Toukan, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jason Wang, Xuan Zhuo,
Eugenio Pérez, Si-Wei Liu, netdev, linux-rdma, linux-kernel,
virtualization
On 02.08.24 15:14, Michael S. Tsirkin wrote:
> On Fri, Aug 02, 2024 at 10:20:17AM +0300, Dragos Tatulea wrote:
>> This series parallelizes the mlx5_vdpa device suspend and resume
>> operations through the firmware async API. The purpose is to reduce live
>> migration downtime.
>>
>> The series starts with changing the VQ suspend and resume commands
>> to the async API. After that, the switch is made to issue multiple
>> commands of the same type in parallel.
>>
>> Finally, a bonus improvement is thrown in: keep the notifierd enabled
>> during suspend but make it a NOP. Upon resume make sure that the link
>> state is forwarded. This shaves around 30ms per device constant time.
>>
>> For 1 vDPA device x 32 VQs (16 VQPs), on a large VM (256 GB RAM, 32 CPUs
>> x 2 threads per core), the improvements are:
>>
>> +-------------------+--------+--------+-----------+
>> | operation | Before | After | Reduction |
>> |-------------------+--------+--------+-----------|
>> | mlx5_vdpa_suspend | 37 ms | 2.5 ms | 14x |
>> | mlx5_vdpa_resume | 16 ms | 5 ms | 3x |
>> +-------------------+--------+--------+-----------+
>>
>> Note for the maintainers:
>> The first patch contains changes for mlx5_core. This must be applied
>> into the mlx5-vhost tree [0] first. Once this patch is applied on
>> mlx5-vhost, the change has to be pulled from mlx5-vdpa into the vhost
>> tree and only then the remaining patches can be applied.
>
> Or maintainer just acks it and I apply directly.
>
Tariq reviewed the patch, he is a mlx5_core maintainer. So consider it acked.
Just sent the v2 with the same note in the cover letter.
Thanks,
Dragos
> Let me know when all this can happen.
>
>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git/log/?h=mlx5-vhost
>>
>> Dragos Tatulea (7):
>> net/mlx5: Support throttled commands from async API
>> vdpa/mlx5: Introduce error logging function
>> vdpa/mlx5: Use async API for vq query command
>> vdpa/mlx5: Use async API for vq modify commands
>> vdpa/mlx5: Parallelize device suspend
>> vdpa/mlx5: Parallelize device resume
>> vdpa/mlx5: Keep notifiers during suspend but ignore
>>
>> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 +-
>> drivers/vdpa/mlx5/core/mlx5_vdpa.h | 7 +
>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 435 +++++++++++++-----
>> 3 files changed, 333 insertions(+), 130 deletions(-)
>>
>> --
>> 2.45.2
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-08-16 9:13 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-02 7:20 [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume Dragos Tatulea
2024-08-02 7:20 ` [PATCH mlx5-vhost 1/7] net/mlx5: Support throttled commands from async API Dragos Tatulea
2024-08-02 13:14 ` [PATCH vhost 0/7] vdpa/mlx5: Parallelize device suspend/resume Michael S. Tsirkin
2024-08-04 8:48 ` Leon Romanovsky
2024-08-04 13:39 ` Michael S. Tsirkin
2024-08-04 14:52 ` Leon Romanovsky
2024-08-16 9:13 ` Dragos Tatulea
2024-08-07 13:25 ` Eugenio Perez Martin
2024-08-07 14:54 ` Dragos Tatulea
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).