From: Si-Wei Liu <si-wei.liu@oracle.com>
To: Eli Cohen <elic@nvidia.com>, jasowang@redhat.com
Cc: mst@redhat.com, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org,
virtualization@lists.linux-foundation.org
Subject: Re: [PATCH 1/2] vdpa/mlx5: Fix suspend/resume index restoration
Date: Fri, 19 Feb 2021 18:42:04 -0800 [thread overview]
Message-ID: <033b0806-4037-5755-a1fa-91dbb58bab2e@oracle.com> (raw)
In-Reply-To: <4ecc1c7f-4f5a-68be-6734-e18dfeb91437@oracle.com>
On 2/17/2021 11:42 AM, Si-Wei Liu wrote:
>
>
> On 2/16/2021 8:20 AM, Eli Cohen wrote:
>> When we suspend the VM, the VDPA interface will be reset. When the VM is
>> resumed again, clear_virtqueues() will clear the available and used
>> indices resulting in hardware virqtqueue objects becoming out of sync.
>> We can avoid this function alltogether since qemu will clear them if
>> required, e.g. when the VM went through a reboot.
>>
>> Moreover, since the hw available and used indices should always be
>> identical on query and should be restored to the same value same value
>> for virtqueues that complete in order, we set the single value provided
>> by set_vq_state(). In get_vq_state() we return the value of hardware
>> used index.
>>
>> Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5
>> devices")
>> Signed-off-by: Eli Cohen <elic@nvidia.com>
> Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
I'd take it back for the moment, according to Jason's latest comment.
Discussion still going.
>
>> ---
>> drivers/vdpa/mlx5/net/mlx5_vnet.c | 17 ++++-------------
>> 1 file changed, 4 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index b8e9d525d66c..a51b0f86afe2 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1169,6 +1169,7 @@ static void suspend_vq(struct mlx5_vdpa_net
>> *ndev, struct mlx5_vdpa_virtqueue *m
>> return;
>> }
>> mvq->avail_idx = attr.available_index;
>> + mvq->used_idx = attr.used_index;
>> }
>> static void suspend_vqs(struct mlx5_vdpa_net *ndev)
>> @@ -1426,6 +1427,7 @@ static int mlx5_vdpa_set_vq_state(struct
>> vdpa_device *vdev, u16 idx,
>> return -EINVAL;
>> }
>> + mvq->used_idx = state->avail_index;
>> mvq->avail_idx = state->avail_index;
This is where things starts getting interesting. According to Jason, the
original expectation of this API would be to restore the internal
last_avail_index in the hardware. With Mellanox network vDPA hardware
implementation, there's no such last_avail_index thing in the hardware
but only a single last_used_index representing both indices, which
should always be the same and in sync with each other. So from migration
point of view, it appears logical to restore the saved last_avail_index
to the last_used_index in the hardware, right? But what is the point to
restore mvq->avail_idx?
Actually, this code path is being repurposed to address the index reset
problem in the device reset scenario. Where the mvq->avail_idx and
mvq->used_idx are both reset to 0 once device is reset. This is a bit
crossing the boundary to me.
>> return 0;
>> }
>> @@ -1443,7 +1445,7 @@ static int mlx5_vdpa_get_vq_state(struct
>> vdpa_device *vdev, u16 idx, struct vdpa
>> * that cares about emulating the index after vq is stopped.
>> */
>> if (!mvq->initialized) {
>> - state->avail_index = mvq->avail_idx;
>> + state->avail_index = mvq->used_idx;
This is where the last_used_index gets loaded from the hardware (when
device is stopped).
>> return 0;
>> }
>> @@ -1452,7 +1454,7 @@ static int mlx5_vdpa_get_vq_state(struct
>> vdpa_device *vdev, u16 idx, struct vdpa
>> mlx5_vdpa_warn(mvdev, "failed to query virtqueue\n");
>> return err;
>> }
>> - state->avail_index = attr.available_index;
>> + state->avail_index = attr.used_index;
This code path never gets called from userspace (when device is running).
-Siwei
>> return 0;
>> }
>> @@ -1532,16 +1534,6 @@ static void teardown_virtqueues(struct
>> mlx5_vdpa_net *ndev)
>> }
>> }
>> -static void clear_virtqueues(struct mlx5_vdpa_net *ndev)
>> -{
>> - int i;
>> -
>> - for (i = ndev->mvdev.max_vqs - 1; i >= 0; i--) {
>> - ndev->vqs[i].avail_idx = 0;
>> - ndev->vqs[i].used_idx = 0;
>> - }
>> -}
>> -
>> /* TODO: cross-endian support */
>> static inline bool mlx5_vdpa_is_little_endian(struct mlx5_vdpa_dev
>> *mvdev)
>> {
>> @@ -1777,7 +1769,6 @@ static void mlx5_vdpa_set_status(struct
>> vdpa_device *vdev, u8 status)
>> if (!status) {
>> mlx5_vdpa_info(mvdev, "performing device reset\n");
>> teardown_driver(ndev);
>> - clear_virtqueues(ndev);
>> mlx5_vdpa_destroy_mr(&ndev->mvdev);
>> ndev->mvdev.status = 0;
>> ++mvdev->generation;
>
prev parent reply other threads:[~2021-02-20 2:43 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-16 16:20 [PATCH 1/2] vdpa/mlx5: Fix suspend/resume index restoration Eli Cohen
2021-02-17 19:42 ` Si-Wei Liu
2021-02-17 21:20 ` Michael S. Tsirkin
2021-02-17 21:51 ` Si-Wei Liu
2021-02-18 7:18 ` Eli Cohen
2021-02-20 2:42 ` Si-Wei Liu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=033b0806-4037-5755-a1fa-91dbb58bab2e@oracle.com \
--to=si-wei.liu@oracle.com \
--cc=elic@nvidia.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).