From: Akihiko Odaki <akihiko.odaki@daynix.com>
To: Laurent Vivier <lvivier@redhat.com>, Jason Wang <jasowang@redhat.com>
Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [PULL 08/20] virtio-net: Add only one queue pair when realizing
Date: Thu, 17 Oct 2024 18:42:44 +0900 [thread overview]
Message-ID: <002f53e8-501e-4b4d-b1fc-67ec51e3a94f@daynix.com> (raw)
In-Reply-To: <bc493771-e507-4027-af76-f9a95e99b81d@redhat.com>
On 2024/10/17 18:17, Laurent Vivier wrote:
> On 17/10/2024 11:07, Akihiko Odaki wrote:
>> On 2024/10/17 16:32, Laurent Vivier wrote:
>>> On 17/10/2024 08:59, Jason Wang wrote:
>>>> On Mon, Oct 14, 2024 at 11:16 PM Laurent Vivier <lvivier@redhat.com>
>>>> wrote:
>>>>>
>>>>> On 14/10/2024 10:30, Laurent Vivier wrote:
>>>>>> Hi Akihiko,
>>>>>>
>>>>>> On 04/06/2024 09:37, Jason Wang wrote:
>>>>>>> From: Akihiko Odaki <akihiko.odaki@daynix.com>
>>>>>>>
>>>>>>> Multiqueue usage is not negotiated yet when realizing. If more than
>>>>>>> one queue is added and the guest never requests to enable
>>>>>>> multiqueue,
>>>>>>> the extra queues will not be deleted when unrealizing and leak.
>>>>>>>
>>>>>>> Fixes: f9d6dbf0bf6e ("virtio-net: remove virtio queues if the
>>>>>>> guest doesn't support
>>>>>>> multiqueue")
>>>>>>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>>> ---
>>>>>>> hw/net/virtio-net.c | 4 +---
>>>>>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>>>>> index 3cee2ef3ac..a8db8bfd9c 100644
>>>>>>> --- a/hw/net/virtio-net.c
>>>>>>> +++ b/hw/net/virtio-net.c
>>>>>>> @@ -3743,9 +3743,7 @@ static void
>>>>>>> virtio_net_device_realize(DeviceState *dev, Error **errp)
>>>>>>> n->net_conf.tx_queue_size =
>>>>>>> MIN(virtio_net_max_tx_queue_size(n),
>>>>>>> n->net_conf.tx_queue_size);
>>>>>>> - for (i = 0; i < n->max_queue_pairs; i++) {
>>>>>>> - virtio_net_add_queue(n, i);
>>>>>>> - }
>>>>>>> + virtio_net_add_queue(n, 0);
>>>>>>> n->ctrl_vq = virtio_add_queue(vdev, 64,
>>>>>>> virtio_net_handle_ctrl);
>>>>>>> qemu_macaddr_default_if_unset(&n->nic_conf.macaddr);
>>>>>>
>>>>>> This change breaks virtio net migration when multiqueue is enabled.
>>>>>>
>>>>>> I think this is because virtqueues are half initialized after
>>>>>> migration : they are
>>>>>> initialized on guest side (kernel is using them) but not on QEMU
>>>>>> side (realized has only
>>>>>> initialized one). After migration, they are not initialized by the
>>>>>> call to
>>>>>> virtio_net_set_multiqueue() from virtio_net_set_features() because
>>>>>> virtio_get_num_queues()
>>>>>> reports already n->max_queue_pairs as this value is coming from
>>>>>> the source guest memory.
>>>>>>
>>>>>> I don't think we have a way to half-initialize a virtqueue (to
>>>>>> initialize them only on
>>>>>> QEMU side as they are already initialized on kernel side).
>>>>>>
>>>>>> I think this change should be reverted to fix the migration issue.
>>>>>>
>>>>>
>>>>> Moreover, if I look in the code of virtio_load() and
>>>>> virtio_add_queue() we can guess it's
>>>>> not correct to migrate a virtqueue that is not initialized on the
>>>>> destination side because
>>>>> fields like 'vdev->vq[i].handle_output' or 'vdev->vq[i].used_elems'
>>>>> cannot be initialized
>>>>> by virtio_load() and neither by virtio_add_queue() after
>>>>> virtio_load() as fields like
>>>>> 'vring.num' are already initialized by virtio_load().
>>>>>
>>>>> For instance, in virtio_load() we set:
>>>>>
>>>>> for (i = 0; i < num; i++) {
>>>>> vdev->vq[i].vring.num = qemu_get_be32(f);
>>>>>
>>>>> and in virtio_add_queue() we search for the firt available queue to
>>>>> add with:
>>>>>
>>>>> for (i = 0; i < VIRTIO_QUEUE_MAX; i++) {
>>>>> if (vdev->vq[i].vring.num == 0)
>>>>> break;
>>>>> }
>>>>>
>>>>> So virtio_add_queue() cannot be used to set:
>>>>>
>>>>> vdev->vq[i].handle_output = handle_output;
>>>>> vdev->vq[i].used_elems = g_new0(VirtQueueElement, queue_size);
>>>>>
>>>>> Moreover it would overwrite fields already set by virtio_load():
>>>>>
>>>>> vdev->vq[i].vring.num = queue_size;
>>>>> vdev->vq[i].vring.align = VIRTIO_PCI_VRING_ALIGN;
>>>>>
>>>>> It also explains why virtio_net_change_num_queue_pairs()
>>>>> (indirectly called by
>>>>> virtio_net_set_features()) doesn't update the queue pair numbers:
>>>>> vring.num is already set
>>>>> so it thinks there is no more queues to add.
>>>>>
>>>>> Thanks,
>>>>> LAurent
>>>>>
>>>>
>>>> I agree.
>>>>
>>>> Laurent, would you like to send a patch to revert this?
>>>>
>>>
>>> Yes. I will also try to fix the leak in unrealize that the patch
>>> wanted to fix initially.
>>
>> I wrote a fix so I will submit it once internal testing is done. You
>> can see the change at:
>> https://gitlab.com/akihiko.odaki/qemu-kvm/-/
>> commit/22161221aa2d2031d7ad1be7701852083aa9109a
>
> It works fine for me but I don't know if it's a good idea to add queues
> while the state is loading.
I couldn't come up with other options. The problem is that the number of
queues added during realization does not match with the loaded state. We
need to add queues after knowing the negotiated feature set and before
loading the queue states to fix this problem.
Reverting will add queues that are used when the multiqueue feature is
negotiated so it will fix migration for such cases, but will also break
the other cases (i.e., the multiqueue feature is not negotiated) as it
adds too many queues.
Regards,
Akihiko Odaki
>
> Jason, let me know which solution you prefer (revert or pre_load_queues
> helper).
>
> CC'ing MST
>
> Thanks,
> Laurent
>
next prev parent reply other threads:[~2024-10-17 9:43 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-04 7:37 [PULL 00/20] Net patches Jason Wang
2024-06-04 7:37 ` [PULL 01/20] tap: Remove tap_probe_vnet_hdr_len() Jason Wang
2024-06-04 7:37 ` [PULL 02/20] tap: Remove qemu_using_vnet_hdr() Jason Wang
2024-06-04 7:37 ` [PULL 03/20] net: Move virtio-net header length assertion Jason Wang
2024-06-04 7:37 ` [PULL 04/20] net: Remove receive_raw() Jason Wang
2024-06-04 7:37 ` [PULL 05/20] tap: Call tap_receive_iov() from tap_receive() Jason Wang
2024-06-04 7:37 ` [PULL 06/20] tap: Shrink zeroed virtio-net header Jason Wang
2024-06-04 7:37 ` [PULL 07/20] virtio-net: Do not propagate ebpf-rss-fds errors Jason Wang
2024-06-05 10:23 ` Daniel P. Berrangé
2024-06-05 20:14 ` Akihiko Odaki
2024-06-06 7:14 ` Daniel P. Berrangé
2024-06-06 7:19 ` Akihiko Odaki
2024-06-06 7:59 ` Daniel P. Berrangé
2024-06-07 6:04 ` Akihiko Odaki
2024-06-04 7:37 ` [PULL 08/20] virtio-net: Add only one queue pair when realizing Jason Wang
2024-10-14 8:30 ` Laurent Vivier
2024-10-14 15:16 ` Laurent Vivier
2024-10-17 6:59 ` Jason Wang
2024-10-17 7:32 ` Laurent Vivier
2024-10-17 9:07 ` Akihiko Odaki
2024-10-17 9:17 ` Laurent Vivier
2024-10-17 9:42 ` Akihiko Odaki [this message]
2024-10-18 4:50 ` Jason Wang
2024-10-19 12:38 ` Akihiko Odaki
2024-10-21 7:23 ` Jason Wang
2024-10-21 8:40 ` Akihiko Odaki
2024-06-04 7:37 ` [PULL 09/20] virtio-net: Copy header only when necessary Jason Wang
2024-06-04 7:37 ` [PULL 10/20] virtio-net: Shrink header byte swapping buffer Jason Wang
2024-06-04 7:37 ` [PULL 11/20] virtio-net: Disable RSS on reset Jason Wang
2024-06-04 7:37 ` [PULL 12/20] virtio-net: Unify the logic to update NIC state for RSS Jason Wang
2024-06-04 7:37 ` [PULL 13/20] virtio-net: Always set populate_hash Jason Wang
2024-06-04 7:37 ` [PULL 14/20] virtio-net: Do not write hashes to peer buffer Jason Wang
2024-06-04 7:37 ` [PULL 15/20] ebpf: Fix RSS error handling Jason Wang
2024-06-04 7:37 ` [PULL 16/20] ebpf: Return 0 when configuration fails Jason Wang
2024-06-04 7:37 ` [PULL 17/20] ebpf: Refactor tun_rss_steering_prog() Jason Wang
2024-06-04 7:37 ` [PULL 18/20] ebpf: Add a separate target for skeleton Jason Wang
2024-06-04 7:37 ` [PULL 19/20] virtio-net: drop too short packets early Jason Wang
2024-06-04 7:37 ` [PULL 20/20] ebpf: Added traces back. Changed source set for eBPF to 'system' Jason Wang
2024-06-04 19:52 ` [PULL 00/20] Net patches Richard Henderson
2024-06-05 10:14 ` Michael Tokarev
2024-06-05 20:18 ` Akihiko Odaki
2024-06-06 0:13 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=002f53e8-501e-4b4d-b1fc-67ec51e3a94f@daynix.com \
--to=akihiko.odaki@daynix.com \
--cc=jasowang@redhat.com \
--cc=lvivier@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).