qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Sahil Siddiq <icegambit91@gmail.com>
To: Eugenio Perez Martin <eperezma@redhat.com>
Cc: sgarzare@redhat.com, mst@redhat.com, qemu-devel@nongnu.org,
	Sahil Siddiq <sahilcdq@proton.me>
Subject: Re: [RFC v4 0/5] Add packed virtqueue to shadow virtqueue
Date: Mon, 10 Feb 2025 21:55:17 +0530	[thread overview]
Message-ID: <37880ee7-74c4-47c3-93ce-c2a8b177fc10@gmail.com> (raw)
In-Reply-To: <CAJaqyWf4OLVmZn+g7B6X97QFUjRV9K=u-Bkr_OhRKUSsJgd6tg@mail.gmail.com>

Hi,

On 2/10/25 7:53 PM, Eugenio Perez Martin wrote:
> On Mon, Feb 10, 2025 at 11:58 AM Sahil Siddiq <icegambit91@gmail.com> wrote:
>> On 2/6/25 8:47 PM, Sahil Siddiq wrote:
>>> On 2/6/25 12:42 PM, Eugenio Perez Martin wrote:
>>>> On Thu, Feb 6, 2025 at 6:26 AM Sahil Siddiq <icegambit91@gmail.com> wrote:
>>>>> On 2/4/25 11:45 PM, Eugenio Perez Martin wrote:
>>>>>> PS: Please note that you can check packed_vq SVQ implementation
>>>>>> already without CVQ, as these features are totally orthogonal :).
>>>>>>
>>>>>
>>>>> Right. Now that I can ping with the ctrl features turned off, I think
>>>>> this should take precedence. There's another issue specific to the
>>>>> packed virtqueue case. It causes the kernel to crash. I have been
>>>>> investigating this and the situation here looks very similar to what's
>>>>> explained in Jason Wang's mail [2]. My plan of action is to apply his
>>>>> changes in L2's kernel and check if that resolves the problem.
>>>>>
>>>>> The details of the crash can be found in this mail [3].
>>>>>
>>>>
>>>> If you're testing this series without changes, I think that is caused
>>>> by not implementing the packed version of vhost_svq_get_buf.
>>>>
>>>> https://lists.nongnu.org/archive/html/qemu-devel/2024-12/msg01902.html
>>>>
>>>
>>> Oh, apologies, I think I had misunderstood your response in the linked mail.
>>> Until now, I thought they were unrelated. In that case, I'll implement the
>>> packed version of vhost_svq_get_buf. Hopefully that fixes it :).
>>>
>>
>> I noticed one thing while testing some of the changes that I have made.
>> I haven't finished making the relevant changes to all the functions which
>> will have to handle split and packed vq differently. L2's kernel crashes
>> when I launch L0-QEMU with ctrl_vq=on,ctrl_rx=on.
> 
> Interesting, is a similar crash than this? (NULL ptr deference on
> virtnet_set_features)?
>
> https://issues.redhat.com/browse/RHEL-391
I am not able to access this bug report (even with a Red Hat account). It
says it may have been deleted or I don't have the permission to view it.

It's hard to tell if this is the same issue. I don't think it is the same
issue though since I don't see any such indication in the logs. The kernel
throws the following:

[   23.047503] virtio_net virtio1: output.0:id 0 is not a head!
[   49.173243] watchdog: BUG: soft lockup - CPU#1 stuck for 25s! [NetworkManager:782]
[   49.174167] Modules linked in: rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core intel_vsec pmt_telemetry pmt_class kvg
[   49.188258] CPU: 1 PID: 782 Comm: NetworkManager Not tainted 6.8.7-200.fc39.x86_64 #1
[   49.193196] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[   49.193196] RIP: 0010:virtqueue_get_buf+0x0/0x20

Maybe I was incorrect in stating that the kernel crashes. It's more like
the kernel is stuck in a loop (according to these blog posts on soft
lockup [1][2]).

In the above trace, RIP is in virtqueue_get_buf() [3]. This is what
calls virtqueue_get_buf_ctx_packed() [4] which throws the error.

What I don't understand is why vq->packed.desc_state[id].data [5] is
NULL when the control features are turned on, but doesn't seem to be
NULL when the control features are turned off.

>> However, when I start
>> L0-QEMU with ctrl_vq=off,ctrl_rx=off,ctrl_vlan=off,ctrl_mac_addr=off, L2's
>> kernel boots successfully. Tracing L2-QEMU also confirms that the packed
>> feature is enabled. With all the ctrl features disabled, I think pinging
>> will also be possible once I finish implementing the packed versions of
>> the other functions.
>>
> 
> Good!
> 
>> There's another thing that I am confused about regarding the current
>> implementation (in the master branch).
>>
>> In hw/virtio/vhost-shadow-virtqueue.c:vhost_svq_vring_write_descs() [1],
>> svq->free_head saves the descriptor in the specified format using
>> "le16_to_cpu" (line 171).
> 
> Good catch, this should be le16_to_cpu actually. But code wise is the
> same, so we have no visible error. Do you want to send a patch to fix
> it?
> 
Sorry, I am still a little confused here. Did you mean cpu_to_le16
by any chance? Based on what I have understood, if it is to be used
by the host machine, then it should be cpu_to_le16.

I can send a patch once this is clear, or can even integrate it in
this patch series since this patch series refactors that function
anyway.

>> On the other hand, the value of i is stored
>> in the native endianness using "cpu_to_le16" (line 168). If "i" is to be
>> stored in the native endianness (little endian in this case), then
>> should svq->free_head first be converted to little endian before being
>> assigned to "i" at the start of the function (line 142)?
>>
> 
> This part is correct in the code, as it is used by the host, not
> written to the guest or read from the guest. So no conversion is
> needed.

Understood.

Thanks,
Sahil

[1] https://www.suse.com/support/kb/doc/?id=000018705
[2] https://softlockup.com/SystemAdministration/Linux/Kernel/softlockup/
[3] https://github.com/torvalds/linux/blob/master/drivers/virtio/virtio_ring.c#L2545
[4] https://github.com/torvalds/linux/blob/master/drivers/virtio/virtio_ring.c#L1727
[5] https://github.com/torvalds/linux/blob/master/drivers/virtio/virtio_ring.c#L1762



  reply	other threads:[~2025-02-10 16:26 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-05 20:34 [RFC v4 0/5] Add packed virtqueue to shadow virtqueue Sahil Siddiq
2024-12-05 20:34 ` [RFC v4 1/5] vhost: Refactor vhost_svq_add_split Sahil Siddiq
2024-12-10  8:40   ` Eugenio Perez Martin
2024-12-05 20:34 ` [RFC v4 2/5] vhost: Write descriptors to packed svq Sahil Siddiq
2024-12-10  8:54   ` Eugenio Perez Martin
2024-12-11 15:58     ` Sahil Siddiq
2024-12-05 20:34 ` [RFC v4 3/5] vhost: Data structure changes to support packed vqs Sahil Siddiq
2024-12-10  8:55   ` Eugenio Perez Martin
2024-12-11 15:59     ` Sahil Siddiq
2024-12-05 20:34 ` [RFC v4 4/5] vdpa: Allocate memory for svq and map them to vdpa Sahil Siddiq
2024-12-05 20:34 ` [RFC v4 5/5] vdpa: Support setting vring_base for packed svq Sahil Siddiq
2024-12-10  9:27 ` [RFC v4 0/5] Add packed virtqueue to shadow virtqueue Eugenio Perez Martin
2024-12-11 15:57   ` Sahil Siddiq
2024-12-15 17:27   ` Sahil Siddiq
2024-12-16  8:39     ` Eugenio Perez Martin
2024-12-17  5:45       ` Sahil Siddiq
2024-12-17  7:50         ` Eugenio Perez Martin
2024-12-19 19:37           ` Sahil Siddiq
2024-12-20  6:58             ` Eugenio Perez Martin
2025-01-03 13:06               ` Sahil Siddiq
2025-01-07  8:05                 ` Eugenio Perez Martin
2025-01-19  6:37                   ` Sahil Siddiq
2025-01-21 16:37                     ` Eugenio Perez Martin
2025-01-24  5:46                       ` Sahil Siddiq
2025-01-24  7:34                         ` Eugenio Perez Martin
2025-01-31  5:04                           ` Sahil Siddiq
2025-01-31  6:57                             ` Eugenio Perez Martin
2025-02-04 12:49                               ` Sahil Siddiq
2025-02-04 18:10                                 ` Eugenio Perez Martin
2025-02-04 18:15                                   ` Eugenio Perez Martin
2025-02-06  5:26                                   ` Sahil Siddiq
2025-02-06  7:12                                     ` Eugenio Perez Martin
2025-02-06 15:17                                       ` Sahil Siddiq
2025-02-10 10:58                                         ` Sahil Siddiq
2025-02-10 14:23                                           ` Eugenio Perez Martin
2025-02-10 16:25                                             ` Sahil Siddiq [this message]
2025-02-11  7:57                                               ` Eugenio Perez Martin
2025-03-06  5:25                                                 ` Sahil Siddiq
2025-03-06  7:23                                                   ` Eugenio Perez Martin
2025-03-24 13:54                                                     ` Sahil Siddiq

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37880ee7-74c4-47c3-93ce-c2a8b177fc10@gmail.com \
    --to=icegambit91@gmail.com \
    --cc=eperezma@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sahilcdq@proton.me \
    --cc=sgarzare@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).