qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Hawkins Jiawei <yin31149@gmail.com>
To: jasowang@redhat.com, mst@redhat.com, eperezma@redhat.com
Cc: qemu-devel@nongnu.org, yin31149@gmail.com, 18801353760@163.com
Subject: [PATCH v3 6/7] vdpa: Avoid forwarding large CVQ command failures
Date: Fri,  7 Jul 2023 23:27:33 +0800	[thread overview]
Message-ID: <267e15e4eed2d7aeb9887f193da99a13d22a2f1d.1688743107.git.yin31149@gmail.com> (raw)
In-Reply-To: <cover.1688743107.git.yin31149@gmail.com>

Due to the size limitation of the out buffer sent to the vdpa device,
which is determined by vhost_vdpa_net_cvq_cmd_len(), excessive CVQ
command is truncated in QEMU. As a result, the vdpa device rejects
this flawd CVQ command.

However, the problem is that, the VIRTIO_NET_CTRL_MAC_TABLE_SET
CVQ command has a variable length, which may exceed
vhost_vdpa_net_cvq_cmd_len() if the guest sets more than
`MAC_TABLE_ENTRIES` MAC addresses for the filter table.

This patch solves this problem by following steps:

  * Increase the out buffer size to vhost_vdpa_net_cvq_cmd_page_len(),
which represents the size of the buffer that is allocated and mmaped.
This ensures that everything works correctly as long as the guest
sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() -
sizeof(struct virtio_net_ctrl_hdr)
- 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses.
    Considering the highly unlikely scenario for the guest setting
more than that number of MAC addresses for the filter table, this
should work fine for the majority of cases.

  * If the CVQ command exceeds vhost_vdpa_net_cvq_cmd_page_len(),
instead of directly sending this CVQ command, QEMU should send
a VIRTIO_NET_CTRL_RX_PROMISC CVQ command to vdpa device. Addtionally,
a fake VIRTIO_NET_CTRL_MAC_TABLE_SET command including
(`MAC_TABLE_ENTRIES` + 1) non-multicast MAC addresses and
(`MAC_TABLE_ENTRIES` + 1) multicast MAC addresses should be provided
to the device model.
    By doing so, the vdpa device turns promiscuous mode on, aligning
with the VirtIO standard. The device model marks
`n->mac_table.uni_overflow` and `n->mac_table.multi_overflow`,
which aligns with the state of the vdpa device.

Note that the bug cannot be triggered at the moment, since
VIRTIO_NET_F_CTRL_RX feature is not enabled for SVQ.

Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
---
 net/vhost-vdpa.c | 162 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 161 insertions(+), 1 deletion(-)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index a84eb088a0..a4ff6c52b7 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -916,6 +916,148 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
     .check_peer_type = vhost_vdpa_check_peer_type,
 };
 
+/*
+ * Forward the excessive VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command to
+ * vdpa device.
+ *
+ * Considering that QEMU cannot send the entire filter table to the
+ * vdpa device, it should send the VIRTIO_NET_CTRL_RX_PROMISC CVQ
+ * command to enable promiscuous mode to receive all packets,
+ * according to VirtIO standard, "Since there are no guarantees,
+ * it can use a hash filter or silently switch to allmulti or
+ * promiscuous mode if it is given too many addresses.".
+ *
+ * Since QEMU ignores MAC addresses beyond `MAC_TABLE_ENTRIES` and
+ * marks `n->mac_table.x_overflow` accordingly, it should have
+ * the same effect on the device model to receive
+ * (`MAC_TABLE_ENTRIES` + 1) or more non-multicast MAC addresses.
+ * The same applies to multicast MAC addresses.
+ *
+ * Therefore, QEMU can provide the device model with a fake
+ * VIRTIO_NET_CTRL_MAC_TABLE_SET command with (`MAC_TABLE_ENTRIES` + 1)
+ * non-multicast MAC addresses and (`MAC_TABLE_ENTRIES` + 1) multicast
+ * MAC addresses. This ensures that the device model marks
+ * `n->mac_table.uni_overflow` and `n->mac_table.multi_overflow`,
+ * allowing all packets to be received, which aligns with the
+ * state of the vdpa device.
+ */
+static int vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
+                                                       VirtQueueElement *elem,
+                                                       struct iovec *out)
+{
+    struct virtio_net_ctrl_mac mac_data, *mac_ptr;
+    struct virtio_net_ctrl_hdr *hdr_ptr;
+    uint32_t cursor;
+    ssize_t r;
+
+    /* parse the non-multicast MAC address entries from CVQ command */
+    cursor = sizeof(*hdr_ptr);
+    r = iov_to_buf(elem->out_sg, elem->out_num, cursor,
+                   &mac_data, sizeof(mac_data));
+    if (unlikely(r != sizeof(mac_data))) {
+        /*
+         * If the CVQ command is invalid, we should simulate the vdpa device
+         * to reject the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
+         */
+        *s->status = VIRTIO_NET_ERR;
+        return sizeof(*s->status);
+    }
+    cursor += sizeof(mac_data) + le32_to_cpu(mac_data.entries) * ETH_ALEN;
+
+    /* parse the multicast MAC address entries from CVQ command */
+    r = iov_to_buf(elem->out_sg, elem->out_num, cursor,
+                   &mac_data, sizeof(mac_data));
+    if (r != sizeof(mac_data)) {
+        /*
+         * If the CVQ command is invalid, we should simulate the vdpa device
+         * to reject the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
+         */
+        *s->status = VIRTIO_NET_ERR;
+        return sizeof(*s->status);
+    }
+    cursor += sizeof(mac_data) + le32_to_cpu(mac_data.entries) * ETH_ALEN;
+
+    /* validate the CVQ command */
+    if (iov_size(elem->out_sg, elem->out_num) != cursor) {
+        /*
+         * If the CVQ command is invalid, we should simulate the vdpa device
+         * to reject the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
+         */
+        *s->status = VIRTIO_NET_ERR;
+        return sizeof(*s->status);
+    }
+
+    /*
+     * According to VirtIO standard, "Since there are no guarantees,
+     * it can use a hash filter or silently switch to allmulti or
+     * promiscuous mode if it is given too many addresses.".
+     *
+     * Therefore, considering that QEMU is unable to send the entire
+     * filter table to the vdpa device, it should send the
+     * VIRTIO_NET_CTRL_RX_PROMISC CVQ command to enable promiscuous mode
+     */
+    r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 1);
+    if (unlikely(r < 0)) {
+        return r;
+    }
+    if (*s->status != VIRTIO_NET_OK) {
+        return sizeof(*s->status);
+    }
+
+    /*
+     * QEMU should also send a fake VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ
+     * command to the device model, including (`MAC_TABLE_ENTRIES` + 1)
+     * non-multicast MAC addresses and (`MAC_TABLE_ENTRIES` + 1)
+     * multicast MAC addresses.
+     *
+     * By doing so, the device model can mark `n->mac_table.uni_overflow`
+     * and `n->mac_table.multi_overflow`, enabling all packets to be
+     * received, which aligns with the state of the vdpa device.
+     */
+    cursor = 0;
+    uint32_t fake_uni_entries = MAC_TABLE_ENTRIES + 1,
+             fake_mul_entries = MAC_TABLE_ENTRIES + 1,
+             fake_cvq_size = sizeof(struct virtio_net_ctrl_hdr) +
+                             sizeof(mac_data) + fake_uni_entries * ETH_ALEN +
+                             sizeof(mac_data) + fake_mul_entries * ETH_ALEN;
+
+    assert(fake_cvq_size < vhost_vdpa_net_cvq_cmd_page_len());
+    out->iov_len = fake_cvq_size;
+
+    /* pack the header for fake CVQ command */
+    hdr_ptr = out->iov_base + cursor;
+    hdr_ptr->class = VIRTIO_NET_CTRL_MAC;
+    hdr_ptr->cmd = VIRTIO_NET_CTRL_MAC_TABLE_SET;
+    cursor += sizeof(*hdr_ptr);
+
+    /*
+     * Pack the non-multicast MAC addresses part for fake CVQ command.
+     *
+     * According to virtio_net_handle_mac(), QEMU doesn't verify the MAC
+     * addresses provieded in CVQ command. Therefore, only the entries
+     * field need to be prepared in the CVQ command.
+     */
+    mac_ptr = out->iov_base + cursor;
+    mac_ptr->entries = cpu_to_le32(fake_uni_entries);
+    cursor += sizeof(*mac_ptr) + fake_uni_entries * ETH_ALEN;
+
+    /*
+     * Pack the multicast MAC addresses part for fake CVQ command.
+     *
+     * According to virtio_net_handle_mac(), QEMU doesn't verify the MAC
+     * addresses provieded in CVQ command. Therefore, only the entries
+     * field need to be prepared in the CVQ command.
+     */
+    mac_ptr = out->iov_base + cursor;
+    mac_ptr->entries = cpu_to_le32(fake_mul_entries);
+
+    /*
+     * Simulating QEMU poll a vdpa device used buffer
+     * for VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
+     */
+    return sizeof(*s->status);
+}
+
 /**
  * Validate and copy control virtqueue commands.
  *
@@ -943,7 +1085,7 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
 
     out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
                              s->cvq_cmd_out_buffer,
-                             vhost_vdpa_net_cvq_cmd_len());
+                             vhost_vdpa_net_cvq_cmd_page_len());
 
     ctrl = s->cvq_cmd_out_buffer;
     if (ctrl->class == VIRTIO_NET_CTRL_ANNOUNCE) {
@@ -953,6 +1095,24 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
          */
         dev_written = sizeof(status);
         *s->status = VIRTIO_NET_OK;
+    } else if (unlikely(ctrl->class == VIRTIO_NET_CTRL_MAC &&
+                        ctrl->cmd == VIRTIO_NET_CTRL_MAC_TABLE_SET &&
+                        iov_size(elem->out_sg, elem->out_num) > out.iov_len)) {
+        /*
+         * Due to the size limitation of the out buffer sent to the vdpa device,
+         * which is determined by vhost_vdpa_net_cvq_cmd_page_len(), excessive
+         * MAC addresses set by the driver for the filter table can cause
+         * truncation of the CVQ command in QEMU. As a result, the vdpa device
+         * rejects the flawed CVQ command.
+         *
+         * Therefore, QEMU must handle this situation instead of sending
+         * the CVQ command direclty.
+         */
+        dev_written = vhost_vdpa_net_excessive_mac_filter_cvq_add(s, elem,
+                                                                  &out);
+        if (unlikely(dev_written < 0)) {
+            goto out;
+        }
     } else {
         dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, sizeof(status));
         if (unlikely(dev_written < 0)) {
-- 
2.25.1



  parent reply	other threads:[~2023-07-07 15:28 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-07 15:27 [PATCH v3 0/7] Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support Hawkins Jiawei
2023-07-07 15:27 ` [PATCH v3 1/7] vdpa: Use iovec for vhost_vdpa_net_load_cmd() Hawkins Jiawei
2023-08-17  9:23   ` Eugenio Perez Martin
2023-08-17 12:42     ` Hawkins Jiawei
2023-08-17 14:05       ` Eugenio Perez Martin
2023-08-17 14:47         ` Hawkins Jiawei
2023-07-07 15:27 ` [PATCH v3 2/7] vdpa: Restore MAC address filtering state Hawkins Jiawei
2023-08-17 10:18   ` Eugenio Perez Martin
2023-08-17 12:46     ` Hawkins Jiawei
2023-08-17 14:24       ` Eugenio Perez Martin
2023-07-07 15:27 ` [PATCH v3 3/7] vdpa: Restore packet receive filtering state relative with _F_CTRL_RX feature Hawkins Jiawei
2023-08-17 10:19   ` Eugenio Perez Martin
2023-07-07 15:27 ` [PATCH v3 4/7] vhost: Fix false positive out-of-bounds Hawkins Jiawei
2023-08-17 10:20   ` Eugenio Perez Martin
2023-07-07 15:27 ` [PATCH v3 5/7] vdpa: Accessing CVQ header through its structure Hawkins Jiawei
2023-08-17 10:33   ` Eugenio Perez Martin
2023-07-07 15:27 ` Hawkins Jiawei [this message]
2023-08-17 11:08   ` [PATCH v3 6/7] vdpa: Avoid forwarding large CVQ command failures Eugenio Perez Martin
2023-07-07 15:27 ` [PATCH v3 7/7] vdpa: Allow VIRTIO_NET_F_CTRL_RX in SVQ Hawkins Jiawei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=267e15e4eed2d7aeb9887f193da99a13d22a2f1d.1688743107.git.yin31149@gmail.com \
    --to=yin31149@gmail.com \
    --cc=18801353760@163.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).