Re: [PATCH v3 6/7] vdpa: Avoid forwarding large CVQ command failures

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Eugenio Perez Martin <eperezma@redhat.com>
To: Hawkins Jiawei <yin31149@gmail.com>
Cc: jasowang@redhat.com, mst@redhat.com, qemu-devel@nongnu.org,
	 18801353760@163.com
Subject: Re: [PATCH v3 6/7] vdpa: Avoid forwarding large CVQ command failures
Date: Thu, 17 Aug 2023 13:08:52 +0200	[thread overview]
Message-ID: <CAJaqyWeb5Xu0JaPCg_S_Ay_92NpwsSZR808rTYWcuU1_Aw1EeA@mail.gmail.com> (raw)
In-Reply-To: <267e15e4eed2d7aeb9887f193da99a13d22a2f1d.1688743107.git.yin31149@gmail.com>

On Fri, Jul 7, 2023 at 5:28 PM Hawkins Jiawei <yin31149@gmail.com> wrote:
>
> Due to the size limitation of the out buffer sent to the vdpa device,
> which is determined by vhost_vdpa_net_cvq_cmd_len(), excessive CVQ
> command is truncated in QEMU. As a result, the vdpa device rejects
> this flawd CVQ command.
>
> However, the problem is that, the VIRTIO_NET_CTRL_MAC_TABLE_SET
> CVQ command has a variable length, which may exceed
> vhost_vdpa_net_cvq_cmd_len() if the guest sets more than
> `MAC_TABLE_ENTRIES` MAC addresses for the filter table.
>
> This patch solves this problem by following steps:
>
>   * Increase the out buffer size to vhost_vdpa_net_cvq_cmd_page_len(),
> which represents the size of the buffer that is allocated and mmaped.
> This ensures that everything works correctly as long as the guest
> sets fewer than `(vhost_vdpa_net_cvq_cmd_page_len() -
> sizeof(struct virtio_net_ctrl_hdr)
> - 2 * sizeof(struct virtio_net_ctrl_mac)) / ETH_ALEN` MAC addresses.
>     Considering the highly unlikely scenario for the guest setting
> more than that number of MAC addresses for the filter table, this
> should work fine for the majority of cases.
>
>   * If the CVQ command exceeds vhost_vdpa_net_cvq_cmd_page_len(),
> instead of directly sending this CVQ command, QEMU should send
> a VIRTIO_NET_CTRL_RX_PROMISC CVQ command to vdpa device. Addtionally,
> a fake VIRTIO_NET_CTRL_MAC_TABLE_SET command including
> (`MAC_TABLE_ENTRIES` + 1) non-multicast MAC addresses and
> (`MAC_TABLE_ENTRIES` + 1) multicast MAC addresses should be provided
> to the device model.
>     By doing so, the vdpa device turns promiscuous mode on, aligning
> with the VirtIO standard. The device model marks
> `n->mac_table.uni_overflow` and `n->mac_table.multi_overflow`,
> which aligns with the state of the vdpa device.
>
> Note that the bug cannot be triggered at the moment, since
> VIRTIO_NET_F_CTRL_RX feature is not enabled for SVQ.
>
> Fixes: 7a7f87e94c ("vdpa: Move command buffers map to start of net device")
> Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>

Acked-by: Eugenio Pérez <eperezma@redhat.com>

> ---
>  net/vhost-vdpa.c | 162 ++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 161 insertions(+), 1 deletion(-)
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index a84eb088a0..a4ff6c52b7 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -916,6 +916,148 @@ static NetClientInfo net_vhost_vdpa_cvq_info = {
>      .check_peer_type = vhost_vdpa_check_peer_type,
>  };
>
> +/*
> + * Forward the excessive VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command to
> + * vdpa device.
> + *
> + * Considering that QEMU cannot send the entire filter table to the
> + * vdpa device, it should send the VIRTIO_NET_CTRL_RX_PROMISC CVQ
> + * command to enable promiscuous mode to receive all packets,
> + * according to VirtIO standard, "Since there are no guarantees,
> + * it can use a hash filter or silently switch to allmulti or
> + * promiscuous mode if it is given too many addresses.".
> + *
> + * Since QEMU ignores MAC addresses beyond `MAC_TABLE_ENTRIES` and
> + * marks `n->mac_table.x_overflow` accordingly, it should have
> + * the same effect on the device model to receive
> + * (`MAC_TABLE_ENTRIES` + 1) or more non-multicast MAC addresses.
> + * The same applies to multicast MAC addresses.
> + *
> + * Therefore, QEMU can provide the device model with a fake
> + * VIRTIO_NET_CTRL_MAC_TABLE_SET command with (`MAC_TABLE_ENTRIES` + 1)
> + * non-multicast MAC addresses and (`MAC_TABLE_ENTRIES` + 1) multicast
> + * MAC addresses. This ensures that the device model marks
> + * `n->mac_table.uni_overflow` and `n->mac_table.multi_overflow`,
> + * allowing all packets to be received, which aligns with the
> + * state of the vdpa device.
> + */
> +static int vhost_vdpa_net_excessive_mac_filter_cvq_add(VhostVDPAState *s,
> +                                                       VirtQueueElement *elem,
> +                                                       struct iovec *out)
> +{
> +    struct virtio_net_ctrl_mac mac_data, *mac_ptr;
> +    struct virtio_net_ctrl_hdr *hdr_ptr;
> +    uint32_t cursor;
> +    ssize_t r;
> +
> +    /* parse the non-multicast MAC address entries from CVQ command */
> +    cursor = sizeof(*hdr_ptr);
> +    r = iov_to_buf(elem->out_sg, elem->out_num, cursor,
> +                   &mac_data, sizeof(mac_data));
> +    if (unlikely(r != sizeof(mac_data))) {
> +        /*
> +         * If the CVQ command is invalid, we should simulate the vdpa device
> +         * to reject the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
> +         */
> +        *s->status = VIRTIO_NET_ERR;
> +        return sizeof(*s->status);
> +    }
> +    cursor += sizeof(mac_data) + le32_to_cpu(mac_data.entries) * ETH_ALEN;
> +
> +    /* parse the multicast MAC address entries from CVQ command */
> +    r = iov_to_buf(elem->out_sg, elem->out_num, cursor,
> +                   &mac_data, sizeof(mac_data));
> +    if (r != sizeof(mac_data)) {
> +        /*
> +         * If the CVQ command is invalid, we should simulate the vdpa device
> +         * to reject the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
> +         */
> +        *s->status = VIRTIO_NET_ERR;
> +        return sizeof(*s->status);
> +    }
> +    cursor += sizeof(mac_data) + le32_to_cpu(mac_data.entries) * ETH_ALEN;
> +
> +    /* validate the CVQ command */
> +    if (iov_size(elem->out_sg, elem->out_num) != cursor) {
> +        /*
> +         * If the CVQ command is invalid, we should simulate the vdpa device
> +         * to reject the VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
> +         */
> +        *s->status = VIRTIO_NET_ERR;
> +        return sizeof(*s->status);
> +    }
> +
> +    /*
> +     * According to VirtIO standard, "Since there are no guarantees,
> +     * it can use a hash filter or silently switch to allmulti or
> +     * promiscuous mode if it is given too many addresses.".
> +     *
> +     * Therefore, considering that QEMU is unable to send the entire
> +     * filter table to the vdpa device, it should send the
> +     * VIRTIO_NET_CTRL_RX_PROMISC CVQ command to enable promiscuous mode
> +     */
> +    r = vhost_vdpa_net_load_rx_mode(s, VIRTIO_NET_CTRL_RX_PROMISC, 1);
> +    if (unlikely(r < 0)) {
> +        return r;
> +    }
> +    if (*s->status != VIRTIO_NET_OK) {
> +        return sizeof(*s->status);
> +    }
> +
> +    /*
> +     * QEMU should also send a fake VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ
> +     * command to the device model, including (`MAC_TABLE_ENTRIES` + 1)
> +     * non-multicast MAC addresses and (`MAC_TABLE_ENTRIES` + 1)
> +     * multicast MAC addresses.
> +     *
> +     * By doing so, the device model can mark `n->mac_table.uni_overflow`
> +     * and `n->mac_table.multi_overflow`, enabling all packets to be
> +     * received, which aligns with the state of the vdpa device.
> +     */
> +    cursor = 0;
> +    uint32_t fake_uni_entries = MAC_TABLE_ENTRIES + 1,
> +             fake_mul_entries = MAC_TABLE_ENTRIES + 1,
> +             fake_cvq_size = sizeof(struct virtio_net_ctrl_hdr) +
> +                             sizeof(mac_data) + fake_uni_entries * ETH_ALEN +
> +                             sizeof(mac_data) + fake_mul_entries * ETH_ALEN;
> +
> +    assert(fake_cvq_size < vhost_vdpa_net_cvq_cmd_page_len());
> +    out->iov_len = fake_cvq_size;
> +
> +    /* pack the header for fake CVQ command */
> +    hdr_ptr = out->iov_base + cursor;
> +    hdr_ptr->class = VIRTIO_NET_CTRL_MAC;
> +    hdr_ptr->cmd = VIRTIO_NET_CTRL_MAC_TABLE_SET;
> +    cursor += sizeof(*hdr_ptr);
> +
> +    /*
> +     * Pack the non-multicast MAC addresses part for fake CVQ command.
> +     *
> +     * According to virtio_net_handle_mac(), QEMU doesn't verify the MAC
> +     * addresses provieded in CVQ command. Therefore, only the entries
> +     * field need to be prepared in the CVQ command.
> +     */
> +    mac_ptr = out->iov_base + cursor;
> +    mac_ptr->entries = cpu_to_le32(fake_uni_entries);
> +    cursor += sizeof(*mac_ptr) + fake_uni_entries * ETH_ALEN;
> +
> +    /*
> +     * Pack the multicast MAC addresses part for fake CVQ command.
> +     *
> +     * According to virtio_net_handle_mac(), QEMU doesn't verify the MAC
> +     * addresses provieded in CVQ command. Therefore, only the entries
> +     * field need to be prepared in the CVQ command.
> +     */
> +    mac_ptr = out->iov_base + cursor;
> +    mac_ptr->entries = cpu_to_le32(fake_mul_entries);
> +
> +    /*
> +     * Simulating QEMU poll a vdpa device used buffer
> +     * for VIRTIO_NET_CTRL_MAC_TABLE_SET CVQ command
> +     */
> +    return sizeof(*s->status);
> +}
> +
>  /**
>   * Validate and copy control virtqueue commands.
>   *
> @@ -943,7 +1085,7 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>
>      out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0,
>                               s->cvq_cmd_out_buffer,
> -                             vhost_vdpa_net_cvq_cmd_len());
> +                             vhost_vdpa_net_cvq_cmd_page_len());
>
>      ctrl = s->cvq_cmd_out_buffer;
>      if (ctrl->class == VIRTIO_NET_CTRL_ANNOUNCE) {
> @@ -953,6 +1095,24 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>           */
>          dev_written = sizeof(status);
>          *s->status = VIRTIO_NET_OK;
> +    } else if (unlikely(ctrl->class == VIRTIO_NET_CTRL_MAC &&
> +                        ctrl->cmd == VIRTIO_NET_CTRL_MAC_TABLE_SET &&
> +                        iov_size(elem->out_sg, elem->out_num) > out.iov_len)) {
> +        /*
> +         * Due to the size limitation of the out buffer sent to the vdpa device,
> +         * which is determined by vhost_vdpa_net_cvq_cmd_page_len(), excessive
> +         * MAC addresses set by the driver for the filter table can cause
> +         * truncation of the CVQ command in QEMU. As a result, the vdpa device
> +         * rejects the flawed CVQ command.
> +         *
> +         * Therefore, QEMU must handle this situation instead of sending
> +         * the CVQ command direclty.
> +         */
> +        dev_written = vhost_vdpa_net_excessive_mac_filter_cvq_add(s, elem,
> +                                                                  &out);
> +        if (unlikely(dev_written < 0)) {
> +            goto out;
> +        }
>      } else {
>          dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, sizeof(status));
>          if (unlikely(dev_written < 0)) {
> --
> 2.25.1
>

next prev parent reply	other threads:[~2023-08-17 11:16 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-07 15:27 [PATCH v3 0/7] Vhost-vdpa Shadow Virtqueue _F_CTRL_RX commands support Hawkins Jiawei
2023-07-07 15:27 ` [PATCH v3 1/7] vdpa: Use iovec for vhost_vdpa_net_load_cmd() Hawkins Jiawei
2023-08-17  9:23   ` Eugenio Perez Martin
2023-08-17 12:42     ` Hawkins Jiawei
2023-08-17 14:05       ` Eugenio Perez Martin
2023-08-17 14:47         ` Hawkins Jiawei
2023-07-07 15:27 ` [PATCH v3 2/7] vdpa: Restore MAC address filtering state Hawkins Jiawei
2023-08-17 10:18   ` Eugenio Perez Martin
2023-08-17 12:46     ` Hawkins Jiawei
2023-08-17 14:24       ` Eugenio Perez Martin
2023-07-07 15:27 ` [PATCH v3 3/7] vdpa: Restore packet receive filtering state relative with _F_CTRL_RX feature Hawkins Jiawei
2023-08-17 10:19   ` Eugenio Perez Martin
2023-07-07 15:27 ` [PATCH v3 4/7] vhost: Fix false positive out-of-bounds Hawkins Jiawei
2023-08-17 10:20   ` Eugenio Perez Martin
2023-07-07 15:27 ` [PATCH v3 5/7] vdpa: Accessing CVQ header through its structure Hawkins Jiawei
2023-08-17 10:33   ` Eugenio Perez Martin
2023-07-07 15:27 ` [PATCH v3 6/7] vdpa: Avoid forwarding large CVQ command failures Hawkins Jiawei
2023-08-17 11:08   ` Eugenio Perez Martin [this message]
2023-07-07 15:27 ` [PATCH v3 7/7] vdpa: Allow VIRTIO_NET_F_CTRL_RX in SVQ Hawkins Jiawei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJaqyWeb5Xu0JaPCg_S_Ay_92NpwsSZR808rTYWcuU1_Aw1EeA@mail.gmail.com \
    --to=eperezma@redhat.com \
    --cc=18801353760@163.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yin31149@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).