From: Jonah Palmer <jonah.palmer@oracle.com>
To: Eugenio Perez Martin <eperezma@redhat.com>
Cc: qemu-devel@nongnu.org, mst@redhat.com, raphael@enfabrica.net,
kwolf@redhat.com, hreitz@redhat.com, jasowang@redhat.com,
pbonzini@redhat.com, fam@euphon.net, stefanha@redhat.com,
qemu-block@nongnu.org, schalla@marvell.com, leiyang@redhat.com,
virtio-fs@lists.linux.dev, si-wei.liu@oracle.com,
boris.ostrovsky@oracle.com
Subject: Re: [PATCH v3 0/6] virtio,vhost: Add VIRTIO_F_IN_ORDER support
Date: Mon, 24 Jun 2024 10:57:41 -0400 [thread overview]
Message-ID: <59b17704-0d15-4963-aa7f-b469494e58cb@oracle.com> (raw)
In-Reply-To: <CAJaqyWebry2dbn4EzfBeqadhzw25tMUFSOKLL6AsUXEFiH9gsQ@mail.gmail.com>
On 6/20/24 2:40 PM, Eugenio Perez Martin wrote:
> On Thu, Jun 20, 2024 at 7:56 PM Jonah Palmer <jonah.palmer@oracle.com> wrote:
>>
>> The goal of these patches is to add support to a variety of virtio and
>> vhost devices for the VIRTIO_F_IN_ORDER transport feature. This feature
>> indicates that all buffers are used by the device in the same order in
>> which they were made available by the driver.
>>
>> These patches attempt to implement a generalized, non-device-specific
>> solution to support this feature.
>>
>> The core feature behind this solution is a buffer mechanism in the form
>> of a VirtQueue's used_elems VirtQueueElement array. This allows devices
>> who always use buffers in-order by default to have a minimal overhead
>> impact. Devices that may not always use buffers in-order likely will
>> experience a performance hit. How large that performance hit is will
>> depend on how frequently elements are completed out-of-order.
>>
>> A VirtQueue whose device uses this feature will use its used_elems
>> VirtQueueElement array to hold used VirtQueueElements. The index that
>> used elements are placed in used_elems is the same index on the
>> used/descriptor ring that would satisfy the in-order requirement. In
>> other words, used elements are placed in their in-order locations on
>> used_elems and are only written to the used/descriptor ring once the
>> elements on used_elems are able to continue their expected order.
>>
>> To differentiate between a "used" and "unused" element on the used_elems
>> array (a "used" element being an element that has returned from
>> processing and an "unused" element being an element that has not yet
>> been processed), we added a boolean 'in_order_filled' member to the
>> VirtQueueElement struct. This flag is set to true when the element comes
>> back from processing (virtqueue_ordered_fill) and then set back to false
>> once it's been written to the used/descriptor ring
>> (virtqueue_ordered_flush).
>>
>> Testing:
>> ========
>> Testing was done using the dpdk-testpmd application on both the host and
>> guest using the following configurations. Traffic was generated between
>> the host and guest after running 'start tx_first' on both the host and
>> guest dpdk-testpmd applications. Results are below after traffic was
>> generated for several seconds.
>>
>> Relevant Qemu args:
>> -------------------
>> -chardev socket,id=char1,path=/tmp/vhost-user1,server=off
>> -chardev socket,id=char2,path=/tmp/vhost-user2,server=off
>> -netdev type=vhost-user,id=net1,chardev=char1,vhostforce=on,queues=1
>> -netdev type=vhost-user,id=net2,chardev=char2,vhostforce=on,queues=1
>> -device virtio-net-pci,in_order=true,packed=true,netdev=net1,
>> mac=56:48:4f:53:54:00,mq=on,vectors=4,rx_queue_size=256
>> -device virtio-net-pci,in_order=true,packed=true,netdev=net2,
>> mac=56:48:4f:53:54:01,mq=on,vectors=4,rx_queue_size=256
>>
>
> Hi Jonah,
>
> These tests are great, but others should also be performed. In
> particular, QEMU should run ok with "tap" netdev with vhost=off
> instead of vhost-user:
>
> -netdev type=tap,id=net1,vhost=off
> -netdev type=tap,id=net2,vhost=off
>
> This way, packets are going through the modified code. With this
> configuration, QEMU is the one forwarding the packets so testpmd is
> not needed in the host. It's still needed in the guest as linux guest
> driver does not support in_order. The guest kernel cmdline and testpmd
> cmdline should require no changes from the configuration you describe
> here.
>
Oof... I didn't even realize that my tests weren't actually testing all
of the modified code. I was so focused on getting DPDK to work that I
didn't think to check that. My apologies.
I am running into some trouble though trying to get packets flowing in
the guest. My Qemu args look like this:
# Regular virtio-net device
-device virtio-net-pci,netdev=net0,disable-legacy=off,disable-modern=off
-netdev tap,id=net0,vhost=off,ifname=tap0,script=${QEMU_IFUP},
downscript=no
# Virtio-net devices for testing
-netdev type=tap,id=net1,vhost=off,ifname=tap1,script=no,downscript=no
-netdev type=tap,id=net2,vhost=off,ifname=tap2,script=no,downscript=no
-device virtio-net-pci,in_order=true,packed=true,netdev=net1,
mac=56:48:4f:53:54:00,mq=on,vectors=4,rx_queue_size=256
-device virtio-net-pci,in_order=true,packed=true,netdev=net2,
mac=56:48:4f:53:54:01,mq=on,vectors=4,rx_queue_size=256
In the guest I have the virtio-net devices I'm using for testing bound
to the uio_pci_generic driver:
dpdk-devbind.py --status net
Network devices using DPDK-compatible driver
============================================
0000:00:02.0 'Virtio network device 1000' drv=uio_pci_generic
unused=virtio_pci,vfio-pci
0000:00:03.0 'Virtio network device 1000' drv=uio_pci_generic
unused=virtio_pci,vfio-pci
Network devices using kernel driver
===================================
0000:00:04.0 'Virtio network device 1000' if=enp0s4
drv=virtio-pci unused=virtio_pci,vfio-pci,uio_pci_generic *Active*
But then after running the dpdk-testpmd command in the guest:
dpdk-testpmd -l 0,1 -a 0000:00:02.0 -a 0000:00:03.0 --
--portmask=3 --rxq=1 --txq=1 --nb-cores=1 --forward-mode=io -i
EAL: Detected CPU lcores: 6
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:00:02.0
(socket -1)
EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:00:03.0
(socket -1)
TELEMETRY: No legacy callbacks, legacy socket not created
Set io packet forwarding mode
Interactive-mode selected
Warning: NUMA should be configured manually by using --port-numa-config
and --ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Configuring Port 0 (socket 0)
Port 0: 56:48:4F:53:54:00
Configuring Port 1 (socket 0)
Port 1: 56:48:4F:53:54:01
Checking link statuses...
Done
I'm not able to see any packets flowing after starting packet forwarding
and running 'show port stats all':
testpmd> start
io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support
enabled, MP allocation mode: native
Logical Core 1 (socket 0) forwards packets on 2 streams:
RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01
RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
io packet forwarding packets/burst=32
nb forwarding cores=1 - nb forwarding ports=2
port 0: RX queue number: 1 Tx queue number: 1
Rx offloads=0x0 Tx offloads=0x0
RX queue: 0
RX desc=0 - RX free threshold=0
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=0 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x0 - TX RS bit threshold=0
port 1: RX queue number: 1 Tx queue number: 1
Rx offloads=0x0 Tx offloads=0x0
RX queue: 0
RX desc=0 - RX free threshold=0
RX threshold registers: pthresh=0 hthresh=0 wthresh=0
RX Offloads=0x0
TX queue: 0
TX desc=0 - TX free threshold=0
TX threshold registers: pthresh=0 hthresh=0 wthresh=0
TX offloads=0x0 - TX RS bit threshold=0
testpmd>
testpmd> show port stats all
###### NIC statistics for port 0 ######
RX-packets: 0 RX-missed: 0 RX-bytes: 0
RX-errors: 0
RX-nombuf: 0
TX-packets: 0 TX-errors: 0 TX-bytes: 0
Throughput (since last show)
Rx-pps: 0 Rx-bps: 0
Tx-pps: 0 Tx-bps: 0
#########################################################
###### NIC statistics for port 1 ######
RX-packets: 0 RX-missed: 0 RX-bytes: 0
RX-errors: 0
RX-nombuf: 0
TX-packets: 0 TX-errors: 0 TX-bytes: 0
Throughput (since last show)
Rx-pps: 0 Rx-bps: 0
Tx-pps: 0 Tx-bps: 0
#########################################################
I'm still working on improving my networking skills so I'm going to try
and figure out what's going on here. Will let you know if I figure it
out and check in with you to see if the test results are satisfactory
before sending out a v4.
> And then try with in_order=true,packed=false and
> in_order=true,packed=off in corresponding virtio-net-pci.
>
> Performance comparison between in_order=true and in_order=false is
> also interesting but we're not batching so I don't think we will get
> an extreme improvement.
>
> Does the plan work for you?
>
> Thanks!
>
>> Host dpdk-testpmd command:
>> --------------------------
>> dpdk-testpmd -l 0,2,3,4,5 --socket-mem=1024 -n 4
>> --vdev 'net_vhost0,iface=/tmp/vhost-user1'
>> --vdev 'net_vhost1,iface=/tmp/vhost-user2' --
>> --portmask=f -i --rxq=1 --txq=1 --nb-cores=4 --forward-mode=io
>>
>> Guest dpdk-testpmd command:
>> ---------------------------
>> dpdk-testpmd -l 0,1 -a 0000:00:02.0 -a 0000:00:03.0 -- --portmask=3
>> --rxq=1 --txq=1 --nb-cores=1 --forward-mode=io -i
>>
>> Results:
>> --------
>> +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++
>> RX-packets: 79067488 RX-dropped: 0 RX-total: 79067488
>> TX-packets: 79067552 TX-dropped: 0 TX-total: 79067552
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>> ---
>> v3: Drop Tested-by tags until patches are re-tested.
>> Replace 'prev_avail_idx' with 'vq->last_avail_idx - 1' in
>> virtqueue_split_pop.
>> Remove redundant '+vq->vring.num' in 'max_steps' calculation in
>> virtqueue_ordered_fill.
>> Add test results to CV.
>>
>> v2: Make 'in_order_filled' more descriptive.
>> Change 'j' to more descriptive var name in virtqueue_split_pop.
>> Use more definitive search conditional in virtqueue_ordered_fill.
>> Avoid code duplication in virtqueue_ordered_flush.
>>
>> v1: Move series from RFC to PATCH for submission.
>>
>> Jonah Palmer (6):
>> virtio: Add bool to VirtQueueElement
>> virtio: virtqueue_pop - VIRTIO_F_IN_ORDER support
>> virtio: virtqueue_ordered_fill - VIRTIO_F_IN_ORDER support
>> virtio: virtqueue_ordered_flush - VIRTIO_F_IN_ORDER support
>> vhost,vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits
>> virtio: Add VIRTIO_F_IN_ORDER property definition
>>
>> hw/block/vhost-user-blk.c | 1 +
>> hw/net/vhost_net.c | 2 +
>> hw/scsi/vhost-scsi.c | 1 +
>> hw/scsi/vhost-user-scsi.c | 1 +
>> hw/virtio/vhost-user-fs.c | 1 +
>> hw/virtio/vhost-user-vsock.c | 1 +
>> hw/virtio/virtio.c | 123 ++++++++++++++++++++++++++++++++++-
>> include/hw/virtio/virtio.h | 6 +-
>> net/vhost-vdpa.c | 1 +
>> 9 files changed, 134 insertions(+), 3 deletions(-)
>>
>> --
>> 2.43.0
>>
>
prev parent reply other threads:[~2024-06-24 14:58 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-20 17:56 [PATCH v3 0/6] virtio,vhost: Add VIRTIO_F_IN_ORDER support Jonah Palmer
2024-06-20 17:56 ` [PATCH v3 1/6] virtio: Add bool to VirtQueueElement Jonah Palmer
2024-06-20 17:56 ` [PATCH v3 2/6] virtio: virtqueue_pop - VIRTIO_F_IN_ORDER support Jonah Palmer
2024-06-20 17:56 ` [PATCH v3 3/6] virtio: virtqueue_ordered_fill " Jonah Palmer
2024-06-20 17:56 ` [PATCH v3 4/6] virtio: virtqueue_ordered_flush " Jonah Palmer
2024-06-20 17:56 ` [PATCH v3 5/6] vhost, vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits Jonah Palmer via
2024-06-20 17:56 ` [PATCH v3 6/6] virtio: Add VIRTIO_F_IN_ORDER property definition Jonah Palmer
2024-06-20 18:40 ` [PATCH v3 0/6] virtio,vhost: Add VIRTIO_F_IN_ORDER support Eugenio Perez Martin
2024-06-24 14:57 ` Jonah Palmer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=59b17704-0d15-4963-aa7f-b469494e58cb@oracle.com \
--to=jonah.palmer@oracle.com \
--cc=boris.ostrovsky@oracle.com \
--cc=eperezma@redhat.com \
--cc=fam@euphon.net \
--cc=hreitz@redhat.com \
--cc=jasowang@redhat.com \
--cc=kwolf@redhat.com \
--cc=leiyang@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=raphael@enfabrica.net \
--cc=schalla@marvell.com \
--cc=si-wei.liu@oracle.com \
--cc=stefanha@redhat.com \
--cc=virtio-fs@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).