From: Vivek Goyal <vgoyal@redhat.com>
To: Peter-Jan Gootzen <peter-jan@gootzen.net>
Cc: German Maglione <gmaglione@redhat.com>,
virtualization@lists.linux-foundation.org,
Jonas Pfefferle <JPF@zurich.ibm.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
miklos@szeredi.hu
Subject: Re: virtio-fs: adding support for multi-queue
Date: Wed, 8 Feb 2023 15:23:51 -0500 [thread overview]
Message-ID: <Y+QE17rQaj8/vjrl@redhat.com> (raw)
In-Reply-To: <82ddafee-7548-e7bd-2f41-24ce9251aa25@gootzen.net>
On Wed, Feb 08, 2023 at 05:29:25PM +0100, Peter-Jan Gootzen wrote:
> On 08/02/2023 11:43, Stefan Hajnoczi wrote:
> > On Wed, Feb 08, 2023 at 09:33:33AM +0100, Peter-Jan Gootzen wrote:
> > >
> > >
> > > On 07/02/2023 22:57, Vivek Goyal wrote:
> > > > On Tue, Feb 07, 2023 at 04:32:02PM -0500, Stefan Hajnoczi wrote:
> > > > > On Tue, Feb 07, 2023 at 02:53:58PM -0500, Vivek Goyal wrote:
> > > > > > On Tue, Feb 07, 2023 at 02:45:39PM -0500, Stefan Hajnoczi wrote:
> > > > > > > On Tue, Feb 07, 2023 at 11:14:46AM +0100, Peter-Jan Gootzen wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > >
> > > > > > [cc German]
> > > > > >
> > > > > > > > For my MSc thesis project in collaboration with IBM
> > > > > > > > (https://github.com/IBM/dpu-virtio-fs) we are looking to improve the
> > > > > > > > performance of the virtio-fs driver in high throughput scenarios. We think
> > > > > > > > the main bottleneck is the fact that the virtio-fs driver does not support
> > > > > > > > multi-queue (while the spec does). A big factor in this is that our setup on
> > > > > > > > the virtio-fs device-side (a DPU) does not easily allow multiple cores to
> > > > > > > > tend to a single virtio queue.
> > > > > >
> > > > > > This is an interesting limitation in DPU.
> > > > >
> > > > > Virtqueues are single-consumer queues anyway. Sharing them between
> > > > > multiple threads would be expensive. I think using multiqueue is natural
> > > > > and not specific to DPUs.
> > > >
> > > > Can we create multiple threads (a thread pool) on DPU and let these
> > > > threads process requests in parallel (While there is only one virt
> > > > queue).
> > > >
> > > > So this is what we had done in virtiofsd. One thread is dedicated to
> > > > pull the requests from virt queue and then pass the request to thread
> > > > pool to process it. And that seems to help with performance in
> > > > certain cases.
> > > >
> > > > Is that possible on DPU? That itself can give a nice performance
> > > > boost for certain workloads without having to implement multiqueue
> > > > actually.
> > > >
> > > > Just curious. I am not opposed to the idea of multiqueue. I am
> > > > just curious about the kind of performance gain (if any) it can
> > > > provide. And will this be helpful for rust virtiofsd running on
> > > > host as well?
> > > >
> > > > Thanks
> > > > Vivek
> > > >
> > > There is technically nothing preventing us from consuming a single queue on
> > > multiple cores, however our current Virtio implementation (DPU-side) is set
> > > up with the assumption that you should never want to do that (concurrency
> > > mayham around the Virtqueues and the DMAs). So instead of putting all the
> > > work into reworking the implementation to support that and still incur the
> > > big overhead, we see it more fitting to amend the virtio-fs driver with
> > > multi-queue support.
> > >
> > >
> > > > Is it just a theory at this point of time or have you implemented
> > > > it and seeing significant performance benefit with multiqueue?
> > >
> > > It is a theory, but we are currently seeing that using the single request
> > > queue, the single core attending to that queue on the DPU is reasonably
> > > close to being fully saturated.
> > >
> > > > And will this be helpful for rust virtiofsd running on
> > > > host as well?
> > >
> > > I figure this would be dependent on the workload and the users-needs.
> > > Having many cores concurrently pulling on their own virtq and then
> > > immediately process the request locally would of course improve performance.
> > > But we are offloading all this work to the DPU, for providing
> > > high-throughput cloud services.
> >
> > I think Vivek is getting at whether your code processes requests
> > sequentially or in parallel. A single thread processing the virtqueue
> > that hands off requests to worker threads or uses io_uring to perform
> > I/O asynchronously will perform differently from a single thread that
> > processes requests sequentially in a blocking fashion. Multiqueue is not
> > necessary for parallelism, but the single queue might become a
> > bottleneck.
>
> Requests are handled non-blocking with remote IO on the DPU. Our current
> architecture is as follows:
> T1: Tends to the Virtq, parses FUSE to remote IO and fires off the
> asynchronous remote IO.
> T2: Polls for completion on the remote IO and parses it back to FUSE, puts
> the FUSE buffers in a completion queue of T1.
> T1: Handles the Virtio completion and DMA of the requests in the CQ.
>
> Thread 1 is busy polling on its two queues (Virtq and CQ) with equal
> priority, thread 2 is busy polling as well. This setup is not really
> optimal, but we are working within the constraints of both our DPU and
> remote IO stack.
> Currently we are able to get with sequential single job 4k throughput:
> Write: 246MiB/s
> Read: 20MiB/s
I had been doing some performance benchmarking for virtiofs and I found
some old results.
https://github.com/rhvgoyal/virtiofs-tests/tree/master/performance-results/feb-10-2021
While running on top of local fs, with bs=4K, with single queue I could
achieve more than 600MB/s.
NAME WORKLOAD Bandwidth IOPS
default seqread-psync 625.0mb 156.2k
no-tpool seqread-psync 660.8mb 165.2k
But catch here I think is that host is doing the caching. In your
case I am assuming there is no caching at DPU and all the I/O is
going to remote storage (which might be doing caching in memory).
Anyway, point I am trying to make is that even with single vq, virtiofs
can push a reasonable amount of I/O.
I will be cuirous to find how multiqueue can improve these numbers
further.
> We are not sure yet where the bottleneck is for reads, we hope to be able to
> match it to the write speed. For writes the two main bottlenecks we see are:
> the single Virtq (so limited parallelism on the DPU and remote-side) and
> that virtio-fs IO is constrained to the page size of 4k (NFS for example,
> who we are trying to replace, sees huge performance gains with larger block
> sizes).
I am wondering how did you conclude that single vq is the bottleneck for
performance and not the remote storage DPU is sending I/O to.
Thanks
Vivek
>
> > > This is what I remembered as well, but can't find it clearly in the source
> > > right now, do you have references to the source for this?
> >
> > virtio_blk.ko uses an irq_affinity descriptor to tell virtio_find_vqs()
> > to spread MSI interrupts across CPUs:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/block/virtio_blk.c#n609
> >
> > The core blk-mq code has the blk_mq_virtio_map_queues() function to map
> > block layer queues to virtqueues:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/block/blk-mq-virtio.c#n24
> >
> > virtio_net.ko manually sets virtqueue affinity:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/virtio_net.c#n2283
> >
> > virtio_net.ko tells the core net subsystem about queues using
> > netif_set_real_num_tx_queues() and then skbs are mapped to queues by
> > common code:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/core/dev.c#n4079
>
> Thanks for the pointers. :)
>
> Thanks,
> Peter-Jan
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2023-02-08 20:24 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <2fd99bc2-0414-0b85-2bff-3a84ae6c23bd@gootzen.net>
2023-02-07 19:45 ` virtio-fs: adding support for multi-queue Stefan Hajnoczi
2023-02-07 19:53 ` Vivek Goyal
2023-02-07 21:32 ` Stefan Hajnoczi
2023-02-07 21:57 ` Vivek Goyal
2023-02-08 8:33 ` Peter-Jan Gootzen via Virtualization
2023-02-08 10:43 ` Stefan Hajnoczi
2023-02-08 16:29 ` Peter-Jan Gootzen via Virtualization
2023-02-08 20:23 ` Vivek Goyal [this message]
2023-02-22 14:32 ` Stefan Hajnoczi
2023-03-07 19:43 ` Peter-Jan Gootzen via Virtualization
2023-03-07 22:26 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y+QE17rQaj8/vjrl@redhat.com \
--to=vgoyal@redhat.com \
--cc=JPF@zurich.ibm.com \
--cc=gmaglione@redhat.com \
--cc=miklos@szeredi.hu \
--cc=peter-jan@gootzen.net \
--cc=stefanha@redhat.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).