From mboxrd@z Thu Jan 1 00:00:00 1970 From: Rusty Russell Subject: Re: [PATCH 2/3] virtio: indirect ring =?iso-8859-1?q?entries=09?=(VIRTIO_RING_F_INDIRECT_DESC) Date: Mon, 4 May 2009 11:49:00 +0930 Message-ID: <200905041149.00724.rusty@rustcorp.com.au> References: <1229620222-22216-1-git-send-email-markmc@redhat.com> <1240318745.443.42.camel@blaa> <49F56239.9010509@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <49F56239.9010509@redhat.com> Content-Disposition: inline List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: dlaor@redhat.com Cc: Mark McLoughlin , Dor Laor , virtualization@lists.linux-foundation.org, Avi Kivity , netdev@vger.kernel.org List-Id: virtualization@lists.linuxfoundation.org On Mon, 27 Apr 2009 05:13:53 pm Dor Laor wrote: > Mark McLoughlin wrote: > > Hi Rusty, > > > > On Thu, 2008-12-18 at 17:10 +0000, Mark McLoughlin wrote: > > > >> Add a new feature flag for indirect ring entries. These are ring > >> entries which point to a table of buffer descriptors. > >> > >> The idea here is to increase the ring capacity by allowing a larger > >> effective ring size whereby the ring size dictates the number of > >> requests that may be outstanding, rather than the size of those > >> requests. OK, just so we track our mistakes. 1) virtio_rings must be physically contiguous, even though they actually have two independent parts. 2) The number of elements in a ring must be a power of 2. 3) virtio_pci tells the guest what number of elements to use. 4) The guest has to allocate that much physically contiguous memory, or fail. In practice, 128 elements = 2 pages, 256 elements = 3 pages, 512 elements = 5 pages. Order 1, order 2, order 3 under Linux. 1 is OK, 2 is iffy, 3 is hard. Blocked from doing the simpler thing, we've decided to go with a layer of indirection. But the patch is simple and clean, so there's nothing fundamental to object to. I can't find 3/3, did it go missing? Thanks, Rusty. > >> > >> This should be most effective in the case of block I/O where we can > >> potentially benefit by concurrently dispatching a large number of > >> large requests. Even in the simple case of single segment block > >> requests, this results in a threefold increase in ring capacity. > >> > > > > Apparently, this would also be useful for the windows virtio-net > > drivers. > > > > Dor can explain further, but apparently Windows has been observed > > passing the driver a packet with >256 fragments when using TSO. > > > > With a ring size of 256, the guest can either drop the packet or copy it > > into a single buffer. We'd much rather if we could use an indirect ring > > entry to pass this number of fragments without copying. > > > Correct. This is what we do in Windows today. > The problem arises when using sending lots of small packets > from the win guest and TSO. Windows prepare very big scatter gather > list, bigger than the ring size (270 fragments). > Having indirect ring entries is good both for this and also for block > io, as described > above. > > Cheers, > Dor > > For reference the original patch was here: > > > > http://lkml.org/lkml/2008/12/18/212 > > > > Cheers, > > Mark. > > > > _______________________________________________ > > Virtualization mailing list > > Virtualization@lists.linux-foundation.org > > https://lists.linux-foundation.org/mailman/listinfo/virtualization > > >