Re: [kvm-devel] [Qemu-devel] [PATCH 3/6] virtio for QEMU

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Anthony Liguori <anthony@codemonkey.ws>
To: dor.laor@qumranet.com
Cc: kvm-devel@lists.sourceforge.net,
	Marcelo Tosatti <marcelo@kvack.org>,
	qemu-devel@nongnu.org, Aurelien Jarno <aurelien@aurel32.net>
Subject: Re: [kvm-devel] [Qemu-devel] [PATCH 3/6] virtio for QEMU
Date: Sun, 30 Mar 2008 17:59:05 -0500	[thread overview]
Message-ID: <47F01B39.4040100@codemonkey.ws> (raw)
In-Reply-To: <1206897941.3357.15.camel@localhost.localdomain>

Dor Laor wrote:
> On Sat, 2008-03-29 at 16:55 -0500, Anthony Liguori wrote:
>   
>> This patch introduces virtio support over PCI.  virtio is a generic virtual IO
>> framework for Linux first introduced in 2.6.23.  Since 2.6.25, virtio has
>> supported a PCI transport which this patch implements.
>>
>> Since the last time these patches were posted to qemu-devel, I've reworked it
>> to use the proper access functions to manipulate guest memory.
>>
>> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>     
>
> It's will be great to drop the nasty hacks :)
> Do you still get 1G net performance using the extra copy from tap
> (memcpy_to_iovector)?
>   

We take a bit of a hit (probably 10-20%) doing copy.  The "tap hacks" 
require a more invasive set of patches to refactor the VLAN support in 
QEMU.  The fundamental problem with tap is that it only supports one tap 
device per VLAN.  What we really want is a VLAN were each VLAN client 
has it's own tap device.  This is also necessary to properly support the 
upcoming ring queue support for TAP along with GSO.

That patch set is why I revisited this one in fact :-)  Once we get 
virtio merged, I'll then send out the VLAN refactoring.  The nice thing 
though is that once we have the VLAN refactoring, we can optimize the 
e1000 device to make use of it.

> [snip]
>
>   
>> +static uint32_t vring_desc_len(VirtQueue *vq, unsigned int i)
>> +{
>>     
>
> Below there were place you did use offsetof(vq->vring.desc[i], len) so
> we better be consistent + its nicer
>
>   
>> +    return ldl_phys(vq->vring.desc + i * sizeof(VRingDesc) + 
>> +		    offsetof(VRingDesc, len));
>> +}
>>     

Yup, I just missed this one.  Thanks for the catch!

>> +VirtQueueElement *virtqueue_pop(VirtQueue *vq)
>> +{
>> +    unsigned int i, head;
>> +    unsigned int position;
>> +    VirtQueueElement *elem;
>> +
>> +    /* Check it isn't doing very strange things with descriptor numbers. */
>> +    if ((uint16_t)(vring_avail_idx(vq) - vq->last_avail_idx) > vq->vring.num)
>> +	errx(1, "Guest moved used index from %u to %u",
>> +	     vq->last_avail_idx, vring_avail_idx(vq));
>> +
>> +    /* If there's nothing new since last we looked, return invalid. */
>> +    if (vring_avail_idx(vq) == vq->last_avail_idx)
>> +	return NULL;
>> +
>> +    /* Grab the next descriptor number they're advertising, and increment
>> +     * the index we've seen. */
>> +    head = vring_avail_ring(vq, vq->last_avail_idx++ % vq->vring.num);
>> +
>> +    /* If their number is silly, that's a fatal mistake. */
>> +    if (head >= vq->vring.num)
>> +	errx(1, "Guest says index %u is available", head);
>> +
>> +    /* When we start there are none of either input nor output. */
>> +    position = 0;
>> +
>> +    elem = qemu_mallocz(sizeof(VirtQueueElement));
>> +
>> +    elem->phys_in = qemu_mallocz(sizeof(PhysIOVector) +
>> +				 vq->vring.num * sizeof(PhysIOVectorElement));
>> +    elem->phys_out = qemu_mallocz(sizeof(PhysIOVector) +
>> +				  vq->vring.num * sizeof(PhysIOVectorElement));
>>     
>
> I was wondering whether it can be optimized since vring.num is sometimes
> 512 so and we can either use a pool of these or calculate the vring.num
> from the descriptors but it seems like your way is the best.
>   

My thinking right now is to use qemu_mallocz() for everything and then 
we can go back and optimize with pooling if necessary.

>> +
>> +    i = head;
>> +    do {
>> +	PhysIOVectorElement *sge;
>> +
>> +	if (vring_desc_flags(vq, i) & VRING_DESC_F_WRITE)
>> +	    sge = &elem->phys_in->sg[elem->phys_in->num++];
>> +	else
>> +	    sge = &elem->phys_out->sg[elem->phys_out->num++];
>> +
>> +	/* Grab the first descriptor, and check it's OK. */
>> +	sge->len = vring_desc_len(vq, i);
>> +	sge->base = vring_desc_addr(vq, i);
>> +
>> +	/* If we've got too many, that implies a descriptor loop. */
>> +	if ((elem->phys_in->num + elem->phys_out->num) > vq->vring.num)
>> +	    errx(1, "Looped descriptor");
>> +    } while ((i = virtqueue_next_desc(vq, i)) != vq->vring.num);
>> +
>> +    elem->virt_in = pci_device_dma_map(&vq->vdev->pci_dev, elem->phys_in);
>> +    elem->virt_out = pci_device_dma_map(&vq->vdev->pci_dev, elem->phys_out);
>> +    elem->index = head;
>> +
>> +    if (elem->virt_in == NULL || elem->virt_out == NULL)
>> +	errx(1, "Bad DMA");
>> +
>> +    return elem;
>> +}
>> +
>> +
>>     
>
> The name below is a bit misleading since when enable is true you
> actually set no_notify.
> So I name it something like virtio_vring_set_no_notify(...) or similar.
>   

Yeah, that's not a bad suggestion.

Thanks,

Anthony Liguori

>> +void virtio_ring_set_used_notify(VirtQueue *vq, int enable)
>> +{
>> +    if (enable)
>> +	vring_used_set_flag(vq, VRING_USED_F_NO_NOTIFY);
>> +    else
>> +	vring_used_unset_flag(vq, VRING_USED_F_NO_NOTIFY);
>> +}
>> +
>>     
>
> Cheers,
> Dor
>
>
> -------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It's the best place to buy or sell services for
> just about anything Open Source.
> http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
> _______________________________________________
> kvm-devel mailing list
> kvm-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>

next prev parent reply	other threads:[~2008-03-30 22:59 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-29 21:55 [Qemu-devel] [PATCH 1/6] Use ram_addr_t for cpu_get_physical_page_desc Anthony Liguori
2008-03-29 21:55 ` [Qemu-devel] [PATCH 2/6] PCI DMA API Anthony Liguori
2008-03-30  7:06   ` Blue Swirl
2008-03-30 14:44     ` Anthony Liguori
2008-03-30 14:49       ` Avi Kivity
2008-03-30 14:56         ` [kvm-devel] " Anthony Liguori
2008-03-30 14:58       ` Blue Swirl
2008-03-30 15:11         ` Anthony Liguori
2008-03-30 10:18   ` Paul Brook
2008-03-30 14:42     ` Anthony Liguori
2008-03-30 18:19       ` Paul Brook
2008-03-30 19:02         ` Anthony Liguori
2008-03-30 10:25   ` [Qemu-devel] Re: [kvm-devel] " Avi Kivity
2008-03-30 14:49     ` Anthony Liguori
2008-03-29 21:55 ` [Qemu-devel] [PATCH 3/6] virtio for QEMU Anthony Liguori
2008-03-30 17:25   ` Dor Laor
2008-03-30 22:59     ` Anthony Liguori [this message]
2008-04-05  3:09     ` Anthony Liguori
2008-03-29 21:55 ` [Qemu-devel] [PATCH 4/6] virtio network driver Anthony Liguori
2008-03-30 10:27   ` Paul Brook
2008-03-30 14:47     ` Anthony Liguori
2008-03-29 21:55 ` [Qemu-devel] [PATCH 5/6] virtio block driver Anthony Liguori
2008-03-29 21:56 ` [Qemu-devel] [PATCH 6/6] virtio balloon driver Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47F01B39.4040100@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=aurelien@aurel32.net \
    --cc=dor.laor@qumranet.com \
    --cc=kvm-devel@lists.sourceforge.net \
    --cc=marcelo@kvack.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).