From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [PATCH 3/3] virtio PCI device Date: Tue, 20 Nov 2007 16:16:43 -0600 Message-ID: <47435CCB.1050506@us.ibm.com> References: <11944899922822-git-send-email-aliguori@us.ibm.com> <11944900141678-git-send-email-aliguori@us.ibm.com> <11944900152750-git-send-email-aliguori@us.ibm.com> <11944900163817-git-send-email-aliguori@us.ibm.com> <4742F6B7.20503@qumranet.com> <474300AD.4060509@us.ibm.com> <4743076F.8000105@qumranet.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4743076F.8000105-atKUWr5tajBWk0Htik3J/w@public.gmane.org> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org Errors-To: kvm-devel-bounces-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org To: Avi Kivity Cc: kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org, virtualization-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Eric Van Hensbergen List-Id: virtualization@lists.linuxfoundation.org Avi Kivity wrote: > Anthony Liguori wrote: >> Avi Kivity wrote: >> >>> Anthony Liguori wrote: >>> >>>> This is a PCI device that implements a transport for virtio. It >>>> allows virtio >>>> devices to be used by QEMU based VMMs like KVM or Xen. >>>> >>>> + >>>> +/* the notify function used when creating a virt queue */ >>>> +static void vp_notify(struct virtqueue *vq) >>>> +{ >>>> + struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev); >>>> + struct virtio_pci_vq_info *info = vq->priv; >>>> + >>>> + /* we write the queue's selector into the notification >>>> register to >>>> + * signal the other end */ >>>> + iowrite16(info->queue_index, vp_dev->ioaddr + >>>> VIRTIO_PCI_QUEUE_NOTIFY); >>>> +} >>>> >>> This means we can't kick multiple queues with one exit. >>> >> >> There is no interface in virtio currently to batch multiple queue >> notifications so the only way one could do this AFAICT is to use a >> timer to delay the notifications. Were you thinking of something else? >> >> > > No. We can change virtio though, so let's have a flexible ABI. Well please propose the virtio API first and then I'll adjust the PCI ABI. I don't want to build things into the ABI that we never actually end up using in virtio :-) >>> I'd also like to see a hypercall-capable version of this (but that >>> can wait). >>> >> >> That can be a different device. >> > > That means the user has to select which device to expose. With > feature bits, the hypervisor advertises both pio and hypercalls, the > guest picks whatever it wants. I was thinking more along the lines that a hypercall-based device would certainly be implemented in-kernel whereas the current device is naturally implemented in userspace. We can simply use a different device for in-kernel drivers than for userspace drivers. There's no point at all in doing a hypercall based userspace device IMHO. >> I don't think so. A vmexit is required to lower the IRQ line. It >> may be possible to do something clever like set a shared memory value >> that's checked on every vmexit. I think it's very unlikely that it's >> worth it though. >> > > Why so unlikely? Not all workloads will have good batching. It's pretty invasive. I think a more paravirt device that expected an edge triggered interrupt would be a better solution for those types of devices. >>>> + return ret; >>>> +} >>>> + >>>> +/* the config->find_vq() implementation */ >>>> +static struct virtqueue *vp_find_vq(struct virtio_device *vdev, >>>> unsigned index, >>>> + bool (*callback)(struct virtqueue *vq)) >>>> +{ >>>> + struct virtio_pci_device *vp_dev = to_vp_device(vdev); >>>> + struct virtio_pci_vq_info *info; >>>> + struct virtqueue *vq; >>>> + int err; >>>> + u16 num; >>>> + >>>> + /* Select the queue we're interested in */ >>>> + iowrite16(index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL); >>>> >>> I would really like to see this implemented as pci config space, >>> with no tricks like multiplexing several virtqueues on one register. >>> Something like the PCI BARs where you have all the register numbers >>> allocated statically to queues. >>> >> >> My first implementation did that. I switched to using a selector >> because it reduces the amount of PCI config space used and does not >> limit the number of queues defined by the ABI as much. >> > > But... it's tricky, and it's nonstandard. With pci config, you can do > live migration by shipping the pci config space to the other side. > With the special iospace, you need to encode/decode it. None of the PCI devices currently work like that in QEMU. It would be very hard to make a device that worked this way because since the order in which values are written matter a whole lot. For instance, if you wrote the status register before the queue information, the driver could get into a funky state. We'll still need save/restore routines for virtio devices. I don't really see this as a problem since we do this for every other device. > Not much of an argument, I know. > > > wrt. number of queues, 8 queues will consume 32 bytes of pci space if > all you store is the ring pfn. You also at least need a num argument which takes you to 48 or 64 depending on whether you care about strange formatting. 8 queues may not be enough either. Eric and I have discussed whether the 9p virtio device should support multiple mounts per-virtio device and if so, whether each one should have it's own queue. Any devices that supports this sort of multiplexing will very quickly start using a lot of queues. I think most types of hardware have some notion of a selector or mode. Take a look at the LSI adapter or even VGA. Regards, Anthony Liguori ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/