kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Amit Shah <amit.shah@redhat.com>
Cc: avi@redhat.com, kvm@vger.kernel.org, muli@il.ibm.com,
	benami@il.ibm.com, weidong.han@intel.com, allen.m.kay@intel.com
Subject: Re: [PATCH 5/7] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests
Date: Wed, 24 Sep 2008 10:07:24 -0500	[thread overview]
Message-ID: <48DA57AC.1020903@codemonkey.ws> (raw)
In-Reply-To: <200809241028.09067.amit.shah@redhat.com>

Amit Shah wrote:
> * On Tuesday 23 Sep 2008 22:00:32 Anthony Liguori wrote:
>   
>> Amit Shah wrote:
>>     
>
>   
>>> diff --git a/qemu/Makefile.target b/qemu/Makefile.target
>>> index 72f3db8..40eb273 100644
>>> --- a/qemu/Makefile.target
>>> +++ b/qemu/Makefile.target
>>> @@ -616,6 +616,7 @@ OBJS+= ide.o pckbd.o ps2.o vga.o $(SOUND_HW) dma.o
>>>  OBJS+= fdc.o mc146818rtc.o serial.o i8259.o i8254.o pcspk.o pc.o
>>>  OBJS+= cirrus_vga.o apic.o parallel.o acpi.o piix_pci.o
>>>  OBJS+= usb-uhci.o vmmouse.o vmport.o vmware_vga.o extboot.o
>>> +OBJS+= device-assignment.o
>>>       
>> This needs to be conditional on at least linux hosts, but probably also
>> kvm support.
>>     
>
> I didn't see any other file that's doing it. So I added this conditional in 
> vl.c by having a #if defined(__linux__). That's how usb-linux.c does it as 
> well. Is there a better way?
>   

aio and compatfd currently do it this way.  block-raw-win32 and 
block-raw-posix are this way.  We're slowly moving things away from 
#ifdef #else #endif to conditional compilation.

> Not the whole functionality needs kvm support. This should be able to work 
> even without kvm (for example, when the guest is 1:1 mapped in the host 
> address space).
>   

KVM is needed for interrupt remapping though.  That's something I don't 
see happening for normal userspace any time soon.

>>> +	/* FIXME: Add support for emulated MMIO for non-kvm guests */
>>> +	if (kvm_enabled()) {
>>>       
>> This doesn't work at all if kvm isn't enabled right?  You should
>> probably bail out in the init if kvm isn't enabled.  If this whole file
>> is included conditionally based on KVM support, then you don't have to
>> worry about using kvm_enabled() guards to conditionally compile out code.
>>     
>
> Non-kvm support is currently broken and should be fixed, but that can happen 
> after we get this merged.
>   

But it would take bouncing interrupts to userspace?  I don't think that 
will ever happen upstream personally.  At any rate, there's no point in 
even trying to support something like that until progress is made 
upstream on this front.

> I can temporarily add a check for kvm_enabled and bail out.
>
>   
>>> +	sprintf(dir, "/sys/bus/pci/devices/0000:%02x:%02x.%x/",
>>> +		r_bus, r_dev, r_func);
>>>       
>> snprintf()
>>     
>
> It's guarded by the %02x modifiers; so this doesn't depend on user input.
>   

strcpy or sprintf should never be used.  It doesn't matter if it's safe 
in a particular instance.  There are safer functions to use (like snprintf).

All it takes is for someone to come along and change the /sys/bus path 
to be larger without adjusting the buffer size and everything goes to 
hell.  It's inherently brittle.

>>> +	fprintf(stderr, "Registered host PCI device %02x:%02x.%1x "
>>> +		"(\"%s\") as guest device %02x:%02x.%1x\n",
>>> +		r_bus, r_dev, r_func, e_dev_name,
>>> +		pci_bus_num(e_bus), e_device, r_func);
>>>       
>> Please don't fprintf() unconditionally.
>>     
>
> OK; however, a vmdk file open does that so I though it was alright to do it.
>   

I obviously don't use vmdk or else I would have removed that by now :-)

>> A lot more checks are needed here to see if things can succeed.  We
>> definitely should bail out if they can't.
>>     
>
> Bailing out is done in the out: label below. What else do  you think can fail? 
> I've taken care of all the cases that do fail IMO.
>
>   
>>> +	return pci_dev;
>>> +out:
>>> +	pci_unregister_device(&pci_dev->dev);
>>> +	return NULL;
>>> +}
>>>       
>
>   
>>> +/*
>>> + * Syntax to assign device:
>>> + *
>>> + * -pcidevice dev=bus:dev.func,dma=dma
>>> + *
>>> + * Example:
>>> + * -pcidevice host=00:13.0,dma=pvdma
>>> + *
>>> + * dma can currently only be 'none' to disable iommu support.
>>>       
>> Does it actually work if you disable iommu support?
>>     
>
> If the guest is 1:1 mapped.
>   

You mean with Andrea's reserved ram patches?

>>> +#include <sys/mman.h>
>>>       
>> Don't think this is needed here.
>>     
>
> We use mmap(), so this is needed.
>   

Ah.

>>> +    /* Initialize assigned devices */
>>> +    if (pci_enabled) {
>>> +        int r = -1;
>>> +        do {
>>> +            init_assigned_device(pci_bus, &r);
>>>       
>> Why pass r by reference instead of just returning it?  At any rate, you
>> should detect when this fails and gracefully terminate QEMU.
>>     
>
> 'r' is the count of the number of assigned devices -- mostly needed because we 
> have the data stored in an array. If we migrate to a list, this can be 
> relaxed.
>
> ATM, I start the guest without assigning the device. I haven't figured out a 
> way to gracefully terminate qemu yet.
>   

In the case of hot plug, you fail the hot plug.  If you start with 
device assignment, just doing an "exit" would be sufficient.

>>> +#if defined(TARGET_I386) || defined(TARGET_X86_64) || defined(__linux__)
>>> +	    case QEMU_OPTION_pcidevice:
>>> +		add_assigned_device(optarg);
>>>       
>> You should copy into an array, then in pc.c, iterate through the array
>> and call into add_assigned_device.
>>     
>
> Is there any benefit in doing this? We're moving the iterate out of vl.c to 
> pc.c and both will happen at the same time.
>   

It's how everything else works.

Regards,

Anthony Liguori

>> Regards,
>>
>> Anthony Liguori
>>     
>
> Thanks!
>
> Amit.
>   


  reply	other threads:[~2008-09-24 15:08 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-23 14:54 [V6] Userspace patches for PCI device assignment Amit Shah
2008-09-23 14:54 ` [PATCH 1/7] KVM/userspace: Device Assignment: Add ioctl wrappers needed for assigning devices Amit Shah
2008-09-23 14:54   ` [PATCH 2/7] qemu: Introduce pci_map_irq to get irq nr from pin number for a PCI device Amit Shah
2008-09-23 14:54     ` [PATCH 3/7] qemu: piix: Introduce functions to get pin number from irq and vice versa Amit Shah
2008-09-23 14:54       ` [PATCH 4/7] qemu: Include hw.h in qemu/hw/isa.h to fix compile issues Amit Shah
2008-09-23 14:54         ` [PATCH 5/7] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests Amit Shah
2008-09-23 14:54           ` [PATCH 6/7] KVM/userspace: Build vtd.c for Intel IOMMU support Amit Shah
2008-09-23 14:54             ` [PATCH 7/7] KVM/userspace: Device Assignment: Support for hot plugging PCI devices Amit Shah
2008-09-23 16:32               ` Anthony Liguori
2008-09-24  4:24                 ` Amit Shah
2008-09-23 16:31             ` [PATCH 6/7] KVM/userspace: Build vtd.c for Intel IOMMU support Anthony Liguori
2008-09-24  4:25               ` Amit Shah
2008-09-24 15:08                 ` Anthony Liguori
2008-09-23 16:30           ` [PATCH 5/7] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests Anthony Liguori
2008-09-23 18:32             ` Muli Ben-Yehuda
2008-09-23 19:18               ` Anthony Liguori
2008-09-23 19:43                 ` Muli Ben-Yehuda
2008-09-23 19:58                   ` Anthony Liguori
2008-09-24  4:58             ` Amit Shah
2008-09-24 15:07               ` Anthony Liguori [this message]
2008-09-24 17:10                 ` Amit Shah
2008-09-25  4:54           ` Yang, Sheng
2008-09-25  5:20             ` Yang, Sheng
2008-09-26  1:34           ` Yang, Sheng
2008-09-23 16:13         ` [PATCH 4/7] qemu: Include hw.h in qemu/hw/isa.h to fix compile issues Anthony Liguori
2008-09-24  4:27           ` Amit Shah
2008-09-24 11:35             ` Amit Shah
2008-09-24 14:59             ` Anthony Liguori
2008-09-23 16:13       ` [PATCH 3/7] qemu: piix: Introduce functions to get pin number from irq and vice versa Anthony Liguori
2008-09-24  4:28         ` Amit Shah
2008-09-23 16:12     ` [PATCH 2/7] qemu: Introduce pci_map_irq to get irq nr from pin number for a PCI device Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48DA57AC.1020903@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=allen.m.kay@intel.com \
    --cc=amit.shah@redhat.com \
    --cc=avi@redhat.com \
    --cc=benami@il.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=muli@il.ibm.com \
    --cc=weidong.han@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).