All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Amit Shah <amit.shah@redhat.com>
Cc: avi@redhat.com, kvm@vger.kernel.org, muli@il.ibm.com,
	benami@il.ibm.com, weidong.han@intel.com, allen.m.kay@intel.com
Subject: Re: [PATCH 5/7] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests
Date: Wed, 24 Sep 2008 10:07:24 -0500	[thread overview]
Message-ID: <48DA57AC.1020903@codemonkey.ws> (raw)
In-Reply-To: <200809241028.09067.amit.shah@redhat.com>

Amit Shah wrote:
> * On Tuesday 23 Sep 2008 22:00:32 Anthony Liguori wrote:
>   
>> Amit Shah wrote:
>>     
>
>   
>>> diff --git a/qemu/Makefile.target b/qemu/Makefile.target
>>> index 72f3db8..40eb273 100644
>>> --- a/qemu/Makefile.target
>>> +++ b/qemu/Makefile.target
>>> @@ -616,6 +616,7 @@ OBJS+= ide.o pckbd.o ps2.o vga.o $(SOUND_HW) dma.o
>>>  OBJS+= fdc.o mc146818rtc.o serial.o i8259.o i8254.o pcspk.o pc.o
>>>  OBJS+= cirrus_vga.o apic.o parallel.o acpi.o piix_pci.o
>>>  OBJS+= usb-uhci.o vmmouse.o vmport.o vmware_vga.o extboot.o
>>> +OBJS+= device-assignment.o
>>>       
>> This needs to be conditional on at least linux hosts, but probably also
>> kvm support.
>>     
>
> I didn't see any other file that's doing it. So I added this conditional in 
> vl.c by having a #if defined(__linux__). That's how usb-linux.c does it as 
> well. Is there a better way?
>   

aio and compatfd currently do it this way.  block-raw-win32 and 
block-raw-posix are this way.  We're slowly moving things away from 
#ifdef #else #endif to conditional compilation.

> Not the whole functionality needs kvm support. This should be able to work 
> even without kvm (for example, when the guest is 1:1 mapped in the host 
> address space).
>   

KVM is needed for interrupt remapping though.  That's something I don't 
see happening for normal userspace any time soon.

>>> +	/* FIXME: Add support for emulated MMIO for non-kvm guests */
>>> +	if (kvm_enabled()) {
>>>       
>> This doesn't work at all if kvm isn't enabled right?  You should
>> probably bail out in the init if kvm isn't enabled.  If this whole file
>> is included conditionally based on KVM support, then you don't have to
>> worry about using kvm_enabled() guards to conditionally compile out code.
>>     
>
> Non-kvm support is currently broken and should be fixed, but that can happen 
> after we get this merged.
>   

But it would take bouncing interrupts to userspace?  I don't think that 
will ever happen upstream personally.  At any rate, there's no point in 
even trying to support something like that until progress is made 
upstream on this front.

> I can temporarily add a check for kvm_enabled and bail out.
>
>   
>>> +	sprintf(dir, "/sys/bus/pci/devices/0000:%02x:%02x.%x/",
>>> +		r_bus, r_dev, r_func);
>>>       
>> snprintf()
>>     
>
> It's guarded by the %02x modifiers; so this doesn't depend on user input.
>   

strcpy or sprintf should never be used.  It doesn't matter if it's safe 
in a particular instance.  There are safer functions to use (like snprintf).

All it takes is for someone to come along and change the /sys/bus path 
to be larger without adjusting the buffer size and everything goes to 
hell.  It's inherently brittle.

>>> +	fprintf(stderr, "Registered host PCI device %02x:%02x.%1x "
>>> +		"(\"%s\") as guest device %02x:%02x.%1x\n",
>>> +		r_bus, r_dev, r_func, e_dev_name,
>>> +		pci_bus_num(e_bus), e_device, r_func);
>>>       
>> Please don't fprintf() unconditionally.
>>     
>
> OK; however, a vmdk file open does that so I though it was alright to do it.
>   

I obviously don't use vmdk or else I would have removed that by now :-)

>> A lot more checks are needed here to see if things can succeed.  We
>> definitely should bail out if they can't.
>>     
>
> Bailing out is done in the out: label below. What else do  you think can fail? 
> I've taken care of all the cases that do fail IMO.
>
>   
>>> +	return pci_dev;
>>> +out:
>>> +	pci_unregister_device(&pci_dev->dev);
>>> +	return NULL;
>>> +}
>>>       
>
>   
>>> +/*
>>> + * Syntax to assign device:
>>> + *
>>> + * -pcidevice dev=bus:dev.func,dma=dma
>>> + *
>>> + * Example:
>>> + * -pcidevice host=00:13.0,dma=pvdma
>>> + *
>>> + * dma can currently only be 'none' to disable iommu support.
>>>       
>> Does it actually work if you disable iommu support?
>>     
>
> If the guest is 1:1 mapped.
>   

You mean with Andrea's reserved ram patches?

>>> +#include <sys/mman.h>
>>>       
>> Don't think this is needed here.
>>     
>
> We use mmap(), so this is needed.
>   

Ah.

>>> +    /* Initialize assigned devices */
>>> +    if (pci_enabled) {
>>> +        int r = -1;
>>> +        do {
>>> +            init_assigned_device(pci_bus, &r);
>>>       
>> Why pass r by reference instead of just returning it?  At any rate, you
>> should detect when this fails and gracefully terminate QEMU.
>>     
>
> 'r' is the count of the number of assigned devices -- mostly needed because we 
> have the data stored in an array. If we migrate to a list, this can be 
> relaxed.
>
> ATM, I start the guest without assigning the device. I haven't figured out a 
> way to gracefully terminate qemu yet.
>   

In the case of hot plug, you fail the hot plug.  If you start with 
device assignment, just doing an "exit" would be sufficient.

>>> +#if defined(TARGET_I386) || defined(TARGET_X86_64) || defined(__linux__)
>>> +	    case QEMU_OPTION_pcidevice:
>>> +		add_assigned_device(optarg);
>>>       
>> You should copy into an array, then in pc.c, iterate through the array
>> and call into add_assigned_device.
>>     
>
> Is there any benefit in doing this? We're moving the iterate out of vl.c to 
> pc.c and both will happen at the same time.
>   

It's how everything else works.

Regards,

Anthony Liguori

>> Regards,
>>
>> Anthony Liguori
>>     
>
> Thanks!
>
> Amit.
>   


  reply	other threads:[~2008-09-24 15:08 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-23 14:54 [V6] Userspace patches for PCI device assignment Amit Shah
2008-09-23 14:54 ` [PATCH 1/7] KVM/userspace: Device Assignment: Add ioctl wrappers needed for assigning devices Amit Shah
2008-09-23 14:54   ` [PATCH 2/7] qemu: Introduce pci_map_irq to get irq nr from pin number for a PCI device Amit Shah
2008-09-23 14:54     ` [PATCH 3/7] qemu: piix: Introduce functions to get pin number from irq and vice versa Amit Shah
2008-09-23 14:54       ` [PATCH 4/7] qemu: Include hw.h in qemu/hw/isa.h to fix compile issues Amit Shah
2008-09-23 14:54         ` [PATCH 5/7] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests Amit Shah
2008-09-23 14:54           ` [PATCH 6/7] KVM/userspace: Build vtd.c for Intel IOMMU support Amit Shah
2008-09-23 14:54             ` [PATCH 7/7] KVM/userspace: Device Assignment: Support for hot plugging PCI devices Amit Shah
2008-09-23 16:32               ` Anthony Liguori
2008-09-24  4:24                 ` Amit Shah
2008-09-23 16:31             ` [PATCH 6/7] KVM/userspace: Build vtd.c for Intel IOMMU support Anthony Liguori
2008-09-24  4:25               ` Amit Shah
2008-09-24 15:08                 ` Anthony Liguori
2008-09-23 16:30           ` [PATCH 5/7] KVM/userspace: Device Assignment: Support for assigning PCI devices to guests Anthony Liguori
2008-09-23 18:32             ` Muli Ben-Yehuda
2008-09-23 19:18               ` Anthony Liguori
2008-09-23 19:43                 ` Muli Ben-Yehuda
2008-09-23 19:58                   ` Anthony Liguori
2008-09-24  4:58             ` Amit Shah
2008-09-24 15:07               ` Anthony Liguori [this message]
2008-09-24 17:10                 ` Amit Shah
2008-09-25  4:54           ` Yang, Sheng
2008-09-25  5:20             ` Yang, Sheng
2008-09-26  1:34           ` Yang, Sheng
2008-09-23 16:13         ` [PATCH 4/7] qemu: Include hw.h in qemu/hw/isa.h to fix compile issues Anthony Liguori
2008-09-24  4:27           ` Amit Shah
2008-09-24 11:35             ` Amit Shah
2008-09-24 14:59             ` Anthony Liguori
2008-09-23 16:13       ` [PATCH 3/7] qemu: piix: Introduce functions to get pin number from irq and vice versa Anthony Liguori
2008-09-24  4:28         ` Amit Shah
2008-09-23 16:12     ` [PATCH 2/7] qemu: Introduce pci_map_irq to get irq nr from pin number for a PCI device Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48DA57AC.1020903@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=allen.m.kay@intel.com \
    --cc=amit.shah@redhat.com \
    --cc=avi@redhat.com \
    --cc=benami@il.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=muli@il.ibm.com \
    --cc=weidong.han@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.