From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: RFC: Automatically making a PCI device assignable in the config file Date: Tue, 9 Jul 2013 17:38:08 +0100 Message-ID: <51DC3C70.2010605@eu.citrix.com> References: <51D6CC76.4040906@citrix.com> <51D6CDE2.90808@eu.citrix.com> <51D6CEBE.4040101@citrix.com> <51D6CF88.4010203@eu.citrix.com> <20130708192336.GB4927@phenom.dumpdata.com> <51DC0796.3050206@eu.citrix.com> <20130709142527.GD24897@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20130709142527.GD24897@phenom.dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: Andrew Cooper , Ian Jackson , Ian Campbell , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 07/09/2013 03:25 PM, Konrad Rzeszutek Wilk wrote: > On Tue, Jul 09, 2013 at 01:52:38PM +0100, George Dunlap wrote: >> On 07/08/2013 08:23 PM, Konrad Rzeszutek Wilk wrote: >>> On Fri, Jul 05, 2013 at 02:52:08PM +0100, George Dunlap wrote: >>>> On 05/07/13 14:48, Andrew Cooper wrote: >>>>> On 05/07/13 14:45, George Dunlap wrote: >>>>>> On 05/07/13 14:39, Andrew Cooper wrote: >>>>>>> On 05/07/13 12:01, George Dunlap wrote: >>>>>>>> I've been doing some work to try to make driver domains easier to set >>>>>>>> up and use. At the moment, in order to pass a device through to a >>>>>>>> guest, you first need to assign it to pciback. This involves doing >>>>>>>> one of three things: >>>>>>>> * Running xl pci-assignable-add for the device >>>>>>>> * Specifying the device to be grabbed on the dom0 Linux command-line >>>>>>>> * Doing some hackery in /etc/modules.d >>>>>>>> >>>>>>>> None of these are very satisfying. What I think would be better is if >>>>>>>> there was a way to specify in the guest config file, "If device X is >>>>>>>> not assignable, try to make it assignable". That way you can have a >>>>>>>> driver domain grab the appropriate device just by running "xl create >>>>>>>> domnet"; and once we have the xendomains script up and running with >>>>>>>> xl, you can simply configure your domnet appropriately, and then put >>>>>>>> it in /etc/xen/auto, to be started automatically on boot. >>>>>>>> >>>>>>>> My initial idea was to add a parameter to the pci argument in the >>>>>>>> config file; for example: >>>>>>>> >>>>>>>> pci = ['08:04.1,permissive=1,seize=1'] >>>>>>>> >>>>>>>> The 'seize=1' would indicate that if bdf 08:04.1 is not already >>>>>>>> assignable, that xl should try to make is assignable. >>>>>>>> >>>>>>>> The problem here is that this would need to be parsed by >>>>>>>> xlu_pci_parse_bdf(), which only takes an argumen tof type >>>>>>>> libxl_device_pci. >>>>>>>> >>>>>>>> Now it seems to me that the right place to do this "seizing" is in xl, >>>>>>>> not inside libxl -- the functions for doing assignment exist already, >>>>>>>> and are simple and straightforward. But doing it in xl, but as a >>>>>>>> parameter of the "pci" setting, means changing xlu_pci_parse_bdf() to >>>>>>>> pass something else back, which begins to get awkward. >>>>>>>> >>>>>>>> So it seems to me we have a couple of options: >>>>>>>> 1. Create a new argument, "pci_seize" or something like that, which >>>>>>>> would be processed separately from pci >>>>>>>> 2. Change xlu_pci_parse_bdf to take a pointer to an extra struct, for >>>>>>>> arguments directed at xl rather than libxl >>>>>>>> 3. Add "seize" to libxl_device_pci, but have it only used by xl >>>>>>>> 4. Add "seize" to libxl_device_pci, and have libxl do the seizing. >>>>>>>> >>>>>>>> Any preference -- or any other ideas? >>>>>>>> >>>>>>>> -George >>>>>>> How about a setting in xl.conf of "auto-seize pci devices" ? That way >>>>>>> the seizing is entirely part of xl >>>>>> Auto-seizing is fairly dangerous; you could easily accidentally yank >>>>>> out the ethernet card, or even the disk that dom0 is using. I really >>>>>> think it should have to be enabled on a device-by-device basis. >>>>>> >>>>>> I suppose another option would be to be able to set, in xl.conf, a >>>>>> list of auto-seizeable devices. I don't really like that option as >>>>>> well, though. I'd rather be able to keep all the configuration in one >>>>>> place. >>>>>> >>>>>> -George >>>>> Or a slight less extreme version. >>>>> >>>>> If xl sees that it would need seize a device, it could ask "You are >>>>> trying to create a domain with device $FOO. Would you like to seize it >>>> >from dom0 ?" >>>> >>>> That won't work for driver domains, as we want it all to happen >>>> automatically when the host is booting. :-) >>> >>> The high-level goal is that we want to put the network devices with a >>> network backend and storage devices with storage backend. Ignorning >>> that for network devices you might want seperate backends for each >>> device (say one backend for Wireless, one for Ethernet, etc). >>> >>> Perhaps the logic ought to do grouping - so you say: >>> a) "backends:all-network" (which would created one backend with all of the >>> wireless, ethernet, etc PCI devices), or >>> b) "backends:all-network,seperate-storage", which create one backend with >>> all of the wireless, ethernet in one backend; and one backend domain for each >>> storage device? >>> >>> Naturally the user gets to chose which grouping they would like? >> >> We seem to be talking about different things. You seem to be >> talking about automatically starting some pre-made VMs and assigning >> devices and backends to them? But I'm not really sure. > > I am trying to look at it from a high perspective to see whether we can > make this automated for 99% of people out of the box. Hence the > idea of grouping. And yes to '..assigning devices and backends to them'. >> >> I was assuming that the user was going to be installing and >> configuring their own driver domains. The user already has to >> specify "pci=['$BDF']" in their config file to get specific devices >> passed through -- this would just be making it easy to have the >> device assigned to pciback as well. > > I think the technical bits what libxl is doing and yanking devices > around is driven either by the admin or a policy. If the policy > is this idea of grouping (that is a terrible name now that I think > of it), then perhaps we should think how to make that work and then > the details (such as this automatic yanking of devices to pci-back) > can be filled in. > > >> >> I suspect that a lot of people will want to have one network card >> assigned to domain 0 as a "management network", and only have other >> devices assigned to driver domains. I think that having one device >> per domain is probably the best recommendation; although we >> obviously want to support someone who wants a single "manage all the >> devices" domain, we should assume that people are going to have one >> device per driver domain. > > I don't know. My feeble idea was that we would have at minimum _two_ > guests on bootup. One is a control one that has no devices - but is > the one that launches the guests. > > Then there is the dom1 which would have all (or some) of the storage > and network devices plugged in along with the backends. Then a dom2 > which would be the old-style-dom0 - so it would have the graphic card > and the rest of the PCI devices. > > In other words, when I boot I would have two tiny domains launch > right before "old-style-dom0" is started. But I am getting in specifics > here. > > Perhaps you could explain to me how you envisioned how the device > driver domains idea would work? How would you want it to work on your > laptop? > > Or are we right now just thinking of the small pieces of making the > code be able to yank the devices around and assign them? I was thinking for now just making the "manually configure it" case easier. I decided to switch one of my test boxen to using a network driver domain by default, and although the core is there, there are a bunch of things that are unnecessarily crufty. I do agree that long term it would be nice to make it easy to make driver domains the default, but that's not what I had in mind for this conversation. :-) The hard part for making it really automated, it seems to me, comes from two things. O One, you have to make sure your driver domain has the appropriate hardware drivers for your system as well. We don't want to be in the business of maintaining a distro; most people will probably want the driver domain to be from the same distro they're using for dom0, which means that setting up such a domain will need to be done differently on a distro-by-distro basis. Two, you have the configuration problem. In Debian, for instance, if you wanted to switch a device from being owned by dom0 to being in a driver domain, you'd have to: * Copy over the udev rules recognizing the mac address, so it got the same ethN * copy over the eth and bridge info from dom0's /etc/network/interfaces into the guest /etc/network/interfaces I'm not sure exactly what you have to do in Fedora, but I bet it's something similar. It might be nice to work with distros to make the process of making driver domains / stub domains easier, and to make it easy to configure driver domain networking options from the distro's network scripts; but that's kind of another level of functionality. I think first things first: make manually-set-up driver domains actually easy to use. -George