From: George Dunlap <george.dunlap@eu.citrix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
Ian Jackson <ian.jackson@eu.citrix.com>,
Ian Campbell <Ian.Campbell@citrix.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: RFC: Automatically making a PCI device assignable in the config file
Date: Tue, 9 Jul 2013 17:38:08 +0100 [thread overview]
Message-ID: <51DC3C70.2010605@eu.citrix.com> (raw)
In-Reply-To: <20130709142527.GD24897@phenom.dumpdata.com>
On 07/09/2013 03:25 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Jul 09, 2013 at 01:52:38PM +0100, George Dunlap wrote:
>> On 07/08/2013 08:23 PM, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Jul 05, 2013 at 02:52:08PM +0100, George Dunlap wrote:
>>>> On 05/07/13 14:48, Andrew Cooper wrote:
>>>>> On 05/07/13 14:45, George Dunlap wrote:
>>>>>> On 05/07/13 14:39, Andrew Cooper wrote:
>>>>>>> On 05/07/13 12:01, George Dunlap wrote:
>>>>>>>> I've been doing some work to try to make driver domains easier to set
>>>>>>>> up and use. At the moment, in order to pass a device through to a
>>>>>>>> guest, you first need to assign it to pciback. This involves doing
>>>>>>>> one of three things:
>>>>>>>> * Running xl pci-assignable-add for the device
>>>>>>>> * Specifying the device to be grabbed on the dom0 Linux command-line
>>>>>>>> * Doing some hackery in /etc/modules.d
>>>>>>>>
>>>>>>>> None of these are very satisfying. What I think would be better is if
>>>>>>>> there was a way to specify in the guest config file, "If device X is
>>>>>>>> not assignable, try to make it assignable". That way you can have a
>>>>>>>> driver domain grab the appropriate device just by running "xl create
>>>>>>>> domnet"; and once we have the xendomains script up and running with
>>>>>>>> xl, you can simply configure your domnet appropriately, and then put
>>>>>>>> it in /etc/xen/auto, to be started automatically on boot.
>>>>>>>>
>>>>>>>> My initial idea was to add a parameter to the pci argument in the
>>>>>>>> config file; for example:
>>>>>>>>
>>>>>>>> pci = ['08:04.1,permissive=1,seize=1']
>>>>>>>>
>>>>>>>> The 'seize=1' would indicate that if bdf 08:04.1 is not already
>>>>>>>> assignable, that xl should try to make is assignable.
>>>>>>>>
>>>>>>>> The problem here is that this would need to be parsed by
>>>>>>>> xlu_pci_parse_bdf(), which only takes an argumen tof type
>>>>>>>> libxl_device_pci.
>>>>>>>>
>>>>>>>> Now it seems to me that the right place to do this "seizing" is in xl,
>>>>>>>> not inside libxl -- the functions for doing assignment exist already,
>>>>>>>> and are simple and straightforward. But doing it in xl, but as a
>>>>>>>> parameter of the "pci" setting, means changing xlu_pci_parse_bdf() to
>>>>>>>> pass something else back, which begins to get awkward.
>>>>>>>>
>>>>>>>> So it seems to me we have a couple of options:
>>>>>>>> 1. Create a new argument, "pci_seize" or something like that, which
>>>>>>>> would be processed separately from pci
>>>>>>>> 2. Change xlu_pci_parse_bdf to take a pointer to an extra struct, for
>>>>>>>> arguments directed at xl rather than libxl
>>>>>>>> 3. Add "seize" to libxl_device_pci, but have it only used by xl
>>>>>>>> 4. Add "seize" to libxl_device_pci, and have libxl do the seizing.
>>>>>>>>
>>>>>>>> Any preference -- or any other ideas?
>>>>>>>>
>>>>>>>> -George
>>>>>>> How about a setting in xl.conf of "auto-seize pci devices" ? That way
>>>>>>> the seizing is entirely part of xl
>>>>>> Auto-seizing is fairly dangerous; you could easily accidentally yank
>>>>>> out the ethernet card, or even the disk that dom0 is using. I really
>>>>>> think it should have to be enabled on a device-by-device basis.
>>>>>>
>>>>>> I suppose another option would be to be able to set, in xl.conf, a
>>>>>> list of auto-seizeable devices. I don't really like that option as
>>>>>> well, though. I'd rather be able to keep all the configuration in one
>>>>>> place.
>>>>>>
>>>>>> -George
>>>>> Or a slight less extreme version.
>>>>>
>>>>> If xl sees that it would need seize a device, it could ask "You are
>>>>> trying to create a domain with device $FOO. Would you like to seize it
>>>> >from dom0 ?"
>>>>
>>>> That won't work for driver domains, as we want it all to happen
>>>> automatically when the host is booting. :-)
>>>
>>> The high-level goal is that we want to put the network devices with a
>>> network backend and storage devices with storage backend. Ignorning
>>> that for network devices you might want seperate backends for each
>>> device (say one backend for Wireless, one for Ethernet, etc).
>>>
>>> Perhaps the logic ought to do grouping - so you say:
>>> a) "backends:all-network" (which would created one backend with all of the
>>> wireless, ethernet, etc PCI devices), or
>>> b) "backends:all-network,seperate-storage", which create one backend with
>>> all of the wireless, ethernet in one backend; and one backend domain for each
>>> storage device?
>>>
>>> Naturally the user gets to chose which grouping they would like?
>>
>> We seem to be talking about different things. You seem to be
>> talking about automatically starting some pre-made VMs and assigning
>> devices and backends to them? But I'm not really sure.
>
> I am trying to look at it from a high perspective to see whether we can
> make this automated for 99% of people out of the box. Hence the
> idea of grouping. And yes to '..assigning devices and backends to them'.
>>
>> I was assuming that the user was going to be installing and
>> configuring their own driver domains. The user already has to
>> specify "pci=['$BDF']" in their config file to get specific devices
>> passed through -- this would just be making it easy to have the
>> device assigned to pciback as well.
>
> I think the technical bits what libxl is doing and yanking devices
> around is driven either by the admin or a policy. If the policy
> is this idea of grouping (that is a terrible name now that I think
> of it), then perhaps we should think how to make that work and then
> the details (such as this automatic yanking of devices to pci-back)
> can be filled in.
>
>
>>
>> I suspect that a lot of people will want to have one network card
>> assigned to domain 0 as a "management network", and only have other
>> devices assigned to driver domains. I think that having one device
>> per domain is probably the best recommendation; although we
>> obviously want to support someone who wants a single "manage all the
>> devices" domain, we should assume that people are going to have one
>> device per driver domain.
>
> I don't know. My feeble idea was that we would have at minimum _two_
> guests on bootup. One is a control one that has no devices - but is
> the one that launches the guests.
>
> Then there is the dom1 which would have all (or some) of the storage
> and network devices plugged in along with the backends. Then a dom2
> which would be the old-style-dom0 - so it would have the graphic card
> and the rest of the PCI devices.
>
> In other words, when I boot I would have two tiny domains launch
> right before "old-style-dom0" is started. But I am getting in specifics
> here.
>
> Perhaps you could explain to me how you envisioned how the device
> driver domains idea would work? How would you want it to work on your
> laptop?
>
> Or are we right now just thinking of the small pieces of making the
> code be able to yank the devices around and assign them?
I was thinking for now just making the "manually configure it" case
easier. I decided to switch one of my test boxen to using a network
driver domain by default, and although the core is there, there are a
bunch of things that are unnecessarily crufty.
I do agree that long term it would be nice to make it easy to make
driver domains the default, but that's not what I had in mind for this
conversation. :-)
The hard part for making it really automated, it seems to me, comes from
two things. O
One, you have to make sure your driver domain has the appropriate
hardware drivers for your system as well. We don't want to be in the
business of maintaining a distro; most people will probably want the
driver domain to be from the same distro they're using for dom0, which
means that setting up such a domain will need to be done differently on
a distro-by-distro basis.
Two, you have the configuration problem. In Debian, for instance, if
you wanted to switch a device from being owned by dom0 to being in a
driver domain, you'd have to:
* Copy over the udev rules recognizing the mac address, so it got the
same ethN
* copy over the eth and bridge info from dom0's /etc/network/interfaces
into the guest /etc/network/interfaces
I'm not sure exactly what you have to do in Fedora, but I bet it's
something similar.
It might be nice to work with distros to make the process of making
driver domains / stub domains easier, and to make it easy to configure
driver domain networking options from the distro's network scripts; but
that's kind of another level of functionality.
I think first things first: make manually-set-up driver domains actually
easy to use.
-George
next prev parent reply other threads:[~2013-07-09 16:38 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-05 11:01 RFC: Automatically making a PCI device assignable in the config file George Dunlap
2013-07-05 13:39 ` Andrew Cooper
2013-07-05 13:45 ` George Dunlap
2013-07-05 13:48 ` Andrew Cooper
2013-07-05 13:52 ` George Dunlap
2013-07-08 19:23 ` Konrad Rzeszutek Wilk
2013-07-09 12:52 ` George Dunlap
2013-07-09 14:25 ` Konrad Rzeszutek Wilk
2013-07-09 16:38 ` George Dunlap [this message]
2013-07-10 13:45 ` Stefano Stabellini
2013-07-10 13:49 ` Stefano Stabellini
2013-07-10 13:55 ` Ian Jackson
2013-07-10 14:45 ` George Dunlap
2013-07-10 15:12 ` Gordan Bobic
2013-07-10 15:29 ` George Dunlap
2013-07-10 15:37 ` Gordan Bobic
2013-07-10 13:53 ` Ian Jackson
2013-07-10 14:48 ` George Dunlap
2013-07-11 11:35 ` David Vrabel
2013-07-12 9:36 ` George Dunlap
2013-07-12 9:55 ` David Vrabel
2013-07-12 10:32 ` George Dunlap
2013-07-12 13:10 ` Ian Jackson
2013-07-12 13:48 ` Konrad Rzeszutek Wilk
2013-07-12 14:43 ` Ian Jackson
2013-07-12 15:01 ` Konrad Rzeszutek Wilk
2013-07-12 15:09 ` George Dunlap
2013-07-12 16:02 ` Konrad Rzeszutek Wilk
2013-07-12 16:08 ` George Dunlap
2013-07-12 14:44 ` Sander Eikelenboom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51DC3C70.2010605@eu.citrix.com \
--to=george.dunlap@eu.citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=ian.jackson@eu.citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).