xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: George Dunlap <george.dunlap@eu.citrix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Ian Jackson <ian.jackson@eu.citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: RFC: Automatically making a PCI device assignable in the config file
Date: Tue, 9 Jul 2013 17:38:08 +0100	[thread overview]
Message-ID: <51DC3C70.2010605@eu.citrix.com> (raw)
In-Reply-To: <20130709142527.GD24897@phenom.dumpdata.com>

On 07/09/2013 03:25 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Jul 09, 2013 at 01:52:38PM +0100, George Dunlap wrote:
>> On 07/08/2013 08:23 PM, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Jul 05, 2013 at 02:52:08PM +0100, George Dunlap wrote:
>>>> On 05/07/13 14:48, Andrew Cooper wrote:
>>>>> On 05/07/13 14:45, George Dunlap wrote:
>>>>>> On 05/07/13 14:39, Andrew Cooper wrote:
>>>>>>> On 05/07/13 12:01, George Dunlap wrote:
>>>>>>>> I've been doing some work to try to make driver domains easier to set
>>>>>>>> up and use.  At the moment, in order to pass a device through to a
>>>>>>>> guest, you first need to assign it to pciback.  This involves doing
>>>>>>>> one of three things:
>>>>>>>> * Running xl pci-assignable-add for the device
>>>>>>>> * Specifying the device to be grabbed on the dom0 Linux command-line
>>>>>>>> * Doing some hackery in /etc/modules.d
>>>>>>>>
>>>>>>>> None of these are very satisfying.  What I think would be better is if
>>>>>>>> there was a way to specify in the guest config file, "If device X is
>>>>>>>> not assignable, try to make it assignable".  That way you can have a
>>>>>>>> driver domain grab the appropriate device just by running "xl create
>>>>>>>> domnet"; and once we have the xendomains script up and running with
>>>>>>>> xl, you can simply configure your domnet appropriately, and then put
>>>>>>>> it in /etc/xen/auto, to be started automatically on boot.
>>>>>>>>
>>>>>>>> My initial idea was to add a parameter to the pci argument in the
>>>>>>>> config file; for example:
>>>>>>>>
>>>>>>>> pci = ['08:04.1,permissive=1,seize=1']
>>>>>>>>
>>>>>>>> The 'seize=1' would indicate that if bdf 08:04.1 is not already
>>>>>>>> assignable, that xl should try to make is assignable.
>>>>>>>>
>>>>>>>> The problem here is that this would need to be parsed by
>>>>>>>> xlu_pci_parse_bdf(), which only takes an argumen tof type
>>>>>>>> libxl_device_pci.
>>>>>>>>
>>>>>>>> Now it seems to me that the right place to do this "seizing" is in xl,
>>>>>>>> not inside libxl -- the functions for doing assignment exist already,
>>>>>>>> and are simple and straightforward.  But doing it in xl, but as a
>>>>>>>> parameter of the "pci" setting, means changing xlu_pci_parse_bdf() to
>>>>>>>> pass something else back, which begins to get awkward.
>>>>>>>>
>>>>>>>> So it seems to me we have a couple of options:
>>>>>>>> 1. Create a new argument, "pci_seize" or something like that, which
>>>>>>>> would be processed separately from pci
>>>>>>>> 2. Change xlu_pci_parse_bdf to take a pointer to an extra struct, for
>>>>>>>> arguments directed at xl rather than libxl
>>>>>>>> 3. Add "seize" to libxl_device_pci, but have it only used by xl
>>>>>>>> 4. Add "seize" to libxl_device_pci, and have libxl do the seizing.
>>>>>>>>
>>>>>>>> Any preference -- or any other ideas?
>>>>>>>>
>>>>>>>>    -George
>>>>>>> How about a setting in xl.conf of "auto-seize pci devices" ?  That way
>>>>>>> the seizing is entirely part of xl
>>>>>> Auto-seizing is fairly dangerous; you could easily accidentally yank
>>>>>> out the ethernet card, or even the disk that dom0 is using.  I really
>>>>>> think it should have to be enabled on a device-by-device basis.
>>>>>>
>>>>>> I suppose another option would be to be able to set, in xl.conf, a
>>>>>> list of auto-seizeable devices.  I don't really like that option as
>>>>>> well, though.  I'd rather be able to keep all the configuration in one
>>>>>> place.
>>>>>>
>>>>>>   -George
>>>>> Or a slight less extreme version.
>>>>>
>>>>> If xl sees that it would need seize a device, it could ask "You are
>>>>> trying to create a domain with device $FOO.  Would you like to seize it
>>>> >from dom0 ?"
>>>>
>>>> That won't work for driver domains, as we want it all to happen
>>>> automatically when the host is booting. :-)
>>>
>>> The high-level goal is that we want to put the network devices with a
>>> network backend and storage devices with storage backend. Ignorning
>>> that for network devices you might want seperate backends for each
>>> device (say one backend for Wireless, one for Ethernet, etc).
>>>
>>> Perhaps the logic ought to do grouping - so you say:
>>>   a) "backends:all-network" (which would created one backend with all of the
>>>     wireless, ethernet, etc PCI devices), or
>>>   b) "backends:all-network,seperate-storage", which  create one backend with
>>>    all of the wireless, ethernet in one backend; and one backend domain for each
>>>    storage device?
>>>
>>> Naturally the user gets to chose which grouping they would like?
>>
>> We seem to be talking about different things.  You seem to be
>> talking about automatically starting some pre-made VMs and assigning
>> devices and backends to them?  But I'm not really sure.
>
> I am trying to look at it from a high perspective to see whether we can
> make this automated for 99% of people out of the box. Hence the
> idea of grouping. And yes to '..assigning devices and backends to them'.
>>
>> I was assuming that the user was going to be installing and
>> configuring their own driver domains.  The user already has to
>> specify "pci=['$BDF']" in their config file to get specific devices
>> passed through -- this would just be making it easy to have the
>> device assigned to pciback as well.
>
> I think the technical bits what libxl is doing and yanking devices
> around is driven either by the admin or a policy. If the policy
> is this idea of grouping (that is a terrible name now that I think
> of it), then perhaps we should think how to make that work and then
> the details (such as this automatic yanking of devices to pci-back)
> can be filled in.
>
>
>>
>> I suspect that a lot of people will want to have one network card
>> assigned to domain 0 as a "management network", and only have other
>> devices assigned to driver domains.  I think that having one device
>> per domain is probably the best recommendation; although we
>> obviously want to support someone who wants a single "manage all the
>> devices" domain, we should assume that people are going to have one
>> device per driver domain.
>
> I don't know. My feeble idea was that we would have at minimum _two_
> guests on bootup. One is a control one that has no devices - but is
> the one that launches the guests.
>
> Then there is the dom1 which would have all (or some) of the storage
> and network devices plugged in along with the backends. Then a dom2
> which would be the old-style-dom0 - so it would have the graphic card
> and the rest of the PCI devices.
>
> In other words, when I boot I would have two tiny domains launch
> right before "old-style-dom0" is started. But I am getting in specifics
> here.
>
> Perhaps you could explain to me how you envisioned how the device
> driver domains idea would work? How would you want it to work on your
> laptop?
>
> Or are we right now just thinking of the small pieces of making the
> code be able to yank the devices around and assign them?

I was thinking for now just making the "manually configure it" case 
easier.  I decided to switch one of my test boxen to using a network 
driver domain by default, and although the core is there, there are a 
bunch of things that are unnecessarily crufty.

I do agree that long term it would be nice to make it easy to make 
driver domains the default, but that's not what I had in mind for this 
conversation. :-)

The hard part for making it really automated, it seems to me, comes from 
two things.  O

One, you have to make sure your driver domain has the appropriate 
hardware drivers for your system as well.  We don't want to be in the 
business of maintaining a distro; most people will probably want the 
driver domain to be from the same distro they're using for dom0, which 
means that setting up such a domain will need to be done differently on 
a distro-by-distro basis.

Two, you have the configuration problem.  In Debian, for instance, if 
you wanted to switch a device from being owned by dom0 to being in a 
driver domain, you'd have to:
* Copy over the udev rules recognizing the mac address, so it got the 
same ethN
* copy over the eth and bridge info from dom0's /etc/network/interfaces 
into the guest /etc/network/interfaces

I'm not sure exactly what you have to do in Fedora, but I bet it's 
something similar.

It might be nice to work with distros to make the process of making 
driver domains / stub domains easier, and to make it easy to configure 
driver domain networking options from the distro's network scripts; but 
that's kind of another level of functionality.

I think first things first: make manually-set-up driver domains actually 
easy to use.

  -George

  reply	other threads:[~2013-07-09 16:38 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-05 11:01 RFC: Automatically making a PCI device assignable in the config file George Dunlap
2013-07-05 13:39 ` Andrew Cooper
2013-07-05 13:45   ` George Dunlap
2013-07-05 13:48     ` Andrew Cooper
2013-07-05 13:52       ` George Dunlap
2013-07-08 19:23         ` Konrad Rzeszutek Wilk
2013-07-09 12:52           ` George Dunlap
2013-07-09 14:25             ` Konrad Rzeszutek Wilk
2013-07-09 16:38               ` George Dunlap [this message]
2013-07-10 13:45                 ` Stefano Stabellini
2013-07-10 13:49               ` Stefano Stabellini
2013-07-10 13:55     ` Ian Jackson
2013-07-10 14:45       ` George Dunlap
2013-07-10 15:12         ` Gordan Bobic
2013-07-10 15:29           ` George Dunlap
2013-07-10 15:37             ` Gordan Bobic
2013-07-10 13:53 ` Ian Jackson
2013-07-10 14:48   ` George Dunlap
2013-07-11 11:35     ` David Vrabel
2013-07-12  9:36       ` George Dunlap
2013-07-12  9:55         ` David Vrabel
2013-07-12 10:32           ` George Dunlap
2013-07-12 13:10         ` Ian Jackson
2013-07-12 13:48           ` Konrad Rzeszutek Wilk
2013-07-12 14:43             ` Ian Jackson
2013-07-12 15:01               ` Konrad Rzeszutek Wilk
2013-07-12 15:09                 ` George Dunlap
2013-07-12 16:02                   ` Konrad Rzeszutek Wilk
2013-07-12 16:08                     ` George Dunlap
2013-07-12 14:44             ` Sander Eikelenboom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51DC3C70.2010605@eu.citrix.com \
    --to=george.dunlap@eu.citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@eu.citrix.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).