From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=53131 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PK6ZC-00062P-3a for qemu-devel@nongnu.org; Sun, 21 Nov 2010 04:50:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PK6Z9-0003KV-GB for qemu-devel@nongnu.org; Sun, 21 Nov 2010 04:50:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37351) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PK6Z9-0003K8-5Q for qemu-devel@nongnu.org; Sun, 21 Nov 2010 04:50:35 -0500 Date: Sun, 21 Nov 2010 11:50:18 +0200 From: "Michael S. Tsirkin" Subject: Re: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device Message-ID: <20101121095018.GA19477@redhat.com> References: <20101004215311.17070.54862.stgit@s20.home> <20101108112227.GA1075@redhat.com> <1289227932.19902.11.camel@x201> <20101108162633.GA7962@redhat.com> <20101109024143.GD4983@valinux.co.jp> <20101109115315.GB22705@redhat.com> <20101119203842.GA11108@redhat.com> <20101120201709.GA8388@redhat.com> <20101121083211.GB7948@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101121083211.GB7948@redhat.com> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov Cc: Isaku Yamahata , Alex Williamson , Markus Armbruster , qemu-devel@nongnu.org On Sun, Nov 21, 2010 at 10:32:11AM +0200, Gleb Natapov wrote: > On Sat, Nov 20, 2010 at 10:17:09PM +0200, Michael S. Tsirkin wrote: > > On Fri, Nov 19, 2010 at 10:38:42PM +0200, Gleb Natapov wrote: > > > On Fri, Nov 19, 2010 at 06:02:58PM +0100, Markus Armbruster wrote: > > > > "Michael S. Tsirkin" writes: > > > > > > > > > On Tue, Nov 09, 2010 at 11:41:43AM +0900, Isaku Yamahata wrote: > > > > >> On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote: > > > > >> > Replace bus number with slot numbers of parent bridges up to the root. > > > > >> > This works for root bridge in a compatible way because bus number there > > > > >> > is hard-coded to 0. > > > > >> > IMO nested bridges are broken anyway, no way to be compatible there. > > > > >> > > > > > >> > > > > > >> > Gleb, Markus, I think the following should be sufficient for PCI. What > > > > >> > do you think? Also - do we need to update QMP/monitor to teach them to > > > > >> > work with these paths? > > > > >> > > > > > >> > This is on top of Alex's patch, completely untested. > > > > >> > > > > > >> > > > > > >> > pci: fix device path for devices behind nested bridges > > > > >> > > > > > >> > We were using bus number in the device path, which is clearly > > > > >> > broken as this number is guest-assigned for all devices > > > > >> > except the root. > > > > >> > > > > > >> > Fix by using hierarchical list of slots, walking the path > > > > >> > from root down to device, instead. Add :00 as bus number > > > > >> > so that if there are no nested bridges, this is compatible > > > > >> > with what we have now. > > > > >> > > > > >> This format, Domain:00:Slot:Slot....:Slot.Function, doesn't work > > > > >> because pci-to-pci bridge is pci function. > > > > >> So the format should be > > > > >> Domain:00:Slot.Function:Slot.Function....:Slot.Function > > > > >> > > > > >> thanks, > > > > > > > > > > Hmm, interesting. If we do this we aren't backwards compatible > > > > > though, so maybe we could try using openfirmware paths, just as well. > > > > > > > > Whatever we do, we need to make it work for all (qdevified) devices and > > > > buses. > > > > > > > > It should also be possible to use canonical addressing with device_add & > > > > friends. I.e. permit naming a device by (a unique abbreviation of) its > > > > canonical address in addition to naming it by its user-defined ID. For > > > > instance, something like > > > > > > > > device_del /pci/@1,1 > > > > > > > FWIW openbios allows this kind of abbreviation. > > > > > > > in addition to > > > > > > > > device_del ID > > > > > > > > Open Firmware is a useful source of inspiration there, but should it > > > > come into conflict with usability, we should let usability win. > > > > > > -- > > > Gleb. > > > > > > I think that the domain (PCI segment group), bus, slot, function way to > > address pci devices is still the most familiar and the easiest to map to > Most familiar to whom? The guests. For CLI, we need an easy way to map a device in guest to the device in qemu and back. > It looks like you identify yourself with most of > qemu users, but if most qemu users are like you then qemu has not enough > users :) Most users that consider themselves to be "advanced" may know > what eth1 or /dev/sdb means. This doesn't mean we should provide > "device_del eth1" or "device_add /dev/sdb" command though. > > More important is that "domain" (encoded as number like you used to) > and "bus number" has no meaning from inside qemu. > So while I said many > times I don't care about exact CLI syntax to much it should make sense > at least. It can use id to specify PCI bus in CLI like this: > device_del pci.0:1.1. Or it can even use device id too like this: > device_del pci.0:ide.0. Or it can use HW topology like in FO device > path. But doing ah-hoc device enumeration inside qemu and then using it > for CLI is not it. > > > functionality in the guests. Qemu is buggy in the moment in that is > > uses the bus addresses assigned by guest and not the ones in ACPI, > > but that can be fixed. > It looks like you confused ACPI _SEG for something it isn't. Maybe I did. This is what linux does: struct pci_bus * __devinit pci_acpi_scan_root(struct acpi_pci_root *root) { struct acpi_device *device = root->device; int domain = root->segment; int busnum = root->secondary.start; And I think this is consistent with the spec. > ACPI spec > says that PCI segment group is purely software concept managed by system > firmware. In fact one segment may include multiple PCI host bridges. It can't I think: Multiple Host Bridges A platform may have multiple PCI Express or PCI-X host bridges. The base address for the MMCONFIG space for these host bridges may need to be allocated at different locations. In such cases, using MCFG table and _CBA method as defined in this section means that each of these host bridges must be in its own PCI Segment Group. > _SEG > is not what OSPM uses to tie HW resource to ACPI resource. It used _CRS > (Current Resource Settings) for that just like OF. No surprise there. OSPM uses both I think. All I see linux do with CRS is get the bus number range. And the spec says, e.g.: the memory mapped configuration base address (always corresponds to bus number 0) for the PCI Segment Group of the host bridge is provided by _CBA and the bus range covered by the base address is indicated by the corresponding bus range specified in _CRS. > > > > That should be enough for e.g. device_del. We do have the need to > > describe the topology when we interface with firmware, e.g. to describe > > the ACPI tables themselves to qemu (this is what Gleb's patches deal > > with), but that's probably the only case. > > > Describing HW topology is the only way to unambiguously describe device to > something or someone outside qemu and have persistent device naming > between different HW configuration. Not really, since ACPI is a binary blob programmed by qemu. > -- > Gleb.