From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=52705 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PFVDA-00035s-CS for qemu-devel@nongnu.org; Mon, 08 Nov 2010 12:08:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PFVD2-0000pJ-8W for qemu-devel@nongnu.org; Mon, 08 Nov 2010 12:08:45 -0500 Received: from mx1.redhat.com ([209.132.183.28]:32026) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PFVD1-0000p8-SQ for qemu-devel@nongnu.org; Mon, 08 Nov 2010 12:08:44 -0500 Date: Mon, 8 Nov 2010 19:08:37 +0200 From: "Michael S. Tsirkin" Message-ID: <20101108170837.GI7962@redhat.com> References: <20101004215311.17070.54862.stgit@s20.home> <20101108112227.GA1075@redhat.com> <1289227932.19902.11.camel@x201> <20101108162633.GA7962@redhat.com> <20101108170015.GA705@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101108170015.GA705@redhat.com> Subject: [Qemu-devel] Re: [PATCH] PCI: Bus number from the bridge, not the device List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gleb Natapov Cc: yamahata@valinux.co.jp, Alex Williamson , qemu-devel@nongnu.org, armbru@redhat.com On Mon, Nov 08, 2010 at 07:00:15PM +0200, Gleb Natapov wrote: > On Mon, Nov 08, 2010 at 06:26:33PM +0200, Michael S. Tsirkin wrote: > > On Mon, Nov 08, 2010 at 07:52:12AM -0700, Alex Williamson wrote: > > > On Mon, 2010-11-08 at 13:22 +0200, Michael S. Tsirkin wrote: > > > > On Mon, Oct 04, 2010 at 03:53:11PM -0600, Alex Williamson wrote: > > > > > pcibus_dev_print() was erroneously retrieving the device bus > > > > > number from the secondary bus number offset of the device > > > > > instead of the bridge above the device. This ends of landing > > > > > in the 2nd byte of the 3rd BAR for devices, which thankfully > > > > > is usually zero. pcibus_get_dev_path() copied this code, > > > > > inheriting the same bug. pcibus_get_dev_path() is used for > > > > > ramblock naming, so changing it can effect migration. However, > > > > > I've only seen this byte be non-zero for an assigned device, > > > > > which can't migrate anyway, so hopefully we won't run into > > > > > any issues. > > > > > > > > > > Signed-off-by: Alex Williamson > > > > > > > > Good catch. Applied. > > > > I don't really see why do we put the dev path > > > > in the bus object: why not let device supply its name? > > > > > > Because the device name is not unique. This came about from the > > > discussion about how to create a canonical device path that Gleb and > > > Markus are again trying to hash out. If we go up to the bus and get the > > > bus address, we have a VM unique name. Unfortunately, it's difficult to > > > define what the bus should print in all cases (ISA), but since they > > > don't do hotplug and typically don't allocate ramblocks, we can mostly > > > ignore it for this use case. > > > > > > > And I think this will affect nested bridges. However they are currently > > > > broken anyway: we really must convert to topological names as bus number > > > > is guest-assigned - they don't have to be unique, even. > > > > > > Yes, nested bridges are a problem. How can the seg/bus/devfn not be > > > unique? > > > > Bus numbers for nested bridges are guest assigned. We start with 0 after reset. > > > > > > What does fixing this involve? Just changing pcibus_get_dev_path? > > > > > > How do you plan to fix it? Don't forget that migration depends on these > > > names, so some kind of compatibility layer would be required. Thanks, > > > > > > Alex > > > > Replace bus number with slot numbers of parent bridges up to the root. > > This works for root bridge in a compatible way because bus number there > > is hard-coded to 0. > > IMO nested bridges are broken anyway, no way to be compatible there. > > > > > > Gleb, Markus, I think the following should be sufficient for PCI. What > > do you think? Also - do we need to update QMP/monitor to teach them to > > work with these paths? > > > > I am no longer use bus's get_dev_path callback for my purpose (I added > another one get_fw_dev_path) Why? Because get_dev_path is unstable? to avoid changing for migration? So with this patch, you get to use it again. IMO two ways to address devices is a bad idea. > since it is used for migration what is used for migration? > it should > be done for all buses to work properly. And by properly I mean produce > full path from system root to the device itself recursively. This is what the code does. Recursion is not needed here though. > But what I > learned is that by changing get_dev_path output you will break migration > from older guest to newer once (something we have to support). Well I think migration for sytems with nested buses are broken anyway: it's just a new feature where we simply did not figure out the migration andgle before the last release. Without nesting my code returns exactly the existing output so no, it's not broken. > And of > course, as Alex said already, you need to traverse bridges recursively Well this is an implementation detail. loop is functionally equivalent, only cleaner and easier to understand. > and > domain does not provide any meaningful information. I dont understand yet, sorry. Code I posted returns exactly what's there today. If domain does not provide any meaningful information, How do things work then? And how do you address roots on a system with multiple roots? > > This is on top of Alex's patch, completely untested. > > > > > > pci: fix device path for devices behind nested bridges > > > > We were using bus number in the device path, which is clearly > > broken as this number is guest-assigned for all devices > > except the root. > > > > Fix by using hierarchical list of slots, walking the path > > from root down to device, instead. Add :00 as bus number > > so that if there are no nested bridges, this is compatible > > with what we have now. > > > > Signed-off-by: Michael S. Tsirkin > > > > diff --git a/hw/pci.c b/hw/pci.c > > index 7d12473..fa98d94 100644 > > --- a/hw/pci.c > > +++ b/hw/pci.c > > @@ -1826,13 +1826,45 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) > > > > static char *pcibus_get_dev_path(DeviceState *dev) > > { > > - PCIDevice *d = (PCIDevice *)dev; > > - char path[16]; > > - > > - snprintf(path, sizeof(path), "%04x:%02x:%02x.%x", > > - pci_find_domain(d->bus), pci_bus_num(d->bus), > > - PCI_SLOT(d->devfn), PCI_FUNC(d->devfn)); > > - > > - return strdup(path); > > + PCIDevice *d = container_of(dev, PCIDevice, qdev); > > + PCIDevice *t; > > + int slot_depth; > > + /* Path format: Domain:00:Slot:Slot....:Slot.Function. > > + * 00 is added here to make this format compatible with > > + * domain:Bus:Slot.Func for systems without nested PCI bridges. > > + * Slot list specifies the slot numbers for all devices on the > > + * path from root to the specific device. */ > > + int domain_len = strlen("DDDD:00"); > > + int func_len = strlen(".F"); > > + int slot_len = strlen(":SS"); > > + int path_len; > > + char *path, *p; > > + > > + /* Calculate # of slots on path between device and root. */; > > + slot_depth = 0; > > + for (t = d; t; t = t->bus->parent_dev) > > + ++slot_depth; > > + > > + path_len = domain_len + bus_len + slot_len * slot_depth + func_len; > > + > > + /* Allocate memory, fill in the terminating null byte. */ > > + path = malloc(path_len + 1 /* For '\0' */); > > + path[path_len] = '\0'; > > + > > + /* First field is the domain. */ > > + snprintf(path, domain_len, "%04x", pci_find_domain(d->bus)); > > + > > + /* Leave space for slot numbers and fill in function number. */ > > + p = path + domain_len + slot_len * slot_depth; > > + snprintf(p, func_len, ".%02x", PCI_FUNC(d->devfn)); > > + > > + /* Fill in slot numbers. We walk up from device to root, so need to print > > + * them in the reverse order, last to first. */ > > + for (t = d; t; t = t->bus->parent_dev) { > > + p -= slot_len; > > + snprintf(p, slot_len, ":%x", PCI_SLOT(t->devfn)); > > + } > > + > > + return path; > > } > > > > -- > Gleb.