From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57622) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VstFm-0007uZ-Dz for qemu-devel@nongnu.org; Tue, 17 Dec 2013 06:56:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VstFd-0004Oa-Kh for qemu-devel@nongnu.org; Tue, 17 Dec 2013 06:55:58 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54215) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VstFd-0004OU-AL for qemu-devel@nongnu.org; Tue, 17 Dec 2013 06:55:49 -0500 Date: Tue, 17 Dec 2013 13:59:36 +0200 From: "Michael S. Tsirkin" Message-ID: <20131217115936.GA30168@redhat.com> References: <1387185088-16811-1-git-send-email-kraxel@redhat.com> <20131216115401.GA19233@redhat.com> <1387201577.28883.38.camel@nilsson.home.kraxel.org> <20131216192843.GB21330@redhat.com> <1387277686.12500.35.camel@nilsson.home.kraxel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1387277686.12500.35.camel@nilsson.home.kraxel.org> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2] x86: gigabyte alignment for ram List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gerd Hoffmann Cc: qemu-devel@nongnu.org, Anthony Liguori On Tue, Dec 17, 2013 at 11:54:46AM +0100, Gerd Hoffmann wrote: > Hi, >=20 > > > Problem is that the firmware places the xbar @ 0xb000000. > > > Hardcoded, assuming qemu will not map ram above 0xb0000000. > >=20 > > Can't bios figure out the size of memory below 4G from fwcfg? > > I refer to 7db16f2480db5e246d34d0c453cff4f58549df0e specifically. >=20 > It can, but it doesn't. >=20 > Additional issue for coreboot is that mmconfig base is a compile-time > constant, because it is setup _very_ early in the boot process. > Coreboot then does the whole pci initialization using mmconfig. >=20 > On the other hand coreboot has a much more sophisticated ressource > management than seabios, so moving the mmconf xbar up to to > 0xe0000000-0xefffffff, then managing two regions (below 0xe0000000 and > above 0xf0000000) for pci bars probably isn't a big issue for coreboot. >=20 > > > So, we must (a) fix firmware first and (b) get a ugly dependency > > > that older firmware will not run on latest qemu. > >=20 > > That's only important for old machine types though, right? >=20 > Correct. That makes it a bit less problematic, but it is still not ver= y > nice. >=20 > > > We also need to figure how we wanna fixup things. So, current memo= ry > > > layout looks like this: > > >=20 > > > 0x00000000 - 0xafffffff -- RAM / unused > > > 0xb0000000 - 0xbfffffff -- mmconfig xbar [256 pci busses] > > > 0xc0000000 - 0xfec00000 -- space for pci bars, almost 1g > > >=20 > > > Largest pci bar we can map below 4g is 512m, @ 0xc0000000. > > >=20 > > > If we wanna map 3G RAM we need to move the xbar somewhere else. Bi= g > > > question is where? > > >=20 > > > We can move it to 0xc0000000. Then we can't map 512m pci bars any = more. > >=20 > > I would go for this when we have 3G of RAM. > > I think that ability to support a single 512m BAR is not all that imp= ortant. >=20 > Use case: pci passthrough of graphics cards. >=20 > > > We can move it to 0xe0000000. Then we have to split the pci bar sp= ace, > > > mapping large bars below 0xe0000000 and small ones above 0xf0000000. > > > SeaBIOS pci init code isn't really up to it. > > > Could also become tricky > > > to declare it correctly in acpi / e820 due to the split. > >=20 > > My laptop's ACPI has this space all fragmented up, seems to boot fine= ... >=20 > We need to change the way we reserve the mmconfig space though. =20 >=20 > Currently it is marked reserved in the e820 table. Having that overlap > with the _CRS region makes windows quite unhappy, we tried that > recently. Yes this also contradicts the spec, see below. > My laptop has the mmconfig space declared as LPC ressource: >=20 > Device (LPC) > { > Name (_ADR, 0x001F0000) // _ADR: Address > Name (_S3D, 0x03) // _S3D: S3 Device State > Name (RID, 0x00) > Device (SIO) > { > Name (_HID, EisaId ("PNP0C02")) > Name (_UID, 0x00) // _UID: Unique ID > Name (SCRS, ResourceTemplate () > [ ... ] > Memory32Fixed (ReadWrite, > 0xF8000000, // Address Base > 0x04000000, // Address Length > ) > [ ... ] > Method (_CRS, 0, NotSerialized) > [ ... return SCRS, with updates applied in some cases ... ] >=20 > When doing it this way we can simply make the PCI0._CRS cover the whole > end-of-ram -> ioapic-base range, simliar to piix, and we are pretty fre= e > to place the mmconfig xbar anywhere in that area. The spec says: 2.If the operating system does not natively comprehend reserving the MMCFG region, the MMCFG region must be reserved by firmware. The address range reported in the MCFG table or by _CBA method (see Section 4.1.3) must be reserved by declaring a motherboard resource. For most systems, the motherboard resource would appear at the root of the ACPI namespace (under \_SB) in a node with a _HID of EISAID (PNP0C02), and the resources in this case should not be claimed in the root PCI bus=E2=80=99s _CRS. The resources = can optionally be returned in Int15 E820 or EFIGetMemoryMap as reserved memory but must always be reported through ACPI as a motherboard resource. My reading of the above is that this can be an LPC resource but claiming this as the root's _CRS isn't ok then. >=20 > Doing the transition is non-trivial though because we (a) move the job > of reserving the mmconfig area from firmware to qemu and (b) the testin= g > needed for that. >=20 > Maybe we should set the gbyte alignment on q35 aside for a while and > tackle the mmconfig reservation handling first. >=20 > cheers, > Gerd I merged your patch but split it: q35 is separate and piix is separate. Would you like me to drop the q35 part then?