From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49414) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WSoK6-0006oy-DG for qemu-devel@nongnu.org; Wed, 26 Mar 2014 09:57:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WSoJz-0007mO-7k for qemu-devel@nongnu.org; Wed, 26 Mar 2014 09:56:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:37538) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WSoJy-0007mH-T6 for qemu-devel@nongnu.org; Wed, 26 Mar 2014 09:56:47 -0400 Date: Wed, 26 Mar 2014 15:56:53 +0200 From: "Michael S. Tsirkin" Message-ID: <20140326135653.GA2691@redhat.com> References: <20140326064510.5518.72436.malonedeb@chaenomeles.canonical.com> <20140326064510.5518.72436.malonedeb@chaenomeles.canonical.com> <20140326103112.GA20219@redhat.com> <5332C7D2.3030901@redhat.com> <20140326125828.GA2192@redhat.com> <20140326144829.52f72b74@nial.usersys.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20140326144829.52f72b74@nial.usersys.redhat.com> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [Bug 1297651] [NEW] KVM create a win7 guest with Qemu, it boots up fail List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Igor Mammedov Cc: robert.hu@intel.com, Bug 1297651 <1297651@bugs.launchpad.net>, Laszlo Ersek , qemu-devel@nongnu.org, ehabkost@redhat.com On Wed, Mar 26, 2014 at 02:48:29PM +0100, Igor Mammedov wrote: > On Wed, 26 Mar 2014 14:58:28 +0200 > "Michael S. Tsirkin" wrote: >=20 > > On Wed, Mar 26, 2014 at 01:28:02PM +0100, Laszlo Ersek wrote: > > > On 03/26/14 11:31, Michael S. Tsirkin wrote: > > >=20 > > > > On Wed, Mar 26, 2014 at 06:45:10AM -0000, Robert Hu wrote: > > >=20 > > > >> Date: Mon Mar 17 17:05:16 2014 +0100 > > > >> > > > >> i386/acpi-build: allow more than 255 elements in CPON > > > >> > > > >> The build_ssdt() function builds a number of AML objects tha= t are related > > > >> to CPU hotplug, and whose IDs form a contiguous sequence of = APIC IDs. > > > >> (APIC IDs are in fact discontiguous, but this is the traditi= onal > > > >> interface: build a contiguous sequence from zero up that cov= ers all > > > >> possible APIC IDs.) These objects are: > > > >> > > > >> - a Processor() object for each VCPU, > > > >> - a NTFY method, with one branch for each VCPU, > > > >> - a CPON package with one element (hotplug status byte) for = each VCPU. > > > >> > > > >> The build_ssdt() function currently limits the *count* of pr= ocessor > > > >> objects, and NTFY branches, and CPON elements, in 0xFF (see = the assignment > > > >> to "acpi_cpus"). This allows for an inclusive APIC ID range = of [0..254]. > > > >> This is incorrect, because the highest APIC ID that we other= wise allow a > > > >> VCPU to take is 255. > > > >> > > > >> In order to extend the maximum count to 256, and the travers= ed APIC ID > > > >> range correspondingly to [0..255]: > > > >> - the Processor() objects need no change, > > > >> - the NTFY method also needs no change, > > > >> - the CPON package must be updated, because it is defined wi= th a > > > >> DefPackage, and the number of elements in such a package c= an be at most > > > >> 255. We pick a DefVarPackage instead. > > > >> > > > >> We replace the Op byte, and the encoding of the number of el= ements. > > > >> Compare: > > > >> > > > >> DefPackage :=3D PackageOp PkgLength NumElements Pa= ckageElementList > > > >> DefVarPackage :=3D VarPackageOp PkgLength VarNumElements Pa= ckageElementList > > > >> > > > >> PackageOp :=3D 0x12 > > > >> VarPackageOp :=3D 0x13 > > > >=20 > > > >=20 > > > > I think I know what's going on here: the specification says: > > > >=20 > > > > The ASL compiler can emit two different AML opcodes for a Package > > > > declaration, either PackageOp or VarPackageOp. For small, fixed-l= ength > > > > packages, the PackageOp is used and this > > > >=20 > > > > opcode is compatible with ACPI 1.0. A VarPackageOp will be emitte= d if > > > > any of the following conditions are true: > > > > =E2=80=A2 > > > > The NumElements argument is a TermArg that can only be resolved = at > > > > runtime. > > > > =E2=80=A2 > > > > At compile time, NumElements resolves to a constant that is larg= er than > > > > 255. > > > > =E2=80=A2 > > > > The PackageList contains more than 255 initializer elements. > > > >=20 > > > >=20 > > > > So we clearly violate this rule. > > >=20 > > > I did see this passage of the spec, but it is not relevant. It is a= bout what the ASL compiler does. It comes from: > > >=20 > > > 19 ACPI Source Language (ASL)Reference > > > 19.5 ASL Operator Reference > > > 19.5.98 Package (Declare Package Object) > > >=20 > > > We do not have an ASL compiler at hand. > >=20 > > True. But I think the spec and guests simply didn't envision writing = AML by hand :) > >=20 > > > The specification nowhere restricts VarPackageOp to > 255. > > >=20 > > > However, what I *did* mess up is compatibility with ACPI 1.0. Just = below the quoted part, there's also this: > > >=20 > > > Note: The ability to create variable-sized packages was first > > > introduced in ACPI 2.0. ACPI 1.0 only allowed fixed-size > > > packages with up to 255 elements. > > >=20 > > > I forgot that the header of the containing table stated the ACPI ve= rsion: > > >=20 > > > /* Copy header and patch values in the S3_ / S4_ / S5_ packages= */ > > > ssdt_ptr =3D acpi_data_push(table_data, sizeof(ssdp_misc_aml)); > > >=20 > > > and > > >=20 > > > DefinitionBlock ("ssdt-misc.aml", "SSDT", 0x01, "BXPC", "BXSSDTS= USP", 0x1) > > > ^^^^ > > > ComplianceRevision > > >=20 > > > So my patch generates ACPI 2.0+ contents for an 1.0 table. > > >=20 > > > > The following seems to fix the issue - still testing. Can you con= firm please? > > >=20 > > > This patch only restricts the bug to a subset of cases, but it does= n't fix it. > > >=20 > > > > However the question we should ask is whether > > > > it's a good idea to allow hotplug ID values that might > > > > make guests fail to boot. > > > >=20 > > > > How about limiting ACPI_CPU_HOTPLUG_ID_LIMIT to 255? > > >=20 > > > I think it's not the package size / APIC ID value per se that break= s the boot, but the incompatibility between the ACPI revision stated in t= he SSDT header, and the construct in the table. > >=20 > >=20 > > It would be interesting to try tweaking the table version and seeing > > what happens. Does it help any guests? > It won't help XP based guests since they support 1.0 revision only. >=20 > >=20 > > > >=20 > > > > We never allowed > 255 in the past, is it worth the > > > > maintainance headaches? > > > >=20 > > > > =20 > > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > > > > index f1054dd..7597517 100644 > > > > --- a/hw/i386/acpi-build.c > > > > +++ b/hw/i386/acpi-build.c > > > > @@ -1055,9 +1055,21 @@ build_ssdt(GArray *table_data, GArray *lin= ker, > > > > =20 > > > > { > > > > GArray *package =3D build_alloc_array(); > > > > - uint8_t op =3D 0x13; /* VarPackageOp */ > > > > + uint8_t op; > > > > + > > > > + /* > > > > + * Note: The ability to create variable-sized packag= es was first introduced in ACPI 2.0. ACPI 1.0 only > > > > + * allowed fixed-size packages with up to 255 elemen= ts. > > > > + * Windows guests up to win2k8 fail when VarPackageO= p is used. > > > > + */ > > > > + if (acpi_cpus <=3D 255) { > > > > + op =3D 0x12; /* PackageOp */ > > > > + build_append_byte(package, acpi_cpus); /* NumEle= ments */ > > > > + } else { > > > > + op =3D 0x13; /* VarPackageOp */ > > > > + build_append_int(package, acpi_cpus); /* VarNumE= lements */ > > > > + } > > > > =20 > > > > - build_append_int(package, acpi_cpus); /* VarNumEleme= nts */ > > > > for (i =3D 0; i < acpi_cpus; i++) { > > > > uint8_t b =3D test_bit(i, cpu->found_cpus) ? 0x0= 1 : 0x00; > > > > build_append_byte(package, b); > > > >=20 > > >=20 > > > The patch will mask the problem for most of the cases, but when Var= PackageOp is used, it will be broken just the same (because the ACPI revi= sion in the SSDT header will still mismatch). > >=20 > > yes but modern guests don't seem to care, and it was already broken i= n > > 1.7, wasn't it? > >=20 > > > Here's my proposal: > > > - I can post a patch that changes the SSDT DSL files, *and* the DSD= T files (q35-acpi-dsdt.dsl acpi-dsdt.dsl), to state an ACPI revision of 2= .0. (The ACPI revision of the DSDT file determines integer sizes for SSDT= s too, so we can't just bump the SSDTs to 2.0) > >=20 > > It should not be a problem but I'm not sure I get this comment. Can y= ou explain? > >=20 > > > - Or I suggest to revert my patches for 2.0. > > >=20 > > > You probably won't like bumping the ACPI rev in DSDT/SSDT, for vari= ous compatibility reasons, so I think I suggest to revert these two patch= es of mine. It's now clear to me that this VCPU hotplug limit cannot be l= ifted without much more intrusive (and guest visible) changes. Sorry abou= t missing this fact in my original submission. > > >=20 > > > Thanks, > > > Laszlo > >=20 > > I have a problem with both approaches :) > >=20 > > If we want to change ACPI rev, I think we should do this > > conditionally when max_cpus > 255. > > Would be worth it if this fixes some guests. > >=20 > > As for reverting, I think it's a problem that we seem to > > allow max_cpus =3D 256 but then it doesn't really work. > more clean would be to abort if CPON index (i.e. APIC ID) > is more than 255. That would affect small number of weird > topologies but sould be fine for most usecases. >=20 > >=20 > >=20 > >=20 > > I think the patch I posted might be good enough for 2.0. > > It seems to make things work for new guests, and old guests > > work as long as you don't specify max_cpus =3D 255. > > The config with a high max_cpus value never really worked so > > not a big deal. > >=20 > >=20 > > Alternatively limit max_cpus to 255, to make it fail cleanly. > that won't work since max_cpus is not equivalent to index in CPON > which depends on topology as well. I'm inclined to keep my patch for now until we have more data. As far as I can see, it doesn't break anything that isn't already broken. --=20 MST