From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37738) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bDtzE-0008Nc-4c for qemu-devel@nongnu.org; Fri, 17 Jun 2016 09:39:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bDtz9-0008F8-1j for qemu-devel@nongnu.org; Fri, 17 Jun 2016 09:39:02 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60486) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bDtz8-0008F2-Q9 for qemu-devel@nongnu.org; Fri, 17 Jun 2016 09:38:58 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4569B46213 for ; Fri, 17 Jun 2016 13:38:58 +0000 (UTC) References: <1466097133-5489-1-git-send-email-dgilbert@redhat.com> <1466097133-5489-5-git-send-email-dgilbert@redhat.com> <20160616202449.GY18662@thinpad.lan.raisama.net> <20160617081505.GA2273@work-vm> <20160617131815.GA18662@thinpad.lan.raisama.net> From: Paolo Bonzini Message-ID: Date: Fri, 17 Jun 2016 15:38:53 +0200 MIME-Version: 1.0 In-Reply-To: <20160617131815.GA18662@thinpad.lan.raisama.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH 4/5] x86: Allow physical address bits to be set List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eduardo Habkost , "Dr. David Alan Gilbert" Cc: qemu-devel@nongnu.org, aarcange@redhat.com, Marcel Apfelbaum , "Michael S. Tsirkin" On 17/06/2016 15:18, Eduardo Habkost wrote: > On Fri, Jun 17, 2016 at 09:15:06AM +0100, Dr. David Alan Gilbert wrote: >> * Eduardo Habkost (ehabkost@redhat.com) wrote: >>> On Thu, Jun 16, 2016 at 06:12:12PM +0100, Dr. David Alan Gilbert (git= ) wrote: >>>> From: "Dr. David Alan Gilbert" >>>> >>>> Currently QEMU sets the x86 number of physical address bits to the >>>> magic number 40. This is only correct on some small AMD systems; >>>> Intel systems tend to have 36, 39, 46 bits, and large AMD systems >>>> tend to have 48. >>>> >>>> Having the value different from your actual hardware is detectable >>>> by the guest and in principal can cause problems; >>> >>> What kind of problems? >>> >>> Is it a problem to have something smaller from the actual >>> hardware, or just if it's higher? >> >> I'm a bit vague on the failure cases; but my understanding of the two >> cases are; >> >> Larger is a problem if the guest tries to map something to a high >> address that's not addressable. (Note: this is a problem when migrating to hosts with _smaller_ phys-bits) >> Smaller is potentially a problem if the guest plays tricks with >> what it thinks are spare bits in page tables but which are actually >> interpreted. I believe KVM plays a trick like this. (Note: this is a problem when migrating to hosts with _larger_ phys-bits) > If both smaller and larger are a problem, we have a much bigger > problem than we thought. We need to confirm this. >=20 > So, what happens if the guest play tricks in bits 40-45 when QEMU > sets the limit to 40 but we are running in a 46-bit host? Is it > really a problem? I assumed it would be safe. The guest expects a "reserved bit set" page fault, but doesn't get one. >> 2) While we have maxmem settings to tell us the top of VM RAM, do >> we have anything that tells us the top of IO space? What happens >> when we hotplug a PCI card? >=20 > (CCing Marcel and Michael, as we were discussing this recently.) >=20 > That's a good question. When calculating how many bits the > machine requires, machine code could choose to reserve a > reasonable amount of space for hotplug by default. >=20 > Whatever we choose as the default, in some corner cases (e.g. > almost-32GB VMs running in a 39-bit host) we will still need to > let the user choose between having extra space for hotplug and > being able to safely migrate to 36-bit hosts. No, this is not possible unfortunately. If you set phys-bits < host-phys-bits, the guest may expect some bits to be reserved, when they actually aren't. In practice this doesn't happen for the reason I mentioned in my other message (tl;dr: 1-the trick is rarely used though KVM uses it, 2-if they use bit 51 they're safe in practice). But still making phys-bits smaller than host-phys-bits is a bad idea. Making the guest's phys-bits larger than host-phys-bits would be okay if you reserve the area in the e820 and assume the guest doesn't touch it. But it is not a great idea too, because e820 describes RAM, so you're telling the guest "look, there's 64 TB of reserved RAM up there". >> 3) Is it better to stick to sizes that correspond to real hardware >> if you can? For example I don't know of any machines with 37 bi= ts >> - in practice I think it's best to stick with sizes that corresp= ond >> to some real hardware. >=20 > Yeah, "as small as possible" could be actually "the smallest > possible value from a set of known-to-exist values". e.g. if we > find out that we need 37 bits, it's probably better to simply use > 39 bits. >=20 > Choosing from a smaller set of values also makes corner cases > (like the example above) less likely to happen. Not really, because any value that doesn't match the host is problematic, albeit in different ways. Paolo