From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: [RFC] Extending MTRRs above 4G Date: Wed, 17 Sep 2008 11:51:12 -0600 Message-ID: <1221673872.16470.27.camel@lappy> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit To: kvm Return-path: Received: from g4t0016.houston.hp.com ([15.201.24.19]:21859 "EHLO g4t0016.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753177AbYIQRuy (ORCPT ); Wed, 17 Sep 2008 13:50:54 -0400 Received: from g5t0030.atlanta.hp.com (g5t0030.atlanta.hp.com [16.228.8.142]) by g4t0016.houston.hp.com (Postfix) with ESMTP id 8FCF11404B for ; Wed, 17 Sep 2008 17:50:53 +0000 (UTC) Received: from ldl.fc.hp.com (ldl.fc.hp.com [15.11.146.30]) by g5t0030.atlanta.hp.com (Postfix) with ESMTP id 5F16824033 for ; Wed, 17 Sep 2008 17:50:53 +0000 (UTC) Received: from localhost (ldl.fc.hp.com [127.0.0.1]) by ldl.fc.hp.com (Postfix) with ESMTP id 09BF213401E for ; Wed, 17 Sep 2008 11:50:53 -0600 (MDT) Received: from ldl.fc.hp.com ([127.0.0.1]) by localhost (ldl.fc.hp.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jXg4EiVqVL5j for ; Wed, 17 Sep 2008 11:50:51 -0600 (MDT) Received: from [10.91.73.10] (lart.fc.hp.com [15.11.146.31]) by ldl.fc.hp.com (Postfix) with ESMTP id 8375D13400D for ; Wed, 17 Sep 2008 11:50:51 -0600 (MDT) Sender: kvm-owner@vger.kernel.org List-ID: When I try to boot guests using a recent Linux kernel (2.6.26+), memory above 3.5G gets thrown away with an error like this: WARNING: BIOS bug: CPU MTRRs don't cover all of memory, losing 4608MB of RAM. And it's true, we're only providing MTRRs for memory below 4G. In fact rombios32 knows very little, if anything, about memory above 4G, as seen by memory reporting in the SMBIOS table. It looks like the Linux kernel MTRR code does have a bail-out point for kvm/qemu, but that was only effective before we started reporting MTRRs. On real hardware, I have two systems that do this two different ways. The first is an Intel based system, which reports MTRRs to cover the I/O space, then defaults the rest of memory to WB. The second is an AMD based system which uses MTRRs to cover memory below 4G, then seems to have a special AMD MSR to describe the top of memory above 4G. Xen appears to mimic the first approach. Is there any reason that KVM sets the default MTRR type to UC, then only sets up MTRRs for the memory below 4G? the patch below is a possible approach to continue down this path and enlighten rombios32 about the real top of memory, and setup MTRRs appropriately. It doesn't address SMBIOS or whatever causes grub to only report upper memory below 4G. Alternatively we could switch to the Intel/Xen system approach, but it seems rombios32 needs to understand the extra memory at some point anyway. Thoughts? BTW, another benefit to the default WB approach is that MTRRs are a limited resource and there will be memory sizes we can't fully cover using the approach below. Thanks, Alex Signed-off-by: Alex Williamson -- diff --git a/bios/rombios32.c b/bios/rombios32.c index 2dc1d25..c57e967 100755 --- a/bios/rombios32.c +++ b/bios/rombios32.c @@ -416,6 +416,7 @@ uint32_t cpuid_signature; uint32_t cpuid_features; uint32_t cpuid_ext_features; unsigned long ram_size; +uint64_t above4g_ram_size; uint8_t bios_uuid[16]; #ifdef BX_USE_EBDA_TABLES unsigned long ebda_cur_addr; @@ -530,6 +531,14 @@ void setup_mtrr(void) wrmsr_smp(MTRRphysMask_MSR(i), (~vmask & 0xfffffff000ull) | 0x800); vbase += vmask + 1; } + for (vbase = 1ull << 32; i < vcnt && vbase < above4g_ram_size; ++i) { + vmask = (1ull << 40) - 1; + while (vbase + vmask + 1 > above4g_ram_size) + vmask >>= 1; + wrmsr_smp(MTRRphysBase_MSR(i), vbase | 6); + wrmsr_smp(MTRRphysMask_MSR(i), (~vmask & 0xfffffff000ull) | 0x800); + vbase += vmask + 1; + } wrmsr_smp(MSR_MTRRdefType, 0xc00); } @@ -540,10 +549,19 @@ void ram_probe(void) 16 * 1024 * 1024; else ram_size = (cmos_readb(0x17) | (cmos_readb(0x18) << 8)) * 1024; + + if (cmos_readb(0x5b) | cmos_readb(0x5c) | cmos_readb(0x5d)) + above4g_ram_size = ((uint64_t)cmos_readb(0x5b) << 16) | + ((uint64_t)cmos_readb(0x5c) << 24) | ((uint64_t)cmos_readb(0x5d) << 32); + + if (above4g_ram_size) + above4g_ram_size += 1ull << 32; + #ifdef BX_USE_EBDA_TABLES ebda_cur_addr = ((*(uint16_t *)(0x40e)) << 4) + 0x380; #endif BX_INFO("ram_size=0x%08lx\n", ram_size); + BX_INFO("top of ram %ldMB\n", above4g_ram_size >> 20); setup_mtrr(); }