From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Chen, Tiejun" Subject: Re: [v8][PATCH 07/16] hvmloader/e820: construct guest e820 table Date: Thu, 16 Jul 2015 23:29:03 +0800 Message-ID: <55A7CDBF.1030104@intel.com> References: <1437029582-19564-1-git-send-email-tiejun.chen@intel.com> <1437029582-19564-8-git-send-email-tiejun.chen@intel.com> <55A7ADAC.1010004@intel.com> <55A7C804.2040101@intel.com> <55A7CABE.4040005@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <55A7CABE.4040005@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap , George Dunlap Cc: Keir Fraser , Ian Campbell , Stefano Stabellini , Andrew Cooper , Ian Jackson , "xen-devel@lists.xen.org" , Jan Beulich , Wei Liu List-Id: xen-devel@lists.xenproject.org On 2015/7/16 23:16, George Dunlap wrote: > On 07/16/2015 04:04 PM, Chen, Tiejun wrote: >>> Yes, sorry, add_high_mem will be the size of memory *relocated*, not >>> the actual end of it (unless, as you say, the original highmem region >>> didn't exist). >>> >>> What I really meant was that either way, after adjusting the highmem >>> region in the e820, the end of that region should correspond to >>> hvm_info->high_mem_pgend. >>> >>> What about something like this? >>> --- >>> /* >>> * And then we also need to adjust highmem. >>> */ >>> if ( add_high_mem ) >>> { >>> /* >>> * Modify the existing highmem region if it exists >>> */ >>> for ( i = 0; i < nr; i++ ) >>> { >>> if ( e820[i].type == E820_RAM && >>> e820[i].addr == (1ull << 32)) >>> { >>> e820[i].size += add_high_mem; >>> break; >>> } >>> } >>> >>> /* >>> * If we didn't find a highmem region, make one >>> */ >>> if ( i == nr ) >>> { >>> e820[nr].addr = ((uint64_t)1 << 32); >>> e820[nr].size = e820[nr].addr + add_high_mem; >>> e820[nr].type = E820_RAM; >>> nr++; >>> } >>> >>> /* >>> * Either way, at this point i points to the entry containing >>> * highmem. Compare it to what's in hvm_info as a sanity >>> * check. >>> */ >>> BUG_ON(e820[i].addr+e820[i].size != >>> ((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT)); >>> } >>> >> >> Looks really better. >> >> I just introduce a little change based on yours, and I post this as a >> whole, >> >> diff --git a/tools/firmware/hvmloader/e820.c >> b/tools/firmware/hvmloader/e820.c >> index 7a414ab..8c9b01f 100644 >> --- a/tools/firmware/hvmloader/e820.c >> +++ b/tools/firmware/hvmloader/e820.c >> @@ -105,7 +105,10 @@ int build_e820_table(struct e820entry *e820, >> unsigned int lowmem_reserved_base, >> unsigned int bios_image_base) >> { >> - unsigned int nr = 0; >> + unsigned int nr = 0, i, j; >> + uint32_t low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT; >> + uint64_t high_mem_end = (uint64_t)hvm_info->high_mem_pgend << >> PAGE_SHIFT; >> + uint64_t add_high_mem = 0; >> >> if ( !lowmem_reserved_base ) >> lowmem_reserved_base = 0xA0000; >> @@ -149,13 +152,6 @@ int build_e820_table(struct e820entry *e820, >> e820[nr].type = E820_RESERVED; >> nr++; >> >> - /* Low RAM goes here. Reserve space for special pages. */ >> - BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20)); >> - e820[nr].addr = 0x100000; >> - e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) - >> e820[nr].addr; >> - e820[nr].type = E820_RAM; >> - nr++; >> - >> /* >> * Explicitly reserve space for special pages. >> * This space starts at RESERVED_MEMBASE an extends to cover various >> @@ -191,16 +187,91 @@ int build_e820_table(struct e820entry *e820, >> nr++; >> } >> >> - >> - if ( hvm_info->high_mem_pgend ) >> + /* >> + * Construct E820 table according to recorded memory map. >> + * >> + * The memory map created by toolstack may include, >> + * >> + * #1. Low memory region >> + * >> + * Low RAM starts at least from 1M to make sure all standard regions >> + * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios, >> + * have enough space. >> + * >> + * #2. Reserved regions if they exist >> + * >> + * #3. High memory region if it exists >> + */ >> + for ( i = 0; i < memory_map.nr_map; i++ ) >> { >> - e820[nr].addr = ((uint64_t)1 << 32); >> - e820[nr].size = >> - ((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - >> e820[nr].addr; >> - e820[nr].type = E820_RAM; >> + e820[nr] = memory_map.map[i]; >> nr++; >> } >> >> + /* Low RAM goes here. Reserve space for special pages. */ >> + BUG_ON(low_mem_end < (2u << 20)); >> + >> + /* >> + * Its possible to relocate RAM to allocate sufficient MMIO previously >> + * so low_mem_pgend would be changed over there. And here memory_map[] >> + * records the original low/high memory, so if low_mem_end is less >> than >> + * the original we need to revise low/high memory range in e820. >> + */ >> + for ( i = 0; i < nr; i++ ) >> + { >> + uint64_t end = e820[i].addr + e820[i].size; >> + if ( e820[i].type == E820_RAM && >> + low_mem_end > e820[i].addr && low_mem_end < end ) >> + { >> + add_high_mem = end - low_mem_end; >> + e820[i].size = low_mem_end - e820[i].addr; >> + } >> + } >> + >> + /* >> + * And then we also need to adjust highmem. >> + */ >> + if ( add_high_mem ) >> + { >> + /* Modify the existing highmem region if it exists. */ >> + for ( i = 0; i < nr; i++ ) >> + { >> + if ( e820[i].type == E820_RAM && >> + e820[i].addr == ((uint64_t)1 << 32)) >> + { >> + e820[i].size += add_high_mem; >> + break; >> + } >> + } >> + >> + /* If there was no highmem region, just create one. */ >> + if ( i == nr ) >> + { >> + e820[nr].addr = ((uint64_t)1 << 32); >> + e820[nr].size = high_mem_end - e820[nr].addr; >> + e820[nr].type = E820_RAM; >> + nr++; >> + } >> + >> + /* A sanity check if high memory is broken. */ >> + BUG_ON( high_mem_end != e820[i].addr + e820[i].size); > > The reason I wrote it the way I did was so that we would cross-check our > lowmem adjustments (via add_high_mem) with the value in hvm_info in > *both cases*. > > In the code above, you'll get the sanity check if we modify an existing > e820 entry; but if we create a new entry, then we don't check to make > sure that the amount we removed from the lowmem entry equals the amount > we added to the highmem entry. Are you saying the following two cases are not same? uint64_t high_mem_end = (uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT; BUG_ON( high_mem_end != e820[i].addr + e820[i].size); vs. BUG_ON(e820[i].addr+e820[i].size != ((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT)); Why? Note hvm_info->high_mem_pgend don't change while build e820 table. Honestly I didn't try to change that point but maybe I'm missing something? Thanks Tiejun > > By all means, calculate high_mem_end so it's easier to read. But then, > when creating a new region, set e820[nr].size = add_high_mem, so that > the BUG_ON() that follows actually checks something useful. > > -George > >