From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752739AbbC3KAJ (ORCPT ); Mon, 30 Mar 2015 06:00:09 -0400 Received: from cantor2.suse.de ([195.135.220.15]:45551 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752400AbbC3KAF (ORCPT ); Mon, 30 Mar 2015 06:00:05 -0400 Message-ID: <55191EA1.5020500@suse.com> Date: Mon, 30 Mar 2015 12:00:01 +0200 From: Juergen Gross User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: David Vrabel , linux-kernel@vger.kernel.org, xen-devel@lists.xensource.com, konrad.wilk@oracle.com, boris.ostrovsky@oracle.com Subject: Re: [Xen-devel] [PATCH 06/13] xen: detect pre-allocated memory interfering with e820 map References: <1424242326-26611-1-git-send-email-jgross@suse.com> <1424242326-26611-7-git-send-email-jgross@suse.com> <54E62662.1050703@cantab.net> <54EC19E6.6070007@suse.com> <54EDDB2F.6050601@citrix.com> <54EDF187.4050507@suse.com> In-Reply-To: <54EDF187.4050507@suse.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/25/2015 05:00 PM, Juergen Gross wrote: > On 02/25/2015 03:24 PM, David Vrabel wrote: >> On 24/02/15 06:27, Juergen Gross wrote: >>> On 02/19/2015 07:07 PM, David Vrabel wrote: >>>> On 18/02/2015 06:51, Juergen Gross wrote: >>>>> +{ >>>>> + unsigned long pfn; >>>>> + unsigned long area_start, area_end; >>>>> + unsigned i; >>>>> + >>>>> + for (i = 0; i < XEN_N_RESERVED_AREAS; i++) { >>>>> + >>>>> + if (!xen_reserved_area[i].size) >>>>> + break; >>>>> + >>>>> + area_start = PFN_DOWN(xen_reserved_area[i].start); >>>>> + area_end = PFN_UP(xen_reserved_area[i].start + >>>>> + xen_reserved_area[i].size); >>>>> + if (area_start >= end_pfn || area_end <= start_pfn) >>>>> + continue; >>>>> + >>>>> + if (area_start > start_pfn) >>>>> + xen_set_identity_and_remap_chunk(start_pfn, area_start, >>>>> + released, remapped); >>>>> + >>>>> + if (area_end < end_pfn) >>>>> + xen_set_identity_and_remap_chunk(area_end, end_pfn, >>>>> + released, remapped); >>>>> + >>>>> + *remapped += min(area_end, end_pfn) - >>>>> + max(area_start, start_pfn); >>>>> + >>>>> + return; >>>> >>>> Why not defer the whole chunk that conflicts? Or for that matter defer >>>> all this remapping to the last minute. >>> >>> There are two problems arising from this: >>> >>> - In the initrd case remapping would be deferred too long: the initrd >>> data is still in use when device initialization is running. And we >>> really want the remap to have happened before PCI space is being >>> used. >> >> I'm not sure I understand what you're saying here. > > I thought you wanted to defer the remapping to the point where the > initrd memory is no longer being used. But the suggestion below is > more clear. > >> >> I'm suggesting: >> >> 1. Reserve all holes. >> >> 2. Relocate (if necessary) all modules (initrd, etc.) to regions that >> are RAM in the e820. >> >> 3. Rebuild the p2m in RAM. >> >> 4. Relocate frames from E820 holes/reserved to the end, free p2m pages >> from the holes and replacing them with the read-only 1:1 page (where >> possible). >> >>> - Delaying all remapping to the point where the new p2m list is in place >>> would either result in a p2m list with all memory holes covered with >>> individual entries as the new list is built with those holes still >>> populated, ... >>> The first option could easily waste significant amounts of memory (on >>> my test machine with 1TB RAM this would have been about 1GB), while >>> the second option would be performance critical. >> >> I don't understand how this wastes memory. When you relocate the >> frames from the holes you can reclaim the p2m pages for the holes (and >> replace them with the r/o mapped identity p2m page). > > Okay, this would work, I guess. > > I'll have a try with some new patches... I tried your approach and hit a problem I can't solve without a major rework of the kernel's init sequence: dmi_scan_machine() (and possibly other functions like probe_roms()) need the identity mappings of BIOS, ACPI or PCI memory. Otherwise SMBIOS, DMI and extension ROMs won't be discovered. This can be solved only by either a complete rework of the sequence of called init functions (not desirable, I guess) or by doing the unmap part of the remapping as early as today. This means, of course, I was just lucky with my resolution of the p2m table conflicting with the E820 map by just delaying the remapping of this memory area: in case it would have collided with an area needed to be identity mapped early, the machine wouldn't have been able to boot my kernel. So I really need to relocate the p2m list, even if this is not as easy as delaying the remapping. Juergen