From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Hansen Subject: Re: [GIT PULL] x86/mm changes for v3.9-rc1 Date: Fri, 22 Feb 2013 09:30:28 -0800 Message-ID: <5127AB34.8090406@linux.vnet.ibm.com> References: <201302220034.r1M0Y6O8008311@terminus.zytor.com> <20130222165531.GA29308@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20130222165531.GA29308@phenom.dumpdata.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: Konrad Rzeszutek Wilk Cc: linux-mips@linux-mips.org, Jeremy Fitzhardinge , "H. J. Lu" , Frederic Weisbecker , Joe Millenbach , virtualization@lists.linux-foundation.org, Gokul Caushik , Ralf Baechle , Pavel Machek , "H. Peter Anvin" , sparclinux@vger.kernel.org, Christoph Lameter , Ingo Molnar , =?ISO-8859-1?Q?Ville_Syrj=E4l=E4?= , Marek Szyprowski , Andrea Arcangeli , Lee Schermerhorn , xen-devel@lists.xensource.com, Russell King , Len Brown , Joerg Roedel , linux-pm@vger.kernel.org, Hugh Dickins , Yasuaki Ishimatsu List-Id: linux-pm@vger.kernel.org On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote: > On Thu, Feb 21, 2013 at 04:34:06PM -0800, H. Peter Anvin wrote: >> Hi Linus, >> >> This is a huge set of several partly interrelated (and concurrently >> developed) changes, which is why the branch history is messier than >> one would like. >> >> The *really* big items are two humonguous patchsets mostly developed >> by Yinghai Lu at my request, which completely revamps the way we >> create initial page tables. In particular, rather than estimating how >> much memory we will need for page tables and then build them into that >> memory -- a calculation that has shown to be incredibly fragile -- we >> now build them (on 64 bits) with the aid of a "pseudo-linear mode" -- >> a #PF handler which creates temporary page tables on demand. >> >> This has several advantages: >> >> 1. It makes it much easier to support things that need access to >> data very early (a followon patchset uses this to load microcode >> way early in the kernel startup). >> >> 2. It allows the kernel and all the kernel data objects to be invoked >> from above the 4 GB limit. This allows kdump to work on very large >> systems. >> >> 3. It greatly reduces the difference between Xen and native (Xen's >> equivalent of the #PF handler are the temporary page tables created >> by the domain builder), eliminating a bunch of fragile hooks. >> >> The patch series also gets us a bit closer to W^X. >> >> Additional work in this pull is the 64-bit get_user() work which you >> were also involved with, and a bunch of cleanups/speedups to >> __phys_addr()/__pa(). > > Looking at figuring out which of the patches in the branch did this, but > with this merge I am getting a crash with a very simple PV guest (booted with > one 1G): > > Call Trace: > [] xen_get_user_pgd+0x5a <-- > [] xen_get_user_pgd+0x5a > [] xen_write_cr3+0x77 > [] init_mem_mapping+0x1f9 > [] setup_arch+0x742 > [] printk+0x48 > [] start_kernel+0x90 > [] __add_preferred_console.clone.1+0x9b > [] x86_64_start_reservations+0x2a > [] xen_start_kernel+0x564 Do you have CONFIG_DEBUG_VIRTUAL on? You're probably hitting the new BUG_ON() in __phys_addr(). It's intended to detect places where someone is doing a __pa()/__phys_addr() on an address that's outside the kernel's identity mapping. There are a lot of __pa() calls around there, but from the looks of it, it's this code: static pgd_t *xen_get_user_pgd(pgd_t *pgd) { ... if (offset < pgd_index(USER_LIMIT)) { struct page *page = virt_to_page(pgd_page); I'm a bit fuzzy on exactly what the code is trying to do here. It could mean either that the identity mapping isn't set up enough yet, or that __pa() is getting called on a bogus address. I'm especially fuzzy on why we'd be calling anything that's looking at userspace pagetables (xen_get_user_pgd() ??) this early in boot.