From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932302AbbLNLUP (ORCPT ); Mon, 14 Dec 2015 06:20:15 -0500 Received: from mx2.suse.de ([195.135.220.15]:55573 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932276AbbLNLUN (ORCPT ); Mon, 14 Dec 2015 06:20:13 -0500 Date: Mon, 14 Dec 2015 12:20:04 +0100 From: Borislav Petkov To: Will Deacon Cc: Fu Wei , Lorenzo Pieralisi , Hanjun Guo , Tomasz Nowicki , Tomasz Nowicki , Rafael Wysocki , Len Brown , Catalin Marinas , LKML , linux-arm-kernel@lists.infradead.org, linux-acpi@vger.kernel.org, Linaro ACPI Mailman List , G Gregory , Al Stone , Mark Rutland , Marc Zyngier , rruigrok@codeaurora.org, "Abdulhamid, Harb" , Jon Masters , Mark Salter , Grant Likely , rrichter@cavium.com, jarkko.nikula@linux.intel.com, Jonathan Zhang , Michal Hocko , Vlastimil Babka Subject: Re: [PATCH v4] acpi, apei, arm64: APEI initial support for aarch64. Message-ID: <20151214112004.GB11630@pd.tnic> References: <1449558183-12259-1-git-send-email-fu.wei@linaro.org> <20151208123433.GB404@red-moon> <5666D28C.4090305@linaro.org> <20151208140720.GA2395@red-moon> <20151210110135.GC21134@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20151210110135.GC21134@arm.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 10, 2015 at 11:01:35AM +0000, Will Deacon wrote: > [adding Boris, as he might know how this works] Gee, thanks Will, now you're making me look at this too :-) > It's not about flushing one page, flush_tlb_kernel_range (which is called > by unmap_kernel_range) already takes care of that. The problem is that > the unmap is called from irq/nmi context, so the IPIs required for > broadcasting the TLB maintenance on x86 cannot be safely executed. Hmm, if you're talking about ghes_iounmap_nmi() and ghes_iounmap_irq() which are the two callers of unmap_kernel_range_noflush(), that last one is calling vunmap_page_range() which is fiddling with the page table. And I don't see TLB flushing IPIs there. If you mean arch_apei_flush_tlb_one(), that's INVLPG on x86 so also no IPI. What am I missing? > Ideally, I think the ghes code would just use unmap_kernel_range unless > the architecture specifically says that doesn't work in irq context. In > that case, we don't need to implement the arch_apei_flush_tlb_one callback > on arm64. Well, what bothers me with using unmap_kernel_range()/vunmap_page_range() is that if a GHES IRQ/NMI happens while something is executing those, the NMI will interrupt whatever's happening and it will possibly corrupt the pagetable, IMHO. Michal, Vlasta, can you please take a look? More specifically, those ghes_iounmap_nmi/ghes_iounmap_irq calls to unmap_kernel_range_noflush() happening in NMI/IRQ context. > One thing I don't fully grok about the code: since the page is mapped > using ioremap_page_range, doesn't that allow other CPUs to speculatively > fill their TLB with entries corresponding to the page mapped by the IRQ > handler on another core? If the core with the speculative entries then > takes an APEI exception, what guarantees do we have that it's looking at > the right page? I think, for x86, we need a local invalidation on map, > too. You're looking at ghes_copy_tofrom_phys(), right? That's grabbing spinlocks in IRQ/NMI context and doing the iounmap a bit later, below on the same core. I mean, I don't see us landing on another core in between, we're non-preemptible... Or do you mean something else? -- Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) --