From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751707AbbI1F7N (ORCPT ); Mon, 28 Sep 2015 01:59:13 -0400 Received: from mail-wi0-f176.google.com ([209.85.212.176]:38027 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751130AbbI1F7M (ORCPT ); Mon, 28 Sep 2015 01:59:12 -0400 Date: Mon, 28 Sep 2015 07:59:07 +0200 From: Ingo Molnar To: Dave Hansen Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Linus Torvalds , Andrew Morton , Peter Zijlstra , Thomas Gleixner Subject: Re: [PATCH 10/26] x86, pkeys: notify userspace about protection key faults Message-ID: <20150928055907.GA2684@gmail.com> References: <20150916174903.E112E464@viggo.jf.intel.com> <20150916174906.51062FBC@viggo.jf.intel.com> <20150924092320.GA26876@gmail.com> <20150924093026.GA29699@gmail.com> <560435B4.1010603@sr71.net> <20150925071119.GB15753@gmail.com> <5605D660.8000009@sr71.net> <20150926062023.GB27841@gmail.com> <5608703E.5070406@sr71.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5608703E.5070406@sr71.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Dave Hansen wrote: > On 09/25/2015 11:20 PM, Ingo Molnar wrote: > > * Dave Hansen wrote: > ... > >> Since follow_pte() fails for all huge > >> pages, it just falls back to pulling the protection key out of the VMA, > >> which _does_ work for huge pages. > > > > That might be true for explicit hugetlb vmas, but what about transparent hugepages > > that can show up in regular vmas? > > All PTEs (large or small) established under a given VMA have the same > protection key. [...] So a 'pte' is only small. The 'large' thing is called a pmd. So follow_pte() is not adequate. But with that removed everything should be fine as the vma (protection) flags are size independent. > So I think it's safe to rely on the VMA entirely. Well, as least as safe as the > PTE. It's definitely a wee bit racy, which I'll elaborate on when I repost the > patches. So the race I can see is wrt. mprotect(), and we should fix that, because the existing method of recovering the 'page fault reason', error_code, is not racy - so the extension of it (the protection key) should not be racy either. By the time user-space processes the signal we might race with other threads, but at least the fault-address/error-reason information itself should be coherent. This can be solved by getting the protection key while still under the down_read() of the vma - instead of your current solution of a second find_vma(). Thanks, Ingo