From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757201AbcBWGqH (ORCPT ); Tue, 23 Feb 2016 01:46:07 -0500 Received: from mail-wm0-f47.google.com ([74.125.82.47]:36092 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750704AbcBWGqE (ORCPT ); Tue, 23 Feb 2016 01:46:04 -0500 Date: Tue, 23 Feb 2016 07:45:59 +0100 From: Ingo Molnar To: Dave Hansen Cc: linux-kernel@vger.kernel.org, dave.hansen@linux.intel.com, linux-api@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Subject: Re: [RFC][PATCH 6/7] x86, pkeys: add pkey set/get syscalls Message-ID: <20160223064559.GB21091@gmail.com> References: <20160223011107.FB9B8215@viggo.jf.intel.com> <20160223011116.471AAADA@viggo.jf.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160223011116.471AAADA@viggo.jf.intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Dave Hansen wrote: > > From: Dave Hansen > > This establishes two more system calls for protection key management: > > unsigned long pkey_get(int pkey); > int pkey_set(int pkey, unsigned long access_rights); > > The return value from pkey_get() and the 'access_rights' passed > to pkey_set() are the same format: a bitmask containing > PKEY_DENY_WRITE and/or PKEY_DENY_ACCESS, or nothing set at all. > > These can replace userspace's direct use of the new rdpkru/wrpkru > instructions. > > With current hardware, the kernel can not enforce that it has > control over a given key. But, this at least allows the kernel > to indicate to userspace that userspace does not control a given > protection key. This makes it more likely that situations like > using a pkey after sys_pkey_free() can be detected. So it's analogous to file descriptor open()/close() syscalls: the kernel does not enforce that different libraries of the same process do not interfere with each other's file descriptors - but in practice it's not a problem because everyone uses open()/close(). Resources that a process uses don't per se 'need' kernel level isolation to be useful. > The kernel does _not_ enforce that this interface must be used for > changes to PKRU, whether or not a key has been "allocated". Nor does the kernel enforce that open() must be used to get a file descriptor, so code can do the following: close(100); and can interfere with a library that is holding a file open - but it's generally not a problem and the above is considered poor code that will cause problems. One thing that is different is that file descriptors are generally plentiful, while of pkeys there are at most 16 - but I think it's still "large enough" to not be an issue in practice. We'll see ... > This syscall interface could also theoretically be replaced with a pair of > vsyscalls. The vsyscalls would just call WRPKRU/RDPKRU directly in situations > where they are drop-in equivalents for what the kernel would be doing. Indeed. Thanks, Ingo