From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96E18C433DF for ; Wed, 26 Aug 2020 11:04:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6E57F2087C for ; Wed, 26 Aug 2020 11:04:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1598439841; bh=dZBupqK1ICb+D03LduZMWk3v5MqcNKkHFnB2Fn2U8ew=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=rTSRHg/OmHlXj7D/75nMbNiz+wAedcBTAmbXJ9CLjLsHXpt0AjXkLL8VxkMyL56R/ X9CgRk0mbAI+yMJuXl116bpBR9ZnzfiGdi/c7l5fLYBigog2R3wV9miTEIs/SUMDgQ InFjWBsSV8nHnMq3vf2sjnjN2ljo7IPR07JwIHcs= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728807AbgHZLCZ (ORCPT ); Wed, 26 Aug 2020 07:02:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:48116 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728795AbgHZLBt (ORCPT ); Wed, 26 Aug 2020 07:01:49 -0400 Received: from kernel.org (unknown [87.70.91.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 973122083B; Wed, 26 Aug 2020 11:01:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1598439708; bh=dZBupqK1ICb+D03LduZMWk3v5MqcNKkHFnB2Fn2U8ew=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kiv8zgwJI7uG2FhY/e8QqDG2hOoqnYuNOROcQg+SjF8hvfVdRzG1wHOGn6+X30FDO ZWWHCGgtt+W1/ww34rQ+2mr81fCzhvdmP2nrQmcqhUPDEJoNN6IRnAO8LrR6+rqjFf kjaCY34RDfsI87WbIfQ9Ghhg2byDUnNBjxwF7HUM= Date: Wed, 26 Aug 2020 14:01:36 +0300 From: Mike Rapoport To: Andrew Morton Cc: Alexander Viro , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christopher Lameter , Dan Williams , Dave Hansen , Elena Reshetova , "H. Peter Anvin" , Idan Yaniv , Ingo Molnar , James Bottomley , "Kirill A. Shutemov" , Matthew Wilcox , Mark Rutland , Mike Rapoport , Michael Kerrisk , Palmer Dabbelt , Paul Walmsley , Peter Zijlstra , Thomas Gleixner , Tycho Andersen , Will Deacon , linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org, x86@kernel.org Subject: Re: [PATCH v4 0/6] mm: introduce memfd_secret system call to create "secret" memory areas Message-ID: <20200826110136.GA69706@kernel.org> References: <20200818141554.13945-1-rppt@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200818141554.13945-1-rppt@kernel.org> Sender: linux-api-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-api@vger.kernel.org Any comments on this? On Tue, Aug 18, 2020 at 05:15:48PM +0300, Mike Rapoport wrote: > From: Mike Rapoport > > Hi, > > This is an implementation of "secret" mappings backed by a file descriptor. > > v4 changes: > * rebase on v5.9-rc1 > * Do not redefine PMD_PAGE_ORDER in fs/dax.c, thanks Kirill > * Make secret mappings exclusive by default and only require flags to > memfd_secret() system call for uncached mappings, thanks again Kirill :) > > v3 changes: > * Squash kernel-parameters.txt update into the commit that added the > command line option. > * Make uncached mode explicitly selectable by architectures. For now enable > it only on x86. > > v2 changes: > * Follow Michael's suggestion and name the new system call 'memfd_secret' > * Add kernel-parameters documentation about the boot option > * Fix i386-tinyconfig regression reported by the kbuild bot. > CONFIG_SECRETMEM now depends on !EMBEDDED to disable it on small systems > from one side and still make it available unconditionally on > architectures that support SET_DIRECT_MAP. > > > The file descriptor backing secret memory mappings is created using a > dedicated memfd_secret system call The desired protection mode for the > memory is configured using flags parameter of the system call. The mmap() > of the file descriptor created with memfd_secret() will create a "secret" > memory mapping. The pages in that mapping will be marked as not present in > the direct map and will have desired protection bits set in the user page > table. For instance, current implementation allows uncached mappings. > > Although normally Linux userspace mappings are protected from other users, > such secret mappings are useful for environments where a hostile tenant is > trying to trick the kernel into giving them access to other tenants > mappings. > > Additionally, the secret mappings may be used as a mean to protect guest > memory in a virtual machine host. > > For demonstration of secret memory usage we've created a userspace library > [1] that does two things: the first is act as a preloader for openssl to > redirect all the OPENSSL_malloc calls to secret memory meaning any secret > keys get automatically protected this way and the other thing it does is > expose the API to the user who needs it. We anticipate that a lot of the > use cases would be like the openssl one: many toolkits that deal with > secret keys already have special handling for the memory to try to give > them greater protection, so this would simply be pluggable into the > toolkits without any need for user application modification. > > I've hesitated whether to continue to use new flags to memfd_create() or to > add a new system call and I've decided to use a new system call after I've > started to look into man pages update. There would have been two completely > independent descriptions and I think it would have been very confusing. > > Hiding secret memory mappings behind an anonymous file allows (ab)use of > the page cache for tracking pages allocated for the "secret" mappings as > well as using address_space_operations for e.g. page migration callbacks. > > The anonymous file may be also used implicitly, like hugetlb files, to > implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm > ABIs in the future. > > As the fragmentation of the direct map was one of the major concerns raised > during the previous postings, I've added an amortizing cache of PMD-size > pages to each file descriptor and an ability to reserve large chunks of the > physical memory at boot time and then use this memory as an allocation pool > for the secret memory areas. > > v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org > v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org > v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org/ > rfc-v2: https://lore.kernel.org/lkml/20200706172051.19465-1-rppt@kernel.org/ > rfc-v1: https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/ > > Mike Rapoport (6): > mm: add definition of PMD_PAGE_ORDER > mmap: make mlock_future_check() global > mm: introduce memfd_secret system call to create "secret" memory areas > arch, mm: wire up memfd_secret system call were relevant > mm: secretmem: use PMD-size pages to amortize direct map fragmentation > mm: secretmem: add ability to reserve memory at boot > > arch/Kconfig | 7 + > arch/arm64/include/asm/unistd.h | 2 +- > arch/arm64/include/asm/unistd32.h | 2 + > arch/arm64/include/uapi/asm/unistd.h | 1 + > arch/riscv/include/asm/unistd.h | 1 + > arch/x86/Kconfig | 1 + > arch/x86/entry/syscalls/syscall_32.tbl | 1 + > arch/x86/entry/syscalls/syscall_64.tbl | 1 + > fs/dax.c | 11 +- > include/linux/pgtable.h | 3 + > include/linux/syscalls.h | 1 + > include/uapi/asm-generic/unistd.h | 7 +- > include/uapi/linux/magic.h | 1 + > include/uapi/linux/secretmem.h | 8 + > kernel/sys_ni.c | 2 + > mm/Kconfig | 4 + > mm/Makefile | 1 + > mm/internal.h | 3 + > mm/mmap.c | 5 +- > mm/secretmem.c | 451 +++++++++++++++++++++++++ > 20 files changed, 501 insertions(+), 12 deletions(-) > create mode 100644 include/uapi/linux/secretmem.h > create mode 100644 mm/secretmem.c > > -- > 2.26.2 > -- Sincerely yours, Mike. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF1C1C433E3 for ; Wed, 26 Aug 2020 11:01:50 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A028B2087C for ; Wed, 26 Aug 2020 11:01:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="kiv8zgwJ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A028B2087C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-nvdimm-bounces@lists.01.org Received: from ml01.vlan13.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 741B2123B993C; Wed, 26 Aug 2020 04:01:50 -0700 (PDT) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=198.145.29.99; helo=mail.kernel.org; envelope-from=rppt@kernel.org; receiver= Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 721691232DC96 for ; Wed, 26 Aug 2020 04:01:48 -0700 (PDT) Received: from kernel.org (unknown [87.70.91.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 973122083B; Wed, 26 Aug 2020 11:01:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1598439708; bh=dZBupqK1ICb+D03LduZMWk3v5MqcNKkHFnB2Fn2U8ew=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kiv8zgwJI7uG2FhY/e8QqDG2hOoqnYuNOROcQg+SjF8hvfVdRzG1wHOGn6+X30FDO ZWWHCGgtt+W1/ww34rQ+2mr81fCzhvdmP2nrQmcqhUPDEJoNN6IRnAO8LrR6+rqjFf kjaCY34RDfsI87WbIfQ9Ghhg2byDUnNBjxwF7HUM= Date: Wed, 26 Aug 2020 14:01:36 +0300 From: Mike Rapoport To: Andrew Morton Subject: Re: [PATCH v4 0/6] mm: introduce memfd_secret system call to create "secret" memory areas Message-ID: <20200826110136.GA69706@kernel.org> References: <20200818141554.13945-1-rppt@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200818141554.13945-1-rppt@kernel.org> Message-ID-Hash: 5Z4I4GSFMYLY56APNZCM7AC4VKAZ5CPB X-Message-ID-Hash: 5Z4I4GSFMYLY56APNZCM7AC4VKAZ5CPB X-MailFrom: rppt@kernel.org X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation CC: Alexander Viro , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christopher Lameter , Dave Hansen , Elena Reshetova , "H. Peter Anvin" , Idan Yaniv , Ingo Molnar , James Bottomley , "Kirill A. Shutemov" , Matthew Wilcox , Mark Rutland , Mike Rapoport , Michael Kerrisk , Palmer Dabbelt , Paul Walmsley , Peter Zijlstra , Thomas Gleixner , Tycho Andersen , Will Deacon , linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm @kvack.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org, x86@kernel.org X-Mailman-Version: 3.1.1 Precedence: list List-Id: "Linux-nvdimm developer list." Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Any comments on this? On Tue, Aug 18, 2020 at 05:15:48PM +0300, Mike Rapoport wrote: > From: Mike Rapoport > > Hi, > > This is an implementation of "secret" mappings backed by a file descriptor. > > v4 changes: > * rebase on v5.9-rc1 > * Do not redefine PMD_PAGE_ORDER in fs/dax.c, thanks Kirill > * Make secret mappings exclusive by default and only require flags to > memfd_secret() system call for uncached mappings, thanks again Kirill :) > > v3 changes: > * Squash kernel-parameters.txt update into the commit that added the > command line option. > * Make uncached mode explicitly selectable by architectures. For now enable > it only on x86. > > v2 changes: > * Follow Michael's suggestion and name the new system call 'memfd_secret' > * Add kernel-parameters documentation about the boot option > * Fix i386-tinyconfig regression reported by the kbuild bot. > CONFIG_SECRETMEM now depends on !EMBEDDED to disable it on small systems > from one side and still make it available unconditionally on > architectures that support SET_DIRECT_MAP. > > > The file descriptor backing secret memory mappings is created using a > dedicated memfd_secret system call The desired protection mode for the > memory is configured using flags parameter of the system call. The mmap() > of the file descriptor created with memfd_secret() will create a "secret" > memory mapping. The pages in that mapping will be marked as not present in > the direct map and will have desired protection bits set in the user page > table. For instance, current implementation allows uncached mappings. > > Although normally Linux userspace mappings are protected from other users, > such secret mappings are useful for environments where a hostile tenant is > trying to trick the kernel into giving them access to other tenants > mappings. > > Additionally, the secret mappings may be used as a mean to protect guest > memory in a virtual machine host. > > For demonstration of secret memory usage we've created a userspace library > [1] that does two things: the first is act as a preloader for openssl to > redirect all the OPENSSL_malloc calls to secret memory meaning any secret > keys get automatically protected this way and the other thing it does is > expose the API to the user who needs it. We anticipate that a lot of the > use cases would be like the openssl one: many toolkits that deal with > secret keys already have special handling for the memory to try to give > them greater protection, so this would simply be pluggable into the > toolkits without any need for user application modification. > > I've hesitated whether to continue to use new flags to memfd_create() or to > add a new system call and I've decided to use a new system call after I've > started to look into man pages update. There would have been two completely > independent descriptions and I think it would have been very confusing. > > Hiding secret memory mappings behind an anonymous file allows (ab)use of > the page cache for tracking pages allocated for the "secret" mappings as > well as using address_space_operations for e.g. page migration callbacks. > > The anonymous file may be also used implicitly, like hugetlb files, to > implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm > ABIs in the future. > > As the fragmentation of the direct map was one of the major concerns raised > during the previous postings, I've added an amortizing cache of PMD-size > pages to each file descriptor and an ability to reserve large chunks of the > physical memory at boot time and then use this memory as an allocation pool > for the secret memory areas. > > v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org > v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org > v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org/ > rfc-v2: https://lore.kernel.org/lkml/20200706172051.19465-1-rppt@kernel.org/ > rfc-v1: https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/ > > Mike Rapoport (6): > mm: add definition of PMD_PAGE_ORDER > mmap: make mlock_future_check() global > mm: introduce memfd_secret system call to create "secret" memory areas > arch, mm: wire up memfd_secret system call were relevant > mm: secretmem: use PMD-size pages to amortize direct map fragmentation > mm: secretmem: add ability to reserve memory at boot > > arch/Kconfig | 7 + > arch/arm64/include/asm/unistd.h | 2 +- > arch/arm64/include/asm/unistd32.h | 2 + > arch/arm64/include/uapi/asm/unistd.h | 1 + > arch/riscv/include/asm/unistd.h | 1 + > arch/x86/Kconfig | 1 + > arch/x86/entry/syscalls/syscall_32.tbl | 1 + > arch/x86/entry/syscalls/syscall_64.tbl | 1 + > fs/dax.c | 11 +- > include/linux/pgtable.h | 3 + > include/linux/syscalls.h | 1 + > include/uapi/asm-generic/unistd.h | 7 +- > include/uapi/linux/magic.h | 1 + > include/uapi/linux/secretmem.h | 8 + > kernel/sys_ni.c | 2 + > mm/Kconfig | 4 + > mm/Makefile | 1 + > mm/internal.h | 3 + > mm/mmap.c | 5 +- > mm/secretmem.c | 451 +++++++++++++++++++++++++ > 20 files changed, 501 insertions(+), 12 deletions(-) > create mode 100644 include/uapi/linux/secretmem.h > create mode 100644 mm/secretmem.c > > -- > 2.26.2 > -- Sincerely yours, Mike. _______________________________________________ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-leave@lists.01.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F690C433E1 for ; Wed, 26 Aug 2020 11:02:09 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 69CE720838 for ; Wed, 26 Aug 2020 11:02:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="ajo+T68o"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="kiv8zgwJ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 69CE720838 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=a5mCKeOpXpxyOL9tMNDNXOrv1bg1hHMNpG9z2aQmpcU=; b=ajo+T68oguuYusEhUjC0i5Ek0 Nfh0+E+ze8EqFTytdlL5ibSun6ZK+s3F6QOYtPqd+FjX94wijLvV5/jjiojp5lq1PTc2ciJPvMaFq 0VKzn4+jTAGB4/kMSzoHrTVgs/uoRLUz9NiwMZg1Z1Nu7+r0LYywqPXepd0MRgq1guvzUQivfDmU7 nPbXbiU00PffpG8UMIHte1NDcjW4wuqTL+5+pK7Bh3mHn4DRwmV7ZG+DzP+qZtsia9OOFR3V2e22C XGbdPIOtkLf8K5Kxr18HKWwz/cuW3tT0cb/lltmq3W76TWqdej/2fJBx9GnmiFqHlLu3fO6pgZRpK 4/j1lNF5Q==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kAtBk-0002GL-1p; Wed, 26 Aug 2020 11:01:56 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kAtBd-0002Dg-JQ; Wed, 26 Aug 2020 11:01:50 +0000 Received: from kernel.org (unknown [87.70.91.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 973122083B; Wed, 26 Aug 2020 11:01:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1598439708; bh=dZBupqK1ICb+D03LduZMWk3v5MqcNKkHFnB2Fn2U8ew=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kiv8zgwJI7uG2FhY/e8QqDG2hOoqnYuNOROcQg+SjF8hvfVdRzG1wHOGn6+X30FDO ZWWHCGgtt+W1/ww34rQ+2mr81fCzhvdmP2nrQmcqhUPDEJoNN6IRnAO8LrR6+rqjFf kjaCY34RDfsI87WbIfQ9Ghhg2byDUnNBjxwF7HUM= Date: Wed, 26 Aug 2020 14:01:36 +0300 From: Mike Rapoport To: Andrew Morton Subject: Re: [PATCH v4 0/6] mm: introduce memfd_secret system call to create "secret" memory areas Message-ID: <20200826110136.GA69706@kernel.org> References: <20200818141554.13945-1-rppt@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200818141554.13945-1-rppt@kernel.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200826_070149_782864_FC82FB67 X-CRM114-Status: GOOD ( 37.85 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, "H. Peter Anvin" , Christopher Lameter , Idan Yaniv , Dan Williams , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Arnd Bergmann , James Bottomley , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Any comments on this? On Tue, Aug 18, 2020 at 05:15:48PM +0300, Mike Rapoport wrote: > From: Mike Rapoport > > Hi, > > This is an implementation of "secret" mappings backed by a file descriptor. > > v4 changes: > * rebase on v5.9-rc1 > * Do not redefine PMD_PAGE_ORDER in fs/dax.c, thanks Kirill > * Make secret mappings exclusive by default and only require flags to > memfd_secret() system call for uncached mappings, thanks again Kirill :) > > v3 changes: > * Squash kernel-parameters.txt update into the commit that added the > command line option. > * Make uncached mode explicitly selectable by architectures. For now enable > it only on x86. > > v2 changes: > * Follow Michael's suggestion and name the new system call 'memfd_secret' > * Add kernel-parameters documentation about the boot option > * Fix i386-tinyconfig regression reported by the kbuild bot. > CONFIG_SECRETMEM now depends on !EMBEDDED to disable it on small systems > from one side and still make it available unconditionally on > architectures that support SET_DIRECT_MAP. > > > The file descriptor backing secret memory mappings is created using a > dedicated memfd_secret system call The desired protection mode for the > memory is configured using flags parameter of the system call. The mmap() > of the file descriptor created with memfd_secret() will create a "secret" > memory mapping. The pages in that mapping will be marked as not present in > the direct map and will have desired protection bits set in the user page > table. For instance, current implementation allows uncached mappings. > > Although normally Linux userspace mappings are protected from other users, > such secret mappings are useful for environments where a hostile tenant is > trying to trick the kernel into giving them access to other tenants > mappings. > > Additionally, the secret mappings may be used as a mean to protect guest > memory in a virtual machine host. > > For demonstration of secret memory usage we've created a userspace library > [1] that does two things: the first is act as a preloader for openssl to > redirect all the OPENSSL_malloc calls to secret memory meaning any secret > keys get automatically protected this way and the other thing it does is > expose the API to the user who needs it. We anticipate that a lot of the > use cases would be like the openssl one: many toolkits that deal with > secret keys already have special handling for the memory to try to give > them greater protection, so this would simply be pluggable into the > toolkits without any need for user application modification. > > I've hesitated whether to continue to use new flags to memfd_create() or to > add a new system call and I've decided to use a new system call after I've > started to look into man pages update. There would have been two completely > independent descriptions and I think it would have been very confusing. > > Hiding secret memory mappings behind an anonymous file allows (ab)use of > the page cache for tracking pages allocated for the "secret" mappings as > well as using address_space_operations for e.g. page migration callbacks. > > The anonymous file may be also used implicitly, like hugetlb files, to > implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm > ABIs in the future. > > As the fragmentation of the direct map was one of the major concerns raised > during the previous postings, I've added an amortizing cache of PMD-size > pages to each file descriptor and an ability to reserve large chunks of the > physical memory at boot time and then use this memory as an allocation pool > for the secret memory areas. > > v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org > v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org > v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org/ > rfc-v2: https://lore.kernel.org/lkml/20200706172051.19465-1-rppt@kernel.org/ > rfc-v1: https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/ > > Mike Rapoport (6): > mm: add definition of PMD_PAGE_ORDER > mmap: make mlock_future_check() global > mm: introduce memfd_secret system call to create "secret" memory areas > arch, mm: wire up memfd_secret system call were relevant > mm: secretmem: use PMD-size pages to amortize direct map fragmentation > mm: secretmem: add ability to reserve memory at boot > > arch/Kconfig | 7 + > arch/arm64/include/asm/unistd.h | 2 +- > arch/arm64/include/asm/unistd32.h | 2 + > arch/arm64/include/uapi/asm/unistd.h | 1 + > arch/riscv/include/asm/unistd.h | 1 + > arch/x86/Kconfig | 1 + > arch/x86/entry/syscalls/syscall_32.tbl | 1 + > arch/x86/entry/syscalls/syscall_64.tbl | 1 + > fs/dax.c | 11 +- > include/linux/pgtable.h | 3 + > include/linux/syscalls.h | 1 + > include/uapi/asm-generic/unistd.h | 7 +- > include/uapi/linux/magic.h | 1 + > include/uapi/linux/secretmem.h | 8 + > kernel/sys_ni.c | 2 + > mm/Kconfig | 4 + > mm/Makefile | 1 + > mm/internal.h | 3 + > mm/mmap.c | 5 +- > mm/secretmem.c | 451 +++++++++++++++++++++++++ > 20 files changed, 501 insertions(+), 12 deletions(-) > create mode 100644 include/uapi/linux/secretmem.h > create mode 100644 mm/secretmem.c > > -- > 2.26.2 > -- Sincerely yours, Mike. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47043C433E1 for ; Wed, 26 Aug 2020 11:03:08 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1218B20838 for ; Wed, 26 Aug 2020 11:03:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="fV2ZdnYZ"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="kiv8zgwJ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1218B20838 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=laZXsmlAJbaz8ZNTVVkpcRaH0gvahw4ymk09q7Dn62M=; b=fV2ZdnYZZVlMspr5IHjEEpwGl ypNfxzBAYlnXJJPEXKdnsDthENAkbSH4GOli6sJYtMxN3IDUb0cDBjHqxIfA05iLxePOcYXtutuqV z+iT85TTL1NyiW88bvHrY/lB/Hb1QyEw9xNNtCcX/CDQ6A/Hzj49w+T5BdXj2ca/DHRgN4K1uRxTu qRBh4MJ8STb6wokHRhCXepWLjujTCxu5E66tmfTHdi9fK9qCMjXX3PI9vvPsLWWrqohoaRv1sQndJ QhZVnlGfYrAiB4KLw9bkrN+ziSkxE/q0s4e+tuTs7rVZNu8W4ry0U6NVIPcjvf0NtEXVPRXxJFm3M LcMauAMWg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kAtBg-0002FL-CV; Wed, 26 Aug 2020 11:01:52 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kAtBd-0002Dg-JQ; Wed, 26 Aug 2020 11:01:50 +0000 Received: from kernel.org (unknown [87.70.91.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 973122083B; Wed, 26 Aug 2020 11:01:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1598439708; bh=dZBupqK1ICb+D03LduZMWk3v5MqcNKkHFnB2Fn2U8ew=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=kiv8zgwJI7uG2FhY/e8QqDG2hOoqnYuNOROcQg+SjF8hvfVdRzG1wHOGn6+X30FDO ZWWHCGgtt+W1/ww34rQ+2mr81fCzhvdmP2nrQmcqhUPDEJoNN6IRnAO8LrR6+rqjFf kjaCY34RDfsI87WbIfQ9Ghhg2byDUnNBjxwF7HUM= Date: Wed, 26 Aug 2020 14:01:36 +0300 From: Mike Rapoport To: Andrew Morton Subject: Re: [PATCH v4 0/6] mm: introduce memfd_secret system call to create "secret" memory areas Message-ID: <20200826110136.GA69706@kernel.org> References: <20200818141554.13945-1-rppt@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200818141554.13945-1-rppt@kernel.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200826_070149_782864_FC82FB67 X-CRM114-Status: GOOD ( 37.85 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, "H. Peter Anvin" , Christopher Lameter , Idan Yaniv , Dan Williams , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Arnd Bergmann , James Bottomley , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Thomas Gleixner , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Any comments on this? On Tue, Aug 18, 2020 at 05:15:48PM +0300, Mike Rapoport wrote: > From: Mike Rapoport > > Hi, > > This is an implementation of "secret" mappings backed by a file descriptor. > > v4 changes: > * rebase on v5.9-rc1 > * Do not redefine PMD_PAGE_ORDER in fs/dax.c, thanks Kirill > * Make secret mappings exclusive by default and only require flags to > memfd_secret() system call for uncached mappings, thanks again Kirill :) > > v3 changes: > * Squash kernel-parameters.txt update into the commit that added the > command line option. > * Make uncached mode explicitly selectable by architectures. For now enable > it only on x86. > > v2 changes: > * Follow Michael's suggestion and name the new system call 'memfd_secret' > * Add kernel-parameters documentation about the boot option > * Fix i386-tinyconfig regression reported by the kbuild bot. > CONFIG_SECRETMEM now depends on !EMBEDDED to disable it on small systems > from one side and still make it available unconditionally on > architectures that support SET_DIRECT_MAP. > > > The file descriptor backing secret memory mappings is created using a > dedicated memfd_secret system call The desired protection mode for the > memory is configured using flags parameter of the system call. The mmap() > of the file descriptor created with memfd_secret() will create a "secret" > memory mapping. The pages in that mapping will be marked as not present in > the direct map and will have desired protection bits set in the user page > table. For instance, current implementation allows uncached mappings. > > Although normally Linux userspace mappings are protected from other users, > such secret mappings are useful for environments where a hostile tenant is > trying to trick the kernel into giving them access to other tenants > mappings. > > Additionally, the secret mappings may be used as a mean to protect guest > memory in a virtual machine host. > > For demonstration of secret memory usage we've created a userspace library > [1] that does two things: the first is act as a preloader for openssl to > redirect all the OPENSSL_malloc calls to secret memory meaning any secret > keys get automatically protected this way and the other thing it does is > expose the API to the user who needs it. We anticipate that a lot of the > use cases would be like the openssl one: many toolkits that deal with > secret keys already have special handling for the memory to try to give > them greater protection, so this would simply be pluggable into the > toolkits without any need for user application modification. > > I've hesitated whether to continue to use new flags to memfd_create() or to > add a new system call and I've decided to use a new system call after I've > started to look into man pages update. There would have been two completely > independent descriptions and I think it would have been very confusing. > > Hiding secret memory mappings behind an anonymous file allows (ab)use of > the page cache for tracking pages allocated for the "secret" mappings as > well as using address_space_operations for e.g. page migration callbacks. > > The anonymous file may be also used implicitly, like hugetlb files, to > implement mmap(MAP_SECRET) and use the secret memory areas with "native" mm > ABIs in the future. > > As the fragmentation of the direct map was one of the major concerns raised > during the previous postings, I've added an amortizing cache of PMD-size > pages to each file descriptor and an ability to reserve large chunks of the > physical memory at boot time and then use this memory as an allocation pool > for the secret memory areas. > > v3: https://lore.kernel.org/lkml/20200804095035.18778-1-rppt@kernel.org > v2: https://lore.kernel.org/lkml/20200727162935.31714-1-rppt@kernel.org > v1: https://lore.kernel.org/lkml/20200720092435.17469-1-rppt@kernel.org/ > rfc-v2: https://lore.kernel.org/lkml/20200706172051.19465-1-rppt@kernel.org/ > rfc-v1: https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/ > > Mike Rapoport (6): > mm: add definition of PMD_PAGE_ORDER > mmap: make mlock_future_check() global > mm: introduce memfd_secret system call to create "secret" memory areas > arch, mm: wire up memfd_secret system call were relevant > mm: secretmem: use PMD-size pages to amortize direct map fragmentation > mm: secretmem: add ability to reserve memory at boot > > arch/Kconfig | 7 + > arch/arm64/include/asm/unistd.h | 2 +- > arch/arm64/include/asm/unistd32.h | 2 + > arch/arm64/include/uapi/asm/unistd.h | 1 + > arch/riscv/include/asm/unistd.h | 1 + > arch/x86/Kconfig | 1 + > arch/x86/entry/syscalls/syscall_32.tbl | 1 + > arch/x86/entry/syscalls/syscall_64.tbl | 1 + > fs/dax.c | 11 +- > include/linux/pgtable.h | 3 + > include/linux/syscalls.h | 1 + > include/uapi/asm-generic/unistd.h | 7 +- > include/uapi/linux/magic.h | 1 + > include/uapi/linux/secretmem.h | 8 + > kernel/sys_ni.c | 2 + > mm/Kconfig | 4 + > mm/Makefile | 1 + > mm/internal.h | 3 + > mm/mmap.c | 5 +- > mm/secretmem.c | 451 +++++++++++++++++++++++++ > 20 files changed, 501 insertions(+), 12 deletions(-) > create mode 100644 include/uapi/linux/secretmem.h > create mode 100644 mm/secretmem.c > > -- > 2.26.2 > -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel