From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4FB3C433E0 for ; Mon, 8 Feb 2021 21:27:41 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EAF8464E26 for ; Mon, 8 Feb 2021 21:27:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EAF8464E26 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=KJp7DH9e5jhJld1ziGNZ5/icriDtsATD+m3XWQumOHM=; b=n382Xx9OlNT4mPTL421E9Ng4C UVdG8E3CZ1vQ8+6OKM2MGkBi4i/Ew9b1Yrn5Ewzh4uskT+wmyX46q/H1Q8k4Hf7U01EfKjTzZ6TBo F0WtVKoMicrNQNGLVTDTG5055s4TXNZ6E5Nb0nNWX0ikL0nbN/2IaCnwfCGCswEV8LQeTVxonxjC3 Rm799IPxyYEXgfgzQ69DnjHkfdYUUmVurGi9QW4ADMbzKYMF0DvV+d1rZWqCUIS3otvZ2RKRa5+Ju YxlUpvGHKM6NBJ6WosovxL5sehnJQG90Y/PVVynVLcrdKol8w1JtUqkLPR4TR2ajidHhgQVq5eIrG JDzRYEg8Q==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1l9E3A-0001DG-Tm; Mon, 08 Feb 2021 21:26:28 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1l9E35-0001B9-Ip; Mon, 08 Feb 2021 21:26:25 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id E27F164E6C; Mon, 8 Feb 2021 21:26:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612819582; bh=Y8G7gf2IwGjBXBF5ErxQfWKVOLSdVIOpGEIG67JA2AY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Eerdn/7TBUJNWh5JSQGQanhkCbXO6r8i+07ORvc0kNlXTv/gMu3WAJsIb2RYRWrDl NzMmvcDn3iTmjaUTOACFOeq+fWRq6saXTmy51YQtoTk38Gte4b+P8L19Xbwx0L6ml+ Ea3Z7Qd3D+MldYIlVvyh5Yb26npCORZwTE+3wJTir9k1lgaS6eZYPXQ4zFo0WChRGs fUiJgsx/cCeor6w8T2f3XtMihtpacVS0+tPYSqs4uM0k3MR4YWV+vaapoyzoPPxAl+ yFpPIEmRq4D/vlPxshoMK56Hm/B9URPT2mlstPmtBfgreYHioAYDlS64KLgxO+QeoG FSPZy/knOHzIQ== Date: Mon, 8 Feb 2021 23:26:05 +0200 From: Mike Rapoport To: Michal Hocko Subject: Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas Message-ID: <20210208212605.GX242749@kernel.org> References: <20210208084920.2884-1-rppt@kernel.org> <20210208084920.2884-8-rppt@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210208_162624_690607_0A6E11E3 X-CRM114-Status: GOOD ( 29.97 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , David Hildenbrand , Peter Zijlstra , Catalin Marinas , Dave Hansen , linux-mm@kvack.org, linux-kselftest@vger.kernel.org, "H. Peter Anvin" , Christopher Lameter , Shuah Khan , Thomas Gleixner , Elena Reshetova , linux-arch@vger.kernel.org, Tycho Andersen , linux-nvdimm@lists.01.org, Will Deacon , x86@kernel.org, Matthew Wilcox , Mike Rapoport , Ingo Molnar , Michael Kerrisk , Palmer Dabbelt , Arnd Bergmann , James Bottomley , Hagen Paul Pfeifer , Borislav Petkov , Alexander Viro , Andy Lutomirski , Paul Walmsley , "Kirill A. Shutemov" , Dan Williams , linux-arm-kernel@lists.infradead.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Palmer Dabbelt , linux-fsdevel@vger.kernel.org, Shakeel Butt , Andrew Morton , Rick Edgecombe , Roman Gushchin Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, Feb 08, 2021 at 11:49:22AM +0100, Michal Hocko wrote: > On Mon 08-02-21 10:49:17, Mike Rapoport wrote: > > From: Mike Rapoport > > > > Introduce "memfd_secret" system call with the ability to create memory > > areas visible only in the context of the owning process and not mapped not > > only to other processes but in the kernel page tables as well. > > > > The secretmem feature is off by default and the user must explicitly enable > > it at the boot time. > > > > Once secretmem is enabled, the user will be able to create a file > > descriptor using the memfd_secret() system call. The memory areas created > > by mmap() calls from this file descriptor will be unmapped from the kernel > > direct map and they will be only mapped in the page table of the owning mm. > > Is this really true? I guess you meant to say that the memory will > visible only via page tables to anybody who can mmap the respective file > descriptor. There is nothing like an owning mm as the fd is inherently a > shareable resource and the ownership becomes a very vague and hard to > define term. Hmm, it seems I've been dragging this paragraph from the very first mmap(MAP_EXCLUSIVE) rfc and nobody (including myself) noticed the inconsistency. > > The file descriptor based memory has several advantages over the > > "traditional" mm interfaces, such as mlock(), mprotect(), madvise(). It > > paves the way for VMMs to remove the secret memory range from the process; > > I do not understand how it helps to remove the memory from the process > as the interface explicitly allows to add a memory that is removed from > all other processes via direct map. The current implementation does not help to remove the memory from the process, but using fd-backed memory seems a better interface to remove guest memory from host mappings than mmap. As Andy nicely put it: "Getting fd-backed memory into a guest will take some possibly major work in the kernel, but getting vma-backed memory into a guest without mapping it in the host user address space seems much, much worse." > > As secret memory implementation is not an extension of tmpfs or hugetlbfs, > > usage of a dedicated system call rather than hooking new functionality into > > memfd_create(2) emphasises that memfd_secret(2) has different semantics and > > allows better upwards compatibility. > > What is this supposed to mean? What are differences? Well, the phrasing could be better indeed. That supposed to mean that they differ in the semantics behind the file descriptor: memfd_create implements sealing for shmem and hugetlbfs while memfd_secret implements memory hidden from the kernel. > > The secretmem mappings are locked in memory so they cannot exceed > > RLIMIT_MEMLOCK. Since these mappings are already locked an attempt to > > mlock() secretmem range would fail and mlockall() will ignore secretmem > > mappings. > > What about munlock? Isn't this implied? ;-) I'll add a sentence about it. -- Sincerely yours, Mike. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel