From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D96E9C64ED8 for ; Fri, 24 Feb 2023 09:36:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0067D6B0072; Fri, 24 Feb 2023 04:36:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EF9116B0073; Fri, 24 Feb 2023 04:36:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC2846B0074; Fri, 24 Feb 2023 04:36:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CBCCD6B0072 for ; Fri, 24 Feb 2023 04:36:18 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 90CA91617A0 for ; Fri, 24 Feb 2023 09:36:18 +0000 (UTC) X-FDA: 80501679636.16.CDA60BB Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf10.hostedemail.com (Postfix) with ESMTP id 8063DC000D for ; Fri, 24 Feb 2023 09:36:15 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=mN1Fky7T; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf10.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677231376; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7AO+1mlxPn50SYH2uYDUwR21QZCNTOuAPY8p1u+DaU0=; b=1itL4Zv6i086WD1TvMvHKx6q8pR31u1u+KCWoh0zyuVrXd84JXnMEDvXNiO0M7B1itMqAv EkcaHhtwopglzYz7RbnU9Oir4ieg7JUuzUD68tbEo/ZJ1BAK51I/S8pSRXZlizjmeCBH7w h7b0uUvc/h3iNe3Jm2JT0xPX85bBvu0= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=mN1Fky7T; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf10.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677231376; a=rsa-sha256; cv=none; b=jyhCNMzQDtQOsy5wJu0ANb5VDLXf/vf5A+n07fklYStDBI1JmOkuaFpKGyaQudL6E+9Squ QFVHnX2bk5Qe/JYuuC+qLXNFf9NUsd4yKyl937hM1+S3wqYkMkjkZ0Tpa5dq6aHN/Ux+WY 1dS6vUBi50y6FelonC4P8/eGp17CaeA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677231376; x=1708767376; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=qh5dHIF3Zy/2vFiSSHWmzyQvmvJlUbTRwfF+WltTESk=; b=mN1Fky7T4AiyVWZPI8yOMkgm8BG2O/U9MBAcdsMVgLGrMGQlSpJz26sm OAVtSsgIugcqaRkRL7Dysl4NiiKFKGZ9rgp6iMsItrJ1vl1gPFs/OV74M xLFo558yMFp3K1zSX2AAzaAdSzzJGQz5KYWOv2EP6kJarfUWCJK1Q+s/p sHKZvStbGpZCbV86ztyJwmMOuHdKnTRTwKz2yHJep0Q3OFx13Pu7WGuSW QjI0gfNPiYaB6xep97fU/9EpGGLlX9qMxisYrANHcsGqjC6WGnEoOhCXk hfF8e6PiBL2REBJnOVRD5l9a16+PV+sao1kBuBQPXxCZA7aG0m/E+CDKe A==; X-IronPort-AV: E=McAfee;i="6500,9779,10630"; a="331171618" X-IronPort-AV: E=Sophos;i="5.97,324,1669104000"; d="scan'208";a="331171618" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Feb 2023 01:36:13 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10630"; a="741608912" X-IronPort-AV: E=Sophos;i="5.97,324,1669104000"; d="scan'208";a="741608912" Received: from rkris18-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.56.190]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Feb 2023 01:36:03 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id C84DF10A581; Fri, 24 Feb 2023 12:36:00 +0300 (+03) Date: Fri, 24 Feb 2023 12:36:00 +0300 From: kirill.shutemov@linux.intel.com To: Ackerley Tng Cc: "Kirill A. Shutemov" , kvm@vger.kernel.org, linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, qemu-devel@nongnu.org, chao.p.peng@linux.intel.com, aarcange@redhat.com, ak@linux.intel.com, akpm@linux-foundation.org, arnd@arndb.de, bfields@fieldses.org, bp@alien8.de, corbet@lwn.net, dave.hansen@intel.com, david@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, hpa@zytor.com, hughd@google.com, jlayton@kernel.org, jmattson@google.com, joro@8bytes.org, jun.nakajima@intel.com, linmiaohe@huawei.com, luto@kernel.org, mail@maciej.szmigiero.name, mhocko@suse.com, michael.roth@amd.com, mingo@redhat.com, naoya.horiguchi@nec.com, pbonzini@redhat.com, qperret@google.com, rppt@kernel.org, seanjc@google.com, shuah@kernel.org, steven.price@arm.com, tabba@google.com, tglx@linutronix.de, vannapurve@google.com, vbabka@suse.cz, vkuznets@redhat.com, wanpengli@tencent.com, wei.w.wang@intel.com, x86@kernel.org, yu.c.zhang@linux.intel.com Subject: Re: [RFC PATCH 1/2] mm: restrictedmem: Allow userspace to specify mount_path for memfd_restricted Message-ID: <20230224093600.osmbpilmsi64wlwb@box.shutemov.name> References: <20230216100150.yv2ehwrdcfzbdhcq@box.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 8063DC000D X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 9jayjedeezb9qozmynomdgkqdthc7urj X-HE-Tag: 1677231375-711326 X-HE-Meta: U2FsdGVkX1/tgYRNBTl7Q9ai0NJwnIEw6BtlwblUF0j9GWYFxX1Yd593/BBTDMGDn7jeYIyfIGUYYyMSwtYtuAJN4NlrrDhI8oozuCZWVjCQxNQXGqcYkW7qKB0QfioraQff2pssKJ3anJficJ0XrJLb8+L3+et/cO5C/pX8GfppuxpMB+qAcKTd9p02benLL/3dJratVF192WCzs9tjnVnE6lBndqblhck6q1/RYVClr0lpS2bnBVFu9koL+31aLmYZHVzugs/9uymTpUO+vYdu9g9uSllvOkxAL503l5Q9Odkq11ItPkfkJY6b1JCcHvZDgmi/lIsVA8yVS+UwVrdjaHutZOkWpj41zUiA6MfCj19e/kLjsHS+VUz5y5S9eD/4I5eQBQA7ao90v6+jynUkdK2sMQJvII5pPbyzn33AxeGFtbn+5xkEFzq6sGZqgnJPTCI3vperxNj38IywVJ0w+BWWDzdQ+sufYTMjzDZDkcRP22iQlO4sHKdpt2JThYvkOlf9yXWXYuPUNzm59bF1pHUfo2EIAvfmGnxuxUB8dTZKtvh7URYXQwL9rT/27RnqC2uwJrA/th/aWTBjUUDQEcV1aK2sVIHp8ROHXkDAsGJrluQMxLlusB6e1IPmRl7EJqnbAHcGDn4yqjzfdwTYTPNwxgqoACRQyH4ngzmjxP+OoI9FLMK+dyK6yQJtIrQgr8Hlsrtyf7lxMIcGl6O9/QbN51HWXHZ1U8R6K9RDqirAhveZv3Cg27FlfD8kJVblkU7P8H7a79y/ZSemI+WdPoL/6+Mc0846/235vddQ7E1T+EeF837sA7wC7WU3oMQd2TBmNS2nya7EhEiBss7ifSAHBWFXtmd8zyK864BTewE45Zf7D5ehGiQwrJkyvDFUE2FaRtDHIdtL9KE5VLtxT+FeVVmUi+KRdH4pyTJj54xXie2EWkqe9yhJU9wJ7+ub6JldKigXj2A3J9A d2psnJN+ yv+RgiVjJj/ypDrNSHTE270yX8+EWf8o+BEQKaMRo3qz+n8wWq4eWXwOozhXRB8A6nFfuZPkb5YDtJ8nGMSIifpuW9FnD+/SIZiUv6TXgIaz7sfddpBznQm+ktivFlCWEInRj/aWM0TyECIlsuu76NNX4FDJ7gkE4kOxDCiZYFHe9mojrbH1dqCZSIPoNYrJ51vBUM2Ul6Repftk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 23, 2023 at 12:55:16AM +0000, Ackerley Tng wrote: > > "Kirill A. Shutemov" writes: > > > On Thu, Feb 16, 2023 at 12:41:16AM +0000, Ackerley Tng wrote: > > > By default, the backing shmem file for a restrictedmem fd is created > > > on shmem's kernel space mount. > > > > With this patch, an optional tmpfs mount can be specified, which will > > > be used as the mountpoint for backing the shmem file associated with a > > > restrictedmem fd. > > > > This change is modeled after how sys_open() can create an unnamed > > > temporary file in a given directory with O_TMPFILE. > > > > This will help restrictedmem fds inherit the properties of the > > > provided tmpfs mounts, for example, hugepage allocation hints, NUMA > > > binding hints, etc. > > > > Signed-off-by: Ackerley Tng > > > --- > > > include/linux/syscalls.h | 2 +- > > > include/uapi/linux/restrictedmem.h | 8 ++++ > > > mm/restrictedmem.c | 63 +++++++++++++++++++++++++++--- > > > 3 files changed, 66 insertions(+), 7 deletions(-) > > > create mode 100644 include/uapi/linux/restrictedmem.h > > > > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h > > > index f9e9e0c820c5..4b8efe9a8680 100644 > > > --- a/include/linux/syscalls.h > > > +++ b/include/linux/syscalls.h > > > @@ -1056,7 +1056,7 @@ asmlinkage long sys_memfd_secret(unsigned int > > > flags); > > > asmlinkage long sys_set_mempolicy_home_node(unsigned long start, > > > unsigned long len, > > > unsigned long home_node, > > > unsigned long flags); > > > -asmlinkage long sys_memfd_restricted(unsigned int flags); > > > +asmlinkage long sys_memfd_restricted(unsigned int flags, const char > > > __user *mount_path); > > > > /* > > > * Architecture-specific system calls > > > I'm not sure what the right practice now: do we provide string that > > contains mount path or fd that represents the filesystem (returned from > > fsmount(2) or open_tree(2)). > > > fd seems more flexible: it allows to specify unbind mounts. > > I tried out the suggestion of passing fds to memfd_restricted() instead > of strings. > > One benefit I see of using fds is interface uniformity: it feels more > aligned with other syscalls like fsopen(), fsconfig(), and fsmount() in > terms of using and passing around fds. > > Other than being able to use a mount without a path attached to the > mount, are there any other benefits of using fds over using the path string? It would be nice if anyone from fs folks comment on this. > Should I post the patches that allows specifying a mount using fds? > Should I post them as a separate RFC, or as a new revision to this RFC? Let's first decide what the right direction is. -- Kiryl Shutsemau / Kirill A. Shutemov