From mboxrd@z Thu Jan 1 00:00:00 1970 From: Catalin Marinas Subject: Re: [PATCH v2 3/7] mm: introduce memfd_secret system call to create "secret" memory areas Date: Fri, 31 Jul 2020 17:22:34 +0100 Message-ID: <20200731162234.GF29569@gaia> References: <20200727162935.31714-1-rppt@kernel.org> <20200727162935.31714-4-rppt@kernel.org> <20200730162209.GB3128@gaia> <20200731142905.GA67415@C02TD0UTHF1T.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20200731142905.GA67415-NxAhncQ8SJbILnBEAk/BfazUEOm+Xw19@public.gmane.org> Sender: linux-api-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Mark Rutland Cc: Mike Rapoport , linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Alexander Viro , Andrew Morton , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Christopher Lameter , Dan Williams , Dave Hansen , Elena Reshetova , "H. Peter Anvin" , Idan Yaniv , Ingo Molnar , James Bottomley , "Kirill A. Shutemov" , Matthew Wilcox , Mike Rapoport , Michael Kerrisk , Palmer Dabbelt , Paul List-Id: linux-arch.vger.kernel.org On Fri, Jul 31, 2020 at 03:29:05PM +0100, Mark Rutland wrote: > On Thu, Jul 30, 2020 at 05:22:10PM +0100, Catalin Marinas wrote: > > On Mon, Jul 27, 2020 at 07:29:31PM +0300, Mike Rapoport wrote: > > > +static int secretmem_mmap(struct file *file, struct vm_area_struct *vma) > > > +{ > > > + struct secretmem_ctx *ctx = file->private_data; > > > + unsigned long mode = ctx->mode; > > > + unsigned long len = vma->vm_end - vma->vm_start; > > > + > > > + if (!mode) > > > + return -EINVAL; > > > + > > > + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0) > > > + return -EINVAL; > > > + > > > + if (mlock_future_check(vma->vm_mm, vma->vm_flags | VM_LOCKED, len)) > > > + return -EAGAIN; > > > + > > > + switch (mode) { > > > + case SECRETMEM_UNCACHED: > > > + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > > > + fallthrough; > > > + case SECRETMEM_EXCLUSIVE: > > > + vma->vm_ops = &secretmem_vm_ops; > > > + break; > > > + default: > > > + return -EINVAL; > > > + } > > > + > > > + vma->vm_flags |= VM_LOCKED; > > > + > > > + return 0; > > > +} > > > > I think the uncached mapping is not the right thing for arm/arm64. First > > of all, pgprot_noncached() gives us Strongly Ordered (Device memory) > > semantics together with not allowing unaligned accesses. I suspect the > > semantics are different on x86. > > > The second, more serious problem, is that I can't find any place where > > the caches are flushed for the page mapped on fault. When a page is > > allocated, assuming GFP_ZERO, only the caches are guaranteed to be > > zeroed. Exposing this subsequently to user space as uncached would allow > > the user to read stale data prior to zeroing. The arm64 > > set_direct_map_default_noflush() doesn't do any cache maintenance. > > It's also worth noting that in a virtual machine this is liable to be > either broken (with a potential loss of coherency if the host has a > cacheable alias as existing KVM hosts have), or pointless (if the host > uses S2FWB to upgrade Stage-1 attribues to cacheable as existing KVM > hosts also have). > > I think that trying to avoid the data caches creates many more problems > than it solves, and I don't think there's a strong justification for > trying to support that on arm64 to begin with, so I'd rather entirely > opt-out on supporting SECRETMEM_UNCACHED. Good point, I forgot the virtualisation aspect. So unless there is a hypervisor API to unmap it from the host memory, the uncached option isn't of much use on arm64. -- Catalin