From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE933C54E58 for ; Thu, 21 Mar 2024 22:46:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C01C6B0089; Thu, 21 Mar 2024 18:46:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 76F546B008C; Thu, 21 Mar 2024 18:46:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E92B6B0092; Thu, 21 Mar 2024 18:46:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4ADC56B0089 for ; Thu, 21 Mar 2024 18:46:48 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 1B9C940598 for ; Thu, 21 Mar 2024 22:46:48 +0000 (UTC) X-FDA: 81922532496.09.6649355 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 0112D40005 for ; Thu, 21 Mar 2024 22:46:45 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=amBcB8FV; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711061206; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X+ogJgpF1gr7B2+yDkmrhnOgUU+a3QN7Ii9aLHiqC4o=; b=Lx/QEcaiLbIooXX7eIW4+5QTv2xeENoeYvaqF5dPYkBfg3gvLl7eI2bebCorPfF9pEFXif 70aqNfrUo0nrXop3xLA8LV2tjDHekkL0mqPz3u7Kh8xkEvFoObzCFRDnzRcK5L/Ky0w3m5 11Lq2sFcbNFZSOqgJPyNvAUn311YNgU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711061206; a=rsa-sha256; cv=none; b=nf6EXfnpt6iuEO2hytSB6kjxEadIw4BzUXUAzaCsI9MXn2CvsOb+SYD8qjtuNN4KvLflKK HxkPcHBZXVXxcd+kFGRvLtceisOznIOxMEHhM2bwP3ifn4zy+WUsHWx5UeDxdXt46aMGWu XokoDvQiyx8HM8Gzw5dWFJgaP2zpIrE= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=amBcB8FV; spf=pass (imf07.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1711061205; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=X+ogJgpF1gr7B2+yDkmrhnOgUU+a3QN7Ii9aLHiqC4o=; b=amBcB8FVDDz1Hr5er9JjniE5s0ntmyhivRYVQS50ycLaBP9M4SYf9ScLxmgl8hftpktKmn GOsyaDry5KaraPna+0jd4wl+m7XVXO1upjp5UNIxDqfuGGywDt1HvHKQ/hHmkbKPbG7Efv jozrLq3tYBTanje6cW+l6scdbGZacjY= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-241-Rf35l4I3O-2IjjZGAbz5TA-1; Thu, 21 Mar 2024 18:46:42 -0400 X-MC-Unique: Rf35l4I3O-2IjjZGAbz5TA-1 Received: by mail-qv1-f70.google.com with SMTP id 6a1803df08f44-6961a54234cso2725536d6.1 for ; Thu, 21 Mar 2024 15:46:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711061201; x=1711666001; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=X+ogJgpF1gr7B2+yDkmrhnOgUU+a3QN7Ii9aLHiqC4o=; b=tIKmNQAeE9v97uGnmOGQagKfSSlKeQLuComSwFx3+1F2zggLiHb3oNujg+KQkEfSh8 9I2LwD4BTu+A82TPdar3qnOYYz1XNhsebL2dv6Snd4EVvDX9nusThs2QHby4yf0W8kbW 0XR37ERkq0OEQ8c6a/nTA1nsbzaSsyPxmYOWrmz/4zGqlFbns2dFj0EW+esxfQ1UaUjX X10aCuzTH5iQXjKkRtLdRnZ4JizavrJgXAO8OJtMBT5Rxdurhc1lrzFftZ/QxhG7UORU PyWuUnQbsftYU2WSrHN8fkaajxDewHD+Q4iIGSGxrOj/td4wqW8NfDpouHSE2qy2a5Kn oHtg== X-Forwarded-Encrypted: i=1; AJvYcCXyCZnUqQeURze9aJ/LA2sK6r2h92wi/DfG9+3Fr6JOQv/w4Ssc7xlNpynjY3YpEXdNpwyci/ZJ52oBFCYeBy05VcM= X-Gm-Message-State: AOJu0YziZZ28JxNiuiTIzsaj9zQCdSJnghoFl1T0qZaqaS+3XFTkn8lr c39yxS/PyagBHDeElUt6nafXx+aCjcSezx6F2ac+CmG3k/kaHX8JI/rsYSjiqqxdobwCfkNvcl6 BMYyl8AQGJB3bKWus5eYzNy5kbI81VYf3wh9ga6a12tm/jSuZ X-Received: by 2002:a05:6214:5984:b0:68f:dc8d:8ad3 with SMTP id qp4-20020a056214598400b0068fdc8d8ad3mr341919qvb.0.1711061201389; Thu, 21 Mar 2024 15:46:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGD/qtGTqYjBzfPurMsA4o89HBOzVGzjAKR5TaLhk1CCqmmBcLG97jKPtGk7lOvUP2Rb2c/CQ== X-Received: by 2002:a05:6214:5984:b0:68f:dc8d:8ad3 with SMTP id qp4-20020a056214598400b0068fdc8d8ad3mr341902qvb.0.1711061200836; Thu, 21 Mar 2024 15:46:40 -0700 (PDT) Received: from x1n ([99.254.121.117]) by smtp.gmail.com with ESMTPSA id gw8-20020a0562140f0800b0068f6e1c3582sm380456qvb.146.2024.03.21.15.46.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Mar 2024 15:46:40 -0700 (PDT) Date: Thu, 21 Mar 2024 18:46:38 -0400 From: Peter Xu To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christian Borntraeger , Janosch Frank , Claudio Imbrenda , Heiko Carstens , Vasily Gorbik , Andrew Morton , Alexander Gordeev , Sven Schnelle , Gerald Schaefer , Andrea Arcangeli , kvm@vger.kernel.org, linux-s390@vger.kernel.org Subject: Re: [PATCH v1 1/2] mm/userfaultfd: don't place zeropages when zeropages are disallowed Message-ID: References: <20240321215954.177730-1-david@redhat.com> <20240321215954.177730-2-david@redhat.com> <48d1282c-e4db-4b55-ab3f-3344af2440c4@redhat.com> MIME-Version: 1.0 In-Reply-To: <48d1282c-e4db-4b55-ab3f-3344af2440c4@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Queue-Id: 0112D40005 X-Rspam-User: X-Stat-Signature: kccixrbyfmhzspfnajhynhqhdpxp84qr X-Rspamd-Server: rspam03 X-HE-Tag: 1711061205-981997 X-HE-Meta: U2FsdGVkX19QHysQ/KVANQdvsGIG0VHTTP1uqrIJpZgBU1pCOqk7VAm89Ja+WmoFjhbY4PNVaue6QGvP3g/OIgGBOAsKOSq6I7c6o6YZcGMWoZQe4M+xagSNibdHliUXW83Iy2VAdEIW3zKSPQ+m7ZXcF8ZfyJmEt7prksWMfCuOm4Zh9mUgG4+8PpS2XDPZykqj+Ob790myHLNPNL8zgB01Tvnl7ztV0r+srjsMUqT4rR24zc6bODa4yvWTOWm3l3EcQbxahS7kMHJ6Nd4IQ02uQT0YFPmMgsBzsYAPj/nF3jNVXNjuOB1TkOc4/pck9nDdGlodXNExjZmc2eTsRQF6KxNSE4nF3tAndqcCpJZ+lgF1FAd6kD0e4wAMcOBueESRTB1cm4WNm7pDuXW5PyiP+JuMfxy1iC6YyjdX8clSx3pE2EcWz78SBRWF2PiravMxCK6vHySQDjt/ItZDwmNQp9DrRgYgmXM1ihJvgSbRP2H6hnKSLKWZSuIk1qn6PaXYAcJ2Wmuj12T9EoK5VoHmC9zz6NBJr8xjYiiHLSlzed6nlAP0/HIrvlSg4qd3+iRaWrfKES2InJEp8xViFyGe22wbe3xJXBnILS1QnAOZGKPKROkTMHl4QuSv763p9sQVbR1tIoHlDekIAwGBtdY0g4pF+BpbEXw3C8NfnBw0B/AhFx3rhiOSekZYzkyCF+7KB8MExyl/2IcmaEQG0UloqYG8qjSsNky3V3l6AIC3UiBhaAxYanSJTpYqmErrk8Kf2PAyaD0XTtDMCeM/q6AAuDjFTgtEaZeY0deXXTeG1F45jj8Hoo/wDkmCm799y5YLSbf+A8u8FRZwkGj0DVlH28b0iM6FHD/fk8q47a0L906QUOthXTXs0I5MmyRS5tbT0LLnfW18AygT2l/Ne8n6CKH25V+U8Cmo2dfrdmEGEZ1kP/IN4hW0cImrNreTKenCI347u0E4A+tBFKB 9nzD6shS n22VOMzqFwO6W2EBuIhs5PDUC7YqDd1A9EYfa7FZTc0cA8XvTazrufpeApwUXut8kymW0Vsrmu/PiXmT94s5Cw6iOu153fJEGQ3JHCV43zMma2tzKyxlIStlQ40il+hOdRAtNqK8YzCvvW97f5/jvM1sj6WQ2UoKba4joZA+b8YD+x55OVAhBCj1Kd2cPZpxXX9+Qg4ozFI51nTtrtDZuJl7nx2uJc8e9n8hUlnl57xe1vGx2mJbVTc/sLiL/iMIwdKJzNheHICMXUQIBFEI5ulRZJqogSFZTBcCzyl3Mg7L1bGVMseeNsaefYGOE4uvqBmHe2bgub5l/svguKLWqUy1BIPAKr63VaKqOF89VOcrYRm1wNojP0Sp8Vm7ly604DMM5LLUxXIWbV8Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 21, 2024 at 11:29:45PM +0100, David Hildenbrand wrote: > On 21.03.24 23:20, Peter Xu wrote: > > On Thu, Mar 21, 2024 at 10:59:53PM +0100, David Hildenbrand wrote: > > > s390x must disable shared zeropages for processes running VMs, because > > > the VMs could end up making use of "storage keys" or protected > > > virtualization, which are incompatible with shared zeropages. > > > > > > Yet, with userfaultfd it is possible to insert shared zeropages into > > > such processes. Let's fallback to simply allocating a fresh zeroed > > > anonymous folio and insert that instead. > > > > > > mm_forbids_zeropage() was introduced in commit 593befa6ab74 ("mm: introduce > > > mm_forbids_zeropage function"), briefly before userfaultfd went > > > upstream. > > > > > > Note that we don't want to fail the UFFDIO_ZEROPAGE request like we do > > > for hugetlb, it would be rather unexpected. Further, we also > > > cannot really indicated "not supported" to user space ahead of time: it > > > could be that the MM disallows zeropages after userfaultfd was already > > > registered. > > > > > > Fixes: c1a4de99fada ("userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation") > > > Signed-off-by: David Hildenbrand > > > > Reviewed-by: Peter Xu > > > > Still, a few comments below. > > > > > --- > > > mm/userfaultfd.c | 35 +++++++++++++++++++++++++++++++++++ > > > 1 file changed, 35 insertions(+) > > > > > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > > > index 712160cd41eca..1d1061ccd1dea 100644 > > > --- a/mm/userfaultfd.c > > > +++ b/mm/userfaultfd.c > > > @@ -316,6 +316,38 @@ static int mfill_atomic_pte_copy(pmd_t *dst_pmd, > > > goto out; > > > } > > > +static int mfill_atomic_pte_zeroed_folio(pmd_t *dst_pmd, > > > + struct vm_area_struct *dst_vma, unsigned long dst_addr) > > > +{ > > > + struct folio *folio; > > > + int ret; > > > > nitpick: we can set -ENOMEM here, then > > > > > + > > > + folio = vma_alloc_zeroed_movable_folio(dst_vma, dst_addr); > > > + if (!folio) > > > + return -ENOMEM; > > > > return ret; > > > > > + > > > + ret = -ENOMEM; > > > > drop. > > Sure! > > > > > > + if (mem_cgroup_charge(folio, dst_vma->vm_mm, GFP_KERNEL)) > > > + goto out_put; > > > + > > > + /* > > > + * The memory barrier inside __folio_mark_uptodate makes sure that > > > + * preceding stores to the page contents become visible before > > > + * the set_pte_at() write. > > > + */ > > > > This comment doesn't apply. We can drop it. > > > > I thought the same until I spotted that comment (where uffd originally > copied this from I strongly assume) in do_anonymous_page(). > > "Preceding stores" here are: zeroing out the memory. Ah.. that's okay then. Considering that userfault used to be pretty cautious on such ordering, as its specialty to involve many user updates on the page, would you mind we mention those details out? /* * __folio_mark_uptodate contains the memory barrier to make sure * the page updates to the zero page will be visible before * installing the pgtable entries. See do_anonymous_page(). */ Or anything better than my wordings. Thanks! -- Peter Xu