From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6E4C3C3DA4A for ; Fri, 16 Aug 2024 23:13:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:References:Mime-Version:In-Reply-To:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=MP4Py9aMfR/AKmNoXvgLR+GIypWGYmbThjpuUqgYpZo=; b=4CIGJAAYOU++cpsg1M0YmvFa6m UHhB6A8zNdA2Mln7aMWBe5RSALWCSljv6ZKtAsY8btXGb74YDNogNSyVjsKO02ALDZOm4tNljBvUB AscZAFS4VdaD3XpPvqHpXYEEvXoIM3iazU92MxZ0d0neQTR09nwT/egQLRQMNJwVLCPNssxKwA3eZ rwX1WLCw0W/pky091eD2LRc54Ds30cLmSGTbFE2KR5GRohPxjyxLwfpZccYTIOHJ4ToW0+Hu+cyFy UBYwptQat3Bvt/D2sEwAtdRE5a7qtsknJemWZk4OLya+Bc1jZCuFTFC/sY847UrPcQ0RIIS7jsB/l cyJjWGXw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sf686-0000000EGjN-1XmG; Fri, 16 Aug 2024 23:13:10 +0000 Received: from mail-yb1-xb49.google.com ([2607:f8b0:4864:20::b49]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sf67Q-0000000EGZx-1gmA for linux-arm-kernel@lists.infradead.org; Fri, 16 Aug 2024 23:12:29 +0000 Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-e035f7b5976so5005907276.0 for ; Fri, 16 Aug 2024 16:12:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723849946; x=1724454746; darn=lists.infradead.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MP4Py9aMfR/AKmNoXvgLR+GIypWGYmbThjpuUqgYpZo=; b=K8IhPnATljEh5enmdWywve8qhU4h9LVjxB9gSkuuWaC29eX6waQHI2VLsa8hBT0zml pfIfGqtYzFDSH/bbUZnkSdtCy/XKU3QN53udbbp/CkF51vVosBCeM/bom8HzErp8mS9L sYYZHuAo/v4CriuDSvYfBCQMEs1C8AA4S+FJVYO/CDO/c3n+qjtJJg+m0UJ1sCHOWNf8 Z4xE8AFJuQO+gUe78KM1yFj9WRLOKZWHa8lYVbLZX/p7OwNMQ7W74RcaQFJUDChvgvf0 H4sMlKwA72086t0C+ZCcYfkxTAPMjUY82WP6HgI+7BZ0JIPf1rHxlRAwbCq1lZ2pBqv/ AHjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723849946; x=1724454746; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MP4Py9aMfR/AKmNoXvgLR+GIypWGYmbThjpuUqgYpZo=; b=Ct6FzQI5G4kxHLYtnjAb3eTNSoxe6JOM2yPN2YoBSnHeh3LcRYy6RoIFJAsF9RmcRc e5G3StqFLiBnvIdT+lCcyZ3VBMoJaXDnVNH38R2CZLiKkxLQO2Jf3sGbxcSAj5HEE5Xx tWAuq9uPLA0wH1krAhWcucBRceH7AAN7I73Fnh1VsjBMBTRRbPkKifaLUTM1iset01p2 RdHEKY83kk2w+ApDIO4mTNUovNy3gJZfHYPKm+H01kF0VVJNl1gIv0uqFRvTGqya1w2Y kgJkbNUMxMcyOFF+jwKKU4W+36aP81W8fUG41H3ob5O69kzTjd8eYGsYh5lJOFJYybe1 2Hlw== X-Forwarded-Encrypted: i=1; AJvYcCXoQJ5r608RGC0dbO5FeW4eV0o2/+ao2/7uB0//Og6URh+WF4mEMgFgdGNb4XhWJeyGHo3Jcs3MRC3myklFPtKzGNG/66YcjNMsMbh04cFL1Gv6Sao= X-Gm-Message-State: AOJu0YyoHHQnDQRa+ZJ3YZSFFiGmz8BBW0K47tCj9WG/JA/RSuGB6M7U 3A9C3CJ71PTLZR9RC1lrCHo//FCcuLQb1LBtwaJmhdv9mwGGceM6GZyAQ4EV4JBjitlwFZDNb55 FMQ== X-Google-Smtp-Source: AGHT+IHvHjVtBQY+zbMWrJxzjHHTXmglsrnC6Fth3pX59tNCoit03ZRDoiW69Fbx+itX1xJqSOk6yOFrynI= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:aa53:0:b0:e11:5da7:33d with SMTP id 3f1490d57ef6-e11829fb2b5mr165390276.2.1723849945979; Fri, 16 Aug 2024 16:12:25 -0700 (PDT) Date: Fri, 16 Aug 2024 16:12:24 -0700 In-Reply-To: <20240809160909.1023470-10-peterx@redhat.com> Mime-Version: 1.0 References: <20240809160909.1023470-1-peterx@redhat.com> <20240809160909.1023470-10-peterx@redhat.com> Message-ID: Subject: Re: [PATCH 09/19] mm: New follow_pfnmap API From: Sean Christopherson To: Peter Xu Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador , Jason Gunthorpe , Axel Rasmussen , linux-arm-kernel@lists.infradead.org, x86@kernel.org, Will Deacon , Gavin Shan , Paolo Bonzini , Zi Yan , Andrew Morton , Catalin Marinas , Ingo Molnar , Alistair Popple , Borislav Petkov , David Hildenbrand , Thomas Gleixner , kvm@vger.kernel.org, Dave Hansen , Alex Williamson , Yan Zhao Content-Type: text/plain; charset="us-ascii" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240816_161228_466665_28D8D046 X-CRM114-Status: GOOD ( 17.43 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Aug 09, 2024, Peter Xu wrote: > Introduce a pair of APIs to follow pfn mappings to get entry information. > It's very similar to what follow_pte() does before, but different in that > it recognizes huge pfn mappings. ... > +int follow_pfnmap_start(struct follow_pfnmap_args *args); > +void follow_pfnmap_end(struct follow_pfnmap_args *args); I find the start+end() terminology to be unintuitive. E.g. I had to look at the implementation to understand why KVM invoke fixup_user_fault() if follow_pfnmap_start() failed. What about follow_pfnmap_and_lock()? And then maybe follow_pfnmap_unlock()? Though that second one reads a little weird. > + * Return: zero on success, -ve otherwise. ve? > +int follow_pfnmap_start(struct follow_pfnmap_args *args) > +{ > + struct vm_area_struct *vma = args->vma; > + unsigned long address = args->address; > + struct mm_struct *mm = vma->vm_mm; > + spinlock_t *lock; > + pgd_t *pgdp; > + p4d_t *p4dp, p4d; > + pud_t *pudp, pud; > + pmd_t *pmdp, pmd; > + pte_t *ptep, pte; > + > + pfnmap_lockdep_assert(vma); > + > + if (unlikely(address < vma->vm_start || address >= vma->vm_end)) > + goto out; > + > + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > + goto out; Why use goto intead of simply? return -EINVAL; That's relevant because I think the cases where no PxE is found should return -ENOENT, not -EINVAL. E.g. if the caller doesn't precheck, then it can bail immediately on EINVAL, but know that it's worth trying to fault-in the pfn on ENOENT. > +retry: > + pgdp = pgd_offset(mm, address); > + if (pgd_none(*pgdp) || unlikely(pgd_bad(*pgdp))) > + goto out; > + > + p4dp = p4d_offset(pgdp, address); > + p4d = READ_ONCE(*p4dp); > + if (p4d_none(p4d) || unlikely(p4d_bad(p4d))) > + goto out; > + > + pudp = pud_offset(p4dp, address); > + pud = READ_ONCE(*pudp); > + if (pud_none(pud)) > + goto out; > + if (pud_leaf(pud)) { > + lock = pud_lock(mm, pudp); > + if (!unlikely(pud_leaf(pud))) { > + spin_unlock(lock); > + goto retry; > + } > + pfnmap_args_setup(args, lock, NULL, pud_pgprot(pud), > + pud_pfn(pud), PUD_MASK, pud_write(pud), > + pud_special(pud)); > + return 0; > + } > + > + pmdp = pmd_offset(pudp, address); > + pmd = pmdp_get_lockless(pmdp); > + if (pmd_leaf(pmd)) { > + lock = pmd_lock(mm, pmdp); > + if (!unlikely(pmd_leaf(pmd))) { > + spin_unlock(lock); > + goto retry; > + } > + pfnmap_args_setup(args, lock, NULL, pmd_pgprot(pmd), > + pmd_pfn(pmd), PMD_MASK, pmd_write(pmd), > + pmd_special(pmd)); > + return 0; > + } > + > + ptep = pte_offset_map_lock(mm, pmdp, address, &lock); > + if (!ptep) > + goto out; > + pte = ptep_get(ptep); > + if (!pte_present(pte)) > + goto unlock; > + pfnmap_args_setup(args, lock, ptep, pte_pgprot(pte), > + pte_pfn(pte), PAGE_MASK, pte_write(pte), > + pte_special(pte)); > + return 0; > +unlock: > + pte_unmap_unlock(ptep, lock); > +out: > + return -EINVAL; > +} > +EXPORT_SYMBOL_GPL(follow_pfnmap_start);