From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59E44C52D7D for ; Fri, 16 Aug 2024 23:12:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 756A38D00A2; Fri, 16 Aug 2024 19:12:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 706026B0299; Fri, 16 Aug 2024 19:12:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5A63D8D00A2; Fri, 16 Aug 2024 19:12:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3BB576B0297 for ; Fri, 16 Aug 2024 19:12:29 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B5E25A0391 for ; Fri, 16 Aug 2024 23:12:28 +0000 (UTC) X-FDA: 82459659576.14.0E57FE6 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf21.hostedemail.com (Postfix) with ESMTP id 024F81C0002 for ; Fri, 16 Aug 2024 23:12:26 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=V2YEeT1R; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 32dy_ZgYKCM4Cyu73w08805y.w86527EH-664Fuw4.8B0@flex--seanjc.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=32dy_ZgYKCM4Cyu73w08805y.w86527EH-664Fuw4.8B0@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723849863; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MP4Py9aMfR/AKmNoXvgLR+GIypWGYmbThjpuUqgYpZo=; b=kU+Snb/NMJAAO+ImTEoMLh/vByt2QEELVpbbIDmeJl8EX1T1yjrMn5ZK4WXlm3eh9M9Yiq RITS8tcMWzyxgLUGlvZLALJ+TKCMM6VTV0G1aK73mirLmnt4ty2Kt4mQNJ87ZvMc1XZRSu 38x5g2ox9q+u2rtWJH1rovLLsfpCdec= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723849863; a=rsa-sha256; cv=none; b=dy5w0kI3GJo10iCVYpJh7lTs5DubzS9cPzV8JFY8ooeyShEaBQz2ydz5EZwRCj9fWAcOFG 7en1CC0rL5ki/8CjNak1D7RhizFvanAL2MtBlmuGTtN9pcK4VZW5J9d+WxUChCdurfkKrG rnFwVTgRcb+54079oPEM2TqFdaMPNFE= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=V2YEeT1R; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 32dy_ZgYKCM4Cyu73w08805y.w86527EH-664Fuw4.8B0@flex--seanjc.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=32dy_ZgYKCM4Cyu73w08805y.w86527EH-664Fuw4.8B0@flex--seanjc.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e035f7b5976so5005905276.0 for ; Fri, 16 Aug 2024 16:12:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723849946; x=1724454746; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MP4Py9aMfR/AKmNoXvgLR+GIypWGYmbThjpuUqgYpZo=; b=V2YEeT1R43Hem2T7TR8m7THDT87yJIp/AIwLxsUpaFGSSFOuCdomkqAB8toDQ8ASxF MHfFBA7eQ7033bG04wJPTjEr5x2ziWK0/sh4smPWNaRwCaKTP4fr10Fz1jo2w1vOG2k2 Zib3pKhWpM8Uw+Y0htIH4myIYfDbizSEni6lGQoCog52/BxSxpcLEi3M5TdpeBP/khOS Cfe0V2D/CZRy92EBur2AnknWyqAwb38aSBu9vsmLlS+KaNy916r+UHuAVospQkncKqrM A2Xo5H8JVrUwfDQ1US9vh50P0JHn3TLnX1Y3dh6Nea1EytbXWXUpIFU22kDTdhHNhUQR yscw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723849946; x=1724454746; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MP4Py9aMfR/AKmNoXvgLR+GIypWGYmbThjpuUqgYpZo=; b=lMTbDrHP3K3kZuV/Rbqf+pySoQEGwdROzKsixHl3f/YWqiviBOeA2LGo5UKt1hXDXW 3t/HAma0EPuh+kUt5knVxeLmeKOfRdylhphwbh74xiGnjKfyUZVmqjeHyYxg9mZ5lgLn Or4AIk1JLNBD5m8zs1hBmRorAzbRtP3XKZOqtU8HQKE9P+sWRqT1Us7zF5Pg5rdwNGuo A6x5wXEdq+OMK/90GixYpqpBQYcEpi32hM5bX8h4y9OwHp1kKdq01M0zDyBPxTxl7rHt iqnMHR8BiatuIPR6beE7kdQ7qGiScGcb/Uw0ATlD48MlnIDGZbqoHkjIed6MjDWmBnGH xldA== X-Gm-Message-State: AOJu0YxwR2J8hBkaAE+aEZoyFUm85uTDivnjpGvuPrdUo5SechkftvWT KrpBkjGagrTN+4aW4w41h18qz/LF/jFs39WdM2B8P31eFlNKM0LMGTwN/XTuR6nPmyE6pwT0A8F C8w== X-Google-Smtp-Source: AGHT+IHvHjVtBQY+zbMWrJxzjHHTXmglsrnC6Fth3pX59tNCoit03ZRDoiW69Fbx+itX1xJqSOk6yOFrynI= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:aa53:0:b0:e11:5da7:33d with SMTP id 3f1490d57ef6-e11829fb2b5mr165390276.2.1723849945979; Fri, 16 Aug 2024 16:12:25 -0700 (PDT) Date: Fri, 16 Aug 2024 16:12:24 -0700 In-Reply-To: <20240809160909.1023470-10-peterx@redhat.com> Mime-Version: 1.0 References: <20240809160909.1023470-1-peterx@redhat.com> <20240809160909.1023470-10-peterx@redhat.com> Message-ID: Subject: Re: [PATCH 09/19] mm: New follow_pfnmap API From: Sean Christopherson To: Peter Xu Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador , Jason Gunthorpe , Axel Rasmussen , linux-arm-kernel@lists.infradead.org, x86@kernel.org, Will Deacon , Gavin Shan , Paolo Bonzini , Zi Yan , Andrew Morton , Catalin Marinas , Ingo Molnar , Alistair Popple , Borislav Petkov , David Hildenbrand , Thomas Gleixner , kvm@vger.kernel.org, Dave Hansen , Alex Williamson , Yan Zhao Content-Type: text/plain; charset="us-ascii" X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 024F81C0002 X-Stat-Signature: k1uqtwd46gk53bmyymc4gniwcja3x8kp X-Rspam-User: X-HE-Tag: 1723849946-562914 X-HE-Meta: U2FsdGVkX18VhrYle6ZuhXLPJf2COyX+3PXibk2zEXLH1nEzVkVTcRxXbMQ5FfY8Ua3lQdkJ+1GBskVwWOsPuFLM+K2z2h+/ZfnsQuUBPGfQJTSTXE90XfsHGr3V4v0YaYItmdSJHJYOEIdfB+cCJGJ6iZpDFwQSquqvMfUWIED+14LDY7oAa/PyekT7dFXKM8NZ5SEaDFkGG8pKCVbxSfrniVf7aqXm6DGfO+jj4FiDk437MdL8Pb6hRdBhPPooKcJMzTriS/7hDgpfFyZuVlLTTR2WqyQdpiUuD4fzSSI2bD9oRSgYbu1hUKBC8ciP2HVrtEEdXr0jbArJrnMlTLrz9CjVyyomJzsxiblGWfYIVAOn+LdWr4RLguIHmC2Z/YcuOWM1mzmZZ39GEgLDidZPOpoBAzIwzyARvi0fywrGf3c5IIb1N1GZngYPiyL7PsSG+KAUuwgAtPSglGH6nvYMYMzX3G8x83UlWpt2hX6rWtyj3eSo84XMdny4YMXabAljFdtKSV1T53eVLzBI3mk3FbRzsgaKDhTyvJeRW7A4TTHgkcvXhldkI2Jl+XcYMbdeqRs5v2jSQb4qeyaAs1/zblm4ap1fPPqRPlogv25puduMAYbzfowZ9unjQTTnXSkDdTMVopBA/Oze0ECjqbkMIGhnYNrvX0J0+4W1+tuwipFjabSjjs1L+DUtF7Vi3AygJX0zffZnV+kGgz94J4UElhS9Jn8ixWIWT4WqkJKvO+2lDVITtTEGT733DYqHMDXE1DXQ6GKkhF9b+JJBA2gDocWwV2FeYtP7jxXPeF+rr9fJCxcVKfiMMukt5DKyejm940lrUeVmuTYzs2eYsDeOnYD7+Zl1l3K6EOH4jp4yoHRNzgS/YVCCZEWELkf/TiLUgDGeR8nyOE1HKjBieVyW3/b+dcJ9zkitEZJUBMrJo0HrKt28rwShcg1JGm1j2tWaafWTXKzySeL1tG9 gt+Hkycn RGgjvulM5Rx7oDdV0bRfHgF85PBjAWLKSxSoHIbCWrVV6E7cw3juHX5CrQz78RgDfW8g9232Bpeoqyypc65q0vBnrrstHutp5PGi4txfid9b+xHMCUtRh7MFeQXXllar5M7znGiwfuVmNl5ZCMQVDTxnFtrD7Hzb2yIoGpGj1lanrcpeKFq0slXsJspCZnbfU9x4Qde+WzMiVNWYcDBMNix1ACWlsjc+SJm7Uk2vedGkNM83kblJpnAKTJx2qUAVJZTGEVscjePZT9X1JIOYMbbT4cNinhdsllmuqNbXA13eMwhm6wk6rpv+Odyl0+YJErFuxTLQ5EMvdijRipSZR5lHNDL6FXqAgNx6F1xVUhm+jBqZaWjorUdh++eFyZDcgRlaCfV67WyK+3WE7E+JaE5+iZLZCcKWodim+xOH4uTyEZGz4aWTXex0q/5xCSbOLwYvdbX+eohWiCsvi9E/IKzC3ew5O+60JEbJUdKFd1BWH3F+rpg1V2ryJ0JvcR8PZxikgJAcmLuux39vZ/Tq1wDBTb3d9gEwpP0Tljd2Ak/kTJ9AtGxlQTPQLLHeHWW82jHYnETZjXyEWSTz++Nl4EhW79exiC2yzYqqZx7pqoqwH6tY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 09, 2024, Peter Xu wrote: > Introduce a pair of APIs to follow pfn mappings to get entry information. > It's very similar to what follow_pte() does before, but different in that > it recognizes huge pfn mappings. ... > +int follow_pfnmap_start(struct follow_pfnmap_args *args); > +void follow_pfnmap_end(struct follow_pfnmap_args *args); I find the start+end() terminology to be unintuitive. E.g. I had to look at the implementation to understand why KVM invoke fixup_user_fault() if follow_pfnmap_start() failed. What about follow_pfnmap_and_lock()? And then maybe follow_pfnmap_unlock()? Though that second one reads a little weird. > + * Return: zero on success, -ve otherwise. ve? > +int follow_pfnmap_start(struct follow_pfnmap_args *args) > +{ > + struct vm_area_struct *vma = args->vma; > + unsigned long address = args->address; > + struct mm_struct *mm = vma->vm_mm; > + spinlock_t *lock; > + pgd_t *pgdp; > + p4d_t *p4dp, p4d; > + pud_t *pudp, pud; > + pmd_t *pmdp, pmd; > + pte_t *ptep, pte; > + > + pfnmap_lockdep_assert(vma); > + > + if (unlikely(address < vma->vm_start || address >= vma->vm_end)) > + goto out; > + > + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) > + goto out; Why use goto intead of simply? return -EINVAL; That's relevant because I think the cases where no PxE is found should return -ENOENT, not -EINVAL. E.g. if the caller doesn't precheck, then it can bail immediately on EINVAL, but know that it's worth trying to fault-in the pfn on ENOENT. > +retry: > + pgdp = pgd_offset(mm, address); > + if (pgd_none(*pgdp) || unlikely(pgd_bad(*pgdp))) > + goto out; > + > + p4dp = p4d_offset(pgdp, address); > + p4d = READ_ONCE(*p4dp); > + if (p4d_none(p4d) || unlikely(p4d_bad(p4d))) > + goto out; > + > + pudp = pud_offset(p4dp, address); > + pud = READ_ONCE(*pudp); > + if (pud_none(pud)) > + goto out; > + if (pud_leaf(pud)) { > + lock = pud_lock(mm, pudp); > + if (!unlikely(pud_leaf(pud))) { > + spin_unlock(lock); > + goto retry; > + } > + pfnmap_args_setup(args, lock, NULL, pud_pgprot(pud), > + pud_pfn(pud), PUD_MASK, pud_write(pud), > + pud_special(pud)); > + return 0; > + } > + > + pmdp = pmd_offset(pudp, address); > + pmd = pmdp_get_lockless(pmdp); > + if (pmd_leaf(pmd)) { > + lock = pmd_lock(mm, pmdp); > + if (!unlikely(pmd_leaf(pmd))) { > + spin_unlock(lock); > + goto retry; > + } > + pfnmap_args_setup(args, lock, NULL, pmd_pgprot(pmd), > + pmd_pfn(pmd), PMD_MASK, pmd_write(pmd), > + pmd_special(pmd)); > + return 0; > + } > + > + ptep = pte_offset_map_lock(mm, pmdp, address, &lock); > + if (!ptep) > + goto out; > + pte = ptep_get(ptep); > + if (!pte_present(pte)) > + goto unlock; > + pfnmap_args_setup(args, lock, ptep, pte_pgprot(pte), > + pte_pfn(pte), PAGE_MASK, pte_write(pte), > + pte_special(pte)); > + return 0; > +unlock: > + pte_unmap_unlock(ptep, lock); > +out: > + return -EINVAL; > +} > +EXPORT_SYMBOL_GPL(follow_pfnmap_start);