From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2BBDC3DA6D for ; Wed, 21 May 2025 00:39:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id ED7356B0083; Tue, 20 May 2025 20:39:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EAF576B0088; Tue, 20 May 2025 20:39:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC5596B0089; Tue, 20 May 2025 20:39:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id BDD8C6B0083 for ; Tue, 20 May 2025 20:39:00 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 5D5681A19ED for ; Wed, 21 May 2025 00:39:00 +0000 (UTC) X-FDA: 83465055240.09.6754E0F Received: from mail-ed1-f53.google.com (mail-ed1-f53.google.com [209.85.208.53]) by imf29.hostedemail.com (Postfix) with ESMTP id 7643C120005 for ; Wed, 21 May 2025 00:38:58 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=R5Ijp8zp; spf=pass (imf29.hostedemail.com: domain of jannh@google.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747787938; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WspCal8y8NZa4tDmuGWQBUOazVn+kNd9Xx/2Q+XoBxI=; b=OKJMetpInqGapQFve9cabi2+/gYkbGBO//F2uIwr/+WHkV7EtlnZAKi0RQE+Ti4DBn8WYR IBfe8uVkQ7IkN24n/uofohS0OwMNpg1AZrNvCWLzUFgCq/T4MkBpL5Z3J4HgeAoxkpmPkX K8xXuqdYttWxkg6Rftm7IaISqI++maA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747787938; a=rsa-sha256; cv=none; b=1de26l0oL3YzOGr9c783GRZ1bkgUcbmGfFkkNa2gRlfniXnJWNZh4alXTR0kwoIzOLUU8t IUl1W8UPCs0j7nDkd9LCJ2/OFlT88FaEB1awTv5s0prD+ZHWLOzBmGTn/coM+FYRJ0t6pI xrtpauCe3hRQvc8tk9NtAw3BXgdZe0Q= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=R5Ijp8zp; spf=pass (imf29.hostedemail.com: domain of jannh@google.com designates 209.85.208.53 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ed1-f53.google.com with SMTP id 4fb4d7f45d1cf-6000791e832so30047a12.1 for ; Tue, 20 May 2025 17:38:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747787937; x=1748392737; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=WspCal8y8NZa4tDmuGWQBUOazVn+kNd9Xx/2Q+XoBxI=; b=R5Ijp8zp5ygg1l1PjEDNejfGsu+zrGuxKiXHmnd3ldRujd46SiPSX6Lzhjt8eY1bGy ivPfnlEXFUjc/PDgMsNvH6fE82MpvM26stwzj6rywFD6vrsK9XspeLCyhnWRHK9PO91/ HAzANwwn67N66C6nF7/OY1Y7paUfUVbxUtUJWY9AfaJCA9ywUrDZvDeW/bZG0NaPNs1W 9iVDPnBoZOQFmcHVTsLtIEqf5XVvDya2OUD4+ILzU9LUvx5wPdq9lRr3f2elYLBnKyzO uQz4wuIQXXMWUdAGMb06FNqom/sD2zAXrcwhtVh4krx7CthJQx9o0gcp3q3b45aCQODd sXxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747787937; x=1748392737; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WspCal8y8NZa4tDmuGWQBUOazVn+kNd9Xx/2Q+XoBxI=; b=uBY6qwlVn5VUq8koJBM4ic1W0m8Rk7bR4mgrUind2ug0VJmg0Z7aOHzCXp8Qfjlbx0 iEdDIxs7eC3fhhjE+Q0Ezcx7n3qw4v1YD5P85bucz3McC8uSGaQQNBs1ylGa0HcrmtyM 2lsBnOKgx+gwx1W7ua+nA7WiBfdOdvOAeGtsXo7xgkNThTG0//g82RrDduUL+duaP8oH IP8LIv4bexYxeKLDBR2/b3Pb0Uw4wsNX5uU3RJPAcu2aHzFEc9D+W4osh/WlbI6i/WGr 5ykj3ygrzc3Dk1aOLN73pRscKiPwSwNVVudB2k6in4xByq5ADov/ubV0nfKWPHfe58zo 4gUA== X-Forwarded-Encrypted: i=1; AJvYcCXiOMgmb1hl2csrz5X9mCPHDgCrLIVCAp4VAbnwpf3Cr8BovTPu2BK4V/l8hRUZteRU1u8Dep1FXg==@kvack.org X-Gm-Message-State: AOJu0YwbzDUf0Z3cQKqTQYDIIL976fJJxAZKTyq10d/q1ZySLcd2QSVt 6gIvinmR7rK8+brMY3SCGaraTUlt93P0R/oMe4x7y1uRRnkyXRj+13Xq5dhRGE/RPJDmlnxspYh IcEvqbzUpxq+f8Bwkzt6e0TC61NQjT8cD8c84E+x+ X-Gm-Gg: ASbGncvYobROAr4MtJ81nU497Tv9bRHygDOFl9njALyRxqwO0FB8idWVPZoB2Yc6jWy pW2VQrCFqsu3norjBXy9kBK2NMUy36sfkrDoQmnlSUTC32TX83lL3zLTwtxyH2RJYxK4Jmh6RJo dSMEto7imhiodeHyKb7gQ6YiGBVFxPqsmynIAx9te8ECHRdvdCE2YG5UnohA6XFSovnFHSRYlQ X-Google-Smtp-Source: AGHT+IFGzg1HiyU3iSRyJlcEH+h760Pi+n2Y93HX/OapY6BeDZU3wPr7AjpuGUN09T84tbsasqjEUyHQjDZU2RrY0WM= X-Received: by 2002:a05:6402:14d5:b0:601:233a:4f4d with SMTP id 4fb4d7f45d1cf-6019bf2f776mr369506a12.2.1747787936630; Tue, 20 May 2025 17:38:56 -0700 (PDT) MIME-Version: 1.0 References: <20250520214813.3946964-1-roman.gushchin@linux.dev> In-Reply-To: <20250520214813.3946964-1-roman.gushchin@linux.dev> From: Jann Horn Date: Wed, 21 May 2025 02:38:20 +0200 X-Gm-Features: AX0GCFu94sdnWGHTjJDGScJi6wHq5LLBJs6G1BkmzL4E483Bo3JMeYGYzyLmFEI Message-ID: Subject: Re: [PATCH v5] mmu_gather: move tlb flush for VM_PFNMAP/VM_MIXEDMAP vmas into free_pgtables() To: Roman Gushchin , Hugh Dickins Cc: Andrew Morton , Peter Zijlstra , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Will Deacon , "Aneesh Kumar K.V" , Nick Piggin , linux-arch@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 11xj61gkzqozqxkp3ax49d4cnoaktkiy X-Rspamd-Queue-Id: 7643C120005 X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1747787938-674032 X-HE-Meta: U2FsdGVkX1+Dgiky/ebV/qNRDN+r/mX+c7IjyEbM58CHwF/Yr8k/rrp3gdSOzubZ4ILlffXb27jx9AnHR+yoWbIGSyg03OFWm8uRxDa7dD3QX7gpHgj6yVs86eQOlN7OJKNB/vNbTVokgbwwxLuiPtg2+jqO9TnmXFZZoJXoHaMTMMHJ7rPNrki2aTCy8qGFGYb2TIFNKo94xqc697NlMWtblbYk95ygBhi3HfhmVtjfLU4lfSPzlzyNtptlbP1i/32XUKT2Ntl5YMhgiXUum3LUeE5M5zCafwYcCvOVjy8XDpm99zyxnW49eiaAXycJxn4Mx/s0nPn9bfmTrBESYeeqndvF82ZGBwheVFdDRdnHr6aMnAIGpylsYGcWNXpdseKfRTuFGE7kOx92prvrWGwy9yrWHkQpTL96yyK/zTFWAbxPad4eCLpcNAlOw9YhT31uFpmcCGTqhdQu9A9Z1ZiLShYAMFkDovQBoh6wnl/aA6dpMvJoyrbDQOsCQPdGehRQ6LO1VP4PHbfSl5mHdJPnZ7wzuv1vOnfPgUAR2DBOxUgWXQPbcn4e9/BVrwZdQ/zlPjXOlTIplEjRhEL8tOfEgfrGCoxL5ZAv8va6Wt2IT/x2L7R7m1siufwGfuLE1deRdXI34P4BLIMn71R1IROzZyvMicpBrAV+RymQXKxHhb47VDw1i3WrVxfbdmR2i1y7yLYBBExpY+zucJP/+ePVJASQX3AR3yok/my7oD2DKSGsSTvv/x52FBy4ttmJCE2kMjY+ZdUTXQNKEexpeGej9N6fvBA7iRE3vXNe/ikvUV8c64vY8TKHanfChhM5S61z1UuuBLDtcUlahjdtscpk65jXIPA+e88kTvp1qKEtEVuq5SLYoed/Pm43QWRlfUZqNDWUJQFcanzBdLxzYuvO31q3sPp8p3aFaRFA+KTSzBJs1bazvQyppkFtoJBWsRQD41iWt1E9Wa4suHA lknHG/RN 5n0iPDwZI1OuuuLhSzl2mKmwNYMCG8YA6ZhMP3CRQE5I0eM5dbsIp5NIaSWGkiBsp0H5/bOiPdSUA92M5KMWfOXIYtCwqV6bXC54FaLaPn1qqdGOC4D/w9RetmDsid0boTH6syM3JkMOtDYrw4Wjo+Ut4z1E6R0eOM7BomGjOpiCS9/PrpdumWKodwgQB91YkidyOqiP5A1OKvajvWXxlBnsyDv14Lb1AoLdqOX43+0gocNuL0KGvL/T0TUGhA8XXcAbmXnzqmzdjFPQTufoYhH6DL7bePwqg0Rg6a5xROr/EB9T98QjW+iMCdxs2q6OclDBILs66WcwG28Gf/gmQqy7mYhq9ogZRntCsbPcSHDEncRzd1X+FcfxutaHnw1ZR2fS5f74+pnNfveyAmF8rN+QIzRgJfP0kIOHRUVWTaYL9v4JbjqTjwKbhNyg5YPuipFydxGm7bO83FK9AyEGnfKgrZFNsMm9TM09PzAkhOaVLxVU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 20, 2025 at 11:50=E2=80=AFPM Roman Gushchin wrote: > Commit b67fbebd4cf9 ("mmu_gather: Force tlb-flush VM_PFNMAP vmas") > added a forced tlbflush to tlb_vma_end(), which is required to avoid a > race between munmap() and unmap_mapping_range(). However it added some > overhead to other paths where tlb_vma_end() is used, but vmas are not > removed, e.g. madvise(MADV_DONTNEED). > > Fix this by moving the tlb flush out of tlb_end_vma() into new > tlb_flush_vmas() called from free_pgtables(), somewhat similar to the > stable version of the original commit: > commit 895428ee124a ("mm: Force TLB flush for PFNMAP mappings before > unlink_file_vma()"). > > Note, that if tlb->fullmm is set, no flush is required, as the whole > mm is about to be destroyed. > > Signed-off-by: Roman Gushchin > Cc: Jann Horn > Cc: Peter Zijlstra > Cc: Will Deacon > Cc: "Aneesh Kumar K.V" > Cc: Andrew Morton > Cc: Nick Piggin > Cc: Hugh Dickins > Cc: linux-arch@vger.kernel.org > Cc: linux-mm@kvack.org > > --- > v5: > - tlb_free_vma() -> tlb_free_vmas() to avoid extra checks > > v4: > - naming/comments update (by Peter Z.) > - check vma->vma->vm_flags in tlb_free_vma() (by Peter Z.) > > v3: > - added initialization of vma_pfn in __tlb_reset_range() (by Hugh D.) > > v2: > - moved vma_pfn flag handling into tlb.h (by Peter Z.) > - added comments (by Peter Z.) > - fixed the vma_pfn flag setting (by Hugh D.) > --- > include/asm-generic/tlb.h | 49 +++++++++++++++++++++++++++++++-------- > mm/memory.c | 2 ++ > 2 files changed, 41 insertions(+), 10 deletions(-) > > diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h > index 88a42973fa47..8a8b9535a930 100644 > --- a/include/asm-generic/tlb.h > +++ b/include/asm-generic/tlb.h > @@ -58,6 +58,11 @@ > * Defaults to flushing at tlb_end_vma() to reset the range; helps wh= en > * there's large holes between the VMAs. > * > + * - tlb_free_vmas() > + * > + * tlb_free_vmas() marks the start of unlinking of one or more vmas > + * and freeing page-tables. > + * > * - tlb_remove_table() > * > * tlb_remove_table() is the basic primitive to free page-table direc= tories > @@ -399,7 +404,10 @@ static inline void __tlb_reset_range(struct mmu_gath= er *tlb) > * Do not reset mmu_gather::vma_* fields here, we do not > * call into tlb_start_vma() again to set them if there is an > * intermediate flush. > + * > + * Except for vma_pfn, that only cares if there's pending TLBI. > */ > + tlb->vma_pfn =3D 0; This looks dodgy to me. Can you explain here in more detail why this is okay? Looking at current mainline, tlb->vma_pfn is only set to 1 when tlb_start_vma() calls into tlb_update_vma_flags(); it is never set again after tlb_start_vma(), so I don't think it's legal to just clear it in the middle of a VMA. If we had something like this callgraph on a VM_MIXEDMAP mapping, with an intermediate TLB flush in the middle of the VM_MIXEDMAP mapping: tlb_start_vma() [sets tlb->vma_pfn] zap_pte_range tlb_flush_mmu_tlbonly __tlb_reset_range [clears tlb->vma_pfn] zap_pte_range [zaps more PTEs and queues a pending TLB flush] tlb_end_vma() free_pgtables tlb_free_vmas [checks for tlb->vma_pfn] then tlb_free_vmas() will erroneously not do a flush when it should've done one, right? Why does it even matter to you whether tlb->vma_pfn ever gets reset? I think more or less at worst you do one extra TLB flush in some case involving a munmap() across multiple VMAs including a mix of VM_PFNMAP/VM_MIXEDMAP and non-VM_PFNMAP/VM_MIXEDMAP VMAs (which is already a fairly weird scenario on its own), with the region being operated on large enough or complicated enough that you already did at least one TLB flush, and the unmap sufficiently large or with sufficient address space around it that page tables are getting freed, or something like that? That seems like a scenario in which one more flush isn't going to be a big deal. If you wanted to do this properly, I think you could do something like: - add another flag tlb->current_vma_pfn that tracks whether the current vma is pfnmap/mixedmap - reset tlb->vma_pfn on TLB flush - set tlb->vma_pfn again if a TLB flush is enqueued while tlb->current_vma_pfn is true But that seems way too complicated, so I would just delete these three lines from the patch.