From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A6C2B258A for ; Fri, 30 May 2025 16:27:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.45 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748622445; cv=none; b=ZUhKKB7AlZ9oxzWqjjoBiQDXWKc8LsdM3BR0QrGnds5HlpH6ZATydx+KMPpobjkmn740jd7qp3fV+L+F00e+4dINGVs0Kgn8t9xjofcQC5+cYvxX6byidjb7TvJbaMKczE9PDw2IlGVaYxr2AQeayTI1cxL2kw04VSJ0/GMDaTM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748622445; c=relaxed/simple; bh=dF1+gDmyxY/9rUreETaqbpogRNLbyWryW5tlSGlkXTU=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=tJZR8tBB3gaPt8swCeIzPjsFcet2QBNwr7gE88AvKhRnvwgvQ5NvqQAhx95YBFUjjjNjAk8PJh2vWUr05lEOQloayQdJyKUvObu97Sj+P00YzlOkc5S9aJ6M2R1u5oJpLIPR4IZwAO9CjkKSsm26obKOctUadezexKnFvcGjRQo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Z0SHbwEG; arc=none smtp.client-ip=209.85.208.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Z0SHbwEG" Received: by mail-ed1-f45.google.com with SMTP id 4fb4d7f45d1cf-6000791e832so11067a12.1 for ; Fri, 30 May 2025 09:27:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1748622441; x=1749227241; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Wb7EzNeuZNtT7LoNB6YIvTwNTIj8nYWDYSy4GeuX+MM=; b=Z0SHbwEG71b+g1+94RD8L73X7DqxpbRlt6Gi80IIC4DLXOIIgwIXnYdZlmy3wDMQMl LA84e3xwjtqsj7lvyLSOq1EiQoQtJGWq/r4px+oEoc9w26PNPrvsIPAeALkDTUTYNABu FbKNlpgTgh9oMD1MbKa2cmhlE6pZ2uEM6KZRNU1GfRu5TwsjVmZdIn/VDw7+7a6XmL6h o+NTnuOpkZ/8bMq/z5J0egoZLu7N+VHvnjVBBOvs5u96APjc5NhwCL2vkwED94Cw4XQj C0CtdI/2kagSZWeDsKsyM+HorSSungCWvwZ8xMyIFiMuKpjUioPnG8fIX8FapD4LZzEs 5Ugw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748622441; x=1749227241; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Wb7EzNeuZNtT7LoNB6YIvTwNTIj8nYWDYSy4GeuX+MM=; b=C4uZTLcJCgJKw7XI/b6jdWZuQ5hNvqw/Z8mVQxuDCz8IPHAJRSt96xGJqOz//UC5TC wBEUJ9E9XQdul3+SLc/2i2P9tf7sS0J5NAE3hNOUD3Zs7NJCGOcjduSOe0pzEKIdvTPK X4T8DktnERd0uyHGLswSORtB/7B32FoJhwPrweMy2TB1Gv6VbC62RXf8mQpQIFvUU0Ki 7kha5D5n+gmNt4zICTex6n5t1B06SuuzYvSn3Z3HZXAjwr2b82beGHC1qghXpo5xmeb2 +ThONE1iGyBA8oeNK3tgtbEt5tLEcQJSY2fx4sHwVSplLH3FNhJanEUiv39+kMtxip6R NpTw== X-Forwarded-Encrypted: i=1; AJvYcCU06TLeqLtimuXLBvc1NEaVZltlEYHno0PEoD0upT/LNq3T9TrBsS6LG9P6KUamCugwJuX1d3bei6XC8FM=@vger.kernel.org X-Gm-Message-State: AOJu0YwTBkLX33FBFLwfLoaQnef/ZipIaJza5Z7MxMFJJTiWYhEvh/v0 MYBVC9ALyNnRWiTgzXIxQ8vkXWhJ8cqacxy7fPbtwm2vGAbecM7Gp7NDcE9o2blj6sdY0PS8JLp hbQCG7uzAQhlsY4tNgaIiQN/nHr+AplyBpYsyJ+vt X-Gm-Gg: ASbGncuy+p8TxgfSRCt1frf9oZX3PGITrEeNKnI4ZDiAyEs0ccvNhgZAqkX8/iP1FsQ ikMPt79xS0U75MxHI2NeRwRRnKyPpxl2CbBfV+wOOP9P8mZwRPIc/dTPn1NSa7mUwp9rdFr1XL7 +Funxn9TVQ191S6BsrXyVpEMXYbZrc6Oo5XluO4iJnddMGyRPvOCvBYkUXdXioTznxGJMtol0= X-Google-Smtp-Source: AGHT+IFvo4UeYXqvvR6hGpmlYjccx/P9qAgBVyNpakUtwWnk/plyYyjpSLFfr/exrsOHq2DB2NbEp4NJnsA/fPm9vYQ= X-Received: by 2002:a50:f608:0:b0:5e6:15d3:ffe7 with SMTP id 4fb4d7f45d1cf-60577a55f40mr88916a12.7.1748622440589; Fri, 30 May 2025 09:27:20 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20250530140446.2387131-1-ryan.roberts@arm.com> <20250530140446.2387131-2-ryan.roberts@arm.com> In-Reply-To: <20250530140446.2387131-2-ryan.roberts@arm.com> From: Jann Horn Date: Fri, 30 May 2025 18:26:44 +0200 X-Gm-Features: AX0GCFt5gaAeg9wlqckCjWiqvoyfvAtwj8XYq0jNIyzkBuWgoy0LcwI3HloBVMo Message-ID: Subject: Re: [RFC PATCH v1 1/6] fs/proc/task_mmu: Fix pte update and tlb maintenance ordering in pagemap_scan_pmd_entry() To: Ryan Roberts Cc: Catalin Marinas , Will Deacon , Madhavan Srinivasan , Michael Ellerman , Nicholas Piggin , Christophe Leroy , "David S. Miller" , Andreas Larsson , Juergen Gross , Ajay Kaher , Alexey Makhalov , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Boris Ostrovsky , "Aneesh Kumar K.V" , Andrew Morton , Peter Zijlstra , Arnd Bergmann , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Alexei Starovoitov , Andrey Ryabinin , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org, virtualization@lists.linux.dev, xen-devel@lists.xenproject.org, linux-mm@kvack.org, Andy Lutomirski Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, May 30, 2025 at 4:04=E2=80=AFPM Ryan Roberts = wrote: > pagemap_scan_pmd_entry() was previously modifying ptes while in lazy mmu > mode, then performing tlb maintenance for the modified ptes, then > leaving lazy mmu mode. But any pte modifications during lazy mmu mode > may be deferred until arch_leave_lazy_mmu_mode(), inverting the required > ordering between pte modificaiton and tlb maintenance. > > Let's fix that by leaving mmu mode, forcing all the pte updates to be > actioned, before doing the tlb maintenance. > > This is a theorectical bug discovered during code review. > > Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and option= ally clear info about PTEs") Hmm... isn't lazy mmu mode supposed to also delay TLB flushes, and preserve the ordering of PTE modifications and TLB flushes? Looking at the existing implementations of lazy MMU: - In Xen PV implementation of lazy MMU, I see that TLB flush hypercalls are delayed as well (xen_flush_tlb(), xen_flush_tlb_one_user() and xen_flush_tlb_multi() all use xen_mc_issue(XEN_LAZY_MMU) which delays issuing if lazymmu is active). - The sparc version also seems to delay TLB flushes, and sparc's arch_leave_lazy_mmu_mode() seems to do TLB flushes via flush_tlb_pending() if necessary. - powerpc's arch_leave_lazy_mmu_mode() also seems to do TLB flushes. Am I missing something? If arm64 requires different semantics compared to all existing implementations and doesn't delay TLB flushes for lazy mmu mode, I think the "Fixes" tag should point to your addition of lazy mmu support for arm64. > Signed-off-by: Ryan Roberts > --- > fs/proc/task_mmu.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index 994cde10e3f4..361f3ffd9a0c 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -2557,10 +2557,9 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, unsi= gned long start, > } > > flush_and_return: > + arch_leave_lazy_mmu_mode(); > if (flush_end) > flush_tlb_range(vma, start, addr); > - > - arch_leave_lazy_mmu_mode(); I think this ordering was probably intentional, because doing it this way around allows Xen PV to avoid one more hypercall, because the TLB flush can be batched together with the page table changes? > pte_unmap_unlock(start_pte, ptl); > > cond_resched(); > -- > 2.43.0 >