From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43596C2D0E4 for ; Mon, 23 Nov 2020 20:04:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A93102071E for ; Mon, 23 Nov 2020 20:04:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="uwgO8z8Y" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A93102071E Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id ECDA56B006E; Mon, 23 Nov 2020 15:04:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EA3956B0070; Mon, 23 Nov 2020 15:04:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CA9E76B0071; Mon, 23 Nov 2020 15:04:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0152.hostedemail.com [216.40.44.152]) by kanga.kvack.org (Postfix) with ESMTP id 8FC426B006E for ; Mon, 23 Nov 2020 15:04:09 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2FB9B180AD801 for ; Mon, 23 Nov 2020 20:04:09 +0000 (UTC) X-FDA: 77516759418.17.alarm88_471178527368 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id 131C4180D0184 for ; Mon, 23 Nov 2020 20:04:09 +0000 (UTC) X-HE-Tag: alarm88_471178527368 X-Filterd-Recvd-Size: 6898 Received: from mail-il1-f193.google.com (mail-il1-f193.google.com [209.85.166.193]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Mon, 23 Nov 2020 20:04:08 +0000 (UTC) Received: by mail-il1-f193.google.com with SMTP id t13so17088623ilp.2 for ; Mon, 23 Nov 2020 12:04:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+1ygwv1gDIx6pzCn/7wXhIf2aYwGwGDOEeQihtM1ZpM=; b=uwgO8z8YqcFkWNQxVIuvsoydACvsVlnlATNCoZSeBJcltJiVeSO2y4JBJOSWwrUKj4 8W63pN/7SYMqHP5rOFduWxX4EiINNzzmvsj+mtsj9+QzYcGzFFOsuTw2wXvy/N0ve3NZ UJNjlZ/AGRf1/VayKKB0hd9AXnsPEyFp63IqZgAZX8AIjLPHkAVI6LJ7d9i0epincBhr rI9DjlVygqGa+yxanvSvBIEipQ1TgN0SoLMFLG8DaFxyzvx273STtzy4RsiFnQB6/aTE FOORfKAXZ32B1tZljN7scdXJ93KSRT7hPQrww3dGr+AwuCLWVUFI1IgSxDOIgvxlWlTa WItA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+1ygwv1gDIx6pzCn/7wXhIf2aYwGwGDOEeQihtM1ZpM=; b=EDvcCBZAcfpT6ghH44dt5Vj/ga+30GRmMhuWNdY2FEBb4+k/KZdzqNIrY2NRy7TCw3 Qy5zQw2B4J6IXWu6RhlAcgOYA/oIRiLgWMSnYaPj1Ykf8eol5o831rjchfPsqLOgkyKb Qm6ahRL1DVi5L6tw8OxhTb8mcuOk4KyMvzWT9chlsbmk1O5gJGRFzd/SswryXz2qctQf 5Iod/DjkJaWc4mM5AmCiTMX7f3wgW66TtxApjRPEjgIUnhBu56jp4I9cEQmUY6ycRBJR 5WeP7VuYUlAtLf3e2pwxBm97zCs6SQfhgJ8TvvfPX91iD/6cnLNy/fynI8ENTZxNubxq 8jSA== X-Gm-Message-State: AOAM5328J7hrdH0owQX08dhg0kDiWP+5yd/9A3nbjrpBDSDLG6sLEI66 VJM8pRgrAbWLfQgscxM8QyCYjA== X-Google-Smtp-Source: ABdhPJyF3e/HkFOMrkX4Eg9o/jcmrwba0hxJNECcugCw6HtB3Dj0vb40Sr4WT6r3PwNGx/Axq0zDyw== X-Received: by 2002:a92:d591:: with SMTP id a17mr1290615iln.51.1606161847863; Mon, 23 Nov 2020 12:04:07 -0800 (PST) Received: from google.com ([2620:15c:183:200:7220:84ff:fe09:2d90]) by smtp.gmail.com with ESMTPSA id h16sm8105392ile.14.2020.11.23.12.04.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Nov 2020 12:04:07 -0800 (PST) Date: Mon, 23 Nov 2020 13:04:03 -0700 From: Yu Zhao To: Will Deacon Cc: linux-kernel@vger.kernel.org, kernel-team@android.com, Catalin Marinas , Minchan Kim , Peter Zijlstra , Linus Torvalds , Anshuman Khandual , linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH 6/6] mm: proc: Avoid fullmm flush for young/dirty bit toggling Message-ID: <20201123200403.GA3888699@google.com> References: <20201120143557.6715-1-will@kernel.org> <20201120143557.6715-7-will@kernel.org> <20201120204005.GC1303870@google.com> <20201123183554.GC11688@willie-the-truck> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201123183554.GC11688@willie-the-truck> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Nov 23, 2020 at 06:35:55PM +0000, Will Deacon wrote: > On Fri, Nov 20, 2020 at 01:40:05PM -0700, Yu Zhao wrote: > > On Fri, Nov 20, 2020 at 02:35:57PM +0000, Will Deacon wrote: > > > clear_refs_write() uses the 'fullmm' API for invalidating TLBs after > > > updating the page-tables for the current mm. However, since the mm is not > > > being freed, this can result in stale TLB entries on architectures which > > > elide 'fullmm' invalidation. > > > > > > Ensure that TLB invalidation is performed after updating soft-dirty > > > entries via clear_refs_write() by using the non-fullmm API to MMU gather. > > > > > > Signed-off-by: Will Deacon > > > --- > > > fs/proc/task_mmu.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > > > index a76d339b5754..316af047f1aa 100644 > > > --- a/fs/proc/task_mmu.c > > > +++ b/fs/proc/task_mmu.c > > > @@ -1238,7 +1238,7 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf, > > > count = -EINTR; > > > goto out_mm; > > > } > > > - tlb_gather_mmu_fullmm(&tlb, mm); > > > + tlb_gather_mmu(&tlb, mm, 0, TASK_SIZE); > > > > Let's assume my reply to patch 4 is wrong, and therefore we still need > > tlb_gather/finish_mmu() here. But then wouldn't this change deprive > > architectures other than ARM the opportunity to optimize based on the > > fact it's a full-mm flush? I double checked my conclusion on patch 4, and aside from a couple of typos, it still seems correct after the weekend. > Only for the soft-dirty case, but I think TLB invalidation is required > there because we are write-protecting the entries and I don't see any > mechanism to handle lazy invalidation for that (compared with the aging > case, which is handled via pte_accessible()). The lazy invalidation for that is done when we write-protect a page, not an individual PTE. When we do so, our decision is based on both the dirty bit and the writable bit on each PTE mapping this page. So we only need to make sure we don't lose both on a PTE. And we don't here. > Furthermore, If we decide that we can relax the TLB invalidation > requirements here, then I'd much rather than was done deliberately, rather > than as an accidental side-effect of another commit (since I think the > current behaviour was a consequence of 7a30df49f63a). Nope. tlb_gather/finish_mmu() should be added by b3a81d0841a9 ("mm: fix KSM data corruption") in the first place. > > It seems to me ARM's interpretation of tlb->fullmm is a special case, > > not the other way around. > > Although I agree that this is subtle and error-prone (which is why I'm > trying to make the API more explicit here), it _is_ documented clearly > in asm-generic/tlb.h: > > * - mmu_gather::fullmm > * > * A flag set by tlb_gather_mmu() to indicate we're going to free > * the entire mm; this allows a number of optimizations. > * > * - We can ignore tlb_{start,end}_vma(); because we don't > * care about ranges. Everything will be shot down. > * > * - (RISC) architectures that use ASIDs can cycle to a new ASID > * and delay the invalidation until ASID space runs out. I'd leave the original tlb_gather/finish_mmu() for the first case and add a new API for the second case, the special case that only applies to exit_mmap()). This way we won't change any existing behaviors on other architectures, which seems important to me. Additional cleanups to tlb_gather/finish_mmu() come thereafter.