From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95BE6C433ED for ; Wed, 7 Apr 2021 20:50:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 33C5961205 for ; Wed, 7 Apr 2021 20:50:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 33C5961205 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=lespinasse.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5F4816B0073; Wed, 7 Apr 2021 16:50:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 56B156B0078; Wed, 7 Apr 2021 16:50:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E6566B007D; Wed, 7 Apr 2021 16:50:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0231.hostedemail.com [216.40.44.231]) by kanga.kvack.org (Postfix) with ESMTP id 1F5856B0073 for ; Wed, 7 Apr 2021 16:50:46 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C828D612F for ; Wed, 7 Apr 2021 20:50:45 +0000 (UTC) X-FDA: 78006764850.11.7561462 Received: from server.lespinasse.org (server.lespinasse.org [63.205.204.226]) by imf02.hostedemail.com (Postfix) with ESMTP id 99A7940002C2 for ; Wed, 7 Apr 2021 20:50:36 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=lespinasse.org; i=@lespinasse.org; q=dns/txt; s=srv-11-ed; t=1617828644; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to : from; bh=RuYCQ0DgIxUBi/wCmv1Y+yc2Ga4zVqxwcVcBP32mtAw=; b=Z7nAG5nMbI4MzkxVTEc2HHAJjn62lZp1Zw9XfJ9QePdfvwBY8+wMqzifzyp5Zwn3JhIVP LDMmT3hwpeoveGbAg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lespinasse.org; i=@lespinasse.org; q=dns/txt; s=srv-11-rsa; t=1617828644; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to : from; bh=RuYCQ0DgIxUBi/wCmv1Y+yc2Ga4zVqxwcVcBP32mtAw=; b=Wos9VqgdC2RFNH+S94wLjY1ICgF+Prad0ZeP1wvOmUyAOmS94nFg+oDwQHbF0c5T2S2x2 dJm7+FwWWH4Kuu5m9enACfQhhMc5maEUv4auILwgZ9hkvIdIr60gcr4g33z/Zj9JOaCE08e pk5B3/CzsTq5fTCG37AU9kYScyCap//+AE3HWIKhkrvmqt4dkxFePSx+OWYdkuFUdPA89cY 3DQZSFgOwuZQd3qep2WDwLX3hy2rFz81XNUFq/p/ji2vM9+0TFUGxo4Nxt5NbxbpNFWO5Wl LpylYnO6YmKpm24zu6AeQKLbfLOY538yYCTGgRyzJCHTL0vw7GQUyFQE4dcA== Received: by server.lespinasse.org (Postfix, from userid 1000) id 682BC160244; Wed, 7 Apr 2021 13:50:44 -0700 (PDT) Date: Wed, 7 Apr 2021 13:50:44 -0700 From: Michel Lespinasse To: Peter Zijlstra Cc: Michel Lespinasse , Linux-MM , Laurent Dufour , Michal Hocko , Matthew Wilcox , Rik van Riel , Paul McKenney , Andrew Morton , Suren Baghdasaryan , Joel Fernandes , Rom Lemarchand , Linux-Kernel Subject: Re: [RFC PATCH 09/37] mm: add per-mm mmap sequence counter for speculative page fault handling. Message-ID: <20210407205044.GD25738@lespinasse.org> References: <20210407014502.24091-1-michel@lespinasse.org> <20210407014502.24091-10-michel@lespinasse.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Queue-Id: 99A7940002C2 X-Stat-Signature: dngxjxse4infw4em5eipr6sq5hmguhm3 X-Rspamd-Server: rspam02 Received-SPF: none (lespinasse.org>: No applicable sender policy available) receiver=imf02; identity=mailfrom; envelope-from=""; helo=server.lespinasse.org; client-ip=63.205.204.226 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617828636-471184 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 07, 2021 at 04:47:34PM +0200, Peter Zijlstra wrote: > On Tue, Apr 06, 2021 at 06:44:34PM -0700, Michel Lespinasse wrote: > > The counter's write side is hooked into the existing mmap locking API: > > mmap_write_lock() increments the counter to the next (odd) value, and > > mmap_write_unlock() increments it again to the next (even) value. > > > > The counter's speculative read side is supposed to be used as follows: > > > > seq = mmap_seq_read_start(mm); > > if (seq & 1) > > goto fail; > > .... speculative handling here .... > > if (!mmap_seq_read_check(mm, seq) > > goto fail; > > > > This API guarantees that, if none of the "fail" tests abort > > speculative execution, the speculative code section did not run > > concurrently with any mmap writer. > > So this is obviously safe, but it's also super excessive. Any change, > anywhere, will invalidate and abort a SPF. > > Since you make a complete copy of the vma, you could memcmp it in its > entirety instead of this. Yeah, there is a deliberate choice here to start with the simplest possible approach, but this could lead to more SPF aborts than strictly necessary. It's not clear to me that just comparing original vs current vma attributes would always be sufficient - I think in some cases, we may also need to worry about attributes being changed back and forth concurrently with the fault. However, the comparison you suggest would be safe at least in the case where the original pte was pte_none. At some point, I did try implementing the optimization you suggest - if the global counter has changed, re-fetch the current vma and compare it against the old - but this was basically no help for the (Android) workloads I looked at. There were just too few aborts that were caused by the global counter in the first place. Note that my patchset only handles anon and page cache faults speculatively, so generally there wasn't very much time for the counter to change. But yes, this may not hold across all workloads, and maybe we'd want to do something smarter once/if a problem workload is identified.