From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6195EC48BE6 for ; Wed, 16 Jun 2021 18:41:24 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 03AD9613EE for ; Wed, 16 Jun 2021 18:41:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 03AD9613EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 937BA6B006C; Wed, 16 Jun 2021 14:41:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 90DCE6B0070; Wed, 16 Jun 2021 14:41:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D6476B0071; Wed, 16 Jun 2021 14:41:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id 4FF3C6B006C for ; Wed, 16 Jun 2021 14:41:23 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E062DDE0D for ; Wed, 16 Jun 2021 18:41:22 +0000 (UTC) X-FDA: 78260454804.06.245915A Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf15.hostedemail.com (Postfix) with ESMTP id CBC16A00014E for ; Wed, 16 Jun 2021 18:41:17 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id BF0CB613EE; Wed, 16 Jun 2021 18:41:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1623868880; bh=KhuEyFARAncSbBn3We8wXGWjyOelaJec1tgfBHwmMTQ=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=jybkIbAoOZGcwi20yIw7ePBdWNwLx0jT1TwrMxStZ+N8KloS+UtzmDQxz525eIQqV sEKTE9kjhg16cTF4iHcIT4xxpMl4yyv49j183cCdzmOSGO6hUrUdau4r2oBmTQGF6Z eUS0nej8hfpjHWRDm3OkZn/0o75gIY+BJZ01o7ip8tXp9eeTUQ31BBGiFOy9UqEM2S yx1NVghmuX/IVq/KDIne0Mn0xZDj2EStMlcb6D0L494h1EWYB+y2kUQPUGca7xrPks 2Ey7rPorO5Rp2CdL8LY06bVX4uSMAkYdI4175WbEd+W/Vul9ZH4rT9v3NSLooTPvon SWK8Xf4FCsOKw== Subject: Re: [PATCH 4/8] membarrier: Make the post-switch-mm barrier explicit To: Peter Zijlstra , Nicholas Piggin Cc: x86@kernel.org, Andrew Morton , Dave Hansen , LKML , linux-mm@kvack.org, Mathieu Desnoyers References: <1623816595.myt8wbkcar.astroid@bobo.none> From: Andy Lutomirski Message-ID: <617cb897-58b1-8266-ecec-ef210832e927@kernel.org> Date: Wed, 16 Jun 2021 11:41:19 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=jybkIbAo; spf=pass (imf15.hostedemail.com: domain of luto@kernel.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=luto@kernel.org; dmarc=pass (policy=none) header.from=kernel.org X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: CBC16A00014E X-Stat-Signature: nragxeyu7i71brznityxfzso6itnnwiz X-HE-Tag: 1623868877-651726 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/16/21 12:35 AM, Peter Zijlstra wrote: > On Wed, Jun 16, 2021 at 02:19:49PM +1000, Nicholas Piggin wrote: >> Excerpts from Andy Lutomirski's message of June 16, 2021 1:21 pm: >>> membarrier() needs a barrier after any CPU changes mm. There is currently >>> a comment explaining why this barrier probably exists in all cases. This >>> is very fragile -- any change to the relevant parts of the scheduler >>> might get rid of these barriers, and it's not really clear to me that >>> the barrier actually exists in all necessary cases. >> >> The comments and barriers in the mmdrop() hunks? I don't see what is >> fragile or maybe-buggy about this. The barrier definitely exists. >> >> And any change can change anything, that doesn't make it fragile. My >> lazy tlb refcounting change avoids the mmdrop in some cases, but it >> replaces it with smp_mb for example. > > I'm with Nick again, on this. You're adding extra barriers for no > discernible reason, that's not generally encouraged, seeing how extra > barriers is extra slow. > > Both mmdrop() itself, as well as the callsite have comments saying how > membarrier relies on the implied barrier, what's fragile about that? > My real motivation is that mmgrab() and mmdrop() don't actually need to be full barriers. The current implementation has them being full barriers, and the current implementation is quite slow. So let's try that commit message again: membarrier() needs a barrier after any CPU changes mm. There is currently a comment explaining why this barrier probably exists in all cases. The logic is based on ensuring that the barrier exists on every control flow path through the scheduler. It also relies on mmgrab() and mmdrop() being full barriers. mmgrab() and mmdrop() would be better if they were not full barriers. As a trivial optimization, mmgrab() could use a relaxed atomic and mmdrop() could use a release on architectures that have these operations. Larger optimizations are also in the works. Doing any of these optimizations while preserving an unnecessary barrier will complicate the code and penalize non-membarrier-using tasks. Simplify the logic by adding an explicit barrier, and allow architectures to override it as an optimization if they want to. One of the deleted comments in this patch said "It is therefore possible to schedule between user->kernel->user threads without passing through switch_mm()". It is possible to do this without, say, writing to CR3 on x86, but the core scheduler indeed calls switch_mm_irqs_off() to tell the arch code to go back from lazy mode to no-lazy mode.