From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id F2C8AECAAA1
	for <linux-mm@archiver.kernel.org>; Thu, 27 Oct 2022 21:55:59 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 3ED2A6B0072; Thu, 27 Oct 2022 17:55:59 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 39D5B8E0001; Thu, 27 Oct 2022 17:55:59 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 28BEA6B0074; Thu, 27 Oct 2022 17:55:59 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12])
	by kanga.kvack.org (Postfix) with ESMTP id 1ABA06B0072
	for <linux-mm@kvack.org>; Thu, 27 Oct 2022 17:55:59 -0400 (EDT)
Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay06.hostedemail.com (Postfix) with ESMTP id AADB4ABA4E
	for <linux-mm@kvack.org>; Thu, 27 Oct 2022 21:55:58 +0000 (UTC)
X-FDA: 80068087596.20.58824AC
Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47])
	by imf29.hostedemail.com (Postfix) with ESMTP id 4D354120040
	for <linux-mm@kvack.org>; Thu, 27 Oct 2022 21:55:57 +0000 (UTC)
Received: by mail-ej1-f47.google.com with SMTP id kt23so8495926ejc.7
        for <linux-mm@kvack.org>; Thu, 27 Oct 2022 14:55:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:from:to:cc:subject:date:message-id:reply-to;
        bh=NBF9nN1amKjoEJYkIjpsAIirOQycb1TH6Xv8hrGX720=;
        b=mDkvDqdAij8U7APRP9yt0MpP0X4VRBno0CXMoN30kyF44cwwyG6SWPi0RMmgixT3Bt
         gDGI3NQdLQgzAOkEJRsHlv30WGUJ2hQblLeSpwcJ5STMlX4u4j8a3tQV2KR2m0LIQzBm
         AQJ503ah+Ml5VjRZNjjXGPE/NDYPCFMXA8zgjE5t0SVsx9Hw9IWLWZ3AMAqfOZwZ+V+e
         qYuMVsdT8DFxXyS63KjEsQME+nw+FVWDILA1TQjyCKoVJwx1GB6oAfX2SEHxx6BUkppp
         FRbFNc1nFH6dDsPv8S53kFNtYm8nzlkRSUmQ01A3lEAB9LpF+C23K7NNIPQTDmTy9HCd
         Qzng==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:subject:message-id:date:from:in-reply-to:references
         :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=NBF9nN1amKjoEJYkIjpsAIirOQycb1TH6Xv8hrGX720=;
        b=3XeTwGUNafaWxqJPhDOmVTAJYWM8nU4Bn4aTNA1LmJdjTuPXeDP5ymDaSTwmwVV+KF
         Ft3WLrJ14xOeBz5jikh85cvnj3mGK+r6uIbujZKZOrDxFSL/o5+3kIucA1n2pIawJaVw
         uyBZ0G+Kb+EPmK6IXzFWFSSWesJbMP9Y1ImhVSnQPFIYGBUlwozOoAavYuhSdacuffJ4
         Pmh93WSoqqUVPJIRJL5hyDCuGAItIQ8NMFw/n9Vt2hiHzak9GbH7pe+K5taaZ/HvK4kA
         eahu3EBQdCNJqG7sNybQ6qZlNxDPwY8ZwaVcp0Ga0VAjVjPB4OfZ80omP/s7A9SLSMQB
         jJ9Q==
X-Gm-Message-State: ACrzQf2U57ciLOT9/feq7NiVvCmTwmdVHmS4rO4KYgtjLDMuTzCoDy4B
	HYP+P3CGXiDQzQLUyjW6v7QknvMzbAdIY+/elO8=
X-Google-Smtp-Source: AMsMyM6xp5XG883Sfz6qxUaX5+gPHPonCP4MPn0vbsj6Bzy6ERW5G8ermHkmXVNLJ97zdVUdhxlu0w7L/DMiIXxWfLY=
X-Received: by 2002:a17:906:8442:b0:7ad:960b:ef61 with SMTP id
 e2-20020a170906844200b007ad960bef61mr2610486ejy.702.1666907755828; Thu, 27
 Oct 2022 14:55:55 -0700 (PDT)
MIME-Version: 1.0
References: <20220921084302.43631-1-yangyicong@huawei.com> <20220921084302.43631-3-yangyicong@huawei.com>
 <168eac93-a6ee-0b2e-12bb-4222eff24561@arm.com> <8e391962-4e3a-5a56-64b4-78e8637e3b8c@huawei.com>
 <CAGsJ_4z=dZbrAUD9jczT08S3qi_ep-h+EK35UfayVk1S+Cnp2A@mail.gmail.com>
 <ecd161db-b290-7997-a81e-a0a00bd1c599@arm.com> <87o7tx5oyx.fsf@stealth>
In-Reply-To: <87o7tx5oyx.fsf@stealth>
From: Barry Song <21cnbao@gmail.com>
Date: Fri, 28 Oct 2022 10:55:42 +1300
Message-ID: <CAGsJ_4zrGfPYAXGW0g3Z-GF4vT7GD0xDjZn1dv-qruztEQTghg@mail.gmail.com>
Subject: Re: [PATCH v4 2/2] arm64: support batched/deferred tlb shootdown
 during page reclamation
To: Punit Agrawal <punit.agrawal@bytedance.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>, Yicong Yang <yangyicong@huawei.com>, 
	yangyicong@hisilicon.com, corbet@lwn.net, peterz@infradead.org, arnd@arndb.de, 
	linux-kernel@vger.kernel.org, darren@os.amperecomputing.com, 
	huzhanyuan@oppo.com, lipeifeng@oppo.com, zhangshiming@oppo.com, 
	guojian@oppo.com, realmz6@gmail.com, linux-mips@vger.kernel.org, 
	openrisc@lists.librecores.org, linux-mm@kvack.org, x86@kernel.org, 
	linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, 
	akpm@linux-foundation.org, linux-riscv@lists.infradead.org, 
	linux-s390@vger.kernel.org, wangkefeng.wang@huawei.com, 
	xhao@linux.alibaba.com, prime.zeng@hisilicon.com, 
	Barry Song <v-songbaohua@oppo.com>, Nadav Amit <namit@vmware.com>, Mel Gorman <mgorman@suse.de>, 
	catalin.marinas@arm.com, will@kernel.org, linux-doc@vger.kernel.org
Content-Type: text/plain; charset="UTF-8"
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com;
	s=arc-20220608; t=1666907757;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=NBF9nN1amKjoEJYkIjpsAIirOQycb1TH6Xv8hrGX720=;
	b=4yarfYu7HwXBTkUwjvR01/NqE+7eSnycLe2wOZcVxUp1VhJ/au3lcuESdHRSla6HdbQi68
	rK+6U24baNyv1ZumDw9jsa+15XBC3qqkG6A0+EjGa3pF/EDY1KWwIjOhPoa5klB+FhewP0
	q9t+N2DPcMS4ncsQwMEY+5a/+ziill8=
ARC-Authentication-Results: i=1;
	imf29.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20210112 header.b=mDkvDqdA;
	dmarc=pass (policy=none) header.from=gmail.com;
	spf=pass (imf29.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=21cnbao@gmail.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1666907757; a=rsa-sha256;
	cv=none;
	b=Y/auOlyH3XwldJ7197toiPVfnmMkvE17VZbEuyJBGHep1Z9gAXV3UNe2YI6BJrNvuFPCl0
	fcDCWoYV8VqVqvDN9Ss4Yg1D8aEmBNBvgHcJLqvWhl2odBjyriZyhdXYUOJZyG6g1869vW
	M1Adwkt4Z3I53yX8cfFtYIcoEPV0w10=
X-Rspamd-Server: rspam02
X-Rspamd-Queue-Id: 4D354120040
Authentication-Results: imf29.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20210112 header.b=mDkvDqdA;
	dmarc=pass (policy=none) header.from=gmail.com;
	spf=pass (imf29.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=21cnbao@gmail.com
X-Stat-Signature: mud8zmz9sc9jnxbdwc55p8hggr4jr8r5
X-Rspam-User: 
X-HE-Tag: 1666907757-549696
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Fri, Oct 28, 2022 at 3:19 AM Punit Agrawal
<punit.agrawal@bytedance.com> wrote:
>
>
> [ Apologies for chiming in late in the conversation ]
>
> Anshuman Khandual <anshuman.khandual@arm.com> writes:
>
> > On 9/28/22 05:53, Barry Song wrote:
> >> On Tue, Sep 27, 2022 at 10:15 PM Yicong Yang <yangyicong@huawei.com> wrote:
> >>>
> >>> On 2022/9/27 14:16, Anshuman Khandual wrote:
> >>>> [...]
> >>>>
> >>>> On 9/21/22 14:13, Yicong Yang wrote:
> >>>>> +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm)
> >>>>> +{
> >>>>> +    /* for small systems with small number of CPUs, TLB shootdown is cheap */
> >>>>> +    if (num_online_cpus() <= 4)
> >>>>
> >>>> It would be great to have some more inputs from others, whether 4 (which should
> >>>> to be codified into a macro e.g ARM64_NR_CPU_DEFERRED_TLB, or something similar)
> >>>> is optimal for an wide range of arm64 platforms.
> >>>>
> >>
> >> I have tested it on a 4-cpus and 8-cpus machine. but i have no machine
> >> with 5,6,7
> >> cores.
> >> I saw improvement on 8-cpus machines and I found 4-cpus machines don't need
> >> this patch.
> >>
> >> so it seems safe to have
> >> if (num_online_cpus()  < 8)
> >>
> >>>
> >>> Do you prefer this macro to be static or make it configurable through kconfig then
> >>> different platforms can make choice based on their own situations? It maybe hard to
> >>> test on all the arm64 platforms.
> >>
> >> Maybe we can have this default enabled on machines with 8 and more cpus and
> >> provide a tlbflush_batched = on or off to allow users enable or
> >> disable it according
> >> to their hardware and products. Similar example: rodata=on or off.
> >
> > No, sounds bit excessive. Kernel command line options should not be added
> > for every possible run time switch options.
> >
> >>
> >> Hi Anshuman, Will,  Catalin, Andrew,
> >> what do you think about this approach?
> >>
> >> BTW, haoxin mentioned another important user scenarios for tlb bach on arm64:
> >> https://lore.kernel.org/lkml/393d6318-aa38-01ed-6ad8-f9eac89bf0fc@linux.alibaba.com/
> >>
> >> I do believe we need it based on the expensive cost of tlb shootdown in arm64
> >> even by hardware broadcast.
> >
> > Alright, for now could we enable ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH selectively
> > with CONFIG_EXPERT and for num_online_cpus()  > 8 ?
>
> When running the test program in the commit in a VM, I saw benefits from
> the patches at all sizes from 2, 4, 8, 32 vcpus. On the test machine,
> ptep_clear_flush() went from ~1% in the unpatched version to not showing
> up.
>
> Yicong mentioned that he didn't see any benefit for <= 4 CPUs but is
> there any overhead? I am wondering what are the downsides of enabling
> the config by default.

As we are deferring tlb flush, but sometimes while we are modifying the vma
which are deferred, we need to do a sync by flush_tlb_batched_pending() in
mprotect() , madvise() to make sure they can see the flushed result. if nobody
is doing mprotect(), madvise() etc in the deferred period, the overhead is zero.

>
> Thanks,
> Punit

Thanks
Barry