From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D53CEB64D9 for ; Fri, 7 Jul 2023 13:24:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1CA016B0078; Fri, 7 Jul 2023 09:24:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 178FC8D0005; Fri, 7 Jul 2023 09:24:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 019B08D0003; Fri, 7 Jul 2023 09:24:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E276F6B0078 for ; Fri, 7 Jul 2023 09:24:57 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id B47AA1C84C2 for ; Fri, 7 Jul 2023 13:24:57 +0000 (UTC) X-FDA: 80984886234.13.9CC41C1 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by imf09.hostedemail.com (Postfix) with ESMTP id A2E83140009 for ; Fri, 7 Jul 2023 13:24:53 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=tYAEzQjc; spf=pass (imf09.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688736293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0AOV5/KA/YbJPXgv98TOBzJ4v99JkvQ6tlRFkQoUBtI=; b=oKJF1fnp57R+l50nu0x5JFbAPRSbpxSRybL48iw5ONn2OEE/4j8JTsv9UXfC8cT2nVyaCY SNQJpmRuVdsZZU0xa6B7blTFQ5gcICs9DyRjAW8EHLpv8DHqkqmOppxg2W4MIEmU2y76Hd ncx6PxJI3wa8mGsX6b4Cf2GHydzG2jg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688736293; a=rsa-sha256; cv=none; b=iP7HZdF7OGQXoPGOzpVedifBbYgq5x3ffNZlEkq6+nqZiVr8H3+UZrQR3ZZnLHyZaQtLoN 80pb0H0jepJEWLpFZ9XfpEYgiHdLTcONPFjEe5jVXSROOUcbTY48NvVNW5sQ234xxrlGqm +R/yS7/XHd89HEa1dHi5zAd2Pme3voE= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=tYAEzQjc; spf=pass (imf09.hostedemail.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 367DGVhj015124; Fri, 7 Jul 2023 13:24:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=0AOV5/KA/YbJPXgv98TOBzJ4v99JkvQ6tlRFkQoUBtI=; b=tYAEzQjcOV4Vi4p1JJsNlKFubavGZpFf1Ym7AF8ves2LLr5TDAYx2JvAFJshFfltQ+L/ 2sdrPIc0JI2Kb8pPlf/eMlL2orLkn8g0BjchnTyMaHbEvUvcKHlS2wQHLf5Gud0YFqcC Hx12JdJx8auy4e7CDDX/okhWgYR33YScCk6XsoH/whTJZsLQrQhsuKsJWLUVWYg5b45b 2WRGD7C4Zzw2LPvN3zv6Bd+08N999hLOVjJrR0DtU5fvQufgcOYRiac4AuEwJJXiWaEY iEcNkAACPzZ75NtM+7MTztvv9uzVq7GGDmmB7XpqzOttdF/bk00Fk2Lt2LIpahqBxy5s Lg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rpkeu05yr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 07 Jul 2023 13:24:51 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 367DHdaO018968; Fri, 7 Jul 2023 13:24:51 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rpkeu05y6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 07 Jul 2023 13:24:51 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 366NsONl031301; Fri, 7 Jul 2023 13:24:49 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma05fra.de.ibm.com (PPS) with ESMTPS id 3rjbs4txvr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 07 Jul 2023 13:24:49 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 367DOklD60162350 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 7 Jul 2023 13:24:46 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 32F932004E; Fri, 7 Jul 2023 13:24:46 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2CE0120040; Fri, 7 Jul 2023 13:24:45 +0000 (GMT) Received: from [9.43.74.102] (unknown [9.43.74.102]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Fri, 7 Jul 2023 13:24:44 +0000 (GMT) Message-ID: <47066176-bd93-55dd-c2fa-002299d9e034@linux.ibm.com> Date: Fri, 7 Jul 2023 18:54:43 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v2 0/5] Avoid building lrugen page table walk code Content-Language: en-US To: Yu Zhao Cc: linux-mm@kvack.org, akpm@linux-foundation.org References: <20230706062044.816068-1-aneesh.kumar@linux.ibm.com> From: Aneesh Kumar K V In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: B6rJjGA42m-dBR-Txz2YmCuYqO7uzZml X-Proofpoint-ORIG-GUID: n2Jqa420mQ_xzKaXWysXmRFmvFL9YY-V X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-07_08,2023-07-06_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 clxscore=1015 bulkscore=0 priorityscore=1501 mlxlogscore=999 mlxscore=0 suspectscore=0 spamscore=0 lowpriorityscore=0 adultscore=0 phishscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307070121 X-Stat-Signature: 8gknqzn8rpkco5nqf56p9zgeqdsxf1e4 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: A2E83140009 X-Rspam-User: X-HE-Tag: 1688736293-921780 X-HE-Meta: U2FsdGVkX18mZV5cYCgYYvOkAYznOps9tXlf0UlA3pvxtY/ymHDfQ8m1PC3/S/62SgNNs7OTb/7Zk/CBwKv8EuuARrqB0eNkVIJAZHaSk/AG+UYHMzpRnehrIFaoN4HEepDXW3d0ZrMaCISvqFL0rzeYjS1bHHIdGCWMAWmmCcCgL91YfrbFQXMom0yD0IjG8UGAaQ3WRokxaAbXv2FQH5Y4J5tkM1JgPCp1UQubxuLvY+XzT5ar1Fjj5yJdX39dtkygfH/VxvnXwp5z3jFh8/o/J2UDLVBqBEZtFskWKAVLroF1CT1LblWMLZHi5E/PuhtpgQthZFuFV0Fq1tOhRBbvPmK3+f9yHJBP/IcAVRT2HP/lYNQIQS2/UD17CjA/QTEFJPxaQJ5NkPEjzVpj1LLHleDyeJX9AWubXITACywCMGgPCULeJukIpxRUsER9NmlMgh8Tb0b5Lgy3vLhcdzzcOy1FPPgvJaDreC/WV15ajbuQQRH6erkPLo9F5XttYI6Ug7hr7onF3ox2VaVsQtdAJddjGMUNHyraMmyWln/mzYsQwrnoKIMwaJEP+4CP30M79oiA3mH4p1MYBMInI2bLKfXuWfTM6BtwN09VqL/U4FiVshsnGPd7Lal/8cCBgy6SjWPhKX5u/GJx+mcrWTjwrl5rDjVfQ7IhWlxFgVFoE3Dn+naOX/MGvREUfZGWqmOT9ee3/TUP8s0f4tjaBwha0KI1lut5hciNOeYHcMm6LWD+Sy7kiqJm+mcSnC0h+gvHnKkuUaRoAdielRQBNF+L4y9gAatwjy4o37WLDVRRmAhViOj+3jyrTIYpug5b42LInM2h+f3MXT+05ujtcksj+QEdT37o3obppor5XFGg179KmTROQpAox4pPLrc8infZhfeVQTMf74vJkV4MMZziT6Fsg20LQrdQU2HjGeBEbc+MoSAW/K+9jrnJcHNOh2vJHW7Dc75GmlKD+z/ /wLpgiyT rlQs9Ots/R6R0+4KJAGoP92vNMzDbyMpj9IpFoSTQZy/rQI2/7HGX+d63oFHhh9sPoyPcjUX6WLsPD9BiF5PKMkHoTfuSy4sjWbZi1LrqTNdZT9S5EjpYZ/x0MLy8t1gYOYnzuqmA7Utq3L4o586EALpwLTbocVqLXBVJbP2EBo4+KM4H1Gm7Jwu+cR155QgpyDMa3n/7YF1V85D8whEydOrSMwoMm12eijH4TJNwoDgppk3CQ5XTa/Pvhw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 7/7/23 1:27 PM, Yu Zhao wrote: > On Thu, Jul 6, 2023 at 12:21 AM Aneesh Kumar K.V > wrote: >> >> This patchset avoids building changes added by commit bd74fdaea146 ("mm: >> multi-gen LRU: support page table walks") on platforms that don't support >> hardware atomic updates of access bits. >> >> Aneesh Kumar K.V (5): >> mm/mglru: Create a new helper iterate_mm_list_walk >> mm/mglru: Move Bloom filter code around >> mm/mglru: Move code around to make future patch easy >> mm/mglru: move iterate_mm_list_walk Helper >> mm/mglru: Don't build multi-gen LRU page table walk code on >> architecture not supported >> >> arch/Kconfig | 3 + >> arch/arm64/Kconfig | 1 + >> arch/x86/Kconfig | 1 + >> include/linux/memcontrol.h | 2 +- >> include/linux/mm_types.h | 10 +- >> include/linux/mmzone.h | 12 +- >> kernel/fork.c | 2 +- >> mm/memcontrol.c | 2 +- >> mm/vmscan.c | 955 +++++++++++++++++++------------------ >> 9 files changed, 528 insertions(+), 460 deletions(-) > > 1. There is no need for a new Kconfig -- the condition is simply > defined(CONFIG_LRU_GEN) && !defined(arch_has_hw_pte_young) > > 2. The best practice to disable static functions is not by macros but: > > static int const_cond(void) > { > return 1; > } > > int main(void) > { > int a = const_cond(); > > if (a) > return 0; > > /* the compiler doesn't generate code for static funcs below */ > static_func_1(); > ... > static_func_N(); > > LTO also optimizes external functions. But not everyone uses it. So we > still need macros for them, and of course data structures. > > 3. In 4/5, you have: > > @@ -461,6 +461,7 @@ enum { > struct lru_gen_mm_state { > /* set to max_seq after each iteration */ > unsigned long seq; > +#ifdef CONFIG_LRU_TASK_PAGE_AGING > /* where the current iteration continues after */ > struct list_head *head; > /* where the last iteration ended before */ > @@ -469,6 +470,11 @@ struct lru_gen_mm_state { > unsigned long *filters[NR_BLOOM_FILTERS]; > /* the mm stats for debugging */ > unsigned long stats[NR_HIST_GENS][NR_MM_STATS]; > +#else > + /* protect the seq update above */ > + /* May be we can use lruvec->lock? */ > + spinlock_t lock; > +#endif > }; > > The answer is yes, and not only that, we don't need lru_gen_mm_state at all. > > I'm attaching a patch that fixes all above. If you want to post it, > please feel free -- fully test it please, since I didn't. Otherwise I > can ask TJ to help make this work for you. > > $ git diff --stat > include/linux/memcontrol.h | 2 +- > include/linux/mm_types.h | 12 +- > include/linux/mmzone.h | 2 + > kernel/bounds.c | 6 +- > kernel/fork.c | 2 +- > mm/vmscan.c | 169 +++++++++++++++++++-------- > 6 files changed, 137 insertions(+), 56 deletions(-) > > On x86: > > $ ./scripts/bloat-o-meter mm/vmscan.o.old mm/vmscan.o > add/remove: 24/34 grow/shrink: 2/7 up/down: 966/-8716 (-7750) > Function old new delta > ... > should_skip_vma 206 - -206 > get_pte_pfn 261 - -261 > lru_gen_add_mm 323 - -323 > lru_gen_seq_show 1710 1370 -340 > lru_gen_del_mm 432 - -432 > reset_batch_size 572 - -572 > try_to_inc_max_seq 2947 1635 -1312 > walk_pmd_range_locked 1508 - -1508 > walk_pud_range 3238 - -3238 > Total: Before=99449, After=91699, chg -7.79% > > $ objdump -S mm/vmscan.o | grep -A 20 ":" > 000000000000a350 : > { > a350: e8 00 00 00 00 call a355 > a355: 55 push %rbp > a356: 48 89 e5 mov %rsp,%rbp > a359: 41 57 push %r15 > a35b: 41 56 push %r14 > a35d: 41 55 push %r13 > a35f: 41 54 push %r12 > a361: 53 push %rbx > a362: 48 83 ec 70 sub $0x70,%rsp > a366: 41 89 d4 mov %edx,%r12d > a369: 49 89 f6 mov %rsi,%r14 > a36c: 49 89 ff mov %rdi,%r15 > spin_lock_irq(&lruvec->lru_lock); > a36f: 48 8d 5f 50 lea 0x50(%rdi),%rbx > a373: 48 89 df mov %rbx,%rdi > a376: e8 00 00 00 00 call a37b > success = max_seq == lrugen->max_seq; > a37b: 49 8b 87 88 00 00 00 mov 0x88(%r15),%rax > a382: 4c 39 f0 cmp %r14,%rax For the below diff: @@ -4497,14 +4547,16 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, struct lru_gen_mm_walk *walk; struct mm_struct *mm = NULL; struct lru_gen_folio *lrugen = &lruvec->lrugen; + struct lru_gen_mm_state *mm_state = get_mm_state(lruvec); VM_WARN_ON_ONCE(max_seq > READ_ONCE(lrugen->max_seq)); + if (!mm_state) + return inc_max_seq(lruvec, max_seq, can_swap, force_scan); + /* see the comment in iterate_mm_list() */ - if (max_seq <= READ_ONCE(lruvec->mm_state.seq)) { - success = false; - goto done; - } + if (max_seq <= READ_ONCE(mm_state->seq)) + return false; /* * If the hardware doesn't automatically set the accessed bit, fallback @@ -4534,8 +4586,10 @@ static bool try_to_inc_max_seq(struct lruvec *lruvec, unsigned long max_seq, walk_mm(lruvec, mm, walk); } while (mm); done: - if (success) - inc_max_seq(lruvec, can_swap, force_scan); + if (success) { + success = inc_max_seq(lruvec, max_seq, can_swap, force_scan); + WARN_ON_ONCE(!success); + } return success; } @ We did discuss a possible race that can happen if we allow multiple callers hit inc_max_seq at the same time. inc_max_seq drop the lru_lock and restart the loop at the previous value of type. ie. if we want to do the above we might also need the below? modified mm/vmscan.c @@ -4368,6 +4368,7 @@ void inc_max_seq(struct lruvec *lruvec, bool can_swap, bool force_scan) int type, zone; struct lru_gen_struct *lrugen = &lruvec->lrugen; +retry: spin_lock_irq(&lruvec->lru_lock); VM_WARN_ON_ONCE(!seq_is_valid(lruvec)); @@ -4381,7 +4382,7 @@ void inc_max_seq(struct lruvec *lruvec, bool can_swap, bool force_scan) while (!inc_min_seq(lruvec, type, can_swap)) { spin_unlock_irq(&lruvec->lru_lock); cond_resched(); - spin_lock_irq(&lruvec->lru_lock); + goto retry; } } I also found that allowing only one cpu to increment max seq value and making other request with the same max_seq return false is also useful in performance runs. ie, we need an equivalent of this? + if (max_seq <= READ_ONCE(mm_state->seq)) + return false;