From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BAC63FB5164 for ; Mon, 6 Apr 2026 21:37:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EBD886B0088; Mon, 6 Apr 2026 17:37:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E951F6B0089; Mon, 6 Apr 2026 17:37:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DAA356B008A; Mon, 6 Apr 2026 17:37:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C9A7A6B0088 for ; Mon, 6 Apr 2026 17:37:53 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 510AD594E1 for ; Mon, 6 Apr 2026 21:37:53 +0000 (UTC) X-FDA: 84629443626.03.8BDF039 Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf21.hostedemail.com (Postfix) with ESMTP id DC7D81C0003 for ; Mon, 6 Apr 2026 21:37:50 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=cmpxchg.org header.s=google header.b=phIqbwpV; spf=pass (imf21.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.170 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1775511471; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tg4IglZkL6FTu+jdUwhTxtA0loFZT6Zl00qSd/nFL+U=; b=cwYBiyuq+mygjLLbu7+Wdff3HcaDVqDNqql2eK+JfvLsCOKcg6tkfCxig2bX6vI+yG5Akf 7nB3pak0/5KMr0OYfb9A59q3q1zZyvvFeOEvzZE8d3e6j48QkavUgDeJFf5cgtXbVczghv cqds9jqqATEfbtYS7kR2zDJZq7uFWrI= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=cmpxchg.org header.s=google header.b=phIqbwpV; spf=pass (imf21.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.170 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1775511471; a=rsa-sha256; cv=none; b=NU8/y+0h8UBog+dKTNPmxj9/1P3gXD+X5uhyeCbxqMhbfBxeCNPpkPzmFKP/pGLDM5KSus VL6z8K1Sg7U2MjHOERj4wq16s6jIp2u579cr0NOwspNkhfxhnqJZGJXlHwBabnatqnMbY1 FR1Af/+nGtnzCiowUH4Apo7vUXup6NE= Received: by mail-qt1-f170.google.com with SMTP id d75a77b69052e-50d75bfb259so19703381cf.1 for ; Mon, 06 Apr 2026 14:37:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg.org; s=google; t=1775511470; x=1776116270; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=tg4IglZkL6FTu+jdUwhTxtA0loFZT6Zl00qSd/nFL+U=; b=phIqbwpVWqLgBmQnXh84QZLYO1bi84oBk9aMXIR2H65/NrPihnj3lpm2jrqXW23Zw1 6M6Hoe1uiUiAy4+0IypYFNz1vUHXOouN1BlDMIVxRI1qhSK+48rin8fLegO/1VVKwJ6x 9E0OO7mg/3m3dkOFfxviBcXk51koSUYKMscgOjh9kVyLYHV1hwj9gEgbu23VrcAAcZ/q SH7dZENuwYzPq28QDZkgKOvoOjkRauqOb3lQeQQgMzAHy2vuTS0LbcaoC3h+eOryqDl7 PEf9LFY68AHEKNsZEpfghjTsdbICOqXrp12lKVJGVi/h2IzBw/QCm809cu7kvm/NCUmj 7uRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775511470; x=1776116270; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tg4IglZkL6FTu+jdUwhTxtA0loFZT6Zl00qSd/nFL+U=; b=PkpC3hymItqtsRyK3pQ8EhHbntQrNZwDDwV2aKSfPVzi9giyXvQxllSZrFBooSpjx9 U2iNPVMDjWw/03Q2FzyK5efLuqhuMOsmfLHu0yWTT6UQAQTpL2lRfR/xLN9XJF4PqPaM RKdMl/mXg7NpN9YHoP9KOHGy1BnzyWIREZz93VA8zCMlGpWs74JlqCRiHF7KkfX2g0ML JZns0Upa6pWPb+taR8FUY/eXS0SxdcaMj6Os8iBB3WDaPYSfalfxUhSY03vJP9eVLcHk vYS+NtYtKbq4AQe4avua+IVOco5TZ5uahQoCKGhb3YwL7Rmt4lgJN7WhupsXbKUST4ii nGMg== X-Forwarded-Encrypted: i=1; AJvYcCWrLpcfF2bqrsS745YZBuhS04oCKWuopTyehoz5PD/SSkYIQBAqV8HB7fvMbMD+Ysrtfn8vYMHc+A==@kvack.org X-Gm-Message-State: AOJu0YwaQ0AewADle/uZ5v+B8H7WtExnyrZ2aKHc4YwmpltuJs7D2aXu TQe7gNvmha3tQtA5HT+teaWHnPkdmqgwDPOPtJ7Wc51kWt9vCxQSYvLSZKpKIwKUd5M= X-Gm-Gg: AeBDietXCPYrqZRujZz+UBZrIMw+gtEpFr6cyFLdvH0iPmuJ2G+XG4g87E8iPGJ519+ 4zsFObOs/u6DyGbtY+Id2/ORwqyWsk5Iv/PeguMRUiz9yXPabmr2dG6J0V1BQL419BQnOwJqBhf Xnp6Lk6F0sUoPH1TuwC7f0HDIUCeFSXk8u0acVmyfP+3cD/A4qD0y1BcAdY6ISiTFMIm3pjmNaG rcQXePPTzjCl9HyxGMKiRhyHroG4ChMZZfOBsfp8VmEb9xhkODkiCZqdra8dvi5zOcxWxVoI4w6 k5kcWjw0lQuB7UsZ5FIZwtaknsY7++0kohvNSNVNiliZEON0lA/O1Z2PxqRL0ZJMsJIg+Ff8hgB QW30jVVxn5gaNSymr724AtWcNbqnpxZI9SsnRiqcRqB3buKdXf6hjZRJtcddi0HT6AM74oTRZys 1c226DO9wgZZ9OFLJptMBejw== X-Received: by 2002:ac8:75d6:0:b0:50b:3e4d:7feb with SMTP id d75a77b69052e-50d62b4facfmr151136841cf.53.1775511469616; Mon, 06 Apr 2026 14:37:49 -0700 (PDT) Received: from localhost ([2603:7000:c00:3a00:365a:60ff:fe62:ff29]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-50d4b746314sm136057621cf.16.2026.04.06.14.37.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Apr 2026 14:37:47 -0700 (PDT) Date: Mon, 6 Apr 2026 17:37:43 -0400 From: Johannes Weiner To: "Lorenzo Stoakes (Oracle)" Cc: Andrew Morton , David Hildenbrand , Shakeel Butt , Yosry Ahmed , Zi Yan , "Liam R. Howlett" , Usama Arif , Kiryl Shutsemau , Dave Chinner , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 7/7] mm: switch deferred split shrinker to list_lru Message-ID: References: <20260318200352.1039011-1-hannes@cmpxchg.org> <20260318200352.1039011-8-hannes@cmpxchg.org> <0cf8a859-b142-4e53-9113-94872dd68f40@lucifer.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: DC7D81C0003 X-Stat-Signature: woughgj19pba4o3ye94syrt7f6g7ungf X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1775511470-349792 X-HE-Meta: U2FsdGVkX1/bphkpY1ztbQEACcbDKIzNn0CmEqKbAUTuQ2h/G+FQYwJUuOctDxJ4H3gnSoawbStRd676q1zIByH6LdiQph/aUwOuGQALChPGvmC04UDgJtJ6ZYfu54IbT3p0xIV0UNW/1S8vGhIVeu7qrYI2Z6dqsfkx0J2EA/qvSzkYxYZ0tYraw+TT/IroxRa+eOYTjUFQTLLLhWnmrw14HDb7axiJloquwhxMQLCDm2BhuBmhe+F7D28Z8lFZavnEsQD6CI0l80H/YyhfcXDdxxA2T6LttlVwEibxRHYtaKXG6LYt7OGXHy1afo6MeGAwAZCmx1zNx4GKojxTMvissL/QrQLETscj4E95VGL9b6bTuxAbMfPJ4qEZ6pUGzuVCidvuPDRb62n1d9CJ1BgpPwaGMP78Zi8rhXGz/+1pUL6YuD5OPgZnycuGiTyf9EE8ei6X6Ju31VJuWxKqveocgfBb02oe/p/nhIO28TfwftanYEBnBwv403rozpx2ea6E7AmTqL/lOCemrzvL34nfKzRqX6UQRs8zKOf4ZUk7f9AxzBLWWDsMWB85NdZAKT/dCcIl4JZ49Yxq8obiOIxzI+MsQtDYJxYN38ShLO5GAUw2mli1gJWGgUZ0BffD1gYNhxKV25GVeQEs5AM1zUjXCyDYKtQSzItLUsk6hD0SOTLLc3FhyRf2P1CDzzUa85YioYwOGaBqL5kbsfGxww6evKs4EelyhexaDlRB6hmb2jcBhETsMmYUTnWektwbItsxsQYNTHe+PQouQtlEhqf9MKVrJmO4Vo6zDuhnymlirMHoGDY8EN61M8PUTr/lFrdEEPaxVBORKTRUroFrsoI+0AWRryN0pMV7o2TKO90i5bsyZe8QebaLljLOHNGKqyw9QsLiIo5sWkyXRtWqAIXpBffB5byR0U9LsK4xPm7eTuDM0fg6RTZM2ACbFzSbFbAC/V+SAnLHRZgmu9N aea44I+R 8edB9hudDPLtNHdDElBTgf4bWVhRFY/D6X7toIRrXrGgDq9fTdyj48ow9YarpMjEEqpVBgftU2JZu985a9t/Rp23K3RjBl2qOXaDz3npe5oOhTotZoLr4NAM1WJEESf7cotGL/2Px/RWkmk76RdYuSYyddAATZOERotWIz3T8QGb57hSIztgDzD4wMaNchvBbhFzeAUNNU/MmP8h6v8Dtea+7lAMWb5DwchdWhL1IgnCMgpOj8GQjKK1LT8VvuCUoh8uBCfw7SlVkyypn7qYPJGndue/Pafaw0m1q0jl6VM2tIHW3YR0jIz9GDccZ4A8b7htITEjALZG0LQBVOo1+pyBhW5Cze+SBQp5QpfwnHMa5mzClts346S81ZHqzJ9nfHMeBzm4Xdgsd/VlaesWPO/3vS42l0+LZmS60DKqnV07uqyU0QHZI7gq7NnVtqquNchz/me6aH8mZKco= Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 01, 2026 at 06:33:04PM +0100, Lorenzo Stoakes (Oracle) wrote: > On Mon, Mar 30, 2026 at 12:40:22PM -0400, Johannes Weiner wrote: > > > > @@ -414,10 +414,9 @@ static inline int split_huge_page(struct page *page) > > > > { > > > > return split_huge_page_to_list_to_order(page, NULL, 0); > > > > } > > > > + > > > > +extern struct list_lru deferred_split_lru; > > > > > > It might be nice for the sake of avoiding a global to instead expose this > > > as a getter? > > > > > > Or actually better, since every caller outside of huge_memory.c that > > > references this uses folio_memcg_list_lru_alloc(), do something like: > > > > > > int folio_memcg_alloc_deferred(struct folio *folio, gfp_t gfp); > > > > > > in mm/huge_memory.c: > > > > > > /** > > > * blah blah blah put on error blah > > > */ > > > int folio_memcg_alloc_deferred(struct folio *folio, gfp_t gfp) > > > { > > > int err; > > > > > > err = folio_memcg_list_lru_alloc(folio, &deferred_split_lru, gfP); > > > if (err) { > > > folio_put(folio); > > > return err; > > > } > > > > > > return 0; > > > } > > > > > > And then the callers can just invoke this, and you can make > > > deferred_split_lru static in mm/huge_memory.c? > > > > That sounds reasonable. Let me make this change. > > Thanks! Done. This looks much nicer. Though I kept the folio_put() in the caller because that's who owns the reference. It would be quite unexpected for this one to consume a ref on error. > > > > @@ -939,6 +949,7 @@ static int __init thp_shrinker_init(void) > > > > > > > > huge_zero_folio_shrinker = shrinker_alloc(0, "thp-zero"); > > > > if (!huge_zero_folio_shrinker) { > > > > + list_lru_destroy(&deferred_split_lru); > > > > shrinker_free(deferred_split_shrinker); > > > > > > Presumably no probably-impossible-in-reality race on somebody entering the > > > shrinker and referencing the deferred_split_lru before the shrinker is freed? > > > > Ah right, I think for clarity it would indeed be better to destroy the > > shrinker, then the queue. Let me re-order this one. > > > > But yes, in practice, none of the above fails. If we have trouble > > doing a couple of small kmallocs during a subsys_initcall(), that > > machine is unlikely to finish booting, let alone allocate enough > > memory to enter the THP shrinker. > > Yeah I thought that might be the case, but seems more logical killing shrinker > first, thanks! Done. > > > > @@ -3854,34 +3761,34 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n > > > > struct folio *end_folio = folio_next(folio); > > > > struct folio *new_folio, *next; > > > > int old_order = folio_order(folio); > > > > + struct list_lru_one *l; > > > > > > Nit, and maybe this is a convention, but hate single letter variable names, > > > 'lru' or something might be nicer? > > > > Yeah I stuck with the list_lru internal naming, which uses `lru` for > > the struct list_lru, and `l` for struct list_lru_one. I suppose that > > was fine for the very domain-specific code and short functions in > > there, but it's grating in large, general MM functions like these. > > > > Since `lru` is taken, any preferences? llo? > > ljs? ;) > > Could be list? list is taken in some of these contexts already. I may have overthought this. lru works fine in those callsites, and is in line with what other sites are using (git grep list_lru_one). > But, and I _know_ it's nitty sorry, but maybe worth expanding that comment to > explain that e.g. 'we must take the folio look prior to the list_lru lock to > avoid racing with deferred_split_scan() in accessing the folio reference count' > or similar? Good idea! Done. > > > > + int nid = folio_nid(folio); > > > > unsigned long flags; > > > > bool unqueued = false; > > > > > > > > WARN_ON_ONCE(folio_ref_count(folio)); > > > > WARN_ON_ONCE(!mem_cgroup_disabled() && !folio_memcg_charged(folio)); > > > > > > > > - ds_queue = folio_split_queue_lock_irqsave(folio, &flags); > > > > - if (!list_empty(&folio->_deferred_list)) { > > > > - ds_queue->split_queue_len--; > > > > + rcu_read_lock(); > > > > + l = list_lru_lock_irqsave(&deferred_split_lru, nid, folio_memcg(folio), &flags); > > > > + if (__list_lru_del(&deferred_split_lru, l, &folio->_deferred_list, nid)) { > > > > > > Maybe worth factoring __list_lru_del() into something that explicitly > > > references &folio->_deferred_list rather than open codingin both places? > > > > Hm, I wouldn't want to encode this into list_lru API, but we could do > > a huge_memory.c-local helper? > > > > folio_deferred_split_del(folio, l, nid) > > Well, I kind of hate how we're using the global deferred_split_lru all over the > place, so a helper woudl be preferable but one that also could be used for > khugepaged.c and memory.c also? This function is used only in huge_memory.c. I managed to make the deferred_list_lru static as well without making any changes to this ^ particular function/callsite. Let me know, after looking at the delta diff below, if you'd still like to see changes here. > > > > @@ -4534,64 +4438,32 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, > > > > } > > > > folio_unlock(folio); > > > > next: > > > > - if (did_split || !folio_test_partially_mapped(folio)) > > > > - continue; > > > > /* > > > > * Only add back to the queue if folio is partially mapped. > > > > * If thp_underused returns false, or if split_folio fails > > > > * in the case it was underused, then consider it used and > > > > * don't add it back to split_queue. > > > > */ > > > > - fqueue = folio_split_queue_lock_irqsave(folio, &flags); > > > > - if (list_empty(&folio->_deferred_list)) { > > > > - list_add_tail(&folio->_deferred_list, &fqueue->split_queue); > > > > - fqueue->split_queue_len++; > > > > + if (!did_split && folio_test_partially_mapped(folio)) { > > > > + rcu_read_lock(); > > > > + l = list_lru_lock_irqsave(&deferred_split_lru, > > > > + folio_nid(folio), > > > > + folio_memcg(folio), > > > > + &flags); > > > > + __list_lru_add(&deferred_split_lru, l, > > > > + &folio->_deferred_list, > > > > + folio_nid(folio), folio_memcg(folio)); > > > > + list_lru_unlock_irqrestore(l, &flags); > > > > > > Hmm this does make me think it'd be nice to have a list_lru_add() variant > > > for irqsave/restore then, since it's a repeating pattern! > > > > Yeah, this site calls for it the most :( I tried to balance callsite > > prettiness with the need to extend the list_lru api; it's just one > > caller. And the possible mutations and variants with these locks is > > seemingly endless once you open that can of worms... > > True... > > > > > Case in point: this is process context and we could use > > spin_lock_irq() here. I'm just using list_lru_lock_irqsave() because > > that's the common variant used by the add and del paths already. > > > > If I went with a helper, I could do list_lru_add_irq(). > > > > I think it would actually nicely mirror the list_lru_shrink_walk_irq() > > a few lines up. > > Yeah, I mean I'm pretty sure this repeats quite a few times so is worthy of a > helper. It's only one callsite, actually. But I added the helper. It's churny on the list_lru side, but that callsite does look much better. Anyway, I hope I got everything. Can you take a look? Will obviously fold this into the respective patches, but just double checking whether these things are what you had in mind. --- diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 8d801ed378db..b473605b4d7d 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -415,7 +415,8 @@ static inline int split_huge_page(struct page *page) return split_huge_page_to_list_to_order(page, NULL, 0); } -extern struct list_lru deferred_split_lru; +int folio_memcg_alloc_deferred(struct folio *folio); + void deferred_split_folio(struct folio *folio, bool partially_mapped); void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h index 4bd29b61c59a..733a262b91e5 100644 --- a/include/linux/list_lru.h +++ b/include/linux/list_lru.h @@ -83,6 +83,21 @@ int memcg_list_lru_alloc(struct mem_cgroup *memcg, struct list_lru *lru, gfp_t gfp); #ifdef CONFIG_MEMCG +/** + * folio_memcg_list_lru_alloc - allocate list_lru heads for shrinkable folio + * @folio: the newly allocated & charged folio + * @lru: the list_lru this might be queued on + * @gfp: gfp mask + * + * Allocate list_lru heads (per-memcg, per-node) needed to queue this + * particular folio down the line. + * + * This does memcg_list_lru_alloc(), but on the memcg that @folio is + * associated with. Handles folio_memcg() access rules in the fast + * path (list_lru heads allocated) and the allocation slowpath. + * + * Returns 0 on success, a negative error value otherwise. + */ int folio_memcg_list_lru_alloc(struct folio *folio, struct list_lru *lru, gfp_t gfp); #else @@ -118,6 +133,10 @@ struct list_lru_one *list_lru_lock(struct list_lru *lru, int nid, */ void list_lru_unlock(struct list_lru_one *l); +struct list_lru_one *list_lru_lock_irq(struct list_lru *lru, int nid, + struct mem_cgroup *memcg); +void list_lru_unlock_irq(struct list_lru_one *l); + struct list_lru_one *list_lru_lock_irqsave(struct list_lru *lru, int nid, struct mem_cgroup *memcg, unsigned long *irq_flags); void list_lru_unlock_irqrestore(struct list_lru_one *l, @@ -161,6 +180,9 @@ bool __list_lru_del(struct list_lru *lru, struct list_lru_one *l, bool list_lru_add(struct list_lru *lru, struct list_head *item, int nid, struct mem_cgroup *memcg); +bool list_lru_add_irq(struct list_lru *lru, struct list_head *item, int nid, + struct mem_cgroup *memcg); + /** * list_lru_add_obj: add an element to the lru list's tail * @lru: the lru pointer diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c8c6c4602cc7..a0cce6a56620 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -69,7 +69,7 @@ unsigned long transparent_hugepage_flags __read_mostly = (1<_refcount */ + /* + * If this folio can be on the deferred split queue, lock out + * the shrinker before freezing the ref. If the shrinker sees + * a 0-ref folio, it assumes it beat folio_put() to the list + * lock and must clean up the LRU state - the same dequeue we + * will do below as part of the split. + */ dequeue_deferred = folio_test_anon(folio) && old_order > 1; if (dequeue_deferred) { rcu_read_lock(); - l = list_lru_lock(&deferred_split_lru, - folio_nid(folio), folio_memcg(folio)); + lru = list_lru_lock(&deferred_split_lru, + folio_nid(folio), folio_memcg(folio)); } if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) { struct swap_cluster_info *ci = NULL; struct lruvec *lruvec; if (dequeue_deferred) { - __list_lru_del(&deferred_split_lru, l, + __list_lru_del(&deferred_split_lru, lru, &folio->_deferred_list, folio_nid(folio)); if (folio_test_partially_mapped(folio)) { folio_clear_partially_mapped(folio); mod_mthp_stat(old_order, MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1); } - list_lru_unlock(l); + list_lru_unlock(lru); rcu_read_unlock(); } @@ -3890,7 +3901,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n swap_cluster_unlock(ci); } else { if (dequeue_deferred) { - list_lru_unlock(l); + list_lru_unlock(lru); rcu_read_unlock(); } return -EAGAIN; @@ -4268,7 +4279,7 @@ int split_folio_to_list(struct folio *folio, struct list_head *list) */ bool __folio_unqueue_deferred_split(struct folio *folio) { - struct list_lru_one *l; + struct list_lru_one *lru; int nid = folio_nid(folio); unsigned long flags; bool unqueued = false; @@ -4277,8 +4288,8 @@ bool __folio_unqueue_deferred_split(struct folio *folio) WARN_ON_ONCE(!mem_cgroup_disabled() && !folio_memcg_charged(folio)); rcu_read_lock(); - l = list_lru_lock_irqsave(&deferred_split_lru, nid, folio_memcg(folio), &flags); - if (__list_lru_del(&deferred_split_lru, l, &folio->_deferred_list, nid)) { + lru = list_lru_lock_irqsave(&deferred_split_lru, nid, folio_memcg(folio), &flags); + if (__list_lru_del(&deferred_split_lru, lru, &folio->_deferred_list, nid)) { if (folio_test_partially_mapped(folio)) { folio_clear_partially_mapped(folio); mod_mthp_stat(folio_order(folio), @@ -4286,7 +4297,7 @@ bool __folio_unqueue_deferred_split(struct folio *folio) } unqueued = true; } - list_lru_unlock_irqrestore(l, &flags); + list_lru_unlock_irqrestore(lru, &flags); rcu_read_unlock(); return unqueued; /* useful for debug warnings */ @@ -4295,7 +4306,7 @@ bool __folio_unqueue_deferred_split(struct folio *folio) /* partially_mapped=false won't clear PG_partially_mapped folio flag */ void deferred_split_folio(struct folio *folio, bool partially_mapped) { - struct list_lru_one *l; + struct list_lru_one *lru; int nid; struct mem_cgroup *memcg; unsigned long flags; @@ -4324,7 +4335,7 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) rcu_read_lock(); memcg = folio_memcg(folio); - l = list_lru_lock_irqsave(&deferred_split_lru, nid, memcg, &flags); + lru = list_lru_lock_irqsave(&deferred_split_lru, nid, memcg, &flags); if (partially_mapped) { if (!folio_test_partially_mapped(folio)) { folio_set_partially_mapped(folio); @@ -4337,8 +4348,8 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) /* partially mapped folios cannot become non-partially mapped */ VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); } - __list_lru_add(&deferred_split_lru, l, &folio->_deferred_list, nid, memcg); - list_lru_unlock_irqrestore(l, &flags); + __list_lru_add(&deferred_split_lru, lru, &folio->_deferred_list, nid, memcg); + list_lru_unlock_irqrestore(lru, &flags); rcu_read_unlock(); } @@ -4411,8 +4422,6 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, list_for_each_entry_safe(folio, next, &dispose, _deferred_list) { bool did_split = false; bool underused = false; - struct list_lru_one *l; - unsigned long flags; list_del_init(&folio->_deferred_list); @@ -4446,14 +4455,10 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, */ if (!did_split && folio_test_partially_mapped(folio)) { rcu_read_lock(); - l = list_lru_lock_irqsave(&deferred_split_lru, - folio_nid(folio), - folio_memcg(folio), - &flags); - __list_lru_add(&deferred_split_lru, l, - &folio->_deferred_list, - folio_nid(folio), folio_memcg(folio)); - list_lru_unlock_irqrestore(l, &flags); + list_lru_add_irq(&deferred_split_lru, + &folio->_deferred_list, + folio_nid(folio), + folio_memcg(folio)); rcu_read_unlock(); } folio_put(folio); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a81470f529e3..44a9b1350dbd 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1121,7 +1121,7 @@ static enum scan_result collapse_huge_page(struct mm_struct *mm, unsigned long a if (result != SCAN_SUCCEED) goto out_nolock; - if (folio_memcg_list_lru_alloc(folio, &deferred_split_lru, GFP_KERNEL)) + if (folio_memcg_alloc_deferred(folio)) goto out_nolock; mmap_read_lock(mm); diff --git a/mm/list_lru.c b/mm/list_lru.c index 1ccdd45b1d14..23bf7c243083 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -160,6 +160,18 @@ void list_lru_unlock(struct list_lru_one *l) unlock_list_lru(l, /*irq_off=*/false, /*irq_flags=*/NULL); } +struct list_lru_one *list_lru_lock_irq(struct list_lru *lru, int nid, + struct mem_cgroup *memcg) +{ + return lock_list_lru_of_memcg(lru, nid, memcg, /*irq=*/true, + /*irq_flags=*/NULL, /*skip_empty=*/false); +} + +void list_lru_unlock_irq(struct list_lru_one *l) +{ + unlock_list_lru(l, /*irq_off=*/true, /*irq_flags=*/NULL); +} + struct list_lru_one *list_lru_lock_irqsave(struct list_lru *lru, int nid, struct mem_cgroup *memcg, unsigned long *flags) @@ -213,6 +225,18 @@ bool list_lru_add(struct list_lru *lru, struct list_head *item, int nid, return ret; } +bool list_lru_add_irq(struct list_lru *lru, struct list_head *item, + int nid, struct mem_cgroup *memcg) +{ + struct list_lru_one *l; + bool ret; + + l = list_lru_lock_irq(lru, nid, memcg); + ret = __list_lru_add(lru, l, item, nid, memcg); + list_lru_unlock_irq(l); + return ret; +} + bool list_lru_add_obj(struct list_lru *lru, struct list_head *item) { bool ret; diff --git a/mm/memory.c b/mm/memory.c index 24dd531125b4..23da4720576d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4658,8 +4658,7 @@ static struct folio *alloc_swap_folio(struct vm_fault *vmf) folio_put(folio); goto next; } - if (order > 1 && - folio_memcg_list_lru_alloc(folio, &deferred_split_lru, GFP_KERNEL)) { + if (order > 1 && folio_memcg_alloc_deferred(folio)) { folio_put(folio); goto fallback; } @@ -5183,8 +5182,7 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf) folio_put(folio); goto next; } - if (order > 1 && - folio_memcg_list_lru_alloc(folio, &deferred_split_lru, GFP_KERNEL)) { + if (order > 1 && folio_memcg_alloc_deferred(folio)) { folio_put(folio); goto fallback; }