From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f49.google.com (mail-ej1-f49.google.com [209.85.218.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C8882E8DEC for ; Sun, 31 May 2026 08:00:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.49 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780214458; cv=none; b=EJl9COOqjrTxen62bA1lFnNb52Ql4QMa6wobp3uSte8GaCA1mwplBnCgo1lgapHUDEO4L2OByhk075FtS2wJm65LyJiUKodIRwAWKF0UoTz3fRCPsEMxe00vXyo/qsdbz2+Hswzx4z8sdzT++nkIX7EvFRzw9BhwEKYMAuzTN5Y= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780214458; c=relaxed/simple; bh=8KofXbDEix4B09rNJ37FucbySAdSUpZMskyYwZA0EXE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=L8+nQ9Z/WFwRtudcXu7PFG+c50Nd/fbRgt75gZjNGiLZT63vAwROTThxTrTfXHnTNbG4zuQVhaaiYH75MvsMbr19lx2PY0Z+TaSQjDoWBQEqVi3lr0IAbSZvnBCjCwTCgxPlnC+TOQKUTtocFHLcjMW04DqIt0c/QqGGVRFy9g0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=M62URrwO; arc=none smtp.client-ip=209.85.218.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="M62URrwO" Received: by mail-ej1-f49.google.com with SMTP id a640c23a62f3a-bec43ee8ff0so50023466b.1 for ; Sun, 31 May 2026 01:00:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780214455; x=1780819255; darn=vger.kernel.org; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=FwkGhwJc4adPpZ7Bnkh8nEE7m2fCygkfGd6/PO7C4fk=; b=M62URrwO52WbUkpBo948N4FdukJJW4BG1HSnISA32/1xJyOifipgqrgBq69IeiWTEY d83wJa8br+C1GY7lluO4tkYoJC/pw4ybe510DhJ1wVCNSk1DvQ8HylSS01zl0NTxCbh8 g0BLE2yOWvfQGbsH84W9Q7P9snnO8xGoeyhlbrMm4S8Nm4xB5odYglWUa/Xwuf5+nRY6 mo0zZd3uEZuHGWjdEb/K8A6sY9UVLAbRbpHJ9EMG9XIbeYrJAN/K0g5txAhcsJ5XRmPC WQNpBsbkcRoy9aL0SO/B3KnUsa+IskYSR+f3wh+NYne9VNNdIzqaDi5KqVu4ifq/JfEG g1Rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780214455; x=1780819255; h=user-agent:in-reply-to:content-disposition:mime-version:references :reply-to:message-id:subject:cc:to:from:date:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FwkGhwJc4adPpZ7Bnkh8nEE7m2fCygkfGd6/PO7C4fk=; b=MkTLjUjlpPHb3I4bBFEWMNNHYYdbsW6RSS1XNDuJ8HEOr+uiurTHQETY+YHrsV8FqG 2rwLk1DGtAkf0z/75MYqX/2ry/utyAUc5ggCRehqg1vE9W3Sh9tE9jdiqy55bQv+R9lv AFWc+pmuHiNkJYvwDhcInuP65Nmhf2ufe8Qj7LkKO6yq7HSz6pmaxAEdrp8RSXuS6hcJ 3nkLGas9rpeBXXNds5+05uh6Yan7fWFMLN9UzDYdhC1v5mIglyRGb+3sP7B9/61yq1Ob LcuKBAA5yi1dBr+IJSYKySojqnu9tJ/Fkf26D/oZtNWfqC1CnbthI02lyhntuEHyWFPc I1YA== X-Forwarded-Encrypted: i=1; AFNElJ+EtStWEhx8ezUA/HC9mpHvHSnu6H+cHqW6A7JjugOG+4/rJv3hBaPFnUpwxJ8Gu0t2EveWoXgg@vger.kernel.org X-Gm-Message-State: AOJu0YwpZgbxxNw4/u5Il+3kpCnVlMzmkSNcWGy4HDeOy0HnLhudwCz5 iu/tTk2me6Cddx3KvXhINsi6m11V8z0xu3FndJErtfAc6LgAbkRzRNRV X-Gm-Gg: Acq92OGhGb172wgjI5so7GPazckGkAjYIA8Xi75LgCGD0K3oum2OS2lCbceRV4ymGPK oVi1bLmrE53XdUdkn1SWd+6g1SPK0LmeH78Hmwi4kKtxeLmNJ01rHXvbqVVyZJCnwGdQ70mK16A Fj92+WVeIPgw62mVvtSrqc+AjkPHnlDIIU71roPbMpQQtg8ZZLlIvgz5NcfDy+S+0qkfeHOcKUN oi/Jx8oU9K2S1rs02w3KBLVlBgQZjZMUi4GN6LOrvbIfG3v361S7GUYDlhDpVNj1BWKYyP143wy V6YTy/NQRgIKApt11gGvuRw+QxQPVtYhlUl96evAj3ZqAxQnCOBNpvg+JllywrmbRw5BQcDGUL2 WJvMVimdxu5x++0HCyaUmm6csvyNYKTbt8nPtAKHt/67XtX2xv6bzRKD5robubd5pF8pLg4QxSd oGxzuP7P096823DcIqxqzB4XoP3yanDhbj X-Received: by 2002:a17:907:d310:b0:bec:ebd7:90dd with SMTP id a640c23a62f3a-becebd7a10fmr24002566b.21.1780214455180; Sun, 31 May 2026 01:00:55 -0700 (PDT) Received: from localhost ([185.92.221.13]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-bec5e5be0d4sm41821066b.52.2026.05.31.01.00.53 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sun, 31 May 2026 01:00:53 -0700 (PDT) Date: Sun, 31 May 2026 08:00:52 +0000 From: Wei Yang To: Johannes Weiner Cc: Andrew Morton , David Hildenbrand , Lorenzo Stoakes , Shakeel Butt , Michal Hocko , Dave Chinner , Roman Gushchin , Muchun Song , Qi Zheng , Yosry Ahmed , Zi Yan , "Liam R . Howlett" , Usama Arif , Kiryl Shutsemau , Vlastimil Babka , Kairui Song , Mikhail Zaslonko , Vasily Gorbik , Baolin Wang , Barry Song , Dev Jain , Lance Yang , Nico Pache , Ryan Roberts , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 9/9] mm: switch deferred split shrinker to list_lru Message-ID: <20260531080052.guzobbwdvprrmger@master> Reply-To: Wei Yang References: <20260527204757.2544958-1-hannes@cmpxchg.org> <20260527204757.2544958-10-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: cgroups@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260527204757.2544958-10-hannes@cmpxchg.org> User-Agent: NeoMutt/20170113 (1.7.2) On Wed, May 27, 2026 at 04:45:16PM -0400, Johannes Weiner wrote: >The deferred split queue handles cgroups in a suboptimal fashion. The >queue is per-NUMA node or per-cgroup, not the intersection. That means >on a cgrouped system, a node-restricted allocation entering reclaim >can end up splitting large pages on other nodes: > > alloc/unmap > deferred_split_folio() > list_add_tail(memcg->split_queue) > set_shrinker_bit(memcg, node, deferred_shrinker_id) > > for_each_zone_zonelist_nodemask(restricted_nodes) > mem_cgroup_iter() > shrink_slab(node, memcg) > shrink_slab_memcg(node, memcg) > if test_shrinker_bit(memcg, node, deferred_shrinker_id) > deferred_split_scan() > walks memcg->split_queue > >The shrinker bit adds an imperfect guard rail. As soon as the cgroup >has a single large page on the node of interest, all large pages owned >by that memcg, including those on other nodes, will be split. > >list_lru properly sets up per-node, per-cgroup lists. As a bonus, it >streamlines a lot of the list operations and reclaim walks. It's used >widely by other major shrinkers already. Convert the deferred split >queue as well. > >The list_lru per-memcg heads are instantiated on demand when the first >object of interest is allocated for a cgroup, by calling >folio_memcg_alloc_deferred(). Add calls to where splittable pages are >created: anon faults, swapin faults, khugepaged collapse. > >These calls create all possible node heads for the cgroup at once, so >the migration code (between nodes) doesn't need any special care. > >Reported-by: Mikhail Zaslonko >Tested-by: Mikhail Zaslonko >Acked-by: Shakeel Butt >Reviewed-by: Lorenzo Stoakes (Oracle) >Signed-off-by: Johannes Weiner >--- > include/linux/huge_mm.h | 7 +- > include/linux/memcontrol.h | 4 - > include/linux/mmzone.h | 12 -- > mm/huge_memory.c | 364 +++++++++++++------------------------ > mm/internal.h | 2 +- > mm/khugepaged.c | 5 + > mm/memcontrol.c | 12 +- > mm/memory.c | 4 + > mm/mm_init.c | 15 -- > mm/swap_state.c | 10 + > 10 files changed, 150 insertions(+), 285 deletions(-) > [...] >@@ -1379,6 +1285,14 @@ static struct folio *vma_alloc_anon_folio_pmd(struct vm_area_struct *vma, > count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); > return NULL; > } >+ >+ if (folio_memcg_alloc_deferred(folio)) { >+ folio_put(folio); >+ count_vm_event(THP_FAULT_FALLBACK); >+ count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); >+ return NULL; >+ } >+ Nit: we have three possible failure point, and some duplicate count_xxx_event/state(). Maybe we can have a followup cleanup for it. Others, looks good. Thanks. -- Wei Yang Help you, Help me