All of lore.kernel.org
 help / color / mirror / Atom feed
From: gaoxu <gaoxu2@honor.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"surenb@google.com" <surenb@google.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	yipengxiang <yipengxiang@honor.com>
Subject: 回复: [PATCH] mm: simplify zone_idx()
Date: Sat, 12 Apr 2025 10:06:43 +0000	[thread overview]
Message-ID: <c915776e308f49e7867ecb50afa44d36@honor.com> (raw)
In-Reply-To: <CAGsJ_4xDc_5q8dBYVq-Ga0iKJD9pTQdYSHrKw8R=1RHNb4+r7Q@mail.gmail.com>

> 
> On Sat, Apr 12, 2025 at 8:34 PM gaoxu <gaoxu2@honor.com> wrote:
> >
> > >
> > > On Fri, Apr 11, 2025 at 2:42 AM Mike Rapoport <rppt@kernel.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Thu, Apr 10, 2025 at 12:03:00PM +0000, gaoxu wrote:
> > > > > store zone_idx directly in struct zone to simplify and optimize
> > > > > zone_idx()
> > > >
> > > > Do you see an actual speed up somewhere?
> > Almost negligible. my simple code tests showed the patch provides an average
> improvement of ~0.02%.
> > Thus, in the Android 15-6.6 kernel, I confidently retained the original zone_idx
> function.
> > (https://android-review.googlesource.com/c/kernel/common/+/3578322/2/m
> > m/page_alloc.c#770)
> >
> > This patch only eliminates 2-3 assembly instructions, making it
> > challenging to observe measurable performance benefits.
> > However, since the zone struct includes CACHELINE_PADDING (reserving
> > unused space), adding a new member variable does not alter the size of
> > zone. This makes the patch effectively zero-cost while achieving a cleaner
> implementation of zone_idx.
> 
> The struct zone contains many CONFIG_ options to include or exclude certain
> fields.
> If we're confident that our new zone_idx doesn't increase cacheline usage for all
> those cases, this seems like a worthwhile cleanup. Have you checked the
> numbers?

The zone info obtained through T32 in the Android 15-6.6 system(arm64):
(struct zone) struct (1664 bytes,
                [0] unsigned long [4] _watermark,
                [32] unsigned long watermark_boost,
                [40] unsigned long nr_reserved_highatomic,
                [48] long [5] lowmem_reserve,
                [88] struct pglist_data * zone_pgdat,
                [96] struct per_cpu_pages * per_cpu_pageset,
                [104] struct per_cpu_zonestat * per_cpu_zonestats,
                [112] int pageset_high,
                [116] int pageset_batch,
                [120] unsigned long zone_start_pfn,
                [128] atomic_long_t managed_pages,
                [136] unsigned long spanned_pages,
                [144] unsigned long present_pages,
                [152] unsigned long present_early_pages,
                [160] unsigned long cma_pages,
                [168] const char * name,
                [176] unsigned long nr_isolate_pageblock,
                [184] seqlock_t span_seqlock,
                [192] int order,
                [196] int initialized,
                [256] struct cacheline_padding _pad1_,
                [256] struct free_area [11] free_area,
                [1400] unsigned long flags,
                [1408] spinlock_t lock,
                [1472] struct cacheline_padding _pad2_,
                [1472] unsigned long percpu_drift_mark,
                [1480] unsigned long compact_cached_free_pfn,
                [1488] unsigned long [2] compact_cached_migrate_pfn,
                [1504] unsigned long compact_init_migrate_pfn,
                [1512] unsigned long compact_init_free_pfn,
                [1520] unsigned int compact_considered,
                [1524] unsigned int compact_defer_shift,
                [1528] int compact_order_failed,
                [1532] bool compact_blockskip_flush,
                [1533] bool contiguous,
                [1536] struct cacheline_padding _pad3_,
                [1536] atomic_long_t [11] vm_stat,
                [1624] atomic_long_t [0] vm_numa_event)

1) It can be observed that there are 56B of free space in CACHELINE_PADDING(pad1);
2) Before the variables in CACHELINE_PADDING(pad1), there are two CONFIGs that are not enabled in Android 15-6.6:
	#ifdef CONFIG_NUMA
		int node;
	#endif

	#ifndef CONFIG_SPARSEMEM
		unsigned long *pageblock_flags;
	#endif /* CONFIG_SPARSEMEM */
	These two CONFIGs occupy 16B.
3) Compared to the latest Linux code, two new variables, unsigned long nr_free_highatomic and int pageset_high_max, occupy an additional 16B;
Based on the above analysis, there are still 24B of free space before CACHELINE_PADDING(pad1).
(If I misunderstand, please point it out in a timely manner. Thank you!)

It would be more appropriate to place the newly added variable zone_idx before initialized.
> 
> > >
> > > +1. Curious if there's data indicating zone_idx is a hot path.
> > There are several functions in the memory management code that are
> > frequently executed and will call zone_idx:
> > rmqueue()->wakeup_kswapd()->zone_idx()
> > alloc_pages_bulk_noprof()->__count_zid_vm_events()->zone_idx()
> >
> > The patch
> > (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.co
> > m/) will add new hotspot paths, with the details as follows:
> > __zone_watermark_ok()->zone_is_suitable()->zone_idx()
> > zone_watermark_fast()->zone_is_suitable()->zone_idx()
> > get_page_from_freelist()->zone_is_suitable()->zone_idx()
> > __free_one_page()->zone_max_order()->zone_idx()
> >
> > Although The patch
> > (https://lore.kernel.org/all/20240229183436.4110845-2-yuzhao@google.co
> > m/) has not yet merged into the Linux mainline; it is already included
> > in Android 15-6.6.
> > >
> >
> > > >
> > > > > Signed-off-by: gao xu <gaoxu2@honor.com>
> > > > > ---
> > > > >  include/linux/mmzone.h | 3 ++-
> > > > >  mm/mm_init.c           | 1 +
> > > > >  2 files changed, 3 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > > > > index 4c95fcc9e..7b14f577d 100644
> > > > > --- a/include/linux/mmzone.h
> > > > > +++ b/include/linux/mmzone.h
> > > > > @@ -941,6 +941,7 @@ struct zone {  #endif
> > > > >
> > > > >       const char              *name;
> > > > > +     enum zone_type  zone_idx;
> > > > >
> > > > >  #ifdef CONFIG_MEMORY_ISOLATION
> > > > >       /*
> > > > > @@ -1536,7 +1537,7 @@ static inline int local_memory_node(int
> > > > > node_id)
> > > { return node_id; };
> > > > >  /*
> > > > >   * zone_idx() returns 0 for the ZONE_DMA zone, 1 for the
> > > > > ZONE_NORMAL
> > > zone, etc.
> > > > >   */
> > > > > -#define zone_idx(zone)               ((zone) -
> > > (zone)->zone_pgdat->node_zones)
> > > > > +#define zone_idx(zone)               ((zone)->zone_idx)
> > > > >
> > > > >  #ifdef CONFIG_ZONE_DEVICE
> > > > >  static inline bool zone_is_zone_device(struct zone *zone) diff
> > > > > --git a/mm/mm_init.c b/mm/mm_init.c index 9659689b8..a7f7264f1
> > > > > 100644
> > > > > --- a/mm/mm_init.c
> > > > > +++ b/mm/mm_init.c
> > > > > @@ -1425,6 +1425,7 @@ static void __meminit
> > > > > zone_init_internals(struct
> > > zone *zone, enum zone_type idx,
> > > > >       atomic_long_set(&zone->managed_pages, remaining_pages);
> > > > >       zone_set_nid(zone, nid);
> > > > >       zone->name = zone_names[idx];
> > > > > +     zone->zone_idx = idx;
> > > > >       zone->zone_pgdat = NODE_DATA(nid);
> > > > >       spin_lock_init(&zone->lock);
> > > > >       zone_seqlock_init(zone);
> > > > > --
> > > > > 2.17.1
> > > >
> > > > --
> > > > Sincerely yours,
> > > > Mike.
> > >
> 
> Thanks
> Barry

  reply	other threads:[~2025-04-12 10:06 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-10 12:03 [PATCH] mm: simplify zone_idx() gaoxu
2025-04-10 14:41 ` Mike Rapoport
2025-04-10 21:59   ` Barry Song
2025-04-12  8:34     ` 回复: " gaoxu
2025-04-12  9:15       ` Barry Song
2025-04-12 10:06         ` gaoxu [this message]
2025-04-13 21:57           ` Barry Song
2025-04-13 23:20             ` Matthew Wilcox
2025-04-15 12:34               ` gaoxu
2025-04-15 14:49                 ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c915776e308f49e7867ecb50afa44d36@honor.com \
    --to=gaoxu2@honor.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rppt@kernel.org \
    --cc=surenb@google.com \
    --cc=yipengxiang@honor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.