Re: [PATCH] mm/sparse: Optimize section number calculations using bit shifts

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

From: Mike Rapoport <rppt@kernel.org>
To: Zhen Ni <zhen.ni@easystack.cn>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Kairui Song <kasong@tencent.com>, Qi Zheng <qi.zheng@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Barry Song <baohua@kernel.org>,
	Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	"Liam R. Howlett" <liam@infradead.org>,
	Vlastimil Babka <vbabka@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/sparse: Optimize section number calculations using bit shifts
Date: Tue, 16 Jun 2026 09:32:03 +0300	[thread overview]
Message-ID: <ajDt4wHn4MWnNsdH@kernel.org> (raw)
In-Reply-To: <20260616025942.3572473-1-zhen.ni@easystack.cn>

Hi,

On Tue, Jun 16, 2026 at 10:59:42AM +0800, Zhen Ni wrote:
> Add SECTIONS_PER_ROOT_SHIFT = ilog2(SECTIONS_PER_ROOT) with correctness
> guaranteed by BUILD_BUG_ON in sparse_init(). Convert SECTION_NR_TO_ROOT
> to use right shift instead of division for better performance. Add
> SECTION_NR_IN_ROOT() macro to improve code readability.
> 
> This improves code efficiency in hot paths where __nr_to_section() is
> frequently called, such as sparse_init() and memory section management
> operations.
> 
> Performance verification in sparse_init() on ARM (8GB RAM, 4 NUMA nodes):
> 
>     sparse_init()
>     |
>     +----> memblocks_present()
>     |
>     +----> section initialization (sparse_init_nid loop)
> 
> Time measurement points:
> 
>     [T1] sparse_init start
>          |
>          v
>     [T2] memblocks_present() complete
>          |
>          v
>     [T3] sparse_init_nid() loop complete / sparse_init end
> 
> Measurement values:
>     memblocks_present_cycles = T2 - T1
>     section_initialization_cycles = T3 - T2
>     total_cycles = T3 - T1
> 
> Before (division):
> [    0.000000] sparse_init: total 7538 cycles
> [    0.000000]   memblocks_present: 4232 cycles
> [    0.000000]   section initialization: 3261 cycles
> 
> After (bit shift):
> [    0.000000] sparse_init: total 5641 cycles
> [    0.000000]   memblocks_present: 3562 cycles
> [    0.000000]   section initialization: 2057 cycles
> 
> Performance improvement:
>   Total: (7538-5641)/7538 = 25.2% faster
>   memblocks_present: (4232-3562)/4232 = 15.8% faster
>   section initialization: (3261-2057)/3261 = 36.9% faster

This is a nice improvement, but it's not the hot path. I believe you can
derive improvement to __nr_to_section() from these measurements.
 
> Signed-off-by: Zhen Ni <zhen.ni@easystack.cn>
> ---
>  include/linux/mmzone.h | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 9adb2ad21da5..5daf471f6823 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -2035,11 +2035,14 @@ struct mem_section {
>  
>  #ifdef CONFIG_SPARSEMEM_EXTREME
>  #define SECTIONS_PER_ROOT       (PAGE_SIZE / sizeof (struct mem_section))
> +#define SECTIONS_PER_ROOT_SHIFT ilog2(SECTIONS_PER_ROOT)
>  #else
>  #define SECTIONS_PER_ROOT	1
> +#define SECTIONS_PER_ROOT_SHIFT 0
>  #endif
>  
> -#define SECTION_NR_TO_ROOT(sec)	((sec) / SECTIONS_PER_ROOT)
> +#define SECTION_NR_TO_ROOT(sec)	((sec) >> SECTIONS_PER_ROOT_SHIFT)
> +#define SECTION_NR_IN_ROOT(sec)	((sec) & SECTION_ROOT_MASK)
>  #define NR_SECTION_ROOTS	DIV_ROUND_UP(NR_MEM_SECTIONS, SECTIONS_PER_ROOT)
>  #define SECTION_ROOT_MASK	(SECTIONS_PER_ROOT - 1)
>  
> @@ -2065,7 +2068,7 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
>  	if (!mem_section || !mem_section[root])
>  		return NULL;
>  #endif
> -	return &mem_section[root][nr & SECTION_ROOT_MASK];
> +	return &mem_section[root][SECTION_NR_IN_ROOT(nr)];

The explicit masking is clearer IMO.

>  }
>  extern size_t mem_section_usage_size(void);

Hmm, I don't see BUILD_BUG_ON() you mention in the changelog.
 
> -- 
> 2.20.1
> 

-- 
Sincerely yours,
Mike.

next prev parent reply	other threads:[~2026-06-16  6:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16  2:59 [PATCH] mm/sparse: Optimize section number calculations using bit shifts Zhen Ni
2026-06-16  6:32 ` Mike Rapoport [this message]
     [not found]   ` <764b8fef-4e77-4daf-b2ba-45745061ade9@easystack.cn>
2026-06-16  7:56     ` Mike Rapoport
2026-06-16  8:06 ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajDt4wHn4MWnNsdH@kernel.org \
    --to=rppt@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=david@kernel.org \
    --cc=kasong@tencent.com \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=qi.zheng@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    --cc=zhen.ni@easystack.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox