linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Luiz Capitulino <luizcap@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	yuzhao@google.com, pasha.tatashin@soleen.com
Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, muchun.song@linux.dev
Subject: Re: [PATCH v2 2/4] mm: page_ext: add an iteration API for page extensions
Date: Tue, 25 Feb 2025 17:39:39 +0100	[thread overview]
Message-ID: <3d45f4b1-83a3-4fb0-88cc-9cca5df26320@redhat.com> (raw)
In-Reply-To: <08aea5d87f5419f4c7033c81d97645f940f87f7e.1740434344.git.luizcap@redhat.com>

On 24.02.25 22:59, Luiz Capitulino wrote:
> The page extension implementation assumes that all page extensions of
> a given page order are stored in the same memory section. The function
> page_ext_next() relies on this assumption by adding an offset to the
> current object to return the next adjacent page extension.
> 
> This behavior works as expected for flatmem but fails for sparsemem when
> using 1G pages. The commit cf54f310d0d3 ("mm/hugetlb: use __GFP_COMP for
> gigantic folios") exposes this issue, making it possible for a crash when
> using page_owner or page_table_check page extensions.
> 
> The problem is that for 1G pages, the page extensions may span memory
> section boundaries and be stored in different memory sections. This issue
> was not visible before commit cf54f310d0d3 ("mm/hugetlb: use __GFP_COMP
> for gigantic folios") because alloc_contig_pages() never passed more than
> MAX_PAGE_ORDER to post_alloc_hook(). However, the series introducing
> mentioned commit changed this behavior allowing the full 1G page order
> to be passed.
> 
> Reproducer:
> 
>   1. Build the kernel with CONFIG_SPARSEMEM=y and table extensions
>      support
>   2. Pass 'default_hugepagesz=1 page_owner=on' in the kernel command-line
>   3. Reserve one 1G page at run-time, this should crash (backtrace below)
> 
> To address this issue, this commit introduces a new API for iterating
> through page extensions. The main iteration macro is for_each_page_ext()
> and it must be called with the RCU read lock taken. Here's an usage
> example:
> 
> """
> struct page_ext_iter iter;
> struct page_ext *page_ext;
> 
> ...
> 
> rcu_read_lock();
> for_each_page_ext(page, 1 << order, page_ext, iter) {
> 	struct my_page_ext *obj = get_my_page_ext_obj(page_ext);
> 	...
> }
> rcu_read_unlock();
> """
> 
> The loop construct uses page_ext_iter_next() which checks to see if we
> have crossed sections in the iteration. In this case, page_ext_iter_next()
> retrieves the next page_ext object from another section.
> 
> Thanks to David Hildenbrand for helping identify the root cause and
> providing suggestions on how to fix and optmize the solution (final
> implementation and bugs are all mine through).
> 
> Lastly, here's the backtrace, without kasan you can get random crashes:
> 
> [   76.052526] BUG: KASAN: slab-out-of-bounds in __update_page_owner_handle+0x238/0x298
> [   76.060283] Write of size 4 at addr ffff07ff96240038 by task tee/3598
> [   76.066714]
> [   76.068203] CPU: 88 UID: 0 PID: 3598 Comm: tee Kdump: loaded Not tainted 6.13.0-rep1 #3
> [   76.076202] Hardware name: WIWYNN Mt.Jade Server System B81.030Z1.0007/Mt.Jade Motherboard, BIOS 2.10.20220810 (SCP: 2.10.20220810) 2022/08/10
> [   76.088972] Call trace:
> [   76.091411]  show_stack+0x20/0x38 (C)
> [   76.095073]  dump_stack_lvl+0x80/0xf8
> [   76.098733]  print_address_description.constprop.0+0x88/0x398
> [   76.104476]  print_report+0xa8/0x278
> [   76.108041]  kasan_report+0xa8/0xf8
> [   76.111520]  __asan_report_store4_noabort+0x20/0x30
> [   76.116391]  __update_page_owner_handle+0x238/0x298
> [   76.121259]  __set_page_owner+0xdc/0x140
> [   76.125173]  post_alloc_hook+0x190/0x1d8
> [   76.129090]  alloc_contig_range_noprof+0x54c/0x890
> [   76.133874]  alloc_contig_pages_noprof+0x35c/0x4a8
> [   76.138656]  alloc_gigantic_folio.isra.0+0x2c0/0x368
> [   76.143616]  only_alloc_fresh_hugetlb_folio.isra.0+0x24/0x150
> [   76.149353]  alloc_pool_huge_folio+0x11c/0x1f8
> [   76.153787]  set_max_huge_pages+0x364/0xca8
> [   76.157961]  __nr_hugepages_store_common+0xb0/0x1a0
> [   76.162829]  nr_hugepages_store+0x108/0x118
> [   76.167003]  kobj_attr_store+0x3c/0x70
> [   76.170745]  sysfs_kf_write+0xfc/0x188
> [   76.174492]  kernfs_fop_write_iter+0x274/0x3e0
> [   76.178927]  vfs_write+0x64c/0x8e0
> [   76.182323]  ksys_write+0xf8/0x1f0
> [   76.185716]  __arm64_sys_write+0x74/0xb0
> [   76.189630]  invoke_syscall.constprop.0+0xd8/0x1e0
> [   76.194412]  do_el0_svc+0x164/0x1e0
> [   76.197891]  el0_svc+0x40/0xe0
> [   76.200939]  el0t_64_sync_handler+0x144/0x168
> [   76.205287]  el0t_64_sync+0x1ac/0x1b0
> 
> Fixes: cf54f310d0d3 ("mm/hugetlb: use __GFP_COMP for gigantic folios")
> Signed-off-by: Luiz Capitulino <luizcap@redhat.com>
> ---


Acked-by: David Hildenbrand <david@redhat.com>


-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-02-25 16:39 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-24 21:59 [PATCH v2 0/4] mm: page_ext: Introduce new iteration API Luiz Capitulino
2025-02-24 21:59 ` [PATCH v2 1/4] mm: page_ext: make lookup_page_ext() public Luiz Capitulino
2025-02-25 16:38   ` David Hildenbrand
2025-02-25 22:29     ` Luiz Capitulino
2025-02-26 17:07       ` David Hildenbrand
2025-02-24 21:59 ` [PATCH v2 2/4] mm: page_ext: add an iteration API for page extensions Luiz Capitulino
2025-02-25 16:39   ` David Hildenbrand [this message]
2025-02-24 21:59 ` [PATCH v2 3/4] mm: page_table_check: use new iteration API Luiz Capitulino
2025-02-25 16:40   ` David Hildenbrand
2025-02-24 21:59 ` [PATCH v2 4/4] mm: page_owner: " Luiz Capitulino
2025-02-25 16:44   ` David Hildenbrand
2025-02-25 22:30     ` Luiz Capitulino
2025-02-27 13:50       ` David Hildenbrand
2025-02-27 20:50         ` Luiz Capitulino
2025-02-28  9:09           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d45f4b1-83a3-4fb0-88cc-9cca5df26320@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luizcap@redhat.com \
    --cc=muchun.song@linux.dev \
    --cc=pasha.tatashin@soleen.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).