linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	kernel test robot <oliver.sang@intel.com>,
	 Alexei Starovoitov <ast@kernel.org>,
	Harry Yoo <harry.yoo@oracle.com>,
	oe-lkp@lists.linux.dev,  kbuild test robot <lkp@intel.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	 "open list:CONTROL GROUP (CGROUP)" <cgroups@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: [linux-next:master] [slab] db93cdd664: BUG:kernel_NULL_pointer_dereference,address
Date: Fri, 19 Sep 2025 08:01:36 -0700	[thread overview]
Message-ID: <CAJuCfpGA_YKuzHu0TM718LFHr92PyyKdD27yJVbtvfF=ZzNOfQ@mail.gmail.com> (raw)
In-Reply-To: <CAADnVQLPq=puz04wNCnUeSUeF2s1SwTUoQvzMWsHCVhjFcyBeg@mail.gmail.com>

On Thu, Sep 18, 2025 at 6:39 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Sep 18, 2025 at 7:49 AM Suren Baghdasaryan <surenb@google.com> wrote:
> >
> > On Thu, Sep 18, 2025 at 12:06 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> > >
> > > On 9/17/25 20:38, Alexei Starovoitov wrote:
> > > > On Wed, Sep 17, 2025 at 2:18 AM Vlastimil Babka <vbabka@suse.cz> wrote:
> > > >>
> > > >> Also I was curious to find out which path is triggered so I've put a
> > > >> dump_stack() before the kmalloc_nolock call:
> > > >>
> > > >> [    0.731812][    T0] Call Trace:
> > > >> [    0.732406][    T0]  __dump_stack+0x18/0x30
> > > >> [    0.733200][    T0]  dump_stack_lvl+0x32/0x90
> > > >> [    0.734037][    T0]  dump_stack+0xd/0x20
> > > >> [    0.734780][    T0]  alloc_slab_obj_exts+0x181/0x1f0
> > > >> [    0.735862][    T0]  __alloc_tagging_slab_alloc_hook+0xd1/0x330
> > > >> [    0.736988][    T0]  ? __slab_alloc+0x4e/0x70
> > > >> [    0.737858][    T0]  ? __set_page_owner+0x167/0x280
> > > >> [    0.738774][    T0]  __kmalloc_cache_noprof+0x379/0x460
> > > >> [    0.739756][    T0]  ? depot_fetch_stack+0x164/0x180
> > > >> [    0.740687][    T0]  ? __set_page_owner+0x167/0x280
> > > >> [    0.741604][    T0]  __set_page_owner+0x167/0x280
> > > >> [    0.742503][    T0]  post_alloc_hook+0x17a/0x200
> > > >> [    0.743404][    T0]  get_page_from_freelist+0x13b3/0x16b0
> > > >> [    0.744427][    T0]  ? kvm_sched_clock_read+0xd/0x20
> > > >> [    0.745358][    T0]  ? kvm_sched_clock_read+0xd/0x20
> > > >> [    0.746290][    T0]  ? __next_zones_zonelist+0x26/0x60
> > > >> [    0.747265][    T0]  __alloc_frozen_pages_noprof+0x143/0x1080
> > > >> [    0.748358][    T0]  ? lock_acquire+0x8b/0x180
> > > >> [    0.749209][    T0]  ? pcpu_alloc_noprof+0x181/0x800
> > > >> [    0.750198][    T0]  ? sched_clock_noinstr+0x8/0x10
> > > >> [    0.751119][    T0]  ? local_clock_noinstr+0x137/0x140
> > > >> [    0.752089][    T0]  ? kvm_sched_clock_read+0xd/0x20
> > > >> [    0.753023][    T0]  alloc_slab_page+0xda/0x150
> > > >> [    0.753879][    T0]  new_slab+0xe1/0x500
> > > >> [    0.754615][    T0]  ? kvm_sched_clock_read+0xd/0x20
> > > >> [    0.755577][    T0]  ___slab_alloc+0xd79/0x1680
> > > >> [    0.756469][    T0]  ? pcpu_alloc_noprof+0x538/0x800
> > > >> [    0.757408][    T0]  ? __mutex_unlock_slowpath+0x195/0x3e0
> > > >> [    0.758446][    T0]  __slab_alloc+0x4e/0x70
> > > >> [    0.759237][    T0]  ? mm_alloc+0x38/0x80
> > > >> [    0.759993][    T0]  kmem_cache_alloc_noprof+0x1db/0x470
> > > >> [    0.760993][    T0]  ? mm_alloc+0x38/0x80
> > > >> [    0.761745][    T0]  ? mm_alloc+0x38/0x80
> > > >> [    0.762506][    T0]  mm_alloc+0x38/0x80
> > > >> [    0.763260][    T0]  poking_init+0xe/0x80
> > > >> [    0.764032][    T0]  start_kernel+0x16b/0x470
> > > >> [    0.764858][    T0]  i386_start_kernel+0xce/0xf0
> > > >> [    0.765723][    T0]  startup_32_smp+0x151/0x160
> > > >>
> > > >> And the reason is we still have restricted gfp_allowed_mask at this point:
> > > >> /* The GFP flags allowed during early boot */
> > > >> #define GFP_BOOT_MASK (__GFP_BITS_MASK & ~(__GFP_RECLAIM|__GFP_IO|__GFP_FS))
> > > >>
> > > >> It's only lifted to a full allowed mask later in the boot.
> > > >
> > > > Ohh. That's interesting.
> > > >
> > > >> That means due to "kmalloc_nolock() is not supported on architectures that
> > > >> don't implement cmpxchg16b" such architectures will no longer get objexts
> > > >> allocated in early boot. I guess that's not a big deal.
> > > >>
> > > >> Also any later allocation having its flags screwed for some reason to not
> > > >> have __GFP_RECLAIM will also lose its objexts. Hope that's also acceptable.
> > > >> I don't know if we can distinguish a real kmalloc_nolock() scope in
> > > >> alloc_slab_obj_exts() without inventing new gfp flags or passing an extra
> > > >> argument through several layers of functions.
> > > >
> > > > I think it's ok-ish.
> > > > Can we add a check to alloc_slab_obj_exts() that sets allow_spin=true
> > > > if we're in the boot phase? Like:
> > > > if (gfp_allowed_mask != __GFP_BITS_MASK)
> > > >    allow_spin = true;
> > > > or some cleaner way to detect boot time by checking slab_state ?
> > > > bpf is not active during the boot and nothing should be
> > > > calling kmalloc_nolock.
> > >
> > > Checking the gfp_allowed_mask should work. Slab state is already UP so won't
> > > help, and this is not really about slab state anyway.
> > > But whether worth it... Suren what do you think?
> >
> > Vlastimil's fix is correct. We definitely need __GFP_NO_OBJ_EXT when
> > allocating an obj_exts vector, otherwise it will try to recursively
> > allocate an obj_exts vector for obj_exts allocation.
> >
> > For the additional __GFP_BITS_MASK check, that sounds good to me as
> > long as we add a comment on why that is there. Or maybe such a check
> > deserves to be placed in a separate function similar to
> > gfpflags_allow_{spinning | blocking}?
>
> I would not. I think adding 'boot or not' logic to these two
> will muddy the waters and will make the whole slab/page_alloc/memcg
> logic and dependencies between them much harder to follow.
> I'd either add a comment to alloc_slab_obj_exts() explaining
> what may happen or add 'boot or not' check only there.
> imo this is a niche, rare and special.

Ok, comment it is then.
Will you be sending a new version or Vlastimil will be including that
in his fixup?


  reply	other threads:[~2025-09-19 15:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-17  5:01 [linux-next:master] [slab] db93cdd664: BUG:kernel_NULL_pointer_dereference,address kernel test robot
2025-09-17  8:03 ` Vlastimil Babka
2025-09-17  9:18   ` Vlastimil Babka
2025-09-17 18:38     ` Alexei Starovoitov
2025-09-18  7:06       ` Vlastimil Babka
2025-09-18 14:49         ` Suren Baghdasaryan
2025-09-19  1:39           ` Alexei Starovoitov
2025-09-19 15:01             ` Suren Baghdasaryan [this message]
2025-09-19 18:31               ` Alexei Starovoitov
2025-09-26 12:25                 ` Vlastimil Babka
2025-09-26 15:30                   ` Alexei Starovoitov
2025-09-26 15:38                     ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJuCfpGA_YKuzHu0TM718LFHr92PyyKdD27yJVbtvfF=ZzNOfQ@mail.gmail.com' \
    --to=surenb@google.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=harry.yoo@oracle.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).