From: Calvin Owens <calvin@wbinvd.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-kernel@vger.kernel.org, "Lai, Yi" <yi1.lai@linux.intel.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
x86@kernel.org
Subject: Re: [tip: locking/urgent] futex: Allow to resize the private local hash
Date: Fri, 20 Jun 2025 18:02:28 -0700 [thread overview]
Message-ID: <aFYEpPIwhlL1WvR0@mozart.vkv.me> (raw)
In-Reply-To: <aFWuwdJUEUD8VcTJ@mozart.vkv.me>
On Friday 06/20 at 11:56 -0700, Calvin Owens wrote:
> On Friday 06/20 at 12:31 +0200, Sebastian Andrzej Siewior wrote:
> > On 2025-06-19 14:07:30 [-0700], Calvin Owens wrote:
> > > > Machine #2 oopsed with the GCC kernel after just over an hour:
> > > >
> > > > BUG: unable to handle page fault for address: ffff88a91eac4458
> > > > RIP: 0010:futex_hash+0x16/0x90
> > …
> > > > Call Trace:
> > > > <TASK>
> > > > futex_wait_setup+0x51/0x1b0
> > …
> >
> > The futex_hash_bucket pointer has an invalid ->priv pointer.
> > This could be use-after-free or double-free. I've been looking through
> > your config and you don't have CONFIG_SLAB_FREELIST_* set. I don't
> > remember which one but one of the two has a "primitiv" double free
> > detection.
> >
> > …
> > > I am not able to reproduce the oops at all with these options:
> > >
> > > * DEBUG_PAGEALLOC_ENABLE_DEFAULT
> > > * SLUB_DEBUG_ON
> >
> > SLUB_DEBUG_ON is something that would "reliably" notice double free.
> > If you drop SLUB_DEBUG_ON (but keep SLUB_DEBUG) then you can boot with
> > slab_debug=f keeping only the consistency checks. The "poison" checks
> > would be excluded for instance. That allocation is kvzalloc() but it
> > should be small on your machine to avoid vmalloc() and use only
> > kmalloc().
>
> I'll try slab_debug=f next.
I just hit the oops with SLUB_DEBUG and slab_debug=f, but nothing new
was logged.
> > > I'm also experimenting with stress-ng as a reproducer, no luck so far.
> >
> > Not sure what you are using there. I think cargo does:
> > - lock/ unlock in a threads
> > - create new thread which triggers auto-resize
> > - auto-resize gets delayed due to lock/ unlock in other threads (the
> > reference is held)
>
> I've tried various combinations of --io, --fork, --exec, --futex, --cpu,
> --vm, and --forkheavy. It's not mixing the operations in threads as I
> understand it, so I guess it won't ever do anything like what you're
> describing no matter what stressors I run?
>
> I did get this message once, something I haven't seen before:
>
> [33024.247423] [ T281] sched: DL replenish lagged too much
>
> ...but maybe that's my fault for overloading it so much.
>
> > And now something happens leading to what we see.
> > _Maybe_ the cargo application terminates/ execs before the new struct is
> > assigned in an unexpected way.
> > The regular hash bucket has reference counting so it should raise
> > warnings if it goes wrong. I haven't seen those.
> >
> > > A third machine with an older Skylake CPU died overnight, but nothing
> > > was logged over netconsole. Luckily it actually has a serial header on
> > > the motherboard, so that's wired up and it's running again, maybe it
> > > dies in a different way that might be a better clue...
> >
> > So far I *think* that cargo does something that I don't expect and this
> > leads to a memory double-free. The SLUB_DEBUG_ON hopefully delays the
> > process long enough that the double free does not trigger.
> >
> > I think I'm going to look for a random rust packet that is using cargo
> > for building (unless you have a recommendation) and look what it is
> > doing. It was always cargo after all. Maybe this brings some light.
>
> The list of things in my big build that use cargo is pretty short:
>
> === Dependendency Snapshot ===
> Dep =mc:house:cargo-native.do_install
> Package=mc:house:cargo-native.do_populate_sysroot
> RDep =mc:house:cargo-c-native.do_prepare_recipe_sysroot
> mc:house:cargo-native.do_create_spdx
> mc:house:cbindgen-native.do_prepare_recipe_sysroot
> mc:house:librsvg-native.do_prepare_recipe_sysroot
> mc:house:librsvg.do_prepare_recipe_sysroot
> mc:house:libstd-rs.do_prepare_recipe_sysroot
> mc:house:python3-maturin-native.do_prepare_recipe_sysroot
> mc:house:python3-maturin-native.do_populate_sysroot
> mc:house:python3-rpds-py.do_prepare_recipe_sysroot
> mc:house:python3-setuptools-rust-native.do_prepare_recipe_sysroot
>
> I've tried building each of those targets alone (and all of them
> together) in a loop, but that hasn't triggered anything. I guess that
> other concurrent builds are necessary to trigger whatever this is.
>
> I tried using stress-ng --vm and --cpu together to "load up" the machine
> while running the isolated targets, but that hasn't worked either.
>
> If you want to run *exactly* what I am, clone this unholy mess:
>
> https://github.com/jcalvinowens/meta-house
>
> ...setup for yocto and install kas as described here:
>
> https://docs.yoctoproject.org/ref-manual/system-requirements.html#ubuntu-and-debian
> https://github.com/jcalvinowens/meta-house/blob/6f6a9c643169fc37ba809f7230261d0e5255b6d7/README.md#kas
>
> ...and run (for the 32-thread machine):
>
> BB_NUMBER_THREADS="48" PARALLEL_MAKE="-j 36" kas build kas/walnascar.yaml -- -k
>
> Fair warning, it needs a *lot* of RAM at the high concurrency, I have
> 96GB with 128GB of swap to spill into. It needs ~500GB of disk space if
> it runs to completion and downloads ~15GB of tarballs when it starts.
>
> Annoyingly it won't work if the system compiler is gcc-15 right now (the
> verison of glib it has won't build, haven't had a chance to fix it yet).
>
> > > > > Thanks,
> > > > > Calvin
> >
> > Sebastian
next prev parent reply other threads:[~2025-06-21 1:02 UTC|newest]
Thread overview: 109+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-16 16:29 [PATCH v12 00/21] futex: Add support task local hash maps, FUTEX2_NUMA and FUTEX2_MPOL Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 01/21] rcuref: Provide rcuref_is_dead() Sebastian Andrzej Siewior
2025-05-05 21:09 ` André Almeida
2025-05-08 10:34 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 02/21] mm: Add vmalloc_huge_node() Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Peter Zijlstra
2025-04-16 16:29 ` [PATCH v12 03/21] futex: Move futex_queue() into futex_wait_setup() Sebastian Andrzej Siewior
2025-05-05 21:43 ` André Almeida
2025-05-16 12:53 ` Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Peter Zijlstra
2025-04-16 16:29 ` [PATCH v12 04/21] futex: Pull futex_hash() out of futex_q_lock() Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Peter Zijlstra
2025-04-16 16:29 ` [PATCH v12 05/21] futex: Create hb scopes Sebastian Andrzej Siewior
2025-05-06 23:45 ` André Almeida
2025-05-16 12:20 ` Sebastian Andrzej Siewior
2025-05-16 13:23 ` Peter Zijlstra
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Peter Zijlstra
2025-04-16 16:29 ` [PATCH v12 06/21] futex: Create futex_hash() get/put class Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Peter Zijlstra
2025-04-16 16:29 ` [PATCH v12 07/21] futex: Create private_hash() " Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Peter Zijlstra
2025-04-16 16:29 ` [PATCH v12 08/21] futex: Acquire a hash reference in futex_wait_multiple_setup() Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 09/21] futex: Decrease the waiter count before the unlock operation Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 10/21] futex: Introduce futex_q_lockptr_lock() Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-05-08 19:06 ` [PATCH v12 10/21] " André Almeida
2025-05-16 12:18 ` Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 11/21] futex: Create helper function to initialize a hash slot Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 12/21] futex: Add basic infrastructure for local task local hash Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 13/21] futex: Allow automatic allocation of process wide futex hash Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 14/21] futex: Allow to resize the private local hash Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-05-08 20:32 ` [PATCH v12 14/21] " André Almeida
2025-05-16 10:49 ` Sebastian Andrzej Siewior
2025-05-16 13:00 ` André Almeida
2025-05-10 8:45 ` [PATCH] futex: Fix futex_mm_init() build failure on older compilers, remove rcu_assign_pointer() Ingo Molnar
2025-05-11 8:11 ` [tip: locking/futex] futex: Relax the rcu_assign_pointer() assignment of mm->futex_phash in futex_mm_init() tip-bot2 for Ingo Molnar
2025-06-01 7:39 ` [PATCH v12 14/21] futex: Allow to resize the private local hash Lai, Yi
2025-06-02 11:00 ` Sebastian Andrzej Siewior
2025-06-02 14:36 ` Lai, Yi
2025-06-02 14:44 ` Sebastian Andrzej Siewior
2025-06-02 15:00 ` Lai, Yi
2025-06-11 9:20 ` [tip: locking/urgent] " tip-bot2 for Sebastian Andrzej Siewior
2025-06-11 14:39 ` tip-bot2 for Sebastian Andrzej Siewior
2025-06-11 14:43 ` Sebastian Andrzej Siewior
2025-06-11 15:11 ` Peter Zijlstra
2025-06-11 15:20 ` Peter Zijlstra
2025-06-11 15:35 ` Sebastian Andrzej Siewior
2025-06-16 17:14 ` Calvin Owens
2025-06-17 7:16 ` Sebastian Andrzej Siewior
2025-06-17 9:23 ` Calvin Owens
2025-06-17 9:50 ` Sebastian Andrzej Siewior
2025-06-17 16:11 ` Calvin Owens
2025-06-18 2:15 ` Calvin Owens
2025-06-18 16:47 ` Sebastian Andrzej Siewior
2025-06-18 16:03 ` Sebastian Andrzej Siewior
2025-06-18 16:49 ` Calvin Owens
2025-06-18 17:09 ` Sebastian Andrzej Siewior
2025-06-18 20:56 ` Calvin Owens
2025-06-18 22:47 ` Calvin Owens
2025-06-19 21:07 ` Calvin Owens
2025-06-20 10:31 ` Sebastian Andrzej Siewior
2025-06-20 18:56 ` Calvin Owens
2025-06-21 1:02 ` Calvin Owens [this message]
2025-06-21 7:24 ` Calvin Owens
2025-06-21 21:01 ` Sebastian Andrzej Siewior
2025-06-22 16:17 ` Calvin Owens
2025-04-16 16:29 ` [PATCH v12 15/21] futex: Allow to make the private hash immutable Sebastian Andrzej Siewior
2025-05-02 18:01 ` Peter Zijlstra
2025-05-05 7:14 ` Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 16/21] futex: Implement FUTEX2_NUMA Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Peter Zijlstra
2025-04-16 16:29 ` [PATCH v12 17/21] futex: Implement FUTEX2_MPOL Sebastian Andrzej Siewior
2025-05-02 18:45 ` Peter Zijlstra
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Peter Zijlstra
2025-04-16 16:29 ` [PATCH v12 18/21] tools headers: Synchronize prctl.h ABI header Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 19/21] tools/perf: Allow to select the number of hash buckets Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 20/21] selftests/futex: Add futex_priv_hash Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-05-09 21:22 ` [PATCH v12 20/21] " André Almeida
2025-05-16 7:38 ` Sebastian Andrzej Siewior
2025-05-27 11:28 ` Mark Brown
2025-05-27 12:23 ` Sebastian Andrzej Siewior
2025-05-27 12:35 ` Mark Brown
2025-05-27 12:43 ` Sebastian Andrzej Siewior
2025-05-27 12:59 ` Mark Brown
2025-05-27 13:25 ` Sebastian Andrzej Siewior
2025-05-27 13:40 ` Mark Brown
2025-05-27 13:45 ` Sebastian Andrzej Siewior
2025-04-16 16:29 ` [PATCH v12 21/21] selftests/futex: Add futex_numa_mpol Sebastian Andrzej Siewior
2025-05-02 19:08 ` Peter Zijlstra
2025-05-05 7:33 ` Sebastian Andrzej Siewior
2025-05-02 19:16 ` Peter Zijlstra
2025-05-05 7:36 ` Sebastian Andrzej Siewior
2025-05-08 10:33 ` [tip: locking/futex] " tip-bot2 for Sebastian Andrzej Siewior
2025-04-16 16:31 ` [PATCH v12 00/21] futex: Add support task local hash maps, FUTEX2_NUMA and FUTEX2_MPOL Sebastian Andrzej Siewior
2025-05-02 19:48 ` Peter Zijlstra
2025-05-03 10:09 ` Peter Zijlstra
2025-05-05 7:30 ` Sebastian Andrzej Siewior
2025-05-06 7:36 ` Peter Zijlstra
2025-05-09 11:41 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aFYEpPIwhlL1WvR0@mozart.vkv.me \
--to=calvin@wbinvd.org \
--cc=bigeasy@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=x86@kernel.org \
--cc=yi1.lai@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).