From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
Luis Goncalves <lgoncalv@redhat.com>,
Chunyu Hu <chuhu@redhat.com>
Subject: Re: [PATCH] locking/rtmutex: Always use trylock in rt_mutex_trylock()
Date: Mon, 7 Oct 2024 16:50:37 +0200 [thread overview]
Message-ID: <20241007145037.GE4879@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <7918987a-4b66-4191-aa52-798f9434352a@redhat.com>
On Wed, Oct 02, 2024 at 01:54:16PM -0400, Waiman Long wrote:
>
> On 10/2/24 05:37, Peter Zijlstra wrote:
> > On Thu, Sep 26, 2024 at 11:13:15AM -0400, Waiman Long wrote:
> > > One reason to use a trylock is to avoid a ABBA deadlock in case we need
> > > to use a locking sequence that is not in the expected locking order. That
> > > requires the use of trylock all the ways in the abnormal locking
> > > sequence. Unfortunately, this is not the case for rt_mutex_trylock() as
> > > it uses a raw_spin_lock_irqsave() to acquire the lock->wait_lock. That
> > > will cause a lockdep splat like the following in a PREEMPT_RT kernel:
> > >
> > > [ 63.695668] -> #0 (&lock->wait_lock){-...}-{2:2}:
> > > [ 63.695674] check_prev_add+0x1bd/0x1310
> > > [ 63.695678] validate_chain+0x6cf/0x7c0
> > > [ 63.695682] __lock_acquire+0x879/0xf40
> > > [ 63.695686] lock_acquire.part.0+0xfa/0x2d0
> > > [ 63.695690] _raw_spin_lock_irqsave+0x46/0x90
> > > [ 63.695695] rt_mutex_slowtrylock+0x3f/0xb0
> > > [ 63.695699] rt_spin_trylock+0x13/0xc0
> > > [ 63.695702] rmqueue_pcplist+0x5b/0x180
> > > [ 63.695705] rmqueue+0xb01/0x1040
> > > [ 63.695708] get_page_from_freelist+0x1d0/0xa00
> > > [ 63.695712] __alloc_pages_noprof+0x19a/0x450
> > > [ 63.695716] alloc_pages_mpol_noprof+0xaf/0x1e0
> > > [ 63.695721] stack_depot_save_flags+0x4db/0x520
> > > [ 63.695727] kasan_save_stack+0x3f/0x50
> > > [ 63.695731] __kasan_record_aux_stack+0x8e/0xa0
> > > [ 63.695736] task_work_add+0x1ad/0x240
> > > [ 63.695741] sched_tick+0x1c7/0x500
> > > [ 63.695744] update_process_times+0xf1/0x130
> > > [ 63.695750] tick_nohz_handler+0xf7/0x230
> > > [ 63.695754] __hrtimer_run_queues+0x13b/0x7b0
> > > [ 63.695758] hrtimer_interrupt+0x1c2/0x350
> > > [ 63.695763] __sysvec_apic_timer_interrupt+0xdb/0x340
> > > [ 63.695770] sysvec_apic_timer_interrupt+0x9c/0xd0
> > > [ 63.695774] asm_sysvec_apic_timer_interrupt+0x1a/0x20
> > > [ 63.695780] __asan_load8+0x8/0xa0
> > > [ 63.695784] mas_wr_end_piv+0x28/0x2c0
> > > [ 63.695789] mas_preallocate+0x3a8/0x680
> > > [ 63.695793] vma_shrink+0x180/0x3f0
> > > [ 63.695799] shift_arg_pages+0x219/0x2c0
> > > [ 63.695804] setup_arg_pages+0x343/0x5e0
> > > [ 63.695807] load_elf_binary+0x5ac/0x15d0
> > > [ 63.695813] search_binary_handler+0x125/0x370
> > > [ 63.695818] exec_binprm+0xc9/0x3d0
> > > [ 63.695821] bprm_execve+0x103/0x230
> > > [ 63.695824] kernel_execve+0x187/0x1c0
> > > [ 63.695828] call_usermodehelper_exec_async+0x145/0x1e0
> > > [ 63.695832] ret_from_fork+0x31/0x60
> > > [ 63.695836] ret_from_fork_asm+0x1a/0x30
> > > [ 63.695840]
> > > [ 63.695840] other info that might help us debug this:
> > > [ 63.695840]
> > > [ 63.695842] Chain exists of:
> > > [ 63.695842] &lock->wait_lock --> &p->pi_lock --> &rq->__lock
> > > [ 63.695842]
> > > [ 63.695850] Possible unsafe locking scenario:
> > > [ 63.695850]
> > > [ 63.695851] CPU0 CPU1
> > > [ 63.695852] ---- ----
> > > [ 63.695854] lock(&rq->__lock);
> > > [ 63.695857] lock(&p->pi_lock);
> > > [ 63.695861] lock(&rq->__lock);
> > > [ 63.695864] lock(&lock->wait_lock);
> > > [ 63.695868]
> > > [ 63.695868] *** DEADLOCK ***
> > >
> > > Fix it by using raw_spin_trylock_irqsave() instead.
> > That truncated lockdep report doesn't really explain anything. Please
> > just transcribe the full lockdep report into something small and
> > coherent.
>
> I was trying to show where the offending call is coming from. I will send a
> v2 with a condensed lockdep splat.
No no no... explain the actual problem.
Is the problem that:
sched_tick()
task_tick_mm_cid()
task_work_add()
kasan_save_stack()
idiotic crap while holding rq->__lock ?
Because afaict that is completely insane. And has nothing to do with
rtmutex.
We are not going to change rtmutex because instrumentation shit is shit.
next prev parent reply other threads:[~2024-10-07 14:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-26 15:13 [PATCH] locking/rtmutex: Always use trylock in rt_mutex_trylock() Waiman Long
2024-10-02 9:37 ` Peter Zijlstra
2024-10-02 17:54 ` Waiman Long
2024-10-07 14:50 ` Peter Zijlstra [this message]
2024-10-07 15:23 ` Waiman Long
2024-10-07 15:33 ` Peter Zijlstra
2024-10-07 15:54 ` Waiman Long
2024-10-08 7:38 ` Peter Zijlstra
2024-10-08 13:21 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241007145037.GE4879@noisy.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=boqun.feng@gmail.com \
--cc=chuhu@redhat.com \
--cc=lgoncalv@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox