All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org, Mel Gorman <mgorman@techsingularity.net>,
	Jan Kara <jack@suse.cz>, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH] rtmutex: Add acquire semantics for rtmutex lock acquisition slow path
Date: Tue, 18 Apr 2023 18:25:48 +0200	[thread overview]
Message-ID: <2023041854-cranium-prone-b9fa@gregkh> (raw)
In-Reply-To: <20230418154315.9PD52J2N@linutronix.de>

On Tue, Apr 18, 2023 at 05:43:15PM +0200, Sebastian Andrzej Siewior wrote:
> From: Mel Gorman <mgorman@techsingularity.net>
> 
> commit 1c0908d8e441631f5b8ba433523cf39339ee2ba0 upstream.
> 
> Jan Kara reported the following bug triggering on 6.0.5-rt14 running dbench
> on XFS on arm64.
> 
>  kernel BUG at fs/inode.c:625!
>  Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP
>  CPU: 11 PID: 6611 Comm: dbench Tainted: G            E   6.0.0-rt14-rt+ #1
>  pc : clear_inode+0xa0/0xc0
>  lr : clear_inode+0x38/0xc0
>  Call trace:
>   clear_inode+0xa0/0xc0
>   evict+0x160/0x180
>   iput+0x154/0x240
>   do_unlinkat+0x184/0x300
>   __arm64_sys_unlinkat+0x48/0xc0
>   el0_svc_common.constprop.4+0xe4/0x2c0
>   do_el0_svc+0xac/0x100
>   el0_svc+0x78/0x200
>   el0t_64_sync_handler+0x9c/0xc0
>   el0t_64_sync+0x19c/0x1a0
> 
> It also affects 6.1-rc7-rt5 and affects a preempt-rt fork of 5.14 so this
> is likely a bug that existed forever and only became visible when ARM
> support was added to preempt-rt. The same problem does not occur on x86-64
> and he also reported that converting sb->s_inode_wblist_lock to
> raw_spinlock_t makes the problem disappear indicating that the RT spinlock
> variant is the problem.
> 
> Which in turn means that RT mutexes on ARM64 and any other weakly ordered
> architecture are affected by this independent of RT.
> 
> Will Deacon observed:
> 
>   "I'd be more inclined to be suspicious of the slowpath tbh, as we need to
>    make sure that we have acquire semantics on all paths where the lock can
>    be taken. Looking at the rtmutex code, this really isn't obvious to me
>    -- for example, try_to_take_rt_mutex() appears to be able to return via
>    the 'takeit' label without acquire semantics and it looks like we might
>    be relying on the caller's subsequent _unlock_ of the wait_lock for
>    ordering, but that will give us release semantics which aren't correct."
> 
> Sebastian Andrzej Siewior prototyped a fix that does work based on that
> comment but it was a little bit overkill and added some fences that should
> not be necessary.
> 
> The lock owner is updated with an IRQ-safe raw spinlock held, but the
> spin_unlock does not provide acquire semantics which are needed when
> acquiring a mutex.
> 
> Adds the necessary acquire semantics for lock owner updates in the slow path
> acquisition and the waiter bit logic.
> 
> It successfully completed 10 iterations of the dbench workload while the
> vanilla kernel fails on the first iteration.
> 
> [ bigeasy@linutronix.de: Initial prototype fix ]
> 
> Fixes: 700318d1d7b38 ("locking/rtmutex: Use acquire/release semantics")
> Fixes: 23f78d4a03c5 ("[PATCH] pi-futex: rt mutex core")
> Reported-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: stable@vger.kernel.org
> Link: https://lore.kernel.org/r/20221202100223.6mevpbl7i6x5udfd@techsingularity.net
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> 
> Could this be please backported to 5.15 and earlier? It is already part
> of the 6.X kernels. I asked about this by the end of January and I'm
> kindly asking again ;)

I thought this was only an issues when using the out-of-tree RT patches
with these kernels, right?  Or is it relevant for 5.15.y from kernel.org
without anything else?

> This patch applies against v5.15. Should it not apply to earlier
> versions, please let me know an I kindly provide a backport.

How far back should it go?

> I received reports that this fixes "mysterious" crashes and that is how
> I noticed that it is not part of the earlier kernels.

Again, isn't this only needed for -rt kernels?

thanks,

greg k-h

  reply	other threads:[~2023-04-18 16:25 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-18 15:43 [PATCH] rtmutex: Add acquire semantics for rtmutex lock acquisition slow path Sebastian Andrzej Siewior
2023-04-18 16:25 ` Greg KH [this message]
2023-04-19  7:25   ` Sebastian Andrzej Siewior
2023-04-21  7:29     ` Thomas Gleixner
2023-04-21 12:45       ` Greg KH
2023-04-21 15:30         ` Thomas Gleixner
2023-04-21 16:09           ` Sebastian Andrzej Siewior
2023-04-21 16:33             ` Thomas Gleixner
2023-04-22 15:07               ` Greg KH
  -- strict thread matches above, loose matches on Subject: below --
2023-01-30 11:36 Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2023041854-cranium-prone-b9fa@gregkh \
    --to=gregkh@linuxfoundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=jack@suse.cz \
    --cc=mgorman@techsingularity.net \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.