All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Keeping <john@metanate.com>
To: linux-rt-users@vger.kernel.org
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [RT] Deadlock on filesystem operations v5.15+
Date: Thu, 7 Jul 2022 15:43:03 +0100	[thread overview]
Message-ID: <Ysbw904Lf1AXlSI0@donbot> (raw)
In-Reply-To: <YsL2gTPQk5YnI2W3@donbot>

On Mon, Jul 04, 2022 at 03:17:39PM +0100, John Keeping wrote:
> I'm seeing a deadlock in filesystem operations on v5.15.49-rt47-rebase;
> after further testing this is reproducible in all v5.15 releases I
> tested as well as v5.19-rc3-rt5-rebase.  I believe it is specific to
> CONFIG_PREEMPT_RT.
> 
> With added logging, I've confirmed that the kworker here is plugged and
> the buffer tar is waiting for is one of those held up by the plug.  The
> vfat filesystem here is on a loop device.
> 
> Both tasks here are running at normal priority (SCHED_OTHER, nice 0).
> 
> This seems to be similar to the issue fixed by b0fdc01354f4
> ("sched/core: Schedule new worker even if PI-blocked") since the lock
> being used here is msdos_sb_info::s_lock which is a normal struct mutex
> and thus tsk_is_pi_blocked() will be false for non-RT but is returning
> true on RT meaning that blk_schedule_flush_plug() is skipped in
> sched_submit_work().  If I remove the tsk_is_pi_blocked() check then I
> can't reproduce the hang any more.
> 
> Is some further refinement of the condition here needed?

Having taken another look at why this wasn't happening in v5.10-rt
previously, the tsk_is_pi_blocked() check in sched_submit_work() was
removed by a patch [1] added in v5.9.1-rt20 but dropped in v5.13-rt1.

It looks like the rest of that patch was replaced by b4bfa3fcfe3b
("sched/core: Rework the __schedule() preempt argument") upstream but
the hunks dropping tsk_is_pi_blocked() have been totally lost.

So I've sent a patch to drop the check [2] as it seems this creates a
difference in the behaviour of always preemptible locks between RT and
!RT which this function isn't called at all by any of the locks that
become preemptible only with RT.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/0022-locking-rtmutex-Use-custom-scheduling-function-for-s.patch?h=linux-5.10.y-rt-patches
[2] https://lore.kernel.org/lkml/20220707143902.529938-1-john@metanate.com/

      reply	other threads:[~2022-07-07 14:43 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-04 14:17 [RT] Deadlock on filesystem operations v5.15+ John Keeping
2022-07-07 14:43 ` John Keeping [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ysbw904Lf1AXlSI0@donbot \
    --to=john@metanate.com \
    --cc=bigeasy@linutronix.de \
    --cc=linux-rt-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.