public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
From: John Keeping <john@metanate.com>
To: linux-rt-users@vger.kernel.org
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [RT] Deadlock on filesystem operations v5.15+
Date: Thu, 7 Jul 2022 15:43:03 +0100	[thread overview]
Message-ID: <Ysbw904Lf1AXlSI0@donbot> (raw)
In-Reply-To: <YsL2gTPQk5YnI2W3@donbot>

On Mon, Jul 04, 2022 at 03:17:39PM +0100, John Keeping wrote:
> I'm seeing a deadlock in filesystem operations on v5.15.49-rt47-rebase;
> after further testing this is reproducible in all v5.15 releases I
> tested as well as v5.19-rc3-rt5-rebase.  I believe it is specific to
> CONFIG_PREEMPT_RT.
> 
> With added logging, I've confirmed that the kworker here is plugged and
> the buffer tar is waiting for is one of those held up by the plug.  The
> vfat filesystem here is on a loop device.
> 
> Both tasks here are running at normal priority (SCHED_OTHER, nice 0).
> 
> This seems to be similar to the issue fixed by b0fdc01354f4
> ("sched/core: Schedule new worker even if PI-blocked") since the lock
> being used here is msdos_sb_info::s_lock which is a normal struct mutex
> and thus tsk_is_pi_blocked() will be false for non-RT but is returning
> true on RT meaning that blk_schedule_flush_plug() is skipped in
> sched_submit_work().  If I remove the tsk_is_pi_blocked() check then I
> can't reproduce the hang any more.
> 
> Is some further refinement of the condition here needed?

Having taken another look at why this wasn't happening in v5.10-rt
previously, the tsk_is_pi_blocked() check in sched_submit_work() was
removed by a patch [1] added in v5.9.1-rt20 but dropped in v5.13-rt1.

It looks like the rest of that patch was replaced by b4bfa3fcfe3b
("sched/core: Rework the __schedule() preempt argument") upstream but
the hunks dropping tsk_is_pi_blocked() have been totally lost.

So I've sent a patch to drop the check [2] as it seems this creates a
difference in the behaviour of always preemptible locks between RT and
!RT which this function isn't called at all by any of the locks that
become preemptible only with RT.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/0022-locking-rtmutex-Use-custom-scheduling-function-for-s.patch?h=linux-5.10.y-rt-patches
[2] https://lore.kernel.org/lkml/20220707143902.529938-1-john@metanate.com/

      reply	other threads:[~2022-07-07 14:43 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-04 14:17 [RT] Deadlock on filesystem operations v5.15+ John Keeping
2022-07-07 14:43 ` John Keeping [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ysbw904Lf1AXlSI0@donbot \
    --to=john@metanate.com \
    --cc=bigeasy@linutronix.de \
    --cc=linux-rt-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox