linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Matthew Wilcox <willy@infradead.org>
Cc: Hillf Danton <hdanton@sina.com>,
	syzbot <syzbot+9c3fb12e9128b6e1d7eb@syzkaller.appspotmail.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	syzkaller-bugs@googlegroups.com
Subject: Re: [syzbot] INFO: task hung in jbd2_journal_commit_transaction (3)
Date: Mon, 20 Dec 2021 16:24:26 -0500	[thread overview]
Message-ID: <YcD0itBAoOPNNKvX@mit.edu> (raw)
In-Reply-To: <Yb6zKVoxuD3lQMA/@casper.infradead.org>

On Sun, Dec 19, 2021 at 04:20:57AM +0000, Matthew Wilcox wrote:
> > Hey Willy
> > >
> > >sched_setattr(0x0, &(0x7f0000000080)={0x38, 0x1, 0x0, 0x0, 0x1}, 0x0)
> > >
> > >so you've set a SCHED_FIFO priority and then are surprised that some
> > >tasks are getting starved?
> > 
> > Can you speficy a bit more on how fifo could block journald waiting for
> > IO completion more than 120 seconds?
> 
> Sure!  You can see from the trace below that jbd2/sda1-8 is in D state,
> so we know nobody's called unlock_buffer() yet, which would have woken
> it.  That would happen in journal_end_buffer_io_sync(), which is
> the b_end_io for the buffer.
> 
> Learning more detail than that would require knowing the I/O path
> for this particular test system.  I suspect that the I/O was submitted
> and has even completed, but there's a kernel thread waiting to run which
> will call the ->b_end_io that hasn't been scheduled yet, because it's
> at a lower priority than all the threads which are running at
> SCHED_FIFO.
> 
> I'm disinclined to look at this report much further because syzbot is
> stumbling around trying things which are definitely in the category of
> "if you do this and things break, you get to keep both pieces".  You
> can learn some interesting things by playing with the various RT
> scheduling classes, but mostly what you can learn is that you need to
> choose your priorities carefully to have a functioning system.

In general, real-time threads (anything scheduled with SCHED_FIFO or
SCHED_RT) should never, *ever* try to do any kind of I/O.  After all,
I/O can block, and if a real-time thread blocks, so much for any kind
of real-time guarantee that you might have.

If you must use do I/O from soft real-time thread, one trick you *can*
do is to some number of CPU's which are reserved for real-time
threads, and a subset of threads which are reserved for non-real-time
threads, enforced using CPU pinning.  It's still not prefect, since
there are still priority inheritance issues, and while this protects
against a non-real-time thread holding some lock which is needed by a
real-time (SCHED_FIFO) thread, if there are two SCHED_FIFO running at
different priorities it's still possible to deadlock the entire
kernel.

Can it be done?  Sure; I was part of an effort to make it work for the
US Navy's DDG-1000 Zumwalt-class destroyer[1].  But it's tricky, and
it's why IBM got paid the big bucks. :-)  Certainly it's going to be
problematic for syzkaller if it's just going to be randomly trying to
set some threads to be real-time without doing any kind of formal
planning.

[1] https://dl.acm.org/doi/10.1147/sj.472.0207

Cheers,

						- Ted

  reply	other threads:[~2021-12-20 21:24 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-18 19:50 [syzbot] INFO: task hung in jbd2_journal_commit_transaction (3) syzbot
2021-12-18 21:22 ` Matthew Wilcox
     [not found] ` <20211219023540.1638-1-hdanton@sina.com>
2021-12-19  4:20   ` Matthew Wilcox
2021-12-20 21:24     ` Theodore Ts'o [this message]
     [not found]     ` <20211221090804.1810-1-hdanton@sina.com>
2021-12-21 22:32       ` Theodore Ts'o
     [not found]       ` <20211222022527.1880-1-hdanton@sina.com>
2021-12-22  4:35         ` Theodore Ts'o
2022-05-20 11:57           ` Dmitry Vyukov
2022-05-20 21:45             ` Theodore Ts'o
2022-05-23 11:34               ` Dmitry Vyukov
2022-05-24 10:59                 ` Jan Kara
2021-12-23  5:32 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YcD0itBAoOPNNKvX@mit.edu \
    --to=tytso@mit.edu \
    --cc=hdanton@sina.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+9c3fb12e9128b6e1d7eb@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).