From: "Theodore Ts'o" <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: Mateusz Guzik <mguzik@redhat.com>,
Dave Chinner <david@fromorbit.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Josef Bacik <jbacik@fb.com>, Al Viro <viro@ZenIV.linux.org.uk>,
Eric Sandeen <esandeen@redhat.com>
Subject: Re: [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems
Date: Thu, 15 May 2014 10:15:15 -0400 [thread overview]
Message-ID: <20140515141515.GA21632@thunk.org> (raw)
In-Reply-To: <20140515134610.GB660@quack.suse.cz>
On Thu, May 15, 2014 at 03:46:10PM +0200, Jan Kara wrote:
> > Saving it in the superblock would require changing a bunch of file
> > systems. What if we store this information in memory, and print it
> > out under certain conditions (i.e., after a soft lockup detection, or
> > upon request of some magic sysrq request)?
> By 'superblock' I meant 'struct super_block' ;) So we are in agreement I
> believe.
Ah, yes, we're in agreement. I thought you were talking about the
on-disk superblock.
> > Or we could create a tunable threshold and print a message after a
> > file system has been frozen more than a particular specified duration,
> > with that duration set conservatively to something like 60 or 120
> > seconds by default.
> I was thinking about this as well but all these "warn after X seconds"
> warnings tend to have quite a few false positives in practice so dumping
> this in emergency-thaw sysrq handler or exposing the information somewhere
> in proc (e.g. mountinfo) would look like a better option to me.
Well, we already have the soft lockup warning, which sometimes has
some false positives, but in practice, if a process is runable but
doesn't get to run in 2 minutes (the default is 20 seconds, but we've
used 2 minutes to avoid the false positive problem on a super busy
system), something is probably clearly wrong.
Similarly, if a process is trying to write to a frozen file system,
and can't after two minutes, something is almost certainly wrong, or
least, it's something a system administrator should know about it. We
can argue over whether the default threshold should be 20 seconds, or
120 seconds, or 2 hours, but I think there would be agreement that for
pretty much any configuration, there is some delay after which
printing a message is actually the right thing to do. (Yes, "time
that a process is waiting to write to a frozen file system != time the
file system is frozen" --- the latter is easier to implement, but if
people feel strongly about it, the former isn't that much more
difficult.)
The problem with using an sysrq handler is the user has to know how to
use it. If the user files a bug saying the system has mysteriously
hung, the fact that the system log contains a hint as to what might be
going on would be very useful for an enterprise distribiution's help
desk. (Yes, this won't help if it's the root file system is the one
that's been frozen, unless the customer has configured remote syslog.
But for many cases, it might provide a vital clue that could save a
lot of time and support costs.)
- Ted
prev parent reply other threads:[~2014-05-15 14:15 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-13 18:31 [PATCH 1/2] fs: include device name in error messages about freezing Mateusz Guzik
2014-05-13 18:31 ` [PATCH 2/2] fs: print a message when freezing/unfreezing filesystems Mateusz Guzik
2014-05-13 18:39 ` Joe Perches
2014-05-13 18:53 ` Mateusz Guzik
2014-05-13 19:00 ` Joe Perches
2014-05-13 19:06 ` Mateusz Guzik
2014-05-14 21:54 ` Dave Chinner
2014-05-15 2:17 ` Theodore Ts'o
2014-05-15 3:04 ` Dave Chinner
2014-05-15 9:42 ` Mateusz Guzik
2014-05-15 10:01 ` Jan Kara
2014-05-15 10:43 ` Lukáš Czerner
2014-05-15 12:47 ` Theodore Ts'o
2014-05-15 13:46 ` Jan Kara
2014-05-15 14:15 ` Theodore Ts'o [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140515141515.GA21632@thunk.org \
--to=tytso@mit.edu \
--cc=david@fromorbit.com \
--cc=esandeen@redhat.com \
--cc=jack@suse.cz \
--cc=jbacik@fb.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mguzik@redhat.com \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox