All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Mateusz Guzik <mguzik@redhat.com>
Cc: "Lukáš Czerner" <lczerner@redhat.com>,
	sandeen@redhat.com, "Jan Kara" <jack@suse.cz>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	"Josef Bacik" <jbacik@fb.com>,
	"Al Viro" <viro@ZenIV.linux.org.uk>,
	"Joe Perches" <joe@perches.com>
Subject: Re: [PATCH V2 2/2] fs: print a message when freezing/unfreezing filesystems
Date: Fri, 16 May 2014 10:11:56 +1000	[thread overview]
Message-ID: <20140516001156.GI5421@dastard> (raw)
In-Reply-To: <20140515231908.GB24089@mguzik.redhat.com>

On Fri, May 16, 2014 at 01:19:09AM +0200, Mateusz Guzik wrote:
> On Fri, May 16, 2014 at 08:51:41AM +1000, Dave Chinner wrote:
> > On Fri, May 16, 2014 at 12:34:40AM +0200, Mateusz Guzik wrote:
> > > On Fri, May 16, 2014 at 08:21:35AM +1000, Dave Chinner wrote:
> > > > > IOW, a new column in mountinfo. For frozen filesystems it would contain
> > > > > 'frozen_by=[%s]:[%d]' (escaped comm, pid).
> > > > 
> > > > I really don't see that the process that froze the filesystem is
> > > > particularly useful - it many cases that process is long gone (e.g.
> > > > fsfreeze is being used to allow a HW array to take a snapshot). Just
> > > > the fact it is in the process of freezing (if stuck, stack trace in
> > > > sysrq-w should be present) or frozen (freezing process may be long
> > > > gone, and is mostly irrelevant because you're now tracking down why
> > > > a thaw hasn't happened)...
> > > 
> > > There are deamons which perform freezing and unfreezing on their own.
> > > Thus storing the name along with pid helps to determine whether someone
> > > went behind such daemon's back, or maybe it's the daemon which "forgot" to
> > > unfreeze after all.
> > 
> > Such a daemon should be logging the fact that it's freezing and
> > thawing the filesystem. The kernel is not the place to track what
> > buggy userspace applications are doing wrong.
> > 
> 
> Except there is no log entry if /var got frozen (and this is not an
> imaginary example).

Freezing the filesystem that the freezing daemon logs to is, well, a
major application architecture fail. Sorry, catering for the lowest
common denominator (i.e. stupidity) is not an valid argument for
adding stuff to the kernel....

> Grabbig a debugger to inspect daemon's state is not
> exactly what your typical support associate can or should do.

No, but they can read /proc/self/mountinfo, and grab sysrq-w output.
And they should be able to read that and tell that there is a freeze
hang from that info. This "filesystem hang triage 101" stuff....

> But this was a side request, I'm not going to argue about including
> this since turns out there is a better way.
> 
> Somewhere in the thread an idea to log long-standing freezes was
> mentioned which would provide sufficient information as far as

You've already got the hung task timer firing when a fs is frozen
for too long. You'll see processes hung in sb_write_wait(), and that
tells you the filesystem is frozen. Then look at
/proc/self/mountinfo to find which fs is frozen....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2014-05-16  0:12 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-13 22:04 [PATCH V2 1/2] fs: include device name in error messages about freezing Mateusz Guzik
2014-05-13 22:04 ` [PATCH V2 2/2] fs: print a message when freezing/unfreezing filesystems Mateusz Guzik
2014-05-14 11:14   ` Jan Kara
2014-05-14 11:26     ` Mateusz Guzik
2014-05-14 11:39       ` Jan Kara
2014-05-14 22:00         ` Dave Chinner
2014-05-14 22:37           ` Dave Chinner
2014-05-14 22:40             ` Eric Sandeen
2014-05-15 10:40               ` Lukáš Czerner
2014-05-15 10:47                 ` Mateusz Guzik
2014-05-15 22:21                   ` Dave Chinner
2014-05-15 22:34                     ` Mateusz Guzik
2014-05-15 22:34                       ` Mateusz Guzik
2014-05-15 22:51                       ` Dave Chinner
2014-05-15 23:19                         ` Mateusz Guzik
2014-05-16  0:11                           ` Dave Chinner [this message]
2014-05-16  0:39                             ` Mateusz Guzik
2014-05-19  9:43                             ` Jan Kara
2014-05-19 23:37                               ` Dave Chinner
2014-05-15 10:13             ` Jan Kara
2014-05-15 22:16               ` Dave Chinner
2014-05-14 11:58     ` Lukáš Czerner
2014-05-14 11:10 ` [PATCH V2 1/2] fs: include device name in error messages about freezing Jan Kara
2014-05-14 22:07   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140516001156.GI5421@dastard \
    --to=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=joe@perches.com \
    --cc=lczerner@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mguzik@redhat.com \
    --cc=sandeen@redhat.com \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.