linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: NeilBrown <neilb@suse.com>, Matthew Wilcox <willy@infradead.org>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-ext4@vger.kernel.org, akpm@linux-foundation.org,
	tytso@mit.edu, jack@suse.cz
Subject: Re: [RFC PATCH 0/4] fs: introduce new writeback error tracking infrastructure and convert ext4 to use it
Date: Thu, 06 Apr 2017 10:02:51 -0400	[thread overview]
Message-ID: <1491487371.18658.22.camel@redhat.com> (raw)
In-Reply-To: <87efx6tnbr.fsf@notabene.neil.brown.name>

On Thu, 2017-04-06 at 10:02 +1000, NeilBrown wrote:
> On Thu, Apr 06 2017, Jeff Layton wrote:
> 
> > On Tue, 2017-04-04 at 10:09 -0700, Matthew Wilcox wrote:
> > > On Tue, Apr 04, 2017 at 12:25:46PM -0400, Jeff Layton wrote:
> > > > That said, I think giving more specific errors where we can is useful.
> > > > When your program is erroring out and writing 'I/O error' to the logs,
> > > > then how much time will your admins burn before they figure out that it
> > > > really failed because the filesystem was full?
> > > 
> > > df is one of the first things I check ... a few years ago, I also learned
> > > to check df -i ... ;-)
> > > 
> > > Anyway, given the decision to simply report the last error lets us do this
> > > implementation:
> > > 
> > > void filemap_set_wb_error(struct address_space *mapping, int err)
> > > {
> > > 	struct inode *inode = mapping->host;
> > > 	unsigned int wb_err;
> > > 
> > > 	if (!err)
> > > 		return;
> > > 	/*
> > > 	 * This should be called with the error code that we want to return
> > > 	 * on fsync. Thus, it should always be <= 0.
> > > 	 */
> > > 	WARN_ON(err > 0 || err < -MAX_ERRNO);
> > > 
> > > 	spin_lock(&inode->i_lock);
> > > 	wb_err = ((mapping->wb_err & ~MAX_ERRNO) + (1 << 12)) | -err;
> > > 	WRITE_ONCE(mapping->wb_err, wb_err);
> > > 	spin_unlock(&inode->i_lock);
> > > }
> > > 
> > 
> > I like this idea of being able to store arbitrary error codes there.
> > That should be used judiciously of course, but we already allow
> > returning arbitrary errors via the ->fsync op anyway.
> > 
> > I'll plan to incorporate something like that into the next set (with
> > judicious comments and constants).
> > 
> > One question...is the i_lock the right way to protect this? I think we
> > could do this locklessly too (cmpxchg in a loop, for instance). I'm not
> > worried about performance here -- it's just nice to be able to call
> > simple stuff like this without worrying about locking.
> 
> I like the idea of using cmpxchg.
> 
> 
> > 
> > > int filemap_report_wb_error(struct file *file)
> > > {
> > > 	struct inode *inode = file_inode(file);
> > > 	unsigned int wb_err = READ_ONCE(mapping->wb_err);
> > > 
> > > 	if (file->f_wb_err == wb_err)
> > > 		return 0;
> > > 	return -(wb_err & 4095);
> > > }
> > > 
> > > That only gives us 20 bits of counter, but I think that's enough.
> > 
> > 2^20 is 1048576, which seems a little small to me.
> > 
> > We may end up bumping the counter on every failed I/O. How fast can we
> > generate 1M failed I/Os? :)
> 
> Do we need to count all of those if no-one sees them?
> i.e. use one bit to say "this error hasn't been seen".
> If an error occurs with has the name error code as is currently stored,
> and the bit is set, don't make a change.  Otherwise make the change,
> inc the counter, set the bit.
> When checking for an error, if the bit is set, clear it first.
> Then you can count 500,000 errors-returned-to-some-thread, which is
> probably enough.
> 

Yeah, that seems like it might be a good idea if we want to stick to a
small value here.

> > 
> > 2^52 however is 4503599627370496 (4Tios or so) ... that might take a
> > little longer to overflow. Is it worth the cost here to ensure that
> > this won't occur?
> > 
> > Actually...we could put this field in the inode instead of the mapping.
> > I know we've traditionally tracked this in the mapping, but is that
> > required here?
> 
> What if the address_space is shared by two inodes?  That is the whole
> point of the i_mapping pointer.  This would make it harder for the
> "other" inode to get the error.
> (Does anyone actually use fs/coda ??
>  Actually, block devices use i_mapping too.
>  If two block device inodes have the same major/minor number, they
>  end up having i_mapping point to the same place)
> 

Ahh ok, that makes sense. I'll plan to keep it part of the mapping.

> If you are concerned about space in 'struct address_space', just prune
> some wastage.
> The "host" field brings no value.  It is only ever assigned in
> inode_init_always():
> 
> 	struct address_space *const mapping = &inode->i_data;
> ......
> 	mapping->host = inode;
> 
> So you could change all references to use
>    container_of(mapping, struct inode, i_data)
> 

That might be a nice cleanup, but I think I'll leave that to be done
separately.

-- 
Jeff Layton <jlayton@redhat.com>

  parent reply	other threads:[~2017-04-06 14:03 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31 19:25 [RFC PATCH 0/4] fs: introduce new writeback error tracking infrastructure and convert ext4 to use it Jeff Layton
2017-03-31 19:26 ` [RFC PATCH 1/4] fs: new infrastructure for writeback error handling and reporting Jeff Layton
2017-04-03  7:12   ` Nikolay Borisov
2017-04-03 10:28     ` Jeff Layton
2017-04-03 14:47   ` Matthew Wilcox
2017-04-03 15:19     ` Jeff Layton
2017-04-03 16:15       ` Matthew Wilcox
2017-04-03 16:30         ` Jeff Layton
2017-03-31 19:26 ` [RFC PATCH 2/4] dax: set errors in mapping when writeback fails Jeff Layton
2017-03-31 19:26 ` [RFC PATCH 3/4] buffer: set wb errors using both new and old infrastructure for now Jeff Layton
2017-03-31 19:26 ` [RFC PATCH 4/4] ext4: wire it up to the new writeback error reporting infrastructure Jeff Layton
2017-04-03  4:25 ` [RFC PATCH 0/4] fs: introduce new writeback error tracking infrastructure and convert ext4 to use it NeilBrown
2017-04-03 10:28   ` Jeff Layton
2017-04-03 14:32     ` Matthew Wilcox
2017-04-03 17:47       ` Jeff Layton
2017-04-03 18:09         ` Jeremy Allison
2017-04-03 18:18           ` Jeff Layton
2017-04-03 18:36             ` Jeremy Allison
2017-04-03 18:40               ` Jeremy Allison
2017-04-03 18:49                 ` Jeff Layton
2017-04-03 19:16         ` Matthew Wilcox
2017-04-03 20:16           ` Jeff Layton
2017-04-04  2:45             ` Matthew Wilcox
2017-04-04  3:03             ` NeilBrown
2017-04-04 11:41               ` Jeff Layton
2017-04-04 22:41                 ` NeilBrown
2017-04-04 11:53               ` Matthew Wilcox
2017-04-04 12:17                 ` Jeff Layton
2017-04-04 16:12                   ` Matthew Wilcox
2017-04-04 16:25                     ` Jeff Layton
2017-04-04 17:09                       ` Matthew Wilcox
2017-04-04 18:08                         ` Jeff Layton
2017-04-04 22:50                         ` NeilBrown
2017-04-05 19:49                         ` Jeff Layton
2017-04-05 21:03                           ` Matthew Wilcox
2017-04-06  0:19                             ` NeilBrown
2017-04-06  0:02                           ` NeilBrown
2017-04-06  2:55                             ` Matthew Wilcox
2017-04-06  5:12                               ` NeilBrown
2017-04-06 13:31                                 ` Matthew Wilcox
2017-04-06 21:53                                   ` NeilBrown
2017-04-06 14:02                             ` Jeff Layton [this message]
2017-04-06 19:14                             ` Jeff Layton
2017-04-06 20:05                               ` Matthew Wilcox
2017-04-07 13:12                                 ` Jeff Layton
2017-04-09 23:15                                   ` NeilBrown
2017-04-10 13:19                                     ` Jeff Layton
2017-04-06 22:15                               ` NeilBrown
2017-04-04 23:13                       ` NeilBrown
2017-04-05 11:14                         ` Jeff Layton
2017-04-06  0:24                           ` NeilBrown
2017-04-04 13:38                 ` Theodore Ts'o
2017-04-04 22:28                 ` NeilBrown
2017-04-03 14:51   ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1491487371.18658.22.camel@redhat.com \
    --to=jlayton@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).