From: Jeff Layton <jlayton@redhat.com>
To: NeilBrown <neilb@suse.com>, Matthew Wilcox <willy@infradead.org>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-ext4@vger.kernel.org, akpm@linux-foundation.org,
tytso@mit.edu, jack@suse.cz
Subject: Re: [RFC PATCH 0/4] fs: introduce new writeback error tracking infrastructure and convert ext4 to use it
Date: Thu, 06 Apr 2017 10:02:51 -0400 [thread overview]
Message-ID: <1491487371.18658.22.camel@redhat.com> (raw)
In-Reply-To: <87efx6tnbr.fsf@notabene.neil.brown.name>
On Thu, 2017-04-06 at 10:02 +1000, NeilBrown wrote:
> On Thu, Apr 06 2017, Jeff Layton wrote:
>
> > On Tue, 2017-04-04 at 10:09 -0700, Matthew Wilcox wrote:
> > > On Tue, Apr 04, 2017 at 12:25:46PM -0400, Jeff Layton wrote:
> > > > That said, I think giving more specific errors where we can is useful.
> > > > When your program is erroring out and writing 'I/O error' to the logs,
> > > > then how much time will your admins burn before they figure out that it
> > > > really failed because the filesystem was full?
> > >
> > > df is one of the first things I check ... a few years ago, I also learned
> > > to check df -i ... ;-)
> > >
> > > Anyway, given the decision to simply report the last error lets us do this
> > > implementation:
> > >
> > > void filemap_set_wb_error(struct address_space *mapping, int err)
> > > {
> > > struct inode *inode = mapping->host;
> > > unsigned int wb_err;
> > >
> > > if (!err)
> > > return;
> > > /*
> > > * This should be called with the error code that we want to return
> > > * on fsync. Thus, it should always be <= 0.
> > > */
> > > WARN_ON(err > 0 || err < -MAX_ERRNO);
> > >
> > > spin_lock(&inode->i_lock);
> > > wb_err = ((mapping->wb_err & ~MAX_ERRNO) + (1 << 12)) | -err;
> > > WRITE_ONCE(mapping->wb_err, wb_err);
> > > spin_unlock(&inode->i_lock);
> > > }
> > >
> >
> > I like this idea of being able to store arbitrary error codes there.
> > That should be used judiciously of course, but we already allow
> > returning arbitrary errors via the ->fsync op anyway.
> >
> > I'll plan to incorporate something like that into the next set (with
> > judicious comments and constants).
> >
> > One question...is the i_lock the right way to protect this? I think we
> > could do this locklessly too (cmpxchg in a loop, for instance). I'm not
> > worried about performance here -- it's just nice to be able to call
> > simple stuff like this without worrying about locking.
>
> I like the idea of using cmpxchg.
>
>
> >
> > > int filemap_report_wb_error(struct file *file)
> > > {
> > > struct inode *inode = file_inode(file);
> > > unsigned int wb_err = READ_ONCE(mapping->wb_err);
> > >
> > > if (file->f_wb_err == wb_err)
> > > return 0;
> > > return -(wb_err & 4095);
> > > }
> > >
> > > That only gives us 20 bits of counter, but I think that's enough.
> >
> > 2^20 is 1048576, which seems a little small to me.
> >
> > We may end up bumping the counter on every failed I/O. How fast can we
> > generate 1M failed I/Os? :)
>
> Do we need to count all of those if no-one sees them?
> i.e. use one bit to say "this error hasn't been seen".
> If an error occurs with has the name error code as is currently stored,
> and the bit is set, don't make a change. Otherwise make the change,
> inc the counter, set the bit.
> When checking for an error, if the bit is set, clear it first.
> Then you can count 500,000 errors-returned-to-some-thread, which is
> probably enough.
>
Yeah, that seems like it might be a good idea if we want to stick to a
small value here.
> >
> > 2^52 however is 4503599627370496 (4Tios or so) ... that might take a
> > little longer to overflow. Is it worth the cost here to ensure that
> > this won't occur?
> >
> > Actually...we could put this field in the inode instead of the mapping.
> > I know we've traditionally tracked this in the mapping, but is that
> > required here?
>
> What if the address_space is shared by two inodes? That is the whole
> point of the i_mapping pointer. This would make it harder for the
> "other" inode to get the error.
> (Does anyone actually use fs/coda ??
> Actually, block devices use i_mapping too.
> If two block device inodes have the same major/minor number, they
> end up having i_mapping point to the same place)
>
Ahh ok, that makes sense. I'll plan to keep it part of the mapping.
> If you are concerned about space in 'struct address_space', just prune
> some wastage.
> The "host" field brings no value. It is only ever assigned in
> inode_init_always():
>
> struct address_space *const mapping = &inode->i_data;
> ......
> mapping->host = inode;
>
> So you could change all references to use
> container_of(mapping, struct inode, i_data)
>
That might be a nice cleanup, but I think I'll leave that to be done
separately.
--
Jeff Layton <jlayton@redhat.com>
next prev parent reply other threads:[~2017-04-06 14:02 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-31 19:25 [RFC PATCH 0/4] fs: introduce new writeback error tracking infrastructure and convert ext4 to use it Jeff Layton
2017-03-31 19:26 ` [RFC PATCH 1/4] fs: new infrastructure for writeback error handling and reporting Jeff Layton
2017-04-03 7:12 ` Nikolay Borisov
2017-04-03 10:28 ` Jeff Layton
2017-04-03 14:47 ` Matthew Wilcox
2017-04-03 15:19 ` Jeff Layton
2017-04-03 16:15 ` Matthew Wilcox
2017-04-03 16:30 ` Jeff Layton
2017-03-31 19:26 ` [RFC PATCH 2/4] dax: set errors in mapping when writeback fails Jeff Layton
2017-03-31 19:26 ` [RFC PATCH 3/4] buffer: set wb errors using both new and old infrastructure for now Jeff Layton
2017-03-31 19:26 ` [RFC PATCH 4/4] ext4: wire it up to the new writeback error reporting infrastructure Jeff Layton
2017-04-03 4:25 ` [RFC PATCH 0/4] fs: introduce new writeback error tracking infrastructure and convert ext4 to use it NeilBrown
2017-04-03 10:28 ` Jeff Layton
2017-04-03 14:32 ` Matthew Wilcox
2017-04-03 17:47 ` Jeff Layton
2017-04-03 18:09 ` Jeremy Allison
2017-04-03 18:18 ` Jeff Layton
2017-04-03 18:36 ` Jeremy Allison
2017-04-03 18:40 ` Jeremy Allison
2017-04-03 18:49 ` Jeff Layton
2017-04-03 19:16 ` Matthew Wilcox
2017-04-03 20:16 ` Jeff Layton
2017-04-04 2:45 ` Matthew Wilcox
2017-04-04 3:03 ` NeilBrown
2017-04-04 11:41 ` Jeff Layton
2017-04-04 22:41 ` NeilBrown
2017-04-04 11:53 ` Matthew Wilcox
2017-04-04 12:17 ` Jeff Layton
2017-04-04 16:12 ` Matthew Wilcox
2017-04-04 16:25 ` Jeff Layton
2017-04-04 17:09 ` Matthew Wilcox
2017-04-04 18:08 ` Jeff Layton
2017-04-04 22:50 ` NeilBrown
2017-04-05 19:49 ` Jeff Layton
2017-04-05 21:03 ` Matthew Wilcox
2017-04-06 0:19 ` NeilBrown
2017-04-06 0:02 ` NeilBrown
2017-04-06 2:55 ` Matthew Wilcox
2017-04-06 5:12 ` NeilBrown
2017-04-06 13:31 ` Matthew Wilcox
2017-04-06 21:53 ` NeilBrown
2017-04-06 14:02 ` Jeff Layton [this message]
2017-04-06 19:14 ` Jeff Layton
2017-04-06 20:05 ` Matthew Wilcox
2017-04-07 13:12 ` Jeff Layton
2017-04-09 23:15 ` NeilBrown
2017-04-10 13:19 ` Jeff Layton
2017-04-06 22:15 ` NeilBrown
2017-04-04 23:13 ` NeilBrown
2017-04-05 11:14 ` Jeff Layton
2017-04-06 0:24 ` NeilBrown
2017-04-04 13:38 ` Theodore Ts'o
2017-04-04 22:28 ` NeilBrown
2017-04-03 14:51 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1491487371.18658.22.camel@redhat.com \
--to=jlayton@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neilb@suse.com \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.