linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Kevin Wolf <kwolf@redhat.com>, Rik van Riel <riel@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	lsf-pc@lists.linux-foundation.org,
	Ric Wheeler <rwheeler@redhat.com>
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] I/O error handling and fsync()
Date: Wed, 11 Jan 2017 10:47:29 +0100	[thread overview]
Message-ID: <20170111094729.GH16116@quack2.suse.cz> (raw)
In-Reply-To: <20170111050356.ldlx73n66zjdkh6i@thunk.org>

On Wed 11-01-17 00:03:56, Ted Tso wrote:
> A couple of thoughts.
> 
> First of all, one of the reasons why this probably hasn't been
> addressed for so long is because programs who really care about issues
> like this tend to use Direct I/O, and don't use the page cache at all.
> And perhaps this is an option open to qemu as well?
> 
> Secondly, one of the reasons why we mark the page clean is because we
> didn't want a failing disk to memory to be trapped with no way of
> releasing the pages.  For example, if a user plugs in a USB
> thumbstick, writes to it, and then rudely yanks it out before all of
> the pages have been writeback, it would be unfortunate if the dirty
> pages can only be released by rebooting the system.
> 
> So an approach that might work is fsync() will keep the pages dirty
> --- but only while the file descriptor is open.  This could either be
> the default behavior, or something that has to be specifically
> requested via fcntl(2).  That way, as soon as the process exits (at
> which point it will be too late for it do anything to save the
> contents of the file) we also release the memory.  And if the process
> gets OOM killed, again, the right thing happens.  But if the process
> wants to take emergency measures to write the file somewhere else, it
> knows that the pages won't get lost until the file gets closed.

Well, as Neil pointed out, the problem is that once the data hits page
cache, we lose the association with a file descriptor. So for example
background writeback or sync(2) can find the dirty data and try to write
it, get EIO, and then you have to do something about it because you don't
know whether fsync(2) is coming or not.

That being said if we'd just keep the pages which failed write out dirty,
the system will eventually block all writers in balance_dirty_pages() and
at that point it is IMO a policy decision (probably per device or per fs)
whether you should just keep things blocked waiting for better times or
whether you just want to start discarding dirty data on a failed write.
Now discarding data that failed to write only when we are close to dirty
limit (or after some timeout or whatever) has a disadvantage that it is
not easy to predict from user POV so I'm not sure if we want to go down
that path. But I can see two options making sense:

1) Just hold onto data and wait indefinitely. Possibly provide a way for
   userspace to forcibly unmount a filesystem in such state.

2) Do what we do now.
 
								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-01-11  9:47 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-10 16:02 [LSF/MM TOPIC] I/O error handling and fsync() Kevin Wolf
2017-01-11  0:41 ` NeilBrown
2017-01-13 11:09   ` Kevin Wolf
2017-01-13 14:21     ` Theodore Ts'o
2017-01-13 16:00       ` Kevin Wolf
2017-01-13 22:28         ` NeilBrown
2017-01-14  6:18           ` Darrick J. Wong
2017-01-16 12:14           ` [Lsf-pc] " Jeff Layton
2017-01-22 22:44             ` NeilBrown
2017-01-22 23:31               ` Jeff Layton
2017-01-23  0:21                 ` Theodore Ts'o
2017-01-23 10:09                   ` Kevin Wolf
2017-01-23 12:10                     ` Jeff Layton
2017-01-23 17:25                       ` Theodore Ts'o
2017-01-23 17:53                         ` Chuck Lever
2017-01-23 22:40                         ` Jeff Layton
2017-01-23 22:35                     ` Jeff Layton
2017-01-23 23:09                       ` Trond Myklebust
2017-01-24  0:16                         ` NeilBrown
2017-01-24  0:46                           ` Jeff Layton
2017-01-24 21:58                             ` NeilBrown
2017-01-25 13:00                               ` Jeff Layton
2017-01-30  5:30                                 ` NeilBrown
2017-01-24  3:34                           ` Trond Myklebust
2017-01-25 18:35                             ` Theodore Ts'o
2017-01-26  0:36                               ` NeilBrown
2017-01-26  9:25                                 ` Jan Kara
2017-01-26 22:19                                   ` NeilBrown
2017-01-27  3:23                                     ` Theodore Ts'o
2017-01-27  6:03                                       ` NeilBrown
2017-01-30 16:04                                       ` Jan Kara
2017-01-13 18:40     ` Al Viro
2017-01-13 19:06       ` Kevin Wolf
2017-01-11  5:03 ` Theodore Ts'o
2017-01-11  9:47   ` Jan Kara [this message]
2017-01-11 15:45     ` [Lsf-pc] " Theodore Ts'o
2017-01-11 10:55   ` Chris Vest
2017-01-11 11:40   ` Kevin Wolf
2017-01-13  4:51     ` NeilBrown
2017-01-13 11:51       ` Kevin Wolf
2017-01-13 21:55         ` NeilBrown
2017-01-11 12:14   ` Chris Vest

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170111094729.GH16116@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=hch@infradead.org \
    --cc=kwolf@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=riel@redhat.com \
    --cc=rwheeler@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).