From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Andres Freund <andres@anarazel.de>
Cc: Andreas Dilger <adilger@dilger.ca>,
Ext4 Developers List <linux-ext4@vger.kernel.org>,
Linux FS Devel <linux-fsdevel@vger.kernel.org>,
Jeff Layton <jlayton@redhat.com>,
"Joshua D. Drake" <jd@commandprompt.com>
Subject: Re: fsync() errors is unsafe and risks data loss
Date: Thu, 12 Apr 2018 17:52:52 -0400 [thread overview]
Message-ID: <20180412215252.GW2801@thunk.org> (raw)
In-Reply-To: <20180412195536.4nunjt5li2xb4rpw@alap3.anarazel.de>
On Thu, Apr 12, 2018 at 12:55:36PM -0700, Andres Freund wrote:
>
> Any pointers to that the underling netlink mechanism? If we can force
> postgres to kill itself when such an error is detected (via a dedicated
> monitoring process), I'd personally be happy enough. It'd be nicer if
> we could associate that knowledge with particular filesystems etc
> (which'd possibly hard through dm etc?), but this'd be much better than
> nothing.
Yeah, sorry, it never got upstreamed. It's not really all that
complicated, it was just that there were some other folks who wanted
to do something similar, and there was a round of bike-sheddingh
several years ago, and nothing ever went upstream. Part of the
problem was that our orignial scheme sent up information about file
system-level corruption reports --- e.g, those stemming from calls to
ext4_error() --- and lots of people had different ideas about how tot
get all of the possible information up in some structured format.
(Think something like uerf from Digtial's OSF/1.)
We did something *really* simple/stupid. We just sent essentially an
ascii test string out the netlink socket. That's because what we were
doing before was essentially scraping the output of dmesg
(e.g. /dev/kmssg).
That's actually probably the simplest thing to do, and it has the
advantage that it will work even on ancient enterprise kernels that PG
users are likely to want to use. So you will need to implement the
dmesg text scraper anyway, and that's probably good enough for most
use cases.
> The problem really isn't about *recovering* from disk errors. *Knowing*
> about them is the crucial part. We do not want to give back clients the
> information that an operation succeeded, when it actually didn't. There
> could be improvements above that, but as long as it's guaranteed that
> "we" get the error (rather than just some kernel log we don't have
> access to, which looks different due to config etc), it's ok. We can
> throw our hands up in the air and give up.
Right, it's a little challenging because the actual regexp's you would
need to use do vary from device driver to device driver. Fortunately
nearly everything is a SCSI/SATA device these days, so there isn't
_that_ much variability.
> Yea, agreed on all that. I don't think anybody actually involved in
> postgres wants to do anything like that. Seems far outside of postgres'
> remit.
Some people on the pg-hackers list were talking about wanting to retry
the fsync() and hoping that would cause the write to somehow suceed.
It's *possible* that might help, but it's not likely to be helpful in
my experience.
Cheers,
- Ted
next prev parent reply other threads:[~2018-04-12 21:52 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-10 22:07 fsync() errors is unsafe and risks data loss Andres Freund
2018-04-11 21:52 ` Andreas Dilger
2018-04-12 0:09 ` Dave Chinner
2018-04-12 2:32 ` Andres Freund
2018-04-12 2:51 ` Andres Freund
2018-04-12 5:09 ` Theodore Y. Ts'o
2018-04-12 5:45 ` Dave Chinner
2018-04-12 11:24 ` Jeff Layton
2018-04-12 21:11 ` Andres Freund
2018-04-12 10:19 ` Lukas Czerner
2018-04-12 19:46 ` Andres Freund
2018-04-12 2:17 ` Andres Freund
2018-04-12 3:02 ` Matthew Wilcox
2018-04-12 11:09 ` Jeff Layton
2018-04-12 11:19 ` Matthew Wilcox
2018-04-12 12:01 ` Dave Chinner
2018-04-12 15:08 ` Jeff Layton
2018-04-12 22:44 ` Dave Chinner
2018-04-13 13:18 ` Jeff Layton
2018-04-13 13:25 ` Andres Freund
2018-04-13 14:02 ` Matthew Wilcox
2018-04-14 1:47 ` Dave Chinner
2018-04-14 2:04 ` Andres Freund
2018-04-18 23:59 ` Dave Chinner
2018-04-19 0:23 ` Eric Sandeen
2018-04-14 2:38 ` Matthew Wilcox
2018-04-19 0:13 ` Dave Chinner
2018-04-19 0:40 ` Matthew Wilcox
2018-04-19 1:08 ` Theodore Y. Ts'o
2018-04-19 17:40 ` Matthew Wilcox
2018-04-19 23:27 ` Theodore Y. Ts'o
2018-04-19 23:28 ` Dave Chinner
2018-04-12 15:16 ` Theodore Y. Ts'o
2018-04-12 20:13 ` Andres Freund
2018-04-12 20:28 ` Matthew Wilcox
2018-04-12 21:14 ` Jeff Layton
2018-04-12 21:31 ` Matthew Wilcox
2018-04-13 12:56 ` Jeff Layton
2018-04-12 21:21 ` Theodore Y. Ts'o
2018-04-12 21:24 ` Matthew Wilcox
2018-04-12 21:37 ` Andres Freund
2018-04-12 20:24 ` Andres Freund
2018-04-12 21:27 ` Jeff Layton
2018-04-12 21:53 ` Andres Freund
2018-04-12 21:57 ` Theodore Y. Ts'o
2018-04-21 18:14 ` Jan Kara
2018-04-12 5:34 ` Theodore Y. Ts'o
2018-04-12 19:55 ` Andres Freund
2018-04-12 21:52 ` Theodore Y. Ts'o [this message]
2018-04-12 22:03 ` Andres Freund
2018-04-18 18:09 ` J. Bruce Fields
2018-04-13 14:48 ` Matthew Wilcox
2018-04-21 16:59 ` Jan Kara
[not found] <8da874c9-cf9c-d40a-3474-b773190878e7@commandprompt.com>
[not found] ` <20180410184356.GD3563@thunk.org>
2018-04-10 19:47 ` Martin Steigerwald
2018-04-18 16:52 ` J. Bruce Fields
2018-04-19 8:39 ` Christoph Hellwig
2018-04-19 14:10 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180412215252.GW2801@thunk.org \
--to=tytso@mit.edu \
--cc=adilger@dilger.ca \
--cc=andres@anarazel.de \
--cc=jd@commandprompt.com \
--cc=jlayton@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).