* Journal async commit broken for data=ordered?
@ 2012-02-14 15:55 Jan Kara
2012-02-14 20:56 ` Andreas Dilger
0 siblings, 1 reply; 3+ messages in thread
From: Jan Kara @ 2012-02-14 15:55 UTC (permalink / raw)
To: linux-ext4; +Cc: Ted Tso
Hello,
I've just realized that JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT breaks
guarantees of data=ordered mode in ext4. The problem is that async commit
code assumes that when a checksum of a transaction in the journal matches,
all necessary data is on disk. This is true for metadata but need not be so
for data - the whole transaction may be correctly on pernament storage
while some data is still sitting in drive's caches. Thus if a power failure
happens at that moment, we have broken guarantees of data=ordered mode.
Seeing that async commit code isn't used by default anyway (I remember
there used to be some problems with it), shouldn't we just rip it out?
Honza
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Journal async commit broken for data=ordered?
2012-02-14 15:55 Journal async commit broken for data=ordered? Jan Kara
@ 2012-02-14 20:56 ` Andreas Dilger
2012-02-15 15:57 ` Jan Kara
0 siblings, 1 reply; 3+ messages in thread
From: Andreas Dilger @ 2012-02-14 20:56 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-ext4, Ted Tso
On 2012-02-14, at 8:55 AM, Jan Kara wrote:
> I've just realized that JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT breaks
> guarantees of data=ordered mode in ext4. The problem is that async commit
> code assumes that when a checksum of a transaction in the journal matches,
> all necessary data is on disk. This is true for metadata but need not be so
> for data - the whole transaction may be correctly on pernament storage
> while some data is still sitting in drive's caches. Thus if a power failure
> happens at that moment, we have broken guarantees of data=ordered mode.
> Seeing that async commit code isn't used by default anyway (I remember
> there used to be some problems with it), shouldn't we just rip it out?
A better long-term solution would be to submit the block IO in advance of
marking the metadata dirty, so the data is safe on disk before the journal
becomes involved. That would also avoid the problem of entangling the
journal commit latency with IO waiting to be flushed to disk.
Cheers, Andreas
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Journal async commit broken for data=ordered?
2012-02-14 20:56 ` Andreas Dilger
@ 2012-02-15 15:57 ` Jan Kara
0 siblings, 0 replies; 3+ messages in thread
From: Jan Kara @ 2012-02-15 15:57 UTC (permalink / raw)
To: Andreas Dilger; +Cc: Jan Kara, linux-ext4, Ted Tso
On Tue 14-02-12 13:56:30, Andreas Dilger wrote:
> On 2012-02-14, at 8:55 AM, Jan Kara wrote:
> > I've just realized that JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT breaks
> > guarantees of data=ordered mode in ext4. The problem is that async commit
> > code assumes that when a checksum of a transaction in the journal matches,
> > all necessary data is on disk. This is true for metadata but need not be so
> > for data - the whole transaction may be correctly on pernament storage
> > while some data is still sitting in drive's caches. Thus if a power failure
> > happens at that moment, we have broken guarantees of data=ordered mode.
> > Seeing that async commit code isn't used by default anyway (I remember
> > there used to be some problems with it), shouldn't we just rip it out?
>
> A better long-term solution would be to submit the block IO in advance of
> marking the metadata dirty, so the data is safe on disk before the journal
> becomes involved.
Umm, I don't see how that would really make difference. Unless you flush
disk's caches you can never be sure data made it to disk. And you must make
sure data is on disk before anyone has a chance to see transaction writing
these data fully written into the log.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-02-15 15:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-14 15:55 Journal async commit broken for data=ordered? Jan Kara
2012-02-14 20:56 ` Andreas Dilger
2012-02-15 15:57 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).