From: Andreas Dilger <adilger@clusterfs.com>
To: Alex Tomas <alex@clusterfs.com>
Cc: Girish Shilamkar <girish@clusterfs.com>,
Ext4 Mailing List <linux-ext4@vger.kernel.org>,
Theodore Tso <tytso@mit.edu>
Subject: Re: Updated patches for journal checksums.
Date: Tue, 19 Jun 2007 13:06:46 -0600 [thread overview]
Message-ID: <20070619190645.GT5181@schatzie.adilger.int> (raw)
In-Reply-To: <46779D9D.2090109@clusterfs.com>
On Jun 19, 2007 13:10 +0400, Alex Tomas wrote:
> Andreas Dilger wrote:
> >I _think_ Alex is asking "what happens if during a transaction undergoing
> >checkpoint of blocks to filesystem (not the last one in the journal) is
> >interrupted by a crash and upon restart the partially-checkpointed
> >transaction is found to have a checksum error?"
>
> yup, thanks for clarification.
>
> >>>what do we do if transaction in the journal is found with wrong
> >>>checksum? leave partial transaction in-place?
> >>The sanity of the transaction is checked in PASS_SCAN. And if checksum
> >>is found to be incorrect for nth transaction then last transaction which
> >>is written to disk is (n - 1).
> >
> >The recovery.c code (AFAIK) does not do replay for any transaction that
> >does not have a valid checksum, or transactions beyond that. If the
> >bad transaction had already started chekpoint (i.e. isn't the last
> >committed transaction) then the journal _should_ return an error up to
> >the filesystem, so it can call ext4_error() at startup. For e2fsck
> >(which normally does journal replay & recovery) it can do a full
> >filesystem check at this point.
>
> hmm. it actually can be last transaction (following no activity?)
The current implementation is that if you are running with sync commits
(i.e. the current 2-phase write {data blocks} {wait} {commit block}
{wait}) and you detect a checksum error in the last transaction this
SHOULD result in the filesystem being marked in error (I haven't
verified this is in the most recent version of the code). This is
what JBD_FEATURE_COMPAT_CHECKSUM does - it only adds checksums to the
commit blocks.
However, in the case of async commits (JBD_FEATURE_INCOMPAT_ASYNC_COMMIT
- write {data blocks} {commit block} {wait}, which is what gives the
performance gain) there isn't any way to tell the difference between a
checksum error in the last transaction and the case of an interrupted
commit. In the case of an interrupted commit there would not have been
any checkpointing into the filesystem yet, so this case shouldn't matter.
This leaves only the "last transaction, finished async commit, started
checkpoint, crash, corruption, checksum error" case. I agree this could
cause data corruption. One possibility is to limit the corruption to
at most a single block by adding checksums to the transaction descriptor
blocks (which already hold the transaction ID to verify they are not
stale) and cover the block numbers. This would seem to be the only place
that we could get error "fan out" and then we would limit the corruption
to the block(s) that were actually corrupted. This wouldn't be any
different than having random corruption of disk blocks inside the fs.
This would probably be a separate INCOMPAT flag.
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
prev parent reply other threads:[~2007-06-19 19:06 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-19 7:50 Updated patches for journal checksums Girish Shilamkar
2007-06-19 8:03 ` Alex Tomas
2007-06-19 8:15 ` Girish Shilamkar
2007-06-19 8:51 ` Andreas Dilger
2007-06-19 9:10 ` Alex Tomas
2007-06-19 19:06 ` Andreas Dilger [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20070619190645.GT5181@schatzie.adilger.int \
--to=adilger@clusterfs.com \
--cc=alex@clusterfs.com \
--cc=girish@clusterfs.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox