linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Ye Bin <yebin@huaweicloud.com>
Cc: adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
	jack@suse.cz, Ye Bin <yebin10@huawei.com>
Subject: Re: [PATCH v3] jbd2: avoid mount failed when commit block is partial submitted
Date: Wed, 19 Jun 2024 22:50:31 -0400	[thread overview]
Message-ID: <20240620025031.GA1553731@mit.edu> (raw)
In-Reply-To: <20240425064515.836633-1-yebin@huaweicloud.com>

Apologies for not getting back to this until now; I was focused on
finalizing changes for the merge window, and then I was on vacation
for the 3 or 4 weeks.

On Thu, Apr 25, 2024 at 02:45:15PM +0800, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
> 
> We encountered a problem that the file system could not be mounted in
> the power-off scenario. The analysis of the file system mirror shows that
> only part of the data is written to the last commit block.
> The valid data of the commit block is concentrated in the first sector.
> However, the data of the entire block is involved in the checksum calculation.
> For different hardware, the minimum atomic unit may be different.
> If the checksum of a committed block is incorrect, clear the data except the
> 'commit_header' and then calculate the checksum. If the checkusm is correct,
> it is considered that the block is partially committed.

This makes a lot of sense; thanks for changing the patch to do this.

> However, if there are valid description/revoke blocks, it is
> considered that the data is abnormal and the log replay is stopped.

I'm not sure I understand your thinking behind this part of the patch,
though.  The description/revoke blocks will have their own checksum,
and while I grant that it would be... highly unusual for the commit
block to be partially written as the result of a torn write, and then
for there to be subsequent valid descriptor or revoke blocks (which
would presumably be part of the next transaction), I wonder if the
extra complexity is worth it.

I can't think of a situation where this might happen other than say, a
bit flip in the portion of commit block where we don't care about its
contents; but in that case, after zeroing out parts of the commit
block that we don't care about, if the checksum is valid, presumably
we would have managed to luckily recover from the bit flip.  So
continuing shouldn't be risky.

What am I missing?

						- Ted

  parent reply	other threads:[~2024-06-20  2:51 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-25  6:45 [PATCH v3] jbd2: avoid mount failed when commit block is partial submitted Ye Bin
2024-04-29 19:14 ` Jan Kara
2024-06-20  2:50 ` Theodore Ts'o [this message]
2024-06-20  6:32   ` yebin (H)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240620025031.GA1553731@mit.edu \
    --to=tytso@mit.edu \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=yebin10@huawei.com \
    --cc=yebin@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).