From: piaojun <piaojun@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2: return error when we attempt to access a dirty bh in jbd2
Date: Fri, 26 Jan 2018 10:31:19 +0800 [thread overview]
Message-ID: <5A6A92F7.2000306@huawei.com> (raw)
In-Reply-To: <5A6B0017020000F9000A6651@prv-mh.provo.novell.com>
Hi Gang,
On 2018/1/26 10:16, Gang He wrote:
> Hi Jun,
>
>
>>>>
>> Hi Gang,
>>
>> Filesystem won't become readonly and journal remains normal,
> How does the user aware this kind of issue happen? watch the kernel message? but some users do not like watch the kernel message except
> there is serious problem (e.g. crash).
Sadlly, user could only identify this problem by kernel message.
> so this problem needs user umount and mount again to recover.
> If the user skip this error prints in the kernel message, there will be further damage to the file system?
> If yes, we should let the user know this problem more obviously.
> And I'm thinking
This error won't cause further damage, but will make the operations
accessing truncate log bh failed, such as 'rm', 'truncate'. Before the
patch(acf8fdbe6afb), we set BUG here, but I think it's a little rude.
Perhaps we could set read-only or set jbd2 aborted.
thanks,
Jun
>> about if we should set readonly to notice user.
>>
>> thanks,
>> Jun
>>
>> On 2018/1/25 16:40, Gang He wrote:
>>> Hi Jun,
>>>
>>> If we return -EIO here, what is the consequence?
>>> make the journal aborted and file system will become read-only?
>>>
>>> Thanks
>>> Gang
>>>
>>>
>>>>>>
>>>> We should not reuse the dirty bh in jbd2 directly due to the following
>>>> situation:
>>>>
>>>> 1. When removing extent rec, we will dirty the bhs of extent rec and
>>>> truncate log at the same time, and hand them over to jbd2.
>>>> 2. The bhs are not flushed to disk due to abnormal storage link.
>>>> 3. After a while the storage link become normal. Truncate log flush
>>>> worker triggered by the next space reclaiming found the dirty bh of
>>>> truncate log and clear its 'BH_Write_EIO' and then set it uptodate
>>>> in __ocfs2_journal_access():
>>>>
>>>> ocfs2_truncate_log_worker
>>>> ocfs2_flush_truncate_log
>>>> __ocfs2_flush_truncate_log
>>>> ocfs2_replay_truncate_records
>>>> ocfs2_journal_access_di
>>>> __ocfs2_journal_access // here we clear io_error and set 'tl_bh'
>>>> uptodata.
>>>>
>>>> 4. Then jbd2 will flush the bh of truncate log to disk, but the bh of
>>>> extent rec is still in error state, and unfortunately nobody will
>>>> take care of it.
>>>> 5. At last the space of extent rec was not reduced, but truncate log
>>>> flush worker have given it back to globalalloc. That will cause
>>>> duplicate cluster problem which could be identified by fsck.ocfs2.
>>>>
>>>> So we should return -EIO in case of ruining atomicity and consistency
>>>> of space reclaim.
>>>>
>>>> Fixes: acf8fdbe6afb ("ocfs2: do not BUG if buffer not uptodate in
>>>> __ocfs2_journal_access")
>>>>
>>>> Signed-off-by: Jun Piao <piaojun@huawei.com>
>>>> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
>>>> ---
>>>> fs/ocfs2/journal.c | 45 +++++++++++++++++++++++++++++++++++----------
>>>> 1 file changed, 35 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>>>> index 3630443..d769ca2 100644
>>>> --- a/fs/ocfs2/journal.c
>>>> +++ b/fs/ocfs2/journal.c
>>>> @@ -666,21 +666,46 @@ static int __ocfs2_journal_access(handle_t *handle,
>>>> /* we can safely remove this assertion after testing. */
>>>> if (!buffer_uptodate(bh)) {
>>>> mlog(ML_ERROR, "giving me a buffer that's not uptodate!\n");
>>>> - mlog(ML_ERROR, "b_blocknr=%llu\n",
>>>> - (unsigned long long)bh->b_blocknr);
>>>> + mlog(ML_ERROR, "b_blocknr=%llu, b_state=0x%lx\n",
>>>> + (unsigned long long)bh->b_blocknr, bh->b_state);
>>>>
>>>> lock_buffer(bh);
>>>> /*
>>>> - * A previous attempt to write this buffer head failed.
>>>> - * Nothing we can do but to retry the write and hope for
>>>> - * the best.
>>>> + * We should not reuse the dirty bh directly due to the
>>>> + * following situation:
>>>> + *
>>>> + * 1. When removing extent rec, we will dirty the bhs of
>>>> + * extent rec and truncate log at the same time, and
>>>> + * hand them over to jbd2.
>>>> + * 2. The bhs are not flushed to disk due to abnormal
>>>> + * storage link.
>>>> + * 3. After a while the storage link become normal.
>>>> + * Truncate log flush worker triggered by the next
>>>> + * space reclaiming found the dirty bh of truncate log
>>>> + * and clear its 'BH_Write_EIO' and then set it uptodate
>>>> + * in __ocfs2_journal_access():
>>>> + *
>>>> + * ocfs2_truncate_log_worker
>>>> + * ocfs2_flush_truncate_log
>>>> + * __ocfs2_flush_truncate_log
>>>> + * ocfs2_replay_truncate_records
>>>> + * ocfs2_journal_access_di
>>>> + * __ocfs2_journal_access
>>>> + *
>>>> + * 4. Then jbd2 will flush the bh of truncate log to disk,
>>>> + * but the bh of extent rec is still in error state, and
>>>> + * unfortunately nobody will take care of it.
>>>> + * 5. At last the space of extent rec was not reduced,
>>>> + * but truncate log flush worker have given it back to
>>>> + * globalalloc. That will cause duplicate cluster problem
>>>> + * which could be identified by fsck.ocfs2.
>>>> + *
>>>> + * So we should return -EIO in case of ruining atomicity
>>>> + * and consistency of space reclaim.
>>>> */
>>>> if (buffer_write_io_error(bh) && !buffer_uptodate(bh)) {
>>>> - clear_buffer_write_io_error(bh);
>>>> - set_buffer_uptodate(bh);
>>>> - }
>>>> -
>>>> - if (!buffer_uptodate(bh)) {
>>>> + mlog(ML_ERROR, "A previous attempt to write this "
>>>> + "buffer head failed\n");
>>>> unlock_buffer(bh);
>>>> return -EIO;
>>>> }
>>>> --
>>>>
>>>> _______________________________________________
>>>> Ocfs2-devel mailing list
>>>> Ocfs2-devel at oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>> .
>>>
> .
>
next prev parent reply other threads:[~2018-01-26 2:31 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-25 2:41 [Ocfs2-devel] [PATCH] ocfs2: return error when we attempt to access a dirty bh in jbd2 piaojun
2018-01-25 8:40 ` Gang He
2018-01-25 12:10 ` piaojun
2018-01-26 2:16 ` Gang He
2018-01-26 2:31 ` piaojun [this message]
2018-01-25 11:59 ` Changwei Ge
2018-01-25 12:17 ` piaojun
2018-01-25 12:30 ` Changwei Ge
2018-01-25 12:44 ` piaojun
2018-01-26 1:00 ` Changwei Ge
2018-01-26 1:45 ` piaojun
2018-01-26 2:03 ` Changwei Ge
2018-01-27 3:51 ` piaojun
2018-01-27 5:17 ` Changwei Ge
2018-01-27 6:36 ` piaojun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5A6A92F7.2000306@huawei.com \
--to=piaojun@huawei.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.