From: "Zhengyuan Liu" <liuzhengyuan@kylinos.cn>
To: "Shaohua Li" <shli@kernel.org>
Cc: "Song Liu" <songliubraving@fb.com>,
linux-raid <linux-raid@vger.kernel.org>,
liuzhengyuang521 <liuzhengyuang521@gmail.com>
Subject: Re: [PATCH] md/raid5: write an empty meta-block when creatinglogsuper-block
Date: Thu, 27 Oct 2016 22:05:06 +0800 [thread overview]
Message-ID: <tencent_647E309D0106DEBE08875B26@qq.com> (raw)
Sorry for the unclear expression.
The log might look like this before we did a recovery :
| mb1 | mb2 | mb3 | | | |
last_checkpoint = mb1'postion, last_cp_seq = mb1'seq
After we did a recovery(we would write a empty meta block emb at log tail):
| mb1 | mb2 | mb3 | emb | | |
last_checkpoint = emb'position, last_cp_seq = mb1'seq + 11
Then we write two meta blocks and suppose crash happens:
| mb1 | mb2 | mb3 | emb | mb4 | mb5 |
last_checkpoint = emb'position, last_cp_seq = mb1'seq + 11
Now we did another recovery after restart and suppose mb4 was invalid:
| mb1 | mb2 | mb3 | emb | mb4 | mb5 |
last_checkpoint = emb'position, last_cp_seq = mb1'seq + 11
Since mb4 was invalid, we would stop recovering mb5 which should be discarded.
After recovery, log_start points to mb4 and we wouldn't write a empty meta block
because condition "ctx.seq > log->last_cp_seq + 1" doesn't satisfy. If we are going to
write a valid meta block and crash happens again, the new meta block will fall into
position of mb4 and recovery process may do a recovery to mb5 since it's seq
is matched.
What I try to say is that if the first meta block ,not only the mid one, we written was
invalid, the log recovery could bring problem here too . I think the condition for
write a empty meta block should like this:
- if (ctx.seq > log->last_cp_seq + 1) {
+ if (ctx.seq > log->last_cp_seq) {
------------------ Original ------------------
From: "Shaohua Li"<shli@kernel.org>;
Date: Thu, Oct 27, 2016 02:35 AM
To: "Zhengyuan Liu"<liuzhengyuan@kylinos.cn>;
Cc: "Song Liu"<songliubraving@fb.com>; "linux-raid"<linux-raid@vger.kernel.org>; "liuzhengyuang521"<liuzhengyuang521@gmail.com>;
Subject: Re: [PATCH] md/raid5: write an empty meta-block when creatinglogsuper-block
On Tue, Oct 25, 2016 at 08:43:50PM +0800, Zhengyuan Liu wrote:
> After discussion with my colleague, I think there is still a problem that
> may happen very unlikely.The superblock should point to the last meta
> block we have written after log reclaim or point to the emtpy meta block
> after log recovery, just consider we write some meta block behind the
> superblock position and suppose crash happens. If the first meta block we
> have written neighboring the superblock position is invalid, ctx.seq would
> also equal to last_cp_seq+1 after we did a recovery . So the safest way is
> we always write an empty meta block at ctx.pos no matter how much
> ctx.req is more than last_cp_seq after we did a recovery.
> How do you think, Shaohua? If it is necessary, I'd revert this patch and
> resend one.
I didn't get the point. Could you please elaborate it again?
Thanks,
Shaohua
>
> ------------------ Original ------------------
> From: "Shaohua Li"<shli@kernel.org>;
> Date: Tue, Oct 25, 2016 05:23 AM
> To: "Zhengyuan Liu"<liuzhengyuan@kylinos.cn>;
> Cc: "shli"<shli@fb.com>; "Song Liu"<songliubraving@fb.com>; "linux-raid"<linux-raid@vger.kernel.org>; "liuzhengyuang521"<liuzhengyuang521@gmail.com>;
> Subject: Re: [PATCH] md/raid5: write an empty meta-block when creating logsuper-block
>
> On Mon, Oct 24, 2016 at 04:15:59PM +0800, Zhengyuan Liu wrote:
> > If superblock points to an invalid meta block, r5l_load_log will set
> > create_super with true and create an new superblock, this runtime path
> > would always happen if we do no writing I/O to this array since it was
> > created. Writing an empty meta block could avoid this unnecessary
> > action at the first time we created log superblock.
> >
> > Another reason is for the corretness of log recovery. Currently we have
> > bellow code to guarantee log revocery to be correct.
> >
> > if (ctx.seq > log->last_cp_seq + 1) {
> > int ret;
> >
> > ret = r5l_log_write_empty_meta_block(log, ctx.pos, ctx.seq + 10);
> > if (ret)
> > return ret;
> > log->seq = ctx.seq + 11;
> > log->log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS);
> > r5l_write_super(log, ctx.pos);
> > } else {
> > log->log_start = ctx.pos;
> > log->seq = ctx.seq;
> > }
> >
> > If we just created a array with a journal device, log->log_start and
> > log->last_checkpoint should all be 0, then we write three meta block
> > which are valid except mid one and supposed crash happened. The ctx.seq
> > would equal to log->last_cp_seq + 1 and log->log_start would be set to
> > position of mid invalid meta block after we did a recovery, this will
> > lead to problems which could be avoided with this patch.
>
> This would be very unlikely, but better to fix. Applied, thanks!
next reply other threads:[~2016-10-27 14:05 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-27 14:05 Zhengyuan Liu [this message]
2016-10-27 22:21 ` [PATCH] md/raid5: write an empty meta-block when creatinglogsuper-block Shaohua Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tencent_647E309D0106DEBE08875B26@qq.com \
--to=liuzhengyuan@kylinos.cn \
--cc=linux-raid@vger.kernel.org \
--cc=liuzhengyuang521@gmail.com \
--cc=shli@kernel.org \
--cc=songliubraving@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).