linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shaohua Li <shli@kernel.org>
To: Song Liu <songliubraving@fb.com>
Cc: linux-raid@vger.kernel.org, neilb@suse.com, shli@fb.com,
	kernel-team@fb.com, dan.j.williams@intel.com, hch@infradead.org,
	liuzhengyuang521@gmail.com, liuzhengyuan@kylinos.cn
Subject: Re: [PATCH v2 4/6] r5cache: r5c recovery
Date: Tue, 27 Sep 2016 18:08:06 -0700	[thread overview]
Message-ID: <20160928010806.GD98100@kernel.org> (raw)
In-Reply-To: <20160926233050.3351081-5-songliubraving@fb.com>

On Mon, Sep 26, 2016 at 04:30:48PM -0700, Song Liu wrote:
> For Data-Only strips, we need to finish complete calculate parity and
> finish the full reconstruct write or RMW write. For simplicity, in
> the recovery, we load the stripe to stripe cache. Once the array is
> started, the stripe cache state machine will handle these stripes
> through normal write path.

please make sure not change the behavior of writethrough mode. In writethrough,
we discard data-only stripes.

Is it safe to run the state machine in recovery stage? For exmaple, md
personablity ->run is called before bitmap is initialized.

> r5c_recovery_flush_log contains the main procedure of recovery. The
> recovery code first scans through the journal and loads data to
> stripe cache. The code keeps tracks of all these stripes in a list
> (use sh->lru and ctx->cached_list), stripes in the list are
> organized in the order of its first appearance on the journal.
> During the scan, the recovery code assesses each stripe as
> Data-Parity or Data-Only.
> 
> During scan, the array may run out of stripe cache. In these cases,
> the recovery code tries to release some stripe head by replaying
> existing Data-Parity stripes. Once these replays are done, these
> stripes can be released. When releasing Data-Parity stripes is not
> enough, the recovery code will also call raid5_set_cache_size to
> increase stripe cache size.
> 
> At the end of scan, the recovery code replays all Data-Parity
> stripes, and sets proper states for Data-Only stripes. The recovery
> code also increases seq number by 10 and rewrites all Data-Only
> stripes to journal. This is to avoid confusion after repeated
> crashes. More details is explained in raid5-cache.c before
> r5c_recovery_rewrite_data_only_stripes().
...
> +r5c_recovery_analyze_meta_block(struct r5l_log *log,
> +				struct r5l_recovery_ctx *ctx,
> +				struct list_head *cached_stripe_list)
> +{
> +	struct mddev *mddev = log->rdev->mddev;
> +	struct r5conf *conf = mddev->private;
>  	struct r5l_meta_block *mb;
> -	int offset;
> +	struct r5l_payload_data_parity *payload;
> +	int mb_offset;
>  	sector_t log_offset;
> -	sector_t stripe_sector;
> +	sector_t stripe_sect;
> +	struct stripe_head *sh;
> +	int ret;
> +
> +	/* for mismatch in data blocks, we will drop all data in this mb, but
> +	 * we will still read next mb for other data with FLUSH flag, as
> +	 * io_unit could finish out of order.
> +	 */
please correct the format

> +	ret = r5l_recovery_verify_data_checksum_for_mb(log, ctx);
> +	if (ret == -EINVAL)
> +		return -EAGAIN;
> +	else if (ret)
> +		return ret;
>  
>  	mb = page_address(ctx->meta_page);
> -	offset = sizeof(struct r5l_meta_block);
> +	mb_offset = sizeof(struct r5l_meta_block);
>  	log_offset = r5l_ring_add(log, ctx->pos, BLOCK_SECTORS);
>  
> -	while (offset < le32_to_cpu(mb->meta_size)) {
> +	while (mb_offset < le32_to_cpu(mb->meta_size)) {
>  		int dd;
>  
> -		payload = (void *)mb + offset;
> -		stripe_sector = raid5_compute_sector(conf,
> -						     le64_to_cpu(payload->location), 0, &dd, NULL);
> -		if (r5l_recovery_flush_one_stripe(log, ctx, stripe_sector,
> -						  &offset, &log_offset))
> +		payload = (void *)mb + mb_offset;
> +		stripe_sect = (payload->header.type == R5LOG_PAYLOAD_DATA) ?
> +			raid5_compute_sector(
> +				conf, le64_to_cpu(payload->location), 0, &dd,
> +				NULL)
> +			: le64_to_cpu(payload->location);
> +
> +		sh = r5c_recovery_lookup_stripe(cached_stripe_list,
> +						stripe_sect);
> +
> +		if (!sh) {
> +			sh = r5c_recovery_alloc_stripe(conf, cached_stripe_list,
> +						       stripe_sect, ctx->pos);
> +			/* cannot get stripe from raid5_get_active_stripe
> +			 * try replay some stripes
> +			 */
ditto

Thanks,
Shaohua

  reply	other threads:[~2016-09-28  1:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-26 23:30 [PATCH v2 0/6] raid5-cache: enabling cache features Song Liu
2016-09-26 23:30 ` [PATCH v2 1/6] r5cache: write part of r5cache Song Liu
2016-09-27 22:51   ` Shaohua Li
2016-09-29 23:06     ` Song Liu
2016-09-26 23:30 ` [PATCH v2 2/6] r5cache: sysfs entry r5c_state Song Liu
2016-09-27 22:58   ` Shaohua Li
2016-09-26 23:30 ` [PATCH v2 3/6] r5cache: reclaim support Song Liu
2016-09-28  0:34   ` Shaohua Li
2016-10-04 21:59     ` Song Liu
2016-09-26 23:30 ` [PATCH v2 4/6] r5cache: r5c recovery Song Liu
2016-09-28  1:08   ` Shaohua Li [this message]
2016-09-26 23:30 ` [PATCH v2 5/6] r5cache: handle SYNC and FUA Song Liu
2016-09-28  1:32   ` Shaohua Li
2016-09-26 23:30 ` [PATCH v2 6/6] md/r5cache: decrease the counter after full-write stripe was reclaimed Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160928010806.GD98100@kernel.org \
    --to=shli@kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=hch@infradead.org \
    --cc=kernel-team@fb.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=liuzhengyuan@kylinos.cn \
    --cc=liuzhengyuang521@gmail.com \
    --cc=neilb@suse.com \
    --cc=shli@fb.com \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).