linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shaohua Li <shli@kernel.org>
To: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Cc: shli@fb.com, linux-raid@vger.kernel.org
Subject: Re: [PATCH v5 3/7] raid5-ppl: Partial Parity Log write logging implementation
Date: Thu, 9 Mar 2017 15:24:51 -0800	[thread overview]
Message-ID: <20170309232451.4hw2pgizn7potlrj@kernel.org> (raw)
In-Reply-To: <20170309090003.13298-4-artur.paszkiewicz@intel.com>

On Thu, Mar 09, 2017 at 09:59:59AM +0100, Artur Paszkiewicz wrote:
> Implement the calculation of partial parity for a stripe and PPL write
> logging functionality. The description of PPL is added to the
> documentation. More details can be found in the comments in raid5-ppl.c.
> 
> Attach a page for holding the partial parity data to stripe_head.
> Allocate it only if mddev has the MD_HAS_PPL flag set.
> 
> Partial parity is the xor of not modified data chunks of a stripe and is
> calculated as follows:
> 
> - reconstruct-write case:
>   xor data from all not updated disks in a stripe
> 
> - read-modify-write case:
>   xor old data and parity from all updated disks in a stripe
> 
> Implement it using the async_tx API and integrate into raid_run_ops().
> It must be called when we still have access to old data, so do it when
> STRIPE_OP_BIODRAIN is set, but before ops_run_prexor5(). The result is
> stored into sh->ppl_page.
> 
> Partial parity is not meaningful for full stripe write and is not stored
> in the log or used for recovery, so don't attempt to calculate it when
> stripe has STRIPE_FULL_WRITE.
> 
> Put the PPL metadata structures to md_p.h because userspace tools
> (mdadm) will also need to read/write PPL.
> 
> Warn about using PPL with enabled disk volatile write-back cache for
> now. It can be removed once disk cache flushing before writing PPL is
> implemented.
> 
> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>

... snip ...

> +struct dma_async_tx_descriptor *
> +ops_run_partial_parity(struct stripe_head *sh, struct raid5_percpu *percpu,
> +		       struct dma_async_tx_descriptor *tx)
> +{
> +	int disks = sh->disks;
> +	struct page **xor_srcs = flex_array_get(percpu->scribble, 0);
> +	int count = 0, pd_idx = sh->pd_idx, i;
> +	struct async_submit_ctl submit;
> +
> +	pr_debug("%s: stripe %llu\n", __func__, (unsigned long long)sh->sector);
> +
> +	/*
> +	 * Partial parity is the XOR of stripe data chunks that are not changed
> +	 * during the write request. Depending on available data
> +	 * (read-modify-write vs. reconstruct-write case) we calculate it
> +	 * differently.
> +	 */
> +	if (sh->reconstruct_state == reconstruct_state_prexor_drain_run) {
> +		/* rmw: xor old data and parity from updated disks */
> +		for (i = disks; i--;) {
> +			struct r5dev *dev = &sh->dev[i];
> +			if (test_bit(R5_Wantdrain, &dev->flags) || i == pd_idx)
> +				xor_srcs[count++] = dev->page;
> +		}
> +	} else if (sh->reconstruct_state == reconstruct_state_drain_run) {
> +		/* rcw: xor data from all not updated disks */
> +		for (i = disks; i--;) {
> +			struct r5dev *dev = &sh->dev[i];
> +			if (test_bit(R5_UPTODATE, &dev->flags))
> +				xor_srcs[count++] = dev->page;
> +		}
> +	} else {
> +		return tx;
> +	}
> +
> +	init_async_submit(&submit, ASYNC_TX_XOR_ZERO_DST, tx, NULL, sh,
> +			  flex_array_get(percpu->scribble, 0)
> +			  + sizeof(struct page *) * (sh->disks + 2));

Since this should be done before biodrain, should this add ASYNC_TX_FENCE flag?

Thanks,
Shaohua

  reply	other threads:[~2017-03-09 23:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-09  8:59 [PATCH v5 0/7] Partial Parity Log for MD RAID 5 Artur Paszkiewicz
2017-03-09  8:59 ` [PATCH v5 1/7] md: superblock changes for PPL Artur Paszkiewicz
2017-03-09  8:59 ` [PATCH v5 2/7] raid5: separate header for log functions Artur Paszkiewicz
2017-03-09  8:59 ` [PATCH v5 3/7] raid5-ppl: Partial Parity Log write logging implementation Artur Paszkiewicz
2017-03-09 23:24   ` Shaohua Li [this message]
2017-03-10 15:16     ` Artur Paszkiewicz
2017-03-10 18:15       ` Shaohua Li
2017-03-10 18:42         ` Dan Williams
2017-03-21 22:00   ` NeilBrown
2017-03-24 16:46     ` Shaohua Li
2017-03-28 14:12       ` Artur Paszkiewicz
2017-03-28 16:16         ` Shaohua Li
2017-04-16 22:58   ` Greg Thelen
2017-04-19  8:48     ` [PATCH] uapi: fix linux/raid/md_p.h userspace compilation error Artur Paszkiewicz
2017-04-19 16:59       ` Greg Thelen
2017-04-20 16:41       ` Shaohua Li
2017-03-09  9:00 ` [PATCH v5 4/7] md: add sysfs entries for PPL Artur Paszkiewicz
2017-03-09  9:00 ` [PATCH v5 5/7] raid5-ppl: load and recover the log Artur Paszkiewicz
2017-03-09 23:30   ` Shaohua Li
2017-03-10 15:23     ` Artur Paszkiewicz
2017-03-09  9:00 ` [PATCH v5 6/7] raid5-ppl: support disk hot add/remove with PPL Artur Paszkiewicz
2017-03-09  9:00 ` [PATCH v5 7/7] raid5-ppl: runtime PPL enabling or disabling Artur Paszkiewicz
2017-03-09 23:32 ` [PATCH v5 0/7] Partial Parity Log for MD RAID 5 Shaohua Li
2017-03-10 15:40   ` [PATCH] raid5-ppl: two minor improvements Artur Paszkiewicz
2017-03-10 18:16     ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170309232451.4hw2pgizn7potlrj@kernel.org \
    --to=shli@kernel.org \
    --cc=artur.paszkiewicz@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=shli@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).