[PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3)

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3)
@ 2009-05-19  0:59 Dan Williams
  2009-05-19  0:59 ` [PATCH v2 01/11] async_tx: rename zero_sum to val Dan Williams
                   ` (10 more replies)
  0 siblings, 11 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-19  0:59 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

This series extends the async_tx api to include support for offloading raid6
operations to hardware.  Part 2 covers updates to the iop-adma driver to
enable raid6.  Part 3 covers the changes to md/raid6 to use the async_tx api
for its raid6 computations.

A merge of these 3 topic series is available via the raid6 branch of
async_tx.git:

	git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx.git raid6

There are several changes since v1 of the patch set[1].  The primary updates
are:
1/ Address H. Peter Anvin's comments around the asynchronous raid6 recovery
   implementation.  It has been rewritten to more closely model the existing
   recovery implementation [2].

2/ Address Andre Noll's comment concerning the excessive number of arguments
   passed to the async_* routines.  Common parameters are now passed in a
   structure.

3/ Address Neil Brown's comment that we should solve the "kernel stack space
   limits the size of an array" problem with pre-allocated buffers.
   stripe_head objects now carry a kmalloc'd 'scribble' buffer relieving the
   stack pressure of raid computations.

4/ Various updates to get the code to pass the new raid6 recovery self test

Please review.

Thanks,
Dan

[1] http://marc.info/?l=linux-raid&m=123740421725422&w=2
[2] http://marc.info/?l=linux-raid&m=124096283425451&w=2

---

Dan Williams (11):
      async_tx: rename zero_sum to val
      async_tx: kill ASYNC_TX_DEP_ACK flag
      async_tx: structify submission arguments, add scribble
      async_xor: permit callers to pass in a 'dma/page scribble' region
      md/raid5: add scribble region for buffer lists
      async_tx: add sum check flags
      async_tx: kill needless module_{init|exit}
      async_tx: add support for asynchronous GF multiplication
      async_tx: add support for asynchronous RAID6 recovery operations
      dmatest: add pq support
      async_tx: raid6 recovery self test

 arch/arm/include/asm/hardware/iop3xx-adma.h |    5 
 arch/arm/mach-iop13xx/include/mach/adma.h   |   12 -
 arch/arm/mach-iop13xx/setup.c               |   10 -
 arch/arm/plat-iop/adma.c                    |    2 
 crypto/async_tx/Kconfig                     |    9 +
 crypto/async_tx/Makefile                    |    3 
 crypto/async_tx/async_memcpy.c              |   34 +-
 crypto/async_tx/async_memset.c              |   36 +-
 crypto/async_tx/async_pq.c                  |  399 +++++++++++++++++++++++++++
 crypto/async_tx/async_raid6_recov.c         |  292 ++++++++++++++++++++
 crypto/async_tx/async_tx.c                  |   53 +---
 crypto/async_tx/async_xor.c                 |  168 +++++------
 crypto/async_tx/raid6test.c                 |  212 ++++++++++++++
 drivers/dma/Kconfig                         |    2 
 drivers/dma/dmaengine.c                     |    8 -
 drivers/dma/dmatest.c                       |   26 ++
 drivers/dma/iop-adma.c                      |   40 +--
 drivers/md/Kconfig                          |   13 +
 drivers/md/raid5.c                          |  123 ++++++--
 drivers/md/raid5.h                          |   10 +
 include/linux/async_tx.h                    |  103 +++++--
 include/linux/dmaengine.h                   |   87 +++++-
 22 files changed, 1347 insertions(+), 300 deletions(-)
 create mode 100644 crypto/async_tx/async_pq.c
 create mode 100644 crypto/async_tx/async_raid6_recov.c
 create mode 100644 crypto/async_tx/raid6test.c

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 01/11] async_tx: rename zero_sum to val
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
@ 2009-05-19  0:59 ` Dan Williams
       [not found]   ` <f12847240905200110x63b22601idbbdf3369984fa9a@mail.gmail.com>
  2009-05-19  0:59 ` [PATCH v2 02/11] async_tx: kill ASYNC_TX_DEP_ACK flag Dan Williams
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Dan Williams @ 2009-05-19  0:59 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

'zero_sum' does not properly describe the operation of generating parity
and checking that it validates against an existing buffer.  Change the
name of the operation to 'val' (for 'validate').  This is in
anticipation of the p+q case where it is a requirement to identify the
target parity buffers separately from the source buffers, because the
target parity buffers will not have corresponding pq coefficients.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/arm/mach-iop13xx/setup.c |    8 ++++----
 arch/arm/plat-iop/adma.c      |    2 +-
 crypto/async_tx/async_xor.c   |   16 ++++++++--------
 drivers/dma/dmaengine.c       |    4 ++--
 drivers/dma/iop-adma.c        |   38 +++++++++++++++++++-------------------
 drivers/md/raid5.c            |    2 +-
 include/linux/async_tx.h      |    2 +-
 include/linux/dmaengine.h     |    8 ++++----
 8 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/arch/arm/mach-iop13xx/setup.c b/arch/arm/mach-iop13xx/setup.c
index cfd4d2e..9800228 100644
--- a/arch/arm/mach-iop13xx/setup.c
+++ b/arch/arm/mach-iop13xx/setup.c
@@ -478,7 +478,7 @@ void __init iop13xx_platform_init(void)
 			dma_cap_set(DMA_MEMCPY, plat_data->cap_mask);
 			dma_cap_set(DMA_XOR, plat_data->cap_mask);
 			dma_cap_set(DMA_DUAL_XOR, plat_data->cap_mask);
-			dma_cap_set(DMA_ZERO_SUM, plat_data->cap_mask);
+			dma_cap_set(DMA_XOR_VAL, plat_data->cap_mask);
 			dma_cap_set(DMA_MEMSET, plat_data->cap_mask);
 			dma_cap_set(DMA_MEMCPY_CRC32C, plat_data->cap_mask);
 			dma_cap_set(DMA_INTERRUPT, plat_data->cap_mask);
@@ -490,7 +490,7 @@ void __init iop13xx_platform_init(void)
 			dma_cap_set(DMA_MEMCPY, plat_data->cap_mask);
 			dma_cap_set(DMA_XOR, plat_data->cap_mask);
 			dma_cap_set(DMA_DUAL_XOR, plat_data->cap_mask);
-			dma_cap_set(DMA_ZERO_SUM, plat_data->cap_mask);
+			dma_cap_set(DMA_XOR_VAL, plat_data->cap_mask);
 			dma_cap_set(DMA_MEMSET, plat_data->cap_mask);
 			dma_cap_set(DMA_MEMCPY_CRC32C, plat_data->cap_mask);
 			dma_cap_set(DMA_INTERRUPT, plat_data->cap_mask);
@@ -502,13 +502,13 @@ void __init iop13xx_platform_init(void)
 			dma_cap_set(DMA_MEMCPY, plat_data->cap_mask);
 			dma_cap_set(DMA_XOR, plat_data->cap_mask);
 			dma_cap_set(DMA_DUAL_XOR, plat_data->cap_mask);
-			dma_cap_set(DMA_ZERO_SUM, plat_data->cap_mask);
+			dma_cap_set(DMA_XOR_VAL, plat_data->cap_mask);
 			dma_cap_set(DMA_MEMSET, plat_data->cap_mask);
 			dma_cap_set(DMA_MEMCPY_CRC32C, plat_data->cap_mask);
 			dma_cap_set(DMA_INTERRUPT, plat_data->cap_mask);
 			dma_cap_set(DMA_PQ_XOR, plat_data->cap_mask);
 			dma_cap_set(DMA_PQ_UPDATE, plat_data->cap_mask);
-			dma_cap_set(DMA_PQ_ZERO_SUM, plat_data->cap_mask);
+			dma_cap_set(DMA_PQ_VAL, plat_data->cap_mask);
 			break;
 		}
 	}
diff --git a/arch/arm/plat-iop/adma.c b/arch/arm/plat-iop/adma.c
index f724208..c040044 100644
--- a/arch/arm/plat-iop/adma.c
+++ b/arch/arm/plat-iop/adma.c
@@ -198,7 +198,7 @@ static int __init iop3xx_adma_cap_init(void)
 	dma_cap_set(DMA_INTERRUPT, iop3xx_aau_data.cap_mask);
 	#else
 	dma_cap_set(DMA_XOR, iop3xx_aau_data.cap_mask);
-	dma_cap_set(DMA_ZERO_SUM, iop3xx_aau_data.cap_mask);
+	dma_cap_set(DMA_XOR_VAL, iop3xx_aau_data.cap_mask);
 	dma_cap_set(DMA_MEMSET, iop3xx_aau_data.cap_mask);
 	dma_cap_set(DMA_INTERRUPT, iop3xx_aau_data.cap_mask);
 	#endif
diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 95fe2c8..e0580b0 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -222,7 +222,7 @@ static int page_is_zero(struct page *p, unsigned int offset, size_t len)
 }
 
 /**
- * async_xor_zero_sum - attempt a xor parity check with a dma engine.
+ * async_xor_val - attempt a xor parity check with a dma engine.
  * @dest: destination page used if the xor is performed synchronously
  * @src_list: array of source pages.  The dest page must be listed as a source
  * 	at index zero.  The contents of this array may be overwritten.
@@ -236,13 +236,13 @@ static int page_is_zero(struct page *p, unsigned int offset, size_t len)
  * @cb_param: parameter to pass to the callback routine
  */
 struct dma_async_tx_descriptor *
-async_xor_zero_sum(struct page *dest, struct page **src_list,
+async_xor_val(struct page *dest, struct page **src_list,
 	unsigned int offset, int src_cnt, size_t len,
 	u32 *result, enum async_tx_flags flags,
 	struct dma_async_tx_descriptor *depend_tx,
 	dma_async_tx_callback cb_fn, void *cb_param)
 {
-	struct dma_chan *chan = async_tx_find_channel(depend_tx, DMA_ZERO_SUM,
+	struct dma_chan *chan = async_tx_find_channel(depend_tx, DMA_XOR_VAL,
 						      &dest, 1, src_list,
 						      src_cnt, len);
 	struct dma_device *device = chan ? chan->device : NULL;
@@ -261,15 +261,15 @@ async_xor_zero_sum(struct page *dest, struct page **src_list,
 			dma_src[i] = dma_map_page(device->dev, src_list[i],
 						  offset, len, DMA_TO_DEVICE);
 
-		tx = device->device_prep_dma_zero_sum(chan, dma_src, src_cnt,
-						      len, result,
-						      dma_prep_flags);
+		tx = device->device_prep_dma_xor_val(chan, dma_src, src_cnt,
+						     len, result,
+						     dma_prep_flags);
 		if (unlikely(!tx)) {
 			async_tx_quiesce(&depend_tx);
 
 			while (!tx) {
 				dma_async_issue_pending(chan);
-				tx = device->device_prep_dma_zero_sum(chan,
+				tx = device->device_prep_dma_xor_val(chan,
 					dma_src, src_cnt, len, result,
 					dma_prep_flags);
 			}
@@ -296,7 +296,7 @@ async_xor_zero_sum(struct page *dest, struct page **src_list,
 
 	return tx;
 }
-EXPORT_SYMBOL_GPL(async_xor_zero_sum);
+EXPORT_SYMBOL_GPL(async_xor_val);
 
 static int __init async_xor_init(void)
 {
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 92438e9..6781e8f 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -644,8 +644,8 @@ int dma_async_device_register(struct dma_device *device)
 		!device->device_prep_dma_memcpy);
 	BUG_ON(dma_has_cap(DMA_XOR, device->cap_mask) &&
 		!device->device_prep_dma_xor);
-	BUG_ON(dma_has_cap(DMA_ZERO_SUM, device->cap_mask) &&
-		!device->device_prep_dma_zero_sum);
+	BUG_ON(dma_has_cap(DMA_XOR_VAL, device->cap_mask) &&
+		!device->device_prep_dma_xor_val);
 	BUG_ON(dma_has_cap(DMA_MEMSET, device->cap_mask) &&
 		!device->device_prep_dma_memset);
 	BUG_ON(dma_has_cap(DMA_INTERRUPT, device->cap_mask) &&
diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
index 2f05226..6ff79a6 100644
--- a/drivers/dma/iop-adma.c
+++ b/drivers/dma/iop-adma.c
@@ -660,9 +660,9 @@ iop_adma_prep_dma_xor(struct dma_chan *chan, dma_addr_t dma_dest,
 }
 
 static struct dma_async_tx_descriptor *
-iop_adma_prep_dma_zero_sum(struct dma_chan *chan, dma_addr_t *dma_src,
-			   unsigned int src_cnt, size_t len, u32 *result,
-			   unsigned long flags)
+iop_adma_prep_dma_xor_val(struct dma_chan *chan, dma_addr_t *dma_src,
+			  unsigned int src_cnt, size_t len, u32 *result,
+			  unsigned long flags)
 {
 	struct iop_adma_chan *iop_chan = to_iop_adma_chan(chan);
 	struct iop_adma_desc_slot *sw_desc, *grp_start;
@@ -906,7 +906,7 @@ out:
 
 #define IOP_ADMA_NUM_SRC_TEST 4 /* must be <= 15 */
 static int __devinit
-iop_adma_xor_zero_sum_self_test(struct iop_adma_device *device)
+iop_adma_xor_val_self_test(struct iop_adma_device *device)
 {
 	int i, src_idx;
 	struct page *dest;
@@ -1002,7 +1002,7 @@ iop_adma_xor_zero_sum_self_test(struct iop_adma_device *device)
 		PAGE_SIZE, DMA_TO_DEVICE);
 
 	/* skip zero sum if the capability is not present */
-	if (!dma_has_cap(DMA_ZERO_SUM, dma_chan->device->cap_mask))
+	if (!dma_has_cap(DMA_XOR_VAL, dma_chan->device->cap_mask))
 		goto free_resources;
 
 	/* zero sum the sources with the destintation page */
@@ -1016,10 +1016,10 @@ iop_adma_xor_zero_sum_self_test(struct iop_adma_device *device)
 		dma_srcs[i] = dma_map_page(dma_chan->device->dev,
 					   zero_sum_srcs[i], 0, PAGE_SIZE,
 					   DMA_TO_DEVICE);
-	tx = iop_adma_prep_dma_zero_sum(dma_chan, dma_srcs,
-					IOP_ADMA_NUM_SRC_TEST + 1, PAGE_SIZE,
-					&zero_sum_result,
-					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+	tx = iop_adma_prep_dma_xor_val(dma_chan, dma_srcs,
+				       IOP_ADMA_NUM_SRC_TEST + 1, PAGE_SIZE,
+				       &zero_sum_result,
+				       DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
 
 	cookie = iop_adma_tx_submit(tx);
 	iop_adma_issue_pending(dma_chan);
@@ -1072,10 +1072,10 @@ iop_adma_xor_zero_sum_self_test(struct iop_adma_device *device)
 		dma_srcs[i] = dma_map_page(dma_chan->device->dev,
 					   zero_sum_srcs[i], 0, PAGE_SIZE,
 					   DMA_TO_DEVICE);
-	tx = iop_adma_prep_dma_zero_sum(dma_chan, dma_srcs,
-					IOP_ADMA_NUM_SRC_TEST + 1, PAGE_SIZE,
-					&zero_sum_result,
-					DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+	tx = iop_adma_prep_dma_xor_val(dma_chan, dma_srcs,
+				       IOP_ADMA_NUM_SRC_TEST + 1, PAGE_SIZE,
+				       &zero_sum_result,
+				       DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
 
 	cookie = iop_adma_tx_submit(tx);
 	iop_adma_issue_pending(dma_chan);
@@ -1192,9 +1192,9 @@ static int __devinit iop_adma_probe(struct platform_device *pdev)
 		dma_dev->max_xor = iop_adma_get_max_xor();
 		dma_dev->device_prep_dma_xor = iop_adma_prep_dma_xor;
 	}
-	if (dma_has_cap(DMA_ZERO_SUM, dma_dev->cap_mask))
-		dma_dev->device_prep_dma_zero_sum =
-			iop_adma_prep_dma_zero_sum;
+	if (dma_has_cap(DMA_XOR_VAL, dma_dev->cap_mask))
+		dma_dev->device_prep_dma_xor_val =
+			iop_adma_prep_dma_xor_val;
 	if (dma_has_cap(DMA_INTERRUPT, dma_dev->cap_mask))
 		dma_dev->device_prep_dma_interrupt =
 			iop_adma_prep_dma_interrupt;
@@ -1249,7 +1249,7 @@ static int __devinit iop_adma_probe(struct platform_device *pdev)
 
 	if (dma_has_cap(DMA_XOR, dma_dev->cap_mask) ||
 		dma_has_cap(DMA_MEMSET, dma_dev->cap_mask)) {
-		ret = iop_adma_xor_zero_sum_self_test(adev);
+		ret = iop_adma_xor_val_self_test(adev);
 		dev_dbg(&pdev->dev, "xor self test returned %d\n", ret);
 		if (ret)
 			goto err_free_iop_chan;
@@ -1259,10 +1259,10 @@ static int __devinit iop_adma_probe(struct platform_device *pdev)
 	  "( %s%s%s%s%s%s%s%s%s%s)\n",
 	  dma_has_cap(DMA_PQ_XOR, dma_dev->cap_mask) ? "pq_xor " : "",
 	  dma_has_cap(DMA_PQ_UPDATE, dma_dev->cap_mask) ? "pq_update " : "",
-	  dma_has_cap(DMA_PQ_ZERO_SUM, dma_dev->cap_mask) ? "pq_zero_sum " : "",
+	  dma_has_cap(DMA_PQ_VAL, dma_dev->cap_mask) ? "pq_val " : "",
 	  dma_has_cap(DMA_XOR, dma_dev->cap_mask) ? "xor " : "",
 	  dma_has_cap(DMA_DUAL_XOR, dma_dev->cap_mask) ? "dual_xor " : "",
-	  dma_has_cap(DMA_ZERO_SUM, dma_dev->cap_mask) ? "xor_zero_sum " : "",
+	  dma_has_cap(DMA_XOR_VAL, dma_dev->cap_mask) ? "xor_val " : "",
 	  dma_has_cap(DMA_MEMSET, dma_dev->cap_mask)  ? "fill " : "",
 	  dma_has_cap(DMA_MEMCPY_CRC32C, dma_dev->cap_mask) ? "cpy+crc " : "",
 	  dma_has_cap(DMA_MEMCPY, dma_dev->cap_mask) ? "cpy " : "",
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 3bbc6d6..f8d2d35 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -854,7 +854,7 @@ static void ops_run_check(struct stripe_head *sh)
 			xor_srcs[count++] = dev->page;
 	}
 
-	tx = async_xor_zero_sum(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
+	tx = async_xor_val(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
 		&sh->ops.zero_sum_result, 0, NULL, NULL, NULL);
 
 	atomic_inc(&sh->count);
diff --git a/include/linux/async_tx.h b/include/linux/async_tx.h
index 5fc2ef8..513150d 100644
--- a/include/linux/async_tx.h
+++ b/include/linux/async_tx.h
@@ -117,7 +117,7 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
 	dma_async_tx_callback cb_fn, void *cb_fn_param);
 
 struct dma_async_tx_descriptor *
-async_xor_zero_sum(struct page *dest, struct page **src_list,
+async_xor_val(struct page *dest, struct page **src_list,
 	unsigned int offset, int src_cnt, size_t len,
 	u32 *result, enum async_tx_flags flags,
 	struct dma_async_tx_descriptor *depend_tx,
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 2e2aa3d..6768727 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -55,8 +55,8 @@ enum dma_transaction_type {
 	DMA_PQ_XOR,
 	DMA_DUAL_XOR,
 	DMA_PQ_UPDATE,
-	DMA_ZERO_SUM,
-	DMA_PQ_ZERO_SUM,
+	DMA_XOR_VAL,
+	DMA_PQ_VAL,
 	DMA_MEMSET,
 	DMA_MEMCPY_CRC32C,
 	DMA_INTERRUPT,
@@ -214,7 +214,7 @@ struct dma_async_tx_descriptor {
  * @device_free_chan_resources: release DMA channel's resources
  * @device_prep_dma_memcpy: prepares a memcpy operation
  * @device_prep_dma_xor: prepares a xor operation
- * @device_prep_dma_zero_sum: prepares a zero_sum operation
+ * @device_prep_dma_xor_val: prepares a xor validation operation
  * @device_prep_dma_memset: prepares a memset operation
  * @device_prep_dma_interrupt: prepares an end of chain interrupt operation
  * @device_prep_slave_sg: prepares a slave dma operation
@@ -243,7 +243,7 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_xor)(
 		struct dma_chan *chan, dma_addr_t dest, dma_addr_t *src,
 		unsigned int src_cnt, size_t len, unsigned long flags);
-	struct dma_async_tx_descriptor *(*device_prep_dma_zero_sum)(
+	struct dma_async_tx_descriptor *(*device_prep_dma_xor_val)(
 		struct dma_chan *chan, dma_addr_t *src,	unsigned int src_cnt,
 		size_t len, u32 *result, unsigned long flags);
 	struct dma_async_tx_descriptor *(*device_prep_dma_memset)(


^ permalink raw reply related	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905200110x63b22601idbbdf3369984fa9a@mail.gmail.com>]

* RE: [PATCH v2 01/11] async_tx: rename zero_sum to val
       [not found]   ` <f12847240905200110x63b22601idbbdf3369984fa9a@mail.gmail.com>
@ 2009-05-29 13:41     ` Sosnowski, Maciej
  2009-06-03 18:12       ` Dan Williams
  0 siblings, 1 reply; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:41 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
> 'zero_sum' does not properly describe the operation of generating parity
> and checking that it validates against an existing buffer.  Change the
> name of the operation to 'val' (for 'validate').  This is in
> anticipation of the p+q case where it is a requirement to identify the
> target parity buffers separately from the source buffers, because the
> target parity buffers will not have corresponding pq coefficients.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  arch/arm/mach-iop13xx/setup.c |    8 ++++----
>  arch/arm/plat-iop/adma.c      |    2 +-
>  crypto/async_tx/async_xor.c   |   16 ++++++++--------
>  drivers/dma/dmaengine.c       |    4 ++--
>  drivers/dma/iop-adma.c        |   38 +++++++++++++++++++-------------------
>  drivers/md/raid5.c            |    2 +-
>  include/linux/async_tx.h      |    2 +-
>  include/linux/dmaengine.h     |    8 ++++----
>  8 files changed, 40 insertions(+), 40 deletions(-)

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>

with following comments:

> @@ -1072,10 +1072,10 @@ iop_adma_xor_zero_sum_self_test(struct
> iop_adma_device *device)
>                dma_srcs[i] = dma_map_page(dma_chan->device->dev,
>                                           zero_sum_srcs[i], 0, PAGE_SIZE,
>                                           DMA_TO_DEVICE);
> -       tx = iop_adma_prep_dma_zero_sum(dma_chan, dma_srcs,
> -                                       IOP_ADMA_NUM_SRC_TEST + 1, PAGE_SIZE,
> -                                       &zero_sum_result,
> -                                       DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
> +       tx = iop_adma_prep_dma_xor_val(dma_chan, dma_srcs,
> +                                      IOP_ADMA_NUM_SRC_TEST + 1, PAGE_SIZE,
> +                                      &zero_sum_result,
> +                                      DMA_PREP_INTERRUPT | DMA_CTRL_ACK);

What about zero_sum_srcs and zero_sum_result? Shouldn't they be renamed too?

> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -854,7 +854,7 @@ static void ops_run_check(struct stripe_head *sh)
>                        xor_srcs[count++] = dev->page;
>        }
> 
> -       tx = async_xor_zero_sum(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
> +       tx = async_xor_val(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
>                &sh->ops.zero_sum_result, 0, NULL, NULL, NULL);

And the same here...

Regards,
Maciej--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/11] async_tx: rename zero_sum to val
  2009-05-29 13:41     ` Sosnowski, Maciej
@ 2009-06-03 18:12       ` Dan Williams
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Williams @ 2009-06-03 18:12 UTC (permalink / raw)
  To: Sosnowski, Maciej; +Cc: linux-raid@vger.kernel.org

2009/5/29 Sosnowski, Maciej <maciej.sosnowski@intel.com>:
> Dan Williams wrote:
>> 'zero_sum' does not properly describe the operation of generating parity
>> and checking that it validates against an existing buffer.  Change the
>> name of the operation to 'val' (for 'validate').  This is in
>> anticipation of the p+q case where it is a requirement to identify the
>> target parity buffers separately from the source buffers, because the
>> target parity buffers will not have corresponding pq coefficients.
>>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  arch/arm/mach-iop13xx/setup.c |    8 ++++----
>>  arch/arm/plat-iop/adma.c      |    2 +-
>>  crypto/async_tx/async_xor.c   |   16 ++++++++--------
>>  drivers/dma/dmaengine.c       |    4 ++--
>>  drivers/dma/iop-adma.c        |   38 +++++++++++++++++++-------------------
>>  drivers/md/raid5.c            |    2 +-
>>  include/linux/async_tx.h      |    2 +-
>>  include/linux/dmaengine.h     |    8 ++++----
>>  8 files changed, 40 insertions(+), 40 deletions(-)
>
> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
>
> with following comments:
>
>> @@ -1072,10 +1072,10 @@ iop_adma_xor_zero_sum_self_test(struct
>> iop_adma_device *device)
>>                dma_srcs[i] = dma_map_page(dma_chan->device->dev,
>>                                           zero_sum_srcs[i], 0, PAGE_SIZE,
>>                                           DMA_TO_DEVICE);
>> -       tx = iop_adma_prep_dma_zero_sum(dma_chan, dma_srcs,
>> -                                       IOP_ADMA_NUM_SRC_TEST + 1, PAGE_SIZE,
>> -                                       &zero_sum_result,
>> -                                       DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
>> +       tx = iop_adma_prep_dma_xor_val(dma_chan, dma_srcs,
>> +                                      IOP_ADMA_NUM_SRC_TEST + 1, PAGE_SIZE,
>> +                                      &zero_sum_result,
>> +                                      DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
>
> What about zero_sum_srcs and zero_sum_result? Shouldn't they be renamed too?

No, the hardware specification calls the operation "zero sum", so it
is fine that the kernel calls it one thing but the driver translates
it into something else.

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 02/11] async_tx: kill ASYNC_TX_DEP_ACK flag
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
  2009-05-19  0:59 ` [PATCH v2 01/11] async_tx: rename zero_sum to val Dan Williams
@ 2009-05-19  0:59 ` Dan Williams
       [not found]   ` <f12847240905250320w74897dabo6576b4b48bd19c0c@mail.gmail.com>
  2009-05-19  0:59 ` [PATCH v2 03/11] async_tx: structify submission arguments, add scribble Dan Williams
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Dan Williams @ 2009-05-19  0:59 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

In support of inter-channel chaining async_tx utilizes an ack flag to
gate whether a dependent operation can be chained to another.  While the
flag is not set the chain can be considered open for appending.  Setting
the ack flag closes the chain and flags the descriptor for garbage
collection.  The ASYNC_TX_DEP_ACK flag essentially means "close the
chain after adding this dependency".  Since each operation can only have
one child the api now implicitly sets the ack flag at dependency
submission time.  This removes an unnecessary management burden from
clients of the api.

[ Impact: clean up and enforce one dependency per operation ]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 crypto/async_tx/async_memcpy.c |    2 +-
 crypto/async_tx/async_memset.c |    2 +-
 crypto/async_tx/async_tx.c     |    4 ++--
 crypto/async_tx/async_xor.c    |    6 ++----
 drivers/md/raid5.c             |   25 +++++++++++--------------
 include/linux/async_tx.h       |    4 +---
 6 files changed, 18 insertions(+), 25 deletions(-)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index ddccfb0..7117ec6 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -35,7 +35,7 @@
  * @src: src page
  * @offset: offset in pages to start transaction
  * @len: length in bytes
- * @flags: ASYNC_TX_ACK, ASYNC_TX_DEP_ACK,
+ * @flags: ASYNC_TX_ACK
  * @depend_tx: memcpy depends on the result of this transaction
  * @cb_fn: function to call when the memcpy completes
  * @cb_param: parameter to pass to the callback routine
diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index 5b5eb99..b2f1338 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -35,7 +35,7 @@
  * @val: fill value
  * @offset: offset in pages to start transaction
  * @len: length in bytes
- * @flags: ASYNC_TX_ACK, ASYNC_TX_DEP_ACK
+ * @flags: ASYNC_TX_ACK
  * @depend_tx: memset depends on the result of this transaction
  * @cb_fn: function to call when the memcpy completes
  * @cb_param: parameter to pass to the callback routine
diff --git a/crypto/async_tx/async_tx.c b/crypto/async_tx/async_tx.c
index 06eb6cc..3766bc3 100644
--- a/crypto/async_tx/async_tx.c
+++ b/crypto/async_tx/async_tx.c
@@ -223,7 +223,7 @@ async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
 	if (flags & ASYNC_TX_ACK)
 		async_tx_ack(tx);
 
-	if (depend_tx && (flags & ASYNC_TX_DEP_ACK))
+	if (depend_tx)
 		async_tx_ack(depend_tx);
 }
 EXPORT_SYMBOL_GPL(async_tx_submit);
@@ -231,7 +231,7 @@ EXPORT_SYMBOL_GPL(async_tx_submit);
 /**
  * async_trigger_callback - schedules the callback function to be run after
  * any dependent operations have been completed.
- * @flags: ASYNC_TX_ACK, ASYNC_TX_DEP_ACK
+ * @flags: ASYNC_TX_ACK
  * @depend_tx: 'callback' requires the completion of this transaction
  * @cb_fn: function to call after depend_tx completes
  * @cb_param: parameter to pass to the callback routine
diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index e0580b0..3cc5dc7 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -105,7 +105,6 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 				_cb_param);
 
 		depend_tx = tx;
-		flags |= ASYNC_TX_DEP_ACK;
 
 		if (src_cnt > xor_src_cnt) {
 			/* drop completed sources */
@@ -168,8 +167,7 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
  * @offset: offset in pages to start transaction
  * @src_cnt: number of source pages
  * @len: length in bytes
- * @flags: ASYNC_TX_XOR_ZERO_DST, ASYNC_TX_XOR_DROP_DEST,
- *	ASYNC_TX_ACK, ASYNC_TX_DEP_ACK
+ * @flags: ASYNC_TX_XOR_ZERO_DST, ASYNC_TX_XOR_DROP_DEST, ASYNC_TX_ACK
  * @depend_tx: xor depends on the result of this transaction.
  * @cb_fn: function to call when the xor completes
  * @cb_param: parameter to pass to the callback routine
@@ -230,7 +228,7 @@ static int page_is_zero(struct page *p, unsigned int offset, size_t len)
  * @src_cnt: number of source pages
  * @len: length in bytes
  * @result: 0 if sum == 0 else non-zero
- * @flags: ASYNC_TX_ACK, ASYNC_TX_DEP_ACK
+ * @flags: ASYNC_TX_ACK
  * @depend_tx: xor depends on the result of this transaction.
  * @cb_fn: function to call when the xor completes
  * @cb_param: parameter to pass to the callback routine
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index f8d2d35..0ef5362 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -525,14 +525,12 @@ async_copy_data(int frombio, struct bio *bio, struct page *page,
 			bio_page = bio_iovec_idx(bio, i)->bv_page;
 			if (frombio)
 				tx = async_memcpy(page, bio_page, page_offset,
-					b_offset, clen,
-					ASYNC_TX_DEP_ACK,
-					tx, NULL, NULL);
+						  b_offset, clen, 0,
+						  tx, NULL, NULL);
 			else
 				tx = async_memcpy(bio_page, page, b_offset,
-					page_offset, clen,
-					ASYNC_TX_DEP_ACK,
-					tx, NULL, NULL);
+						  page_offset, clen, 0,
+						  tx, NULL, NULL);
 		}
 		if (clen < len) /* hit end of page */
 			break;
@@ -615,8 +613,7 @@ static void ops_run_biofill(struct stripe_head *sh)
 	}
 
 	atomic_inc(&sh->count);
-	async_trigger_callback(ASYNC_TX_DEP_ACK | ASYNC_TX_ACK, tx,
-		ops_complete_biofill, sh);
+	async_trigger_callback(ASYNC_TX_ACK, tx, ops_complete_biofill, sh);
 }
 
 static void ops_complete_compute5(void *stripe_head_ref)
@@ -701,8 +698,8 @@ ops_run_prexor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 	}
 
 	tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
-		ASYNC_TX_DEP_ACK | ASYNC_TX_XOR_DROP_DST, tx,
-		ops_complete_prexor, sh);
+		       ASYNC_TX_XOR_DROP_DST, tx,
+		       ops_complete_prexor, sh);
 
 	return tx;
 }
@@ -809,7 +806,7 @@ ops_run_postxor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 	 * set ASYNC_TX_XOR_DROP_DST and ASYNC_TX_XOR_ZERO_DST
 	 * for the synchronous xor case
 	 */
-	flags = ASYNC_TX_DEP_ACK | ASYNC_TX_ACK |
+	flags = ASYNC_TX_ACK |
 		(prexor ? ASYNC_TX_XOR_DROP_DST : ASYNC_TX_XOR_ZERO_DST);
 
 	atomic_inc(&sh->count);
@@ -858,7 +855,7 @@ static void ops_run_check(struct stripe_head *sh)
 		&sh->ops.zero_sum_result, 0, NULL, NULL, NULL);
 
 	atomic_inc(&sh->count);
-	tx = async_trigger_callback(ASYNC_TX_DEP_ACK | ASYNC_TX_ACK, tx,
+	tx = async_trigger_callback(ASYNC_TX_ACK, tx,
 		ops_complete_check, sh);
 }
 
@@ -2687,8 +2684,8 @@ static void handle_stripe_expansion(raid5_conf_t *conf, struct stripe_head *sh,
 
 			/* place all the copies on one channel */
 			tx = async_memcpy(sh2->dev[dd_idx].page,
-				sh->dev[i].page, 0, 0, STRIPE_SIZE,
-				ASYNC_TX_DEP_ACK, tx, NULL, NULL);
+					  sh->dev[i].page, 0, 0, STRIPE_SIZE,
+					  0, tx, NULL, NULL);
 
 			set_bit(R5_Expanded, &sh2->dev[dd_idx].flags);
 			set_bit(R5_UPTODATE, &sh2->dev[dd_idx].flags);
diff --git a/include/linux/async_tx.h b/include/linux/async_tx.h
index 513150d..9f14cd5 100644
--- a/include/linux/async_tx.h
+++ b/include/linux/async_tx.h
@@ -58,13 +58,11 @@ struct dma_chan_ref {
  * array.
  * @ASYNC_TX_ACK: immediately ack the descriptor, precludes setting up a
  * dependency chain
- * @ASYNC_TX_DEP_ACK: ack the dependency descriptor.  Useful for chaining.
  */
 enum async_tx_flags {
 	ASYNC_TX_XOR_ZERO_DST	 = (1 << 0),
 	ASYNC_TX_XOR_DROP_DST	 = (1 << 1),
-	ASYNC_TX_ACK		 = (1 << 3),
-	ASYNC_TX_DEP_ACK	 = (1 << 4),
+	ASYNC_TX_ACK		 = (1 << 2),
 };
 
 #ifdef CONFIG_DMA_ENGINE


^ permalink raw reply related	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905250320w74897dabo6576b4b48bd19c0c@mail.gmail.com>]

* RE: [PATCH v2 02/11] async_tx: kill ASYNC_TX_DEP_ACK flag
       [not found]   ` <f12847240905250320w74897dabo6576b4b48bd19c0c@mail.gmail.com>
@ 2009-05-29 13:41     ` Sosnowski, Maciej
  2009-06-03 18:42       ` Dan Williams
  0 siblings, 1 reply; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:41 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
> In support of inter-channel chaining async_tx utilizes an ack flag to
> gate whether a dependent operation can be chained to another.  While the
> flag is not set the chain can be considered open for appending.  Setting
> the ack flag closes the chain and flags the descriptor for garbage
> collection.  The ASYNC_TX_DEP_ACK flag essentially means "close the
> chain after adding this dependency".  Since each operation can only have
> one child the api now implicitly sets the ack flag at dependency
> submission time.  This removes an unnecessary management burden from
> clients of the api.
> 
> [ Impact: clean up and enforce one dependency per operation ]
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  crypto/async_tx/async_memcpy.c |    2 +-
>  crypto/async_tx/async_memset.c |    2 +-
>  crypto/async_tx/async_tx.c     |    4 ++--
>  crypto/async_tx/async_xor.c    |    6 ++----
>  drivers/md/raid5.c             |   25 +++++++++++--------------
>  include/linux/async_tx.h       |    4 +---
>  6 files changed, 18 insertions(+), 25 deletions(-)

Async-tx-api.txt documentation should be also updated
(it still describes ASYNC_TX_DEP_ACK usage).

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 02/11] async_tx: kill ASYNC_TX_DEP_ACK flag
  2009-05-29 13:41     ` Sosnowski, Maciej
@ 2009-06-03 18:42       ` Dan Williams
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Williams @ 2009-06-03 18:42 UTC (permalink / raw)
  To: Sosnowski, Maciej; +Cc: linux-raid@vger.kernel.org

2009/5/29 Sosnowski, Maciej <maciej.sosnowski@intel.com>:
> Dan Williams wrote:
>> In support of inter-channel chaining async_tx utilizes an ack flag to
>> gate whether a dependent operation can be chained to another.  While the
>> flag is not set the chain can be considered open for appending.  Setting
>> the ack flag closes the chain and flags the descriptor for garbage
>> collection.  The ASYNC_TX_DEP_ACK flag essentially means "close the
>> chain after adding this dependency".  Since each operation can only have
>> one child the api now implicitly sets the ack flag at dependency
>> submission time.  This removes an unnecessary management burden from
>> clients of the api.
>>
>> [ Impact: clean up and enforce one dependency per operation ]
>>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  crypto/async_tx/async_memcpy.c |    2 +-
>>  crypto/async_tx/async_memset.c |    2 +-
>>  crypto/async_tx/async_tx.c     |    4 ++--
>>  crypto/async_tx/async_xor.c    |    6 ++----
>>  drivers/md/raid5.c             |   25 +++++++++++--------------
>>  include/linux/async_tx.h       |    4 +---
>>  6 files changed, 18 insertions(+), 25 deletions(-)
>
> Async-tx-api.txt documentation should be also updated
> (it still describes ASYNC_TX_DEP_ACK usage).
>

Done, thanks... and I also need to update the document for the
async_submit_ctl structure.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 03/11] async_tx: structify submission arguments, add scribble
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
  2009-05-19  0:59 ` [PATCH v2 01/11] async_tx: rename zero_sum to val Dan Williams
  2009-05-19  0:59 ` [PATCH v2 02/11] async_tx: kill ASYNC_TX_DEP_ACK flag Dan Williams
@ 2009-05-19  0:59 ` Dan Williams
  2009-05-20  8:06   ` Andre Noll
       [not found]   ` <f12847240905250321v774c4e8dscd7a466cd2e61168@mail.gmail.com>
  2009-05-19  0:59 ` [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region Dan Williams
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-19  0:59 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

Prepare the api for the arrival of a new parameter, 'scribble'.  This
will allow callers to identify scratchpad memory for dma address or page
address conversions.  As this adds yet another parameter, take this
opportunity to convert the common submission parameters (flags,
dependency, callback, and callback argument) into an object that is
passed by reference.

[ Impact: moves api pass-by-value parameters to a pass-by-reference struct ]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 crypto/async_tx/async_memcpy.c |   21 ++++-----
 crypto/async_tx/async_memset.c |   23 ++++------
 crypto/async_tx/async_tx.c     |   34 +++++++--------
 crypto/async_tx/async_xor.c    |   93 +++++++++++++++++-----------------------
 drivers/md/raid5.c             |   59 +++++++++++++++----------
 include/linux/async_tx.h       |   84 +++++++++++++++++++++++-------------
 6 files changed, 161 insertions(+), 153 deletions(-)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index 7117ec6..c9342ae 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -35,26 +35,23 @@
  * @src: src page
  * @offset: offset in pages to start transaction
  * @len: length in bytes
- * @flags: ASYNC_TX_ACK
- * @depend_tx: memcpy depends on the result of this transaction
- * @cb_fn: function to call when the memcpy completes
- * @cb_param: parameter to pass to the callback routine
+ * @submit: submission / completion modifiers
  */
 struct dma_async_tx_descriptor *
 async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
-	unsigned int src_offset, size_t len, enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_param)
+	     unsigned int src_offset, size_t len,
+	     struct async_submit_ctl *submit)
 {
-	struct dma_chan *chan = async_tx_find_channel(depend_tx, DMA_MEMCPY,
+	struct dma_chan *chan = async_tx_find_channel(submit, DMA_MEMCPY,
 						      &dest, 1, &src, 1, len);
 	struct dma_device *device = chan ? chan->device : NULL;
 	struct dma_async_tx_descriptor *tx = NULL;
 
 	if (device) {
 		dma_addr_t dma_dest, dma_src;
-		unsigned long dma_prep_flags = cb_fn ? DMA_PREP_INTERRUPT : 0;
+		unsigned long dma_prep_flags;
 
+		dma_prep_flags = submit->cb_fn ? DMA_PREP_INTERRUPT : 0;
 		dma_dest = dma_map_page(device->dev, dest, dest_offset, len,
 					DMA_FROM_DEVICE);
 
@@ -67,13 +64,13 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
 
 	if (tx) {
 		pr_debug("%s: (async) len: %zu\n", __func__, len);
-		async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
+		async_tx_submit(chan, tx, submit);
 	} else {
 		void *dest_buf, *src_buf;
 		pr_debug("%s: (sync) len: %zu\n", __func__, len);
 
 		/* wait for any prerequisite operations */
-		async_tx_quiesce(&depend_tx);
+		async_tx_quiesce(&submit->depend_tx);
 
 		dest_buf = kmap_atomic(dest, KM_USER0) + dest_offset;
 		src_buf = kmap_atomic(src, KM_USER1) + src_offset;
@@ -83,7 +80,7 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
 		kunmap_atomic(dest_buf, KM_USER0);
 		kunmap_atomic(src_buf, KM_USER1);
 
-		async_tx_sync_epilog(cb_fn, cb_param);
+		async_tx_sync_epilog(submit);
 	}
 
 	return tx;
diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index b2f1338..e347dbe 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -35,26 +35,21 @@
  * @val: fill value
  * @offset: offset in pages to start transaction
  * @len: length in bytes
- * @flags: ASYNC_TX_ACK
- * @depend_tx: memset depends on the result of this transaction
- * @cb_fn: function to call when the memcpy completes
- * @cb_param: parameter to pass to the callback routine
  */
 struct dma_async_tx_descriptor *
-async_memset(struct page *dest, int val, unsigned int offset,
-	size_t len, enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_param)
+async_memset(struct page *dest, int val, unsigned int offset, size_t len,
+	     struct async_submit_ctl *submit)
 {
-	struct dma_chan *chan = async_tx_find_channel(depend_tx, DMA_MEMSET,
+	struct dma_chan *chan = async_tx_find_channel(submit, DMA_MEMSET,
 						      &dest, 1, NULL, 0, len);
 	struct dma_device *device = chan ? chan->device : NULL;
 	struct dma_async_tx_descriptor *tx = NULL;
 
 	if (device) {
 		dma_addr_t dma_dest;
-		unsigned long dma_prep_flags = cb_fn ? DMA_PREP_INTERRUPT : 0;
+		unsigned long dma_prep_flags;
 
+		dma_prep_flags = submit->cb_fn ? DMA_PREP_INTERRUPT : 0;
 		dma_dest = dma_map_page(device->dev, dest, offset, len,
 					DMA_FROM_DEVICE);
 
@@ -64,19 +59,19 @@ async_memset(struct page *dest, int val, unsigned int offset,
 
 	if (tx) {
 		pr_debug("%s: (async) len: %zu\n", __func__, len);
-		async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
+		async_tx_submit(chan, tx, submit);
 	} else { /* run the memset synchronously */
 		void *dest_buf;
 		pr_debug("%s: (sync) len: %zu\n", __func__, len);
 
-		dest_buf = (void *) (((char *) page_address(dest)) + offset);
+		dest_buf = page_address(dest) + offset;
 
 		/* wait for any prerequisite operations */
-		async_tx_quiesce(&depend_tx);
+		async_tx_quiesce(&submit->depend_tx);
 
 		memset(dest_buf, val, len);
 
-		async_tx_sync_epilog(cb_fn, cb_param);
+		async_tx_sync_epilog(submit);
 	}
 
 	return tx;
diff --git a/crypto/async_tx/async_tx.c b/crypto/async_tx/async_tx.c
index 3766bc3..85e1b44 100644
--- a/crypto/async_tx/async_tx.c
+++ b/crypto/async_tx/async_tx.c
@@ -45,13 +45,15 @@ static void __exit async_tx_exit(void)
 /**
  * __async_tx_find_channel - find a channel to carry out the operation or let
  *	the transaction execute synchronously
- * @depend_tx: transaction dependency
+ * @submit: transaction dependency and submission modifiers
  * @tx_type: transaction type
  */
 struct dma_chan *
-__async_tx_find_channel(struct dma_async_tx_descriptor *depend_tx,
-	enum dma_transaction_type tx_type)
+__async_tx_find_channel(struct async_submit_ctl *submit,
+			enum dma_transaction_type tx_type)
 {
+	struct dma_async_tx_descriptor *depend_tx = submit->depend_tx;
+
 	/* see if we can keep the chain on one channel */
 	if (depend_tx &&
 	    dma_has_cap(tx_type, depend_tx->chan->device->cap_mask))
@@ -160,11 +162,12 @@ enum submit_disposition {
 
 void
 async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
-	enum async_tx_flags flags, struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_param)
+		struct async_submit_ctl *submit)
 {
-	tx->callback = cb_fn;
-	tx->callback_param = cb_param;
+	struct dma_async_tx_descriptor *depend_tx = submit->depend_tx;
+
+	tx->callback = submit->cb_fn;
+	tx->callback_param = submit->cb_param;
 
 	if (depend_tx) {
 		enum submit_disposition s;
@@ -220,7 +223,7 @@ async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
 		tx->tx_submit(tx);
 	}
 
-	if (flags & ASYNC_TX_ACK)
+	if (submit->flags & ASYNC_TX_ACK)
 		async_tx_ack(tx);
 
 	if (depend_tx)
@@ -231,19 +234,14 @@ EXPORT_SYMBOL_GPL(async_tx_submit);
 /**
  * async_trigger_callback - schedules the callback function to be run after
  * any dependent operations have been completed.
- * @flags: ASYNC_TX_ACK
- * @depend_tx: 'callback' requires the completion of this transaction
- * @cb_fn: function to call after depend_tx completes
- * @cb_param: parameter to pass to the callback routine
  */
 struct dma_async_tx_descriptor *
-async_trigger_callback(enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_param)
+async_trigger_callback(struct async_submit_ctl *submit)
 {
 	struct dma_chan *chan;
 	struct dma_device *device;
 	struct dma_async_tx_descriptor *tx;
+	struct dma_async_tx_descriptor *depend_tx = submit->depend_tx;
 
 	if (depend_tx) {
 		chan = depend_tx->chan;
@@ -262,14 +260,14 @@ async_trigger_callback(enum async_tx_flags flags,
 	if (tx) {
 		pr_debug("%s: (async)\n", __func__);
 
-		async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
+		async_tx_submit(chan, tx, submit);
 	} else {
 		pr_debug("%s: (sync)\n", __func__);
 
 		/* wait for any prerequisite operations */
-		async_tx_quiesce(&depend_tx);
+		async_tx_quiesce(&submit->depend_tx);
 
-		async_tx_sync_epilog(cb_fn, cb_param);
+		async_tx_sync_epilog(submit);
 	}
 
 	return tx;
diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 3cc5dc7..6290d05 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -34,18 +34,16 @@
 static __async_inline struct dma_async_tx_descriptor *
 do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 	     unsigned int offset, int src_cnt, size_t len,
-	     enum async_tx_flags flags,
-	     struct dma_async_tx_descriptor *depend_tx,
-	     dma_async_tx_callback cb_fn, void *cb_param)
+	     struct async_submit_ctl *submit)
 {
 	struct dma_device *dma = chan->device;
 	dma_addr_t *dma_src = (dma_addr_t *) src_list;
 	struct dma_async_tx_descriptor *tx = NULL;
 	int src_off = 0;
 	int i;
-	dma_async_tx_callback _cb_fn;
-	void *_cb_param;
-	enum async_tx_flags async_flags;
+	dma_async_tx_callback cb_fn_orig = submit->cb_fn;
+	void *cb_param_orig = submit->cb_param;
+	enum async_tx_flags flags_orig = submit->flags;
 	enum dma_ctrl_flags dma_flags;
 	int xor_src_cnt;
 	dma_addr_t dma_dest;
@@ -63,7 +61,7 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 	}
 
 	while (src_cnt) {
-		async_flags = flags;
+		submit->flags = flags_orig;
 		dma_flags = 0;
 		xor_src_cnt = min(src_cnt, dma->max_xor);
 		/* if we are submitting additional xors, leave the chain open,
@@ -71,15 +69,15 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 		 * buffer mapped
 		 */
 		if (src_cnt > xor_src_cnt) {
-			async_flags &= ~ASYNC_TX_ACK;
+			submit->flags &= ~ASYNC_TX_ACK;
 			dma_flags = DMA_COMPL_SKIP_DEST_UNMAP;
-			_cb_fn = NULL;
-			_cb_param = NULL;
+			submit->cb_fn = NULL;
+			submit->cb_param = NULL;
 		} else {
-			_cb_fn = cb_fn;
-			_cb_param = cb_param;
+			submit->cb_fn = cb_fn_orig;
+			submit->cb_param = cb_param_orig;
 		}
-		if (_cb_fn)
+		if (submit->cb_fn)
 			dma_flags |= DMA_PREP_INTERRUPT;
 
 		/* Since we have clobbered the src_list we are committed
@@ -90,7 +88,7 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 					      xor_src_cnt, len, dma_flags);
 
 		if (unlikely(!tx))
-			async_tx_quiesce(&depend_tx);
+			async_tx_quiesce(&submit->depend_tx);
 
 		/* spin wait for the preceeding transactions to complete */
 		while (unlikely(!tx)) {
@@ -101,10 +99,8 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 						      dma_flags);
 		}
 
-		async_tx_submit(chan, tx, async_flags, depend_tx, _cb_fn,
-				_cb_param);
-
-		depend_tx = tx;
+		async_tx_submit(chan, tx, submit);
+		submit->depend_tx = tx;
 
 		if (src_cnt > xor_src_cnt) {
 			/* drop completed sources */
@@ -123,8 +119,7 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 
 static void
 do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
-	    int src_cnt, size_t len, enum async_tx_flags flags,
-	    dma_async_tx_callback cb_fn, void *cb_param)
+	    int src_cnt, size_t len, struct async_submit_ctl *submit)
 {
 	int i;
 	int xor_src_cnt;
@@ -139,7 +134,7 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
 	/* set destination address */
 	dest_buf = page_address(dest) + offset;
 
-	if (flags & ASYNC_TX_XOR_ZERO_DST)
+	if (submit->flags & ASYNC_TX_XOR_ZERO_DST)
 		memset(dest_buf, 0, len);
 
 	while (src_cnt > 0) {
@@ -152,7 +147,7 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
 		src_off += xor_src_cnt;
 	}
 
-	async_tx_sync_epilog(cb_fn, cb_param);
+	async_tx_sync_epilog(submit);
 }
 
 /**
@@ -167,18 +162,13 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
  * @offset: offset in pages to start transaction
  * @src_cnt: number of source pages
  * @len: length in bytes
- * @flags: ASYNC_TX_XOR_ZERO_DST, ASYNC_TX_XOR_DROP_DEST, ASYNC_TX_ACK
- * @depend_tx: xor depends on the result of this transaction.
- * @cb_fn: function to call when the xor completes
- * @cb_param: parameter to pass to the callback routine
+ * @submit: submission / completion modifiers
  */
 struct dma_async_tx_descriptor *
 async_xor(struct page *dest, struct page **src_list, unsigned int offset,
-	int src_cnt, size_t len, enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_param)
+	  int src_cnt, size_t len, struct async_submit_ctl *submit)
 {
-	struct dma_chan *chan = async_tx_find_channel(depend_tx, DMA_XOR,
+	struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR,
 						      &dest, 1, src_list,
 						      src_cnt, len);
 	BUG_ON(src_cnt <= 1);
@@ -188,7 +178,7 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
 		pr_debug("%s (async): len: %zu\n", __func__, len);
 
 		return do_async_xor(chan, dest, src_list, offset, src_cnt, len,
-				    flags, depend_tx, cb_fn, cb_param);
+				    submit);
 	} else {
 		/* run the xor synchronously */
 		pr_debug("%s (sync): len: %zu\n", __func__, len);
@@ -196,16 +186,15 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
 		/* in the sync case the dest is an implied source
 		 * (assumes the dest is the first source)
 		 */
-		if (flags & ASYNC_TX_XOR_DROP_DST) {
+		if (submit->flags & ASYNC_TX_XOR_DROP_DST) {
 			src_cnt--;
 			src_list++;
 		}
 
 		/* wait for any prerequisite operations */
-		async_tx_quiesce(&depend_tx);
+		async_tx_quiesce(&submit->depend_tx);
 
-		do_sync_xor(dest, src_list, offset, src_cnt, len,
-			    flags, cb_fn, cb_param);
+		do_sync_xor(dest, src_list, offset, src_cnt, len, submit);
 
 		return NULL;
 	}
@@ -228,19 +217,14 @@ static int page_is_zero(struct page *p, unsigned int offset, size_t len)
  * @src_cnt: number of source pages
  * @len: length in bytes
  * @result: 0 if sum == 0 else non-zero
- * @flags: ASYNC_TX_ACK
- * @depend_tx: xor depends on the result of this transaction.
- * @cb_fn: function to call when the xor completes
- * @cb_param: parameter to pass to the callback routine
+ * @submit: submission / completion modifiers
  */
 struct dma_async_tx_descriptor *
-async_xor_val(struct page *dest, struct page **src_list,
-	unsigned int offset, int src_cnt, size_t len,
-	u32 *result, enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_param)
+async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
+	      int src_cnt, size_t len, u32 *result,
+	      struct async_submit_ctl *submit)
 {
-	struct dma_chan *chan = async_tx_find_channel(depend_tx, DMA_XOR_VAL,
+	struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR_VAL,
 						      &dest, 1, src_list,
 						      src_cnt, len);
 	struct dma_device *device = chan ? chan->device : NULL;
@@ -250,11 +234,12 @@ async_xor_val(struct page *dest, struct page **src_list,
 
 	if (device && src_cnt <= device->max_xor) {
 		dma_addr_t *dma_src = (dma_addr_t *) src_list;
-		unsigned long dma_prep_flags = cb_fn ? DMA_PREP_INTERRUPT : 0;
+		unsigned long dma_prep_flags;
 		int i;
 
 		pr_debug("%s: (async) len: %zu\n", __func__, len);
 
+		dma_prep_flags = submit->cb_fn ? DMA_PREP_INTERRUPT : 0;
 		for (i = 0; i < src_cnt; i++)
 			dma_src[i] = dma_map_page(device->dev, src_list[i],
 						  offset, len, DMA_TO_DEVICE);
@@ -263,7 +248,7 @@ async_xor_val(struct page *dest, struct page **src_list,
 						     len, result,
 						     dma_prep_flags);
 		if (unlikely(!tx)) {
-			async_tx_quiesce(&depend_tx);
+			async_tx_quiesce(&submit->depend_tx);
 
 			while (!tx) {
 				dma_async_issue_pending(chan);
@@ -273,23 +258,23 @@ async_xor_val(struct page *dest, struct page **src_list,
 			}
 		}
 
-		async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
+		async_tx_submit(chan, tx, submit);
 	} else {
-		unsigned long xor_flags = flags;
+		enum async_tx_flags flags_orig = submit->flags;
 
 		pr_debug("%s: (sync) len: %zu\n", __func__, len);
 
-		xor_flags |= ASYNC_TX_XOR_DROP_DST;
-		xor_flags &= ~ASYNC_TX_ACK;
+		submit->flags |= ASYNC_TX_XOR_DROP_DST;
+		submit->flags &= ~ASYNC_TX_ACK;
 
-		tx = async_xor(dest, src_list, offset, src_cnt, len, xor_flags,
-			depend_tx, NULL, NULL);
+		tx = async_xor(dest, src_list, offset, src_cnt, len, submit);
 
 		async_tx_quiesce(&tx);
 
 		*result = page_is_zero(dest, offset, len) ? 0 : 1;
 
-		async_tx_sync_epilog(cb_fn, cb_param);
+		async_tx_sync_epilog(submit);
+		submit->flags = flags_orig;
 	}
 
 	return tx;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 0ef5362..e1920f2 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -499,11 +499,14 @@ async_copy_data(int frombio, struct bio *bio, struct page *page,
 	struct page *bio_page;
 	int i;
 	int page_offset;
+	struct async_submit_ctl submit;
 
 	if (bio->bi_sector >= sector)
 		page_offset = (signed)(bio->bi_sector - sector) * 512;
 	else
 		page_offset = (signed)(sector - bio->bi_sector) * -512;
+
+	init_async_submit(&submit, 0, tx, NULL, NULL, NULL);
 	bio_for_each_segment(bvl, bio, i) {
 		int len = bio_iovec_idx(bio, i)->bv_len;
 		int clen;
@@ -525,13 +528,14 @@ async_copy_data(int frombio, struct bio *bio, struct page *page,
 			bio_page = bio_iovec_idx(bio, i)->bv_page;
 			if (frombio)
 				tx = async_memcpy(page, bio_page, page_offset,
-						  b_offset, clen, 0,
-						  tx, NULL, NULL);
+						  b_offset, clen, &submit);
 			else
 				tx = async_memcpy(bio_page, page, b_offset,
-						  page_offset, clen, 0,
-						  tx, NULL, NULL);
+						  page_offset, clen, &submit);
 		}
+		/* chain the operations */
+		submit.depend_tx = tx;
+
 		if (clen < len) /* hit end of page */
 			break;
 		page_offset +=  len;
@@ -590,6 +594,7 @@ static void ops_run_biofill(struct stripe_head *sh)
 {
 	struct dma_async_tx_descriptor *tx = NULL;
 	raid5_conf_t *conf = sh->raid_conf;
+	struct async_submit_ctl submit;
 	int i;
 
 	pr_debug("%s: stripe %llu\n", __func__,
@@ -613,7 +618,8 @@ static void ops_run_biofill(struct stripe_head *sh)
 	}
 
 	atomic_inc(&sh->count);
-	async_trigger_callback(ASYNC_TX_ACK, tx, ops_complete_biofill, sh);
+	init_async_submit(&submit, ASYNC_TX_ACK, tx, ops_complete_biofill, sh, NULL);
+	async_trigger_callback(&submit);
 }
 
 static void ops_complete_compute5(void *stripe_head_ref)
@@ -645,6 +651,7 @@ static struct dma_async_tx_descriptor *ops_run_compute5(struct stripe_head *sh)
 	struct page *xor_dest = tgt->page;
 	int count = 0;
 	struct dma_async_tx_descriptor *tx;
+	struct async_submit_ctl submit;
 	int i;
 
 	pr_debug("%s: stripe %llu block: %d\n",
@@ -657,13 +664,12 @@ static struct dma_async_tx_descriptor *ops_run_compute5(struct stripe_head *sh)
 
 	atomic_inc(&sh->count);
 
+	init_async_submit(&submit, ASYNC_TX_XOR_ZERO_DST, NULL,
+			  ops_complete_compute5, sh, NULL);
 	if (unlikely(count == 1))
-		tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE,
-			0, NULL, ops_complete_compute5, sh);
+		tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
 	else
-		tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
-			ASYNC_TX_XOR_ZERO_DST, NULL,
-			ops_complete_compute5, sh);
+		tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE, &submit);
 
 	return tx;
 }
@@ -683,6 +689,7 @@ ops_run_prexor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 	int disks = sh->disks;
 	struct page *xor_srcs[disks];
 	int count = 0, pd_idx = sh->pd_idx, i;
+	struct async_submit_ctl submit;
 
 	/* existing parity data subtracted */
 	struct page *xor_dest = xor_srcs[count++] = sh->dev[pd_idx].page;
@@ -697,9 +704,9 @@ ops_run_prexor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 			xor_srcs[count++] = dev->page;
 	}
 
-	tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
-		       ASYNC_TX_XOR_DROP_DST, tx,
-		       ops_complete_prexor, sh);
+	init_async_submit(&submit, ASYNC_TX_XOR_DROP_DST, tx,
+			  ops_complete_prexor, sh, NULL);
+	tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE, &submit);
 
 	return tx;
 }
@@ -772,7 +779,7 @@ ops_run_postxor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 	/* kernel stack size limits the total number of disks */
 	int disks = sh->disks;
 	struct page *xor_srcs[disks];
-
+	struct async_submit_ctl submit;
 	int count = 0, pd_idx = sh->pd_idx, i;
 	struct page *xor_dest;
 	int prexor = 0;
@@ -811,13 +818,11 @@ ops_run_postxor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 
 	atomic_inc(&sh->count);
 
-	if (unlikely(count == 1)) {
-		flags &= ~(ASYNC_TX_XOR_DROP_DST | ASYNC_TX_XOR_ZERO_DST);
-		tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE,
-			flags, tx, ops_complete_postxor, sh);
-	} else
-		tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
-			flags, tx, ops_complete_postxor, sh);
+	init_async_submit(&submit, flags, tx, ops_complete_postxor, sh, NULL);
+	if (unlikely(count == 1))
+		tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
+	else
+		tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE, &submit);
 }
 
 static void ops_complete_check(void *stripe_head_ref)
@@ -838,6 +843,7 @@ static void ops_run_check(struct stripe_head *sh)
 	int disks = sh->disks;
 	struct page *xor_srcs[disks];
 	struct dma_async_tx_descriptor *tx;
+	struct async_submit_ctl submit;
 
 	int count = 0, pd_idx = sh->pd_idx, i;
 	struct page *xor_dest = xor_srcs[count++] = sh->dev[pd_idx].page;
@@ -851,12 +857,13 @@ static void ops_run_check(struct stripe_head *sh)
 			xor_srcs[count++] = dev->page;
 	}
 
+	init_async_submit(&submit, 0, NULL, NULL, NULL, NULL);
 	tx = async_xor_val(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
-		&sh->ops.zero_sum_result, 0, NULL, NULL, NULL);
+			   &sh->ops.zero_sum_result, &submit);
 
 	atomic_inc(&sh->count);
-	tx = async_trigger_callback(ASYNC_TX_ACK, tx,
-		ops_complete_check, sh);
+	init_async_submit(&submit, ASYNC_TX_ACK, tx, ops_complete_check, sh, NULL);
+	tx = async_trigger_callback(&submit);
 }
 
 static void raid5_run_ops(struct stripe_head *sh, unsigned long ops_request)
@@ -2664,6 +2671,7 @@ static void handle_stripe_expansion(raid5_conf_t *conf, struct stripe_head *sh,
 		if (i != sh->pd_idx && i != sh->qd_idx) {
 			int dd_idx, j;
 			struct stripe_head *sh2;
+			struct async_submit_ctl submit;
 
 			sector_t bn = compute_blocknr(sh, i, 1);
 			sector_t s = raid5_compute_sector(conf, bn, 0,
@@ -2683,9 +2691,10 @@ static void handle_stripe_expansion(raid5_conf_t *conf, struct stripe_head *sh,
 			}
 
 			/* place all the copies on one channel */
+			init_async_submit(&submit, 0, tx, NULL, NULL, NULL);
 			tx = async_memcpy(sh2->dev[dd_idx].page,
 					  sh->dev[i].page, 0, 0, STRIPE_SIZE,
-					  0, tx, NULL, NULL);
+					  &submit);
 
 			set_bit(R5_Expanded, &sh2->dev[dd_idx].flags);
 			set_bit(R5_UPTODATE, &sh2->dev[dd_idx].flags);
diff --git a/include/linux/async_tx.h b/include/linux/async_tx.h
index 9f14cd5..00cfb63 100644
--- a/include/linux/async_tx.h
+++ b/include/linux/async_tx.h
@@ -65,6 +65,22 @@ enum async_tx_flags {
 	ASYNC_TX_ACK		 = (1 << 2),
 };
 
+/**
+ * struct async_submit_ctl - async_tx submission/completion modifiers
+ * @flags: submission modifiers
+ * @depend_tx: parent dependency of the current operation being submitted
+ * @cb_fn: callback routine to run at operation completion
+ * @cb_param: parameter for the callback routine
+ * @scribble: caller provided space for dma/page address conversions
+ */
+struct async_submit_ctl {
+	enum async_tx_flags flags;
+	struct dma_async_tx_descriptor *depend_tx;
+	dma_async_tx_callback cb_fn;
+	void *cb_param;
+	void *scribble;
+};
+
 #ifdef CONFIG_DMA_ENGINE
 #define async_tx_issue_pending_all dma_issue_pending_all
 #ifdef CONFIG_ARCH_HAS_ASYNC_TX_FIND_CHANNEL
@@ -73,8 +89,8 @@ enum async_tx_flags {
 #define async_tx_find_channel(dep, type, dst, dst_count, src, src_count, len) \
 	 __async_tx_find_channel(dep, type)
 struct dma_chan *
-__async_tx_find_channel(struct dma_async_tx_descriptor *depend_tx,
-	enum dma_transaction_type tx_type);
+__async_tx_find_channel(struct async_submit_ctl *submit,
+			enum dma_transaction_type tx_type);
 #endif /* CONFIG_ARCH_HAS_ASYNC_TX_FIND_CHANNEL */
 #else
 static inline void async_tx_issue_pending_all(void)
@@ -83,9 +99,10 @@ static inline void async_tx_issue_pending_all(void)
 }
 
 static inline struct dma_chan *
-async_tx_find_channel(struct dma_async_tx_descriptor *depend_tx,
-	enum dma_transaction_type tx_type, struct page **dst, int dst_count,
-	struct page **src, int src_count, size_t len)
+async_tx_find_channel(struct async_submit_ctl *submit,
+		      enum dma_transaction_type tx_type, struct page **dst,
+		      int dst_count, struct page **src, int src_count,
+		      size_t len)
 {
 	return NULL;
 }
@@ -97,46 +114,53 @@ async_tx_find_channel(struct dma_async_tx_descriptor *depend_tx,
  * @cb_fn_param: parameter to pass to the callback routine
  */
 static inline void
-async_tx_sync_epilog(dma_async_tx_callback cb_fn, void *cb_fn_param)
+async_tx_sync_epilog(struct async_submit_ctl *submit)
+{
+	if (submit->cb_fn)
+		submit->cb_fn(submit->cb_param);
+}
+
+typedef union {
+	unsigned long addr;
+	struct page *page;
+	dma_addr_t dma;
+} addr_conv_t;
+
+static inline void
+init_async_submit(struct async_submit_ctl *args, enum async_tx_flags flags,
+		  struct dma_async_tx_descriptor *tx,
+		  dma_async_tx_callback cb_fn, void *cb_param,
+		  addr_conv_t *scribble)
 {
-	if (cb_fn)
-		cb_fn(cb_fn_param);
+	args->flags = flags;
+	args->depend_tx = tx;
+	args->cb_fn = cb_fn;
+	args->cb_param = cb_param;
+	args->scribble = scribble;
 }
 
-void
-async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
-	enum async_tx_flags flags, struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_fn_param);
+void async_tx_submit(struct dma_chan *chan, struct dma_async_tx_descriptor *tx,
+		     struct async_submit_ctl *submit);
 
 struct dma_async_tx_descriptor *
 async_xor(struct page *dest, struct page **src_list, unsigned int offset,
-	int src_cnt, size_t len, enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_fn_param);
+	  int src_cnt, size_t len, struct async_submit_ctl *submit);
 
 struct dma_async_tx_descriptor *
-async_xor_val(struct page *dest, struct page **src_list,
-	unsigned int offset, int src_cnt, size_t len,
-	u32 *result, enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_fn_param);
+async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
+	      int src_cnt, size_t len, u32 *result,
+	      struct async_submit_ctl *submit);
 
 struct dma_async_tx_descriptor *
 async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
-	unsigned int src_offset, size_t len, enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_fn_param);
+	     unsigned int src_offset, size_t len,
+	     struct async_submit_ctl *submit);
 
 struct dma_async_tx_descriptor *
 async_memset(struct page *dest, int val, unsigned int offset,
-	size_t len, enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_fn_param);
+	     size_t len, struct async_submit_ctl *submit);
 
-struct dma_async_tx_descriptor *
-async_trigger_callback(enum async_tx_flags flags,
-	struct dma_async_tx_descriptor *depend_tx,
-	dma_async_tx_callback cb_fn, void *cb_fn_param);
+struct dma_async_tx_descriptor *async_trigger_callback(struct async_submit_ctl *submit);
 
 void async_tx_quiesce(struct dma_async_tx_descriptor **tx);
 #endif /* _ASYNC_TX_H_ */


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/11] async_tx: structify submission arguments, add scribble
  2009-05-19  0:59 ` [PATCH v2 03/11] async_tx: structify submission arguments, add scribble Dan Williams
@ 2009-05-20  8:06   ` Andre Noll
  2009-05-20 18:19     ` Dan Williams
       [not found]   ` <f12847240905250321v774c4e8dscd7a466cd2e61168@mail.gmail.com>
  1 sibling, 1 reply; 45+ messages in thread
From: Andre Noll @ 2009-05-20  8:06 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1751 bytes --]

On Mon, May 18, 2009 at 05:59:41PM -0700, Dan Williams wrote:

> diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
> index 7117ec6..c9342ae 100644
> --- a/crypto/async_tx/async_memcpy.c
> +++ b/crypto/async_tx/async_memcpy.c
> @@ -35,26 +35,23 @@
>   * @src: src page
>   * @offset: offset in pages to start transaction
>   * @len: length in bytes
> - * @flags: ASYNC_TX_ACK
> - * @depend_tx: memcpy depends on the result of this transaction
> - * @cb_fn: function to call when the memcpy completes
> - * @cb_param: parameter to pass to the callback routine
> + * @submit: submission / completion modifiers
>   */
>  struct dma_async_tx_descriptor *
>  async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
> -	unsigned int src_offset, size_t len, enum async_tx_flags flags,
> -	struct dma_async_tx_descriptor *depend_tx,
> -	dma_async_tx_callback cb_fn, void *cb_param)
> +	     unsigned int src_offset, size_t len,
> +	     struct async_submit_ctl *submit)
>  {

The third parameter is called "dest_offset", but the comment refers to
"@offset".

> +struct async_submit_ctl {
> +	enum async_tx_flags flags;
> +	struct dma_async_tx_descriptor *depend_tx;
> +	dma_async_tx_callback cb_fn;
> +	void *cb_param;
> +	void *scribble;
> +};

Can't scribble be of type addr_conv_t *?

Apart from these two minor issues the patch looks really nice. Thanks
for taking the time to combine the common submission parameters to
the new structure. This improves readability of the code and makes
adaptation to future needs easier.

Signed-off-by: Andre Noll <maan@systemlinux.org>

Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/11] async_tx: structify submission arguments, add scribble
  2009-05-20  8:06   ` Andre Noll
@ 2009-05-20 18:19     ` Dan Williams
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-20 18:19 UTC (permalink / raw)
  To: Andre Noll; +Cc: Neil Brown, linux-raid

On Wed, May 20, 2009 at 1:06 AM, Andre Noll <maan@systemlinux.org> wrote:
> On Mon, May 18, 2009 at 05:59:41PM -0700, Dan Williams wrote:
>
>> diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
>> index 7117ec6..c9342ae 100644
>> --- a/crypto/async_tx/async_memcpy.c
>> +++ b/crypto/async_tx/async_memcpy.c
>> @@ -35,26 +35,23 @@
>>   * @src: src page
>>   * @offset: offset in pages to start transaction
>>   * @len: length in bytes
>> - * @flags: ASYNC_TX_ACK
>> - * @depend_tx: memcpy depends on the result of this transaction
>> - * @cb_fn: function to call when the memcpy completes
>> - * @cb_param: parameter to pass to the callback routine
>> + * @submit: submission / completion modifiers
>>   */
>>  struct dma_async_tx_descriptor *
>>  async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
>> -     unsigned int src_offset, size_t len, enum async_tx_flags flags,
>> -     struct dma_async_tx_descriptor *depend_tx,
>> -     dma_async_tx_callback cb_fn, void *cb_param)
>> +          unsigned int src_offset, size_t len,
>> +          struct async_submit_ctl *submit)
>>  {
>
> The third parameter is called "dest_offset", but the comment refers to
> "@offset".

...and I was missing a comment for src_offset.  fixed.

>
>> +struct async_submit_ctl {
>> +     enum async_tx_flags flags;
>> +     struct dma_async_tx_descriptor *depend_tx;
>> +     dma_async_tx_callback cb_fn;
>> +     void *cb_param;
>> +     void *scribble;
>> +};
>
> Can't scribble be of type addr_conv_t *?

It could, but that would require casting it when it is used.  I really
only added the addr_conv_t type to get a compiler warning if someone
inadvertently swaps the cb_param and scribble arguments to
init_async_submit().  There is no benefit for type safety once we go
to use it, and this will allow me to drop that unnecessary cast from
void * that you identified in [PATCH 04/11].

> Apart from these two minor issues the patch looks really nice. Thanks
> for taking the time to combine the common submission parameters to
> the new structure. This improves readability of the code and makes
> adaptation to future needs easier.
>
> Signed-off-by: Andre Noll <maan@systemlinux.org>

Thanks!

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905250321v774c4e8dscd7a466cd2e61168@mail.gmail.com>]

* RE: [PATCH v2 03/11] async_tx: structify submission arguments, add scribble
       [not found]   ` <f12847240905250321v774c4e8dscd7a466cd2e61168@mail.gmail.com>
@ 2009-05-29 13:41     ` Sosnowski, Maciej
  2009-06-03 19:05       ` Dan Williams
  0 siblings, 1 reply; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:41 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
> Prepare the api for the arrival of a new parameter, 'scribble'.  This
> will allow callers to identify scratchpad memory for dma address or page
> address conversions.  As this adds yet another parameter, take this
> opportunity to convert the common submission parameters (flags,
> dependency, callback, and callback argument) into an object that is
> passed by reference.
> 
> [ Impact: moves api pass-by-value parameters to a pass-by-reference struct ]
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  crypto/async_tx/async_memcpy.c |   21 ++++-----
>  crypto/async_tx/async_memset.c |   23 ++++------
>  crypto/async_tx/async_tx.c     |   34 +++++++--------
>  crypto/async_tx/async_xor.c    |   93 +++++++++++++++++-----------------------
>  drivers/md/raid5.c             |   59 +++++++++++++++----------
>  include/linux/async_tx.h       |   84 +++++++++++++++++++++++-------------
>  6 files changed, 161 insertions(+), 153 deletions(-)

(...)

> @@ -811,13 +818,11 @@ ops_run_postxor(struct stripe_head *sh, struct
> dma_async_tx_descriptor *tx)
> 
>        atomic_inc(&sh->count);
> 
> -       if (unlikely(count == 1)) {
> -               flags &= ~(ASYNC_TX_XOR_DROP_DST | ASYNC_TX_XOR_ZERO_DST);
> -               tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE,
> -                       flags, tx, ops_complete_postxor, sh);
> -       } else
> -               tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
> -                       flags, tx, ops_complete_postxor, sh);
> +       init_async_submit(&submit, flags, tx, ops_complete_postxor, sh, NULL);
> +       if (unlikely(count == 1))
> +               tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
> +       else
> +               tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE, &submit);
>  }

What about ASYNC_TX_XOR_DROP_DST and ASYNC_TX_XOR_ZERO_DST flags clearing before async_memcpy?

Maciej--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/11] async_tx: structify submission arguments, add scribble
  2009-05-29 13:41     ` Sosnowski, Maciej
@ 2009-06-03 19:05       ` Dan Williams
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Williams @ 2009-06-03 19:05 UTC (permalink / raw)
  To: Sosnowski, Maciej; +Cc: linux-raid@vger.kernel.org

2009/5/29 Sosnowski, Maciej <maciej.sosnowski@intel.com>:
> Dan Williams wrote:
>> Prepare the api for the arrival of a new parameter, 'scribble'.  This
>> will allow callers to identify scratchpad memory for dma address or page
>> address conversions.  As this adds yet another parameter, take this
>> opportunity to convert the common submission parameters (flags,
>> dependency, callback, and callback argument) into an object that is
>> passed by reference.
>>
>> [ Impact: moves api pass-by-value parameters to a pass-by-reference struct ]
>>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  crypto/async_tx/async_memcpy.c |   21 ++++-----
>>  crypto/async_tx/async_memset.c |   23 ++++------
>>  crypto/async_tx/async_tx.c     |   34 +++++++--------
>>  crypto/async_tx/async_xor.c    |   93 +++++++++++++++++-----------------------
>>  drivers/md/raid5.c             |   59 +++++++++++++++----------
>>  include/linux/async_tx.h       |   84 +++++++++++++++++++++++-------------
>>  6 files changed, 161 insertions(+), 153 deletions(-)
>
> (...)
>
>> @@ -811,13 +818,11 @@ ops_run_postxor(struct stripe_head *sh, struct
>> dma_async_tx_descriptor *tx)
>>
>>        atomic_inc(&sh->count);
>>
>> -       if (unlikely(count == 1)) {
>> -               flags &= ~(ASYNC_TX_XOR_DROP_DST | ASYNC_TX_XOR_ZERO_DST);
>> -               tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE,
>> -                       flags, tx, ops_complete_postxor, sh);
>> -       } else
>> -               tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
>> -                       flags, tx, ops_complete_postxor, sh);
>> +       init_async_submit(&submit, flags, tx, ops_complete_postxor, sh, NULL);
>> +       if (unlikely(count == 1))
>> +               tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
>> +       else
>> +               tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE, &submit);
>>  }
>
> What about ASYNC_TX_XOR_DROP_DST and ASYNC_TX_XOR_ZERO_DST flags clearing before async_memcpy?
>

Not necessary.  The routines ignore the flags that are not relevant.
However, the relevant flags for each routine used to be documented in
the kerneldoc description, but those lines got removed when the submit
parameter was added.  I'll go back and add notes to each routine about
which flags are honored and fix up compliance with the kerneldoc
format.

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
                   ` (2 preceding siblings ...)
  2009-05-19  0:59 ` [PATCH v2 03/11] async_tx: structify submission arguments, add scribble Dan Williams
@ 2009-05-19  0:59 ` Dan Williams
  2009-05-20  8:08   ` Andre Noll
       [not found]   ` <f12847240905250320w523fc657w3bca47f23442f46e@mail.gmail.com>
  2009-05-19  0:59 ` [PATCH v2 05/11] md/raid5: add scribble region for buffer lists Dan Williams
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-19  0:59 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

async_xor() needs space to perform dma and page address conversions.  In
most cases the code can simply reuse the struct page * array because the
size of the native pointer matches the size of a dma/page address.  In
order to support archs where sizeof(dma_addr_t) is larger than
sizeof(struct page *), or to preserve the input parameters, we utilize a
memory region passed in by the caller.

Since the code is now prepared to handle the case where it cannot
perform address conversions on the stack, we no longer need the
!HIGHMEM64G dependency in drivers/dma/Kconfig.

[ Impact: don't clobber input buffers for address conversions ]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 crypto/async_tx/async_xor.c |   59 ++++++++++++++++++++-----------------------
 drivers/dma/Kconfig         |    2 +
 2 files changed, 29 insertions(+), 32 deletions(-)

diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 6290d05..3caecdd 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -33,11 +33,10 @@
 /* do_async_xor - dma map the pages and perform the xor with an engine */
 static __async_inline struct dma_async_tx_descriptor *
 do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
-	     unsigned int offset, int src_cnt, size_t len,
+	     unsigned int offset, int src_cnt, size_t len, dma_addr_t *dma_src,
 	     struct async_submit_ctl *submit)
 {
 	struct dma_device *dma = chan->device;
-	dma_addr_t *dma_src = (dma_addr_t *) src_list;
 	struct dma_async_tx_descriptor *tx = NULL;
 	int src_off = 0;
 	int i;
@@ -125,9 +124,14 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
 	int xor_src_cnt;
 	int src_off = 0;
 	void *dest_buf;
-	void **srcs = (void **) src_list;
+	void **srcs;
 
-	/* reuse the 'src_list' array to convert to buffer pointers */
+	if (submit->scribble)
+		srcs = (void **) submit->scribble;
+	else
+		srcs = (void **) src_list;
+
+	/* convert to buffer pointers */
 	for (i = 0; i < src_cnt; i++)
 		srcs[i] = page_address(src_list[i]) + offset;
 
@@ -171,17 +175,26 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
 	struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR,
 						      &dest, 1, src_list,
 						      src_cnt, len);
+	dma_addr_t *dma_src = NULL;
+
 	BUG_ON(src_cnt <= 1);
 
-	if (chan) {
+	if (submit->scribble)
+		dma_src = submit->scribble;
+	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
+		dma_src = (dma_addr_t *) src_list;
+
+	if (dma_src && chan) {
 		/* run the xor asynchronously */
 		pr_debug("%s (async): len: %zu\n", __func__, len);
 
 		return do_async_xor(chan, dest, src_list, offset, src_cnt, len,
-				    submit);
+				    dma_src, submit);
 	} else {
 		/* run the xor synchronously */
 		pr_debug("%s (sync): len: %zu\n", __func__, len);
+		WARN_ONCE(chan, "%s: no space for dma address conversion\n",
+			  __func__);
 
 		/* in the sync case the dest is an implied source
 		 * (assumes the dest is the first source)
@@ -229,11 +242,16 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
 						      src_cnt, len);
 	struct dma_device *device = chan ? chan->device : NULL;
 	struct dma_async_tx_descriptor *tx = NULL;
+	dma_addr_t *dma_src = NULL;
 
 	BUG_ON(src_cnt <= 1);
 
-	if (device && src_cnt <= device->max_xor) {
-		dma_addr_t *dma_src = (dma_addr_t *) src_list;
+	if (submit->scribble)
+		dma_src = submit->scribble;
+	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
+		dma_src = (dma_addr_t *) src_list;
+
+	if (dma_src && device && src_cnt <= device->max_xor) {
 		unsigned long dma_prep_flags;
 		int i;
 
@@ -263,6 +281,8 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
 		enum async_tx_flags flags_orig = submit->flags;
 
 		pr_debug("%s: (sync) len: %zu\n", __func__, len);
+		WARN_ONCE(device && src_cnt <= device->max_xor,
+			  "%s: no space for dma address conversion\n", __func__);
 
 		submit->flags |= ASYNC_TX_XOR_DROP_DST;
 		submit->flags &= ~ASYNC_TX_ACK;
@@ -281,29 +301,6 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
 }
 EXPORT_SYMBOL_GPL(async_xor_val);
 
-static int __init async_xor_init(void)
-{
-	#ifdef CONFIG_DMA_ENGINE
-	/* To conserve stack space the input src_list (array of page pointers)
-	 * is reused to hold the array of dma addresses passed to the driver.
-	 * This conversion is only possible when dma_addr_t is less than the
-	 * the size of a pointer.  HIGHMEM64G is known to violate this
-	 * assumption.
-	 */
-	BUILD_BUG_ON(sizeof(dma_addr_t) > sizeof(struct page *));
-	#endif
-
-	return 0;
-}
-
-static void __exit async_xor_exit(void)
-{
-	do { } while (0);
-}
-
-module_init(async_xor_init);
-module_exit(async_xor_exit);
-
 MODULE_AUTHOR("Intel Corporation");
 MODULE_DESCRIPTION("asynchronous xor/xor-zero-sum api");
 MODULE_LICENSE("GPL");
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 3b3c01b..912a51b 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -4,7 +4,7 @@
 
 menuconfig DMADEVICES
 	bool "DMA Engine support"
-	depends on !HIGHMEM64G && HAS_DMA
+	depends on HAS_DMA
 	help
 	  DMA engines can do asynchronous data transfers without
 	  involving the host CPU.  Currently, this framework can be


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region
  2009-05-19  0:59 ` [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region Dan Williams
@ 2009-05-20  8:08   ` Andre Noll
  2009-05-20 18:35     ` Dan Williams
       [not found]   ` <f12847240905250320w523fc657w3bca47f23442f46e@mail.gmail.com>
  1 sibling, 1 reply; 45+ messages in thread
From: Andre Noll @ 2009-05-20  8:08 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 2318 bytes --]

On Mon, May 18, 2009 at 05:59:46PM -0700, Dan Williams wrote:

> diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
> index 6290d05..3caecdd 100644
> --- a/crypto/async_tx/async_xor.c
> +++ b/crypto/async_tx/async_xor.c
> @@ -33,11 +33,10 @@
>  /* do_async_xor - dma map the pages and perform the xor with an engine */
>  static __async_inline struct dma_async_tx_descriptor *
>  do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
> -	     unsigned int offset, int src_cnt, size_t len,
> +	     unsigned int offset, int src_cnt, size_t len, dma_addr_t *dma_src,
>  	     struct async_submit_ctl *submit)
>  {
>  	struct dma_device *dma = chan->device;
> -	dma_addr_t *dma_src = (dma_addr_t *) src_list;
>  	struct dma_async_tx_descriptor *tx = NULL;
>  	int src_off = 0;
>  	int i;
> @@ -125,9 +124,14 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
>  	int xor_src_cnt;
>  	int src_off = 0;
>  	void *dest_buf;
> -	void **srcs = (void **) src_list;
> +	void **srcs;
>  
> -	/* reuse the 'src_list' array to convert to buffer pointers */
> +	if (submit->scribble)
> +		srcs = (void **) submit->scribble;

Unnecessary cast as submit->scribble is void *.

> @@ -171,17 +175,26 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
>  	struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR,
>  						      &dest, 1, src_list,
>  						      src_cnt, len);
> +	dma_addr_t *dma_src = NULL;
> +
>  	BUG_ON(src_cnt <= 1);
>  
> -	if (chan) {
> +	if (submit->scribble)
> +		dma_src = submit->scribble;
> +	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
> +		dma_src = (dma_addr_t *) src_list;
> +
> +	if (dma_src && chan) {
>  		/* run the xor asynchronously */
>  		pr_debug("%s (async): len: %zu\n", __func__, len);
>  
>  		return do_async_xor(chan, dest, src_list, offset, src_cnt, len,
> -				    submit);
> +				    dma_src, submit);
>  	} else {

Don't we need to fall back to sync xor if src_cnt exceeds
what the device can handle, i.e. if it is larger than
chan->device->max_xor? async_xor_val() further down has a check for
this condition.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region
  2009-05-20  8:08   ` Andre Noll
@ 2009-05-20 18:35     ` Dan Williams
  2009-05-20 19:09       ` Andre Noll
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Williams @ 2009-05-20 18:35 UTC (permalink / raw)
  To: Andre Noll; +Cc: Neil Brown, linux-raid

On Wed, May 20, 2009 at 1:08 AM, Andre Noll <maan@systemlinux.org> wrote:
> On Mon, May 18, 2009 at 05:59:46PM -0700, Dan Williams wrote:
>
>> diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
>> @@ -125,9 +124,14 @@ do_sync_xor(struct page *dest, struct page **src_list, unsigned int offset,
>>       int xor_src_cnt;
>>       int src_off = 0;
>>       void *dest_buf;
>> -     void **srcs = (void **) src_list;
>> +     void **srcs;
>>
>> -     /* reuse the 'src_list' array to convert to buffer pointers */
>> +     if (submit->scribble)
>> +             srcs = (void **) submit->scribble;
>
> Unnecessary cast as submit->scribble is void *.

fixed.

>
>> @@ -171,17 +175,26 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
>>       struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR,
>>                                                     &dest, 1, src_list,
>>                                                     src_cnt, len);
>> +     dma_addr_t *dma_src = NULL;
>> +
>>       BUG_ON(src_cnt <= 1);
>>
>> -     if (chan) {
>> +     if (submit->scribble)
>> +             dma_src = submit->scribble;
>> +     else if (sizeof(dma_addr_t) <= sizeof(struct page *))
>> +             dma_src = (dma_addr_t *) src_list;
>> +
>> +     if (dma_src && chan) {
>>               /* run the xor asynchronously */
>>               pr_debug("%s (async): len: %zu\n", __func__, len);
>>
>>               return do_async_xor(chan, dest, src_list, offset, src_cnt, len,
>> -                                 submit);
>> +                                 dma_src, submit);
>>       } else {
>
> Don't we need to fall back to sync xor if src_cnt exceeds
> what the device can handle, i.e. if it is larger than
> chan->device->max_xor? async_xor_val() further down has a check for
> this condition.

No, we don't need this check for async_xor().  The asynchronous path
of async_xor_val() has the constraint that it must be able to validate
an xor block without writing to that block i.e. it is a read-only
operation.  The synchronous path recalculates the xor with the
destination as a source and then does a memcmp() to validate that the
new result is zero.

To support more than ->max_xor sources for async_xor_val() we would
need an engine that supported hardware continuation of validate
operations (i.e. an engine with an internal buffer for the
intermediate xor result) otherwise we will need to store an
intermediate xor result in system memory which is no better than the
synchronous path.

For async_xor() we are always allowed to write the destination, so we
can reuse it as a source to continue the calculation of the xor
result.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region
  2009-05-20 18:35     ` Dan Williams
@ 2009-05-20 19:09       ` Andre Noll
  2009-05-22  8:29         ` Andre Noll
  0 siblings, 1 reply; 45+ messages in thread
From: Andre Noll @ 2009-05-20 19:09 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1485 bytes --]

On 11:35, Dan Williams wrote:

> > Don't we need to fall back to sync xor if src_cnt exceeds
> > what the device can handle, i.e. if it is larger than
> > chan->device->max_xor? async_xor_val() further down has a check for
> > this condition.
> 
> No, we don't need this check for async_xor().  The asynchronous path
> of async_xor_val() has the constraint that it must be able to validate
> an xor block without writing to that block i.e. it is a read-only
> operation.  The synchronous path recalculates the xor with the
> destination as a source and then does a memcmp() to validate that the
> new result is zero.
> 
> To support more than ->max_xor sources for async_xor_val() we would
> need an engine that supported hardware continuation of validate
> operations (i.e. an engine with an internal buffer for the
> intermediate xor result) otherwise we will need to store an
> intermediate xor result in system memory which is no better than the
> synchronous path.
> 
> For async_xor() we are always allowed to write the destination, so we
> can reuse it as a source to continue the calculation of the xor
> result.

I see. Thanks for this detailed explanation!

BTW: So far I've only glanced at patches 7-11 of your series. I will
review these patches this evening and tomorrow, so I'll send another
batch of comments/questions/sign-offs by Friday.

Robinson^WAndre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region
  2009-05-20 19:09       ` Andre Noll
@ 2009-05-22  8:29         ` Andre Noll
  2009-05-22 17:25           ` Dan Williams
  0 siblings, 1 reply; 45+ messages in thread
From: Andre Noll @ 2009-05-22  8:29 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 524 bytes --]

On Wed, May 20, 2009 at 09:09:28PM +0200, Andre Noll wrote:

> BTW: So far I've only glanced at patches 7-11 of your series. I will
> review these patches this evening and tomorrow, so I'll send another
> batch of comments/questions/sign-offs by Friday.

I've just sent a couple of review comments for the remaining patches.
Since I found only minor issues, feel free to add my sign-off to all
patches of the series.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region
  2009-05-22  8:29         ` Andre Noll
@ 2009-05-22 17:25           ` Dan Williams
  2009-05-25  7:55             ` Andre Noll
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Williams @ 2009-05-22 17:25 UTC (permalink / raw)
  To: Andre Noll; +Cc: Neil Brown, linux-raid

On Fri, May 22, 2009 at 1:29 AM, Andre Noll <maan@systemlinux.org> wrote:
> On Wed, May 20, 2009 at 09:09:28PM +0200, Andre Noll wrote:
>
>> BTW: So far I've only glanced at patches 7-11 of your series. I will
>> review these patches this evening and tomorrow, so I'll send another
>> batch of comments/questions/sign-offs by Friday.
>
> I've just sent a couple of review comments for the remaining patches.
> Since I found only minor issues, feel free to add my sign-off to all
> patches of the series.

Thanks it is very much appreciated.

Strictly speaking your "-by" line in this scenario should be either
Acked-by or Reviewed-by.  Signed-off-by is reserved for people who had
the patch in their possession on its way to mainline.  Any heartburn
if I use Reviewed-by?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region
  2009-05-22 17:25           ` Dan Williams
@ 2009-05-25  7:55             ` Andre Noll
  0 siblings, 0 replies; 45+ messages in thread
From: Andre Noll @ 2009-05-25  7:55 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 413 bytes --]

On 10:25, Dan Williams wrote:

> Strictly speaking your "-by" line in this scenario should be either
> Acked-by or Reviewed-by.  Signed-off-by is reserved for people who had
> the patch in their possession on its way to mainline.  Any heartburn
> if I use Reviewed-by?

No heartburns, just use Reviewed-by.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905250320w523fc657w3bca47f23442f46e@mail.gmail.com>]

* RE: [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region
       [not found]   ` <f12847240905250320w523fc657w3bca47f23442f46e@mail.gmail.com>
@ 2009-05-29 13:41     ` Sosnowski, Maciej
  0 siblings, 0 replies; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:41 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
> async_xor() needs space to perform dma and page address conversions.  In
> most cases the code can simply reuse the struct page * array because the
> size of the native pointer matches the size of a dma/page address.  In
> order to support archs where sizeof(dma_addr_t) is larger than
> sizeof(struct page *), or to preserve the input parameters, we utilize a
> memory region passed in by the caller.
> 
> Since the code is now prepared to handle the case where it cannot
> perform address conversions on the stack, we no longer need the
> !HIGHMEM64G dependency in drivers/dma/Kconfig.
> 
> [ Impact: don't clobber input buffers for address conversions ]
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  crypto/async_tx/async_xor.c |   59 ++++++++++++++++++++-----------------------
>  drivers/dma/Kconfig         |    2 +
>  2 files changed, 29 insertions(+), 32 deletions(-)

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 05/11] md/raid5: add scribble region for buffer lists
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
                   ` (3 preceding siblings ...)
  2009-05-19  0:59 ` [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region Dan Williams
@ 2009-05-19  0:59 ` Dan Williams
  2009-05-20  8:09   ` Andre Noll
  2009-06-04  6:11   ` Neil Brown
  2009-05-19  0:59 ` [PATCH v2 06/11] async_tx: add sum check flags Dan Williams
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-19  0:59 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

Hang some memory off of each stripe_head which can be used for storing
the buffer lists used in parity calculations.  Include space for dma
address conversions and pass that to async_tx via the
async_submit_ctl.scribble pointer.

[ Impact: move memory pressure from stack to heap ]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/md/raid5.c |   61 ++++++++++++++++++++++++++++++++++++++++++----------
 drivers/md/raid5.h |    5 ++++
 2 files changed, 54 insertions(+), 12 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e1920f2..0e456a6 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -275,6 +275,9 @@ static void shrink_buffers(struct stripe_head *sh, int num)
 	struct page *p;
 	int i;
 
+	kfree(sh->scribble);
+	sh->scribble = NULL;
+
 	for (i=0; i<num ; i++) {
 		p = sh->dev[i].page;
 		if (!p)
@@ -284,10 +287,26 @@ static void shrink_buffers(struct stripe_head *sh, int num)
 	}
 }
 
+static size_t scribble_len(int num)
+{
+	size_t len;
+
+	/* return enough space for an array of page pointers and dma
+	 * addresses for the ddf raid6 layout
+	 */
+	len = sizeof(struct page *) * (num+2) + sizeof(addr_conv_t) * (num+2);
+
+	return len;
+}
+
 static int grow_buffers(struct stripe_head *sh, int num)
 {
 	int i;
 
+	sh->scribble = kmalloc(scribble_len(num), GFP_KERNEL);
+	if (!sh->scribble)
+		return 1;
+
 	for (i=0; i<num; i++) {
 		struct page *page;
 
@@ -641,11 +660,16 @@ static void ops_complete_compute5(void *stripe_head_ref)
 	release_stripe(sh);
 }
 
+/* return a pointer to the address conversion region of the scribble buffer */
+static addr_conv_t *sh_to_addr_conv(struct stripe_head *sh)
+{
+	return sh->scribble + sizeof(struct page *) * (sh->disks + 2);
+}
+
 static struct dma_async_tx_descriptor *ops_run_compute5(struct stripe_head *sh)
 {
-	/* kernel stack size limits the total number of disks */
 	int disks = sh->disks;
-	struct page *xor_srcs[disks];
+	struct page **xor_srcs = sh->scribble;
 	int target = sh->ops.target;
 	struct r5dev *tgt = &sh->dev[target];
 	struct page *xor_dest = tgt->page;
@@ -665,7 +689,7 @@ static struct dma_async_tx_descriptor *ops_run_compute5(struct stripe_head *sh)
 	atomic_inc(&sh->count);
 
 	init_async_submit(&submit, ASYNC_TX_XOR_ZERO_DST, NULL,
-			  ops_complete_compute5, sh, NULL);
+			  ops_complete_compute5, sh, sh_to_addr_conv(sh));
 	if (unlikely(count == 1))
 		tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
 	else
@@ -685,9 +709,8 @@ static void ops_complete_prexor(void *stripe_head_ref)
 static struct dma_async_tx_descriptor *
 ops_run_prexor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 {
-	/* kernel stack size limits the total number of disks */
 	int disks = sh->disks;
-	struct page *xor_srcs[disks];
+	struct page **xor_srcs = sh->scribble;
 	int count = 0, pd_idx = sh->pd_idx, i;
 	struct async_submit_ctl submit;
 
@@ -705,7 +728,7 @@ ops_run_prexor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 	}
 
 	init_async_submit(&submit, ASYNC_TX_XOR_DROP_DST, tx,
-			  ops_complete_prexor, sh, NULL);
+			  ops_complete_prexor, sh, sh_to_addr_conv(sh));
 	tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE, &submit);
 
 	return tx;
@@ -776,9 +799,8 @@ static void ops_complete_postxor(void *stripe_head_ref)
 static void
 ops_run_postxor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 {
-	/* kernel stack size limits the total number of disks */
 	int disks = sh->disks;
-	struct page *xor_srcs[disks];
+	struct page **xor_srcs = sh->scribble;
 	struct async_submit_ctl submit;
 	int count = 0, pd_idx = sh->pd_idx, i;
 	struct page *xor_dest;
@@ -818,7 +840,8 @@ ops_run_postxor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 
 	atomic_inc(&sh->count);
 
-	init_async_submit(&submit, flags, tx, ops_complete_postxor, sh, NULL);
+	init_async_submit(&submit, flags, tx, ops_complete_postxor, sh,
+			  sh_to_addr_conv(sh));
 	if (unlikely(count == 1))
 		tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
 	else
@@ -839,9 +862,8 @@ static void ops_complete_check(void *stripe_head_ref)
 
 static void ops_run_check(struct stripe_head *sh)
 {
-	/* kernel stack size limits the total number of disks */
 	int disks = sh->disks;
-	struct page *xor_srcs[disks];
+	struct page **xor_srcs = sh->scribble;
 	struct dma_async_tx_descriptor *tx;
 	struct async_submit_ctl submit;
 
@@ -857,7 +879,7 @@ static void ops_run_check(struct stripe_head *sh)
 			xor_srcs[count++] = dev->page;
 	}
 
-	init_async_submit(&submit, 0, NULL, NULL, NULL, NULL);
+	init_async_submit(&submit, 0, NULL, NULL, NULL, sh_to_addr_conv(sh));
 	tx = async_xor_val(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
 			   &sh->ops.zero_sum_result, &submit);
 
@@ -871,6 +893,7 @@ static void raid5_run_ops(struct stripe_head *sh, unsigned long ops_request)
 	int overlap_clear = 0, i, disks = sh->disks;
 	struct dma_async_tx_descriptor *tx = NULL;
 
+	mutex_lock(&sh->scribble_lock);
 	if (test_bit(STRIPE_OP_BIOFILL, &ops_request)) {
 		ops_run_biofill(sh);
 		overlap_clear++;
@@ -903,6 +926,7 @@ static void raid5_run_ops(struct stripe_head *sh, unsigned long ops_request)
 			if (test_and_clear_bit(R5_Overlap, &dev->flags))
 				wake_up(&sh->raid_conf->wait_for_overlap);
 		}
+	mutex_unlock(&sh->scribble_lock);
 }
 
 static int grow_one_stripe(raid5_conf_t *conf)
@@ -914,6 +938,7 @@ static int grow_one_stripe(raid5_conf_t *conf)
 	memset(sh, 0, sizeof(*sh) + (conf->raid_disks-1)*sizeof(struct r5dev));
 	sh->raid_conf = conf;
 	spin_lock_init(&sh->lock);
+	mutex_init(&sh->scribble_lock);
 
 	if (grow_buffers(sh, conf->raid_disks)) {
 		shrink_buffers(sh, conf->raid_disks);
@@ -1007,6 +1032,7 @@ static int resize_stripes(raid5_conf_t *conf, int newsize)
 
 		nsh->raid_conf = conf;
 		spin_lock_init(&nsh->lock);
+		mutex_init(&nsh->scribble_lock);
 
 		list_add(&nsh->lru, &newstripes);
 	}
@@ -1038,6 +1064,7 @@ static int resize_stripes(raid5_conf_t *conf, int newsize)
 			nsh->dev[i].page = osh->dev[i].page;
 		for( ; i<newsize; i++)
 			nsh->dev[i].page = NULL;
+		nsh->scribble = osh->scribble;
 		kmem_cache_free(conf->slab_cache, osh);
 	}
 	kmem_cache_destroy(conf->slab_cache);
@@ -1058,8 +1085,18 @@ static int resize_stripes(raid5_conf_t *conf, int newsize)
 
 	/* Step 4, return new stripes to service */
 	while(!list_empty(&newstripes)) {
+		void *scribble;
+
 		nsh = list_entry(newstripes.next, struct stripe_head, lru);
 		list_del_init(&nsh->lru);
+
+		scribble = kmalloc(scribble_len(newsize), GFP_NOIO);
+		if (scribble) {
+			kfree(nsh->scribble);
+			nsh->scribble = scribble;
+		} else
+			err = -ENOMEM;
+
 		for (i=conf->raid_disks; i < newsize; i++)
 			if (nsh->dev[i].page == NULL) {
 				struct page *p = alloc_page(GFP_NOIO);
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 52ba999..6ab0ccd 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -211,6 +211,11 @@ struct stripe_head {
 	int			disks;		/* disks in stripe */
 	enum check_states	check_state;
 	enum reconstruct_states reconstruct_state;
+	void			*scribble;	/* space for constructing buffer
+						 * lists and performing address
+						 * conversions
+						 */
+	struct mutex		scribble_lock; /* no concurrent scribbling */
 	/* stripe_operations
 	 * @target - STRIPE_OP_COMPUTE_BLK target
 	 */


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 05/11] md/raid5: add scribble region for buffer lists
  2009-05-19  0:59 ` [PATCH v2 05/11] md/raid5: add scribble region for buffer lists Dan Williams
@ 2009-05-20  8:09   ` Andre Noll
  2009-05-20 19:05     ` Dan Williams
  2009-06-04  6:11   ` Neil Brown
  1 sibling, 1 reply; 45+ messages in thread
From: Andre Noll @ 2009-05-20  8:09 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 897 bytes --]

On Mon, May 18, 2009 at 05:59:51PM -0700, Dan Williams wrote:

> +static size_t scribble_len(int num)
> +{
> +	size_t len;
> +
> +	/* return enough space for an array of page pointers and dma
> +	 * addresses for the ddf raid6 layout
> +	 */
> +	len = sizeof(struct page *) * (num+2) + sizeof(addr_conv_t) * (num+2);
> +
> +	return len;
> +}

The comment is a bit misleading as the function only returns the
_amount_ of space needed. It should probably also explain the meaning
of the "+2".

> +/* return a pointer to the address conversion region of the scribble buffer */
> +static addr_conv_t *sh_to_addr_conv(struct stripe_head *sh)
> +{
> +	return sh->scribble + sizeof(struct page *) * (sh->disks + 2);
> +}

Maybe it's safer to return NULL if sh->scribble is NULL.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 05/11] md/raid5: add scribble region for buffer lists
  2009-05-20  8:09   ` Andre Noll
@ 2009-05-20 19:05     ` Dan Williams
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-20 19:05 UTC (permalink / raw)
  To: Andre Noll; +Cc: Neil Brown, linux-raid

On Wed, May 20, 2009 at 1:09 AM, Andre Noll <maan@systemlinux.org> wrote:
> On Mon, May 18, 2009 at 05:59:51PM -0700, Dan Williams wrote:
>
>> +static size_t scribble_len(int num)
>> +{
>> +     size_t len;
>> +
>> +     /* return enough space for an array of page pointers and dma
>> +      * addresses for the ddf raid6 layout
>> +      */
>> +     len = sizeof(struct page *) * (num+2) + sizeof(addr_conv_t) * (num+2);
>> +
>> +     return len;
>> +}
>
> The comment is a bit misleading as the function only returns the
> _amount_ of space needed. It should probably also explain the meaning
> of the "+2".

Ok I updated this to:
/**
 * scribble_len - return the required size of the scribble region
 * @num - total number of disks in the array
 *
 * The size must be enough to contain:
 * 1/ a struct page pointer for each device in the array +2
 * 2/ room to convert each entry in (1) to its corresponding dma
 *    (dma_map_page()) or page (page_address()) address.
 *
 * Note: the +2 is for the destination buffers of the ddf/raid6 case where we
 * calculate over all devices (not just the data blocks), using zeros in place
 * of the P and Q blocks.
 */

>
>> +/* return a pointer to the address conversion region of the scribble buffer */
>> +static addr_conv_t *sh_to_addr_conv(struct stripe_head *sh)
>> +{
>> +     return sh->scribble + sizeof(struct page *) * (sh->disks + 2);
>> +}
>
> Maybe it's safer to return NULL if sh->scribble is NULL.

...yes, and a big fat warning, because it should never be NULL.

Thanks for the review,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 05/11] md/raid5: add scribble region for buffer lists
  2009-05-19  0:59 ` [PATCH v2 05/11] md/raid5: add scribble region for buffer lists Dan Williams
  2009-05-20  8:09   ` Andre Noll
@ 2009-06-04  6:11   ` Neil Brown
  2009-06-05 19:19     ` Dan Williams
  1 sibling, 1 reply; 45+ messages in thread
From: Neil Brown @ 2009-06-04  6:11 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-raid, maan, linux-kernel, yur, hpa

On Monday May 18, dan.j.williams@intel.com wrote:
> Hang some memory off of each stripe_head which can be used for storing
> the buffer lists used in parity calculations.  Include space for dma
> address conversions and pass that to async_tx via the
> async_submit_ctl.scribble pointer.
> 
> [ Impact: move memory pressure from stack to heap ]

I've finally had a look at this and I cannot say that I like it.

We don't really need one scribble-buffer per stripe_head.
And in fact, that isn't even enough because you find you need a mutex
to avoid multiple-use.

We really want one scribble-buffer per thread, or per CPU, or
something like that.

You could possibly handle it a bit like ->spare_page, though we cope
with that being NULL some times, and you might not be able to do that
with scribble-buffer.
How do the async-raid6 patches cope with possible multiple users of
->spare_page now that the computations are async and so possible in
parallel? 

Maybe a little mempool would be best?.... though given that in most
cases, the stack solution is really quite adequate it would be good to
make sure the replacement isn't too heavy-weight....

I'm not sure what would be best, but I really don't like the current
proposal.

NeilBrown


> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/md/raid5.c |   61 ++++++++++++++++++++++++++++++++++++++++++----------
>  drivers/md/raid5.h |    5 ++++
>  2 files changed, 54 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index e1920f2..0e456a6 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -275,6 +275,9 @@ static void shrink_buffers(struct stripe_head *sh, int num)
>  	struct page *p;
>  	int i;
>  
> +	kfree(sh->scribble);
> +	sh->scribble = NULL;
> +
>  	for (i=0; i<num ; i++) {
>  		p = sh->dev[i].page;
>  		if (!p)
> @@ -284,10 +287,26 @@ static void shrink_buffers(struct stripe_head *sh, int num)
>  	}
>  }
>  
> +static size_t scribble_len(int num)
> +{
> +	size_t len;
> +
> +	/* return enough space for an array of page pointers and dma
> +	 * addresses for the ddf raid6 layout
> +	 */
> +	len = sizeof(struct page *) * (num+2) + sizeof(addr_conv_t) * (num+2);
> +
> +	return len;
> +}
> +
>  static int grow_buffers(struct stripe_head *sh, int num)
>  {
>  	int i;
>  
> +	sh->scribble = kmalloc(scribble_len(num), GFP_KERNEL);
> +	if (!sh->scribble)
> +		return 1;
> +
>  	for (i=0; i<num; i++) {
>  		struct page *page;
>  
> @@ -641,11 +660,16 @@ static void ops_complete_compute5(void *stripe_head_ref)
>  	release_stripe(sh);
>  }
>  
> +/* return a pointer to the address conversion region of the scribble buffer */
> +static addr_conv_t *sh_to_addr_conv(struct stripe_head *sh)
> +{
> +	return sh->scribble + sizeof(struct page *) * (sh->disks + 2);
> +}
> +
>  static struct dma_async_tx_descriptor *ops_run_compute5(struct stripe_head *sh)
>  {
> -	/* kernel stack size limits the total number of disks */
>  	int disks = sh->disks;
> -	struct page *xor_srcs[disks];
> +	struct page **xor_srcs = sh->scribble;
>  	int target = sh->ops.target;
>  	struct r5dev *tgt = &sh->dev[target];
>  	struct page *xor_dest = tgt->page;
> @@ -665,7 +689,7 @@ static struct dma_async_tx_descriptor *ops_run_compute5(struct stripe_head *sh)
>  	atomic_inc(&sh->count);
>  
>  	init_async_submit(&submit, ASYNC_TX_XOR_ZERO_DST, NULL,
> -			  ops_complete_compute5, sh, NULL);
> +			  ops_complete_compute5, sh, sh_to_addr_conv(sh));
>  	if (unlikely(count == 1))
>  		tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
>  	else
> @@ -685,9 +709,8 @@ static void ops_complete_prexor(void *stripe_head_ref)
>  static struct dma_async_tx_descriptor *
>  ops_run_prexor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
>  {
> -	/* kernel stack size limits the total number of disks */
>  	int disks = sh->disks;
> -	struct page *xor_srcs[disks];
> +	struct page **xor_srcs = sh->scribble;
>  	int count = 0, pd_idx = sh->pd_idx, i;
>  	struct async_submit_ctl submit;
>  
> @@ -705,7 +728,7 @@ ops_run_prexor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
>  	}
>  
>  	init_async_submit(&submit, ASYNC_TX_XOR_DROP_DST, tx,
> -			  ops_complete_prexor, sh, NULL);
> +			  ops_complete_prexor, sh, sh_to_addr_conv(sh));
>  	tx = async_xor(xor_dest, xor_srcs, 0, count, STRIPE_SIZE, &submit);
>  
>  	return tx;
> @@ -776,9 +799,8 @@ static void ops_complete_postxor(void *stripe_head_ref)
>  static void
>  ops_run_postxor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
>  {
> -	/* kernel stack size limits the total number of disks */
>  	int disks = sh->disks;
> -	struct page *xor_srcs[disks];
> +	struct page **xor_srcs = sh->scribble;
>  	struct async_submit_ctl submit;
>  	int count = 0, pd_idx = sh->pd_idx, i;
>  	struct page *xor_dest;
> @@ -818,7 +840,8 @@ ops_run_postxor(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
>  
>  	atomic_inc(&sh->count);
>  
> -	init_async_submit(&submit, flags, tx, ops_complete_postxor, sh, NULL);
> +	init_async_submit(&submit, flags, tx, ops_complete_postxor, sh,
> +			  sh_to_addr_conv(sh));
>  	if (unlikely(count == 1))
>  		tx = async_memcpy(xor_dest, xor_srcs[0], 0, 0, STRIPE_SIZE, &submit);
>  	else
> @@ -839,9 +862,8 @@ static void ops_complete_check(void *stripe_head_ref)
>  
>  static void ops_run_check(struct stripe_head *sh)
>  {
> -	/* kernel stack size limits the total number of disks */
>  	int disks = sh->disks;
> -	struct page *xor_srcs[disks];
> +	struct page **xor_srcs = sh->scribble;
>  	struct dma_async_tx_descriptor *tx;
>  	struct async_submit_ctl submit;
>  
> @@ -857,7 +879,7 @@ static void ops_run_check(struct stripe_head *sh)
>  			xor_srcs[count++] = dev->page;
>  	}
>  
> -	init_async_submit(&submit, 0, NULL, NULL, NULL, NULL);
> +	init_async_submit(&submit, 0, NULL, NULL, NULL, sh_to_addr_conv(sh));
>  	tx = async_xor_val(xor_dest, xor_srcs, 0, count, STRIPE_SIZE,
>  			   &sh->ops.zero_sum_result, &submit);
>  
> @@ -871,6 +893,7 @@ static void raid5_run_ops(struct stripe_head *sh, unsigned long ops_request)
>  	int overlap_clear = 0, i, disks = sh->disks;
>  	struct dma_async_tx_descriptor *tx = NULL;
>  
> +	mutex_lock(&sh->scribble_lock);
>  	if (test_bit(STRIPE_OP_BIOFILL, &ops_request)) {
>  		ops_run_biofill(sh);
>  		overlap_clear++;
> @@ -903,6 +926,7 @@ static void raid5_run_ops(struct stripe_head *sh, unsigned long ops_request)
>  			if (test_and_clear_bit(R5_Overlap, &dev->flags))
>  				wake_up(&sh->raid_conf->wait_for_overlap);
>  		}
> +	mutex_unlock(&sh->scribble_lock);
>  }
>  
>  static int grow_one_stripe(raid5_conf_t *conf)
> @@ -914,6 +938,7 @@ static int grow_one_stripe(raid5_conf_t *conf)
>  	memset(sh, 0, sizeof(*sh) + (conf->raid_disks-1)*sizeof(struct r5dev));
>  	sh->raid_conf = conf;
>  	spin_lock_init(&sh->lock);
> +	mutex_init(&sh->scribble_lock);
>  
>  	if (grow_buffers(sh, conf->raid_disks)) {
>  		shrink_buffers(sh, conf->raid_disks);
> @@ -1007,6 +1032,7 @@ static int resize_stripes(raid5_conf_t *conf, int newsize)
>  
>  		nsh->raid_conf = conf;
>  		spin_lock_init(&nsh->lock);
> +		mutex_init(&nsh->scribble_lock);
>  
>  		list_add(&nsh->lru, &newstripes);
>  	}
> @@ -1038,6 +1064,7 @@ static int resize_stripes(raid5_conf_t *conf, int newsize)
>  			nsh->dev[i].page = osh->dev[i].page;
>  		for( ; i<newsize; i++)
>  			nsh->dev[i].page = NULL;
> +		nsh->scribble = osh->scribble;
>  		kmem_cache_free(conf->slab_cache, osh);
>  	}
>  	kmem_cache_destroy(conf->slab_cache);
> @@ -1058,8 +1085,18 @@ static int resize_stripes(raid5_conf_t *conf, int newsize)
>  
>  	/* Step 4, return new stripes to service */
>  	while(!list_empty(&newstripes)) {
> +		void *scribble;
> +
>  		nsh = list_entry(newstripes.next, struct stripe_head, lru);
>  		list_del_init(&nsh->lru);
> +
> +		scribble = kmalloc(scribble_len(newsize), GFP_NOIO);
> +		if (scribble) {
> +			kfree(nsh->scribble);
> +			nsh->scribble = scribble;
> +		} else
> +			err = -ENOMEM;
> +
>  		for (i=conf->raid_disks; i < newsize; i++)
>  			if (nsh->dev[i].page == NULL) {
>  				struct page *p = alloc_page(GFP_NOIO);
> diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
> index 52ba999..6ab0ccd 100644
> --- a/drivers/md/raid5.h
> +++ b/drivers/md/raid5.h
> @@ -211,6 +211,11 @@ struct stripe_head {
>  	int			disks;		/* disks in stripe */
>  	enum check_states	check_state;
>  	enum reconstruct_states reconstruct_state;
> +	void			*scribble;	/* space for constructing buffer
> +						 * lists and performing address
> +						 * conversions
> +						 */
> +	struct mutex		scribble_lock; /* no concurrent scribbling */
>  	/* stripe_operations
>  	 * @target - STRIPE_OP_COMPUTE_BLK target
>  	 */

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 05/11] md/raid5: add scribble region for buffer lists
  2009-06-04  6:11   ` Neil Brown
@ 2009-06-05 19:19     ` Dan Williams
  2009-06-08 17:25       ` Jody McIntyre
  0 siblings, 1 reply; 45+ messages in thread
From: Dan Williams @ 2009-06-05 19:19 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid, maan, linux-kernel, yur, hpa

On Wed, Jun 3, 2009 at 11:11 PM, Neil Brown<neilb@suse.de> wrote:
> On Monday May 18, dan.j.williams@intel.com wrote:
>> Hang some memory off of each stripe_head which can be used for storing
>> the buffer lists used in parity calculations.  Include space for dma
>> address conversions and pass that to async_tx via the
>> async_submit_ctl.scribble pointer.
>>
>> [ Impact: move memory pressure from stack to heap ]
>
> I've finally had a look at this and I cannot say that I like it.
>
> We don't really need one scribble-buffer per stripe_head.
> And in fact, that isn't even enough because you find you need a mutex
> to avoid multiple-use.

The mutex is probably not necessary, just need to audit the
stripe_operations state machine to make sure threads don't overlap in
raid_run_ops()...  but the point is mute when we move to per-cpu
resources.

> We really want one scribble-buffer per thread, or per CPU, or
> something like that.

One of the design goals was to prevent the occurrence of the
softlockup watchdog events which seem to trigger on large raid6
resyncs.  A per-cpu scheme would still require preempt_disable() while
the calculation is active, so perhaps we just need a call to
cond_resched() in raid5d to appease the scheduler.

> You could possibly handle it a bit like ->spare_page, though we cope
> with that being NULL some times, and you might not be able to do that
> with scribble-buffer.
> How do the async-raid6 patches cope with possible multiple users of
> ->spare_page now that the computations are async and so possible in
> parallel?

Currently the code just takes a spare page lock.

>
> Maybe a little mempool would be best?.... though given that in most
> cases, the stack solution is really quite adequate it would be good to
> make sure the replacement isn't too heavy-weight....
>
> I'm not sure what would be best, but I really don't like the current
> proposal.
>

I'll take a look at a per-cpu implementation.

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 05/11] md/raid5: add scribble region for buffer lists
  2009-06-05 19:19     ` Dan Williams
@ 2009-06-08 17:25       ` Jody McIntyre
  0 siblings, 0 replies; 45+ messages in thread
From: Jody McIntyre @ 2009-06-08 17:25 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid, maan, linux-kernel, yur, hpa

On Fri, Jun 05, 2009 at 12:19:07PM -0700, Dan Williams wrote:
> One of the design goals was to prevent the occurrence of the
> softlockup watchdog events which seem to trigger on large raid6
> resyncs.  A per-cpu scheme would still require preempt_disable() while
> the calculation is active, so perhaps we just need a call to
> cond_resched() in raid5d to appease the scheduler.

FWIW we added this to the patches shipped with Lustre:

Index: linux-2.6.18-128.1.1/drivers/md/raid5.c
===================================================================
--- linux-2.6.18-128.1.1.orig/drivers/md/raid5.c
+++ linux-2.6.18-128.1.1/drivers/md/raid5.c
@@ -2987,6 +2987,8 @@ static void raid5d (mddev_t *mddev)
 		handle_stripe(sh, conf->spare_page);
 		release_stripe(sh);
 
+		cond_resched();
+
 		spin_lock_irq(&conf->device_lock);
 	}
 	PRINTK("%d stripes handled\n", handled);

I thought most of these issues were gone in more recent kernels, but we
haven't tested RAID on anything other than RHEL 4+5 extensively (Lustre
doesn't support sufficiently new kernels yet.)

Cheers,
Jody

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 06/11] async_tx: add sum check flags
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
                   ` (4 preceding siblings ...)
  2009-05-19  0:59 ` [PATCH v2 05/11] md/raid5: add scribble region for buffer lists Dan Williams
@ 2009-05-19  0:59 ` Dan Williams
       [not found]   ` <f12847240905200111p54382735v6941b52825cf4d7e@mail.gmail.com>
  2009-05-19  1:00 ` [PATCH v2 07/11] async_tx: kill needless module_{init|exit} Dan Williams
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Dan Williams @ 2009-05-19  0:59 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

Replace the flat zero_sum_result with a collection of flags to contain
the P (xor) zero-sum result, and the soon to be utilized Q (raid6 reed
solomon syndrome) zero-sum result.  Use the SUM_CHECK_ namespace instead
of DMA_ since these flags will be used on non-dma-zero-sum enabled
platforms.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/arm/include/asm/hardware/iop3xx-adma.h |    5 +++--
 arch/arm/mach-iop13xx/include/mach/adma.h   |   12 +++++++-----
 crypto/async_tx/async_xor.c                 |    4 ++--
 drivers/md/raid5.c                          |    2 +-
 drivers/md/raid5.h                          |    5 +++--
 include/linux/async_tx.h                    |    2 +-
 include/linux/dmaengine.h                   |   21 ++++++++++++++++++++-
 7 files changed, 37 insertions(+), 14 deletions(-)

diff --git a/arch/arm/include/asm/hardware/iop3xx-adma.h b/arch/arm/include/asm/hardware/iop3xx-adma.h
index 83e6ba3..26eefea 100644
--- a/arch/arm/include/asm/hardware/iop3xx-adma.h
+++ b/arch/arm/include/asm/hardware/iop3xx-adma.h
@@ -756,13 +756,14 @@ static inline void iop_desc_set_block_fill_val(struct iop_adma_desc_slot *desc,
 	hw_desc->src[0] = val;
 }
 
-static inline int iop_desc_get_zero_result(struct iop_adma_desc_slot *desc)
+static inline enum sum_check_flags
+iop_desc_get_zero_result(struct iop_adma_desc_slot *desc)
 {
 	struct iop3xx_desc_aau *hw_desc = desc->hw_desc;
 	struct iop3xx_aau_desc_ctrl desc_ctrl = hw_desc->desc_ctrl_field;
 
 	iop_paranoia(!(desc_ctrl.tx_complete && desc_ctrl.zero_result_en));
-	return desc_ctrl.zero_result_err;
+	return desc_ctrl.zero_result_err << SUM_CHECK_P;
 }
 
 static inline void iop_chan_append(struct iop_adma_chan *chan)
diff --git a/arch/arm/mach-iop13xx/include/mach/adma.h b/arch/arm/mach-iop13xx/include/mach/adma.h
index 5722e86..1cd31df 100644
--- a/arch/arm/mach-iop13xx/include/mach/adma.h
+++ b/arch/arm/mach-iop13xx/include/mach/adma.h
@@ -428,18 +428,20 @@ static inline void iop_desc_set_block_fill_val(struct iop_adma_desc_slot *desc,
 	hw_desc->block_fill_data = val;
 }
 
-static inline int iop_desc_get_zero_result(struct iop_adma_desc_slot *desc)
+static inline enum sum_check_flags
+iop_desc_get_zero_result(struct iop_adma_desc_slot *desc)
 {
 	struct iop13xx_adma_desc_hw *hw_desc = desc->hw_desc;
 	struct iop13xx_adma_desc_ctrl desc_ctrl = hw_desc->desc_ctrl_field;
 	struct iop13xx_adma_byte_count byte_count = hw_desc->byte_count_field;
+	enum sum_check_flags flags;
 
 	BUG_ON(!(byte_count.tx_complete && desc_ctrl.zero_result));
 
-	if (desc_ctrl.pq_xfer_en)
-		return byte_count.zero_result_err_q;
-	else
-		return byte_count.zero_result_err;
+	flags = byte_count.zero_result_err_q << SUM_CHECK_Q;
+	flags |= byte_count.zero_result_err << SUM_CHECK_P;
+
+	return flags;
 }
 
 static inline void iop_chan_append(struct iop_adma_chan *chan)
diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 3caecdd..3bfbbc0 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -234,7 +234,7 @@ static int page_is_zero(struct page *p, unsigned int offset, size_t len)
  */
 struct dma_async_tx_descriptor *
 async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
-	      int src_cnt, size_t len, u32 *result,
+	      int src_cnt, size_t len, enum sum_check_flags *result,
 	      struct async_submit_ctl *submit)
 {
 	struct dma_chan *chan = async_tx_find_channel(submit, DMA_XOR_VAL,
@@ -291,7 +291,7 @@ async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
 
 		async_tx_quiesce(&tx);
 
-		*result = page_is_zero(dest, offset, len) ? 0 : 1;
+		*result = !page_is_zero(dest, offset, len) << SUM_CHECK_P;
 
 		async_tx_sync_epilog(submit);
 		submit->flags = flags_orig;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 0e456a6..a39b14a 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2569,7 +2569,7 @@ static void handle_parity_checks5(raid5_conf_t *conf, struct stripe_head *sh,
 		 * we are done.  Otherwise update the mismatch count and repair
 		 * parity if !MD_RECOVERY_CHECK
 		 */
-		if (sh->ops.zero_sum_result == 0)
+		if ((sh->ops.zero_sum_result & SUM_CHECK_P_RESULT) == 0)
 			/* parity is correct (on disc,
 			 * not in buffer any more)
 			 */
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 6ab0ccd..f580cb9 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -2,6 +2,7 @@
 #define _RAID5_H
 
 #include <linux/raid/xor.h>
+#include <linux/dmaengine.h>
 
 /*
  *
@@ -220,8 +221,8 @@ struct stripe_head {
 	 * @target - STRIPE_OP_COMPUTE_BLK target
 	 */
 	struct stripe_operations {
-		int		   target;
-		u32		   zero_sum_result;
+		int		     target;
+		enum sum_check_flags zero_sum_result;
 	} ops;
 	struct r5dev {
 		struct bio	req;
diff --git a/include/linux/async_tx.h b/include/linux/async_tx.h
index 00cfb63..3d21a25 100644
--- a/include/linux/async_tx.h
+++ b/include/linux/async_tx.h
@@ -148,7 +148,7 @@ async_xor(struct page *dest, struct page **src_list, unsigned int offset,
 
 struct dma_async_tx_descriptor *
 async_xor_val(struct page *dest, struct page **src_list, unsigned int offset,
-	      int src_cnt, size_t len, u32 *result,
+	      int src_cnt, size_t len, enum sum_check_flags *result,
 	      struct async_submit_ctl *submit);
 
 struct dma_async_tx_descriptor *
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 6768727..02447af 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -87,6 +87,25 @@ enum dma_ctrl_flags {
 };
 
 /**
+ * enum sum_check_bits - bit position of pq_check_flags
+ */
+enum sum_check_bits {
+	SUM_CHECK_P = 0,
+	SUM_CHECK_Q = 1,
+};
+
+/**
+ * enum pq_check_flags - result of async_{xor,pq}_zero_sum operations
+ * @SUM_CHECK_P_RESULT - 1 if xor zero sum error, 0 otherwise
+ * @SUM_CHECK_Q_RESULT - 1 if reed-solomon zero sum error, 0 otherwise
+ */
+enum sum_check_flags {
+	SUM_CHECK_P_RESULT = (1 << SUM_CHECK_P),
+	SUM_CHECK_Q_RESULT = (1 << SUM_CHECK_Q),
+};
+
+
+/**
  * dma_cap_mask_t - capabilities bitmap modeled after cpumask_t.
  * See linux/cpumask.h
  */
@@ -245,7 +264,7 @@ struct dma_device {
 		unsigned int src_cnt, size_t len, unsigned long flags);
 	struct dma_async_tx_descriptor *(*device_prep_dma_xor_val)(
 		struct dma_chan *chan, dma_addr_t *src,	unsigned int src_cnt,
-		size_t len, u32 *result, unsigned long flags);
+		size_t len, enum sum_check_flags *result, unsigned long flags);
 	struct dma_async_tx_descriptor *(*device_prep_dma_memset)(
 		struct dma_chan *chan, dma_addr_t dest, int value, size_t len,
 		unsigned long flags);


^ permalink raw reply related	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905200111p54382735v6941b52825cf4d7e@mail.gmail.com>]

* RE: [PATCH v2 06/11] async_tx: add sum check flags
       [not found]   ` <f12847240905200111p54382735v6941b52825cf4d7e@mail.gmail.com>
@ 2009-05-29 13:41     ` Sosnowski, Maciej
  0 siblings, 0 replies; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:41 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
> Replace the flat zero_sum_result with a collection of flags to contain
> the P (xor) zero-sum result, and the soon to be utilized Q (raid6 reed
> solomon syndrome) zero-sum result.  Use the SUM_CHECK_ namespace instead
> of DMA_ since these flags will be used on non-dma-zero-sum enabled
> platforms.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  arch/arm/include/asm/hardware/iop3xx-adma.h |    5 +++--
>  arch/arm/mach-iop13xx/include/mach/adma.h   |   12 +++++++-----
>  crypto/async_tx/async_xor.c                 |    4 ++--
>  drivers/md/raid5.c                          |    2 +-
>  drivers/md/raid5.h                          |    5 +++--
>  include/linux/async_tx.h                    |    2 +-
>  include/linux/dmaengine.h                   |   21 ++++++++++++++++++++-
>  7 files changed, 37 insertions(+), 14 deletions(-)

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 07/11] async_tx: kill needless module_{init|exit}
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
                   ` (5 preceding siblings ...)
  2009-05-19  0:59 ` [PATCH v2 06/11] async_tx: add sum check flags Dan Williams
@ 2009-05-19  1:00 ` Dan Williams
       [not found]   ` <f12847240905250323o21113fb9xbc4c16eea07b215@mail.gmail.com>
  2009-05-19  1:00 ` [PATCH v2 08/11] async_tx: add support for asynchronous GF multiplication Dan Williams
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Dan Williams @ 2009-05-19  1:00 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

If module_init and module_exit are nops then neither need to be defined.

[ Impact: pure cleanup ]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 crypto/async_tx/async_memcpy.c |   13 -------------
 crypto/async_tx/async_memset.c |   13 -------------
 crypto/async_tx/async_tx.c     |   17 +++--------------
 3 files changed, 3 insertions(+), 40 deletions(-)

diff --git a/crypto/async_tx/async_memcpy.c b/crypto/async_tx/async_memcpy.c
index c9342ae..3ec118d 100644
--- a/crypto/async_tx/async_memcpy.c
+++ b/crypto/async_tx/async_memcpy.c
@@ -87,19 +87,6 @@ async_memcpy(struct page *dest, struct page *src, unsigned int dest_offset,
 }
 EXPORT_SYMBOL_GPL(async_memcpy);
 
-static int __init async_memcpy_init(void)
-{
-	return 0;
-}
-
-static void __exit async_memcpy_exit(void)
-{
-	do { } while (0);
-}
-
-module_init(async_memcpy_init);
-module_exit(async_memcpy_exit);
-
 MODULE_AUTHOR("Intel Corporation");
 MODULE_DESCRIPTION("asynchronous memcpy api");
 MODULE_LICENSE("GPL");
diff --git a/crypto/async_tx/async_memset.c b/crypto/async_tx/async_memset.c
index e347dbe..ae63140 100644
--- a/crypto/async_tx/async_memset.c
+++ b/crypto/async_tx/async_memset.c
@@ -78,19 +78,6 @@ async_memset(struct page *dest, int val, unsigned int offset, size_t len,
 }
 EXPORT_SYMBOL_GPL(async_memset);
 
-static int __init async_memset_init(void)
-{
-	return 0;
-}
-
-static void __exit async_memset_exit(void)
-{
-	do { } while (0);
-}
-
-module_init(async_memset_init);
-module_exit(async_memset_exit);
-
 MODULE_AUTHOR("Intel Corporation");
 MODULE_DESCRIPTION("asynchronous memset api");
 MODULE_LICENSE("GPL");
diff --git a/crypto/async_tx/async_tx.c b/crypto/async_tx/async_tx.c
index 85e1b44..71f708f 100644
--- a/crypto/async_tx/async_tx.c
+++ b/crypto/async_tx/async_tx.c
@@ -42,6 +42,9 @@ static void __exit async_tx_exit(void)
 	async_dmaengine_put();
 }
 
+module_init(async_tx_init);
+module_exit(async_tx_exit);
+
 /**
  * __async_tx_find_channel - find a channel to carry out the operation or let
  *	the transaction execute synchronously
@@ -61,17 +64,6 @@ __async_tx_find_channel(struct async_submit_ctl *submit,
 	return async_dma_find_channel(tx_type);
 }
 EXPORT_SYMBOL_GPL(__async_tx_find_channel);
-#else
-static int __init async_tx_init(void)
-{
-	printk(KERN_INFO "async_tx: api initialized (sync-only)\n");
-	return 0;
-}
-
-static void __exit async_tx_exit(void)
-{
-	do { } while (0);
-}
 #endif
 
 
@@ -293,9 +285,6 @@ void async_tx_quiesce(struct dma_async_tx_descriptor **tx)
 }
 EXPORT_SYMBOL_GPL(async_tx_quiesce);
 
-module_init(async_tx_init);
-module_exit(async_tx_exit);
-
 MODULE_AUTHOR("Intel Corporation");
 MODULE_DESCRIPTION("Asynchronous Bulk Memory Transactions API");
 MODULE_LICENSE("GPL");


^ permalink raw reply related	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905250323o21113fb9xbc4c16eea07b215@mail.gmail.com>]

* RE: [PATCH v2 07/11] async_tx: kill needless module_{init|exit}
       [not found]   ` <f12847240905250323o21113fb9xbc4c16eea07b215@mail.gmail.com>
@ 2009-05-29 13:42     ` Sosnowski, Maciej
  0 siblings, 0 replies; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:42 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
> If module_init and module_exit are nops then neither need to be defined.
> 
> [ Impact: pure cleanup ]
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  crypto/async_tx/async_memcpy.c |   13 -------------
>  crypto/async_tx/async_memset.c |   13 -------------
>  crypto/async_tx/async_tx.c     |   17 +++--------------
>  3 files changed, 3 insertions(+), 40 deletions(-)

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 08/11] async_tx: add support for asynchronous GF multiplication
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
                   ` (6 preceding siblings ...)
  2009-05-19  1:00 ` [PATCH v2 07/11] async_tx: kill needless module_{init|exit} Dan Williams
@ 2009-05-19  1:00 ` Dan Williams
  2009-05-22  8:29   ` Andre Noll
       [not found]   ` <f12847240905200111q37457b29lb9e30879e251888@mail.gmail.com>
  2009-05-19  1:00 ` [PATCH v2 09/11] async_tx: add support for asynchronous RAID6 recovery operations Dan Williams
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-19  1:00 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

[ Based on an original patch by Yuri Tikhonov ]

This adds support for doing asynchronous GF multiplication by adding
two additional functions to the async_tx API:

 async_gen_syndrome() does simultaneous XOR and Galois field
    multiplication of sources.

 async_syndrome_val() validates the given source buffers against known P
    and Q values.

When a request is made to run async_pq against more than the hardware
maximum number of supported sources we need to reuse the previous
generated P and Q values as sources into the next operation.  Care must
be taken to remove Q from P' and P from Q'.  For example to perform a 5
source pq op with hardware that only supports 4 sources at a time the
following approach is taken:

p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))

p' = p + q + q + src4 = p + src4
q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4

Note: 4 is the minimum acceptable maxpq otherwise we punt to
synchronous-software path.

The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
sources (in the above manner) and fill the remaining slots up to maxpq
with the new sources/coefficients.

Note: Some devices have native support for P+Q continuation and can skip
this extra work.  Devices with this capability can advertise it with
dma_set_maxpq.  It is up to each driver how the DMA_PREP_CONTINUE flag
is honored.

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/arm/mach-iop13xx/setup.c |    2 
 crypto/async_tx/Kconfig       |    4 
 crypto/async_tx/Makefile      |    1 
 crypto/async_tx/async_pq.c    |  399 +++++++++++++++++++++++++++++++++++++++++
 crypto/async_tx/async_xor.c   |    2 
 drivers/dma/dmaengine.c       |    4 
 drivers/dma/iop-adma.c        |    2 
 include/linux/async_tx.h      |    9 +
 include/linux/dmaengine.h     |   58 +++++-
 9 files changed, 472 insertions(+), 9 deletions(-)
 create mode 100644 crypto/async_tx/async_pq.c

diff --git a/arch/arm/mach-iop13xx/setup.c b/arch/arm/mach-iop13xx/setup.c
index 9800228..2e7ca0d 100644
--- a/arch/arm/mach-iop13xx/setup.c
+++ b/arch/arm/mach-iop13xx/setup.c
@@ -506,7 +506,7 @@ void __init iop13xx_platform_init(void)
 			dma_cap_set(DMA_MEMSET, plat_data->cap_mask);
 			dma_cap_set(DMA_MEMCPY_CRC32C, plat_data->cap_mask);
 			dma_cap_set(DMA_INTERRUPT, plat_data->cap_mask);
-			dma_cap_set(DMA_PQ_XOR, plat_data->cap_mask);
+			dma_cap_set(DMA_PQ, plat_data->cap_mask);
 			dma_cap_set(DMA_PQ_UPDATE, plat_data->cap_mask);
 			dma_cap_set(DMA_PQ_VAL, plat_data->cap_mask);
 			break;
diff --git a/crypto/async_tx/Kconfig b/crypto/async_tx/Kconfig
index d8fb391..cb6d731 100644
--- a/crypto/async_tx/Kconfig
+++ b/crypto/async_tx/Kconfig
@@ -14,3 +14,7 @@ config ASYNC_MEMSET
 	tristate
 	select ASYNC_CORE
 
+config ASYNC_PQ
+	tristate
+	select ASYNC_CORE
+
diff --git a/crypto/async_tx/Makefile b/crypto/async_tx/Makefile
index 27baa7d..1b99265 100644
--- a/crypto/async_tx/Makefile
+++ b/crypto/async_tx/Makefile
@@ -2,3 +2,4 @@ obj-$(CONFIG_ASYNC_CORE) += async_tx.o
 obj-$(CONFIG_ASYNC_MEMCPY) += async_memcpy.o
 obj-$(CONFIG_ASYNC_MEMSET) += async_memset.o
 obj-$(CONFIG_ASYNC_XOR) += async_xor.o
+obj-$(CONFIG_ASYNC_PQ) += async_pq.o
diff --git a/crypto/async_tx/async_pq.c b/crypto/async_tx/async_pq.c
new file mode 100644
index 0000000..195e6cf
--- /dev/null
+++ b/crypto/async_tx/async_pq.c
@@ -0,0 +1,399 @@
+/*
+ * Copyright(c) 2007 Yuri Tikhonov <yur@emcraft.com>
+ * Copyright(c) 2009 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59
+ * Temple Place - Suite 330, Boston, MA  02111-1307, USA.
+ *
+ * The full GNU General Public License is included in this distribution in the
+ * file called COPYING.
+ */
+#include <linux/kernel.h>
+#include <linux/interrupt.h>
+#include <linux/dma-mapping.h>
+#include <linux/raid/pq.h>
+#include <linux/async_tx.h>
+
+/**
+ * spare_pages - synchronous zero sum result buffers
+ *
+ * Protected by spare_lock
+ */
+static struct page *spare_pages[2];
+static spinlock_t spare_lock;
+
+/**
+ * scribble - space to hold throwaway P buffer for synchronous gen_syndrome
+ */
+static struct page *scribble;
+
+static bool is_raid6_zero_block(struct page *p)
+{
+	return p == (void *) raid6_empty_zero_page;
+}
+
+/* the struct page *blocks[] parameter passed to async_gen_syndrome()
+ * and async_syndrome_val() contains the 'P' destination address at
+ * blocks[disks-2] and the 'Q' destination address at blocks[disks-1]
+ *
+ * note: these are macros as they are used a lvalues
+ */
+#define P(b, d) (b[d-2])
+#define Q(b, d) (b[d-1])
+
+/**
+ * do_async_gen_syndrome - asynchronously calculate P and/or Q
+ */
+static __async_inline struct dma_async_tx_descriptor *
+do_async_gen_syndrome(struct dma_chan *chan, struct page **blocks,
+		      const unsigned char *scfs, unsigned int offset, int disks,
+		      size_t len, dma_addr_t *dma_src,
+		      struct async_submit_ctl *submit)
+{
+	struct dma_async_tx_descriptor *tx = NULL;
+	struct dma_device *dma = chan->device;
+	enum dma_ctrl_flags dma_flags = 0;
+	enum async_tx_flags flags_orig = submit->flags;
+	dma_async_tx_callback cb_fn_orig = submit->cb_fn;
+	dma_async_tx_callback cb_param_orig = submit->cb_param;
+	int src_cnt = disks - 2;
+	unsigned char coefs[src_cnt];
+	unsigned short pq_src_cnt;
+	dma_addr_t dma_dest[2];
+	int src_off = 0;
+	int idx;
+	int i;
+
+	/* DMAs use destinations as sources, so use BIDIRECTIONAL mapping */
+	if (P(blocks, disks))
+		dma_dest[0] = dma_map_page(dma->dev, P(blocks, disks), offset,
+					   len, DMA_BIDIRECTIONAL);
+	else
+		dma_flags |= DMA_PREP_PQ_DISABLE_P;
+	if (Q(blocks, disks))
+		dma_dest[1] = dma_map_page(dma->dev, Q(blocks, disks), offset,
+					   len, DMA_BIDIRECTIONAL);
+	else
+		dma_flags |= DMA_PREP_PQ_DISABLE_Q;
+
+	/* convert source addresses being careful to collapse 'empty'
+	 * sources and update the coefficients accordingly
+	 */
+	for (i = 0, idx = 0; i < src_cnt; i++) {
+		if (is_raid6_zero_block(blocks[i]))
+			continue;
+		dma_src[idx] = dma_map_page(dma->dev, blocks[i], offset, len,
+					    DMA_TO_DEVICE);
+		coefs[idx] = scfs[i];
+		idx++;
+	}
+	src_cnt = idx;
+
+	while (src_cnt > 0) {
+		submit->flags = flags_orig;
+		pq_src_cnt = min(src_cnt, dma_maxpq(dma, dma_flags));
+		/* if we are submitting additional pqs, leave the chain open,
+		 * clear the callback parameters, and leave the destination
+		 * buffers mapped
+		 */
+		if (src_cnt > pq_src_cnt) {
+			submit->flags &= ~ASYNC_TX_ACK;
+			dma_flags |= DMA_COMPL_SKIP_DEST_UNMAP;
+			submit->cb_fn = NULL;
+			submit->cb_param = NULL;
+		} else {
+			dma_flags &= ~DMA_COMPL_SKIP_DEST_UNMAP;
+			submit->cb_fn = cb_fn_orig;
+			submit->cb_param = cb_param_orig;
+		}
+		if (submit->cb_fn)
+			dma_flags |= DMA_PREP_INTERRUPT;
+
+		/* Since we have clobbered the src_list we are committed
+		 * to doing this asynchronously.  Drivers force forward
+		 * progress in case they can not provide a descriptor
+		 */
+		for (;;) {
+			tx = dma->device_prep_dma_pq(chan, dma_dest,
+						     &dma_src[src_off],
+						     pq_src_cnt,
+						     &coefs[src_off], len,
+						     dma_flags);
+			if (likely(tx))
+				break;
+			async_tx_quiesce(&submit->depend_tx);
+			dma_async_issue_pending(chan);
+		}
+
+		async_tx_submit(chan, tx, submit);
+		submit->depend_tx = tx;
+
+		/* drop completed sources */
+		src_cnt -= pq_src_cnt;
+		src_off += pq_src_cnt;
+
+		dma_flags |= DMA_PREP_CONTINUE;
+	}
+
+	return tx;
+}
+
+/**
+ * do_sync_gen_syndrome - synchronously calculate a raid6 syndrome
+ */
+static void
+do_sync_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
+		     size_t len, struct async_submit_ctl *submit)
+{
+	void **srcs;
+	int i;
+
+	if (submit->scribble)
+		srcs = (void **) submit->scribble;
+	else
+		srcs = (void **) blocks;
+
+	for (i = 0; i < disks; i++) {
+		if (is_raid6_zero_block(blocks[i])) {
+			BUG_ON(i > disks - 3); /* P or Q can't be zero */
+			srcs[i] = (void *) blocks[i];
+		} else
+			srcs[i] = page_address(blocks[i]) + offset;
+	}
+	raid6_call.gen_syndrome(disks, len, srcs);
+	async_tx_sync_epilog(submit);
+}
+
+/**
+ * async_gen_syndrome - asynchronously calculate a raid6 syndrome
+ * @blocks: source blocks from idx 0..disks-3, P @ disks-2 and Q @ disks-1
+ * @offset: common offset into each block (src and dest) to start transaction
+ * @disks: number of blocks (including missing P or Q, see below)
+ * @len: length of operation in bytes
+ * @submit: submission/completion modifiers
+ *
+ * General note: This routine assumes a field of GF(2^8) with a
+ * primitive polynomial of 0x11d and a generator of {02}.
+ *
+ * 'disks' note: callers can optionally omit either P or Q (but not
+ * both) from the calculation by setting blocks[disks-2] or
+ * blocks[disks-1] to NULL.  When P or Q is omitted 'len' must be <=
+ * PAGE_SIZE as a temporary buffer of this size is used in the
+ * synchronous path.  'disks' always accounts for both destination
+ * buffers.
+ *
+ * 'blocks' note: if submit->scribble is NULL then the contents of
+ * 'blocks' may be overridden
+ */
+struct dma_async_tx_descriptor *
+async_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
+		   size_t len, struct async_submit_ctl *submit)
+{
+	int src_cnt = disks - 2;
+	struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
+						      &P(blocks, disks), 2,
+						      blocks, src_cnt, len);
+	struct dma_device *device = chan ? chan->device : NULL;
+	dma_addr_t *dma_src = NULL;
+
+	BUG_ON(disks > 255 || !(P(blocks, disks) || Q(blocks, disks)));
+
+	if (submit->scribble)
+		dma_src = submit->scribble;
+	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
+		dma_src = (dma_addr_t *) blocks;
+
+	if (dma_src && device &&
+	    (src_cnt <= dma_maxpq(device, 0) ||
+	     dma_maxpq(device, DMA_PREP_CONTINUE) > 0)) {
+		/* run the p+q asynchronously */
+		pr_debug("%s: (async) len: %zu\n", __func__, len);
+		return do_async_gen_syndrome(chan, blocks, raid6_gfexp, offset,
+					     disks, len, dma_src, submit);
+	}
+
+	/* run the pq synchronously */
+	pr_debug("%s: (sync) len: %zu\n", __func__, len);
+
+	/* wait for any prerequisite operations */
+	async_tx_quiesce(&submit->depend_tx);
+
+	if (!P(blocks, disks)) {
+		P(blocks, disks) = scribble;
+		BUG_ON(len + offset > PAGE_SIZE);
+	}
+	if (!Q(blocks, disks)) {
+		Q(blocks, disks) = scribble;
+		BUG_ON(len + offset > PAGE_SIZE);
+	}
+	do_sync_gen_syndrome(blocks, offset, disks, len, submit);
+
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(async_gen_syndrome);
+
+/**
+ * async_syndrome_val - asynchronously validate a raid6 syndrome
+ * @blocks: source blocks from idx 0..disks-3, P @ disks-2 and Q @ disks-1
+ * @offset: common offset into each block (src and dest) to start transaction
+ * @disks: number of blocks (including missing P or Q, see below)
+ * @len: length of operation in bytes
+ * @pqres: on val failure SUM_CHECK_P_RESULT and/or SUM_CHECK_Q_RESULT are set
+ * @submit: submission / completion modifiers
+ *
+ * The same notes from async_gen_syndrome apply to the 'blocks',
+ * and 'disks' parameters of this routine.
+ */
+struct dma_async_tx_descriptor *
+async_syndrome_val(struct page **blocks, unsigned int offset, int disks,
+		   size_t len, enum sum_check_flags *pqres,
+		   struct async_submit_ctl *submit)
+{
+	struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ_VAL,
+						      NULL, 0,  blocks, disks,
+						      len);
+	struct dma_device *device = chan ? chan->device : NULL;
+	struct dma_async_tx_descriptor *tx = NULL;
+	enum dma_ctrl_flags dma_flags = submit->cb_fn ? DMA_PREP_INTERRUPT : 0;
+	dma_addr_t *dma_src = NULL;
+
+	BUG_ON(disks < 4);
+
+	if (submit->scribble)
+		dma_src = submit->scribble;
+	else if (sizeof(dma_addr_t) <= sizeof(struct page *))
+		dma_src = (dma_addr_t *) blocks;
+
+	if (dma_src && device && disks <= dma_maxpq(device, 0)) {
+		struct device *dev = device->dev;
+		dma_addr_t *pq = &dma_src[disks-2];
+		int i;
+
+		pr_debug("%s: (async) len: %zu\n", __func__, len);
+		if (!P(blocks, disks))
+			dma_flags |= DMA_PREP_PQ_DISABLE_P;
+		if (!Q(blocks, disks))
+			dma_flags |= DMA_PREP_PQ_DISABLE_Q;
+		for (i = 0; i < disks; i++)
+			if (likely(blocks[i])) {
+				BUG_ON(is_raid6_zero_block(blocks[i]));
+				dma_src[i] = dma_map_page(dev, blocks[i],
+							  offset, len,
+							  DMA_TO_DEVICE);
+			}
+
+		for (;;) {
+			tx = device->device_prep_dma_pq_val(chan, pq, dma_src,
+							    disks - 2,
+							    raid6_gfexp,
+							    len, pqres,
+							    dma_flags);
+			if (likely(tx))
+				break;
+			async_tx_quiesce(&submit->depend_tx);
+			dma_async_issue_pending(chan);
+		}
+		async_tx_submit(chan, tx, submit);
+	} else {
+		struct page *p_src = P(blocks, disks);
+		struct page *q_src = Q(blocks, disks);
+		enum async_tx_flags flags_orig = submit->flags;
+		dma_async_tx_callback cb_fn_orig = submit->cb_fn;
+		void *cb_param_orig = submit->cb_param;
+		void *p, *q, *s;
+
+		pr_debug("%s: (sync) len: %zu\n", __func__, len);
+		BUG_ON(len + offset > PAGE_SIZE);
+		submit->flags &= ~ASYNC_TX_ACK;
+		submit->cb_fn = NULL;
+		submit->cb_param = NULL;
+
+		/* recompute the parity into temporary buffers */
+		spin_lock(&spare_lock);
+		P(blocks, disks) = spare_pages[0];
+		Q(blocks, disks) = spare_pages[1];
+		tx = async_gen_syndrome(blocks, offset,
+					disks, len, submit);
+		async_tx_quiesce(&tx);
+
+		/* validate that the existing parity matches the
+		 * temporary result
+		 */
+		*pqres = 0;
+		if (p_src) {
+			p = page_address(p_src) + offset;
+			s = page_address(spare_pages[0]) + offset;
+			*pqres |= !!memcmp(p, s, len) << SUM_CHECK_P;
+		}
+
+		if (q_src) {
+			q = page_address(q_src) + offset;
+			s = page_address(spare_pages[1]) + offset;
+			*pqres |= !!memcmp(q, s, len) << SUM_CHECK_Q;
+		}
+		spin_unlock(&spare_lock);
+
+		/* restore P and Q, but note that if submit->scribble
+		 * was NULL the above call to async_gen_syndrome() will
+		 * have destroyed the contents of 'blocks'
+		 */
+		P(blocks, disks) = p_src;
+		Q(blocks, disks) = q_src;
+
+		submit->cb_fn = cb_fn_orig;
+		submit->cb_param = cb_param_orig;
+		submit->flags = flags_orig;
+		async_tx_sync_epilog(submit);
+	}
+
+	return tx;
+}
+EXPORT_SYMBOL_GPL(async_syndrome_val);
+
+static void safe_put_page(struct page *p)
+{
+	if (p)
+		put_page(p);
+}
+
+static int __init async_pq_init(void)
+{
+	spin_lock_init(&spare_lock);
+
+	spare_pages[0] = alloc_page(GFP_KERNEL);
+	spare_pages[1] = alloc_page(GFP_KERNEL);
+	scribble = alloc_page(GFP_KERNEL);
+
+	if (spare_pages[0] && spare_pages[1] && scribble)
+		return 0;
+
+	safe_put_page(scribble);
+	safe_put_page(spare_pages[1]);
+	safe_put_page(spare_pages[0]);
+	pr_err("%s: failed to allocate required spare pages\n", __func__);
+	return -ENOMEM;
+}
+
+static void __exit async_pq_exit(void)
+{
+	safe_put_page(scribble);
+	safe_put_page(spare_pages[1]);
+	safe_put_page(spare_pages[0]);
+}
+
+module_init(async_pq_init);
+module_exit(async_pq_exit);
+
+MODULE_DESCRIPTION("asynchronous raid6 syndrome generation/validation");
+MODULE_LICENSE("GPL");
diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index 3bfbbc0..63ee568 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -62,7 +62,7 @@ do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 	while (src_cnt) {
 		submit->flags = flags_orig;
 		dma_flags = 0;
-		xor_src_cnt = min(src_cnt, dma->max_xor);
+		xor_src_cnt = min(src_cnt, (int)dma->max_xor);
 		/* if we are submitting additional xors, leave the chain open,
 		 * clear the callback parameters, and leave the destination
 		 * buffer mapped
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index 6781e8f..17cd775 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -646,6 +646,10 @@ int dma_async_device_register(struct dma_device *device)
 		!device->device_prep_dma_xor);
 	BUG_ON(dma_has_cap(DMA_XOR_VAL, device->cap_mask) &&
 		!device->device_prep_dma_xor_val);
+	BUG_ON(dma_has_cap(DMA_PQ, device->cap_mask) &&
+		!device->device_prep_dma_pq);
+	BUG_ON(dma_has_cap(DMA_PQ_VAL, device->cap_mask) &&
+		!device->device_prep_dma_pq_val);
 	BUG_ON(dma_has_cap(DMA_MEMSET, device->cap_mask) &&
 		!device->device_prep_dma_memset);
 	BUG_ON(dma_has_cap(DMA_INTERRUPT, device->cap_mask) &&
diff --git a/drivers/dma/iop-adma.c b/drivers/dma/iop-adma.c
index 6ff79a6..4496bc6 100644
--- a/drivers/dma/iop-adma.c
+++ b/drivers/dma/iop-adma.c
@@ -1257,7 +1257,7 @@ static int __devinit iop_adma_probe(struct platform_device *pdev)
 
 	dev_printk(KERN_INFO, &pdev->dev, "Intel(R) IOP: "
 	  "( %s%s%s%s%s%s%s%s%s%s)\n",
-	  dma_has_cap(DMA_PQ_XOR, dma_dev->cap_mask) ? "pq_xor " : "",
+	  dma_has_cap(DMA_PQ, dma_dev->cap_mask) ? "pq " : "",
 	  dma_has_cap(DMA_PQ_UPDATE, dma_dev->cap_mask) ? "pq_update " : "",
 	  dma_has_cap(DMA_PQ_VAL, dma_dev->cap_mask) ? "pq_val " : "",
 	  dma_has_cap(DMA_XOR, dma_dev->cap_mask) ? "xor " : "",
diff --git a/include/linux/async_tx.h b/include/linux/async_tx.h
index 3d21a25..6d80022 100644
--- a/include/linux/async_tx.h
+++ b/include/linux/async_tx.h
@@ -162,5 +162,14 @@ async_memset(struct page *dest, int val, unsigned int offset,
 
 struct dma_async_tx_descriptor *async_trigger_callback(struct async_submit_ctl *submit);
 
+struct dma_async_tx_descriptor *
+async_gen_syndrome(struct page **blocks, unsigned int offset, int src_cnt,
+		   size_t len, struct async_submit_ctl *submit);
+
+struct dma_async_tx_descriptor *
+async_syndrome_val(struct page **blocks, unsigned int offset, int src_cnt,
+		   size_t len, enum sum_check_flags *pqres,
+		   struct async_submit_ctl *submit);
+
 void async_tx_quiesce(struct dma_async_tx_descriptor **tx);
 #endif /* _ASYNC_TX_H_ */
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 02447af..0a05846 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -52,7 +52,7 @@ enum dma_status {
 enum dma_transaction_type {
 	DMA_MEMCPY,
 	DMA_XOR,
-	DMA_PQ_XOR,
+	DMA_PQ,
 	DMA_DUAL_XOR,
 	DMA_PQ_UPDATE,
 	DMA_XOR_VAL,
@@ -70,20 +70,28 @@ enum dma_transaction_type {
 
 /**
  * enum dma_ctrl_flags - DMA flags to augment operation preparation,
- * 	control completion, and communicate status.
+ *  control completion, and communicate status.
  * @DMA_PREP_INTERRUPT - trigger an interrupt (callback) upon completion of
- * 	this transaction
+ *  this transaction
  * @DMA_CTRL_ACK - the descriptor cannot be reused until the client
- * 	acknowledges receipt, i.e. has has a chance to establish any
- * 	dependency chains
+ *  acknowledges receipt, i.e. has has a chance to establish any dependency
+ *  chains
  * @DMA_COMPL_SKIP_SRC_UNMAP - set to disable dma-unmapping the source buffer(s)
  * @DMA_COMPL_SKIP_DEST_UNMAP - set to disable dma-unmapping the destination(s)
+ * @DMA_PREP_PQ_DISABLE_P - prevent generation of P while generating Q
+ * @DMA_PREP_PQ_DISABLE_Q - prevent generation of Q while generating P
+ * @DMA_PREP_CONTINUE - indicate to a driver that it is reusing buffers as
+ *  sources that were the result of a previous operation, in the case of a PQ
+ *  operation it continues the calculation with new sources
  */
 enum dma_ctrl_flags {
 	DMA_PREP_INTERRUPT = (1 << 0),
 	DMA_CTRL_ACK = (1 << 1),
 	DMA_COMPL_SKIP_SRC_UNMAP = (1 << 2),
 	DMA_COMPL_SKIP_DEST_UNMAP = (1 << 3),
+	DMA_PREP_PQ_DISABLE_P = (1 << 4),
+	DMA_PREP_PQ_DISABLE_Q = (1 << 5),
+	DMA_PREP_CONTINUE = (1 << 6),
 };
 
 /**
@@ -226,6 +234,7 @@ struct dma_async_tx_descriptor {
  * @global_node: list_head for global dma_device_list
  * @cap_mask: one or more dma_capability flags
  * @max_xor: maximum number of xor sources, 0 if no capability
+ * @max_pq: maximum number of PQ sources and PQ-continue capability
  * @dev_id: unique device ID
  * @dev: struct device reference for dma mapping api
  * @device_alloc_chan_resources: allocate resources and return the
@@ -234,6 +243,8 @@ struct dma_async_tx_descriptor {
  * @device_prep_dma_memcpy: prepares a memcpy operation
  * @device_prep_dma_xor: prepares a xor operation
  * @device_prep_dma_xor_val: prepares a xor validation operation
+ * @device_prep_dma_pq: prepares a pq operation
+ * @device_prep_dma_pq_val: prepares a pqzero_sum operation
  * @device_prep_dma_memset: prepares a memset operation
  * @device_prep_dma_interrupt: prepares an end of chain interrupt operation
  * @device_prep_slave_sg: prepares a slave dma operation
@@ -248,7 +259,9 @@ struct dma_device {
 	struct list_head channels;
 	struct list_head global_node;
 	dma_cap_mask_t  cap_mask;
-	int max_xor;
+	unsigned short max_xor;
+	unsigned short max_pq;
+	#define DMA_HAS_PQ_CONTINUE (1 << 15)
 
 	int dev_id;
 	struct device *dev;
@@ -265,6 +278,14 @@ struct dma_device {
 	struct dma_async_tx_descriptor *(*device_prep_dma_xor_val)(
 		struct dma_chan *chan, dma_addr_t *src,	unsigned int src_cnt,
 		size_t len, enum sum_check_flags *result, unsigned long flags);
+	struct dma_async_tx_descriptor *(*device_prep_dma_pq)(
+		struct dma_chan *chan, dma_addr_t *dst, dma_addr_t *src,
+		unsigned int src_cnt, const unsigned char *scf,
+		size_t len, unsigned long flags);
+	struct dma_async_tx_descriptor *(*device_prep_dma_pq_val)(
+		struct dma_chan *chan, dma_addr_t *pq, dma_addr_t *src,
+		unsigned int src_cnt, const unsigned char *scf, size_t len,
+		enum sum_check_flags *pqres, unsigned long flags);
 	struct dma_async_tx_descriptor *(*device_prep_dma_memset)(
 		struct dma_chan *chan, dma_addr_t dest, int value, size_t len,
 		unsigned long flags);
@@ -283,6 +304,31 @@ struct dma_device {
 	void (*device_issue_pending)(struct dma_chan *chan);
 };
 
+static inline void dma_set_maxpq(struct dma_device *dma, int maxpq, int has_pq_continue)
+{
+	dma->max_pq = maxpq;
+	if (has_pq_continue)
+		dma->max_pq |= DMA_HAS_PQ_CONTINUE;
+}
+
+/* dma_maxpq - reduce maxpq in the face of continued operations
+ * @dma - dma device with PQ capability
+ * @flags - to determine if DMA_PREP_CONTINUE is set
+ *
+ * When an engine does not support native continuation we need 3 extra
+ * source slots to reuse P and Q with the following coefficients:
+ * 1/ {00} * P : remove P from Q', but use it as a source for P'
+ * 2/ {01} * Q : use Q to continue Q' calculation
+ * 3/ {00} * Q : subtract Q from P' to cancel (2)
+ */
+static inline int dma_maxpq(struct dma_device *dma, enum dma_ctrl_flags flags)
+{
+	if ((flags & DMA_PREP_CONTINUE) &&
+	    (dma->max_pq & DMA_HAS_PQ_CONTINUE) == 0)
+		return dma->max_pq - 3;
+	return dma->max_pq & ~DMA_HAS_PQ_CONTINUE;
+}
+
 /* --- public DMA engine API --- */
 
 #ifdef CONFIG_DMA_ENGINE


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/11] async_tx: add support for asynchronous GF multiplication
  2009-05-19  1:00 ` [PATCH v2 08/11] async_tx: add support for asynchronous GF multiplication Dan Williams
@ 2009-05-22  8:29   ` Andre Noll
  2009-06-03 22:11     ` Dan Williams
       [not found]   ` <f12847240905200111q37457b29lb9e30879e251888@mail.gmail.com>
  1 sibling, 1 reply; 45+ messages in thread
From: Andre Noll @ 2009-05-22  8:29 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 2496 bytes --]

On Mon, May 18, 2009 at 06:00:07PM -0700, Dan Williams wrote:
> +/* the struct page *blocks[] parameter passed to async_gen_syndrome()
> + * and async_syndrome_val() contains the 'P' destination address at
> + * blocks[disks-2] and the 'Q' destination address at blocks[disks-1]
> + *
> + * note: these are macros as they are used a lvalues
> + */
> +#define P(b, d) (b[d-2])
> +#define Q(b, d) (b[d-1])

s/a lvalues/as lvalues

> +	/* convert source addresses being careful to collapse 'empty'
> +	 * sources and update the coefficients accordingly
> +	 */
> +	for (i = 0, idx = 0; i < src_cnt; i++) {
> +		if (is_raid6_zero_block(blocks[i]))
> +			continue;
> +		dma_src[idx] = dma_map_page(dma->dev, blocks[i], offset, len,
> +					    DMA_TO_DEVICE);
> +		coefs[idx] = scfs[i];
> +		idx++;
> +	}
> +	src_cnt = idx;

If P is disabled, we could further collapse this loop by also skipping
sources where the coefficient is zero. Not sure if this would be a win
though.

> +
> +	while (src_cnt > 0) {
> +		submit->flags = flags_orig;
> +		pq_src_cnt = min(src_cnt, dma_maxpq(dma, dma_flags));
> +		/* if we are submitting additional pqs, leave the chain open,
> +		 * clear the callback parameters, and leave the destination
> +		 * buffers mapped
> +		 */
> +		if (src_cnt > pq_src_cnt) {
> +			submit->flags &= ~ASYNC_TX_ACK;
> +			dma_flags |= DMA_COMPL_SKIP_DEST_UNMAP;
> +			submit->cb_fn = NULL;
> +			submit->cb_param = NULL;
> +		} else {
> +			dma_flags &= ~DMA_COMPL_SKIP_DEST_UNMAP;
> +			submit->cb_fn = cb_fn_orig;
> +			submit->cb_param = cb_param_orig;
> +		}
> +		if (submit->cb_fn)
> +			dma_flags |= DMA_PREP_INTERRUPT;

The last if() can go to the else branch.

> +/**
> + * do_sync_gen_syndrome - synchronously calculate a raid6 syndrome
> + */
> +static void
> +do_sync_gen_syndrome(struct page **blocks, unsigned int offset, int disks,
> +		     size_t len, struct async_submit_ctl *submit)
> +{
> +	void **srcs;
> +	int i;
> +
> +	if (submit->scribble)
> +		srcs = (void **) submit->scribble;

Unnecessary cast.

> +	else
> +		srcs = (void **) blocks;
> +
> +	for (i = 0; i < disks; i++) {
> +		if (is_raid6_zero_block(blocks[i])) {
> +			BUG_ON(i > disks - 3); /* P or Q can't be zero */
> +			srcs[i] = (void *) blocks[i];

Another Unnecessary cast.

Otherwise, this patch looks also very nice.

Thanks
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/11] async_tx: add support for asynchronous GF multiplication
  2009-05-22  8:29   ` Andre Noll
@ 2009-06-03 22:11     ` Dan Williams
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Williams @ 2009-06-03 22:11 UTC (permalink / raw)
  To: Andre Noll; +Cc: Neil Brown, linux-raid

On Fri, May 22, 2009 at 1:29 AM, Andre Noll <maan@systemlinux.org> wrote:
> If P is disabled, we could further collapse this loop by also skipping
> sources where the coefficient is zero. Not sure if this would be a win
> though.

It's ok to leave this as an optimization opportunity for the caller.

> Otherwise, this patch looks also very nice.
>
> Thanks
> Andre

Thanks, and the other cleanups were addressed as well.

^ permalink raw reply	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905200111q37457b29lb9e30879e251888@mail.gmail.com>]

* RE: [PATCH v2 08/11] async_tx: add support for asynchronous GF multiplication
       [not found]   ` <f12847240905200111q37457b29lb9e30879e251888@mail.gmail.com>
@ 2009-05-29 13:42     ` Sosnowski, Maciej
  2009-06-03 22:16       ` Dan Williams
  0 siblings, 1 reply; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:42 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
> [ Based on an original patch by Yuri Tikhonov ]
> 
> This adds support for doing asynchronous GF multiplication by adding
> two additional functions to the async_tx API:
> 
>  async_gen_syndrome() does simultaneous XOR and Galois field
>    multiplication of sources.
> 
>  async_syndrome_val() validates the given source buffers against known P
>    and Q values.
> 
> When a request is made to run async_pq against more than the hardware
> maximum number of supported sources we need to reuse the previous
> generated P and Q values as sources into the next operation.  Care must
> be taken to remove Q from P' and P from Q'.  For example to perform a 5
> source pq op with hardware that only supports 4 sources at a time the
> following approach is taken:
> 
> p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
> p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))
> 
> p' = p + q + q + src4 = p + src4
> q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4
> 
> Note: 4 is the minimum acceptable maxpq otherwise we punt to
> synchronous-software path.
> 
> The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
> sources (in the above manner) and fill the remaining slots up to maxpq
> with the new sources/coefficients.
> 
> Note: Some devices have native support for P+Q continuation and can skip
> this extra work.  Devices with this capability can advertise it with
> dma_set_maxpq.  It is up to each driver how the DMA_PREP_CONTINUE flag
> is honored.
> 
> Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
> Signed-off-by: Ilya Yanok <yanok@emcraft.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  arch/arm/mach-iop13xx/setup.c |    2
>  crypto/async_tx/Kconfig       |    4
>  crypto/async_tx/Makefile      |    1
>  crypto/async_tx/async_pq.c    |  399 +++++++++++++++++++++++++++++++++++++++++
>  crypto/async_tx/async_xor.c   |    2
>  drivers/dma/dmaengine.c       |    4
>  drivers/dma/iop-adma.c        |    2
>  include/linux/async_tx.h      |    9 +
>  include/linux/dmaengine.h     |   58 +++++-
>  9 files changed, 472 insertions(+), 9 deletions(-)
>  create mode 100644 crypto/async_tx/async_pq.c

(...)

> +               /* Since we have clobbered the src_list we are committed
> +                * to doing this asynchronously.  Drivers force forward
> +                * progress in case they can not provide a descriptor
> +                */
> +               for (;;) {
> +                       tx = dma->device_prep_dma_pq(chan, dma_dest,
> +                                                    &dma_src[src_off],
> +                                                    pq_src_cnt,
> +                                                    &coefs[src_off], len,
> +                                                    dma_flags);
> +                       if (likely(tx))
> +                               break;
> +                       async_tx_quiesce(&submit->depend_tx);
> +                       dma_async_issue_pending(chan);
> +               }

How about adding a timeout to the loop in case we do not get a descriptor at all for some reason?

> +               for (;;) {
> +                       tx = device->device_prep_dma_pq_val(chan, pq, dma_src,
> +                                                           disks - 2,
> +                                                           raid6_gfexp,
> +                                                           len, pqres,
> +                                                           dma_flags);
> +                       if (likely(tx))
> +                               break;
> +                       async_tx_quiesce(&submit->depend_tx);
> +                       dma_async_issue_pending(chan);
> +               }

Same as above...

Thanks,
Maciej--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/11] async_tx: add support for asynchronous GF multiplication
  2009-05-29 13:42     ` Sosnowski, Maciej
@ 2009-06-03 22:16       ` Dan Williams
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Williams @ 2009-06-03 22:16 UTC (permalink / raw)
  To: Sosnowski, Maciej; +Cc: linux-raid@vger.kernel.org, NeilBrown

2009/5/29 Sosnowski, Maciej <maciej.sosnowski@intel.com>:
> Dan Williams wrote:
>> +               /* Since we have clobbered the src_list we are committed
>> +                * to doing this asynchronously.  Drivers force forward
>> +                * progress in case they can not provide a descriptor
>> +                */
>> +               for (;;) {
>> +                       tx = dma->device_prep_dma_pq(chan, dma_dest,
>> +                                                    &dma_src[src_off],
>> +                                                    pq_src_cnt,
>> +                                                    &coefs[src_off], len,
>> +                                                    dma_flags);
>> +                       if (likely(tx))
>> +                               break;
>> +                       async_tx_quiesce(&submit->depend_tx);
>> +                       dma_async_issue_pending(chan);
>> +               }
>
> How about adding a timeout to the loop in case we do not get a descriptor at all for some reason?
>

There is an embedded timeout in async_tx_quiesce().  However, now that
we have the ->scribble pointer a future patch could make it a
requirement of the api.  With that in place we would always be able to
fall back to the synchronous at any point because the input parameters
would be preserved.

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 09/11] async_tx: add support for asynchronous RAID6 recovery operations
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
                   ` (7 preceding siblings ...)
  2009-05-19  1:00 ` [PATCH v2 08/11] async_tx: add support for asynchronous GF multiplication Dan Williams
@ 2009-05-19  1:00 ` Dan Williams
  2009-05-22  8:29   ` Andre Noll
       [not found]   ` <f12847240905250323q2e14efd6q69022a62cc7fd01f@mail.gmail.com>
  2009-05-19  1:00 ` [PATCH v2 10/11] dmatest: add pq support Dan Williams
  2009-05-19  1:00 ` [PATCH v2 11/11] async_tx: raid6 recovery self test Dan Williams
  10 siblings, 2 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-19  1:00 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

 async_raid6_2data_recov() recovers two data disk failures

 async_raid6_datap_recov() recovers a data disk and the P disk

These routines are a port of the synchronous versions found in
drivers/md/raid6recov.c.  The primary difference is breaking out the xor
operations into separate calls to async_xor.  Two helper routines are
introduced to perform scalar multiplication where needed.
async_sum_product() multiplies two sources by scalar coefficients and
then sums (xor) the result.  async_mult() simply multiplies a single
source by a scalar.

[ Impact: asynchronous raid6 recovery routines for 2data and datap cases ]

Cc: Yuri Tikhonov <yur@emcraft.com>
Cc: Ilya Yanok <yanok@emcraft.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 crypto/async_tx/Kconfig             |    5 +
 crypto/async_tx/Makefile            |    1 
 crypto/async_tx/async_raid6_recov.c |  292 +++++++++++++++++++++++++++++++++++
 include/linux/async_tx.h            |    8 +
 4 files changed, 306 insertions(+), 0 deletions(-)
 create mode 100644 crypto/async_tx/async_raid6_recov.c

diff --git a/crypto/async_tx/Kconfig b/crypto/async_tx/Kconfig
index cb6d731..e5aeb2b 100644
--- a/crypto/async_tx/Kconfig
+++ b/crypto/async_tx/Kconfig
@@ -18,3 +18,8 @@ config ASYNC_PQ
 	tristate
 	select ASYNC_CORE
 
+config ASYNC_RAID6_RECOV
+	tristate
+	select ASYNC_CORE
+	select ASYNC_PQ
+
diff --git a/crypto/async_tx/Makefile b/crypto/async_tx/Makefile
index 1b99265..9a1a768 100644
--- a/crypto/async_tx/Makefile
+++ b/crypto/async_tx/Makefile
@@ -3,3 +3,4 @@ obj-$(CONFIG_ASYNC_MEMCPY) += async_memcpy.o
 obj-$(CONFIG_ASYNC_MEMSET) += async_memset.o
 obj-$(CONFIG_ASYNC_XOR) += async_xor.o
 obj-$(CONFIG_ASYNC_PQ) += async_pq.o
+obj-$(CONFIG_ASYNC_RAID6_RECOV) += async_raid6_recov.o
diff --git a/crypto/async_tx/async_raid6_recov.c b/crypto/async_tx/async_raid6_recov.c
new file mode 100644
index 0000000..ea019e8
--- /dev/null
+++ b/crypto/async_tx/async_raid6_recov.c
@@ -0,0 +1,292 @@
+/*
+ * Asynchronous RAID-6 recovery calculations ASYNC_TX API.
+ * Copyright(c) 2009 Intel Corporation
+ *
+ * based on raid6recov.c:
+ *   Copyright 2002 H. Peter Anvin
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 51
+ * Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ */
+#include <linux/kernel.h>
+#include <linux/interrupt.h>
+#include <linux/dma-mapping.h>
+#include <linux/raid/pq.h>
+#include <linux/async_tx.h>
+
+static struct dma_async_tx_descriptor *
+async_sum_product(struct page *dest, struct page **srcs, unsigned char *coef,
+		  size_t len, struct async_submit_ctl *submit)
+{
+	struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
+						      &dest, 1, srcs, 2, len);
+	struct dma_device *dma = chan ? chan->device : NULL;
+	const u8 *amul, *bmul;
+	u8 ax, bx;
+	u8 *a, *b, *c;
+
+	if (dma) {
+		dma_addr_t dma_dest[2];
+		dma_addr_t dma_src[2];
+		struct device *dev = dma->dev;
+		struct dma_async_tx_descriptor *tx;
+		enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
+
+		dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
+		dma_src[0] = dma_map_page(dev, srcs[0], 0, len, DMA_TO_DEVICE);
+		dma_src[1] = dma_map_page(dev, srcs[1], 0, len, DMA_TO_DEVICE);
+		tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 2, coef,
+					     len, dma_flags);
+		if (tx) {
+			async_tx_submit(chan, tx, submit);
+			return tx;
+		}
+	}
+
+	/* run the operation synchronously */
+	async_tx_quiesce(&submit->depend_tx);
+	amul = raid6_gfmul[coef[0]];
+	bmul = raid6_gfmul[coef[1]];
+	a = page_address(srcs[0]);
+	b = page_address(srcs[1]);
+	c = page_address(dest);
+
+	while (len--) {
+		ax    = amul[*a++];
+		bx    = bmul[*b++];
+		*c++ = ax ^ bx;
+	}
+
+	return NULL;
+}
+
+/**
+ * async_raid6_2data_recov - asynchronously calculate two missing data blocks
+ * @disks: number of disks in the RAID-6 array
+ * @bytes: block size
+ * @faila: first failed drive index
+ * @failb: second failed drive index
+ * @blocks: array of source pointers where the last two entries are p and q
+ * @submit: submission/completion modifiers
+ */
+struct dma_async_tx_descriptor *
+async_raid6_2data_recov(int disks, size_t bytes, int faila, int failb,
+			struct page **blocks, struct async_submit_ctl *submit)
+{
+	struct dma_async_tx_descriptor *tx;
+	struct page *p, *q, *dp, *dq;
+	struct page *srcs[2];
+	unsigned char coef[2];
+	enum async_tx_flags flags_orig = submit->flags;
+	dma_async_tx_callback cb_fn_orig = submit->cb_fn;
+	void *cb_param_orig = submit->cb_param;
+
+	/* we need to preserve the contents of 'blocks' for the async
+	 * case, so punt to synchronous if a scribble buffer is not available
+	 */
+	if (!submit->scribble) {
+		void **ptrs = (void **) blocks;
+		int i;
+
+		async_tx_quiesce(&submit->depend_tx);
+		for (i = 0; i < disks; i++)
+			ptrs[i] = page_address(blocks[i]);
+
+		raid6_2data_recov(disks, bytes, faila, failb, ptrs);
+
+		async_tx_sync_epilog(submit);
+
+		return NULL;
+	}
+
+	p = blocks[disks-2];
+	q = blocks[disks-1];
+
+	/* Compute syndrome with zero for the missing data pages
+	   Use the dead data pages as temporary storage for
+	   delta p and delta q */
+	dp = blocks[faila];
+	blocks[faila] = (void *)raid6_empty_zero_page;
+	blocks[disks-2] = dp;
+	dq = blocks[failb];
+	blocks[failb] = (void *)raid6_empty_zero_page;
+	blocks[disks-1] = dq;
+
+	submit->flags &= ~ASYNC_TX_ACK;
+	submit->cb_fn = NULL;
+	submit->cb_param = NULL;
+	tx = async_gen_syndrome(blocks, 0, disks, bytes, submit);
+	submit->depend_tx = tx;
+
+	/* Restore pointer table */
+	blocks[faila]   = dp;
+	blocks[failb]   = dq;
+	blocks[disks-2] = p;
+	blocks[disks-1] = q;
+
+	/* compute P + Pxy */
+	srcs[0] = dp;
+	srcs[1] = p;
+	submit->flags |= ASYNC_TX_XOR_DROP_DST;
+	tx = async_xor(dp, srcs, 0, 2, bytes, submit);
+	submit->depend_tx = tx;
+
+	/* compute Q + Qxy */
+	srcs[0] = dq;
+	srcs[1] = q;
+	tx = async_xor(dq, srcs, 0, 2, bytes, submit);
+	submit->depend_tx = tx;
+
+	/* Dx = A*(P+Pxy) + B*(Q+Qxy) */
+	srcs[0] = dp;
+	srcs[1] = dq;
+	coef[0] = raid6_gfexi[failb-faila];
+	coef[1] = raid6_gfinv[raid6_gfexp[faila]^raid6_gfexp[failb]];
+	tx = async_sum_product(dq, srcs, coef, bytes, submit);
+	submit->depend_tx = tx;
+
+	/* Dy = P+Pxy+Dx */
+	srcs[0] = dp;
+	srcs[1] = dq;
+	submit->flags = flags_orig | ASYNC_TX_XOR_DROP_DST;
+	submit->cb_fn = cb_fn_orig;
+	submit->cb_param = cb_param_orig;
+	tx = async_xor(dp, srcs, 0, 2, bytes, submit);
+
+	return tx;
+}
+EXPORT_SYMBOL_GPL(async_raid6_2data_recov);
+
+static struct dma_async_tx_descriptor *
+async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
+	   struct async_submit_ctl *submit)
+{
+	struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
+						      &dest, 1, srcs, 2, len);
+	struct dma_device *dma = chan ? chan->device : NULL;
+	const u8 *qmul; /* Q multiplier table */
+	u8 *d, *s;
+
+	if (dma) {
+		dma_addr_t dma_dest[2];
+		dma_addr_t dma_src[1];
+		struct device *dev = dma->dev;
+		struct dma_async_tx_descriptor *tx;
+		enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
+
+		dma_dest[1] = dma_map_page(dev, dest, 0, len, DMA_BIDIRECTIONAL);
+		dma_src[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE);
+		tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 1, &coef,
+					     len, dma_flags);
+		if (tx) {
+			async_tx_submit(chan, tx, submit);
+			return tx;
+		}
+	}
+
+	async_tx_quiesce(&submit->depend_tx);
+	qmul  = raid6_gfmul[coef];
+	d = page_address(dest);
+	s = page_address(src);
+
+	while (len--)
+		*d++ = qmul[*s++];
+
+	return NULL;
+}
+
+/**
+ * async_raid6_datap_recov - asynchronously calculate a data and the 'p' block
+ * @disks: number of disks in the RAID-6 array
+ * @bytes: block size
+ * @faila: failed drive index
+ * @blocks: array of source pointers where the last two entries are p and q
+ * @submit: submission/completion modifiers
+ */
+struct dma_async_tx_descriptor *
+async_raid6_datap_recov(int disks, size_t bytes, int faila,
+			struct page **blocks, struct async_submit_ctl *submit)
+{
+	struct dma_async_tx_descriptor *tx;
+	struct page *p, *q, *dq;
+	u8 coef;
+	enum async_tx_flags flags_orig = submit->flags;
+	dma_async_tx_callback cb_fn_orig = submit->cb_fn;
+	void *cb_param_orig = submit->cb_param;
+	struct page *srcs[2];
+
+	/* we need to preserve the contents of 'blocks' for the async
+	 * case, so punt to synchronous if a scribble buffer is not available
+	 */
+	if (!submit->scribble) {
+		void **ptrs = (void **) blocks;
+		int i;
+
+		async_tx_quiesce(&submit->depend_tx);
+		for (i = 0; i < disks; i++)
+			ptrs[i] = page_address(blocks[i]);
+
+		raid6_datap_recov(disks, bytes, faila, ptrs);
+
+		async_tx_sync_epilog(submit);
+
+		return NULL;
+	}
+
+	p = blocks[disks-2];
+	q = blocks[disks-1];
+
+	/* Compute syndrome with zero for the missing data page
+	   Use the dead data page as temporary storage for delta q */
+	dq = blocks[faila];
+	blocks[faila] = (void *)raid6_empty_zero_page;
+	blocks[disks-1] = dq;
+
+	submit->flags &= ~ASYNC_TX_ACK;
+	submit->cb_fn = NULL;
+	submit->cb_param = NULL;
+	tx = async_gen_syndrome(blocks, 0, disks, bytes, submit);
+	submit->depend_tx = tx;
+
+	/* Restore pointer table */
+	blocks[faila]   = dq;
+	blocks[disks-1] = q;
+
+	/* Now, pick the proper data tables */
+	coef = raid6_gfinv[raid6_gfexp[faila]];
+
+	submit->flags |= ASYNC_TX_XOR_DROP_DST;
+	srcs[0] = dq;
+	srcs[1] = q;
+	tx = async_xor(dq, srcs, 0, 2, bytes, submit);
+	submit->depend_tx = tx;
+
+	tx = async_mult(dq, dq, coef, bytes, submit);
+	submit->depend_tx = tx;
+
+	srcs[0] = p;
+	srcs[1] = dq;
+	submit->flags = flags_orig | ASYNC_TX_XOR_DROP_DST;
+	submit->cb_fn = cb_fn_orig;
+	submit->cb_param = cb_param_orig;
+	tx = async_xor(p, srcs, 0, 2, bytes, submit);
+
+	return tx;
+}
+EXPORT_SYMBOL_GPL(async_raid6_datap_recov);
+
+MODULE_AUTHOR("Dan Williams <dan.j.williams@intel.com>");
+MODULE_DESCRIPTION("asynchronous RAID-6 recovery api");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/async_tx.h b/include/linux/async_tx.h
index 6d80022..265e7e2 100644
--- a/include/linux/async_tx.h
+++ b/include/linux/async_tx.h
@@ -171,5 +171,13 @@ async_syndrome_val(struct page **blocks, unsigned int offset, int src_cnt,
 		   size_t len, enum sum_check_flags *pqres,
 		   struct async_submit_ctl *submit);
 
+struct dma_async_tx_descriptor *
+async_raid6_2data_recov(int src_num, size_t bytes, int faila, int failb,
+			struct page **ptrs, struct async_submit_ctl *submit);
+
+struct dma_async_tx_descriptor *
+async_raid6_datap_recov(int src_num, size_t bytes, int faila,
+			struct page **ptrs, struct async_submit_ctl *submit);
+
 void async_tx_quiesce(struct dma_async_tx_descriptor **tx);
 #endif /* _ASYNC_TX_H_ */


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 09/11] async_tx: add support for asynchronous RAID6 recovery operations
  2009-05-19  1:00 ` [PATCH v2 09/11] async_tx: add support for asynchronous RAID6 recovery operations Dan Williams
@ 2009-05-22  8:29   ` Andre Noll
  2009-05-22 18:39     ` Dan Williams
       [not found]   ` <f12847240905250323q2e14efd6q69022a62cc7fd01f@mail.gmail.com>
  1 sibling, 1 reply; 45+ messages in thread
From: Andre Noll @ 2009-05-22  8:29 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1893 bytes --]

On Mon, May 18, 2009 at 06:00:12PM -0700, Dan Williams wrote:
> +/**
> + * async_raid6_2data_recov - asynchronously calculate two missing data blocks
> + * @disks: number of disks in the RAID-6 array
> + * @bytes: block size
> + * @faila: first failed drive index
> + * @failb: second failed drive index
> + * @blocks: array of source pointers where the last two entries are p and q
> + * @submit: submission/completion modifiers
> + */
> +struct dma_async_tx_descriptor *
> +async_raid6_2data_recov(int disks, size_t bytes, int faila, int failb,
> +			struct page **blocks, struct async_submit_ctl *submit)
> +{

[...]

> +	/* Dx = A*(P+Pxy) + B*(Q+Qxy) */
> +	srcs[0] = dp;
> +	srcs[1] = dq;
> +	coef[0] = raid6_gfexi[failb-faila];

Here it's essential that faila < failb. This should either be clearly
documented and checked for, or (better) the function should swap
faila and failb if they are in the wrong order.

> +	p = blocks[disks-2];
> +	q = blocks[disks-1];
> +
> +	/* Compute syndrome with zero for the missing data page
> +	   Use the dead data page as temporary storage for delta q */
> +	dq = blocks[faila];
> +	blocks[faila] = (void *)raid6_empty_zero_page;
> +	blocks[disks-1] = dq;
> +
> +	submit->flags &= ~ASYNC_TX_ACK;
> +	submit->cb_fn = NULL;
> +	submit->cb_param = NULL;
> +	tx = async_gen_syndrome(blocks, 0, disks, bytes, submit);
> +	submit->depend_tx = tx;
> +
> +	/* Restore pointer table */
> +	blocks[faila]   = dq;
> +	blocks[disks-1] = q;
> +
> +	/* Now, pick the proper data tables */
> +	coef = raid6_gfinv[raid6_gfexp[faila]];

Pick data tables? The formula for recovering the data block is

        (q + dq) / g^faila

(with g being the generator). What this line actually does is computing
g^{-faila}.

Regards
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 09/11] async_tx: add support for asynchronous RAID6 recovery operations
  2009-05-22  8:29   ` Andre Noll
@ 2009-05-22 18:39     ` Dan Williams
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-22 18:39 UTC (permalink / raw)
  To: Andre Noll; +Cc: Neil Brown, linux-raid

On Fri, May 22, 2009 at 1:29 AM, Andre Noll <maan@systemlinux.org> wrote:
> On Mon, May 18, 2009 at 06:00:12PM -0700, Dan Williams wrote:
>> +/**
>> + * async_raid6_2data_recov - asynchronously calculate two missing data blocks
>> + * @disks: number of disks in the RAID-6 array
>> + * @bytes: block size
>> + * @faila: first failed drive index
>> + * @failb: second failed drive index
>> + * @blocks: array of source pointers where the last two entries are p and q
>> + * @submit: submission/completion modifiers
>> + */
>> +struct dma_async_tx_descriptor *
>> +async_raid6_2data_recov(int disks, size_t bytes, int faila, int failb,
>> +                     struct page **blocks, struct async_submit_ctl *submit)
>> +{
>
> [...]
>
>> +     /* Dx = A*(P+Pxy) + B*(Q+Qxy) */
>> +     srcs[0] = dp;
>> +     srcs[1] = dq;
>> +     coef[0] = raid6_gfexi[failb-faila];
>
> Here it's essential that faila < failb. This should either be clearly
> documented and checked for, or (better) the function should swap
> faila and failb if they are in the wrong order.

Yes, it would be safer if we did not trust the caller to get this right.

>
>> +     p = blocks[disks-2];
>> +     q = blocks[disks-1];
>> +
>> +     /* Compute syndrome with zero for the missing data page
>> +        Use the dead data page as temporary storage for delta q */
>> +     dq = blocks[faila];
>> +     blocks[faila] = (void *)raid6_empty_zero_page;
>> +     blocks[disks-1] = dq;
>> +
>> +     submit->flags &= ~ASYNC_TX_ACK;
>> +     submit->cb_fn = NULL;
>> +     submit->cb_param = NULL;
>> +     tx = async_gen_syndrome(blocks, 0, disks, bytes, submit);
>> +     submit->depend_tx = tx;
>> +
>> +     /* Restore pointer table */
>> +     blocks[faila]   = dq;
>> +     blocks[disks-1] = q;
>> +
>> +     /* Now, pick the proper data tables */
>> +     coef = raid6_gfinv[raid6_gfexp[faila]];
>
> Pick data tables? The formula for recovering the data block is
>
>        (q + dq) / g^faila
>
> (with g being the generator). What this line actually does is computing
> g^{-faila}.

Yeah, that is a leftover comment from the original implementation,
I'll change it to be more informative.

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905250323q2e14efd6q69022a62cc7fd01f@mail.gmail.com>]

* RE: [PATCH v2 09/11] async_tx: add support for asynchronous RAID6 recovery operations
       [not found]   ` <f12847240905250323q2e14efd6q69022a62cc7fd01f@mail.gmail.com>
@ 2009-05-29 13:42     ` Sosnowski, Maciej
  0 siblings, 0 replies; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:42 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
>  async_raid6_2data_recov() recovers two data disk failures
> 
>  async_raid6_datap_recov() recovers a data disk and the P disk
> 
> These routines are a port of the synchronous versions found in
> drivers/md/raid6recov.c.  The primary difference is breaking out the xor
> operations into separate calls to async_xor.  Two helper routines are
> introduced to perform scalar multiplication where needed.
> async_sum_product() multiplies two sources by scalar coefficients and
> then sums (xor) the result.  async_mult() simply multiplies a single
> source by a scalar.
> 
> [ Impact: asynchronous raid6 recovery routines for 2data and datap cases ]
> 
> Cc: Yuri Tikhonov <yur@emcraft.com>
> Cc: Ilya Yanok <yanok@emcraft.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  crypto/async_tx/Kconfig             |    5 +
>  crypto/async_tx/Makefile            |    1
>  crypto/async_tx/async_raid6_recov.c |  292 +++++++++++++++++++++++++++++++++++
>  include/linux/async_tx.h            |    8 +
>  4 files changed, 306 insertions(+), 0 deletions(-)
>  create mode 100644 crypto/async_tx/async_raid6_recov.c

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
 
With a minor comment:

> +static struct dma_async_tx_descriptor *
> +async_mult(struct page *dest, struct page *src, u8 coef, size_t len,
> +          struct async_submit_ctl *submit)
> +{
> +       struct dma_chan *chan = async_tx_find_channel(submit, DMA_PQ,
> +                                                     &dest, 1, srcs, 2, len);
> +       struct dma_device *dma = chan ? chan->device : NULL;
> +       const u8 *qmul; /* Q multiplier table */
> +       u8 *d, *s;
> +
> +       if (dma) {
> +               dma_addr_t dma_dest[2];
> +               dma_addr_t dma_src[1];
> +               struct device *dev = dma->dev;
> +               struct dma_async_tx_descriptor *tx;
> +               enum dma_ctrl_flags dma_flags = DMA_PREP_PQ_DISABLE_P;
> +
> +               dma_dest[1] = dma_map_page(dev, dest, 0, len,
> DMA_BIDIRECTIONAL);
> +               dma_src[0] = dma_map_page(dev, src, 0, len, DMA_TO_DEVICE);
> +               tx = dma->device_prep_dma_pq(chan, dma_dest, dma_src, 1, &coef,
> +                                            len, dma_flags);
> +               if (tx) {
> +                       async_tx_submit(chan, tx, submit);
> +                       return tx;
> +               }
> +       }

How about adding "run the operation synchronously" comment at this point, just like it is in async_sum_product?

Maciej--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 10/11] dmatest: add pq support
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
                   ` (8 preceding siblings ...)
  2009-05-19  1:00 ` [PATCH v2 09/11] async_tx: add support for asynchronous RAID6 recovery operations Dan Williams
@ 2009-05-19  1:00 ` Dan Williams
       [not found]   ` <f12847240905250324t1a55b757hc8cd06d6b9663efe@mail.gmail.com>
  2009-05-19  1:00 ` [PATCH v2 11/11] async_tx: raid6 recovery self test Dan Williams
  10 siblings, 1 reply; 45+ messages in thread
From: Dan Williams @ 2009-05-19  1:00 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

Test raid6 p+q operations with a simple "always multiply by 1" q
calculation to fit into dmatest's current destination verification
scheme.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dma/dmatest.c |   26 ++++++++++++++++++++++++++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index a27c0fb..a5ee541 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -43,6 +43,11 @@ module_param(xor_sources, uint, S_IRUGO);
 MODULE_PARM_DESC(xor_sources,
 		"Number of xor source buffers (default: 3)");
 
+static unsigned int pq_sources = 3;
+module_param(pq_sources, uint, S_IRUGO);
+MODULE_PARM_DESC(pq_sources,
+		"Number of p+q source buffers (default: 3)");
+
 /*
  * Initialization patterns. All bytes in the source buffer has bit 7
  * set, all bytes in the destination buffer has bit 7 cleared.
@@ -227,6 +232,7 @@ static int dmatest_func(void *data)
 	dma_cookie_t		cookie;
 	enum dma_status		status;
 	enum dma_ctrl_flags 	flags;
+	u8			pq_coefs[pq_sources];
 	int			ret;
 	int			src_cnt;
 	int			dst_cnt;
@@ -243,6 +249,11 @@ static int dmatest_func(void *data)
 	else if (thread->type == DMA_XOR) {
 		src_cnt = xor_sources | 1; /* force odd to ensure dst = src */
 		dst_cnt = 1;
+	} else if (thread->type == DMA_PQ) {
+		src_cnt = pq_sources | 1; /* force odd to ensure dst = src */
+		dst_cnt = 2;
+		for (i = 0; i < pq_sources; i++)
+			pq_coefs[i] = 1;
 	} else
 		goto err_srcs;
 
@@ -310,6 +321,15 @@ static int dmatest_func(void *data)
 						      dma_dsts[0] + dst_off,
 						      dma_srcs, xor_sources,
 						      len, flags);
+		else if (thread->type == DMA_PQ) {
+			dma_addr_t dma_pq[dst_cnt];
+
+			for (i = 0; i < dst_cnt; i++)
+				dma_pq[i] = dma_dsts[i] + dst_off;
+			tx = dev->device_prep_dma_pq(chan, dma_pq, dma_srcs,
+						     pq_sources, pq_coefs,
+						     len, flags);
+		}
 
 		if (!tx) {
 			for (i = 0; i < src_cnt; i++)
@@ -446,6 +466,8 @@ static int dmatest_add_threads(struct dmatest_chan *dtc, enum dma_transaction_ty
 		op = "copy";
 	else if (type == DMA_XOR)
 		op = "xor";
+	else if (type == DMA_PQ)
+		op = "pq";
 	else
 		return -EINVAL;
 
@@ -501,6 +523,10 @@ static int dmatest_add_channel(struct dma_chan *chan)
 		cnt = dmatest_add_threads(dtc, DMA_XOR);
 		thread_count += cnt > 0 ?: 0;
 	}
+	if (dma_has_cap(DMA_PQ, dma_dev->cap_mask)) {
+		cnt = dmatest_add_threads(dtc, DMA_PQ);
+		thread_count += cnt > 0 ?: 0;
+	}
 
 	pr_info("dmatest: Started %u threads using %s\n",
 		thread_count, dma_chan_name(chan));


^ permalink raw reply related	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905250324t1a55b757hc8cd06d6b9663efe@mail.gmail.com>]

* RE: [PATCH v2 10/11] dmatest: add pq support
       [not found]   ` <f12847240905250324t1a55b757hc8cd06d6b9663efe@mail.gmail.com>
@ 2009-05-29 13:42     ` Sosnowski, Maciej
  0 siblings, 0 replies; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:42 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
> Test raid6 p+q operations with a simple "always multiply by 1" q
> calculation to fit into dmatest's current destination verification
> scheme.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/dma/dmatest.c |   26 ++++++++++++++++++++++++++
>  1 files changed, 26 insertions(+), 0 deletions(-)

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 11/11] async_tx: raid6 recovery self test
  2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
                   ` (9 preceding siblings ...)
  2009-05-19  1:00 ` [PATCH v2 10/11] dmatest: add pq support Dan Williams
@ 2009-05-19  1:00 ` Dan Williams
  2009-05-22  8:29   ` Andre Noll
       [not found]   ` <f12847240905250324k2b4a0c7as9ac9b084d3707ce5@mail.gmail.com>
  10 siblings, 2 replies; 45+ messages in thread
From: Dan Williams @ 2009-05-19  1:00 UTC (permalink / raw)
  To: neilb, linux-raid; +Cc: maan, linux-kernel, yur, hpa

Port drivers/md/raid6test/test.c to use the async raid6 recovery
routines.  This is meant as a unit test for raid6 acceleration drivers.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 crypto/async_tx/Makefile    |    1 
 crypto/async_tx/raid6test.c |  212 +++++++++++++++++++++++++++++++++++++++++++
 drivers/md/Kconfig          |   13 +++
 3 files changed, 226 insertions(+), 0 deletions(-)
 create mode 100644 crypto/async_tx/raid6test.c

diff --git a/crypto/async_tx/Makefile b/crypto/async_tx/Makefile
index 9a1a768..d1e0e6f 100644
--- a/crypto/async_tx/Makefile
+++ b/crypto/async_tx/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_ASYNC_MEMSET) += async_memset.o
 obj-$(CONFIG_ASYNC_XOR) += async_xor.o
 obj-$(CONFIG_ASYNC_PQ) += async_pq.o
 obj-$(CONFIG_ASYNC_RAID6_RECOV) += async_raid6_recov.o
+obj-$(CONFIG_ASYNC_RAID6_TEST) += raid6test.o
diff --git a/crypto/async_tx/raid6test.c b/crypto/async_tx/raid6test.c
new file mode 100644
index 0000000..9c16aeb
--- /dev/null
+++ b/crypto/async_tx/raid6test.c
@@ -0,0 +1,212 @@
+/*
+ * asynchronous raid6 recovery self test
+ * Copyright (c) 2009, Intel Corporation.
+ *
+ * based on drivers/md/raid6test/test.c:
+ * 	Copyright 2002-2007 H. Peter Anvin
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ */
+#include <linux/async_tx.h>
+#include <linux/random.h>
+
+#undef pr
+#define pr(fmt, args...) pr_info("%s: " fmt, THIS_MODULE->name, ##args)
+
+#define NDISKS 16 /* Including P and Q */
+
+static struct page *dataptrs[NDISKS];
+static struct page *data[NDISKS+2];
+static struct page *recovi;
+static struct page *recovj;
+
+static void callback(void *param)
+{
+	struct completion *cmp = param;
+
+	complete(cmp);
+}
+
+static void makedata(void)
+{
+	int i, j;
+
+	for (i = 0; i < NDISKS; i++) {
+		for (j = 0; j < PAGE_SIZE/sizeof(u32); j += sizeof(u32)) {
+			u32 *p = page_address(data[i]) + j;
+
+			*p = random32();
+		}
+
+		dataptrs[i] = data[i];
+	}
+}
+
+static char disk_type(int d)
+{
+	switch (d) {
+	case NDISKS-2:
+		return 'P';
+	case NDISKS-1:
+		return 'Q';
+	default:
+		return 'D';
+	}
+}
+
+/* Recover two failed blocks. */
+static void raid6_dual_recov(int disks, size_t bytes, int faila, int failb, struct page **ptrs)
+{
+	struct async_submit_ctl submit;
+	addr_conv_t addr_conv[NDISKS];
+	struct completion cmp;
+	struct dma_async_tx_descriptor *tx = NULL;
+	enum sum_check_flags result = ~0;
+	bool dataq = false;
+
+	if (faila > failb)
+		swap(faila, failb);
+
+	init_async_submit(&submit, 0, NULL, NULL, NULL, addr_conv);
+	if (failb == disks-1) {
+		if (faila == disks-2) {
+			/* P+Q failure.  Just rebuild the syndrome. */
+			tx = async_gen_syndrome(ptrs, 0, disks, bytes, &submit);
+		} else {
+			/* data+Q failure.  Reconstruct data from P,
+			   then rebuild syndrome. */
+			/* NOT IMPLEMENTED - equivalent to RAID-5 */
+			dataq = true;
+		}
+	} else {
+		if (failb == disks-2) {
+			/* data+P failure. */
+			tx = async_raid6_datap_recov(disks, bytes, faila, ptrs, &submit);
+		} else {
+			/* data+data failure. */
+			tx = async_raid6_2data_recov(disks, bytes, faila, failb, ptrs, &submit);
+		}
+	}
+	init_completion(&cmp);
+	init_async_submit(&submit, ASYNC_TX_ACK, tx, callback, &cmp, addr_conv);
+	async_syndrome_val(ptrs, 0, disks, bytes, &result, &submit);
+
+	if (wait_for_completion_timeout(&cmp, msecs_to_jiffies(3000)) == 0)
+		pr("%s: timeout! (faila: %d failb: %d disks: %d)\n",
+		   __func__, faila, failb, disks);
+
+	if (!dataq && result != 0)
+		pr("%s: validation failure! faila: %d failb: %d sum_check_flags: %x\n",
+		   __func__, faila, failb, result);
+}
+
+static int test_disks(int i, int j)
+{
+	int erra, errb;
+
+	memset(page_address(recovi), 0xf0, PAGE_SIZE);
+	memset(page_address(recovj), 0xba, PAGE_SIZE);
+
+	dataptrs[i] = recovi;
+	dataptrs[j] = recovj;
+
+	raid6_dual_recov(NDISKS, PAGE_SIZE, i, j, dataptrs);
+
+	erra = memcmp(page_address(data[i]), page_address(recovi), PAGE_SIZE);
+	errb = memcmp(page_address(data[j]), page_address(recovj), PAGE_SIZE);
+
+	if (i < NDISKS-2 && j == NDISKS-1) {
+		/* We don't implement the DQ failure scenario, since it's
+		   equivalent to a RAID-5 failure (XOR, then recompute Q) */
+		erra = errb = 0;
+	} else {
+		pr("%s(%d, %d): faila=%3d(%c)  failb=%3d(%c)  %s\n",
+		   __func__, i, j,
+		   i, disk_type(i),
+		   j, disk_type(j),
+		   (!erra && !errb) ? "OK" :
+		   !erra ? "ERRB" :
+		   !errb ? "ERRA" : "ERRAB");
+	}
+
+	dataptrs[i] = data[i];
+	dataptrs[j] = data[j];
+
+	return erra || errb;
+}
+
+static int raid6_test(void)
+{
+	struct async_submit_ctl submit;
+	addr_conv_t addr_conv[NDISKS];
+	struct completion cmp;
+	int err = 0;
+	int tests = 0;
+	int i, j;
+
+	for (i = 0; i < NDISKS+2; i++) {
+		data[i] = alloc_page(GFP_KERNEL);
+		if (!data[i]) {
+			while (i--)
+				put_page(data[i]);
+			return -ENOMEM;
+		}
+	}
+	recovi = data[NDISKS];
+	recovj = data[NDISKS+1];
+
+	makedata();
+
+	/* Nuke syndromes */
+	memset(page_address(data[NDISKS-2]), 0xee, PAGE_SIZE);
+	memset(page_address(data[NDISKS-1]), 0xee, PAGE_SIZE);
+
+	/* Generate assumed good syndrome */
+	init_completion(&cmp);
+	init_async_submit(&submit, ASYNC_TX_ACK, NULL, callback, &cmp, addr_conv);
+	async_gen_syndrome(dataptrs, 0, NDISKS, PAGE_SIZE, &submit);
+
+	if (wait_for_completion_timeout(&cmp, msecs_to_jiffies(3000)) == 0) {
+		pr("error: initial gen_syndrome timed out\n");
+		goto out;
+	}
+
+	for (i = 0; i < NDISKS-1; i++)
+		for (j = i+1; j < NDISKS; j++) {
+			tests++;
+			err += test_disks(i, j);
+		}
+
+ out:
+	pr("\n");
+	pr("complete (%d tests, %d failure%s)\n",
+	   tests, err, err == 1 ? "" : "s");
+
+	for (i = 0; i < NDISKS+2; i++)
+		put_page(data[i]);
+
+	return 0;
+}
+
+static void raid6_test_exit(void)
+{
+}
+
+/* when compiled-in wait for drivers to load first */
+late_initcall(raid6_test);
+module_exit(raid6_test_exit);
+MODULE_AUTHOR("Dan Williams <dan.j.williams@intel.com>");
+MODULE_DESCRIPTION("asynchronous RAID-6 recovery self tests");
+MODULE_LICENSE("GPL");
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 36e0675..41b3ae2 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -155,6 +155,19 @@ config MD_RAID456
 config MD_RAID6_PQ
 	tristate
 
+config ASYNC_RAID6_TEST
+	tristate "Self test for hardware accelerated raid6 recovery"
+	depends on MD_RAID6_PQ
+	select ASYNC_RAID6_RECOV
+	---help---
+	  This is a one-shot self test that permutes through the
+	  recovery of all the possible two disk failure scenarios for a
+	  N-disk array.  Recovery is performed with the asynchronous
+	  raid6 recovery routines, and will optionally use an offload
+	  engine if one is available.
+
+	  If unsure, say N.
+
 config MD_MULTIPATH
 	tristate "Multipath I/O support"
 	depends on BLK_DEV_MD


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 11/11] async_tx: raid6 recovery self test
  2009-05-19  1:00 ` [PATCH v2 11/11] async_tx: raid6 recovery self test Dan Williams
@ 2009-05-22  8:29   ` Andre Noll
  2009-06-03 21:42     ` Dan Williams
       [not found]   ` <f12847240905250324k2b4a0c7as9ac9b084d3707ce5@mail.gmail.com>
  1 sibling, 1 reply; 45+ messages in thread
From: Andre Noll @ 2009-05-22  8:29 UTC (permalink / raw)
  To: Dan Williams; +Cc: Neil Brown, linux-raid

[-- Attachment #1: Type: text/plain, Size: 2117 bytes --]

On Mon, May 18, 2009 at 06:00:22PM -0700, Dan Williams wrote:

> +static void makedata(void)
> +{
> +	int i, j;
> +
> +	for (i = 0; i < NDISKS; i++) {

i < NDISKS - 2 would be sufficient.

> +static char disk_type(int d)
> +{
> +	switch (d) {
> +	case NDISKS-2:
> +		return 'P';
> +	case NDISKS-1:
> +		return 'Q';
> +	default:
> +		return 'D';
> +	}
> +}

I like this function very much because "if (disk_type(faila) == 'Q')"
is so much more readable than "if (faila == num_disks - 1)". It's a
pity that we only have it in this test module ;)

> +/* Recover two failed blocks. */
> +static void raid6_dual_recov(int disks, size_t bytes, int faila, int failb, struct page **ptrs)

As disks is always NDISKS, the disks parameter could be removed.

> +{
> +	struct async_submit_ctl submit;
> +	addr_conv_t addr_conv[NDISKS];
> +	struct completion cmp;
> +	struct dma_async_tx_descriptor *tx = NULL;
> +	enum sum_check_flags result = ~0;
> +	bool dataq = false;
> +
> +	if (faila > failb)
> +		swap(faila, failb);
> +
> +	init_async_submit(&submit, 0, NULL, NULL, NULL, addr_conv);
> +	if (failb == disks-1) {

if (disk_type(failb) == 'Q'). Similar for the other tests in this function.

> +	/* Generate assumed good syndrome */
> +	init_completion(&cmp);
> +	init_async_submit(&submit, ASYNC_TX_ACK, NULL, callback, &cmp, addr_conv);
> +	async_gen_syndrome(dataptrs, 0, NDISKS, PAGE_SIZE, &submit);

How hard would it be to also test the fallback code for the synchronous
paths? AFAICS this test module wouldn't notice errors in the fallback
logic.

> +config ASYNC_RAID6_TEST
> +	tristate "Self test for hardware accelerated raid6 recovery"
> +	depends on MD_RAID6_PQ
> +	select ASYNC_RAID6_RECOV
> +	---help---
> +	  This is a one-shot self test that permutes through the
> +	  recovery of all the possible two disk failure scenarios for a
> +	  N-disk array.  Recovery is performed with the asynchronous

Currently N is 16. So s/N-disk/raid6 perhaps?

Regards
Andre
-- 
The only person who always got his work done by Friday was Robinson Crusoe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 11/11] async_tx: raid6 recovery self test
  2009-05-22  8:29   ` Andre Noll
@ 2009-06-03 21:42     ` Dan Williams
  0 siblings, 0 replies; 45+ messages in thread
From: Dan Williams @ 2009-06-03 21:42 UTC (permalink / raw)
  To: Andre Noll; +Cc: Neil Brown, linux-raid

On Fri, May 22, 2009 at 1:29 AM, Andre Noll <maan@systemlinux.org> wrote:
> On Mon, May 18, 2009 at 06:00:22PM -0700, Dan Williams wrote:
>> +static char disk_type(int d)
>> +{
>> +     switch (d) {
>> +     case NDISKS-2:
>> +             return 'P';
>> +     case NDISKS-1:
>> +             return 'Q';
>> +     default:
>> +             return 'D';
>> +     }
>> +}
>
> I like this function very much because "if (disk_type(faila) == 'Q')"
> is so much more readable than "if (faila == num_disks - 1)". It's a
> pity that we only have it in this test module ;)

I do not see too many places outside this file where it would be
helpful... but an incremental patch for this cleanup would be welcome.
 In general this code tries to duplicate the look of
drivers/md/raid6test/test.c, so any cleanups would be applicable in
both locations.

> How hard would it be to also test the fallback code for the synchronous
> paths? AFAICS this test module wouldn't notice errors in the fallback
> logic.

It does.  Simply omit loading an offload driver and the api will by
necessity fallback to the synchronous path.

Thanks,
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

[parent not found: <f12847240905250324k2b4a0c7as9ac9b084d3707ce5@mail.gmail.com>]

* RE: [PATCH v2 11/11] async_tx: raid6 recovery self test
       [not found]   ` <f12847240905250324k2b4a0c7as9ac9b084d3707ce5@mail.gmail.com>
@ 2009-05-29 13:42     ` Sosnowski, Maciej
  0 siblings, 0 replies; 45+ messages in thread
From: Sosnowski, Maciej @ 2009-05-29 13:42 UTC (permalink / raw)
  To: Williams, Dan J
  Cc: neilb@suse.de, linux-raid@vger.kernel.org, maan@systemlinux.org,
	linux-kernel@vger.kernel.org, yur@emcraft.com, hpa@zytor.com

Dan Williams wrote:
> Port drivers/md/raid6test/test.c to use the async raid6 recovery
> routines.  This is meant as a unit test for raid6 acceleration drivers.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  crypto/async_tx/Makefile    |    1
>  crypto/async_tx/raid6test.c |  212 +++++++++++++++++++++++++++++++++++++++++++
>  drivers/md/Kconfig          |   13 +++
>  3 files changed, 226 insertions(+), 0 deletions(-)
>  create mode 100644 crypto/async_tx/raid6test.c

Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2009-06-08 17:25 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-19  0:59 [PATCH v2 00/11] Asynchronous raid6 acceleration (part 1 of 3) Dan Williams
2009-05-19  0:59 ` [PATCH v2 01/11] async_tx: rename zero_sum to val Dan Williams
     [not found]   ` <f12847240905200110x63b22601idbbdf3369984fa9a@mail.gmail.com>
2009-05-29 13:41     ` Sosnowski, Maciej
2009-06-03 18:12       ` Dan Williams
2009-05-19  0:59 ` [PATCH v2 02/11] async_tx: kill ASYNC_TX_DEP_ACK flag Dan Williams
     [not found]   ` <f12847240905250320w74897dabo6576b4b48bd19c0c@mail.gmail.com>
2009-05-29 13:41     ` Sosnowski, Maciej
2009-06-03 18:42       ` Dan Williams
2009-05-19  0:59 ` [PATCH v2 03/11] async_tx: structify submission arguments, add scribble Dan Williams
2009-05-20  8:06   ` Andre Noll
2009-05-20 18:19     ` Dan Williams
     [not found]   ` <f12847240905250321v774c4e8dscd7a466cd2e61168@mail.gmail.com>
2009-05-29 13:41     ` Sosnowski, Maciej
2009-06-03 19:05       ` Dan Williams
2009-05-19  0:59 ` [PATCH v2 04/11] async_xor: permit callers to pass in a 'dma/page scribble' region Dan Williams
2009-05-20  8:08   ` Andre Noll
2009-05-20 18:35     ` Dan Williams
2009-05-20 19:09       ` Andre Noll
2009-05-22  8:29         ` Andre Noll
2009-05-22 17:25           ` Dan Williams
2009-05-25  7:55             ` Andre Noll
     [not found]   ` <f12847240905250320w523fc657w3bca47f23442f46e@mail.gmail.com>
2009-05-29 13:41     ` Sosnowski, Maciej
2009-05-19  0:59 ` [PATCH v2 05/11] md/raid5: add scribble region for buffer lists Dan Williams
2009-05-20  8:09   ` Andre Noll
2009-05-20 19:05     ` Dan Williams
2009-06-04  6:11   ` Neil Brown
2009-06-05 19:19     ` Dan Williams
2009-06-08 17:25       ` Jody McIntyre
2009-05-19  0:59 ` [PATCH v2 06/11] async_tx: add sum check flags Dan Williams
     [not found]   ` <f12847240905200111p54382735v6941b52825cf4d7e@mail.gmail.com>
2009-05-29 13:41     ` Sosnowski, Maciej
2009-05-19  1:00 ` [PATCH v2 07/11] async_tx: kill needless module_{init|exit} Dan Williams
     [not found]   ` <f12847240905250323o21113fb9xbc4c16eea07b215@mail.gmail.com>
2009-05-29 13:42     ` Sosnowski, Maciej
2009-05-19  1:00 ` [PATCH v2 08/11] async_tx: add support for asynchronous GF multiplication Dan Williams
2009-05-22  8:29   ` Andre Noll
2009-06-03 22:11     ` Dan Williams
     [not found]   ` <f12847240905200111q37457b29lb9e30879e251888@mail.gmail.com>
2009-05-29 13:42     ` Sosnowski, Maciej
2009-06-03 22:16       ` Dan Williams
2009-05-19  1:00 ` [PATCH v2 09/11] async_tx: add support for asynchronous RAID6 recovery operations Dan Williams
2009-05-22  8:29   ` Andre Noll
2009-05-22 18:39     ` Dan Williams
     [not found]   ` <f12847240905250323q2e14efd6q69022a62cc7fd01f@mail.gmail.com>
2009-05-29 13:42     ` Sosnowski, Maciej
2009-05-19  1:00 ` [PATCH v2 10/11] dmatest: add pq support Dan Williams
     [not found]   ` <f12847240905250324t1a55b757hc8cd06d6b9663efe@mail.gmail.com>
2009-05-29 13:42     ` Sosnowski, Maciej
2009-05-19  1:00 ` [PATCH v2 11/11] async_tx: raid6 recovery self test Dan Williams
2009-05-22  8:29   ` Andre Noll
2009-06-03 21:42     ` Dan Williams
     [not found]   ` <f12847240905250324k2b4a0c7as9ac9b084d3707ce5@mail.gmail.com>
2009-05-29 13:42     ` Sosnowski, Maciej

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).