public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1
@ 2015-05-07 12:57 Maxime Ripard
  2015-05-07 14:39 ` AW: " Markus Stockhausen
  2015-05-11  6:26 ` Shaohua Li
  0 siblings, 2 replies; 7+ messages in thread
From: Maxime Ripard @ 2015-05-07 12:57 UTC (permalink / raw)
  To: Neil Brown, Shaohua Li
  Cc: linux-raid, linux-kernel, Lior Amsalem, Thomas Petazzoni,
	Gregory Clement, Boris Brezillon

[-- Attachment #1: Type: text/plain, Size: 1318 bytes --]

Hi,

I'm currently trying to add support for the PQ operations on the
marvell XOR engine, in dmaengine, obviously to be able to use async_tx
to offload these operations.

I'm testing these patches with a RAID6 array with 4 disks.

However, since the commit 59fc630b8b5f ("RAID5: batch adjacent full
stripe write", every write to that array fails with the following
stacktrace.

http://code.bulix.org/eh8iew-88342?raw

It seems to be generated by that warning here:

http://lxr.free-electrons.com/source/crypto/async_tx/async_tx.c#L173

And indeed, if we dump the status of depend_tx here, it's already been
acked.

That doesn't happen if ASYNC_TX_DMA is disabled, hence using the
software version of it, instead of relying on our XOR engine. It
doesn't happen on any commit prior to the one mentionned above, with
the exact same changes applied. These changes are meant to be
contributed, so I can definitely push them somewhere if needed.

I don't really know where to look for though, the change that is
causing this is probably the change in ops_run_reconstruct6, but I'm
not sure that this partial revert alone would work with regard to the
rest of the patch.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* AW: Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1
  2015-05-07 12:57 Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1 Maxime Ripard
@ 2015-05-07 14:39 ` Markus Stockhausen
  2015-05-11  9:13   ` Maxime Ripard
  2015-05-11  6:26 ` Shaohua Li
  1 sibling, 1 reply; 7+ messages in thread
From: Markus Stockhausen @ 2015-05-07 14:39 UTC (permalink / raw)
  To: Maxime Ripard, Neil Brown, Shaohua Li
  Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
	Lior Amsalem, Thomas Petazzoni, Gregory Clement, Boris Brezillon

[-- Attachment #1: Type: text/plain, Size: 2636 bytes --]

Hi Maxime,

> Von: linux-raid-owner@vger.kernel.org [linux-raid-owner@vger.kernel.org]" im Auftrag von "Maxime Ripard [maxime.ripard@free-electrons.com]
> Gesendet: Donnerstag, 7. Mai 2015 14:57
> An: Neil Brown; Shaohua Li
> Cc: linux-raid@vger.kernel.org; linux-kernel@vger.kernel.org; Lior Amsalem; Thomas Petazzoni; Gregory Clement; Boris Brezillon
> Betreff: Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1
> 
> Hi,
> 
> I'm currently trying to add support for the PQ operations on the
> marvell XOR engine, in dmaengine, obviously to be able to use async_tx
> to offload these operations.
> 
> I'm testing these patches with a RAID6 array with 4 disks.
> 
> However, since the commit 59fc630b8b5f ("RAID5: batch adjacent full
> stripe write", every write to that array fails with the following
> stacktrace.
> 
> http://code.bulix.org/eh8iew-88342?raw

I don't know if it might be related. I added support for RAID6 Read-Modify-Write
in software XOR with some patches. The following commit mangles some lines in 
async_pq.c:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
id=584acdd49cd2472ca0f5a06adbe979db82d0b4af

I introduced a new flag ASYNC_TX_PQ_XOR_DST that notifies the async layer
that we want to do a XOR syndrome operation instead of a full calculation.
This will enforce the software path because I guessed that hardware does not
support that case. Without hardware to check I might have missed some 
checks in the async layer.

In the upper layer ops_run_reconstruct6 will set the flag if we determined
that rmw is faster than rcw.

Can you check if rmw_level=0 fixes the issue. See:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
id=d06f191f8ecaef4d524e765fdb455f96392fbd42

> It seems to be generated by that warning here:
> 
> http://lxr.free-electrons.com/source/crypto/async_tx/async_tx.c#L173
> 
> And indeed, if we dump the status of depend_tx here, it's already been
> acked.
> 
> That doesn't happen if ASYNC_TX_DMA is disabled, hence using the
> software version of it, instead of relying on our XOR engine. It
> doesn't happen on any commit prior to the one mentionned above, with
> the exact same changes applied. These changes are meant to be
> contributed, so I can definitely push them somewhere if needed.
> 
> I don't really know where to look for though, the change that is
> causing this is probably the change in ops_run_reconstruct6, but I'm
> not sure that this partial revert alone would work with regard to the
> rest of the patch.
> 
> Maxime

Markus
=

[-- Attachment #2: InterScan_Disclaimer.txt --]
[-- Type: text/plain, Size: 1650 bytes --]

****************************************************************************
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497

****************************************************************************

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1
  2015-05-07 12:57 Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1 Maxime Ripard
  2015-05-07 14:39 ` AW: " Markus Stockhausen
@ 2015-05-11  6:26 ` Shaohua Li
  2015-05-12 12:55   ` Maxime Ripard
  1 sibling, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2015-05-11  6:26 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Neil Brown, Shaohua Li, linux-raid, linux-kernel, Lior Amsalem,
	Thomas Petazzoni, Gregory Clement, Boris Brezillon

On Thu, May 07, 2015 at 02:57:02PM +0200, Maxime Ripard wrote:
> Hi,
> 
> I'm currently trying to add support for the PQ operations on the
> marvell XOR engine, in dmaengine, obviously to be able to use async_tx
> to offload these operations.
> 
> I'm testing these patches with a RAID6 array with 4 disks.
> 
> However, since the commit 59fc630b8b5f ("RAID5: batch adjacent full
> stripe write", every write to that array fails with the following
> stacktrace.
> 
> http://code.bulix.org/eh8iew-88342?raw
> 
> It seems to be generated by that warning here:
> 
> http://lxr.free-electrons.com/source/crypto/async_tx/async_tx.c#L173
> 
> And indeed, if we dump the status of depend_tx here, it's already been
> acked.
> 
> That doesn't happen if ASYNC_TX_DMA is disabled, hence using the
> software version of it, instead of relying on our XOR engine. It
> doesn't happen on any commit prior to the one mentionned above, with
> the exact same changes applied. These changes are meant to be
> contributed, so I can definitely push them somewhere if needed.
> 
> I don't really know where to look for though, the change that is
> causing this is probably the change in ops_run_reconstruct6, but I'm
> not sure that this partial revert alone would work with regard to the
> rest of the patch.

I don't have a machine with dmaengine, it's likely there is error in this side.
Could you please make stripe_can_batch() returns false always and check if the
error disappear? This should narrow down if it's related to batch issue.

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1
  2015-05-07 14:39 ` AW: " Markus Stockhausen
@ 2015-05-11  9:13   ` Maxime Ripard
  0 siblings, 0 replies; 7+ messages in thread
From: Maxime Ripard @ 2015-05-11  9:13 UTC (permalink / raw)
  To: Markus Stockhausen
  Cc: Neil Brown, Shaohua Li, linux-raid@vger.kernel.org,
	linux-kernel@vger.kernel.org, Lior Amsalem, Thomas Petazzoni,
	Gregory Clement, Boris Brezillon

[-- Attachment #1: Type: text/plain, Size: 2666 bytes --]

Hi Markus,

On Thu, May 07, 2015 at 02:39:07PM +0000, Markus Stockhausen wrote:
> Hi Maxime,
> 
> > Von: linux-raid-owner@vger.kernel.org [linux-raid-owner@vger.kernel.org]" im Auftrag von "Maxime Ripard [maxime.ripard@free-electrons.com]
> > Gesendet: Donnerstag, 7. Mai 2015 14:57
> > An: Neil Brown; Shaohua Li
> > Cc: linux-raid@vger.kernel.org; linux-kernel@vger.kernel.org; Lior Amsalem; Thomas Petazzoni; Gregory Clement; Boris Brezillon
> > Betreff: Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1
> > 
> > Hi,
> > 
> > I'm currently trying to add support for the PQ operations on the
> > marvell XOR engine, in dmaengine, obviously to be able to use async_tx
> > to offload these operations.
> > 
> > I'm testing these patches with a RAID6 array with 4 disks.
> > 
> > However, since the commit 59fc630b8b5f ("RAID5: batch adjacent full
> > stripe write", every write to that array fails with the following
> > stacktrace.
> > 
> > http://code.bulix.org/eh8iew-88342?raw
> 
> I don't know if it might be related. I added support for RAID6 Read-Modify-Write
> in software XOR with some patches. The following commit mangles some lines in 
> async_pq.c:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
> id=584acdd49cd2472ca0f5a06adbe979db82d0b4af
> 
> I introduced a new flag ASYNC_TX_PQ_XOR_DST that notifies the async layer
> that we want to do a XOR syndrome operation instead of a full calculation.
> This will enforce the software path because I guessed that hardware does not
> support that case. Without hardware to check I might have missed some 
> checks in the async layer.
> 
> In the upper layer ops_run_reconstruct6 will set the flag if we determined
> that rmw is faster than rcw.
> 
> Can you check if rmw_level=0 fixes the issue. See:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?
> id=d06f191f8ecaef4d524e765fdb455f96392fbd42

I just gave this a try, and it doesn't fix anything.

One thing I forgot to mention is that our hardware doesn't support the
PQ multiplications and product sums, so one of the patches we have is
to add a new ASYNC_TX flag to be able to identify and bail out of such
transfers.

The patch is here:
https://github.com/MISL-EBU-System-SW/mainline-public/commit/9964fe4a79da10162f83bd527b3fe44da60d7e0f

There might be some interaction between your patch and this one, even
though the async_tx code itself looks to be untouched by your patches.

Thanks!
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1
  2015-05-12 12:55   ` Maxime Ripard
@ 2015-05-12 10:59     ` Shaohua Li
  2015-05-13  7:46       ` Maxime Ripard
  0 siblings, 1 reply; 7+ messages in thread
From: Shaohua Li @ 2015-05-12 10:59 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Neil Brown, linux-raid, linux-kernel, Lior Amsalem,
	Thomas Petazzoni, Gregory Clement, Boris Brezillon

On Tue, May 12, 2015 at 02:55:46PM +0200, Maxime Ripard wrote:
> Hi Shaohua,
> 
> On Sun, May 10, 2015 at 11:26:38PM -0700, Shaohua Li wrote:
> > On Thu, May 07, 2015 at 02:57:02PM +0200, Maxime Ripard wrote:
> > > Hi,
> > > 
> > > I'm currently trying to add support for the PQ operations on the
> > > marvell XOR engine, in dmaengine, obviously to be able to use async_tx
> > > to offload these operations.
> > > 
> > > I'm testing these patches with a RAID6 array with 4 disks.
> > > 
> > > However, since the commit 59fc630b8b5f ("RAID5: batch adjacent full
> > > stripe write", every write to that array fails with the following
> > > stacktrace.
> > > 
> > > http://code.bulix.org/eh8iew-88342?raw
> > > 
> > > It seems to be generated by that warning here:
> > > 
> > > http://lxr.free-electrons.com/source/crypto/async_tx/async_tx.c#L173
> > > 
> > > And indeed, if we dump the status of depend_tx here, it's already been
> > > acked.
> > > 
> > > That doesn't happen if ASYNC_TX_DMA is disabled, hence using the
> > > software version of it, instead of relying on our XOR engine. It
> > > doesn't happen on any commit prior to the one mentionned above, with
> > > the exact same changes applied. These changes are meant to be
> > > contributed, so I can definitely push them somewhere if needed.
> > > 
> > > I don't really know where to look for though, the change that is
> > > causing this is probably the change in ops_run_reconstruct6, but I'm
> > > not sure that this partial revert alone would work with regard to the
> > > rest of the patch.
> > 
> > I don't have a machine with dmaengine, it's likely there is error in this side.
> > Could you please make stripe_can_batch() returns false always and check if the
> > error disappear? This should narrow down if it's related to batch issue.
> 
> The error indeed disappears if stripe_can_batch always returns false.

Does this fix it?


diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 77dfd72..5e820fc 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1825,7 +1825,7 @@ ops_run_reconstruct6(struct stripe_head *sh, struct raid5_percpu *percpu,
 	} else
 		init_async_submit(&submit, 0, tx, NULL, NULL,
 				  to_addr_conv(sh, percpu, j));
-	async_gen_syndrome(blocks, 0, count+2, STRIPE_SIZE,  &submit);
+	tx = async_gen_syndrome(blocks, 0, count+2, STRIPE_SIZE,  &submit);
 	if (!last_stripe) {
 		j++;
 		sh = list_first_entry(&sh->batch_list, struct stripe_head,

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1
  2015-05-11  6:26 ` Shaohua Li
@ 2015-05-12 12:55   ` Maxime Ripard
  2015-05-12 10:59     ` Shaohua Li
  0 siblings, 1 reply; 7+ messages in thread
From: Maxime Ripard @ 2015-05-12 12:55 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Neil Brown, linux-raid, linux-kernel, Lior Amsalem,
	Thomas Petazzoni, Gregory Clement, Boris Brezillon

[-- Attachment #1: Type: text/plain, Size: 1902 bytes --]

Hi Shaohua,

On Sun, May 10, 2015 at 11:26:38PM -0700, Shaohua Li wrote:
> On Thu, May 07, 2015 at 02:57:02PM +0200, Maxime Ripard wrote:
> > Hi,
> > 
> > I'm currently trying to add support for the PQ operations on the
> > marvell XOR engine, in dmaengine, obviously to be able to use async_tx
> > to offload these operations.
> > 
> > I'm testing these patches with a RAID6 array with 4 disks.
> > 
> > However, since the commit 59fc630b8b5f ("RAID5: batch adjacent full
> > stripe write", every write to that array fails with the following
> > stacktrace.
> > 
> > http://code.bulix.org/eh8iew-88342?raw
> > 
> > It seems to be generated by that warning here:
> > 
> > http://lxr.free-electrons.com/source/crypto/async_tx/async_tx.c#L173
> > 
> > And indeed, if we dump the status of depend_tx here, it's already been
> > acked.
> > 
> > That doesn't happen if ASYNC_TX_DMA is disabled, hence using the
> > software version of it, instead of relying on our XOR engine. It
> > doesn't happen on any commit prior to the one mentionned above, with
> > the exact same changes applied. These changes are meant to be
> > contributed, so I can definitely push them somewhere if needed.
> > 
> > I don't really know where to look for though, the change that is
> > causing this is probably the change in ops_run_reconstruct6, but I'm
> > not sure that this partial revert alone would work with regard to the
> > rest of the patch.
> 
> I don't have a machine with dmaengine, it's likely there is error in this side.
> Could you please make stripe_can_batch() returns false always and check if the
> error disappear? This should narrow down if it's related to batch issue.

The error indeed disappears if stripe_can_batch always returns false.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1
  2015-05-12 10:59     ` Shaohua Li
@ 2015-05-13  7:46       ` Maxime Ripard
  0 siblings, 0 replies; 7+ messages in thread
From: Maxime Ripard @ 2015-05-13  7:46 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Neil Brown, linux-raid, linux-kernel, Lior Amsalem,
	Thomas Petazzoni, Gregory Clement, Boris Brezillon

[-- Attachment #1: Type: text/plain, Size: 2902 bytes --]

Hi,

On Tue, May 12, 2015 at 03:59:07AM -0700, Shaohua Li wrote:
> On Tue, May 12, 2015 at 02:55:46PM +0200, Maxime Ripard wrote:
> > Hi Shaohua,
> > 
> > On Sun, May 10, 2015 at 11:26:38PM -0700, Shaohua Li wrote:
> > > On Thu, May 07, 2015 at 02:57:02PM +0200, Maxime Ripard wrote:
> > > > Hi,
> > > > 
> > > > I'm currently trying to add support for the PQ operations on the
> > > > marvell XOR engine, in dmaengine, obviously to be able to use async_tx
> > > > to offload these operations.
> > > > 
> > > > I'm testing these patches with a RAID6 array with 4 disks.
> > > > 
> > > > However, since the commit 59fc630b8b5f ("RAID5: batch adjacent full
> > > > stripe write", every write to that array fails with the following
> > > > stacktrace.
> > > > 
> > > > http://code.bulix.org/eh8iew-88342?raw
> > > > 
> > > > It seems to be generated by that warning here:
> > > > 
> > > > http://lxr.free-electrons.com/source/crypto/async_tx/async_tx.c#L173
> > > > 
> > > > And indeed, if we dump the status of depend_tx here, it's already been
> > > > acked.
> > > > 
> > > > That doesn't happen if ASYNC_TX_DMA is disabled, hence using the
> > > > software version of it, instead of relying on our XOR engine. It
> > > > doesn't happen on any commit prior to the one mentionned above, with
> > > > the exact same changes applied. These changes are meant to be
> > > > contributed, so I can definitely push them somewhere if needed.
> > > > 
> > > > I don't really know where to look for though, the change that is
> > > > causing this is probably the change in ops_run_reconstruct6, but I'm
> > > > not sure that this partial revert alone would work with regard to the
> > > > rest of the patch.
> > > 
> > > I don't have a machine with dmaengine, it's likely there is error in this side.
> > > Could you please make stripe_can_batch() returns false always and check if the
> > > error disappear? This should narrow down if it's related to batch issue.
> > 
> > The error indeed disappears if stripe_can_batch always returns false.
> 
> Does this fix it?
> 
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 77dfd72..5e820fc 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -1825,7 +1825,7 @@ ops_run_reconstruct6(struct stripe_head *sh, struct raid5_percpu *percpu,
>  	} else
>  		init_async_submit(&submit, 0, tx, NULL, NULL,
>  				  to_addr_conv(sh, percpu, j));
> -	async_gen_syndrome(blocks, 0, count+2, STRIPE_SIZE,  &submit);
> +	tx = async_gen_syndrome(blocks, 0, count+2, STRIPE_SIZE,  &submit);
>  	if (!last_stripe) {
>  		j++;
>  		sh = list_first_entry(&sh->batch_list, struct stripe_head,

It does, thanks!

Feel free to add my Tested-by if you submit this patch.

Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-05-13  7:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-07 12:57 Possible RAID6 regression with ASYNC_TX_DMA enabled in 4.1 Maxime Ripard
2015-05-07 14:39 ` AW: " Markus Stockhausen
2015-05-11  9:13   ` Maxime Ripard
2015-05-11  6:26 ` Shaohua Li
2015-05-12 12:55   ` Maxime Ripard
2015-05-12 10:59     ` Shaohua Li
2015-05-13  7:46       ` Maxime Ripard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox