linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
@ 2011-11-22  4:55 b29237
  2011-11-22 18:59 ` Ira W. Snyder
  0 siblings, 1 reply; 14+ messages in thread
From: b29237 @ 2011-11-22  4:55 UTC (permalink / raw)
  To: dan.j.williams, leoli, zw, vinod.koul
  Cc: Forrest Shi, linuxppc-dev, linux-kernel

From: Forrest Shi <b29237@freescale.com>

    dma status check function fsl_tx_status is heavily called in
    a tight loop and the desc lock in fsl_tx_status contended by
    the dma status update function. this caused the dma performance
    degrades much.

    this patch releases the lock in the fsl_tx_status function.
    I believe it has no neglect impact on the following call of
    dma_async_is_complete(...).

    we can see below three conditions will be identified as success
    a)  x < complete < use
    b)  x < complete+N < use+N
    c)  x < complete < use+N
    here complete is the completed_cookie, use is the last_used
    cookie, x is the querying cookie, N is MAX cookie

    when chan->completed_cookie is being read, the last_used may
    be incresed. Anyway it has no neglect impact on the dma status
    decision.

    Signed-off-by: Forrest Shi <xuelin.shi@freescale.com>
---
 drivers/dma/fsldma.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 8a78154..1dca56f 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -986,15 +986,10 @@ static enum dma_status fsl_tx_status(struct dma_chan *dchan,
 	struct fsldma_chan *chan = to_fsl_chan(dchan);
 	dma_cookie_t last_complete;
 	dma_cookie_t last_used;
-	unsigned long flags;
-
-	spin_lock_irqsave(&chan->desc_lock, flags);
 
 	last_complete = chan->completed_cookie;
 	last_used = dchan->cookie;
 
-	spin_unlock_irqrestore(&chan->desc_lock, flags);
-
 	dma_set_tx_state(txstate, last_complete, last_used, 0);
 	return dma_async_is_complete(cookie, last_complete, last_used);
 }
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-22  4:55 [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use b29237
@ 2011-11-22 18:59 ` Ira W. Snyder
  2011-11-24  8:12   ` Shi Xuelin-B29237
  0 siblings, 1 reply; 14+ messages in thread
From: Ira W. Snyder @ 2011-11-22 18:59 UTC (permalink / raw)
  To: b29237; +Cc: vinod.koul, linux-kernel, zw, dan.j.williams, linuxppc-dev

On Tue, Nov 22, 2011 at 12:55:05PM +0800, b29237@freescale.com wrote:
> From: Forrest Shi <b29237@freescale.com>
> 
>     dma status check function fsl_tx_status is heavily called in
>     a tight loop and the desc lock in fsl_tx_status contended by
>     the dma status update function. this caused the dma performance
>     degrades much.
> 
>     this patch releases the lock in the fsl_tx_status function.
>     I believe it has no neglect impact on the following call of
>     dma_async_is_complete(...).
> 
>     we can see below three conditions will be identified as success
>     a)  x < complete < use
>     b)  x < complete+N < use+N
>     c)  x < complete < use+N
>     here complete is the completed_cookie, use is the last_used
>     cookie, x is the querying cookie, N is MAX cookie
> 
>     when chan->completed_cookie is being read, the last_used may
>     be incresed. Anyway it has no neglect impact on the dma status
>     decision.
> 
>     Signed-off-by: Forrest Shi <xuelin.shi@freescale.com>
> ---
>  drivers/dma/fsldma.c |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
> index 8a78154..1dca56f 100644
> --- a/drivers/dma/fsldma.c
> +++ b/drivers/dma/fsldma.c
> @@ -986,15 +986,10 @@ static enum dma_status fsl_tx_status(struct dma_chan *dchan,
>  	struct fsldma_chan *chan = to_fsl_chan(dchan);
>  	dma_cookie_t last_complete;
>  	dma_cookie_t last_used;
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&chan->desc_lock, flags);
>  

This will cause a bug. See below for a detailed explanation. You need
this instead:

	/*
	 * On an SMP system, we must ensure that this CPU has seen the
	 * memory accesses performed by another CPU under the
	 * chan->desc_lock spinlock.
	 */
	smp_mb();
>  	last_complete = chan->completed_cookie;
>  	last_used = dchan->cookie;
>  
> -	spin_unlock_irqrestore(&chan->desc_lock, flags);
> -
>  	dma_set_tx_state(txstate, last_complete, last_used, 0);
>  	return dma_async_is_complete(cookie, last_complete, last_used);
>  }

Facts:
- dchan->cookie is the same member as chan->common.cookie (same memory location)
- chan->common.cookie is the "last allocated cookie for a pending transaction"
- chan->completed_cookie is the "last completed transaction"

I have replaced "dchan->cookie" with "chan->common.cookie" in the below
explanation, to keep everything referenced from the same structure.

Variable usage before your change. Everything is used locked.
- RW chan->common.cookie		(fsl_dma_tx_submit)
- R  chan->common.cookie		(fsl_tx_status)
- R  chan->completed_cookie		(fsl_tx_status)
- W  chan->completed_cookie		(dma_do_tasklet)

Variable usage after your change:
- RW chan->common.cookie		LOCKED
- R  chan->common.cookie		NO LOCK
- R  chan->completed_cookie		NO LOCK
- W  chan->completed_cookie             LOCKED

What if we assume that you have a 2 CPU system (such as a P2020). After
your changes, one possible sequence is:

=== CPU1 - allocate + submit descriptor: fsl_dma_tx_submit() ===
spin_lock_irqsave
descriptor->cookie = 20		(x in your example)
chan->common.cookie = 20	(used in your example)
spin_unlock_irqrestore

=== CPU2 - immediately calls fsl_tx_status() ===
chan->common.cookie == 19
chan->completed_cookie == 19
descriptor->cookie == 20

Since we don't have locks anymore, CPU2 may not have seen the write to
chan->common.cookie yet.

Also assume that the DMA hardware has not started processing the
transaction yet. Therefore dma_do_tasklet() has not been called, and
chan->completed_cookie has not been updated.

In this case, dma_async_is_complete() (on CPU2) returns DMA_SUCCESS,
even though the DMA operation has not succeeded. The DMA operation has
not even started yet!

The smp_mb() fixes this, since it forces CPU2 to have seen all memory
operations that happened before CPU1 released the spinlock. Spinlocks
are implicit SMP memory barriers.

Therefore, the above example becomes:
smp_mb();
chan->common.cookie == 20
chan->completed_cookie == 19
descriptor->cookie == 20

Then dma_async_is_complete() returns DMA_IN_PROGRESS, which is correct.

Thanks,
Ira

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-22 18:59 ` Ira W. Snyder
@ 2011-11-24  8:12   ` Shi Xuelin-B29237
  2011-11-28 16:38     ` Ira W. Snyder
  0 siblings, 1 reply; 14+ messages in thread
From: Shi Xuelin-B29237 @ 2011-11-24  8:12 UTC (permalink / raw)
  To: Ira W. Snyder
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, Li Yang-R58472,
	linux-kernel@vger.kernel.org

Hi Ira,

Thanks for your review.

After second thought, I think your scenario may not occur.
Because the cookie 20 we query must be returned by fsl_dma_tx_submit(...) i=
n practice.=20
We never query a cookie not returned by fsl_dma_tx_submit(...).

When we call fsl_tx_status(20), the chan->common.cookie is definitely wrote=
 as 20 and cpu2 could not read as 19.

How do you think?=20

Happy Thanks Giving day,
Forrest

-----Original Message-----
From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]=20
Sent: 2011=1B$BG/=1B(B11=1B$B7n=1B(B23=1B$BF|=1B(B 2:59
To: Shi Xuelin-B29237
Cc: dan.j.williams@intel.com; Li Yang-R58472; zw@zh-kernel.org; vinod.koul@=
intel.com; linuxppc-dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing=
 spinlock use.

On Tue, Nov 22, 2011 at 12:55:05PM +0800, b29237@freescale.com wrote:
> From: Forrest Shi <b29237@freescale.com>
>=20
>     dma status check function fsl_tx_status is heavily called in
>     a tight loop and the desc lock in fsl_tx_status contended by
>     the dma status update function. this caused the dma performance
>     degrades much.
>=20
>     this patch releases the lock in the fsl_tx_status function.
>     I believe it has no neglect impact on the following call of
>     dma_async_is_complete(...).
>=20
>     we can see below three conditions will be identified as success
>     a)  x < complete < use
>     b)  x < complete+N < use+N
>     c)  x < complete < use+N
>     here complete is the completed_cookie, use is the last_used
>     cookie, x is the querying cookie, N is MAX cookie
>=20
>     when chan->completed_cookie is being read, the last_used may
>     be incresed. Anyway it has no neglect impact on the dma status
>     decision.
>=20
>     Signed-off-by: Forrest Shi <xuelin.shi@freescale.com>
> ---
>  drivers/dma/fsldma.c |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
>=20
> diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c index=20
> 8a78154..1dca56f 100644
> --- a/drivers/dma/fsldma.c
> +++ b/drivers/dma/fsldma.c
> @@ -986,15 +986,10 @@ static enum dma_status fsl_tx_status(struct dma_cha=
n *dchan,
>  	struct fsldma_chan *chan =3D to_fsl_chan(dchan);
>  	dma_cookie_t last_complete;
>  	dma_cookie_t last_used;
> -	unsigned long flags;
> -
> -	spin_lock_irqsave(&chan->desc_lock, flags);
> =20

This will cause a bug. See below for a detailed explanation. You need this =
instead:

	/*
	 * On an SMP system, we must ensure that this CPU has seen the
	 * memory accesses performed by another CPU under the
	 * chan->desc_lock spinlock.
	 */
	smp_mb();
>  	last_complete =3D chan->completed_cookie;
>  	last_used =3D dchan->cookie;
> =20
> -	spin_unlock_irqrestore(&chan->desc_lock, flags);
> -
>  	dma_set_tx_state(txstate, last_complete, last_used, 0);
>  	return dma_async_is_complete(cookie, last_complete, last_used);  }

Facts:
- dchan->cookie is the same member as chan->common.cookie (same memory loca=
tion)
- chan->common.cookie is the "last allocated cookie for a pending transacti=
on"
- chan->completed_cookie is the "last completed transaction"

I have replaced "dchan->cookie" with "chan->common.cookie" in the below exp=
lanation, to keep everything referenced from the same structure.

Variable usage before your change. Everything is used locked.
- RW chan->common.cookie		(fsl_dma_tx_submit)
- R  chan->common.cookie		(fsl_tx_status)
- R  chan->completed_cookie		(fsl_tx_status)
- W  chan->completed_cookie		(dma_do_tasklet)

Variable usage after your change:
- RW chan->common.cookie		LOCKED
- R  chan->common.cookie		NO LOCK
- R  chan->completed_cookie		NO LOCK
- W  chan->completed_cookie             LOCKED

What if we assume that you have a 2 CPU system (such as a P2020). After you=
r changes, one possible sequence is:

=3D=3D=3D CPU1 - allocate + submit descriptor: fsl_dma_tx_submit() =3D=3D=
=3D spin_lock_irqsave
descriptor->cookie =3D 20		(x in your example)
chan->common.cookie =3D 20	(used in your example)
spin_unlock_irqrestore

=3D=3D=3D CPU2 - immediately calls fsl_tx_status() =3D=3D=3D
chan->common.cookie =3D=3D 19
chan->completed_cookie =3D=3D 19
descriptor->cookie =3D=3D 20

Since we don't have locks anymore, CPU2 may not have seen the write to
chan->common.cookie yet.

Also assume that the DMA hardware has not started processing the transactio=
n yet. Therefore dma_do_tasklet() has not been called, and
chan->completed_cookie has not been updated.

In this case, dma_async_is_complete() (on CPU2) returns DMA_SUCCESS, even t=
hough the DMA operation has not succeeded. The DMA operation has not even s=
tarted yet!

The smp_mb() fixes this, since it forces CPU2 to have seen all memory opera=
tions that happened before CPU1 released the spinlock. Spinlocks are implic=
it SMP memory barriers.

Therefore, the above example becomes:
smp_mb();
chan->common.cookie =3D=3D 20
chan->completed_cookie =3D=3D 19
descriptor->cookie =3D=3D 20

Then dma_async_is_complete() returns DMA_IN_PROGRESS, which is correct.

Thanks,
Ira

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-24  8:12   ` Shi Xuelin-B29237
@ 2011-11-28 16:38     ` Ira W. Snyder
  2011-11-29  3:19       ` Li Yang-R58472
  2011-11-29  3:41       ` Shi Xuelin-B29237
  0 siblings, 2 replies; 14+ messages in thread
From: Ira W. Snyder @ 2011-11-28 16:38 UTC (permalink / raw)
  To: Shi Xuelin-B29237
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, Li Yang-R58472,
	linux-kernel@vger.kernel.org

On Thu, Nov 24, 2011 at 08:12:25AM +0000, Shi Xuelin-B29237 wrote:
> Hi Ira,
> 
> Thanks for your review.
> 
> After second thought, I think your scenario may not occur.
> Because the cookie 20 we query must be returned by fsl_dma_tx_submit(...) in practice. 
> We never query a cookie not returned by fsl_dma_tx_submit(...).
> 

I agree about this part.

> When we call fsl_tx_status(20), the chan->common.cookie is definitely wrote as 20 and cpu2 could not read as 19.
> 

This is what I don't agree about. However, I'm not an expert on CPU
cache vs. memory accesses in an multi-processor system. The section
titled "CACHE COHERENCY" in Documentation/memory-barriers.txt leads me
to believe that the scenario I described is possible.

What happens if CPU1's write of chan->common.cookie only goes into
CPU1's cache. It never makes it to main memory before CPU2 fetches the
old value of 19.

I don't think you should see any performance impact from the smp_mb()
operation.

Thanks,
Ira

> -----Original Message-----
> From: Ira W. Snyder [mailto:iws@ovro.caltech.edu] 
> Sent: 2011年11月23日 2:59
> To: Shi Xuelin-B29237
> Cc: dan.j.williams@intel.com; Li Yang-R58472; zw@zh-kernel.org; vinod.koul@intel.com; linuxppc-dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
> 
> On Tue, Nov 22, 2011 at 12:55:05PM +0800, b29237@freescale.com wrote:
> > From: Forrest Shi <b29237@freescale.com>
> > 
> >     dma status check function fsl_tx_status is heavily called in
> >     a tight loop and the desc lock in fsl_tx_status contended by
> >     the dma status update function. this caused the dma performance
> >     degrades much.
> > 
> >     this patch releases the lock in the fsl_tx_status function.
> >     I believe it has no neglect impact on the following call of
> >     dma_async_is_complete(...).
> > 
> >     we can see below three conditions will be identified as success
> >     a)  x < complete < use
> >     b)  x < complete+N < use+N
> >     c)  x < complete < use+N
> >     here complete is the completed_cookie, use is the last_used
> >     cookie, x is the querying cookie, N is MAX cookie
> > 
> >     when chan->completed_cookie is being read, the last_used may
> >     be incresed. Anyway it has no neglect impact on the dma status
> >     decision.
> > 
> >     Signed-off-by: Forrest Shi <xuelin.shi@freescale.com>
> > ---
> >  drivers/dma/fsldma.c |    5 -----
> >  1 files changed, 0 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c index 
> > 8a78154..1dca56f 100644
> > --- a/drivers/dma/fsldma.c
> > +++ b/drivers/dma/fsldma.c
> > @@ -986,15 +986,10 @@ static enum dma_status fsl_tx_status(struct dma_chan *dchan,
> >  	struct fsldma_chan *chan = to_fsl_chan(dchan);
> >  	dma_cookie_t last_complete;
> >  	dma_cookie_t last_used;
> > -	unsigned long flags;
> > -
> > -	spin_lock_irqsave(&chan->desc_lock, flags);
> >  
> 
> This will cause a bug. See below for a detailed explanation. You need this instead:
> 
> 	/*
> 	 * On an SMP system, we must ensure that this CPU has seen the
> 	 * memory accesses performed by another CPU under the
> 	 * chan->desc_lock spinlock.
> 	 */
> 	smp_mb();
> >  	last_complete = chan->completed_cookie;
> >  	last_used = dchan->cookie;
> >  
> > -	spin_unlock_irqrestore(&chan->desc_lock, flags);
> > -
> >  	dma_set_tx_state(txstate, last_complete, last_used, 0);
> >  	return dma_async_is_complete(cookie, last_complete, last_used);  }
> 
> Facts:
> - dchan->cookie is the same member as chan->common.cookie (same memory location)
> - chan->common.cookie is the "last allocated cookie for a pending transaction"
> - chan->completed_cookie is the "last completed transaction"
> 
> I have replaced "dchan->cookie" with "chan->common.cookie" in the below explanation, to keep everything referenced from the same structure.
> 
> Variable usage before your change. Everything is used locked.
> - RW chan->common.cookie		(fsl_dma_tx_submit)
> - R  chan->common.cookie		(fsl_tx_status)
> - R  chan->completed_cookie		(fsl_tx_status)
> - W  chan->completed_cookie		(dma_do_tasklet)
> 
> Variable usage after your change:
> - RW chan->common.cookie		LOCKED
> - R  chan->common.cookie		NO LOCK
> - R  chan->completed_cookie		NO LOCK
> - W  chan->completed_cookie             LOCKED
> 
> What if we assume that you have a 2 CPU system (such as a P2020). After your changes, one possible sequence is:
> 
> === CPU1 - allocate + submit descriptor: fsl_dma_tx_submit() === spin_lock_irqsave
> descriptor->cookie = 20		(x in your example)
> chan->common.cookie = 20	(used in your example)
> spin_unlock_irqrestore
> 
> === CPU2 - immediately calls fsl_tx_status() ===
> chan->common.cookie == 19
> chan->completed_cookie == 19
> descriptor->cookie == 20
> 
> Since we don't have locks anymore, CPU2 may not have seen the write to
> chan->common.cookie yet.
> 
> Also assume that the DMA hardware has not started processing the transaction yet. Therefore dma_do_tasklet() has not been called, and
> chan->completed_cookie has not been updated.
> 
> In this case, dma_async_is_complete() (on CPU2) returns DMA_SUCCESS, even though the DMA operation has not succeeded. The DMA operation has not even started yet!
> 
> The smp_mb() fixes this, since it forces CPU2 to have seen all memory operations that happened before CPU1 released the spinlock. Spinlocks are implicit SMP memory barriers.
> 
> Therefore, the above example becomes:
> smp_mb();
> chan->common.cookie == 20
> chan->completed_cookie == 19
> descriptor->cookie == 20
> 
> Then dma_async_is_complete() returns DMA_IN_PROGRESS, which is correct.
> 
> Thanks,
> Ira
> 
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-28 16:38     ` Ira W. Snyder
@ 2011-11-29  3:19       ` Li Yang-R58472
  2011-11-29 17:25         ` Ira W. Snyder
  2011-11-29 19:49         ` Scott Wood
  2011-11-29  3:41       ` Shi Xuelin-B29237
  1 sibling, 2 replies; 14+ messages in thread
From: Li Yang-R58472 @ 2011-11-29  3:19 UTC (permalink / raw)
  To: Ira W. Snyder, Shi Xuelin-B29237
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org

PiBTdWJqZWN0OiBSZTogW1BBVENIXVtSRkNdIGZzbGRtYTogZml4IHBlcmZvcm1hbmNlIGRlZ3Jh
ZGF0aW9uIGJ5IG9wdGltaXppbmcNCj4gc3BpbmxvY2sgdXNlLg0KPiANCj4gT24gVGh1LCBOb3Yg
MjQsIDIwMTEgYXQgMDg6MTI6MjVBTSArMDAwMCwgU2hpIFh1ZWxpbi1CMjkyMzcgd3JvdGU6DQo+
ID4gSGkgSXJhLA0KPiA+DQo+ID4gVGhhbmtzIGZvciB5b3VyIHJldmlldy4NCj4gPg0KPiA+IEFm
dGVyIHNlY29uZCB0aG91Z2h0LCBJIHRoaW5rIHlvdXIgc2NlbmFyaW8gbWF5IG5vdCBvY2N1ci4N
Cj4gPiBCZWNhdXNlIHRoZSBjb29raWUgMjAgd2UgcXVlcnkgbXVzdCBiZSByZXR1cm5lZCBieSBm
c2xfZG1hX3R4X3N1Ym1pdCguLi4pIGluDQo+IHByYWN0aWNlLg0KPiA+IFdlIG5ldmVyIHF1ZXJ5
IGEgY29va2llIG5vdCByZXR1cm5lZCBieSBmc2xfZG1hX3R4X3N1Ym1pdCguLi4pLg0KPiA+DQo+
IA0KPiBJIGFncmVlIGFib3V0IHRoaXMgcGFydC4NCj4gDQo+ID4gV2hlbiB3ZSBjYWxsIGZzbF90
eF9zdGF0dXMoMjApLCB0aGUgY2hhbi0+Y29tbW9uLmNvb2tpZSBpcyBkZWZpbml0ZWx5IHdyb3Rl
IGFzDQo+IDIwIGFuZCBjcHUyIGNvdWxkIG5vdCByZWFkIGFzIDE5Lg0KPiA+DQo+IA0KPiBUaGlz
IGlzIHdoYXQgSSBkb24ndCBhZ3JlZSBhYm91dC4gSG93ZXZlciwgSSdtIG5vdCBhbiBleHBlcnQg
b24gQ1BVIGNhY2hlIHZzLg0KPiBtZW1vcnkgYWNjZXNzZXMgaW4gYW4gbXVsdGktcHJvY2Vzc29y
IHN5c3RlbS4gVGhlIHNlY3Rpb24gdGl0bGVkICJDQUNIRQ0KPiBDT0hFUkVOQ1kiIGluIERvY3Vt
ZW50YXRpb24vbWVtb3J5LWJhcnJpZXJzLnR4dCBsZWFkcyBtZSB0byBiZWxpZXZlIHRoYXQgdGhl
DQo+IHNjZW5hcmlvIEkgZGVzY3JpYmVkIGlzIHBvc3NpYmxlLg0KDQpGb3IgRnJlZXNjYWxlIFBv
d2VyUEMsIHRoZSBjaGlwIGF1dG9tYXRpY2FsbHkgdGFrZXMgY2FyZSBvZiBjYWNoZSBjb2hlcmVu
Y3kuICBFdmVuIGlmIHRoaXMgaXMgYSBjb25jZXJuLCBzcGlubG9jayBjYW4ndCBhZGRyZXNzIGl0
Lg0KDQo+IA0KPiBXaGF0IGhhcHBlbnMgaWYgQ1BVMSdzIHdyaXRlIG9mIGNoYW4tPmNvbW1vbi5j
b29raWUgb25seSBnb2VzIGludG8gQ1BVMSdzDQo+IGNhY2hlLiBJdCBuZXZlciBtYWtlcyBpdCB0
byBtYWluIG1lbW9yeSBiZWZvcmUgQ1BVMiBmZXRjaGVzIHRoZSBvbGQgdmFsdWUgb2YgMTkuDQo+
IA0KPiBJIGRvbid0IHRoaW5rIHlvdSBzaG91bGQgc2VlIGFueSBwZXJmb3JtYW5jZSBpbXBhY3Qg
ZnJvbSB0aGUgc21wX21iKCkNCj4gb3BlcmF0aW9uLg0KDQpTbXBfbWIoKSBkbyBoYXZlIGltcGFj
dCBvbiBwZXJmb3JtYW5jZSBpZiBpdCdzIGluIHRoZSBob3QgcGF0aC4gIFdoaWxlIGl0IG1pZ2h0
IGJlIHNhZmVyIGhhdmluZyBpdCwgSSBkb3VidCBpdCBpcyByZWFsbHkgbmVjZXNzYXJ5LiAgSWYg
dGhlIENQVTEgZG9lc24ndCBoYXZlIHRoZSB1cGRhdGVkIGxhc3RfdXNlZCwgaXQncyBzaG91bGRu
J3QgaGF2ZSBrbm93biB0aGVyZSBpcyBhIGNvb2tpZSAyMCBleGlzdGVkIGVpdGhlci4NCg0KLSBM
ZW8NCg0KPiANCj4gVGhhbmtzLA0KPiBJcmENCj4gDQo+ID4gLS0tLS1PcmlnaW5hbCBNZXNzYWdl
LS0tLS0NCj4gPiBGcm9tOiBJcmEgVy4gU255ZGVyIFttYWlsdG86aXdzQG92cm8uY2FsdGVjaC5l
ZHVdDQo+ID4gU2VudDogMjAxMeW5tDEx5pyIMjPml6UgMjo1OQ0KPiA+IFRvOiBTaGkgWHVlbGlu
LUIyOTIzNw0KPiA+IENjOiBkYW4uai53aWxsaWFtc0BpbnRlbC5jb207IExpIFlhbmctUjU4NDcy
OyB6d0B6aC1rZXJuZWwub3JnOw0KPiA+IHZpbm9kLmtvdWxAaW50ZWwuY29tOyBsaW51eHBwYy1k
ZXZAbGlzdHMub3psYWJzLm9yZzsNCj4gPiBsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnDQo+
ID4gU3ViamVjdDogUmU6IFtQQVRDSF1bUkZDXSBmc2xkbWE6IGZpeCBwZXJmb3JtYW5jZSBkZWdy
YWRhdGlvbiBieSBvcHRpbWl6aW5nDQo+IHNwaW5sb2NrIHVzZS4NCj4gPg0KPiA+IE9uIFR1ZSwg
Tm92IDIyLCAyMDExIGF0IDEyOjU1OjA1UE0gKzA4MDAsIGIyOTIzN0BmcmVlc2NhbGUuY29tIHdy
b3RlOg0KPiA+ID4gRnJvbTogRm9ycmVzdCBTaGkgPGIyOTIzN0BmcmVlc2NhbGUuY29tPg0KPiA+
ID4NCj4gPiA+ICAgICBkbWEgc3RhdHVzIGNoZWNrIGZ1bmN0aW9uIGZzbF90eF9zdGF0dXMgaXMg
aGVhdmlseSBjYWxsZWQgaW4NCj4gPiA+ICAgICBhIHRpZ2h0IGxvb3AgYW5kIHRoZSBkZXNjIGxv
Y2sgaW4gZnNsX3R4X3N0YXR1cyBjb250ZW5kZWQgYnkNCj4gPiA+ICAgICB0aGUgZG1hIHN0YXR1
cyB1cGRhdGUgZnVuY3Rpb24uIHRoaXMgY2F1c2VkIHRoZSBkbWEgcGVyZm9ybWFuY2UNCj4gPiA+
ICAgICBkZWdyYWRlcyBtdWNoLg0KPiA+ID4NCj4gPiA+ICAgICB0aGlzIHBhdGNoIHJlbGVhc2Vz
IHRoZSBsb2NrIGluIHRoZSBmc2xfdHhfc3RhdHVzIGZ1bmN0aW9uLg0KPiA+ID4gICAgIEkgYmVs
aWV2ZSBpdCBoYXMgbm8gbmVnbGVjdCBpbXBhY3Qgb24gdGhlIGZvbGxvd2luZyBjYWxsIG9mDQo+
ID4gPiAgICAgZG1hX2FzeW5jX2lzX2NvbXBsZXRlKC4uLikuDQo+ID4gPg0KPiA+ID4gICAgIHdl
IGNhbiBzZWUgYmVsb3cgdGhyZWUgY29uZGl0aW9ucyB3aWxsIGJlIGlkZW50aWZpZWQgYXMgc3Vj
Y2Vzcw0KPiA+ID4gICAgIGEpICB4IDwgY29tcGxldGUgPCB1c2UNCj4gPiA+ICAgICBiKSAgeCA8
IGNvbXBsZXRlK04gPCB1c2UrTg0KPiA+ID4gICAgIGMpICB4IDwgY29tcGxldGUgPCB1c2UrTg0K
PiA+ID4gICAgIGhlcmUgY29tcGxldGUgaXMgdGhlIGNvbXBsZXRlZF9jb29raWUsIHVzZSBpcyB0
aGUgbGFzdF91c2VkDQo+ID4gPiAgICAgY29va2llLCB4IGlzIHRoZSBxdWVyeWluZyBjb29raWUs
IE4gaXMgTUFYIGNvb2tpZQ0KPiA+ID4NCj4gPiA+ICAgICB3aGVuIGNoYW4tPmNvbXBsZXRlZF9j
b29raWUgaXMgYmVpbmcgcmVhZCwgdGhlIGxhc3RfdXNlZCBtYXkNCj4gPiA+ICAgICBiZSBpbmNy
ZXNlZC4gQW55d2F5IGl0IGhhcyBubyBuZWdsZWN0IGltcGFjdCBvbiB0aGUgZG1hIHN0YXR1cw0K
PiA+ID4gICAgIGRlY2lzaW9uLg0KPiA+ID4NCj4gPiA+ICAgICBTaWduZWQtb2ZmLWJ5OiBGb3Jy
ZXN0IFNoaSA8eHVlbGluLnNoaUBmcmVlc2NhbGUuY29tPg0KPiA+ID4gLS0tDQo+ID4gPiAgZHJp
dmVycy9kbWEvZnNsZG1hLmMgfCAgICA1IC0tLS0tDQo+ID4gPiAgMSBmaWxlcyBjaGFuZ2VkLCAw
IGluc2VydGlvbnMoKyksIDUgZGVsZXRpb25zKC0pDQo+ID4gPg0KPiA+ID4gZGlmZiAtLWdpdCBh
L2RyaXZlcnMvZG1hL2ZzbGRtYS5jIGIvZHJpdmVycy9kbWEvZnNsZG1hLmMgaW5kZXgNCj4gPiA+
IDhhNzgxNTQuLjFkY2E1NmYgMTAwNjQ0DQo+ID4gPiAtLS0gYS9kcml2ZXJzL2RtYS9mc2xkbWEu
Yw0KPiA+ID4gKysrIGIvZHJpdmVycy9kbWEvZnNsZG1hLmMNCj4gPiA+IEBAIC05ODYsMTUgKzk4
NiwxMCBAQCBzdGF0aWMgZW51bSBkbWFfc3RhdHVzIGZzbF90eF9zdGF0dXMoc3RydWN0DQo+IGRt
YV9jaGFuICpkY2hhbiwNCj4gPiA+ICAJc3RydWN0IGZzbGRtYV9jaGFuICpjaGFuID0gdG9fZnNs
X2NoYW4oZGNoYW4pOw0KPiA+ID4gIAlkbWFfY29va2llX3QgbGFzdF9jb21wbGV0ZTsNCj4gPiA+
ICAJZG1hX2Nvb2tpZV90IGxhc3RfdXNlZDsNCj4gPiA+IC0JdW5zaWduZWQgbG9uZyBmbGFnczsN
Cj4gPiA+IC0NCj4gPiA+IC0Jc3Bpbl9sb2NrX2lycXNhdmUoJmNoYW4tPmRlc2NfbG9jaywgZmxh
Z3MpOw0KPiA+ID4NCj4gPg0KPiA+IFRoaXMgd2lsbCBjYXVzZSBhIGJ1Zy4gU2VlIGJlbG93IGZv
ciBhIGRldGFpbGVkIGV4cGxhbmF0aW9uLiBZb3UgbmVlZCB0aGlzIGluc3RlYWQ6DQo+ID4NCj4g
PiAJLyoNCj4gPiAJICogT24gYW4gU01QIHN5c3RlbSwgd2UgbXVzdCBlbnN1cmUgdGhhdCB0aGlz
IENQVSBoYXMgc2VlbiB0aGUNCj4gPiAJICogbWVtb3J5IGFjY2Vzc2VzIHBlcmZvcm1lZCBieSBh
bm90aGVyIENQVSB1bmRlciB0aGUNCj4gPiAJICogY2hhbi0+ZGVzY19sb2NrIHNwaW5sb2NrLg0K
PiA+IAkgKi8NCj4gPiAJc21wX21iKCk7DQo+ID4gPiAgCWxhc3RfY29tcGxldGUgPSBjaGFuLT5j
b21wbGV0ZWRfY29va2llOw0KPiA+ID4gIAlsYXN0X3VzZWQgPSBkY2hhbi0+Y29va2llOw0KPiA+
ID4NCj4gPiA+IC0Jc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSgmY2hhbi0+ZGVzY19sb2NrLCBmbGFn
cyk7DQo+ID4gPiAtDQo+ID4gPiAgCWRtYV9zZXRfdHhfc3RhdGUodHhzdGF0ZSwgbGFzdF9jb21w
bGV0ZSwgbGFzdF91c2VkLCAwKTsNCj4gPiA+ICAJcmV0dXJuIGRtYV9hc3luY19pc19jb21wbGV0
ZShjb29raWUsIGxhc3RfY29tcGxldGUsIGxhc3RfdXNlZCk7ICB9DQo+ID4NCj4gPiBGYWN0czoN
Cj4gPiAtIGRjaGFuLT5jb29raWUgaXMgdGhlIHNhbWUgbWVtYmVyIGFzIGNoYW4tPmNvbW1vbi5j
b29raWUgKHNhbWUgbWVtb3J5DQo+ID4gbG9jYXRpb24pDQo+ID4gLSBjaGFuLT5jb21tb24uY29v
a2llIGlzIHRoZSAibGFzdCBhbGxvY2F0ZWQgY29va2llIGZvciBhIHBlbmRpbmcgdHJhbnNhY3Rp
b24iDQo+ID4gLSBjaGFuLT5jb21wbGV0ZWRfY29va2llIGlzIHRoZSAibGFzdCBjb21wbGV0ZWQg
dHJhbnNhY3Rpb24iDQo+ID4NCj4gPiBJIGhhdmUgcmVwbGFjZWQgImRjaGFuLT5jb29raWUiIHdp
dGggImNoYW4tPmNvbW1vbi5jb29raWUiIGluIHRoZSBiZWxvdw0KPiBleHBsYW5hdGlvbiwgdG8g
a2VlcCBldmVyeXRoaW5nIHJlZmVyZW5jZWQgZnJvbSB0aGUgc2FtZSBzdHJ1Y3R1cmUuDQo+ID4N
Cj4gPiBWYXJpYWJsZSB1c2FnZSBiZWZvcmUgeW91ciBjaGFuZ2UuIEV2ZXJ5dGhpbmcgaXMgdXNl
ZCBsb2NrZWQuDQo+ID4gLSBSVyBjaGFuLT5jb21tb24uY29va2llCQkoZnNsX2RtYV90eF9zdWJt
aXQpDQo+ID4gLSBSICBjaGFuLT5jb21tb24uY29va2llCQkoZnNsX3R4X3N0YXR1cykNCj4gPiAt
IFIgIGNoYW4tPmNvbXBsZXRlZF9jb29raWUJCShmc2xfdHhfc3RhdHVzKQ0KPiA+IC0gVyAgY2hh
bi0+Y29tcGxldGVkX2Nvb2tpZQkJKGRtYV9kb190YXNrbGV0KQ0KPiA+DQo+ID4gVmFyaWFibGUg
dXNhZ2UgYWZ0ZXIgeW91ciBjaGFuZ2U6DQo+ID4gLSBSVyBjaGFuLT5jb21tb24uY29va2llCQlM
T0NLRUQNCj4gPiAtIFIgIGNoYW4tPmNvbW1vbi5jb29raWUJCU5PIExPQ0sNCj4gPiAtIFIgIGNo
YW4tPmNvbXBsZXRlZF9jb29raWUJCU5PIExPQ0sNCj4gPiAtIFcgIGNoYW4tPmNvbXBsZXRlZF9j
b29raWUgICAgICAgICAgICAgTE9DS0VEDQo+ID4NCj4gPiBXaGF0IGlmIHdlIGFzc3VtZSB0aGF0
IHlvdSBoYXZlIGEgMiBDUFUgc3lzdGVtIChzdWNoIGFzIGEgUDIwMjApLiBBZnRlciB5b3VyDQo+
IGNoYW5nZXMsIG9uZSBwb3NzaWJsZSBzZXF1ZW5jZSBpczoNCj4gPg0KPiA+ID09PSBDUFUxIC0g
YWxsb2NhdGUgKyBzdWJtaXQgZGVzY3JpcHRvcjogZnNsX2RtYV90eF9zdWJtaXQoKSA9PT0NCj4g
PiBzcGluX2xvY2tfaXJxc2F2ZQ0KPiA+IGRlc2NyaXB0b3ItPmNvb2tpZSA9IDIwCQkoeCBpbiB5
b3VyIGV4YW1wbGUpDQo+ID4gY2hhbi0+Y29tbW9uLmNvb2tpZSA9IDIwCSh1c2VkIGluIHlvdXIg
ZXhhbXBsZSkNCj4gPiBzcGluX3VubG9ja19pcnFyZXN0b3JlDQo+ID4NCj4gPiA9PT0gQ1BVMiAt
IGltbWVkaWF0ZWx5IGNhbGxzIGZzbF90eF9zdGF0dXMoKSA9PT0NCj4gPiBjaGFuLT5jb21tb24u
Y29va2llID09IDE5DQo+ID4gY2hhbi0+Y29tcGxldGVkX2Nvb2tpZSA9PSAxOQ0KPiA+IGRlc2Ny
aXB0b3ItPmNvb2tpZSA9PSAyMA0KPiA+DQo+ID4gU2luY2Ugd2UgZG9uJ3QgaGF2ZSBsb2NrcyBh
bnltb3JlLCBDUFUyIG1heSBub3QgaGF2ZSBzZWVuIHRoZSB3cml0ZSB0bw0KPiA+IGNoYW4tPmNv
bW1vbi5jb29raWUgeWV0Lg0KPiA+DQo+ID4gQWxzbyBhc3N1bWUgdGhhdCB0aGUgRE1BIGhhcmR3
YXJlIGhhcyBub3Qgc3RhcnRlZCBwcm9jZXNzaW5nIHRoZQ0KPiA+IHRyYW5zYWN0aW9uIHlldC4g
VGhlcmVmb3JlIGRtYV9kb190YXNrbGV0KCkgaGFzIG5vdCBiZWVuIGNhbGxlZCwgYW5kDQo+ID4g
Y2hhbi0+Y29tcGxldGVkX2Nvb2tpZSBoYXMgbm90IGJlZW4gdXBkYXRlZC4NCj4gPg0KPiA+IElu
IHRoaXMgY2FzZSwgZG1hX2FzeW5jX2lzX2NvbXBsZXRlKCkgKG9uIENQVTIpIHJldHVybnMgRE1B
X1NVQ0NFU1MsIGV2ZW4NCj4gdGhvdWdoIHRoZSBETUEgb3BlcmF0aW9uIGhhcyBub3Qgc3VjY2Vl
ZGVkLiBUaGUgRE1BIG9wZXJhdGlvbiBoYXMgbm90IGV2ZW4NCj4gc3RhcnRlZCB5ZXQhDQo+ID4N
Cj4gPiBUaGUgc21wX21iKCkgZml4ZXMgdGhpcywgc2luY2UgaXQgZm9yY2VzIENQVTIgdG8gaGF2
ZSBzZWVuIGFsbCBtZW1vcnkgb3BlcmF0aW9ucw0KPiB0aGF0IGhhcHBlbmVkIGJlZm9yZSBDUFUx
IHJlbGVhc2VkIHRoZSBzcGlubG9jay4gU3BpbmxvY2tzIGFyZSBpbXBsaWNpdCBTTVANCj4gbWVt
b3J5IGJhcnJpZXJzLg0KPiA+DQo+ID4gVGhlcmVmb3JlLCB0aGUgYWJvdmUgZXhhbXBsZSBiZWNv
bWVzOg0KPiA+IHNtcF9tYigpOw0KPiA+IGNoYW4tPmNvbW1vbi5jb29raWUgPT0gMjANCj4gPiBj
aGFuLT5jb21wbGV0ZWRfY29va2llID09IDE5DQo+ID4gZGVzY3JpcHRvci0+Y29va2llID09IDIw
DQo+ID4NCj4gPiBUaGVuIGRtYV9hc3luY19pc19jb21wbGV0ZSgpIHJldHVybnMgRE1BX0lOX1BS
T0dSRVNTLCB3aGljaCBpcyBjb3JyZWN0Lg0KPiA+DQo+ID4gVGhhbmtzLA0KPiA+IElyYQ0KPiA+
DQo+ID4NCj4gPiBfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
Xw0KPiA+IExpbnV4cHBjLWRldiBtYWlsaW5nIGxpc3QNCj4gPiBMaW51eHBwYy1kZXZAbGlzdHMu
b3psYWJzLm9yZw0KPiA+IGh0dHBzOi8vbGlzdHMub3psYWJzLm9yZy9saXN0aW5mby9saW51eHBw
Yy1kZXYNCg0K

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-28 16:38     ` Ira W. Snyder
  2011-11-29  3:19       ` Li Yang-R58472
@ 2011-11-29  3:41       ` Shi Xuelin-B29237
  1 sibling, 0 replies; 14+ messages in thread
From: Shi Xuelin-B29237 @ 2011-11-29  3:41 UTC (permalink / raw)
  To: Ira W. Snyder
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, Li Yang-R58472,
	linux-kernel@vger.kernel.org

SGkgSXJhLCANCg0Kc2VlIG15IGNvbW1lbnRzIGJlbG93Lg0KDQpUaGFua3MsDQpGb3JyZXN0DQoN
Ci0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQpGcm9tOiBJcmEgVy4gU255ZGVyIFttYWlsdG86
aXdzQG92cm8uY2FsdGVjaC5lZHVdIA0KU2VudDogMjAxMeW5tDEx5pyIMjnml6UgMDozOA0KVG86
IFNoaSBYdWVsaW4tQjI5MjM3DQpDYzogdmlub2Qua291bEBpbnRlbC5jb207IGRhbi5qLndpbGxp
YW1zQGludGVsLmNvbTsgbGludXhwcGMtZGV2QGxpc3RzLm96bGFicy5vcmc7IExpIFlhbmctUjU4
NDcyOyBsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnDQpTdWJqZWN0OiBSZTogW1BBVENIXVtS
RkNdIGZzbGRtYTogZml4IHBlcmZvcm1hbmNlIGRlZ3JhZGF0aW9uIGJ5IG9wdGltaXppbmcgc3Bp
bmxvY2sgdXNlLg0KDQpPbiBUaHUsIE5vdiAyNCwgMjAxMSBhdCAwODoxMjoyNUFNICswMDAwLCBT
aGkgWHVlbGluLUIyOTIzNyB3cm90ZToNCj4gSGkgSXJhLA0KPiANCj4gVGhhbmtzIGZvciB5b3Vy
IHJldmlldy4NCj4gDQo+IEFmdGVyIHNlY29uZCB0aG91Z2h0LCBJIHRoaW5rIHlvdXIgc2NlbmFy
aW8gbWF5IG5vdCBvY2N1ci4NCj4gQmVjYXVzZSB0aGUgY29va2llIDIwIHdlIHF1ZXJ5IG11c3Qg
YmUgcmV0dXJuZWQgYnkgZnNsX2RtYV90eF9zdWJtaXQoLi4uKSBpbiBwcmFjdGljZS4gDQo+IFdl
IG5ldmVyIHF1ZXJ5IGEgY29va2llIG5vdCByZXR1cm5lZCBieSBmc2xfZG1hX3R4X3N1Ym1pdCgu
Li4pLg0KPiANCg0KSSBhZ3JlZSBhYm91dCB0aGlzIHBhcnQuDQoNCj4gV2hlbiB3ZSBjYWxsIGZz
bF90eF9zdGF0dXMoMjApLCB0aGUgY2hhbi0+Y29tbW9uLmNvb2tpZSBpcyBkZWZpbml0ZWx5IHdy
b3RlIGFzIDIwIGFuZCBjcHUyIGNvdWxkIG5vdCByZWFkIGFzIDE5Lg0KPiANCg0KVGhpcyBpcyB3
aGF0IEkgZG9uJ3QgYWdyZWUgYWJvdXQuIEhvd2V2ZXIsIEknbSBub3QgYW4gZXhwZXJ0IG9uIENQ
VSBjYWNoZSB2cy4gbWVtb3J5IGFjY2Vzc2VzIGluIGFuIG11bHRpLXByb2Nlc3NvciBzeXN0ZW0u
IFRoZSBzZWN0aW9uIHRpdGxlZCAiQ0FDSEUgQ09IRVJFTkNZIiBpbiBEb2N1bWVudGF0aW9uL21l
bW9yeS1iYXJyaWVycy50eHQgbGVhZHMgbWUgdG8gYmVsaWV2ZSB0aGF0IHRoZSBzY2VuYXJpbyBJ
IGRlc2NyaWJlZCBpcyBwb3NzaWJsZS4NCg0KIA0KW1NoaSBYdWVsaW4tQjI5MjM3XSBJZiBxdWVy
eSBpcyB1c2VkIHdpdGhvdXQgcnVsZXMsIHlvdXIgc2NlbmFyaW8gbWF5IGhhcHBlbi4gQnV0IGlu
IERNQSB1c2FnZSBoZXJlLCB0aGUgcXVlcnkgaXMgdXNlZCBzb21ldGhpbmcgbGlrZSBzZXF1ZW50
aWFsbHkuIE9ubHkgYWZ0ZXIgY2hhbi0+Y29tbW9uLmNvb2tpZSBpcyB1cGRhdGVkIGluIGZzbF9k
bWFfdHhfc3VibWl0KC4uLikgYW5kIHJldHVybmVkLCB0aGVuIGl0IGNvdWxkIGJlIHF1ZXJpZWQu
IFNvIHlvdSBzY2VuYXJpbyB3aWxsIG5vdCBoYXBwZW4uDQoNCldoYXQgaGFwcGVucyBpZiBDUFUx
J3Mgd3JpdGUgb2YgY2hhbi0+Y29tbW9uLmNvb2tpZSBvbmx5IGdvZXMgaW50byBDUFUxJ3MgY2Fj
aGUuIEl0IG5ldmVyIG1ha2VzIGl0IHRvIG1haW4gbWVtb3J5IGJlZm9yZSBDUFUyIGZldGNoZXMg
dGhlIG9sZCB2YWx1ZSBvZiAxOS4NCiANCltTaGkgWHVlbGluLUIyOTIzN10gQXMgeW91IHNlZSwg
Y2hhbi0+Y29tbW9uLmNvb2tpZSBpcyB1cGRhdGVkIGluIGZzbF9kbWFfdHhfc3VibWl0KC4uLikg
YW5kIGVuY2xvc2VkIGJ5IHNwaW5sb2NrLiBUaGUgc3BpbmxvY2sgaW1wbGVtZW50YXRpb24gaW4g
UFBDIHdpbGwgZ3VhcmFudGVlIHRoZSBjYWNoZSBjb2hlcmVuY2UgYW1vbmcgY29yZXMsIHNvbWV0
aGluZyBsaWtlIHlvdSBjYWxsZWQgaW1wbGljaXQgc21wX21iLg0KDQpJIGRvbid0IHRoaW5rIHlv
dSBzaG91bGQgc2VlIGFueSBwZXJmb3JtYW5jZSBpbXBhY3QgZnJvbSB0aGUgc21wX21iKCkgb3Bl
cmF0aW9uLg0KDQpbU2hpIFh1ZWxpbi1CMjkyMzddIE9ubHkgYWRkIHNtcF9tYigpIGRvZXNuJ3Qg
d29yay4gSXQgb25seSBzeW5jIG9uIHRoaXMgc3RlcCwgYnV0IG5leHQgc3RlcCBpcyB0aGUgc2Ft
ZSBhcyBqdXN0IGdldHRpbmcgaW50byB0aGlzIGZ1bmN0aW9uIHdpdGhvdXQgc21wX21iIG9wZXJh
dGlvbi4NCg0KVGhhbmtzLA0KSXJhDQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4g
RnJvbTogSXJhIFcuIFNueWRlciBbbWFpbHRvOml3c0BvdnJvLmNhbHRlY2guZWR1XQ0KPiBTZW50
OiAyMDEx5bm0MTHmnIgyM+aXpSAyOjU5DQo+IFRvOiBTaGkgWHVlbGluLUIyOTIzNw0KPiBDYzog
ZGFuLmoud2lsbGlhbXNAaW50ZWwuY29tOyBMaSBZYW5nLVI1ODQ3MjsgendAemgta2VybmVsLm9y
ZzsgDQo+IHZpbm9kLmtvdWxAaW50ZWwuY29tOyBsaW51eHBwYy1kZXZAbGlzdHMub3psYWJzLm9y
ZzsgDQo+IGxpbnV4LWtlcm5lbEB2Z2VyLmtlcm5lbC5vcmcNCj4gU3ViamVjdDogUmU6IFtQQVRD
SF1bUkZDXSBmc2xkbWE6IGZpeCBwZXJmb3JtYW5jZSBkZWdyYWRhdGlvbiBieSBvcHRpbWl6aW5n
IHNwaW5sb2NrIHVzZS4NCj4gDQo+IE9uIFR1ZSwgTm92IDIyLCAyMDExIGF0IDEyOjU1OjA1UE0g
KzA4MDAsIGIyOTIzN0BmcmVlc2NhbGUuY29tIHdyb3RlOg0KPiA+IEZyb206IEZvcnJlc3QgU2hp
IDxiMjkyMzdAZnJlZXNjYWxlLmNvbT4NCj4gPiANCj4gPiAgICAgZG1hIHN0YXR1cyBjaGVjayBm
dW5jdGlvbiBmc2xfdHhfc3RhdHVzIGlzIGhlYXZpbHkgY2FsbGVkIGluDQo+ID4gICAgIGEgdGln
aHQgbG9vcCBhbmQgdGhlIGRlc2MgbG9jayBpbiBmc2xfdHhfc3RhdHVzIGNvbnRlbmRlZCBieQ0K
PiA+ICAgICB0aGUgZG1hIHN0YXR1cyB1cGRhdGUgZnVuY3Rpb24uIHRoaXMgY2F1c2VkIHRoZSBk
bWEgcGVyZm9ybWFuY2UNCj4gPiAgICAgZGVncmFkZXMgbXVjaC4NCj4gPiANCj4gPiAgICAgdGhp
cyBwYXRjaCByZWxlYXNlcyB0aGUgbG9jayBpbiB0aGUgZnNsX3R4X3N0YXR1cyBmdW5jdGlvbi4N
Cj4gPiAgICAgSSBiZWxpZXZlIGl0IGhhcyBubyBuZWdsZWN0IGltcGFjdCBvbiB0aGUgZm9sbG93
aW5nIGNhbGwgb2YNCj4gPiAgICAgZG1hX2FzeW5jX2lzX2NvbXBsZXRlKC4uLikuDQo+ID4gDQo+
ID4gICAgIHdlIGNhbiBzZWUgYmVsb3cgdGhyZWUgY29uZGl0aW9ucyB3aWxsIGJlIGlkZW50aWZp
ZWQgYXMgc3VjY2Vzcw0KPiA+ICAgICBhKSAgeCA8IGNvbXBsZXRlIDwgdXNlDQo+ID4gICAgIGIp
ICB4IDwgY29tcGxldGUrTiA8IHVzZStODQo+ID4gICAgIGMpICB4IDwgY29tcGxldGUgPCB1c2Ur
Tg0KPiA+ICAgICBoZXJlIGNvbXBsZXRlIGlzIHRoZSBjb21wbGV0ZWRfY29va2llLCB1c2UgaXMg
dGhlIGxhc3RfdXNlZA0KPiA+ICAgICBjb29raWUsIHggaXMgdGhlIHF1ZXJ5aW5nIGNvb2tpZSwg
TiBpcyBNQVggY29va2llDQo+ID4gDQo+ID4gICAgIHdoZW4gY2hhbi0+Y29tcGxldGVkX2Nvb2tp
ZSBpcyBiZWluZyByZWFkLCB0aGUgbGFzdF91c2VkIG1heQ0KPiA+ICAgICBiZSBpbmNyZXNlZC4g
QW55d2F5IGl0IGhhcyBubyBuZWdsZWN0IGltcGFjdCBvbiB0aGUgZG1hIHN0YXR1cw0KPiA+ICAg
ICBkZWNpc2lvbi4NCj4gPiANCj4gPiAgICAgU2lnbmVkLW9mZi1ieTogRm9ycmVzdCBTaGkgPHh1
ZWxpbi5zaGlAZnJlZXNjYWxlLmNvbT4NCj4gPiAtLS0NCj4gPiAgZHJpdmVycy9kbWEvZnNsZG1h
LmMgfCAgICA1IC0tLS0tDQo+ID4gIDEgZmlsZXMgY2hhbmdlZCwgMCBpbnNlcnRpb25zKCspLCA1
IGRlbGV0aW9ucygtKQ0KPiA+IA0KPiA+IGRpZmYgLS1naXQgYS9kcml2ZXJzL2RtYS9mc2xkbWEu
YyBiL2RyaXZlcnMvZG1hL2ZzbGRtYS5jIGluZGV4IA0KPiA+IDhhNzgxNTQuLjFkY2E1NmYgMTAw
NjQ0DQo+ID4gLS0tIGEvZHJpdmVycy9kbWEvZnNsZG1hLmMNCj4gPiArKysgYi9kcml2ZXJzL2Rt
YS9mc2xkbWEuYw0KPiA+IEBAIC05ODYsMTUgKzk4NiwxMCBAQCBzdGF0aWMgZW51bSBkbWFfc3Rh
dHVzIGZzbF90eF9zdGF0dXMoc3RydWN0IGRtYV9jaGFuICpkY2hhbiwNCj4gPiAgCXN0cnVjdCBm
c2xkbWFfY2hhbiAqY2hhbiA9IHRvX2ZzbF9jaGFuKGRjaGFuKTsNCj4gPiAgCWRtYV9jb29raWVf
dCBsYXN0X2NvbXBsZXRlOw0KPiA+ICAJZG1hX2Nvb2tpZV90IGxhc3RfdXNlZDsNCj4gPiAtCXVu
c2lnbmVkIGxvbmcgZmxhZ3M7DQo+ID4gLQ0KPiA+IC0Jc3Bpbl9sb2NrX2lycXNhdmUoJmNoYW4t
PmRlc2NfbG9jaywgZmxhZ3MpOw0KPiA+ICANCj4gDQo+IFRoaXMgd2lsbCBjYXVzZSBhIGJ1Zy4g
U2VlIGJlbG93IGZvciBhIGRldGFpbGVkIGV4cGxhbmF0aW9uLiBZb3UgbmVlZCB0aGlzIGluc3Rl
YWQ6DQo+IA0KPiAJLyoNCj4gCSAqIE9uIGFuIFNNUCBzeXN0ZW0sIHdlIG11c3QgZW5zdXJlIHRo
YXQgdGhpcyBDUFUgaGFzIHNlZW4gdGhlDQo+IAkgKiBtZW1vcnkgYWNjZXNzZXMgcGVyZm9ybWVk
IGJ5IGFub3RoZXIgQ1BVIHVuZGVyIHRoZQ0KPiAJICogY2hhbi0+ZGVzY19sb2NrIHNwaW5sb2Nr
Lg0KPiAJICovDQo+IAlzbXBfbWIoKTsNCj4gPiAgCWxhc3RfY29tcGxldGUgPSBjaGFuLT5jb21w
bGV0ZWRfY29va2llOw0KPiA+ICAJbGFzdF91c2VkID0gZGNoYW4tPmNvb2tpZTsNCj4gPiAgDQo+
ID4gLQlzcGluX3VubG9ja19pcnFyZXN0b3JlKCZjaGFuLT5kZXNjX2xvY2ssIGZsYWdzKTsNCj4g
PiAtDQo+ID4gIAlkbWFfc2V0X3R4X3N0YXRlKHR4c3RhdGUsIGxhc3RfY29tcGxldGUsIGxhc3Rf
dXNlZCwgMCk7DQo+ID4gIAlyZXR1cm4gZG1hX2FzeW5jX2lzX2NvbXBsZXRlKGNvb2tpZSwgbGFz
dF9jb21wbGV0ZSwgbGFzdF91c2VkKTsgIH0NCj4gDQo+IEZhY3RzOg0KPiAtIGRjaGFuLT5jb29r
aWUgaXMgdGhlIHNhbWUgbWVtYmVyIGFzIGNoYW4tPmNvbW1vbi5jb29raWUgKHNhbWUgbWVtb3J5
IA0KPiBsb2NhdGlvbikNCj4gLSBjaGFuLT5jb21tb24uY29va2llIGlzIHRoZSAibGFzdCBhbGxv
Y2F0ZWQgY29va2llIGZvciBhIHBlbmRpbmcgdHJhbnNhY3Rpb24iDQo+IC0gY2hhbi0+Y29tcGxl
dGVkX2Nvb2tpZSBpcyB0aGUgImxhc3QgY29tcGxldGVkIHRyYW5zYWN0aW9uIg0KPiANCj4gSSBo
YXZlIHJlcGxhY2VkICJkY2hhbi0+Y29va2llIiB3aXRoICJjaGFuLT5jb21tb24uY29va2llIiBp
biB0aGUgYmVsb3cgZXhwbGFuYXRpb24sIHRvIGtlZXAgZXZlcnl0aGluZyByZWZlcmVuY2VkIGZy
b20gdGhlIHNhbWUgc3RydWN0dXJlLg0KPiANCj4gVmFyaWFibGUgdXNhZ2UgYmVmb3JlIHlvdXIg
Y2hhbmdlLiBFdmVyeXRoaW5nIGlzIHVzZWQgbG9ja2VkLg0KPiAtIFJXIGNoYW4tPmNvbW1vbi5j
b29raWUJCShmc2xfZG1hX3R4X3N1Ym1pdCkNCj4gLSBSICBjaGFuLT5jb21tb24uY29va2llCQko
ZnNsX3R4X3N0YXR1cykNCj4gLSBSICBjaGFuLT5jb21wbGV0ZWRfY29va2llCQkoZnNsX3R4X3N0
YXR1cykNCj4gLSBXICBjaGFuLT5jb21wbGV0ZWRfY29va2llCQkoZG1hX2RvX3Rhc2tsZXQpDQo+
IA0KPiBWYXJpYWJsZSB1c2FnZSBhZnRlciB5b3VyIGNoYW5nZToNCj4gLSBSVyBjaGFuLT5jb21t
b24uY29va2llCQlMT0NLRUQNCj4gLSBSICBjaGFuLT5jb21tb24uY29va2llCQlOTyBMT0NLDQo+
IC0gUiAgY2hhbi0+Y29tcGxldGVkX2Nvb2tpZQkJTk8gTE9DSw0KPiAtIFcgIGNoYW4tPmNvbXBs
ZXRlZF9jb29raWUgICAgICAgICAgICAgTE9DS0VEDQo+IA0KPiBXaGF0IGlmIHdlIGFzc3VtZSB0
aGF0IHlvdSBoYXZlIGEgMiBDUFUgc3lzdGVtIChzdWNoIGFzIGEgUDIwMjApLiBBZnRlciB5b3Vy
IGNoYW5nZXMsIG9uZSBwb3NzaWJsZSBzZXF1ZW5jZSBpczoNCj4gDQo+ID09PSBDUFUxIC0gYWxs
b2NhdGUgKyBzdWJtaXQgZGVzY3JpcHRvcjogZnNsX2RtYV90eF9zdWJtaXQoKSA9PT0gDQo+IHNw
aW5fbG9ja19pcnFzYXZlDQo+IGRlc2NyaXB0b3ItPmNvb2tpZSA9IDIwCQkoeCBpbiB5b3VyIGV4
YW1wbGUpDQo+IGNoYW4tPmNvbW1vbi5jb29raWUgPSAyMAkodXNlZCBpbiB5b3VyIGV4YW1wbGUp
DQo+IHNwaW5fdW5sb2NrX2lycXJlc3RvcmUNCj4gDQo+ID09PSBDUFUyIC0gaW1tZWRpYXRlbHkg
Y2FsbHMgZnNsX3R4X3N0YXR1cygpID09PQ0KPiBjaGFuLT5jb21tb24uY29va2llID09IDE5DQo+
IGNoYW4tPmNvbXBsZXRlZF9jb29raWUgPT0gMTkNCj4gZGVzY3JpcHRvci0+Y29va2llID09IDIw
DQo+IA0KPiBTaW5jZSB3ZSBkb24ndCBoYXZlIGxvY2tzIGFueW1vcmUsIENQVTIgbWF5IG5vdCBo
YXZlIHNlZW4gdGhlIHdyaXRlIHRvDQo+IGNoYW4tPmNvbW1vbi5jb29raWUgeWV0Lg0KPiANCj4g
QWxzbyBhc3N1bWUgdGhhdCB0aGUgRE1BIGhhcmR3YXJlIGhhcyBub3Qgc3RhcnRlZCBwcm9jZXNz
aW5nIHRoZSANCj4gdHJhbnNhY3Rpb24geWV0LiBUaGVyZWZvcmUgZG1hX2RvX3Rhc2tsZXQoKSBo
YXMgbm90IGJlZW4gY2FsbGVkLCBhbmQNCj4gY2hhbi0+Y29tcGxldGVkX2Nvb2tpZSBoYXMgbm90
IGJlZW4gdXBkYXRlZC4NCj4gDQo+IEluIHRoaXMgY2FzZSwgZG1hX2FzeW5jX2lzX2NvbXBsZXRl
KCkgKG9uIENQVTIpIHJldHVybnMgRE1BX1NVQ0NFU1MsIGV2ZW4gdGhvdWdoIHRoZSBETUEgb3Bl
cmF0aW9uIGhhcyBub3Qgc3VjY2VlZGVkLiBUaGUgRE1BIG9wZXJhdGlvbiBoYXMgbm90IGV2ZW4g
c3RhcnRlZCB5ZXQhDQo+IA0KPiBUaGUgc21wX21iKCkgZml4ZXMgdGhpcywgc2luY2UgaXQgZm9y
Y2VzIENQVTIgdG8gaGF2ZSBzZWVuIGFsbCBtZW1vcnkgb3BlcmF0aW9ucyB0aGF0IGhhcHBlbmVk
IGJlZm9yZSBDUFUxIHJlbGVhc2VkIHRoZSBzcGlubG9jay4gU3BpbmxvY2tzIGFyZSBpbXBsaWNp
dCBTTVAgbWVtb3J5IGJhcnJpZXJzLg0KPiANCj4gVGhlcmVmb3JlLCB0aGUgYWJvdmUgZXhhbXBs
ZSBiZWNvbWVzOg0KPiBzbXBfbWIoKTsNCj4gY2hhbi0+Y29tbW9uLmNvb2tpZSA9PSAyMA0KPiBj
aGFuLT5jb21wbGV0ZWRfY29va2llID09IDE5DQo+IGRlc2NyaXB0b3ItPmNvb2tpZSA9PSAyMA0K
PiANCj4gVGhlbiBkbWFfYXN5bmNfaXNfY29tcGxldGUoKSByZXR1cm5zIERNQV9JTl9QUk9HUkVT
Uywgd2hpY2ggaXMgY29ycmVjdC4NCj4gDQo+IFRoYW5rcywNCj4gSXJhDQo+IA0KPiANCj4gX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18NCj4gTGludXhwcGMt
ZGV2IG1haWxpbmcgbGlzdA0KPiBMaW51eHBwYy1kZXZAbGlzdHMub3psYWJzLm9yZw0KPiBodHRw
czovL2xpc3RzLm96bGFicy5vcmcvbGlzdGluZm8vbGludXhwcGMtZGV2DQoNCg==

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-29  3:19       ` Li Yang-R58472
@ 2011-11-29 17:25         ` Ira W. Snyder
  2011-11-30  0:08           ` Tabi Timur-B04825
  2011-11-30  9:57           ` Shi Xuelin-B29237
  2011-11-29 19:49         ` Scott Wood
  1 sibling, 2 replies; 14+ messages in thread
From: Ira W. Snyder @ 2011-11-29 17:25 UTC (permalink / raw)
  To: Li Yang-R58472
  Cc: vinod.koul@intel.com, Shi Xuelin-B29237,
	linuxppc-dev@lists.ozlabs.org, dan.j.williams@intel.com,
	linux-kernel@vger.kernel.org

On Tue, Nov 29, 2011 at 03:19:05AM +0000, Li Yang-R58472 wrote:
> > Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing
> > spinlock use.
> > 
> > On Thu, Nov 24, 2011 at 08:12:25AM +0000, Shi Xuelin-B29237 wrote:
> > > Hi Ira,
> > >
> > > Thanks for your review.
> > >
> > > After second thought, I think your scenario may not occur.
> > > Because the cookie 20 we query must be returned by fsl_dma_tx_submit(...) in
> > practice.
> > > We never query a cookie not returned by fsl_dma_tx_submit(...).
> > >
> > 
> > I agree about this part.
> > 
> > > When we call fsl_tx_status(20), the chan->common.cookie is definitely wrote as
> > 20 and cpu2 could not read as 19.
> > >
> > 
> > This is what I don't agree about. However, I'm not an expert on CPU cache vs.
> > memory accesses in an multi-processor system. The section titled "CACHE
> > COHERENCY" in Documentation/memory-barriers.txt leads me to believe that the
> > scenario I described is possible.
> 
> For Freescale PowerPC, the chip automatically takes care of cache coherency.  Even if this is a concern, spinlock can't address it.
> 
> > 
> > What happens if CPU1's write of chan->common.cookie only goes into CPU1's
> > cache. It never makes it to main memory before CPU2 fetches the old value of 19.
> > 
> > I don't think you should see any performance impact from the smp_mb()
> > operation.
> 
> Smp_mb() do have impact on performance if it's in the hot path.  While it might be safer having it, I doubt it is really necessary.  If the CPU1 doesn't have the updated last_used, it's shouldn't have known there is a cookie 20 existed either.
> 

I believe that you are correct, for powerpc. However, anything outside
of arch/powerpc shouldn't assume it only runs on powerpc. I wouldn't be
surprised to see fsldma running on an iMX someday (ARM processor).

My interpretation says that the change introduces the possibility that
fsl_tx_status() returns the wrong answer for an extremely small time
window, on SMP only, based on Documentation/memory-barriers.txt. But I
can't seem convince you.

My real question is what code path is hitting this spinlock? Is it in
mainline Linux? Why is it polling rather than using callbacks to
determine DMA completion?

Thanks,
Ira

> > > -----Original Message-----
> > > From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]
> > > Sent: 2011年11月23日 2:59
> > > To: Shi Xuelin-B29237
> > > Cc: dan.j.williams@intel.com; Li Yang-R58472; zw@zh-kernel.org;
> > > vinod.koul@intel.com; linuxppc-dev@lists.ozlabs.org;
> > > linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing
> > spinlock use.
> > >
> > > On Tue, Nov 22, 2011 at 12:55:05PM +0800, b29237@freescale.com wrote:
> > > > From: Forrest Shi <b29237@freescale.com>
> > > >
> > > >     dma status check function fsl_tx_status is heavily called in
> > > >     a tight loop and the desc lock in fsl_tx_status contended by
> > > >     the dma status update function. this caused the dma performance
> > > >     degrades much.
> > > >
> > > >     this patch releases the lock in the fsl_tx_status function.
> > > >     I believe it has no neglect impact on the following call of
> > > >     dma_async_is_complete(...).
> > > >
> > > >     we can see below three conditions will be identified as success
> > > >     a)  x < complete < use
> > > >     b)  x < complete+N < use+N
> > > >     c)  x < complete < use+N
> > > >     here complete is the completed_cookie, use is the last_used
> > > >     cookie, x is the querying cookie, N is MAX cookie
> > > >
> > > >     when chan->completed_cookie is being read, the last_used may
> > > >     be incresed. Anyway it has no neglect impact on the dma status
> > > >     decision.
> > > >
> > > >     Signed-off-by: Forrest Shi <xuelin.shi@freescale.com>
> > > > ---
> > > >  drivers/dma/fsldma.c |    5 -----
> > > >  1 files changed, 0 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c index
> > > > 8a78154..1dca56f 100644
> > > > --- a/drivers/dma/fsldma.c
> > > > +++ b/drivers/dma/fsldma.c
> > > > @@ -986,15 +986,10 @@ static enum dma_status fsl_tx_status(struct
> > dma_chan *dchan,
> > > >  	struct fsldma_chan *chan = to_fsl_chan(dchan);
> > > >  	dma_cookie_t last_complete;
> > > >  	dma_cookie_t last_used;
> > > > -	unsigned long flags;
> > > > -
> > > > -	spin_lock_irqsave(&chan->desc_lock, flags);
> > > >
> > >
> > > This will cause a bug. See below for a detailed explanation. You need this instead:
> > >
> > > 	/*
> > > 	 * On an SMP system, we must ensure that this CPU has seen the
> > > 	 * memory accesses performed by another CPU under the
> > > 	 * chan->desc_lock spinlock.
> > > 	 */
> > > 	smp_mb();
> > > >  	last_complete = chan->completed_cookie;
> > > >  	last_used = dchan->cookie;
> > > >
> > > > -	spin_unlock_irqrestore(&chan->desc_lock, flags);
> > > > -
> > > >  	dma_set_tx_state(txstate, last_complete, last_used, 0);
> > > >  	return dma_async_is_complete(cookie, last_complete, last_used);  }
> > >
> > > Facts:
> > > - dchan->cookie is the same member as chan->common.cookie (same memory
> > > location)
> > > - chan->common.cookie is the "last allocated cookie for a pending transaction"
> > > - chan->completed_cookie is the "last completed transaction"
> > >
> > > I have replaced "dchan->cookie" with "chan->common.cookie" in the below
> > explanation, to keep everything referenced from the same structure.
> > >
> > > Variable usage before your change. Everything is used locked.
> > > - RW chan->common.cookie		(fsl_dma_tx_submit)
> > > - R  chan->common.cookie		(fsl_tx_status)
> > > - R  chan->completed_cookie		(fsl_tx_status)
> > > - W  chan->completed_cookie		(dma_do_tasklet)
> > >
> > > Variable usage after your change:
> > > - RW chan->common.cookie		LOCKED
> > > - R  chan->common.cookie		NO LOCK
> > > - R  chan->completed_cookie		NO LOCK
> > > - W  chan->completed_cookie             LOCKED
> > >
> > > What if we assume that you have a 2 CPU system (such as a P2020). After your
> > changes, one possible sequence is:
> > >
> > > === CPU1 - allocate + submit descriptor: fsl_dma_tx_submit() ===
> > > spin_lock_irqsave
> > > descriptor->cookie = 20		(x in your example)
> > > chan->common.cookie = 20	(used in your example)
> > > spin_unlock_irqrestore
> > >
> > > === CPU2 - immediately calls fsl_tx_status() ===
> > > chan->common.cookie == 19
> > > chan->completed_cookie == 19
> > > descriptor->cookie == 20
> > >
> > > Since we don't have locks anymore, CPU2 may not have seen the write to
> > > chan->common.cookie yet.
> > >
> > > Also assume that the DMA hardware has not started processing the
> > > transaction yet. Therefore dma_do_tasklet() has not been called, and
> > > chan->completed_cookie has not been updated.
> > >
> > > In this case, dma_async_is_complete() (on CPU2) returns DMA_SUCCESS, even
> > though the DMA operation has not succeeded. The DMA operation has not even
> > started yet!
> > >
> > > The smp_mb() fixes this, since it forces CPU2 to have seen all memory operations
> > that happened before CPU1 released the spinlock. Spinlocks are implicit SMP
> > memory barriers.
> > >
> > > Therefore, the above example becomes:
> > > smp_mb();
> > > chan->common.cookie == 20
> > > chan->completed_cookie == 19
> > > descriptor->cookie == 20
> > >
> > > Then dma_async_is_complete() returns DMA_IN_PROGRESS, which is correct.
> > >
> > > Thanks,
> > > Ira
> > >
> > >
> > > _______________________________________________
> > > Linuxppc-dev mailing list
> > > Linuxppc-dev@lists.ozlabs.org
> > > https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-29  3:19       ` Li Yang-R58472
  2011-11-29 17:25         ` Ira W. Snyder
@ 2011-11-29 19:49         ` Scott Wood
  1 sibling, 0 replies; 14+ messages in thread
From: Scott Wood @ 2011-11-29 19:49 UTC (permalink / raw)
  To: Li Yang-R58472
  Cc: Ira W. Snyder, vinod.koul@intel.com, linux-kernel@vger.kernel.org,
	Shi Xuelin-B29237, linuxppc-dev@lists.ozlabs.org,
	dan.j.williams@intel.com

On 11/28/2011 09:19 PM, Li Yang-R58472 wrote:
>> Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing
>> spinlock use.
>>
>> On Thu, Nov 24, 2011 at 08:12:25AM +0000, Shi Xuelin-B29237 wrote:
>>> Hi Ira,
>>>
>>> Thanks for your review.
>>>
>>> After second thought, I think your scenario may not occur.
>>> Because the cookie 20 we query must be returned by fsl_dma_tx_submit(...) in
>> practice.
>>> We never query a cookie not returned by fsl_dma_tx_submit(...).
>>>
>>
>> I agree about this part.
>>
>>> When we call fsl_tx_status(20), the chan->common.cookie is definitely wrote as
>> 20 and cpu2 could not read as 19.
>>>
>>
>> This is what I don't agree about. However, I'm not an expert on CPU cache vs.
>> memory accesses in an multi-processor system. The section titled "CACHE
>> COHERENCY" in Documentation/memory-barriers.txt leads me to believe that the
>> scenario I described is possible.
> 
> For Freescale PowerPC, the chip automatically takes care of cache coherency.  Even if this is a concern, spinlock can't address it.

Cache coherency is not the same thing as ordering -- and spinlocks do
address ordering, because there are memory barriers in the lock
implementation.

If you're relying on some non-universal ordering guarantee that all
chips with this device make, it needs to be documented explicitly what
you're assuming and why it's valid.

-Scott

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-29 17:25         ` Ira W. Snyder
@ 2011-11-30  0:08           ` Tabi Timur-B04825
  2011-11-30  9:57           ` Shi Xuelin-B29237
  1 sibling, 0 replies; 14+ messages in thread
From: Tabi Timur-B04825 @ 2011-11-30  0:08 UTC (permalink / raw)
  To: Ira W. Snyder
  Cc: Li Yang-R58472, vinod.koul@intel.com,
	linux-kernel@vger.kernel.org, Shi Xuelin-B29237,
	linuxppc-dev@lists.ozlabs.org, dan.j.williams@intel.com

On Tue, Nov 29, 2011 at 11:25 AM, Ira W. Snyder <iws@ovro.caltech.edu> wrot=
e:

> I believe that you are correct, for powerpc. However, anything outside
> of arch/powerpc shouldn't assume it only runs on powerpc. I wouldn't be
> surprised to see fsldma running on an iMX someday (ARM processor).

I certainly would.  The i.MX chips have their own DMA controller.  The
DMA controller that fsldma.c works with is unique to Freescale's 83xx,
85xx, and QorIQ chips.

--=20
Timur Tabi
Linux kernel developer at Freescale=

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-29 17:25         ` Ira W. Snyder
  2011-11-30  0:08           ` Tabi Timur-B04825
@ 2011-11-30  9:57           ` Shi Xuelin-B29237
  2011-11-30 17:07             ` Ira W. Snyder
  1 sibling, 1 reply; 14+ messages in thread
From: Shi Xuelin-B29237 @ 2011-11-30  9:57 UTC (permalink / raw)
  To: Ira W. Snyder
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Li Yang-R58472

SGVsbG8gSXJhLA0KDQpJbiBkcml2ZXJzL2RtYS9kbWFlbmdpbmUuYywgd2UgaGF2ZSBiZWxvdyB0
aWdodCBsb29wIHRvIGNoZWNrIERNQSBjb21wbGV0aW9uIGluIG1haW5saW5lIExpbnV4Og0KICAg
ICAgIGRvIHsNCiAgICAgICAgICAgICAgICBzdGF0dXMgPSBkbWFfYXN5bmNfaXNfdHhfY29tcGxl
dGUoY2hhbiwgY29va2llLCBOVUxMLCBOVUxMKTsNCiAgICAgICAgICAgICAgICBpZiAodGltZV9h
ZnRlcl9lcShqaWZmaWVzLCBkbWFfc3luY193YWl0X3RpbWVvdXQpKSB7DQogICAgICAgICAgICAg
ICAgICAgICAgICBwcmludGsoS0VSTl9FUlIgImRtYV9zeW5jX3dhaXRfdGltZW91dCFcbiIpOw0K
ICAgICAgICAgICAgICAgICAgICAgICAgcmV0dXJuIERNQV9FUlJPUjsNCiAgICAgICAgICAgICAg
ICB9DQogICAgICAgIH0gd2hpbGUgKHN0YXR1cyA9PSBETUFfSU5fUFJPR1JFU1MpOw0KDQoNClRo
YW5rcywNCkZvcnJlc3QNCg0KLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCkZyb206IElyYSBX
LiBTbnlkZXIgW21haWx0bzppd3NAb3Zyby5jYWx0ZWNoLmVkdV0gDQpTZW50OiAyMDEx5bm0MTHm
nIgzMOaXpSAxOjI2DQpUbzogTGkgWWFuZy1SNTg0NzINCkNjOiBTaGkgWHVlbGluLUIyOTIzNzsg
dmlub2Qua291bEBpbnRlbC5jb207IGRhbi5qLndpbGxpYW1zQGludGVsLmNvbTsgbGludXhwcGMt
ZGV2QGxpc3RzLm96bGFicy5vcmc7IGxpbnV4LWtlcm5lbEB2Z2VyLmtlcm5lbC5vcmcNClN1Ympl
Y3Q6IFJlOiBbUEFUQ0hdW1JGQ10gZnNsZG1hOiBmaXggcGVyZm9ybWFuY2UgZGVncmFkYXRpb24g
Ynkgb3B0aW1pemluZyBzcGlubG9jayB1c2UuDQoNCk9uIFR1ZSwgTm92IDI5LCAyMDExIGF0IDAz
OjE5OjA1QU0gKzAwMDAsIExpIFlhbmctUjU4NDcyIHdyb3RlOg0KPiA+IFN1YmplY3Q6IFJlOiBb
UEFUQ0hdW1JGQ10gZnNsZG1hOiBmaXggcGVyZm9ybWFuY2UgZGVncmFkYXRpb24gYnkgDQo+ID4g
b3B0aW1pemluZyBzcGlubG9jayB1c2UuDQo+ID4gDQo+ID4gT24gVGh1LCBOb3YgMjQsIDIwMTEg
YXQgMDg6MTI6MjVBTSArMDAwMCwgU2hpIFh1ZWxpbi1CMjkyMzcgd3JvdGU6DQo+ID4gPiBIaSBJ
cmEsDQo+ID4gPg0KPiA+ID4gVGhhbmtzIGZvciB5b3VyIHJldmlldy4NCj4gPiA+DQo+ID4gPiBB
ZnRlciBzZWNvbmQgdGhvdWdodCwgSSB0aGluayB5b3VyIHNjZW5hcmlvIG1heSBub3Qgb2NjdXIu
DQo+ID4gPiBCZWNhdXNlIHRoZSBjb29raWUgMjAgd2UgcXVlcnkgbXVzdCBiZSByZXR1cm5lZCBi
eSANCj4gPiA+IGZzbF9kbWFfdHhfc3VibWl0KC4uLikgaW4NCj4gPiBwcmFjdGljZS4NCj4gPiA+
IFdlIG5ldmVyIHF1ZXJ5IGEgY29va2llIG5vdCByZXR1cm5lZCBieSBmc2xfZG1hX3R4X3N1Ym1p
dCguLi4pLg0KPiA+ID4NCj4gPiANCj4gPiBJIGFncmVlIGFib3V0IHRoaXMgcGFydC4NCj4gPiAN
Cj4gPiA+IFdoZW4gd2UgY2FsbCBmc2xfdHhfc3RhdHVzKDIwKSwgdGhlIGNoYW4tPmNvbW1vbi5j
b29raWUgaXMgDQo+ID4gPiBkZWZpbml0ZWx5IHdyb3RlIGFzDQo+ID4gMjAgYW5kIGNwdTIgY291
bGQgbm90IHJlYWQgYXMgMTkuDQo+ID4gPg0KPiA+IA0KPiA+IFRoaXMgaXMgd2hhdCBJIGRvbid0
IGFncmVlIGFib3V0LiBIb3dldmVyLCBJJ20gbm90IGFuIGV4cGVydCBvbiBDUFUgY2FjaGUgdnMu
DQo+ID4gbWVtb3J5IGFjY2Vzc2VzIGluIGFuIG11bHRpLXByb2Nlc3NvciBzeXN0ZW0uIFRoZSBz
ZWN0aW9uIHRpdGxlZCANCj4gPiAiQ0FDSEUgQ09IRVJFTkNZIiBpbiBEb2N1bWVudGF0aW9uL21l
bW9yeS1iYXJyaWVycy50eHQgbGVhZHMgbWUgdG8gDQo+ID4gYmVsaWV2ZSB0aGF0IHRoZSBzY2Vu
YXJpbyBJIGRlc2NyaWJlZCBpcyBwb3NzaWJsZS4NCj4gDQo+IEZvciBGcmVlc2NhbGUgUG93ZXJQ
QywgdGhlIGNoaXAgYXV0b21hdGljYWxseSB0YWtlcyBjYXJlIG9mIGNhY2hlIGNvaGVyZW5jeS4g
IEV2ZW4gaWYgdGhpcyBpcyBhIGNvbmNlcm4sIHNwaW5sb2NrIGNhbid0IGFkZHJlc3MgaXQuDQo+
IA0KPiA+IA0KPiA+IFdoYXQgaGFwcGVucyBpZiBDUFUxJ3Mgd3JpdGUgb2YgY2hhbi0+Y29tbW9u
LmNvb2tpZSBvbmx5IGdvZXMgaW50byANCj4gPiBDUFUxJ3MgY2FjaGUuIEl0IG5ldmVyIG1ha2Vz
IGl0IHRvIG1haW4gbWVtb3J5IGJlZm9yZSBDUFUyIGZldGNoZXMgdGhlIG9sZCB2YWx1ZSBvZiAx
OS4NCj4gPiANCj4gPiBJIGRvbid0IHRoaW5rIHlvdSBzaG91bGQgc2VlIGFueSBwZXJmb3JtYW5j
ZSBpbXBhY3QgZnJvbSB0aGUgDQo+ID4gc21wX21iKCkgb3BlcmF0aW9uLg0KPiANCj4gU21wX21i
KCkgZG8gaGF2ZSBpbXBhY3Qgb24gcGVyZm9ybWFuY2UgaWYgaXQncyBpbiB0aGUgaG90IHBhdGgu
ICBXaGlsZSBpdCBtaWdodCBiZSBzYWZlciBoYXZpbmcgaXQsIEkgZG91YnQgaXQgaXMgcmVhbGx5
IG5lY2Vzc2FyeS4gIElmIHRoZSBDUFUxIGRvZXNuJ3QgaGF2ZSB0aGUgdXBkYXRlZCBsYXN0X3Vz
ZWQsIGl0J3Mgc2hvdWxkbid0IGhhdmUga25vd24gdGhlcmUgaXMgYSBjb29raWUgMjAgZXhpc3Rl
ZCBlaXRoZXIuDQo+IA0KDQpJIGJlbGlldmUgdGhhdCB5b3UgYXJlIGNvcnJlY3QsIGZvciBwb3dl
cnBjLiBIb3dldmVyLCBhbnl0aGluZyBvdXRzaWRlIG9mIGFyY2gvcG93ZXJwYyBzaG91bGRuJ3Qg
YXNzdW1lIGl0IG9ubHkgcnVucyBvbiBwb3dlcnBjLiBJIHdvdWxkbid0IGJlIHN1cnByaXNlZCB0
byBzZWUgZnNsZG1hIHJ1bm5pbmcgb24gYW4gaU1YIHNvbWVkYXkgKEFSTSBwcm9jZXNzb3IpLg0K
DQpNeSBpbnRlcnByZXRhdGlvbiBzYXlzIHRoYXQgdGhlIGNoYW5nZSBpbnRyb2R1Y2VzIHRoZSBw
b3NzaWJpbGl0eSB0aGF0DQpmc2xfdHhfc3RhdHVzKCkgcmV0dXJucyB0aGUgd3JvbmcgYW5zd2Vy
IGZvciBhbiBleHRyZW1lbHkgc21hbGwgdGltZSB3aW5kb3csIG9uIFNNUCBvbmx5LCBiYXNlZCBv
biBEb2N1bWVudGF0aW9uL21lbW9yeS1iYXJyaWVycy50eHQuIEJ1dCBJIGNhbid0IHNlZW0gY29u
dmluY2UgeW91Lg0KDQpNeSByZWFsIHF1ZXN0aW9uIGlzIHdoYXQgY29kZSBwYXRoIGlzIGhpdHRp
bmcgdGhpcyBzcGlubG9jaz8gSXMgaXQgaW4gbWFpbmxpbmUgTGludXg/IFdoeSBpcyBpdCBwb2xs
aW5nIHJhdGhlciB0aGFuIHVzaW5nIGNhbGxiYWNrcyB0byBkZXRlcm1pbmUgRE1BIGNvbXBsZXRp
b24/DQoNClRoYW5rcywNCklyYQ0KDQo+ID4gPiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0K
PiA+ID4gRnJvbTogSXJhIFcuIFNueWRlciBbbWFpbHRvOml3c0BvdnJvLmNhbHRlY2guZWR1XQ0K
PiA+ID4gU2VudDogMjAxMeW5tDEx5pyIMjPml6UgMjo1OQ0KPiA+ID4gVG86IFNoaSBYdWVsaW4t
QjI5MjM3DQo+ID4gPiBDYzogZGFuLmoud2lsbGlhbXNAaW50ZWwuY29tOyBMaSBZYW5nLVI1ODQ3
MjsgendAemgta2VybmVsLm9yZzsgDQo+ID4gPiB2aW5vZC5rb3VsQGludGVsLmNvbTsgbGludXhw
cGMtZGV2QGxpc3RzLm96bGFicy5vcmc7IA0KPiA+ID4gbGludXgta2VybmVsQHZnZXIua2VybmVs
Lm9yZw0KPiA+ID4gU3ViamVjdDogUmU6IFtQQVRDSF1bUkZDXSBmc2xkbWE6IGZpeCBwZXJmb3Jt
YW5jZSBkZWdyYWRhdGlvbiBieSANCj4gPiA+IG9wdGltaXppbmcNCj4gPiBzcGlubG9jayB1c2Uu
DQo+ID4gPg0KPiA+ID4gT24gVHVlLCBOb3YgMjIsIDIwMTEgYXQgMTI6NTU6MDVQTSArMDgwMCwg
YjI5MjM3QGZyZWVzY2FsZS5jb20gd3JvdGU6DQo+ID4gPiA+IEZyb206IEZvcnJlc3QgU2hpIDxi
MjkyMzdAZnJlZXNjYWxlLmNvbT4NCj4gPiA+ID4NCj4gPiA+ID4gICAgIGRtYSBzdGF0dXMgY2hl
Y2sgZnVuY3Rpb24gZnNsX3R4X3N0YXR1cyBpcyBoZWF2aWx5IGNhbGxlZCBpbg0KPiA+ID4gPiAg
ICAgYSB0aWdodCBsb29wIGFuZCB0aGUgZGVzYyBsb2NrIGluIGZzbF90eF9zdGF0dXMgY29udGVu
ZGVkIGJ5DQo+ID4gPiA+ICAgICB0aGUgZG1hIHN0YXR1cyB1cGRhdGUgZnVuY3Rpb24uIHRoaXMg
Y2F1c2VkIHRoZSBkbWEgcGVyZm9ybWFuY2UNCj4gPiA+ID4gICAgIGRlZ3JhZGVzIG11Y2guDQo+
ID4gPiA+DQo+ID4gPiA+ICAgICB0aGlzIHBhdGNoIHJlbGVhc2VzIHRoZSBsb2NrIGluIHRoZSBm
c2xfdHhfc3RhdHVzIGZ1bmN0aW9uLg0KPiA+ID4gPiAgICAgSSBiZWxpZXZlIGl0IGhhcyBubyBu
ZWdsZWN0IGltcGFjdCBvbiB0aGUgZm9sbG93aW5nIGNhbGwgb2YNCj4gPiA+ID4gICAgIGRtYV9h
c3luY19pc19jb21wbGV0ZSguLi4pLg0KPiA+ID4gPg0KPiA+ID4gPiAgICAgd2UgY2FuIHNlZSBi
ZWxvdyB0aHJlZSBjb25kaXRpb25zIHdpbGwgYmUgaWRlbnRpZmllZCBhcyBzdWNjZXNzDQo+ID4g
PiA+ICAgICBhKSAgeCA8IGNvbXBsZXRlIDwgdXNlDQo+ID4gPiA+ICAgICBiKSAgeCA8IGNvbXBs
ZXRlK04gPCB1c2UrTg0KPiA+ID4gPiAgICAgYykgIHggPCBjb21wbGV0ZSA8IHVzZStODQo+ID4g
PiA+ICAgICBoZXJlIGNvbXBsZXRlIGlzIHRoZSBjb21wbGV0ZWRfY29va2llLCB1c2UgaXMgdGhl
IGxhc3RfdXNlZA0KPiA+ID4gPiAgICAgY29va2llLCB4IGlzIHRoZSBxdWVyeWluZyBjb29raWUs
IE4gaXMgTUFYIGNvb2tpZQ0KPiA+ID4gPg0KPiA+ID4gPiAgICAgd2hlbiBjaGFuLT5jb21wbGV0
ZWRfY29va2llIGlzIGJlaW5nIHJlYWQsIHRoZSBsYXN0X3VzZWQgbWF5DQo+ID4gPiA+ICAgICBi
ZSBpbmNyZXNlZC4gQW55d2F5IGl0IGhhcyBubyBuZWdsZWN0IGltcGFjdCBvbiB0aGUgZG1hIHN0
YXR1cw0KPiA+ID4gPiAgICAgZGVjaXNpb24uDQo+ID4gPiA+DQo+ID4gPiA+ICAgICBTaWduZWQt
b2ZmLWJ5OiBGb3JyZXN0IFNoaSA8eHVlbGluLnNoaUBmcmVlc2NhbGUuY29tPg0KPiA+ID4gPiAt
LS0NCj4gPiA+ID4gIGRyaXZlcnMvZG1hL2ZzbGRtYS5jIHwgICAgNSAtLS0tLQ0KPiA+ID4gPiAg
MSBmaWxlcyBjaGFuZ2VkLCAwIGluc2VydGlvbnMoKyksIDUgZGVsZXRpb25zKC0pDQo+ID4gPiA+
DQo+ID4gPiA+IGRpZmYgLS1naXQgYS9kcml2ZXJzL2RtYS9mc2xkbWEuYyBiL2RyaXZlcnMvZG1h
L2ZzbGRtYS5jIGluZGV4IA0KPiA+ID4gPiA4YTc4MTU0Li4xZGNhNTZmIDEwMDY0NA0KPiA+ID4g
PiAtLS0gYS9kcml2ZXJzL2RtYS9mc2xkbWEuYw0KPiA+ID4gPiArKysgYi9kcml2ZXJzL2RtYS9m
c2xkbWEuYw0KPiA+ID4gPiBAQCAtOTg2LDE1ICs5ODYsMTAgQEAgc3RhdGljIGVudW0gZG1hX3N0
YXR1cyANCj4gPiA+ID4gZnNsX3R4X3N0YXR1cyhzdHJ1Y3QNCj4gPiBkbWFfY2hhbiAqZGNoYW4s
DQo+ID4gPiA+ICAJc3RydWN0IGZzbGRtYV9jaGFuICpjaGFuID0gdG9fZnNsX2NoYW4oZGNoYW4p
Ow0KPiA+ID4gPiAgCWRtYV9jb29raWVfdCBsYXN0X2NvbXBsZXRlOw0KPiA+ID4gPiAgCWRtYV9j
b29raWVfdCBsYXN0X3VzZWQ7DQo+ID4gPiA+IC0JdW5zaWduZWQgbG9uZyBmbGFnczsNCj4gPiA+
ID4gLQ0KPiA+ID4gPiAtCXNwaW5fbG9ja19pcnFzYXZlKCZjaGFuLT5kZXNjX2xvY2ssIGZsYWdz
KTsNCj4gPiA+ID4NCj4gPiA+DQo+ID4gPiBUaGlzIHdpbGwgY2F1c2UgYSBidWcuIFNlZSBiZWxv
dyBmb3IgYSBkZXRhaWxlZCBleHBsYW5hdGlvbi4gWW91IG5lZWQgdGhpcyBpbnN0ZWFkOg0KPiA+
ID4NCj4gPiA+IAkvKg0KPiA+ID4gCSAqIE9uIGFuIFNNUCBzeXN0ZW0sIHdlIG11c3QgZW5zdXJl
IHRoYXQgdGhpcyBDUFUgaGFzIHNlZW4gdGhlDQo+ID4gPiAJICogbWVtb3J5IGFjY2Vzc2VzIHBl
cmZvcm1lZCBieSBhbm90aGVyIENQVSB1bmRlciB0aGUNCj4gPiA+IAkgKiBjaGFuLT5kZXNjX2xv
Y2sgc3BpbmxvY2suDQo+ID4gPiAJICovDQo+ID4gPiAJc21wX21iKCk7DQo+ID4gPiA+ICAJbGFz
dF9jb21wbGV0ZSA9IGNoYW4tPmNvbXBsZXRlZF9jb29raWU7DQo+ID4gPiA+ICAJbGFzdF91c2Vk
ID0gZGNoYW4tPmNvb2tpZTsNCj4gPiA+ID4NCj4gPiA+ID4gLQlzcGluX3VubG9ja19pcnFyZXN0
b3JlKCZjaGFuLT5kZXNjX2xvY2ssIGZsYWdzKTsNCj4gPiA+ID4gLQ0KPiA+ID4gPiAgCWRtYV9z
ZXRfdHhfc3RhdGUodHhzdGF0ZSwgbGFzdF9jb21wbGV0ZSwgbGFzdF91c2VkLCAwKTsNCj4gPiA+
ID4gIAlyZXR1cm4gZG1hX2FzeW5jX2lzX2NvbXBsZXRlKGNvb2tpZSwgbGFzdF9jb21wbGV0ZSwg
DQo+ID4gPiA+IGxhc3RfdXNlZCk7ICB9DQo+ID4gPg0KPiA+ID4gRmFjdHM6DQo+ID4gPiAtIGRj
aGFuLT5jb29raWUgaXMgdGhlIHNhbWUgbWVtYmVyIGFzIGNoYW4tPmNvbW1vbi5jb29raWUgKHNh
bWUgDQo+ID4gPiBtZW1vcnkNCj4gPiA+IGxvY2F0aW9uKQ0KPiA+ID4gLSBjaGFuLT5jb21tb24u
Y29va2llIGlzIHRoZSAibGFzdCBhbGxvY2F0ZWQgY29va2llIGZvciBhIHBlbmRpbmcgdHJhbnNh
Y3Rpb24iDQo+ID4gPiAtIGNoYW4tPmNvbXBsZXRlZF9jb29raWUgaXMgdGhlICJsYXN0IGNvbXBs
ZXRlZCB0cmFuc2FjdGlvbiINCj4gPiA+DQo+ID4gPiBJIGhhdmUgcmVwbGFjZWQgImRjaGFuLT5j
b29raWUiIHdpdGggImNoYW4tPmNvbW1vbi5jb29raWUiIGluIHRoZSANCj4gPiA+IGJlbG93DQo+
ID4gZXhwbGFuYXRpb24sIHRvIGtlZXAgZXZlcnl0aGluZyByZWZlcmVuY2VkIGZyb20gdGhlIHNh
bWUgc3RydWN0dXJlLg0KPiA+ID4NCj4gPiA+IFZhcmlhYmxlIHVzYWdlIGJlZm9yZSB5b3VyIGNo
YW5nZS4gRXZlcnl0aGluZyBpcyB1c2VkIGxvY2tlZC4NCj4gPiA+IC0gUlcgY2hhbi0+Y29tbW9u
LmNvb2tpZQkJKGZzbF9kbWFfdHhfc3VibWl0KQ0KPiA+ID4gLSBSICBjaGFuLT5jb21tb24uY29v
a2llCQkoZnNsX3R4X3N0YXR1cykNCj4gPiA+IC0gUiAgY2hhbi0+Y29tcGxldGVkX2Nvb2tpZQkJ
KGZzbF90eF9zdGF0dXMpDQo+ID4gPiAtIFcgIGNoYW4tPmNvbXBsZXRlZF9jb29raWUJCShkbWFf
ZG9fdGFza2xldCkNCj4gPiA+DQo+ID4gPiBWYXJpYWJsZSB1c2FnZSBhZnRlciB5b3VyIGNoYW5n
ZToNCj4gPiA+IC0gUlcgY2hhbi0+Y29tbW9uLmNvb2tpZQkJTE9DS0VEDQo+ID4gPiAtIFIgIGNo
YW4tPmNvbW1vbi5jb29raWUJCU5PIExPQ0sNCj4gPiA+IC0gUiAgY2hhbi0+Y29tcGxldGVkX2Nv
b2tpZQkJTk8gTE9DSw0KPiA+ID4gLSBXICBjaGFuLT5jb21wbGV0ZWRfY29va2llICAgICAgICAg
ICAgIExPQ0tFRA0KPiA+ID4NCj4gPiA+IFdoYXQgaWYgd2UgYXNzdW1lIHRoYXQgeW91IGhhdmUg
YSAyIENQVSBzeXN0ZW0gKHN1Y2ggYXMgYSBQMjAyMCkuIA0KPiA+ID4gQWZ0ZXIgeW91cg0KPiA+
IGNoYW5nZXMsIG9uZSBwb3NzaWJsZSBzZXF1ZW5jZSBpczoNCj4gPiA+DQo+ID4gPiA9PT0gQ1BV
MSAtIGFsbG9jYXRlICsgc3VibWl0IGRlc2NyaXB0b3I6IGZzbF9kbWFfdHhfc3VibWl0KCkgPT09
IA0KPiA+ID4gc3Bpbl9sb2NrX2lycXNhdmUNCj4gPiA+IGRlc2NyaXB0b3ItPmNvb2tpZSA9IDIw
CQkoeCBpbiB5b3VyIGV4YW1wbGUpDQo+ID4gPiBjaGFuLT5jb21tb24uY29va2llID0gMjAJKHVz
ZWQgaW4geW91ciBleGFtcGxlKQ0KPiA+ID4gc3Bpbl91bmxvY2tfaXJxcmVzdG9yZQ0KPiA+ID4N
Cj4gPiA+ID09PSBDUFUyIC0gaW1tZWRpYXRlbHkgY2FsbHMgZnNsX3R4X3N0YXR1cygpID09PQ0K
PiA+ID4gY2hhbi0+Y29tbW9uLmNvb2tpZSA9PSAxOQ0KPiA+ID4gY2hhbi0+Y29tcGxldGVkX2Nv
b2tpZSA9PSAxOQ0KPiA+ID4gZGVzY3JpcHRvci0+Y29va2llID09IDIwDQo+ID4gPg0KPiA+ID4g
U2luY2Ugd2UgZG9uJ3QgaGF2ZSBsb2NrcyBhbnltb3JlLCBDUFUyIG1heSBub3QgaGF2ZSBzZWVu
IHRoZSANCj4gPiA+IHdyaXRlIHRvDQo+ID4gPiBjaGFuLT5jb21tb24uY29va2llIHlldC4NCj4g
PiA+DQo+ID4gPiBBbHNvIGFzc3VtZSB0aGF0IHRoZSBETUEgaGFyZHdhcmUgaGFzIG5vdCBzdGFy
dGVkIHByb2Nlc3NpbmcgdGhlIA0KPiA+ID4gdHJhbnNhY3Rpb24geWV0LiBUaGVyZWZvcmUgZG1h
X2RvX3Rhc2tsZXQoKSBoYXMgbm90IGJlZW4gY2FsbGVkLCANCj4gPiA+IGFuZA0KPiA+ID4gY2hh
bi0+Y29tcGxldGVkX2Nvb2tpZSBoYXMgbm90IGJlZW4gdXBkYXRlZC4NCj4gPiA+DQo+ID4gPiBJ
biB0aGlzIGNhc2UsIGRtYV9hc3luY19pc19jb21wbGV0ZSgpIChvbiBDUFUyKSByZXR1cm5zIA0K
PiA+ID4gRE1BX1NVQ0NFU1MsIGV2ZW4NCj4gPiB0aG91Z2ggdGhlIERNQSBvcGVyYXRpb24gaGFz
IG5vdCBzdWNjZWVkZWQuIFRoZSBETUEgb3BlcmF0aW9uIGhhcyANCj4gPiBub3QgZXZlbiBzdGFy
dGVkIHlldCENCj4gPiA+DQo+ID4gPiBUaGUgc21wX21iKCkgZml4ZXMgdGhpcywgc2luY2UgaXQg
Zm9yY2VzIENQVTIgdG8gaGF2ZSBzZWVuIGFsbCANCj4gPiA+IG1lbW9yeSBvcGVyYXRpb25zDQo+
ID4gdGhhdCBoYXBwZW5lZCBiZWZvcmUgQ1BVMSByZWxlYXNlZCB0aGUgc3BpbmxvY2suIFNwaW5s
b2NrcyBhcmUgDQo+ID4gaW1wbGljaXQgU01QIG1lbW9yeSBiYXJyaWVycy4NCj4gPiA+DQo+ID4g
PiBUaGVyZWZvcmUsIHRoZSBhYm92ZSBleGFtcGxlIGJlY29tZXM6DQo+ID4gPiBzbXBfbWIoKTsN
Cj4gPiA+IGNoYW4tPmNvbW1vbi5jb29raWUgPT0gMjANCj4gPiA+IGNoYW4tPmNvbXBsZXRlZF9j
b29raWUgPT0gMTkNCj4gPiA+IGRlc2NyaXB0b3ItPmNvb2tpZSA9PSAyMA0KPiA+ID4NCj4gPiA+
IFRoZW4gZG1hX2FzeW5jX2lzX2NvbXBsZXRlKCkgcmV0dXJucyBETUFfSU5fUFJPR1JFU1MsIHdo
aWNoIGlzIGNvcnJlY3QuDQo+ID4gPg0KPiA+ID4gVGhhbmtzLA0KPiA+ID4gSXJhDQo+ID4gPg0K
PiA+ID4NCj4gPiA+IF9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fDQo+ID4gPiBMaW51eHBwYy1kZXYgbWFpbGluZyBsaXN0DQo+ID4gPiBMaW51eHBwYy1kZXZA
bGlzdHMub3psYWJzLm9yZw0KPiA+ID4gaHR0cHM6Ly9saXN0cy5vemxhYnMub3JnL2xpc3RpbmZv
L2xpbnV4cHBjLWRldg0KPiANCg0K

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-30  9:57           ` Shi Xuelin-B29237
@ 2011-11-30 17:07             ` Ira W. Snyder
  2011-12-02  3:47               ` Shi Xuelin-B29237
  0 siblings, 1 reply; 14+ messages in thread
From: Ira W. Snyder @ 2011-11-30 17:07 UTC (permalink / raw)
  To: Shi Xuelin-B29237
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Li Yang-R58472

On Wed, Nov 30, 2011 at 09:57:47AM +0000, Shi Xuelin-B29237 wrote:
> Hello Ira,
> 
> In drivers/dma/dmaengine.c, we have below tight loop to check DMA completion in mainline Linux:
>        do {
>                 status = dma_async_is_tx_complete(chan, cookie, NULL, NULL);
>                 if (time_after_eq(jiffies, dma_sync_wait_timeout)) {
>                         printk(KERN_ERR "dma_sync_wait_timeout!\n");
>                         return DMA_ERROR;
>                 }
>         } while (status == DMA_IN_PROGRESS);
> 

That is the body of dma_sync_wait(). It is mostly used in the raid code.
I understand that you don't want to change the raid code to use
callbacks.

In any case, I think we've strayed from the topic under consideration,
which is: can we remove this spinlock without introducing a bug.

I'm convinced that "smp_rmb()" is needed when removing the spinlock. As
noted, Documentation/memory-barriers.txt says that stores on one CPU can
be observed by another CPU in a different order.

Previously, there was an UNLOCK (in fsl_dma_tx_submit) followed by a
LOCK (in fsl_tx_status). This provided a "full barrier", forcing the
operations to complete correctly when viewed by the second CPU. From the
text:

> Therefore, from (1), (2) and (4) an UNLOCK followed by an unconditional LOCK is
> equivalent to a full barrier, but a LOCK followed by an UNLOCK is not.

Also, please read "EXAMPLES OF MEMORY BARRIER SEQUENCES" and "INTER-CPU
LOCKING BARRIER EFFECTS". Particularly, in "EXAMPLES OF MEMORY BARRIER
SEQUENCES", the text notes:

> Without intervention, CPU 2 may perceive the events on CPU 1 in some
> effectively random order, despite the write barrier issued by CPU 1:
>
> [snip diagram]
>
> And thirdly, a read barrier acts as a partial order on loads. Consider the
> following sequence of events:
>
> [snip diagram]
>
> Without intervention, CPU 2 may then choose to perceive the events on CPU 1 in
> some effectively random order, despite the write barrier issued by CPU 1:
>
> [snip diagram]
>

And so on. Please read this entire section in the document.

I can't give you an ACK on the proposed patch. To the best of my
understanding, I believe it introduces a bug. I've tried to provide as
much evidence for this belief as I can, in the form of documentation in
the kernel source tree. If you can cite some documentation that shows I
am wrong, I will happily change my mind!

Ira

> -----Original Message-----
> From: Ira W. Snyder [mailto:iws@ovro.caltech.edu] 
> Sent: 2011年11月30日 1:26
> To: Li Yang-R58472
> Cc: Shi Xuelin-B29237; vinod.koul@intel.com; dan.j.williams@intel.com; linuxppc-dev@lists.ozlabs.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
> 
> On Tue, Nov 29, 2011 at 03:19:05AM +0000, Li Yang-R58472 wrote:
> > > Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by 
> > > optimizing spinlock use.
> > > 
> > > On Thu, Nov 24, 2011 at 08:12:25AM +0000, Shi Xuelin-B29237 wrote:
> > > > Hi Ira,
> > > >
> > > > Thanks for your review.
> > > >
> > > > After second thought, I think your scenario may not occur.
> > > > Because the cookie 20 we query must be returned by 
> > > > fsl_dma_tx_submit(...) in
> > > practice.
> > > > We never query a cookie not returned by fsl_dma_tx_submit(...).
> > > >
> > > 
> > > I agree about this part.
> > > 
> > > > When we call fsl_tx_status(20), the chan->common.cookie is 
> > > > definitely wrote as
> > > 20 and cpu2 could not read as 19.
> > > >
> > > 
> > > This is what I don't agree about. However, I'm not an expert on CPU cache vs.
> > > memory accesses in an multi-processor system. The section titled 
> > > "CACHE COHERENCY" in Documentation/memory-barriers.txt leads me to 
> > > believe that the scenario I described is possible.
> > 
> > For Freescale PowerPC, the chip automatically takes care of cache coherency.  Even if this is a concern, spinlock can't address it.
> > 
> > > 
> > > What happens if CPU1's write of chan->common.cookie only goes into 
> > > CPU1's cache. It never makes it to main memory before CPU2 fetches the old value of 19.
> > > 
> > > I don't think you should see any performance impact from the 
> > > smp_mb() operation.
> > 
> > Smp_mb() do have impact on performance if it's in the hot path.  While it might be safer having it, I doubt it is really necessary.  If the CPU1 doesn't have the updated last_used, it's shouldn't have known there is a cookie 20 existed either.
> > 
> 
> I believe that you are correct, for powerpc. However, anything outside of arch/powerpc shouldn't assume it only runs on powerpc. I wouldn't be surprised to see fsldma running on an iMX someday (ARM processor).
> 
> My interpretation says that the change introduces the possibility that
> fsl_tx_status() returns the wrong answer for an extremely small time window, on SMP only, based on Documentation/memory-barriers.txt. But I can't seem convince you.
> 
> My real question is what code path is hitting this spinlock? Is it in mainline Linux? Why is it polling rather than using callbacks to determine DMA completion?
> 
> Thanks,
> Ira
> 
> > > > -----Original Message-----
> > > > From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]
> > > > Sent: 2011年11月23日 2:59
> > > > To: Shi Xuelin-B29237
> > > > Cc: dan.j.williams@intel.com; Li Yang-R58472; zw@zh-kernel.org; 
> > > > vinod.koul@intel.com; linuxppc-dev@lists.ozlabs.org; 
> > > > linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by 
> > > > optimizing
> > > spinlock use.
> > > >
> > > > On Tue, Nov 22, 2011 at 12:55:05PM +0800, b29237@freescale.com wrote:
> > > > > From: Forrest Shi <b29237@freescale.com>
> > > > >
> > > > >     dma status check function fsl_tx_status is heavily called in
> > > > >     a tight loop and the desc lock in fsl_tx_status contended by
> > > > >     the dma status update function. this caused the dma performance
> > > > >     degrades much.
> > > > >
> > > > >     this patch releases the lock in the fsl_tx_status function.
> > > > >     I believe it has no neglect impact on the following call of
> > > > >     dma_async_is_complete(...).
> > > > >
> > > > >     we can see below three conditions will be identified as success
> > > > >     a)  x < complete < use
> > > > >     b)  x < complete+N < use+N
> > > > >     c)  x < complete < use+N
> > > > >     here complete is the completed_cookie, use is the last_used
> > > > >     cookie, x is the querying cookie, N is MAX cookie
> > > > >
> > > > >     when chan->completed_cookie is being read, the last_used may
> > > > >     be incresed. Anyway it has no neglect impact on the dma status
> > > > >     decision.
> > > > >
> > > > >     Signed-off-by: Forrest Shi <xuelin.shi@freescale.com>
> > > > > ---
> > > > >  drivers/dma/fsldma.c |    5 -----
> > > > >  1 files changed, 0 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c index 
> > > > > 8a78154..1dca56f 100644
> > > > > --- a/drivers/dma/fsldma.c
> > > > > +++ b/drivers/dma/fsldma.c
> > > > > @@ -986,15 +986,10 @@ static enum dma_status 
> > > > > fsl_tx_status(struct
> > > dma_chan *dchan,
> > > > >  	struct fsldma_chan *chan = to_fsl_chan(dchan);
> > > > >  	dma_cookie_t last_complete;
> > > > >  	dma_cookie_t last_used;
> > > > > -	unsigned long flags;
> > > > > -
> > > > > -	spin_lock_irqsave(&chan->desc_lock, flags);
> > > > >
> > > >
> > > > This will cause a bug. See below for a detailed explanation. You need this instead:
> > > >
> > > > 	/*
> > > > 	 * On an SMP system, we must ensure that this CPU has seen the
> > > > 	 * memory accesses performed by another CPU under the
> > > > 	 * chan->desc_lock spinlock.
> > > > 	 */
> > > > 	smp_mb();
> > > > >  	last_complete = chan->completed_cookie;
> > > > >  	last_used = dchan->cookie;
> > > > >
> > > > > -	spin_unlock_irqrestore(&chan->desc_lock, flags);
> > > > > -
> > > > >  	dma_set_tx_state(txstate, last_complete, last_used, 0);
> > > > >  	return dma_async_is_complete(cookie, last_complete, 
> > > > > last_used);  }
> > > >
> > > > Facts:
> > > > - dchan->cookie is the same member as chan->common.cookie (same 
> > > > memory
> > > > location)
> > > > - chan->common.cookie is the "last allocated cookie for a pending transaction"
> > > > - chan->completed_cookie is the "last completed transaction"
> > > >
> > > > I have replaced "dchan->cookie" with "chan->common.cookie" in the 
> > > > below
> > > explanation, to keep everything referenced from the same structure.
> > > >
> > > > Variable usage before your change. Everything is used locked.
> > > > - RW chan->common.cookie		(fsl_dma_tx_submit)
> > > > - R  chan->common.cookie		(fsl_tx_status)
> > > > - R  chan->completed_cookie		(fsl_tx_status)
> > > > - W  chan->completed_cookie		(dma_do_tasklet)
> > > >
> > > > Variable usage after your change:
> > > > - RW chan->common.cookie		LOCKED
> > > > - R  chan->common.cookie		NO LOCK
> > > > - R  chan->completed_cookie		NO LOCK
> > > > - W  chan->completed_cookie             LOCKED
> > > >
> > > > What if we assume that you have a 2 CPU system (such as a P2020). 
> > > > After your
> > > changes, one possible sequence is:
> > > >
> > > > === CPU1 - allocate + submit descriptor: fsl_dma_tx_submit() === 
> > > > spin_lock_irqsave
> > > > descriptor->cookie = 20		(x in your example)
> > > > chan->common.cookie = 20	(used in your example)
> > > > spin_unlock_irqrestore
> > > >
> > > > === CPU2 - immediately calls fsl_tx_status() ===
> > > > chan->common.cookie == 19
> > > > chan->completed_cookie == 19
> > > > descriptor->cookie == 20
> > > >
> > > > Since we don't have locks anymore, CPU2 may not have seen the 
> > > > write to
> > > > chan->common.cookie yet.
> > > >
> > > > Also assume that the DMA hardware has not started processing the 
> > > > transaction yet. Therefore dma_do_tasklet() has not been called, 
> > > > and
> > > > chan->completed_cookie has not been updated.
> > > >
> > > > In this case, dma_async_is_complete() (on CPU2) returns 
> > > > DMA_SUCCESS, even
> > > though the DMA operation has not succeeded. The DMA operation has 
> > > not even started yet!
> > > >
> > > > The smp_mb() fixes this, since it forces CPU2 to have seen all 
> > > > memory operations
> > > that happened before CPU1 released the spinlock. Spinlocks are 
> > > implicit SMP memory barriers.
> > > >
> > > > Therefore, the above example becomes:
> > > > smp_mb();
> > > > chan->common.cookie == 20
> > > > chan->completed_cookie == 19
> > > > descriptor->cookie == 20
> > > >
> > > > Then dma_async_is_complete() returns DMA_IN_PROGRESS, which is correct.
> > > >
> > > > Thanks,
> > > > Ira
> > > >
> > > >
> > > > _______________________________________________
> > > > Linuxppc-dev mailing list
> > > > Linuxppc-dev@lists.ozlabs.org
> > > > https://lists.ozlabs.org/listinfo/linuxppc-dev
> > 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-11-30 17:07             ` Ira W. Snyder
@ 2011-12-02  3:47               ` Shi Xuelin-B29237
  2011-12-02 17:13                 ` Ira W. Snyder
  0 siblings, 1 reply; 14+ messages in thread
From: Shi Xuelin-B29237 @ 2011-12-02  3:47 UTC (permalink / raw)
  To: Ira W. Snyder
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Li Yang-R58472

SGkgSXJpcywNCg0KPkknbSBjb252aW5jZWQgdGhhdCAic21wX3JtYigpIiBpcyBuZWVkZWQgd2hl
biByZW1vdmluZyB0aGUgc3BpbmxvY2suIEFzIG5vdGVkLCBEb2N1bWVudGF0aW9uL21lbW9yeS1i
YXJyaWVycy50eHQgc2F5cyB0aGF0IHN0b3JlcyBvbiBvbmUgQ1BVIGNhbiBiZQ0KPm9ic2VydmVk
IGJ5IGFub3RoZXIgQ1BVIGluIGEgZGlmZmVyZW50IG9yZGVyLg0KPlByZXZpb3VzbHksIHRoZXJl
IHdhcyBhbiBVTkxPQ0sgKGluIGZzbF9kbWFfdHhfc3VibWl0KSBmb2xsb3dlZCBieSBhIExPQ0sg
KGluIGZzbF90eF9zdGF0dXMpLiBUaGlzIHByb3ZpZGVkIGEgImZ1bGwgYmFycmllciIsIGZvcmNp
bmcgdGhlIG9wZXJhdGlvbnMgdG8gDQo+Y29tcGxldGUgY29ycmVjdGx5IHdoZW4gdmlld2VkIGJ5
IHRoZSBzZWNvbmQgQ1BVLiANCg0KSSBkbyBub3QgYWdyZWUgdGhpcyBzbXBfcm1iKCkgd29ya3Mg
aGVyZS4gQmVjYXVzZSB3aGVuIHRoaXMgc21wX3JtYigpIGV4ZWN1dGVkIGFuZCBiZWdpbiB0byBy
ZWFkIGNoYW4tPmNvbW1vbi5jb29raWUsIHlvdSBzdGlsbCBjYW5ub3QgYXZvaWQgdGhlIG9yZGVy
IGlzc3VlLiBTb21ldGhpbmcgbGlrZSBvbmUgaXMgcmVhZGluZyBvbGQgdmFsdWUsIGJ1dCBhbm90
aGVyIENQVSBpcyB1cGRhdGluZyB0aGUgbmV3IHZhbHVlLiANCg0KTXkgcG9pbnQgaXMgaGVyZSB0
aGUgb3JkZXIgaXMgbm90IGltcG9ydGFudCBmb3IgdGhlIERNQSBkZWNpc2lvbi4NCkNvbXBsZXRl
ZCBETUEgdHggaXMgZGVjaWRlZCBhcyBub3QgY29tcGxldGUgaXMgbm90IGEgYmlnIGRlYWwsIGJl
Y2F1c2UgbmV4dCB0aW1lIGl0IHdpbGwgYmUgT0suDQoNCkkgYmVsaWV2ZSB0aGVyZSBpcyBubyBj
YXNlIHRoYXQgY291bGQgY2F1c2UgdW5jb21wbGV0ZWQgRE1BIHR4IGlzIGRlY2lkZWQgYXMgY29t
cGxldGVkLCBiZWNhdXNlIHRoZSBmc2xfdHhfc3RhdHVzIGlzIGNhbGxlZCBhZnRlciBmc2xfZG1h
X3R4X3N1Ym1pdCBmb3IgYSBzcGVjaWZpYyBjb29raWUuIElmIHlvdSBjYW4gZ2l2ZSBtZSBhbiBl
eGFtcGxlIGhlcmUsIEkgd2lsbCBhZ3JlZSB3aXRoIHlvdS4NCg0KVGhhbmtzLA0KRm9ycmVzdCAN
Cg0KLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCkZyb206IElyYSBXLiBTbnlkZXIgW21haWx0
bzppd3NAb3Zyby5jYWx0ZWNoLmVkdV0gDQpTZW50OiAyMDEx5bm0MTLmnIgx5pelIDE6MDgNClRv
OiBTaGkgWHVlbGluLUIyOTIzNw0KQ2M6IHZpbm9kLmtvdWxAaW50ZWwuY29tOyBkYW4uai53aWxs
aWFtc0BpbnRlbC5jb207IGxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3JnOyBsaW51eC1rZXJu
ZWxAdmdlci5rZXJuZWwub3JnOyBMaSBZYW5nLVI1ODQ3Mg0KU3ViamVjdDogUmU6IFtQQVRDSF1b
UkZDXSBmc2xkbWE6IGZpeCBwZXJmb3JtYW5jZSBkZWdyYWRhdGlvbiBieSBvcHRpbWl6aW5nIHNw
aW5sb2NrIHVzZS4NCg0KT24gV2VkLCBOb3YgMzAsIDIwMTEgYXQgMDk6NTc6NDdBTSArMDAwMCwg
U2hpIFh1ZWxpbi1CMjkyMzcgd3JvdGU6DQo+IEhlbGxvIElyYSwNCj4gDQo+IEluIGRyaXZlcnMv
ZG1hL2RtYWVuZ2luZS5jLCB3ZSBoYXZlIGJlbG93IHRpZ2h0IGxvb3AgdG8gY2hlY2sgRE1BIGNv
bXBsZXRpb24gaW4gbWFpbmxpbmUgTGludXg6DQo+ICAgICAgICBkbyB7DQo+ICAgICAgICAgICAg
ICAgICBzdGF0dXMgPSBkbWFfYXN5bmNfaXNfdHhfY29tcGxldGUoY2hhbiwgY29va2llLCBOVUxM
LCBOVUxMKTsNCj4gICAgICAgICAgICAgICAgIGlmICh0aW1lX2FmdGVyX2VxKGppZmZpZXMsIGRt
YV9zeW5jX3dhaXRfdGltZW91dCkpIHsNCj4gICAgICAgICAgICAgICAgICAgICAgICAgcHJpbnRr
KEtFUk5fRVJSICJkbWFfc3luY193YWl0X3RpbWVvdXQhXG4iKTsNCj4gICAgICAgICAgICAgICAg
ICAgICAgICAgcmV0dXJuIERNQV9FUlJPUjsNCj4gICAgICAgICAgICAgICAgIH0NCj4gICAgICAg
ICB9IHdoaWxlIChzdGF0dXMgPT0gRE1BX0lOX1BST0dSRVNTKTsNCj4gDQoNClRoYXQgaXMgdGhl
IGJvZHkgb2YgZG1hX3N5bmNfd2FpdCgpLiBJdCBpcyBtb3N0bHkgdXNlZCBpbiB0aGUgcmFpZCBj
b2RlLg0KSSB1bmRlcnN0YW5kIHRoYXQgeW91IGRvbid0IHdhbnQgdG8gY2hhbmdlIHRoZSByYWlk
IGNvZGUgdG8gdXNlIGNhbGxiYWNrcy4NCg0KSW4gYW55IGNhc2UsIEkgdGhpbmsgd2UndmUgc3Ry
YXllZCBmcm9tIHRoZSB0b3BpYyB1bmRlciBjb25zaWRlcmF0aW9uLCB3aGljaCBpczogY2FuIHdl
IHJlbW92ZSB0aGlzIHNwaW5sb2NrIHdpdGhvdXQgaW50cm9kdWNpbmcgYSBidWcuDQoNCkknbSBj
b252aW5jZWQgdGhhdCAic21wX3JtYigpIiBpcyBuZWVkZWQgd2hlbiByZW1vdmluZyB0aGUgc3Bp
bmxvY2suIEFzIG5vdGVkLCBEb2N1bWVudGF0aW9uL21lbW9yeS1iYXJyaWVycy50eHQgc2F5cyB0
aGF0IHN0b3JlcyBvbiBvbmUgQ1BVIGNhbiBiZSBvYnNlcnZlZCBieSBhbm90aGVyIENQVSBpbiBh
IGRpZmZlcmVudCBvcmRlci4NCg0KUHJldmlvdXNseSwgdGhlcmUgd2FzIGFuIFVOTE9DSyAoaW4g
ZnNsX2RtYV90eF9zdWJtaXQpIGZvbGxvd2VkIGJ5IGEgTE9DSyAoaW4gZnNsX3R4X3N0YXR1cyku
IFRoaXMgcHJvdmlkZWQgYSAiZnVsbCBiYXJyaWVyIiwgZm9yY2luZyB0aGUgb3BlcmF0aW9ucyB0
byBjb21wbGV0ZSBjb3JyZWN0bHkgd2hlbiB2aWV3ZWQgYnkgdGhlIHNlY29uZCBDUFUuIEZyb20g
dGhlDQp0ZXh0Og0KDQo+IFRoZXJlZm9yZSwgZnJvbSAoMSksICgyKSBhbmQgKDQpIGFuIFVOTE9D
SyBmb2xsb3dlZCBieSBhbiANCj4gdW5jb25kaXRpb25hbCBMT0NLIGlzIGVxdWl2YWxlbnQgdG8g
YSBmdWxsIGJhcnJpZXIsIGJ1dCBhIExPQ0sgZm9sbG93ZWQgYnkgYW4gVU5MT0NLIGlzIG5vdC4N
Cg0KQWxzbywgcGxlYXNlIHJlYWQgIkVYQU1QTEVTIE9GIE1FTU9SWSBCQVJSSUVSIFNFUVVFTkNF
UyIgYW5kICJJTlRFUi1DUFUgTE9DS0lORyBCQVJSSUVSIEVGRkVDVFMiLiBQYXJ0aWN1bGFybHks
IGluICJFWEFNUExFUyBPRiBNRU1PUlkgQkFSUklFUiBTRVFVRU5DRVMiLCB0aGUgdGV4dCBub3Rl
czoNCg0KPiBXaXRob3V0IGludGVydmVudGlvbiwgQ1BVIDIgbWF5IHBlcmNlaXZlIHRoZSBldmVu
dHMgb24gQ1BVIDEgaW4gc29tZSANCj4gZWZmZWN0aXZlbHkgcmFuZG9tIG9yZGVyLCBkZXNwaXRl
IHRoZSB3cml0ZSBiYXJyaWVyIGlzc3VlZCBieSBDUFUgMToNCj4NCj4gW3NuaXAgZGlhZ3JhbV0N
Cj4NCj4gQW5kIHRoaXJkbHksIGEgcmVhZCBiYXJyaWVyIGFjdHMgYXMgYSBwYXJ0aWFsIG9yZGVy
IG9uIGxvYWRzLiBDb25zaWRlciANCj4gdGhlIGZvbGxvd2luZyBzZXF1ZW5jZSBvZiBldmVudHM6
DQo+DQo+IFtzbmlwIGRpYWdyYW1dDQo+DQo+IFdpdGhvdXQgaW50ZXJ2ZW50aW9uLCBDUFUgMiBt
YXkgdGhlbiBjaG9vc2UgdG8gcGVyY2VpdmUgdGhlIGV2ZW50cyBvbiANCj4gQ1BVIDEgaW4gc29t
ZSBlZmZlY3RpdmVseSByYW5kb20gb3JkZXIsIGRlc3BpdGUgdGhlIHdyaXRlIGJhcnJpZXIgaXNz
dWVkIGJ5IENQVSAxOg0KPg0KPiBbc25pcCBkaWFncmFtXQ0KPg0KDQpBbmQgc28gb24uIFBsZWFz
ZSByZWFkIHRoaXMgZW50aXJlIHNlY3Rpb24gaW4gdGhlIGRvY3VtZW50Lg0KDQpJIGNhbid0IGdp
dmUgeW91IGFuIEFDSyBvbiB0aGUgcHJvcG9zZWQgcGF0Y2guIFRvIHRoZSBiZXN0IG9mIG15IHVu
ZGVyc3RhbmRpbmcsIEkgYmVsaWV2ZSBpdCBpbnRyb2R1Y2VzIGEgYnVnLiBJJ3ZlIHRyaWVkIHRv
IHByb3ZpZGUgYXMgbXVjaCBldmlkZW5jZSBmb3IgdGhpcyBiZWxpZWYgYXMgSSBjYW4sIGluIHRo
ZSBmb3JtIG9mIGRvY3VtZW50YXRpb24gaW4gdGhlIGtlcm5lbCBzb3VyY2UgdHJlZS4gSWYgeW91
IGNhbiBjaXRlIHNvbWUgZG9jdW1lbnRhdGlvbiB0aGF0IHNob3dzIEkgYW0gd3JvbmcsIEkgd2ls
bCBoYXBwaWx5IGNoYW5nZSBteSBtaW5kIQ0KDQpJcmENCg0KPiAtLS0tLU9yaWdpbmFsIE1lc3Nh
Z2UtLS0tLQ0KPiBGcm9tOiBJcmEgVy4gU255ZGVyIFttYWlsdG86aXdzQG92cm8uY2FsdGVjaC5l
ZHVdDQo+IFNlbnQ6IDIwMTHlubQxMeaciDMw5pelIDE6MjYNCj4gVG86IExpIFlhbmctUjU4NDcy
DQo+IENjOiBTaGkgWHVlbGluLUIyOTIzNzsgdmlub2Qua291bEBpbnRlbC5jb207IGRhbi5qLndp
bGxpYW1zQGludGVsLmNvbTsgDQo+IGxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3JnOyBsaW51
eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnDQo+IFN1YmplY3Q6IFJlOiBbUEFUQ0hdW1JGQ10gZnNs
ZG1hOiBmaXggcGVyZm9ybWFuY2UgZGVncmFkYXRpb24gYnkgb3B0aW1pemluZyBzcGlubG9jayB1
c2UuDQo+IA0KPiBPbiBUdWUsIE5vdiAyOSwgMjAxMSBhdCAwMzoxOTowNUFNICswMDAwLCBMaSBZ
YW5nLVI1ODQ3MiB3cm90ZToNCj4gPiA+IFN1YmplY3Q6IFJlOiBbUEFUQ0hdW1JGQ10gZnNsZG1h
OiBmaXggcGVyZm9ybWFuY2UgZGVncmFkYXRpb24gYnkgDQo+ID4gPiBvcHRpbWl6aW5nIHNwaW5s
b2NrIHVzZS4NCj4gPiA+IA0KPiA+ID4gT24gVGh1LCBOb3YgMjQsIDIwMTEgYXQgMDg6MTI6MjVB
TSArMDAwMCwgU2hpIFh1ZWxpbi1CMjkyMzcgd3JvdGU6DQo+ID4gPiA+IEhpIElyYSwNCj4gPiA+
ID4NCj4gPiA+ID4gVGhhbmtzIGZvciB5b3VyIHJldmlldy4NCj4gPiA+ID4NCj4gPiA+ID4gQWZ0
ZXIgc2Vjb25kIHRob3VnaHQsIEkgdGhpbmsgeW91ciBzY2VuYXJpbyBtYXkgbm90IG9jY3VyLg0K
PiA+ID4gPiBCZWNhdXNlIHRoZSBjb29raWUgMjAgd2UgcXVlcnkgbXVzdCBiZSByZXR1cm5lZCBi
eQ0KPiA+ID4gPiBmc2xfZG1hX3R4X3N1Ym1pdCguLi4pIGluDQo+ID4gPiBwcmFjdGljZS4NCj4g
PiA+ID4gV2UgbmV2ZXIgcXVlcnkgYSBjb29raWUgbm90IHJldHVybmVkIGJ5IGZzbF9kbWFfdHhf
c3VibWl0KC4uLikuDQo+ID4gPiA+DQo+ID4gPiANCj4gPiA+IEkgYWdyZWUgYWJvdXQgdGhpcyBw
YXJ0Lg0KPiA+ID4gDQo+ID4gPiA+IFdoZW4gd2UgY2FsbCBmc2xfdHhfc3RhdHVzKDIwKSwgdGhl
IGNoYW4tPmNvbW1vbi5jb29raWUgaXMgDQo+ID4gPiA+IGRlZmluaXRlbHkgd3JvdGUgYXMNCj4g
PiA+IDIwIGFuZCBjcHUyIGNvdWxkIG5vdCByZWFkIGFzIDE5Lg0KPiA+ID4gPg0KPiA+ID4gDQo+
ID4gPiBUaGlzIGlzIHdoYXQgSSBkb24ndCBhZ3JlZSBhYm91dC4gSG93ZXZlciwgSSdtIG5vdCBh
biBleHBlcnQgb24gQ1BVIGNhY2hlIHZzLg0KPiA+ID4gbWVtb3J5IGFjY2Vzc2VzIGluIGFuIG11
bHRpLXByb2Nlc3NvciBzeXN0ZW0uIFRoZSBzZWN0aW9uIHRpdGxlZCANCj4gPiA+ICJDQUNIRSBD
T0hFUkVOQ1kiIGluIERvY3VtZW50YXRpb24vbWVtb3J5LWJhcnJpZXJzLnR4dCBsZWFkcyBtZSB0
byANCj4gPiA+IGJlbGlldmUgdGhhdCB0aGUgc2NlbmFyaW8gSSBkZXNjcmliZWQgaXMgcG9zc2li
bGUuDQo+ID4gDQo+ID4gRm9yIEZyZWVzY2FsZSBQb3dlclBDLCB0aGUgY2hpcCBhdXRvbWF0aWNh
bGx5IHRha2VzIGNhcmUgb2YgY2FjaGUgY29oZXJlbmN5LiAgRXZlbiBpZiB0aGlzIGlzIGEgY29u
Y2Vybiwgc3BpbmxvY2sgY2FuJ3QgYWRkcmVzcyBpdC4NCj4gPiANCj4gPiA+IA0KPiA+ID4gV2hh
dCBoYXBwZW5zIGlmIENQVTEncyB3cml0ZSBvZiBjaGFuLT5jb21tb24uY29va2llIG9ubHkgZ29l
cyBpbnRvIA0KPiA+ID4gQ1BVMSdzIGNhY2hlLiBJdCBuZXZlciBtYWtlcyBpdCB0byBtYWluIG1l
bW9yeSBiZWZvcmUgQ1BVMiBmZXRjaGVzIHRoZSBvbGQgdmFsdWUgb2YgMTkuDQo+ID4gPiANCj4g
PiA+IEkgZG9uJ3QgdGhpbmsgeW91IHNob3VsZCBzZWUgYW55IHBlcmZvcm1hbmNlIGltcGFjdCBm
cm9tIHRoZQ0KPiA+ID4gc21wX21iKCkgb3BlcmF0aW9uLg0KPiA+IA0KPiA+IFNtcF9tYigpIGRv
IGhhdmUgaW1wYWN0IG9uIHBlcmZvcm1hbmNlIGlmIGl0J3MgaW4gdGhlIGhvdCBwYXRoLiAgV2hp
bGUgaXQgbWlnaHQgYmUgc2FmZXIgaGF2aW5nIGl0LCBJIGRvdWJ0IGl0IGlzIHJlYWxseSBuZWNl
c3NhcnkuICBJZiB0aGUgQ1BVMSBkb2Vzbid0IGhhdmUgdGhlIHVwZGF0ZWQgbGFzdF91c2VkLCBp
dCdzIHNob3VsZG4ndCBoYXZlIGtub3duIHRoZXJlIGlzIGEgY29va2llIDIwIGV4aXN0ZWQgZWl0
aGVyLg0KPiA+IA0KPiANCj4gSSBiZWxpZXZlIHRoYXQgeW91IGFyZSBjb3JyZWN0LCBmb3IgcG93
ZXJwYy4gSG93ZXZlciwgYW55dGhpbmcgb3V0c2lkZSBvZiBhcmNoL3Bvd2VycGMgc2hvdWxkbid0
IGFzc3VtZSBpdCBvbmx5IHJ1bnMgb24gcG93ZXJwYy4gSSB3b3VsZG4ndCBiZSBzdXJwcmlzZWQg
dG8gc2VlIGZzbGRtYSBydW5uaW5nIG9uIGFuIGlNWCBzb21lZGF5IChBUk0gcHJvY2Vzc29yKS4N
Cj4gDQo+IE15IGludGVycHJldGF0aW9uIHNheXMgdGhhdCB0aGUgY2hhbmdlIGludHJvZHVjZXMg
dGhlIHBvc3NpYmlsaXR5IHRoYXQNCj4gZnNsX3R4X3N0YXR1cygpIHJldHVybnMgdGhlIHdyb25n
IGFuc3dlciBmb3IgYW4gZXh0cmVtZWx5IHNtYWxsIHRpbWUgd2luZG93LCBvbiBTTVAgb25seSwg
YmFzZWQgb24gRG9jdW1lbnRhdGlvbi9tZW1vcnktYmFycmllcnMudHh0LiBCdXQgSSBjYW4ndCBz
ZWVtIGNvbnZpbmNlIHlvdS4NCj4gDQo+IE15IHJlYWwgcXVlc3Rpb24gaXMgd2hhdCBjb2RlIHBh
dGggaXMgaGl0dGluZyB0aGlzIHNwaW5sb2NrPyBJcyBpdCBpbiBtYWlubGluZSBMaW51eD8gV2h5
IGlzIGl0IHBvbGxpbmcgcmF0aGVyIHRoYW4gdXNpbmcgY2FsbGJhY2tzIHRvIGRldGVybWluZSBE
TUEgY29tcGxldGlvbj8NCj4gDQo+IFRoYW5rcywNCj4gSXJhDQo+IA0KPiA+ID4gPiAtLS0tLU9y
aWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiA+ID4gPiBGcm9tOiBJcmEgVy4gU255ZGVyIFttYWlsdG86
aXdzQG92cm8uY2FsdGVjaC5lZHVdDQo+ID4gPiA+IFNlbnQ6IDIwMTHlubQxMeaciDIz5pelIDI6
NTkNCj4gPiA+ID4gVG86IFNoaSBYdWVsaW4tQjI5MjM3DQo+ID4gPiA+IENjOiBkYW4uai53aWxs
aWFtc0BpbnRlbC5jb207IExpIFlhbmctUjU4NDcyOyB6d0B6aC1rZXJuZWwub3JnOyANCj4gPiA+
ID4gdmlub2Qua291bEBpbnRlbC5jb207IGxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3JnOyAN
Cj4gPiA+ID4gbGludXgta2VybmVsQHZnZXIua2VybmVsLm9yZw0KPiA+ID4gPiBTdWJqZWN0OiBS
ZTogW1BBVENIXVtSRkNdIGZzbGRtYTogZml4IHBlcmZvcm1hbmNlIGRlZ3JhZGF0aW9uIGJ5IA0K
PiA+ID4gPiBvcHRpbWl6aW5nDQo+ID4gPiBzcGlubG9jayB1c2UuDQo+ID4gPiA+DQo+ID4gPiA+
IE9uIFR1ZSwgTm92IDIyLCAyMDExIGF0IDEyOjU1OjA1UE0gKzA4MDAsIGIyOTIzN0BmcmVlc2Nh
bGUuY29tIHdyb3RlOg0KPiA+ID4gPiA+IEZyb206IEZvcnJlc3QgU2hpIDxiMjkyMzdAZnJlZXNj
YWxlLmNvbT4NCj4gPiA+ID4gPg0KPiA+ID4gPiA+ICAgICBkbWEgc3RhdHVzIGNoZWNrIGZ1bmN0
aW9uIGZzbF90eF9zdGF0dXMgaXMgaGVhdmlseSBjYWxsZWQgaW4NCj4gPiA+ID4gPiAgICAgYSB0
aWdodCBsb29wIGFuZCB0aGUgZGVzYyBsb2NrIGluIGZzbF90eF9zdGF0dXMgY29udGVuZGVkIGJ5
DQo+ID4gPiA+ID4gICAgIHRoZSBkbWEgc3RhdHVzIHVwZGF0ZSBmdW5jdGlvbi4gdGhpcyBjYXVz
ZWQgdGhlIGRtYSBwZXJmb3JtYW5jZQ0KPiA+ID4gPiA+ICAgICBkZWdyYWRlcyBtdWNoLg0KPiA+
ID4gPiA+DQo+ID4gPiA+ID4gICAgIHRoaXMgcGF0Y2ggcmVsZWFzZXMgdGhlIGxvY2sgaW4gdGhl
IGZzbF90eF9zdGF0dXMgZnVuY3Rpb24uDQo+ID4gPiA+ID4gICAgIEkgYmVsaWV2ZSBpdCBoYXMg
bm8gbmVnbGVjdCBpbXBhY3Qgb24gdGhlIGZvbGxvd2luZyBjYWxsIG9mDQo+ID4gPiA+ID4gICAg
IGRtYV9hc3luY19pc19jb21wbGV0ZSguLi4pLg0KPiA+ID4gPiA+DQo+ID4gPiA+ID4gICAgIHdl
IGNhbiBzZWUgYmVsb3cgdGhyZWUgY29uZGl0aW9ucyB3aWxsIGJlIGlkZW50aWZpZWQgYXMgc3Vj
Y2Vzcw0KPiA+ID4gPiA+ICAgICBhKSAgeCA8IGNvbXBsZXRlIDwgdXNlDQo+ID4gPiA+ID4gICAg
IGIpICB4IDwgY29tcGxldGUrTiA8IHVzZStODQo+ID4gPiA+ID4gICAgIGMpICB4IDwgY29tcGxl
dGUgPCB1c2UrTg0KPiA+ID4gPiA+ICAgICBoZXJlIGNvbXBsZXRlIGlzIHRoZSBjb21wbGV0ZWRf
Y29va2llLCB1c2UgaXMgdGhlIGxhc3RfdXNlZA0KPiA+ID4gPiA+ICAgICBjb29raWUsIHggaXMg
dGhlIHF1ZXJ5aW5nIGNvb2tpZSwgTiBpcyBNQVggY29va2llDQo+ID4gPiA+ID4NCj4gPiA+ID4g
PiAgICAgd2hlbiBjaGFuLT5jb21wbGV0ZWRfY29va2llIGlzIGJlaW5nIHJlYWQsIHRoZSBsYXN0
X3VzZWQgbWF5DQo+ID4gPiA+ID4gICAgIGJlIGluY3Jlc2VkLiBBbnl3YXkgaXQgaGFzIG5vIG5l
Z2xlY3QgaW1wYWN0IG9uIHRoZSBkbWEgc3RhdHVzDQo+ID4gPiA+ID4gICAgIGRlY2lzaW9uLg0K
PiA+ID4gPiA+DQo+ID4gPiA+ID4gICAgIFNpZ25lZC1vZmYtYnk6IEZvcnJlc3QgU2hpIDx4dWVs
aW4uc2hpQGZyZWVzY2FsZS5jb20+DQo+ID4gPiA+ID4gLS0tDQo+ID4gPiA+ID4gIGRyaXZlcnMv
ZG1hL2ZzbGRtYS5jIHwgICAgNSAtLS0tLQ0KPiA+ID4gPiA+ICAxIGZpbGVzIGNoYW5nZWQsIDAg
aW5zZXJ0aW9ucygrKSwgNSBkZWxldGlvbnMoLSkNCj4gPiA+ID4gPg0KPiA+ID4gPiA+IGRpZmYg
LS1naXQgYS9kcml2ZXJzL2RtYS9mc2xkbWEuYyBiL2RyaXZlcnMvZG1hL2ZzbGRtYS5jIGluZGV4
IA0KPiA+ID4gPiA+IDhhNzgxNTQuLjFkY2E1NmYgMTAwNjQ0DQo+ID4gPiA+ID4gLS0tIGEvZHJp
dmVycy9kbWEvZnNsZG1hLmMNCj4gPiA+ID4gPiArKysgYi9kcml2ZXJzL2RtYS9mc2xkbWEuYw0K
PiA+ID4gPiA+IEBAIC05ODYsMTUgKzk4NiwxMCBAQCBzdGF0aWMgZW51bSBkbWFfc3RhdHVzIA0K
PiA+ID4gPiA+IGZzbF90eF9zdGF0dXMoc3RydWN0DQo+ID4gPiBkbWFfY2hhbiAqZGNoYW4sDQo+
ID4gPiA+ID4gIAlzdHJ1Y3QgZnNsZG1hX2NoYW4gKmNoYW4gPSB0b19mc2xfY2hhbihkY2hhbik7
DQo+ID4gPiA+ID4gIAlkbWFfY29va2llX3QgbGFzdF9jb21wbGV0ZTsNCj4gPiA+ID4gPiAgCWRt
YV9jb29raWVfdCBsYXN0X3VzZWQ7DQo+ID4gPiA+ID4gLQl1bnNpZ25lZCBsb25nIGZsYWdzOw0K
PiA+ID4gPiA+IC0NCj4gPiA+ID4gPiAtCXNwaW5fbG9ja19pcnFzYXZlKCZjaGFuLT5kZXNjX2xv
Y2ssIGZsYWdzKTsNCj4gPiA+ID4gPg0KPiA+ID4gPg0KPiA+ID4gPiBUaGlzIHdpbGwgY2F1c2Ug
YSBidWcuIFNlZSBiZWxvdyBmb3IgYSBkZXRhaWxlZCBleHBsYW5hdGlvbi4gWW91IG5lZWQgdGhp
cyBpbnN0ZWFkOg0KPiA+ID4gPg0KPiA+ID4gPiAJLyoNCj4gPiA+ID4gCSAqIE9uIGFuIFNNUCBz
eXN0ZW0sIHdlIG11c3QgZW5zdXJlIHRoYXQgdGhpcyBDUFUgaGFzIHNlZW4gdGhlDQo+ID4gPiA+
IAkgKiBtZW1vcnkgYWNjZXNzZXMgcGVyZm9ybWVkIGJ5IGFub3RoZXIgQ1BVIHVuZGVyIHRoZQ0K
PiA+ID4gPiAJICogY2hhbi0+ZGVzY19sb2NrIHNwaW5sb2NrLg0KPiA+ID4gPiAJICovDQo+ID4g
PiA+IAlzbXBfbWIoKTsNCj4gPiA+ID4gPiAgCWxhc3RfY29tcGxldGUgPSBjaGFuLT5jb21wbGV0
ZWRfY29va2llOw0KPiA+ID4gPiA+ICAJbGFzdF91c2VkID0gZGNoYW4tPmNvb2tpZTsNCj4gPiA+
ID4gPg0KPiA+ID4gPiA+IC0Jc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSgmY2hhbi0+ZGVzY19sb2Nr
LCBmbGFncyk7DQo+ID4gPiA+ID4gLQ0KPiA+ID4gPiA+ICAJZG1hX3NldF90eF9zdGF0ZSh0eHN0
YXRlLCBsYXN0X2NvbXBsZXRlLCBsYXN0X3VzZWQsIDApOw0KPiA+ID4gPiA+ICAJcmV0dXJuIGRt
YV9hc3luY19pc19jb21wbGV0ZShjb29raWUsIGxhc3RfY29tcGxldGUsIA0KPiA+ID4gPiA+IGxh
c3RfdXNlZCk7ICB9DQo+ID4gPiA+DQo+ID4gPiA+IEZhY3RzOg0KPiA+ID4gPiAtIGRjaGFuLT5j
b29raWUgaXMgdGhlIHNhbWUgbWVtYmVyIGFzIGNoYW4tPmNvbW1vbi5jb29raWUgKHNhbWUgDQo+
ID4gPiA+IG1lbW9yeQ0KPiA+ID4gPiBsb2NhdGlvbikNCj4gPiA+ID4gLSBjaGFuLT5jb21tb24u
Y29va2llIGlzIHRoZSAibGFzdCBhbGxvY2F0ZWQgY29va2llIGZvciBhIHBlbmRpbmcgdHJhbnNh
Y3Rpb24iDQo+ID4gPiA+IC0gY2hhbi0+Y29tcGxldGVkX2Nvb2tpZSBpcyB0aGUgImxhc3QgY29t
cGxldGVkIHRyYW5zYWN0aW9uIg0KPiA+ID4gPg0KPiA+ID4gPiBJIGhhdmUgcmVwbGFjZWQgImRj
aGFuLT5jb29raWUiIHdpdGggImNoYW4tPmNvbW1vbi5jb29raWUiIGluIA0KPiA+ID4gPiB0aGUg
YmVsb3cNCj4gPiA+IGV4cGxhbmF0aW9uLCB0byBrZWVwIGV2ZXJ5dGhpbmcgcmVmZXJlbmNlZCBm
cm9tIHRoZSBzYW1lIHN0cnVjdHVyZS4NCj4gPiA+ID4NCj4gPiA+ID4gVmFyaWFibGUgdXNhZ2Ug
YmVmb3JlIHlvdXIgY2hhbmdlLiBFdmVyeXRoaW5nIGlzIHVzZWQgbG9ja2VkLg0KPiA+ID4gPiAt
IFJXIGNoYW4tPmNvbW1vbi5jb29raWUJCShmc2xfZG1hX3R4X3N1Ym1pdCkNCj4gPiA+ID4gLSBS
ICBjaGFuLT5jb21tb24uY29va2llCQkoZnNsX3R4X3N0YXR1cykNCj4gPiA+ID4gLSBSICBjaGFu
LT5jb21wbGV0ZWRfY29va2llCQkoZnNsX3R4X3N0YXR1cykNCj4gPiA+ID4gLSBXICBjaGFuLT5j
b21wbGV0ZWRfY29va2llCQkoZG1hX2RvX3Rhc2tsZXQpDQo+ID4gPiA+DQo+ID4gPiA+IFZhcmlh
YmxlIHVzYWdlIGFmdGVyIHlvdXIgY2hhbmdlOg0KPiA+ID4gPiAtIFJXIGNoYW4tPmNvbW1vbi5j
b29raWUJCUxPQ0tFRA0KPiA+ID4gPiAtIFIgIGNoYW4tPmNvbW1vbi5jb29raWUJCU5PIExPQ0sN
Cj4gPiA+ID4gLSBSICBjaGFuLT5jb21wbGV0ZWRfY29va2llCQlOTyBMT0NLDQo+ID4gPiA+IC0g
VyAgY2hhbi0+Y29tcGxldGVkX2Nvb2tpZSAgICAgICAgICAgICBMT0NLRUQNCj4gPiA+ID4NCj4g
PiA+ID4gV2hhdCBpZiB3ZSBhc3N1bWUgdGhhdCB5b3UgaGF2ZSBhIDIgQ1BVIHN5c3RlbSAoc3Vj
aCBhcyBhIFAyMDIwKS4gDQo+ID4gPiA+IEFmdGVyIHlvdXINCj4gPiA+IGNoYW5nZXMsIG9uZSBw
b3NzaWJsZSBzZXF1ZW5jZSBpczoNCj4gPiA+ID4NCj4gPiA+ID4gPT09IENQVTEgLSBhbGxvY2F0
ZSArIHN1Ym1pdCBkZXNjcmlwdG9yOiBmc2xfZG1hX3R4X3N1Ym1pdCgpID09PSANCj4gPiA+ID4g
c3Bpbl9sb2NrX2lycXNhdmUNCj4gPiA+ID4gZGVzY3JpcHRvci0+Y29va2llID0gMjAJCSh4IGlu
IHlvdXIgZXhhbXBsZSkNCj4gPiA+ID4gY2hhbi0+Y29tbW9uLmNvb2tpZSA9IDIwCSh1c2VkIGlu
IHlvdXIgZXhhbXBsZSkNCj4gPiA+ID4gc3Bpbl91bmxvY2tfaXJxcmVzdG9yZQ0KPiA+ID4gPg0K
PiA+ID4gPiA9PT0gQ1BVMiAtIGltbWVkaWF0ZWx5IGNhbGxzIGZzbF90eF9zdGF0dXMoKSA9PT0N
Cj4gPiA+ID4gY2hhbi0+Y29tbW9uLmNvb2tpZSA9PSAxOQ0KPiA+ID4gPiBjaGFuLT5jb21wbGV0
ZWRfY29va2llID09IDE5DQo+ID4gPiA+IGRlc2NyaXB0b3ItPmNvb2tpZSA9PSAyMA0KPiA+ID4g
Pg0KPiA+ID4gPiBTaW5jZSB3ZSBkb24ndCBoYXZlIGxvY2tzIGFueW1vcmUsIENQVTIgbWF5IG5v
dCBoYXZlIHNlZW4gdGhlIA0KPiA+ID4gPiB3cml0ZSB0bw0KPiA+ID4gPiBjaGFuLT5jb21tb24u
Y29va2llIHlldC4NCj4gPiA+ID4NCj4gPiA+ID4gQWxzbyBhc3N1bWUgdGhhdCB0aGUgRE1BIGhh
cmR3YXJlIGhhcyBub3Qgc3RhcnRlZCBwcm9jZXNzaW5nIHRoZSANCj4gPiA+ID4gdHJhbnNhY3Rp
b24geWV0LiBUaGVyZWZvcmUgZG1hX2RvX3Rhc2tsZXQoKSBoYXMgbm90IGJlZW4gY2FsbGVkLCAN
Cj4gPiA+ID4gYW5kDQo+ID4gPiA+IGNoYW4tPmNvbXBsZXRlZF9jb29raWUgaGFzIG5vdCBiZWVu
IHVwZGF0ZWQuDQo+ID4gPiA+DQo+ID4gPiA+IEluIHRoaXMgY2FzZSwgZG1hX2FzeW5jX2lzX2Nv
bXBsZXRlKCkgKG9uIENQVTIpIHJldHVybnMgDQo+ID4gPiA+IERNQV9TVUNDRVNTLCBldmVuDQo+
ID4gPiB0aG91Z2ggdGhlIERNQSBvcGVyYXRpb24gaGFzIG5vdCBzdWNjZWVkZWQuIFRoZSBETUEg
b3BlcmF0aW9uIGhhcyANCj4gPiA+IG5vdCBldmVuIHN0YXJ0ZWQgeWV0IQ0KPiA+ID4gPg0KPiA+
ID4gPiBUaGUgc21wX21iKCkgZml4ZXMgdGhpcywgc2luY2UgaXQgZm9yY2VzIENQVTIgdG8gaGF2
ZSBzZWVuIGFsbCANCj4gPiA+ID4gbWVtb3J5IG9wZXJhdGlvbnMNCj4gPiA+IHRoYXQgaGFwcGVu
ZWQgYmVmb3JlIENQVTEgcmVsZWFzZWQgdGhlIHNwaW5sb2NrLiBTcGlubG9ja3MgYXJlIA0KPiA+
ID4gaW1wbGljaXQgU01QIG1lbW9yeSBiYXJyaWVycy4NCj4gPiA+ID4NCj4gPiA+ID4gVGhlcmVm
b3JlLCB0aGUgYWJvdmUgZXhhbXBsZSBiZWNvbWVzOg0KPiA+ID4gPiBzbXBfbWIoKTsNCj4gPiA+
ID4gY2hhbi0+Y29tbW9uLmNvb2tpZSA9PSAyMA0KPiA+ID4gPiBjaGFuLT5jb21wbGV0ZWRfY29v
a2llID09IDE5DQo+ID4gPiA+IGRlc2NyaXB0b3ItPmNvb2tpZSA9PSAyMA0KPiA+ID4gPg0KPiA+
ID4gPiBUaGVuIGRtYV9hc3luY19pc19jb21wbGV0ZSgpIHJldHVybnMgRE1BX0lOX1BST0dSRVNT
LCB3aGljaCBpcyBjb3JyZWN0Lg0KPiA+ID4gPg0KPiA+ID4gPiBUaGFua3MsDQo+ID4gPiA+IEly
YQ0KPiA+ID4gPg0KPiA+ID4gPg0KPiA+ID4gPiBfX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fXw0KPiA+ID4gPiBMaW51eHBwYy1kZXYgbWFpbGluZyBsaXN0DQo+
ID4gPiA+IExpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3JnDQo+ID4gPiA+IGh0dHBzOi8vbGlz
dHMub3psYWJzLm9yZy9saXN0aW5mby9saW51eHBwYy1kZXYNCj4gPiANCj4gDQoNCg==

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-12-02  3:47               ` Shi Xuelin-B29237
@ 2011-12-02 17:13                 ` Ira W. Snyder
  2011-12-05  6:11                   ` Shi Xuelin-B29237
  0 siblings, 1 reply; 14+ messages in thread
From: Ira W. Snyder @ 2011-12-02 17:13 UTC (permalink / raw)
  To: Shi Xuelin-B29237
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Li Yang-R58472

On Fri, Dec 02, 2011 at 03:47:27AM +0000, Shi Xuelin-B29237 wrote:
> Hi Iris,
> 
> >I'm convinced that "smp_rmb()" is needed when removing the spinlock. As noted, Documentation/memory-barriers.txt says that stores on one CPU can be
> >observed by another CPU in a different order.
> >Previously, there was an UNLOCK (in fsl_dma_tx_submit) followed by a LOCK (in fsl_tx_status). This provided a "full barrier", forcing the operations to 
> >complete correctly when viewed by the second CPU. 
> 
> I do not agree this smp_rmb() works here. Because when this smp_rmb() executed and begin to read chan->common.cookie, you still cannot avoid the order issue. Something like one is reading old value, but another CPU is updating the new value. 
> 
> My point is here the order is not important for the DMA decision.
> Completed DMA tx is decided as not complete is not a big deal, because next time it will be OK.
> 
> I believe there is no case that could cause uncompleted DMA tx is decided as completed, because the fsl_tx_status is called after fsl_dma_tx_submit for a specific cookie. If you can give me an example here, I will agree with you.
> 

According to memory-barriers.txt, writes to main memory may be observed in
any order if memory barriers are not used. This means that writes can
appear to happen in a different order than they were issued by the CPU.

Citing from the text:

> There are certain things that the Linux kernel memory barriers do not guarantee:
>
>  (*) There is no guarantee that any of the memory accesses specified before a
>      memory barrier will be _complete_ by the completion of a memory barrier
>      instruction; the barrier can be considered to draw a line in that CPU's
>      access queue that accesses of the appropriate type may not cross.

Also:

> Without intervention, CPU 2 may perceive the events on CPU 1 in some
> effectively random order, despite the write barrier issued by CPU 1:

Also:

> When dealing with CPU-CPU interactions, certain types of memory barrier should
> always be paired.  A lack of appropriate pairing is almost certainly an error.
>
> A write barrier should always be paired with a data dependency barrier or read
> barrier, though a general barrier would also be viable.

Therefore, in an SMP system, the following situation can happen.

descriptor->cookie = 2
chan->common.cookie = 1
chan->completed_cookie = 1

This occurs when CPU-A calls fsl_dma_tx_submit() and then CPU-B calls
dma_async_is_complete() ***after*** CPU-B has observed the write to
descriptor->cookie, and ***before*** before CPU-B has observed the write to
chan->common.cookie.

Remember, without barriers, CPU-B can observe CPU-A's memory accesses in
*any possible order*. Memory accesses are not guaranteed to be *complete*
by the time fsl_dma_tx_submit() returns!

With the above values, dma_async_is_complete() returns DMA_COMPLETE. This
is incorrect: the DMA is still in progress. The required invariant
chan->common.cookie >= descriptor->cookie has not been met.

By adding an smp_rmb(), I force CPU-B to stall until *both* stores in
fsl_dma_tx_submit() (descriptor->cookie and chan->common.cookie) actually
hit main memory. This avoids the above situation: all CPU's observe
descriptor->cookie and chan->common.cookie to update in sync with each
other.

Is this unclear in any way?

Please run your test with the smp_rmb() and measure the performance
impact.

Ira

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use.
  2011-12-02 17:13                 ` Ira W. Snyder
@ 2011-12-05  6:11                   ` Shi Xuelin-B29237
  0 siblings, 0 replies; 14+ messages in thread
From: Shi Xuelin-B29237 @ 2011-12-05  6:11 UTC (permalink / raw)
  To: Ira W. Snyder
  Cc: vinod.koul@intel.com, dan.j.williams@intel.com,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Li Yang-R58472

Hi Iris,

>Remember, without barriers, CPU-B can observe CPU-A's memory accesses in *=
any possible order*. Memory accesses are not guaranteed to be *complete* by=
=20
>the time fsl_dma_tx_submit() returns!

fsl_dma_tx_submit is enclosed by spin_lock_irqsave/spin_unlock_irqrestore, =
when this function returns, I believe the memory access are completed. spin=
_unlock_irqsave is an implicit memory barrier and guaranteed this.

Thanks,
Forrest


-----Original Message-----
From: Ira W. Snyder [mailto:iws@ovro.caltech.edu]=20
Sent: 2011=1B$BG/=1B(B12=1B$B7n=1B(B3=1B$BF|=1B(B 1:14
To: Shi Xuelin-B29237
Cc: vinod.koul@intel.com; dan.j.williams@intel.com; linuxppc-dev@lists.ozla=
bs.org; linux-kernel@vger.kernel.org; Li Yang-R58472
Subject: Re: [PATCH][RFC] fsldma: fix performance degradation by optimizing=
 spinlock use.

On Fri, Dec 02, 2011 at 03:47:27AM +0000, Shi Xuelin-B29237 wrote:
> Hi Iris,
>=20
> >I'm convinced that "smp_rmb()" is needed when removing the spinlock.=20
> >As noted, Documentation/memory-barriers.txt says that stores on one CPU =
can be observed by another CPU in a different order.
> >Previously, there was an UNLOCK (in fsl_dma_tx_submit) followed by a=20
> >LOCK (in fsl_tx_status). This provided a "full barrier", forcing the ope=
rations to complete correctly when viewed by the second CPU.
>=20
> I do not agree this smp_rmb() works here. Because when this smp_rmb() exe=
cuted and begin to read chan->common.cookie, you still cannot avoid the ord=
er issue. Something like one is reading old value, but another CPU is updat=
ing the new value.=20
>=20
> My point is here the order is not important for the DMA decision.
> Completed DMA tx is decided as not complete is not a big deal, because ne=
xt time it will be OK.
>=20
> I believe there is no case that could cause uncompleted DMA tx is decided=
 as completed, because the fsl_tx_status is called after fsl_dma_tx_submit =
for a specific cookie. If you can give me an example here, I will agree wit=
h you.
>=20

According to memory-barriers.txt, writes to main memory may be observed in =
any order if memory barriers are not used. This means that writes can appea=
r to happen in a different order than they were issued by the CPU.

Citing from the text:

> There are certain things that the Linux kernel memory barriers do not gua=
rantee:
>
>  (*) There is no guarantee that any of the memory accesses specified befo=
re a
>      memory barrier will be _complete_ by the completion of a memory barr=
ier
>      instruction; the barrier can be considered to draw a line in that CP=
U's
>      access queue that accesses of the appropriate type may not cross.

Also:

> Without intervention, CPU 2 may perceive the events on CPU 1 in some=20
> effectively random order, despite the write barrier issued by CPU 1:

Also:

> When dealing with CPU-CPU interactions, certain types of memory=20
> barrier should always be paired.  A lack of appropriate pairing is almost=
 certainly an error.
>
> A write barrier should always be paired with a data dependency barrier=20
> or read barrier, though a general barrier would also be viable.

Therefore, in an SMP system, the following situation can happen.

descriptor->cookie =3D 2
chan->common.cookie =3D 1
chan->completed_cookie =3D 1

This occurs when CPU-A calls fsl_dma_tx_submit() and then CPU-B calls
dma_async_is_complete() ***after*** CPU-B has observed the write to
descriptor->cookie, and ***before*** before CPU-B has observed the write=20
descriptor->to
chan->common.cookie.

Remember, without barriers, CPU-B can observe CPU-A's memory accesses in *a=
ny possible order*. Memory accesses are not guaranteed to be *complete* by =
the time fsl_dma_tx_submit() returns!

With the above values, dma_async_is_complete() returns DMA_COMPLETE. This i=
s incorrect: the DMA is still in progress. The required invariant
chan->common.cookie >=3D descriptor->cookie has not been met.

By adding an smp_rmb(), I force CPU-B to stall until *both* stores in
fsl_dma_tx_submit() (descriptor->cookie and chan->common.cookie) actually h=
it main memory. This avoids the above situation: all CPU's observe
descriptor->cookie and chan->common.cookie to update in sync with each
other.

Is this unclear in any way?

Please run your test with the smp_rmb() and measure the performance impact.

Ira

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-12-05  6:11 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-22  4:55 [PATCH][RFC] fsldma: fix performance degradation by optimizing spinlock use b29237
2011-11-22 18:59 ` Ira W. Snyder
2011-11-24  8:12   ` Shi Xuelin-B29237
2011-11-28 16:38     ` Ira W. Snyder
2011-11-29  3:19       ` Li Yang-R58472
2011-11-29 17:25         ` Ira W. Snyder
2011-11-30  0:08           ` Tabi Timur-B04825
2011-11-30  9:57           ` Shi Xuelin-B29237
2011-11-30 17:07             ` Ira W. Snyder
2011-12-02  3:47               ` Shi Xuelin-B29237
2011-12-02 17:13                 ` Ira W. Snyder
2011-12-05  6:11                   ` Shi Xuelin-B29237
2011-11-29 19:49         ` Scott Wood
2011-11-29  3:41       ` Shi Xuelin-B29237

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).