[PATCH 0/2] net: davinci_cpdma: reduce latency on -rt

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt
@ 2016-07-26 12:02 Uwe Kleine-König
  2016-07-26 12:02 ` [PATCH 1/2] net: davinci_cpdma: reduce time holding ctlr->lock in cpdma_control_set Uwe Kleine-König
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Uwe Kleine-König @ 2016-07-26 12:02 UTC (permalink / raw)
  To: Mugunthan V N, Grygorii Strashko; +Cc: linux-omap, netdev, kernel

Hello,

these patches are based on next-20160726. I didn't check yet how latency
improves by using these patches, but even if the improvment is small,
it's still a good idea to have them.

A second pair of eyes checking what I did would be great.

Best regards
Uwe


Uwe Kleine-König (2):
  net: davinci_cpdma: reduce time holding ctlr->lock in
    cpdma_control_set
  net: davinci_cpdma: reduce time holding chan->lock in
    cpdma_chan_submit

 drivers/net/ethernet/ti/davinci_cpdma.c | 58 ++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 29 deletions(-)

-- 
2.8.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] net: davinci_cpdma: reduce time holding ctlr->lock in cpdma_control_set
  2016-07-26 12:02 [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt Uwe Kleine-König
@ 2016-07-26 12:02 ` Uwe Kleine-König
  2016-08-04 15:22   ` Grygorii Strashko
  2016-08-09  8:27   ` Mugunthan V N
  2016-07-26 12:02 ` [PATCH 2/2] net: davinci_cpdma: reduce time holding chan->lock in cpdma_chan_submit Uwe Kleine-König
  2016-07-26 14:36 ` [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt Grygorii Strashko
  2 siblings, 2 replies; 15+ messages in thread
From: Uwe Kleine-König @ 2016-07-26 12:02 UTC (permalink / raw)
  To: Mugunthan V N, Grygorii Strashko; +Cc: linux-omap, netdev, kernel

The only user of cpdma_control_set (i.e. cpsw_ndo_open) doesn't check
the return code, so it doesn't matter, which error triggers. So the
checks that are independant of the fields protected by ctlr->lock can be
moved out of the critical section.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
 drivers/net/ethernet/ti/davinci_cpdma.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
index 73638f7a55d4..5ffa04a306c6 100644
--- a/drivers/net/ethernet/ti/davinci_cpdma.c
+++ b/drivers/net/ethernet/ti/davinci_cpdma.c
@@ -877,23 +877,21 @@ int cpdma_control_set(struct cpdma_ctlr *ctlr, int control, int value)
 	int ret;
 	u32 val;
 
-	spin_lock_irqsave(&ctlr->lock, flags);
-
-	ret = -ENOTSUPP;
 	if (!ctlr->params.has_ext_regs)
-		goto unlock_ret;
+		return -ENOTSUPP;
 
-	ret = -EINVAL;
-	if (ctlr->state != CPDMA_STATE_ACTIVE)
-		goto unlock_ret;
-
-	ret = -ENOENT;
 	if (control < 0 || control >= ARRAY_SIZE(controls))
-		goto unlock_ret;
+		return -ENOENT;
 
-	ret = -EPERM;
 	if ((info->access & ACCESS_WO) != ACCESS_WO)
+		return -EPERM;
+
+	spin_lock_irqsave(&ctlr->lock, flags);
+
+	if (ctlr->state != CPDMA_STATE_ACTIVE) {
+		ret = -EINVAL;
 		goto unlock_ret;
+	}
 
 	val  = dma_reg_read(ctlr, info->reg);
 	val &= ~(info->mask << info->shift);
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] net: davinci_cpdma: reduce time holding ctlr->lock in cpdma_control_set
  2016-07-26 12:02 ` [PATCH 1/2] net: davinci_cpdma: reduce time holding ctlr->lock in cpdma_control_set Uwe Kleine-König
@ 2016-08-04 15:22   ` Grygorii Strashko
  2016-08-09  8:27   ` Mugunthan V N
  1 sibling, 0 replies; 15+ messages in thread
From: Grygorii Strashko @ 2016-08-04 15:22 UTC (permalink / raw)
  To: Uwe Kleine-König, Mugunthan V N; +Cc: linux-omap, netdev, kernel

On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
> The only user of cpdma_control_set (i.e. cpsw_ndo_open) doesn't check
> the return code, so it doesn't matter, which error triggers. So the
> checks that are independant of the fields protected by ctlr->lock can be
> moved out of the critical section.
>
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>


Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>

> ---
>  drivers/net/ethernet/ti/davinci_cpdma.c | 20 +++++++++-----------
>  1 file changed, 9 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
> index 73638f7a55d4..5ffa04a306c6 100644
> --- a/drivers/net/ethernet/ti/davinci_cpdma.c
> +++ b/drivers/net/ethernet/ti/davinci_cpdma.c
> @@ -877,23 +877,21 @@ int cpdma_control_set(struct cpdma_ctlr *ctlr, int control, int value)
>  	int ret;
>  	u32 val;
>
> -	spin_lock_irqsave(&ctlr->lock, flags);
> -
> -	ret = -ENOTSUPP;
>  	if (!ctlr->params.has_ext_regs)
> -		goto unlock_ret;
> +		return -ENOTSUPP;
>
> -	ret = -EINVAL;
> -	if (ctlr->state != CPDMA_STATE_ACTIVE)
> -		goto unlock_ret;
> -
> -	ret = -ENOENT;
>  	if (control < 0 || control >= ARRAY_SIZE(controls))
> -		goto unlock_ret;
> +		return -ENOENT;
>
> -	ret = -EPERM;
>  	if ((info->access & ACCESS_WO) != ACCESS_WO)
> +		return -EPERM;
> +
> +	spin_lock_irqsave(&ctlr->lock, flags);
> +
> +	if (ctlr->state != CPDMA_STATE_ACTIVE) {
> +		ret = -EINVAL;
>  		goto unlock_ret;
> +	}
>
>  	val  = dma_reg_read(ctlr, info->reg);
>  	val &= ~(info->mask << info->shift);
>


-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] net: davinci_cpdma: reduce time holding ctlr->lock in cpdma_control_set
  2016-07-26 12:02 ` [PATCH 1/2] net: davinci_cpdma: reduce time holding ctlr->lock in cpdma_control_set Uwe Kleine-König
  2016-08-04 15:22   ` Grygorii Strashko
@ 2016-08-09  8:27   ` Mugunthan V N
  1 sibling, 0 replies; 15+ messages in thread
From: Mugunthan V N @ 2016-08-09  8:27 UTC (permalink / raw)
  To: Uwe Kleine-König, Grygorii Strashko; +Cc: linux-omap, netdev, kernel

On Tuesday 26 July 2016 05:32 PM, Uwe Kleine-König wrote:
> The only user of cpdma_control_set (i.e. cpsw_ndo_open) doesn't check
> the return code, so it doesn't matter, which error triggers. So the
> checks that are independant of the fields protected by ctlr->lock can be
> moved out of the critical section.
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>

Regards
Mugunthan V N

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 2/2] net: davinci_cpdma: reduce time holding chan->lock in cpdma_chan_submit
  2016-07-26 12:02 [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt Uwe Kleine-König
  2016-07-26 12:02 ` [PATCH 1/2] net: davinci_cpdma: reduce time holding ctlr->lock in cpdma_control_set Uwe Kleine-König
@ 2016-07-26 12:02 ` Uwe Kleine-König
  2016-07-26 14:25   ` Grygorii Strashko
  2016-07-26 14:36 ` [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt Grygorii Strashko
  2 siblings, 1 reply; 15+ messages in thread
From: Uwe Kleine-König @ 2016-07-26 12:02 UTC (permalink / raw)
  To: Mugunthan V N, Grygorii Strashko; +Cc: linux-omap, netdev, kernel

Allocating and preparing a dma descriptor doesn't need to happen under
the channel's lock. So do this before taking the channel's lock. The only
down side is that the dma descriptor might be allocated even though the
channel is about to be stopped. This is unlikely though.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
 drivers/net/ethernet/ti/davinci_cpdma.c | 38 +++++++++++++++++----------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
index 5ffa04a306c6..ba3462707ae3 100644
--- a/drivers/net/ethernet/ti/davinci_cpdma.c
+++ b/drivers/net/ethernet/ti/davinci_cpdma.c
@@ -542,24 +542,10 @@ int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
 	u32				mode;
 	int				ret = 0;
 
-	spin_lock_irqsave(&chan->lock, flags);
-
-	if (chan->state == CPDMA_STATE_TEARDOWN) {
-		ret = -EINVAL;
-		goto unlock_ret;
-	}
-
-	if (chan->count >= chan->desc_num)	{
-		chan->stats.desc_alloc_fail++;
-		ret = -ENOMEM;
-		goto unlock_ret;
-	}
-
 	desc = cpdma_desc_alloc(ctlr->pool);
 	if (!desc) {
 		chan->stats.desc_alloc_fail++;
-		ret = -ENOMEM;
-		goto unlock_ret;
+		return -ENOMEM;
 	}
 
 	if (len < ctlr->params.min_packet_size) {
@@ -571,8 +557,7 @@ int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
 	ret = dma_mapping_error(ctlr->dev, buffer);
 	if (ret) {
 		cpdma_desc_free(ctlr->pool, desc, 1);
-		ret = -EINVAL;
-		goto unlock_ret;
+		return -EINVAL;
 	}
 
 	mode = CPDMA_DESC_OWNER | CPDMA_DESC_SOP | CPDMA_DESC_EOP;
@@ -586,6 +571,19 @@ int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
 	desc_write(desc, sw_buffer, buffer);
 	desc_write(desc, sw_len,    len);
 
+	spin_lock_irqsave(&chan->lock, flags);
+
+	if (chan->state == CPDMA_STATE_TEARDOWN) {
+		ret = -EINVAL;
+		goto unlock_free;
+	}
+
+	if (chan->count >= chan->desc_num)	{
+		chan->stats.desc_alloc_fail++;
+		ret = -ENOMEM;
+		goto unlock_free;
+	}
+
 	__cpdma_chan_submit(chan, desc);
 
 	if (chan->state == CPDMA_STATE_ACTIVE && chan->rxfree)
@@ -593,8 +591,12 @@ int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
 
 	chan->count++;
 
-unlock_ret:
 	spin_unlock_irqrestore(&chan->lock, flags);
+	return 0;
+
+unlock_free:
+	spin_unlock_irqrestore(&chan->lock, flags);
+	cpdma_desc_free(ctlr->pool, desc, 1);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(cpdma_chan_submit);
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] net: davinci_cpdma: reduce time holding chan->lock in cpdma_chan_submit
  2016-07-26 12:02 ` [PATCH 2/2] net: davinci_cpdma: reduce time holding chan->lock in cpdma_chan_submit Uwe Kleine-König
@ 2016-07-26 14:25   ` Grygorii Strashko
  2016-07-27  7:12     ` Uwe Kleine-König
  0 siblings, 1 reply; 15+ messages in thread
From: Grygorii Strashko @ 2016-07-26 14:25 UTC (permalink / raw)
  To: Uwe Kleine-König, Mugunthan V N; +Cc: linux-omap, netdev, kernel

On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
> Allocating and preparing a dma descriptor doesn't need to happen under
> the channel's lock. So do this before taking the channel's lock. The only
> down side is that the dma descriptor might be allocated even though the
> channel is about to be stopped. This is unlikely though.
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> ---
>  drivers/net/ethernet/ti/davinci_cpdma.c | 38 +++++++++++++++++----------------
>  1 file changed, 20 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
> index 5ffa04a306c6..ba3462707ae3 100644
> --- a/drivers/net/ethernet/ti/davinci_cpdma.c
> +++ b/drivers/net/ethernet/ti/davinci_cpdma.c
> @@ -542,24 +542,10 @@ int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
>  	u32				mode;
>  	int				ret = 0;
>  
> -	spin_lock_irqsave(&chan->lock, flags);
> -
> -	if (chan->state == CPDMA_STATE_TEARDOWN) {
> -		ret = -EINVAL;
> -		goto unlock_ret;
> -	}
> -
> -	if (chan->count >= chan->desc_num)	{
> -		chan->stats.desc_alloc_fail++;
> -		ret = -ENOMEM;
> -		goto unlock_ret;
> -	}

I'm not sure this is right thing to do. This check is expected to be strict
and means "channel has exhausted the available descriptors, so further descs allocation does not allowed".


This also might affect on Ivan's work [1] "[PATCH 0/4]  net: ethernet: ti: cpsw: add multi-queue support"



[1] https://lkml.org/lkml/2016/6/30/603
-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] net: davinci_cpdma: reduce time holding chan->lock in cpdma_chan_submit
  2016-07-26 14:25   ` Grygorii Strashko
@ 2016-07-27  7:12     ` Uwe Kleine-König
  2016-07-27 14:08       ` Grygorii Strashko
  0 siblings, 1 reply; 15+ messages in thread
From: Uwe Kleine-König @ 2016-07-27  7:12 UTC (permalink / raw)
  To: Grygorii Strashko; +Cc: Mugunthan V N, linux-omap, netdev, kernel

Hello,

On Tue, Jul 26, 2016 at 05:25:58PM +0300, Grygorii Strashko wrote:
> On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
> > Allocating and preparing a dma descriptor doesn't need to happen under
> > the channel's lock. So do this before taking the channel's lock. The only
> > down side is that the dma descriptor might be allocated even though the
> > channel is about to be stopped. This is unlikely though.
> > 
> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > ---
> >  drivers/net/ethernet/ti/davinci_cpdma.c | 38 +++++++++++++++++----------------
> >  1 file changed, 20 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
> > index 5ffa04a306c6..ba3462707ae3 100644
> > --- a/drivers/net/ethernet/ti/davinci_cpdma.c
> > +++ b/drivers/net/ethernet/ti/davinci_cpdma.c
> > @@ -542,24 +542,10 @@ int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
> >  	u32				mode;
> >  	int				ret = 0;
> >  
> > -	spin_lock_irqsave(&chan->lock, flags);
> > -
> > -	if (chan->state == CPDMA_STATE_TEARDOWN) {
> > -		ret = -EINVAL;
> > -		goto unlock_ret;
> > -	}
> > -
> > -	if (chan->count >= chan->desc_num)	{
> > -		chan->stats.desc_alloc_fail++;
> > -		ret = -ENOMEM;
> > -		goto unlock_ret;
> > -	}
> 
> I'm not sure this is right thing to do. This check is expected to be strict
> and means "channel has exhausted the available descriptors, so further descs allocation does not allowed".

I developed this patch basing on a 4.4 kernel which doesn't have
742fb20fd4c7 ("net: ethernet: ti: cpdma: switch to use genalloc"). There
my patch is more obviously correct. As currently chan->count is
protected by chan->lock we must hold the lock for this check. If a
failing check means we must not call cpdma_desc_alloc in the first
place, that's bad.

But I'm not sure this is the case here. After all cpdma_desc_alloc
doesn't do anything relevant for the hardware, right?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] net: davinci_cpdma: reduce time holding chan->lock in cpdma_chan_submit
  2016-07-27  7:12     ` Uwe Kleine-König
@ 2016-07-27 14:08       ` Grygorii Strashko
  2016-07-27 18:11         ` Ivan Khoronzhuk
  0 siblings, 1 reply; 15+ messages in thread
From: Grygorii Strashko @ 2016-07-27 14:08 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: Mugunthan V N, linux-omap, netdev, kernel, Ivan Khoronzhuk

On 07/27/2016 10:12 AM, Uwe Kleine-König wrote:
> Hello,
> 
> On Tue, Jul 26, 2016 at 05:25:58PM +0300, Grygorii Strashko wrote:
>> On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
>>> Allocating and preparing a dma descriptor doesn't need to happen under
>>> the channel's lock. So do this before taking the channel's lock. The only
>>> down side is that the dma descriptor might be allocated even though the
>>> channel is about to be stopped. This is unlikely though.
>>>
>>> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>>> ---
>>>  drivers/net/ethernet/ti/davinci_cpdma.c | 38 +++++++++++++++++----------------
>>>  1 file changed, 20 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
>>> index 5ffa04a306c6..ba3462707ae3 100644
>>> --- a/drivers/net/ethernet/ti/davinci_cpdma.c
>>> +++ b/drivers/net/ethernet/ti/davinci_cpdma.c
>>> @@ -542,24 +542,10 @@ int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
>>>  	u32				mode;
>>>  	int				ret = 0;
>>>  
>>> -	spin_lock_irqsave(&chan->lock, flags);
>>> -
>>> -	if (chan->state == CPDMA_STATE_TEARDOWN) {
>>> -		ret = -EINVAL;
>>> -		goto unlock_ret;
>>> -	}
>>> -
>>> -	if (chan->count >= chan->desc_num)	{
>>> -		chan->stats.desc_alloc_fail++;
>>> -		ret = -ENOMEM;
>>> -		goto unlock_ret;
>>> -	}
>>
>> I'm not sure this is right thing to do. This check is expected to be strict
>> and means "channel has exhausted the available descriptors, so further descs allocation does not allowed".
> 
> I developed this patch basing on a 4.4 kernel which doesn't have
> 742fb20fd4c7 ("net: ethernet: ti: cpdma: switch to use genalloc"). There
> my patch is more obviously correct. As currently chan->count is
> protected by chan->lock we must hold the lock for this check. If a
> failing check means we must not call cpdma_desc_alloc in the first
> place, that's bad.

Yes. That's intention of this check :(
Now it'll work as following for two (rx/tx) channels, as example
RX desc_num = 16 (max allowed number of descriptors)
TX desc_num = 16 (max allowed number of descriptors)
and with current code number of allocated descriptors will never exceed 16.

with your change, in corner case when TX channel's already utilized 16 descriptors the
following will happen:
cpdma_chan_submit()
 - cpdma_desc_alloc() -  will allocate 17th desc
 - lock
 - check for chan->count - fail
 - unlock
 - cpdma_desc_free() 

so your patch will add additional desc_alloc/desc_free in the above corner case
and that's what i'm worry about (TEARDOWN seems ok) especially taking into account
further multi-queue feature development.

Above corner case seems might happen very rare, because of the guard check in cpsw_ndo_start_xmit(), 
but it could.

> 
> But I'm not sure this is the case here. After all cpdma_desc_alloc
> doesn't do anything relevant for the hardware, right?

Right.

Thanks. I'd try to do some measurement also.
-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/2] net: davinci_cpdma: reduce time holding chan->lock in cpdma_chan_submit
  2016-07-27 14:08       ` Grygorii Strashko
@ 2016-07-27 18:11         ` Ivan Khoronzhuk
  0 siblings, 0 replies; 15+ messages in thread
From: Ivan Khoronzhuk @ 2016-07-27 18:11 UTC (permalink / raw)
  To: Grygorii Strashko, Uwe Kleine-König
  Cc: Mugunthan V N, linux-omap, netdev, kernel



On 27.07.16 17:08, Grygorii Strashko wrote:
> On 07/27/2016 10:12 AM, Uwe Kleine-König wrote:
>> Hello,
>>
>> On Tue, Jul 26, 2016 at 05:25:58PM +0300, Grygorii Strashko wrote:
>>> On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
>>>> Allocating and preparing a dma descriptor doesn't need to happen under
>>>> the channel's lock. So do this before taking the channel's lock. The only
>>>> down side is that the dma descriptor might be allocated even though the
>>>> channel is about to be stopped. This is unlikely though.
>>>>
>>>> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>>>> ---
>>>>  drivers/net/ethernet/ti/davinci_cpdma.c | 38 +++++++++++++++++----------------
>>>>  1 file changed, 20 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c
>>>> index 5ffa04a306c6..ba3462707ae3 100644
>>>> --- a/drivers/net/ethernet/ti/davinci_cpdma.c
>>>> +++ b/drivers/net/ethernet/ti/davinci_cpdma.c
>>>> @@ -542,24 +542,10 @@ int cpdma_chan_submit(struct cpdma_chan *chan, void *token, void *data,
>>>>  	u32				mode;
>>>>  	int				ret = 0;
>>>>
>>>> -	spin_lock_irqsave(&chan->lock, flags);
>>>> -
>>>> -	if (chan->state == CPDMA_STATE_TEARDOWN) {
>>>> -		ret = -EINVAL;
>>>> -		goto unlock_ret;
>>>> -	}
>>>> -
>>>> -	if (chan->count >= chan->desc_num)	{
>>>> -		chan->stats.desc_alloc_fail++;
>>>> -		ret = -ENOMEM;
>>>> -		goto unlock_ret;
>>>> -	}
>>>
>>> I'm not sure this is right thing to do. This check is expected to be strict
>>> and means "channel has exhausted the available descriptors, so further descs allocation does not allowed".
>>
>> I developed this patch basing on a 4.4 kernel which doesn't have
>> 742fb20fd4c7 ("net: ethernet: ti: cpdma: switch to use genalloc"). There
>> my patch is more obviously correct. As currently chan->count is
>> protected by chan->lock we must hold the lock for this check. If a
>> failing check means we must not call cpdma_desc_alloc in the first
>> place, that's bad.
The chan->count is not only case where this lock is needed unfortunately.
I like the idea to remove a bunch of locks from here (I was wondering why it needs to have
so much locks when using h/w queues, but this is the style driver is written though)
This lock is also needed to cover stats counters at least.
In case of cpsw driver, that uses cpdam_chan, the same channel can be shared
between two emacs (in dual emac mode) then the lock is needed for every chan var.
So that's not rare case. In general, the optimization of cpdma is good idea,
but seems it can require much more changes.

>
> Yes. That's intention of this check :(
> Now it'll work as following for two (rx/tx) channels, as example
> RX desc_num = 16 (max allowed number of descriptors)
> TX desc_num = 16 (max allowed number of descriptors)
> and with current code number of allocated descriptors will never exceed 16.
>
> with your change, in corner case when TX channel's already utilized 16 descriptors the
> following will happen:
> cpdma_chan_submit()
>  - cpdma_desc_alloc() -  will allocate 17th desc
>  - lock
>  - check for chan->count - fail
>  - unlock
>  - cpdma_desc_free()
>
> so your patch will add additional desc_alloc/desc_free in the above corner case
> and that's what i'm worry about (TEARDOWN seems ok) especially taking into account
> further multi-queue feature development.
>
> Above corner case seems might happen very rare, because of the guard check in cpsw_ndo_start_xmit(),
> but it could.
>
>>
>> But I'm not sure this is the case here. After all cpdma_desc_alloc
>> doesn't do anything relevant for the hardware, right?
>
> Right.
>
> Thanks. I'd try to do some measurement also.
>

-- 
Regards,
Ivan Khoronzhuk

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt
  2016-07-26 12:02 [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt Uwe Kleine-König
  2016-07-26 12:02 ` [PATCH 1/2] net: davinci_cpdma: reduce time holding ctlr->lock in cpdma_control_set Uwe Kleine-König
  2016-07-26 12:02 ` [PATCH 2/2] net: davinci_cpdma: reduce time holding chan->lock in cpdma_chan_submit Uwe Kleine-König
@ 2016-07-26 14:36 ` Grygorii Strashko
  2016-07-27  7:03   ` Uwe Kleine-König
  2 siblings, 1 reply; 15+ messages in thread
From: Grygorii Strashko @ 2016-07-26 14:36 UTC (permalink / raw)
  To: Uwe Kleine-König, Mugunthan V N; +Cc: linux-omap, netdev, kernel

On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
> Hello,
>
> these patches are based on next-20160726. I didn't check yet how latency
> improves by using these patches, but even if the improvment is small,
> it's still a good idea to have them.

Sry, but how this will affect on -RT? This is not a raw locks, so
they will be converted to rt-mutexes which are sleepable.
Or I've missed smth?

>
> A second pair of eyes checking what I did would be great.
>
> Best regards
> Uwe
>
>
> Uwe Kleine-König (2):
>   net: davinci_cpdma: reduce time holding ctlr->lock in
>     cpdma_control_set
>   net: davinci_cpdma: reduce time holding chan->lock in
>     cpdma_chan_submit
>
>  drivers/net/ethernet/ti/davinci_cpdma.c | 58 ++++++++++++++++-----------------
>  1 file changed, 29 insertions(+), 29 deletions(-)
>


-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt
  2016-07-26 14:36 ` [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt Grygorii Strashko
@ 2016-07-27  7:03   ` Uwe Kleine-König
  2016-07-27 14:11     ` Grygorii Strashko
  0 siblings, 1 reply; 15+ messages in thread
From: Uwe Kleine-König @ 2016-07-27  7:03 UTC (permalink / raw)
  To: Grygorii Strashko; +Cc: Mugunthan V N, linux-omap, netdev, kernel

On Tue, Jul 26, 2016 at 05:36:49PM +0300, Grygorii Strashko wrote:
> On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
> >Hello,
> >
> >these patches are based on next-20160726. I didn't check yet how latency
> >improves by using these patches, but even if the improvment is small,
> >it's still a good idea to have them.
> 
> Sry, but how this will affect on -RT? This is not a raw locks, so
> they will be converted to rt-mutexes which are sleepable.
> Or I've missed smth?

They are still locks after all. On -rt I saw for the relevant
application:

  send package         |
    take lock          |
    write pckt to hw   |
                       | rcv irq
		       |   take lock
		       |     schedule
    drop lock	       | 
      schedule         |
                       |   get pckt from hw
		       |   drop lock

So reducing the time a lock is taken reduces the chances that the lock
is contended for another thread which results in extra context switches.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt
  2016-07-27  7:03   ` Uwe Kleine-König
@ 2016-07-27 14:11     ` Grygorii Strashko
  2016-07-27 14:38       ` Uwe Kleine-König
  0 siblings, 1 reply; 15+ messages in thread
From: Grygorii Strashko @ 2016-07-27 14:11 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: Mugunthan V N, linux-omap, netdev, kernel

On 07/27/2016 10:03 AM, Uwe Kleine-König wrote:
> On Tue, Jul 26, 2016 at 05:36:49PM +0300, Grygorii Strashko wrote:
>> On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
>>> Hello,
>>>
>>> these patches are based on next-20160726. I didn't check yet how latency
>>> improves by using these patches, but even if the improvment is small,
>>> it's still a good idea to have them.
>>
>> Sry, but how this will affect on -RT? This is not a raw locks, so
>> they will be converted to rt-mutexes which are sleepable.
>> Or I've missed smth?
> 
> They are still locks after all. On -rt I saw for the relevant
> application:
> 
>   send package         |
>     take lock          |
>     write pckt to hw   |
>                        | rcv irq
> 		       |   take lock
> 		       |     schedule
>     drop lock	       | 
>       schedule         |
>                        |   get pckt from hw
> 		       |   drop lock
> 
> So reducing the time a lock is taken reduces the chances that the lock
> is contended for another thread which results in extra context switches.
> 
Thanks a lot for explanation. So, this is not exactly rt-latency reduction,
but it might improve net performance on -RT. correct?

Thanks.
-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt
  2016-07-27 14:11     ` Grygorii Strashko
@ 2016-07-27 14:38       ` Uwe Kleine-König
  2016-07-28  9:34         ` Grygorii Strashko
  0 siblings, 1 reply; 15+ messages in thread
From: Uwe Kleine-König @ 2016-07-27 14:38 UTC (permalink / raw)
  To: Grygorii Strashko; +Cc: Mugunthan V N, linux-omap, netdev, kernel

Hello,

On Wed, Jul 27, 2016 at 05:11:54PM +0300, Grygorii Strashko wrote:
> On 07/27/2016 10:03 AM, Uwe Kleine-König wrote:
> > On Tue, Jul 26, 2016 at 05:36:49PM +0300, Grygorii Strashko wrote:
> >> On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
> >>> Hello,
> >>>
> >>> these patches are based on next-20160726. I didn't check yet how latency
> >>> improves by using these patches, but even if the improvment is small,
> >>> it's still a good idea to have them.
> >>
> >> Sry, but how this will affect on -RT? This is not a raw locks, so
> >> they will be converted to rt-mutexes which are sleepable.
> >> Or I've missed smth?
> > 
> > They are still locks after all. On -rt I saw for the relevant
> > application:
> > 
> >   send package         |
> >     take lock          |
> >     write pckt to hw   |
> >                        | rcv irq
> > 		       |   take lock
> > 		       |     schedule
> >     drop lock	       | 
> >       schedule         |
> >                        |   get pckt from hw
> > 		       |   drop lock
> > 
> > So reducing the time a lock is taken reduces the chances that the lock
> > is contended for another thread which results in extra context switches.
> > 
> Thanks a lot for explanation. So, this is not exactly rt-latency reduction,
> but it might improve net performance on -RT. correct?

Well, it's not really rt related, but if you hit a locked lock on rt it
hurts more than on !rt. And this results in increased latency.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt
  2016-07-27 14:38       ` Uwe Kleine-König
@ 2016-07-28  9:34         ` Grygorii Strashko
  2016-07-28 12:12           ` Uwe Kleine-König
  0 siblings, 1 reply; 15+ messages in thread
From: Grygorii Strashko @ 2016-07-28  9:34 UTC (permalink / raw)
  To: Uwe Kleine-König; +Cc: Mugunthan V N, linux-omap, netdev, kernel

On 07/27/2016 05:38 PM, Uwe Kleine-König wrote:
> Hello,
> 
> On Wed, Jul 27, 2016 at 05:11:54PM +0300, Grygorii Strashko wrote:
>> On 07/27/2016 10:03 AM, Uwe Kleine-König wrote:
>>> On Tue, Jul 26, 2016 at 05:36:49PM +0300, Grygorii Strashko wrote:
>>>> On 07/26/2016 03:02 PM, Uwe Kleine-König wrote:
>>>>> Hello,
>>>>>
>>>>> these patches are based on next-20160726. I didn't check yet how latency
>>>>> improves by using these patches, but even if the improvment is small,
>>>>> it's still a good idea to have them.
>>>>
>>>> Sry, but how this will affect on -RT? This is not a raw locks, so
>>>> they will be converted to rt-mutexes which are sleepable.
>>>> Or I've missed smth?
>>>
>>> They are still locks after all. On -rt I saw for the relevant
>>> application:
>>>
>>>   send package         |
>>>     take lock          |
>>>     write pckt to hw   |
>>>                        | rcv irq
>>> 		       |   take lock
>>> 		       |     schedule
>>>     drop lock	       | 
>>>       schedule         |
>>>                        |   get pckt from hw
>>> 		       |   drop lock
>>>
>>> So reducing the time a lock is taken reduces the chances that the lock
>>> is contended for another thread which results in extra context switches.
>>>
>> Thanks a lot for explanation. So, this is not exactly rt-latency reduction,
>> but it might improve net performance on -RT. correct?
> 
> Well, it's not really rt related, but if you hit a locked lock on rt it
> hurts more than on !rt. And this results in increased latency.
> 

Thanks. I've just wanted to have clear understanding of the [possible] issue.
And I'd be appreciated if you could share and measurement results if you have.

-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt
  2016-07-28  9:34         ` Grygorii Strashko
@ 2016-07-28 12:12           ` Uwe Kleine-König
  0 siblings, 0 replies; 15+ messages in thread
From: Uwe Kleine-König @ 2016-07-28 12:12 UTC (permalink / raw)
  To: Grygorii Strashko; +Cc: Mugunthan V N, linux-omap, kernel, netdev

Hello Grygorii,

On Thu, Jul 28, 2016 at 12:34:19PM +0300, Grygorii Strashko wrote:
> Thanks. I've just wanted to have clear understanding of the [possible] issue.
> And I'd be appreciated if you could share and measurement results if you have.

I didn't measure anything (yet), just considered these patches
low-hanging fruits. But when looking again there, I will provide some
numbers.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-08-09  8:27 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-26 12:02 [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt Uwe Kleine-König
2016-07-26 12:02 ` [PATCH 1/2] net: davinci_cpdma: reduce time holding ctlr->lock in cpdma_control_set Uwe Kleine-König
2016-08-04 15:22   ` Grygorii Strashko
2016-08-09  8:27   ` Mugunthan V N
2016-07-26 12:02 ` [PATCH 2/2] net: davinci_cpdma: reduce time holding chan->lock in cpdma_chan_submit Uwe Kleine-König
2016-07-26 14:25   ` Grygorii Strashko
2016-07-27  7:12     ` Uwe Kleine-König
2016-07-27 14:08       ` Grygorii Strashko
2016-07-27 18:11         ` Ivan Khoronzhuk
2016-07-26 14:36 ` [PATCH 0/2] net: davinci_cpdma: reduce latency on -rt Grygorii Strashko
2016-07-27  7:03   ` Uwe Kleine-König
2016-07-27 14:11     ` Grygorii Strashko
2016-07-27 14:38       ` Uwe Kleine-König
2016-07-28  9:34         ` Grygorii Strashko
2016-07-28 12:12           ` Uwe Kleine-König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).