All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tero Kristo <t-kristo@ti.com>
To: Grygorii Strashko <grygorii.strashko@ti.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Dave Gerlach <d-gerlach@ti.com>
Cc: tony@atomide.com, lokeshvutla@ti.com,
	linux-crypto@vger.kernel.org, linux-omap@vger.kernel.org,
	davem@davemloft.net, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations
Date: Wed, 22 Jun 2016 12:17:09 +0300	[thread overview]
Message-ID: <576A5795.5030300@ti.com> (raw)
In-Reply-To: <5756BD0B.4030506@ti.com>

On 07/06/16 15:24, Grygorii Strashko wrote:
> On 06/07/2016 02:52 PM, Tero Kristo wrote:
>> On 07/06/16 13:08, Herbert Xu wrote:
>>> On Wed, Jun 01, 2016 at 06:03:52PM -0500, Dave Gerlach wrote:
>>>> On 06/01/2016 04:53 AM, Grygorii Strashko wrote:
>>>>> On 06/01/2016 11:56 AM, Tero Kristo wrote:
>>>>>> From: Lokesh Vutla <lokeshvutla@ti.com>
>>>>>>
>>>>>> Calling runtime PM API for every block causes serious perf hit to
>>>>>> crypto operations that are done on a long buffer.
>>>>>> As crypto is performed on a page boundary, encrypting large buffers
>>>>>> can
>>>>>> cause a series of crypto operations divided by page. The runtime PM
>>>>>> API
>>>>>> is also called those many times.
>>>>>>
>>>>>> We call runtime_pm_get_sync only at beginning on the session
>>>>>> (cra_init)
>>>>>> and runtime_pm_put at the end. This result in upto a 50% speedup.
>>>>>> This doesn't make the driver to keep the system awake as runtime
>>>>>> get/put
>>>>>> is only called during a crypto session which completes usually
>>>>>> quickly.
>>>>>>
>>>>>> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
>>>>>> Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>>> ---
>>>>>>   drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
>>>>>>   1 file changed, 17 insertions(+), 10 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
>>>>>> index 6eefaa2..bd0258f 100644
>>>>>> --- a/drivers/crypto/omap-sham.c
>>>>>> +++ b/drivers/crypto/omap-sham.c
>>>>>> @@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct
>>>>>> ahash_request *req)
>>>>>>
>>>>>>   static int omap_sham_hw_init(struct omap_sham_dev *dd)
>>>>>>   {
>>>>>> -    int err;
>>>>>> -
>>>>>> -    err = pm_runtime_get_sync(dd->dev);
>>>>>> -    if (err < 0) {
>>>>>> -        dev_err(dd->dev, "failed to get sync: %d\n", err);
>>>>>> -        return err;
>>>>>> -    }
>>>>>> -
>>>>
>>>> Would it be worth it to investigate a pm_runtime autosuspend
>>>> approach rather than knocking runtime PM out here completely? I am
>>>> not clear if the overhead is coming from the pm_runtime calls
>>>> themselves or the actual idling of the IP, but if it's the idling of
>>>> the IP causing the slowdown, with a large enough autosuspend_delay
>>>> we don't actually sleep between each block but after a long enough
>>>> period of idle time we would actually suspend.
>>>
>>> Indeed, I think this patch is bogus.  cra_init is associated
>>> with the tfm object which is usually long-lived.  So doing power
>>> management there makes no sense.
>>>
>>> Cheers,
>>>
>>
>> I can investigate this further, but I believe this patch itself gave a
>> noticeable performance boost.
>>
>> This is an optimization anyway, and not critical for functionality.
>>
>
> It is not critical only if below code would not introduce races

I don't get your point here. This patch is an optimization, and the 
driver works fine without it.

> +    spin_lock_bh(&sham.lock);
> +    list_for_each_entry(dd, &sham.dev_list, list) {
> +        break;
> +    }
> +    spin_unlock_bh(&sham.lock);
>
> Is it guaranteed that dd will alive always at this moment?

Typically yes, but I think there might be a race condition here if the 
driver is removed during operation. Anyway, I'll drop this patch and 
change the optimization to use autosuspend as Dave suggested; that gives 
almost the same performance boost as this one (I miss a couple of 
percent in the overall performance, but I can live with that.)

-Tero

>
> +
> +    pm_runtime_get_sync(dd->dev);
>
>
>

WARNING: multiple messages have this Message-ID (diff)
From: t-kristo@ti.com (Tero Kristo)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations
Date: Wed, 22 Jun 2016 12:17:09 +0300	[thread overview]
Message-ID: <576A5795.5030300@ti.com> (raw)
In-Reply-To: <5756BD0B.4030506@ti.com>

On 07/06/16 15:24, Grygorii Strashko wrote:
> On 06/07/2016 02:52 PM, Tero Kristo wrote:
>> On 07/06/16 13:08, Herbert Xu wrote:
>>> On Wed, Jun 01, 2016 at 06:03:52PM -0500, Dave Gerlach wrote:
>>>> On 06/01/2016 04:53 AM, Grygorii Strashko wrote:
>>>>> On 06/01/2016 11:56 AM, Tero Kristo wrote:
>>>>>> From: Lokesh Vutla <lokeshvutla@ti.com>
>>>>>>
>>>>>> Calling runtime PM API for every block causes serious perf hit to
>>>>>> crypto operations that are done on a long buffer.
>>>>>> As crypto is performed on a page boundary, encrypting large buffers
>>>>>> can
>>>>>> cause a series of crypto operations divided by page. The runtime PM
>>>>>> API
>>>>>> is also called those many times.
>>>>>>
>>>>>> We call runtime_pm_get_sync only at beginning on the session
>>>>>> (cra_init)
>>>>>> and runtime_pm_put at the end. This result in upto a 50% speedup.
>>>>>> This doesn't make the driver to keep the system awake as runtime
>>>>>> get/put
>>>>>> is only called during a crypto session which completes usually
>>>>>> quickly.
>>>>>>
>>>>>> Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
>>>>>> Signed-off-by: Tero Kristo <t-kristo@ti.com>
>>>>>> ---
>>>>>>   drivers/crypto/omap-sham.c | 27 +++++++++++++++++----------
>>>>>>   1 file changed, 17 insertions(+), 10 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
>>>>>> index 6eefaa2..bd0258f 100644
>>>>>> --- a/drivers/crypto/omap-sham.c
>>>>>> +++ b/drivers/crypto/omap-sham.c
>>>>>> @@ -360,14 +360,6 @@ static void omap_sham_copy_ready_hash(struct
>>>>>> ahash_request *req)
>>>>>>
>>>>>>   static int omap_sham_hw_init(struct omap_sham_dev *dd)
>>>>>>   {
>>>>>> -    int err;
>>>>>> -
>>>>>> -    err = pm_runtime_get_sync(dd->dev);
>>>>>> -    if (err < 0) {
>>>>>> -        dev_err(dd->dev, "failed to get sync: %d\n", err);
>>>>>> -        return err;
>>>>>> -    }
>>>>>> -
>>>>
>>>> Would it be worth it to investigate a pm_runtime autosuspend
>>>> approach rather than knocking runtime PM out here completely? I am
>>>> not clear if the overhead is coming from the pm_runtime calls
>>>> themselves or the actual idling of the IP, but if it's the idling of
>>>> the IP causing the slowdown, with a large enough autosuspend_delay
>>>> we don't actually sleep between each block but after a long enough
>>>> period of idle time we would actually suspend.
>>>
>>> Indeed, I think this patch is bogus.  cra_init is associated
>>> with the tfm object which is usually long-lived.  So doing power
>>> management there makes no sense.
>>>
>>> Cheers,
>>>
>>
>> I can investigate this further, but I believe this patch itself gave a
>> noticeable performance boost.
>>
>> This is an optimization anyway, and not critical for functionality.
>>
>
> It is not critical only if below code would not introduce races

I don't get your point here. This patch is an optimization, and the 
driver works fine without it.

> +    spin_lock_bh(&sham.lock);
> +    list_for_each_entry(dd, &sham.dev_list, list) {
> +        break;
> +    }
> +    spin_unlock_bh(&sham.lock);
>
> Is it guaranteed that dd will alive always at this moment?

Typically yes, but I think there might be a race condition here if the 
driver is removed during operation. Anyway, I'll drop this patch and 
change the optimization to use autosuspend as Dave suggested; that gives 
almost the same performance boost as this one (I miss a couple of 
percent in the overall performance, but I can live with that.)

-Tero

>
> +
> +    pm_runtime_get_sync(dd->dev);
>
>
>

  reply	other threads:[~2016-06-22  9:17 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-01  8:56 [PATCH 00/28] crypto: omap fixes / support additions Tero Kristo
2016-06-01  8:56 ` Tero Kristo
2016-06-01  8:56 ` Tero Kristo
2016-06-01  8:56 ` [PATCH 01/28] crypto: omap-aes: Fix registration of algorithms Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-07 10:48   ` Herbert Xu
2016-06-07 10:48     ` Herbert Xu
2016-06-20 12:11     ` Tero Kristo
2016-06-20 12:11       ` Tero Kristo
2016-06-20 23:49       ` Herbert Xu
2016-06-20 23:49         ` Herbert Xu
2016-06-01  8:56 ` [PATCH 02/28] crypto: omap-sham: Don't idle/start SHA device between Encrypt operations Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  9:53   ` Grygorii Strashko
2016-06-01  9:53     ` Grygorii Strashko
2016-06-01  9:53     ` Grygorii Strashko
2016-06-01 23:03     ` Dave Gerlach
2016-06-01 23:03       ` Dave Gerlach
2016-06-01 23:03       ` Dave Gerlach
2016-06-07 10:08       ` Herbert Xu
2016-06-07 10:08         ` Herbert Xu
2016-06-07 11:52         ` Tero Kristo
2016-06-07 11:52           ` Tero Kristo
2016-06-07 11:52           ` Tero Kristo
2016-06-07 12:24           ` Grygorii Strashko
2016-06-07 12:24             ` Grygorii Strashko
2016-06-07 12:24             ` Grygorii Strashko
2016-06-22  9:17             ` Tero Kristo [this message]
2016-06-22  9:17               ` Tero Kristo
2016-06-01  8:56 ` [PATCH 03/28] crypto: omap-sham: change queue size from 1 to 10 Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 04/28] crypto: omap: do not call dmaengine_terminate_all Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 05/28] crypto: omap-sham: set sw fallback to 240 bytes Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 06/28] crypto: omap-sham: avoid executing tasklet where not needed Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 07/28] crypto: ahash: increase the maximum allowed statesize Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 08/28] crypto: omap-sham: implement context export/import APIs Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 09/28] crypto: omap-des: Fix support for unequal lengths Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 10/28] crypto: omap-aes - Fix enabling clocks Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 11/28] crypto: omap-aes: Add support for multiple cores Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 12/28] crypto: omap-aes: Add fallback support Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 13/28] crypto: engine: avoid unnecessary context switches Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 14/28] crypto: omap-aes: fix crypto engine initialization order Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56 ` [PATCH 15/28] crypto: omap-des: " Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  8:56   ` Tero Kristo
2016-06-01  9:04 ` [PATCH 16/28] ARM: DRA7: hwmod: Add data for DES IP Tero Kristo
2016-06-01  9:04   ` Tero Kristo
2016-06-01  9:04   ` Tero Kristo
2016-06-01  9:06 ` [PATCH 17/28] ARM: DRA7: hwmod: Add data for AES IP Tero Kristo
2016-06-01  9:06   ` Tero Kristo
2016-06-01  9:06   ` Tero Kristo
2016-06-01  9:06 ` [PATCH 18/28] ARM: DRA7: hwmod: Add data for SHA IP Tero Kristo
2016-06-01  9:06   ` Tero Kristo
2016-06-01  9:06   ` Tero Kristo
2016-06-01  9:06 ` [PATCH 20/28] ARM: OMAP: DRA7xx: Make L4SEC clock domain SWSUP only Tero Kristo
2016-06-01  9:06   ` Tero Kristo
2016-06-01  9:06   ` Tero Kristo
2016-06-01  9:06   ` [PATCH 21/28] ARM: AM43xx: hwmod: Add data for DES Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06   ` [PATCH 22/28] ARM: AMx3xx: hwmod: Add data for RNG Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06   ` [PATCH 23/28] ARM: dts: DRA7: Add DT node for DES IP Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-10 11:38     ` Tony Lindgren
2016-06-10 11:38       ` Tony Lindgren
2016-06-21 17:56       ` Tero Kristo
2016-06-21 17:56         ` Tero Kristo
2016-06-21 17:56         ` Tero Kristo
2016-06-22  7:58         ` Tony Lindgren
2016-06-22  7:58           ` Tony Lindgren
2016-06-01  9:06   ` [PATCH 24/28] ARM: dts: DRA7: Add DT nodes for AES IP Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06   ` [PATCH 25/28] ARM: dts: DRA7: Add support for SHA IP Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06   ` [PATCH 26/28] ARM: dts: DRA7: Add DT node for RNG IP Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06   ` [PATCH 27/28] ARM: dts: AM43xx: clk: Add RNG clk node Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06   ` [PATCH 28/28] ARM: dts: AM43xx: Add node for RNG Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06     ` Tero Kristo
2016-06-01  9:06 ` [PATCH 19/28] ARM: DRA7: hwmod: Add data for RNG IP Tero Kristo
2016-06-01  9:06   ` Tero Kristo
2016-06-01  9:06   ` Tero Kristo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=576A5795.5030300@ti.com \
    --to=t-kristo@ti.com \
    --cc=d-gerlach@ti.com \
    --cc=davem@davemloft.net \
    --cc=grygorii.strashko@ti.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-omap@vger.kernel.org \
    --cc=lokeshvutla@ti.com \
    --cc=tony@atomide.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.