* [PATCH] hwrng: bcm2835: Fix hwrng throughput regression
@ 2023-08-26 11:28 Stefan Wahren
2023-08-26 12:34 ` Jason A. Donenfeld
0 siblings, 1 reply; 6+ messages in thread
From: Stefan Wahren @ 2023-08-26 11:28 UTC (permalink / raw)
To: Olivia Mackall, Herbert Xu, Florian Fainelli, Ray Jui,
Scott Branden
Cc: Jason A. Donenfeld, Mark Brown, linux-crypto, linux-arm-kernel,
bcm-kernel-feedback-list, Stefan Wahren
The recent RCU stall fix caused a massive throughput regression of the
hwrng on Raspberry Pi 0 - 3. So try to restore a similiar throughput
as before the RCU stall fix.
Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig):
sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000
cpu_relax ~138025 Bytes / sec
hwrng_msleep(1000) ~13 Bytes / sec
usleep_range(100,200) ~92141 Bytes / sec
Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()")
Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@gmx.net/
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
---
drivers/char/hw_random/bcm2835-rng.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c
index e98fcac578d6..3f1b6aaa98ee 100644
--- a/drivers/char/hw_random/bcm2835-rng.c
+++ b/drivers/char/hw_random/bcm2835-rng.c
@@ -14,6 +14,7 @@
#include <linux/printk.h>
#include <linux/clk.h>
#include <linux/reset.h>
+#include <linux/delay.h>
#define RNG_CTRL 0x0
#define RNG_STATUS 0x4
@@ -71,7 +72,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max,
while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) {
if (!wait)
return 0;
- hwrng_msleep(rng, 1000);
+ usleep_range(100, 200);
}
num_words = rng_readl(priv, RNG_STATUS) >> 24;
--
2.34.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] hwrng: bcm2835: Fix hwrng throughput regression
2023-08-26 11:28 [PATCH] hwrng: bcm2835: Fix hwrng throughput regression Stefan Wahren
@ 2023-08-26 12:34 ` Jason A. Donenfeld
2023-08-26 14:01 ` Stefan Wahren
0 siblings, 1 reply; 6+ messages in thread
From: Jason A. Donenfeld @ 2023-08-26 12:34 UTC (permalink / raw)
To: Stefan Wahren
Cc: Olivia Mackall, Herbert Xu, Florian Fainelli, Ray Jui,
Scott Branden, Mark Brown, linux-crypto, linux-arm-kernel,
bcm-kernel-feedback-list
On Sat, Aug 26, 2023 at 01:28:28PM +0200, Stefan Wahren wrote:
> The recent RCU stall fix caused a massive throughput regression of the
> hwrng on Raspberry Pi 0 - 3. So try to restore a similiar throughput
> as before the RCU stall fix.
>
> Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig):
>
> sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000
>
> cpu_relax ~138025 Bytes / sec
> hwrng_msleep(1000) ~13 Bytes / sec
> usleep_range(100,200) ~92141 Bytes / sec
>
> Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()")
> Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@gmx.net/
> Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
> ---
> drivers/char/hw_random/bcm2835-rng.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c
> index e98fcac578d6..3f1b6aaa98ee 100644
> --- a/drivers/char/hw_random/bcm2835-rng.c
> +++ b/drivers/char/hw_random/bcm2835-rng.c
> @@ -14,6 +14,7 @@
> #include <linux/printk.h>
> #include <linux/clk.h>
> #include <linux/reset.h>
> +#include <linux/delay.h>
>
> #define RNG_CTRL 0x0
> #define RNG_STATUS 0x4
> @@ -71,7 +72,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max,
> while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) {
> if (!wait)
> return 0;
> - hwrng_msleep(rng, 1000);
> + usleep_range(100, 200);
I think we still need to use the hwrng_msleep function so that the sleep
remains cancelable. Maybe just change the 1000 to 100?
Jason
> }
>
> num_words = rng_readl(priv, RNG_STATUS) >> 24;
> --
> 2.34.1
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] hwrng: bcm2835: Fix hwrng throughput regression
2023-08-26 12:34 ` Jason A. Donenfeld
@ 2023-08-26 14:01 ` Stefan Wahren
2023-08-26 15:48 ` Jason A. Donenfeld
0 siblings, 1 reply; 6+ messages in thread
From: Stefan Wahren @ 2023-08-26 14:01 UTC (permalink / raw)
To: Jason A. Donenfeld
Cc: Olivia Mackall, Herbert Xu, Florian Fainelli, Ray Jui,
Scott Branden, Mark Brown, linux-crypto, linux-arm-kernel,
bcm-kernel-feedback-list
Hi Jason,
Am 26.08.23 um 14:34 schrieb Jason A. Donenfeld:
> On Sat, Aug 26, 2023 at 01:28:28PM +0200, Stefan Wahren wrote:
>> The recent RCU stall fix caused a massive throughput regression of the
>> hwrng on Raspberry Pi 0 - 3. So try to restore a similiar throughput
>> as before the RCU stall fix.
>>
>> Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig):
>>
>> sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000
>>
>> cpu_relax ~138025 Bytes / sec
>> hwrng_msleep(1000) ~13 Bytes / sec
>> usleep_range(100,200) ~92141 Bytes / sec
>>
>> Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()")
>> Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@gmx.net/
>> Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
>> ---
>> drivers/char/hw_random/bcm2835-rng.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c
>> index e98fcac578d6..3f1b6aaa98ee 100644
>> --- a/drivers/char/hw_random/bcm2835-rng.c
>> +++ b/drivers/char/hw_random/bcm2835-rng.c
>> @@ -14,6 +14,7 @@
>> #include <linux/printk.h>
>> #include <linux/clk.h>
>> #include <linux/reset.h>
>> +#include <linux/delay.h>
>>
>> #define RNG_CTRL 0x0
>> #define RNG_STATUS 0x4
>> @@ -71,7 +72,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max,
>> while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) {
>> if (!wait)
>> return 0;
>> - hwrng_msleep(rng, 1000);
>> + usleep_range(100, 200);
>
> I think we still need to use the hwrng_msleep function so that the sleep
> remains cancelable. Maybe just change the 1000 to 100?
i found that other hwrng driver like iproc-rng200 (Raspberry Pi 4) also
use usleep_range().
Nevertheless here are more numbers:
usleep_range(200,400) : 47776 bytes / sec
hwrng_msleep(20) : 715 bytes / sec
Changing to 100 ms won't be a real gain.
>
> Jason
>
>> }
>>
>> num_words = rng_readl(priv, RNG_STATUS) >> 24;
>> --
>> 2.34.1
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] hwrng: bcm2835: Fix hwrng throughput regression
2023-08-26 14:01 ` Stefan Wahren
@ 2023-08-26 15:48 ` Jason A. Donenfeld
2023-08-30 18:05 ` Stefan Wahren
0 siblings, 1 reply; 6+ messages in thread
From: Jason A. Donenfeld @ 2023-08-26 15:48 UTC (permalink / raw)
To: Stefan Wahren
Cc: Olivia Mackall, Herbert Xu, Florian Fainelli, Ray Jui,
Scott Branden, Mark Brown, linux-crypto, linux-arm-kernel,
bcm-kernel-feedback-list
On Sat, Aug 26, 2023 at 04:01:58PM +0200, Stefan Wahren wrote:
> Hi Jason,
>
> Am 26.08.23 um 14:34 schrieb Jason A. Donenfeld:
> > On Sat, Aug 26, 2023 at 01:28:28PM +0200, Stefan Wahren wrote:
> >> The recent RCU stall fix caused a massive throughput regression of the
> >> hwrng on Raspberry Pi 0 - 3. So try to restore a similiar throughput
> >> as before the RCU stall fix.
> >>
> >> Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig):
> >>
> >> sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000
> >>
> >> cpu_relax ~138025 Bytes / sec
> >> hwrng_msleep(1000) ~13 Bytes / sec
> >> usleep_range(100,200) ~92141 Bytes / sec
> >>
> >> Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()")
> >> Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@gmx.net/
> >> Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
> >> ---
> >> drivers/char/hw_random/bcm2835-rng.c | 3 ++-
> >> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c
> >> index e98fcac578d6..3f1b6aaa98ee 100644
> >> --- a/drivers/char/hw_random/bcm2835-rng.c
> >> +++ b/drivers/char/hw_random/bcm2835-rng.c
> >> @@ -14,6 +14,7 @@
> >> #include <linux/printk.h>
> >> #include <linux/clk.h>
> >> #include <linux/reset.h>
> >> +#include <linux/delay.h>
> >>
> >> #define RNG_CTRL 0x0
> >> #define RNG_STATUS 0x4
> >> @@ -71,7 +72,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max,
> >> while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) {
> >> if (!wait)
> >> return 0;
> >> - hwrng_msleep(rng, 1000);
> >> + usleep_range(100, 200);
> >
> > I think we still need to use the hwrng_msleep function so that the sleep
> > remains cancelable. Maybe just change the 1000 to 100?
> i found that other hwrng driver like iproc-rng200 (Raspberry Pi 4) also
> use usleep_range().
>
> Nevertheless here are more numbers:
>
> usleep_range(200,400) : 47776 bytes / sec
> hwrng_msleep(20) : 715 bytes / sec
>
> Changing to 100 ms won't be a real gain.
I'm fine with whatever number you want there. Maybe we need a
hwrng_usleep_range() that takes into account rng->dying like
hwrng_msleep() does? (And iproc-rng200 should probably use that too?)
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] hwrng: bcm2835: Fix hwrng throughput regression
2023-08-26 15:48 ` Jason A. Donenfeld
@ 2023-08-30 18:05 ` Stefan Wahren
2023-08-30 22:31 ` Jason A. Donenfeld
0 siblings, 1 reply; 6+ messages in thread
From: Stefan Wahren @ 2023-08-30 18:05 UTC (permalink / raw)
To: Jason A. Donenfeld, Herbert Xu
Cc: Olivia Mackall, Florian Fainelli, Ray Jui, Scott Branden,
Mark Brown, linux-crypto, linux-arm-kernel,
bcm-kernel-feedback-list, Thorsten Leemhuis, Phil Elwell
Hi Jason,
Am 26.08.23 um 17:48 schrieb Jason A. Donenfeld:
> On Sat, Aug 26, 2023 at 04:01:58PM +0200, Stefan Wahren wrote:
>> Hi Jason,
>>
>> Am 26.08.23 um 14:34 schrieb Jason A. Donenfeld:
>>> On Sat, Aug 26, 2023 at 01:28:28PM +0200, Stefan Wahren wrote:
>>>> The recent RCU stall fix caused a massive throughput regression of the
>>>> hwrng on Raspberry Pi 0 - 3. So try to restore a similiar throughput
>>>> as before the RCU stall fix.
>>>>
>>>> Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig):
>>>>
>>>> sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000
>>>>
>>>> cpu_relax ~138025 Bytes / sec
>>>> hwrng_msleep(1000) ~13 Bytes / sec
>>>> usleep_range(100,200) ~92141 Bytes / sec
>>>>
>>>> Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()")
>>>> Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@gmx.net/
>>>> Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
>>>> ---
>>>> drivers/char/hw_random/bcm2835-rng.c | 3 ++-
>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c
>>>> index e98fcac578d6..3f1b6aaa98ee 100644
>>>> --- a/drivers/char/hw_random/bcm2835-rng.c
>>>> +++ b/drivers/char/hw_random/bcm2835-rng.c
>>>> @@ -14,6 +14,7 @@
>>>> #include <linux/printk.h>
>>>> #include <linux/clk.h>
>>>> #include <linux/reset.h>
>>>> +#include <linux/delay.h>
>>>>
>>>> #define RNG_CTRL 0x0
>>>> #define RNG_STATUS 0x4
>>>> @@ -71,7 +72,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max,
>>>> while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) {
>>>> if (!wait)
>>>> return 0;
>>>> - hwrng_msleep(rng, 1000);
>>>> + usleep_range(100, 200);
>>> I think we still need to use the hwrng_msleep function so that the sleep
>>> remains cancelable. Maybe just change the 1000 to 100?
>> i found that other hwrng driver like iproc-rng200 (Raspberry Pi 4) also
>> use usleep_range().
>>
>> Nevertheless here are more numbers:
>>
>> usleep_range(200,400) : 47776 bytes / sec
>> hwrng_msleep(20) : 715 bytes / sec
>>
>> Changing to 100 ms won't be a real gain.
> I'm fine with whatever number you want there. Maybe we need a
> hwrng_usleep_range() that takes into account rng->dying like
> hwrng_msleep() does? (And iproc-rng200 should probably use that too?)
the idea of this patch was to fix the performance regression in upcoming
mainline and backport the fix to Linux 6.1 LTS. After that i'm fine with
the introduction of hwrng_usleep_range().
Best regards
> Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] hwrng: bcm2835: Fix hwrng throughput regression
2023-08-30 18:05 ` Stefan Wahren
@ 2023-08-30 22:31 ` Jason A. Donenfeld
0 siblings, 0 replies; 6+ messages in thread
From: Jason A. Donenfeld @ 2023-08-30 22:31 UTC (permalink / raw)
To: Stefan Wahren
Cc: Herbert Xu, Olivia Mackall, Florian Fainelli, Ray Jui,
Scott Branden, Mark Brown, linux-crypto, linux-arm-kernel,
bcm-kernel-feedback-list, Thorsten Leemhuis, Phil Elwell
On Wed, Aug 30, 2023 at 08:05:39PM +0200, Stefan Wahren wrote:
> Hi Jason,
>
> Am 26.08.23 um 17:48 schrieb Jason A. Donenfeld:
> > On Sat, Aug 26, 2023 at 04:01:58PM +0200, Stefan Wahren wrote:
> >> Hi Jason,
> >>
> >> Am 26.08.23 um 14:34 schrieb Jason A. Donenfeld:
> >>> On Sat, Aug 26, 2023 at 01:28:28PM +0200, Stefan Wahren wrote:
> >>>> The recent RCU stall fix caused a massive throughput regression of the
> >>>> hwrng on Raspberry Pi 0 - 3. So try to restore a similiar throughput
> >>>> as before the RCU stall fix.
> >>>>
> >>>> Some performance measurements on Raspberry Pi 3B+ (arm64/defconfig):
> >>>>
> >>>> sudo dd if=/dev/hwrng of=/dev/null count=1 bs=10000
> >>>>
> >>>> cpu_relax ~138025 Bytes / sec
> >>>> hwrng_msleep(1000) ~13 Bytes / sec
> >>>> usleep_range(100,200) ~92141 Bytes / sec
> >>>>
> >>>> Fixes: 96cb9d055445 ("hwrng: bcm2835 - use hwrng_msleep() instead of cpu_relax()")
> >>>> Link: https://lore.kernel.org/linux-arm-kernel/bc97ece5-44a3-4c4e-77da-2db3eb66b128@gmx.net/
> >>>> Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
> >>>> ---
> >>>> drivers/char/hw_random/bcm2835-rng.c | 3 ++-
> >>>> 1 file changed, 2 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c
> >>>> index e98fcac578d6..3f1b6aaa98ee 100644
> >>>> --- a/drivers/char/hw_random/bcm2835-rng.c
> >>>> +++ b/drivers/char/hw_random/bcm2835-rng.c
> >>>> @@ -14,6 +14,7 @@
> >>>> #include <linux/printk.h>
> >>>> #include <linux/clk.h>
> >>>> #include <linux/reset.h>
> >>>> +#include <linux/delay.h>
> >>>>
> >>>> #define RNG_CTRL 0x0
> >>>> #define RNG_STATUS 0x4
> >>>> @@ -71,7 +72,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max,
> >>>> while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) {
> >>>> if (!wait)
> >>>> return 0;
> >>>> - hwrng_msleep(rng, 1000);
> >>>> + usleep_range(100, 200);
> >>> I think we still need to use the hwrng_msleep function so that the sleep
> >>> remains cancelable. Maybe just change the 1000 to 100?
> >> i found that other hwrng driver like iproc-rng200 (Raspberry Pi 4) also
> >> use usleep_range().
> >>
> >> Nevertheless here are more numbers:
> >>
> >> usleep_range(200,400) : 47776 bytes / sec
> >> hwrng_msleep(20) : 715 bytes / sec
> >>
> >> Changing to 100 ms won't be a real gain.
> > I'm fine with whatever number you want there. Maybe we need a
> > hwrng_usleep_range() that takes into account rng->dying like
> > hwrng_msleep() does? (And iproc-rng200 should probably use that too?)
> the idea of this patch was to fix the performance regression in upcoming
> mainline and backport the fix to Linux 6.1 LTS. After that i'm fine with
> the introduction of hwrng_usleep_range().
No, you've got it backwards, but maybe it's a bit confusing why that is.
Originally, lots of drivers called variants of the msleep/usleep
functions. But this lead to some subtle bugs when the hwrng subsystem
was changing the active RNG or when the driver was unloading or during
suspend or a couple of other edge cases. In some cases, I believe there
were some interesting deadlocks. The solution to it was to make those
sleep calls cancellable with the &rng->dying completion, so those sleeps
could break, regardless of the time or the state of timekeeping on the
system. The &rng->dying thing was encapsulated in the hwrng_msleep()
function, and now it's a rule that any sleeps in hwrng drivers have got
to go through that.
Now, it sounds like you've come up with a case where hwrng_msleep(1) is
too long of a sleep for your purposes. In that case, indeed a usleep
variant might be called for. But in doing so, don't _reintroduce the
same bug we already fixed_. In other words, don't fix one regression by
adding another. Instead, follow the established pattern, and add a
hwrng_usleep_whatever() that waits on that same &rng->dying.
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-08-30 22:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-26 11:28 [PATCH] hwrng: bcm2835: Fix hwrng throughput regression Stefan Wahren
2023-08-26 12:34 ` Jason A. Donenfeld
2023-08-26 14:01 ` Stefan Wahren
2023-08-26 15:48 ` Jason A. Donenfeld
2023-08-30 18:05 ` Stefan Wahren
2023-08-30 22:31 ` Jason A. Donenfeld
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).