public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] random32: Various minor cleanups
@ 2014-06-07  8:18 George Spelvin
  2014-06-07  8:19 ` [PATCH 1/7] lib/random32.c: Mark self-test data as __initconst George Spelvin
                   ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: George Spelvin @ 2014-06-07  8:18 UTC (permalink / raw)
  To: davem, dborkman, shemminger, tytso; +Cc: linux, linux-kernel

I was looking for self-test code to emulate for lib/glob.c and, while
canvassing other code with self-tests, found some code that made me itch,
so I scratched it.

7 patches;

1/7: Mark the self-test data as __initconst
2/7: Remove excess calls to prandom_u32_state in initialization
3/7: Replace an #ifdef with a stub prandom_state_selftest()
4/7: Use <asm/unaligned.h> instead of hand-rolling it
5/7: Make prandom_u32_max efficient for powers of 2
6/7: Randomize timeout to the millisecond, not the second
7/7: Remove redundant U suffixes on integers

I have a big follow-on patch series to replace all the (many) instances
of "prandom_u32() % x" in the kernel with prandom_u32_max(x), which is
more efficient.

As of patch 5/7, "prandom_u32() & (x-1)" can also be replaced by
"prandom_u32_max(x)", but that's optional.

Patch 6/7 is dubious, and I'd like comments.  I just don't see
a reason why integer granularity would be desirable.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/7] lib/random32.c: Mark self-test data as __initconst
  2014-06-07  8:18 [PATCH 0/7] random32: Various minor cleanups George Spelvin
@ 2014-06-07  8:19 ` George Spelvin
  2014-06-08 12:06   ` Daniel Borkmann
  2014-06-07  8:20 ` [PATCH 2/7] lib/random32.c: Remove excess calls to prandom_u32_state in initialization George Spelvin
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-07  8:19 UTC (permalink / raw)
  To: davem, dborkman, linux, shemminger, tytso; +Cc: linux-kernel

So it can be thrown away along with the code that uses it.

Signed-off-by: George Spelvin <linux@horizon.com>
---
 lib/random32.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/random32.c b/lib/random32.c
index fa5da61c..4da5d281 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -296,7 +296,7 @@ late_initcall(prandom_reseed);
 static struct prandom_test1 {
 	u32 seed;
 	u32 result;
-} test1[] = {
+} const test1[] __initconst = {
 	{ 1U, 3484351685U },
 	{ 2U, 2623130059U },
 	{ 3U, 3125133893U },
@@ -307,7 +307,7 @@ static struct prandom_test2 {
 	u32 seed;
 	u32 iteration;
 	u32 result;
-} test2[] = {
+} const test2[] __initconst = {
 	/* Test cases against taus113 from GSL library. */
 	{  931557656U, 959U, 2975593782U },
 	{ 1339693295U, 876U, 3887776532U },
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/7] lib/random32.c: Remove excess calls to prandom_u32_state in initialization
  2014-06-07  8:18 [PATCH 0/7] random32: Various minor cleanups George Spelvin
  2014-06-07  8:19 ` [PATCH 1/7] lib/random32.c: Mark self-test data as __initconst George Spelvin
@ 2014-06-07  8:20 ` George Spelvin
  2014-06-08 12:11   ` Daniel Borkmann
  2014-06-07  8:22 ` [PATCH 3/7] lib/random32.c: Replace an #ifdef with a stub prandom_state_selftest() George Spelvin
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-07  8:20 UTC (permalink / raw)
  To: davem, dborkman, linux, shemminger, tytso; +Cc: linux-kernel

Unrolling code in single-use code paths is just silly.  There are two
instances:

1) prandom_warmup() calls 10 times.
2) prandom_state_selftest() can avoid one call and simplify the
   loop logic by repeating an assignment to a local variable
   (that probably adds zero code anyway)

Signed-off-by: George Spelvin <linux@horizon.com>
---
 lib/random32.c | 21 ++++++++-------------
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/lib/random32.c b/lib/random32.c
index 4da5d281..2e7c15c0 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -134,17 +134,11 @@ EXPORT_SYMBOL(prandom_bytes);
 
 static void prandom_warmup(struct rnd_state *state)
 {
+	int i;
+
 	/* Calling RNG ten times to satify recurrence condition */
-	prandom_u32_state(state);
-	prandom_u32_state(state);
-	prandom_u32_state(state);
-	prandom_u32_state(state);
-	prandom_u32_state(state);
-	prandom_u32_state(state);
-	prandom_u32_state(state);
-	prandom_u32_state(state);
-	prandom_u32_state(state);
-	prandom_u32_state(state);
+	for (i = 0; i < 10; i++)
+		prandom_u32_state(state);
 }
 
 static void prandom_seed_very_weak(struct rnd_state *state, u32 seed)
@@ -433,14 +427,15 @@ static void __init prandom_state_selftest(void)
 
 	for (i = 0; i < ARRAY_SIZE(test2); i++) {
 		struct rnd_state state;
+		u32 result;
 
 		prandom_seed_very_weak(&state, test2[i].seed);
 		prandom_warmup(&state);
 
-		for (j = 0; j < test2[i].iteration - 1; j++)
-			prandom_u32_state(&state);
+		for (j = 0; j < test2[i].iteration; j++)
+			result = prandom_u32_state(&state);
 
-		if (test2[i].result != prandom_u32_state(&state))
+		if (test2[i].result != result)
 			errors++;
 
 		runs++;
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/7] lib/random32.c: Replace an #ifdef with a stub prandom_state_selftest()
  2014-06-07  8:18 [PATCH 0/7] random32: Various minor cleanups George Spelvin
  2014-06-07  8:19 ` [PATCH 1/7] lib/random32.c: Mark self-test data as __initconst George Spelvin
  2014-06-07  8:20 ` [PATCH 2/7] lib/random32.c: Remove excess calls to prandom_u32_state in initialization George Spelvin
@ 2014-06-07  8:22 ` George Spelvin
  2014-06-08 12:16   ` Daniel Borkmann
  2014-06-07  8:25 ` [PATCH 4/7] lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it George Spelvin
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-07  8:22 UTC (permalink / raw)
  To: davem, dborkman, linux, shemminger, tytso; +Cc: linux-kernel

The preferred Linux style for optional features is to define
no-op stub functions rather than wrap each call site in #ifdef.

Signed-off-by: George Spelvin <linux@horizon.com>
---
 lib/random32.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/random32.c b/lib/random32.c
index 2e7c15c0..e8f3557b 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -40,6 +40,8 @@
 
 #ifdef CONFIG_RANDOM32_SELFTEST
 static void __init prandom_state_selftest(void);
+#else
+#define prandom_state_selftest() (void)0
 #endif
 
 static DEFINE_PER_CPU(struct rnd_state, net_rand_state);
@@ -188,9 +190,7 @@ static int __init prandom_init(void)
 {
 	int i;
 
-#ifdef CONFIG_RANDOM32_SELFTEST
 	prandom_state_selftest();
-#endif
 
 	for_each_possible_cpu(i) {
 		struct rnd_state *state = &per_cpu(net_rand_state,i);
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/7] lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it
  2014-06-07  8:18 [PATCH 0/7] random32: Various minor cleanups George Spelvin
                   ` (2 preceding siblings ...)
  2014-06-07  8:22 ` [PATCH 3/7] lib/random32.c: Replace an #ifdef with a stub prandom_state_selftest() George Spelvin
@ 2014-06-07  8:25 ` George Spelvin
  2014-06-08 12:25   ` Daniel Borkmann
  2014-06-07  8:28 ` [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2 George Spelvin
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-07  8:25 UTC (permalink / raw)
  To: davem, dborkman, linux, shemminger, tytso; +Cc: linux-kernel

The functions exist for a reason; the manual byte-at-a-time code
is unnecessarily slow (and bloated).

Signed-off-by: George Spelvin <linux@horizon.com>
---
 lib/random32.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/lib/random32.c b/lib/random32.c
index e8f3557b..eee60100 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -37,6 +37,7 @@
 #include <linux/jiffies.h>
 #include <linux/random.h>
 #include <linux/sched.h>
+#include <asm/unaligned.h>
 
 #ifdef CONFIG_RANDOM32_SELFTEST
 static void __init prandom_state_selftest(void);
@@ -97,25 +98,23 @@ EXPORT_SYMBOL(prandom_u32);
  */
 void prandom_bytes_state(struct rnd_state *state, void *buf, int bytes)
 {
-	unsigned char *p = buf;
+	u8 *p = buf;
 	int i;
 
-	for (i = 0; i < round_down(bytes, sizeof(u32)); i += sizeof(u32)) {
-		u32 random = prandom_u32_state(state);
-		int j;
+	for (i = 0; i < round_down(bytes, sizeof(u32)); i += sizeof(u32))
+		put_unaligned_le32(prandom_u32_state(state), p+i);
 
-		for (j = 0; j < sizeof(u32); j++) {
-			p[i + j] = random;
-			random >>= BITS_PER_BYTE;
-		}
-	}
 	if (i < bytes) {
 		u32 random = prandom_u32_state(state);
 
-		for (; i < bytes; i++) {
-			p[i] = random;
-			random >>= BITS_PER_BYTE;
+		if (bytes & 2) {
+			put_unaligned_le16((u16)random, p+i);
+			if ((bytes & 1) == 0)
+				return;
+			i += 2;
+			random >>= 16;
 		}
+		p[i] = (u8)random;
 	}
 }
 EXPORT_SYMBOL(prandom_bytes_state);
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2
  2014-06-07  8:18 [PATCH 0/7] random32: Various minor cleanups George Spelvin
                   ` (3 preceding siblings ...)
  2014-06-07  8:25 ` [PATCH 4/7] lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it George Spelvin
@ 2014-06-07  8:28 ` George Spelvin
  2014-06-08 12:03   ` Daniel Borkmann
  2014-06-08 17:34   ` Hannes Frederic Sowa
  2014-06-07  8:28 ` [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second George Spelvin
  2014-06-07  8:31 ` [PATCH 7/7] lib/random32.c: Remove redundant U suffixes on integers George Spelvin
  6 siblings, 2 replies; 31+ messages in thread
From: George Spelvin @ 2014-06-07  8:28 UTC (permalink / raw)
  To: davem, dborkman, linux, shemminger, tytso; +Cc: linux-kernel

The multiply-and-shift is efficient in the general case, but slower than
a simple bitwise AND if the range is a power of 2.  Make the function
handle this case so callers don't have to worry about micro-optimizing it.

Signed-off-by: George Spelvin <linux@horizon.com>
---
 include/linux/random.h | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/include/linux/random.h b/include/linux/random.h
index 57fbbffd..e1f3ec9a 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -47,11 +47,23 @@ void prandom_bytes_state(struct rnd_state *state, void *buf, int nbytes);
  * generator, that is, prandom_u32(). This is useful when requesting a
  * random index of an array containing ep_ro elements, for example.
  *
+ * If ep_ro is a power of 2 known at compile time, a modulo operation
+ * reduces to a simple mask to extract the low order bits.  Otherwise,
+ * it uses a multiply and shift, which is faster than a general modulus.
+ *
  * Returns: pseudo-random number in interval [0, ep_ro)
  */
 static inline u32 prandom_u32_max(u32 ep_ro)
 {
-	return (u32)(((u64) prandom_u32() * ep_ro) >> 32);
+	/*
+	 * Instead of just __builtin_constant_p(ep_ro), this test is
+	 * "is it known at compile time that ep_ro is a power of 2?", and
+	 * can in theory handle the case that it's an unknown power of 2.
+	 */
+	if (__builtin_constant_p(ep_ro & (ep_ro-1)) && !(ep_ro & (ep_ro-1)))
+		return prandom_u32() & (ep_ro-1);
+	else
+		return (u32)((u64)prandom_u32() * ep_ro >> 32);
 }
 
 /*
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second
  2014-06-07  8:18 [PATCH 0/7] random32: Various minor cleanups George Spelvin
                   ` (4 preceding siblings ...)
  2014-06-07  8:28 ` [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2 George Spelvin
@ 2014-06-07  8:28 ` George Spelvin
  2014-06-08 10:43   ` Daniel Borkmann
  2014-06-07  8:31 ` [PATCH 7/7] lib/random32.c: Remove redundant U suffixes on integers George Spelvin
  6 siblings, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-07  8:28 UTC (permalink / raw)
  To: davem, dborkman, linux, shemminger, tytso; +Cc: linux-kernel

If you're going to bother randomizing it, do it right.
And use prandom_u32_max(), which is designed for the job, rather
than "% 40".

Signed-off-by: George Spelvin <linux@horizon.com>
---
 lib/random32.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/random32.c b/lib/random32.c
index eee60100..9cc410dd 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -208,14 +208,14 @@ static DEFINE_TIMER(seed_timer, __prandom_timer, 0, 0);
 static void __prandom_timer(unsigned long dontcare)
 {
 	u32 entropy;
-	unsigned long expires;
+	unsigned int expires;
 
 	get_random_bytes(&entropy, sizeof(entropy));
 	prandom_seed(entropy);
 
 	/* reseed every ~60 seconds, in [40 .. 80) interval with slack */
-	expires = 40 + (prandom_u32() % 40);
-	seed_timer.expires = jiffies + msecs_to_jiffies(expires * MSEC_PER_SEC);
+	expires = 40000 + prandom_u32_max(40000);
+	seed_timer.expires = jiffies + msecs_to_jiffies(expires);
 
 	add_timer(&seed_timer);
 }
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 7/7] lib/random32.c: Remove redundant U suffixes on integers
  2014-06-07  8:18 [PATCH 0/7] random32: Various minor cleanups George Spelvin
                   ` (5 preceding siblings ...)
  2014-06-07  8:28 ` [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second George Spelvin
@ 2014-06-07  8:31 ` George Spelvin
  2014-06-08 10:36   ` Daniel Borkmann
  6 siblings, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-07  8:31 UTC (permalink / raw)
  To: davem, dborkman, linux, shemminger, tytso; +Cc: linux-kernel

Get rid of a few of the extraneous U suffixes on ordinary integers.

Signed-off-by: George Spelvin <linux@horizon.com>
---
 lib/random32.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/lib/random32.c b/lib/random32.c
index 9cc410dd..ad0c2ed1 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -58,10 +58,10 @@ u32 prandom_u32_state(struct rnd_state *state)
 {
 #define TAUSWORTHE(s,a,b,c,d) ((s&c)<<d) ^ (((s <<a) ^ s)>>b)
 
-	state->s1 = TAUSWORTHE(state->s1,  6U, 13U, 4294967294U, 18U);
-	state->s2 = TAUSWORTHE(state->s2,  2U, 27U, 4294967288U,  2U);
-	state->s3 = TAUSWORTHE(state->s3, 13U, 21U, 4294967280U,  7U);
-	state->s4 = TAUSWORTHE(state->s4,  3U, 12U, 4294967168U, 13U);
+	state->s1 = TAUSWORTHE(state->s1,  6, 13, 4294967294U, 18);
+	state->s2 = TAUSWORTHE(state->s2,  2, 27, 4294967288U,  2);
+	state->s3 = TAUSWORTHE(state->s3, 13, 21, 4294967280U,  7);
+	state->s4 = TAUSWORTHE(state->s4,  3, 12, 4294967168U, 13);
 
 	return (state->s1 ^ state->s2 ^ state->s3 ^ state->s4);
 }
@@ -77,9 +77,8 @@ EXPORT_SYMBOL(prandom_u32_state);
 u32 prandom_u32(void)
 {
 	struct rnd_state *state = &get_cpu_var(net_rand_state);
-	u32 res;
+	u32 res = prandom_u32_state(state);
 
-	res = prandom_u32_state(state);
 	put_cpu_var(state);
 
 	return res;
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] lib/random32.c: Remove redundant U suffixes on integers
  2014-06-07  8:31 ` [PATCH 7/7] lib/random32.c: Remove redundant U suffixes on integers George Spelvin
@ 2014-06-08 10:36   ` Daniel Borkmann
  2014-06-08 11:14     ` George Spelvin
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 10:36 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, shemminger, tytso, linux-kernel, hannes

[ Please also Cc Hannes in your series. ]

On 06/07/2014 10:31 AM, George Spelvin wrote:
> Get rid of a few of the extraneous U suffixes on ordinary integers.
>
> Signed-off-by: George Spelvin <linux@horizon.com>
> ---
>   lib/random32.c | 11 +++++------
>   1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/lib/random32.c b/lib/random32.c
> index 9cc410dd..ad0c2ed1 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -58,10 +58,10 @@ u32 prandom_u32_state(struct rnd_state *state)
>   {
>   #define TAUSWORTHE(s,a,b,c,d) ((s&c)<<d) ^ (((s <<a) ^ s)>>b)
>
> -	state->s1 = TAUSWORTHE(state->s1,  6U, 13U, 4294967294U, 18U);
> -	state->s2 = TAUSWORTHE(state->s2,  2U, 27U, 4294967288U,  2U);
> -	state->s3 = TAUSWORTHE(state->s3, 13U, 21U, 4294967280U,  7U);
> -	state->s4 = TAUSWORTHE(state->s4,  3U, 12U, 4294967168U, 13U);
> +	state->s1 = TAUSWORTHE(state->s1,  6, 13, 4294967294U, 18);
> +	state->s2 = TAUSWORTHE(state->s2,  2, 27, 4294967288U,  2);
> +	state->s3 = TAUSWORTHE(state->s3, 13, 21, 4294967280U,  7);
> +	state->s4 = TAUSWORTHE(state->s4,  3, 12, 4294967168U, 13);

I don't see the point in why we _need_ to change this here.

>   	return (state->s1 ^ state->s2 ^ state->s3 ^ state->s4);
>   }
> @@ -77,9 +77,8 @@ EXPORT_SYMBOL(prandom_u32_state);
>   u32 prandom_u32(void)
>   {
>   	struct rnd_state *state = &get_cpu_var(net_rand_state);
> -	u32 res;
> +	u32 res = prandom_u32_state(state);
>
> -	res = prandom_u32_state(state);
>   	put_cpu_var(state);

Ditto.

>   	return res;
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second
  2014-06-07  8:28 ` [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second George Spelvin
@ 2014-06-08 10:43   ` Daniel Borkmann
  2014-06-08 11:30     ` George Spelvin
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 10:43 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, shemminger, tytso, linux-kernel, hannes

On 06/07/2014 10:28 AM, George Spelvin wrote:
> If you're going to bother randomizing it, do it right.
> And use prandom_u32_max(), which is designed for the job, rather
> than "% 40".
>
> Signed-off-by: George Spelvin <linux@horizon.com>

Fine by me this cleanup, although not strictly needed.

> ---
>   lib/random32.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/lib/random32.c b/lib/random32.c
> index eee60100..9cc410dd 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -208,14 +208,14 @@ static DEFINE_TIMER(seed_timer, __prandom_timer, 0, 0);
>   static void __prandom_timer(unsigned long dontcare)
>   {
>   	u32 entropy;
> -	unsigned long expires;
> +	unsigned int expires;
>
>   	get_random_bytes(&entropy, sizeof(entropy));
>   	prandom_seed(entropy);
>
>   	/* reseed every ~60 seconds, in [40 .. 80) interval with slack */
> -	expires = 40 + (prandom_u32() % 40);
> -	seed_timer.expires = jiffies + msecs_to_jiffies(expires * MSEC_PER_SEC);
> +	expires = 40000 + prandom_u32_max(40000);
> +	seed_timer.expires = jiffies + msecs_to_jiffies(expires);
>
>   	add_timer(&seed_timer);
>   }
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] lib/random32.c: Remove redundant U suffixes on integers
  2014-06-08 10:36   ` Daniel Borkmann
@ 2014-06-08 11:14     ` George Spelvin
  2014-06-08 12:05       ` Daniel Borkmann
  0 siblings, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-08 11:14 UTC (permalink / raw)
  To: dborkman, linux; +Cc: davem, hannes, linux-kernel, shemminger, tytso

> [ Please also Cc Hannes in your series. ]

Thank you!  But there are two frequent kernel contributors with that name;
the other is Hannes Reinecke <hare@suse.de>.

> I don't see the point in why we _need_ to change this here.

We don't.  It was just a slight legibility improvement (at least, *I*
find it more legible) included as long as I was working in the area.
But split out to avoid making review of the substantive parts any
more difficult, and so it can be dropped easily if there are objections.

We tend to avoid cosmetic cleanups if there is not other work to
do in an area, so I assume that if there *is* other work to do in
an area, cosmetic cleanups are okay.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second
  2014-06-08 10:43   ` Daniel Borkmann
@ 2014-06-08 11:30     ` George Spelvin
  2014-06-08 12:28       ` Daniel Borkmann
  0 siblings, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-08 11:30 UTC (permalink / raw)
  To: dborkman, linux; +Cc: davem, hannes, linux-kernel, shemminger, tytso

> Fine by me this cleanup, although not strictly needed.

Agreed.  The timer slack is set to HZ (1 second) anyway.

It just dawned on me that a simpler and more efficient way to do this
(which I'll do in v2 of this) would be:

   	/* reseed every ~60 seconds, in [40 .. 80) interval with slack */
-	expires = 40 + (prandom_u32() % 40);
-	seed_timer.expires = jiffies + msecs_to_jiffies(expires * MSEC_PER_SEC);
+	expires = 40*HZ + prandom_u32_max(40*HZ);
+	seed_timer.expires = jiffies + expires;

That avoids calling msecs_to_jiffies entirely.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2
  2014-06-07  8:28 ` [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2 George Spelvin
@ 2014-06-08 12:03   ` Daniel Borkmann
  2014-06-08 17:34   ` Hannes Frederic Sowa
  1 sibling, 0 replies; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 12:03 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, shemminger, tytso, linux-kernel, hannes

On 06/07/2014 10:28 AM, George Spelvin wrote:
> The multiply-and-shift is efficient in the general case, but slower than
> a simple bitwise AND if the range is a power of 2.  Make the function
> handle this case so callers don't have to worry about micro-optimizing it.
>
> Signed-off-by: George Spelvin <linux@horizon.com>
> ---
>   include/linux/random.h | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/random.h b/include/linux/random.h
> index 57fbbffd..e1f3ec9a 100644
> --- a/include/linux/random.h
> +++ b/include/linux/random.h
> @@ -47,11 +47,23 @@ void prandom_bytes_state(struct rnd_state *state, void *buf, int nbytes);
>    * generator, that is, prandom_u32(). This is useful when requesting a
>    * random index of an array containing ep_ro elements, for example.
>    *
> + * If ep_ro is a power of 2 known at compile time, a modulo operation
> + * reduces to a simple mask to extract the low order bits.  Otherwise,
> + * it uses a multiply and shift, which is faster than a general modulus.
> + *
>    * Returns: pseudo-random number in interval [0, ep_ro)
>    */
>   static inline u32 prandom_u32_max(u32 ep_ro)
>   {
> -	return (u32)(((u64) prandom_u32() * ep_ro) >> 32);
> +	/*
> +	 * Instead of just __builtin_constant_p(ep_ro), this test is
> +	 * "is it known at compile time that ep_ro is a power of 2?", and
> +	 * can in theory handle the case that it's an unknown power of 2.
> +	 */
> +	if (__builtin_constant_p(ep_ro & (ep_ro-1)) && !(ep_ro & (ep_ro-1)))
> +		return prandom_u32() & (ep_ro-1);
> +	else
> +		return (u32)((u64)prandom_u32() * ep_ro >> 32);

Okay, I guess it's fine since we expect a random number here anyway, so
it doesn't matter much how we get to the result; otherwise I'd have
complained as both function don't do the same thing. Please leave some
whitespace, i.e. I mean "ep_ro-1" -> "ep_ro - 1".

Btw, there are many other users which hard code prandom_u32_max() resp.
reciprocal_scale() function in the source tree. If you have some cycles,
feel free to migrate them and let them use these functions.

>   }
>
>   /*
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/7] lib/random32.c: Remove redundant U suffixes on integers
  2014-06-08 11:14     ` George Spelvin
@ 2014-06-08 12:05       ` Daniel Borkmann
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 12:05 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, hannes, linux-kernel, shemminger, tytso

On 06/08/2014 01:14 PM, George Spelvin wrote:
>> [ Please also Cc Hannes in your series. ]
>
> Thank you!  But there are two frequent kernel contributors with that name;
> the other is Hannes Reinecke <hare@suse.de>.

I mean <hannes@stressinduktion.org> which I already did for you. I
thought this would have been obvious based on:

    git log lib/random32.c

;)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/7] lib/random32.c: Mark self-test data as __initconst
  2014-06-07  8:19 ` [PATCH 1/7] lib/random32.c: Mark self-test data as __initconst George Spelvin
@ 2014-06-08 12:06   ` Daniel Borkmann
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 12:06 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, shemminger, tytso, linux-kernel, hannes

On 06/07/2014 10:19 AM, George Spelvin wrote:
> So it can be thrown away along with the code that uses it.
>
> Signed-off-by: George Spelvin <linux@horizon.com>

Fine by me.

Acked-by: Daniel Borkmann <dborkman@redhat.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] lib/random32.c: Remove excess calls to prandom_u32_state in initialization
  2014-06-07  8:20 ` [PATCH 2/7] lib/random32.c: Remove excess calls to prandom_u32_state in initialization George Spelvin
@ 2014-06-08 12:11   ` Daniel Borkmann
  2014-06-08 12:19     ` George Spelvin
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 12:11 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, shemminger, tytso, linux-kernel, hannes

On 06/07/2014 10:20 AM, George Spelvin wrote:
> Unrolling code in single-use code paths is just silly.  There are two
> instances:
>
> 1) prandom_warmup() calls 10 times.
> 2) prandom_state_selftest() can avoid one call and simplify the
>     loop logic by repeating an assignment to a local variable
>     (that probably adds zero code anyway)
>
> Signed-off-by: George Spelvin <linux@horizon.com>

Fine by me although we simply resembled initialization code from
GSL here. I think that your subject line is a bit misleading though.

> ---
>   lib/random32.c | 21 ++++++++-------------
>   1 file changed, 8 insertions(+), 13 deletions(-)
>
> diff --git a/lib/random32.c b/lib/random32.c
> index 4da5d281..2e7c15c0 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -134,17 +134,11 @@ EXPORT_SYMBOL(prandom_bytes);
>
>   static void prandom_warmup(struct rnd_state *state)
>   {
> +	int i;
> +
>   	/* Calling RNG ten times to satify recurrence condition */
> -	prandom_u32_state(state);
> -	prandom_u32_state(state);
> -	prandom_u32_state(state);
> -	prandom_u32_state(state);
> -	prandom_u32_state(state);
> -	prandom_u32_state(state);
> -	prandom_u32_state(state);
> -	prandom_u32_state(state);
> -	prandom_u32_state(state);
> -	prandom_u32_state(state);
> +	for (i = 0; i < 10; i++)
> +		prandom_u32_state(state);
>   }
>
>   static void prandom_seed_very_weak(struct rnd_state *state, u32 seed)
> @@ -433,14 +427,15 @@ static void __init prandom_state_selftest(void)
>
>   	for (i = 0; i < ARRAY_SIZE(test2); i++) {
>   		struct rnd_state state;
> +		u32 result;
>
>   		prandom_seed_very_weak(&state, test2[i].seed);
>   		prandom_warmup(&state);
>
> -		for (j = 0; j < test2[i].iteration - 1; j++)
> -			prandom_u32_state(&state);
> +		for (j = 0; j < test2[i].iteration; j++)
> +			result = prandom_u32_state(&state);
>
> -		if (test2[i].result != prandom_u32_state(&state))
> +		if (test2[i].result != result)
>   			errors++;
>
>   		runs++;
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/7] lib/random32.c: Replace an #ifdef with a stub prandom_state_selftest()
  2014-06-07  8:22 ` [PATCH 3/7] lib/random32.c: Replace an #ifdef with a stub prandom_state_selftest() George Spelvin
@ 2014-06-08 12:16   ` Daniel Borkmann
  2014-06-08 12:27     ` George Spelvin
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 12:16 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, shemminger, tytso, linux-kernel, hannes

On 06/07/2014 10:22 AM, George Spelvin wrote:
> The preferred Linux style for optional features is to define
> no-op stub functions rather than wrap each call site in #ifdef.
>
> Signed-off-by: George Spelvin <linux@horizon.com>
> ---
>   lib/random32.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/random32.c b/lib/random32.c
> index 2e7c15c0..e8f3557b 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -40,6 +40,8 @@
>
>   #ifdef CONFIG_RANDOM32_SELFTEST
>   static void __init prandom_state_selftest(void);
> +#else
> +#define prandom_state_selftest() (void)0

Fine by me. I think you can remove this '(void)0' here, though.

>   #endif
>
>   static DEFINE_PER_CPU(struct rnd_state, net_rand_state);
> @@ -188,9 +190,7 @@ static int __init prandom_init(void)
>   {
>   	int i;
>
> -#ifdef CONFIG_RANDOM32_SELFTEST
>   	prandom_state_selftest();
> -#endif
>
>   	for_each_possible_cpu(i) {
>   		struct rnd_state *state = &per_cpu(net_rand_state,i);
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/7] lib/random32.c: Remove excess calls to prandom_u32_state in initialization
  2014-06-08 12:11   ` Daniel Borkmann
@ 2014-06-08 12:19     ` George Spelvin
  0 siblings, 0 replies; 31+ messages in thread
From: George Spelvin @ 2014-06-08 12:19 UTC (permalink / raw)
  To: dborkman, linux; +Cc: davem, hannes, linux-kernel, shemminger, tytso

Thank you for your time and attention on all these comments.

> Fine by me although we simply resembled initialization code from
> GSL here. I think that your subject line is a bit misleading though.

Yes, I had a hard time coming up with a good summary.  I'm removing
*static* calls but not dynamic ones.

How about "don't unroll one-time initialization code"?
Would that be a better way to put it?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/7] lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it
  2014-06-07  8:25 ` [PATCH 4/7] lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it George Spelvin
@ 2014-06-08 12:25   ` Daniel Borkmann
  2014-06-08 12:40     ` George Spelvin
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 12:25 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, shemminger, tytso, linux-kernel, hannes

On 06/07/2014 10:25 AM, George Spelvin wrote:
> The functions exist for a reason; the manual byte-at-a-time code
> is unnecessarily slow (and bloated).
>
> Signed-off-by: George Spelvin <linux@horizon.com>

Seems fine by me, since it's random anyway archs might not care
about the *_le32, though this might yield additional work in some
cases I presume.

> ---
>   lib/random32.c | 23 +++++++++++------------
>   1 file changed, 11 insertions(+), 12 deletions(-)
>
> diff --git a/lib/random32.c b/lib/random32.c
> index e8f3557b..eee60100 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -37,6 +37,7 @@
>   #include <linux/jiffies.h>
>   #include <linux/random.h>
>   #include <linux/sched.h>
> +#include <asm/unaligned.h>
>
>   #ifdef CONFIG_RANDOM32_SELFTEST
>   static void __init prandom_state_selftest(void);
> @@ -97,25 +98,23 @@ EXPORT_SYMBOL(prandom_u32);
>    */
>   void prandom_bytes_state(struct rnd_state *state, void *buf, int bytes)
>   {
> -	unsigned char *p = buf;
> +	u8 *p = buf;
>   	int i;
>
> -	for (i = 0; i < round_down(bytes, sizeof(u32)); i += sizeof(u32)) {
> -		u32 random = prandom_u32_state(state);
> -		int j;
> +	for (i = 0; i < round_down(bytes, sizeof(u32)); i += sizeof(u32))
> +		put_unaligned_le32(prandom_u32_state(state), p+i);

Nit: 'p + i'

>
> -		for (j = 0; j < sizeof(u32); j++) {
> -			p[i + j] = random;
> -			random >>= BITS_PER_BYTE;
> -		}
> -	}
>   	if (i < bytes) {
>   		u32 random = prandom_u32_state(state);
>
> -		for (; i < bytes; i++) {
> -			p[i] = random;
> -			random >>= BITS_PER_BYTE;
> +		if (bytes & 2) {
> +			put_unaligned_le16((u16)random, p+i);

Ditto.

> +			if ((bytes & 1) == 0)
> +				return;
> +			i += 2;
> +			random >>= 16;
>   		}
> +		p[i] = (u8)random;

Nit: '(u8) random'

You could probably use a switch statement with fall-through for
filling the remaining stuff, might simplify it further, perhaps.

>   	}
>   }
>   EXPORT_SYMBOL(prandom_bytes_state);
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/7] lib/random32.c: Replace an #ifdef with a stub prandom_state_selftest()
  2014-06-08 12:16   ` Daniel Borkmann
@ 2014-06-08 12:27     ` George Spelvin
  0 siblings, 0 replies; 31+ messages in thread
From: George Spelvin @ 2014-06-08 12:27 UTC (permalink / raw)
  To: dborkman, linux; +Cc: davem, hannes, linux-kernel, shemminger, tytso

>>   #ifdef CONFIG_RANDOM32_SELFTEST
>>   static void __init prandom_state_selftest(void);
>> +#else
>> +#define prandom_state_selftest() (void)0

> Fine by me. I think you can remove this '(void)0' here, though.

That's the standard way to write a no-op statement in C.

I seem to recall there's a reason that the empty string can cause problems
in some syntactic contexts, but I can't figure out what the situation is.

At first, I thought of the obvious:

if (condition)
	prandom_state_selftest();
unconditional_code();

... but the semicolon makes that work.  I'll try to remember
the reason.

(I know that nobody uses it in any such context, but it's
good manners to make a function-like macro behave as exactly
like a function as possible.)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second
  2014-06-08 11:30     ` George Spelvin
@ 2014-06-08 12:28       ` Daniel Borkmann
  2014-06-08 12:42         ` George Spelvin
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 12:28 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, hannes, linux-kernel, shemminger, tytso

On 06/08/2014 01:30 PM, George Spelvin wrote:
>> Fine by me this cleanup, although not strictly needed.
>
> Agreed.  The timer slack is set to HZ (1 second) anyway.
>
> It just dawned on me that a simpler and more efficient way to do this
> (which I'll do in v2 of this) would be:

Note, when you talk about efficiency here, this is called once every
40+ secs ... ;)

>     	/* reseed every ~60 seconds, in [40 .. 80) interval with slack */
> -	expires = 40 + (prandom_u32() % 40);
> -	seed_timer.expires = jiffies + msecs_to_jiffies(expires * MSEC_PER_SEC);
> +	expires = 40*HZ + prandom_u32_max(40*HZ);
> +	seed_timer.expires = jiffies + expires;
>
> That avoids calling msecs_to_jiffies entirely.
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/7] lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it
  2014-06-08 12:25   ` Daniel Borkmann
@ 2014-06-08 12:40     ` George Spelvin
  2014-06-08 20:26       ` Hannes Frederic Sowa
  2014-06-10 15:13       ` Daniel Borkmann
  0 siblings, 2 replies; 31+ messages in thread
From: George Spelvin @ 2014-06-08 12:40 UTC (permalink / raw)
  To: dborkman, linux; +Cc: davem, hannes, linux-kernel, shemminger, tytso

> Seems fine by me, since it's random anyway archs might not care
> about the *_le32, though this might yield additional work in some
> cases I presume.

If you think that's okay, sure.  I kept it consistent to get byte-for-byte
identical results.

If variations in that are allowed, that can also
simplify the trailing storage case:

+		if (bytes & 2)
+			put_unaligned_le16((u16)random, p+i);
+		if (bytes & 1)
+			p[i+bytes-1] = (u8)(random >> 16);

>> +	for (i = 0; i < round_down(bytes, sizeof(u32)); i += sizeof(u32))
>> +		put_unaligned_le32(prandom_u32_state(state), p+i);

> Nit: 'p + i'

I don't really care, but I'm happy without the spaces; I add them
to show what binds more weakly, and in this case that's the
argument-separating comma.

>> +		p[i] = (u8)random;

> Nit: '(u8) random'

Actually, my style and most of the kerel is to not put
a space in a cast, since it binds so tighty.

E.g. compare

git grep ')[a-z]' kernel/
git grep ') [a-z]' kernel/

Notice that the second is alsmost all comments.
(There are some spaces in kernel/ptrace.v.)

I'd rather leave it without the space unless you feel
very strongly about it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second
  2014-06-08 12:28       ` Daniel Borkmann
@ 2014-06-08 12:42         ` George Spelvin
  2014-06-08 20:01           ` Daniel Borkmann
  0 siblings, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-08 12:42 UTC (permalink / raw)
  To: dborkman, linux; +Cc: davem, hannes, linux-kernel, shemminger, tytso

> Note, when you talk about efficiency here, this is called once every
> 40+ secs ... ;)

Agreed, in this case the code size saving (avoid a function call) is the
main benefit.  And I prefer to avoid inefficiency on general priinciples,
if it's not serving some other goal.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2
  2014-06-07  8:28 ` [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2 George Spelvin
  2014-06-08 12:03   ` Daniel Borkmann
@ 2014-06-08 17:34   ` Hannes Frederic Sowa
  2014-06-08 20:02     ` Daniel Borkmann
  2014-06-08 20:48     ` George Spelvin
  1 sibling, 2 replies; 31+ messages in thread
From: Hannes Frederic Sowa @ 2014-06-08 17:34 UTC (permalink / raw)
  To: George Spelvin, davem, dborkman, shemminger, tytso; +Cc: linux-kernel

On Sat, Jun 7, 2014, at 1:28, George Spelvin wrote:
> diff --git a/include/linux/random.h b/include/linux/random.h
> index 57fbbffd..e1f3ec9a 100644
> --- a/include/linux/random.h
> +++ b/include/linux/random.h
> @@ -47,11 +47,23 @@ void prandom_bytes_state(struct rnd_state *state, void *buf, int nbytes);
>   * generator, that is, prandom_u32(). This is useful when requesting a
>   * random index of an array containing ep_ro elements, for example.
>   *
> + * If ep_ro is a power of 2 known at compile time, a modulo operation
> + * reduces to a simple mask to extract the low order bits.  Otherwise,
> + * it uses a multiply and shift, which is faster than a general modulus.
> + *
>   * Returns: pseudo-random number in interval [0, ep_ro)
>   */
>  static inline u32 prandom_u32_max(u32 ep_ro)
>  {
> -	return (u32)(((u64) prandom_u32() * ep_ro) >> 32);
> +	/*
> +	 * Instead of just __builtin_constant_p(ep_ro), this test is
> +	 * "is it known at compile time that ep_ro is a power of 2?", and
> +	 * can in theory handle the case that it's an unknown power of 2.
> +	 */
> +	if (__builtin_constant_p(ep_ro & (ep_ro-1)) && !(ep_ro & (ep_ro-1)))
> +		return prandom_u32() & (ep_ro-1);
> +	else
> +		return (u32)((u64)prandom_u32() * ep_ro >> 32);
>  }

Have you checked assembler output if this helps anything at all? Constant propagation in the compiler should be able to figure that out all by itself. The only places I use __builtin_constant_p today are where I also make use of inline assembler.

Please check this as it makes the code more complicated and I doubt it is worth it.

Btw, IIRC there is a function is_power_of_2 somewhere. ;)

Thanks,

  Hannes

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second
  2014-06-08 12:42         ` George Spelvin
@ 2014-06-08 20:01           ` Daniel Borkmann
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 20:01 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, hannes, linux-kernel, shemminger, tytso

On 06/08/2014 02:42 PM, George Spelvin wrote:
>> Note, when you talk about efficiency here, this is called once every
>> 40+ secs ... ;)
>
> Agreed, in this case the code size saving (avoid a function call) is the
> main benefit.  And I prefer to avoid inefficiency on general priinciples,
> if it's not serving some other goal.

Hm, don't think it will make much of a difference.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2
  2014-06-08 17:34   ` Hannes Frederic Sowa
@ 2014-06-08 20:02     ` Daniel Borkmann
  2014-06-09  0:28       ` George Spelvin
  2014-06-08 20:48     ` George Spelvin
  1 sibling, 1 reply; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-08 20:02 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: George Spelvin, davem, shemminger, tytso, linux-kernel

On 06/08/2014 07:34 PM, Hannes Frederic Sowa wrote:
...
> Please check this as it makes the code more complicated and I doubt it is worth it.

Agreed.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/7] lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it
  2014-06-08 12:40     ` George Spelvin
@ 2014-06-08 20:26       ` Hannes Frederic Sowa
  2014-06-10 15:13       ` Daniel Borkmann
  1 sibling, 0 replies; 31+ messages in thread
From: Hannes Frederic Sowa @ 2014-06-08 20:26 UTC (permalink / raw)
  To: George Spelvin, dborkman; +Cc: davem, linux-kernel, shemminger, tytso

On Sun, Jun 8, 2014, at 5:40, George Spelvin wrote:
> > Seems fine by me, since it's random anyway archs might not care
> > about the *_le32, though this might yield additional work in some
> > cases I presume.
> 
> If you think that's okay, sure.  I kept it consistent to get byte-for-byte
> identical results.
> 
> If variations in that are allowed, that can also
> simplify the trailing storage case:
> 
> +		if (bytes & 2)
> +			put_unaligned_le16((u16)random, p+i);
> +		if (bytes & 1)
> +			p[i+bytes-1] = (u8)(random >> 16);

Yes, please! Otherwise the code looks much too complicated for what it does.

> > Nit: '(u8) random'
> 
> Actually, my style and most of the kerel is to not put
> a space in a cast, since it binds so tighty.
> 
> E.g. compare
> 
> git grep ')[a-z]' kernel/
> git grep ') [a-z]' kernel/
> 
> Notice that the second is alsmost all comments.
> (There are some spaces in kernel/ptrace.v.)
> 
> I'd rather leave it without the space unless you feel
> very strongly about it.

IMHO I wouldn't put a whitespace here.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2
  2014-06-08 17:34   ` Hannes Frederic Sowa
  2014-06-08 20:02     ` Daniel Borkmann
@ 2014-06-08 20:48     ` George Spelvin
  2014-06-09 10:30       ` Hannes Frederic Sowa
  1 sibling, 1 reply; 31+ messages in thread
From: George Spelvin @ 2014-06-08 20:48 UTC (permalink / raw)
  To: davem, dborkman, hannes, linux, shemminger, tytso; +Cc: linux-kernel

Thank you for your comments!

> Have you checked assembler output if this helps anything at all? Constant
> propagation in the compiler should be able to figure that out all by
> itself. The only places I use __builtin_constant_p today are where I
> also make use of inline assembler.

Yes, I did.  (I'll expand the commit comment for v2; my bad.)

It seems that GCC isn't smart enough to reduce this to a single shift.
With the multiply and reduce, the code looks like:
        call    prandom_u32
        xorl    %edx, %edx
        shldl   $4, %eax, %edx
        movl    %edx, %eax

Instead of the hoped-for
        call    prandom_u32
	shrl	$28, %eax

Converting to a single mask is something the compiler can't do,
because it doesn't understand that using the lsbits instead of the
msbits is okay.

With the mask, it turns into the spectacularly simple:
        call    prandom_u32
        andl    $15, %eax

An interesting question is which is preferred in general.

The AND allows non-constant powers of 2 without requiring CLZ.  But I
don't recall seeing that actually happen anywhere.  And the shift allows
a smaller encoding (8-bit rather than 32-bit immediate constant) when
the power of 2 is known at compile time and is larger than 128 (for
example, PAGE_SIZE).

Me, I thought it was in the noise and not worth stressing about,
but I also understand the hackers's urge for maximum tweaking.


The other thing that I couldn't think of a clean wrapper for is the
"probability 1 in N" case that happens in several bits of code.
For example, in drivers/mtd/ubi/debug.h, I made the following change:

 static inline int ubi_dbg_is_bitflip(const struct ubi_device *ubi)
 {
-	if (ubi->dbg.emulate_bitflips)
-		return !(prandom_u32() % 200);
-	return 0;
+	return ubi->dbg.emulate_bitflips && prandom_u32() < -1u/200;
 }

GCC doesn't know how to optimize "prandom_u32_max(200) == 0", which
is basically the same.  It spits out:

        call    prandom_u32
        movl    %eax, %edx
        movl    $200, %eax
        mull    %edx
        testl   %edx, %edx

I couldn't think of a good enough function name for this, so I just left
it as inline code.  (Using "-1u/200" rather than the technically correct
"(u32)(((u64)1 << 32)/200)" is okay as long as 200 isn't a power of 2, and
even if it were, the error wouldn't matter since it's approximate anyway.)

> Btw, IIRC there is a function is_power_of_2 somewhere. ;)

In <linux/log2.h>; thanks for the pointer.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2
  2014-06-08 20:02     ` Daniel Borkmann
@ 2014-06-09  0:28       ` George Spelvin
  0 siblings, 0 replies; 31+ messages in thread
From: George Spelvin @ 2014-06-09  0:28 UTC (permalink / raw)
  To: dborkman, hannes; +Cc: davem, linux-kernel, linux, shemminger, tytso

Daniel Borkmann wrote:
> On 06/08/2014 07:34 PM, Hannes Frederic Sowa wrote:
>> Please check this as it makes the code more complicated and I doubt it is worth it.
>
> Agreed.

Just to clarify, the goal is to be able to replace "prandom_u32() %
range" all over the kernel with "prandom_u32_max(range)" and promise
people that it's guaranteed to be just as good.

My goal is a one-time cleanup to eliminate "prandom_u32() % range" so
people will stop cutting and pasting it into new code.

(FIXME: Add a kdoc update on the subject.)

It uglifies the random32 code for sure, but by encapsulating the cleverness
it lets all the call sites be simpler and reduces arguments about the change.

E.g. have a look at all the prandom_u32 calls in net/core/pktgen.c.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2
  2014-06-08 20:48     ` George Spelvin
@ 2014-06-09 10:30       ` Hannes Frederic Sowa
  0 siblings, 0 replies; 31+ messages in thread
From: Hannes Frederic Sowa @ 2014-06-09 10:30 UTC (permalink / raw)
  To: George Spelvin, davem, dborkman, shemminger, tytso; +Cc: linux-kernel

Thanks for your detailed explanation!

On Sun, Jun 8, 2014, at 13:48, George Spelvin wrote:
> Thank you for your comments!
> 
> > Have you checked assembler output if this helps anything at all? Constant
> > propagation in the compiler should be able to figure that out all by
> > itself. The only places I use __builtin_constant_p today are where I
> > also make use of inline assembler.
> 
> Yes, I did.  (I'll expand the commit comment for v2; my bad.)
> 
> It seems that GCC isn't smart enough to reduce this to a single shift.
> With the multiply and reduce, the code looks like:
>         call    prandom_u32
>         xorl    %edx, %edx
>         shldl   $4, %eax, %edx
>         movl    %edx, %eax
> 
> Instead of the hoped-for
>         call    prandom_u32
> 	shrl	$28, %eax

On x86_64 I get the above result. Seems like gcc doesn't see the downcast to u32 far enough ahead and stays in DI mode on i386, thus the shldl. It shouldn't matter that much... ;)

> Converting to a single mask is something the compiler can't do,
> because it doesn't understand that using the lsbits instead of the
> msbits is okay.

Yep, sure.

> With the mask, it turns into the spectacularly simple:
>         call    prandom_u32
>         andl    $15, %eax
> 
> An interesting question is which is preferred in general.
> 
> The AND allows non-constant powers of 2 without requiring CLZ.  But I
> don't recall seeing that actually happen anywhere.  And the shift allows
> a smaller encoding (8-bit rather than 32-bit immediate constant) when
> the power of 2 is known at compile time and is larger than 128 (for
> example, PAGE_SIZE).

I actually don't know if folding logic in gcc is so enhanced to see that coming. ;)
Would be interesting tough, maybe I'll try that later.

> Me, I thought it was in the noise and not worth stressing about,
> but I also understand the hackers's urge for maximum tweaking.

Totally ok. ;)

I don't have any problems with the patch, although such a detailed changelog would be nice + some approx. numbers of how many times we run into the new optimization.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/7] lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it
  2014-06-08 12:40     ` George Spelvin
  2014-06-08 20:26       ` Hannes Frederic Sowa
@ 2014-06-10 15:13       ` Daniel Borkmann
  1 sibling, 0 replies; 31+ messages in thread
From: Daniel Borkmann @ 2014-06-10 15:13 UTC (permalink / raw)
  To: George Spelvin; +Cc: davem, hannes, linux-kernel, shemminger, tytso

On 06/08/2014 02:40 PM, George Spelvin wrote:
...
>>> +	for (i = 0; i < round_down(bytes, sizeof(u32)); i += sizeof(u32))
>>> +		put_unaligned_le32(prandom_u32_state(state), p+i);
>
>> Nit: 'p + i'
>
> I don't really care, but I'm happy without the spaces; I add them
> to show what binds more weakly, and in this case that's the
> argument-separating comma.

If you invent such things, then first send a patch to
Documentation/CodingStyle to change and discuss this.

Otherwise, I urge you to read:

   vi Documentation/CodingStyle +206

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2014-06-10 15:13 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-07  8:18 [PATCH 0/7] random32: Various minor cleanups George Spelvin
2014-06-07  8:19 ` [PATCH 1/7] lib/random32.c: Mark self-test data as __initconst George Spelvin
2014-06-08 12:06   ` Daniel Borkmann
2014-06-07  8:20 ` [PATCH 2/7] lib/random32.c: Remove excess calls to prandom_u32_state in initialization George Spelvin
2014-06-08 12:11   ` Daniel Borkmann
2014-06-08 12:19     ` George Spelvin
2014-06-07  8:22 ` [PATCH 3/7] lib/random32.c: Replace an #ifdef with a stub prandom_state_selftest() George Spelvin
2014-06-08 12:16   ` Daniel Borkmann
2014-06-08 12:27     ` George Spelvin
2014-06-07  8:25 ` [PATCH 4/7] lib/random32.c: Use <asm/unaligned.h> instead of hand-rolling it George Spelvin
2014-06-08 12:25   ` Daniel Borkmann
2014-06-08 12:40     ` George Spelvin
2014-06-08 20:26       ` Hannes Frederic Sowa
2014-06-10 15:13       ` Daniel Borkmann
2014-06-07  8:28 ` [PATCH 5/7] lib/random32.c: Make prandom_u32_max efficient for powers of 2 George Spelvin
2014-06-08 12:03   ` Daniel Borkmann
2014-06-08 17:34   ` Hannes Frederic Sowa
2014-06-08 20:02     ` Daniel Borkmann
2014-06-09  0:28       ` George Spelvin
2014-06-08 20:48     ` George Spelvin
2014-06-09 10:30       ` Hannes Frederic Sowa
2014-06-07  8:28 ` [PATCH 6/7] lib/random32.c: Randomize timeout to the millisecond, not the second George Spelvin
2014-06-08 10:43   ` Daniel Borkmann
2014-06-08 11:30     ` George Spelvin
2014-06-08 12:28       ` Daniel Borkmann
2014-06-08 12:42         ` George Spelvin
2014-06-08 20:01           ` Daniel Borkmann
2014-06-07  8:31 ` [PATCH 7/7] lib/random32.c: Remove redundant U suffixes on integers George Spelvin
2014-06-08 10:36   ` Daniel Borkmann
2014-06-08 11:14     ` George Spelvin
2014-06-08 12:05       ` Daniel Borkmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox