* [RFC PATCH v1 29/50] fs/ocfs2/journal: Use prandom_u32() and not /dev/random for timeout @ 2020-03-28 16:43 ` George Spelvin 0 siblings, 0 replies; 8+ messages in thread From: George Spelvin @ 2019-03-21 3:07 UTC (permalink / raw) To: linux-kernel, lkml; +Cc: Mark Fasheh, Joel Becker, Joseph Qi, ocfs2-devel get_random_bytes() is expensive crypto-quality random numbers. If we're just doing random backoff, prandom_u32() is plenty. (Not to mention fetching 8 bytes of seed material only to reduce it modulo 5000 is a huge waste.) Also, convert timeouts to jiffies at compile time; convert milliseconds to jiffies before picking a random number in the range to take advantage of compile-time constant folding. Signed-off-by: George Spelvin <lkml@sdf.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: ocfs2-devel@oss.oracle.com --- fs/ocfs2/journal.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index 68ba354cf3610..939a12e57fa8b 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -1884,11 +1884,8 @@ int ocfs2_mark_dead_nodes(struct ocfs2_super *osb) */ static inline unsigned long ocfs2_orphan_scan_timeout(void) { - unsigned long time; - - get_random_bytes(&time, sizeof(time)); - time = ORPHAN_SCAN_SCHEDULE_TIMEOUT + (time % 5000); - return msecs_to_jiffies(time); + return msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT) + + prandom_u32_max(5 * HZ); } /* -- 2.26.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Ocfs2-devel] [RFC PATCH v1 29/50] fs/ocfs2/journal: Use prandom_u32() and not /dev/random for timeout @ 2020-03-28 16:43 ` George Spelvin 0 siblings, 0 replies; 8+ messages in thread From: George Spelvin @ 2020-03-28 16:43 UTC (permalink / raw) To: ocfs2-devel get_random_bytes() is expensive crypto-quality random numbers. If we're just doing random backoff, prandom_u32() is plenty. (Not to mention fetching 8 bytes of seed material only to reduce it modulo 5000 is a huge waste.) Also, convert timeouts to jiffies at compile time; convert milliseconds to jiffies before picking a random number in the range to take advantage of compile-time constant folding. Signed-off-by: George Spelvin <lkml@sdf.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: ocfs2-devel at oss.oracle.com --- fs/ocfs2/journal.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index 68ba354cf3610..939a12e57fa8b 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -1884,11 +1884,8 @@ int ocfs2_mark_dead_nodes(struct ocfs2_super *osb) */ static inline unsigned long ocfs2_orphan_scan_timeout(void) { - unsigned long time; - - get_random_bytes(&time, sizeof(time)); - time = ORPHAN_SCAN_SCHEDULE_TIMEOUT + (time % 5000); - return msecs_to_jiffies(time); + return msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT) + + prandom_u32_max(5 * HZ); } /* -- 2.26.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Ocfs2-devel] [RFC PATCH v1 29/50] fs/ocfs2/journal: Use prandom_u32() and not /dev/random for timeout 2020-03-28 16:43 ` [Ocfs2-devel] " George Spelvin @ 2020-03-30 12:09 ` Joseph Qi -1 siblings, 0 replies; 8+ messages in thread From: Joseph Qi @ 2020-03-30 12:09 UTC (permalink / raw) To: ocfs2-devel Sorry for the late reply since I might miss this mail. On 2019/3/21 11:07, George Spelvin wrote: > get_random_bytes() is expensive crypto-quality random numbers. > If we're just doing random backoff, prandom_u32() is plenty. > > (Not to mention fetching 8 bytes of seed material only to > reduce it modulo 5000 is a huge waste.) > > Also, convert timeouts to jiffies at compile time; convert > milliseconds to jiffies before picking a random number in the > range to take advantage of compile-time constant folding. > > Signed-off-by: George Spelvin <lkml@sdf.org> > Cc: Mark Fasheh <mark@fasheh.com> > Cc: Joel Becker <jlbec@evilplan.org> > Cc: Joseph Qi <joseph.qi@linux.alibaba.com> > Cc: ocfs2-devel at oss.oracle.com > --- > fs/ocfs2/journal.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c > index 68ba354cf3610..939a12e57fa8b 100644 > --- a/fs/ocfs2/journal.c > +++ b/fs/ocfs2/journal.c > @@ -1884,11 +1884,8 @@ int ocfs2_mark_dead_nodes(struct ocfs2_super *osb) > */ > static inline unsigned long ocfs2_orphan_scan_timeout(void) > { > - unsigned long time; > - > - get_random_bytes(&time, sizeof(time)); > - time = ORPHAN_SCAN_SCHEDULE_TIMEOUT + (time % 5000); > - return msecs_to_jiffies(time); > + return msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT) + > + prandom_u32_max(5 * HZ); Seems better include the prandom_u32_max() into msecs_to_jiffies()? Thanks, Joseph > } > > /* > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH v1 29/50] fs/ocfs2/journal: Use prandom_u32() and not /dev/random for timeout @ 2020-03-30 12:09 ` Joseph Qi 0 siblings, 0 replies; 8+ messages in thread From: Joseph Qi @ 2020-03-30 12:09 UTC (permalink / raw) To: George Spelvin, linux-kernel; +Cc: Mark Fasheh, Joel Becker, ocfs2-devel Sorry for the late reply since I might miss this mail. On 2019/3/21 11:07, George Spelvin wrote: > get_random_bytes() is expensive crypto-quality random numbers. > If we're just doing random backoff, prandom_u32() is plenty. > > (Not to mention fetching 8 bytes of seed material only to > reduce it modulo 5000 is a huge waste.) > > Also, convert timeouts to jiffies at compile time; convert > milliseconds to jiffies before picking a random number in the > range to take advantage of compile-time constant folding. > > Signed-off-by: George Spelvin <lkml@sdf.org> > Cc: Mark Fasheh <mark@fasheh.com> > Cc: Joel Becker <jlbec@evilplan.org> > Cc: Joseph Qi <joseph.qi@linux.alibaba.com> > Cc: ocfs2-devel@oss.oracle.com > --- > fs/ocfs2/journal.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c > index 68ba354cf3610..939a12e57fa8b 100644 > --- a/fs/ocfs2/journal.c > +++ b/fs/ocfs2/journal.c > @@ -1884,11 +1884,8 @@ int ocfs2_mark_dead_nodes(struct ocfs2_super *osb) > */ > static inline unsigned long ocfs2_orphan_scan_timeout(void) > { > - unsigned long time; > - > - get_random_bytes(&time, sizeof(time)); > - time = ORPHAN_SCAN_SCHEDULE_TIMEOUT + (time % 5000); > - return msecs_to_jiffies(time); > + return msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT) + > + prandom_u32_max(5 * HZ); Seems better include the prandom_u32_max() into msecs_to_jiffies()? Thanks, Joseph > } > > /* > ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Ocfs2-devel] [RFC PATCH v1 29/50] fs/ocfs2/journal: Use prandom_u32() and not /dev/random for timeout 2020-03-30 12:09 ` Joseph Qi @ 2020-03-30 16:34 ` George Spelvin -1 siblings, 0 replies; 8+ messages in thread From: George Spelvin @ 2020-03-30 16:34 UTC (permalink / raw) To: ocfs2-devel On Mon, Mar 30, 2020 at 08:09:33PM +0800, Joseph Qi wrote: > Sorry for the late reply since I might miss this mail. You're hardly late; I expect replies to dribble in for a week. > On 2019/3/21 11:07, George Spelvin wrote: >> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c >> index 68ba354cf3610..939a12e57fa8b 100644 >> --- a/fs/ocfs2/journal.c >> +++ b/fs/ocfs2/journal.c >> @@ -1884,11 +1884,8 @@ int ocfs2_mark_dead_nodes(struct ocfs2_super *osb) >> */ >> static inline unsigned long ocfs2_orphan_scan_timeout(void) >> { >> - unsigned long time; >> - >> - get_random_bytes(&time, sizeof(time)); >> - time = ORPHAN_SCAN_SCHEDULE_TIMEOUT + (time % 5000); >> - return msecs_to_jiffies(time); >> + return msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT) + >> + prandom_u32_max(5 * HZ); > > Seems better include the prandom_u32_max() into msecs_to_jiffies()? What I'm trying to take advantage of here is constant propagation. msecs_to_jiffies is zero cost (it's evaluated entirely at compile time) if its argument is a compile-time constant. It's a function call and a few instructions if its argument is variable. msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT + prandom_u32_max(5000)) would be forced to use the expensive version. The compiler does't know, but *I* know, that msecs_to_jiffies() is a linear function, and prandom_u32_max() is a sort-of linear function. (It's a linear function for a given PRNG starting state, so each individual call is linear, but multiple calls mess things up.) Modulo a bit of rounding, we have: msecs_to_jiffies(a + b) = msecs_to_jiffies(a) + msecs_to_jiffies(b) msecs_to_jiffies(a) * b = msecs_to_jiffies(a * b) prandom_u32_max(a) * b = prandom_u32_max(a * b) prandom_u32_max(msecs_to_jiffies(a)) = msecs_to_jiffies(prandom_u32_max(a)) By doing the addition in jiffies rather than milliseconds, we get the cheap code. It's not a huge big deal, but it's definitely smaller and faster. Admittedly, I happen to be using HZ = 300, which requires a multiply to convert, and makes the resultant random numbers slightly non-uniform. The default HZ = 250 makes it just a divide by 4, which is pretty simple. (When HZ = 300, you get 1..3 ms -> 1 jiffy, 4..6 ms -> 2 jiffies, and 7..10 ms -> 3 jiffies. Multiples of 3 are 33% more likely to be chosen.) It just seems silly and wasteful to pick a random number between 0 and 4999 (plus 30000), only to convert it to a random number between 0 and 1249 (plus 7500). And if HZ = 2000 ever happens, the timeout won't be artificially limited to integer milliseconds. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH v1 29/50] fs/ocfs2/journal: Use prandom_u32() and not /dev/random for timeout @ 2020-03-30 16:34 ` George Spelvin 0 siblings, 0 replies; 8+ messages in thread From: George Spelvin @ 2020-03-30 16:34 UTC (permalink / raw) To: Joseph Qi; +Cc: linux-kernel, Mark Fasheh, Joel Becker, ocfs2-devel, lkml On Mon, Mar 30, 2020 at 08:09:33PM +0800, Joseph Qi wrote: > Sorry for the late reply since I might miss this mail. You're hardly late; I expect replies to dribble in for a week. > On 2019/3/21 11:07, George Spelvin wrote: >> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c >> index 68ba354cf3610..939a12e57fa8b 100644 >> --- a/fs/ocfs2/journal.c >> +++ b/fs/ocfs2/journal.c >> @@ -1884,11 +1884,8 @@ int ocfs2_mark_dead_nodes(struct ocfs2_super *osb) >> */ >> static inline unsigned long ocfs2_orphan_scan_timeout(void) >> { >> - unsigned long time; >> - >> - get_random_bytes(&time, sizeof(time)); >> - time = ORPHAN_SCAN_SCHEDULE_TIMEOUT + (time % 5000); >> - return msecs_to_jiffies(time); >> + return msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT) + >> + prandom_u32_max(5 * HZ); > > Seems better include the prandom_u32_max() into msecs_to_jiffies()? What I'm trying to take advantage of here is constant propagation. msecs_to_jiffies is zero cost (it's evaluated entirely at compile time) if its argument is a compile-time constant. It's a function call and a few instructions if its argument is variable. msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT + prandom_u32_max(5000)) would be forced to use the expensive version. The compiler does't know, but *I* know, that msecs_to_jiffies() is a linear function, and prandom_u32_max() is a sort-of linear function. (It's a linear function for a given PRNG starting state, so each individual call is linear, but multiple calls mess things up.) Modulo a bit of rounding, we have: msecs_to_jiffies(a + b) = msecs_to_jiffies(a) + msecs_to_jiffies(b) msecs_to_jiffies(a) * b = msecs_to_jiffies(a * b) prandom_u32_max(a) * b = prandom_u32_max(a * b) prandom_u32_max(msecs_to_jiffies(a)) = msecs_to_jiffies(prandom_u32_max(a)) By doing the addition in jiffies rather than milliseconds, we get the cheap code. It's not a huge big deal, but it's definitely smaller and faster. Admittedly, I happen to be using HZ = 300, which requires a multiply to convert, and makes the resultant random numbers slightly non-uniform. The default HZ = 250 makes it just a divide by 4, which is pretty simple. (When HZ = 300, you get 1..3 ms -> 1 jiffy, 4..6 ms -> 2 jiffies, and 7..10 ms -> 3 jiffies. Multiples of 3 are 33% more likely to be chosen.) It just seems silly and wasteful to pick a random number between 0 and 4999 (plus 30000), only to convert it to a random number between 0 and 1249 (plus 7500). And if HZ = 2000 ever happens, the timeout won't be artificially limited to integer milliseconds. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Ocfs2-devel] [RFC PATCH v1 29/50] fs/ocfs2/journal: Use prandom_u32() and not /dev/random for timeout 2020-03-30 16:34 ` George Spelvin @ 2020-03-31 0:56 ` Joseph Qi -1 siblings, 0 replies; 8+ messages in thread From: Joseph Qi @ 2020-03-31 0:56 UTC (permalink / raw) To: ocfs2-devel On 2020/3/31 00:34, George Spelvin wrote: > On Mon, Mar 30, 2020 at 08:09:33PM +0800, Joseph Qi wrote: >> Sorry for the late reply since I might miss this mail. > > You're hardly late; I expect replies to dribble in for a week. > >> On 2019/3/21 11:07, George Spelvin wrote: >>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c >>> index 68ba354cf3610..939a12e57fa8b 100644 >>> --- a/fs/ocfs2/journal.c >>> +++ b/fs/ocfs2/journal.c >>> @@ -1884,11 +1884,8 @@ int ocfs2_mark_dead_nodes(struct ocfs2_super *osb) >>> */ >>> static inline unsigned long ocfs2_orphan_scan_timeout(void) >>> { >>> - unsigned long time; >>> - >>> - get_random_bytes(&time, sizeof(time)); >>> - time = ORPHAN_SCAN_SCHEDULE_TIMEOUT + (time % 5000); >>> - return msecs_to_jiffies(time); >>> + return msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT) + >>> + prandom_u32_max(5 * HZ); >> >> Seems better include the prandom_u32_max() into msecs_to_jiffies()? > > What I'm trying to take advantage of here is constant propagation. > > msecs_to_jiffies is zero cost (it's evaluated entirely at compile > time) if its argument is a compile-time constant. It's a function call > and a few instructions if its argument is variable. > > msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT + prandom_u32_max(5000)) > would be forced to use the expensive version. > > The compiler does't know, but *I* know, that msecs_to_jiffies() is a > linear function, and prandom_u32_max() is a sort-of linear function. > > (It's a linear function for a given PRNG starting state, so each > individual call is linear, but multiple calls mess things up.) > > Modulo a bit of rounding, we have: > > msecs_to_jiffies(a + b) = msecs_to_jiffies(a) + msecs_to_jiffies(b) > msecs_to_jiffies(a) * b = msecs_to_jiffies(a * b) > prandom_u32_max(a) * b = prandom_u32_max(a * b) > prandom_u32_max(msecs_to_jiffies(a)) = msecs_to_jiffies(prandom_u32_max(a)) > > By doing the addition in jiffies rather than milliseconds, we get the > cheap code. It's not a huge big deal, but it's definitely smaller and > faster. > > Admittedly, I happen to be using HZ = 300, which requires a multiply to > convert, and makes the resultant random numbers slightly non-uniform. > The default HZ = 250 makes it just a divide by 4, which is pretty simple. > > (When HZ = 300, you get 1..3 ms -> 1 jiffy, 4..6 ms -> 2 jiffies, and > 7..10 ms -> 3 jiffies. Multiples of 3 are 33% more likely to be chosen.) > > It just seems silly and wasteful to pick a random number between 0 and > 4999 (plus 30000), only to convert it to a random number between 0 and > 1249 (plus 7500). > > And if HZ = 2000 ever happens, the timeout won't be artificially limited > to integer milliseconds. > Thanks for the detail explanation. Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH v1 29/50] fs/ocfs2/journal: Use prandom_u32() and not /dev/random for timeout @ 2020-03-31 0:56 ` Joseph Qi 0 siblings, 0 replies; 8+ messages in thread From: Joseph Qi @ 2020-03-31 0:56 UTC (permalink / raw) To: George Spelvin, Andrew Morton Cc: linux-kernel, Mark Fasheh, Joel Becker, ocfs2-devel On 2020/3/31 00:34, George Spelvin wrote: > On Mon, Mar 30, 2020 at 08:09:33PM +0800, Joseph Qi wrote: >> Sorry for the late reply since I might miss this mail. > > You're hardly late; I expect replies to dribble in for a week. > >> On 2019/3/21 11:07, George Spelvin wrote: >>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c >>> index 68ba354cf3610..939a12e57fa8b 100644 >>> --- a/fs/ocfs2/journal.c >>> +++ b/fs/ocfs2/journal.c >>> @@ -1884,11 +1884,8 @@ int ocfs2_mark_dead_nodes(struct ocfs2_super *osb) >>> */ >>> static inline unsigned long ocfs2_orphan_scan_timeout(void) >>> { >>> - unsigned long time; >>> - >>> - get_random_bytes(&time, sizeof(time)); >>> - time = ORPHAN_SCAN_SCHEDULE_TIMEOUT + (time % 5000); >>> - return msecs_to_jiffies(time); >>> + return msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT) + >>> + prandom_u32_max(5 * HZ); >> >> Seems better include the prandom_u32_max() into msecs_to_jiffies()? > > What I'm trying to take advantage of here is constant propagation. > > msecs_to_jiffies is zero cost (it's evaluated entirely at compile > time) if its argument is a compile-time constant. It's a function call > and a few instructions if its argument is variable. > > msecs_to_jiffies(ORPHAN_SCAN_SCHEDULE_TIMEOUT + prandom_u32_max(5000)) > would be forced to use the expensive version. > > The compiler does't know, but *I* know, that msecs_to_jiffies() is a > linear function, and prandom_u32_max() is a sort-of linear function. > > (It's a linear function for a given PRNG starting state, so each > individual call is linear, but multiple calls mess things up.) > > Modulo a bit of rounding, we have: > > msecs_to_jiffies(a + b) = msecs_to_jiffies(a) + msecs_to_jiffies(b) > msecs_to_jiffies(a) * b = msecs_to_jiffies(a * b) > prandom_u32_max(a) * b = prandom_u32_max(a * b) > prandom_u32_max(msecs_to_jiffies(a)) = msecs_to_jiffies(prandom_u32_max(a)) > > By doing the addition in jiffies rather than milliseconds, we get the > cheap code. It's not a huge big deal, but it's definitely smaller and > faster. > > Admittedly, I happen to be using HZ = 300, which requires a multiply to > convert, and makes the resultant random numbers slightly non-uniform. > The default HZ = 250 makes it just a divide by 4, which is pretty simple. > > (When HZ = 300, you get 1..3 ms -> 1 jiffy, 4..6 ms -> 2 jiffies, and > 7..10 ms -> 3 jiffies. Multiples of 3 are 33% more likely to be chosen.) > > It just seems silly and wasteful to pick a random number between 0 and > 4999 (plus 30000), only to convert it to a random number between 0 and > 1249 (plus 7500). > > And if HZ = 2000 ever happens, the timeout won't be artificially limited > to integer milliseconds. > Thanks for the detail explanation. Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-03-31 0:57 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-03-21 3:07 [RFC PATCH v1 29/50] fs/ocfs2/journal: Use prandom_u32() and not /dev/random for timeout George Spelvin 2020-03-28 16:43 ` [Ocfs2-devel] " George Spelvin 2020-03-30 12:09 ` Joseph Qi 2020-03-30 12:09 ` Joseph Qi 2020-03-30 16:34 ` [Ocfs2-devel] " George Spelvin 2020-03-30 16:34 ` George Spelvin 2020-03-31 0:56 ` [Ocfs2-devel] " Joseph Qi 2020-03-31 0:56 ` Joseph Qi
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.