* [RFC] jiffies_till_first_fqs off by 1 @ 2025-12-23 17:38 Joel Fernandes 2025-12-23 23:53 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Joel Fernandes @ 2025-12-23 17:38 UTC (permalink / raw) To: rcu Cc: Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang, rcu Hello, During studying some synchronize_rcu() latencies, I found that the jiffies_till_first_fqs value passed to the timer tick subsystem does is always off by one. This is natural due to calc_index() rounding up. For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay is actually 4ms. And same for the next FQS. In fact, in testing it shows it can never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due to interrupts. Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 before passing it to the wait APIs. But before I wanted to send a patch, I wanted to get everyone's thoughts. Considering this the RFC. The other place I found this was when call_rcu_hurry() is called, but the GP thread takes a tick to wake up, but this isn't related to the timer per-se, it is just that we don't want to wake the GP thread too often. So we just wait for the next tick to notice callbacks before doing a wakeup. Heh, and this means synchronize_rcu() latencies will multiply when HZ < 1000. I wonder if this is also what caused Uladzislau to investigate it for mobile devices. - Joel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] jiffies_till_first_fqs off by 1 2025-12-23 17:38 [RFC] jiffies_till_first_fqs off by 1 Joel Fernandes @ 2025-12-23 23:53 ` Paul E. McKenney 2025-12-24 2:06 ` Joel Fernandes 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2025-12-23 23:53 UTC (permalink / raw) To: Joel Fernandes Cc: rcu, Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote: > Hello, > > During studying some synchronize_rcu() latencies, I found that the > jiffies_till_first_fqs value passed to the timer tick subsystem does is always > off by one. This is natural due to calc_index() rounding up. > > For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay > is actually 4ms. And same for the next FQS. In fact, in testing it shows it can > never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due > to interrupts. > > Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 > before passing it to the wait APIs. > > But before I wanted to send a patch, I wanted to get everyone's thoughts. > Considering this the RFC. Inadvertent passing of the value zero? > The other place I found this was when call_rcu_hurry() is called, but the GP > thread takes a tick to wake up, but this isn't related to the timer per-se, it > is just that we don't want to wake the GP thread too often. So we just wait for > the next tick to notice callbacks before doing a wakeup. > > Heh, and this means synchronize_rcu() latencies will multiply when HZ < 1000. I > wonder if this is also what caused Uladzislau to investigate it for mobile devices. Quite possibly! Back in the day, the theory was that lower HZ tended to imply less-capable CPUs, and thus a need to lighten the load. So there might need to be some adjustment for present-day hardware. Thanx, Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] jiffies_till_first_fqs off by 1 2025-12-23 23:53 ` Paul E. McKenney @ 2025-12-24 2:06 ` Joel Fernandes 2025-12-25 18:54 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Joel Fernandes @ 2025-12-24 2:06 UTC (permalink / raw) To: Paul E. McKenney Cc: rcu, Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang Hi Paul, On Tue, Dec 23, 2025 at 03:53:23PM -0800, Paul E. McKenney wrote: > On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote: > > During studying some synchronize_rcu() latencies, I found that the > > jiffies_till_first_fqs value passed to the timer tick subsystem does is always > > off by one. This is natural due to calc_index() rounding up. > > > > For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay > > is actually 4ms. And same for the next FQS. In fact, in testing it shows it can > > never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due > > to interrupts. > > > > Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 > > before passing it to the wait APIs. > > > > But before I wanted to send a patch, I wanted to get everyone's thoughts. > > Considering this the RFC. > > Inadvertent passing of the value zero? This should not be an issue because at the moment, even a value of jiffies_till_first_fqs == 0 waits for ~1 jiffie due to schedule_timeout(0). But you raise a good point, we should cap the minimum allowed jiffie value for the fqs parameters to 1 so that we don't pass schedule_timeout() with negative values when/if we do the reduce-by-one approach. thanks, - Joel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] jiffies_till_first_fqs off by 1 2025-12-24 2:06 ` Joel Fernandes @ 2025-12-25 18:54 ` Paul E. McKenney 2025-12-26 2:15 ` Joel Fernandes 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2025-12-25 18:54 UTC (permalink / raw) To: Joel Fernandes Cc: rcu, Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang On Tue, Dec 23, 2025 at 09:06:19PM -0500, Joel Fernandes wrote: > Hi Paul, > > On Tue, Dec 23, 2025 at 03:53:23PM -0800, Paul E. McKenney wrote: > > On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote: > > > During studying some synchronize_rcu() latencies, I found that the > > > jiffies_till_first_fqs value passed to the timer tick subsystem does is always > > > off by one. This is natural due to calc_index() rounding up. > > > > > > For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay > > > is actually 4ms. And same for the next FQS. In fact, in testing it shows it can > > > never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due > > > to interrupts. > > > > > > Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 > > > before passing it to the wait APIs. > > > > > > But before I wanted to send a patch, I wanted to get everyone's thoughts. > > > Considering this the RFC. > > > > Inadvertent passing of the value zero? > > This should not be an issue because at the moment, even a value of > jiffies_till_first_fqs == 0 waits for ~1 jiffie due to schedule_timeout(0). > > But you raise a good point, we should cap the minimum allowed jiffie value > for the fqs parameters to 1 so that we don't pass schedule_timeout() with > negative values when/if we do the reduce-by-one approach. There is a potential use case for jiffies_till_first_fqs=0 and no wait, which would be systems that want to scan for idle CPUs immediately after the grace period has been initialized. Note the word "potential". ;-) If we want to support this, then perhaps we would need to avoid that schedule_timeout(0). Or rcu_gp_fqs_check_wake(), as the case may be. Thanx, Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] jiffies_till_first_fqs off by 1 2025-12-25 18:54 ` Paul E. McKenney @ 2025-12-26 2:15 ` Joel Fernandes 2026-01-01 22:24 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Joel Fernandes @ 2025-12-26 2:15 UTC (permalink / raw) To: Paul E. McKenney Cc: rcu, Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang On Thu, Dec 25, 2025 at 10:54:20AM -0800, Paul E. McKenney wrote: > On Tue, Dec 23, 2025 at 09:06:19PM -0500, Joel Fernandes wrote: > > Hi Paul, > > > > On Tue, Dec 23, 2025 at 03:53:23PM -0800, Paul E. McKenney wrote: > > > On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote: > > > > During studying some synchronize_rcu() latencies, I found that the > > > > jiffies_till_first_fqs value passed to the timer tick subsystem does is always > > > > off by one. This is natural due to calc_index() rounding up. > > > > > > > > For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay > > > > is actually 4ms. And same for the next FQS. In fact, in testing it shows it can > > > > never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due > > > > to interrupts. > > > > > > > > Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 > > > > before passing it to the wait APIs. > > > > > > > > But before I wanted to send a patch, I wanted to get everyone's thoughts. > > > > Considering this the RFC. > > > > > > Inadvertent passing of the value zero? > > > > This should not be an issue because at the moment, even a value of > > jiffies_till_first_fqs == 0 waits for ~1 jiffie due to schedule_timeout(0). > > > > But you raise a good point, we should cap the minimum allowed jiffie value > > for the fqs parameters to 1 so that we don't pass schedule_timeout() with > > negative values when/if we do the reduce-by-one approach. > > There is a potential use case for jiffies_till_first_fqs=0 and no wait, > which would be systems that want to scan for idle CPUs immediately after > the grace period has been initialized. Note the word "potential". ;-) Sure, we could add support for that but that would be new behavior that is not in the existing code. So jiffies_till_first_fqs=0 today, I think it is not 'working as intended' because it will never not wait I think. So we should fix that too? Or maybe it can be a patch separate from this (that I can work on). I think no harming in allowing that mode, at least it will be more in line with the expected outcome. > > If we want to support this, then perhaps we would need to avoid that > schedule_timeout(0). Or rcu_gp_fqs_check_wake(), as the case may be. True. thanks, - Joel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] jiffies_till_first_fqs off by 1 2025-12-26 2:15 ` Joel Fernandes @ 2026-01-01 22:24 ` Paul E. McKenney 2026-01-02 2:59 ` Joel Fernandes 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2026-01-01 22:24 UTC (permalink / raw) To: Joel Fernandes Cc: rcu, Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang On Thu, Dec 25, 2025 at 09:15:59PM -0500, Joel Fernandes wrote: > On Thu, Dec 25, 2025 at 10:54:20AM -0800, Paul E. McKenney wrote: > > On Tue, Dec 23, 2025 at 09:06:19PM -0500, Joel Fernandes wrote: > > > Hi Paul, > > > > > > On Tue, Dec 23, 2025 at 03:53:23PM -0800, Paul E. McKenney wrote: > > > > On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote: > > > > > During studying some synchronize_rcu() latencies, I found that the > > > > > jiffies_till_first_fqs value passed to the timer tick subsystem does is always > > > > > off by one. This is natural due to calc_index() rounding up. > > > > > > > > > > For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay > > > > > is actually 4ms. And same for the next FQS. In fact, in testing it shows it can > > > > > never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due > > > > > to interrupts. > > > > > > > > > > Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 > > > > > before passing it to the wait APIs. > > > > > > > > > > But before I wanted to send a patch, I wanted to get everyone's thoughts. > > > > > Considering this the RFC. > > > > > > > > Inadvertent passing of the value zero? > > > > > > This should not be an issue because at the moment, even a value of > > > jiffies_till_first_fqs == 0 waits for ~1 jiffie due to schedule_timeout(0). > > > > > > But you raise a good point, we should cap the minimum allowed jiffie value > > > for the fqs parameters to 1 so that we don't pass schedule_timeout() with > > > negative values when/if we do the reduce-by-one approach. > > > > There is a potential use case for jiffies_till_first_fqs=0 and no wait, > > which would be systems that want to scan for idle CPUs immediately after > > the grace period has been initialized. Note the word "potential". ;-) > > Sure, we could add support for that but that would be new behavior that is > not in the existing code. > > So jiffies_till_first_fqs=0 today, I think it is not 'working as intended' > because it will never not wait I think. Agreed. > So we should fix that too? Or maybe it can be a patch separate from this > (that I can work on). I think no harming in allowing that mode, at least it > will be more in line with the expected outcome. Makes sense! However, given that no one has complained, care is required. Someone might be relying on the old behavior. (In which case an easy fix would be to make -1 be no waiting, though one might hope for a better fix.) Thanx, Paul > > If we want to support this, then perhaps we would need to avoid that > > schedule_timeout(0). Or rcu_gp_fqs_check_wake(), as the case may be. > > True. > > thanks, > > - Joel > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] jiffies_till_first_fqs off by 1 2026-01-01 22:24 ` Paul E. McKenney @ 2026-01-02 2:59 ` Joel Fernandes 2026-01-02 3:41 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Joel Fernandes @ 2026-01-02 2:59 UTC (permalink / raw) To: paulmck Cc: rcu, Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang On 1/1/2026 5:24 PM, Paul E. McKenney wrote: > On Thu, Dec 25, 2025 at 09:15:59PM -0500, Joel Fernandes wrote: >> On Thu, Dec 25, 2025 at 10:54:20AM -0800, Paul E. McKenney wrote: >>> On Tue, Dec 23, 2025 at 09:06:19PM -0500, Joel Fernandes wrote: >>>> Hi Paul, >>>> >>>> On Tue, Dec 23, 2025 at 03:53:23PM -0800, Paul E. McKenney wrote: >>>>> On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote: >>>>>> During studying some synchronize_rcu() latencies, I found that the >>>>>> jiffies_till_first_fqs value passed to the timer tick subsystem does is always >>>>>> off by one. This is natural due to calc_index() rounding up. >>>>>> >>>>>> For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay >>>>>> is actually 4ms. And same for the next FQS. In fact, in testing it shows it can >>>>>> never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due >>>>>> to interrupts. >>>>>> >>>>>> Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 >>>>>> before passing it to the wait APIs. >>>>>> >>>>>> But before I wanted to send a patch, I wanted to get everyone's thoughts. >>>>>> Considering this the RFC. >>>>> >>>>> Inadvertent passing of the value zero? >>>> >>>> This should not be an issue because at the moment, even a value of >>>> jiffies_till_first_fqs == 0 waits for ~1 jiffie due to schedule_timeout(0). >>>> >>>> But you raise a good point, we should cap the minimum allowed jiffie value >>>> for the fqs parameters to 1 so that we don't pass schedule_timeout() with >>>> negative values when/if we do the reduce-by-one approach. >>> >>> There is a potential use case for jiffies_till_first_fqs=0 and no wait, >>> which would be systems that want to scan for idle CPUs immediately after >>> the grace period has been initialized. Note the word "potential". ;-) >> >> Sure, we could add support for that but that would be new behavior that is >> not in the existing code. >> >> So jiffies_till_first_fqs=0 today, I think it is not 'working as intended' >> because it will never not wait I think. > > Agreed. > >> So we should fix that too? Or maybe it can be a patch separate from this >> (that I can work on). I think no harming in allowing that mode, at least it >> will be more in line with the expected outcome. > > Makes sense! However, given that no one has complained, care is required. > Someone might be relying on the old behavior. (In which case an easy > fix would be to make -1 be no waiting, though one might hope for a > better fix.) Some further investigations revealed that the "1 jiffie error" is actually worst case. In the best case, it could still be closer to a jiffie. It is just the nature of the timer wheel, since it snaps to numerical TICK_NS boundary, the rounding error is intentionally added depending on how far along in the boundary was the timer for the wait enqueued. If we took probability distributions, we should be landing with a 1/2 jiffie error, though in practice I've seen it to be 3/4 jiffie error on average. Given this, it would probably not make sense for us to do the -1 to adjust for the error (since we don't clearly have bounds on the minimum error). We just have to accept that we'd lose 1-2 extra jiffie per FQS loop iteration wait, which is amplified if a grace period is already in progress. I've seen this add upto 4 jiffies to back-to-back synchronize_rcu() latency even when there are no readers in progress. But I had to go down the rabbit hole and check... ;-) thanks, - Joel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] jiffies_till_first_fqs off by 1 2026-01-02 2:59 ` Joel Fernandes @ 2026-01-02 3:41 ` Paul E. McKenney 2026-01-02 17:58 ` Joel Fernandes 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2026-01-02 3:41 UTC (permalink / raw) To: Joel Fernandes Cc: rcu, Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang On Thu, Jan 01, 2026 at 09:59:27PM -0500, Joel Fernandes wrote: > > > On 1/1/2026 5:24 PM, Paul E. McKenney wrote: > > On Thu, Dec 25, 2025 at 09:15:59PM -0500, Joel Fernandes wrote: > >> On Thu, Dec 25, 2025 at 10:54:20AM -0800, Paul E. McKenney wrote: > >>> On Tue, Dec 23, 2025 at 09:06:19PM -0500, Joel Fernandes wrote: > >>>> Hi Paul, > >>>> > >>>> On Tue, Dec 23, 2025 at 03:53:23PM -0800, Paul E. McKenney wrote: > >>>>> On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote: > >>>>>> During studying some synchronize_rcu() latencies, I found that the > >>>>>> jiffies_till_first_fqs value passed to the timer tick subsystem does is always > >>>>>> off by one. This is natural due to calc_index() rounding up. > >>>>>> > >>>>>> For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay > >>>>>> is actually 4ms. And same for the next FQS. In fact, in testing it shows it can > >>>>>> never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due > >>>>>> to interrupts. > >>>>>> > >>>>>> Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 > >>>>>> before passing it to the wait APIs. > >>>>>> > >>>>>> But before I wanted to send a patch, I wanted to get everyone's thoughts. > >>>>>> Considering this the RFC. > >>>>> > >>>>> Inadvertent passing of the value zero? > >>>> > >>>> This should not be an issue because at the moment, even a value of > >>>> jiffies_till_first_fqs == 0 waits for ~1 jiffie due to schedule_timeout(0). > >>>> > >>>> But you raise a good point, we should cap the minimum allowed jiffie value > >>>> for the fqs parameters to 1 so that we don't pass schedule_timeout() with > >>>> negative values when/if we do the reduce-by-one approach. > >>> > >>> There is a potential use case for jiffies_till_first_fqs=0 and no wait, > >>> which would be systems that want to scan for idle CPUs immediately after > >>> the grace period has been initialized. Note the word "potential". ;-) > >> > >> Sure, we could add support for that but that would be new behavior that is > >> not in the existing code. > >> > >> So jiffies_till_first_fqs=0 today, I think it is not 'working as intended' > >> because it will never not wait I think. > > > > Agreed. > > >> So we should fix that too? Or maybe it can be a patch separate from this > >> (that I can work on). I think no harming in allowing that mode, at least it > >> will be more in line with the expected outcome. > > > > Makes sense! However, given that no one has complained, care is required. > > Someone might be relying on the old behavior. (In which case an easy > > fix would be to make -1 be no waiting, though one might hope for a > > better fix.) > Some further investigations revealed that the "1 jiffie error" is actually worst > case. In the best case, it could still be closer to a jiffie. It is just the > nature of the timer wheel, since it snaps to numerical TICK_NS boundary, the > rounding error is intentionally added depending on how far along in the boundary > was the timer for the wait enqueued. If we took probability distributions, we > should be landing with a 1/2 jiffie error, though in practice I've seen it to be > 3/4 jiffie error on average. > > Given this, it would probably not make sense for us to do the -1 to adjust for > the error (since we don't clearly have bounds on the minimum error). We just > have to accept that we'd lose 1-2 extra jiffie per FQS loop iteration wait, > which is amplified if a grace period is already in progress. I've seen this add > upto 4 jiffies to back-to-back synchronize_rcu() latency even when there are no > readers in progress. . > But I had to go down the rabbit hole and check... ;-) I was thinking in terms of special-casing -1 to skip the sleep, but I guess that there are as many ways to skin a rabbit as a cat. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] jiffies_till_first_fqs off by 1 2026-01-02 3:41 ` Paul E. McKenney @ 2026-01-02 17:58 ` Joel Fernandes 2026-01-02 19:50 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Joel Fernandes @ 2026-01-02 17:58 UTC (permalink / raw) To: paulmck Cc: Joel Fernandes, rcu, Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang > On Jan 1, 2026, at 10:41 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Thu, Jan 01, 2026 at 09:59:27PM -0500, Joel Fernandes wrote: >> >> >>> On 1/1/2026 5:24 PM, Paul E. McKenney wrote: >>> On Thu, Dec 25, 2025 at 09:15:59PM -0500, Joel Fernandes wrote: >>>> On Thu, Dec 25, 2025 at 10:54:20AM -0800, Paul E. McKenney wrote: >>>>> On Tue, Dec 23, 2025 at 09:06:19PM -0500, Joel Fernandes wrote: >>>>>> Hi Paul, >>>>>> >>>>>> On Tue, Dec 23, 2025 at 03:53:23PM -0800, Paul E. McKenney wrote: >>>>>>> On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote: >>>>>>>> During studying some synchronize_rcu() latencies, I found that the >>>>>>>> jiffies_till_first_fqs value passed to the timer tick subsystem does is always >>>>>>>> off by one. This is natural due to calc_index() rounding up. >>>>>>>> >>>>>>>> For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay >>>>>>>> is actually 4ms. And same for the next FQS. In fact, in testing it shows it can >>>>>>>> never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due >>>>>>>> to interrupts. >>>>>>>> >>>>>>>> Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 >>>>>>>> before passing it to the wait APIs. >>>>>>>> >>>>>>>> But before I wanted to send a patch, I wanted to get everyone's thoughts. >>>>>>>> Considering this the RFC. >>>>>>> >>>>>>> Inadvertent passing of the value zero? >>>>>> >>>>>> This should not be an issue because at the moment, even a value of >>>>>> jiffies_till_first_fqs == 0 waits for ~1 jiffie due to schedule_timeout(0). >>>>>> >>>>>> But you raise a good point, we should cap the minimum allowed jiffie value >>>>>> for the fqs parameters to 1 so that we don't pass schedule_timeout() with >>>>>> negative values when/if we do the reduce-by-one approach. >>>>> >>>>> There is a potential use case for jiffies_till_first_fqs=0 and no wait, >>>>> which would be systems that want to scan for idle CPUs immediately after >>>>> the grace period has been initialized. Note the word "potential". ;-) >>>> >>>> Sure, we could add support for that but that would be new behavior that is >>>> not in the existing code. >>>> >>>> So jiffies_till_first_fqs=0 today, I think it is not 'working as intended' >>>> because it will never not wait I think. >>> >>> Agreed. >>>>> So we should fix that too? Or maybe it can be a patch separate from this >>>> (that I can work on). I think no harming in allowing that mode, at least it >>>> will be more in line with the expected outcome. >>> >>> Makes sense! However, given that no one has complained, care is required. >>> Someone might be relying on the old behavior. (In which case an easy >>> fix would be to make -1 be no waiting, though one might hope for a >>> better fix.) >> Some further investigations revealed that the "1 jiffie error" is actually worst >> case. In the best case, it could still be closer to a jiffie. It is just the >> nature of the timer wheel, since it snaps to numerical TICK_NS boundary, the >> rounding error is intentionally added depending on how far along in the boundary >> was the timer for the wait enqueued. If we took probability distributions, we >> should be landing with a 1/2 jiffie error, though in practice I've seen it to be >> 3/4 jiffie error on average. >> >> Given this, it would probably not make sense for us to do the -1 to adjust for >> the error (since we don't clearly have bounds on the minimum error). We just >> have to accept that we'd lose 1-2 extra jiffie per FQS loop iteration wait, >> which is amplified if a grace period is already in progress. I've seen this add >> upto 4 jiffies to back-to-back synchronize_rcu() latency even when there are no >> readers in progress. > . >> But I had to go down the rabbit hole and check... ;-) > > I was thinking in terms of special-casing -1 to skip the sleep, but I > guess that there are as many ways to skin a rabbit as a cat. ;-) Sure I am happy to do that. One of my fears though is no one will know to use it that way making it not that useful. Do let me know if anyone sets it to 0 though. Perhaps for testing even to make the GP cycle shorter? - Joel > > Thanx, Paul > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] jiffies_till_first_fqs off by 1 2026-01-02 17:58 ` Joel Fernandes @ 2026-01-02 19:50 ` Paul E. McKenney 0 siblings, 0 replies; 10+ messages in thread From: Paul E. McKenney @ 2026-01-02 19:50 UTC (permalink / raw) To: Joel Fernandes Cc: Joel Fernandes, rcu, Steven Rostedt, linux-kernel, Davidlohr Bueso, Josh Triplett, Frederic Weisbecker, Neeraj Upadhyay, Boqun Feng, Uladzislau Rezki, Mathieu Desnoyers, Lai Jiangshan, Zqiang On Fri, Jan 02, 2026 at 12:58:08PM -0500, Joel Fernandes wrote: > > > > On Jan 1, 2026, at 10:41 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Thu, Jan 01, 2026 at 09:59:27PM -0500, Joel Fernandes wrote: > >> > >> > >>> On 1/1/2026 5:24 PM, Paul E. McKenney wrote: > >>> On Thu, Dec 25, 2025 at 09:15:59PM -0500, Joel Fernandes wrote: > >>>> On Thu, Dec 25, 2025 at 10:54:20AM -0800, Paul E. McKenney wrote: > >>>>> On Tue, Dec 23, 2025 at 09:06:19PM -0500, Joel Fernandes wrote: > >>>>>> Hi Paul, > >>>>>> > >>>>>> On Tue, Dec 23, 2025 at 03:53:23PM -0800, Paul E. McKenney wrote: > >>>>>>> On Tue, Dec 23, 2025 at 12:38:19PM -0500, Joel Fernandes wrote: > >>>>>>>> During studying some synchronize_rcu() latencies, I found that the > >>>>>>>> jiffies_till_first_fqs value passed to the timer tick subsystem does is always > >>>>>>>> off by one. This is natural due to calc_index() rounding up. > >>>>>>>> > >>>>>>>> For example, jiffies_till_first_fqs=3 means the "Jiffies till first FQS" delay > >>>>>>>> is actually 4ms. And same for the next FQS. In fact, in testing it shows it can > >>>>>>>> never ever be 3ms for HZ=1000. And in rare cases, it will go to 5ms probably due > >>>>>>>> to interrupts. > >>>>>>>> > >>>>>>>> Considering this, I think it is better to reduce the jiffies_till_first_fqs by 1 > >>>>>>>> before passing it to the wait APIs. > >>>>>>>> > >>>>>>>> But before I wanted to send a patch, I wanted to get everyone's thoughts. > >>>>>>>> Considering this the RFC. > >>>>>>> > >>>>>>> Inadvertent passing of the value zero? > >>>>>> > >>>>>> This should not be an issue because at the moment, even a value of > >>>>>> jiffies_till_first_fqs == 0 waits for ~1 jiffie due to schedule_timeout(0). > >>>>>> > >>>>>> But you raise a good point, we should cap the minimum allowed jiffie value > >>>>>> for the fqs parameters to 1 so that we don't pass schedule_timeout() with > >>>>>> negative values when/if we do the reduce-by-one approach. > >>>>> > >>>>> There is a potential use case for jiffies_till_first_fqs=0 and no wait, > >>>>> which would be systems that want to scan for idle CPUs immediately after > >>>>> the grace period has been initialized. Note the word "potential". ;-) > >>>> > >>>> Sure, we could add support for that but that would be new behavior that is > >>>> not in the existing code. > >>>> > >>>> So jiffies_till_first_fqs=0 today, I think it is not 'working as intended' > >>>> because it will never not wait I think. > >>> > >>> Agreed. > >>>>> So we should fix that too? Or maybe it can be a patch separate from this > >>>> (that I can work on). I think no harming in allowing that mode, at least it > >>>> will be more in line with the expected outcome. > >>> > >>> Makes sense! However, given that no one has complained, care is required. > >>> Someone might be relying on the old behavior. (In which case an easy > >>> fix would be to make -1 be no waiting, though one might hope for a > >>> better fix.) > >> Some further investigations revealed that the "1 jiffie error" is actually worst > >> case. In the best case, it could still be closer to a jiffie. It is just the > >> nature of the timer wheel, since it snaps to numerical TICK_NS boundary, the > >> rounding error is intentionally added depending on how far along in the boundary > >> was the timer for the wait enqueued. If we took probability distributions, we > >> should be landing with a 1/2 jiffie error, though in practice I've seen it to be > >> 3/4 jiffie error on average. > >> > >> Given this, it would probably not make sense for us to do the -1 to adjust for > >> the error (since we don't clearly have bounds on the minimum error). We just > >> have to accept that we'd lose 1-2 extra jiffie per FQS loop iteration wait, > >> which is amplified if a grace period is already in progress. I've seen this add > >> upto 4 jiffies to back-to-back synchronize_rcu() latency even when there are no > >> readers in progress. > > . > >> But I had to go down the rabbit hole and check... ;-) > > > > I was thinking in terms of special-casing -1 to skip the sleep, but I > > guess that there are as many ways to skin a rabbit as a cat. ;-) > > Sure I am happy to do that. One of my fears though is no one will know to use it that way making it not that useful. > > Do let me know if anyone sets it to 0 though. Perhaps for testing even to make the GP cycle shorter? I do not know of anyone doing that, hence the non-urgency. The "-1" would be just in case someone actually is setting it to zero, and complains about us breaking userspace. :-/ Thanx, Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-01-02 19:50 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-23 17:38 [RFC] jiffies_till_first_fqs off by 1 Joel Fernandes 2025-12-23 23:53 ` Paul E. McKenney 2025-12-24 2:06 ` Joel Fernandes 2025-12-25 18:54 ` Paul E. McKenney 2025-12-26 2:15 ` Joel Fernandes 2026-01-01 22:24 ` Paul E. McKenney 2026-01-02 2:59 ` Joel Fernandes 2026-01-02 3:41 ` Paul E. McKenney 2026-01-02 17:58 ` Joel Fernandes 2026-01-02 19:50 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox