* [patch 0/2] Introduce round_jiffies() to save spurious wakeups @ 2006-10-10 16:02 Arjan van de Ven 2006-10-10 16:03 ` [patch 1/2] round_jiffies infrastructure Arjan van de Ven 2006-10-11 17:23 ` [patch 0/2] Introduce round_jiffies() to save spurious wakeups Christoph Hellwig 0 siblings, 2 replies; 13+ messages in thread From: Arjan van de Ven @ 2006-10-10 16:02 UTC (permalink / raw) To: linux-kernel; +Cc: akpm, mingo Hi, the following 2 patches will introduce the round_jiffies() api and users thereof. The general idea is that by rounding the jiffies for certain timers to the next whole second will make those timers all happen at the same time; and thus reduce the number of times the cpu has to wake up to service timers (this assumes a tickless kernel) Obviously only timers where the exact time of firing isn't so important can do this; several of the recurring "always live" timers of the kernel are of this kind, they want "about once a second" or "about once every 4 seconds" and such, and don't really care about the exact jiffy in which they fire. An alternative would have been to introduce mod_timer_rounded() or somesuch APIs (but there's many variants that take jiffies); I feel that an explicit caller based rounding actually is quite reasonable. Greetings, Arjan van de Ven ^ permalink raw reply [flat|nested] 13+ messages in thread
* [patch 1/2] round_jiffies infrastructure 2006-10-10 16:02 [patch 0/2] Introduce round_jiffies() to save spurious wakeups Arjan van de Ven @ 2006-10-10 16:03 ` Arjan van de Ven 2006-10-10 16:04 ` [patch 2/2] round_jiffies users Arjan van de Ven ` (2 more replies) 2006-10-11 17:23 ` [patch 0/2] Introduce round_jiffies() to save spurious wakeups Christoph Hellwig 1 sibling, 3 replies; 13+ messages in thread From: Arjan van de Ven @ 2006-10-10 16:03 UTC (permalink / raw) To: linux-kernel; +Cc: mingo, akpm From: Arjan van de Ven <arjan@linux.intel.com> Subject: round_jiffies infrastructure This patch introduces a round_jiffies() function as well as a round_jiffies_relative() function. These functions round a jiffies value to the next whole second. The primary purpose of this rounding is to cause all "we don't care exactly when" timers to happen at the same jiffy. This avoids multiple timers to fire within the second for no real reason; with dynamic ticks these extra timers cause wakeups from deep sleep CPU sleep states and thus waste power. The exact wakeup moment is skewed by the cpu number, to avoid all cpus from waking up at the exact same time (and hitting the same lock/cachelines there) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Index: linux-2.6.19-rc1-git6/include/linux/timer.h =================================================================== --- linux-2.6.19-rc1-git6.orig/include/linux/timer.h +++ linux-2.6.19-rc1-git6/include/linux/timer.h @@ -98,4 +98,10 @@ extern void run_local_timers(void); struct hrtimer; extern int it_real_fn(struct hrtimer *); +unsigned long __round_jiffies(unsigned long T, int CPU); +unsigned long __round_jiffies_relative(unsigned long T, int CPU); +unsigned long round_jiffies(unsigned long T); +unsigned long round_jiffies_relative(unsigned long T); + + #endif Index: linux-2.6.19-rc1-git6/kernel/timer.c =================================================================== --- linux-2.6.19-rc1-git6.orig/kernel/timer.c +++ linux-2.6.19-rc1-git6/kernel/timer.c @@ -80,6 +80,56 @@ tvec_base_t boot_tvec_bases; EXPORT_SYMBOL(boot_tvec_bases); static DEFINE_PER_CPU(tvec_base_t *, tvec_bases) = &boot_tvec_bases; +unsigned long __round_jiffies(unsigned long T, int CPU) +{ + int rem; + int original = T; + rem = T % HZ; + if (rem < HZ/4) + T = T - rem; + else + T = T - rem + HZ; + /* we don't want all cpus firing at once hitting the same lock/memory */ + T += CPU * 3; + if (T <= jiffies) /* rounding ate our timeout entirely */ + return original; + return T; +} +EXPORT_SYMBOL_GPL(__round_jiffies); + +unsigned long __round_jiffies_relative(unsigned long T, int CPU) +{ + int rem; + int original = T; + T=T+jiffies; + rem = T % HZ; + if (rem < HZ/4) + T = T - rem; + else + T = T - rem + HZ; + /* we don't want all cpus firing at once hitting the same lock/memory */ + T += CPU * 3; + T = T-jiffies; + if (T<=0) /* rounding ate our delay entirely, don't round */ + return original; + return T; +} +EXPORT_SYMBOL_GPL(__round_jiffies_relative); + +unsigned long round_jiffies(unsigned long T) +{ + return __round_jiffies(T, raw_smp_processor_id()); +} +EXPORT_SYMBOL_GPL(round_jiffies); + +unsigned long round_jiffies_relative(unsigned long T) +{ + return __round_jiffies_relative(T, raw_smp_processor_id()); +} +EXPORT_SYMBOL_GPL(round_jiffies_relative); + + + static inline void set_running_timer(tvec_base_t *base, struct timer_list *timer) { ^ permalink raw reply [flat|nested] 13+ messages in thread
* [patch 2/2] round_jiffies users 2006-10-10 16:03 ` [patch 1/2] round_jiffies infrastructure Arjan van de Ven @ 2006-10-10 16:04 ` Arjan van de Ven 2006-10-10 16:47 ` Ingo Oeser 2006-10-10 22:47 ` Paul Dickson 2006-10-10 18:56 ` [patch 1/2] round_jiffies infrastructure Andrew Morton 2006-10-16 13:42 ` Andi Kleen 2 siblings, 2 replies; 13+ messages in thread From: Arjan van de Ven @ 2006-10-10 16:04 UTC (permalink / raw) To: linux-kernel; +Cc: netdev, jgarzik, akpm, mingo From: Arjan van de Ven <arjan@linux.intel.com> Subject: round_jiffies users CC: jgarzik@pobox.com CC: netdev@vger.kernel.org This patch introduces users of the round_jiffies() function. These timers all were of the "about once a second" or "about once every X seconds" variety and several showed up in the "what wakes the cpu up" profiles that the tickless patches provide. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Index: linux-2.6.19-rc1-git6/mm/slab.c =================================================================== --- linux-2.6.19-rc1-git6.orig/mm/slab.c +++ linux-2.6.19-rc1-git6/mm/slab.c @@ -926,7 +926,7 @@ static void __devinit start_cpu_timer(in if (keventd_up() && reap_work->func == NULL) { init_reap_node(cpu); INIT_WORK(reap_work, cache_reap, NULL); - schedule_delayed_work_on(cpu, reap_work, HZ + 3 * cpu); + schedule_delayed_work_on(cpu, reap_work, __round_jiffies_relative(HZ, cpu)); } } @@ -3821,7 +3821,7 @@ static void cache_reap(void *unused) if (!mutex_trylock(&cache_chain_mutex)) { /* Give up. Setup the next iteration. */ schedule_delayed_work(&__get_cpu_var(reap_work), - REAPTIMEOUT_CPUC); + round_jiffies_relative(REAPTIMEOUT_CPUC)); return; } @@ -3867,7 +3867,8 @@ next: next_reap_node(); refresh_cpu_vm_stats(smp_processor_id()); /* Set up the next iteration */ - schedule_delayed_work(&__get_cpu_var(reap_work), REAPTIMEOUT_CPUC); + schedule_delayed_work(&__get_cpu_var(reap_work), + round_jiffies_relative(REAPTIMEOUT_CPUC)); } #ifdef CONFIG_PROC_FS Index: linux-2.6.19-rc1-git6/fs/jbd/transaction.c =================================================================== --- linux-2.6.19-rc1-git6.orig/fs/jbd/transaction.c +++ linux-2.6.19-rc1-git6/fs/jbd/transaction.c @@ -53,7 +53,7 @@ get_transaction(journal_t *journal, tran spin_lock_init(&transaction->t_handle_lock); /* Set up the commit timer for the new transaction. */ - journal->j_commit_timer.expires = transaction->t_expires; + journal->j_commit_timer.expires = round_jiffies(transaction->t_expires); add_timer(&journal->j_commit_timer); J_ASSERT(journal->j_running_transaction == NULL); Index: linux-2.6.19-rc1-git6/drivers/ata/libata-scsi.c =================================================================== --- linux-2.6.19-rc1-git6.orig/drivers/ata/libata-scsi.c +++ linux-2.6.19-rc1-git6/drivers/ata/libata-scsi.c @@ -3094,7 +3094,8 @@ void ata_scsi_hotplug(void *data) for (i = 0; i < ATA_MAX_DEVICES; i++) { struct ata_device *dev = &ap->device[i]; if (ata_dev_enabled(dev) && !dev->sdev) { - queue_delayed_work(ata_aux_wq, &ap->hotplug_task, HZ); + queue_delayed_work(ata_aux_wq, &ap->hotplug_task, + round_jiffies_relative(HZ)); break; } } Index: linux-2.6.19-rc1-git6/net/core/dst.c =================================================================== --- linux-2.6.19-rc1-git6.orig/net/core/dst.c +++ linux-2.6.19-rc1-git6/net/core/dst.c @@ -99,7 +99,14 @@ static void dst_run_gc(unsigned long dum printk("dst_total: %d/%d %ld\n", atomic_read(&dst_total), delayed, dst_gc_timer_expires); #endif - mod_timer(&dst_gc_timer, jiffies + dst_gc_timer_expires); + /* if the next desired timer is more than 4 seconds in the future + * then round the timer to whole seconds + */ + if (dst_gc_timer_expires > 4*HZ) + mod_timer(&dst_gc_timer, + round_jiffies(jiffies + dst_gc_timer_expires)); + else + mod_timer(&dst_gc_timer, jiffies + dst_gc_timer_expires); out: spin_unlock(&dst_lock); Index: linux-2.6.19-rc1-git6/net/core/neighbour.c =================================================================== --- linux-2.6.19-rc1-git6.orig/net/core/neighbour.c +++ linux-2.6.19-rc1-git6/net/core/neighbour.c @@ -695,7 +695,10 @@ next_elt: if (!expire) expire = 1; - mod_timer(&tbl->gc_timer, now + expire); + if (expire>HZ) + mod_timer(&tbl->gc_timer, round_jiffies(now + expire)); + else + mod_timer(&tbl->gc_timer, now + expire); write_unlock(&tbl->lock); } Index: linux-2.6.19-rc1-git6/net/sched/sch_generic.c =================================================================== --- linux-2.6.19-rc1-git6.orig/net/sched/sch_generic.c +++ linux-2.6.19-rc1-git6/net/sched/sch_generic.c @@ -209,7 +209,7 @@ static void dev_watchdog(unsigned long a dev->name); dev->tx_timeout(dev); } - if (!mod_timer(&dev->watchdog_timer, jiffies + dev->watchdog_timeo)) + if (!mod_timer(&dev->watchdog_timer, round_jiffies(jiffies + dev->watchdog_timeo))) dev_hold(dev); } } Index: linux-2.6.19-rc1-git6/drivers/net/e1000/e1000_main.c =================================================================== --- linux-2.6.19-rc1-git6.orig/drivers/net/e1000/e1000_main.c +++ linux-2.6.19-rc1-git6/drivers/net/e1000/e1000_main.c @@ -483,7 +483,7 @@ e1000_up(struct e1000_adapter *adapter) clear_bit(__E1000_DOWN, &adapter->flags); - mod_timer(&adapter->watchdog_timer, jiffies + 2 * HZ); + mod_timer(&adapter->watchdog_timer, round_jiffies(jiffies + 2 * HZ)); return 0; } @@ -2493,7 +2493,7 @@ e1000_watchdog(unsigned long data) netif_carrier_on(netdev); netif_wake_queue(netdev); - mod_timer(&adapter->phy_info_timer, jiffies + 2 * HZ); + mod_timer(&adapter->phy_info_timer, round_jiffies(jiffies + 2 * HZ)); adapter->smartspeed = 0; } } else { @@ -2503,7 +2503,7 @@ e1000_watchdog(unsigned long data) DPRINTK(LINK, INFO, "NIC Link is Down\n"); netif_carrier_off(netdev); netif_stop_queue(netdev); - mod_timer(&adapter->phy_info_timer, jiffies + 2 * HZ); + mod_timer(&adapter->phy_info_timer, round_jiffies(jiffies + 2 * HZ)); /* 80003ES2LAN workaround-- * For packet buffer work-around on link down event; @@ -2568,7 +2568,7 @@ e1000_watchdog(unsigned long data) e1000_rar_set(&adapter->hw, adapter->hw.mac_addr, 0); /* Reset the timer */ - mod_timer(&adapter->watchdog_timer, jiffies + 2 * HZ); + mod_timer(&adapter->watchdog_timer, round_jiffies(jiffies + 2 * HZ)); } #define E1000_TX_FLAGS_CSUM 0x00000001 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 2/2] round_jiffies users 2006-10-10 16:04 ` [patch 2/2] round_jiffies users Arjan van de Ven @ 2006-10-10 16:47 ` Ingo Oeser 2006-10-10 16:59 ` Arjan van de Ven 2006-10-10 22:47 ` Paul Dickson 1 sibling, 1 reply; 13+ messages in thread From: Ingo Oeser @ 2006-10-10 16:47 UTC (permalink / raw) To: Arjan van de Ven; +Cc: linux-kernel, netdev, jgarzik, akpm, mingo Hi Arjan, Arjan van de Ven wrote: > Index: linux-2.6.19-rc1-git6/mm/slab.c > =================================================================== > --- linux-2.6.19-rc1-git6.orig/mm/slab.c > +++ linux-2.6.19-rc1-git6/mm/slab.c > @@ -926,7 +926,7 @@ static void __devinit start_cpu_timer(in > if (keventd_up() && reap_work->func == NULL) { > init_reap_node(cpu); > INIT_WORK(reap_work, cache_reap, NULL); > - schedule_delayed_work_on(cpu, reap_work, HZ + 3 * cpu); > + schedule_delayed_work_on(cpu, reap_work, __round_jiffies_relative(HZ, cpu)); > } > } > Did you changed the behavior by intention? You seem to miss the factor "3" here. This hunk should read: --- linux-2.6.19-rc1-git6.orig/mm/slab.c +++ linux-2.6.19-rc1-git6/mm/slab.c @@ -926,7 +926,7 @@ static void __devinit start_cpu_timer(in if (keventd_up() && reap_work->func == NULL) { init_reap_node(cpu); INIT_WORK(reap_work, cache_reap, NULL); - schedule_delayed_work_on(cpu, reap_work, HZ + 3 * cpu); + schedule_delayed_work_on(cpu, reap_work, __round_jiffies_relative(HZ, 3 * cpu)); } } In case you apply it: Signed-off-by: Ingo Oese <netdev@axxeo.de> Regards Ingo Oeser ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 2/2] round_jiffies users 2006-10-10 16:47 ` Ingo Oeser @ 2006-10-10 16:59 ` Arjan van de Ven 0 siblings, 0 replies; 13+ messages in thread From: Arjan van de Ven @ 2006-10-10 16:59 UTC (permalink / raw) To: Ingo Oeser; +Cc: linux-kernel, netdev, jgarzik, akpm, mingo On Tue, 2006-10-10 at 18:47 +0200, Ingo Oeser wrote: > Hi Arjan, > > Arjan van de Ven wrote: > > Index: linux-2.6.19-rc1-git6/mm/slab.c > > =================================================================== > > --- linux-2.6.19-rc1-git6.orig/mm/slab.c > > +++ linux-2.6.19-rc1-git6/mm/slab.c > > @@ -926,7 +926,7 @@ static void __devinit start_cpu_timer(in > > if (keventd_up() && reap_work->func == NULL) { > > init_reap_node(cpu); > > INIT_WORK(reap_work, cache_reap, NULL); > > - schedule_delayed_work_on(cpu, reap_work, HZ + 3 * cpu); > > + schedule_delayed_work_on(cpu, reap_work, __round_jiffies_relative(HZ, cpu)); > > } > > } > > > > Did you changed the behavior by intention? > You seem to miss the factor "3" here. This hunk should read: Hi, actually.. not really; the __round_jiffies_relative function just takes a CPU number, and internally takes care of spreading things around based on CPU number (eg it does the *3 internally); it's cleaner that way, the callers don't need to bother by how much to spread for each cpu etc etc... So the patch is correct as is. Greetings, Arjan van de Ven ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 2/2] round_jiffies users 2006-10-10 16:04 ` [patch 2/2] round_jiffies users Arjan van de Ven 2006-10-10 16:47 ` Ingo Oeser @ 2006-10-10 22:47 ` Paul Dickson 2006-10-10 23:52 ` Arjan van de Ven 1 sibling, 1 reply; 13+ messages in thread From: Paul Dickson @ 2006-10-10 22:47 UTC (permalink / raw) To: Arjan van de Ven; +Cc: linux-kernel, netdev, jgarzik, akpm, mingo On Tue, 10 Oct 2006 18:04:23 +0200, Arjan van de Ven wrote: > + mod_timer(&adapter->phy_info_timer, round_jiffies(jiffies + 2 * HZ)); Shouldn't round_jiffies_relative be used for some of these, a la: + mod_timer(&adapter->phy_info_timer, round_jiffies_relative(2 * HZ)); -Paul ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 2/2] round_jiffies users 2006-10-10 22:47 ` Paul Dickson @ 2006-10-10 23:52 ` Arjan van de Ven 0 siblings, 0 replies; 13+ messages in thread From: Arjan van de Ven @ 2006-10-10 23:52 UTC (permalink / raw) To: Paul Dickson; +Cc: linux-kernel, netdev, jgarzik, akpm, mingo Paul Dickson wrote: > On Tue, 10 Oct 2006 18:04:23 +0200, Arjan van de Ven wrote: > >> + mod_timer(&adapter->phy_info_timer, round_jiffies(jiffies + 2 * HZ)); > > Shouldn't round_jiffies_relative be used for some of these, a la: > > + mod_timer(&adapter->phy_info_timer, round_jiffies_relative(2 * HZ)); > mod_timer() takes an absolute jiffies value as argument, so... no :) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/2] round_jiffies infrastructure 2006-10-10 16:03 ` [patch 1/2] round_jiffies infrastructure Arjan van de Ven 2006-10-10 16:04 ` [patch 2/2] round_jiffies users Arjan van de Ven @ 2006-10-10 18:56 ` Andrew Morton 2006-10-10 20:48 ` Arjan van de Ven 2006-10-16 13:42 ` Andi Kleen 2 siblings, 1 reply; 13+ messages in thread From: Andrew Morton @ 2006-10-10 18:56 UTC (permalink / raw) To: Arjan van de Ven; +Cc: linux-kernel, mingo On Tue, 10 Oct 2006 18:03:30 +0200 Arjan van de Ven <arjan@linux.intel.com> wrote: > @@ -80,6 +80,56 @@ tvec_base_t boot_tvec_bases; > EXPORT_SYMBOL(boot_tvec_bases); > static DEFINE_PER_CPU(tvec_base_t *, tvec_bases) = &boot_tvec_bases; > > +unsigned long __round_jiffies(unsigned long T, int CPU) > +{ > + int rem; > + int original = T; > + rem = T % HZ; > + if (rem < HZ/4) > + T = T - rem; > + else > + T = T - rem + HZ; > + /* we don't want all cpus firing at once hitting the same lock/memory */ > + T += CPU * 3; > + if (T <= jiffies) /* rounding ate our timeout entirely */ > + return original; > + return T; > +} > +EXPORT_SYMBOL_GPL(__round_jiffies); > + c'mon Arjan. If we're going to create new, kernel-wide, exported-to-modules infrastructure then it deserves slightly more than zero documentation. Some commentary explaining/justifying the magic numbers in there would be useful too. The HZ/4, the cpu*3. And coding style too, please: consistent spacing around arithmetic operators, variables are lower case, constants are upper case. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/2] round_jiffies infrastructure 2006-10-10 18:56 ` [patch 1/2] round_jiffies infrastructure Andrew Morton @ 2006-10-10 20:48 ` Arjan van de Ven 0 siblings, 0 replies; 13+ messages in thread From: Arjan van de Ven @ 2006-10-10 20:48 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, mingo > c'mon Arjan. If we're going to create new, kernel-wide, > exported-to-modules infrastructure then it deserves slightly more than zero > documentation. ok I had a weak moment; updated patch below (lightly tested because current 2.6.19-rc1-git6 dies on x86_64 with an interrupt issue within a few minutes even unpatched on my machines) From: Arjan van de Ven <arjan@linux.intel.com> Subject: round_jiffies infrastructure This patch introduces a round_jiffies() function as well as a round_jiffies_relative() function. These functions round a jiffies value to the next whole second. The primary purpose of this rounding is to cause all "we don't care exactly when" timers to happen at the same jiffy. This avoids multiple timers to fire within the second for no real reason; with dynamic ticks these extra timers cause wakeups from deep sleep CPU sleep states and thus waste power. The exact wakeup moment is skewed by the cpu number, to avoid all cpus from waking up at the exact same time (and hitting the same lock/cachelines there) Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Index: linux-2.6.19-rc1-git6/include/linux/timer.h =================================================================== --- linux-2.6.19-rc1-git6.orig/include/linux/timer.h +++ linux-2.6.19-rc1-git6/include/linux/timer.h @@ -98,4 +98,10 @@ extern void run_local_timers(void); struct hrtimer; extern int it_real_fn(struct hrtimer *); +unsigned long __round_jiffies(unsigned long j, int cpu); +unsigned long __round_jiffies_relative(unsigned long j, int cpu); +unsigned long round_jiffies(unsigned long j); +unsigned long round_jiffies_relative(unsigned long j); + + #endif Index: linux-2.6.19-rc1-git6/kernel/timer.c =================================================================== --- linux-2.6.19-rc1-git6.orig/kernel/timer.c +++ linux-2.6.19-rc1-git6/kernel/timer.c @@ -80,6 +80,139 @@ tvec_base_t boot_tvec_bases; EXPORT_SYMBOL(boot_tvec_bases); static DEFINE_PER_CPU(tvec_base_t *, tvec_bases) = &boot_tvec_bases; +/** + * __round_jiffies - function to round jiffies to a full second + * @j: the time in (absolute) jiffies that should be rounded + * @cpu: the processor number on which the timeout will happen + * + * __round_jiffies rounds an absolute time in the future (in jiffies) + * up or down to (approximately) full seconds. This is useful for timers + * for which the exact time they fire does not matter too much, as long as + * they fire approximately every X seconds. + * + * By rounding these timers to whole seconds, all such timers will fire + * at the same time, rather than at various times spread out. The goal + * of this is to have the CPU wake up less, which saves power. + * + * The exact rounding is skewed for each processor to avoid all + * processors firing at the exact same time, which could lead + * to lock contention or spurious cache line bouncing. + * + * The return value is the rounded version of the "j" parameter. + */ +unsigned long __round_jiffies(unsigned long j, int cpu) +{ + int rem; + int original = j; + + /* + * We don't want all cpus firing their timers at once hitting the + * same lock or cachelines, so we skew each extra cpu with an extra + * 3 jiffies. This 3 jiffies came originally from the mm/ code which + * already did this. + * The skew is done by adding 3*cpunr, then round, then subtract this + * extra offset again. + */ + j += cpu * 3; + + rem = j % HZ; + + /* + * If the target jiffie is just after a whole second (which can happen + * due to delays of the timer irq, long irq off times etc etc) then + * we should round down to the whole second, not up. Use 1/4th second + * as cutoff for this rounding as an extreme upper bound for this. + */ + if (rem < HZ/4) /* round down */ + j = j - rem; + else /* round up */ + j = j - rem + HZ; + + /* now that we have rounded, subtract the extra skew again */ + j -= cpu * 3; + + if (j <= jiffies) /* rounding ate our timeout entirely; */ + return original; + return j; +} +EXPORT_SYMBOL_GPL(__round_jiffies); + +/** + * __round_jiffies_relative - function to round jiffies to a full second + * @j: the time in (relative) jiffies that should be rounded + * @cpu: the processor number on which the timeout will happen + * + * __round_jiffies_relative rounds a time delta in the future (in jiffies) + * up or down to (approximately) full seconds. This is useful for timers + * for which the exact time they fire does not matter too much, as long as + * they fire approximately every X seconds. + * + * By rounding these timers to whole seconds, all such timers will fire + * at the same time, rather than at various times spread out. The goal + * of this is to have the CPU wake up less, which saves power. + * + * The exact rounding is skewed for each processor to avoid all + * processors firing at the exact same time, which could lead + * to lock contention or spurious cache line bouncing. + * + * The return value is the rounded version of the "j" parameter. + */ +unsigned long __round_jiffies_relative(unsigned long j, int cpu) +{ + /* + * In theory the following code can skip a jiffy in case jiffies + * increments right between the addition and the later subtraction. + * However since the entire point of this function is to use approximate + * timeouts, it's entirely ok to not handle that. + */ + return __round_jiffies(j + jiffies, cpu) - jiffies; +} +EXPORT_SYMBOL_GPL(__round_jiffies_relative); + +/** + * round_jiffies - function to round jiffies to a full second + * @j: the time in (absolute) jiffies that should be rounded + * + * round_jiffies rounds an absolute time in the future (in jiffies) + * up or down to (approximately) full seconds. This is useful for timers + * for which the exact time they fire does not matter too much, as long as + * they fire approximately every X seconds. + * + * By rounding these timers to whole seconds, all such timers will fire + * at the same time, rather than at various times spread out. The goal + * of this is to have the CPU wake up less, which saves power. + * + * The return value is the rounded version of the "j" parameter. + */ +unsigned long round_jiffies(unsigned long j) +{ + return __round_jiffies(j, raw_smp_processor_id()); +} +EXPORT_SYMBOL_GPL(round_jiffies); + +/** + * round_jiffies_relative - function to round jiffies to a full second + * @j: the time in (relative) jiffies that should be rounded + * + * round_jiffies_relative rounds a time delta in the future (in jiffies) + * up or down to (approximately) full seconds. This is useful for timers + * for which the exact time they fire does not matter too much, as long as + * they fire approximately every X seconds. + * + * By rounding these timers to whole seconds, all such timers will fire + * at the same time, rather than at various times spread out. The goal + * of this is to have the CPU wake up less, which saves power. + * + * The return value is the rounded version of the "j" parameter. + */ +unsigned long round_jiffies_relative(unsigned long j) +{ + return __round_jiffies_relative(j, raw_smp_processor_id()); +} +EXPORT_SYMBOL_GPL(round_jiffies_relative); + + + static inline void set_running_timer(tvec_base_t *base, struct timer_list *timer) { ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 1/2] round_jiffies infrastructure 2006-10-10 16:03 ` [patch 1/2] round_jiffies infrastructure Arjan van de Ven 2006-10-10 16:04 ` [patch 2/2] round_jiffies users Arjan van de Ven 2006-10-10 18:56 ` [patch 1/2] round_jiffies infrastructure Andrew Morton @ 2006-10-16 13:42 ` Andi Kleen 2 siblings, 0 replies; 13+ messages in thread From: Andi Kleen @ 2006-10-16 13:42 UTC (permalink / raw) To: Arjan van de Ven; +Cc: mingo, akpm, linux-kernel Arjan van de Ven <arjan@linux.intel.com> writes: > +} > +EXPORT_SYMBOL_GPL(__round_jiffies); This means non GPL modules will disturb your timers again. probably not a good strategy. > + > +unsigned long __round_jiffies_relative(unsigned long T, int CPU) > +{ > + int rem; > + int original = T; > + T=T+jiffies; > + rem = T % HZ; > + if (rem < HZ/4) > + T = T - rem; > + else > + T = T - rem + HZ; > + /* we don't want all cpus firing at once hitting the same lock/memory */ > + T += CPU * 3; Consider a dual core Yonah/Merom: it has shared caches and the two cores can only go to sleep together. With this the wakeups will be always twice. Not good. I guess you need to add some topology awareness here and e.g. only spread it for sockets. BTW we normally put spaces around operators inside expressions. -Andi ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 0/2] Introduce round_jiffies() to save spurious wakeups 2006-10-10 16:02 [patch 0/2] Introduce round_jiffies() to save spurious wakeups Arjan van de Ven 2006-10-10 16:03 ` [patch 1/2] round_jiffies infrastructure Arjan van de Ven @ 2006-10-11 17:23 ` Christoph Hellwig 2006-10-11 17:54 ` Arjan van de Ven 1 sibling, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2006-10-11 17:23 UTC (permalink / raw) To: Arjan van de Ven; +Cc: linux-kernel, akpm, mingo On Tue, Oct 10, 2006 at 06:02:45PM +0200, Arjan van de Ven wrote: > Hi, > > the following 2 patches will introduce the round_jiffies() api and users > thereof. > > The general idea is that by rounding the jiffies for certain timers to > the next whole second will make those timers all happen at the same > time; and thus reduce the number of times the cpu has to wake up to > service timers (this assumes a tickless kernel) > > Obviously only timers where the exact time of firing isn't so important > can do this; several of the recurring "always live" timers of the kernel > are of this kind, they want "about once a second" or "about once every 4 > seconds" and such, and don't really care about the exact jiffy in which > they fire. > > An alternative would have been to introduce mod_timer_rounded() or > somesuch APIs (but there's many variants that take jiffies); I feel that > an explicit caller based rounding actually is quite reasonable. I think the API you proposed is horrible. Having jiffies exposed in ani API is a mistake, and adding more makes this problem worse. I'd suggest to start with Alan's patches that add a timer variant that takes a miliseconds argument instead of jiffies and add a _rounded varaint to it that has a new parameter that specifies the precision. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 0/2] Introduce round_jiffies() to save spurious wakeups 2006-10-11 17:23 ` [patch 0/2] Introduce round_jiffies() to save spurious wakeups Christoph Hellwig @ 2006-10-11 17:54 ` Arjan van de Ven 2006-10-12 19:02 ` Christoph Hellwig 0 siblings, 1 reply; 13+ messages in thread From: Arjan van de Ven @ 2006-10-11 17:54 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-kernel, akpm, mingo On Wed, 2006-10-11 at 18:23 +0100, Christoph Hellwig wrote: > On Tue, Oct 10, 2006 at 06:02:45PM +0200, Arjan van de Ven wrote: > > Hi, > > > > the following 2 patches will introduce the round_jiffies() api and users > > thereof. > > > > The general idea is that by rounding the jiffies for certain timers to > > the next whole second will make those timers all happen at the same > > time; and thus reduce the number of times the cpu has to wake up to > > service timers (this assumes a tickless kernel) > > > > Obviously only timers where the exact time of firing isn't so important > > can do this; several of the recurring "always live" timers of the kernel > > are of this kind, they want "about once a second" or "about once every 4 > > seconds" and such, and don't really care about the exact jiffy in which > > they fire. > > > > An alternative would have been to introduce mod_timer_rounded() or > > somesuch APIs (but there's many variants that take jiffies); I feel that > > an explicit caller based rounding actually is quite reasonable. > > I think the API you proposed is horrible. Having jiffies exposed in > ani API is a mistake, and adding more makes this problem worse. and other people like Linus disagree with you. > I'd suggest > to start with Alan's patches that add a timer variant that takes a miliseconds > argument instead of jiffies and add a _rounded varaint to it that has > a new parameter that specifies the precision. it's half a solution; there's many apis that currently take either absolute or relative jiffies, and want rounding. Duplicating that lot doesn't look like the best idea either... ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch 0/2] Introduce round_jiffies() to save spurious wakeups 2006-10-11 17:54 ` Arjan van de Ven @ 2006-10-12 19:02 ` Christoph Hellwig 0 siblings, 0 replies; 13+ messages in thread From: Christoph Hellwig @ 2006-10-12 19:02 UTC (permalink / raw) To: Arjan van de Ven; +Cc: Christoph Hellwig, linux-kernel, akpm, mingo On Wed, Oct 11, 2006 at 07:54:57PM +0200, Arjan van de Ven wrote: > > > An alternative would have been to introduce mod_timer_rounded() or > > > somesuch APIs (but there's many variants that take jiffies); I feel that > > > an explicit caller based rounding actually is quite reasonable. > > > > I think the API you proposed is horrible. Having jiffies exposed in > > ani API is a mistake, and adding more makes this problem worse. > > and other people like Linus disagree with you. The only argument from Linus was about getting rid of jiffies completly. He certainly didn't complain when we added interfaces to reduce jiffies use in the past (e.g. the msleep APIs) ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2006-10-16 13:42 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-10-10 16:02 [patch 0/2] Introduce round_jiffies() to save spurious wakeups Arjan van de Ven 2006-10-10 16:03 ` [patch 1/2] round_jiffies infrastructure Arjan van de Ven 2006-10-10 16:04 ` [patch 2/2] round_jiffies users Arjan van de Ven 2006-10-10 16:47 ` Ingo Oeser 2006-10-10 16:59 ` Arjan van de Ven 2006-10-10 22:47 ` Paul Dickson 2006-10-10 23:52 ` Arjan van de Ven 2006-10-10 18:56 ` [patch 1/2] round_jiffies infrastructure Andrew Morton 2006-10-10 20:48 ` Arjan van de Ven 2006-10-16 13:42 ` Andi Kleen 2006-10-11 17:23 ` [patch 0/2] Introduce round_jiffies() to save spurious wakeups Christoph Hellwig 2006-10-11 17:54 ` Arjan van de Ven 2006-10-12 19:02 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox