* [PATCH] net: use a deferred timer in rt_check_expire
@ 2009-06-12 6:10 Eric Dumazet
2009-06-14 6:38 ` David Miller
0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2009-06-12 6:10 UTC (permalink / raw)
To: David S. Miller; +Cc: Linux Netdev List, Tero.Kristo@nokia.com
For the sake of power saver lovers, use a deferrable timer to fire rt_check_expire()
As some big routers cache equilibrium depends on garbage collection done in time,
we take into account elapsed time between two rt_check_expire() invocations
to adjust the amount of slots we have to check.
Based on an initial idea and patch from Tero Kristo
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Tero Kristo <tero.kristo@nokia.com>
---
net/ipv4/route.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index a849bb1..cd76b3c 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -131,8 +131,8 @@ static int ip_rt_min_advmss __read_mostly = 256;
static int ip_rt_secret_interval __read_mostly = 10 * 60 * HZ;
static int rt_chain_length_max __read_mostly = 20;
-static void rt_worker_func(struct work_struct *work);
-static DECLARE_DELAYED_WORK(expires_work, rt_worker_func);
+static struct delayed_work expires_work;
+static unsigned long expires_ljiffies;
/*
* Interface to generic destination cache.
@@ -787,9 +787,12 @@ static void rt_check_expire(void)
struct rtable *rth, *aux, **rthp;
unsigned long samples = 0;
unsigned long sum = 0, sum2 = 0;
+ unsigned long delta;
u64 mult;
- mult = ((u64)ip_rt_gc_interval) << rt_hash_log;
+ delta = jiffies - expires_ljiffies;
+ expires_ljiffies = jiffies;
+ mult = ((u64)delta) << rt_hash_log;
if (ip_rt_gc_timeout > 1)
do_div(mult, ip_rt_gc_timeout);
goal = (unsigned int)mult;
@@ -3397,6 +3400,8 @@ int __init ip_rt_init(void)
/* All the timers, started at system startup tend
to synchronize. Perturb it a bit.
*/
+ INIT_DELAYED_WORK_DEFERRABLE(&expires_work, rt_worker_func);
+ expires_ljiffies = jiffies;
schedule_delayed_work(&expires_work,
net_random() % ip_rt_gc_interval + ip_rt_gc_interval);
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] net: use a deferred timer in rt_check_expire
2009-06-12 6:10 [PATCH] net: use a deferred timer in rt_check_expire Eric Dumazet
@ 2009-06-14 6:38 ` David Miller
0 siblings, 0 replies; 3+ messages in thread
From: David Miller @ 2009-06-14 6:38 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, Tero.Kristo
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 12 Jun 2009 08:10:07 +0200
> For the sake of power saver lovers, use a deferrable timer to fire rt_check_expire()
>
> As some big routers cache equilibrium depends on garbage collection done in time,
> we take into account elapsed time between two rt_check_expire() invocations
> to adjust the amount of slots we have to check.
>
> Based on an initial idea and patch from Tero Kristo
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Tero Kristo <tero.kristo@nokia.com>
Applied.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Network stack timer hacks for power saving
@ 2009-05-19 8:13 Tero.Kristo
2009-05-19 9:04 ` Eric Dumazet
0 siblings, 1 reply; 3+ messages in thread
From: Tero.Kristo @ 2009-05-19 8:13 UTC (permalink / raw)
To: netdev
[-- Attachment #1: Type: text/plain, Size: 514 bytes --]
Hi,
I have been looking at network stack timer optimization for
power saving in embedded ARM environment, basically trying to
avoid as many wakeups as possible. I have changed several
timers in the network stack into deferred ones, i.e. they do
not wake up the device from low power modes but instead they
are deferred until next wakeup from some other source, like
another (non-deferred) timer or some I/O. Attached a patch
about the changes I've done, is something like this safe to do?
-Tero
[-- Attachment #2: 0001-Network-stack-timer-optimizations-for-power-saving.patch --]
[-- Type: application/octet-stream, Size: 2685 bytes --]
From 92856356359fd3dd4e859119676a5fa7aee3ba79 Mon Sep 17 00:00:00 2001
From: Tero Kristo <tero.kristo@nokia.com>
Date: Wed, 13 May 2009 15:39:00 +0300
Subject: [PATCH] Network stack timer optimizations for power saving
Signed-off-by: Tero Kristo <tero.kristo@nokia.com>
---
net/core/flow.c | 1 +
net/core/neighbour.c | 1 +
net/ipv4/inet_fragment.c | 1 +
net/ipv4/inetpeer.c | 1 +
net/ipv4/route.c | 1 +
5 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/net/core/flow.c b/net/core/flow.c
index 5cf8105..44d963c 100644
--- a/net/core/flow.c
+++ b/net/core/flow.c
@@ -352,6 +352,7 @@ static int __init flow_cache_init(void)
flow_hwm = 4 * flow_hash_size;
setup_timer(&flow_hash_rnd_timer, flow_cache_new_hashrnd, 0);
+ init_timer_deferrable(&flow_hash_rnd_timer);
flow_hash_rnd_timer.expires = jiffies + FLOW_HASH_RND_PERIOD;
add_timer(&flow_hash_rnd_timer);
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 1dc728b..c9217d7 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1445,6 +1445,7 @@ void neigh_table_init_no_netlink(struct neigh_table *tbl)
rwlock_init(&tbl->lock);
setup_timer(&tbl->gc_timer, neigh_periodic_timer, (unsigned long)tbl);
+ init_timer_deferrable(&tbl->gc_timer);
tbl->gc_timer.expires = now + 1;
add_timer(&tbl->gc_timer);
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index 6c52e08..91a099f 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -64,6 +64,7 @@ void inet_frags_init(struct inet_frags *f)
setup_timer(&f->secret_timer, inet_frag_secret_rebuild,
(unsigned long)f);
+ init_timer_deferrable(&f->secret_timer);
f->secret_timer.expires = jiffies + f->secret_interval;
add_timer(&f->secret_timer);
}
diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c
index a456cee..19eea7c 100644
--- a/net/ipv4/inetpeer.c
+++ b/net/ipv4/inetpeer.c
@@ -125,6 +125,7 @@ void __init inet_initpeers(void)
/* All the timers, started at system startup tend
to synchronize. Perturb it a bit.
*/
+ init_timer_deferrable(&peer_periodic_timer);
peer_periodic_timer.expires = jiffies
+ net_random() % inet_peer_gc_maxtime
+ inet_peer_gc_maxtime;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 2ea6dcc..792bd7e 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -3288,6 +3288,7 @@ int __init ip_rt_init(void)
/* All the timers, started at system startup tend
to synchronize. Perturb it a bit.
*/
+ INIT_DELAYED_WORK_DEFERRABLE(&expires_work, rt_worker_func);
schedule_delayed_work(&expires_work,
net_random() % ip_rt_gc_interval + ip_rt_gc_interval);
--
1.5.4.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: Network stack timer hacks for power saving
2009-05-19 8:13 Network stack timer hacks for power saving Tero.Kristo
@ 2009-05-19 9:04 ` Eric Dumazet
2009-05-19 9:46 ` Tero.Kristo
0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2009-05-19 9:04 UTC (permalink / raw)
To: Tero.Kristo; +Cc: netdev
Tero.Kristo@nokia.com a écrit :
> Hi,
>
> I have been looking at network stack timer optimization for
> power saving in embedded ARM environment, basically trying to
> avoid as many wakeups as possible. I have changed several
> timers in the network stack into deferred ones, i.e. they do
> not wake up the device from low power modes but instead they
> are deferred until next wakeup from some other source, like
> another (non-deferred) timer or some I/O. Attached a patch
> about the changes I've done, is something like this safe to do?
>
> -Tero
Hi Tero
When tcp communications are active, we setup a timer for *every* frame
we receive or we send. These timers wont be deferrable anyway.
delaying one wakeup every 60 seconds (if I take your net/ipv4/route.c change)
wont change that much power savings, or did I missed something ?
On big routers, we need to set ip_rt_gc_interval from 60 seconds to one second,
in order to perform an effective garbage collection.
So, if we use a deferred timer and :
schedule_delayed_work(&expires_work, HZ);
How many times worker will be started every minute ?
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: Network stack timer hacks for power saving
2009-05-19 9:04 ` Eric Dumazet
@ 2009-05-19 9:46 ` Tero.Kristo
2009-05-19 18:56 ` [PATCH] net: use a deferred timer in rt_check_expire Eric Dumazet
0 siblings, 1 reply; 3+ messages in thread
From: Tero.Kristo @ 2009-05-19 9:46 UTC (permalink / raw)
To: dada1; +Cc: netdev
>-----Original Message-----
>From: ext Eric Dumazet [mailto:dada1@cosmosbay.com]
>Sent: 19 May, 2009 12:04
>To: Kristo Tero (Nokia-D/Tampere)
>Cc: netdev@vger.kernel.org
>Subject: Re: Network stack timer hacks for power saving
>
>Tero.Kristo@nokia.com a écrit :
>> Hi,
>>
>> I have been looking at network stack timer optimization for power
>> saving in embedded ARM environment, basically trying to
>avoid as many
>> wakeups as possible. I have changed several timers in the network
>> stack into deferred ones, i.e. they do not wake up the
>device from low
>> power modes but instead they are deferred until next wakeup
>from some
>> other source, like another (non-deferred) timer or some I/O.
>Attached
>> a patch about the changes I've done, is something like this safe to
>> do?
>>
>> -Tero
>
>Hi Tero
>
Hi Eric,
Thanks for comments. Also, I think I need to make one clarification here, I am not proposing these changes to be made into linux network stack, at least not as such because I am aware that these might cause some problems on some systems. If something like this would be merged, I think this should be added behind a kernel config option.
>
>When tcp communications are active, we setup a timer for
>*every* frame we receive or we send. These timers wont be
>deferrable anyway.
>
>delaying one wakeup every 60 seconds (if I take your
>net/ipv4/route.c change) wont change that much power savings,
>or did I missed something ?
No, this won't change it that much, but it still affects a bit. Also, if everybody thinks that a 1 minute timer does not matter too much, we soon have 60 timers expiring once a minute and can basically cause a wakeup from low power modes every second in the worst case.
>
>On big routers, we need to set ip_rt_gc_interval from 60
>seconds to one second, in order to perform an effective
>garbage collection.
>
>So, if we use a deferred timer and :
>
>schedule_delayed_work(&expires_work, HZ);
>
>How many times worker will be started every minute ?
I think big routers do not enter any low power states, due to heavy network traffic keeping them busy. Even if they do enter low power mode, I guess network HW will basically wake them up rather quickly causing the delayed work to be executed approximately around the time (or probably exactly at the time) it was scheduled. I might be wrong here, as I do not really know too much about network router power management.
Here is some data I grabbed from /proc/timer_stats before doing these changes (just added the network stack stuff here and calculated expiry rates.) I already changed most of the timers to deferrable in this sample, and also made a hack to show workqueues properly. Device was basically just idling during this measurement, not doing any frequent net communication.
Timer Stats Version: v0.2
Sample period: 59672.362 s
7455D, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer) [neighbour.c, rate 1 per 8s]
995, 0 workqueue queue_delayed_work (rt_worker_func) [route.c, rate 1 per min]
498D, 1 swapper inet_initpeers (peer_check_expire) [inetpeer.c, rate 1 per 2 min]
99D, 1 swapper flow_cache_init (flow_cache_new_hashrnd) [flow.c, rate 1 per 10 min]
99D, 1 swapper inet_frags_init (inet_frag_secret_rebuild) [inet_fragment.c, rate 1 per 10 min]
-Tero
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] net: use a deferred timer in rt_check_expire
2009-05-19 9:46 ` Tero.Kristo
@ 2009-05-19 18:56 ` Eric Dumazet
0 siblings, 0 replies; 3+ messages in thread
From: Eric Dumazet @ 2009-05-19 18:56 UTC (permalink / raw)
Cc: Tero.Kristo@nokia.com, netdev
Tero.Kristo@nokia.com a écrit :
>
>
>> -----Original Message-----
>> From: ext Eric Dumazet [mailto:dada1@cosmosbay.com]
>> Sent: 19 May, 2009 12:04
>> To: Kristo Tero (Nokia-D/Tampere)
>> Cc: netdev@vger.kernel.org
>> Subject: Re: Network stack timer hacks for power saving
>>
>> Tero.Kristo@nokia.com a écrit :
>>> Hi,
>>>
>>> I have been looking at network stack timer optimization for power
>>> saving in embedded ARM environment, basically trying to
>> avoid as many
>>> wakeups as possible. I have changed several timers in the network
>>> stack into deferred ones, i.e. they do not wake up the
>> device from low
>>> power modes but instead they are deferred until next wakeup
>>from some
>>> other source, like another (non-deferred) timer or some I/O.
>> Attached
>>> a patch about the changes I've done, is something like this safe to
>>> do?
>>>
>>> -Tero
Here is the patch I cooked and tested on a machine where ip_rt_gc_interval
is set to minimal value (1 second), where equilibrium depends on garbage collection
done in time.
I found that delayed timers could be *really* delayed so I think we must take
into account the elapsed time (in jiffies) between two rt_check_expire()
calls, to "guarantee" a full scan of rt cache in a ip_rt_gc_timeout period.
Not for inclusion, as undergoing work is happening in this function
for a bug correction. I'll redo the patch later once stabilized.
[PATCH] net: use a deferred timer in rt_check_expire
For the sake of power saver lovers, use a deferrable timer to fire rt_check_expire()
As some big routers cache equilibrium depends on garbage collection done in time,
we take into account elapsed time between two rt_check_expire() invocations
to adjust the amount of slots we have to check.
Based on an initial idea and patch from Tero Kristo
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Tero Kristo <tero.kristo@nokia.com>
---
net/ipv4/route.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index c4c60e9..b2c6793 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -131,8 +131,8 @@ static int ip_rt_min_advmss __read_mostly = 256;
static int ip_rt_secret_interval __read_mostly = 10 * 60 * HZ;
static int rt_chain_length_max __read_mostly = 20;
-static void rt_worker_func(struct work_struct *work);
-static DECLARE_DELAYED_WORK(expires_work, rt_worker_func);
+static struct delayed_work expires_work;
+static unsigned long expires_ljiffies;
/*
* Interface to generic destination cache.
@@ -787,9 +787,12 @@ static void rt_check_expire(void)
struct rtable *rth, **rthp;
unsigned long length = 0, samples = 0;
unsigned long sum = 0, sum2 = 0;
+ unsigned long delta;
u64 mult;
- mult = ((u64)ip_rt_gc_interval) << rt_hash_log;
+ delta = jiffies - expires_ljiffies;
+ expires_ljiffies = jiffies;
+ mult = ((u64)delta) << rt_hash_log;
if (ip_rt_gc_timeout > 1)
do_div(mult, ip_rt_gc_timeout);
goal = (unsigned int)mult;
@@ -3410,6 +3413,8 @@ int __init ip_rt_init(void)
/* All the timers, started at system startup tend
to synchronize. Perturb it a bit.
*/
+ INIT_DELAYED_WORK_DEFERRABLE(&expires_work, rt_worker_func);
+ expires_ljiffies = jiffies;
schedule_delayed_work(&expires_work,
net_random() % ip_rt_gc_interval + ip_rt_gc_interval);
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-06-14 6:38 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-12 6:10 [PATCH] net: use a deferred timer in rt_check_expire Eric Dumazet
2009-06-14 6:38 ` David Miller
-- strict thread matches above, loose matches on Subject: below --
2009-05-19 8:13 Network stack timer hacks for power saving Tero.Kristo
2009-05-19 9:04 ` Eric Dumazet
2009-05-19 9:46 ` Tero.Kristo
2009-05-19 18:56 ` [PATCH] net: use a deferred timer in rt_check_expire Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).