* [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c
@ 2009-07-09 21:59 Thomas Gleixner
2009-07-09 21:59 ` [patch 1/3] net: serialize hrtimer callback in sched_cbq Thomas Gleixner
` (4 more replies)
0 siblings, 5 replies; 23+ messages in thread
From: Thomas Gleixner @ 2009-07-09 21:59 UTC (permalink / raw)
To: netdev; +Cc: LKML, David Miller, Patrick McHardy, Peter Zijlstra
While looking at the "Soft-Lockup/Race in networking in 2.6.31-rc1+195
( possibly caused by netem)" issue (http://lkml.org/lkml/2009/7/2/519)
I noticed some serious problems in the hrtimer related code of
net/sched/sched/sch_cbq.c.
The following patch series addresses these.
I'm not entirely sure whether patch [1/3] covers all possible
concurrent modifications, but it fixes the most obvious ones. The
remaining details are left to the networking experts.
Patches [1/3] and [2/3] might be stable material as well.
Thanks,
tglx
^ permalink raw reply [flat|nested] 23+ messages in thread
* [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-09 21:59 [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c Thomas Gleixner
@ 2009-07-09 21:59 ` Thomas Gleixner
2009-07-12 20:55 ` David Miller
2009-07-09 21:59 ` [patch 2/3] net: sanitize hrtimer usage " Thomas Gleixner
` (3 subsequent siblings)
4 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2009-07-09 21:59 UTC (permalink / raw)
To: netdev; +Cc: LKML, David Miller, Patrick McHardy, Peter Zijlstra
[-- Attachment #1: net-sched-cbq-serialize-hrtimer-callback.patch --]
[-- Type: text/plain, Size: 1956 bytes --]
The hrtimer callback cbq_undelay() is not serialized against
cbq_ovl_delay(). That affects at least q->pmask and q->delay_timer.
Lock it proper.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
net/sched/sch_cbq.c | 8 ++++++++
1 file changed, 8 insertions(+)
Index: linux-2.6/net/sched/sch_cbq.c
===================================================================
--- linux-2.6.orig/net/sched/sch_cbq.c
+++ linux-2.6/net/sched/sch_cbq.c
@@ -163,6 +163,7 @@ struct cbq_sched_data
psched_time_t now_rt; /* Cached real time */
unsigned pmask;
+ spinlock_t lock;
struct hrtimer delay_timer;
struct qdisc_watchdog watchdog; /* Watchdog timer,
started when CBQ has
@@ -503,6 +504,9 @@ static void cbq_ovl_delay(struct cbq_cla
cl->undertime = q->now + delay;
if (delay > 0) {
+ unsigned long flags;
+
+ spin_lock_irqsave(&q->lock, flags);
sched += delay + cl->penalty;
cl->penalized = sched;
cl->cpriority = TC_CBQ_MAXPRIO;
@@ -518,6 +522,7 @@ static void cbq_ovl_delay(struct cbq_cla
hrtimer_restart(&q->delay_timer);
cl->delayed = 1;
cl->xstats.overactions++;
+ spin_unlock_irqrestore(&q->lock, flags);
return;
}
delay = 1;
@@ -599,6 +604,7 @@ static enum hrtimer_restart cbq_undelay(
now = psched_get_time();
+ spin_lock(&q->lock);
pmask = q->pmask;
q->pmask = 0;
@@ -623,6 +629,7 @@ static enum hrtimer_restart cbq_undelay(
time = ktime_add_ns(time, PSCHED_TICKS2NS(now + delay));
hrtimer_start(&q->delay_timer, time, HRTIMER_MODE_ABS);
}
+ spin_unlock(&q->lock);
sch->flags &= ~TCQ_F_THROTTLED;
__netif_schedule(qdisc_root(sch));
@@ -1396,6 +1403,7 @@ static int cbq_init(struct Qdisc *sch, s
q->link.avpkt = q->link.allot/2;
q->link.minidle = -0x7FFFFFFF;
+ spin_lock_init(&q->lock);
qdisc_watchdog_init(&q->watchdog, sch);
hrtimer_init(&q->delay_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
q->delay_timer.function = cbq_undelay;
^ permalink raw reply [flat|nested] 23+ messages in thread
* [patch 2/3] net: sanitize hrtimer usage in sched_cbq
2009-07-09 21:59 [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c Thomas Gleixner
2009-07-09 21:59 ` [patch 1/3] net: serialize hrtimer callback in sched_cbq Thomas Gleixner
@ 2009-07-09 21:59 ` Thomas Gleixner
2009-07-09 21:59 ` [patch 3/3] net: use HRTIMER_RESTART " Thomas Gleixner
` (2 subsequent siblings)
4 siblings, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2009-07-09 21:59 UTC (permalink / raw)
To: netdev; +Cc: LKML, David Miller, Patrick McHardy, Peter Zijlstra
[-- Attachment #1: net-sched-cbq-sanitize-hrtimer-usage.patch --]
[-- Type: text/plain, Size: 3134 bytes --]
The usage of hrtimer in cbq_ovl_delay() is less than obvious and in
two aspects wrong.
The intention of the code is to arm an hrtimer with the expiry time
X. If the timer is already armed the code needs to check whether the
expiry time X is earlier than the expiry time of the armed timer and
either keep the active timer or rearm it to X. If the timer is not
armed it needs to schedule it to X. That's not what the code does.
It calls hrtimer_try_to_cancel() unconditionally and checks the return
value. If the return value is non zero then it compares the expiry
time of the timer with the new expiry time and picks the earlier
one. The return value check does not take into account that the timer
might run its callback (expressed by a return value of -1). In that
case the expiry time of the timer is probably earlier than the new
expiry time so it rearms the already expired timer with the expiry
value in the past.
If the timer is not active (hrtimer_try_to_cancel() returns 0) it does
not set the new expiry time X but instead restarts the timer with the
expiry time which was active when the timer fired last. That's in the
past as well.
Change the code to check whether the timer is enqueued. If it is
enqueued then the expiry time of the timer is checked against the new
expiry time and it only calls hrtimer_start when the new expiry time
is earlier than the already armed timer. If the timer is not active
then arm it unconditionally with the new expiry time.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Patrick McHardy <kaber@trash.net>
---
net/sched/sch_cbq.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
Index: linux-2.6/net/sched/sch_cbq.c
===================================================================
--- linux-2.6.orig/net/sched/sch_cbq.c
+++ linux-2.6/net/sched/sch_cbq.c
@@ -494,7 +494,6 @@ static void cbq_ovl_delay(struct cbq_cla
if (!cl->delayed) {
psched_time_t sched = q->now;
- ktime_t expires;
delay += cl->offtime;
if (cl->avgidle < 0)
@@ -504,6 +503,7 @@ static void cbq_ovl_delay(struct cbq_cla
cl->undertime = q->now + delay;
if (delay > 0) {
+ ktime_t expires, actexp;
unsigned long flags;
spin_lock_irqsave(&q->lock, flags);
@@ -514,12 +514,19 @@ static void cbq_ovl_delay(struct cbq_cla
expires = ktime_set(0, 0);
expires = ktime_add_ns(expires, PSCHED_TICKS2NS(sched));
- if (hrtimer_try_to_cancel(&q->delay_timer) &&
- ktime_to_ns(ktime_sub(
- hrtimer_get_expires(&q->delay_timer),
- expires)) > 0)
- hrtimer_set_expires(&q->delay_timer, expires);
- hrtimer_restart(&q->delay_timer);
+ /*
+ * If the timer is queued check whether the
+ * new expiry time is earlier than the current
+ * one.
+ */
+ if (hrtimer_is_queued(&q->delay_timer)) {
+ actexp = hrtimer_get_expires(&q->delay_timer);
+ if (expires.tv64 >= actexp.tv64)
+ expires.tv64 = 0;
+ }
+ if (expires.tv64)
+ hrtimer_start(&q->delay_timer, expires,
+ HRTIMER_MODE_ABS);
cl->delayed = 1;
cl->xstats.overactions++;
spin_unlock_irqrestore(&q->lock, flags);
^ permalink raw reply [flat|nested] 23+ messages in thread
* [patch 3/3] net: use HRTIMER_RESTART in sched_cbq
2009-07-09 21:59 [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c Thomas Gleixner
2009-07-09 21:59 ` [patch 1/3] net: serialize hrtimer callback in sched_cbq Thomas Gleixner
2009-07-09 21:59 ` [patch 2/3] net: sanitize hrtimer usage " Thomas Gleixner
@ 2009-07-09 21:59 ` Thomas Gleixner
2009-07-10 0:39 ` [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c David Miller
2009-07-12 20:57 ` David Miller
4 siblings, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2009-07-09 21:59 UTC (permalink / raw)
To: netdev; +Cc: LKML, David Miller, Patrick McHardy, Peter Zijlstra
[-- Attachment #1: net-sched-cbq-use-hrtimer-restart.patch --]
[-- Type: text/plain, Size: 1320 bytes --]
Restarting a hrtimer from the callback function via hrtimer_start is
inefficient. The canonical way is to modify the expiry value of the
timer and return HRTIMER_RESTART to the hrtimer core code which takes
care of the restart.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
net/sched/sch_cbq.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
Index: linux-2.6/net/sched/sch_cbq.c
===================================================================
--- linux-2.6.orig/net/sched/sch_cbq.c
+++ linux-2.6/net/sched/sch_cbq.c
@@ -605,6 +605,7 @@ static enum hrtimer_restart cbq_undelay(
struct cbq_sched_data *q = container_of(timer, struct cbq_sched_data,
delay_timer);
struct Qdisc *sch = q->watchdog.qdisc;
+ enum hrtimer_restart ret = HRTIMER_NORESTART;
psched_time_t now;
psched_tdiff_t delay = 0;
unsigned pmask;
@@ -634,13 +635,14 @@ static enum hrtimer_restart cbq_undelay(
time = ktime_set(0, 0);
time = ktime_add_ns(time, PSCHED_TICKS2NS(now + delay));
- hrtimer_start(&q->delay_timer, time, HRTIMER_MODE_ABS);
+ hrtimer_set_expires(&q->delay_timer, time);
+ ret = HRTIMER_RESTART;
}
spin_unlock(&q->lock);
sch->flags &= ~TCQ_F_THROTTLED;
__netif_schedule(qdisc_root(sch));
- return HRTIMER_NORESTART;
+ return ret;
}
#ifdef CONFIG_NET_CLS_ACT
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c
2009-07-09 21:59 [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c Thomas Gleixner
` (2 preceding siblings ...)
2009-07-09 21:59 ` [patch 3/3] net: use HRTIMER_RESTART " Thomas Gleixner
@ 2009-07-10 0:39 ` David Miller
2009-07-12 20:57 ` David Miller
4 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2009-07-10 0:39 UTC (permalink / raw)
To: tglx; +Cc: netdev, linux-kernel, kaber, peterz
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 09 Jul 2009 21:59:18 -0000
> While looking at the "Soft-Lockup/Race in networking in 2.6.31-rc1+195
> ( possibly caused by netem)" issue (http://lkml.org/lkml/2009/7/2/519)
> I noticed some serious problems in the hrtimer related code of
> net/sched/sched/sch_cbq.c.
>
> The following patch series addresses these.
>
> I'm not entirely sure whether patch [1/3] covers all possible
> concurrent modifications, but it fixes the most obvious ones. The
> remaining details are left to the networking experts.
>
> Patches [1/3] and [2/3] might be stable material as well.
Thanks a lot for these paches Thomas, I'll look at them more
closely.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-09 21:59 ` [patch 1/3] net: serialize hrtimer callback in sched_cbq Thomas Gleixner
@ 2009-07-12 20:55 ` David Miller
2009-07-14 8:22 ` Patrick McHardy
2009-07-14 8:55 ` Thomas Gleixner
0 siblings, 2 replies; 23+ messages in thread
From: David Miller @ 2009-07-12 20:55 UTC (permalink / raw)
To: tglx; +Cc: netdev, linux-kernel, kaber, peterz
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 09 Jul 2009 21:59:22 -0000
> The hrtimer callback cbq_undelay() is not serialized against
> cbq_ovl_delay(). That affects at least q->pmask and q->delay_timer.
>
> Lock it proper.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The problems here are even much deeper than it appears.
First of all, I am to understand that hrtimers run from hardware
interrupt context, right? If so, all of these datastructures are
softirq safe only.
And it is not merely the immediate things you see being modified in
this hrtimer, such as ->pmask etc., it is also the q->active[]
pointers, the list state for the classes, just about everything in the
qdisc state is referenced in this hrtimer code path.
I wonder how many queer unexplainable bugs we see because of this.
What should probably happen is that the hrtimer merely fires off work
at software interrupt context (perhaps a tasklet or similar), and that
software interrupt code take the qdisc's root lock throughout it's
execution.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c
2009-07-09 21:59 [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c Thomas Gleixner
` (3 preceding siblings ...)
2009-07-10 0:39 ` [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c David Miller
@ 2009-07-12 20:57 ` David Miller
4 siblings, 0 replies; 23+ messages in thread
From: David Miller @ 2009-07-12 20:57 UTC (permalink / raw)
To: tglx; +Cc: netdev, linux-kernel, kaber, peterz
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 09 Jul 2009 21:59:18 -0000
> While looking at the "Soft-Lockup/Race in networking in 2.6.31-rc1+195
> ( possibly caused by netem)" issue (http://lkml.org/lkml/2009/7/2/519)
> I noticed some serious problems in the hrtimer related code of
> net/sched/sched/sch_cbq.c.
>
> The following patch series addresses these.
>
> I'm not entirely sure whether patch [1/3] covers all possible
> concurrent modifications, but it fixes the most obvious ones. The
> remaining details are left to the networking experts.
>
> Patches [1/3] and [2/3] might be stable material as well.
I'm holding off on this entire set until we figure out what
to really do with the CBQ hrtimer situation. See my reply
to patch #1, thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-12 20:55 ` David Miller
@ 2009-07-14 8:22 ` Patrick McHardy
2009-07-14 8:30 ` Peter Zijlstra
2009-07-14 16:01 ` David Miller
2009-07-14 8:55 ` Thomas Gleixner
1 sibling, 2 replies; 23+ messages in thread
From: Patrick McHardy @ 2009-07-14 8:22 UTC (permalink / raw)
To: David Miller; +Cc: tglx, netdev, linux-kernel, peterz
[-- Attachment #1: Type: text/plain, Size: 1264 bytes --]
David Miller wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Thu, 09 Jul 2009 21:59:22 -0000
>
>> The hrtimer callback cbq_undelay() is not serialized against
>> cbq_ovl_delay(). That affects at least q->pmask and q->delay_timer.
>>
>> Lock it proper.
>>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>
> The problems here are even much deeper than it appears.
>
> First of all, I am to understand that hrtimers run from hardware
> interrupt context, right? If so, all of these datastructures are
> softirq safe only.
>
> And it is not merely the immediate things you see being modified in
> this hrtimer, such as ->pmask etc., it is also the q->active[]
> pointers, the list state for the classes, just about everything in the
> qdisc state is referenced in this hrtimer code path.
>
> I wonder how many queer unexplainable bugs we see because of this.
>
> What should probably happen is that the hrtimer merely fires off work
> at software interrupt context (perhaps a tasklet or similar), and that
> software interrupt code take the qdisc's root lock throughout it's
> execution.
That's my understanding what HRTIMER_SOFTIRQ is used for. I think
simply grabbing the root lock in cbq_undelay() should be fine.
Compile-tested only.
[-- Attachment #2: 01.diff --]
[-- Type: text/x-patch, Size: 1229 bytes --]
commit a790fb8873f1cbe8b9cb48cb368851e30d3ec172
Author: Patrick McHardy <kaber@trash.net>
Date: Tue Jul 14 10:19:47 2009 +0200
net-sched: sch_cbq: fix locking in cbq_undelay()
The hrtimer callback cbq_undelay() is not serialized against
cbq_ovl_delay(). That affects at least q->pmask and q->delay_timer.
Lock it proper.
Based on patch by Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 23a1676..7c659c6 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -593,12 +593,16 @@ static enum hrtimer_restart cbq_undelay(struct hrtimer *timer)
struct cbq_sched_data *q = container_of(timer, struct cbq_sched_data,
delay_timer);
struct Qdisc *sch = q->watchdog.qdisc;
+ spinlock_t *root_lock;
psched_time_t now;
psched_tdiff_t delay = 0;
unsigned pmask;
now = psched_get_time();
+ root_lock = qdisc_lock(qdisc_root(sch));
+ spin_lock(root_lock);
+
pmask = q->pmask;
q->pmask = 0;
@@ -615,6 +619,7 @@ static enum hrtimer_restart cbq_undelay(struct hrtimer *timer)
delay = tmp;
}
}
+ spin_unlock(root_lock);
if (delay) {
ktime_t time;
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-14 8:22 ` Patrick McHardy
@ 2009-07-14 8:30 ` Peter Zijlstra
2009-07-14 16:01 ` David Miller
1 sibling, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2009-07-14 8:30 UTC (permalink / raw)
To: Patrick McHardy; +Cc: David Miller, tglx, netdev, linux-kernel
On Tue, 2009-07-14 at 10:22 +0200, Patrick McHardy wrote:
>
> That's my understanding what HRTIMER_SOFTIRQ is used for. I think
> simply grabbing the root lock in cbq_undelay() should be fine.
Its not, and its going away soon (again) :-)
The current use of HRTIMER_SOFTIRQ is for when we enqueue a hrtimer with
an expiration time in the past. The current implementation tries to run
the timer instantly, however we cannot do it from the context calling
hrtimer_start(), since that might be holding locks the timer callback
also wants to hold, resulting in deadlocks.
Instead we queue the timer to the softirq, and kick the softirq. Which
leads to another problem in that we cannot always kick the softirq (esp
from within the scheduler).
We're going to change hrtimer_start() to return -ETIME instead of trying
to run the timer in-place, leaving the callers to figure it out.
The basic patch is done, but I still need to audit all the hrtimer users
in the kernel.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-12 20:55 ` David Miller
2009-07-14 8:22 ` Patrick McHardy
@ 2009-07-14 8:55 ` Thomas Gleixner
2009-07-14 16:00 ` David Miller
1 sibling, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2009-07-14 8:55 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-kernel, kaber, peterz
David,
On Sun, 12 Jul 2009, David Miller wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Thu, 09 Jul 2009 21:59:22 -0000
>
> > The hrtimer callback cbq_undelay() is not serialized against
> > cbq_ovl_delay(). That affects at least q->pmask and q->delay_timer.
> >
> > Lock it proper.
> >
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>
> The problems here are even much deeper than it appears.
>
> First of all, I am to understand that hrtimers run from hardware
> interrupt context, right? If so, all of these datastructures are
> softirq safe only.
>
> And it is not merely the immediate things you see being modified in
> this hrtimer, such as ->pmask etc., it is also the q->active[]
> pointers, the list state for the classes, just about everything in the
> qdisc state is referenced in this hrtimer code path.
That's what I was worried about.
> I wonder how many queer unexplainable bugs we see because of this.
>
> What should probably happen is that the hrtimer merely fires off work
> at software interrupt context (perhaps a tasklet or similar), and that
> software interrupt code take the qdisc's root lock throughout it's
> execution.
Sigh, I almost expected that the removal of the callback modes will
fire back some day.
Thanks,
tglx
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-14 8:55 ` Thomas Gleixner
@ 2009-07-14 16:00 ` David Miller
2009-07-14 16:28 ` Peter Zijlstra
2009-07-15 9:56 ` [patch 1/3] net: serialize hrtimer callback in sched_cbq Oliver Hartkopp
0 siblings, 2 replies; 23+ messages in thread
From: David Miller @ 2009-07-14 16:00 UTC (permalink / raw)
To: tglx; +Cc: netdev, linux-kernel, kaber, peterz
From: Thomas Gleixner <tglx@linutronix.de>
Date: Tue, 14 Jul 2009 10:55:14 +0200 (CEST)
> David,
>
> On Sun, 12 Jul 2009, David Miller wrote:
>
>> What should probably happen is that the hrtimer merely fires off work
>> at software interrupt context (perhaps a tasklet or similar), and that
>> software interrupt code take the qdisc's root lock throughout it's
>> execution.
>
> Sigh, I almost expected that the removal of the callback modes will
> fire back some day.
Well this makes hrtimers decidedly less useful for networking and we
have a ton of bugs right now, basically in every hrtimer used by the
networking currently.
The only way we can use them, as things currently stand, is as
triggers for softirq work.
Is it really that troublesome to provide this kind of facility
generically, rather than having various subsystems replicate such code
where they want to use hrtimers and are restricted to softirqs?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-14 8:22 ` Patrick McHardy
2009-07-14 8:30 ` Peter Zijlstra
@ 2009-07-14 16:01 ` David Miller
1 sibling, 0 replies; 23+ messages in thread
From: David Miller @ 2009-07-14 16:01 UTC (permalink / raw)
To: kaber; +Cc: tglx, netdev, linux-kernel, peterz
From: Patrick McHardy <kaber@trash.net>
Date: Tue, 14 Jul 2009 10:22:02 +0200
> That's my understanding what HRTIMER_SOFTIRQ is used for. I think
> simply grabbing the root lock in cbq_undelay() should be fine.
>
> Compile-tested only.
Unfortunately, as Peter and Thomas explained, this is not the case.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-14 16:00 ` David Miller
@ 2009-07-14 16:28 ` Peter Zijlstra
2009-07-14 16:42 ` Linus Torvalds
2009-07-15 9:56 ` [patch 1/3] net: serialize hrtimer callback in sched_cbq Oliver Hartkopp
1 sibling, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2009-07-14 16:28 UTC (permalink / raw)
To: David Miller; +Cc: tglx, netdev, linux-kernel, kaber, Linus Torvalds
On Tue, 2009-07-14 at 09:00 -0700, David Miller wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Tue, 14 Jul 2009 10:55:14 +0200 (CEST)
> > On Sun, 12 Jul 2009, David Miller wrote:
> >
> >> What should probably happen is that the hrtimer merely fires off work
> >> at software interrupt context (perhaps a tasklet or similar), and that
> >> software interrupt code take the qdisc's root lock throughout it's
> >> execution.
> >
> > Sigh, I almost expected that the removal of the callback modes will
> > fire back some day.
>
> Well this makes hrtimers decidedly less useful for networking and we
> have a ton of bugs right now, basically in every hrtimer used by the
> networking currently.
>
> The only way we can use them, as things currently stand, is as
> triggers for softirq work.
>
> Is it really that troublesome to provide this kind of facility
> generically, rather than having various subsystems replicate such code
> where they want to use hrtimers and are restricted to softirqs?
Linus really hated the softirq mode, which is what prompted me to change
that.
Now, it might be he only hated the particular interface and the
resulting code, but I think to remember he simply thought the whole
thing daft.
I can look into adding it back if we can agree on the interface and code
impact, but looking at:
# git grep hrtimer_init net/ | sort -u
net/can/bcm.c: hrtimer_init(&op->thrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
net/can/bcm.c: hrtimer_init(&op->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
net/sched/sch_api.c: hrtimer_init(&wd->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
net/sched/sch_cbq.c: hrtimer_init(&q->delay_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
I wonder if its worth the impact on the core kernel code, or whether its
better for these few timers to kick off a tasklet or the like.
Further, I don't think a lot of subsystems would need this, as the
general trend is away from softirqs/tasklets and towards
threads/workqueues as most people want to schedule. And for those
hardirq hrtimers are good enough as a wakeup source.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-14 16:28 ` Peter Zijlstra
@ 2009-07-14 16:42 ` Linus Torvalds
2009-07-17 12:14 ` Peter Zijlstra
0 siblings, 1 reply; 23+ messages in thread
From: Linus Torvalds @ 2009-07-14 16:42 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: David Miller, tglx, netdev, linux-kernel, kaber
On Tue, 14 Jul 2009, Peter Zijlstra wrote:
>
> Linus really hated the softirq mode, which is what prompted me to change
> that.
>
> Now, it might be he only hated the particular interface and the
> resulting code, but I think to remember he simply thought the whole
> thing daft.
Yes. And I hated the bugs it had.
Don't make something as core as timers any more complicated. Don't take
locks in timers and then complain about deadlocks. If your locking is
broken, don't make the core timers be idiotically broken.
Because it was. The code was a total mess to follow, and had bugs.
Linus
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-14 16:00 ` David Miller
2009-07-14 16:28 ` Peter Zijlstra
@ 2009-07-15 9:56 ` Oliver Hartkopp
1 sibling, 0 replies; 23+ messages in thread
From: Oliver Hartkopp @ 2009-07-15 9:56 UTC (permalink / raw)
To: David Miller, tglx; +Cc: netdev, linux-kernel, kaber, peterz
David Miller wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> Date: Tue, 14 Jul 2009 10:55:14 +0200 (CEST)
>
>> David,
>>
>> On Sun, 12 Jul 2009, David Miller wrote:
>>
>>> What should probably happen is that the hrtimer merely fires off work
>>> at software interrupt context (perhaps a tasklet or similar), and that
>>> software interrupt code take the qdisc's root lock throughout it's
>>> execution.
>> Sigh, I almost expected that the removal of the callback modes will
>> fire back some day.
>
> Well this makes hrtimers decidedly less useful for networking and we
> have a ton of bugs right now, basically in every hrtimer used by the
> networking currently.
The CAN stuff is clean in this topic. See below.
>
> The only way we can use them, as things currently stand, is as
> triggers for softirq work.
>
> Is it really that troublesome to provide this kind of facility
> generically, rather than having various subsystems replicate such code
> where they want to use hrtimers and are restricted to softirqs?
Indeed this had been my concerns also, when i moved the hrtimer usage in a CAN
protocol to use tasklets.
("can: update can-bcm for hrtimer hardirq callbacks")
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
due to
("hrtimer: removing all ur callback modes")
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ca109491f612aab5c8152207631c0444f63da97f
I was not very amused that time and wanted to NACK that change, but Linus said:
"Quite frankly, your NAK doesn't matter.
We've had too many bugs in hrtimers. They _will_ get simplified."
(http://lkml.indiana.edu/hypermail/linux/kernel/0812.1/00218.html)
Thomas, is there chance to get this nice simple possibility back to invoke at
least a hrtimer for SOFT_IRQ context additional to the current functionality??
I would expect this to save lot's of tasklet code that is - and will be -
created due to the lack of the hrtimers softirq capability ...
FWIK there were really many callback modes before (per-cpu stuff and so), that
were probably too much. But having a SOFT_IRQ callback mode again would really
help.
Regards,
Oliver
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-14 16:42 ` Linus Torvalds
@ 2009-07-17 12:14 ` Peter Zijlstra
2009-07-17 13:26 ` Oliver Hartkopp
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Peter Zijlstra @ 2009-07-17 12:14 UTC (permalink / raw)
To: Linus Torvalds; +Cc: David Miller, tglx, netdev, linux-kernel, kaber
On Tue, 2009-07-14 at 09:42 -0700, Linus Torvalds wrote:
>
> On Tue, 14 Jul 2009, Peter Zijlstra wrote:
> >
> > Linus really hated the softirq mode, which is what prompted me to change
> > that.
> >
> > Now, it might be he only hated the particular interface and the
> > resulting code, but I think to remember he simply thought the whole
> > thing daft.
>
> Yes. And I hated the bugs it had.
>
> Don't make something as core as timers any more complicated. Don't take
> locks in timers and then complain about deadlocks. If your locking is
> broken, don't make the core timers be idiotically broken.
>
> Because it was. The code was a total mess to follow, and had bugs.
How would something like the below work for people?
---
include/linux/hrtimer.h | 22 ++++++++++++++++++++--
kernel/hrtimer.c | 23 ++++++++++++++++++++++-
2 files changed, 42 insertions(+), 3 deletions(-)
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 4759917..e7559fe 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -22,6 +22,7 @@
#include <linux/wait.h>
#include <linux/percpu.h>
#include <linux/timer.h>
+#include <linux/interrupt.h>
struct hrtimer_clock_base;
@@ -91,7 +92,6 @@ enum hrtimer_restart {
* @function: timer expiry callback function
* @base: pointer to the timer base (per cpu and per clock)
* @state: state information (See bit values above)
- * @cb_entry: list head to enqueue an expired timer into the callback list
* @start_site: timer statistics field to store the site where the timer
* was started
* @start_comm: timer statistics field to store the name of the process which
@@ -108,7 +108,6 @@ struct hrtimer {
enum hrtimer_restart (*function)(struct hrtimer *);
struct hrtimer_clock_base *base;
unsigned long state;
- struct list_head cb_entry;
#ifdef CONFIG_TIMER_STATS
int start_pid;
void *start_site;
@@ -116,6 +115,12 @@ struct hrtimer {
#endif
};
+struct hrtimer_softirq {
+ struct hrtimer timer;
+ struct tasklet_struct tasklet;
+ enum hrtimer_restart (*function)(struct hrtimer *);
+};
+
/**
* struct hrtimer_sleeper - simple sleeper structure
* @timer: embedded timer structure
@@ -335,6 +340,19 @@ static inline void hrtimer_init_on_stack(struct hrtimer *timer,
static inline void destroy_hrtimer_on_stack(struct hrtimer *timer) { }
#endif
+enum hrtimer_restart __hrtimer_softirq_trampoline(struct hrtimer *timer);
+void __hrtimer_tasklet_trampoline(unsigned long data);
+
+static inline void hrtimer_softirq_init(struct hrtimer_softirq *stimer,
+ enum hrtimer_restart (*func)(struct hrtimer *),
+ clockid_t which_clock, enum hrtimer_mode mode)
+{
+ hrtimer_init(&stimer->timer, which_clock, mode);
+ stimer->timer.function = __hrtimer_softirq_trampoline;
+ tasklet_init(&stimer->tasklet, __hrtimer_tasklet_trampoline, stimer);
+ stimer->function = func;
+}
+
/* Basic timer operations: */
extern int hrtimer_start(struct hrtimer *timer, ktime_t tim,
const enum hrtimer_mode mode);
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index ab5eb70..dae063c 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1098,7 +1098,6 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
clock_id = CLOCK_MONOTONIC;
timer->base = &cpu_base->clock_base[clock_id];
- INIT_LIST_HEAD(&timer->cb_entry);
hrtimer_init_timer_hres(timer);
#ifdef CONFIG_TIMER_STATS
@@ -1141,6 +1140,28 @@ int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp)
}
EXPORT_SYMBOL_GPL(hrtimer_get_res);
+enum hrtimer_restart __hrtimer_softirq_trampoline(struct hrtimer *timer)
+{
+ struct hrtimer_softirq *stimer =
+ container_of(timer, struct hrtimer_softirq, timer);
+
+ tasklet_hi_schedule(&timer->tasklet);
+
+ return HRTIMER_NORESTART;
+}
+EXPORT_SYMBOL_GPL(__hrtimer_softirq_trampoline);
+
+void __hrtimer_tasklet_trampoline(unsigned long data)
+{
+ struct hrtimer_softirq *stimer = (void *)data;
+ enum hrtimer_restart restart;
+
+ restart = stimer->function(&stimer->timer);
+ if (restart != HRTIMER_NORESTART)
+ hrtimer_restart(&stimer->timer);
+}
+EXPORT_SYMBOL_GPL(__hrtimer_tasklet_trampoline);
+
static void __run_hrtimer(struct hrtimer *timer)
{
struct hrtimer_clock_base *base = timer->base;
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-17 12:14 ` Peter Zijlstra
@ 2009-07-17 13:26 ` Oliver Hartkopp
2009-07-17 15:44 ` Linus Torvalds
2009-07-22 3:18 ` David Miller
2 siblings, 0 replies; 23+ messages in thread
From: Oliver Hartkopp @ 2009-07-17 13:26 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Linus Torvalds, David Miller, tglx, netdev, linux-kernel, kaber
Peter Zijlstra wrote:
> On Tue, 2009-07-14 at 09:42 -0700, Linus Torvalds wrote:
>> On Tue, 14 Jul 2009, Peter Zijlstra wrote:
>>> Linus really hated the softirq mode, which is what prompted me to change
>>> that.
>>>
>>> Now, it might be he only hated the particular interface and the
>>> resulting code, but I think to remember he simply thought the whole
>>> thing daft.
>> Yes. And I hated the bugs it had.
>>
>> Don't make something as core as timers any more complicated. Don't take
>> locks in timers and then complain about deadlocks. If your locking is
>> broken, don't make the core timers be idiotically broken.
>>
>> Because it was. The code was a total mess to follow, and had bugs.
>
> How would something like the below work for people?
Would be fine to me.
It reduces the duplicated code as well as private structs for hrtimers &
tasklets. And finally your suggestion preserves the proper separation of the
hrtimers and the tasklets that are used as underlying concepts.
Regards,
Oliver (who wrote net/can/bcm.c)
>
> ---
> include/linux/hrtimer.h | 22 ++++++++++++++++++++--
> kernel/hrtimer.c | 23 ++++++++++++++++++++++-
> 2 files changed, 42 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
> index 4759917..e7559fe 100644
> --- a/include/linux/hrtimer.h
> +++ b/include/linux/hrtimer.h
> @@ -22,6 +22,7 @@
> #include <linux/wait.h>
> #include <linux/percpu.h>
> #include <linux/timer.h>
> +#include <linux/interrupt.h>
>
>
> struct hrtimer_clock_base;
> @@ -91,7 +92,6 @@ enum hrtimer_restart {
> * @function: timer expiry callback function
> * @base: pointer to the timer base (per cpu and per clock)
> * @state: state information (See bit values above)
> - * @cb_entry: list head to enqueue an expired timer into the callback list
> * @start_site: timer statistics field to store the site where the timer
> * was started
> * @start_comm: timer statistics field to store the name of the process which
> @@ -108,7 +108,6 @@ struct hrtimer {
> enum hrtimer_restart (*function)(struct hrtimer *);
> struct hrtimer_clock_base *base;
> unsigned long state;
> - struct list_head cb_entry;
> #ifdef CONFIG_TIMER_STATS
> int start_pid;
> void *start_site;
> @@ -116,6 +115,12 @@ struct hrtimer {
> #endif
> };
>
> +struct hrtimer_softirq {
> + struct hrtimer timer;
> + struct tasklet_struct tasklet;
> + enum hrtimer_restart (*function)(struct hrtimer *);
> +};
> +
> /**
> * struct hrtimer_sleeper - simple sleeper structure
> * @timer: embedded timer structure
> @@ -335,6 +340,19 @@ static inline void hrtimer_init_on_stack(struct hrtimer *timer,
> static inline void destroy_hrtimer_on_stack(struct hrtimer *timer) { }
> #endif
>
> +enum hrtimer_restart __hrtimer_softirq_trampoline(struct hrtimer *timer);
> +void __hrtimer_tasklet_trampoline(unsigned long data);
> +
> +static inline void hrtimer_softirq_init(struct hrtimer_softirq *stimer,
> + enum hrtimer_restart (*func)(struct hrtimer *),
> + clockid_t which_clock, enum hrtimer_mode mode)
> +{
> + hrtimer_init(&stimer->timer, which_clock, mode);
> + stimer->timer.function = __hrtimer_softirq_trampoline;
> + tasklet_init(&stimer->tasklet, __hrtimer_tasklet_trampoline, stimer);
> + stimer->function = func;
> +}
> +
> /* Basic timer operations: */
> extern int hrtimer_start(struct hrtimer *timer, ktime_t tim,
> const enum hrtimer_mode mode);
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index ab5eb70..dae063c 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1098,7 +1098,6 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
> clock_id = CLOCK_MONOTONIC;
>
> timer->base = &cpu_base->clock_base[clock_id];
> - INIT_LIST_HEAD(&timer->cb_entry);
> hrtimer_init_timer_hres(timer);
>
> #ifdef CONFIG_TIMER_STATS
> @@ -1141,6 +1140,28 @@ int hrtimer_get_res(const clockid_t which_clock, struct timespec *tp)
> }
> EXPORT_SYMBOL_GPL(hrtimer_get_res);
>
> +enum hrtimer_restart __hrtimer_softirq_trampoline(struct hrtimer *timer)
> +{
> + struct hrtimer_softirq *stimer =
> + container_of(timer, struct hrtimer_softirq, timer);
> +
> + tasklet_hi_schedule(&timer->tasklet);
> +
> + return HRTIMER_NORESTART;
> +}
> +EXPORT_SYMBOL_GPL(__hrtimer_softirq_trampoline);
> +
> +void __hrtimer_tasklet_trampoline(unsigned long data)
> +{
> + struct hrtimer_softirq *stimer = (void *)data;
> + enum hrtimer_restart restart;
> +
> + restart = stimer->function(&stimer->timer);
> + if (restart != HRTIMER_NORESTART)
> + hrtimer_restart(&stimer->timer);
> +}
> +EXPORT_SYMBOL_GPL(__hrtimer_tasklet_trampoline);
> +
> static void __run_hrtimer(struct hrtimer *timer)
> {
> struct hrtimer_clock_base *base = timer->base;
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-17 12:14 ` Peter Zijlstra
2009-07-17 13:26 ` Oliver Hartkopp
@ 2009-07-17 15:44 ` Linus Torvalds
2009-07-22 3:18 ` David Miller
2 siblings, 0 replies; 23+ messages in thread
From: Linus Torvalds @ 2009-07-17 15:44 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: David Miller, tglx, netdev, linux-kernel, kaber
On Fri, 17 Jul 2009, Peter Zijlstra wrote:
>
> How would something like the below work for people?
This looks saner.
It was the insanity of having the core timer code know about different
modes that caused all the sily problems.
Having a separate abstraction layer for "I want to get a softirq timeout"
sounds fine, as long as the timer code itself never cares.
That said, I don't think this shoud be a "hrtimer" issue (reflected in
your naming and include file choice). I think this is a softirq or tasklet
(or whatever) issue, and should be named that way.
Why should the timer code (and header files) care about how you can use
tasklets with them? It shouldn't. The timers should be seen as the really
low-level critical code, and the timer code should never need to know
about softirq's or tasklets or whatever.
So I think you shouldmove it to kernel/softirq.c.
Linus
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-17 12:14 ` Peter Zijlstra
2009-07-17 13:26 ` Oliver Hartkopp
2009-07-17 15:44 ` Linus Torvalds
@ 2009-07-22 3:18 ` David Miller
2009-07-22 6:29 ` Peter Zijlstra
2009-07-22 12:28 ` [PATCH] softirq: tasklet_hrtimer Peter Zijlstra
2 siblings, 2 replies; 23+ messages in thread
From: David Miller @ 2009-07-22 3:18 UTC (permalink / raw)
To: peterz; +Cc: torvalds, tglx, netdev, linux-kernel, kaber
From: Peter Zijlstra <peterz@infradead.org>
Date: Fri, 17 Jul 2009 14:14:52 +0200
> On Tue, 2009-07-14 at 09:42 -0700, Linus Torvalds wrote:
>>
>> On Tue, 14 Jul 2009, Peter Zijlstra wrote:
>> >
>> > Linus really hated the softirq mode, which is what prompted me to change
>> > that.
>> >
>> > Now, it might be he only hated the particular interface and the
>> > resulting code, but I think to remember he simply thought the whole
>> > thing daft.
>>
>> Yes. And I hated the bugs it had.
>>
>> Don't make something as core as timers any more complicated. Don't take
>> locks in timers and then complain about deadlocks. If your locking is
>> broken, don't make the core timers be idiotically broken.
>>
>> Because it was. The code was a total mess to follow, and had bugs.
>
> How would something like the below work for people?
I like it, but like Linus said it probably belongs in
kernel/softirq.c
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch 1/3] net: serialize hrtimer callback in sched_cbq
2009-07-22 3:18 ` David Miller
@ 2009-07-22 6:29 ` Peter Zijlstra
2009-07-22 12:28 ` [PATCH] softirq: tasklet_hrtimer Peter Zijlstra
1 sibling, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2009-07-22 6:29 UTC (permalink / raw)
To: David Miller; +Cc: torvalds, tglx, netdev, linux-kernel, kaber
On Tue, 2009-07-21 at 20:18 -0700, David Miller wrote:
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Fri, 17 Jul 2009 14:14:52 +0200
>
> > On Tue, 2009-07-14 at 09:42 -0700, Linus Torvalds wrote:
> >>
> >> On Tue, 14 Jul 2009, Peter Zijlstra wrote:
> >> >
> >> > Linus really hated the softirq mode, which is what prompted me to change
> >> > that.
> >> >
> >> > Now, it might be he only hated the particular interface and the
> >> > resulting code, but I think to remember he simply thought the whole
> >> > thing daft.
> >>
> >> Yes. And I hated the bugs it had.
> >>
> >> Don't make something as core as timers any more complicated. Don't take
> >> locks in timers and then complain about deadlocks. If your locking is
> >> broken, don't make the core timers be idiotically broken.
> >>
> >> Because it was. The code was a total mess to follow, and had bugs.
> >
> > How would something like the below work for people?
>
> I like it, but like Linus said it probably belongs in
> kernel/softirq.c
Right, I meant to rework it, but stuff kept preempting me. I'll try and
get around to it today :-)
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] softirq: tasklet_hrtimer
2009-07-22 3:18 ` David Miller
2009-07-22 6:29 ` Peter Zijlstra
@ 2009-07-22 12:28 ` Peter Zijlstra
2009-07-22 15:39 ` David Miller
2009-07-22 16:01 ` Linus Torvalds
1 sibling, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2009-07-22 12:28 UTC (permalink / raw)
To: David Miller; +Cc: torvalds, tglx, netdev, linux-kernel, kaber
Thomas, will you take this?
---
Subject: softirq: tasklet_hrtimer
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed Jul 22 14:18:35 CEST 2009
Stick tasklets and hrtimers together to provide an in-softirq hrtimer
experience.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
include/linux/interrupt.h | 11 +++++++++++
kernel/softirq.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 54 insertions(+), 1 deletion(-)
Index: linux-2.6/include/linux/interrupt.h
===================================================================
--- linux-2.6.orig/include/linux/interrupt.h
+++ linux-2.6/include/linux/interrupt.h
@@ -14,6 +14,7 @@
#include <linux/irqflags.h>
#include <linux/smp.h>
#include <linux/percpu.h>
+#include <linux/hrtimer.h>
#include <asm/atomic.h>
#include <asm/ptrace.h>
@@ -517,6 +518,16 @@ extern void tasklet_kill_immediate(struc
extern void tasklet_init(struct tasklet_struct *t,
void (*func)(unsigned long), unsigned long data);
+struct tasklet_hrtimer {
+ struct hrtimer timer;
+ struct tasklet_struct tasklet;
+ enum hrtimer_restart (*function)(struct hrtimer *);
+};
+
+void tasklet_hrtimer_init(struct tasklet_hrtimer *ttimer,
+ enum hrtimer_restart (*function)(struct hrtimer *),
+ clockid_t which_clock, enum hrtimer_mode mode);
+
/*
* Autoprobing for irqs:
*
Index: linux-2.6/kernel/softirq.c
===================================================================
--- linux-2.6.orig/kernel/softirq.c
+++ linux-2.6/kernel/softirq.c
@@ -345,7 +345,9 @@ void open_softirq(int nr, void (*action)
softirq_vec[nr].action = action;
}
-/* Tasklets */
+/*
+ * Tasklets
+ */
struct tasklet_head
{
struct tasklet_struct *head;
@@ -493,6 +495,46 @@ void tasklet_kill(struct tasklet_struct
EXPORT_SYMBOL(tasklet_kill);
+/*
+ * tasklet_hrtimer
+ */
+
+static enum hrtimer_restart __hrtimer_tasklet_trampoline(struct hrtimer *timer)
+{
+ struct tasklet_hrtimer *ttimer =
+ container_of(timer, struct tasklet_hrtimer, timer);
+
+ tasklet_hi_schedule(&ttimer->tasklet);
+
+ return HRTIMER_NORESTART;
+}
+
+static void __tasklet_hrtimer_trampoline(unsigned long data)
+{
+ struct tasklet_hrtimer *ttimer = (void *)data;
+ enum hrtimer_restart restart;
+
+ restart = ttimer->function(&ttimer->timer);
+ if (restart != HRTIMER_NORESTART)
+ hrtimer_restart(&ttimer->timer);
+}
+
+void tasklet_hrtimer_init(struct tasklet_hrtimer *ttimer,
+ enum hrtimer_restart (*function)(struct hrtimer *),
+ clockid_t which_clock, enum hrtimer_mode mode)
+{
+ hrtimer_init(&ttimer->timer, which_clock, mode);
+ ttimer->timer.function = __hrtimer_tasklet_trampoline;
+ tasklet_init(&ttimer->tasklet, __tasklet_hrtimer_trampoline,
+ (unsigned long)ttimer);
+ ttimer->function = function;
+}
+EXPORT_SYMBOL_GPL(tasklet_hrtimer_init);
+
+/*
+ * Remote softirq bits
+ */
+
DEFINE_PER_CPU(struct list_head [NR_SOFTIRQS], softirq_work_list);
EXPORT_PER_CPU_SYMBOL(softirq_work_list);
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] softirq: tasklet_hrtimer
2009-07-22 12:28 ` [PATCH] softirq: tasklet_hrtimer Peter Zijlstra
@ 2009-07-22 15:39 ` David Miller
2009-07-22 16:01 ` Linus Torvalds
1 sibling, 0 replies; 23+ messages in thread
From: David Miller @ 2009-07-22 15:39 UTC (permalink / raw)
To: peterz; +Cc: torvalds, tglx, netdev, linux-kernel, kaber
From: Peter Zijlstra <peterz@infradead.org>
Date: Wed, 22 Jul 2009 14:28:44 +0200
> Thomas, will you take this?
>
> ---
> Subject: softirq: tasklet_hrtimer
> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Wed Jul 22 14:18:35 CEST 2009
>
> Stick tasklets and hrtimers together to provide an in-softirq hrtimer
> experience.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: David S. Miller <davem@davemloft.net>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] softirq: tasklet_hrtimer
2009-07-22 12:28 ` [PATCH] softirq: tasklet_hrtimer Peter Zijlstra
2009-07-22 15:39 ` David Miller
@ 2009-07-22 16:01 ` Linus Torvalds
1 sibling, 0 replies; 23+ messages in thread
From: Linus Torvalds @ 2009-07-22 16:01 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: David Miller, tglx, netdev, linux-kernel, kaber
On Wed, 22 Jul 2009, Peter Zijlstra wrote:
>
> Stick tasklets and hrtimers together to provide an in-softirq hrtimer
> experience.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Looks ok by me now.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Thanks,
Linus
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2009-07-22 16:02 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-09 21:59 [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c Thomas Gleixner
2009-07-09 21:59 ` [patch 1/3] net: serialize hrtimer callback in sched_cbq Thomas Gleixner
2009-07-12 20:55 ` David Miller
2009-07-14 8:22 ` Patrick McHardy
2009-07-14 8:30 ` Peter Zijlstra
2009-07-14 16:01 ` David Miller
2009-07-14 8:55 ` Thomas Gleixner
2009-07-14 16:00 ` David Miller
2009-07-14 16:28 ` Peter Zijlstra
2009-07-14 16:42 ` Linus Torvalds
2009-07-17 12:14 ` Peter Zijlstra
2009-07-17 13:26 ` Oliver Hartkopp
2009-07-17 15:44 ` Linus Torvalds
2009-07-22 3:18 ` David Miller
2009-07-22 6:29 ` Peter Zijlstra
2009-07-22 12:28 ` [PATCH] softirq: tasklet_hrtimer Peter Zijlstra
2009-07-22 15:39 ` David Miller
2009-07-22 16:01 ` Linus Torvalds
2009-07-15 9:56 ` [patch 1/3] net: serialize hrtimer callback in sched_cbq Oliver Hartkopp
2009-07-09 21:59 ` [patch 2/3] net: sanitize hrtimer usage " Thomas Gleixner
2009-07-09 21:59 ` [patch 3/3] net: use HRTIMER_RESTART " Thomas Gleixner
2009-07-10 0:39 ` [patch 0/3] net: Sanitizing hrtimer usage in net/sched/sch_cbq.c David Miller
2009-07-12 20:57 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).