linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2][RT] hrtimers stuck in waitqueue
@ 2008-08-18 14:42 Gilles Carry
  2008-08-18 14:42 ` [PATCH 1/2] [RT] " Gilles Carry
  2008-08-18 14:42 ` [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup Gilles Carry
  0 siblings, 2 replies; 13+ messages in thread
From: Gilles Carry @ 2008-08-18 14:42 UTC (permalink / raw)
  To: linux-rt-users
  Cc: tglx, mingo, tinytim, jean-pierre.dion, sebastien.dugue,
	gilles.carry



Hello,

These patches are to fix a bug for high resolution timers initialized by
hrtimer_init_sleeper (nanosleep and futexes) which can get stuck on a
wait queue.
They apply onto 2.6.26-rt1

The below test shows up the bug. Though the test hangs immediately on
my ppc64 (8 CPU), it can takes tens of minutes on my x86_64 (8 CPU).
(kernel must feature: CONFIG_HIGH_RES_TIMERS=y)

#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

#define NUM_THREADS     30
#define NUM_LOOPS       10000

void *worker_thread(void *arg)
{
        long id = (long)arg;
        int i;

        for (i = 0; i < NUM_LOOPS; i++) {
                usleep(1000);
        }

        printf("thread %02ld done\n", id+1);

        return NULL;
}

int main(int argc, char* argv[])
{
        int i;
        struct sched_param param;
        pthread_attr_t attr;
        pthread_t *threads;

        if ((threads = malloc(NUM_THREADS * sizeof(pthread_t))) == NULL)
{
                perror("Failed to allocate threads\n");
                return 1;
        }

        param.sched_priority = sched_get_priority_min(SCHED_FIFO);
        pthread_attr_init(&attr);
        pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
        pthread_attr_setschedparam(&attr, &param);
        pthread_attr_setschedpolicy(&attr, SCHED_FIFO);

        /* start threads */
        for (i = 0; i < NUM_THREADS; i++) {
                if (pthread_create(&threads[i], &attr,
                                   worker_thread, (void *)(long)i))
                        perror("Failed to create thread\n");
        }

        pthread_attr_destroy(&attr);

        for (i = 0; i < NUM_THREADS; i++)
                pthread_join(threads[i], NULL);

        free(threads);

        return 0;
}


This occurs when hrtimer_interrupt is very busy and some awakened
threads enter hrtimer_cancel before hrtimer_interrupt has changed the
timer status. These threads are queued on a wait queue and are almost
never awakened since HRTIMER_CB_IRQSAFE_NO_SOFTIRQ timers are not supposed
to raise a softirq.
They would sometimes be awakened and only when another timer awakes and
uses a softirq call back set on the same CPU!!!
Before the patch, I could unlock them all by flooding the system with the
below program in order to run softirq timers with the same CB mode
on all CPUs.

#include <unistd.h>

main() {
	alarm(1);
	pause();
}

Adding traces (not included in this patch) to /proc/timer_list did
help to track the bug.


The second patch is a code cleanup that makes the code more readable.

I have run flawlessly the above test with the patched kernel for
~100 hours on two 8-way systems: x86_64 and ppc64 (power 6)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] [RT] hrtimers stuck in waitqueue
  2008-08-18 14:42 [PATCH 0/2][RT] hrtimers stuck in waitqueue Gilles Carry
@ 2008-08-18 14:42 ` Gilles Carry
  2008-08-19 14:10   ` Gregory Haskins
                     ` (2 more replies)
  2008-08-18 14:42 ` [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup Gilles Carry
  1 sibling, 3 replies; 13+ messages in thread
From: Gilles Carry @ 2008-08-18 14:42 UTC (permalink / raw)
  To: linux-rt-users
  Cc: tglx, mingo, tinytim, jean-pierre.dion, sebastien.dugue,
	gilles.carry

This patch makes hrtimers initialized with hrtimer_init_sleeper
to use another mode and then not be stuck in waitqueues when
hrtimer_interrupt is very busy.

The new mode is HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
The above-mentionned timers have been moved from
HRTIMER_CB_IRQSAFE_NO_SOFTIRQ to
HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.

HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ timers use a slightly different
state machine from HRTIMER_CB_IRQSAFE_NO_SOFTIRQ's as when removing the
timer, __run_hrtimer sets the status to INACTIVE _then_
wakes up the thread. This way, an awakened thread cannot enter
hrtimer_cancel before the timer's status has changed.

Signed-off-by: Gilles Carry <gilles.carry@bull.net>

---
 include/linux/hrtimer.h |    4 ++++
 kernel/hrtimer.c        |   26 +++++++++++++++++++++-----
 2 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index f3867fd..53f4d44 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -49,12 +49,16 @@ enum hrtimer_restart {
  *					does not restart the timer
  *	HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:	Callback must run in hardirq context
  *					Special mode for tick emultation
+ *	HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
+ *					Callback must run in hardirq context and
+ *					does not restart the timer
  */
 enum hrtimer_cb_mode {
 	HRTIMER_CB_SOFTIRQ,
 	HRTIMER_CB_IRQSAFE,
 	HRTIMER_CB_IRQSAFE_NO_RESTART,
 	HRTIMER_CB_IRQSAFE_NO_SOFTIRQ,
+	HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ,
 };
 
 /*
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index fa63a24..778eb7e 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -668,6 +668,7 @@ static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
 		/* Timer is expired, act upon the callback mode */
 		switch(timer->cb_mode) {
 		case HRTIMER_CB_IRQSAFE_NO_RESTART:
+		case HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
 			debug_hrtimer_deactivate(timer);
 			/*
 			 * We can call the callback from here. No restart
@@ -1089,6 +1090,8 @@ int hrtimer_cancel(struct hrtimer *timer)
 
 		if (ret >= 0)
 			return ret;
+		WARN_ON (timer->cb_mode ==
+			HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ);
 		hrtimer_wait_for_timer(timer);
 	}
 }
@@ -1298,11 +1301,18 @@ static void __run_hrtimer(struct hrtimer *timer)
 	int restart;
 
 	debug_hrtimer_deactivate(timer);
-	__remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
-	timer_stats_account_hrtimer(timer);
 
 	fn = timer->function;
-	if (timer->cb_mode == HRTIMER_CB_IRQSAFE_NO_SOFTIRQ) {
+
+	switch (timer->cb_mode) {
+	case HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
+		__remove_hrtimer(timer, base, HRTIMER_STATE_INACTIVE, 0);
+		timer_stats_account_hrtimer(timer);
+		spin_unlock(&cpu_base->lock);
+		fn(timer);
+		spin_lock(&cpu_base->lock);
+		return;
+	case HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:
 		/*
 		 * Used for scheduler timers, avoid lock inversion with
 		 * rq->lock and tasklist_lock.
@@ -1310,11 +1320,17 @@ static void __run_hrtimer(struct hrtimer *timer)
 		 * These timers are required to deal with enqueue expiry
 		 * themselves and are not allowed to migrate.
 		 */
+		__remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
+		timer_stats_account_hrtimer(timer);
 		spin_unlock(&cpu_base->lock);
 		restart = fn(timer);
 		spin_lock(&cpu_base->lock);
-	} else
+		break;
+	default:
+		__remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
+		timer_stats_account_hrtimer(timer);
 		restart = fn(timer);
+	}
 
 	/*
 	 * Note: We clear the CALLBACK bit after enqueue_hrtimer to avoid
@@ -1516,7 +1532,7 @@ void hrtimer_init_sleeper(struct hrtimer_sleeper *sl, struct task_struct *task)
 	sl->timer.function = hrtimer_wakeup;
 	sl->task = task;
 #ifdef CONFIG_HIGH_RES_TIMERS
-	sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_SOFTIRQ;
+	sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ;
 #endif
 }
 
-- 
1.5.2.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup
  2008-08-18 14:42 [PATCH 0/2][RT] hrtimers stuck in waitqueue Gilles Carry
  2008-08-18 14:42 ` [PATCH 1/2] [RT] " Gilles Carry
@ 2008-08-18 14:42 ` Gilles Carry
  2008-08-20 21:48   ` John Kacur
  1 sibling, 1 reply; 13+ messages in thread
From: Gilles Carry @ 2008-08-18 14:42 UTC (permalink / raw)
  To: linux-rt-users
  Cc: tglx, mingo, tinytim, jean-pierre.dion, sebastien.dugue,
	gilles.carry

This is a code cleanup that makes the code more readable.
Some HRTIMER_CB_SOFTIRQ processing have been moved to __run_hrtimer.

Signed-off-by: Gilles Carry <gilles.carry@bull.net>

---
 kernel/hrtimer.c |   32 +++++++++++---------------------
 1 files changed, 11 insertions(+), 21 deletions(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 778eb7e..26cfcdc 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1293,7 +1293,7 @@ static void run_hrtimer_pending(struct hrtimer_cpu_base *cpu_base)
 	wake_up_timer_waiters(cpu_base);
 }
 
-static void __run_hrtimer(struct hrtimer *timer)
+static int __run_hrtimer(struct hrtimer *timer)
 {
 	struct hrtimer_clock_base *base = timer->base;
 	struct hrtimer_cpu_base *cpu_base = base->cpu_base;
@@ -1311,7 +1311,7 @@ static void __run_hrtimer(struct hrtimer *timer)
 		spin_unlock(&cpu_base->lock);
 		fn(timer);
 		spin_lock(&cpu_base->lock);
-		return;
+		return 0;
 	case HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:
 		/*
 		 * Used for scheduler timers, avoid lock inversion with
@@ -1326,6 +1326,12 @@ static void __run_hrtimer(struct hrtimer *timer)
 		restart = fn(timer);
 		spin_lock(&cpu_base->lock);
 		break;
+	case HRTIMER_CB_SOFTIRQ:
+		/* Move softirq callbacks to the pending list */
+		__remove_hrtimer(timer, base, HRTIMER_STATE_PENDING, 0);
+		timer_stats_account_hrtimer(timer);
+		list_add_tail(&timer->cb_entry, &base->cpu_base->cb_pending);
+		return 1; /* Raise soft irq */
 	default:
 		__remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
 		timer_stats_account_hrtimer(timer);
@@ -1342,6 +1348,8 @@ static void __run_hrtimer(struct hrtimer *timer)
 		enqueue_hrtimer(timer, base, 0);
 	}
 	timer->state &= ~HRTIMER_STATE_CALLBACK;
+
+	return 0;
 }
 
 #ifdef CONFIG_HIGH_RES_TIMERS
@@ -1394,17 +1402,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 
 			ftrace_event_timer_triggered(&timer->expires, timer);
 
-			/* Move softirq callbacks to the pending list */
-			if (timer->cb_mode == HRTIMER_CB_SOFTIRQ) {
-				__remove_hrtimer(timer, base,
-						 HRTIMER_STATE_PENDING, 0);
-				list_add_tail(&timer->cb_entry,
-					      &base->cpu_base->cb_pending);
-				raise = 1;
-				continue;
-			}
-
-			__run_hrtimer(timer);
+			raise |= __run_hrtimer(timer);
 		}
 		spin_unlock(&cpu_base->lock);
 		base++;
@@ -1497,14 +1495,6 @@ void hrtimer_run_queues(void)
 			if (base->softirq_time.tv64 <= timer->expires.tv64)
 				break;
 
-			if (timer->cb_mode == HRTIMER_CB_SOFTIRQ) {
-				__remove_hrtimer(timer, base,
-					HRTIMER_STATE_PENDING, 0);
-				list_add_tail(&timer->cb_entry,
-					&base->cpu_base->cb_pending);
-				continue;
-			}

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [RT] hrtimers stuck in waitqueue
  2008-08-18 14:42 ` [PATCH 1/2] [RT] " Gilles Carry
@ 2008-08-19 14:10   ` Gregory Haskins
       [not found]     ` <789E827C-DB3F-451E-BFFF-4210433029DF@free.fr>
  2008-08-21 13:16   ` John Kacur
  2008-08-22 14:39   ` Thomas Gleixner
  2 siblings, 1 reply; 13+ messages in thread
From: Gregory Haskins @ 2008-08-19 14:10 UTC (permalink / raw)
  To: Gilles Carry
  Cc: linux-rt-users, tglx, mingo, tinytim, jean-pierre.dion,
	sebastien.dugue

[-- Attachment #1: Type: text/plain, Size: 836 bytes --]

Hi Gilles,

Gilles Carry wrote:
> This patch makes hrtimers initialized with hrtimer_init_sleeper
> to use another mode and then not be stuck in waitqueues when
> hrtimer_interrupt is very busy.
>   

Not sure if this is related, but I just posted a patch to fix a 
condition where hrtimer_interrupt could get artificially busy for *very* 
long periods of time (I observed 500ms).

http://article.gmane.org/gmane.linux.rt.user/3357

This bug was specific to 26-rt1.  Did you investigate the cause of the 
busy hrtimer_interrupt?  If not, its possible that your issue would be 
indirectly fixed now that the HRT subsystem doesn't spike-out like 
that.  Or do you believe these are orthogonal issues and we should still 
protect against non-pathological cases of hrtimer_interrupt running hot?

Regards,
-Greg




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [RT] hrtimers stuck in waitqueue
       [not found]     ` <789E827C-DB3F-451E-BFFF-4210433029DF@free.fr>
@ 2008-08-20 10:57       ` Gregory Haskins
  0 siblings, 0 replies; 13+ messages in thread
From: Gregory Haskins @ 2008-08-20 10:57 UTC (permalink / raw)
  To: Gilles Carry
  Cc: Gilles Carry, linux-rt-users, tglx, mingo, tinytim,
	jean-pierre.dion, sebastien.dugue

[-- Attachment #1: Type: text/plain, Size: 1649 bytes --]

Hi Gilles,

Gilles Carry wrote:
>
> Le 19 août 08 à 16:10, Gregory Haskins a écrit :
>
>> Hi Gilles,
>>
>> Gilles Carry wrote:
>>> This patch makes hrtimers initialized with hrtimer_init_sleeper
>>> to use another mode and then not be stuck in waitqueues when
>>> hrtimer_interrupt is very busy.
>>>  
>>
>> Not sure if this is related, but I just posted a patch to fix a 
>> condition where hrtimer_interrupt could get artificially busy for 
>> *very* long periods of time (I observed 500ms).
>>
>> http://article.gmane.org/gmane.linux.rt.user/3357
>>
>> This bug was specific to 26-rt1.  Did you investigate the cause of 
>> the busy hrtimer_interrupt?  If not, its possible that your issue 
>> would be indirectly fixed now that the HRT subsystem doesn't 
>> spike-out like that.  Or do you believe these are orthogonal issues 
>> and we should still protect against non-pathological cases of 
>> hrtimer_interrupt running hot?
>
> Hello Greg,
> The cause of busy hrtimer interrupt is the test that stresses the 
> system. Many timers expire and are treated in a single interrupt run.
>
> Anyway looking at __run_hrtimer we can see:
> spin_unlock (&cpu_base->lock);
> restart = fn(timer);
> spin_lock(&cpu_base->lock);
>
> I guess It is the point where awaken threads might migrate to another cpu and reach hrtimer_cancel *before* __run_hrtimer has changed the timer state. (at the bottom of the function)
> This is why a new state machine is needed.

It sounds reasonable that both fixes are needed, then.  Thanks.

-Greg

>
> Regards,
> Gilles.
>
>>
>>
>> Regards,
>> -Greg
>>
>>
>>
>



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup
  2008-08-18 14:42 ` [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup Gilles Carry
@ 2008-08-20 21:48   ` John Kacur
  2008-08-21 12:18     ` Gilles Carry
  0 siblings, 1 reply; 13+ messages in thread
From: John Kacur @ 2008-08-20 21:48 UTC (permalink / raw)
  To: Gilles Carry
  Cc: linux-rt-users, tglx, mingo, tinytim, jean-pierre.dion,
	sebastien.dugue

On Mon, Aug 18, 2008 at 4:42 PM, Gilles Carry <gilles.carry@bull.net> wrote:
> This is a code cleanup that makes the code more readable.
> Some HRTIMER_CB_SOFTIRQ processing have been moved to __run_hrtimer.
>
> Signed-off-by: Gilles Carry <gilles.carry@bull.net>
>
> ---
>  kernel/hrtimer.c |   32 +++++++++++---------------------
>  1 files changed, 11 insertions(+), 21 deletions(-)
>
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index 778eb7e..26cfcdc 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1293,7 +1293,7 @@ static void run_hrtimer_pending(struct hrtimer_cpu_base *cpu_base)
>        wake_up_timer_waiters(cpu_base);
>  }
>
> -static void __run_hrtimer(struct hrtimer *timer)
> +static int __run_hrtimer(struct hrtimer *timer)
>  {
>        struct hrtimer_clock_base *base = timer->base;
>        struct hrtimer_cpu_base *cpu_base = base->cpu_base;
> @@ -1311,7 +1311,7 @@ static void __run_hrtimer(struct hrtimer *timer)
>                spin_unlock(&cpu_base->lock);
>                fn(timer);
>                spin_lock(&cpu_base->lock);
> -               return;
> +               return 0;
>        case HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:
>                /*
>                 * Used for scheduler timers, avoid lock inversion with
> @@ -1326,6 +1326,12 @@ static void __run_hrtimer(struct hrtimer *timer)
>                restart = fn(timer);
>                spin_lock(&cpu_base->lock);
>                break;
> +       case HRTIMER_CB_SOFTIRQ:
> +               /* Move softirq callbacks to the pending list */
> +               __remove_hrtimer(timer, base, HRTIMER_STATE_PENDING, 0);
> +               timer_stats_account_hrtimer(timer);
> +               list_add_tail(&timer->cb_entry, &base->cpu_base->cb_pending);
> +               return 1; /* Raise soft irq */

I kind of find this confusing, because the return value is only useful
in one specific instance. In other cases, the return value is thrown
away. People who are not aware of the history of this code and examine
it later may ask why the return value is being ignored in some cases.

>        default:
>                __remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
>                timer_stats_account_hrtimer(timer);
> @@ -1342,6 +1348,8 @@ static void __run_hrtimer(struct hrtimer *timer)
>                enqueue_hrtimer(timer, base, 0);
>        }
>        timer->state &= ~HRTIMER_STATE_CALLBACK;
> +
> +       return 0;
>  }
>
>  #ifdef CONFIG_HIGH_RES_TIMERS
> @@ -1394,17 +1402,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>
>                        ftrace_event_timer_triggered(&timer->expires, timer);
>
> -                       /* Move softirq callbacks to the pending list */
> -                       if (timer->cb_mode == HRTIMER_CB_SOFTIRQ) {
> -                               __remove_hrtimer(timer, base,
> -                                                HRTIMER_STATE_PENDING, 0);
> -                               list_add_tail(&timer->cb_entry,
> -                                             &base->cpu_base->cb_pending);
> -                               raise = 1;
> -                               continue;
> -                       }
> -
> -                       __run_hrtimer(timer);
> +                       raise |= __run_hrtimer(timer);
>                }
>                spin_unlock(&cpu_base->lock);
>                base++;
> @@ -1497,14 +1495,6 @@ void hrtimer_run_queues(void)
>                        if (base->softirq_time.tv64 <= timer->expires.tv64)
>                                break;
>
> -                       if (timer->cb_mode == HRTIMER_CB_SOFTIRQ) {
> -                               __remove_hrtimer(timer, base,
> -                                       HRTIMER_STATE_PENDING, 0);
> -                               list_add_tail(&timer->cb_entry,
> -                                       &base->cpu_base->cb_pending);
> -                               continue;
> -                       }
> -
>                        __run_hrtimer(timer);
>                }
>                spin_unlock(&cpu_base->lock);
> --
> 1.5.2.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup
  2008-08-20 21:48   ` John Kacur
@ 2008-08-21 12:18     ` Gilles Carry
  2008-08-21 13:03       ` John Kacur
  0 siblings, 1 reply; 13+ messages in thread
From: Gilles Carry @ 2008-08-21 12:18 UTC (permalink / raw)
  To: John Kacur
  Cc: Gilles Carry, linux-rt-users, tglx, mingo, tinytim,
	jean-pierre.dion, sebastien.dugue

Le 20 août 08 à 23:48, John Kacur a écrit :

>
> I kind of find this confusing, because the return value is only useful
> in one specific instance. In other cases, the return value is thrown
> away. People who are not aware of the history of this code and examine
> it later may ask why the return value is being ignored in some cases.
>
>>

John,

I don't see anything confusing here. It's similar to a function where  
you have plenty
of parameters and depending on the context, only a few are useful. In  
the present
case, the timers mode names (...NOSOFTIRQ) speak for themselves. They  
don't need
this return code.
As there is already plenty of explanations and comments within the code,
people 'not aware' will do as everbody else do: decrypt and  
understand. ;-)
Anyway, I don't mind adding comments at the head of the function.

My goal here is only to make the code more readable and easier to  
maintain by
factorization this is why I separated this patch from the fix.
If you look at the factorized code, you see the only difference is the  
raisesoftirq stuff.
I thought it was worth doing it.

Gilles.--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup
  2008-08-21 12:18     ` Gilles Carry
@ 2008-08-21 13:03       ` John Kacur
  2008-08-22  6:04         ` Gilles Carry
  0 siblings, 1 reply; 13+ messages in thread
From: John Kacur @ 2008-08-21 13:03 UTC (permalink / raw)
  To: Gilles Carry
  Cc: Gilles Carry, linux-rt-users, tglx, mingo, tinytim,
	jean-pierre.dion, sebastien.dugue

On Thu, Aug 21, 2008 at 2:18 PM, Gilles Carry <glscarry@free.fr> wrote:
> Le 20 août 08 à 23:48, John Kacur a écrit :
>
>>
>> I kind of find this confusing, because the return value is only useful
>> in one specific instance. In other cases, the return value is thrown
>> away. People who are not aware of the history of this code and examine
>> it later may ask why the return value is being ignored in some cases.
>>
>>>
>
> John,
>
> I don't see anything confusing here. It's similar to a function where you
> have plenty
> of parameters and depending on the context, only a few are useful. In the
> present
> case, the timers mode names (...NOSOFTIRQ) speak for themselves. They don't
> need
> this return code.
> As there is already plenty of explanations and comments within the code,
> people 'not aware' will do as everbody else do: decrypt and understand. ;-)
> Anyway, I don't mind adding comments at the head of the function.
>
> My goal here is only to make the code more readable and easier to maintain
> by
> factorization this is why I separated this patch from the fix.
> If you look at the factorized code, you see the only difference is the
> raisesoftirq stuff.
> I thought it was worth doing it.
>
> Gilles.--
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

The goal of factorization is a good one of course. I also don't want
to argue too much about the implementation when you have done the
detective work to solve a hard problem, but I am just wondering if
there is a better way to do this - that would make the code even more
readable. As I go over this code again, I'm also thinking that it
conceptually doesn't really belong in the __run_timer function, even
if it works. If you look at the code before your changes, the function
__run_timer did just that, it ran a timer, but now it's manipulating
callback lists. Is there perhaps another way to common this code up
then?
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [RT] hrtimers stuck in waitqueue
  2008-08-18 14:42 ` [PATCH 1/2] [RT] " Gilles Carry
  2008-08-19 14:10   ` Gregory Haskins
@ 2008-08-21 13:16   ` John Kacur
  2008-08-22  6:11     ` Gilles Carry
  2008-08-22 14:39   ` Thomas Gleixner
  2 siblings, 1 reply; 13+ messages in thread
From: John Kacur @ 2008-08-21 13:16 UTC (permalink / raw)
  To: Gilles Carry
  Cc: linux-rt-users, tglx, mingo, tinytim, jean-pierre.dion,
	sebastien.dugue

[-- Attachment #1: Type: text/plain, Size: 5790 bytes --]

On Mon, Aug 18, 2008 at 4:42 PM, Gilles Carry <gilles.carry@bull.net> wrote:
> This patch makes hrtimers initialized with hrtimer_init_sleeper
> to use another mode and then not be stuck in waitqueues when
> hrtimer_interrupt is very busy.
>
> The new mode is HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
> The above-mentionned timers have been moved from
> HRTIMER_CB_IRQSAFE_NO_SOFTIRQ to
> HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
>
> HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ timers use a slightly different
> state machine from HRTIMER_CB_IRQSAFE_NO_SOFTIRQ's as when removing the
> timer, __run_hrtimer sets the status to INACTIVE _then_
> wakes up the thread. This way, an awakened thread cannot enter
> hrtimer_cancel before the timer's status has changed.
>
> Signed-off-by: Gilles Carry <gilles.carry@bull.net>
>
> ---
>  include/linux/hrtimer.h |    4 ++++
>  kernel/hrtimer.c        |   26 +++++++++++++++++++++-----
>  2 files changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
> index f3867fd..53f4d44 100644
> --- a/include/linux/hrtimer.h
> +++ b/include/linux/hrtimer.h
> @@ -49,12 +49,16 @@ enum hrtimer_restart {
>  *                                     does not restart the timer
>  *     HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:  Callback must run in hardirq context
>  *                                     Special mode for tick emultation
> + *     HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
> + *                                     Callback must run in hardirq context and
> + *                                     does not restart the timer
>  */
>  enum hrtimer_cb_mode {
>        HRTIMER_CB_SOFTIRQ,
>        HRTIMER_CB_IRQSAFE,
>        HRTIMER_CB_IRQSAFE_NO_RESTART,
>        HRTIMER_CB_IRQSAFE_NO_SOFTIRQ,
> +       HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ,
>  };
>
>  /*
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index fa63a24..778eb7e 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -668,6 +668,7 @@ static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
>                /* Timer is expired, act upon the callback mode */
>                switch(timer->cb_mode) {
>                case HRTIMER_CB_IRQSAFE_NO_RESTART:
> +               case HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
>                        debug_hrtimer_deactivate(timer);
>                        /*
>                         * We can call the callback from here. No restart
> @@ -1089,6 +1090,8 @@ int hrtimer_cancel(struct hrtimer *timer)
>
>                if (ret >= 0)
>                        return ret;
> +               WARN_ON (timer->cb_mode ==

Minor point here, we may as well squash checkpatch.pl complaints now,
to reduce any clean-up later to push to mainline. if you run
checkpatch.pl
WARNING: space prohibited between function name and open parenthesis '('
#69: FILE: kernel/hrtimer.c:1093:
+               WARN_ON (timer->cb_mode ==


> +                       HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ);
>                hrtimer_wait_for_timer(timer);
>        }
>  }
> @@ -1298,11 +1301,18 @@ static void __run_hrtimer(struct hrtimer *timer)
>        int restart;
>
>        debug_hrtimer_deactivate(timer);
> -       __remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
> -       timer_stats_account_hrtimer(timer);
>
>        fn = timer->function;
> -       if (timer->cb_mode == HRTIMER_CB_IRQSAFE_NO_SOFTIRQ) {
> +
> +       switch (timer->cb_mode) {
> +       case HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
> +               __remove_hrtimer(timer, base, HRTIMER_STATE_INACTIVE, 0);

Correct me if I'm wrong but there is a lot of code reworking and code
expansion here, when really the above line is the only difference
between your new HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ
and HRTIMER_CB_IRQSAFE_NO_SOFTIRQ.
It is not even strictly necessary to return, since the rest of the
function should be safe. I have attached an alternate smaller patch, -
maybe you'll find it simpler?


> +               timer_stats_account_hrtimer(timer);
> +               spin_unlock(&cpu_base->lock);
> +               fn(timer);
> +               spin_lock(&cpu_base->lock);
> +               return;
> +       case HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:
>                /*
>                 * Used for scheduler timers, avoid lock inversion with
>                 * rq->lock and tasklist_lock.
> @@ -1310,11 +1320,17 @@ static void __run_hrtimer(struct hrtimer *timer)
>                 * These timers are required to deal with enqueue expiry
>                 * themselves and are not allowed to migrate.
>                 */
> +               __remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
> +               timer_stats_account_hrtimer(timer);
>                spin_unlock(&cpu_base->lock);
>                restart = fn(timer);
>                spin_lock(&cpu_base->lock);
> -       } else
> +               break;
> +       default:
> +               __remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
> +               timer_stats_account_hrtimer(timer);
>                restart = fn(timer);
> +       }
>
>        /*
>         * Note: We clear the CALLBACK bit after enqueue_hrtimer to avoid
> @@ -1516,7 +1532,7 @@ void hrtimer_init_sleeper(struct hrtimer_sleeper *sl, struct task_struct *task)
>        sl->timer.function = hrtimer_wakeup;
>        sl->task = task;
>  #ifdef CONFIG_HIGH_RES_TIMERS
> -       sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_SOFTIRQ;
> +       sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ;
>  #endif
>  }
>
> --
> 1.5.2.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: hrtimers_stuck_in_waitq-jk.patch --]
[-- Type: text/x-patch; name=hrtimers_stuck_in_waitq-jk.patch, Size: 3722 bytes --]

From: gilles.carry@bull.net Gilles Carry 
To: linux-rt-users@vger.kernel.org 
Date: Mon, 18 Aug 2008 16:42:31 +0200 
Subject: [PATCH 1/2] [RT] hrtimers stuck in waitqueue 
 
This patch makes hrtimers initialized with hrtimer_init_sleeper
to use another mode and then not be stuck in waitqueues when
hrtimer_interrupt is very busy.

The new mode is HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
The above-mentionned timers have been moved from
HRTIMER_CB_IRQSAFE_NO_SOFTIRQ to
HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.

HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ timers use a slightly different
state machine from HRTIMER_CB_IRQSAFE_NO_SOFTIRQ's as when removing the
timer, __run_hrtimer sets the status to INACTIVE _then_
wakes up the thread. This way, an awakened thread cannot enter
hrtimer_cancel before the timer's status has changed.

Signed-off-by: Gilles Carry <gilles.carry@bull.net>
-reworked the patch to reduce it's size and more closely match the original function.
- removed space after WARN_ON to remove checkpatch.pl warning.
Signed-off-by: John Kacur <jkacur at gmail dot com>

---
 include/linux/hrtimer.h |    4 ++++
 kernel/hrtimer.c        |   26 +++++++++++++++++++++-----
 2 files changed, 25 insertions(+), 5 deletions(-)

Index: linux-2.6.26.2-rt1-jk/include/linux/hrtimer.h
===================================================================
--- linux-2.6.26.2-rt1-jk.orig/include/linux/hrtimer.h
+++ linux-2.6.26.2-rt1-jk/include/linux/hrtimer.h
@@ -49,12 +49,16 @@ enum hrtimer_restart {
  *					does not restart the timer
  *	HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:	Callback must run in hardirq context
  *					Special mode for tick emultation
+ *	HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
+ *					Callback must run in hardirq context and
+ *					does not restart the timer
  */
 enum hrtimer_cb_mode {
 	HRTIMER_CB_SOFTIRQ,
 	HRTIMER_CB_IRQSAFE,
 	HRTIMER_CB_IRQSAFE_NO_RESTART,
 	HRTIMER_CB_IRQSAFE_NO_SOFTIRQ,
+	HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ,
 };
 
 /*
Index: linux-2.6.26.2-rt1-jk/kernel/hrtimer.c
===================================================================
--- linux-2.6.26.2-rt1-jk.orig/kernel/hrtimer.c
+++ linux-2.6.26.2-rt1-jk/kernel/hrtimer.c
@@ -668,6 +668,7 @@ static inline int hrtimer_enqueue_reprog
 		/* Timer is expired, act upon the callback mode */
 		switch(timer->cb_mode) {
 		case HRTIMER_CB_IRQSAFE_NO_RESTART:
+		case HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
 			debug_hrtimer_deactivate(timer);
 			/*
 			 * We can call the callback from here. No restart
@@ -1089,6 +1090,8 @@ int hrtimer_cancel(struct hrtimer *timer
 
 		if (ret >= 0)
 			return ret;
+		WARN_ON(timer->cb_mode ==
+			HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ);
 		hrtimer_wait_for_timer(timer);
 	}
 }
@@ -1298,11 +1301,15 @@ static void __run_hrtimer(struct hrtimer
 	int restart;
 
 	debug_hrtimer_deactivate(timer);
-	__remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
+	if (timer->cb_mode == HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ)
+		__remove_hrtimer(timer, base, HRTIMER_STATE_INACTIVE, 0);
+	else
+		__remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
 	timer_stats_account_hrtimer(timer);
 
 	fn = timer->function;
-	if (timer->cb_mode == HRTIMER_CB_IRQSAFE_NO_SOFTIRQ) {
+
+	if (timer->cb_mode == HRTIMER_CB_IRQSAFE_NO_SOFTIRQ || timer->cb_mode == HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ) {
 		/*
 		 * Used for scheduler timers, avoid lock inversion with
 		 * rq->lock and tasklist_lock.
@@ -1516,7 +1523,7 @@ void hrtimer_init_sleeper(struct hrtimer
 	sl->timer.function = hrtimer_wakeup;
 	sl->task = task;
 #ifdef CONFIG_HIGH_RES_TIMERS
-	sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_SOFTIRQ;
+	sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ;
 #endif
 }
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup
  2008-08-21 13:03       ` John Kacur
@ 2008-08-22  6:04         ` Gilles Carry
  0 siblings, 0 replies; 13+ messages in thread
From: Gilles Carry @ 2008-08-22  6:04 UTC (permalink / raw)
  To: John Kacur
  Cc: Gilles Carry, linux-rt-users, tglx, mingo, tinytim,
	jean-pierre.dion, sebastien.dugue


Le 21 août 08 à 15:03, John Kacur a écrit :

> On Thu, Aug 21, 2008 at 2:18 PM, Gilles Carry <glscarry@free.fr>  
> wrote:
>> Le 20 août 08 à 23:48, John Kacur a écrit :
>>
>>>
>>> I kind of find this confusing, because the return value is only  
>>> useful
>>> in one specific instance. In other cases, the return value is thrown
>>> away. People who are not aware of the history of this code and  
>>> examine
>>> it later may ask why the return value is being ignored in some  
>>> cases.
>>>
>>>>
>>
>> John,
>>
>> I don't see anything confusing here. It's similar to a function  
>> where you
>> have plenty
>> of parameters and depending on the context, only a few are useful.  
>> In the
>> present
>> case, the timers mode names (...NOSOFTIRQ) speak for themselves.  
>> They don't
>> need
>> this return code.
>> As there is already plenty of explanations and comments within the  
>> code,
>> people 'not aware' will do as everbody else do: decrypt and  
>> understand. ;-)
>> Anyway, I don't mind adding comments at the head of the function.
>>
>> My goal here is only to make the code more readable and easier to  
>> maintain
>> by
>> factorization this is why I separated this patch from the fix.
>> If you look at the factorized code, you see the only difference is  
>> the
>> raisesoftirq stuff.
>> I thought it was worth doing it.
>>
>> Gilles.--
>> To unsubscribe from this list: send the line "unsubscribe linux-rt- 
>> users" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> The goal of factorization is a good one of course. I also don't want
> to argue too much about the implementation when you have done the
> detective work to solve a hard problem, but I am just wondering if
> there is a better way to do this - that would make the code even more
> readable. As I go over this code again, I'm also thinking that it
> conceptually doesn't really belong in the __run_timer function, even
> if it works. If you look at the code before your changes, the function
> __run_timer did just that, it ran a timer, but now it's manipulating
> callback lists. Is there perhaps another way to common this code up
> then?

True, __run_hrtimer does a bit more than it used to but if you consider
that this function as a method of an object of class_hrtimer, it is
exactly what it would do. Also, wakeup_thread simply puts a thread
back on a run queue which somewhat similar to put a timer in a
callback list and change its status.

But as I told you, this patch is only a proposal.

Gilles.--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [RT] hrtimers stuck in waitqueue
  2008-08-21 13:16   ` John Kacur
@ 2008-08-22  6:11     ` Gilles Carry
  0 siblings, 0 replies; 13+ messages in thread
From: Gilles Carry @ 2008-08-22  6:11 UTC (permalink / raw)
  To: John Kacur
  Cc: Gilles Carry, linux-rt-users, tglx, mingo, tinytim,
	jean-pierre.dion, sebastien.dugue


Le 21 août 08 à 15:16, John Kacur a écrit :

> On Mon, Aug 18, 2008 at 4:42 PM, Gilles Carry  
> <gilles.carry@bull.net> wrote:
>> This patch makes hrtimers initialized with hrtimer_init_sleeper
>> to use another mode and then not be stuck in waitqueues when
>> hrtimer_interrupt is very busy.
>>
>> The new mode is HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
>> The above-mentionned timers have been moved from
>> HRTIMER_CB_IRQSAFE_NO_SOFTIRQ to
>> HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
>>
>> HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ timers use a slightly  
>> different
>> state machine from HRTIMER_CB_IRQSAFE_NO_SOFTIRQ's as when removing  
>> the
>> timer, __run_hrtimer sets the status to INACTIVE _then_
>> wakes up the thread. This way, an awakened thread cannot enter
>> hrtimer_cancel before the timer's status has changed.
>>
>> Signed-off-by: Gilles Carry <gilles.carry@bull.net>
>>
>> ---
>> include/linux/hrtimer.h |    4 ++++
>> kernel/hrtimer.c        |   26 +++++++++++++++++++++-----
>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
>> index f3867fd..53f4d44 100644
>> --- a/include/linux/hrtimer.h
>> +++ b/include/linux/hrtimer.h
>> @@ -49,12 +49,16 @@ enum hrtimer_restart {
>> *                                     does not restart the timer
>> *     HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:  Callback must run in hardirq  
>> context
>> *                                     Special mode for tick  
>> emultation
>> + *     HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
>> + *                                     Callback must run in  
>> hardirq context and
>> + *                                     does not restart the timer
>> */
>> enum hrtimer_cb_mode {
>>       HRTIMER_CB_SOFTIRQ,
>>       HRTIMER_CB_IRQSAFE,
>>       HRTIMER_CB_IRQSAFE_NO_RESTART,
>>       HRTIMER_CB_IRQSAFE_NO_SOFTIRQ,
>> +       HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ,
>> };
>>
>> /*
>> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
>> index fa63a24..778eb7e 100644
>> --- a/kernel/hrtimer.c
>> +++ b/kernel/hrtimer.c
>> @@ -668,6 +668,7 @@ static inline int  
>> hrtimer_enqueue_reprogram(struct hrtimer *timer,
>>               /* Timer is expired, act upon the callback mode */
>>               switch(timer->cb_mode) {
>>               case HRTIMER_CB_IRQSAFE_NO_RESTART:
>> +               case HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
>>                       debug_hrtimer_deactivate(timer);
>>                       /*
>>                        * We can call the callback from here. No  
>> restart
>> @@ -1089,6 +1090,8 @@ int hrtimer_cancel(struct hrtimer *timer)
>>
>>               if (ret >= 0)
>>                       return ret;
>> +               WARN_ON (timer->cb_mode ==
>
> Minor point here, we may as well squash checkpatch.pl complaints now,
> to reduce any clean-up later to push to mainline. if you run
> checkpatch.pl
> WARNING: space prohibited between function name and open parenthesis  
> '('
> #69: FILE: kernel/hrtimer.c:1093:
> +               WARN_ON (timer->cb_mode ==

I will fix that soon.
I split the line because it was to long. I know some people don't like  
more than
80 chars a line. This is why I did this.

>
>
>
>> +                       HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ);
>>               hrtimer_wait_for_timer(timer);
>>       }
>> }
>> @@ -1298,11 +1301,18 @@ static void __run_hrtimer(struct hrtimer  
>> *timer)
>>       int restart;
>>
>>       debug_hrtimer_deactivate(timer);
>> -       __remove_hrtimer(timer, base, HRTIMER_STATE_CALLBACK, 0);
>> -       timer_stats_account_hrtimer(timer);
>>
>>       fn = timer->function;
>> -       if (timer->cb_mode == HRTIMER_CB_IRQSAFE_NO_SOFTIRQ) {
>> +
>> +       switch (timer->cb_mode) {
>> +       case HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ:
>> +               __remove_hrtimer(timer, base,  
>> HRTIMER_STATE_INACTIVE, 0);
>
> Correct me if I'm wrong but there is a lot of code reworking and code
> expansion here, when really the above line is the only difference
> between your new HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ
> and HRTIMER_CB_IRQSAFE_NO_SOFTIRQ.
> It is not even strictly necessary to return, since the rest of the
> function should be safe. I have attached an alternate smaller patch, -
> maybe you'll find it simpler?
>

You're right I did some reworking to have clear situation of what is  
to be done
depending on the timer's mode.
I'll check your patch as soon as I come back to my office (monday).
I just have an email client down here.

Thank-you for all your comments, John.

>
>> +               timer_stats_account_hrtimer(timer);
>> +               spin_unlock(&cpu_base->lock);
>> +               fn(timer);
>> +               spin_lock(&cpu_base->lock);
>> +               return;
>> +       case HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:
>>               /*
>>                * Used for scheduler timers, avoid lock inversion with
>>                * rq->lock and tasklist_lock.
>> @@ -1310,11 +1320,17 @@ static void __run_hrtimer(struct hrtimer  
>> *timer)
>>                * These timers are required to deal with enqueue  
>> expiry
>>                * themselves and are not allowed to migrate.
>>                */
>> +               __remove_hrtimer(timer, base,  
>> HRTIMER_STATE_CALLBACK, 0);
>> +               timer_stats_account_hrtimer(timer);
>>               spin_unlock(&cpu_base->lock);
>>               restart = fn(timer);
>>               spin_lock(&cpu_base->lock);
>> -       } else
>> +               break;
>> +       default:
>> +               __remove_hrtimer(timer, base,  
>> HRTIMER_STATE_CALLBACK, 0);
>> +               timer_stats_account_hrtimer(timer);
>>               restart = fn(timer);
>> +       }
>>
>>       /*
>>        * Note: We clear the CALLBACK bit after enqueue_hrtimer to  
>> avoid
>> @@ -1516,7 +1532,7 @@ void hrtimer_init_sleeper(struct  
>> hrtimer_sleeper *sl, struct task_struct *task)
>>       sl->timer.function = hrtimer_wakeup;
>>       sl->task = task;
>> #ifdef CONFIG_HIGH_RES_TIMERS
>> -       sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_SOFTIRQ;
>> +       sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFTIRQ;
>> #endif
>> }
>>
>> --
>> 1.5.2.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rt- 
>> users" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> <hrtimers_stuck_in_waitq-jk.patch>

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [RT] hrtimers stuck in waitqueue
  2008-08-18 14:42 ` [PATCH 1/2] [RT] " Gilles Carry
  2008-08-19 14:10   ` Gregory Haskins
  2008-08-21 13:16   ` John Kacur
@ 2008-08-22 14:39   ` Thomas Gleixner
  2008-08-25 13:09     ` Gilles Carry
  2 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2008-08-22 14:39 UTC (permalink / raw)
  To: Gilles Carry
  Cc: linux-rt-users, mingo, tinytim, jean-pierre.dion, sebastien.dugue

On Mon, 18 Aug 2008, Gilles Carry wrote:
> This patch makes hrtimers initialized with hrtimer_init_sleeper
> to use another mode and then not be stuck in waitqueues when
> hrtimer_interrupt is very busy.
> 
> The new mode is HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
> The above-mentionned timers have been moved from
> HRTIMER_CB_IRQSAFE_NO_SOFTIRQ to
> HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
> 
> HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ timers use a slightly different
> state machine from HRTIMER_CB_IRQSAFE_NO_SOFTIRQ's as when removing the
> timer, __run_hrtimer sets the status to INACTIVE _then_
> wakes up the thread. This way, an awakened thread cannot enter
> hrtimer_cancel before the timer's status has changed.

NAK. That solution is racy.

     CPU 0                        CPU 1

     timer interrupt runs         signal wakeup for task which sleeps
     timer->state = INACTIVE;

-> Race window start
     base->lock is dropped
				  hrtimer_cancel()
				  data structure on stack is destroyed

     timer function called
     	   data structure access --> POOOF

-> Race window end

     base->lock is locked

The race is extremly narrow and requires an SMI or some other delay
(bus stall, cache miss ...) on CPU 0, but it exists.

Fix below.

Thanks,
	tglx

---------------->
Subject: hrtimer: avoid waitqueue starvation
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 22 Aug 2008 16:27:13 +0200

Garry Gilles found that when a hrtimer callback runs in interrupt
context, then the woken up thread might end up on the timer wakequeue,
which might be blocked for a long time due to interrupts, higher
priority threads and no timers in the softirq list.

For timers which run their callback function in the hard irq context
we can safely spin and wait for the state to become inactive. The
waitqueue is only necessary for timers which run their callback
function in softirq context.

Debugged-by: Gilles Carry <gilles.carry@bull.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/hrtimer.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Index: linux-2.6.24.7/kernel/hrtimer.c
===================================================================
--- linux-2.6.24.7.orig/kernel/hrtimer.c
+++ linux-2.6.24.7/kernel/hrtimer.c
@@ -990,7 +990,14 @@ int hrtimer_cancel(struct hrtimer *timer
 
 		if (ret >= 0)
 			return ret;
-		hrtimer_wait_for_timer(timer);
+		switch (timer->cb_mode) {
+		case HRTIMER_CB_IRQSAFE_NO_SOFTIRQ:
+		case HRTIMER_CB_IRQSAFE_NO_RESTART:
+			cpu_relax();
+			break;
+		default:
+			hrtimer_wait_for_timer(timer);
+		}
 	}
 }
 EXPORT_SYMBOL_GPL(hrtimer_cancel);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] [RT] hrtimers stuck in waitqueue
  2008-08-22 14:39   ` Thomas Gleixner
@ 2008-08-25 13:09     ` Gilles Carry
  0 siblings, 0 replies; 13+ messages in thread
From: Gilles Carry @ 2008-08-25 13:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-rt-users, mingo, tinytim, jean-pierre.dion, sebastien.dugue

Thomas Gleixner wrote:
> On Mon, 18 Aug 2008, Gilles Carry wrote:
> 
>>This patch makes hrtimers initialized with hrtimer_init_sleeper
>>to use another mode and then not be stuck in waitqueues when
>>hrtimer_interrupt is very busy.
>>
>>The new mode is HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
>>The above-mentionned timers have been moved from
>>HRTIMER_CB_IRQSAFE_NO_SOFTIRQ to
>>HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ.
>>
>>HRTIMER_CB_IRQSAFE_NO_RESTART_NO_SOFIRQ timers use a slightly different
>>state machine from HRTIMER_CB_IRQSAFE_NO_SOFTIRQ's as when removing the
>>timer, __run_hrtimer sets the status to INACTIVE _then_
>>wakes up the thread. This way, an awakened thread cannot enter
>>hrtimer_cancel before the timer's status has changed.
> 
> 
> NAK. That solution is racy.
> 
>      CPU 0                        CPU 1
> 
>      timer interrupt runs         signal wakeup for task which sleeps
>      timer->state = INACTIVE;
> 
> -> Race window start
>      base->lock is dropped
> 				  hrtimer_cancel()
> 				  data structure on stack is destroyed
> 
>      timer function called
>      	   data structure access --> POOOF
> 
> -> Race window end
> 
>      base->lock is locked
> 
> The race is extremly narrow and requires an SMI or some other delay
> (bus stall, cache miss ...) on CPU 0, but it exists.
> 
> Fix below.
> 
> Thanks,
> 	tglx
> 

Ooops! I did not think of that.
A so narrow window that my tests did not reveal.

Thank-you Thomas.

Gilles.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-08-25 13:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-18 14:42 [PATCH 0/2][RT] hrtimers stuck in waitqueue Gilles Carry
2008-08-18 14:42 ` [PATCH 1/2] [RT] " Gilles Carry
2008-08-19 14:10   ` Gregory Haskins
     [not found]     ` <789E827C-DB3F-451E-BFFF-4210433029DF@free.fr>
2008-08-20 10:57       ` Gregory Haskins
2008-08-21 13:16   ` John Kacur
2008-08-22  6:11     ` Gilles Carry
2008-08-22 14:39   ` Thomas Gleixner
2008-08-25 13:09     ` Gilles Carry
2008-08-18 14:42 ` [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup Gilles Carry
2008-08-20 21:48   ` John Kacur
2008-08-21 12:18     ` Gilles Carry
2008-08-21 13:03       ` John Kacur
2008-08-22  6:04         ` Gilles Carry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).