linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2][RT] hrtimers stuck in waitqueue
@ 2008-08-18 14:42 Gilles Carry
  2008-08-18 14:42 ` [PATCH 1/2] [RT] " Gilles Carry
  2008-08-18 14:42 ` [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup Gilles Carry
  0 siblings, 2 replies; 13+ messages in thread
From: Gilles Carry @ 2008-08-18 14:42 UTC (permalink / raw)
  To: linux-rt-users
  Cc: tglx, mingo, tinytim, jean-pierre.dion, sebastien.dugue,
	gilles.carry



Hello,

These patches are to fix a bug for high resolution timers initialized by
hrtimer_init_sleeper (nanosleep and futexes) which can get stuck on a
wait queue.
They apply onto 2.6.26-rt1

The below test shows up the bug. Though the test hangs immediately on
my ppc64 (8 CPU), it can takes tens of minutes on my x86_64 (8 CPU).
(kernel must feature: CONFIG_HIGH_RES_TIMERS=y)

#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

#define NUM_THREADS     30
#define NUM_LOOPS       10000

void *worker_thread(void *arg)
{
        long id = (long)arg;
        int i;

        for (i = 0; i < NUM_LOOPS; i++) {
                usleep(1000);
        }

        printf("thread %02ld done\n", id+1);

        return NULL;
}

int main(int argc, char* argv[])
{
        int i;
        struct sched_param param;
        pthread_attr_t attr;
        pthread_t *threads;

        if ((threads = malloc(NUM_THREADS * sizeof(pthread_t))) == NULL)
{
                perror("Failed to allocate threads\n");
                return 1;
        }

        param.sched_priority = sched_get_priority_min(SCHED_FIFO);
        pthread_attr_init(&attr);
        pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
        pthread_attr_setschedparam(&attr, &param);
        pthread_attr_setschedpolicy(&attr, SCHED_FIFO);

        /* start threads */
        for (i = 0; i < NUM_THREADS; i++) {
                if (pthread_create(&threads[i], &attr,
                                   worker_thread, (void *)(long)i))
                        perror("Failed to create thread\n");
        }

        pthread_attr_destroy(&attr);

        for (i = 0; i < NUM_THREADS; i++)
                pthread_join(threads[i], NULL);

        free(threads);

        return 0;
}


This occurs when hrtimer_interrupt is very busy and some awakened
threads enter hrtimer_cancel before hrtimer_interrupt has changed the
timer status. These threads are queued on a wait queue and are almost
never awakened since HRTIMER_CB_IRQSAFE_NO_SOFTIRQ timers are not supposed
to raise a softirq.
They would sometimes be awakened and only when another timer awakes and
uses a softirq call back set on the same CPU!!!
Before the patch, I could unlock them all by flooding the system with the
below program in order to run softirq timers with the same CB mode
on all CPUs.

#include <unistd.h>

main() {
	alarm(1);
	pause();
}

Adding traces (not included in this patch) to /proc/timer_list did
help to track the bug.


The second patch is a code cleanup that makes the code more readable.

I have run flawlessly the above test with the patched kernel for
~100 hours on two 8-way systems: x86_64 and ppc64 (power 6)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-08-25 13:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-18 14:42 [PATCH 0/2][RT] hrtimers stuck in waitqueue Gilles Carry
2008-08-18 14:42 ` [PATCH 1/2] [RT] " Gilles Carry
2008-08-19 14:10   ` Gregory Haskins
     [not found]     ` <789E827C-DB3F-451E-BFFF-4210433029DF@free.fr>
2008-08-20 10:57       ` Gregory Haskins
2008-08-21 13:16   ` John Kacur
2008-08-22  6:11     ` Gilles Carry
2008-08-22 14:39   ` Thomas Gleixner
2008-08-25 13:09     ` Gilles Carry
2008-08-18 14:42 ` [PATCH 2/2] [RT] hrtimer __run_hrtimer code cleanup Gilles Carry
2008-08-20 21:48   ` John Kacur
2008-08-21 12:18     ` Gilles Carry
2008-08-21 13:03       ` John Kacur
2008-08-22  6:04         ` Gilles Carry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).