All of lore.kernel.org
 help / color / mirror / Atom feed
From: john cooper <john.cooper@timesys.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: john cooper <john.cooper@timesys.com>,
	Daniel Walker <dwalker@mvista.com>,
	linux-kernel@vger.kernel.org
Subject: Re: RT and Cascade interrupts
Date: Tue, 24 May 2005 12:32:21 -0400	[thread overview]
Message-ID: <42935715.2000505@timesys.com> (raw)
In-Reply-To: <4284A7B6.4090408@timesys.com>

[-- Attachment #1: Type: text/plain, Size: 1550 bytes --]

john cooper wrote:
> I'm seeing the BUG assert in kernel/timers.c:cascade()
> kick in (tmp->base is somehow 0) during a test which
> creates a few tasks of priority higher than ksoftirqd.
> This race doesn't happen if ksoftirqd's priority is
> elevated (eg: chrt -f -p 75 2) so the -RT patch might
> be opening up a window here.

There is a window in rpc_run_timer() which allows
it to lose track of timer ownership when ksoftirqd
(and thus itself) are preempted.  This doesn't
immediately cause a problem but does corrupt
the timer cascade list when the timer struct is
recycled/requeued.  This shows up some time later
as the list is processed.  The failure mode is cascade()
attempting to percolate a timer with poisoned
next/prev *s and a NULL base causing the assertion
BUG(tmp->base != base) to kick in.

The RPC code is attempting to replicate state of
timer ownership for a given rpc_task via RPC_TASK_HAS_TIMER
in rpc_task.tk_runstate.  Besides not working
correctly in the case of preemptable context it is
a replication of state of a timer pending in the
cascade structure (ie: timer->base).  The fix
changes the RPC code to use timer->base when
deciding whether an outstanding timer registration
exists during rpc_task tear down.

Note: this failure occurred in the 40-04 version of
the patch though it applies to more current versions.
It was seen when executing stress tests on a number
of PPC targets running on an NFS mounted root though
was not observed on a x86 target under similar
conditions.

-john


-- 
john.cooper@timesys.com

[-- Attachment #2: RPC.patch --]
[-- Type: text/plain, Size: 1630 bytes --]

./include/linux/sunrpc/sched.h
./net/sunrpc/sched.c
=================================================================
--- ./include/linux/sunrpc/sched.h.ORG	2005-05-24 10:29:24.000000000 -0400
+++ ./include/linux/sunrpc/sched.h	2005-05-24 10:47:56.000000000 -0400
@@ -142,7 +142,6 @@ typedef void			(*rpc_action)(struct rpc_
 #define RPC_TASK_RUNNING	0
 #define RPC_TASK_QUEUED		1
 #define RPC_TASK_WAKEUP		2
-#define RPC_TASK_HAS_TIMER	3
 
 #define RPC_IS_RUNNING(t)	(test_bit(RPC_TASK_RUNNING, &(t)->tk_runstate))
 #define rpc_set_running(t)	(set_bit(RPC_TASK_RUNNING, &(t)->tk_runstate))
=================================================================
--- ./net/sunrpc/sched.c.ORG	2005-05-24 10:29:52.000000000 -0400
+++ ./net/sunrpc/sched.c	2005-05-24 11:02:44.000000000 -0400
@@ -103,9 +103,6 @@ static void rpc_run_timer(struct rpc_tas
 		dprintk("RPC: %4d running timer\n", task->tk_pid);
 		callback(task);
 	}
-	smp_mb__before_clear_bit();
-	clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate);
-	smp_mb__after_clear_bit();
 }
 
 /*
@@ -124,7 +121,6 @@ __rpc_add_timer(struct rpc_task *task, r
 		task->tk_timeout_fn = timer;
 	else
 		task->tk_timeout_fn = __rpc_default_timer;
-	set_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate);
 	mod_timer(&task->tk_timer, jiffies + task->tk_timeout);
 }
 
@@ -135,7 +131,7 @@ __rpc_add_timer(struct rpc_task *task, r
 static inline void
 rpc_delete_timer(struct rpc_task *task)
 {
-	if (test_and_clear_bit(RPC_TASK_HAS_TIMER, &task->tk_runstate)) {
+	if (task->tk_timer.base) {
 		del_singleshot_timer_sync(&task->tk_timer);
 		dprintk("RPC: %4d deleting timer\n", task->tk_pid);
 	}

  reply	other threads:[~2005-05-24 16:40 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-12 14:43 RT and Cascade interrupts Daniel Walker
2005-05-13  7:44 ` Ingo Molnar
2005-05-13 13:12   ` john cooper
2005-05-24 16:32     ` john cooper [this message]
2005-05-27  7:25       ` Ingo Molnar
2005-05-27 13:53         ` john cooper
  -- strict thread matches above, loose matches on Subject: below --
2005-05-27 16:47 Oleg Nesterov
2005-05-27 23:37 ` john cooper
2005-05-28  8:52   ` Oleg Nesterov
2005-05-28 14:02     ` john cooper
2005-05-28 16:34       ` Oleg Nesterov
2005-05-28 17:48     ` john cooper
2005-05-28 20:35       ` Trond Myklebust
2005-05-29  3:12         ` john cooper
2005-05-29  7:40           ` Trond Myklebust
2005-05-30 21:32             ` john cooper
2005-05-31 23:09               ` john cooper
2005-06-01 14:22               ` Oleg Nesterov
2005-06-01 18:05                 ` john cooper
2005-06-01 18:31                   ` Trond Myklebust
2005-06-01 19:20                     ` john cooper
2005-06-01 19:46                       ` Trond Myklebust
2005-06-01 20:21                       ` Trond Myklebust
2005-06-01 20:59                         ` john cooper
2005-06-01 22:51                           ` Trond Myklebust
2005-06-01 23:09                             ` Trond Myklebust
2005-06-02  3:31                             ` john cooper
2005-06-02  4:26                               ` Trond Myklebust
2005-06-09 23:17                                 ` George Anzinger
2005-06-09 23:52                                   ` john cooper
2005-05-29 11:31         ` Oleg Nesterov
2005-05-29 13:58           ` Trond Myklebust
2005-05-30 14:50             ` Ingo Molnar
2005-05-28 22:17       ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42935715.2000505@timesys.com \
    --to=john.cooper@timesys.com \
    --cc=dwalker@mvista.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.