public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: john cooper <john.cooper@timesys.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Oleg Nesterov <oleg@tv-sign.ru>,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Olaf Kirch <okir@suse.de>, john cooper <john.cooper@timesys.com>
Subject: Re: RT and Cascade interrupts
Date: Wed, 01 Jun 2005 23:31:29 -0400	[thread overview]
Message-ID: <429E7D91.9000808@timesys.com> (raw)
In-Reply-To: <1117666319.10822.17.camel@lade.trondhjem.org>

Trond Myklebust wrote:
> on den 01.06.2005 Klokka 16:59 (-0400) skreiv john cooper:
> 
>>Yes later versions of the patch do.  The version at hand
>>40-04 is based on 2.6.11.  We intend to sync-up with a
>>more recent version of the RT patch pending resolution
>>of this issue.
> 
> 
> Well it is pointless to concentrate on an obviously buggy patch. Could
> you please sync up to rc5-rt-V0.7.47-15 at least: that looks like it
> might be working (or at least be close to working).

I fully share your frustration of wanting to "use the
latest patch -- dammit".  However there are other practical
constraints coming into play.  This tree has accumulated a
substantial amount of fixes for scheduler violation assertions
along with associated testing and has faired well thus far.
The bug under discussion here is the last major operational
problem found in the associated testing process.  Arriving
at this point also required development of target specific
driver/board code so a resync to a later version is not a
trivial operation.  However it would be justifiable in the
case of encountering at an impasse with the current tree.

> Could you then apply the following debugging patch? It should warn you
> in case something happens to corrupt base->running_timer (something
> which would screw up del_timer_sync()). I'm not sure that can happen,
> but it might be worth checking.

Yes, thanks.  Though the event trace does not suggest a
reentrance in __run_timer() but rather a preemption of it
during the call to rpc_run_timer() by a high priority
application task in the midst of an RPC.  The preempting
task requeues the timer in the cascade at the tail of
xprt_transmit().  rpc_run_timer() upon resuming execution
unconditionally clears the RPC_TASK_HAS_TIMER flag.  This
creates the inconsistent state.

No explicit deletion attempt of the timer (synchronous or
otherwise) is coming into play in the failure scenario as
witnessed by the event trace.  Rather it is the implicit
dequeue of the timer from the cascade in __run_timer() and
attempt to track ownership of it in rpc_run_timer() via
RPC_TASK_HAS_TIMER which is undermined in the case of
preemption.

 From earlier mail:

 > There should be no instances of RPC entering call_transmit() or any
 > other tk_action callback with a pending timer.

My description wasn't clear.  The timeout isn't pending
before call_transmit().  Rather the RPC appears to be
blocked elsewhere and upon wakeup via __run_timer()/xprt_timer()
preempts ksoftirqd and does the __rpc_sleep_on()/__mod_timer()
at the very tail of xprt_transmit().

I have work-arounds which detect the preemption of
rpc_run_timer() and correct the timer state.  But I
don't suggest they are general solutions or a
permanent fix.

In any case I will ascertain whether or not the problem here
still exists as soon as possible upon resync with a current
tree.

-john


-- 
john.cooper@timesys.com

  parent reply	other threads:[~2005-06-02  3:33 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-27 16:47 RT and Cascade interrupts Oleg Nesterov
2005-05-27 23:37 ` john cooper
2005-05-28  8:52   ` Oleg Nesterov
2005-05-28 14:02     ` john cooper
2005-05-28 16:34       ` Oleg Nesterov
2005-05-28 17:48     ` john cooper
2005-05-28 20:35       ` Trond Myklebust
2005-05-29  3:12         ` john cooper
2005-05-29  7:40           ` Trond Myklebust
2005-05-30 21:32             ` john cooper
2005-05-31 23:09               ` john cooper
2005-06-01 14:22               ` Oleg Nesterov
2005-06-01 18:05                 ` john cooper
2005-06-01 18:31                   ` Trond Myklebust
2005-06-01 19:20                     ` john cooper
2005-06-01 19:46                       ` Trond Myklebust
2005-06-01 20:21                       ` Trond Myklebust
2005-06-01 20:59                         ` john cooper
2005-06-01 22:51                           ` Trond Myklebust
2005-06-01 23:09                             ` Trond Myklebust
2005-06-02  3:31                             ` john cooper [this message]
2005-06-02  4:26                               ` Trond Myklebust
2005-06-09 23:17                                 ` George Anzinger
2005-06-09 23:52                                   ` john cooper
2005-05-29 11:31         ` Oleg Nesterov
2005-05-29 13:58           ` Trond Myklebust
2005-05-30 14:50             ` Ingo Molnar
2005-05-28 22:17       ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2005-05-12 14:43 Daniel Walker
2005-05-13  7:44 ` Ingo Molnar
2005-05-13 13:12   ` john cooper
2005-05-24 16:32     ` john cooper
2005-05-27  7:25       ` Ingo Molnar
2005-05-27 13:53         ` john cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=429E7D91.9000808@timesys.com \
    --to=john.cooper@timesys.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=okir@suse.de \
    --cc=oleg@tv-sign.ru \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox