All of lore.kernel.org
 help / color / mirror / Atom feed
From: john cooper <john.cooper@timesys.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: john cooper <john.cooper@timesys.com>,
	Oleg Nesterov <oleg@tv-sign.ru>,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Olaf Kirch <okir@suse.de>
Subject: Re: RT and Cascade interrupts
Date: Tue, 31 May 2005 19:09:44 -0400	[thread overview]
Message-ID: <429CEEB8.1010404@timesys.com> (raw)
In-Reply-To: <429B8678.1000706@timesys.com>

john cooper wrote:
> Trond Myklebust wrote:
> 
>> I've appended a patch that
>> should check for strict compliance of the above rules. Could you try it
>> out and see if it triggers any Oopses?
> 
> 
> Yes, the assert in rpc_delete_timer() occurs just before
> the cascade list corruption.  This is consistent with
> what I have seen.  ie: the timer in a released rpc_task
> is still active.

I've captured more data in the instrumentation and found
the rpc_task's timer is being requeued by an application
task which is preempting ksoftirqd when it wakes up in
xprt_transmit().  This is what I had originally suspected
but likely didn't communicate it effectively.

The scenario unfolds as:

                                          [high priority app task]
                                              :
                                          call_transmit()
                                             xprt_transmit()
                                             /* blocks in xprt_transmit() */
ksoftirqd()
     __run_timers()
         list_del("rpc_task_X.timer") /* logically off cascade */
         rpc_run_timer(data)
             task->tk_timeout_fn(task)

             /* ksoftirqd preempted */

                                              :
                                              ---------------------------------------------------------
                                              /* Don't race with disconnect */
                                              if (!xprt_connected(xprt))
                                                  task->tk_status = -ENOTCONN;
                                              else if (!req->rq_received)
                                                  rpc_sleep_on(&xprt->pending, task, NULL, xprt_timer);
                                              ---------------------------------------------------------
                                                      __rpc_sleep_on()
                                                          __mod_timer("rpc_task_X.timer")  /* requeued in cascade */
                                              /* blocks */

         /* rpc_run_timer resumes from preempt */
         clear_bit(RPC_TASK_HAS_TIMER, "rpc_task_X.tk_runstate");

         /* rpc_task_X.timer is now enqueued in cascade without
            RPC_TASK_HAS_TIMER set and will not be dequeued
            in rpc_release_task()/rpc_delete_timer() */


The usage of "rpc_task_X.timer" indicates the same KVA
observed for the timer struct at the associated points
in the instrumented code.

The above was gathered by logging usage of the
kernel/timer.c primitives.  Thus I don't have more
detailed state of the rpc_task in RPC context.
However I did verify which of the three calls to
rpc_sleep_on() in xprt_transmit() was being invoked
(as above).

So the root cause appears to be the rpc_task's timer
being requeued in xprt_transmit() when rpc_run_timer
is preempted.  From looking at the code I'm unsure
if modifying xprt_transmit()/out_receive is appropriate
to synchronize with rpc_release_task().  It seems
allowing rpc_sleep_on() to occur is more natural and
for rpc_release_task() to detect the pending timer and
remove it before proceeding.  I'm still in the process
of trying to digest the logic here but I thought there
was enough information here to be of use.  Suggestions,
warnings welcome.

-john


-- 
john.cooper@timesys.com

  reply	other threads:[~2005-05-31 23:11 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-27 16:47 RT and Cascade interrupts Oleg Nesterov
2005-05-27 23:37 ` john cooper
2005-05-28  8:52   ` Oleg Nesterov
2005-05-28 14:02     ` john cooper
2005-05-28 16:34       ` Oleg Nesterov
2005-05-28 17:48     ` john cooper
2005-05-28 20:35       ` Trond Myklebust
2005-05-29  3:12         ` john cooper
2005-05-29  7:40           ` Trond Myklebust
2005-05-30 21:32             ` john cooper
2005-05-31 23:09               ` john cooper [this message]
2005-06-01 14:22               ` Oleg Nesterov
2005-06-01 18:05                 ` john cooper
2005-06-01 18:31                   ` Trond Myklebust
2005-06-01 19:20                     ` john cooper
2005-06-01 19:46                       ` Trond Myklebust
2005-06-01 20:21                       ` Trond Myklebust
2005-06-01 20:59                         ` john cooper
2005-06-01 22:51                           ` Trond Myklebust
2005-06-01 23:09                             ` Trond Myklebust
2005-06-02  3:31                             ` john cooper
2005-06-02  4:26                               ` Trond Myklebust
2005-06-09 23:17                                 ` George Anzinger
2005-06-09 23:52                                   ` john cooper
2005-05-29 11:31         ` Oleg Nesterov
2005-05-29 13:58           ` Trond Myklebust
2005-05-30 14:50             ` Ingo Molnar
2005-05-28 22:17       ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2005-05-12 14:43 Daniel Walker
2005-05-13  7:44 ` Ingo Molnar
2005-05-13 13:12   ` john cooper
2005-05-24 16:32     ` john cooper
2005-05-27  7:25       ` Ingo Molnar
2005-05-27 13:53         ` john cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=429CEEB8.1010404@timesys.com \
    --to=john.cooper@timesys.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=okir@suse.de \
    --cc=oleg@tv-sign.ru \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.