public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: john cooper <john.cooper@timesys.com>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Olaf Kirch <okir@suse.de>, john cooper <john.cooper@timesys.com>
Subject: Re: RT and Cascade interrupts
Date: Sat, 28 May 2005 10:02:37 -0400	[thread overview]
Message-ID: <429879FD.30002@timesys.com> (raw)
In-Reply-To: <42983135.C521F1C8@tv-sign.ru>

Oleg Nesterov wrote:

[race scenario deleted]

This was not the race scenario of concern.

> This is totally pointless to do:
> 
> 	if (timer_pending())
> 		del_singleshot_timer_sync();

However this is the action the RPC code was attempting
to take originally based upon stale/incorrect data.
Again the failure mode was of the RPC_TASK_HAS_TIMER
bit not being set when the embedded timer actually was
still present in the timer cascade structure (eg:
timer->base != 0 or timer_pending()).  This caused the
rpc_task to be recycled with the embedded timer struct
still linked in the timer cascade.

> If you need to ensure that timer's handler is not running on any
> cpu then timer_pending() can't help. If you don't need this, you
> should use plain del_timer().

That's not the goal of the timer_pending() usage here.
Rather we're at a point in rpc_release_task where we
want to tear down an rpc_task.  The call to timer_pending()
determines if the embedded timer is still linked in the
timer cascade structure.

The embedded timer may be running on another CPU.  We
may even check timer->base during this race when it is
non-NULL and wind up deleting a timer which the path
running on another CPU has already deleted.  That is
allowable.  We simply want to idle the timer.

> I don't understand why this race is rt specific. I think
> that PREEMPT_SOFTIRQS just enlarges the window.

That is certainly possible.  However this code has been
around for some time and this problem apparently hasn't
otherwise surfaced.  I suspect it depends on the
combination of timer cascade structures existing
per-CPU and the execution context of callback(task)
invoked in rpc_run_timer() now being preemptive (ie:
preemption is required on the same CPU for this
problem to surface).

> I believe
> the right fix is just to call del_singleshot_timer_sync()
> unconditionally and kill RPC_TASK_HAS_TIMER bit.

Ignoring RPC_TASK_HAS_TIMER and always calling
del_singleshot_timer_sync() will work but seems
excessive.

> I have added Olaf Kirch to the CC: list.

Welcome.

>>The fix removes the attempt to replicate timer queue
>>status from the RPC code by removing the usage of the
>>pre-calculated RPC_TASK_HAS_TIMER.
> 
> 
> No, RPC_TASK_HAS_TIMER doesn't replicate timer queue status.
> I believe, it was intended as optimization, to avoid costly
> del_timer_sync() call.

If true its chosen name seems a misnomer.  In any case
the embedded timer in the rpc_task structure must be
quiesced before the rpc_task is made available for
reuse.  The fix under discussion here does so.  Note the
goal was not to rearchitecture the RPC code but rather to
address this very specific bug which surfaced during
stress testing.

If anyone with more ownership of the RPC code would like
to comment, any insight would be most welcome.

-john


-- 
john.cooper@timesys.com

  reply	other threads:[~2005-05-28 14:04 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-05-27 16:47 RT and Cascade interrupts Oleg Nesterov
2005-05-27 23:37 ` john cooper
2005-05-28  8:52   ` Oleg Nesterov
2005-05-28 14:02     ` john cooper [this message]
2005-05-28 16:34       ` Oleg Nesterov
2005-05-28 17:48     ` john cooper
2005-05-28 20:35       ` Trond Myklebust
2005-05-29  3:12         ` john cooper
2005-05-29  7:40           ` Trond Myklebust
2005-05-30 21:32             ` john cooper
2005-05-31 23:09               ` john cooper
2005-06-01 14:22               ` Oleg Nesterov
2005-06-01 18:05                 ` john cooper
2005-06-01 18:31                   ` Trond Myklebust
2005-06-01 19:20                     ` john cooper
2005-06-01 19:46                       ` Trond Myklebust
2005-06-01 20:21                       ` Trond Myklebust
2005-06-01 20:59                         ` john cooper
2005-06-01 22:51                           ` Trond Myklebust
2005-06-01 23:09                             ` Trond Myklebust
2005-06-02  3:31                             ` john cooper
2005-06-02  4:26                               ` Trond Myklebust
2005-06-09 23:17                                 ` George Anzinger
2005-06-09 23:52                                   ` john cooper
2005-05-29 11:31         ` Oleg Nesterov
2005-05-29 13:58           ` Trond Myklebust
2005-05-30 14:50             ` Ingo Molnar
2005-05-28 22:17       ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2005-05-12 14:43 Daniel Walker
2005-05-13  7:44 ` Ingo Molnar
2005-05-13 13:12   ` john cooper
2005-05-24 16:32     ` john cooper
2005-05-27  7:25       ` Ingo Molnar
2005-05-27 13:53         ` john cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=429879FD.30002@timesys.com \
    --to=john.cooper@timesys.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=okir@suse.de \
    --cc=oleg@tv-sign.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox