xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Dario Faggioli <dario.faggioli@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	TimDeegan <tim@xen.org>, Julien Grall <julien.grall@arm.com>,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH 1/3] xen: timers: don't miss a timer event because of stop_timer()
Date: Wed, 27 Sep 2017 12:18:27 +0200	[thread overview]
Message-ID: <1506507507.17428.1.camel@citrix.com> (raw)
In-Reply-To: <59CB7B5D0200007800180006@prv-mh.provo.novell.com>


[-- Attachment #1.1: Type: text/plain, Size: 4151 bytes --]

On Wed, 2017-09-27 at 02:20 -0600, Jan Beulich wrote:
> > > > On 26.09.17 at 20:11, <dario.faggioli@citrix.com> wrote:
> > it means there is an event that *is* in progress right now (i.e.,
> > we're
> > stopping the timer on the path that goes from the interrupt that
> > raised
> > TIMER_SOFTIRQ, and the timer softirq handler).
> > 
> > So, basically, I'm trying to achieve exactly what you call
> > reasonable:
> > servicing an event which is in progress. :-)
> 
> I'm afraid I don't understand - if the processing of the timer is
> already in progress, why would you need to raise another
> softirq? Wouldn't it suffice to simply skip deactivate_timer() then?
>
Ah, yes! In the use case I'm trying to fix, that's indeed not
necessary, as it has already been risen.

Basically, as it's harmless to re-raise it, I thought to do so, with
the aim of making it easier to understand what the code was trying to
achieve. But now I actually agree with you that the effect is quite the
opposite. :-(

I will get rid of the re-raising, and explain the situation better in
the both the comment and the changelog.

> This raising of the softirq is what made me imply that, perhaps
> due to some minimal skew e.g. resulting from rounding, you
> assume the interrupt did not fire yet, which I'd then call the
> timer event not being in progress (yet).
> 
Sure, I totally see it now, and I also totally agree.

> In the end what I think I'm missing is a clear description of an
> actual
> case where the current behavior causes breakage (plus of course
> an explanation why the new behavior is unlikely to cause issues with
> existing users).
> 
So, the problem is that the handler of the RCU idle_timer, introduced
by 2b936ea7b716dc1a13c ("xen: RCU: avoid busy waiting until the end of
grace period."), never runs.

And that is because the following happens:
- the CPU wants to go idle
- sched_tick_suspend()
    rcu_idle_timer_start()
      set_timer(RCU_idle_timer)
- the CPU goes idle
  ... ... ...
- RCU_idle_timer's IRQ arrives
- the CPU wakes up
- raise_softirq(TIMER_SOFTIRQ)
- sched_tick_resume()
    rcu_idle_timer_stop()
      stop_timer(RCU_idle_timer)
        deactivate_timer(RCU_idle_timer)
          remove_entry(RCU_idle_timer) // timer out of heap/list
- do_softirq() (we are inside idle_loop())
- softirq_handlers[TIMER_SOFTIRQ]()
- timer_softirq_action()
    // but the timer is not in the heap/list!

Now, as far as the purpose of that patch goes, we're fine. In fact,
there, we "only" needed to avoid that a certain CPU (because of its
role in the grace period) would sleep too long/indefinitely. And the
fact that the CPU does wake up, because of the timer interrupt, is
enough.

However, it's been asked to try to make the logic a bit more clever,
basically for preventing RCU_idle_timer from firing too often. And
that's actually what this series is doing. But now, in order to achieve
that, I do need the timer handler to run.


About the (lack of) breakage of existing use cases. Well, hard to tell
like this, but I'd say that, if, right now, we are not missing any
timer event handling, it means that this situation --where you stop the
timer in between raise_softirq(TIMER_SOFTIRQ) and
softirq_handler[TIMER_SOFTIRQ]()-- never happens.

Inspecting the code, in fact, I can't spot any stop_timer() happening
within that window, which I think it means we're fine (IOW,
RCU_idle_timer is the first, and for now only, timer that is stopped on
the CPU wake-up-from-idle path).

It is important (in the new version of this patch) for deactivation to
be skipped only in stop_timer(), and not, e.g., in kill_timer(). As, if
someone, in future, will want to kill and free the timer during the
window, then in that case the handler *must* not run.

Makes sense?

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-09-27 10:18 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-15 18:01 [PATCH 0/3] xen: RCU: Improve the idle timer handling Dario Faggioli
2017-09-15 18:01 ` [PATCH 1/3] xen: timers: don't miss a timer event because of stop_timer() Dario Faggioli
2017-09-26 14:59   ` Jan Beulich
2017-09-26 18:11     ` Dario Faggioli
2017-09-27  8:20       ` Jan Beulich
2017-09-27 10:18         ` Dario Faggioli [this message]
2017-09-27 10:30           ` Jan Beulich
2017-09-27 13:39             ` Dario Faggioli
2017-09-28  9:51               ` Dario Faggioli
2017-09-15 18:01 ` [PATCH 2/3] xen: RCU: make the period of the idle timer configurable Dario Faggioli
2017-09-26 15:14   ` Jan Beulich
2017-09-26 17:48     ` Dario Faggioli
2017-09-27  8:26       ` Jan Beulich
2017-09-15 18:01 ` [PATCH 3/3] xen: RCU: make the period of the idle timer adaptive Dario Faggioli
2017-09-26 15:24   ` Jan Beulich
2017-09-26 17:50     ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1506507507.17428.1.camel@citrix.com \
    --to=dario.faggioli@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=julien.grall@arm.com \
    --cc=sstabellini@kernel.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).