From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 1/3] xen: timers: don't miss a timer event because of stop_timer() Date: Wed, 27 Sep 2017 12:18:27 +0200 Message-ID: <1506507507.17428.1.camel@citrix.com> References: <150549814200.30442.14095065559616595274.stgit@Solace.fritz.box> <150549847722.30442.110948122518687861.stgit@Solace.fritz.box> <59CA8789020000780017FDDC@prv-mh.provo.novell.com> <1506449460.27663.37.camel@citrix.com> <59CB7B5D0200007800180006@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8018351742219169935==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dx9QP-000685-2B for xen-devel@lists.xenproject.org; Wed, 27 Sep 2017 10:18:41 +0000 In-Reply-To: <59CB7B5D0200007800180006@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Jan Beulich Cc: Stefano Stabellini , George Dunlap , Andrew Cooper , TimDeegan , Julien Grall , xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org --===============8018351742219169935== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-oxJNlF77ccTlPVHsZtew" --=-oxJNlF77ccTlPVHsZtew Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2017-09-27 at 02:20 -0600, Jan Beulich wrote: > > > > On 26.09.17 at 20:11, wrote: > > it means there is an event that *is* in progress right now (i.e., > > we're > > stopping the timer on the path that goes from the interrupt that > > raised > > TIMER_SOFTIRQ, and the timer softirq handler). > >=20 > > So, basically, I'm trying to achieve exactly what you call > > reasonable: > > servicing an event which is in progress. :-) >=20 > I'm afraid I don't understand - if the processing of the timer is > already in progress, why would you need to raise another > softirq? Wouldn't it suffice to simply skip deactivate_timer() then? > Ah, yes! In the use case I'm trying to fix, that's indeed not necessary, as it has already been risen. Basically, as it's harmless to re-raise it, I thought to do so, with the aim of making it easier to understand what the code was trying to achieve. But now I actually agree with you that the effect is quite the opposite. :-( I will get rid of the re-raising, and explain the situation better in the both the comment and the changelog. > This raising of the softirq is what made me imply that, perhaps > due to some minimal skew e.g. resulting from rounding, you > assume the interrupt did not fire yet, which I'd then call the > timer event not being in progress (yet). >=20 Sure, I totally see it now, and I also totally agree. > In the end what I think I'm missing is a clear description of an > actual > case where the current behavior causes breakage (plus of course > an explanation why the new behavior is unlikely to cause issues with > existing users). >=20 So, the problem is that the handler of the RCU idle_timer, introduced by 2b936ea7b716dc1a13c ("xen: RCU: avoid busy waiting until the end of grace period."), never runs. And that is because the following happens: - the CPU wants to go idle - sched_tick_suspend() rcu_idle_timer_start() set_timer(RCU_idle_timer) - the CPU goes idle ... ... ... - RCU_idle_timer's IRQ arrives - the CPU wakes up - raise_softirq(TIMER_SOFTIRQ) - sched_tick_resume() rcu_idle_timer_stop() stop_timer(RCU_idle_timer) deactivate_timer(RCU_idle_timer) remove_entry(RCU_idle_timer) // timer out of heap/list - do_softirq() (we are inside idle_loop()) - softirq_handlers[TIMER_SOFTIRQ]() - timer_softirq_action() // but the timer is not in the heap/list! Now, as far as the purpose of that patch goes, we're fine. In fact, there, we "only" needed to avoid that a certain CPU (because of its role in the grace period) would sleep too long/indefinitely. And the fact that the CPU does wake up, because of the timer interrupt, is enough. However, it's been asked to try to make the logic a bit more clever, basically for preventing RCU_idle_timer from firing too often. And that's actually what this series is doing. But now, in order to achieve that, I do need the timer handler to run. About the (lack of) breakage of existing use cases. Well, hard to tell like this, but I'd say that, if, right now, we are not missing any timer event handling, it means that this situation --where you stop the timer in between raise_softirq(TIMER_SOFTIRQ) and softirq_handler[TIMER_SOFTIRQ]()-- never happens. Inspecting the code, in fact, I can't spot any stop_timer() happening within that window, which I think it means we're fine (IOW, RCU_idle_timer is the first, and for now only, timer that is stopped on the CPU wake-up-from-idle path). It is important (in the new version of this patch) for deactivation to be skipped only in stop_timer(), and not, e.g., in kill_timer(). As, if someone, in future, will want to kill and free the timer during the window, then in that case the handler *must* not run. Makes sense? Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-oxJNlF77ccTlPVHsZtew Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZy3r0AAoJEBZCeImluHPufZQQAIAvtsMVB1jCk6FacYcB/QQz gEn4fFrJxytOGl8FjIJbDJlUF65K3njpsHkyubsXX0XAKEJCNeeRglubpd/eBH2K W51nBFAg8akENY9NqwHNU7Hxc7GUytLpyuk9sMnR7PTknoGCxifDuNPH/mz1sE/j 1dcRQVQn4u9r2J61lJAnZWt4wTPuUM5BMHBxBLY7xNij4z0lQCg+DVw67Jy5ECSB ursqjqhCtjz2gDvPD8ECDiJ1YaS/g44zYUsk2eL8AI+lrA9sodWY17bRMLBQdy4y eRFonm+YziarfTOFEIXF+T8knvV3LYHFVvp/kO3Ljx8Ci1zLfEJmGz4qvlDXhnWL pcIOVFlN5OHKYarfXwtzGPwA7dO9jl2P+kuwWf91ksT5jTqkTCOrccktsHrdkwHS aTq8g7l6x75FUv5QveZEU1DjManX9rLtERXfDIW+oe8kQjv/2EnJ6SSOwob62XSK slBUVYBkAN2QFTWXAyqDbCx3wnYj4p1Vf8QGBCU1JFaqs5qkSBz8ugMXeVZjysok R9DKYSDD0u9mUYPkSvVr+lIL12Qyh4FJCLkSILSbKNw3h+PoF6V8rx0HUH1nSi35 ugDpjioEcfFj7+fDMxh/u9U+8NcEk6x8kzluoaGn4mml83ixinyJVIb+6p5jJOow BMytAHs4tRul6zvb1XJH =Iumc -----END PGP SIGNATURE----- --=-oxJNlF77ccTlPVHsZtew-- --===============8018351742219169935== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============8018351742219169935==--