From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH 1/3] xen: timers: don't miss a timer event because of stop_timer() Date: Tue, 26 Sep 2017 20:11:00 +0200 Message-ID: <1506449460.27663.37.camel@citrix.com> References: <150549814200.30442.14095065559616595274.stgit@Solace.fritz.box> <150549847722.30442.110948122518687861.stgit@Solace.fritz.box> <59CA8789020000780017FDDC@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8872994343837180014==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dwuK5-0004sl-ER for xen-devel@lists.xenproject.org; Tue, 26 Sep 2017 18:11:09 +0000 In-Reply-To: <59CA8789020000780017FDDC@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Jan Beulich Cc: Stefano Stabellini , George Dunlap , Andrew Cooper , Tim Deegan , Julien Grall , xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org --===============8872994343837180014== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-Vjq4R9ZeYvByeUCxgnDk" --=-Vjq4R9ZeYvByeUCxgnDk Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 2017-09-26 at 08:59 -0600, Jan Beulich wrote: > > > > On 15.09.17 at 20:01, wrote: > > --- a/xen/common/timer.c > > +++ b/xen/common/timer.c > > @@ -217,7 +217,7 @@ static inline void activate_timer(struct timer > > *timer) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0timer->status =3D TIMER_STATUS_invalid; > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0list_del(&timer->inactive); > > =C2=A0 > > -=C2=A0=C2=A0=C2=A0=C2=A0if ( add_entry(timer) ) > > +=C2=A0=C2=A0=C2=A0=C2=A0if ( add_entry(timer) || timer->expires <=3D N= OW() ) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0cpu_raise_softirq= (timer->cpu, TIMER_SOFTIRQ); > > =C2=A0} >=20 > I'm not convinced of this change - it's unrelated to what title and > description say,=20 > You're right. This should either be mentioned, or dropped (or live in another patch). > > @@ -326,7 +326,17 @@ void stop_timer(struct timer *timer) > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0return; > > =C2=A0 > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( active_timer(timer) ) > > -=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0deactivate_timer(timer= ); > > +=C2=A0=C2=A0=C2=A0=C2=A0{ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0/* > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* If the timer i= s expired already, 'call' the softirq > > handler to > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* execute it (it= will leave it inactive after that). If > > not, just > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0* deactivate it. > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*/ > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0if ( timer->expires <= =3D NOW() ) > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0cpu_raise_softirq(timer->cpu, TIMER_SOFTIRQ); > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0else > > +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0deactivate_timer(timer); > > +=C2=A0=C2=A0=C2=A0=C2=A0} >=20 > Isn't it reasonable to expect current behavior, i.e. the timer not > raising any events other than those already in progress? >=20 Well, if we managed to get to here, with the timer that is both: - active, - with its expiry time in the past, it means there is an event that *is* in progress right now (i.e., we're stopping the timer on the path that goes from the interrupt that raised TIMER_SOFTIRQ, and the timer softirq handler). So, basically, I'm trying to achieve exactly what you call reasonable: servicing an event which is in progress. :-) The alternative is that the event happened, but we somehow managed to miss running the timer handler for it, and we realize this only now, potentially a long time after the miss. This scenario, as far as I can tell, can't happen, but if it could/does, I'd still say running the handler late is better than not running it at all. > Additionally I'm not convinced the changed actually does what you > want: What if NOW() becomes equal to timer->expires immediately > after you've managed to obtain its value, before execution makes > it into deactivate_timer()? IOW you're shrinking a time window which > (I think) you really mean to eliminate. >=20 Well, I certainly don't like the window to be there, and closing it would be ideal, IMO. However, in this patch, I'm addressing the specific situation of when we are stopping a timer for which an interrupt has already fired, the interrupt's ISR has already raised TIMER_SOFTIRQ, and yet we don't run the handler. Yes, if an interrupt is about to be raised, and/or it arrives _while_ we are inside this function (after the check), or already in deactivate_timer(), we also end up not running the handler. I guess these can be seen as two aspects of the same problem, or as conceptually different issues, but whatever you consider them, getting rid of the former is something I consider an improvement. I certainly can try to state the problem better, and describe the situation more clearly in the changelog. > Furthermore, taking both changes together: What if the same > you try to address in stop_timer() happens in set_timer()? Wouldn't > it then be only consistent to also require a timer even to fire for > the > old expiry value, before the new one is being set? >=20 Yes, personally, I think that, whenever it is that we figure out that we missed handling a timer interrupt, we should run it then. Unfortunately, for achieving this, e.g., in the set_timer() case, I don't see much alternatives to call execute_timer() right in there. But that would violate the current invariant that execute_timer() only run from the TIMER_SOFTIRQ handler... which is probably doable, but is at the same time unrelated to the problem I'm tackling here, and I would rather do it in a dedicated series. Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-Vjq4R9ZeYvByeUCxgnDk Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZypg0AAoJEBZCeImluHPuk8kP/2iQlCwendM7G2AJxdZAErKO X53HnVGUm87qkjYqTcBfNaBvCpXGnQWvvjmCsnak5I7uabptPEnXp7tCgupQ9HBq IOsz4TkRgpSQsg/sFaTE4Sco/Obsr1BBcKrw4VWd16eST7JNlXL2o4mHqWDHJ8G4 Y7ckIXfDsT5tWTfbkvivlYSeXDY6e9R0RTfFCdu2E/jH35Y7OcIOx9BXOqki225l BzuauLS0JLGBr1DGWG3B/570EQFpVXmoo5E8qQI4gTHsAhx9fXBJwD+sGCagjJl1 +qeDJCEN9Ev1BvTRnT4jipkBkFplPf3Ah4YU1SsToG4TBUIbn08Ngb36zlBD0p4y plw7tEniLYLltSa4N033X6xofrkIhAITVaHFmMHyCBXWi6ARjwFfsQY72ZQaa/GH 24cAoay7bEGkGNforZe7UqMnSlyByjQks3Rk7M+zMY0qZvn7sVpbgog8MYZVnnIp W9J9TjYmu8VspEmwmSUMdpRTfMVdYg7BSp6laVCRNFeIPqMZJ8t5OWqLfSET0K76 VXDUe5qHAitNqHI+Z/zzSLdTz6WcfWculu1VhCRKew4naBZ76Nl3UrKjRPl19Li1 Znh6TgQ6P2rDADmgZzRfgUN1qrbgDGH1N+C60TBRWp7h+4xyCmA631c9+nV0VR2M 0kfB727dp/657BvW/bEB =jssD -----END PGP SIGNATURE----- --=-Vjq4R9ZeYvByeUCxgnDk-- --===============8872994343837180014== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============8872994343837180014==--