From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model Date: Fri, 26 Feb 2016 19:09:30 +0100 Message-ID: <1456510170.2959.210.camel@citrix.com> References: <1456430736-4606-1-git-send-email-tiche@seas.upenn.edu> <1456443078.2959.85.camel@citrix.com> <56CFDF85.504@seas.upenn.edu> <1456477891.2959.132.camel@citrix.com> <56D08B50.5090301@seas.upenn.edu> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2464609806589264482==" Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.84) (envelope-from ) id 1aZMpf-0008VI-Ge for xen-devel@lists.xenproject.org; Fri, 26 Feb 2016 18:09:39 +0000 In-Reply-To: <56D08B50.5090301@seas.upenn.edu> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: Tianyang Chen , xen-devel@lists.xenproject.org Cc: george.dunlap@citrix.com, Dagaen Golomb , Meng Xu List-Id: xen-devel@lists.xenproject.org --===============2464609806589264482== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-qMgdf9clDFoUx3INQo+9" --=-qMgdf9clDFoUx3INQo+9 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 2016-02-26 at 12:28 -0500, Tianyang Chen wrote: > > So, have you made other changes wrt v6 when trying this? > The v6 doesn't have the if statement commented out when I submitted > it.=C2=A0 > But I tried commenting it out, the assertion failed. >=20 Ok, thanks for these tests. Can you send (just quick-&-dirtily, as an attached to a replay to this email, no need of a proper re-submission of a new version) the patch that does this: > rt_vcpu_sleep(): removing replenishment event if the vcpu is on=C2=A0 > runq/depletedq > rt_context_saved(): removing replenishment events if not runnable > rt_vcpu_wake(): not checking if the event is already queued. >=20 > I added debug prints in all these functions and noticed that it could > be=C2=A0 > caused by racing between spurious wakeups and context switching.=20 > And the code that produces these debug output as well? > (XEN) cpu1 picked idle > (XEN) d0 attempted to change d0v1's CR4 flags 00000620 -> 00040660 > (XEN) cpu2 picked idle > (XEN) vcpu1 sleeps on cpu > (XEN) cpu0 picked idle > (XEN) vcpu1 context saved not runnable > (XEN) vcpu1 wakes up nowhere > (XEN) cpu0 picked vcpu1 > (XEN) vcpu1 sleeps on cpu > (XEN) cpu0 picked idle > (XEN) vcpu1 context saved not runnable > (XEN) vcpu1 wakes up nowhere > (XEN) cpu0 picked vcpu1 > (XEN) cpu0 picked idle > (XEN) vcpu1 context saved not runnable > (XEN) cpu0 picked vcpu0 > (XEN) vcpu1 wakes up nowhere > (XEN) cpu1 picked vcpu1=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*** vcpu1 is on a cp= u > (XEN) cpu1 picked idle=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*** vcpu1 is wa= iting to be context > switched > (XEN) vcpu2 wakes up nowhere > (XEN) cpu0 picked vcpu0 > (XEN) cpu2 picked vcpu2 > (XEN) cpu0 picked vcpu0 > (XEN) cpu0 picked vcpu0 > (XEN) d0 attempted to change d0v2's CR4 flags 00000620 -> 00040660 > (XEN) cpu0 picked vcpu0 > (XEN) vcpu2 sleeps on cpu > (XEN) vcpu1 wakes up nowhere=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0*** vcpu1= wakes up without sleep? >=20 > (XEN) Assertion '!__vcpu_on_replq(svc)' failed at sched_rt.c:526 > (XEN) ----[ Xen-4.7-unstable=C2=A0=C2=A0x86_64=C2=A0=C2=A0debug=3Dy=C2=A0= =C2=A0Tainted:=C2=A0=C2=A0=C2=A0=C2=A0C ]---- > (XEN) CPU:=C2=A0=C2=A0=C2=A0=C2=A00 > (XEN) RIP:=C2=A0=C2=A0=C2=A0=C2=A0e008:[] > sched_rt.c#rt_vcpu_wake+0x11f/0x17b > ... > (XEN) Xen call trace: > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] sched_rt.c#rt_vcpu_wake= +0x11f/0x17b > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] vcpu_wake+0x213/0x3d4 > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] vcpu_unblock+0x4b/0x4d > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] vcpu_kick+0x20/0x6f > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] vcpu_mark_events_pendin= g+0x2c/0x2f > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] > event_2l.c#evtchn_2l_set_pending+0xa9/0xb9 > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] send_guest_vcpu_virq+0x= 9d/0xba > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] send_timer_event+0xe/0x= 10 > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] > schedule.c#vcpu_singleshot_timer_fn+0x9/0xb > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] timer.c#execute_timer+0= x4e/0x6c > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] timer.c#timer_softirq_a= ction+0xdd/0x213 > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] softirq.c#__do_softirq+= 0x82/0x8d > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] do_softirq+0x13/0x15 > (XEN)=C2=A0=C2=A0=C2=A0=C2=A0[] cpufreq.c#process_softi= rqs+0x21/0x30 >=20 >=20 > So, it looks like spurious wakeup for vcpu1 happens before it was=C2=A0 > completely context switched off a cpu. But rt_vcpu_wake() didn't see > it=C2=A0 > on cpu with curr_on_cpu() so it fell through the first two RETURNs. >=20 > I guess the replenishment queue check is necessary for this > situation? >=20 Perhaps, but I first want to make sure we understand what is really happening. Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-qMgdf9clDFoUx3INQo+9 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlbQlNoACgkQk4XaBE3IOsT5rwCgoeIVOXbcw5LQgM3ZH3fspocf UiMAoKNC53Z0UCX5aoO+abAPsIQ4mt5v =XGpA -----END PGP SIGNATURE----- --=-qMgdf9clDFoUx3INQo+9-- --===============2464609806589264482== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============2464609806589264482==--