From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH] xen: credit2: fix spinlock irq-safety violation Date: Wed, 20 Sep 2017 13:44:04 +0200 Message-ID: <1505907844.3483.12.camel@citrix.com> References: <150578708843.32006.17195420852157192880.stgit@Solace.fritz.box> <68900559-7c07-0009-3a61-a6fe82ddd2b5@citrix.com> <3c1b015f-ad2a-9bc7-7414-3d51c13e15c1@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============6109580954544822168==" Return-path: Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dudRa-00072x-Kd for xen-devel@lists.xenproject.org; Wed, 20 Sep 2017 11:45:30 +0000 In-Reply-To: <3c1b015f-ad2a-9bc7-7414-3d51c13e15c1@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: George Dunlap , xen-devel@lists.xenproject.org Cc: George Dunlap , Wei Liu , osstest service owner List-Id: xen-devel@lists.xenproject.org --===============6109580954544822168== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="=-7NmCPp6iJ5Bke5Dr6kYI" --=-7NmCPp6iJ5Bke5Dr6kYI Content-Type: multipart/mixed; boundary="=-uT08BS8DssbDaZWxS3tl" --=-uT08BS8DssbDaZWxS3tl Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2017-09-20 at 11:04 +0100, George Dunlap wrote: > On 09/20/2017 10:19 AM, George Dunlap wrote: > > > diff --git a/xen/common/sched_credit2.c > > > b/xen/common/sched_credit2.c > > > index 32234ac..7a550db 100644 > > > --- a/xen/common/sched_credit2.c > > > +++ b/xen/common/sched_credit2.c > > > @@ -2923,13 +2923,13 @@ csched2_free_domdata(const struct > > > scheduler *ops, void *data) > > > =C2=A0 > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0write_lock_irqsave(&prv->lock, flags); > > > =C2=A0 > > > -=C2=A0=C2=A0=C2=A0=C2=A0kill_timer(sdom->repl_timer); > > > -=C2=A0=C2=A0=C2=A0=C2=A0xfree(sdom->repl_timer); > > > - > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0list_del_init(&sdom->sdom_elem); > > > =C2=A0 > > > =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0write_unlock_irqrestore(&prv->lock, fla= gs); > > > =C2=A0 > > > +=C2=A0=C2=A0=C2=A0=C2=A0kill_timer(sdom->repl_timer); > > > +=C2=A0=C2=A0=C2=A0=C2=A0xfree(sdom->repl_timer); > >=20 > > Any particular reason for moving the kill_timer() as well as the > > xfree > > outside the lock?=C2=A0=C2=A0What happens if the timer goes off after t= he > > irqrestore but before the kill_timer? >=20 > Looks like if the timer fires, nothing terribly bad will happen; it > will > just do a useless budget replenishment. >=20 It's just that it has not reason to be there, as nothing that it does is serialized by prv->lock, so it only makes the critical section (with IRQ disabled) longer, for no reason, which is bad, as this being a write_lock(), it'll stop readers too. I think it was (my) mistake to put it there in the first place. > It looks like kill_timer() disables irqs, so it could be moved inside > the critical section.=C2=A0=C2=A0I'm inclined to say we should do so.=C2= =A0=C2=A0I don't > anticipate the budget replenishment ever to need to walk the domain > list, but should that change, this would be a bear of a bug to find. >=20 Indeed it's rather unlikely for the replenishment handler to have to use sdom_elem to go through the list of domains. IAC, if you're concerned about that, I'd much rather put both kill_timer() and xfree() before the critical section, rather than after, like in the attached patch. Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-uT08BS8DssbDaZWxS3tl Content-Type: message/rfc822; name="xen-credit2-fix-spinlock" Content-Disposition: attachment; filename="xen-credit2-fix-spinlock" From: Dario Faggioli Date: Wed, 20 Sep 2017 13:42:51 +0200 Subject: No Subject Message-ID: <1505907771.3483.11.camel@citrix.com> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable [PATCH v2] xen: credit2: fix spinlock irq-safety violation In commit ad4b3e1e9df34 ("xen: credit2: implement utilization cap") xfree() was being called (for deallocating the budget replenishment timer, during domain destruction) inside an IRQ disabled critical section. That must not happen, as it uses the mem-pool's lock, which needs to be taken with IRQ enabled. And, in fact, we crash (in debug builds): (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Xen BUG at spinlock.c:47 (XEN) **************************************** Let's, therefore, kill and deallocate the timer outside of the critical sections, when IRQs are enabled. Signed-off-by: Dario Faggioli --- Cc: osstest service owner Cc: George Dunlap Cc: Wei Liu --- This was spotted by OSSTest's flight 113562: http://logs.test-lab.xenproject.org/osstest/logs/113562/ http://logs.test-lab.xenproject.org/osstest/logs/113562/test-amd64-amd64-x= l-credit2/serial-godello0.log --- Changes from v1: - kill_timer() and xfree() moved above critical section, instead than below= . diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 32234ac..32b0363 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -2921,11 +2921,11 @@ csched2_free_domdata(const struct scheduler *ops, v= oid *data) struct csched2_dom *sdom =3D data; struct csched2_private *prv =3D csched2_priv(ops); =20 - write_lock_irqsave(&prv->lock, flags); - kill_timer(sdom->repl_timer); xfree(sdom->repl_timer); =20 + write_lock_irqsave(&prv->lock, flags); + list_del_init(&sdom->sdom_elem); =20 write_unlock_irqrestore(&prv->lock, flags); --=-uT08BS8DssbDaZWxS3tl-- --=-7NmCPp6iJ5Bke5Dr6kYI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZwlSFAAoJEBZCeImluHPuWDwQAIQ9fVZS35HOibMhyOuLBFlU fc/xa0gA7pRkvM6f3/vuhiYLh3P6qAXpyyfye9Gzm/itvgFLIEgWfbqlRBvexD6/ gl7qZmLIuik3zuXHrI0Hth9QMgx1N2Xvh277yU8adTrGP5OIPRjI/kSWMwp+0o9n Rg57RDk0rl/UZmi7X8reepQDrOEpcZ5EnQe2g+f+fEX1K/P19zaEaCX4o3u5D14c lVgUsK7SsF7K8BYEyNvKjCK01mb1GyOn2UsVxXg0fu8RzTPFaETG2vOjnF5QlcZV yKwOGwU3VMsbj1QpGKAD5u4S9IXNDzhEhKXURmU/9FKIAA0a3p7wpwIeyN01uTbI Fce1UTh8znyUkple62Nbfp8HB0sQDHdZOA00nUpRUynZ1Rb4sFWpEHOTXtTFvK4X pt0KJLvXb9vhjVEGBZL8t08gojwi4GitOrmU52DlBS3pQdlF4VWx5w5Cfn1+SMpd +skVC+1dSNRqEg2o2GEBdYSEB3cyRxJ/EQd0TqqN5lcyWFZjiOO7IsYeyQfMDFTj 5spKrErpM29jpE7MLQrvwwYLN451YXrplDvgxa+674ykfone3/pH4uoHDsp1w1mV LpYkjkjwJKXx+QejqxOfGZBWYWat6fSTcaTie3YfcylbtkxYHsL8ziM8gL+G1Bs4 EqkVcp9yocOVP+UwtwsJ =2uLh -----END PGP SIGNATURE----- --=-7NmCPp6iJ5Bke5Dr6kYI-- --===============6109580954544822168== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwczovL2xpc3RzLnhlbi5v cmcveGVuLWRldmVsCg== --===============6109580954544822168==--