From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: Re: Xen PVM: Strange lockups when running PostgreSQL load Date: Fri, 19 Oct 2012 16:03:13 +0200 Message-ID: <50815DA1.10303@canonical.com> References: <1350479456-4007-1-git-send-email-stefan.bader@canonical.com> <507EB27D.8050308@citrix.com> <1350482118.2460.74.camel@zakaz.uk.xensource.com> <507ECD06.2050407@canonical.com> <507ED038.8000806@citrix.com> <507FC51102000078000A235E@nat28.tlf.novell.com> <507FC71502000078000A236C@nat28.tlf.novell.com> <507FB1E1.8080700@canonical.com> <1350546483.28188.25.camel@dagon.hellion.org.uk> <507FD7DE.2010209@canonical.com> <507FFA5102000078000A250D@nat28.tlf.novell.com> <507FF964.9090009@canonical.com> <50806C0B.1060504@canonical.com> <5081264002000078000A2841@nat28.tlf.novell.com> <50811047.4080200@canonical.com> <5081387B02000078000A288D@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1550538253537960605==" Return-path: In-Reply-To: <5081387B02000078000A288D@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Andrew Cooper , "xen-devel@lists.xen.org" , Ian Campbell , Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --===============1550538253537960605== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enig6155E8BB71A4A409BC8AC38D" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig6155E8BB71A4A409BC8AC38D Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 19.10.2012 11:24, Jan Beulich wrote: >>>> On 19.10.12 at 10:33, Stefan Bader wrot= e: >> On 19.10.2012 10:06, Jan Beulich wrote: >>>>>> On 18.10.12 at 22:52, Stefan Bader >>>>>> wrote: >>>> Actually I begin to suspect that it could be possible that I just >>>> overlooked >>=20 >>>> the most obvious thing. Provoking question: are we sure we are on th= e >>>> same page about the purpose of the spin_lock_flags variant of the pv= >>>> lock ops interface? >>>>=20 >>>> I begin to suspect that it really is not for giving a chance to >>>> re-enable interrupts. Just what it should be used for I am not clear= =2E >>>> Anyway it seems all other places more or less ignore the flags and m= ap >>>> themselves back to an ignorant version of spinlock. Also I believe t= hat >>>> the only high level function that would end up in passing any flags,= >>>> would be the spin_lock_irqsave one. And I am pretty sure that this o= ne >>>> will expect interrupts to stay disabled. >>>=20 >>> No - the only requirement here is that from the point on where the lo= ck >>> is owned interrupt must remain disabled. Re-enabling intermediately i= s >>> quite okay (and used to be done by the native kernel prior to the >>> conversion to ticket locks iirc). >>=20 >> Though it seems rather dangerous then. Don't remember the old code, bu= t imo >> it always opens up a (even microscopic) window to unexpected >> interruptions. >=20 > There just can't be unexpected interruptions. Whenever interrupts are > enabled, it is expected that they can occur. Probably one thing that makes things a bit more complicated is that in th= e PVM case interrupts maps to vcpu->evtchn_upcall_mask. >=20 >>>> So I tried below approach and that seems to be surviving the previou= sly >>>> breaking testcase for much longer than anything I tried before. >>>=20 >>> If that indeed fixes your problem, then (minus eventual problems with= the >>> scope of the interrupts-enabled window) this rather points at a bug i= n >>> the users of the spinlock interfaces. >>=20 >> I would be pragmatic here, none of the other current implementations s= eem >> to re-enable interrupts and so this only affects xen pv. >=20 > I don't think you really checked - the first arch I looked at (s390, as= being > the most obvious one to look at when it comes to virtualization) quite > prominently re-enableds interrupts in arch_spin_lock_wait_flags(). No, I assumed that you saying native kernel did prior to ticket lock conv= ersion, that this involves more historic search. And yes s390 is doing virtualiza= tion quite a bit back into history. Just not paravirtualization. And when I look at arch_spin_lock_wait_flags() enabling/disabling is done= close by (at least I am not leaving off into some hypercall fog). >=20 >> And how much really is gained from enabling it compared to the risk of= >> being affected by something that nobody else will be? >=20 > The main difference between the native and virtualized cases is that th= e > period of time you spend waiting for the lock to become available is pr= etty > much unbounded (even more so without ticket locks), and keeping interru= pts > disabled for such an extended period of time is just going to ask for o= ther > problems. So not sure I followed all of the right paths here, but I think xen_poll_= irq ends up doing a hypercall via syscall. Syscall seems to mask of some bits= of the flags (maybe irq) but certainly that will not translate automatically int= o the upcall mask. Then, again hopefully the right place, in the mini-os part, the hypervisor_callback there is some annotation about leaving the events mas= ked off until having checked for being already interrupted. Could be the same mas= k that is checked here that the guest has cleared to enable interrupts... Would = that be expected? >=20 > And note that this isn't the case just for Xen PV - all virtualization = > scenarios suffer from that. >=20 > Jan >=20 >=20 > _______________________________________________ Xen-devel mailing list = > Xen-devel@lists.xen.org http://lists.xen.org/xen-devel >=20 --------------enig6155E8BB71A4A409BC8AC38D Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iQIcBAEBCgAGBQJQgV2iAAoJEOhnXe7L7s6jnSAQAMq1+YooxH+c51fR0OKfk2FV hr5DPD4nRUpvyxXu/Z/la9csOeAONflQpsZiYp9v0UmxWkHkXrldXQTC3tX2rjNJ ZrkIJEpK5caImdchrtcO5fdxDS5QCyh9qrIRvOOqCaRscnmtvgaWIBm8lc5jklrT 1QjBZwTk8dkVtafw1uC3R4B8U7WvA/QCTza47u2whkd8RKDGVYTfdjH4z4LlU3SC jhYtYrwV8GfdaU+U9bWfHP6nmmywIDHsr3guaSjowNPmDzWMCRbNrrWsPgvnG0GL sW9B6FO/v+P6MRa8mpn/raTmAr9DkGS12sLdm5msZ2/3jfhtpVD6vs9MkundpMNj 8IpDIgNAvTpsIyZUR6boUSa/Icz3wGFuFCeKeEcXrtb3IeTpwqb/HNXiHlaI8Ado ug0kwLFd+9PmDfBHsbVIoaAWuRJmzEvQUt4RMR0N6MfJhVyZtT+ttYInEqrD9dNU nnGeWTFnsRY5ce1cJdTdo+GAKh+UO+dZ5UHdR6msIX4Mg4xff11pCnguCGjvlgkd umEEG5U+a4DV2xl0DnQjHEcIMo4zH8tvsq7NpRVbBS5e7aLgXpTXN+Ie73FBdg4w z0r6TEkF7uR3ZdlHotZy34uj2EhirIDSGqtgErqHy6QIiD+xlMDqb7sogpYNZ5yj 01OxpQvRosMfyRbaajCZ =OdpA -----END PGP SIGNATURE----- --------------enig6155E8BB71A4A409BC8AC38D-- --===============1550538253537960605== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============1550538253537960605==--