From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: Re: Xen PVM: Strange lockups when running PostgreSQL load Date: Wed, 17 Oct 2012 17:21:42 +0200 Message-ID: <507ECD06.2050407@canonical.com> References: <1350479456-4007-1-git-send-email-stefan.bader@canonical.com> <507EB27D.8050308@citrix.com> <1350482118.2460.74.camel@zakaz.uk.xensource.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============6214944843867296276==" Return-path: In-Reply-To: <1350482118.2460.74.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: Andrew Cooper , Konrad Rzeszutek Wilk , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --===============6214944843867296276== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enigDCF9146FEFDB17A1136CD1F0" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigDCF9146FEFDB17A1136CD1F0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 17.10.2012 15:55, Ian Campbell wrote: > On Wed, 2012-10-17 at 14:28 +0100, Andrew Cooper wrote: >> In our case, certain processes were locking up, and it turned out that= >> the kernel was issuing SCHOP_poll hypercalls (same the stack trace on >> your launchpad ticket) on its own spinlock IPI event channel >> (understandable, as its a spinlock), but with the event channel >> masked, so it would never wake up from the poll.=20 >=20 > I think (but you might want to check) that SCHEDOP_poll works (or is > supposed to work!) regardless of whether the specific evtchn you ask fo= r > is masked or not. I was assuming it to be supposed to work at least in Xen 4.1.2. Or at lea= st if it did not I would hope to catch VCPUs rather sitting on the hypercall th= an doing nothing. Of course I cannot say how reliable information in crash i= s as this is something new to do after Daniel fixed crash. >=20 > The Linux PV spinlock implementation takes advantage of this because it= > never wants to take a real interrupt from the spinlock poller evtchn. Right, I probably should have realized that. Though I guess it is still interesting to see whether the channel is pending. So when just recreating, I got the q and e info which is (assuming the gu= est domain is enough): (XEN) Event channel information for domain 1: (XEN) Polling vCPUs: {1,4,6} (XEN) port [p/m] (XEN) 1 [0/0]: s=3D3 n=3D0 d=3D0 p=3D68 x=3D0 (XEN) 2 [1/0]: s=3D3 n=3D0 d=3D0 p=3D69 x=3D0 (XEN) 3 [1/0]: s=3D5 n=3D0 v=3D0 x=3D0 (XEN) 4 [1/1]: s=3D6 n=3D0 x=3D0 (XEN) 5 [1/0]: s=3D6 n=3D0 x=3D0 (XEN) 6 [0/0]: s=3D6 n=3D0 x=3D0 (XEN) 7 [0/0]: s=3D5 n=3D0 v=3D1 x=3D0 (XEN) 8 [0/0]: s=3D6 n=3D0 x=3D0 (XEN) 9 [1/0]: s=3D5 n=3D1 v=3D0 x=3D0 (XEN) 10 [0/1]: s=3D6 n=3D1 x=3D0 (XEN) 11 [1/0]: s=3D6 n=3D1 x=3D0 (XEN) 12 [0/0]: s=3D6 n=3D1 x=3D0 (XEN) 13 [0/0]: s=3D5 n=3D1 v=3D1 x=3D0 (XEN) 14 [0/0]: s=3D6 n=3D1 x=3D0 (XEN) 15 [0/0]: s=3D5 n=3D2 v=3D0 x=3D0 (XEN) 16 [1/1]: s=3D6 n=3D2 x=3D0 (XEN) 17 [0/0]: s=3D6 n=3D2 x=3D0 (XEN) 18 [0/0]: s=3D6 n=3D2 x=3D0 (XEN) 19 [0/0]: s=3D5 n=3D2 v=3D1 x=3D0 (XEN) 20 [0/0]: s=3D6 n=3D2 x=3D0 (XEN) 21 [0/0]: s=3D5 n=3D3 v=3D0 x=3D0 (XEN) 22 [1/1]: s=3D6 n=3D3 x=3D0 (XEN) 23 [0/0]: s=3D6 n=3D3 x=3D0 (XEN) 24 [0/0]: s=3D6 n=3D3 x=3D0 (XEN) 25 [0/0]: s=3D5 n=3D3 v=3D1 x=3D0 (XEN) 26 [0/0]: s=3D6 n=3D3 x=3D0 (XEN) 27 [1/0]: s=3D5 n=3D4 v=3D0 x=3D0 (XEN) 28 [0/1]: s=3D6 n=3D4 x=3D0 (XEN) 29 [1/0]: s=3D6 n=3D4 x=3D0 (XEN) 30 [0/0]: s=3D6 n=3D4 x=3D0 (XEN) 31 [0/0]: s=3D5 n=3D4 v=3D1 x=3D0 (XEN) 32 [0/0]: s=3D6 n=3D4 x=3D0 (XEN) 33 [0/0]: s=3D5 n=3D5 v=3D0 x=3D0 (XEN) 34 [0/1]: s=3D6 n=3D5 x=3D0 (XEN) 35 [0/0]: s=3D6 n=3D5 x=3D0 (XEN) 36 [0/0]: s=3D6 n=3D5 x=3D0 (XEN) 37 [0/0]: s=3D5 n=3D5 v=3D1 x=3D0 (XEN) 38 [0/0]: s=3D6 n=3D5 x=3D0 (XEN) 39 [1/0]: s=3D5 n=3D6 v=3D0 x=3D0 (XEN) 40 [0/1]: s=3D6 n=3D6 x=3D0 (XEN) 41 [1/0]: s=3D6 n=3D6 x=3D0 (XEN) 42 [0/0]: s=3D6 n=3D6 x=3D0 (XEN) 43 [0/0]: s=3D5 n=3D6 v=3D1 x=3D0 (XEN) 44 [0/0]: s=3D6 n=3D6 x=3D0 (XEN) 45 [0/0]: s=3D5 n=3D7 v=3D0 x=3D0 (XEN) 46 [1/1]: s=3D6 n=3D7 x=3D0 (XEN) 47 [0/0]: s=3D6 n=3D7 x=3D0 (XEN) 48 [0/0]: s=3D6 n=3D7 x=3D0 (XEN) 49 [0/0]: s=3D5 n=3D7 v=3D1 x=3D0 (XEN) 50 [0/0]: s=3D6 n=3D7 x=3D0 (XEN) 51 [0/0]: s=3D3 n=3D7 d=3D0 p=3D70 x=3D0 (XEN) 52 [0/0]: s=3D3 n=3D0 d=3D0 p=3D71 x=3D0 (XEN) 53 [0/0]: s=3D3 n=3D0 d=3D0 p=3D72 x=3D0 (XEN) 54 [0/0]: s=3D3 n=3D0 d=3D0 p=3D73 x=3D0 (XEN) 55 [1/0]: s=3D3 n=3D0 d=3D0 p=3D74 x=3D0 [maybe someone can tell me what the s,n,d,p and x mean] (XEN) Rangesets belonging to domain 1: (XEN) I/O Ports { } (XEN) Interrupts { } (XEN) I/O Memory { } (XEN) Memory pages belonging to domain 1: (XEN) DomPage list too long to display (XEN) XenPage 00000000008000ab: caf=3Dc000000000000002, taf=3D7400000= 000000002 (XEN) XenPage 00000000008000aa: caf=3Dc000000000000002, taf=3D7400000= 000000002 (XEN) XenPage 00000000008000a9: caf=3Dc000000000000002, taf=3D7400000= 000000002 (XEN) XenPage 00000000008000a8: caf=3Dc000000000000001, taf=3D7400000= 000000001 (XEN) XenPage 00000000000dfae4: caf=3Dc000000000000002, taf=3D7400000= 000000002 (XEN) VCPU information and callbacks for domain 1: (XEN) VCPU0: CPU3 [has=3DT] flags=3D0 poll=3D0 upcall_pend =3D 01, up= call_mask =3D 01 dirty_cpus=3D{3} cpu_affinity=3D{0-127} (XEN) No periodic timer (XEN) VCPU1: CPU7 [has=3DF] flags=3D1 poll=3D10 upcall_pend =3D 01, u= pcall_mask =3D 01 dirty_cpus=3D{} cpu_affinity=3D{0-127} (XEN) No periodic timer (XEN) VCPU2: CPU4 [has=3DF] flags=3D1 poll=3D0 upcall_pend =3D 00, up= call_mask =3D 00 dirty_cpus=3D{} cpu_affinity=3D{0-127} (XEN) No periodic timer (XEN) VCPU3: CPU5 [has=3DF] flags=3D1 poll=3D0 upcall_pend =3D 00, up= call_mask =3D 00 dirty_cpus=3D{} cpu_affinity=3D{0-127} (XEN) No periodic timer (XEN) VCPU4: CPU6 [has=3DF] flags=3D1 poll=3D28 upcall_pend =3D 01, u= pcall_mask =3D 01 dirty_cpus=3D{} cpu_affinity=3D{0-127} (XEN) No periodic timer (XEN) VCPU5: CPU7 [has=3DF] flags=3D1 poll=3D0 upcall_pend =3D 00, up= call_mask =3D 00 dirty_cpus=3D{7} cpu_affinity=3D{0-127} (XEN) No periodic timer (XEN) VCPU6: CPU0 [has=3DF] flags=3D1 poll=3D40 upcall_pend =3D 01, u= pcall_mask =3D 01 dirty_cpus=3D{} cpu_affinity=3D{0-127} (XEN) No periodic timer (XEN) VCPU7: CPU6 [has=3DT] flags=3D0 poll=3D0 upcall_pend =3D 00, up= call_mask =3D 01 dirty_cpus=3D{6} cpu_affinity=3D{0-127} (XEN) No periodic timer (XEN) Notifying guest 0:0 (virq 1, port 5, stat 0/0/0) (XEN) Notifying guest 0:1 (virq 1, port 11, stat 0/0/0) (XEN) Notifying guest 0:2 (virq 1, port 17, stat 0/0/0) (XEN) Notifying guest 0:3 (virq 1, port 23, stat 0/0/0) (XEN) Notifying guest 0:4 (virq 1, port 29, stat 0/0/0) (XEN) Notifying guest 0:5 (virq 1, port 35, stat 0/0/0) (XEN) Notifying guest 0:6 (virq 1, port 41, stat 0/0/0) (XEN) Notifying guest 0:7 (virq 1, port 47, stat 0/0/0) (XEN) Notifying guest 1:0 (virq 1, port 7, stat 0/0/-1) (XEN) Notifying guest 1:1 (virq 1, port 13, stat 0/0/-1) (XEN) Notifying guest 1:2 (virq 1, port 19, stat 0/0/0) (XEN) Notifying guest 1:3 (virq 1, port 25, stat 0/0/0) (XEN) Notifying guest 1:4 (virq 1, port 31, stat 0/0/-1) (XEN) Notifying guest 1:5 (virq 1, port 37, stat 0/0/0) (XEN) Notifying guest 1:6 (virq 1, port 43, stat 0/0/-1) (XEN) Notifying guest 1:7 (virq 1, port 49, stat 0/0/0) When the guest was unresponsive the console would still show: [10174.372092] INFO: rcu_sched detected stalls on CPUs/tasks: { 0 1} (det= ected by 4, t=3D15002 jiffies) [10284.448089] INFO: rcu_bh detected stalls on CPUs/tasks: { 0 1 4 6} (de= tected by 5, t=3D15004 jiffies) in a repeating pattern. So I take the above as cpus 1,4 and 6 are polling= =2E From the dump and the content of lock_spinners I get: cpu 0 and 1 -> ffff8803bfc13700 (which is runqueue[0] and is unlocked aga= in) cpu 4 and 6 -> ffffffff81f15ef0 (which is blkif_io_lock and is locked) Backtraces would be somewhat inconsistent (as always). Note, I should men= tion that I still had a kernel with my patch applied on that guest. That chang= es things a bit (actually it takes a bit longer to hang but again that might= be just a matter of timing). The strange lock state of 2 spinners on an unlo= cked lock remains the same with or without it. One question about the patch actually, would anybody think that there cou= ld be a case where the unlocking cpu has itself on the spinners list? I did not t= hink so but that might be wrong. > =20 > The IRQ handler for the spinlock evtchn in Linux is: > static irqreturn_t dummy_handler(int irq, void *dev_id) > { > BUG(); > return IRQ_HANDLED; > } > =20 > and right after we register it: > disable_irq(irq); /* make sure it's never delivered */ >=20 > The is no enable -- ignoring bugs of which there have been couple of > instances, but those trigger the BUG() so are pretty obvious. >=20 > Ian. >=20 >=20 --------------enigDCF9146FEFDB17A1136CD1F0 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://www.enigmail.net/ iQIcBAEBCgAGBQJQfs0GAAoJEOhnXe7L7s6jwXUQAL5NF6D2cs6XlFgWJ7CjAZtL jJCCkonPut/ZJG081Jb0F8ZOx8dbHame2/ypTNPghZVNnfWMwPR9Niy99Jk4l2sm H2jMJ2lAfYt2cHHI6T31pbcrO5JJ5aYrRn3A6CS8LIiTSY3gc0CGgIVWU5FWFwGJ czSTqV9Wk7cMIEkdQOSnCotBQ942wa9tI0otMv8RUMBfdfUa89DmuMh571AXLu9p x1KGfwNRo7m6S/1SvO5GnA0lUGueC86deHH+zCRPq003LNIHZR55H/FRwMo4wZxE i5UjUkiVaZdi4DHawVgfWjyVx5BPfLVBAmKCTu5PRcVIF2wv9PXEKXGFtu2uAte0 0qvaav7PIEvAkBlFH4iaXkvLlUEoBTB3OUhxz46WcdzuM09D4m2WqpckJezsw+pP zC2wiw5H+D4KCqGaRLXIc+zidBbzE7C2Y+SJmp3QWO1174VKOwnI0hsVD8uaky/7 6OcyktOX3kf9Hs+Cpc1XneGYWLZmE8v5jfz4JTiVq1YQs35deHK6tvDLaMBrnSk5 1ZW4EAGMSVTk11BxOYlfOqwz6YRmm829xKM/j96zbOeqVjU70fs1ZfmQfSQeyYkK rDxo4UagRdLcxFlNlvFVABP922oZJe0sVuyJXyK13xc76bjDSeGBpp/O3DWuTXj5 e+NdMunsg2owKoGESWjB =JD2M -----END PGP SIGNATURE----- --------------enigDCF9146FEFDB17A1136CD1F0-- --===============6214944843867296276== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============6214944843867296276==--