From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marek =?utf-8?Q?Marczykowski-G=C3=B3recki?= Subject: Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier) Date: Thu, 19 Mar 2015 02:19:11 +0100 Message-ID: <20150319011911.GA29029@mail-itl> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============6524097353098093713==" Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: xen-devel Cc: Boris Ostrovsky , David Vrabel List-Id: xen-devel@lists.xenproject.org --===============6524097353098093713== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="C7zPtVaVf+AK4Oqc" Content-Disposition: inline --C7zPtVaVf+AK4Oqc Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, I've hit some deadlock in kernel xenstore client exposed via /proc/xen/xenbus. Steps to reproduce are simple: int main() { struct xs_handle *xs; xs =3D xs_open(0); xs_watch(xs, "domid", "token"); xs_read(xs, 0, "name", NULL); return 0; } xs_watch internally creates new thread, which uses read to wait for the watch. And in the same time, the program tries to read some value, but actually it hangs at sending the command (before even sending a path to= be read). Strace gives this (simplified for readability): [pid 2494] write(3, "\4\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0", 160 =3D 16 [pid 2494] write(3, "domid\0", 6) =3D 6 [pid 2494] write(3, "token\0", 6) =3D 6 [pid 2495] read(3, [pid 2494] futex(0x71c0d4, FUTEX_WAIT_PRIVATE, 1, NULL [pid 2495] <... read resumed> "\17\0\0\0\377\377\377\377\220~\255\27\f\0\0\0", 16) =3D 16 [pid 2495] read(3, "domid\0token\0", 12) =3D 12 [pid 2495] read(3, "\4\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0", 16) =3D 16 [pid 2495] read(3, "OK\0", 3) =3D 3 [pid 2495] futex(0x71c0d4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x71c0d0, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} [pid 2494] <... futex resumed> ) =3D 0 [pid 2495] <... futex resumed> ) =3D 1 [pid 2494] futex(0x71c0a8, FUTEX_WAIT_PRIVATE, 2, NULL [pid 2495] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 [pid 2494] <... futex resumed> ) =3D -1 EAGAIN (Resource temporarily unavailable) [pid 2495] <... futex resumed> ) =3D 0 [pid 2494] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 [pid 2495] read(3, [pid 2494] <... futex resumed> ) =3D 0 [pid 2494] rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER, 0x7fc78c1488f0}, NULL, 8) =3D 0 [pid 2494] rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER, 0x7fc78c1488f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc78c1488f0}, 8) =3D 0 [pid 2494] write(3, "\2\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0", 16 And thats all - 2494 is waiting on write, 2495 is waiting on read. On 3.12.x it is working. On 3.17.0 and 3.18.7 it is broken. I haven't checked versions in the middle. Any ideas? --=20 Best Regards, Marek Marczykowski-G=C3=B3recki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? --C7zPtVaVf+AK4Oqc Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVCiQPAAoJENuP0xzK19csCOgH/2TNpSjbr7g23YBZzTaXhNKk J3JUGvk3H/zbbXyonT+IOuTPSRso6+faWM4lrdTtxDGXVhEs1BIEqZH0cniXio8+ JyA7TiIqE5inXZaJqACyrsD1fTo25EJxMQMKmKV8UKmNTgJbhyGvPod5sqWC3QIO jjPmun7VN68zB+Yifbca+8ZfDvrQpx7xzJvZ73ai8M5iVhOXnG0hRymjQVhuUe9N 5NkxxFDGkJWbjlN+lwYeUBi1nohdBxlJxGoUkrW4AIIxN95NwUBE6A1oUJBWFHTu R1jBvV+AAaPPMrudsw/16c5TQldrtkXTthWdIxR6iEGbgjud9lQ8qkuiZZM34Zw= =fDy3 -----END PGP SIGNATURE----- --C7zPtVaVf+AK4Oqc-- --===============6524097353098093713== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============6524097353098093713==--