From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joe Jin Subject: Re: Lots of connections led oxenstored stuck Date: Tue, 12 Aug 2014 08:19:40 +0800 Message-ID: <53E95D9C.6000901@oracle.com> References: <53E475C8.2070806@oracle.com> <0E6BCB61859D7F4EB9CAC75FC6EE6FF84573109A@SZXEMA502-MBX.china.huawei.com> <56117839-54D7-410D-9008-25F5F18514FA@citrix.com> <53E80FD9.3010002@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dave Scott Cc: Zheng Li , "Luis R. Rodriguez" , Luonengjun , xen-devel , Fanhenglong , "Liuqiming (John)" , Ian Jackson List-Id: xen-devel@lists.xenproject.org On 08/11/14 17:41, Dave Scott wrote: > = > On 11 Aug 2014, at 01:35, Joe Jin wrote: > = >> On 08/08/14 17:37, Dave Scott wrote: >>> >>> On 8 Aug 2014, at 09:35, Liuqiming (John) w= rote: >>> >>>> In oxenstored it use "select" for incoming socket, so I don't think it= can handle more than 1024 socket connections. = >>> >>> That=92s true. >> >> The problem is when oxenstored does not respond any request anymore even= all >> thread exited, with my reproducer, when you executed it and all threads = exited, >> "xm list -l" will stuck. > = > OK so is this the behaviour you expect: > = > * root in dom0 opens many connections, until oxenstored is out of resourc= es (where the most limited resource is currently file descriptors) > * root in dom0 closes the connections > * oxenstored recovers, and =91xm list -l=92 works again > = > Instead, you=92re seeing oxenstored getting into a stuck state causing = =91xm list -l=92 to block =97 is this accurate? Yes that's it. > = > Could you share your reproducer program? /* = * This program used to test oxenstored connections stuck issue. * please compile by below command: * gcc -o client client.c -lpthread */ #include #include #include #include #include #include #include #include void *main_thread(void *arg) { struct sockaddr_un address; int socket_fd, nbytes; char buffer[256]; int i; extern int errno; memcpy(&i, arg, sizeof(i)); socket_fd =3D socket(PF_UNIX, SOCK_STREAM, 0); if (socket_fd < 0) { fprintf(stderr, "socket() %dth failed, errno=3D%d\n", i, errno); return; } fprintf(stderr, "socket() %dth ok!\n", i); /* start with a clean address structure */ memset(&address, 0, sizeof(struct sockaddr_un)); address.sun_family =3D AF_UNIX; snprintf(address.sun_path, 1024, "/var/run/xenstored/socket"); if (connect(socket_fd, (struct sockaddr *) &address, sizeof(struct sockaddr_un)) !=3D 0) { fprintf(stderr, "connect() %d failed, error=3D%d", i, errno); return; } fprintf(stderr, "connec() %dth ok!\n", i); while (1) sleep(1); if (arg) { free(arg); arg =3D NULL; } return; } int main(void) { int i; for (i =3D 0; i < 2000; i++) { void *arg =3D malloc(sizeof(i)); memset(arg, 0, sizeof(i)); memcpy(arg, &i, sizeof(i)); pthread_t thread; if (pthread_create(&thread, NULL, main_thread, arg) !=3D 0) { perror("pthread_create:"); break; } } /* Wait all children exit */ sleep(3); return 0; } /* end */