From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joe Jin Subject: Re: Lots of connections led oxenstored stuck Date: Tue, 26 Aug 2014 16:15:07 +0800 Message-ID: <53FC420B.10005@oracle.com> References: <53E475C8.2070806@oracle.com> <0E6BCB61859D7F4EB9CAC75FC6EE6FF84573109A@SZXEMA502-MBX.china.huawei.com> <56117839-54D7-410D-9008-25F5F18514FA@citrix.com> <53E80FD9.3010002@oracle.com> <53E95D9C.6000901@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <53E95D9C.6000901@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Dave Scott Cc: Zheng Li , "Luis R. Rodriguez" , Luonengjun , xen-devel , Fanhenglong , Ian Jackson , "Liuqiming (John)" List-Id: xen-devel@lists.xenproject.org This bug caused by oxenstored handle incoming requests, when lots of connections came at same time it has not chance to delete closed sockets. I created a patch for this, please review: Thanks, Joe [PATCH] oxenstored: check and delete closed socket before accept incoming c= onnections When more than SYSCONF.OPEN_MAX connections came at the same time and connecitons been closed later, oxenstored has not change to delete closed socket, this led oxenstored stuck and unable to handle any incoming requests any more. This patch let oxenstored check and process closed socket before handle incoming connections to avoid the stuck. Cc: David Scott Cc: Zheng Li Cc: Luis R. Rodriguez Cc: Ian Jackson Signed-off-by: Joe Jin --- tools/ocaml/xenstored/xenstored.ml | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xen= stored.ml index 1c02f2f..b142952 100644 --- a/tools/ocaml/xenstored/xenstored.ml +++ b/tools/ocaml/xenstored/xenstored.ml @@ -373,10 +373,10 @@ let _ =3D [], [], [] in let sfds, cfds =3D List.partition (fun fd -> List.mem fd spec_fds) rset in - if List.length sfds > 0 then - process_special_fds sfds; if List.length cfds > 0 || List.length wset > 0 then process_connection_fds store cons domains cfds wset; + if List.length sfds > 0 then + process_special_fds sfds; process_domains store cons domains in = -- = 1.7.1 On 08/12/14 08:19, Joe Jin wrote: > On 08/11/14 17:41, Dave Scott wrote: >> >> On 11 Aug 2014, at 01:35, Joe Jin wrote: >> >>> On 08/08/14 17:37, Dave Scott wrote: >>>> >>>> On 8 Aug 2014, at 09:35, Liuqiming (John) = wrote: >>>> >>>>> In oxenstored it use "select" for incoming socket, so I don't think i= t can handle more than 1024 socket connections. = >>>> >>>> That=92s true. >>> >>> The problem is when oxenstored does not respond any request anymore eve= n all >>> thread exited, with my reproducer, when you executed it and all threads= exited, >>> "xm list -l" will stuck. >> >> OK so is this the behaviour you expect: >> >> * root in dom0 opens many connections, until oxenstored is out of resour= ces (where the most limited resource is currently file descriptors) >> * root in dom0 closes the connections >> * oxenstored recovers, and =91xm list -l=92 works again >> >> Instead, you=92re seeing oxenstored getting into a stuck state causing = =91xm list -l=92 to block =97 is this accurate? > = > Yes that's it. > = >> >> Could you share your reproducer program? > = > /* = > * This program used to test oxenstored connections stuck issue. > * please compile by below command: > * gcc -o client client.c -lpthread > */ > #include > #include > #include > #include > #include > #include > #include > #include > = > = > void *main_thread(void *arg) > { > struct sockaddr_un address; > int socket_fd, nbytes; > char buffer[256]; > int i; > extern int errno; > = > memcpy(&i, arg, sizeof(i)); > socket_fd =3D socket(PF_UNIX, SOCK_STREAM, 0); > if (socket_fd < 0) { > fprintf(stderr, "socket() %dth failed, errno=3D%d\n", i, errno); > return; > } > fprintf(stderr, "socket() %dth ok!\n", i); > = > /* start with a clean address structure */ > memset(&address, 0, sizeof(struct sockaddr_un)); > = > address.sun_family =3D AF_UNIX; > snprintf(address.sun_path, 1024, "/var/run/xenstored/socket"); > = > if (connect(socket_fd, > (struct sockaddr *) &address, > sizeof(struct sockaddr_un)) !=3D 0) { > fprintf(stderr, "connect() %d failed, error=3D%d", i, errno); > return; > } > fprintf(stderr, "connec() %dth ok!\n", i); > = > while (1) > sleep(1); > if (arg) { > free(arg); > arg =3D NULL; > } > = > return; > } > = > int main(void) > { > int i; > for (i =3D 0; i < 2000; i++) { > void *arg =3D malloc(sizeof(i)); > memset(arg, 0, sizeof(i)); > memcpy(arg, &i, sizeof(i)); > pthread_t thread; > if (pthread_create(&thread, NULL, main_thread, arg) !=3D 0) { > perror("pthread_create:"); > break; > } > } > /* Wait all children exit */ > sleep(3); > return 0; > } > /* end */ > = > = > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > =