xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Joe Jin <joe.jin@oracle.com>
To: Dave Scott <Dave.Scott@citrix.com>
Cc: Zheng Li <dev@zheng.li>, "Luis R. Rodriguez" <mcgrof@suse.com>,
	Luonengjun <luonengjun@huawei.com>,
	xen-devel <xen-devel@lists.xen.org>,
	Fanhenglong <fanhenglong@huawei.com>,
	Ian Jackson <Ian.Jackson@citrix.com>,
	"Liuqiming (John)" <john.liuqiming@huawei.com>
Subject: Re: Lots of connections led oxenstored stuck
Date: Tue, 26 Aug 2014 16:15:07 +0800	[thread overview]
Message-ID: <53FC420B.10005@oracle.com> (raw)
In-Reply-To: <53E95D9C.6000901@oracle.com>

This bug caused by oxenstored handle incoming requests, when lots of
connections came at same time it has not chance to delete closed sockets.

I created a patch for this, please review:

Thanks,
Joe

[PATCH] oxenstored: check and delete closed socket before accept incoming connections

When more than SYSCONF.OPEN_MAX connections came at the same time and
connecitons been closed later, oxenstored has not change to delete closed
socket, this led oxenstored stuck and unable to handle any incoming
requests any more. This patch let oxenstored check and process closed
socket before handle incoming connections to avoid the stuck.

Cc: David Scott <dave.scott@eu.citrix.com>
Cc: Zheng Li <dev@zheng.li>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Ian Jackson <Ian.Jackson@citrix.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
---
 tools/ocaml/xenstored/xenstored.ml |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/ocaml/xenstored/xenstored.ml b/tools/ocaml/xenstored/xenstored.ml
index 1c02f2f..b142952 100644
--- a/tools/ocaml/xenstored/xenstored.ml
+++ b/tools/ocaml/xenstored/xenstored.ml
@@ -373,10 +373,10 @@ let _ =
 			[], [], [] in
 		let sfds, cfds =
 			List.partition (fun fd -> List.mem fd spec_fds) rset in
-		if List.length sfds > 0 then
-			process_special_fds sfds;
 		if List.length cfds > 0 || List.length wset > 0 then
 			process_connection_fds store cons domains cfds wset;
+		if List.length sfds > 0 then
+			process_special_fds sfds;
 		process_domains store cons domains
 		in
 
-- 
1.7.1

On 08/12/14 08:19, Joe Jin wrote:
> On 08/11/14 17:41, Dave Scott wrote:
>>
>> On 11 Aug 2014, at 01:35, Joe Jin <joe.jin@oracle.com> wrote:
>>
>>> On 08/08/14 17:37, Dave Scott wrote:
>>>>
>>>> On 8 Aug 2014, at 09:35, Liuqiming (John) <john.liuqiming@huawei.com> wrote:
>>>>
>>>>> In oxenstored it use "select" for incoming socket, so I don't think it can handle more than 1024 socket connections. 
>>>>
>>>> That’s true.
>>>
>>> The problem is when oxenstored does not respond any request anymore even all
>>> thread exited, with my reproducer, when you executed it and all threads exited,
>>> "xm list -l" will stuck.
>>
>> OK so is this the behaviour you expect:
>>
>> * root in dom0 opens many connections, until oxenstored is out of resources (where the most limited resource is currently file descriptors)
>> * root in dom0 closes the connections
>> * oxenstored recovers, and ‘xm list -l’ works again
>>
>> Instead, you’re seeing oxenstored getting into a stuck state causing ‘xm list -l’ to block — is this accurate?
> 
> Yes that's it.
> 
>>
>> Could you share your reproducer program?
> 
> /* 
>  * This program used to test oxenstored connections stuck issue.
>  * please compile by below command:
>  *	gcc -o client client.c -lpthread
>  */
> #include <stdio.h>
> #include <sys/socket.h>
> #include <sys/un.h>
> #include <unistd.h>
> #include <string.h>
> #include <pthread.h>
> #include <stdlib.h>
> #include <errno.h>
> 
> 
> void *main_thread(void *arg)
> {
> 	struct sockaddr_un address;
> 	int socket_fd, nbytes;
> 	char buffer[256];
> 	int i;
> 	extern int errno;
> 
> 	memcpy(&i, arg, sizeof(i));
> 	socket_fd = socket(PF_UNIX, SOCK_STREAM, 0);
> 	if (socket_fd < 0) {
> 		fprintf(stderr, "socket() %dth failed, errno=%d\n", i, errno);
> 		return;
> 	}
> 	fprintf(stderr, "socket() %dth ok!\n", i);
> 
> 	/* start with a clean address structure */
> 	memset(&address, 0, sizeof(struct sockaddr_un));
> 
> 	address.sun_family = AF_UNIX;
> 	snprintf(address.sun_path, 1024, "/var/run/xenstored/socket");
> 
> 	if (connect(socket_fd,
> 		    (struct sockaddr *) &address,
> 		    sizeof(struct sockaddr_un)) != 0) {
> 		fprintf(stderr, "connect() %d failed, error=%d", i, errno);
> 		return;
> 	}
> 	fprintf(stderr, "connec() %dth ok!\n", i);
> 
> 	while (1)
> 		sleep(1);
> 	if (arg) {
> 		free(arg);
> 		arg = NULL;
> 	}
> 
> 	return;
> }
> 
> int main(void)
> {
> 	int i;
> 	for (i = 0; i < 2000; i++) {
> 		void *arg = malloc(sizeof(i));
> 		memset(arg, 0, sizeof(i));
> 		memcpy(arg, &i, sizeof(i));
> 		pthread_t thread;
> 		if (pthread_create(&thread, NULL, main_thread, arg) != 0) {
> 			perror("pthread_create:");
> 			break;
> 		}
> 	}
> 	/* Wait all children exit */
> 	sleep(3);
> 	return 0;
> }
> /* end */
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

  parent reply	other threads:[~2014-08-26  8:15 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-08  7:01 Lots of connections led oxenstored stuck Joe Jin
2014-08-08  8:35 ` Liuqiming (John)
2014-08-08  9:37   ` Dave Scott
2014-08-11  0:35     ` Joe Jin
2014-08-11  9:41       ` Dave Scott
2014-08-12  0:19         ` Joe Jin
2014-08-14  8:33           ` Joe Jin
2014-08-26  8:15           ` Joe Jin [this message]
2014-08-26  9:02             ` Zheng Li
2014-08-27  1:59               ` Joe Jin
2014-08-27 10:16                 ` Zheng Li
2014-08-11 16:58     ` Zheng Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53FC420B.10005@oracle.com \
    --to=joe.jin@oracle.com \
    --cc=Dave.Scott@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=dev@zheng.li \
    --cc=fanhenglong@huawei.com \
    --cc=john.liuqiming@huawei.com \
    --cc=luonengjun@huawei.com \
    --cc=mcgrof@suse.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).