* intermittent problems with legacy xmlrpc server in 3.0.4
@ 2007-01-17 1:04 John Levon
2007-01-17 1:29 ` Daniel P. Berrange
0 siblings, 1 reply; 3+ messages in thread
From: John Levon @ 2007-01-17 1:04 UTC (permalink / raw)
To: xen-devel; +Cc: atse
I've been having intermittent problems with xm talking to xend over the legacy
xmlrpc server. In theory it should be reproducable under Linux with an
xm list loop, though you might need a heavy load. DTrace says:
0 24244 recv:entry 12954192948474 xend recv(8192)
0 24245 recv:return 12954192971001 xend recv() ret -1 errno 11
0 24250 send:entry 12954196092353 xm send(132, POST /RPC2 HTTP/1.0
Host:
User-Agent: xmlrpclib.py/1.0.1 (by www.pythonware.com)
Content-Type: text/xml
Content-Length: 268
)
0 24251 send:return 12954196113363 xm send() ret -1 errno 32
11 = EAGAIN:
EWOULDBLOCK The socket is marked non-blocking and the
requested operation would block.
32 = EPIPE:
EPIPE The socket is shut down for writing, or the
socket is connection-mode and is no longer
connected. In the latter case, if the socket
is of type SOCK_STREAM, the SIGPIPE signal
is generated to the calling thread.
So for some reason the server is trying to process a request before xm has sent it, and the
EWOULDBLOCK is causing the EPIPE it seems.
changeset 12062:5fe8e9ebcf5c made this change:
+ try:
+ self.server.socket.settimeout(1.0)
+ while self.running:
+ self.server.handle_request()
which places xmlrpc.sock in non-blocking mode. SocketServer.py actually
does this on init:
def __init__(self, request, client_address, server):
self.request = request
self.client_address = client_address
self.server = server
try:
self.setup()
self.handle()
self.finish()
This self.handle() ends up as the recv() that craps itself when it gets
EAGAIN. This doesn't always happen, presumably the race is between
creating the request thread in SocketServer and xm writing the data.
I've hacked up SocketServer a bit to handle EAGAIN, but this obviously
isn't a good fix. Suggestions welcome, I'm not really familiar with all
this server code.
regards
john
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: intermittent problems with legacy xmlrpc server in 3.0.4
2007-01-17 1:04 intermittent problems with legacy xmlrpc server in 3.0.4 John Levon
@ 2007-01-17 1:29 ` Daniel P. Berrange
2007-01-18 15:55 ` Alastair Tse
0 siblings, 1 reply; 3+ messages in thread
From: Daniel P. Berrange @ 2007-01-17 1:29 UTC (permalink / raw)
To: John Levon; +Cc: atse, xen-devel
On Wed, Jan 17, 2007 at 01:04:01AM +0000, John Levon wrote:
> So for some reason the server is trying to process a request before xm has sent it, and the
> EWOULDBLOCK is causing the EPIPE it seems.
>
> changeset 12062:5fe8e9ebcf5c made this change:
>
> + try:
> + self.server.socket.settimeout(1.0)
> + while self.running:
> + self.server.handle_request()
>
> which places xmlrpc.sock in non-blocking mode. SocketServer.py actually
> does this on init:
So from reading that changeset, it looks as if the socket is being put
in no-blocking mode so that when XenD shuts down it doesn't wait forever
for active clients to finish. An alternate way to do this would be to
simply set all the client connection handling threads to be daemonized
threads and not bother calling join() on them at all - just rely on
the automatic thread cleanup. This means that the leader process can just
quit & any outstanding client handling threads will simply be killed
off without delay.
> def __init__(self, request, client_address, server):
> self.request = request
> self.client_address = client_address
> self.server = server
> try:
> self.setup()
> self.handle()
> self.finish()
>
> This self.handle() ends up as the recv() that craps itself when it gets
> EAGAIN. This doesn't always happen, presumably the race is between
> creating the request thread in SocketServer and xm writing the data.
>
> I've hacked up SocketServer a bit to handle EAGAIN, but this obviously
> isn't a good fix. Suggestions welcome, I'm not really familiar with all
> this server code.
Having had a cursory glance at the code, as you say, none of it is expecting
the socket to be in non-blocking mode so it easily breaks. You'd probably
see same thing if network congestion caused a data stall of > 1 second.
IMHO the sockets should be put back to blocking mode & find another way
of dealing with any possible shutdown issues.
Regards,
Dan.
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules: http://search.cpan.org/~danberr/ -=|
|=- Projects: http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: intermittent problems with legacy xmlrpc server in 3.0.4
2007-01-17 1:29 ` Daniel P. Berrange
@ 2007-01-18 15:55 ` Alastair Tse
0 siblings, 0 replies; 3+ messages in thread
From: Alastair Tse @ 2007-01-18 15:55 UTC (permalink / raw)
To: xen-devel
On Wed, 2007-01-17 at 01:29 +0000, Daniel P. Berrange wrote:
> So from reading that changeset, it looks as if the socket is being put
> in no-blocking mode so that when XenD shuts down it doesn't wait forever
> for active clients to finish. An alternate way to do this would be to
> simply set all the client connection handling threads to be daemonized
> threads and not bother calling join() on them at all - just rely on
> the automatic thread cleanup. This means that the leader process can just
> quit & any outstanding client handling threads will simply be killed
> off without delay.
>
I've just committed a patch based on your suggestion with setting all
the threads to daemonic, which gets rid of the join() and settimeout()
on the socket. Hopefully this should solve the problems John is seeing
with intermittent failures.
Thanks,
Alastair
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-01-18 15:55 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-17 1:04 intermittent problems with legacy xmlrpc server in 3.0.4 John Levon
2007-01-17 1:29 ` Daniel P. Berrange
2007-01-18 15:55 ` Alastair Tse
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.