intermittent problems with legacy xmlrpc server in 3.0.4

All of lore.kernel.org
 help / color / mirror / Atom feed

* intermittent problems with legacy xmlrpc server in 3.0.4
@ 2007-01-17  1:04 John Levon
  2007-01-17  1:29 ` Daniel P. Berrange
  0 siblings, 1 reply; 3+ messages in thread
From: John Levon @ 2007-01-17  1:04 UTC (permalink / raw)
  To: xen-devel; +Cc: atse

I've been having intermittent problems with xm talking to xend over the legacy
xmlrpc server. In theory it should be reproducable under Linux with an
xm list loop, though you might need a heavy load. DTrace says:

  0  24244                       recv:entry 12954192948474 xend recv(8192)
  0  24245                      recv:return 12954192971001 xend recv() ret -1 errno 11
  0  24250                       send:entry 12954196092353 xm send(132, POST /RPC2 HTTP/1.0
Host:
User-Agent: xmlrpclib.py/1.0.1 (by www.pythonware.com)
Content-Type: text/xml
Content-Length: 268

)
  0  24251                      send:return 12954196113363 xm send() ret -1 errno 32

11 = EAGAIN:

     EWOULDBLOCK     The socket is marked  non-blocking  and  the
                     requested operation would block.
32 = EPIPE:

     EPIPE           The socket is shut down for writing, or  the
                     socket  is  connection-mode and is no longer
                     connected. In the latter case, if the socket
                     is  of  type SOCK_STREAM, the SIGPIPE signal
                     is generated to the calling thread.

So for some reason the server is trying to process a request before xm has sent it, and the
EWOULDBLOCK is causing the EPIPE it seems.

changeset 12062:5fe8e9ebcf5c made this change:

+        try:
+            self.server.socket.settimeout(1.0)
+            while self.running:
+                self.server.handle_request()

which places xmlrpc.sock in non-blocking mode. SocketServer.py actually
does this on init:

    def __init__(self, request, client_address, server):
        self.request = request
        self.client_address = client_address
        self.server = server
        try:
            self.setup()
            self.handle()
            self.finish()

This self.handle() ends up as the recv() that craps itself when it gets
EAGAIN. This doesn't always happen, presumably the race is between
creating the request thread in SocketServer and xm writing the data.

I've hacked up SocketServer a bit to handle EAGAIN, but this obviously
isn't a good fix. Suggestions welcome, I'm not really familiar with all
this server code.

regards
john

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: intermittent problems with legacy xmlrpc server in 3.0.4
  2007-01-17  1:04 intermittent problems with legacy xmlrpc server in 3.0.4 John Levon
@ 2007-01-17  1:29 ` Daniel P. Berrange
  2007-01-18 15:55   ` Alastair Tse
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel P. Berrange @ 2007-01-17  1:29 UTC (permalink / raw)
  To: John Levon; +Cc: atse, xen-devel

On Wed, Jan 17, 2007 at 01:04:01AM +0000, John Levon wrote:
> So for some reason the server is trying to process a request before xm has sent it, and the
> EWOULDBLOCK is causing the EPIPE it seems.
> 
> changeset 12062:5fe8e9ebcf5c made this change:
> 
> +        try:
> +            self.server.socket.settimeout(1.0)
> +            while self.running:
> +                self.server.handle_request()
> 
> which places xmlrpc.sock in non-blocking mode. SocketServer.py actually
> does this on init:

So from reading that changeset, it looks as if the socket is being put
in no-blocking mode so that when XenD shuts down it doesn't wait forever
for active clients to finish. An alternate way to do this would be to
simply set all the client connection handling threads to be daemonized
threads and not bother calling join() on them at all - just rely on
the automatic thread cleanup. This means that the leader process can just
quit & any outstanding client handling threads will simply be killed
off without delay.

>     def __init__(self, request, client_address, server):
>         self.request = request
>         self.client_address = client_address
>         self.server = server
>         try:
>             self.setup()
>             self.handle()
>             self.finish()
> 
> This self.handle() ends up as the recv() that craps itself when it gets
> EAGAIN. This doesn't always happen, presumably the race is between
> creating the request thread in SocketServer and xm writing the data.
> 
> I've hacked up SocketServer a bit to handle EAGAIN, but this obviously
> isn't a good fix. Suggestions welcome, I'm not really familiar with all
> this server code.

Having had a cursory glance at the code, as you say, none of it is expecting
the socket to be in non-blocking mode so it easily breaks. You'd probably
see same thing if network congestion caused a data stall of > 1 second.
IMHO the sockets should be put back to blocking mode & find another way
of dealing with any possible shutdown issues.

Regards,
Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=| 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: intermittent problems with legacy xmlrpc server in 3.0.4
  2007-01-17  1:29 ` Daniel P. Berrange
@ 2007-01-18 15:55   ` Alastair Tse
  0 siblings, 0 replies; 3+ messages in thread
From: Alastair Tse @ 2007-01-18 15:55 UTC (permalink / raw)
  To: xen-devel

On Wed, 2007-01-17 at 01:29 +0000, Daniel P. Berrange wrote:
> So from reading that changeset, it looks as if the socket is being put
> in no-blocking mode so that when XenD shuts down it doesn't wait forever
> for active clients to finish. An alternate way to do this would be to
> simply set all the client connection handling threads to be daemonized
> threads and not bother calling join() on them at all - just rely on
> the automatic thread cleanup. This means that the leader process can just
> quit & any outstanding client handling threads will simply be killed
> off without delay.
> 

I've just committed a patch based on your suggestion with setting all
the threads to daemonic, which gets rid of the join() and settimeout()
on the socket. Hopefully this should solve the problems John is seeing
with intermittent failures.

Thanks,

Alastair

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-01-18 15:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-17  1:04 intermittent problems with legacy xmlrpc server in 3.0.4 John Levon
2007-01-17  1:29 ` Daniel P. Berrange
2007-01-18 15:55   ` Alastair Tse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.