[PATCH] util: don't set SO_REUSEADDR on client sockets

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] util: don't set SO_REUSEADDR on client sockets
@ 2024-10-21 14:54 Daniel P. Berrangé
  2024-10-21 15:45 ` Peter Xu
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Daniel P. Berrangé @ 2024-10-21 14:54 UTC (permalink / raw)
  To: qemu-devel
  Cc: Daniel P. Berrangé, Peter Xu, Brad Smith, Peter Maydell,
	Fabiano Rosas

Setting the SO_REUSEADDR property on a socket allows binding to a port
number that is in the TIMED_WAIT state. This is usually done on listener
sockets, to enable a server to restart itself without having to wait for
the completion of TIMED_WAIT on the port.

It is also possible, but highly unusual, to set it on client sockets. It
is rare to explicitly bind() a client socket, since it is almost always
fine to allow the kernel to auto-bind a client socket to a random free
port. Most systems will have many 10's of 1000's of free ports that
client sockets will be bound to.

eg on Linux

  $ sysctl -a | grep local_port
  net.ipv4.ip_local_port_range = 32768	60999

eg on OpenBSD

  $ sysctl -a | grep net.inet.ip.port
  net.inet.ip.portfirst=1024
  net.inet.ip.portlast=49151
  net.inet.ip.porthifirst=49152
  net.inet.ip.porthilast=65535

A connected socket must have a unique set of value for

 (protocol, localip, localport, remoteip, remoteport)

otherwise it is liable to get EADDRINUSE.

A client connection should trivially avoid EADDRINUSE if letting the
kernel auto-assign the 'localport' value, which QEMU always does.

When QEMU sets SO_REUSEADDR on a client socket on OpenBSD, however, it
upsets this situation.

The OpenBSD kernel appears to happily pick a 'localport' that is in the
TIMED_WAIT state, even if there are many other available local ports
available for use that are not in the TIMED_WAIT state.

A test program that just loops opening client sockets will start seeing
EADDRINUSE on OpenBSD when as few as 2000 ports are in TIMED_WAIT,
despite 10's of 1000's ports still being unused. This contrasts with
Linux which appears to avoid picking local ports in TIMED_WAIT state.

This problem on OpenBSD exhibits itself periodically with the migration
test failing with a message like[1]:

  qemu-system-ppc64: Failed to connect to '127.0.0.1:24109': Address already in use

While I have not been able to reproduce the OpenBSD failure in my own
testing, given the scope of what QEMU tests do, it is entirely possible
that there could be a lot of ports in TIMED_WAIT state when the
migration test runs.

Removing SO_REUSEADDR from the client sockets should not affect normal
QEMU usage, and should improve reliability on OpenBSD.

This use of SO_REUSEADDR on client sockets is highly unusual, and
appears to have been present since the very start of the QEMU socket
helpers in 2008. The orignal commit has no comment about the use of
SO_REUSEADDR on the client, so is most likely just an 16 year old
copy+paste bug.

[1] https://lists.nongnu.org/archive/html/qemu-devel/2024-10/msg03427.html
    https://lists.nongnu.org/archive/html/qemu-devel/2024-02/msg01572.html

Fixes: d247d25f18764402899b37c381bb696a79000b4e
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 util/qemu-sockets.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 60c44b2b56..80594ecad5 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -367,7 +367,6 @@ static int inet_connect_addr(const InetSocketAddress *saddr,
                          addr->ai_family);
         return -1;
     }
-    socket_set_fast_reuse(sock);

     /* connect to peer */
     do {
-- 
2.46.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] util: don't set SO_REUSEADDR on client sockets
  2024-10-21 14:54 [PATCH] util: don't set SO_REUSEADDR on client sockets Daniel P. Berrangé
@ 2024-10-21 15:45 ` Peter Xu
  2024-10-21 16:41 ` Fabiano Rosas
  2024-10-21 16:53 ` Peter Maydell
  2 siblings, 0 replies; 5+ messages in thread
From: Peter Xu @ 2024-10-21 15:45 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: qemu-devel, Brad Smith, Peter Maydell, Fabiano Rosas

On Mon, Oct 21, 2024 at 03:54:10PM +0100, Daniel P. Berrangé wrote:
> Setting the SO_REUSEADDR property on a socket allows binding to a port
> number that is in the TIMED_WAIT state. This is usually done on listener
> sockets, to enable a server to restart itself without having to wait for
> the completion of TIMED_WAIT on the port.
> 
> It is also possible, but highly unusual, to set it on client sockets. It
> is rare to explicitly bind() a client socket, since it is almost always
> fine to allow the kernel to auto-bind a client socket to a random free
> port. Most systems will have many 10's of 1000's of free ports that
> client sockets will be bound to.
> 
> eg on Linux
> 
>   $ sysctl -a | grep local_port
>   net.ipv4.ip_local_port_range = 32768	60999
> 
> eg on OpenBSD
> 
>   $ sysctl -a | grep net.inet.ip.port
>   net.inet.ip.portfirst=1024
>   net.inet.ip.portlast=49151
>   net.inet.ip.porthifirst=49152
>   net.inet.ip.porthilast=65535
> 
> A connected socket must have a unique set of value for
> 
>  (protocol, localip, localport, remoteip, remoteport)
> 
> otherwise it is liable to get EADDRINUSE.
> 
> A client connection should trivially avoid EADDRINUSE if letting the
> kernel auto-assign the 'localport' value, which QEMU always does.
> 
> When QEMU sets SO_REUSEADDR on a client socket on OpenBSD, however, it
> upsets this situation.
> 
> The OpenBSD kernel appears to happily pick a 'localport' that is in the
> TIMED_WAIT state, even if there are many other available local ports
> available for use that are not in the TIMED_WAIT state.
> 
> A test program that just loops opening client sockets will start seeing
> EADDRINUSE on OpenBSD when as few as 2000 ports are in TIMED_WAIT,
> despite 10's of 1000's ports still being unused. This contrasts with
> Linux which appears to avoid picking local ports in TIMED_WAIT state.
> 
> This problem on OpenBSD exhibits itself periodically with the migration
> test failing with a message like[1]:
> 
>   qemu-system-ppc64: Failed to connect to '127.0.0.1:24109': Address already in use
> 
> While I have not been able to reproduce the OpenBSD failure in my own
> testing, given the scope of what QEMU tests do, it is entirely possible
> that there could be a lot of ports in TIMED_WAIT state when the
> migration test runs.
> 
> Removing SO_REUSEADDR from the client sockets should not affect normal
> QEMU usage, and should improve reliability on OpenBSD.
> 
> This use of SO_REUSEADDR on client sockets is highly unusual, and
> appears to have been present since the very start of the QEMU socket
> helpers in 2008. The orignal commit has no comment about the use of
> SO_REUSEADDR on the client, so is most likely just an 16 year old
> copy+paste bug.
> 
> [1] https://lists.nongnu.org/archive/html/qemu-devel/2024-10/msg03427.html
>     https://lists.nongnu.org/archive/html/qemu-devel/2024-02/msg01572.html
> 
> Fixes: d247d25f18764402899b37c381bb696a79000b4e
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

Thanks for digging this.  It's good already to see some line removed on the
client side where it doesn't seem to clear why REUSE is needed at all..

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] util: don't set SO_REUSEADDR on client sockets
  2024-10-21 14:54 [PATCH] util: don't set SO_REUSEADDR on client sockets Daniel P. Berrangé
  2024-10-21 15:45 ` Peter Xu
@ 2024-10-21 16:41 ` Fabiano Rosas
  2024-10-21 16:53 ` Peter Maydell
  2 siblings, 0 replies; 5+ messages in thread
From: Fabiano Rosas @ 2024-10-21 16:41 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel
  Cc: Daniel P. Berrangé, Peter Xu, Brad Smith, Peter Maydell

Daniel P. Berrangé <berrange@redhat.com> writes:

> Setting the SO_REUSEADDR property on a socket allows binding to a port
> number that is in the TIMED_WAIT state. This is usually done on listener
> sockets, to enable a server to restart itself without having to wait for
> the completion of TIMED_WAIT on the port.
>
> It is also possible, but highly unusual, to set it on client sockets. It
> is rare to explicitly bind() a client socket, since it is almost always
> fine to allow the kernel to auto-bind a client socket to a random free
> port. Most systems will have many 10's of 1000's of free ports that
> client sockets will be bound to.
>
> eg on Linux
>
>   $ sysctl -a | grep local_port
>   net.ipv4.ip_local_port_range = 32768	60999
>
> eg on OpenBSD
>
>   $ sysctl -a | grep net.inet.ip.port
>   net.inet.ip.portfirst=1024
>   net.inet.ip.portlast=49151
>   net.inet.ip.porthifirst=49152
>   net.inet.ip.porthilast=65535
>
> A connected socket must have a unique set of value for
>
>  (protocol, localip, localport, remoteip, remoteport)
>
> otherwise it is liable to get EADDRINUSE.
>
> A client connection should trivially avoid EADDRINUSE if letting the
> kernel auto-assign the 'localport' value, which QEMU always does.
>
> When QEMU sets SO_REUSEADDR on a client socket on OpenBSD, however, it
> upsets this situation.
>
> The OpenBSD kernel appears to happily pick a 'localport' that is in the
> TIMED_WAIT state, even if there are many other available local ports
> available for use that are not in the TIMED_WAIT state.
>
> A test program that just loops opening client sockets will start seeing
> EADDRINUSE on OpenBSD when as few as 2000 ports are in TIMED_WAIT,
> despite 10's of 1000's ports still being unused. This contrasts with
> Linux which appears to avoid picking local ports in TIMED_WAIT state.
>
> This problem on OpenBSD exhibits itself periodically with the migration
> test failing with a message like[1]:
>
>   qemu-system-ppc64: Failed to connect to '127.0.0.1:24109': Address already in use
>
> While I have not been able to reproduce the OpenBSD failure in my own
> testing, given the scope of what QEMU tests do, it is entirely possible
> that there could be a lot of ports in TIMED_WAIT state when the
> migration test runs.
>
> Removing SO_REUSEADDR from the client sockets should not affect normal
> QEMU usage, and should improve reliability on OpenBSD.
>
> This use of SO_REUSEADDR on client sockets is highly unusual, and
> appears to have been present since the very start of the QEMU socket
> helpers in 2008. The orignal commit has no comment about the use of
> SO_REUSEADDR on the client, so is most likely just an 16 year old
> copy+paste bug.
>
> [1] https://lists.nongnu.org/archive/html/qemu-devel/2024-10/msg03427.html
>     https://lists.nongnu.org/archive/html/qemu-devel/2024-02/msg01572.html
>
> Fixes: d247d25f18764402899b37c381bb696a79000b4e
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] util: don't set SO_REUSEADDR on client sockets
  2024-10-21 14:54 [PATCH] util: don't set SO_REUSEADDR on client sockets Daniel P. Berrangé
  2024-10-21 15:45 ` Peter Xu
  2024-10-21 16:41 ` Fabiano Rosas
@ 2024-10-21 16:53 ` Peter Maydell
  2024-10-21 17:04   ` Daniel P. Berrangé
  2 siblings, 1 reply; 5+ messages in thread
From: Peter Maydell @ 2024-10-21 16:53 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel, Peter Xu, Brad Smith, Fabiano Rosas

On Mon, 21 Oct 2024 at 15:54, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> Setting the SO_REUSEADDR property on a socket allows binding to a port
> number that is in the TIMED_WAIT state. This is usually done on listener
> sockets, to enable a server to restart itself without having to wait for
> the completion of TIMED_WAIT on the port.

[...]

> diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
> index 60c44b2b56..80594ecad5 100644
> --- a/util/qemu-sockets.c
> +++ b/util/qemu-sockets.c
> @@ -367,7 +367,6 @@ static int inet_connect_addr(const InetSocketAddress *saddr,
>                           addr->ai_family);
>          return -1;
>      }
> -    socket_set_fast_reuse(sock);
>
>      /* connect to peer */
>      do {

We definitely want to keep the socket_set_fast_reuse()
call in create_fast_reuse_socket() as that function is
used (only) in the "create socket, listen, bind" server
socket code path. (Arguably create_fast_reuse_socket()
is a bit unnecessary as it has only one callsite.)

The one in inet_connect_addr() is clearly wrong as that's
the client end (fixed in this patch).

How about the call in inet_dgram_saddr() ? I'm not sure how
SO_REUSEADDR interacts with UDP sockets... (I'm assuming
the answer is "we need it there" so I'm kind of asking for
the code-review record really.)

In net/socket.c we already set SO_REUSEADDR for dgram
and for listening sockets but not for client ones, so
we're now consistent there.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] util: don't set SO_REUSEADDR on client sockets
  2024-10-21 16:53 ` Peter Maydell
@ 2024-10-21 17:04   ` Daniel P. Berrangé
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel P. Berrangé @ 2024-10-21 17:04 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, Peter Xu, Brad Smith, Fabiano Rosas

On Mon, Oct 21, 2024 at 05:53:17PM +0100, Peter Maydell wrote:
> On Mon, 21 Oct 2024 at 15:54, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > Setting the SO_REUSEADDR property on a socket allows binding to a port
> > number that is in the TIMED_WAIT state. This is usually done on listener
> > sockets, to enable a server to restart itself without having to wait for
> > the completion of TIMED_WAIT on the port.
> 
> [...]
> 
> > diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
> > index 60c44b2b56..80594ecad5 100644
> > --- a/util/qemu-sockets.c
> > +++ b/util/qemu-sockets.c
> > @@ -367,7 +367,6 @@ static int inet_connect_addr(const InetSocketAddress *saddr,
> >                           addr->ai_family);
> >          return -1;
> >      }
> > -    socket_set_fast_reuse(sock);
> >
> >      /* connect to peer */
> >      do {
> 
> We definitely want to keep the socket_set_fast_reuse()
> call in create_fast_reuse_socket() as that function is
> used (only) in the "create socket, listen, bind" server
> socket code path. (Arguably create_fast_reuse_socket()
> is a bit unnecessary as it has only one callsite.)
> 
> The one in inet_connect_addr() is clearly wrong as that's
> the client end (fixed in this patch).
> 
> How about the call in inet_dgram_saddr() ? I'm not sure how
> SO_REUSEADDR interacts with UDP sockets... (I'm assuming
> the answer is "we need it there" so I'm kind of asking for
> the code-review record really.)

We need the one in inet_dgram_saddr, because there is an
explicit bind() call there, for the situation where the
local UDP address is set.

> In net/socket.c we already set SO_REUSEADDR for dgram
> and for listening sockets but not for client ones, so
> we're now consistent there.
> 
> thanks
> -- PMM
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-10-21 17:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-21 14:54 [PATCH] util: don't set SO_REUSEADDR on client sockets Daniel P. Berrangé
2024-10-21 15:45 ` Peter Xu
2024-10-21 16:41 ` Fabiano Rosas
2024-10-21 16:53 ` Peter Maydell
2024-10-21 17:04   ` Daniel P. Berrangé

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).