public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Willy Tarreau <w@1wt.eu>
Cc: "Cyril Bonté" <cyril.bonte@free.fr>,
	netdev@vger.kernel.org, "Daniel Baluta" <daniel.baluta@gmail.com>,
	"Gaspar Chilingarov" <gasparch@gmail.com>,
	"Charles Duffy" <charles@dyfis.net>
Subject: Re: tcp: disallow bind() to reuse addr/port regression in 2.6.38
Date: Sat, 02 Apr 2011 21:44:55 +0200	[thread overview]
Message-ID: <1301773495.2837.26.camel@edumazet-laptop> (raw)
In-Reply-To: <20110402191516.GG5552@1wt.eu>

Le samedi 02 avril 2011 à 21:15 +0200, Willy Tarreau a écrit :
> Hi Eric,
> 
> On Sat, Apr 02, 2011 at 08:46:11PM +0200, Cyril Bonté wrote:
> > Le samedi 2 avril 2011 20:10:48, Eric Dumazet a écrit :
> > > Le samedi 02 avril 2011 à 20:01 +0200, Cyril Bonté a écrit :
> > > (...)
> > > > > 		if (shutdown(listenfd, SHUT_WR) == 0 &&
> > > > 		
> > > > 		    listen(listenfd, 1024) == 0 &&
> > > > 		    shutdown(listenfd, SHUT_RD) == 0) {
> > > > 			
> > > > 			printf("shutdown OK\n");
> > > > 		
> > > > 		}
> > > > 	
> > > > 	}
> > > > 	exit(0);
> > > > 
> > > > }
> > > 
> > > Wow, not clear what this is doing....
> > > 
> > > for sure the listen() call is not needed ?
> > > 
> > > And the shutdown(listenfd, SHUT_WR) is clearly useless too.
> > 
> > Well, I'm not the best one to explain that part but from what i read in the 
> > comments of this part of code, both listen and SHUT_WR are used to detect 
> > errors on various OS (OpenBSD, Solaris, ...).
> > 
> > > I feel you only needed the shutdown(listenfd, SHUT_RD) call.
> > > 
> > > Why haproxy needs to setup a second listening socket on same port ?
> > 
> > I simplified the test case, which is far from what haproxy do (just forgot to 
> > explain the real behaviour).
> > To reload the configuration, a new haproxy process is launched, sending a 
> > signal to the previous one and asking it to free the ports for a while (the 
> > shutdown part in the test). The new process then tries to bind the ports, 
> > which worked until 2.6.38 (if an error occurs, a new signal is sent to the 
> > previous process to listen to its sockets again).
> 
> Indeed, here's what normally happens when haproxy reloads.
> 
> New process is loaded with a new config. Once the config correctly parses,
> it sends a signal to the previous process asking it to temporarily release
> its listening ports so that the new one can bind, hence the shutdown(SHUT_RD)
> performed in the old process.
> 
> Then the new process can grab the ports and listen to them. Once that's OK,
> it sends another signal to the old process telling it it can go away. But
> if the new process failed to completely start (eg: could not grab one port),
> then it sends a third signal to the old process asking it to rebind the port
> and serve them again, and the new one dies with an error.
> 
> That way, the service is never interrupted even if the new config fails
> late, because the old process has the ability to rebind to the port it
> temporarily released.
> 
> Now with 2.6.38, as Cyril diagnosed it, the new bind() fails when the
> old process has just performed its shutdown(SHUT_RD), preventing the
> new process from binding to the ports until the old process has
> definitely closed them.
> 
> The behaviour is very useful, because the old process might have lost
> its privileges, it will not have to rebind to the socket, just listen
> on it again since it is never closed.
> 
> This is quite embarrassing, because this code used to work for the
> last 10 years, at least since kernel 2.2, and maybe even 2.0, I don't
> remember.
> 
> I'm not sure what the original intent of the patch was, not what was
> the reported issue, but maybe we could find a way to both fix the
> reported issue (if any) and restore the old behaviour in order not
> to break existing programs.
> 
> Best regards,
> Willy
> 

I wish it was that simple....

http://www.spinics.net/lists/netdev/msg151551.html

Is Cyril program running OK on FreeBsd ?




  reply	other threads:[~2011-04-02 19:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-02 18:01 tcp: disallow bind() to reuse addr/port regression in 2.6.38 Cyril Bonté
2011-04-02 18:10 ` Eric Dumazet
2011-04-02 18:46   ` Cyril Bonté
2011-04-02 19:15     ` Willy Tarreau
2011-04-02 19:44       ` Eric Dumazet [this message]
2011-04-02 20:37         ` Willy Tarreau
2011-04-02 21:00           ` Cyril Bonté
2011-04-02 21:18             ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1301773495.2837.26.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=charles@dyfis.net \
    --cc=cyril.bonte@free.fr \
    --cc=daniel.baluta@gmail.com \
    --cc=gasparch@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox