From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:57331) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TG5HU-0001Q2-MM for qemu-devel@nongnu.org; Mon, 24 Sep 2012 05:48:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TG5HO-0004nO-Gn for qemu-devel@nongnu.org; Mon, 24 Sep 2012 05:48:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57473) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TG5HO-0004nI-8N for qemu-devel@nongnu.org; Mon, 24 Sep 2012 05:48:42 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q8O9mfam003698 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 24 Sep 2012 05:48:41 -0400 Message-ID: <50602C73.2080503@redhat.com> Date: Mon, 24 Sep 2012 17:48:35 +0800 From: Amos Kong MIME-Version: 1.0 References: <170225001.2762316.1348154187593.JavaMail.root@redhat.com> <505EAD8A.90701@redhat.com> In-Reply-To: <505EAD8A.90701@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v3 3/3] Fix address handling in inet_nonblocking_connect List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Orit Wasserman Cc: qemu-devel@nongnu.org On 23/09/12 14:34, Orit Wasserman wrote: > On 09/20/2012 06:16 PM, Amos Kong wrote: >> ----- Original Message ----- >>> On 09/19/2012 11:31 AM, Amos Kong wrote: >>>> On 14/09/12 02:58, Orit Wasserman wrote: >>>>> getaddrinfo can give us a list of addresses, but we only try to >>>>> connect to the first one. If that fails we never proceed to >>>>> the next one. This is common on desktop setups that often have >>>>> ipv6 >>>>> configured but not actually working. >>>>> >>>>> To fix this make inet_connect_nonblocking retry connection with a >>>>> different >>>>> address. >>>>> callers on inet_nonblocking_connect register a callback function >>>>> that will >>>>> be called when connect opertion completes, in case of failure the >>>>> fd will have >>>>> a negative value >>>> >>>> >>>> Hi Orit, >>>> >>>> We almost get to the destination :) >>>> >>>>> Signed-off-by: Orit Wasserman >>>>> Signed-off-by: Michael S. Tsirkin >>>>> --- >>>>> migration-tcp.c | 29 +++++----------- >>>>> qemu-char.c | 2 +- >>>>> qemu-sockets.c | 95 >>>>> +++++++++++++++++++++++++++++++++++++++++++++++------- >>>>> qemu_socket.h | 13 ++++++-- >>>>> 4 files changed, 102 insertions(+), 37 deletions(-) >>>>> >>>> >>>> ... >>>> >>>>> diff --git a/qemu-sockets.c b/qemu-sockets.c >>>>> index 212075d..d321c58 100644 >>>>> --- a/qemu-sockets.c >>>>> +++ b/qemu-sockets.c >>>>> @@ -24,6 +24,7 @@ >>>>> >>>>> #include "qemu_socket.h" >>>>> #include "qemu-common.h" /* for qemu_isdigit */ >>>>> +#include "main-loop.h" >>>>> >>>>> #ifndef AI_ADDRCONFIG >>>>> # define AI_ADDRCONFIG 0 >>>>> @@ -217,11 +218,69 @@ listen: >>>>> ((rc) =3D=3D -EINPROGRESS) >>>>> #endif >>>>> >>>>> +/* Struct to store connect state for non blocking connect */ >>>>> +typedef struct ConnectState { >>>>> + int fd; >>>>> + struct addrinfo *addr_list; >>>>> + struct addrinfo *current_addr; >>>>> + ConnectHandler *callback; >>>>> + void *opaque; >>>> >>>>> + Error *errp; >>>> >>>> This member is not used. >>> I will remove it. >>>> >>>>> +} ConnectState; >>>> >>>> # make (gcc-4.4.6-4.el6.x86_64) >>>> CC qemu-sockets.o >>>> qemu-sockets.c:229: error: redefinition of typedef =E2=80=98ConnectS= tate=E2=80=99 >>>> qemu_socket.h:46: note: previous declaration of =E2=80=98ConnectStat= e=E2=80=99 was >>>> here >>>> make: *** [qemu-sockets.o] Error 1 >>>> >>>> >>>> struct ConnectState { >>>> int fd; >>>> struct addrinfo *addr_list; >>>> struct addrinfo *current_addr; >>>> ConnectHandler *callback; >>>> void *opaque; >>>> }; >>>> >>>> >>>>> + >>>>> static int inet_connect_addr(struct addrinfo *addr, bool block, >>>>> - bool *in_progress) >>>>> + bool *in_progress, ConnectState >>>>> *connect_state); >>>>> + >>>>> +static void wait_for_connect(void *opaque) >>>>> +{ >>>>> + ConnectState *s =3D opaque; >>>>> + int val =3D 0, rc =3D 0; >>>>> + socklen_t valsize =3D sizeof(val); >>>>> + bool in_progress =3D false; >>>>> + >>>>> + qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL); >>>>> + >>>>> + do { >>>>> + rc =3D getsockopt(s->fd, SOL_SOCKET, SO_ERROR, (void *) >>>>> &val, &valsize); >>>>> + } while (rc =3D=3D -1 && socket_error() =3D=3D EINTR); >>>> >>>> # man getsockopt >>>> RETURN VALUE >>>> On success, zero is returned. On error, -1 is returned, and >>>> errno is set appropriately. >>>> >>>> .. original code: >>>> .. >>>> .. do { >>>> .. ret =3D getsockopt(s->fd, SOL_SOCKET, SO_ERROR, (void *) >>>> &val, &valsize); >>>> .. } while (ret =3D=3D -1 && (socket_error()) =3D=3D EINTR); >>>> .. >>>> .. if (ret < 0) { // ret equals to -1, and socket_error() !=3D >>>> EINTR >>>> >>>> If ret < 0, just set the error state, but the callback function is >>>> not set to NULL. >>>> Then tcp_wait_for_connect() will be called again, and retry to >>>> checking current >>>> socket by getsockopt(). >>>> >>>> But in this patch, we will abandon current socket, and connect next >>>> address on the list. >>>> We should reserve that block, and move >>>> qemu_set_fd_handler2(...NULL..) below it. >>> I'm sorry I don't get you point, it makes no difference where is >>> qemu_set_fd_handlers(...NULL...). >>> rc < 0 happens in two cases: >>> 1. val !=3D 0 - connect error >> >> It seems rc is 0 in this case. >> >>> 2. getsockopt failed for reason different than EINTR . >>> In both cases we can only move to the next addr in the list, >>> inet_connect_addr will call qemu_set_fd_handlers with the new socket >>> fd. >> >> In the past, if rc < 0, we will return without setting fd handler to N= ULL, >> tcp_wait_for_connect() will be re-called. the original socket (current= addr) >> will be re-checked by getsockopt() >> >> But in your patch, it will retry next addr in the list. >> > In the past we didn't have the address list so we could not continue to= the next one, > now we can. Yes, we should try to connect next addr if current address is not availab= le. But rc < 0 doesn't mean current is not available. "rc < 0" means fail to get socket option, we need to retry to _get_=20 socket option. --=20 Amos.