All of lore.kernel.org
 help / color / mirror / Atom feed
* Source IP not corresponding to interface
@ 2010-05-25 16:30 Georgios Cheimonidis
  2010-05-25 17:11 ` Vlad Yasevich
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Georgios Cheimonidis @ 2010-05-25 16:30 UTC (permalink / raw)
  To: linux-sctp

Hi!

I have observed a problem while doing some tests with dynamic address 
reconfiguration. Let me first describe my setup and application.

Setup: I have two hosts, one that acts as a client and another that acts 
as a server. The client has two IPv4 addresses (one on wlan, let's call 
it X, and another on a 3G p-to-p connection, let's call it Y). There are 
two default routes on the client, and the wlan default has a smaller 
metric than the 3G default. The server is single homed. All addresses 
belong to different subnets.
Both hosts are running the net-next kernel, downloaded from David 
Miller's net-next source tree on 12-May-2010). I have also applied two 
extra patches found in: (a) 
http://www.spinics.net/lists/linux-sctp/msg00881.html and 
(b)http://www.spinics.net/lists/linux-sctp/msg00882.html. I have also 
enabled SCTP debugging messages.

Application: In my simple application, only the server transmits 
messages to the client. The server uses blocking send() and the client 
uses blocking recv(). My client application has a simple policy: When IP 
address X is removed from the system, a monitoring process reports this 
event to my application.  When my application receives this event 
notification, it takes two consecutive actions. First, it calls 
sctp_bindx() to remove IP address X from the association. Immediately 
after that, it calls setsockopt(SET_PEER_PRIMARY_ADDR) to change the 
peer's (server's) primary destination to address Y. So, when I execute 
"sudo dhcpcd -k wlan0" at the client host, the server should eventually 
remove address X from its list of destination addresses and then change 
its primary (destination) address to Y. When I execute the command "sudo 
dhcpcd wlan0", and when the IP address is finally configured, my client 
application gets a notification and calls sctpbindx() to first add the 
wlan's IP address X to the association and then calls setsockopt() to 
change the peer's (server's) primary to address X. In simple words, 
whenever both (wlan and 3G) are available at the client, the client 
would like to receive packets from wlan.

In the following experiment, I start the association with the client 
having both IP addresses (address X is used for the initial handshake) 
and then I execute "sudo dhcpcd -k wlan0" and after one minute I execute 
"sudo dhcpcd wlan0". Everything is OK after removing wlan's IP address 
(which occures after executing "dhcpcd -x". A fast switchover to 3G 
interface is achieved. But after the wlan's address is configured again, 
SOMETIMES (not always!!), all subsequent packets (SACKS and ASCONFs) 
sent from the client to the server are sent from the wlan interface but 
with the 3G IP address!! The source address does not correspond to the 
wlan interface, and the wireless network router discards these packets, 
which consequently never end up to the server. I have to note that the 
first ASCONF for adding the new IP address to the association is 
correctly sent from the 3G interface using the 3G IP address. It is the 
second ASCONF (for setting peer's primary address) and all SACKs that 
are sent from wlan with wrong source IP. Also I observe a delay before 
these packets appear in the wlan interface.

I have played with rp_filter and accept_source_route options in 
/etc/sysctl.conf but I haven't observed any difference. The values that 
I used throughout most of my tests were rp_filter = 0 and 
accept_source_route = 1.

Any help will be highly appreciated. If you need the kernel log file I 
can also send it.

Thanks in advance,
George

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Source IP not corresponding to interface
  2010-05-25 16:30 Source IP not corresponding to interface Georgios Cheimonidis
@ 2010-05-25 17:11 ` Vlad Yasevich
  2010-05-25 19:12 ` Vlad Yasevich
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Vlad Yasevich @ 2010-05-25 17:11 UTC (permalink / raw)
  To: linux-sctp



Georgios Cheimonidis wrote:
> Hi!
> 
> I have observed a problem while doing some tests with dynamic address
> reconfiguration. Let me first describe my setup and application.
> 
> Setup: I have two hosts, one that acts as a client and another that acts
> as a server. The client has two IPv4 addresses (one on wlan, let's call
> it X, and another on a 3G p-to-p connection, let's call it Y). There are
> two default routes on the client, and the wlan default has a smaller
> metric than the 3G default. The server is single homed. All addresses
> belong to different subnets.
> Both hosts are running the net-next kernel, downloaded from David
> Miller's net-next source tree on 12-May-2010). I have also applied two
> extra patches found in: (a)
> http://www.spinics.net/lists/linux-sctp/msg00881.html and
> (b)http://www.spinics.net/lists/linux-sctp/msg00882.html. I have also
> enabled SCTP debugging messages.
> 

Hi George

Thanks for this report.  I am setting up a reproduction environment now.
Will let you know what I find.

It sounds like the routing might get kind-of funky after you add the
address back.  Does the default route get recreated with the right
metric?

Kernel logs are always nice to have.  You can even look through them
and try finding references to sctp_v4_get_dst() call to see what
it shows you.  Thats where routing and source address selection
is done.

I am also assuming that this is all v4, right?  I've got v6 patch
ready finally.  Passed all the tests I could throw at it.

-vlad

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Source IP not corresponding to interface
  2010-05-25 16:30 Source IP not corresponding to interface Georgios Cheimonidis
  2010-05-25 17:11 ` Vlad Yasevich
@ 2010-05-25 19:12 ` Vlad Yasevich
  2010-05-25 19:53 ` Georgios Cheimonidis
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Vlad Yasevich @ 2010-05-25 19:12 UTC (permalink / raw)
  To: linux-sctp

[-- Attachment #1: Type: text/plain, Size: 4120 bytes --]

Hi George

Georgios Cheimonidis wrote:
> Hi Vlad!
> 
> Thanks for the quick reply!
> - The default route is recreated with a different metric but always
> smaller than the metric corresponding to the default route of the 3G
> interface.
> - The IP addresses were all IPv4, but I used AF_INET6 sockets, since in
> some other tests I add and remove IPv6 addresses as well. I don't know
> if this matters.
> - I am also attaching the kernel log from the client host. Address X of
> the previous description is 192.XXX.XXX.XXX (client's wlan), Y is
> 95.YYY.YYY.YYY (client's 3G) and Z is 213.ZZZ.ZZZ.ZZZ (server's single
> IP address). I will also try to examine it and check the
> sctp_v4_get_dst() calls.
> 
> Nice to hear about the v6 patch! I will also do some testing and let you
> know about the results. Have you already published it in the mailing list?
> 

Ok, so here is a simple patch to try along with the explanation.

When you add a address we send an ASCONF, but the new address is not usable
for anything other then Heartbeats util ASCONF_ACK is received.

Also, the addition of a new default route causes something to timeout or change
such that the transport looses a route.  When we look up the new route, we get
an updated route with the lower metric; however, we can't use the source
provided by that route because we have not received the ASCONF_ACK yet.
So, we try to do a lookup with the source addresses provided.  We still can only
use 1 of the addresses (the 3G one).  The routing table still appears to return
us the route with a lower metric.  I can reproduce this with a simple
'ip route get' command.  Try it on your system:

   ip route get <dest> from <second source>

You will see a route that will have the source set to 'second source', but using
the interface that the preferred source is configured on (since that one has a
lower metric).

Thus we end up using the wrong interface, with the 'correct' source address.

I don't think there is anything we can do about this before ASCONF_ACK is
received.  However, when we receive the ASCONF_ACK, we can trigger a route
lookup and source address selection again.

I've attached the patch.  So, looks like you will still see this strange
condition for a short duration, but once ASCONF_ACK is received it should clear up.

Let me know how if this works.  I'll look back in history to see why the code is
the way it is.

-vlad

> Best regards
> George
> 
> 
> 
> On 05/25/2010 07:11 PM, Vlad Yasevich wrote:
>>
>> Georgios Cheimonidis wrote:
>>   
>>> Hi!
>>>
>>> I have observed a problem while doing some tests with dynamic address
>>> reconfiguration. Let me first describe my setup and application.
>>>
>>> Setup: I have two hosts, one that acts as a client and another that acts
>>> as a server. The client has two IPv4 addresses (one on wlan, let's call
>>> it X, and another on a 3G p-to-p connection, let's call it Y). There are
>>> two default routes on the client, and the wlan default has a smaller
>>> metric than the 3G default. The server is single homed. All addresses
>>> belong to different subnets.
>>> Both hosts are running the net-next kernel, downloaded from David
>>> Miller's net-next source tree on 12-May-2010). I have also applied two
>>> extra patches found in: (a)
>>> http://www.spinics.net/lists/linux-sctp/msg00881.html and
>>> (b)http://www.spinics.net/lists/linux-sctp/msg00882.html. I have also
>>> enabled SCTP debugging messages.
>>>
>>>      
>> Hi George
>>
>> Thanks for this report.  I am setting up a reproduction environment now.
>> Will let you know what I find.
>>
>> It sounds like the routing might get kind-of funky after you add the
>> address back.  Does the default route get recreated with the right
>> metric?
>>
>> Kernel logs are always nice to have.  You can even look through them
>> and try finding references to sctp_v4_get_dst() call to see what
>> it shows you.  Thats where routing and source address selection
>> is done.
>>
>> I am also assuming that this is all v4, right?  I've got v6 patch
>> ready finally.  Passed all the tests I could throw at it.
>>
>> -vlad
>>    
> 

[-- Attachment #2: george.patch --]
[-- Type: text/x-patch, Size: 523 bytes --]

diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 24effdf..183d38c 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -3136,8 +3136,6 @@ static void sctp_asconf_param_success(struct sctp_association *asoc,
 		local_bh_enable();
 		list_for_each_entry(transport, &asoc->peer.transport_addr_list,
 				transports) {
-			if (transport->state == SCTP_ACTIVE)
-				continue;
 			dst_release(transport->dst);
 			sctp_transport_route(transport, NULL,
 					     sctp_sk(asoc->base.sk));

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Source IP not corresponding to interface
  2010-05-25 16:30 Source IP not corresponding to interface Georgios Cheimonidis
  2010-05-25 17:11 ` Vlad Yasevich
  2010-05-25 19:12 ` Vlad Yasevich
@ 2010-05-25 19:53 ` Georgios Cheimonidis
  2010-05-26 13:49 ` Georgios Cheimonidis
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Georgios Cheimonidis @ 2010-05-25 19:53 UTC (permalink / raw)
  To: linux-sctp

Hi Vlad!

Ok, it makes sense. I will test this patch tomorrow morning and let you 
know.

Best regards,
George


On 05/25/2010 09:12 PM, Vlad Yasevich wrote:
> Hi George
>
> Georgios Cheimonidis wrote:
>    
>> Hi Vlad!
>>
>> Thanks for the quick reply!
>> - The default route is recreated with a different metric but always
>> smaller than the metric corresponding to the default route of the 3G
>> interface.
>> - The IP addresses were all IPv4, but I used AF_INET6 sockets, since in
>> some other tests I add and remove IPv6 addresses as well. I don't know
>> if this matters.
>> - I am also attaching the kernel log from the client host. Address X of
>> the previous description is 192.XXX.XXX.XXX (client's wlan), Y is
>> 95.YYY.YYY.YYY (client's 3G) and Z is 213.ZZZ.ZZZ.ZZZ (server's single
>> IP address). I will also try to examine it and check the
>> sctp_v4_get_dst() calls.
>>
>> Nice to hear about the v6 patch! I will also do some testing and let you
>> know about the results. Have you already published it in the mailing list?
>>
>>      
> Ok, so here is a simple patch to try along with the explanation.
>
> When you add a address we send an ASCONF, but the new address is not usable
> for anything other then Heartbeats util ASCONF_ACK is received.
>
> Also, the addition of a new default route causes something to timeout or change
> such that the transport looses a route.  When we look up the new route, we get
> an updated route with the lower metric; however, we can't use the source
> provided by that route because we have not received the ASCONF_ACK yet.
> So, we try to do a lookup with the source addresses provided.  We still can only
> use 1 of the addresses (the 3G one).  The routing table still appears to return
> us the route with a lower metric.  I can reproduce this with a simple
> 'ip route get' command.  Try it on your system:
>
>     ip route get<dest>  from<second source>
>
> You will see a route that will have the source set to 'second source', but using
> the interface that the preferred source is configured on (since that one has a
> lower metric).
>
> Thus we end up using the wrong interface, with the 'correct' source address.
>
> I don't think there is anything we can do about this before ASCONF_ACK is
> received.  However, when we receive the ASCONF_ACK, we can trigger a route
> lookup and source address selection again.
>
> I've attached the patch.  So, looks like you will still see this strange
> condition for a short duration, but once ASCONF_ACK is received it should clear up.
>
> Let me know how if this works.  I'll look back in history to see why the code is
> the way it is.
>
> -vlad
>
>    
>> Best regards
>> George
>>
>>
>>
>> On 05/25/2010 07:11 PM, Vlad Yasevich wrote:
>>      
>>> Georgios Cheimonidis wrote:
>>>
>>>        
>>>> Hi!
>>>>
>>>> I have observed a problem while doing some tests with dynamic address
>>>> reconfiguration. Let me first describe my setup and application.
>>>>
>>>> Setup: I have two hosts, one that acts as a client and another that acts
>>>> as a server. The client has two IPv4 addresses (one on wlan, let's call
>>>> it X, and another on a 3G p-to-p connection, let's call it Y). There are
>>>> two default routes on the client, and the wlan default has a smaller
>>>> metric than the 3G default. The server is single homed. All addresses
>>>> belong to different subnets.
>>>> Both hosts are running the net-next kernel, downloaded from David
>>>> Miller's net-next source tree on 12-May-2010). I have also applied two
>>>> extra patches found in: (a)
>>>> http://www.spinics.net/lists/linux-sctp/msg00881.html and
>>>> (b)http://www.spinics.net/lists/linux-sctp/msg00882.html. I have also
>>>> enabled SCTP debugging messages.
>>>>
>>>>
>>>>          
>>> Hi George
>>>
>>> Thanks for this report.  I am setting up a reproduction environment now.
>>> Will let you know what I find.
>>>
>>> It sounds like the routing might get kind-of funky after you add the
>>> address back.  Does the default route get recreated with the right
>>> metric?
>>>
>>> Kernel logs are always nice to have.  You can even look through them
>>> and try finding references to sctp_v4_get_dst() call to see what
>>> it shows you.  Thats where routing and source address selection
>>> is done.
>>>
>>> I am also assuming that this is all v4, right?  I've got v6 patch
>>> ready finally.  Passed all the tests I could throw at it.
>>>
>>> -vlad
>>>
>>>        
>>      


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Source IP not corresponding to interface
  2010-05-25 16:30 Source IP not corresponding to interface Georgios Cheimonidis
                   ` (2 preceding siblings ...)
  2010-05-25 19:53 ` Georgios Cheimonidis
@ 2010-05-26 13:49 ` Georgios Cheimonidis
  2010-05-26 13:57 ` Vlad Yasevich
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Georgios Cheimonidis @ 2010-05-26 13:49 UTC (permalink / raw)
  To: linux-sctp

Hi Vlad!

I have applied the patch and repeated the same test. The results are 
good. I don't see any packets with wrong source IP in the wlan interface 
any more. Most of the times the switchover from 3G to wlan (when wlan's 
IP is made available and added to the association) is quite fast. 
Sometimes, I observe a small delay between the ASCONF_ACK received from 
the server (corresponding to the ASCONF for adding the wlan's IP 
address) and the first packet (SACK or ASCONF for setting peer's 
primary) transmitted from the wlan interface. The maximum value of this 
delay is about 1 second. During this small delay, no packets are 
transmitted from wlan or 3G interface.

Best regards,
George

On 05/25/2010 09:12 PM, Vlad Yasevich wrote:
> Hi George
>
> Georgios Cheimonidis wrote:
>> Hi Vlad!
>>
>> Thanks for the quick reply!
>> - The default route is recreated with a different metric but always
>> smaller than the metric corresponding to the default route of the 3G
>> interface.
>> - The IP addresses were all IPv4, but I used AF_INET6 sockets, since in
>> some other tests I add and remove IPv6 addresses as well. I don't know
>> if this matters.
>> - I am also attaching the kernel log from the client host. Address X of
>> the previous description is 192.XXX.XXX.XXX (client's wlan), Y is
>> 95.YYY.YYY.YYY (client's 3G) and Z is 213.ZZZ.ZZZ.ZZZ (server's single
>> IP address). I will also try to examine it and check the
>> sctp_v4_get_dst() calls.
>>
>> Nice to hear about the v6 patch! I will also do some testing and let you
>> know about the results. Have you already published it in the mailing list?
>>
>
> Ok, so here is a simple patch to try along with the explanation.
>
> When you add a address we send an ASCONF, but the new address is not usable
> for anything other then Heartbeats util ASCONF_ACK is received.
>
> Also, the addition of a new default route causes something to timeout or change
> such that the transport looses a route.  When we look up the new route, we get
> an updated route with the lower metric; however, we can't use the source
> provided by that route because we have not received the ASCONF_ACK yet.
> So, we try to do a lookup with the source addresses provided.  We still can only
> use 1 of the addresses (the 3G one).  The routing table still appears to return
> us the route with a lower metric.  I can reproduce this with a simple
> 'ip route get' command.  Try it on your system:
>
>     ip route get<dest>  from<second source>
>
> You will see a route that will have the source set to 'second source', but using
> the interface that the preferred source is configured on (since that one has a
> lower metric).
>
> Thus we end up using the wrong interface, with the 'correct' source address.
>
> I don't think there is anything we can do about this before ASCONF_ACK is
> received.  However, when we receive the ASCONF_ACK, we can trigger a route
> lookup and source address selection again.
>
> I've attached the patch.  So, looks like you will still see this strange
> condition for a short duration, but once ASCONF_ACK is received it should clear up.
>
> Let me know how if this works.  I'll look back in history to see why the code is
> the way it is.
>
> -vlad
>
>> Best regards
>> George
>>
>>
>>
>> On 05/25/2010 07:11 PM, Vlad Yasevich wrote:
>>>
>>> Georgios Cheimonidis wrote:
>>>
>>>> Hi!
>>>>
>>>> I have observed a problem while doing some tests with dynamic address
>>>> reconfiguration. Let me first describe my setup and application.
>>>>
>>>> Setup: I have two hosts, one that acts as a client and another that acts
>>>> as a server. The client has two IPv4 addresses (one on wlan, let's call
>>>> it X, and another on a 3G p-to-p connection, let's call it Y). There are
>>>> two default routes on the client, and the wlan default has a smaller
>>>> metric than the 3G default. The server is single homed. All addresses
>>>> belong to different subnets.
>>>> Both hosts are running the net-next kernel, downloaded from David
>>>> Miller's net-next source tree on 12-May-2010). I have also applied two
>>>> extra patches found in: (a)
>>>> http://www.spinics.net/lists/linux-sctp/msg00881.html and
>>>> (b)http://www.spinics.net/lists/linux-sctp/msg00882.html. I have also
>>>> enabled SCTP debugging messages.
>>>>
>>>>
>>> Hi George
>>>
>>> Thanks for this report.  I am setting up a reproduction environment now.
>>> Will let you know what I find.
>>>
>>> It sounds like the routing might get kind-of funky after you add the
>>> address back.  Does the default route get recreated with the right
>>> metric?
>>>
>>> Kernel logs are always nice to have.  You can even look through them
>>> and try finding references to sctp_v4_get_dst() call to see what
>>> it shows you.  Thats where routing and source address selection
>>> is done.
>>>
>>> I am also assuming that this is all v4, right?  I've got v6 patch
>>> ready finally.  Passed all the tests I could throw at it.
>>>
>>> -vlad
>>>
>>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Source IP not corresponding to interface
  2010-05-25 16:30 Source IP not corresponding to interface Georgios Cheimonidis
                   ` (3 preceding siblings ...)
  2010-05-26 13:49 ` Georgios Cheimonidis
@ 2010-05-26 13:57 ` Vlad Yasevich
  2010-05-26 15:50 ` Vlad Yasevich
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Vlad Yasevich @ 2010-05-26 13:57 UTC (permalink / raw)
  To: linux-sctp



Georgios Cheimonidis wrote:
> Hi Vlad!
> 
> I have applied the patch and repeated the same test. The results are
> good. I don't see any packets with wrong source IP in the wlan interface
> any more. Most of the times the switchover from 3G to wlan (when wlan's
> IP is made available and added to the association) is quite fast.
> Sometimes, I observe a small delay between the ASCONF_ACK received from
> the server (corresponding to the ASCONF for adding the wlan's IP
> address) and the first packet (SACK or ASCONF for setting peer's
> primary) transmitted from the wlan interface. The maximum value of this
> delay is about 1 second. During this small delay, no packets are
> transmitted from wlan or 3G interface.

Interesting...  Can you send a log when this occurs?

Also,  does this 1 second delay occur if you disable debug output?  I now
sometimes the output itself can cause delays.

-vlad

> 
> Best regards,
> George
> 
> On 05/25/2010 09:12 PM, Vlad Yasevich wrote:
>> Hi George
>>
>> Georgios Cheimonidis wrote:
>>> Hi Vlad!
>>>
>>> Thanks for the quick reply!
>>> - The default route is recreated with a different metric but always
>>> smaller than the metric corresponding to the default route of the 3G
>>> interface.
>>> - The IP addresses were all IPv4, but I used AF_INET6 sockets, since in
>>> some other tests I add and remove IPv6 addresses as well. I don't know
>>> if this matters.
>>> - I am also attaching the kernel log from the client host. Address X of
>>> the previous description is 192.XXX.XXX.XXX (client's wlan), Y is
>>> 95.YYY.YYY.YYY (client's 3G) and Z is 213.ZZZ.ZZZ.ZZZ (server's single
>>> IP address). I will also try to examine it and check the
>>> sctp_v4_get_dst() calls.
>>>
>>> Nice to hear about the v6 patch! I will also do some testing and let you
>>> know about the results. Have you already published it in the mailing
>>> list?
>>>
>>
>> Ok, so here is a simple patch to try along with the explanation.
>>
>> When you add a address we send an ASCONF, but the new address is not
>> usable
>> for anything other then Heartbeats util ASCONF_ACK is received.
>>
>> Also, the addition of a new default route causes something to timeout
>> or change
>> such that the transport looses a route.  When we look up the new
>> route, we get
>> an updated route with the lower metric; however, we can't use the source
>> provided by that route because we have not received the ASCONF_ACK yet.
>> So, we try to do a lookup with the source addresses provided.  We
>> still can only
>> use 1 of the addresses (the 3G one).  The routing table still appears
>> to return
>> us the route with a lower metric.  I can reproduce this with a simple
>> 'ip route get' command.  Try it on your system:
>>
>>     ip route get<dest>  from<second source>
>>
>> You will see a route that will have the source set to 'second source',
>> but using
>> the interface that the preferred source is configured on (since that
>> one has a
>> lower metric).
>>
>> Thus we end up using the wrong interface, with the 'correct' source
>> address.
>>
>> I don't think there is anything we can do about this before ASCONF_ACK is
>> received.  However, when we receive the ASCONF_ACK, we can trigger a
>> route
>> lookup and source address selection again.
>>
>> I've attached the patch.  So, looks like you will still see this strange
>> condition for a short duration, but once ASCONF_ACK is received it
>> should clear up.
>>
>> Let me know how if this works.  I'll look back in history to see why
>> the code is
>> the way it is.
>>
>> -vlad
>>
>>> Best regards
>>> George
>>>
>>>
>>>
>>> On 05/25/2010 07:11 PM, Vlad Yasevich wrote:
>>>>
>>>> Georgios Cheimonidis wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> I have observed a problem while doing some tests with dynamic address
>>>>> reconfiguration. Let me first describe my setup and application.
>>>>>
>>>>> Setup: I have two hosts, one that acts as a client and another that
>>>>> acts
>>>>> as a server. The client has two IPv4 addresses (one on wlan, let's
>>>>> call
>>>>> it X, and another on a 3G p-to-p connection, let's call it Y).
>>>>> There are
>>>>> two default routes on the client, and the wlan default has a smaller
>>>>> metric than the 3G default. The server is single homed. All addresses
>>>>> belong to different subnets.
>>>>> Both hosts are running the net-next kernel, downloaded from David
>>>>> Miller's net-next source tree on 12-May-2010). I have also applied two
>>>>> extra patches found in: (a)
>>>>> http://www.spinics.net/lists/linux-sctp/msg00881.html and
>>>>> (b)http://www.spinics.net/lists/linux-sctp/msg00882.html. I have also
>>>>> enabled SCTP debugging messages.
>>>>>
>>>>>
>>>> Hi George
>>>>
>>>> Thanks for this report.  I am setting up a reproduction environment
>>>> now.
>>>> Will let you know what I find.
>>>>
>>>> It sounds like the routing might get kind-of funky after you add the
>>>> address back.  Does the default route get recreated with the right
>>>> metric?
>>>>
>>>> Kernel logs are always nice to have.  You can even look through them
>>>> and try finding references to sctp_v4_get_dst() call to see what
>>>> it shows you.  Thats where routing and source address selection
>>>> is done.
>>>>
>>>> I am also assuming that this is all v4, right?  I've got v6 patch
>>>> ready finally.  Passed all the tests I could throw at it.
>>>>
>>>> -vlad
>>>>
>>>
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Source IP not corresponding to interface
  2010-05-25 16:30 Source IP not corresponding to interface Georgios Cheimonidis
                   ` (4 preceding siblings ...)
  2010-05-26 13:57 ` Vlad Yasevich
@ 2010-05-26 15:50 ` Vlad Yasevich
  2010-05-26 20:21 ` Georgios Cheimonidis
  2010-05-26 20:31 ` Vlad Yasevich
  7 siblings, 0 replies; 9+ messages in thread
From: Vlad Yasevich @ 2010-05-26 15:50 UTC (permalink / raw)
  To: linux-sctp



Georgios Cheimonidis wrote:
> Hi!
> 
> I repeated the test once again. The scenario for the attached log is the
> following.
> Client starts with 2 IPv4 addresses on the association (X: wlan and Y:
> 3G). Server has only one address Z. I repeatedly do the following on the
> client side:
> - Remove address X and set Y as peer's (server's) primary (whenever
> address X becomes unavailable).
> - Add address X and set X as peer's primary (whenever address X becomes
> available).
> The above is repeated 12 times (12 removals and 12 additions of wlan's
> IP address).
> A measurable delay (about 1 second) occured during the #4, #6, #7, #9,
> #10 and #12 addition of address Y. In the remaining cases the delay was
> negligible. This delay was measured on the server side by examining the
> capture from wireshark. On all occasions, it was the time between the
> ASCONF_ACK sent from the server and the first packet sent from the
> client (SACK most of the times) to the server from the wlan's IP address.
> I have disabled debugging messages in my application.

Hi George

Looking at the look (iteration #4), I see lots of traffic at 16:13:16.
Looks like the client gets the ASCONF_ACK for the ADD_IP parameter, and
re-looks up the route to the server.  The route is now rt_dst:213.ZZZ.ZZZ.ZZZ,
rt_src:192.XXX.XXX.XXX.

It sends the ASCONF for SET_PRIMARY and then doesn't get anything back from
the server until 16:13:17 which is DATA.  Now, the kernel timesamps don't
include milliseconds so it's not really possible to tell how much time has
passed.   So at 16:13:17, there is DATA flow from the server and it triggers a
SACK.  Looks like there is also a HEARTBEAT.

So it could be that the delay is the HEARTBEAT delay.  Try playing with
rto.initial value, or even try forcing a user Heartbeat, when you see a
new path come up on the server.

-vlad

> 
> Best regards,
> George
> 
> On 05/26/2010 03:57 PM, Vlad Yasevich wrote:
>>
>>
>> Georgios Cheimonidis wrote:
>>> Hi Vlad!
>>>
>>> I have applied the patch and repeated the same test. The results are
>>> good. I don't see any packets with wrong source IP in the wlan interface
>>> any more. Most of the times the switchover from 3G to wlan (when wlan's
>>> IP is made available and added to the association) is quite fast.
>>> Sometimes, I observe a small delay between the ASCONF_ACK received from
>>> the server (corresponding to the ASCONF for adding the wlan's IP
>>> address) and the first packet (SACK or ASCONF for setting peer's
>>> primary) transmitted from the wlan interface. The maximum value of this
>>> delay is about 1 second. During this small delay, no packets are
>>> transmitted from wlan or 3G interface.
>>
>> Interesting...  Can you send a log when this occurs?
>>
>> Also,  does this 1 second delay occur if you disable debug output?  I now
>> sometimes the output itself can cause delays.
>>
>> -vlad
>>
>>>
>>> Best regards,
>>> George
>>>
>>> On 05/25/2010 09:12 PM, Vlad Yasevich wrote:
>>>> Hi George
>>>>
>>>> Georgios Cheimonidis wrote:
>>>>> Hi Vlad!
>>>>>
>>>>> Thanks for the quick reply!
>>>>> - The default route is recreated with a different metric but always
>>>>> smaller than the metric corresponding to the default route of the 3G
>>>>> interface.
>>>>> - The IP addresses were all IPv4, but I used AF_INET6 sockets,
>>>>> since in
>>>>> some other tests I add and remove IPv6 addresses as well. I don't know
>>>>> if this matters.
>>>>> - I am also attaching the kernel log from the client host. Address
>>>>> X of
>>>>> the previous description is 192.XXX.XXX.XXX (client's wlan), Y is
>>>>> 95.YYY.YYY.YYY (client's 3G) and Z is 213.ZZZ.ZZZ.ZZZ (server's single
>>>>> IP address). I will also try to examine it and check the
>>>>> sctp_v4_get_dst() calls.
>>>>>
>>>>> Nice to hear about the v6 patch! I will also do some testing and
>>>>> let you
>>>>> know about the results. Have you already published it in the mailing
>>>>> list?
>>>>>
>>>>
>>>> Ok, so here is a simple patch to try along with the explanation.
>>>>
>>>> When you add a address we send an ASCONF, but the new address is not
>>>> usable
>>>> for anything other then Heartbeats util ASCONF_ACK is received.
>>>>
>>>> Also, the addition of a new default route causes something to timeout
>>>> or change
>>>> such that the transport looses a route.  When we look up the new
>>>> route, we get
>>>> an updated route with the lower metric; however, we can't use the
>>>> source
>>>> provided by that route because we have not received the ASCONF_ACK yet.
>>>> So, we try to do a lookup with the source addresses provided.  We
>>>> still can only
>>>> use 1 of the addresses (the 3G one).  The routing table still appears
>>>> to return
>>>> us the route with a lower metric.  I can reproduce this with a simple
>>>> 'ip route get' command.  Try it on your system:
>>>>
>>>>      ip route get<dest>   from<second source>
>>>>
>>>> You will see a route that will have the source set to 'second source',
>>>> but using
>>>> the interface that the preferred source is configured on (since that
>>>> one has a
>>>> lower metric).
>>>>
>>>> Thus we end up using the wrong interface, with the 'correct' source
>>>> address.
>>>>
>>>> I don't think there is anything we can do about this before
>>>> ASCONF_ACK is
>>>> received.  However, when we receive the ASCONF_ACK, we can trigger a
>>>> route
>>>> lookup and source address selection again.
>>>>
>>>> I've attached the patch.  So, looks like you will still see this
>>>> strange
>>>> condition for a short duration, but once ASCONF_ACK is received it
>>>> should clear up.
>>>>
>>>> Let me know how if this works.  I'll look back in history to see why
>>>> the code is
>>>> the way it is.
>>>>
>>>> -vlad
>>>>
>>>>> Best regards
>>>>> George
>>>>>
>>>>>
>>>>>
>>>>> On 05/25/2010 07:11 PM, Vlad Yasevich wrote:
>>>>>>
>>>>>> Georgios Cheimonidis wrote:
>>>>>>
>>>>>>> Hi!
>>>>>>>
>>>>>>> I have observed a problem while doing some tests with dynamic
>>>>>>> address
>>>>>>> reconfiguration. Let me first describe my setup and application.
>>>>>>>
>>>>>>> Setup: I have two hosts, one that acts as a client and another that
>>>>>>> acts
>>>>>>> as a server. The client has two IPv4 addresses (one on wlan, let's
>>>>>>> call
>>>>>>> it X, and another on a 3G p-to-p connection, let's call it Y).
>>>>>>> There are
>>>>>>> two default routes on the client, and the wlan default has a smaller
>>>>>>> metric than the 3G default. The server is single homed. All
>>>>>>> addresses
>>>>>>> belong to different subnets.
>>>>>>> Both hosts are running the net-next kernel, downloaded from David
>>>>>>> Miller's net-next source tree on 12-May-2010). I have also
>>>>>>> applied two
>>>>>>> extra patches found in: (a)
>>>>>>> http://www.spinics.net/lists/linux-sctp/msg00881.html and
>>>>>>> (b)http://www.spinics.net/lists/linux-sctp/msg00882.html. I have
>>>>>>> also
>>>>>>> enabled SCTP debugging messages.
>>>>>>>
>>>>>>>
>>>>>> Hi George
>>>>>>
>>>>>> Thanks for this report.  I am setting up a reproduction environment
>>>>>> now.
>>>>>> Will let you know what I find.
>>>>>>
>>>>>> It sounds like the routing might get kind-of funky after you add the
>>>>>> address back.  Does the default route get recreated with the right
>>>>>> metric?
>>>>>>
>>>>>> Kernel logs are always nice to have.  You can even look through them
>>>>>> and try finding references to sctp_v4_get_dst() call to see what
>>>>>> it shows you.  Thats where routing and source address selection
>>>>>> is done.
>>>>>>
>>>>>> I am also assuming that this is all v4, right?  I've got v6 patch
>>>>>> ready finally.  Passed all the tests I could throw at it.
>>>>>>
>>>>>> -vlad
>>>>>>
>>>>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Source IP not corresponding to interface
  2010-05-25 16:30 Source IP not corresponding to interface Georgios Cheimonidis
                   ` (5 preceding siblings ...)
  2010-05-26 15:50 ` Vlad Yasevich
@ 2010-05-26 20:21 ` Georgios Cheimonidis
  2010-05-26 20:31 ` Vlad Yasevich
  7 siblings, 0 replies; 9+ messages in thread
From: Georgios Cheimonidis @ 2010-05-26 20:21 UTC (permalink / raw)
  To: linux-sctp

Hi Vlad!

I will do some more tests with various rto.initial values and let you know.
About what you said regarding the transmission of a user HEARTBEAT at the
server: do I get a notification when the address is added to the
association? Because, from what I remember with some tests that I did with
2.6.31 kernel, this notification (SCTP_ADDR_ADDED) as well as
SCTP_ADDR_REMOVED and SCTP_ADDR_MADE_PRIM were not received by my
application (probably because they were not implemented yet). 

Best regards,
George 

-----Original Message-----
From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
Sent: Wednesday, May 26, 2010 5:50 PM
To: Georgios Cheimonidis
Cc: linux-sctp@vger.kernel.org
Subject: Re: Source IP not corresponding to interface



Georgios Cheimonidis wrote:
> Hi!
> 
> I repeated the test once again. The scenario for the attached log is the
> following.
> Client starts with 2 IPv4 addresses on the association (X: wlan and Y:
> 3G). Server has only one address Z. I repeatedly do the following on the
> client side:
> - Remove address X and set Y as peer's (server's) primary (whenever
> address X becomes unavailable).
> - Add address X and set X as peer's primary (whenever address X becomes
> available).
> The above is repeated 12 times (12 removals and 12 additions of wlan's
> IP address).
> A measurable delay (about 1 second) occured during the #4, #6, #7, #9,
> #10 and #12 addition of address Y. In the remaining cases the delay was
> negligible. This delay was measured on the server side by examining the
> capture from wireshark. On all occasions, it was the time between the
> ASCONF_ACK sent from the server and the first packet sent from the
> client (SACK most of the times) to the server from the wlan's IP address.
> I have disabled debugging messages in my application.

Hi George

Looking at the look (iteration #4), I see lots of traffic at 16:13:16.
Looks like the client gets the ASCONF_ACK for the ADD_IP parameter, and
re-looks up the route to the server.  The route is now
rt_dst:213.ZZZ.ZZZ.ZZZ,
rt_src:192.XXX.XXX.XXX.

It sends the ASCONF for SET_PRIMARY and then doesn't get anything back from
the server until 16:13:17 which is DATA.  Now, the kernel timesamps don't
include milliseconds so it's not really possible to tell how much time has
passed.   So at 16:13:17, there is DATA flow from the server and it triggers
a
SACK.  Looks like there is also a HEARTBEAT.

So it could be that the delay is the HEARTBEAT delay.  Try playing with
rto.initial value, or even try forcing a user Heartbeat, when you see a
new path come up on the server.

-vlad

> 
> Best regards,
> George
> 
> On 05/26/2010 03:57 PM, Vlad Yasevich wrote:
>>
>>
>> Georgios Cheimonidis wrote:
>>> Hi Vlad!
>>>
>>> I have applied the patch and repeated the same test. The results are
>>> good. I don't see any packets with wrong source IP in the wlan interface
>>> any more. Most of the times the switchover from 3G to wlan (when wlan's
>>> IP is made available and added to the association) is quite fast.
>>> Sometimes, I observe a small delay between the ASCONF_ACK received from
>>> the server (corresponding to the ASCONF for adding the wlan's IP
>>> address) and the first packet (SACK or ASCONF for setting peer's
>>> primary) transmitted from the wlan interface. The maximum value of this
>>> delay is about 1 second. During this small delay, no packets are
>>> transmitted from wlan or 3G interface.
>>
>> Interesting...  Can you send a log when this occurs?
>>
>> Also,  does this 1 second delay occur if you disable debug output?  I now
>> sometimes the output itself can cause delays.
>>
>> -vlad
>>
>>>
>>> Best regards,
>>> George
>>>
>>> On 05/25/2010 09:12 PM, Vlad Yasevich wrote:
>>>> Hi George
>>>>
>>>> Georgios Cheimonidis wrote:
>>>>> Hi Vlad!
>>>>>
>>>>> Thanks for the quick reply!
>>>>> - The default route is recreated with a different metric but always
>>>>> smaller than the metric corresponding to the default route of the 3G
>>>>> interface.
>>>>> - The IP addresses were all IPv4, but I used AF_INET6 sockets,
>>>>> since in
>>>>> some other tests I add and remove IPv6 addresses as well. I don't know
>>>>> if this matters.
>>>>> - I am also attaching the kernel log from the client host. Address
>>>>> X of
>>>>> the previous description is 192.XXX.XXX.XXX (client's wlan), Y is
>>>>> 95.YYY.YYY.YYY (client's 3G) and Z is 213.ZZZ.ZZZ.ZZZ (server's single
>>>>> IP address). I will also try to examine it and check the
>>>>> sctp_v4_get_dst() calls.
>>>>>
>>>>> Nice to hear about the v6 patch! I will also do some testing and
>>>>> let you
>>>>> know about the results. Have you already published it in the mailing
>>>>> list?
>>>>>
>>>>
>>>> Ok, so here is a simple patch to try along with the explanation.
>>>>
>>>> When you add a address we send an ASCONF, but the new address is not
>>>> usable
>>>> for anything other then Heartbeats util ASCONF_ACK is received.
>>>>
>>>> Also, the addition of a new default route causes something to timeout
>>>> or change
>>>> such that the transport looses a route.  When we look up the new
>>>> route, we get
>>>> an updated route with the lower metric; however, we can't use the
>>>> source
>>>> provided by that route because we have not received the ASCONF_ACK yet.
>>>> So, we try to do a lookup with the source addresses provided.  We
>>>> still can only
>>>> use 1 of the addresses (the 3G one).  The routing table still appears
>>>> to return
>>>> us the route with a lower metric.  I can reproduce this with a simple
>>>> 'ip route get' command.  Try it on your system:
>>>>
>>>>      ip route get<dest>   from<second source>
>>>>
>>>> You will see a route that will have the source set to 'second source',
>>>> but using
>>>> the interface that the preferred source is configured on (since that
>>>> one has a
>>>> lower metric).
>>>>
>>>> Thus we end up using the wrong interface, with the 'correct' source
>>>> address.
>>>>
>>>> I don't think there is anything we can do about this before
>>>> ASCONF_ACK is
>>>> received.  However, when we receive the ASCONF_ACK, we can trigger a
>>>> route
>>>> lookup and source address selection again.
>>>>
>>>> I've attached the patch.  So, looks like you will still see this
>>>> strange
>>>> condition for a short duration, but once ASCONF_ACK is received it
>>>> should clear up.
>>>>
>>>> Let me know how if this works.  I'll look back in history to see why
>>>> the code is
>>>> the way it is.
>>>>
>>>> -vlad
>>>>
>>>>> Best regards
>>>>> George
>>>>>
>>>>>
>>>>>
>>>>> On 05/25/2010 07:11 PM, Vlad Yasevich wrote:
>>>>>>
>>>>>> Georgios Cheimonidis wrote:
>>>>>>
>>>>>>> Hi!
>>>>>>>
>>>>>>> I have observed a problem while doing some tests with dynamic
>>>>>>> address
>>>>>>> reconfiguration. Let me first describe my setup and application.
>>>>>>>
>>>>>>> Setup: I have two hosts, one that acts as a client and another that
>>>>>>> acts
>>>>>>> as a server. The client has two IPv4 addresses (one on wlan, let's
>>>>>>> call
>>>>>>> it X, and another on a 3G p-to-p connection, let's call it Y).
>>>>>>> There are
>>>>>>> two default routes on the client, and the wlan default has a smaller
>>>>>>> metric than the 3G default. The server is single homed. All
>>>>>>> addresses
>>>>>>> belong to different subnets.
>>>>>>> Both hosts are running the net-next kernel, downloaded from David
>>>>>>> Miller's net-next source tree on 12-May-2010). I have also
>>>>>>> applied two
>>>>>>> extra patches found in: (a)
>>>>>>> http://www.spinics.net/lists/linux-sctp/msg00881.html and
>>>>>>> (b)http://www.spinics.net/lists/linux-sctp/msg00882.html. I have
>>>>>>> also
>>>>>>> enabled SCTP debugging messages.
>>>>>>>
>>>>>>>
>>>>>> Hi George
>>>>>>
>>>>>> Thanks for this report.  I am setting up a reproduction environment
>>>>>> now.
>>>>>> Will let you know what I find.
>>>>>>
>>>>>> It sounds like the routing might get kind-of funky after you add the
>>>>>> address back.  Does the default route get recreated with the right
>>>>>> metric?
>>>>>>
>>>>>> Kernel logs are always nice to have.  You can even look through them
>>>>>> and try finding references to sctp_v4_get_dst() call to see what
>>>>>> it shows you.  Thats where routing and source address selection
>>>>>> is done.
>>>>>>
>>>>>> I am also assuming that this is all v4, right?  I've got v6 patch
>>>>>> ready finally.  Passed all the tests I could throw at it.
>>>>>>
>>>>>> -vlad
>>>>>>
>>>>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Source IP not corresponding to interface
  2010-05-25 16:30 Source IP not corresponding to interface Georgios Cheimonidis
                   ` (6 preceding siblings ...)
  2010-05-26 20:21 ` Georgios Cheimonidis
@ 2010-05-26 20:31 ` Vlad Yasevich
  7 siblings, 0 replies; 9+ messages in thread
From: Vlad Yasevich @ 2010-05-26 20:31 UTC (permalink / raw)
  To: linux-sctp



Georgios Cheimonidis wrote:
> Hi Vlad!
> 
> I will do some more tests with various rto.initial values and let you know.
> About what you said regarding the transmission of a user HEARTBEAT at the
> server: do I get a notification when the address is added to the
> association? Because, from what I remember with some tests that I did with
> 2.6.31 kernel, this notification (SCTP_ADDR_ADDED) as well as
> SCTP_ADDR_REMOVED and SCTP_ADDR_MADE_PRIM were not received by my
> application (probably because they were not implemented yet). 
> 

Hmm.. You are right.  That's a bummer.

I guess we have more work to do... ;)

-vlad

> Best regards,
> George 
> 
> -----Original Message-----
> From: Vlad Yasevich [mailto:vladislav.yasevich@hp.com] 
> Sent: Wednesday, May 26, 2010 5:50 PM
> To: Georgios Cheimonidis
> Cc: linux-sctp@vger.kernel.org
> Subject: Re: Source IP not corresponding to interface
> 
> 
> 
> Georgios Cheimonidis wrote:
>> Hi!
>>
>> I repeated the test once again. The scenario for the attached log is the
>> following.
>> Client starts with 2 IPv4 addresses on the association (X: wlan and Y:
>> 3G). Server has only one address Z. I repeatedly do the following on the
>> client side:
>> - Remove address X and set Y as peer's (server's) primary (whenever
>> address X becomes unavailable).
>> - Add address X and set X as peer's primary (whenever address X becomes
>> available).
>> The above is repeated 12 times (12 removals and 12 additions of wlan's
>> IP address).
>> A measurable delay (about 1 second) occured during the #4, #6, #7, #9,
>> #10 and #12 addition of address Y. In the remaining cases the delay was
>> negligible. This delay was measured on the server side by examining the
>> capture from wireshark. On all occasions, it was the time between the
>> ASCONF_ACK sent from the server and the first packet sent from the
>> client (SACK most of the times) to the server from the wlan's IP address.
>> I have disabled debugging messages in my application.
> 
> Hi George
> 
> Looking at the look (iteration #4), I see lots of traffic at 16:13:16.
> Looks like the client gets the ASCONF_ACK for the ADD_IP parameter, and
> re-looks up the route to the server.  The route is now
> rt_dst:213.ZZZ.ZZZ.ZZZ,
> rt_src:192.XXX.XXX.XXX.
> 
> It sends the ASCONF for SET_PRIMARY and then doesn't get anything back from
> the server until 16:13:17 which is DATA.  Now, the kernel timesamps don't
> include milliseconds so it's not really possible to tell how much time has
> passed.   So at 16:13:17, there is DATA flow from the server and it triggers
> a
> SACK.  Looks like there is also a HEARTBEAT.
> 
> So it could be that the delay is the HEARTBEAT delay.  Try playing with
> rto.initial value, or even try forcing a user Heartbeat, when you see a
> new path come up on the server.
> 
> -vlad
> 
>> Best regards,
>> George
>>
>> On 05/26/2010 03:57 PM, Vlad Yasevich wrote:
>>>
>>> Georgios Cheimonidis wrote:
>>>> Hi Vlad!
>>>>
>>>> I have applied the patch and repeated the same test. The results are
>>>> good. I don't see any packets with wrong source IP in the wlan interface
>>>> any more. Most of the times the switchover from 3G to wlan (when wlan's
>>>> IP is made available and added to the association) is quite fast.
>>>> Sometimes, I observe a small delay between the ASCONF_ACK received from
>>>> the server (corresponding to the ASCONF for adding the wlan's IP
>>>> address) and the first packet (SACK or ASCONF for setting peer's
>>>> primary) transmitted from the wlan interface. The maximum value of this
>>>> delay is about 1 second. During this small delay, no packets are
>>>> transmitted from wlan or 3G interface.
>>> Interesting...  Can you send a log when this occurs?
>>>
>>> Also,  does this 1 second delay occur if you disable debug output?  I now
>>> sometimes the output itself can cause delays.
>>>
>>> -vlad
>>>
>>>> Best regards,
>>>> George
>>>>
>>>> On 05/25/2010 09:12 PM, Vlad Yasevich wrote:
>>>>> Hi George
>>>>>
>>>>> Georgios Cheimonidis wrote:
>>>>>> Hi Vlad!
>>>>>>
>>>>>> Thanks for the quick reply!
>>>>>> - The default route is recreated with a different metric but always
>>>>>> smaller than the metric corresponding to the default route of the 3G
>>>>>> interface.
>>>>>> - The IP addresses were all IPv4, but I used AF_INET6 sockets,
>>>>>> since in
>>>>>> some other tests I add and remove IPv6 addresses as well. I don't know
>>>>>> if this matters.
>>>>>> - I am also attaching the kernel log from the client host. Address
>>>>>> X of
>>>>>> the previous description is 192.XXX.XXX.XXX (client's wlan), Y is
>>>>>> 95.YYY.YYY.YYY (client's 3G) and Z is 213.ZZZ.ZZZ.ZZZ (server's single
>>>>>> IP address). I will also try to examine it and check the
>>>>>> sctp_v4_get_dst() calls.
>>>>>>
>>>>>> Nice to hear about the v6 patch! I will also do some testing and
>>>>>> let you
>>>>>> know about the results. Have you already published it in the mailing
>>>>>> list?
>>>>>>
>>>>> Ok, so here is a simple patch to try along with the explanation.
>>>>>
>>>>> When you add a address we send an ASCONF, but the new address is not
>>>>> usable
>>>>> for anything other then Heartbeats util ASCONF_ACK is received.
>>>>>
>>>>> Also, the addition of a new default route causes something to timeout
>>>>> or change
>>>>> such that the transport looses a route.  When we look up the new
>>>>> route, we get
>>>>> an updated route with the lower metric; however, we can't use the
>>>>> source
>>>>> provided by that route because we have not received the ASCONF_ACK yet.
>>>>> So, we try to do a lookup with the source addresses provided.  We
>>>>> still can only
>>>>> use 1 of the addresses (the 3G one).  The routing table still appears
>>>>> to return
>>>>> us the route with a lower metric.  I can reproduce this with a simple
>>>>> 'ip route get' command.  Try it on your system:
>>>>>
>>>>>      ip route get<dest>   from<second source>
>>>>>
>>>>> You will see a route that will have the source set to 'second source',
>>>>> but using
>>>>> the interface that the preferred source is configured on (since that
>>>>> one has a
>>>>> lower metric).
>>>>>
>>>>> Thus we end up using the wrong interface, with the 'correct' source
>>>>> address.
>>>>>
>>>>> I don't think there is anything we can do about this before
>>>>> ASCONF_ACK is
>>>>> received.  However, when we receive the ASCONF_ACK, we can trigger a
>>>>> route
>>>>> lookup and source address selection again.
>>>>>
>>>>> I've attached the patch.  So, looks like you will still see this
>>>>> strange
>>>>> condition for a short duration, but once ASCONF_ACK is received it
>>>>> should clear up.
>>>>>
>>>>> Let me know how if this works.  I'll look back in history to see why
>>>>> the code is
>>>>> the way it is.
>>>>>
>>>>> -vlad
>>>>>
>>>>>> Best regards
>>>>>> George
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 05/25/2010 07:11 PM, Vlad Yasevich wrote:
>>>>>>> Georgios Cheimonidis wrote:
>>>>>>>
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> I have observed a problem while doing some tests with dynamic
>>>>>>>> address
>>>>>>>> reconfiguration. Let me first describe my setup and application.
>>>>>>>>
>>>>>>>> Setup: I have two hosts, one that acts as a client and another that
>>>>>>>> acts
>>>>>>>> as a server. The client has two IPv4 addresses (one on wlan, let's
>>>>>>>> call
>>>>>>>> it X, and another on a 3G p-to-p connection, let's call it Y).
>>>>>>>> There are
>>>>>>>> two default routes on the client, and the wlan default has a smaller
>>>>>>>> metric than the 3G default. The server is single homed. All
>>>>>>>> addresses
>>>>>>>> belong to different subnets.
>>>>>>>> Both hosts are running the net-next kernel, downloaded from David
>>>>>>>> Miller's net-next source tree on 12-May-2010). I have also
>>>>>>>> applied two
>>>>>>>> extra patches found in: (a)
>>>>>>>> http://www.spinics.net/lists/linux-sctp/msg00881.html and
>>>>>>>> (b)http://www.spinics.net/lists/linux-sctp/msg00882.html. I have
>>>>>>>> also
>>>>>>>> enabled SCTP debugging messages.
>>>>>>>>
>>>>>>>>
>>>>>>> Hi George
>>>>>>>
>>>>>>> Thanks for this report.  I am setting up a reproduction environment
>>>>>>> now.
>>>>>>> Will let you know what I find.
>>>>>>>
>>>>>>> It sounds like the routing might get kind-of funky after you add the
>>>>>>> address back.  Does the default route get recreated with the right
>>>>>>> metric?
>>>>>>>
>>>>>>> Kernel logs are always nice to have.  You can even look through them
>>>>>>> and try finding references to sctp_v4_get_dst() call to see what
>>>>>>> it shows you.  Thats where routing and source address selection
>>>>>>> is done.
>>>>>>>
>>>>>>> I am also assuming that this is all v4, right?  I've got v6 patch
>>>>>>> ready finally.  Passed all the tests I could throw at it.
>>>>>>>
>>>>>>> -vlad
>>>>>>>
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-05-26 20:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-25 16:30 Source IP not corresponding to interface Georgios Cheimonidis
2010-05-25 17:11 ` Vlad Yasevich
2010-05-25 19:12 ` Vlad Yasevich
2010-05-25 19:53 ` Georgios Cheimonidis
2010-05-26 13:49 ` Georgios Cheimonidis
2010-05-26 13:57 ` Vlad Yasevich
2010-05-26 15:50 ` Vlad Yasevich
2010-05-26 20:21 ` Georgios Cheimonidis
2010-05-26 20:31 ` Vlad Yasevich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.