All of lore.kernel.org
 help / color / mirror / Atom feed
From: Georgios Cheimonidis <gche@kth.se>
To: linux-sctp@vger.kernel.org
Subject: Re: Gap not retransmitted after switchover
Date: Wed, 12 May 2010 15:26:48 +0000	[thread overview]
Message-ID: <4BEAC8B8.8070306@kth.se> (raw)
In-Reply-To: <4BE95C1C.4050902@hp.com>

Hi Vlad!

I made quite a lot of tests today. Here are my results.

When I repeated my previous test (IPv4 addresses only) I did not 
experience any problems. So, it seems that the patch worked! The server, 
after receiving three consecutive SACKs with the reported gap (three 
miss indications), it retransmitted the missing TSNs and the data flow 
continued normally. I repeated it many times and the result was always 
the same.

However, I experienced the same problem (not always but some times) when 
I had the following setup.
- Server having both IPv4 and IPv6 addresses on ethernet interface.
- Client having IPv6 on ethernet (X) and IPv4 on wlan (Y).
- Association established with all the above addresses belonging to the 
association. The client uses its IPv6 address to contact the IPv6 
address of the server (initially), so the initial handshake is done 
using the IPv6 addresses. The client sends an ASCONF just after 
association establishment to tell the server to set its primary to the X.
- Whenever the ethernet cable is removed at the client, the client calls 
setsockopt(SET_PEER_PRIMARY_ADDR) to tell the server to set Y as its 
primary and then calles sctp_bindx() to remove X from the association.
In this scenario, sometimes the server does not retransmit the gap 
(after changing primary from X to Y and deleting Y from association).

Another observation that I have made, is that sometimes, after the 
ethernet cable is removed and I call setsockopt(SET_PEER_PRIMARY_ADDR) 
on the client to set the peer's primary to Y, the actual transmission of 
the ASCONF chunk is observed after many seconds (sometimes I observed 
the transmission 30 seconds after the call to setsockopt). I don't know 
if this is normal. Even with IPv4 only test I observed a small delay 
between calling setsockopt() and observing the ASCONF chunk, but it was 
about 1-2 seconds. With the IPv4/IPv6 test, this delay varied more.

Looking forward to your comments! Let me know if you want me to test 
something more.

Best regards,
George



On 05/11/2010 05:35 PM, Vlad Yasevich wrote:
>
>
> Vlad Yasevich wrote:
>>
>> Georgios Cheimonidis wrote:
>>> Hi Vlad!
>>>
>>> I have repeated the test with the net-next kernel tree. It seems that
>>> the problem persists. Below, I summarize what I observed from the
>>> capture at the server side (the client's capture agrees with these
>>> observations). Although the timing differs somewhat from the previous
>>> test, the basic observation is still the same. After the server switches
>>> primary address and removes the previous primary from the association,
>>> some unacknowledged DATA packets that were transmitted to the previous
>>> primary (after it became unreachable) are never retransmitted to the new
>>> one.
>>>
>>
>> Thanks for testing.  I am looking to see what can be happening.
>>
>> -vlad
>>
>
> Hi George.
>
> I figured out why there were no retransmits.  Because you changed primary
> path, you kicked in the SFR-CACC algorithm, and our implementation didn't
> deal properly with the fact that some chunks may have moved from the old
> primary to the new one without going though a retransmit.
>
> There are really 2 ways to deal with this:
> 	1).  If we are deleting a transport that had outstanding data,
> 	automatically retransmit the data on the new transport.
>
> 	or.
>
> 	2) Under the same condition as above, move the data to the new primary
> 	destination and let fast-recovery take care of the issue.
>
> Linux implemented (2) from above, and thus this bug surfaced.
>
> Try the attached patch, and let me know if it fixes it for you.
>
> -vlad


  parent reply	other threads:[~2010-05-12 15:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-11 13:31 Gap not retransmitted after switchover Vlad Yasevich
2010-05-11 15:35 ` Vlad Yasevich
2010-05-11 18:45 ` Georgios Cheimonidis
2010-05-12 15:26 ` Georgios Cheimonidis [this message]
2010-05-12 16:14 ` Vlad Yasevich
2010-05-13 11:40 ` Georgios Cheimonidis
2010-05-13 13:41 ` Vlad Yasevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BEAC8B8.8070306@kth.se \
    --to=gche@kth.se \
    --cc=linux-sctp@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.