linux-sctp.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* SCTP Association Restart
@ 2009-10-14 11:53 Gregory Waines
  2009-10-14 13:07 ` Vlad Yasevich
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Gregory Waines @ 2009-10-14 11:53 UTC (permalink / raw)
  To: linux-sctp



- ok, so I am using Linux 2.6.14 .
  can someone confirm that association restart should work
  for the SCTP implementation in Linux 2.6.14 .
  i.e. specifically for the side of the association that stays
       up and receives the unexpected INIT and COOKIE_ECHO while
       in the ESTABLISHED state.
       This end should accept the new INIT request as a restart
       (provided ip address and port match), report RESTART to the
       ULP, and reset sequence numbers to zero.
  This all works in 2.6.14 ?


- If I have a Linux process with an established SCTP connection/
  association,
  is there a socket option that prevents the kernel from 
  ABORTing the association if this Linux process fails unexpectedly ?


- I have the following question related to using the one-to-one style
socket 
  interface when trying to do an Association Restart:
     * if my node is typically the server side of the SCTP connections
     * then on a restart of this node,
     * I assume that I could NOT setup my server's listening socket
first,
                    (i.e. socket(), bind(), listen(), accept()...)
       and,
       then try to re-establish old associations by
socket(),bind(),connect() ...
            because the bind() would probably fail due to the listening
socket
            already being bound to the same SCTP IP Address and Port.
     *
     * is this correct ?
     * i.e. if using the one-to-one style interface, and 
               you are the server, and
               you restart, and
               you are trying to recover SCTP Associations,
            then
               the only way you can get around the bind() conflict is to

               recover the SCTP associations first, and then 
               re-setup your listening socket.


thanks in advance for any help,
Greg Waines
Nortel
waines@nortel.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SCTP Association Restart
  2009-10-14 11:53 SCTP Association Restart Gregory Waines
@ 2009-10-14 13:07 ` Vlad Yasevich
  2009-10-14 13:51 ` Gregory Waines
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Vlad Yasevich @ 2009-10-14 13:07 UTC (permalink / raw)
  To: linux-sctp



Gregory Waines wrote:
> 
> - ok, so I am using Linux 2.6.14 .
>   can someone confirm that association restart should work
>   for the SCTP implementation in Linux 2.6.14 .
>   i.e. specifically for the side of the association that stays
>        up and receives the unexpected INIT and COOKIE_ECHO while
>        in the ESTABLISHED state.
>        This end should accept the new INIT request as a restart
>        (provided ip address and port match), report RESTART to the
>        ULP, and reset sequence numbers to zero.
>   This all works in 2.6.14 ?

Yes.  There is a bug there, however, that if you have any data awaiting
re-assembly or ordering, it will stay there (as stale), and will cause
issues.  That was fixed in 2.6.21.  You will want these 2 commit to fix
it:
	0b58a811461ccf3cf848aba4cc192538fd3b0516
	749bf9215ed1a8b6edb4bb03693c2b62c6b9c2a4
> 
> 
> - If I have a Linux process with an established SCTP connection/
>   association,
>   is there a socket option that prevents the kernel from 
>   ABORTing the association if this Linux process fails unexpectedly ?
> 

Nope.  When the socket is closed, the association is closed as well.
Depending on your settings, it will either be ABORTed or closed with SHUTDOWN.

> 
> - I have the following question related to using the one-to-one style
> socket 
>   interface when trying to do an Association Restart:
>      * if my node is typically the server side of the SCTP connections
>      * then on a restart of this node,
>      * I assume that I could NOT setup my server's listening socket
> first,
>                     (i.e. socket(), bind(), listen(), accept()...)
>        and,
>        then try to re-establish old associations by
> socket(),bind(),connect() ...
>             because the bind() would probably fail due to the listening
> socket
>             already being bound to the same SCTP IP Address and Port.
>      *
>      * is this correct ?

No.  If you system restarts, you will start with a completely fresh state
and you would need to start your service with a normal procedure.

>      * i.e. if using the one-to-one style interface, and 
>                you are the server, and
>                you restart, and
>                you are trying to recover SCTP Associations,
>             then
>                the only way you can get around the bind() conflict is to
> 
>                recover the SCTP associations first, and then 
>                re-setup your listening socket.
> 

I think you mis-understand when association restart is typically triggered.  The
trigger is when one association failed to notify the other that it went down.
When everything is operating normally, this almost never happens.  It is usually
triggered due to a network outage where one side lost reachability and
terminated the association.  The application attempts to restart by either
connecting again, or attempting to transmit data (using implicit connect).  If
the network is restored, you will get a restart.

A restart _might_ get triggered on a system restart if you have a service that
tries to establish associations as part of it's start-up procedure and you had
a network overflow/failure that lost the ABORT/SHUTDOWN packets.  Again, this is
not something that's always guaranteed to happen.

-vlad

> 
> thanks in advance for any help,
> Greg Waines
> Nortel
> waines@nortel.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: SCTP Association Restart
  2009-10-14 11:53 SCTP Association Restart Gregory Waines
  2009-10-14 13:07 ` Vlad Yasevich
@ 2009-10-14 13:51 ` Gregory Waines
  2009-10-14 15:09 ` Vlad Yasevich
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Gregory Waines @ 2009-10-14 13:51 UTC (permalink / raw)
  To: linux-sctp


thanks vlad.

ok ... I now understand 'original intent' of the association restart.

You're correct that I am trying to use the 'association restart'
behaviour for a different purpose.

i.e. I have a 1:1 Active / Standby implementation of 
an Application which uses SCTP connections.
- Active process on node A ... SCTP server with ESTABLISHED SCTP
associations
- Standby process on node B ... hot-standby waiting to take service if
Active fails
     * with a variety of data being journalled from node A to node B
     * mostly application/ULP-specific
     * but includes far-end SCTP IP Address & port of ESTABLISHED SCTP
associations
- if node A fails ... e.g. say hardware failure / reset.
- Standby process on node B becomes Active
- node B takes over IP Address ... details left out
- node B recovers SCTP Associations using journalled SCTP data ( far-end
IP Address & ports )
  ... which would rely on the 'association restart' behaviour at far-end
to send a 
      RESTART (rather than an ABORT) to the far-end ULP/Application, and
      reset far-end sequence numbers, etc. such that communication can
restart
      on this SCTP Association.


Are you aware of any implementations similar to the above description ?

The 3GPP TS 36.412 version 8.5.0 Release 8 standard (LTE wireless
standard), 
Section 7 Transport Layer, describes this "SCTP endpoint redundancy", 
for the SCTP connections between the eNodeB and the MME devices, and 
actually refers to the behaviour described in RFC4960 section 5.2 .
So ... I'm assuming that this has been or can be done (?).

Comments ?

Greg.



Vlad Yasevich wrote:
> Gregory Waines wrote:
>> 
>> - ok, so I am using Linux 2.6.14 .
>>   can someone confirm that association restart should work
>>   for the SCTP implementation in Linux 2.6.14 .
>>   i.e. specifically for the side of the association that stays
>>        up and receives the unexpected INIT and COOKIE_ECHO while
>>        in the ESTABLISHED state.
>>        This end should accept the new INIT request as a restart
>>        (provided ip address and port match), report RESTART to the
>>        ULP, and reset sequence numbers to zero.
>>   This all works in 2.6.14 ?
> 
> Yes.  There is a bug there, however, that if you have any
> data awaiting re-assembly or ordering, it will stay there (as
> stale), and will cause issues.  That was fixed in 2.6.21.
> You will want these 2 commit to fix
> it:
> 	0b58a811461ccf3cf848aba4cc192538fd3b0516
> 	749bf9215ed1a8b6edb4bb03693c2b62c6b9c2a4
>> 
>> 
>> - If I have a Linux process with an established SCTP connection/  
>>   association, is there a socket option that prevents the kernel from
>>   ABORTing the association if this Linux process fails unexpectedly ?
>> 
> 
> Nope.  When the socket is closed, the association is closed as well.
> Depending on your settings, it will either be ABORTed or
> closed with SHUTDOWN.
> 
>> 
>> - I have the following question related to using the one-to-one
>>   style socket interface when trying to do an Association Restart:
>>      * if my node is typically the server side of the SCTP
>> connections 
>>      * then on a restart of this node,
>>      * I assume that I could NOT setup my server's listening socket
>>                     first, (i.e. socket(), bind(), listen(),
>>        accept()...)        and, then try to re-establish old
>>             associations by socket(),bind(),connect() ... because
>>             the bind() would probably fail due to the listening
>>      socket already being bound to the same SCTP IP Address and
>> Port.      * * is this correct ? 
> 
> No.  If you system restarts, you will start with a completely
> fresh state and you would need to start your service with a
> normal procedure.
> 
>>      * i.e. if using the one-to-one style interface, and
>>                you are the server, and
>>                you restart, and
>>                you are trying to recover SCTP Associations,         
>>                then the only way you can get around the bind()
>> conflict is to 
>> 
>>                recover the SCTP associations first, and then
>>                re-setup your listening socket.
>> 
> 
> I think you mis-understand when association restart is
> typically triggered.  The trigger is when one association
> failed to notify the other that it went down.
> When everything is operating normally, this almost never
> happens.  It is usually triggered due to a network outage
> where one side lost reachability and terminated the
> association.  The application attempts to restart by either
> connecting again, or attempting to transmit data (using
> implicit connect).  If the network is restored, you will get
> a restart.
> 
> A restart _might_ get triggered on a system restart if you
> have a service that tries to establish associations as part
> of it's start-up procedure and you had a network
> overflow/failure that lost the ABORT/SHUTDOWN packets.
> Again, this is not something that's always guaranteed to happen.
> 
> -vlad
> 
>> 
>> thanks in advance for any help,
>> Greg Waines
>> Nortel
>> waines@nortel.com
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SCTP Association Restart
  2009-10-14 11:53 SCTP Association Restart Gregory Waines
  2009-10-14 13:07 ` Vlad Yasevich
  2009-10-14 13:51 ` Gregory Waines
@ 2009-10-14 15:09 ` Vlad Yasevich
  2009-10-28 12:00 ` Gregory Waines
  2009-10-28 13:48 ` Vlad Yasevich
  4 siblings, 0 replies; 6+ messages in thread
From: Vlad Yasevich @ 2009-10-14 15:09 UTC (permalink / raw)
  To: linux-sctp



Gregory Waines wrote:
> thanks vlad.
> 
> ok ... I now understand 'original intent' of the association restart.
> 
> You're correct that I am trying to use the 'association restart'
> behaviour for a different purpose.
> 
> i.e. I have a 1:1 Active / Standby implementation of 
> an Application which uses SCTP connections.
> - Active process on node A ... SCTP server with ESTABLISHED SCTP
> associations
> - Standby process on node B ... hot-standby waiting to take service if
> Active fails
>      * with a variety of data being journalled from node A to node B
>      * mostly application/ULP-specific
>      * but includes far-end SCTP IP Address & port of ESTABLISHED SCTP
> associations
> - if node A fails ... e.g. say hardware failure / reset.
> - Standby process on node B becomes Active
> - node B takes over IP Address ... details left out
> - node B recovers SCTP Associations using journalled SCTP data ( far-end
> IP Address & ports )
>   ... which would rely on the 'association restart' behaviour at far-end
> to send a 
>       RESTART (rather than an ABORT) to the far-end ULP/Application, and
>       reset far-end sequence numbers, etc. such that communication can
> restart
>       on this SCTP Association.
> 

Yes, in the case of a hardware failure or operating system crash there
typically will not be any termination sequence from the SCTP layer.  When
the standby takes over, it will trigger a restart procedure at the remote.

However, in cases of application failure, system maintenance reboot, or similar
events where the application or system is terminated semi-gracefully, the
association would be torn down, unless application has a hand-over functionality
 to transition to the stand-by.

> 
> Are you aware of any implementations similar to the above description ?

Yes.  I am familiar with multiple deployments of the above functionality.
Non of them explicitly try to trigger a restart, but they depend on the
ability to be there when needed.

-vlad

> 
> The 3GPP TS 36.412 version 8.5.0 Release 8 standard (LTE wireless
> standard), 
> Section 7 Transport Layer, describes this "SCTP endpoint redundancy", 
> for the SCTP connections between the eNodeB and the MME devices, and 
> actually refers to the behaviour described in RFC4960 section 5.2 .
> So ... I'm assuming that this has been or can be done (?).
> 
> Comments ?
> 
> Greg.
> 
> 
> 
> Vlad Yasevich wrote:
>> Gregory Waines wrote:
>>> - ok, so I am using Linux 2.6.14 .
>>>   can someone confirm that association restart should work
>>>   for the SCTP implementation in Linux 2.6.14 .
>>>   i.e. specifically for the side of the association that stays
>>>        up and receives the unexpected INIT and COOKIE_ECHO while
>>>        in the ESTABLISHED state.
>>>        This end should accept the new INIT request as a restart
>>>        (provided ip address and port match), report RESTART to the
>>>        ULP, and reset sequence numbers to zero.
>>>   This all works in 2.6.14 ?
>> Yes.  There is a bug there, however, that if you have any
>> data awaiting re-assembly or ordering, it will stay there (as
>> stale), and will cause issues.  That was fixed in 2.6.21.
>> You will want these 2 commit to fix
>> it:
>> 	0b58a811461ccf3cf848aba4cc192538fd3b0516
>> 	749bf9215ed1a8b6edb4bb03693c2b62c6b9c2a4
>>>
>>> - If I have a Linux process with an established SCTP connection/  
>>>   association, is there a socket option that prevents the kernel from
>>>   ABORTing the association if this Linux process fails unexpectedly ?
>>>
>> Nope.  When the socket is closed, the association is closed as well.
>> Depending on your settings, it will either be ABORTed or
>> closed with SHUTDOWN.
>>
>>> - I have the following question related to using the one-to-one
>>>   style socket interface when trying to do an Association Restart:
>>>      * if my node is typically the server side of the SCTP
>>> connections 
>>>      * then on a restart of this node,
>>>      * I assume that I could NOT setup my server's listening socket
>>>                     first, (i.e. socket(), bind(), listen(),
>>>        accept()...)        and, then try to re-establish old
>>>             associations by socket(),bind(),connect() ... because
>>>             the bind() would probably fail due to the listening
>>>      socket already being bound to the same SCTP IP Address and
>>> Port.      * * is this correct ? 
>> No.  If you system restarts, you will start with a completely
>> fresh state and you would need to start your service with a
>> normal procedure.
>>
>>>      * i.e. if using the one-to-one style interface, and
>>>                you are the server, and
>>>                you restart, and
>>>                you are trying to recover SCTP Associations,         
>>>                then the only way you can get around the bind()
>>> conflict is to 
>>>
>>>                recover the SCTP associations first, and then
>>>                re-setup your listening socket.
>>>
>> I think you mis-understand when association restart is
>> typically triggered.  The trigger is when one association
>> failed to notify the other that it went down.
>> When everything is operating normally, this almost never
>> happens.  It is usually triggered due to a network outage
>> where one side lost reachability and terminated the
>> association.  The application attempts to restart by either
>> connecting again, or attempting to transmit data (using
>> implicit connect).  If the network is restored, you will get
>> a restart.
>>
>> A restart _might_ get triggered on a system restart if you
>> have a service that tries to establish associations as part
>> of it's start-up procedure and you had a network
>> overflow/failure that lost the ABORT/SHUTDOWN packets.
>> Again, this is not something that's always guaranteed to happen.
>>
>> -vlad
>>
>>> thanks in advance for any help,
>>> Greg Waines
>>> Nortel
>>> waines@nortel.com
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp"
>>> in the body of a message to majordomo@vger.kernel.org More majordomo
>>> info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: SCTP Association Restart
  2009-10-14 11:53 SCTP Association Restart Gregory Waines
                   ` (2 preceding siblings ...)
  2009-10-14 15:09 ` Vlad Yasevich
@ 2009-10-28 12:00 ` Gregory Waines
  2009-10-28 13:48 ` Vlad Yasevich
  4 siblings, 0 replies; 6+ messages in thread
From: Gregory Waines @ 2009-10-28 12:00 UTC (permalink / raw)
  To: linux-sctp

Vlad Yasevich wrote:
> Gregory Waines wrote:
>> 
< SNIP >
>> 
>> - If I have a Linux process with an established SCTP connection/  
>>   association, is there a socket option that prevents the kernel from
>>   ABORTing the association if this Linux process fails unexpectedly ?
>> 
> 
> Nope.  When the socket is closed, the association is closed as well.
> Depending on your settings, it will either be ABORTed or
> closed with SHUTDOWN.
> 
< SNIP >

... a followup question on this ...

remember that 
   - I am trying to build a 1:1 (active / standby) type application
     which uses an SCTP connection.  
   - And I am trying to take advantage of the ASSOCIATION RESTART 
     behaviour (i.e. section 5.2.4.1) in order to allow the 
     Standby Instance of the Application, when becomingActive, to take 
     over the SCTP connection without the far-end of the SCTP connection
     terminating or aborting the connection.
   - typically the Standby Instance of the Application is on a different
     card/processor,
   - and for full Card/Processor failures, which result in the Kernel
having
     no chance to send an ABORT to the far-end,
     I believe I have no issues with this approach.

... but I would like to also support Application Process Failures, 
in the same way.

This is why I was asking about the socket option above ...
i.e. a socket option that, if the Linux Process dies,
     the kernel would still clean up the SCTP Association locally,
     BUT would NOT send an ABORT to the far-end of the Association.


If I was going to implement this socket option myself,
would you recommend:
   - a new socket option  e.g. SO_NO_ABORT_ON_FAIL
   or
   - use an existing socket option e.g. SO_LINGER

Any thoughts / recommendations ?


thanks in advance,
Greg.







^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SCTP Association Restart
  2009-10-14 11:53 SCTP Association Restart Gregory Waines
                   ` (3 preceding siblings ...)
  2009-10-28 12:00 ` Gregory Waines
@ 2009-10-28 13:48 ` Vlad Yasevich
  4 siblings, 0 replies; 6+ messages in thread
From: Vlad Yasevich @ 2009-10-28 13:48 UTC (permalink / raw)
  To: linux-sctp



Gregory Waines wrote:
> Vlad Yasevich wrote:
>> Gregory Waines wrote:
> < SNIP >
>>> - If I have a Linux process with an established SCTP connection/  
>>>   association, is there a socket option that prevents the kernel from
>>>   ABORTing the association if this Linux process fails unexpectedly ?
>>>
>> Nope.  When the socket is closed, the association is closed as well.
>> Depending on your settings, it will either be ABORTed or
>> closed with SHUTDOWN.
>>
> < SNIP >
> 
> ... a followup question on this ...
> 
> remember that 
>    - I am trying to build a 1:1 (active / standby) type application
>      which uses an SCTP connection.  
>    - And I am trying to take advantage of the ASSOCIATION RESTART 
>      behaviour (i.e. section 5.2.4.1) in order to allow the 
>      Standby Instance of the Application, when becomingActive, to take 
>      over the SCTP connection without the far-end of the SCTP connection
>      terminating or aborting the connection.
>    - typically the Standby Instance of the Application is on a different
>      card/processor,
>    - and for full Card/Processor failures, which result in the Kernel
> having
>      no chance to send an ABORT to the far-end,
>      I believe I have no issues with this approach.
> 
> ... but I would like to also support Application Process Failures, 
> in the same way.
> 
> This is why I was asking about the socket option above ...
> i.e. a socket option that, if the Linux Process dies,
>      the kernel would still clean up the SCTP Association locally,
>      BUT would NOT send an ABORT to the far-end of the Association.
> 
> 
> If I was going to implement this socket option myself,
> would you recommend:
>    - a new socket option  e.g. SO_NO_ABORT_ON_FAIL
>    or
>    - use an existing socket option e.g. SO_LINGER
> 

I think you should keep it as clean and simple as possible and make it its own
socket option.

One item of note.  When association restarts, any queued data, that's awaiting
reassembly or ordering is discarded.  The reason is that on restarted
associations, TSN and SSN sequences begin anew and there is no way to
re-assemble or re-order old data.

-vlad
> Any thoughts / recommendations ?
> 
> 
> thanks in advance,
> Greg.
> 
> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-10-28 13:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-14 11:53 SCTP Association Restart Gregory Waines
2009-10-14 13:07 ` Vlad Yasevich
2009-10-14 13:51 ` Gregory Waines
2009-10-14 15:09 ` Vlad Yasevich
2009-10-28 12:00 ` Gregory Waines
2009-10-28 13:48 ` Vlad Yasevich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).