Netdev List
 help / color / mirror / Atom feed
From: David Miller <davem@davemloft.net>
To: tuong.t.lien@dektech.com.au
Cc: jon.maloy@ericsson.com, ying.xue@windriver.com,
	netdev@vger.kernel.org, tipc-discussion@lists.sourceforge.net
Subject: Re: [net] tipc: fix link session and re-establish issues
Date: Mon, 11 Feb 2019 21:26:37 -0800 (PST)	[thread overview]
Message-ID: <20190211.212637.697481727085947933.davem@davemloft.net> (raw)
In-Reply-To: <20190211062943.4864-1-tuong.t.lien@dektech.com.au>

From: Tuong Lien <tuong.t.lien@dektech.com.au>
Date: Mon, 11 Feb 2019 13:29:43 +0700

> When a link endpoint is re-created (e.g. after a node reboot or
> interface reset), the link session number is varied by random, the peer
> endpoint will be synced with this new session number before the link is
> re-established.
> 
> However, there is a shortcoming in this mechanism that can lead to the
> link never re-established or faced with a failure then. It happens when
> the peer endpoint is ready in ESTABLISHING state, the 'peer_session' as
> well as the 'in_session' flag have been set, but suddenly this link
> endpoint leaves. When it comes back with a random session number, there
> are two situations possible:
> 
> 1/ If the random session number is larger than (or equal to) the
> previous one, the peer endpoint will be updated with this new session
> upon receipt of a RESET_MSG from this endpoint, and the link can be re-
> established as normal. Otherwise, all the RESET_MSGs from this endpoint
> will be rejected by the peer. In turn, when this link endpoint receives
> one ACTIVATE_MSG from the peer, it will move to ESTABLISHED and start
> to send STATE_MSGs, but again these messages will be dropped by the
> peer due to wrong session.
> The peer link endpoint can still become ESTABLISHED after receiving a
> traffic message from this endpoint (e.g. a BCAST_PROTOCOL or
> NAME_DISTRIBUTOR), but since all the STATE_MSGs are invalid, the link
> will be forced down sooner or later!
> 
> Even in case the random session number is larger than the previous one,
> it can be that the ACTIVATE_MSG from the peer arrives first, and this
> link endpoint moves quickly to ESTABLISHED without sending out any
> RESET_MSG yet. Consequently, the peer link will not be updated with the
> new session number, and the same link failure scenario as above will
> happen.
> 
> 2/ Another situation can be that, the peer link endpoint was reset due
> to any reasons in the meantime, its link state was set to RESET from
> ESTABLISHING but still in session, i.e. the 'in_session' flag is not
> reset...
> Now, if the random session number from this endpoint is less than the
> previous one, all the RESET_MSGs from this endpoint will be rejected by
> the peer. In the other direction, when this link endpoint receives a
> RESET_MSG from the peer, it moves to ESTABLISHING and starts to send
> ACTIVATE_MSGs, but all these messages will be rejected by the peer too.
> As a result, the link cannot be re-established but gets stuck with this
> link endpoint in state ESTABLISHING and the peer in RESET!
> 
> Solution:
> ===========
> 
> This link endpoint should not go directly to ESTABLISHED when getting
> ACTIVATE_MSG from the peer which may belong to the old session if the
> link was re-created. To ensure the session to be correct before the
> link is re-established, the peer endpoint in ESTABLISHING state will
> send back the last session number in ACTIVATE_MSG for a verification at
> this endpoint. Then, if needed, a new and more appropriate session
> number will be regenerated to force a re-synch first.
> 
> In addition, when a link in ESTABLISHING state is reset, its state will
> move to RESET according to the link FSM, along with resetting the
> 'in_session' flag (and the other data) as a normal link reset, it will
> also be deleted if requested.
> 
> The solution is backward compatible.
> 
> Acked-by: Jon Maloy <jon.maloy@ericsson.com>
> Acked-by: Ying Xue <ying.xue@windriver.com>
> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>

Applied.

      reply	other threads:[~2019-02-12  5:26 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-11  6:29 [net] tipc: fix link session and re-establish issues Tuong Lien
2019-02-12  5:26 ` David Miller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190211.212637.697481727085947933.davem@davemloft.net \
    --to=davem@davemloft.net \
    --cc=jon.maloy@ericsson.com \
    --cc=netdev@vger.kernel.org \
    --cc=tipc-discussion@lists.sourceforge.net \
    --cc=tuong.t.lien@dektech.com.au \
    --cc=ying.xue@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox