All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ulf Samuelsson <ulf.samuelsson@ericsson.com>
To: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>,
	<netdev@emagii.com>
Cc: <netdev@vger.kernel.org>
Subject: Re: [PATCH] neighbour.c: Avoid GC directly after state change
Date: Fri, 17 Apr 2015 10:03:12 +0200	[thread overview]
Message-ID: <5530BE40.1000504@ericsson.com> (raw)
In-Reply-To: <552F45AA.6010906@miraclelinux.com>


On 04/16/2015 07:16 AM, YOSHIFUJI Hideaki wrote:
> Hi,
>
> Ulf Samuelsson wrote:
>
>> The desired functionality is that if communication stops,
>> you want to send out ARP probes, before the entry is deleted.
>>
>> The current (pseudo) code of the neigh timer is:
>>
>>      if (state & NUD_REACHABLE) {
>>          if (now <= "confirmed + "reachable_time")) {
>>                      ... /* We are OK */
>>          } else if (now < "used" + DELAY_PROBE_TIME) {    /* Never happens */
>>                      state = NUD_DELAY;
>>          } else {
>>              state = NUD_STALE;
>>              notify = 1;
>>          }
>>
>> We never see the state beeing changed from REACHABLE to DELAY,
>> so the probes are not beeing sent out, instead you always go
>> from REACHABLE to STALE.
> That's right.
But not acceptable, in telecom.
>
>
>> DELAY_PROBE_TIME is set to (5 x HZ) and "used"
>> seems to be only set by the periodic_work routine
>> when the neigh entry is in STALE state, and then it is too late.
>> It is also set by "arp_find" which is used by "broken" devices.
>>
> In STALE state, neigh->used is set by neigh_event_send(), called
> by neigh_resolve_output() via neigh->output().



>> In practice, the second condition: "(now < "used" + DELAY_PROBE_TIME)" is never used.
>> What is the intention of this test?
> That's right.  It is NOT used in normal condition unless
> reachable time is too short.
>
>
>> By adding a new test + parameter, we would get the desired functionality,
>> and no need to listen for notifications or doing ARP state updates from applications.
>>
>>          if (now <= "confirmed + "reachable_time")) {
>>                      ... /* We are OK */
>> +        else if (now <= "confirmed + "reprobe_time")) {
>> +                   state <= NUD_DELAY;
>>          } else if (now < "used" + DELAY_PROBE_TIME))) {    /* Never happens */
>>                      state <= NUD_DELAY;
>>          } else {
>>              state = NUD_STALE;
>>              notify = 1;
>>          }
>>
>> This way the entry would remain in REACHABLE while normal communication occurs,
>> then it would enter DELAY state to probe, and if that fails, it goes to STALE state.
> No, it is not what REACHABLE and DELAY mean.
>
>  From RFC2461:
>
> |      REACHABLE   Roughly speaking, the neighbor is known to have been
> |                  reachable recently (within tens of seconds ago).
> :
> |      STALE       The neighbor is no longer known to be reachable but
> |                  until traffic is sent to the neighbor, no attempt
> |                  should be made to verify its reachability.
> |      DELAY       The neighbor is no longer known to be reachable, and
> |                  traffic has recently been sent to the neighbor.
> |                  Rather than probe the neighbor immediately, however,
> |                  delay sending probes for a short while in order to
> |                  give upper layer protocols a chance to provide
> |                  reachability confirmation.
>
>

It is all depending on the meaning of the word "recently".
You imply, that if timeouts have been triggered, then it is no longer 
"recent",
but that is not the only interpretation, it is up to the implementer to 
decide
what is "recently".

You can argue, that for REACHABLE they define it as "(within tens of 
seconds ago)",
but in a standards document, that is not enough,
so the definition of STALE is perfectly OK due to this ambiguity.

We have the situation in that machines enter and exit the network, at 
unpredictable times,
and while traffic is sporadic, they still need to be reachable.
They should not enter FAILED state unless they leave the network.

I see also in the RFC2461:
"To reduce unnecessary network traffic, probe messages are only sent to
neighbors to which the node is actively sending packets."

In telecom applications, as long as the neighbour is present on the network,
the node will be sending packets, even if it is not that frequent.

These probes are *neccessary* for the system to work properly,
due to the long time for garbage collection.

The PROBE state need to be entered once, and only when these probes get 
no answer,
the entry should move into STALE.
I think that is compliant with the statement above.

Since they leave at unpredictable times, it is not good to set them to 
PERMANENT.

Therefore, if a timeout occurs due to no traffic, they must be probed before
they are garbage collected.

If this is not acceptable, how do you propose to solve the problem that 
you cannot
make remote units inaccessible for more than a fraction of a second?


Best Regards,
Ulf Samuelsson

  reply	other threads:[~2015-04-17  8:03 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-11 21:01 [PATCH] neighbour.c: Avoid GC directly after state change Ulf Samuelsson
2015-03-12 18:26 ` David Miller
2015-03-17 23:33   ` Ulf Samuelsson
2015-03-18  1:56     ` YOSHIFUJI Hideaki/吉藤英明
2015-04-10  8:26   ` Ulf Samuelsson
2015-04-15 13:40     ` Ulf Samuelsson
2015-04-16  5:16     ` YOSHIFUJI Hideaki
2015-04-17  8:03       ` Ulf Samuelsson [this message]
2015-04-20  2:33         ` YOSHIFUJI Hideaki
2015-04-20 12:48           ` Ulf Samuelsson
2015-04-21  3:58             ` YOSHIFUJI Hideaki
2015-04-22  7:42               ` Ulf Samuelsson
2015-04-22 10:46                 ` YOSHIFUJI Hideaki
2015-04-22 11:49                   ` Ulf Samuelsson
2015-05-08  9:39                     ` Ulf Samuelsson
  -- strict thread matches above, loose matches on Subject: below --
2015-03-11 20:28 Ulf Samuelsson
2015-03-15  8:27 ` YOSHIFUJI Hideaki
2015-03-15 19:34   ` Ulf Samuelsson
2015-03-16  4:57     ` YOSHIFUJI Hideaki/吉藤英明
2015-03-16 19:55       ` Ulf Samuelsson
2015-03-17 12:31         ` YOSHIFUJI Hideaki
2015-03-17 23:27           ` Ulf Samuelsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5530BE40.1000504@ericsson.com \
    --to=ulf.samuelsson@ericsson.com \
    --cc=hideaki.yoshifuji@miraclelinux.com \
    --cc=netdev@emagii.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.