From: Ulf Samuelsson <netdev@emagii.com>
To: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>,
ulf@emagii.com, netdev@vger.kernel.org
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Subject: Re: [PATCH] neighbour.c: Avoid GC directly after state change
Date: Wed, 18 Mar 2015 00:27:17 +0100 [thread overview]
Message-ID: <5508B855.9090303@emagii.com> (raw)
In-Reply-To: <55081E87.7020603@miraclelinux.com>
Den 2015-03-17 13:31, YOSHIFUJI Hideaki skrev:
> Hello,
>
> Ulf Samuelsson wrote:
>> Den 2015-03-16 05:57, YOSHIFUJI Hideaki/吉藤英明 skrev:
>>> Hello.
>>>
>>> Ulf Samuelsson wrote:
>>>> Den 2015-03-15 09:27, YOSHIFUJI Hideaki skrev:
>>>>> Hello.
>>>>>
>>>>> Ulf Samuelsson wrote:
>>>>>> From: Ulf Samuelsson <ulf@emagii.com>
>>>>>>
>>>>>> The neighbour state is changed in the ARP timer handler.
>>>>>> If the state is changed to NUD_STALE, then the neighbour
>>>>>> entry becomes a candidate for garbage collection.
>>>>>>
>>>>>> The garbage collection is handled by a "periodic work" routine.
>>>>>>
>>>>>> When :
>>>>>>
>>>>>> * noone refers to the entry
>>>>>> * the state is no longer valid (I.E: NUD_STALE).
>>>>> NUD_STALE is still valid.
>>>> Yes, my fault.
>>>> The condition which causes garbage collection to be skipped is.
>>>>
>>>>
>>>> if (state & (NUD_PERMANENT | NUD_IN_TIMER)) {
>>>>
>>>> NUD_STALE is not part of that, so GC will not be skipped,
>>>> and therefore the patch is needed if you want to be able
>>>> to use the API to modify the neigh statemachine..
>>>>>
>>>>>> * the timeout value has been reached or state is FAILED
>>>>>>
>>>>>> the "periodic work" routine will notify
>>>>>> the stack that the entry should be deleted.
>>>>>>
>>>>>> A user application monitoring and controlling the neighbour table
>>>>>> using NETLINK may fail, if the "period work" routine is run
>>>>>> directly after the state has been changed to NUD_STALE,
>>>>>> but before the user application has had a chance to change
>>>>>> the state to something valid.
>>>>>>
>>>>>> The "period work" routine will detect the NUD_STALE state
>>>>>> and if the timeout value has been reached, it will notify the stack
>>>>>> that the entry should be deleted.
>>>>>>
>>>>>> The patch adds a check in the periodic work routine
>>>>>> which will skip test for garbage collection
>>>>>> unless a number of ticks has passed since the last time
>>>>>> the neighbour entry state was changed.
>>>>>>
>>>>>> The feature is controlled through Kconfig
>>>>>>
>>>>>> The featuree is enabled by setting ARP_GC_APPLY_GUARDBAND
>>>>>> The guardband time (in ticks) is set in ARP_GC_GUARDBAND
>>>>>> Default time is 100 ms if HZ_### is set.
>>>>> We have "lower limit" not to start releasing neighbour entries.
>>>>> Try increasing gc_thresh1.
>>>> Why would that work?
>>>>
>>>> The only place where this is used is
>>>>
>>>> "if (atomic_read(&tbl->entries) < tbl->gc_thresh1)"
>>>>
>>>> tbl->entries is related to how many entries there are in the
>>>> neighbour table.
>>>>
>>>> The only way I think this would work, is if this is raised so high
>>>> that
>>>> garbage collection does not occur.
>>>>
>>>> That is not the intention.
>>>>
>>>> It does not solve the race condition between the timer_handler and
>>>> the periodic_work.
>>>
>>> I don't think it is a race.
>> And I think you are wrong and my logging shows that it is.
>
> We do not gurantee holding "stale" entries more than
> gc_thresh1, so it shall not be called as a race at all.
And that is a problem, which results in traffic loss.
>
>>
>>>
>>> You can try increasing gc_staletime to hold each entry based
>>> on last usage. Plus, you can "confirm" neighbors by
>>> MSG_CONFIRM.
>>>
>>> Note that if the number of entries becomes high, "forced GC" will
>>> drop valid, "not connected" entries as well.
>>
>> I can try increasing gc_staletime, but its a waste of time, because
>> it is not the last usage which is interesting.
>> What is interesting is the time when the entry state was updated by
>> the timer handler.
>>
>> Pls explain further MSG_CONFIRM.
>
> Typically, base reachable time is set to 30sec and gc_staletime is
> set to 60sec. So, entries are expected to be held for a while
> after it has become "stale", no?
>
> MSG_CONFIRM is a sendmsg() flag. It allows user-space application
> to confirm reachability of neighbor. It refreshes "confirmed"
> time. In neigh_periodic_work(), "used" time is updated to
> "confirmed" time if "used" time is before "confirmed" time.
>
May work, if the application was totally redesigned,
> ping(8), ping6(8), tftpd(8) use that flag, for example.
>
>
>> The problem occurs when the periodic_work routine runs immediately
>> after the timer handler has changes the state to NUD_STALE and
>> the entry has reached the expiry time.
>
> It is what we expect.
>
> In neigh_periodic_work(), you may try to release non-STALE
> entries first, and then STALE entries if the number of entries is
> still high.
That sounds much more complex than the proposed fix.
>
> --yoshfuji
next prev parent reply other threads:[~2015-03-17 23:27 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-11 20:28 [PATCH] neighbour.c: Avoid GC directly after state change Ulf Samuelsson
2015-03-15 8:27 ` YOSHIFUJI Hideaki
2015-03-15 19:34 ` Ulf Samuelsson
2015-03-16 4:57 ` YOSHIFUJI Hideaki/吉藤英明
2015-03-16 19:55 ` Ulf Samuelsson
2015-03-17 12:31 ` YOSHIFUJI Hideaki
2015-03-17 23:27 ` Ulf Samuelsson [this message]
-- strict thread matches above, loose matches on Subject: below --
2015-03-11 21:01 Ulf Samuelsson
2015-03-12 18:26 ` David Miller
2015-03-17 23:33 ` Ulf Samuelsson
2015-03-18 1:56 ` YOSHIFUJI Hideaki/吉藤英明
2015-04-10 8:26 ` Ulf Samuelsson
2015-04-15 13:40 ` Ulf Samuelsson
2015-04-16 5:16 ` YOSHIFUJI Hideaki
2015-04-17 8:03 ` Ulf Samuelsson
2015-04-20 2:33 ` YOSHIFUJI Hideaki
2015-04-20 12:48 ` Ulf Samuelsson
2015-04-21 3:58 ` YOSHIFUJI Hideaki
2015-04-22 7:42 ` Ulf Samuelsson
2015-04-22 10:46 ` YOSHIFUJI Hideaki
2015-04-22 11:49 ` Ulf Samuelsson
2015-05-08 9:39 ` Ulf Samuelsson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5508B855.9090303@emagii.com \
--to=netdev@emagii.com \
--cc=hideaki.yoshifuji@miraclelinux.com \
--cc=netdev@vger.kernel.org \
--cc=ulf@emagii.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.