netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "YOSHIFUJI Hideaki/吉藤英明" <hideaki.yoshifuji@miraclelinux.com>
To: ulf@emagii.com, Ulf Samuelsson <netdev@emagii.com>,
	netdev@vger.kernel.org
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>,
	hideaki.yoshifuji@miraclelinux.com
Subject: Re: [PATCH] neighbour.c: Avoid GC directly after state change
Date: Mon, 16 Mar 2015 13:57:47 +0900	[thread overview]
Message-ID: <550662CB.50009@miraclelinux.com> (raw)
In-Reply-To: <5505DEC6.30006@emagii.com>

Hello.

Ulf Samuelsson wrote:
> Den 2015-03-15 09:27, YOSHIFUJI Hideaki skrev:
>> Hello.
>>
>> Ulf Samuelsson wrote:
>>> From: Ulf Samuelsson <ulf@emagii.com>
>>>
>>> The neighbour state is changed in the ARP timer handler.
>>> If the state is changed to NUD_STALE, then the neighbour
>>> entry becomes a candidate for garbage collection.
>>>
>>> The garbage collection is handled by a "periodic work" routine.
>>>
>>> When :
>>>
>>>     * noone refers to the entry
>>>     * the state is no longer valid (I.E: NUD_STALE).
>> NUD_STALE is still valid.
> Yes, my fault.
> The condition which causes garbage collection to be skipped is.
>
>
>      if (state & (NUD_PERMANENT | NUD_IN_TIMER)) {
>
>      NUD_STALE is not part of that, so GC will not be skipped,
>      and therefore the patch is needed if you want to be able
>      to use the API to modify the neigh statemachine..
>>
>>>     * the timeout value has  been reached or state is FAILED
>>>
>>> the "periodic work" routine will notify
>>> the stack that the entry should be deleted.
>>>
>>> A user application monitoring and controlling the neighbour table
>>> using NETLINK may fail, if the "period work" routine is run
>>> directly after the state has been changed to NUD_STALE,
>>> but before the user application has had a chance to change
>>> the state to something valid.
>>>
>>> The "period work" routine will detect the NUD_STALE state
>>> and if the timeout value has been reached, it will notify the stack
>>> that the entry should be deleted.
>>>
>>> The patch adds a check in the periodic work routine
>>> which will skip test for garbage collection
>>> unless a number of ticks has passed since the last time
>>> the neighbour entry state was changed.
>>>
>>> The feature is controlled through Kconfig
>>>
>>> The featuree is enabled by setting ARP_GC_APPLY_GUARDBAND
>>> The guardband time (in ticks) is set in ARP_GC_GUARDBAND
>>> Default time is 100 ms if HZ_### is set.
>> We have "lower limit" not to start releasing neighbour entries.
>> Try increasing gc_thresh1.
> Why would  that work?
>
> The only place where this is used is
>
>      "if (atomic_read(&tbl->entries) < tbl->gc_thresh1)"
>
> tbl->entries is related to how many entries there are in the neighbour table.
>
> The only way I think this would work, is if this is raised so high that
> garbage collection does not occur.
>
> That is not the intention.
>
> It does not solve the race condition between the timer_handler and the periodic_work.

I don't think it is a race.

You can try increasing gc_staletime to hold each entry based
on last usage.  Plus, you can "confirm" neighbors by
MSG_CONFIRM.

Note that if the number of entries becomes high, "forced GC" will
drop valid, "not connected" entries as well.

--yoshfuji

>
> BR
> Ulf Samuelsson
>
>>
>> --yoshfuji
>>
>>> Signed-off-by: Ulf Samuelsson <ulf@emagii.com>
>>> ---
>>>   net/Kconfig          |   32 ++++++++++++++++++++++++++++++++
>>>   net/core/neighbour.c |   15 ++++++++++++---
>>>   2 files changed, 44 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/Kconfig b/net/Kconfig
>>> index 44dd578..099a5dd 100644
>>> --- a/net/Kconfig
>>> +++ b/net/Kconfig
>>> @@ -77,6 +77,38 @@ config INET
>>>         Short answer: say Y.
>>>   if INET
>>> +
>>> +#
>>> +# Core Network configuration
>>> +#
>>> +
>>> +config ARP_GC_APPLY_GUARDBAND
>>> +    bool "IP: ARP: Avoid garbage collection directly after state change"
>>> +    default n
>>> +    ---help---
>>> +      With this item selected, an entry in the neighbour table
>>> +      will not be garbage collected directly after the ARP state
>>> +      has changed to STALE of FAILED
>>> +      This allows an application program change the state to something valid
>>> +      before garbage colllection occurs.
>>> +
>>> +      If unsure, say N.
>>> +
>>> +config ARP_GC_GUARDBAND
>>> +    int "Guardband time on garbage collection"
>>> +    depends on ARP_GC_APPLY_GUARDBAND
>>> +    default 10 if HZ_100
>>> +    default 25 if HZ_250
>>> +    default 30 if HZ_300
>>> +    default 100 if HZ_1000
>>> +    default 100
>>> +
>>> +    ---help---
>>> +      The number of ticks to delay garbage collection
>>> +      after the neighbour entry has been updated
>>> +      A delay of 100 ms is reasonable.
>>> +      With CONFIG_HZ = 250, this value should be 25
>>> +
>>>   source "net/ipv4/Kconfig"
>>>   source "net/ipv6/Kconfig"
>>>   source "net/netlabel/Kconfig"
>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>> index 70fe9e1..194195d 100644
>>> --- a/net/core/neighbour.c
>>> +++ b/net/core/neighbour.c
>>> @@ -786,13 +786,23 @@ static void neigh_periodic_work(struct work_struct *work)
>>>               state = n->nud_state;
>>>               if (state & (NUD_PERMANENT | NUD_IN_TIMER)) {
>>> -                write_unlock(&n->lock);
>>>                   goto next_elt;
>>>               }
>>>               if (time_before(n->used, n->confirmed))
>>>                   n->used = n->confirmed;
>>> +#if defined(CONFIG_ARP_GC_APPLY_GUARDBAND)
>>> +            /* Do not garbage collect directly after we
>>> +             * updated n->state to allow applications to
>>> +             * react to the event
>>> +             */
>>> +            if (time_before(jiffies,
>>> +                    n->updated + CONFIG_ARP_GC_GUARDBAND)) {
>>> +                goto next_elt;
>>> +            }
>>> +#endif
>>> +
>>>               if (atomic_read(&n->refcnt) == 1 &&
>>>                   (state == NUD_FAILED ||
>>>                    time_after(jiffies, n->used + NEIGH_VAR(n->parms, GC_STALETIME)))) {
>>> @@ -802,9 +812,8 @@ static void neigh_periodic_work(struct work_struct *work)
>>>                   neigh_cleanup_and_release(n);
>>>                   continue;
>>>               }
>>> -            write_unlock(&n->lock);
>>> -
>>>   next_elt:
>>> +            write_unlock(&n->lock);
>>>               np = &n->next;
>>>           }
>>>           /*
>>>
>
>

-- 
吉藤英明 <hideaki.yoshifuji@miraclelinux.com>
ミラクル・リナックス株式会社 技術本部 サポート部

  reply	other threads:[~2015-03-16  4:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-11 20:28 [PATCH] neighbour.c: Avoid GC directly after state change Ulf Samuelsson
2015-03-15  8:27 ` YOSHIFUJI Hideaki
2015-03-15 19:34   ` Ulf Samuelsson
2015-03-16  4:57     ` YOSHIFUJI Hideaki/吉藤英明 [this message]
2015-03-16 19:55       ` Ulf Samuelsson
2015-03-17 12:31         ` YOSHIFUJI Hideaki
2015-03-17 23:27           ` Ulf Samuelsson
  -- strict thread matches above, loose matches on Subject: below --
2015-03-11 21:01 Ulf Samuelsson
2015-03-12 18:26 ` David Miller
2015-03-17 23:33   ` Ulf Samuelsson
2015-03-18  1:56     ` YOSHIFUJI Hideaki/吉藤英明
2015-04-10  8:26   ` Ulf Samuelsson
2015-04-15 13:40     ` Ulf Samuelsson
2015-04-16  5:16     ` YOSHIFUJI Hideaki
2015-04-17  8:03       ` Ulf Samuelsson
2015-04-20  2:33         ` YOSHIFUJI Hideaki
2015-04-20 12:48           ` Ulf Samuelsson
2015-04-21  3:58             ` YOSHIFUJI Hideaki
2015-04-22  7:42               ` Ulf Samuelsson
2015-04-22 10:46                 ` YOSHIFUJI Hideaki
2015-04-22 11:49                   ` Ulf Samuelsson
2015-05-08  9:39                     ` Ulf Samuelsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=550662CB.50009@miraclelinux.com \
    --to=hideaki.yoshifuji@miraclelinux.com \
    --cc=netdev@emagii.com \
    --cc=netdev@vger.kernel.org \
    --cc=ulf@emagii.com \
    --cc=yoshfuji@linux-ipv6.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).