* ndisc_cache garbage collection issue
@ 2019-05-02 12:42 Tom Hughes
2019-05-02 13:19 ` Eric Dumazet
0 siblings, 1 reply; 3+ messages in thread
From: Tom Hughes @ 2019-05-02 12:42 UTC (permalink / raw)
To: David Ahern; +Cc: netdev
I recently upgraded a machine from a 4.20.13 kernel to 5.0.9 and am
finding that after a few days I start getting a lot of these messages:
neighbour: ndisc_cache: neighbor table overflow!
and IPv6 networking starts to fail intermittently as a result.
The neighbour table doesn't appear to have much in it however so I've
been looking at the code, and especially your recent changes to garbage
collection in the neighbour tables and my working theory is that the
value of gc_entries is somehow out of sync with the actual list of what
needs to be garbage collected.
Looking at the code I think I see a possible way that this could be
happening post 8cc196d6ef8 which moved the addition of new entries to
the gc list out of neigh_alloc into ___neigh_create.
The problem is that neigh_alloc is doing the increment of gc_entries, so
if ___neigh_create winds up taking an error path gc_entries will have
been incremented but the neighbour will never be added to the gc list.
I don't know for sure yet that this is the cause of my problem, but it
seems to be incorrect in any case unless I have misunderstood something?
Tom
--
Tom Hughes (tom@compton.nu)
http://compton.nu/
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: ndisc_cache garbage collection issue
2019-05-02 12:42 ndisc_cache garbage collection issue Tom Hughes
@ 2019-05-02 13:19 ` Eric Dumazet
2019-05-02 17:59 ` Tom Hughes
0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2019-05-02 13:19 UTC (permalink / raw)
To: Tom Hughes, David Ahern; +Cc: netdev
On 5/2/19 5:42 AM, Tom Hughes wrote:
> I recently upgraded a machine from a 4.20.13 kernel to 5.0.9 and am
> finding that after a few days I start getting a lot of these messages:
>
> neighbour: ndisc_cache: neighbor table overflow!
>
> and IPv6 networking starts to fail intermittently as a result.
>
> The neighbour table doesn't appear to have much in it however so I've
> been looking at the code, and especially your recent changes to garbage
> collection in the neighbour tables and my working theory is that the
> value of gc_entries is somehow out of sync with the actual list of what
> needs to be garbage collected.
>
> Looking at the code I think I see a possible way that this could be
> happening post 8cc196d6ef8 which moved the addition of new entries to
> the gc list out of neigh_alloc into ___neigh_create.
>
> The problem is that neigh_alloc is doing the increment of gc_entries, so
> if ___neigh_create winds up taking an error path gc_entries will have
> been incremented but the neighbour will never be added to the gc list.
>
> I don't know for sure yet that this is the cause of my problem, but it
> seems to be incorrect in any case unless I have misunderstood something?
>
> Tom
>
Hi Tom
This seems to match your report : https://patchwork.ozlabs.org/patch/1093973/
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: ndisc_cache garbage collection issue
2019-05-02 13:19 ` Eric Dumazet
@ 2019-05-02 17:59 ` Tom Hughes
0 siblings, 0 replies; 3+ messages in thread
From: Tom Hughes @ 2019-05-02 17:59 UTC (permalink / raw)
To: Eric Dumazet, David Ahern; +Cc: netdev
On 02/05/2019 14:19, Eric Dumazet wrote:
>
>
> On 5/2/19 5:42 AM, Tom Hughes wrote:
>> I recently upgraded a machine from a 4.20.13 kernel to 5.0.9 and am
>> finding that after a few days I start getting a lot of these messages:
>>
>> neighbour: ndisc_cache: neighbor table overflow!
>>
>> and IPv6 networking starts to fail intermittently as a result.
>>
>> The neighbour table doesn't appear to have much in it however so I've
>> been looking at the code, and especially your recent changes to garbage
>> collection in the neighbour tables and my working theory is that the
>> value of gc_entries is somehow out of sync with the actual list of what
>> needs to be garbage collected.
>>
>> Looking at the code I think I see a possible way that this could be
>> happening post 8cc196d6ef8 which moved the addition of new entries to
>> the gc list out of neigh_alloc into ___neigh_create.
>>
>> The problem is that neigh_alloc is doing the increment of gc_entries, so
>> if ___neigh_create winds up taking an error path gc_entries will have
>> been incremented but the neighbour will never be added to the gc list.
>>
>> I don't know for sure yet that this is the cause of my problem, but it
>> seems to be incorrect in any case unless I have misunderstood something?
>
> This seems to match your report : https://patchwork.ozlabs.org/patch/1093973/
That does indeed look like the same thing... I've built a kernel with
that applied now so we'll see how that goes.
Tom
--
Tom Hughes (tom@compton.nu)
http://compton.nu/
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-05-02 17:59 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-02 12:42 ndisc_cache garbage collection issue Tom Hughes
2019-05-02 13:19 ` Eric Dumazet
2019-05-02 17:59 ` Tom Hughes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).