* Freeing alive fib_info caused by ebc0ffae5
@ 2010-11-04 10:23 Michael Ellerman
2010-11-04 10:30 ` Eric Dumazet
0 siblings, 1 reply; 7+ messages in thread
From: Michael Ellerman @ 2010-11-04 10:23 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet
[-- Attachment #1: Type: text/plain, Size: 694 bytes --]
Hi all,
I'm running Linus' latest or thereabouts (ff8b16d), and I'm seeing
"Freeing alive fib_info" messages, from free_fib_info().
Actually I only get one per boot, when network interfaces come up.
Seemingly related I am getting refcount problems when I shutdown, ie.
unregister_netdevice() sees a usage count of 1, which never decrements.
Bisect says it's ebc0ffae5 which causes the problem, or makes it appear.
fib: RCU conversion of fib_lookup()
fib_lookup() converted to be called in RCU protected context, no
reference taken and released on a contended cache line (fib_clntref)
Is this a bug in that commit, or a driver bug exposed?
cheers
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Freeing alive fib_info caused by ebc0ffae5
2010-11-04 10:23 Freeing alive fib_info caused by ebc0ffae5 Michael Ellerman
@ 2010-11-04 10:30 ` Eric Dumazet
2010-11-04 10:46 ` Eric Dumazet
2010-11-04 11:21 ` Eric Dumazet
0 siblings, 2 replies; 7+ messages in thread
From: Eric Dumazet @ 2010-11-04 10:30 UTC (permalink / raw)
To: michael; +Cc: netdev
Le jeudi 04 novembre 2010 à 21:23 +1100, Michael Ellerman a écrit :
> Hi all,
>
> I'm running Linus' latest or thereabouts (ff8b16d), and I'm seeing
> "Freeing alive fib_info" messages, from free_fib_info().
>
> Actually I only get one per boot, when network interfaces come up.
> Seemingly related I am getting refcount problems when I shutdown, ie.
> unregister_netdevice() sees a usage count of 1, which never decrements.
>
> Bisect says it's ebc0ffae5 which causes the problem, or makes it appear.
>
> fib: RCU conversion of fib_lookup()
>
> fib_lookup() converted to be called in RCU protected context, no
> reference taken and released on a contended cache line (fib_clntref)
>
>
> Is this a bug in that commit, or a driver bug exposed?
Hi Michael, thanks for the report (and painful bisection I guess)
Thats hard to say... Is it reproductable on my machine ?
Thanks
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Freeing alive fib_info caused by ebc0ffae5
2010-11-04 10:30 ` Eric Dumazet
@ 2010-11-04 10:46 ` Eric Dumazet
2010-11-04 11:21 ` Eric Dumazet
1 sibling, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2010-11-04 10:46 UTC (permalink / raw)
To: michael; +Cc: netdev
Le jeudi 04 novembre 2010 à 11:30 +0100, Eric Dumazet a écrit :
> Le jeudi 04 novembre 2010 à 21:23 +1100, Michael Ellerman a écrit :
> > Hi all,
> >
> > I'm running Linus' latest or thereabouts (ff8b16d), and I'm seeing
> > "Freeing alive fib_info" messages, from free_fib_info().
> >
> > Actually I only get one per boot, when network interfaces come up.
> > Seemingly related I am getting refcount problems when I shutdown, ie.
> > unregister_netdevice() sees a usage count of 1, which never decrements.
> >
> > Bisect says it's ebc0ffae5 which causes the problem, or makes it appear.
> >
> > fib: RCU conversion of fib_lookup()
> >
> > fib_lookup() converted to be called in RCU protected context, no
> > reference taken and released on a contended cache line (fib_clntref)
> >
> >
> > Is this a bug in that commit, or a driver bug exposed?
>
> Hi Michael, thanks for the report (and painful bisection I guess)
>
> Thats hard to say... Is it reproductable on my machine ?
You could ask a stack trace eventually, this might help to spot the bug.
Thanks
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 3e0da3e..8039db0 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -159,6 +159,7 @@ void free_fib_info(struct fib_info *fi)
{
if (fi->fib_dead == 0) {
pr_warning("Freeing alive fib_info %p\n", fi);
+ WARN_ON_ONCE(1);
return;
}
change_nexthops(fi) {
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Freeing alive fib_info caused by ebc0ffae5
2010-11-04 10:30 ` Eric Dumazet
2010-11-04 10:46 ` Eric Dumazet
@ 2010-11-04 11:21 ` Eric Dumazet
2010-11-04 11:23 ` Michael Ellerman
2010-11-04 11:35 ` Michael Ellerman
1 sibling, 2 replies; 7+ messages in thread
From: Eric Dumazet @ 2010-11-04 11:21 UTC (permalink / raw)
To: michael; +Cc: netdev
Le jeudi 04 novembre 2010 à 11:30 +0100, Eric Dumazet a écrit :
> Le jeudi 04 novembre 2010 à 21:23 +1100, Michael Ellerman a écrit :
> > Hi all,
> >
> > I'm running Linus' latest or thereabouts (ff8b16d), and I'm seeing
> > "Freeing alive fib_info" messages, from free_fib_info().
> >
> > Actually I only get one per boot, when network interfaces come up.
> > Seemingly related I am getting refcount problems when I shutdown, ie.
> > unregister_netdevice() sees a usage count of 1, which never decrements.
> >
> > Bisect says it's ebc0ffae5 which causes the problem, or makes it appear.
> >
> > fib: RCU conversion of fib_lookup()
> >
> > fib_lookup() converted to be called in RCU protected context, no
> > reference taken and released on a contended cache line (fib_clntref)
> >
> >
> > Is this a bug in that commit, or a driver bug exposed?
>
> Hi Michael, thanks for the report (and painful bisection I guess)
>
> Thats hard to say... Is it reproductable on my machine ?
>
Hmm, a review of the code spotted a bug in fib_result_assign()
Please try following patch :
Thanks again !
[PATCH] fib: fib_result_assign() should not change fib refcounts
After commit ebc0ffae5 (RCU conversion of fib_lookup()),
fib_result_assign() should not change fib refcounts anymore.
Thanks to Michael who did the bisection and bug report.
Reported-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
net/ipv4/fib_lookup.h | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/net/ipv4/fib_lookup.h b/net/ipv4/fib_lookup.h
index a29edf2..c079cc0 100644
--- a/net/ipv4/fib_lookup.h
+++ b/net/ipv4/fib_lookup.h
@@ -47,11 +47,8 @@ extern int fib_detect_death(struct fib_info *fi, int order,
static inline void fib_result_assign(struct fib_result *res,
struct fib_info *fi)
{
- if (res->fi != NULL)
- fib_info_put(res->fi);
+ /* we used to play games with refcounts, but we now use RCU */
res->fi = fi;
- if (fi != NULL)
- atomic_inc(&fi->fib_clntref);
}
#endif /* _FIB_LOOKUP_H */
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Freeing alive fib_info caused by ebc0ffae5
2010-11-04 11:21 ` Eric Dumazet
@ 2010-11-04 11:23 ` Michael Ellerman
2010-11-04 11:35 ` Michael Ellerman
1 sibling, 0 replies; 7+ messages in thread
From: Michael Ellerman @ 2010-11-04 11:23 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
[-- Attachment #1: Type: text/plain, Size: 1295 bytes --]
On Thu, 2010-11-04 at 12:21 +0100, Eric Dumazet wrote:
> Le jeudi 04 novembre 2010 à 11:30 +0100, Eric Dumazet a écrit :
> > Le jeudi 04 novembre 2010 à 21:23 +1100, Michael Ellerman a écrit :
> > > Hi all,
> > >
> > > I'm running Linus' latest or thereabouts (ff8b16d), and I'm seeing
> > > "Freeing alive fib_info" messages, from free_fib_info().
> > >
> > > Actually I only get one per boot, when network interfaces come up.
> > > Seemingly related I am getting refcount problems when I shutdown, ie.
> > > unregister_netdevice() sees a usage count of 1, which never decrements.
> > >
> > > Bisect says it's ebc0ffae5 which causes the problem, or makes it appear.
> > >
> > > fib: RCU conversion of fib_lookup()
> > >
> > > fib_lookup() converted to be called in RCU protected context, no
> > > reference taken and released on a contended cache line (fib_clntref)
> > >
> > >
> > > Is this a bug in that commit, or a driver bug exposed?
> >
> > Hi Michael, thanks for the report (and painful bisection I guess)
> >
> > Thats hard to say... Is it reproductable on my machine ?
> >
>
> Hmm, a review of the code spotted a bug in fib_result_assign()
Aha, I was just adding some debug in there. Let me test the patch.
cheers
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Freeing alive fib_info caused by ebc0ffae5
2010-11-04 11:21 ` Eric Dumazet
2010-11-04 11:23 ` Michael Ellerman
@ 2010-11-04 11:35 ` Michael Ellerman
2010-11-04 19:06 ` David Miller
1 sibling, 1 reply; 7+ messages in thread
From: Michael Ellerman @ 2010-11-04 11:35 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
[-- Attachment #1: Type: text/plain, Size: 1301 bytes --]
On Thu, 2010-11-04 at 12:21 +0100, Eric Dumazet wrote:
>
> Hmm, a review of the code spotted a bug in fib_result_assign()
>
> Please try following patch :
>
> Thanks again !
>
> [PATCH] fib: fib_result_assign() should not change fib refcounts
>
> After commit ebc0ffae5 (RCU conversion of fib_lookup()),
> fib_result_assign() should not change fib refcounts anymore.
>
> Thanks to Michael who did the bisection and bug report.
>
> Reported-by: Michael Ellerman <michael@ellerman.id.au>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> net/ipv4/fib_lookup.h | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/net/ipv4/fib_lookup.h b/net/ipv4/fib_lookup.h
> index a29edf2..c079cc0 100644
> --- a/net/ipv4/fib_lookup.h
> +++ b/net/ipv4/fib_lookup.h
> @@ -47,11 +47,8 @@ extern int fib_detect_death(struct fib_info *fi, int order,
> static inline void fib_result_assign(struct fib_result *res,
> struct fib_info *fi)
> {
> - if (res->fi != NULL)
> - fib_info_put(res->fi);
> + /* we used to play games with refcounts, but we now use RCU */
> res->fi = fi;
> - if (fi != NULL)
> - atomic_inc(&fi->fib_clntref);
> }
>
> #endif /* _FIB_LOOKUP_H */
Perfect, that fixes it, thanks!
cheers
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Freeing alive fib_info caused by ebc0ffae5
2010-11-04 11:35 ` Michael Ellerman
@ 2010-11-04 19:06 ` David Miller
0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2010-11-04 19:06 UTC (permalink / raw)
To: michael; +Cc: eric.dumazet, netdev
From: Michael Ellerman <michael@ellerman.id.au>
Date: Thu, 04 Nov 2010 22:35:26 +1100
> On Thu, 2010-11-04 at 12:21 +0100, Eric Dumazet wrote:
>> [PATCH] fib: fib_result_assign() should not change fib refcounts
>>
>> After commit ebc0ffae5 (RCU conversion of fib_lookup()),
>> fib_result_assign() should not change fib refcounts anymore.
>>
>> Thanks to Michael who did the bisection and bug report.
...
> Perfect, that fixes it, thanks!
Applied, thanks everyone!
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-11-04 19:05 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-04 10:23 Freeing alive fib_info caused by ebc0ffae5 Michael Ellerman
2010-11-04 10:30 ` Eric Dumazet
2010-11-04 10:46 ` Eric Dumazet
2010-11-04 11:21 ` Eric Dumazet
2010-11-04 11:23 ` Michael Ellerman
2010-11-04 11:35 ` Michael Ellerman
2010-11-04 19:06 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).