* [PATCH net] mlxsw: spectrum_router: do not drop refcnt on fib rule
@ 2017-07-13 19:11 David Ahern
2017-07-13 19:46 ` David Ahern
0 siblings, 1 reply; 6+ messages in thread
From: David Ahern @ 2017-07-13 19:11 UTC (permalink / raw)
To: idosch, jiri, netdev; +Cc: David Ahern
The recent conversion to refcount_t, 717d1e993ad8 ("net: convert
fib_rule.refcnt from atomic_t to refcount_t"), and subsequent fix
by Eric, 5361e209dd30 ("net: avoid one splat in fib_nl_delrule()"),
exposed a bug in mlxsw.
The driver is doing a put on fib rules after processing it from the
notifier. This triggers a BUG on:
[ 104.444889] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[ 104.452821] IP: fib_rules_lookup+0x39/0x170
[ 104.457056] PGD 409395067
[ 104.457057] P4D 409395067
[ 104.459783] PUD 408c23067
[ 104.462507] PMD 0
...
[ 104.519750] CPU: 1 PID: 900 Comm: vrf Tainted: G W 4.12.0-rc7+ #51
[ 104.527133] Hardware name: Mellanox Technologies Ltd. Mellanox switch/Mellanox switch, BIOS 4.6.5 05/21/2015
[ 104.537084] task: ffff880401454380 task.stack: ffffc900007c0000
[ 104.543029] RIP: 0010:fib_rules_lookup+0x39/0x170
[ 104.547784] RSP: 0000:ffff88041dd039d8 EFLAGS: 00010207
[ 104.553053] RAX: 00000000d8e1b910 RBX: 0000000000000000 RCX: 0000000000000002
[ 104.560264] RDX: 00000000fffffff5 RSI: 0000000000000000 RDI: ffff880408d80f30
[ 104.567461] RBP: ffff88041dd03a08 R08: 000000000000001d R09: 0000000000000000
[ 104.574699] R10: 0000000000000000 R11: 0000000000000006 R12: ffff88040b160cc0
[ 104.581916] R13: ffff88041dd03a18 R14: ffff88040b160d40 R15: ffff88041dd03aa0
[ 104.589130] FS: 00007f44b0edf700(0000) GS:ffff88041dd00000(0000) knlGS:0000000000000000
[ 104.597330] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 104.603151] CR2: 0000000000000010 CR3: 0000000408ca8000 CR4: 00000000001406e0
[ 104.610371] Call Trace:
[ 104.612839] <IRQ>
[ 104.614872] __fib_lookup+0x54/0x90
[ 104.618406] fib_validate_source+0x31d/0x570
[ 104.622731] ? fib_rules_lookup+0x131/0x170
[ 104.626975] ? __fib_lookup+0x54/0x90
[ 104.630685] ip_route_input_rcu+0xbcf/0xd30
Since mlxsw is not doing a get on the rule to increase the ref count, it
should not be doing a put.
Fixes: 5d7bfd141924a("ipv4: fib_rules: Dump FIB rules when registering FIB notifier")
Signed-off-by: David Ahern <dsahern@gmail.com>
---
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 383fef5a8e24..b0fb8e5e83c9 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -2844,7 +2844,6 @@ static void mlxsw_sp_router_fib_event_work(struct work_struct *work)
rule = fib_work->fr_info.rule;
if (!fib4_rule_default(rule) && !rule->l3mdev)
mlxsw_sp_router_fib4_abort(mlxsw_sp);
- fib_rule_put(rule);
break;
case FIB_EVENT_NH_ADD: /* fall through */
case FIB_EVENT_NH_DEL:
--
2.1.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net] mlxsw: spectrum_router: do not drop refcnt on fib rule
2017-07-13 19:11 [PATCH net] mlxsw: spectrum_router: do not drop refcnt on fib rule David Ahern
@ 2017-07-13 19:46 ` David Ahern
2017-07-13 20:33 ` Ido Schimmel
0 siblings, 1 reply; 6+ messages in thread
From: David Ahern @ 2017-07-13 19:46 UTC (permalink / raw)
To: idosch, jiri, netdev
On 7/13/17 1:11 PM, David Ahern wrote:
> Since mlxsw is not doing a get on the rule to increase the ref count, it
> should not be doing a put.
upon further review, mlxsw is doing a get on the rule
Problem remains, but this is not the right fix.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] mlxsw: spectrum_router: do not drop refcnt on fib rule
2017-07-13 19:46 ` David Ahern
@ 2017-07-13 20:33 ` Ido Schimmel
2017-07-13 20:39 ` David Ahern
0 siblings, 1 reply; 6+ messages in thread
From: Ido Schimmel @ 2017-07-13 20:33 UTC (permalink / raw)
To: David Ahern; +Cc: jiri, netdev
On Thu, Jul 13, 2017 at 01:46:15PM -0600, David Ahern wrote:
> On 7/13/17 1:11 PM, David Ahern wrote:
> > Since mlxsw is not doing a get on the rule to increase the ref count, it
> > should not be doing a put.
>
> upon further review, mlxsw is doing a get on the rule
>
> Problem remains, but this is not the right fix.
Remains where? It's not clear to me how you concluded mlxsw is at fault.
My setup is running net-next with the refcount patches and I didn't
observe this.
If current trace isn't enough to pinpoint the problem, can you try to
reproduce with a KASAN enabled kernel?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] mlxsw: spectrum_router: do not drop refcnt on fib rule
2017-07-13 20:33 ` Ido Schimmel
@ 2017-07-13 20:39 ` David Ahern
2017-07-13 20:48 ` Ido Schimmel
2017-07-13 21:05 ` Ido Schimmel
0 siblings, 2 replies; 6+ messages in thread
From: David Ahern @ 2017-07-13 20:39 UTC (permalink / raw)
To: Ido Schimmel; +Cc: jiri, netdev
On 7/13/17 2:33 PM, Ido Schimmel wrote:
> Remains where? It's not clear to me how you concluded mlxsw is at fault.
> My setup is running net-next with the refcount patches and I didn't
> observe this.
Create a VRF.
see latest patch. mlxsw releasing the refcnt on the rule was the victim;
eric's patch to fix a delete was setting the refcnt to 1 after mlxsw
bumped it.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] mlxsw: spectrum_router: do not drop refcnt on fib rule
2017-07-13 20:39 ` David Ahern
@ 2017-07-13 20:48 ` Ido Schimmel
2017-07-13 21:05 ` Ido Schimmel
1 sibling, 0 replies; 6+ messages in thread
From: Ido Schimmel @ 2017-07-13 20:48 UTC (permalink / raw)
To: David Ahern; +Cc: jiri, netdev
On Thu, Jul 13, 2017 at 02:39:10PM -0600, David Ahern wrote:
> On 7/13/17 2:33 PM, Ido Schimmel wrote:
> > Remains where? It's not clear to me how you concluded mlxsw is at fault.
> > My setup is running net-next with the refcount patches and I didn't
> > observe this.
>
> Create a VRF.
Yea, I wasn't running VRFs with the refcount patches.
Reproduced this on my system now. Thanks for the fix.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] mlxsw: spectrum_router: do not drop refcnt on fib rule
2017-07-13 20:39 ` David Ahern
2017-07-13 20:48 ` Ido Schimmel
@ 2017-07-13 21:05 ` Ido Schimmel
1 sibling, 0 replies; 6+ messages in thread
From: Ido Schimmel @ 2017-07-13 21:05 UTC (permalink / raw)
To: David Ahern; +Cc: jiri, netdev
On Thu, Jul 13, 2017 at 02:39:10PM -0600, David Ahern wrote:
> On 7/13/17 2:33 PM, Ido Schimmel wrote:
> > Remains where? It's not clear to me how you concluded mlxsw is at fault.
> > My setup is running net-next with the refcount patches and I didn't
> > observe this.
>
> Create a VRF.
BTW, this didn't show up on my dev branch as I've patches that introduce
IPv6 support where I move the rules notifications to core, after the
refcount is set to 1 and just before the netlink notification is sent.
https://github.com/idosch/linux/commit/7b17a21b1d71fc9a1969080e5fdcb90f376b73b2
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-07-13 21:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-13 19:11 [PATCH net] mlxsw: spectrum_router: do not drop refcnt on fib rule David Ahern
2017-07-13 19:46 ` David Ahern
2017-07-13 20:33 ` Ido Schimmel
2017-07-13 20:39 ` David Ahern
2017-07-13 20:48 ` Ido Schimmel
2017-07-13 21:05 ` Ido Schimmel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).