From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vegard Nossum Subject: Re: [PATCH] net: check for NULL net_device in FIB tables Date: Mon, 4 Jul 2016 21:56:33 +0200 Message-ID: <577ABF71.6020002@oracle.com> References: <1467636446-6771-1-git-send-email-vegard.nossum@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: Andy Gospodarek , davem@davemloft.net, netdev@vger.kernel.org, Dinesh Dutt , Scott Feldman To: Julian Anastasov Return-path: Received: from aserp1040.oracle.com ([141.146.126.69]:42113 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753507AbcGDT4y (ORCPT ); Mon, 4 Jul 2016 15:56:54 -0400 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On 07/04/2016 09:45 PM, Julian Anastasov wrote: > > Hello, > > On Mon, 4 Jul 2016, Vegard Nossum wrote: > >> struct fib_nh->nh_dev can be NULL, so we should check it before calling >> __in_dev_get_rcu on it. >> >> Multiple places seem to want this (and check the return value), so we can >> add a convenience wrapper for this. >> >> This fixes a crash in AF_NETLINK sendmsg(). >> >> Please double check that I caught all the callers that need the NULL >> guard. > > What kind of configuration causes such crash? It's not very easy to tell, sorry, I was using a netlink fuzzer to find the crash. But see below, you identified the right bit of code. >> - in_dev = __in_dev_get_rtnl(nh->nh_dev); >> + in_dev = in_dev_get_rtnl(nh->nh_dev); > > fib_rebalance can not crash because for multipath > routes (fib_nhs > 1) all nexthops should have valid nh_dev. > Okay, cool. >> >> if (in_dev && >> IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN(in_dev) && >> @@ -559,7 +559,7 @@ static void fib_rebalance(struct fib_info *fi) >> change_nexthops(fi) { >> int upper_bound; >> >> - in_dev = __in_dev_get_rtnl(nexthop_nh->nh_dev); >> + in_dev = in_dev_get_rtnl(nexthop_nh->nh_dev); >> >> if (nexthop_nh->nh_flags & RTNH_F_DEAD) { >> upper_bound = -1; >> @@ -1261,7 +1261,7 @@ int fib_dump_info(struct sk_buff *skb, u32 portid, u32 seq, int event, >> nla_put_u32(skb, RTA_OIF, fi->fib_nh->nh_oif)) >> goto nla_put_failure; >> if (fi->fib_nh->nh_flags & RTNH_F_LINKDOWN) { >> - in_dev = __in_dev_get_rtnl(fi->fib_nh->nh_dev); >> + in_dev = in_dev_get_rtnl(fi->fib_nh->nh_dev); > > Looks like this is the only place that can crash, > adding extra cycles to the other places is not good. And this > can happen only because fib_create_info() allows RTNH_F_LINKDOWN > to come for routes with error. May be fc_flags should be masked > there? Or there is another place that sets the flag when nh_dev > is NULL? Indeed, this is the (only) place that actually crashed for me. > >> if (in_dev && >> IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN(in_dev)) >> rtm->rtm_flags |= RTNH_F_DEAD; >> @@ -1292,7 +1292,7 @@ int fib_dump_info(struct sk_buff *skb, u32 portid, u32 seq, int event, >> >> rtnh->rtnh_flags = nh->nh_flags & 0xFF; >> if (nh->nh_flags & RTNH_F_LINKDOWN) { >> - in_dev = __in_dev_get_rtnl(nh->nh_dev); >> + in_dev = in_dev_get_rtnl(nh->nh_dev); > > Not needed because fib_nhs > 1 > Alright. Thanks for the review! I can submit a new patch to only check the one place above that actually crashed. Otherwise, if you think it's better to go with your fc_flags suggestion, feel free to send a patch for that. As you can tell, I am not very familiar with this code :-) If you do send a patch, I can test it easily. Vegard