From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Hemminger Subject: Re: [iproute PATCH] ip-route: Propagate errors from parse_one_nh() Date: Wed, 24 Jan 2018 07:44:42 -0800 Message-ID: <20180124074442.6790409f@xeon-e3> References: <20180123164047.28661-1-phil@nwl.cc> <20180123144442.1500f35a@xeon-e3> <20180124091924.GF1008@orbyte.nwl.cc> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: netdev@vger.kernel.org, =?UTF-8?B?w4lsaWU=?= Bouttier To: Phil Sutter Return-path: Received: from mail-pg0-f65.google.com ([74.125.83.65]:34537 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934058AbeAXPop (ORCPT ); Wed, 24 Jan 2018 10:44:45 -0500 Received: by mail-pg0-f65.google.com with SMTP id r19so2968680pgn.1 for ; Wed, 24 Jan 2018 07:44:45 -0800 (PST) In-Reply-To: <20180124091924.GF1008@orbyte.nwl.cc> Sender: netdev-owner@vger.kernel.org List-ID: On Wed, 24 Jan 2018 10:19:24 +0100 Phil Sutter wrote: > Hi Stephen, >=20 > On Tue, Jan 23, 2018 at 02:44:42PM -0800, Stephen Hemminger wrote: > > On Tue, 23 Jan 2018 17:40:47 +0100 > > Phil Sutter wrote: > > =20 > > > The following command segfaults if enp0s31f6 does not exist: > > >=20 > > > | # ip -6 route add default proto ra metric 20100 \ > > > | nexthop via fe80:52:0:2040::1fc dev enp0s31f6 weight 1 \ > > > | nexthop via fe80:52:0:2040::1fe dev enp0s31f6 weight 1 > > >=20 > > > Since the non-zero return code from parse_one_nh() is ignored, > > > parse_nexthops() continues iterating over the the same fields in argv > > > until buffer space is exhausted and eventually accesses unallocated > > > memory. > > >=20 > > > Fix this by aborting on error in parse_nexthops() and make > > > iproute_modify() fail if parse_nexthops() did. > > >=20 > > > Reported-by: Lennart Poettering > > > Fixes: 2f406f2d0b4ef ("ip route: replace exits with returns") > > > Signed-off-by: Phil Sutter > > > --- > > > ip/iproute.c | 7 ++++--- > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > >=20 > > > diff --git a/ip/iproute.c b/ip/iproute.c > > > index bf886fda9d761..d7accf57ac8d1 100644 > > > --- a/ip/iproute.c > > > +++ b/ip/iproute.c > > > @@ -871,7 +871,8 @@ static int parse_nexthops(struct nlmsghdr *n, str= uct rtmsg *r, > > > memset(rtnh, 0, sizeof(*rtnh)); > > > rtnh->rtnh_len =3D sizeof(*rtnh); > > > rta->rta_len +=3D rtnh->rtnh_len; > > > - parse_one_nh(n, r, rta, rtnh, &argc, &argv); > > > + if (parse_one_nh(n, r, rta, rtnh, &argc, &argv) < 0) > > > + return -1; > > > rtnh =3D RTNH_NEXT(rtnh); > > > } > > > =20 > > > @@ -1318,8 +1319,8 @@ static int iproute_modify(int cmd, unsigned int= flags, int argc, char **argv) > > > addattr_l(&req.n, sizeof(req), RTA_METRICS, RTA_DATA(mxrta), RTA_P= AYLOAD(mxrta)); > > > } > > > =20 > > > - if (nhs_ok) > > > - parse_nexthops(&req.n, &req.r, argc, argv); > > > + if (nhs_ok && parse_nexthops(&req.n, &req.r, argc, argv) < 0) > > > + return -1; > > > =20 > > > if (req.r.rtm_family =3D=3D AF_UNSPEC) > > > req.r.rtm_family =3D AF_INET; =20 > >=20 > >=20 > > The real issue is that handling of invalid device is different than all= the other > > possible semantic errors. > >=20 > > My recommendations are: > > * change bad device to use invarg() which does exit > > * make functions that only return 0 void including > > parse_one_nh > > lwt_parse_encap > > get_addr > >=20 > > Also, it looks like read_family converts any address family it doesn't = know about to unspec > > that is stupid behavior as well. > >=20 > > The original commit 2f406f2d0b4ef ("ip route: replace exits with return= s") > > looks like well intentioned but suspect. Most of the errors in ip route > > indicate real issues where continuing is not a good plan. =20 >=20 > You're right, the use of invarg() for any other error effectively > prevents what said commit tried to achieve, so my fix is pretty > pointless in that regard. Yet I wonder why we still have 'ip -batch > -force' given that it's not useful. Maybe =C3=89lie is able to provide so= me > details about the use-case said commit tried to fix? >=20 > Meanwhile I'll prepare some patches to address the shortcomings you > mentioned above. The use case for batch (and force) is that there may be a large set of rout= es or qdisc operations where it is ok for some of them to fail because of resp= onses from the kernel failing. I don't think batch should ever just continue if = handed invalid syntax for device or address. There are some borderline cases, for = example if a tunnel device could not be created and later steps depend on that name. Agree, lets get some real data on why the original patch was done.