From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jiri Pirko Subject: Re: [PATCH] net: take care of bonding in build_skb_flow_key (v4) Date: Fri, 22 Jan 2016 07:52:07 +0100 Message-ID: <20160122065207.GA2211@nanopsycho.orion> References: <1453354378-3018-1-git-send-email-wen.gang.wang@oracle.com> <20160121083506.GA2251@nanopsycho.orion> <56A1AE48.4000908@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: netdev@vger.kernel.org, sd@queasysnail.net, jay.vosburgh@canonical.com, zyjzyj2000@gmail.com To: Wengang Wang Return-path: Received: from mail-wm0-f67.google.com ([74.125.82.67]:35730 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750803AbcAVGwK (ORCPT ); Fri, 22 Jan 2016 01:52:10 -0500 Received: by mail-wm0-f67.google.com with SMTP id 123so15413680wmz.2 for ; Thu, 21 Jan 2016 22:52:10 -0800 (PST) Content-Disposition: inline In-Reply-To: <56A1AE48.4000908@oracle.com> Sender: netdev-owner@vger.kernel.org List-ID: =46ri, Jan 22, 2016 at 05:21:28AM CET, wen.gang.wang@oracle.com wrote: > > >=E5=9C=A8 2016=E5=B9=B401=E6=9C=8821=E6=97=A5 16:35, Jiri Pirko =E5=86= =99=E9=81=93: >>Thu, Jan 21, 2016 at 06:32:58AM CET, wen.gang.wang@oracle.com wrote: >>>In a bonding setting, we determines fragment size according to MTU a= nd >>>PMTU associated to the bonding master. If the slave finds the fragme= nt >>>size is too big, it drops the fragment and calls ip_rt_update_pmtu()= , >>>passing _skb_ and _pmtu_, trying to update the path MTU. >>>Problem is that the target device that function ip_rt_update_pmtu ac= tually >>>tries to update is the slave (skb->dev), not the master. Thus since = no >>>PMTU change happens on master, the fragment size for later packets d= oesn't >>>change so all later fragments/packets are dropped too. >>> >>>The fix is letting build_skb_flow_key() take care of the transition = of >>>device index from bonding slave to the master. That makes the master= become >>>the target device that ip_rt_update_pmtu tries to update PMTU to. >>> >>>Signed-off-by: Wengang Wang >>>--- >>>net/ipv4/route.c | 9 +++++++++ >>>1 file changed, 9 insertions(+) >>> >>>diff --git a/net/ipv4/route.c b/net/ipv4/route.c >>>index 85f184e..7e766b5 100644 >>>--- a/net/ipv4/route.c >>>+++ b/net/ipv4/route.c >>>@@ -524,10 +524,19 @@ static void build_skb_flow_key(struct flowi4 *= fl4, const struct sk_buff *skb, >>>{ >>> const struct iphdr *iph =3D ip_hdr(skb); >>> int oif =3D skb->dev->ifindex; >>>+ struct net_device *master; >>> u8 tos =3D RT_TOS(iph->tos); >>> u8 prot =3D iph->protocol; >>> u32 mark =3D skb->mark; >>> >>>+ if (netif_is_bond_slave(skb->dev)) { >>>+ rcu_read_lock(); >>>+ master =3D netdev_master_upper_dev_get_rcu(skb->dev); >>>+ if (master) >>>+ oif =3D master->ifindex; >>>+ rcu_read_unlock(); >>>+ } >>This is certainly not correct as it should not be bond-specific but >>rather generic. > >Then what you would suggest to fix it? >>Note that you may have bond over bond or bridge over >>bond or other scenarios, which this patch ignores. >I don't think bond over bond is a good configuration. Do you have a re= al use >case for that configuration? Stacking of multiple master devices is absolutelly common. You have to go in the upper tree all the way up, for all master device types. > >thanks, >wengang >