From: Eric Dumazet <eric.dumazet@gmail.com>
To: Simon Horman <horms@verge.net.au>
Cc: netdev@vger.kernel.org, Jay Vosburgh <fubar@us.ibm.com>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: bonding: flow control regression [was Re: bridging: flow control regression]
Date: Tue, 02 Nov 2010 08:30:57 +0100 [thread overview]
Message-ID: <1288683057.2660.154.camel@edumazet-laptop> (raw)
In-Reply-To: <20101102070308.GA19924@verge.net.au>
Le mardi 02 novembre 2010 à 16:03 +0900, Simon Horman a écrit :
> On Tue, Nov 02, 2010 at 05:53:42AM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 à 11:06 +0900, Simon Horman a écrit :
> >
> > > Thanks for the explanation.
> > > I'm not entirely sure how much of a problem this is in practice.
> >
> > Maybe for virtual devices (tunnels, bonding, ...), it would make sense
> > to delay the orphaning up to the real device.
>
> That was my initial thought. Could you give me some guidance
> on how that might be done so I can try and make a patch to test?
>
> > But if the socket send buffer is very large, it would defeat the flow
> > control any way...
>
> I'm primarily concerned about a situation where
> UDP packets are sent as fast as possible, indefinitely.
> And in that scenario, I think it would need to be a rather large buffer.
>
Please try following patch, thanks.
drivers/net/bonding/bond_main.c | 1 +
include/linux/if.h | 3 +++
net/core/dev.c | 5 +++--
3 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index bdb68a6..325931e 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4714,6 +4714,7 @@ static void bond_setup(struct net_device *bond_dev)
bond_dev->flags |= IFF_MASTER|IFF_MULTICAST;
bond_dev->priv_flags |= IFF_BONDING;
bond_dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
+ bond_dev->priv_flags &= ~IFF_EARLY_ORPHAN;
if (bond->params.arp_interval)
bond_dev->priv_flags |= IFF_MASTER_ARPMON;
diff --git a/include/linux/if.h b/include/linux/if.h
index 1239599..7499a99 100644
--- a/include/linux/if.h
+++ b/include/linux/if.h
@@ -77,6 +77,9 @@
#define IFF_BRIDGE_PORT 0x8000 /* device used as bridge port */
#define IFF_OVS_DATAPATH 0x10000 /* device used as Open vSwitch
* datapath port */
+#define IFF_EARLY_ORPHAN 0x20000 /* early orphan skbs in
+ * dev_hard_start_xmit()
+ */
#define IF_GET_IFACE 0x0001 /* for querying only */
#define IF_GET_PROTO 0x0002
diff --git a/net/core/dev.c b/net/core/dev.c
index 35dfb83..eabf94d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2005,7 +2005,8 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
skb_dst_drop(skb);
- skb_orphan_try(skb);
+ if (dev->priv_flags & IFF_EARLY_ORPHAN)
+ skb_orphan_try(skb);
if (vlan_tx_tag_present(skb) &&
!(dev->features & NETIF_F_HW_VLAN_TX)) {
@@ -5590,7 +5591,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
INIT_LIST_HEAD(&dev->napi_list);
INIT_LIST_HEAD(&dev->unreg_list);
INIT_LIST_HEAD(&dev->link_watch_list);
- dev->priv_flags = IFF_XMIT_DST_RELEASE;
+ dev->priv_flags = IFF_XMIT_DST_RELEASE | IFF_EARLY_ORPHAN ;
setup(dev);
strcpy(dev->name, name);
return dev;
next prev parent reply other threads:[~2010-11-02 7:31 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-01 12:29 bridging: flow control regression Simon Horman
2010-11-01 12:59 ` Eric Dumazet
2010-11-02 2:06 ` bonding: flow control regression [was Re: bridging: flow control regression] Simon Horman
2010-11-02 4:53 ` Eric Dumazet
2010-11-02 7:03 ` Simon Horman
2010-11-02 7:30 ` Eric Dumazet [this message]
2010-11-02 8:46 ` Simon Horman
2010-11-02 9:29 ` Eric Dumazet
2010-11-06 9:25 ` Simon Horman
2010-12-08 13:22 ` Simon Horman
2010-12-08 13:50 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1288683057.2660.154.camel@edumazet-laptop \
--to=eric.dumazet@gmail.com \
--cc=davem@davemloft.net \
--cc=fubar@us.ibm.com \
--cc=horms@verge.net.au \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox