* Re: [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
2004-11-12 23:51 Radheka Godse
@ 2004-11-12 0:00 ` David S. Miller
0 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2004-11-12 0:00 UTC (permalink / raw)
To: Radheka Godse; +Cc: bonding-devel, fubar, ctindel, linux-kernel
On Fri, 12 Nov 2004 15:51:49 -0800 (PST)
Radheka Godse <radheka.godse@intel.com> wrote:
> -
> +
> + /* We let the bond device publish all hardware
> + * acceleration features possible. This is OK,
> + * since if an skb is passed from the bond to
> + * a slave that doesn't support one of those
> + * features, everything is fixed in the
> + * dev_queue_xmit() function (e.g. calculate
> + * check sum, linearize the skb, etc.).
> + */
This is very inefficient if the bond slaves don't
support these features.
I believe you when you say you saw improvement in the
case where the slaves do support TSO, but if you test
a non-TSO slave case I bet you'll see a marked
decrease in system utilization at least.
The upper layers need to know the precise capabilities
of the device in order to optimize the copy from userspace,
the checksumming, and the data gathering into the SKB.
Therefore, if you "fake it out" like this without checking
what the slaves actually support, then a lot of wasted cpu
time will be spent in each dev_queue_xmit() path. There will
in many cases be multiple passes over the data instead of one,
and it is possible to introduce an extra data copy as well.
I would recommend instead the following algorithm. Publish
only the capabilities which all slaves support.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
2004-11-12 21:31 [Bonding-devel][PATCH]Zero Copy Transmit Support (Update) Godse, Radheka
@ 2004-11-12 21:20 ` David S. Miller
2004-11-12 21:56 ` Jay Vosburgh
0 siblings, 1 reply; 9+ messages in thread
From: David S. Miller @ 2004-11-12 21:20 UTC (permalink / raw)
To: Godse, Radheka; +Cc: bonding-devel, fubar, ctindel, linux-kernel
On Fri, 12 Nov 2004 13:31:53 -0800
"Godse, Radheka" <radheka.godse@intel.com> wrote:
> I had similar thoughts but then, the bond device does not have any
> slaves attached to it at load time. By publishing them upfront the bond
> device is able to take advantage of hardware acceleration if it is later
> available...
This is not a problem at all.
You can make ethtool calls on the bonding device to change
settings later. Nothing prevents you from changing the
SG/CSUM/TSO bits after device registration.
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
@ 2004-11-12 21:31 Godse, Radheka
2004-11-12 21:20 ` David S. Miller
0 siblings, 1 reply; 9+ messages in thread
From: Godse, Radheka @ 2004-11-12 21:31 UTC (permalink / raw)
To: David S. Miller; +Cc: bonding-devel, fubar, ctindel, linux-kernel
I had similar thoughts but then, the bond device does not have any
slaves attached to it at load time. By publishing them upfront the bond
device is able to take advantage of hardware acceleration if it is later
available... I will get test data for non-TSO test scenarios to see if
there is significant degradation in performance.
-radheka
-----Original Message-----
From: David S. Miller [mailto:davem@davemloft.net]
Sent: Thursday, November 11, 2004 4:00 PM
To: Godse, Radheka
Cc: bonding-devel@lists.sourceforge.net; fubar@us.ibm.com;
ctindel@users.sourceforge.net; linux-kernel@vger.kernel.org
Subject: Re: [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
On Fri, 12 Nov 2004 15:51:49 -0800 (PST)
Radheka Godse <radheka.godse@intel.com> wrote:
> -
> +
> + /* We let the bond device publish all hardware
> + * acceleration features possible. This is OK,
> + * since if an skb is passed from the bond to
> + * a slave that doesn't support one of those
> + * features, everything is fixed in the
> + * dev_queue_xmit() function (e.g. calculate
> + * check sum, linearize the skb, etc.).
> + */
This is very inefficient if the bond slaves don't
support these features.
I believe you when you say you saw improvement in the
case where the slaves do support TSO, but if you test
a non-TSO slave case I bet you'll see a marked
decrease in system utilization at least.
The upper layers need to know the precise capabilities
of the device in order to optimize the copy from userspace,
the checksumming, and the data gathering into the SKB.
Therefore, if you "fake it out" like this without checking
what the slaves actually support, then a lot of wasted cpu
time will be spent in each dev_queue_xmit() path. There will
in many cases be multiple passes over the data instead of one,
and it is possible to introduce an extra data copy as well.
I would recommend instead the following algorithm. Publish
only the capabilities which all slaves support.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
2004-11-12 21:56 ` Jay Vosburgh
@ 2004-11-12 21:49 ` David S. Miller
2004-11-12 22:20 ` Jay Vosburgh
0 siblings, 1 reply; 9+ messages in thread
From: David S. Miller @ 2004-11-12 21:49 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: radheka.godse, bonding-devel, ctindel, linux-kernel
On Fri, 12 Nov 2004 13:56:26 -0800
Jay Vosburgh <fubar@us.ibm.com> wrote:
>
> David S. Miller <davem@davemloft.net> wrote:
> >> I had similar thoughts but then, the bond device does not have any
> >> slaves attached to it at load time. By publishing them upfront the bond
> >> device is able to take advantage of hardware acceleration if it is later
> >> available...
>
> "Shlomi Yaakobovich" <Shlomi@exanet.com> posted a patch to
> update the features as slaves are added and removed, based on the
> features advertised by the slaves. His original patch wasn't properly
> based; this is the same change set redone to patch against 2.6.9.
That's definitely a good start.
It does need to be fixed to enforce the usual rules about
illegal combinations. And his code is going to include
all sorts of weird things like VLAN offload which I wonder
if works correctly with the current bonding driver? :)
The two rules are codified in register_netdevice() as follows:
/* Fix illegal SG+CSUM combinations. */
if ((dev->features & NETIF_F_SG) &&
!(dev->features & (NETIF_F_IP_CSUM |
NETIF_F_NO_CSUM |
NETIF_F_HW_CSUM))) {
printk("%s: Dropping NETIF_F_SG since no checksum feature.\n",
dev->name);
dev->features &= ~NETIF_F_SG;
}
/* TSO requires that SG is present as well. */
if ((dev->features & NETIF_F_TSO) &&
!(dev->features & NETIF_F_SG)) {
printk("%s: Dropping NETIF_F_TSO since no SG feature.\n",
dev->name);
dev->features &= ~NETIF_F_TSO;
}
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
2004-11-12 21:20 ` David S. Miller
@ 2004-11-12 21:56 ` Jay Vosburgh
2004-11-12 21:49 ` David S. Miller
0 siblings, 1 reply; 9+ messages in thread
From: Jay Vosburgh @ 2004-11-12 21:56 UTC (permalink / raw)
To: David S. Miller; +Cc: Godse, Radheka, bonding-devel, ctindel, linux-kernel
David S. Miller <davem@davemloft.net> wrote:
>> I had similar thoughts but then, the bond device does not have any
>> slaves attached to it at load time. By publishing them upfront the bond
>> device is able to take advantage of hardware acceleration if it is later
>> available...
"Shlomi Yaakobovich" <Shlomi@exanet.com> posted a patch to
update the features as slaves are added and removed, based on the
features advertised by the slaves. His original patch wasn't properly
based; this is the same change set redone to patch against 2.6.9.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
Index: linux-2681-bd/drivers/net/bonding/bond_main.c
===================================================================
RCS file: /sda7/CVS/linux-2681-bd/drivers/net/bonding/bond_main.c,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -u -r1.2 -r1.3
--- linux-2681-bd/drivers/net/bonding/bond_main.c 15 Oct 2004 22:11:14 -0000 1.2
+++ linux-2681-bd/drivers/net/bonding/bond_main.c 15 Oct 2004 23:27:34 -0000 1.3
@@ -1580,6 +1580,32 @@
return 0;
}
+enum { /* don't set these here */
+ BOND_MAINTAINED_FEATURES =
+ NETIF_F_VLAN_CHALLENGED|NETIF_F_HW_VLAN_TX |
+ NETIF_F_HW_VLAN_RX |NETIF_F_HW_VLAN_FILTER
+};
+
+static void bond_set_features(struct net_device *bond_dev)
+{
+ struct bonding *bond = bond_dev->priv;
+ struct slave *slave;
+ int i, features = 0;
+
+ bond_for_each_slave(bond, slave, i) {
+ struct net_device *slave_dev = slave->dev;
+ if (i == 0) {
+ features = slave_dev->features;
+ } else {
+ features &= slave_dev->features;
+ }
+ }
+ /* clear all previous features */
+ bond_dev->features &= BOND_MAINTAINED_FEATURES;
+ /* add common features, don't touch driver maintained ones */
+ bond_dev->features |= (features & ~BOND_MAINTAINED_FEATURES);
+}
+
/* enslave device <slave> to bond device <master> */
static int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
{
@@ -1972,6 +1998,7 @@
new_slave->link != BOND_LINK_DOWN ? "n up" : " down");
/* enslave is successful */
+ bond_set_features(bond_dev);
return 0;
/* Undo stages on error */
@@ -2177,6 +2204,7 @@
kfree(slave);
+ bond_set_features(bond_dev);
return 0; /* deletion OK */
}
@@ -2292,6 +2320,7 @@
printk(KERN_INFO DRV_NAME
": %s: released all slaves\n",
bond_dev->name);
+ bond_set_features(bond_dev);
out:
write_unlock_bh(&bond->lock);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
2004-11-12 21:49 ` David S. Miller
@ 2004-11-12 22:20 ` Jay Vosburgh
2004-11-12 22:24 ` David S. Miller
0 siblings, 1 reply; 9+ messages in thread
From: Jay Vosburgh @ 2004-11-12 22:20 UTC (permalink / raw)
To: David S. Miller; +Cc: radheka.godse, bonding-devel, ctindel, linux-kernel
David S. Miller <davem@davemloft.net> wrote:
>That's definitely a good start.
>
>It does need to be fixed to enforce the usual rules about
>illegal combinations. And his code is going to include
>all sorts of weird things like VLAN offload which I wonder
>if works correctly with the current bonding driver? :)
The existing code should handle VLANs correctly, and the patch
excludes the VLAN related bits from the dev->features update.
>The two rules are codified in register_netdevice() as follows:
[...]
Would it be preferrable to duplicate that logic in bonding, or
push it out to an inline or some such?
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
2004-11-12 22:20 ` Jay Vosburgh
@ 2004-11-12 22:24 ` David S. Miller
0 siblings, 0 replies; 9+ messages in thread
From: David S. Miller @ 2004-11-12 22:24 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: radheka.godse, bonding-devel, ctindel, linux-kernel
On Fri, 12 Nov 2004 14:20:51 -0800
Jay Vosburgh <fubar@us.ibm.com> wrote:
> Would it be preferrable to duplicate that logic in bonding, or
> push it out to an inline or some such?
It's present also in the ethtool methods used to change
these things, so you might consider making ethtool calls
to change the bits as well.
It's a pretty bad idea for folks to be changing the ->features
bits directly in drivers after device registry.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
@ 2004-11-12 23:51 Radheka Godse
2004-11-12 0:00 ` David S. Miller
0 siblings, 1 reply; 9+ messages in thread
From: Radheka Godse @ 2004-11-12 23:51 UTC (permalink / raw)
To: bonding-devel, fubar, ctindel; +Cc: linux-kernel
Please ignore previous submittal, it has reversed tags ('-'s instead of
'+'s; I had swapped the order for diff of modified and un-modified
src trees)
The patch below ADDS Zero Copy Transmit Support to bonding device.
We saw around ~50% system utilization improvement with tcp sendfile()
function enabled netperf test in 802.3ad mode.
Note that this patch was generated for 2.6.9 kernel after applying
bond-patch-2.6.1(also attached to this thread) that was submitted last week and
got accepted into the kernel.
Signed-off-by: Radheka Godse <radheka.godse@intel.com>
diff -uprN -X dontdiff linux-2.6.9_bond-patch-2.6.1/drivers/net/bonding/bonding.h linux-2.6.9/drivers/net/bonding/bonding.h
--- linux-2.6.9_bond-patch-2.6.1/drivers/net/bonding/bonding.h 2004-11-10 15:42:55.000000000 -0800
+++ linux-2.6.9/drivers/net/bonding/bonding.h 2004-11-11 13:05:54.000000000 -0800
@@ -36,8 +36,8 @@
#include "bond_3ad.h"
#include "bond_alb.h"
-#define DRV_VERSION "2.6.1"
-#define DRV_RELDATE "October 29, 2004"
+#define DRV_VERSION "2.6.2"
+#define DRV_RELDATE "November 09, 2004"
#define DRV_NAME "bonding"
#define DRV_DESCRIPTION "Ethernet Channel Bonding Driver"
diff -uprN -X dontdiff linux-2.6.9_bond-patch-2.6.1/drivers/net/bonding/bond_main.c linux-2.6.9/drivers/net/bonding/bond_main.c
--- linux-2.6.9_bond-patch-2.6.1/drivers/net/bonding/bond_main.c 2004-11-10 15:42:55.000000000 -0800
+++ linux-2.6.9/drivers/net/bonding/bond_main.c 2004-11-11 13:05:54.000000000 -0800
@@ -475,6 +475,9 @@
* Solution is to move call to dev_remove_pack outside of the
* spinlock.
* Set version to 2.6.1.
+ * 2004/11/09 - Radheka Godse <radheka.godse at intel dot com>
+ * - Added Zero Copy Transmit Support by setting appropriate flags.
+ * Set version to 2.6.2.
*
*/
@@ -4318,7 +4321,22 @@ static int __init bond_init(struct net_d
bond_dev->features |= (NETIF_F_HW_VLAN_TX |
NETIF_F_HW_VLAN_RX |
NETIF_F_HW_VLAN_FILTER);
-
+
+ /* We let the bond device publish all hardware
+ * acceleration features possible. This is OK,
+ * since if an skb is passed from the bond to
+ * a slave that doesn't support one of those
+ * features, everything is fixed in the
+ * dev_queue_xmit() function (e.g. calculate
+ * check sum, linearize the skb, etc.).
+ */
+ bond_dev->features |= (NETIF_F_SG |
+ NETIF_F_IP_CSUM |
+ NETIF_F_NO_CSUM |
+ NETIF_F_HW_CSUM |
+ NETIF_F_HIGHDMA |
+ NETIF_F_FRAGLIST);
+
#ifdef CONFIG_PROC_FS
bond_create_proc_entry(bond);
#endif
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [Bonding-devel][PATCH]Zero Copy Transmit Support (Update)
@ 2004-11-18 19:02 Godse, Radheka
0 siblings, 0 replies; 9+ messages in thread
From: Godse, Radheka @ 2004-11-18 19:02 UTC (permalink / raw)
To: David S. Miller; +Cc: bonding-devel, fubar, ctindel, linux-kernel
> -
> +
> + /* We let the bond device publish all hardware
> + * acceleration features possible. This is OK,
> + * since if an skb is passed from the bond to
> + * a slave that doesn't support one of those
> + * features, everything is fixed in the
> + * dev_queue_xmit() function (e.g. calculate
> + * check sum, linearize the skb, etc.).
> + */
This is very inefficient if the bond slaves don't
support these features.
I believe you when you say you saw improvement in the
case where the slaves do support TSO, but if you test
a non-TSO slave case I bet you'll see a marked
decrease in system utilization at least.
...
Note: 2.6.2 patch enables Zero Copy Transmit support by default while
registering the bond device... but, I continue to see better CPU
utilization (~50% improvement) when this patch was tested in various
scenarios(see data below). Test results are from a dual Itanium 2 1.4GHz
system but I saw a similar trend on a PIII Xeon 550MHz x 8 system.
bond-patch-2.6.1 and 82546EB Copper NICs with hardware acceleration
support,
Test results with TSO on and off
ETH1 ETH2 Throughput/Performance System Utilization
on off 1870 35 %
on on 1870 35 %
off off 1871 34 %
bond-patch-2.6.2 and 82546EB Copper NICs with hardware acceleration
support,
Test results with TSO on and off
ETH1 ETH2 Throughput/Performance System Utilization
on off 1879 14 %
on on 1880 14.08 %
off off 1879 13 %
------------------------------------------------------------------
bond-patch-2.6.1 and 82543GC NICs without hardware support for TSO
ETH1 ETH2 Throughput/Performance System Utilization
off off 1815 36.85 %
bond-patch-2.6.2 and 82543GC NICs without hardware support for TSO
ETH1 ETH2 Throughput/Performance System Utilization
off off 1848 11.85 %
------------------------------------------------------------------
bond-patch-2.6.1, and one 82546EB NIC with & one 82543GC NIC without
hardware support for TSO
ETH1 ETH2 Throughput/Performance System Utilization
on off 1817 30.69 %
bond-patch-2.6.2, and one 82546EB NIC with & one 82543GC NIC without
hardware support for TSO
ETH1 ETH2 Throughput/Performance System Utilization
on off 1836 24.54 %
------------------------------------------------------------------
bond-patch-2.6.1 and 82543GC NICs
Test results with all HW acceleration bits on for 1 NIC and turned off
for the other NIC
ETH1 ETH2 Throughput/Performance System Utilization
on off 1851 35 %
bond-patch-2.6.2 and 82543GC NICs
Test results with all HW acceleration bits on for 1 NIC and turned off
for the other NIC
ETH1 ETH2 Throughput/Performance System Utilization
on off 1822 17 %
------------------------------------------------------------------
bond-patch-2.6.1 and 82543GC NICs
Test results with all HW acceleration bits turned off for both NICs
ETH1 ETH2 Throughput/Performance System Utilization
off off 1816 35 %
bond-patch-2.6.2 and 82543GC NICs
Test results with all HW acceleration bits turned off for both NICs
ETH1 ETH2 Throughput/Performance System Utilization
off off 1604 16.23 %
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-11-18 19:28 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-12 21:31 [Bonding-devel][PATCH]Zero Copy Transmit Support (Update) Godse, Radheka
2004-11-12 21:20 ` David S. Miller
2004-11-12 21:56 ` Jay Vosburgh
2004-11-12 21:49 ` David S. Miller
2004-11-12 22:20 ` Jay Vosburgh
2004-11-12 22:24 ` David S. Miller
-- strict thread matches above, loose matches on Subject: below --
2004-11-12 23:51 Radheka Godse
2004-11-12 0:00 ` David S. Miller
2004-11-18 19:02 Godse, Radheka
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox