Netdev List
 help / color / mirror / Atom feed
* Re: [XFRM]: xfrm_algo_clone() allocates too much memory
From: Eric Dumazet @ 2008-01-09  7:51 UTC (permalink / raw)
  To: David Miller; +Cc: herbert, netdev
In-Reply-To: <20080108.234012.181072357.davem@davemloft.net>

David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Wed, 09 Jan 2008 08:29:11 +0100
> 
> Thanks for catching this.
> 
> Applied to net-2.6
> 
>> +static inline int xfrm_alg_len(struct xfrm_algo *alg)
>> +{
>> +	return sizeof(*alg) + ((alg->alg_key_len + 7) / 8);
>> +}
> 
> That gets emitted as a divide doesn't it :-))))
> 
> 
Yes I have a patch for these divides, but will apply on 2.6.25 once this one 
hits it.  (this saves 192 bytes of kernel text BTW)


^ permalink raw reply

* Re: [PATCH] [IPv6]: IPV6_MULTICAST_IF setting is ignored on link-local connect()
From: David Miller @ 2008-01-09  7:53 UTC (permalink / raw)
  To: dlstevens; +Cc: brian.haley, netdev, netdev-owner, yoshfuji
In-Reply-To: <OF3393858D.9874E03A-ON882573CA.00071BC8-882573CA.000738ED@us.ibm.com>

From: David Stevens <dlstevens@us.ibm.com>
Date: Mon, 7 Jan 2008 17:18:56 -0800

> Acked-by: David L Stevens <dlstevens@us.ibm.com>

Patch applied, thanks everyone.

^ permalink raw reply

* Re: [XFRM]: xfrm_algo_clone() allocates too much memory
From: David Miller @ 2008-01-09  7:53 UTC (permalink / raw)
  To: dada1; +Cc: herbert, netdev
In-Reply-To: <47847D0B.3050003@cosmosbay.com>

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed, 09 Jan 2008 08:51:39 +0100

> Yes I have a patch for these divides, but will apply on 2.6.25 once this one 
> hits it.  (this saves 192 bytes of kernel text BTW)

I never doubted you for a second.


^ permalink raw reply

* Re: [VLAN]: Avoid expensive divides
From: David Miller @ 2008-01-09  7:54 UTC (permalink / raw)
  To: dada1; +Cc: netdev
In-Reply-To: <47835E74.4040900@cosmosbay.com>

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Tue, 08 Jan 2008 12:28:52 +0100

> We can avoid divides (as seen with CONFIG_CC_OPTIMIZE_FOR_SIZE=y on x86)
> changing vlan_group_get_device()/vlan_group_set_device()  id parameter 
> from signed to
> unsigned.
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: SACK scoreboard
From: David Miller @ 2008-01-09  7:58 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: lachlan.andrew, netdev, quetchen
In-Reply-To: <Pine.LNX.4.64.0801081258050.12911@kivilampi-30.cs.helsinki.fi>

From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Tue, 8 Jan 2008 14:12:47 +0200 (EET)

> If I'd hint my boss that I'm involved in something like this I'd
> bet that he also would get quite crazy... ;-) I'm partially paid
> for making TCP more RFCish :-), or at least that the places where
> thing diverge are known and controllable for research purposes.

RFCs are great guides by which to implement things, but they have been
often completely wrong or not practicle to follow strictly.

The handling of out of order ACKs with timestamps is my favorite
example.  Nobody performs an RFC compliant timestamp check on ACK
packets, or else their performance would go into the toilet during
packet reordering.

The URG bit setting is another one.

Especially when, practically speaking, we can in fact make changes
like I believe we can here I think we should.

^ permalink raw reply

* Re: [PATCH 0/3] bonding: 3 fixes for 2.6.24
From: Jay Vosburgh @ 2008-01-09  7:58 UTC (permalink / raw)
  To: Krzysztof Oledzki
  Cc: netdev, Jeff Garzik, David Miller, Andy Gospodarek, Herbert Xu
In-Reply-To: <Pine.LNX.4.64.0801090732490.1135@bizon.gios.gov.pl>

Krzysztof Oledzki <olel@ans.pl> wrote:

>Fine. Just let you know that someone test your patches and everything
>works, except mentioned problem.

	And I appreciate it; I just wanted to make sure our many fans
following along at home didn't misunderstand.

	Could you let me know if the patch below make the lockdep
warning go away?  This applies on top of the previous three, although it
should be trivial to do by hand.

	I'm still checking to make sure this is safe with regard to
mutexing the bonding structures, but it would be good to know if it
eliminates the warning.

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com


diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 77d004d..1baaadc 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3937,8 +3937,6 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	struct bonding *bond = bond_dev->priv;
 	struct dev_mc_list *dmi;
 
-	write_lock_bh(&bond->lock);
-
 	/*
 	 * Do promisc before checking multicast_mode
 	 */
@@ -3978,8 +3976,6 @@ static void bond_set_multicast_list(struct net_device *bond_dev)
 	/* save master's multicast list */
 	bond_mc_list_destroy(bond);
 	bond_mc_list_copy(bond_dev->mc_list, bond, GFP_ATOMIC);
-
-	write_unlock_bh(&bond->lock);
 }
 
 /*

^ permalink raw reply related

* Re: [PATCH][VLAN] Merge tree equal tails in vlan_skb_recv
From: Pavel Emelyanov @ 2008-01-09  8:15 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Linux Netdev List
In-Reply-To: <47591F9F.1020607@trash.net>

Hi, Patrick.

> Pavel Emelyanov wrote:
>> There are tree paths in it, that set the skb->proto and then
>> perform common receive manipulations (basically call netif_rx()).
>>
>> I think, that we can make this code flow easier to understand
>> by introducing the vlan_set_encap_proto() function (I hope the 
>> name is good) to setup the skb proto and merge the paths calling 
>> netif_rx() together.
>>
>> Surprisingly, but gcc detects this thing and merges these paths
>> by itself, so this patch doesn't make the vlan module smaller.
> 
> 
> I already have something similar queued, but your patch is a nice
> cleanup on top. I'll merge it into my tree and send it out after
> some testing, hopefully today.
> 

What are your plans about this patch? Should I resubmit this one?

Thanks,
Pavel

^ permalink raw reply

* Re: [PATCH net-2.6.25] [IPV4] Remove unsupported DNAT (RTCF_NAT and RTCF_NAT) in IPV4
From: David Miller @ 2008-01-09  8:19 UTC (permalink / raw)
  To: ramirose; +Cc: netdev
In-Reply-To: <eb3ff54b0801021331w63637142sd39b6605bd6f8be4@mail.gmail.com>

From: "Rami Rosen" <ramirose@gmail.com>
Date: Wed, 2 Jan 2008 23:31:37 +0200

> - The DNAT (Destination NAT) is not implemented in IPV4.
> 
> - This patch remove the code which checks these flags
> in net/ipv4/arp.c and net/ipv4/route.c.
> 
> The RTCF_NAT and RTCF_NAT should stay in the header (linux/in_route.h)
> because they are used in DECnet.
> 
> Signed-off-by: Rami Rosen <ramirose@gmail.com>

Applied, thanks Rami.

If someone has the stamina, it's very likely that the partial
NAT support in the decnet code can simply be removed.  If
successful, then the in_route.h macros can be removed.

^ permalink raw reply

* Re: [patch 5/9][NETNS][IPV6] make bindv6only sysctl per namespace
From: David Miller @ 2008-01-09  8:25 UTC (permalink / raw)
  To: dlezcano; +Cc: benjamin.thery, netdev, dada1
In-Reply-To: <47825513.50100@fr.ibm.com>

From: Daniel Lezcano <dlezcano@fr.ibm.com>
Date: Mon, 07 Jan 2008 17:36:35 +0100

> Thanks Benjamin to catch this.
> 
> I think I have to apologize to Eric, I thought I tested with this option 
> off but it wasn't and Eric was right. I will wait a little for feedbacks 
> and send a V3.

I would have preferred if you just reposted a fixed version of patch 5
instead of waiting for the feedback that realistically isn't going to
come at all after this much time has passed.

I was about to apply this series, then looked through my mailbox for
reported regressions and hit this.

That makes me not start applying your patch series.  I want the whole
thing with known bug fixes available by the time I get around to
reviewing it.

Thanks.

^ permalink raw reply

* Re: make ipv6_sysctl_register to return a value
From: David Miller @ 2008-01-09  8:26 UTC (permalink / raw)
  To: dlezcano; +Cc: netdev
In-Reply-To: <47828DEB.9070301@fr.ibm.com>


Why did you post this again?  It's identical to patch 1/9
from the previous series you sent out.

This is confusing, since it makes me think you wanted me to
perhaps do something different or update an already
submitted patch.

^ permalink raw reply

* Re: [PATCH net-2.6.25 1/6][NET] Simple ctl_table to ctl_path conversions.
From: David Miller @ 2008-01-09  8:30 UTC (permalink / raw)
  To: xemul; +Cc: netdev, devel
In-Reply-To: <47839CD1.4080505@openvz.org>

From: Pavel Emelyanov <xemul@openvz.org>
Date: Tue, 08 Jan 2008 18:54:57 +0300

> This patch includes many places, that only required
> replacing the ctl_table-s with appropriate ctl_paths
> and call register_sysctl_paths().
> 
> Nothing special was done with them.
> 
> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6.25 2/6][IPVS] Switch to using ctl_paths.
From: David Miller @ 2008-01-09  8:30 UTC (permalink / raw)
  To: horms; +Cc: xemul, netdev, devel
In-Reply-To: <20080109024914.GG3403@verge.net.au>

From: Simon Horman <horms@verge.net.au>
Date: Wed, 9 Jan 2008 11:49:16 +0900

> On Tue, Jan 08, 2008 at 06:58:11PM +0300, Pavel Emelyanov wrote:
> > The feature of ipvs ctls is that the net/ipv4/vs path
> > is common for core ipvs ctls and for two schedulers,
> > so I make it exported and re-use it in modules.
> > 
> > Two other .c files required linux/sysctl.h to make the
> > extern declaration of this path compile well.
> > 
> > Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
> 
> Thanks, this looks good to me and I've confirmed that
> the same entires with the same permissions exist under
> /proc/sys/net/ipv4/vs before and after the change.
> 
> Acked-by: Simon Horman <horms@verge.net.au>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH net-2.6.25 3/6][DECNET] Switch to using ctl_paths.
From: David Miller @ 2008-01-09  8:31 UTC (permalink / raw)
  To: xemul; +Cc: netdev, devel
In-Reply-To: <47839E88.2040604@openvz.org>

From: Pavel Emelyanov <xemul@openvz.org>
Date: Tue, 08 Jan 2008 19:02:16 +0300

> The decnet includes two places to patch. The first one is 
> the net/decnet table itself, and it is patched just like 
> other subsystems in the first patch in this series.
> 
> The second place is a bit more complex - it is the 
> net/decnet/conf/xxx entries,. similar to those in 
> ipv4/devinet.c and ipv6/addrconf.c. This code is made similar 
> to those in ipv[46].
> 
> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6.25 4/6][AX25] Switch to using ctl_paths.
From: David Miller @ 2008-01-09  8:32 UTC (permalink / raw)
  To: xemul; +Cc: netdev, devel
In-Reply-To: <47839F3E.5080109@openvz.org>

From: Pavel Emelyanov <xemul@openvz.org>
Date: Tue, 08 Jan 2008 19:05:18 +0300

> This one is almost the same as the hunks in the
> first patch, but ax25 tables are created dynamically.
> 
> So this patch differs a bit to handle this case.
> 
> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6.25 5/6][NETFILTER] Switch to using ctl_paths in nf_queue and conntrack modules
From: David Miller @ 2008-01-09  8:33 UTC (permalink / raw)
  To: kaber; +Cc: xemul, netdev, devel
In-Reply-To: <4783A092.1040605@trash.net>

From: Patrick McHardy <kaber@trash.net>
Date: Tue, 08 Jan 2008 17:10:58 +0100

> Pavel Emelyanov wrote:
> > This includes the most simple cases for netfilter.
> > 
> > The first part is tne queue modules for ipv4 and ipv6,
> > on which the net/ipv4/ and net/ipv6/ paths are reused 
> > from the appropriate ipv4 and ipv6 code.
> > 
> > The conntrack module is also patched, but this hunk is 
> > very small and simple.
> > 
> > Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
> 
> Looks good to me.

Applied.

^ permalink raw reply

* Re: [PATCH net-2.6.25 6/6][NETFILTER] Use the ctl paths instead of hand-made analogue
From: David Miller @ 2008-01-09  8:34 UTC (permalink / raw)
  To: kaber; +Cc: xemul, netdev, devel
In-Reply-To: <4783A12A.3060207@trash.net>

From: Patrick McHardy <kaber@trash.net>
Date: Tue, 08 Jan 2008 17:13:30 +0100

> Pavel Emelyanov wrote:
> > The conntracks subsystem has a similar infrastructure
> > to maintain ctl_paths, but since we already have it
> > on the generic level, I think it's OK to switch to
> > using it.
> > 
> > So, basically, this patch just replaces the ctl_table-s
> > with ctl_path-s, nf_register_sysctl_table with 
> > register_sysctl_paths() and removes no longer needed code.
> 
> Also looks good, thanks.

Also applied, thanks.


^ permalink raw reply

* Re: [PATCH net-2.6.25] [BRIDGE] Remove unused macros from ebt_vlan.c
From: David Miller @ 2008-01-09  8:35 UTC (permalink / raw)
  To: ramirose; +Cc: netdev
In-Reply-To: <eb3ff54b0801080638j48b51978j9beb11b310ef2b8@mail.gmail.com>

From: "Rami Rosen" <ramirose@gmail.com>
Date: Tue, 8 Jan 2008 16:38:15 +0200

> Remove two unused macros, INV_FLAG and SET_BITMASK
> from net/bridge/netfilter/ebt_vlan.c.
> 
> Signed-off-by: Rami Rosen <ramirose@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [RFC 1/2] [XFRM] remove ifdef crypto
From: David Miller @ 2008-01-09  8:36 UTC (permalink / raw)
  To: sebastian; +Cc: netdev
In-Reply-To: <E1JCLys-0000RU-2G@Chamillionaire.breakpoint.cc>

From: Sebastian Siewior <sebastian@breakpoint.cc>
Date: Sat, 8 Jan 2008 22:26:37 +0100

> and select the crypto subsystem if neccessary
> 
> Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 1/3] drivers/net/ipg.c: Fix skbuff leak
From: David Miller @ 2008-01-09  8:39 UTC (permalink / raw)
  To: linux; +Cc: romieu, akpm, netdev
In-Reply-To: <20080109003840.22917.qmail@science.horizon.com>

From: linux@horizon.com
Date: 8 Jan 2008 19:38:40 -0500

> That doesn't seem to do it.  Not entirely, at least.  After downloading
> and partially re-uploading an 800M file, slabtop reports:

Ok, I'll let you and Francois work out how to fix this for
good.

Please submit just the outright leak bug fixes once this is
all resolved.  All of that code cleanup stuff needs to wait
until later, let's fix bugs before adding new ones. :-)

Thanks.

^ permalink raw reply

* Re: 2.6.24-rc6-mm1
From: Jarek Poplawski @ 2008-01-09  9:04 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: mingo, akpm, just.for.lkml, tomof, herbert, linux-kernel, neilb,
	bfields, netdev, tom
In-Reply-To: <20080109085753O.fujita.tomonori@lab.ntt.co.jp>

On Wed, Jan 09, 2008 at 08:57:53AM +0900, FUJITA Tomonori wrote:
...
> diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
> new file mode 100644
> index 0000000..495575a
> --- /dev/null
> +++ b/lib/iommu-helper.c
> @@ -0,0 +1,80 @@
> +/*
> + * IOMMU helper functions for the free area management
> + */
> +
> +#include <linux/module.h>
> +#include <linux/bitops.h>
> +
> +static unsigned long find_next_zero_area(unsigned long *map,
> +					 unsigned long size,
> +					 unsigned long start,
> +					 unsigned int nr,
> +					 unsigned long align_mask)
> +{
> +	unsigned long index, end, i;
> +again:
> +	index = find_next_zero_bit(map, size, start);
> +
> +	/* Align allocation */
> +	index = (index + align_mask) & ~align_mask;
> +
> +	end = index + nr;
> +	if (end >= size)
> +		return -1;

This '>=' looks doubtful to me, e.g.:
map points to 0s only,  size = 64, nr = 64,
we get: index = 0; end = 64;
and: return -1 ?!

Regards,
Jarek P.

^ permalink raw reply

* Bonding : Monitoring of 4965 wireless card
From: patnel972-linux @ 2008-01-09  9:00 UTC (permalink / raw)
  To: netdev

Hi,

I want to make a bond with my wireless card. The ipw driver create two
 interfaces (wlan0 and wmaster0). When i switch the rf_kill button,
 ifplug detect wlan0 unplugged but not wmaster0. If i down wlan0 (while
 rf_kil ), bonding detect the inactivity when i up the interface.

Have you some idea where is the problem? the driver or the miimon of
 the module?

my module parameters mode=1 miimon=100 primary eth0

Thanks



      _____________________________________________________________________________ 
Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo! Mail http://mail.yahoo.fr

^ permalink raw reply

* Re: [Patch 2.6.22.2 ] : drivers/net/via-rhine.c:   Offload checksum handling to VT6105M
From: Roger Luethi @ 2008-01-09  9:13 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: K Naru, linux-kernel, netdev
In-Reply-To: <20071230223354.GA28568@1wt.eu>

[top posting because context may be missing otherwise, over a week later]

Excellent analysis, Willy. Quite frankly, I am not keen on making this
driver any more complex, especially if the gains are marginal at best. VIA
Rhine will never be high-performance hardware, and we have too much special
casing already.

Patches to fix actual problems (such as the recent irq init work by Dave
Jones) are much more interesting to me (and presumably to most via-rhine
users).

Roger

On Sun, 30 Dec 2007 23:33:54 +0100, Willy Tarreau wrote:
> Hi Kim,
> 
> On Fri, Aug 17, 2007 at 11:34:37AM -0700, K Naru wrote:
> > From: Kim Naru (squat_rack@yahoo.com)
> > 
> > Added support to offload TCP/UDP/IP checksum to the
> > VIA Technologies VT6105M chip.
> > Firstly, let the stack know this chip is capable of
> > doing its own checksum(IPV4 only).
> > Secondly offload checksum to VT6105M, if necessary. 
> > 
> > 
> > Verbose Mode:
> > 
> > #1. Define 3 bits(18,19,20) in Transmit Descriptor 1
> > of chip, which affect checksum processing.
> > The prefix(TDES1) for the 3 variables is the short
> > name for Transmit Descriptior 1.
> > #2. In rhine_init_one(), if pci_rev >=  VT6105M then
> > set  NETIF_F_IP_CSUM(see skbuff.h for details).
> > #3. In rhine_start_tx() if NETIF_F_IP_CSUM is set AND
> > the stack requires a checksum then
> > set either bit 19(UDP),20(TCP) AND bit 18(IP).
> > 
> > Note : The numbered items above(i.e.#1,#2,#3) denote
> > pseudo code.
> > 
> > This patch was developed and tested on Imedia
> > linux-2.6.20 under a PC-Engines Alix System board
> > (www.pcengines.ch/alix.htm). It was tested(compilation
> > only) on linux-2.6.22.2. The minor code change between
> > 2.6.20 and 2.6.22 is the use of ip_hdr() in 2.26.22.
> > 
> > In 2.6.20 :
> >                 struct iphdr *ip = skb->nh.iph;
> > In 2.6.22 :
> >                 const struct iphdr *ip = ip_hdr(skb);
> > 
> > Testing:
> > 
> > 
> > ttcp,netperf ftp and top  where used. There appears to
> > be a small CPU utilization gain. Throughput results 
> > where more inconclusive.
> > 
> > The data sheet used to get information is 'VT6105M
> > Data Sheet, Revision 1.63  June21,2006'.
> > 
> > Signed-off-by: Kim Naru (squat_rack@yahoo.com)
> > 
> > ---
> 
> Well, I've reformated your patch so that it can be applied, and very
> slightly arranged it in order to save 13 bytes of code and a few CPU
> cycles.
> 
> Also, I moved the if block before the spinlock as there is no reason
> for this code to be run with the lock held.
> 
> I have run some performance measurements on an ALIX 3C2 motherboard
> with a 2.6.22-stable kernel. What I see is a reduction of CPU usage
> by about 20% when the network is saturated, but also a reduction of
> the network speed by 8%!
> 
> Without the patch, I can produce a continuous traffic of about 99 Mbps with
> about 11% CPU (system only, 89% idle).
> 
> With the patch, the traffic drops to 91 Mbps but CPU usage decreases to 9%.
> 
> Now, if I reduce the MTU to exactly 1000, then the traffic increases to about
> 98 Mbps, but it progressively reduces when the MTU moves away from 1000.
> 
> So I have run some deeper tests consisting in leaving NETIF_F_IP_CSUM unset
> and still asking the NIC to compute the checksums. The conclusion is very
> clear: as soon as *any* checksum bit is set (IP, TCP, UDP), the traffic
> immediately drops.
> 
> I think that what happens is that the NIC is not pipelined at all and that
> no data is transferred while a checksum is being computed. This would also
> explain why reducing the MTU increases performance, since it reduces the
> time required to compute a checksum, reducing the off time. And the more I
> think about it, the more I think this is the problem, because the VT6105M
> has a 2kB transmit buffer, so it cannot checksum a 1.5kB frame while sending
> another one if it does it inside the buffer.
> 
> And I'm pretty sure that the checksum is computed in the buffer and that the
> data is not transferred twice on the bus, because playing with PCI latency
> timer and other parameters does not change anything.
> 
> So basically, we're there with a chip which can offload the CPU by performing
> the checksums itself, but it reduces performance for packets larger than 1kB
> (or possibly 500 bytes if there's a 1.5kB packet being transferred).
> 
> The driver should be adjusted to permit the user to enable and disable this
> feature with ethtool. Right now, its status can only be consulted, and I'm
> using dd on /dev/mem and /dev/kmem to change the values on the fly.
> 
> Given the fact that a 20% reduction on CPU usage which was already 10% only
> leaves a net gain of about 2% more CPU available, I'm not convinced that there
> is any advantage in enabling this feature by default with this NIC.
> 
> Here's the updated patch for reference (maybe you'd want to enhance it).
> 
> --- linux-2.6.22-wt3/drivers/net/via-rhine.c	2007-11-22 17:48:34 +0100
> +++ linux-2.6.22-wt3.via-cksum/drivers/net/via-rhine.c	2007-12-30 20:53:30 +0100
> @@ -95,6 +95,8 @@
>  #include <linux/netdevice.h>
>  #include <linux/etherdevice.h>
>  #include <linux/skbuff.h>
> +#include <linux/in.h>
> +#include <linux/ip.h>
>  #include <linux/init.h>
>  #include <linux/delay.h>
>  #include <linux/mii.h>
> @@ -343,6 +345,9 @@
>  
>  /* Initial value for tx_desc.desc_length, Buffer size goes to bits 0-10 */
>  #define TXDESC		0x00e08000
> +#define TDES1_TCPCK	0x00100000  /* Bit 20, Transmit Desc 1  */
> +#define TDES1_UDPCK	0x00080000  /* Bit 19, Transmit Desc 1  */
> +#define TDES1_IPCK	0x00040000  /* Bit 18, Transmit Desc 1  */
>  
>  enum rx_status_bits {
>  	RxOK=0x8000, RxWholePkt=0x0300, RxErr=0x008F
> @@ -788,6 +793,9 @@
>  	if (rp->quirks & rqRhineI)
>  		dev->features |= NETIF_F_SG|NETIF_F_HW_CSUM;
>  
> +	if (pci_rev >=  VT6105M)
> +		dev->features |= NETIF_F_IP_CSUM;   /* chip does checksum */
> +
>  	/* dev->name not defined before register_netdev()! */
>  	rc = register_netdev(dev);
>  	if (rc)
> @@ -1260,6 +1268,18 @@
>  	rp->tx_ring[entry].desc_length =
>  		cpu_to_le32(TXDESC | (skb->len >= ETH_ZLEN ? skb->len : ETH_ZLEN));
>  
> +	if ((dev->features & NETIF_F_IP_CSUM) &&
> +	    (skb->ip_summed == CHECKSUM_PARTIAL)) {
> +		/* Offload checksum to chip. */
> +		const struct iphdr *ip = ip_hdr(skb);
> +		unsigned long flag;
> +
> +		flag = (ip->protocol == IPPROTO_TCP) ? TDES1_TCPCK|TDES1_IPCK :
> +		       (ip->protocol == IPPROTO_UDP) ? TDES1_UDPCK|TDES1_IPCK :
> +		       TDES1_IPCK;
> +		rp->tx_ring[entry].desc_length |= flag;
> +	}
> +
>  	/* lock eth irq */
>  	spin_lock_irq(&rp->lock);
>  	wmb();
> 
> Best regards,
> Willy
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
If you make people think they're thinking, they'll love you;
but if you really make them think they'll hate you.

^ permalink raw reply

* Re: SACK scoreboard
From: Evgeniy Polyakov @ 2008-01-09  9:47 UTC (permalink / raw)
  To: Andi Kleen
  Cc: David Miller, jheffner, ilpo.jarvinen, lachlan.andrew, netdev,
	quetchen
In-Reply-To: <20080109070318.GA8581@one.firstfloor.org>

Hi.

On Wed, Jan 09, 2008 at 08:03:18AM +0100, Andi Kleen (andi@firstfloor.org) wrote:
> > It adds severe spikes in CPU utilization that are even moderate
> > line rates begins to affect RTTs.
> > 
> > Or do you think it's OK to process 500,000 SKBs while locked
> > in a software interrupt.
> 
> You can always push it into a work queue.  Even put it to
> other cores if you want. 
> 
> In fact this is already done partly for the ->completion_queue.
> Wouldn't be a big change to queue it another level down.
> 
> Also even freeing a lot of objects doesn't have to be
> that expensive. I suspect the most cost is in taking
> the slab locks, but that could be batched. Without
> that the kmem_free fast path isn't particularly
> expensive, as long as the headers are still in cache.

Postponing freeing of the skb has major drawbacks. Some time ago I
made a patch to postpone skb freeing behind rcu and got 2.5 times slower
connection speed on some machines with decreased CPU usage though.
So, queueing solution has to be proven with real data and although it
looks good in one situation, it can be really bad in another.

For interested reader: results of the RCUfication of the kfree_skbmem()
http://tservice.net.ru/~s0mbre/blog/devel/networking/2006/12/05

-- 
	Evgeniy Polyakov

^ permalink raw reply

* Re: [NET] ROUTE: fix rcu_dereference() uses in /proc/net/rt_cache
From: Herbert Xu @ 2008-01-09  9:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Paul E. McKenney, davem, dipankar, netdev
In-Reply-To: <47847A10.1020508@cosmosbay.com>

On Wed, Jan 09, 2008 at 08:38:56AM +0100, Eric Dumazet wrote:
> 
> I am not sure this is valid, since it will do this :
> 
> r = rt_hash_table[st->bucket].chain;
> if (r)
>     return rcu_dereference(r);
> 
> So compiler might be dumb enough do dereference 
> &rt_hash_table[st->bucket].chain two times.

That wouldn't be a problem at all.  The key is to add a barrier between
reading the pointer:

	r = rt_hash_table[st->bucket].chain

and dereferencing it later, e.g.,

	r->u.dst.rt_next

The barrier is there so that when we dereference r we don't read
stale cache that was there before the memory at r was initialised.
How many times you read the pointer value before the barrier is
irrelevant to the effectiveness of the barrier preceding the
dereference.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: make ipv6_sysctl_register to return a value
From: Daniel Lezcano @ 2008-01-09  9:56 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20080109.002632.136758957.davem@davemloft.net>

David Miller wrote:
> Why did you post this again?  It's identical to patch 1/9
> from the previous series you sent out.
> 
> This is confusing, since it makes me think you wanted me to
> perhaps do something different or update an already
> submitted patch.

I thought you was expecting a new netns ipv6 sysctl series and ignored 
the current one. While I was looking at them again I found a problem and 
I wanted to fix that before resending. In the meantime, there were three 
independant trivial patches I wanted to send as a pre-requesite for the 
incoming sysctl patches. Bad idea ... :)

  - make ipv6_sysctl_register to return a value
  - make a subsystem for af_inet6
  - add ipv6 structure for netns

With the sysctl Pavel's patch, the first one does not apply.

To clear out any confusion, please can you just ignore all my previous 
patches, I will resend a new serie rebased on the work done by Pavel.

Sorry for the inconvenience.

Thanks.

   -- Daniel


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox