Netdev List
 help / color / mirror / Atom feed
* Re: [net-next-2.6 PATCH] igb: add support for reporting 5GT/s during probe on PCIe Gen2
From: David Miller @ 2010-04-27 19:55 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, alexander.h.duyck
In-Reply-To: <20100427110238.23921.24825.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 27 Apr 2010 04:02:40 -0700

> From: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> This change corrects the fact that we were not reporting Gen2 link speeds
> when we were in fact connected at Gen2 rates.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 2/2] igbvf: double increment nr_frags
From: David Miller @ 2010-04-27 19:55 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, sanagi.koki
In-Reply-To: <20100427110137.23872.8779.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 27 Apr 2010 04:01:39 -0700

> From: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> 
> There is no need to increment nr_frags because skb_fill_page_desc increments
> it.
> 
> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: [net-next-2.6 PATCH 1/2] igb: double increment nr_frags
From: David Miller @ 2010-04-27 19:55 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, sanagi.koki
In-Reply-To: <20100427110107.23872.86247.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 27 Apr 2010 04:01:19 -0700

> From: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> 
> There is no need to increment nr_frags because skb_fill_page_desc increments
> it.
> 
> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: [net-next-2.6 PATCH] ixgb: Use pr_<level> and netdev_<level>
From: David Miller @ 2010-04-27 19:55 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, joe
In-Reply-To: <20100427104952.23637.6317.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 27 Apr 2010 03:50:58 -0700

> From: Joe Perches <joe@perches.com>
> 
> Convert DEBUGOUTx to pr_debug
> Convert DEBUGFUNC to more commonly used ENTER
> Convert mac address output to %pM
> Use #define pr_fmt
> Convert a few printks to pr_<level>
> Improve ixgb_mc_addr_list_update: use a temporary for current mc address
> Use etherdevice.h functions for mac address testing
> 
> Signed-off-by: Joe Perches <joe@perches.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next-2.6] net: fix a lockdep rcu warning in __sk_dst_set()
From: David Miller @ 2010-04-27 19:55 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1272350443.4861.9.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 27 Apr 2010 08:40:43 +0200

> __sk_dst_set() might be called while no state can be integrated in a
> rcu_dereference_check() condition.
> 
> So use rcu_dereference_raw() to shutup lockdep warnings (if
> CONFIG_PROVE_RCU is set)
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH v4] TCP: avoid to send keepalive probes if receiving data
From: David Miller @ 2010-04-27 19:55 UTC (permalink / raw)
  To: ilpo.jarvinen; +Cc: fleitner, netdev, eric.dumazet
In-Reply-To: <alpine.DEB.2.00.1004270906320.13989@melkinpaasi.cs.helsinki.fi>

From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Tue, 27 Apr 2010 09:08:08 +0300 (EEST)

> On Tue, 27 Apr 2010, Flavio Leitner wrote:
> 
>> RFC 1122 says the following:
>> ...
>>   Keep-alive packets MUST only be sent when no data or
>>   acknowledgement packets have been received for the
>>   connection within an interval.
>> ...
>> 
>> The acknowledgement packet is reseting the keepalive
>> timer but the data packet isn't. This patch fixes it by
>> checking the timestamp of the last received data packet
>> too when the keepalive timer expires.
>> 
>> Signed-off-by: Flavio Leitner <fleitner@redhat.com>
 ...
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
 ...
> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH net-next] bridge: multicast router list manipulation
From: David Miller @ 2010-04-27 19:54 UTC (permalink / raw)
  To: shemminger; +Cc: herbert, netdev
In-Reply-To: <20100427101311.2f445227@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Tue, 27 Apr 2010 10:13:11 -0700

> I prefer that the hlist be only accessed through the hlist macro
> objects. Explicit twiddling of links (especially with RCU) exposes
> the code to future bugs.
> 
> Compile tested only.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Yes this by-hand stuff was beyond awful, I'm sorry I didn't catch it
when pulling in these changes initially :-)

Applied, thanks Stephen.

^ permalink raw reply

* Re: [PATCH net-next] bridge: use is_multicast_ether_addr
From: David Miller @ 2010-04-27 19:53 UTC (permalink / raw)
  To: shemminger; +Cc: herbert, netdev
In-Reply-To: <20100427101306.3a49104f@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Tue, 27 Apr 2010 10:13:06 -0700

> Use existing inline function.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next-2.6] net: fix a lockdep rcu warning in __sk_dst_set()
From: David Miller @ 2010-04-27 19:42 UTC (permalink / raw)
  To: paulmck; +Cc: eric.dumazet, netdev
In-Reply-To: <20100427161716.GB2424@linux.vnet.ibm.com>

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Date: Tue, 27 Apr 2010 09:17:16 -0700

> On Tue, Apr 27, 2010 at 08:40:43AM +0200, Eric Dumazet wrote:
>> __sk_dst_set() might be called while no state can be integrated in a
>> rcu_dereference_check() condition.
>> 
>> So use rcu_dereference_raw() to shutup lockdep warnings (if
>> CONFIG_PROVE_RCU is set)
> 
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

I've applied this to net-next-2.6, thanks!

^ permalink raw reply

* Re: [net-next-2.6 PATCH 1/2] Add ndo_set_vf_port_profile
From: Arnd Bergmann @ 2010-04-27 19:38 UTC (permalink / raw)
  To: Anirban Chakraborty
  Cc: Scott Feldman, Rose, Gregory V, David Miller,
	netdev@vger.kernel.org, chrisw@redhat.com, Williams, Mitch A
In-Reply-To: <8966E338-1C9C-43D9-B6A3-A44349E7EE18@qlogic.com>

On Tuesday 27 April 2010 19:33:04 Anirban Chakraborty wrote:
> On Apr 27, 2010, at 5:35 AM, Arnd Bergmann wrote:
> > Anything that ties port profiles to VFs seems fundamentally flawed AFAICT,
> > at least when we want to extend this to adapters that don't do it in firmware.
> 
> Correct me if I am wrong. Shouldn't the port profile be tied to the physical NICs which are essentially
> PCI functions (be it PF or VF)? I'd think that a port profile would have configuration settings for all the
> physical NICs (PF/VF) of a specific physical port of the adapter. I liked the idea of querying the device
> for number of VFs as it will cover both SR-IOV and non SR-IOV PCI functions.

Yes, the port profile association is tied to whoever owns the link to the switch.
That can be a regular NIC, an SR-IOV PF, an ethernet bonding device or an S-component
implementing provider S-VLANs on top of any of these.

Usually it will be the same as a physical link, but in case of bonding it is two
physical links while in case of S-VLAN, you have multiple instances that each
have their own set of port profile association. If S-VLAN is implemented by
the NIC, that may be a VF.

Querying a PF for the number of VFs attached to it is a useful thing, but this
is independent of port profiles. Consider this (artificially complex) setup:

- eth0 is the PF of an SR-IOV NIC
- eth1 is a regular single-channel NIC
- vf0 is a VF of eth0, used by a guest using PCI passthrough mode on S-VLAN 2
- vf1 is a VF of eth0 owned by the host on S-VLAN 3
- vf1.23 is a VLAN port for VLAN 23 in S-VLAN 3
- br0 is a bridge connected to vf1
- br23 is a bridge VLAN device for br0
- vf2 is a VF of eth0 owned by the host on S-VLAN 4
- eth1.5 is a software vlan device for S-VLAN 4
- bond0 combines eth1.5 and vf2
- bond0.24 is a VLAN port for VLAN24 on bond0
- tap0 is a guest connected to br0 in trunk mode
- tap1 is a guest connected to br23 in access mode
- macvtap0 is a VEPA mode guest on bond0
- macvtap1 is a private mode guest on bond0.24

This means you have a total of five guests running, on vf0, tap0, tap1,
macvtap0 and macvtap1. Querying the number of VFs on eth0 will return '2',
for vf0 and vf1. What you are interested in however is which guests are
associated. Querying every single interface in the system will tell you

eth0: one guest (vf0)
vf1: two guests (tap0 and tap1)
bond0: two guests (macvtap0 and macvtap1)

	Arnd

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: Eric Dumazet @ 2010-04-27 19:30 UTC (permalink / raw)
  To: eilong
  Cc: Rick Jones, David Miller, therbert@google.com,
	netdev@vger.kernel.org
In-Reply-To: <1272393060.30392.2.camel@lb-tlvb-eilong.il.broadcom.com>

Le mardi 27 avril 2010 à 21:31 +0300, Eilon Greenstein a écrit :

> Though the thread is going in a different direction now, I just wanted
> to clarify two things:
> - yes, the 57710 and 57711 only handle the IP (src+dst) for UDP toeplitz
> hash. We all agree that it is much better to address the UDP ports as
> well, but I think Rick Jones explained the process very well - thank you
> Rick. Just to add one more (lame) excuse: the HW was designed before new
> NAPI was introduced and it complies with the requirements from Redmond
> - the next generation (57712) which we already sample does (finally)
> support it. We are working on a patch series to enhance the bnx2x to
> support this device now.
> 

Thanks Eilon !



^ permalink raw reply

* Re: [PATCH 0/4] net: ipmr netlink interface for route dumping
From: Patrick McHardy @ 2010-04-27 18:41 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20100427.100345.241441437.davem@davemloft.net>

David Miller wrote:
> Whoa, there are three of you now?!?!?!
> 
> :-)
> 

That would be nice, I'd have my two clones do all the work :)

Not sure what happened, some mishandling of git send-email
apparently :)

^ permalink raw reply

* Re: vlan performance issue on outgoing traffic
From: Brandeburg, Jesse @ 2010-04-27 18:32 UTC (permalink / raw)
  To: R. Weinedel; +Cc: netdev@vger.kernel.org
In-Reply-To: <4BD4C037.2070003@yahoo.de>

On Sun, 25 Apr 2010, R. Weinedel wrote:

> hallo,
> 
> I have an performance issue with vlan interfaces on an Debian Lenny
> server. The problem occurs only on outgoing traffic from the vlan
> interfaces. They use only half of the available bandwidth - (490 Mbit/s
> measured with iperf ). Incoming traffic is handled @ 950 Mbit/s and is
> fine. The issue remains even with no switch and an direct connection
> between pc and server on the same nic. Removing (on server) the vlans
> from eth0 and configure one net on eth0 results in full speed (950
> Mbit/s) in both directions. Even another nic (onboard nvidia3 - mod
> forcedeth) couldn't solve it. I tested only in the same networking
> segment (vlan) without the need for ip forwarding or NAT, but the issue
> occurs on all my vlan's.
> 
> All values were taken with iperf between the server and an ubuntu 9.04
> workstation (and vice versa). I have controlled (w. ethtool / stats from
> switch) that all connection was 1000-BaseT/full duplex. It looks like
> some kind of trafficshaping to me, but i don't use tc, qos,tos nor other
> priority handling.
 
> The network ist quite simple: One Server, one switch and then the
> workstations. No need for cascading or using (r)stp.
> 
> Here some information about my network:
> 
> Switch: Netgear GSM7224 Layer 2 managed switch, FW 6.2.0.14
> (independent, issue remains on direct connection).
> 
> Server: Debian Lenny, kernel 2.6.26-2,

This version of the kernel doesn't support offloads for vlan adapters, 
which is probably causing most of your decrease in throughput due to 
either exhausting socket buffer size, or because of the round trip time 
being so much more relevant when not sending large bursts using TSO.  
Sometimes the flood of ACK packets causes higher cpu which could reduce 
your throughput also.

The newer kernels will have a major impact on your setup due to a patch 
that enabled pass through of hardware offloads to the vlan device's 
offload advertisement.

The commit id of the patch is 5fb13570543f4ae022996c9d7c0c099c8abf22dd, 
you can view it at:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=5fb13570543f4ae022996c9d7c0c099c8abf22dd

 
> NIC: Intel Corporation 82541PI Gigabit Ethernet Con. (e1000 module).

This PCI adapter is bandwidth limited on the PCI bus, and so will be even 
more sensitive to offload on (TSO) vs offload off.

> # ethtool eth0
> Settings for eth0:
>         Supported ports: [ TP ]
>         Supported link modes:   10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>                                 1000baseT/Full
>         Supports auto-negotiation: Yes
>         Advertised link modes:  1000baseT/Full
>         Advertised auto-negotiation: Yes
>         Speed: 1000Mb/s
>         Duplex: Full
>         Port: Twisted Pair
>         PHYAD: 0
>         Transceiver: internal
>         Auto-negotiation: on
>         Supports Wake-on: umbg
>         Wake-on: g
>         Current message level: 0x00000007 (7)
>         Link detected: yes
> 
> 8021q:
> filename:       /lib/modules/2.6.26-2-686/kernel/net/8021q/8021q.ko
> version:        1.8
> license:        GPL
> alias:          rtnl-link-vlan
> srcversion:     A61E1168F65EE335A91D4E1
> depends:
> vermagic:       2.6.26-2-686 SMP mod_unload modversions 686
> 
> VLAN: #/proc/net/vlan/config
> VLAN Dev name    | VLAN ID
> Name-Type: VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD
> eth0.5         | 5  | eth0
> eth0.101       | 101  | eth0
> eth0.90        | 90  | eth0
> 
> IFCONFIG:
> eth0      Link encap:Ethernet  Hardware Adresse 00:0e:0c:bc:43:43
>           inet6-Adresse: fe80::20e:cff:febc:4343/64
> Gültigkeitsbereich:Verbindung
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metrik:1
>           RX packets:28140829 errors:0 dropped:218 overruns:0 frame:0
>           TX packets:44994420 errors:0 dropped:0 overruns:0 carrier:0
>           Kollisionen:0 Sendewarteschlangenlänge:1000
>           RX bytes:3472864138 (3.2 GiB)  TX bytes:3908682627 (3.6 GiB)
> 
> eth0.5    Link encap:Ethernet  Hardware Adresse 00:0e:0c:bc:43:43
>           inet Adresse:XXX.YYY.5.1  Bcast:XXX.YYY.5.255  Maske:255.255.255.0
>           inet6-Adresse: fe80::20e:cff:febc:4343/64
> Gültigkeitsbereich:Verbindung
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metrik:1
>           RX packets:77807 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:69699 errors:0 dropped:0 overruns:0 carrier:0
>           Kollisionen:0 Sendewarteschlangenlänge:0
>           RX bytes:57578233 (54.9 MiB)  TX bytes:7782844 (7.4 MiB)
> 
> eth0.90   Link encap:Ethernet  Hardware Adresse 00:0e:0c:bc:43:43
>           inet Adresse:XXX.YYY.90.1  Bcast:XXX.YYY.90.255 
> Maske:255.255.255.0
>           inet6-Adresse: fe80::20e:cff:febc:4343/64
> Gültigkeitsbereich:Verbindung
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metrik:1
>           RX packets:457850 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:913988 errors:0 dropped:0 overruns:0 carrier:0
>           Kollisionen:0 Sendewarteschlangenlänge:0
>           RX bytes:23824841 (22.7 MiB)  TX bytes:1311485281 (1.2 GiB)
> 
> eth0.101  Link encap:Ethernet  Hardware Adresse 00:0e:0c:bc:43:43
>           inet Adresse:XXX.YYY.101.1  Bcast:XXX.YYY.101.255 
> Maske:255.255.255.0
>           inet6-Adresse: fe80::20e:cff:febc:4343/64
> Gültigkeitsbereich:Verbindung
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metrik:1
>           RX packets:24856818 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:41608593 errors:0 dropped:0 overruns:0 carrier:0
>           Kollisionen:0 Sendewarteschlangenlänge:0
>           RX bytes:423116676 (403.5 MiB)  TX bytes:3855703636 (3.5 GiB)
> 
> ROUTE: #route -n
> Ziel            Router          Genmask         Flags Metric Ref    Use
> Iface
> XXX.YYY.101.0   0.0.0.0         255.255.255.0   U     0      0        0
> eth0.101
> XXX.YYY.5.0     0.0.0.0         255.255.255.0   U     0      0        0
> eth0.5
> XXX.YYY.90.0    0.0.0.0         255.255.255.0   U     0      0        0
> eth0.90
> 0.0.0.0         192.168.5.4     0.0.0.0         UG    0      0        0
> eth0.5
> 
> Can someone give me a hint, where my search for an solution should be
> going on ?
> 
> Many thanks !
> Regards
> Ralf Weinedel
> Falkensee/Germany
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: Eilon Greenstein @ 2010-04-27 18:31 UTC (permalink / raw)
  To: Rick Jones, David Miller, therbert@google.com,
	eric.dumazet@gmail.com
  Cc: netdev@vger.kernel.org
In-Reply-To: <4BD601C3.5030108@hp.com>

On Mon, 2010-04-26 at 14:12 -0700, Rick Jones wrote:
> David Miller wrote:
> > From: Rick Jones <rick.jones2@hp.com>
> > Date: Mon, 26 Apr 2010 13:48:22 -0700
> > 
> >>Do not confuse explanation with endorsement.
> > 
> > Ok, fair enough.
> > 
> > But I don't see even the "other perspective" argument being even
> > valid.  Big shops still use UDP and it has to scale.
> 
> Preface - I too think it is massively stupid to ignore anything but TCP/IPv4, 
> and unwise to ignore IPv6 and so on, but there is a very real reason why one of 
> my email signatures reads:
> 
> "The road to hell is paved with business decisions"
> 
> > Or have they made multicast magically start working with TCP so
> > they can us it to do trades on the NASDAQ?
> 
> No. How many NIC chips can NASDAQ be expected to move? 0.1%? or even 1% of the 
> NIC chip market?
> 
> How many more NIC chips are in places where someone says "You sold me on 
> iSCSI/FCoE/whatnot, why can't I get 'link-rate'  to/from iSCSI storage/whatnot?!"
> 
> The NIC designer is there with his finance guys breathing down his neck shouting 
> "ROI Uber Alles!" and "Your budget is only this many monetary units!"  The 
> system designers at the system vendors are hearing the same things from their 
> own finance guys, have certain schedules, which then has them going to the NIC 
> firms, who want to sell chips to the system guys "You have to be ready to ship 
> by this date and your chip has to sell for no more than this."
> 
> Lather, rinse, repeat a few times and you get compromises on top of compromises.
> 
> Sometimes I think it is a wonder any of it actually works at all...
> 
> rick jones

Though the thread is going in a different direction now, I just wanted
to clarify two things:
- yes, the 57710 and 57711 only handle the IP (src+dst) for UDP toeplitz
hash. We all agree that it is much better to address the UDP ports as
well, but I think Rick Jones explained the process very well - thank you
Rick. Just to add one more (lame) excuse: the HW was designed before new
NAPI was introduced and it complies with the requirements from Redmond
- the next generation (57712) which we already sample does (finally)
support it. We are working on a patch series to enhance the bnx2x to
support this device now.

Eilon




^ permalink raw reply

* Re: [PATCH] RCU: don't turn off lockdep when find suspicious rcu_dereference_check() usage
From: Miles Lane @ 2010-04-27 17:58 UTC (permalink / raw)
  To: paulmck
  Cc: Eric W. Biederman, Vivek Goyal, Eric Paris, Lai Jiangshan,
	Ingo Molnar, Peter Zijlstra, LKML, nauman, eric.dumazet, netdev,
	Jens Axboe, Gui Jianfeng, Li Zefan, Johannes Berg
In-Reply-To: <20100427162201.GA5826@linux.vnet.ibm.com>

On Tue, Apr 27, 2010 at 12:22 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Mon, Apr 26, 2010 at 09:27:44PM -0700, Paul E. McKenney wrote:
>> On Mon, Apr 26, 2010 at 11:35:10AM -0700, Eric W. Biederman wrote:
>> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
>> >
>> > > Eric Dumazet traced these down to a commit from Eric Biederman.
>> > >
>> > > If I don't hear from Eric Biederman in a few days, I will attempt a
>> > > patch, but it would be more likely to be correct coming from someone
>> > > with a better understanding of the code.  ;-)
>> >
>> > I already replied.
>> >
>> > http://lkml.org/lkml/2010/4/21/420
>>
>> You did indeed!!!  This experience is giving me an even better appreciation
>> of the maintainers' ability to keep all their patches straight!
>>
>> I will put together something based on your suggestion.
>
> How about the following?
>
>                                                        Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit 85fa42bd568ab99c375f018761ae6345249942cd
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date:   Mon Apr 26 21:40:05 2010 -0700
>
>    net: suppress RCU lockdep false positive in twsk_net()
>
>    Calls to twsk_net() are in some cases protected by reference counting
>    as an alternative to RCU protection.  Cases covered by reference counts
>    include __inet_twsk_kill(), inet_twsk_free(), inet_twdr_do_twkill_work(),
>    inet_twdr_twcal_tick(), and tcp_timewait_state_process().  RCU is used
>    by inet_twsk_purge().  Locking is used by established_get_first()
>    and established_get_next().  Finally, __inet_twsk_hashdance() is an
>    initialization case.
>
>    It appears to be non-trivial to locate the appropriate locks and
>    reference counts from within twsk_net(), so used rcu_dereference_raw().
>
>    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
> index 79f67ea..a066fdd 100644
> --- a/include/net/inet_timewait_sock.h
> +++ b/include/net/inet_timewait_sock.h
> @@ -224,7 +224,9 @@ static inline
>  struct net *twsk_net(const struct inet_timewait_sock *twsk)
>  {
>  #ifdef CONFIG_NET_NS
> -       return rcu_dereference(twsk->tw_net);
> +       return rcu_dereference_raw(twsk->tw_net); /* protected by locking, */
> +                                                 /* reference counting, */
> +                                                 /* initialization, or RCU. */
>  #else
>        return &init_net;
>  #endif
>

Worked for me.  Thanks!

           Miles

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: Eric Dumazet @ 2010-04-27 17:37 UTC (permalink / raw)
  To: David Miller; +Cc: bmb, therbert, netdev, rick.jones2
In-Reply-To: <20100427.102038.57469310.davem@davemloft.net>

Le mardi 27 avril 2010 à 10:20 -0700, David Miller a écrit :

> 
> Indeed, a huge issue, in that we haven't converted the UDP hash over
> to RCU yet :-)
> 

I am not sure what you mean, UDP hash _is_ RCU converted ;)

> But because of the transient bind nature of UDP there are still a bunch
> of cases that won't even cure.
> --

We might use the ticket spinlock paradigm to let writers go in parallel
and let the user the socket lock

Instead of having the bh_lock_sock() to protect receive_queue *and*
backlog, writers get a unique slot in a table, that 'user' can handle
later.

Or serialize writers (before they try to bh_lock_sock()) with a
dedicated lock, so that user has 50% chances to get the sock lock,
contending with at most one writer.




^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: Eric Dumazet @ 2010-04-27 17:36 UTC (permalink / raw)
  To: Tom Herbert; +Cc: David Miller, bmb, netdev, rick.jones2
In-Reply-To: <g2k65634d661004271031r2eb2000bxc30013009509c410@mail.gmail.com>

Le mardi 27 avril 2010 à 10:31 -0700, Tom Herbert a écrit :

> This is the problem that we are addressing with so_reuseport!

How standard applications are protected against a DDOS ?




^ permalink raw reply

* Re: [net-next-2.6 PATCH 1/2] Add ndo_set_vf_port_profile
From: Anirban Chakraborty @ 2010-04-27 17:33 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Scott Feldman, Rose, Gregory V, David Miller,
	netdev@vger.kernel.org, chrisw@redhat.com, Williams, Mitch A
In-Reply-To: <201004271435.25480.arnd@arndb.de>


On Apr 27, 2010, at 5:35 AM, Arnd Bergmann wrote:

> On Tuesday 27 April 2010, Scott Feldman wrote:
>>> Yes, I believe that's there today:
>>> 
>>>    NLA_PUT_U32(skb, IFLA_NUM_VF, dev_num_vf(dev->dev.parent));
>>> 
>>> The number of VFs is returned in RTM_GETLINK.  But, it's only returned if:
>>> 
>>>    if (dev->netdev_ops->ndo_get_vf_config && dev->dev.parent)
>>> 
>>> For my proposal, I'll need to return IFLA_NUM_VF unconditionally so callers
>>> can get num VFs.
>> 
>> Hmmm...seems IFLA_NUM_VF assumes a PCI device supporting SR-IOV when it uses
>> dev_num_vf().  I think a better option would have been to query the device
>> for the number of VFs, without assuming SR-IOV or even PCI.
>> 
>> I see a ndo_get_num_vf() coming...
> 
> Shouldn't the number of registered port profiles be totally independent of
> the number of virtual functions?
> 
> Any of the VFs could multiplex multiple guests using macvlan, which means you
> need to register each guest separately, not each VF.
> 
> Anything that ties port profiles to VFs seems fundamentally flawed AFAICT,
> at least when we want to extend this to adapters that don't do it in firmware.

Correct me if I am wrong. Shouldn't the port profile be tied to the physical NICs which are essentially
PCI functions (be it PF or VF)? I'd think that a port profile would have configuration settings for all the
physical NICs (PF/VF) of a specific physical port of the adapter. I liked the idea of querying the device
for number of VFs as it will cover both SR-IOV and non SR-IOV PCI functions.

thanks,
-Anirban

^ permalink raw reply

* Re: [patch v2] sctp: cleanup: remove duplicate assignment
From: Vlad Yasevich @ 2010-04-27 17:32 UTC (permalink / raw)
  To: David Miller
  Cc: error27, sri, yjwei, cdischino, linux-sctp, netdev,
	kernel-janitors
In-Reply-To: <20100427.095823.98890165.davem@davemloft.net>



David Miller wrote:
> From: Vlad Yasevich <vladislav.yasevich@hp.com>
> Date: Tue, 27 Apr 2010 10:32:34 -0400
> 
>>
>> Dan Carpenter wrote:
>>> This assignment isn't needed because we did it earlier already.
>>>
>>> Also another reason to delete the assignment is because it triggers a
>>> Smatch warning about checking for NULL pointers after a dereference.
>>>
>>> Reported-by: Vlad Yasevich <vladislav.yasevich@hp.com>
>>> Signed-off-by: Dan Carpenter <error27@gmail.com>
>> Thanks.  I'll take this one.
> 
> And when will I get this from you? :-)

By the end of the week.  I am trying to get all the testing finished. :)

-vlad

> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: Tom Herbert @ 2010-04-27 17:31 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, bmb, netdev, rick.jones2
In-Reply-To: <1272388439.2295.369.camel@edumazet-laptop>

> So we have a BIG problem :
>
> All cpus are fighting to get the socket lock,
> and very litle progress is done.
>
> Note this problem has nothing to do with RPS, we could have
> it with multiqueue as well.
>

This is the problem that we are addressing with so_reuseport!

> Oh well...
>
>
>
>

^ permalink raw reply

* Re: [PATCH] bnx2x: add support for receive hashing
From: David Miller @ 2010-04-27 17:20 UTC (permalink / raw)
  To: eric.dumazet; +Cc: bmb, therbert, netdev, rick.jones2
In-Reply-To: <1272388439.2295.369.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 27 Apr 2010 19:13:59 +0200

> So we have a BIG problem :
> 
> All cpus are fighting to get the socket lock,
> and very litle progress is done.
> 
> Note this problem has nothing to do with RPS, we could have 
> it with multiqueue as well.
> 
> Oh well...

Indeed, a huge issue, in that we haven't converted the UDP hash over
to RCU yet :-)

But because of the transient bind nature of UDP there are still a bunch
of cases that won't even cure.

^ permalink raw reply

* Re: [PATCH] cxgb3: Wait longer for control packets on initialization
From: David Miller @ 2010-04-27 17:18 UTC (permalink / raw)
  To: divy; +Cc: adetsch, netdev
In-Reply-To: <4BD648FC.80602@chelsio.com>

From: Divy Le Ray <divy@chelsio.com>
Date: Mon, 26 Apr 2010 19:16:28 -0700

> Andre Detsch wrote:
>> In some Power7 platforms, when using VIOS (Virtual I/O Server), we
>> need to wait longer for control packets to finish transfer during
>> initialization.
>> Without this change, initialization may fail prematurely.
>>
>> Signed-off-by: Wen Xiong <wenxiong@us.ibm.com>
>> Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
>>   
> 
> Acked-by: Divy Le Ray <divy@chelsio.com>

Applied.

^ permalink raw reply

* Re: [net-2.6 PATCH] ixgbe: Power down PHY during driver resets
From: David Miller @ 2010-04-27 17:18 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, peter.p.waskiewicz.jr
In-Reply-To: <20100427103814.23338.47637.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 27 Apr 2010 03:38:15 -0700

> From: Peter Waskiewicz <peter.p.waskiewicz.jr@intel.com>
> 
> The PHY laser is still on during driver init.  It's allowing
> garbage to hit our FIFO, which eventually can cause the entire
> device to die.  Power down the laser while setting up the device,
> and re-enable the laser before getting link.
> 
> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: [net-2.6 PATCH] e1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata
From: David Miller @ 2010-04-27 17:18 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: netdev, gospo, mjg, bruce.w.allan
In-Reply-To: <20100427133232.25490.92973.stgit@localhost.localdomain>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Tue, 27 Apr 2010 06:33:04 -0700

> From: Bruce Allan <bruce.w.allan@intel.com>
> 
> Prompted by a previous patch submitted by Matthew Garret <mjg@redhat.com>,
> further digging into errata documentation reveals the current enabling or
> disabling of ASPM L0s and L1 states for certain parts supported by this
> driver are incorrect.  82571 and 82572 should always disable L1.  For
> standard frames, 82573/82574/82583 can enable L1 but L0s must be disabled,
> and for jumbo frames 82573/82574 must disable L1.  This allows for some
> parts to enable L1 in certain configurations leading to better power
> savings.
> 
> Also according to the same errata, Early Receive (ERT) should be disabled
> on 82573 when using jumbo frames.
> 
> Cc: Matthew Garret <mjg@redhat.com>
> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied.

^ permalink raw reply

* Re: linux-next: build failure after merge of the final tree (net tree related)
From: David Miller @ 2010-04-27 17:18 UTC (permalink / raw)
  To: sfr; +Cc: netdev, linux-next, linux-kernel, yoshfuji
In-Reply-To: <20100427.093430.258110898.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Tue, 27 Apr 2010 09:34:30 -0700 (PDT)

> From: Stephen Rothwell <sfr@canb.auug.org.au>
> Date: Tue, 27 Apr 2010 15:25:16 +1000
> 
>> After merging the bkl-ioctl tree, today's linux-next build (powerpc
>> ppc44x_defconfig) failed like this:
>> 
>> net/bridge/br_multicast.c: In function 'br_ip6_multicast_alloc_query':
>> net/bridge/br_multicast.c:469: error: implicit declaration of function 'csum_ipv6_magic'
>> 
>> Introduced by commit 08b202b6726459626c73ecfa08fcdc8c3efc76c2 ("bridge
>> br_multicast: IPv6 MLD support") from the net tree.
>> 
>> csum_ipv6_magic is declared in net/ip6_checksum.h ...
> 
> Bummer, powerpc is one of the few platforms that doesn't get the header
> file implicitly so you always trip over this whereas we never see it in
> x86 and sparc64 builds :-)
> 
> I'll fix this, thanks!

I just committed the following for this:

bridge: Fix build of ipv6 multicast code.

Based upon a report from Stephen Rothwell:

--------------------
net/bridge/br_multicast.c: In function 'br_ip6_multicast_alloc_query':
net/bridge/br_multicast.c:469: error: implicit declaration of function 'csum_ipv6_magic'

Introduced by commit 08b202b6726459626c73ecfa08fcdc8c3efc76c2 ("bridge
br_multicast: IPv6 MLD support") from the net tree.

csum_ipv6_magic is declared in net/ip6_checksum.h ...
--------------------

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 net/bridge/br_multicast.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index e481dbd..2048ef0 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -28,6 +28,7 @@
 #include <net/ipv6.h>
 #include <net/mld.h>
 #include <net/addrconf.h>
+#include <net/ip6_checksum.h>
 #endif
 
 #include "br_private.h"
-- 
1.7.0.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox