Netdev List
 help / color / mirror / Atom feed
* Re: Query on usage of multicast as source IPv6 address
From: Stephen Hemminger @ 2011-11-07 21:11 UTC (permalink / raw)
  To: Kumar Sanghvi; +Cc: netdev
In-Reply-To: <20111107204550.GB2980@kumar.asicdesigners.com>

On Tue, 8 Nov 2011 02:15:52 +0530
Kumar Sanghvi <divinekumar@gmail.com> wrote:

> However, what should be the behavior if a host receives a
> packet (probably from a malicious host with pktgen abilities)
> having a multicast address in source address field:
> 1) Should the receiving host discard the packet?
> 2) Should the receiving host dicard the packet, and send back
>    ICMP error?
> 3) Or should the receiving host send a response to the multicast
>    address?

Before the Internet was full of people sending malicious packets,
the standards encourage sending ICMP errors. Later RFC's discourage
sending ICMP's for many cases (See RFC 1812).

IMHO just drop packet making sure to increment appropriate statistic.

^ permalink raw reply

* softirq oops from b44_poll
From: Josh Boyer @ 2011-11-07 20:56 UTC (permalink / raw)
  To: Gary Zambrano, netdev; +Cc: linux-kernel, kernel-team

Hi all,

We've had two reports of a WARN_ON being spit out from kernel/softirq.c
that seem fairly related in symptoms.  Both seem to involved b44_poll
either during the middle of some disk I/O.  An example of the output is
here:

:WARNING: at kernel/softirq.c:159 _local_bh_enable_ip+0x44/0x8e()
:Hardware name: Vostro 1500                     
:Modules linked in: fuse lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack sunrpc uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec
snd_hwdep snd_seq snd_seq_device snd_pcm dell_wmi sparse_keymap dell_laptop
joydev dcdbas microcode r852 sm_common nand nand_ids b44 nand_ecc r592 mtd ssb
mii memstick arc4 i2c_i801 iTCO_wdt iTCO_vendor_support iwl3945 iwl_legacy
mac80211 cfg80211 rfkill snd_timer snd soundcore snd_page_alloc firewire_ohci
firewire_core crc_itu_t uas usb_storage sdhci_pci sdhci mmc_core nouveau ttm
drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi wmi video [last unloaded:
scsi_wait_scan]
:Pid: 1511, comm: nepomukservices Not tainted 3.1.0-1.fc16.x86_64 #1
:Call Trace:
: <IRQ>  [<ffffffff81057a56>] warn_slowpath_common+0x83/0x9b
: [<ffffffff81057a88>] warn_slowpath_null+0x1a/0x1c
: [<ffffffff8105d462>] _local_bh_enable_ip+0x44/0x8e
: [<ffffffff8105d4ba>] local_bh_enable_ip+0xe/0x10
: [<ffffffff814b5af4>] _raw_spin_unlock_bh+0x15/0x17
: [<ffffffffa03cc969>] destroy_conntrack+0x9d/0xdc [nf_conntrack]
: [<ffffffff813fa083>] nf_conntrack_destroy+0x19/0x1b
: [<ffffffff813ce4ed>] skb_release_head_state+0xa7/0xef
: [<ffffffff813ce2f1>] __kfree_skb+0x13/0x83
: [<ffffffff813ce3b7>] consume_skb+0x56/0x6b
: [<ffffffffa02e48c4>] b44_poll+0xaf/0x3ec [b44]
: [<ffffffff813d8137>] net_rx_action+0xa9/0x1b8
: [<ffffffffa02e202e>] ? br32+0x19/0x1d [b44]
: [<ffffffff8105d6b3>] __do_softirq+0xc9/0x1b5
: [<ffffffff81027719>] ? ack_APIC_irq+0x15/0x17
: [<ffffffff814be32c>] call_softirq+0x1c/0x30
: [<ffffffff81010b45>] do_softirq+0x46/0x81
: [<ffffffff8105d97b>] irq_exit+0x57/0xb1
: [<ffffffff814bec0e>] do_IRQ+0x8e/0xa5
: [<ffffffff814b5d2e>] common_interrupt+0x6e/0x6e
: <EOI>  [<ffffffff814bc1f4>] ? sysret_audit+0x16/0x20

You can find the original bug reports in the URLs below.  This has happened
on two different machines, one 32-bit and another 64-bit.  I'm fairly sure
both reports are the same issue, but I haven't a clue what that issue might
be at the moment.

Thoughts?

https://bugzilla.redhat.com/show_bug.cgi?id=749856
https://bugzilla.redhat.com/show_bug.cgi?id=741117

josh

^ permalink raw reply

* [PATCH] staging: octeon-ethernet: Fix compile error caused by changed to struct skb_frag_struct.
From: David Daney @ 2011-11-07 20:49 UTC (permalink / raw)
  To: ralf, linux-mips, netdev, gregkh, devel; +Cc: ddaney.cavm, David Daney

Evidently the definition of struct skb_frag_struct has changed, so we
need to change to match it.

Signed-off-by: David Daney <david.daney@cavium.com>
---
 drivers/staging/octeon/ethernet-tx.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/octeon/ethernet-tx.c b/drivers/staging/octeon/ethernet-tx.c
index b445cd6..2542c37 100644
--- a/drivers/staging/octeon/ethernet-tx.c
+++ b/drivers/staging/octeon/ethernet-tx.c
@@ -275,7 +275,7 @@ int cvm_oct_xmit(struct sk_buff *skb, struct net_device *dev)
 		CVM_OCT_SKB_CB(skb)[0] = hw_buffer.u64;
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 			struct skb_frag_struct *fs = skb_shinfo(skb)->frags + i;
-			hw_buffer.s.addr = XKPHYS_TO_PHYS((u64)(page_address(fs->page) + fs->page_offset));
+			hw_buffer.s.addr = XKPHYS_TO_PHYS((u64)(page_address(fs->page.p) + fs->page_offset));
 			hw_buffer.s.size = fs->size;
 			CVM_OCT_SKB_CB(skb)[i + 1] = hw_buffer.u64;
 		}
-- 
1.7.2.3

^ permalink raw reply related

* Query on usage of multicast as source IPv6 address
From: Kumar Sanghvi @ 2011-11-07 20:45 UTC (permalink / raw)
  To: netdev

Hi,

I am trying to understand IPv6 behavior in Linux.
And I have a doubt related to use of multicast address
as source address.

RFC 4291 in Section 2.7 states that:
"Multicast addresses must not be used as source addresses
 in IPv6 packets or appear in any Routing header."

However, what should be the behavior if a host receives a
packet (probably from a malicious host with pktgen abilities)
having a multicast address in source address field:
1) Should the receiving host discard the packet?
2) Should the receiving host dicard the packet, and send back
   ICMP error?
3) Or should the receiving host send a response to the multicast
   address?

I tried to search on the usage of multicast address in source
address field. However, could not find much detail (may be I am
not looking hard enough...)

So, I tried a below experiment between two linux hosts:

Host1: Running Linux 3.1,
       IP address: 2001:db8:0:f101::2/64,
       and netserver listening on port 12865.

Host2: Running Linux 2.6.32,
       IP address: 2001:db8:0:f101::1/64,
       and with pktgen abilities.

Host1 and Host2 are back-to-back connected.

Now, from Host2, I send a TCP packet with a multicast address
(ff02::1) in source IP address field.
>From the tcpdump running on Host2, I see below:
----
tcpdump ip6 -i eth6 -vv
tcpdump: WARNING: eth6: no IPv4 address assigned
tcpdump: listening on eth6, link-type EN10MB (Ethernet), capture size 65535 bytes
01:16:11.297469 IP6 (hlim 37, next-header TCP (6) payload length: 20) ff02::1.43373 > 2001:db8:0:f101::2.12865: Flags [S], cksum 0x0d91 (correct), seq 768433557, win 5760, length 0
01:16:11.297627 IP6 (hlim 64, next-header TCP (6) payload length: 24) 2001:db8:0:f101::2.12865 > ff02::1.43373: Flags [S.], cksum 0x4614 (correct), seq 4202338952, ack 768433558, win 14400, options [mss 1440], length 0
01:16:12.299063 IP6 (hlim 64, next-header TCP (6) payload length: 24) 2001:db8:0:f101::2.12865 > ff02::1.43373: Flags [S.], cksum 0x4614 (correct), seq 4202338952, ack 768433558, win 14400, options [mss 1440], length 0
01:16:14.298824 IP6 (hlim 64, next-header TCP (6) payload length: 24) 2001:db8:0:f101::2.12865 > ff02::1.43373: Flags [S.], cksum 0x4614 (correct), seq 4202338952, ack 768433558, win 14400, options [mss 1440], length 0
01:16:16.297476 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::7:4300:210:a410 > 2001:db8:0:f101::2: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2001:db8:0:f101::2
	  source link-address option (1), length 8 (1): 00:07:43:10:a4:10
	    0x0000:  0007 4310 a410
01:16:16.297591 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 24) 2001:db8:0:f101::2 > fe80::7:4300:210:a410: [icmp6 sum ok] ICMP6, neighbor advertisement, length 24, tgt is 2001:db8:0:f101::2, Flags [solicited]
01:16:18.498340 IP6 (hlim 64, next-header TCP (6) payload length: 24) 2001:db8:0:f101::2.12865 > ff02::1.43373: Flags [S.], cksum 0x4614 (correct), seq 4202338952, ack 768433558, win 14400, options [mss 1440], length 0
01:16:21.309998 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::7:4300:210:8450 > fe80::7:4300:210:a410: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::7:4300:210:a410
	  source link-address option (1), length 8 (1): 00:07:43:10:84:50
	    0x0000:  0007 4310 8450
01:16:21.310040 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 24) fe80::7:4300:210:a410 > fe80::7:4300:210:8450: [icmp6 sum ok] ICMP6, neighbor advertisement, length 24, tgt is fe80::7:4300:210:a410, Flags [solicited]
01:16:26.309429 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::7:4300:210:a410 > fe80::7:4300:210:8450: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::7:4300:210:8450
	  source link-address option (1), length 8 (1): 00:07:43:10:a4:10
	    0x0000:  0007 4310 a410
01:16:26.309482 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 24) fe80::7:4300:210:8450 > fe80::7:4300:210:a410: [icmp6 sum ok] ICMP6, neighbor advertisement, length 24, tgt is fe80::7:4300:210:8450, Flags [solicited]
01:16:26.497391 IP6 (hlim 64, next-header TCP (6) payload length: 24) 2001:db8:0:f101::2.12865 > ff02::1.43373: Flags [S.], cksum 0x4614 (correct), seq 4202338952, ack 768433558, win 14400, options [mss 1440], length 0
01:16:42.495518 IP6 (hlim 64, next-header TCP (6) payload length: 24) 2001:db8:0:f101::2.12865 > ff02::1.43373: Flags [S.], cksum 0x4614 (correct), seq 4202338952, ack 768433558, win 14400, options [mss 1440], length 0
^C
13 packets captured
13 packets received by filter
0 packets dropped by kernel
----

So, it seems that Linux responds to packets having source IP
field as multicast address, and sends response to that multicast
address. Or am I interpreting it wrongly ?

I would like to understand if this is a valid behavior to send
response to multicast address? Will it not lead to some kind of
amplification attack, if the malicious user from Host2 sends a
flood of TCP packets, with multicast as source IP, towards Host1,
and if there are several hosts present in that same LAN segment?

Or, is it completely valid to send a response to multicast address?
May be, my understanding is not clear then.


Any help in this regards is appreciated.


Thanks,
Kumar.

^ permalink raw reply

* Re: patch "workflow" - what deferred state means?
From: David Miller @ 2011-11-07 20:19 UTC (permalink / raw)
  To: mazziesaccount; +Cc: netdev
In-Reply-To: <CANhJrGMsM7Pc9j0SB4G4Xxym0LLYmtaUq--vyPgt+7HWNz7H0Q@mail.gmail.com>

From: Maz The Northener <mazziesaccount@gmail.com>
Date: Mon, 7 Nov 2011 22:08:00 +0200

> I was talking about http://patchwork.ozlabs.org/patch/123407/ and
> patchwork.ozlabs.org/patch/123406/

Those are deferred because now is not an appropriate time to submit
new feature patches.

This kind of work should be resubmitted when net-next opens back up.

^ permalink raw reply

* Re: patch "workflow" - what deferred state means?
From: Maz The Northener @ 2011-11-07 20:08 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20111107.141421.1710547287569203789.davem@davemloft.net>

I was talking about http://patchwork.ozlabs.org/patch/123407/ and
patchwork.ozlabs.org/patch/123406/

On 11/7/11, David Miller <davem@davemloft.net> wrote:
> From: Maz The Northener <mazziesaccount@gmail.com>
> Date: Mon, 7 Nov 2011 21:07:45 +0200
>
>> So is the different states of patches explained somewhere?
>
> Tell us what specific patch you're talking about and maybe we
> can give you an answer.
>


-- 

-Matti "Maz" Vaittinen
CWF coding team leader
http://www.curlysworldoffreeware.com/

BrakesAreForCowards!!!
When you feel blue, no one sees your tears... When your down, no one
understands your struggle...
When you feel happy, no one notices your smile...
But fart just once...
I would love to create a freeware game with C - unless I was working at NSN.

^ permalink raw reply

* RE: [PATCH] net, wireless, mwifiex: Fix mem leak in mwifiex_update_curr_bss_params()
From: Bing Zhao @ 2011-11-07 19:27 UTC (permalink / raw)
  To: Srivatsa S. Bhat, Jesper Juhl
  Cc: linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, John W. Linville
In-Reply-To: <4EB70920.80905@linux.vnet.ibm.com>

> On 11/07/2011 03:28 AM, Jesper Juhl wrote:
> > If kmemdup() fails we leak the memory allocated to bss_desc.
> > This patch fixes the leak.
> > I also removed the pointless default assignment of 'NULL' to 'bss_desc'
> > while I was there anyway.
> >
> > Signed-off-by: Jesper Juhl <jj@chaosbits.net>

Hi Jesper,

Thanks for the patch.

> 
> Looks good to me.
> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Hi Srivatsa,

Thanks for your review.

Acked-by: Bing Zhao <bzhao@marvell.com>

Thanks,
Bing

> 
> Thanks,
> Srivatsa S. Bhat
> 
> > ---
> >  drivers/net/wireless/mwifiex/scan.c |    3 ++-
> >  1 files changed, 2 insertions(+), 1 deletions(-)
> >
> >  note: patch is compile tested only since I don't have the hardware.
> >
> > diff --git a/drivers/net/wireless/mwifiex/scan.c b/drivers/net/wireless/mwifiex/scan.c
> > index dae8dbb..8a3f959 100644
> > --- a/drivers/net/wireless/mwifiex/scan.c
> > +++ b/drivers/net/wireless/mwifiex/scan.c
> > @@ -1469,7 +1469,7 @@ mwifiex_update_curr_bss_params(struct mwifiex_private *priv, u8 *bssid,
> >  			       s32 rssi, const u8 *ie_buf, size_t ie_len,
> >  			       u16 beacon_period, u16 cap_info_bitmap, u8 band)
> >  {
> > -	struct mwifiex_bssdescriptor *bss_desc = NULL;
> > +	struct mwifiex_bssdescriptor *bss_desc;
> >  	int ret;
> >  	unsigned long flags;
> >  	u8 *beacon_ie;
> > @@ -1484,6 +1484,7 @@ mwifiex_update_curr_bss_params(struct mwifiex_private *priv, u8 *bssid,
> >
> >  	beacon_ie = kmemdup(ie_buf, ie_len, GFP_KERNEL);
> >  	if (!beacon_ie) {
> > +		kfree(bss_desc);
> >  		dev_err(priv->adapter->dev, " failed to alloc beacon_ie\n");
> >  		return -ENOMEM;
> >  	}

^ permalink raw reply

* Re: patch "workflow" - what deferred state means?
From: David Miller @ 2011-11-07 19:14 UTC (permalink / raw)
  To: mazziesaccount; +Cc: netdev
In-Reply-To: <CANhJrGNsyD1vsRA4xhgD3KrY9ZcdNOyM4JsM+kf71Pt9KTqh-Q@mail.gmail.com>

From: Maz The Northener <mazziesaccount@gmail.com>
Date: Mon, 7 Nov 2011 21:07:45 +0200

> So is the different states of patches explained somewhere?

Tell us what specific patch you're talking about and maybe we
can give you an answer.

^ permalink raw reply

* Re: [PATCH] route: fix ICMP secure_redirects
From: Flavio Leitner @ 2011-11-07 19:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20111107.133541.808346933690781560.davem@davemloft.net>

On Mon, 07 Nov 2011 13:35:41 -0500 (EST)
David Miller <davem@davemloft.net> wrote:

> From: Flavio Leitner <fbl@redhat.com>
> Date: Mon,  7 Nov 2011 13:41:45 -0200
> 
> > It should accept ICMP redirects from any host and not
> > just from gateways when secure_redirects is disabled.
> > 
> > Signed-off-by: Flavio Leitner <fbl@redhat.com>
> 
> This is changing the default behavior, and could break things for
> people.
> 
> We have sort-of discussed this already, and agreed that the tests
> made in this code before my inetpeer reworking had to be reinstated
> exactly as it was.

Right, so I cannot change either values 0 or 1 then. For some
reason I thought I couldn't change only the default behavior.
I will think on something else then.
thanks,
fbl

^ permalink raw reply

* patch "workflow" - what deferred state means?
From: Maz The Northener @ 2011-11-07 19:07 UTC (permalink / raw)
  To: netdev

Hi!

I sent a few versions of a patch to the netdev some days ago. I
recently stumbled upon patchwork website, and noticed that the latest
versions of my patches had "deferred" state. I tried searching for
what that means, but only thing I managed to find was Uboot project's
explanation. They used deferred state to mean that patch in question
depends on something not currently in source tree. I doubt that's the
case here though. Maybe it is because my patch was created against rc4
kernel.

I was just wondering if I could do some conclusion based on the state.
I naturally am keen to know if patch is rejected, or if it stil may
end up in kernel, or if there is something I could still do in order
to improve the situation? Maybe creating the patch against new 3.1
kernel would help you?

So is the different states of patches explained somewhere?

--Matti Vaittinen

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2011-11-07 18:45 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) The IXGBE build fix wrt. CONFIG_PCI_IOV from Gregory Rose.

2) Fixes for module unload races and statistic problems in forcedeth from
   david decotigny, Mike Ditto, and Mandeep Baines.

3) Kill stray BKL references from wanrouter code, from Richard Weinberger.

4) usbnet oopses due to unguarded skb_tx_timestamp() check, fix from
   Konstantin Khlebnikov.

5) tg3 driver bug fixes from Matt Carlson.

6) Fix bogus compare of u8 with -1 in bonding, from Dan Carpenter.

7) Netlink message validation fix from Johannes Berg.

8) Fix sky2 driver regression on Yukon Optima chips, from Stephen Hemminger.

Please pull, thanks a lot!

The following changes since commit 83dbb15e9cd78a3619e3db36777e2f81d09b2914:

  Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux (2011-11-07 10:01:56 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

Andres Salomon (1):
      libertas: ensure we clean up a scan request properly

Christian Lamparter (1):
      carl9170: fix AMPDU TX_CTL_REQ_TX_STATUS handling

Dan Carpenter (1):
      bonding: comparing a u8 with -1 is always false

David Herrmann (4):
      Bluetooth: ath3k: Use GFP_KERNEL instead of GFP_ATOMIC
      Bluetooth: bcm203x: Fix race condition on disconnect
      Bluetooth: bcm203x: Use GFP_KERNEL in workqueue
      Bluetooth: bfusb: Fix error path on firmware load

David S. Miller (1):
      Merge branch 'for-davem' of git://git.kernel.org/.../linville/wireless

Eliad Peller (2):
      mac80211: fix remain_off_channel regression
      mac80211: config hw when going back on-channel

Emmanuel Grumbach (1):
      iwlagn: fix the race in the unmapping of the HCMD

Jeff Kirsher (2):
      i825xx:xscale:8390:freescale: Fix Kconfig dependancies
      etherh: Add MAINTAINERS entry for etherh

Johan Hedberg (1):
      Bluetooth: Set HCI_MGMT flag only in read_controller_info

Johannes Berg (4):
      mac80211: disable powersave for broken APs
      mac80211: warn only once about not finding a rate
      netlink: validate NLA_MSECS length
      netlink: clarify attribute length check documentation

John W. Linville (2):
      Merge branch 'master' of git://git.kernel.org/.../padovan/bluetooth
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless into for-davem

Jouni Malinen (1):
      mac80211: Fix TDLS support validation in add_station handler

Konstantin Khlebnikov (1):
      usbnet: fix oops in usbnet_start_xmit

Larry Finger (1):
      b43: Remove unneeded message

Mandeep Baines (1):
      forcedeth: Improve stats counters

Matt Carlson (8):
      tg3: Fix APE mutex init and use
      tg3: Fix 4k tx bd segmentation code
      tg3: Fix 4k skb error recovery path
      tg3: Fix irq alloc error cleanup path
      tg3: Obtain PCI function number from device
      tg3: Schedule at most one tg3_reset_task run
      tg3: Eliminate timer race with reset_task
      tg3: Update version to 3.121

Mike Ditto (1):
      forcedeth: Acknowledge only interrupts that are being processed

Or Gerlitz (1):
      MAINTAINERS/rds: update maintainer

Rajkumar Manoharan (5):
      ath9k_hw: Fix regression of register offset for AR9003 chips
      ath9k_hw: Fix radio retention for AR9462
      ath9k_hw: Fix regression of register offset of AR9330/AR9340
      ath9k_hw: Update AR9485 initvals to fix system hang issue
      ath9k_hw: Fix noise floor calibration timeout on fast channel change

Richard Weinberger (1):
      wanrouter: Remove kernel_lock annotations

Rose, Gregory V (1):
      ixgbe: Fix compile for kernel without CONFIG_PCI_IOV defined

Szymon Janc (2):
      Bluetooth: rfcomm: Fix sleep in invalid context in rfcomm_security_cfm
      Bluetooth: Increase HCI reset timeout in hci_dev_do_close

Wey-Yi Guy (2):
      iwlwifi: allow pci_enable_msi fail
      iwlwifi: don't perform "echo test" when cmd queue stuck

david decotigny (3):
      forcedeth: fix race when unloading module
      forcedeth: remove unneeded stats updates
      forcedeth: fix a few sparse warnings (variable shadowing)

stephen hemminger (2):
      macvlan: receive multicast with local address
      sky2: fix regression on Yukon Optima

 MAINTAINERS                                      |    3 +-
 drivers/bluetooth/ath3k.c                        |    4 +-
 drivers/bluetooth/bcm203x.c                      |   12 ++-
 drivers/bluetooth/bfusb.c                        |   13 +-
 drivers/net/bonding/bond_main.c                  |    4 +-
 drivers/net/bonding/bond_procfs.c                |    4 +-
 drivers/net/ethernet/broadcom/tg3.c              |  195 ++++++++++++----------
 drivers/net/ethernet/broadcom/tg3.h              |   21 ++-
 drivers/net/ethernet/freescale/Kconfig           |    3 +-
 drivers/net/ethernet/intel/Kconfig               |    6 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c   |    2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h   |    4 +-
 drivers/net/ethernet/marvell/sky2.c              |   11 --
 drivers/net/ethernet/natsemi/Kconfig             |    5 +-
 drivers/net/ethernet/nvidia/forcedeth.c          |   88 ++++-------
 drivers/net/macvlan.c                            |    7 +
 drivers/net/usb/usbnet.c                         |    3 +-
 drivers/net/wireless/ath/ath9k/ar9002_calib.c    |    4 -
 drivers/net/wireless/ath/ath9k/ar9003_calib.c    |   11 +-
 drivers/net/wireless/ath/ath9k/ar9003_phy.h      |   34 ++--
 drivers/net/wireless/ath/ath9k/ar9485_initvals.h |   10 +-
 drivers/net/wireless/ath/ath9k/hw.c              |    3 +
 drivers/net/wireless/ath/carl9170/tx.c           |   11 +-
 drivers/net/wireless/b43/xmit.c                  |    1 -
 drivers/net/wireless/iwlwifi/iwl-core.c          |   10 -
 drivers/net/wireless/iwlwifi/iwl-pci.c           |    8 +-
 drivers/net/wireless/iwlwifi/iwl-trans-pcie.c    |   12 +-
 drivers/net/wireless/libertas/cfg.c              |   25 ++-
 drivers/net/wireless/libertas/cfg.h              |    1 +
 drivers/net/wireless/libertas/main.c             |    6 +-
 include/linux/ethtool.h                          |    2 +
 include/net/bluetooth/rfcomm.h                   |    1 +
 include/net/mac80211.h                           |    3 +-
 include/net/netlink.h                            |   11 +-
 lib/nlattr.c                                     |    1 +
 net/bluetooth/hci_core.c                         |    2 +-
 net/bluetooth/mgmt.c                             |    2 -
 net/bluetooth/rfcomm/core.c                      |    9 +-
 net/mac80211/cfg.c                               |   12 +-
 net/mac80211/ieee80211_i.h                       |    1 +
 net/mac80211/mlme.c                              |   18 ++-
 net/mac80211/work.c                              |    7 +-
 net/wanrouter/wanproc.c                          |    2 -
 43 files changed, 325 insertions(+), 267 deletions(-)

^ permalink raw reply

* Re: [PATCH net v5 0/5] forcedeth: minor fixes for stats, rmmod, sparse
From: David Miller @ 2011-11-07 18:44 UTC (permalink / raw)
  To: david.decotigny
  Cc: netdev, linux-kernel, ian.campbell, eric.dumazet,
	jeffrey.t.kirsher, jpirko, joe, szymon
In-Reply-To: <cover.1320539724.git.david.decotigny@google.com>

From: David Decotigny <david.decotigny@google.com>
Date: Sat,  5 Nov 2011 17:38:19 -0700

> This is a minor update over v4, re-adding a patch I left aside to
> study it.

All applied, thanks.

^ permalink raw reply

* Re: [PATCH 6/7] fsl_pmc: Add API to enable device as wakeup event source
From: Scott Wood @ 2011-11-07 18:41 UTC (permalink / raw)
  To: Tabi Timur-B04825
  Cc: Zhao Chenhui-B35336, netdev@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, Li Yang-R58472
In-Reply-To: <CAOZdJXXB9zJWqC+kPq7ZDdzePtp8XNBnWcf5UmE8Ye50U-G7Dg@mail.gmail.com>

On 11/04/2011 07:08 PM, Tabi Timur-B04825 wrote:
> On Fri, Nov 4, 2011 at 7:39 AM, Zhao Chenhui <chenhui.zhao@freescale.com> wrote:
>> +       /* clear to enable clock in low power mode */
>> +       if (enable)
>> +               clrbits32(&pmc_regs->pmcdr, *pmcdr_mask);
>> +       else
>> +               setbits32(&pmc_regs->pmcdr, *pmcdr_mask);
> 
> You need to use be32_to_cpup() when dereferencing a pointer to a
> device tree property.

Or just use of_property_read_u32().

-Scott

^ permalink raw reply

* Re: [PATCH] route: fix ICMP secure_redirects
From: David Miller @ 2011-11-07 18:35 UTC (permalink / raw)
  To: fbl; +Cc: netdev
In-Reply-To: <1320680505-26367-1-git-send-email-fbl@redhat.com>

From: Flavio Leitner <fbl@redhat.com>
Date: Mon,  7 Nov 2011 13:41:45 -0200

> It should accept ICMP redirects from any host and not
> just from gateways when secure_redirects is disabled.
> 
> Signed-off-by: Flavio Leitner <fbl@redhat.com>

This is changing the default behavior, and could break things for people.

We have sort-of discussed this already, and agreed that the tests made in
this code before my inetpeer reworking had to be reinstated exactly as it
was.

^ permalink raw reply

* Re: [PATCH resend] MAINTAINERS/rds: update maintainer
From: David Miller @ 2011-11-07 18:28 UTC (permalink / raw)
  To: ogerlitz; +Cc: netdev
In-Reply-To: <alpine.LRH.2.00.1111071114280.20919@ogerlitz.voltaire.com>

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Mon, 7 Nov 2011 11:39:49 +0200

> update for the actual maintainer
> 
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>

Applied.

^ permalink raw reply

* Re: [PATCH 3/3] wanrouter: Remove kernel_lock annotations
From: David Miller @ 2011-11-07 18:27 UTC (permalink / raw)
  To: richard; +Cc: netdev, linux-kernel
In-Reply-To: <1320654261-4473-1-git-send-email-richard@nod.at>

From: Richard Weinberger <richard@nod.at>
Date: Mon,  7 Nov 2011 09:24:21 +0100

> The BKL is gone, these annotations are useless.
> 
> Signed-off-by: Richard Weinberger <richard@nod.at>

Applied, thanks Richard.

^ permalink raw reply

* Re: [PATCH v2] usbnet: fix oops in usbnet_start_xmit
From: David Miller @ 2011-11-07 18:26 UTC (permalink / raw)
  To: richardcochran
  Cc: khlebnikov, oneukum, michael, alexey.orishko, netdev, devel
In-Reply-To: <20111107173919.GA2730@netboy.at.omicron.at>

From: Richard Cochran <richardcochran@gmail.com>
Date: Mon, 7 Nov 2011 18:39:19 +0100

> On Mon, Nov 07, 2011 at 06:54:58PM +0300, Konstantin Khlebnikov wrote:
>> This patch fixes the bug added in commit v3.1-rc7-1055-gf9b491e
>> SKB can be NULL at this point, at least for cdc-ncm.
>> 
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
> 
> Acked-by: Richard Cochran <richardcochran@gmail.com>

Applied, but the overall logic in the usbnet transmit path definitely
could use a good reconsideration and cleanup.

I'm even open to having generic support at the generic net device TX
level to fixup the segmentation layout of the SKB so that it meets
the device's requirements.  We can do it more efficiently there too.

^ permalink raw reply

* RE: [net-ext PATCH] ixgbe: Fix compile for kernel without CONFIG_PCI_IOV defined
From: Rose, Gregory V @ 2011-11-07 18:25 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org
In-Reply-To: <20111107.132340.124929090969671336.davem@davemloft.net>

> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Monday, November 07, 2011 10:24 AM
> To: Rose, Gregory V
> Cc: netdev@vger.kernel.org
> Subject: Re: [net-ext PATCH] ixgbe: Fix compile for kernel without
> CONFIG_PCI_IOV defined
> 
> From: Greg Rose <gregory.v.rose@intel.com>
> Date: Mon, 07 Nov 2011 09:44:17 -0800
> 
> > Fix compiler errors and warnings with CONFIG_PCI_IOV defined and not
> > defined.
> >
> > Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
> 
> It's "net-next" not "net-ext" :-)
> 
> Applied, thanks.

Ugh... I'm really stepping in it deep.

Thanks Dave,

- Greg

^ permalink raw reply

* Re: [net-ext PATCH] ixgbe: Fix compile for kernel without CONFIG_PCI_IOV defined
From: David Miller @ 2011-11-07 18:23 UTC (permalink / raw)
  To: gregory.v.rose; +Cc: netdev
In-Reply-To: <20111107174417.8638.87569.stgit@gitlad.jf.intel.com>

From: Greg Rose <gregory.v.rose@intel.com>
Date: Mon, 07 Nov 2011 09:44:17 -0800

> Fix compiler errors and warnings with CONFIG_PCI_IOV defined and not
> defined.
> 
> Signed-off-by: Greg Rose <gregory.v.rose@intel.com>

It's "net-next" not "net-ext" :-)

Applied, thanks.

^ permalink raw reply

* RE: linux-next: build failure after merge of the origin tree
From: Rose, Gregory V @ 2011-11-07 17:46 UTC (permalink / raw)
  To: Rose, Gregory V, Kirsher, Jeffrey T, David Miller
  Cc: sfr@canb.auug.org.au, torvalds@linux-foundation.org,
	linux-next@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org
In-Reply-To: <43F901BD926A4E43B106BF17856F075501A1CCA399@orsmsx508.amr.corp.intel.com>

[-- Attachment #1: Type: text/plain, Size: 2308 bytes --]

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
> On Behalf Of Rose, Gregory V
> Sent: Monday, November 07, 2011 8:47 AM
> To: Kirsher, Jeffrey T; David Miller
> Cc: sfr@canb.auug.org.au; torvalds@linux-foundation.org; linux-
> next@vger.kernel.org; linux-kernel@vger.kernel.org; netdev@vger.kernel.org
> Subject: RE: linux-next: build failure after merge of the origin tree
> 
> 
> 
> > -----Original Message-----
> > From: Kirsher, Jeffrey T
> > Sent: Sunday, November 06, 2011 9:30 PM
> > To: David Miller
> > Cc: sfr@canb.auug.org.au; torvalds@linux-foundation.org; linux-
> > next@vger.kernel.org; linux-kernel@vger.kernel.org; Rose, Gregory V;
> > netdev@vger.kernel.org
> > Subject: Re: linux-next: build failure after merge of the origin tree
> >
> >
> >
> > Cheers,
> > Jeff
> >
> > On Nov 6, 2011, at 19:38, "David Miller" <davem@davemloft.net> wrote:
> >
> > > From: Stephen Rothwell <sfr@canb.auug.org.au>
> > > Date: Mon, 7 Nov 2011 13:47:06 +1100
> > >
> > >>> If you just revert the commit in origin from -next, then you will
> get
> > >>> conflicts with you pull the net.git tree in.
> > >>
> > >> I got no conflicts when I merged in the net tree and can see no fix
> for
> > >> this problem in the net tree.  My current head of the net tree is
> > 1a6422f
> > >> "etherh: Add MAINTAINERS entry for etherh".
> > >
> > > Ok, Jeff please take a look at this and send me a fix soon.
> > >
> > > Thanks.
> >
> > Ok Dave, at this point, I am puttying together a patch to revert this
> fix
> > since it appears that more trouble comes with this fix.  I will take a
> > look at it quickly before sending out a patch to fix the issue.
> 
> My bad...  I fixed a compiler warning that occurred with CONFIG_PCI_IOV
> turned on and didn't realize that my patch would cause an error when
> turning it back off.
> 
> I'll have it fixed ASAP.
> 
> - Greg

I have posted a fix for this problem to netdev and attached it to this email.

Again, my apologies for the mix up.

- Greg


> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: Type: message/rfc822, Size: 5342 bytes --]

From: "Rose, Gregory V" <gregory.v.rose@intel.com>
To: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Cc: "davem@davemloft.net" <davem@davemloft.net>
Subject: [net-ext PATCH] ixgbe: Fix compile for kernel without CONFIG_PCI_IOV	defined
Date: Mon, 7 Nov 2011 09:44:17 -0800
Message-ID: <20111107174417.8638.87569.stgit@gitlad.jf.intel.com>

Fix compiler errors and warnings with CONFIG_PCI_IOV defined and not
defined.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
---

 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c |    2 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h |    4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index db95731..00fcd39 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -442,12 +442,14 @@ static int ixgbe_set_vf_macvlan(struct ixgbe_adapter *adapter,

 int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter)
 {
+#ifdef CONFIG_PCI_IOV
        int i;
        for (i = 0; i < adapter->num_vfs; i++) {
                if (adapter->vfinfo[i].vfdev->dev_flags &
                                PCI_DEV_FLAGS_ASSIGNED)
                        return true;
        }
+#endif
        return false;
 }

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h
index 4a5d889..df04f1a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h
@@ -42,11 +42,11 @@ int ixgbe_ndo_set_vf_spoofchk(struct net_device *netdev, int vf, bool setting);
 int ixgbe_ndo_get_vf_config(struct net_device *netdev,
                            int vf, struct ifla_vf_info *ivi);
 void ixgbe_check_vf_rate_limit(struct ixgbe_adapter *adapter);
-#ifdef CONFIG_PCI_IOV
 void ixgbe_disable_sriov(struct ixgbe_adapter *adapter);
+int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter);
+#ifdef CONFIG_PCI_IOV
 void ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
                        const struct ixgbe_info *ii);
-int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter);
 #endif



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [net-ext PATCH] ixgbe: Fix compile for kernel without CONFIG_PCI_IOV defined
From: Greg Rose @ 2011-11-07 17:44 UTC (permalink / raw)
  To: netdev; +Cc: davem

Fix compiler errors and warnings with CONFIG_PCI_IOV defined and not
defined.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
---

 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c |    2 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h |    4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index db95731..00fcd39 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -442,12 +442,14 @@ static int ixgbe_set_vf_macvlan(struct ixgbe_adapter *adapter,
 
 int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter)
 {
+#ifdef CONFIG_PCI_IOV
 	int i;
 	for (i = 0; i < adapter->num_vfs; i++) {
 		if (adapter->vfinfo[i].vfdev->dev_flags &
 				PCI_DEV_FLAGS_ASSIGNED)
 			return true;
 	}
+#endif
 	return false;
 }
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h
index 4a5d889..df04f1a 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.h
@@ -42,11 +42,11 @@ int ixgbe_ndo_set_vf_spoofchk(struct net_device *netdev, int vf, bool setting);
 int ixgbe_ndo_get_vf_config(struct net_device *netdev,
 			    int vf, struct ifla_vf_info *ivi);
 void ixgbe_check_vf_rate_limit(struct ixgbe_adapter *adapter);
-#ifdef CONFIG_PCI_IOV
 void ixgbe_disable_sriov(struct ixgbe_adapter *adapter);
+int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter);
+#ifdef CONFIG_PCI_IOV
 void ixgbe_enable_sriov(struct ixgbe_adapter *adapter,
 			const struct ixgbe_info *ii);
-int ixgbe_check_vf_assignment(struct ixgbe_adapter *adapter);
 #endif
 
 

^ permalink raw reply related

* Re: [PATCH v2] usbnet: fix oops in usbnet_start_xmit
From: Richard Cochran @ 2011-11-07 17:39 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Oliver Neukum, Michael Riesch, Alexey Orishko, netdev,
	David S. Miller, devel
In-Reply-To: <20111107145458.29997.79829.stgit@zurg>

On Mon, Nov 07, 2011 at 06:54:58PM +0300, Konstantin Khlebnikov wrote:
> This patch fixes the bug added in commit v3.1-rc7-1055-gf9b491e
> SKB can be NULL at this point, at least for cdc-ncm.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>

Acked-by: Richard Cochran <richardcochran@gmail.com>

> ---
>  drivers/net/usb/usbnet.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
> index 7d60821..fae0fbd 100644
> --- a/drivers/net/usb/usbnet.c
> +++ b/drivers/net/usb/usbnet.c
> @@ -1057,7 +1057,8 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
>  	unsigned long		flags;
>  	int retval;
>  
> -	skb_tx_timestamp(skb);
> +	if (skb)
> +		skb_tx_timestamp(skb);
>  
>  	// some devices want funky USB-level framing, for
>  	// win32 driver (usually) and/or hardware quirks
> 

^ permalink raw reply

* Re: data corruption in skge hardware
From: Mikulas Patocka @ 2011-11-07 17:34 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Stephen Hemminger, netdev
In-Reply-To: <20111107091327.79a8c6da@nehalam.linuxnetplumber.net>



On Mon, 7 Nov 2011, Stephen Hemminger wrote:

> On Mon, 7 Nov 2011 11:42:11 -0500 (EST)
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > Hi
> > 
> > I found a data corruption in skge network card.
> > 
> > The card is this: "03:06.0 Ethernet controller: 3Com Corporation 3c940 
> > 10/100/1000Base-T [Marvell] (rev 10)"
> > 
> > The machine is two quad core Opterons with HT2000 north bridge and HT1000 
> > south bridge.
> > 
> > When "scatter-gather" and "generic-segmentation-offload" are enabled, the 
> > card sends out corrupted packets.
> > 
> > It normally manifests as a ssh connection drop once per few days, but I 
> > found a workload that triggers this bug quickly.
> > 
> > I ran tcpdump on both sending and receiving machine and caught the packet 
> > corruption:
> > 
> > correct packet (on the sending machine):
> > 19:03:21.131836 IP hydra.ssh > phoebe.58913: Flags [P.], seq 53712:53808, 
> > ack 1, win 193, options [nop,nop,TS val 8677173 ecr 1211608], length 96
> >         0x0000:  4510 0094 c7bf 4000 4006 f12d c0a8 8007
> >         0x0010:  c0a8 800e 0016 e621 2d64 84e6 1fc2 3f5b
> >         0x0020:  8018 00c1 81ed 0000 0101 080a 0084 6735
> >         0x0030:  0012 7cd8 4301 4af9 87c9 d2b4 8ba6 aedb
> >         0x0040:  0572 1738 93db 789c 634b 4386 d013 db27
> >         0x0050:  258b 6fa6 743c d429 a5e1 162f 2721 19bf
> >         0x0060:  6669 a5c3 6bea 89ec a635 b8b4 8727 38c1
> >         0x0070:  139f 5989 781b 49dd 79f5 4dfe 78ac ecb0
> >         0x0080:  546c 33e0 0953 04bc 0647 a9d4 2fc4 cba0
> >         0x0090:  44b2 3b01
> > 
> > incorrect packet (on the receiving machine):
> > 19:03:21.133174 IP hydra.ssh > phoebe.58913: Flags [P.], seq 53712:53808, 
> > ack 1, win 193, options [nop,nop,TS val 8677173 ecr 1211608], length 96
> >         0x0000:  4510 0094 c7bf 4000 4006 f12d c0a8 8007
> >         0x0010:  c0a8 800e 0016 e621 2d64 84e6 1fc2 3f5b
> >         0x0020:  8018 00c1 6aa4 0000 0101 080a 0084 6735
> >         0x0030:  0012 7cd8 0000 0000 0000 0000 0010 0000
> >         0x0040:  0000 0000 0000 0000 0000 0000 0000 0000
> >         0x0050:  0000 0000 0000 0000 0000 00c0 dc92 4702
> >         0x0060:  88ff ff00 0000 0000 0000 0000 0000 0000
> >         0x0070:  0000 0000 0000 0000 0000 0000 0000 0000
> >         0x0080:  0000 0000 0000 0000 0000 0000 0000 0000
> >         0x0090:  0000 00e0
> > 
> > Obviously, scatter-gather doesn't work, the header is correct, but the 
> > packet body was likely read from random memory.
> > 
> > I tried to use "clflush" instruction on the transmit descriptor and the 
> > packet body to test if it is a cache-coherency issue, but the corruption 
> > was still there.
> > 
> > I tried to limit memory to 2G to test if it was a problem with high 
> > memory, but the corruption was still there.
> > 
> > I tries olded kernels (as far as 2.6.34), the corruption was still there, 
> > but it took much more time to trigger it with old kernels.
> > 
> > 
> > Do you have other reports of data corruption with skge hardware? Shouldn't 
> > the driver set "scatter-gather" off by default because it is unreliable?
> 
> No reports, of problems.
> Scatter-gather is used all the time by normal TCP connections.
> I suspect something different because of the IOMMU and separate sockets.

This card has 64-bit addressing, so it doesn't use IOMMU. Or does it?
Anyway, if I booted with 2G RAM, IOMMU was disabled and the corruption was 
still there.

Mikulas

^ permalink raw reply

* RE: [PATCH v5 04/10] per-cgroup tcp buffers control
From: Glauber Costa @ 2011-11-07 17:28 UTC (permalink / raw)
  To: Glauber Costa, linux-kernel@vger.kernel.org
  Cc: paul@paulmenage.org, lizf@cn.fujitsu.com,
	kamezawa.hiroyu@jp.fujitsu.com, ebiederm@xmission.com,
	davem@davemloft.net, gthelen@google.com, netdev@vger.kernel.org,
	linux-mm@kvack.org, kirill@shutemov.name, Andrey Vagin,
	devel@openvz.org, eric.dumazet@gmail.com, Glauber Costa,
	kamezawa.hiroyu@jp.fujtisu.com

Ok, I forgot to change the temporary name I was using for the jump label. Shame on me :)

--- Mensagem Original ---

De: Glauber Costa <glommer@parallels.com>
Enviado: 7 de novembro de 2011 07/11/11
Para: linux-kernel@vger.kernel.org
Cc: paul@paulmenage.org, lizf@cn.fujitsu.com, kamezawa.hiroyu@jp.fujitsu.com, ebiederm@xmission.com, davem@davemloft.net, gthelen@google.com, netdev@vger.kernel.org, linux-mm@kvack.org, kirill@shutemov.name, Andrey Vagin <avagin@parallels.com>, devel@openvz.org, eric.dumazet@gmail.com, Glauber Costa <glommer@parallels.com>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
Assunto: [PATCH v5 04/10] per-cgroup tcp buffers control

With all the infrastructure in place, this patch implements
per-cgroup control for tcp memory pressure handling.

A resource conter is used to control allocated memory, except
for the root cgroup, that will keep using global counters.

This patch is the one that actually enables/disables the
jump labels controlling cgroup. To this point, they were always
disabled.

Signed-off-by: Glauber Costa <glommer@parallels.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
CC: David S. Miller <davem@davemloft.net>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/tcp.h       |   18 +++++++
 include/net/transp_v6.h |    1 +
 mm/memcontrol.c         |  125 ++++++++++++++++++++++++++++++++++++++++++++++-
 net/core/sock.c         |   46 +++++++++++++++--
 net/ipv4/af_inet.c      |    3 +
 net/ipv4/tcp_ipv4.c     |   12 +++++
 net/ipv6/af_inet6.c     |    3 +
 net/ipv6/tcp_ipv6.c     |   10 ++++
 8 files changed, 211 insertions(+), 7 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index ccaa3b6..7301ca8 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -253,6 +253,22 @@ extern int sysctl_tcp_cookie_size;
 extern int sysctl_tcp_thin_linear_timeouts;
 extern int sysctl_tcp_thin_dupack;
 
+struct tcp_memcontrol {
+	/* per-cgroup tcp memory pressure knobs */
+	struct res_counter tcp_memory_allocated;
+	struct percpu_counter tcp_sockets_allocated;
+	/* those two are read-mostly, leave them at the end */
+	long tcp_prot_mem[3];
+	int tcp_memory_pressure;
+};
+
+long *sysctl_mem_tcp(struct mem_cgroup *memcg);
+struct percpu_counter *sockets_allocated_tcp(struct mem_cgroup *memcg);
+int *memory_pressure_tcp(struct mem_cgroup *memcg);
+struct res_counter *memory_allocated_tcp(struct mem_cgroup *memcg);
+int tcp_init_cgroup(struct cgroup *cgrp, struct cgroup_subsys *ss);
+void tcp_destroy_cgroup(struct cgroup *cgrp, struct cgroup_subsys *ss);
+
 extern atomic_long_t tcp_memory_allocated;
 extern struct percpu_counter tcp_sockets_allocated;
 extern int tcp_memory_pressure;
@@ -305,6 +321,7 @@ static inline int tcp_synq_no_recent_overflow(const struct sock *sk)
 }
 
 extern struct proto tcp_prot;
+extern struct cg_proto tcp_cg_prot;
 
 #define TCP_INC_STATS(net, field)	SNMP_INC_STATS((net)->mib.tcp_statistics, field)
 #define TCP_INC_STATS_BH(net, field)	SNMP_INC_STATS_BH((net)->mib.tcp_statistics, field)
@@ -1022,6 +1039,7 @@ static inline void tcp_openreq_init(struct request_sock *req,
 	ireq->loc_port = tcp_hdr(skb)->dest;
 }
 
+extern void tcp_enter_memory_pressure_cg(struct sock *sk);
 extern void tcp_enter_memory_pressure(struct sock *sk);
 
 static inline int keepalive_intvl_when(const struct tcp_sock *tp)
diff --git a/include/net/transp_v6.h b/include/net/transp_v6.h
index 498433d..1e18849 100644
--- a/include/net/transp_v6.h
+++ b/include/net/transp_v6.h
@@ -11,6 +11,7 @@ extern struct proto rawv6_prot;
 extern struct proto udpv6_prot;
 extern struct proto udplitev6_prot;
 extern struct proto tcpv6_prot;
+extern struct cg_proto tcpv6_cg_prot;
 
 struct flowi6;
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7d684d0..f14d7d2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -49,6 +49,9 @@
 #include <linux/cpu.h>
 #include <linux/oom.h>
 #include "internal.h"
+#ifdef CONFIG_INET
+#include <net/tcp.h>
+#endif
 
 #include <asm/uaccess.h>
 
@@ -294,6 +297,10 @@ struct mem_cgroup {
 	 */
 	struct mem_cgroup_stat_cpu nocpu_base;
 	spinlock_t pcp_counter_lock;
+
+#ifdef CONFIG_INET
+	struct tcp_memcontrol tcp;
+#endif
 };
 
 /* Stuffs for move charges at task migration. */
@@ -377,7 +384,7 @@ enum mem_type {
 #define MEM_CGROUP_RECLAIM_SOFT		(1 << MEM_CGROUP_RECLAIM_SOFT_BIT)
 
 static struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg);
-
+static struct mem_cgroup *mem_cgroup_from_cont(struct cgroup *cont);
 static inline bool mem_cgroup_is_root(struct mem_cgroup *mem)
 {
 	return (mem == root_mem_cgroup);
@@ -387,6 +394,7 @@ static inline bool mem_cgroup_is_root(struct mem_cgroup *mem)
 #ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM
 #ifdef CONFIG_INET
 #include <net/sock.h>
+#include <net/ip.h>
 
 void sock_update_memcg(struct sock *sk)
 {
@@ -451,6 +459,93 @@ u64 memcg_memory_allocated_read(struct mem_cgroup *memcg, struct cg_proto *prot)
 				    RES_USAGE) >> PAGE_SHIFT ;
 }
 EXPORT_SYMBOL(memcg_memory_allocated_read);
+/*
+ * Pressure flag: try to collapse.
+ * Technical note: it is used by multiple contexts non atomically.
+ * All the __sk_mem_schedule() is of this nature: accounting
+ * is strict, actions are advisory and have some latency.
+ */
+void tcp_enter_memory_pressure_cg(struct sock *sk)
+{
+	struct mem_cgroup *memcg = sk->sk_cgrp;
+	if (!memcg->tcp.tcp_memory_pressure) {
+		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMEMORYPRESSURES);
+		memcg->tcp.tcp_memory_pressure = 1;
+	}
+}
+EXPORT_SYMBOL(tcp_enter_memory_pressure_cg);
+
+long *sysctl_mem_tcp(struct mem_cgroup *memcg)
+{
+	return memcg

^ permalink raw reply related

* Re: data corruption in skge hardware
From: Stephen Hemminger @ 2011-11-07 17:13 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: Stephen Hemminger, netdev
In-Reply-To: <Pine.LNX.4.64.1111071109410.18030@hs20-bc2-1.build.redhat.com>

On Mon, 7 Nov 2011 11:42:11 -0500 (EST)
Mikulas Patocka <mpatocka@redhat.com> wrote:

> Hi
> 
> I found a data corruption in skge network card.
> 
> The card is this: "03:06.0 Ethernet controller: 3Com Corporation 3c940 
> 10/100/1000Base-T [Marvell] (rev 10)"
> 
> The machine is two quad core Opterons with HT2000 north bridge and HT1000 
> south bridge.
> 
> When "scatter-gather" and "generic-segmentation-offload" are enabled, the 
> card sends out corrupted packets.
> 
> It normally manifests as a ssh connection drop once per few days, but I 
> found a workload that triggers this bug quickly.
> 
> I ran tcpdump on both sending and receiving machine and caught the packet 
> corruption:
> 
> correct packet (on the sending machine):
> 19:03:21.131836 IP hydra.ssh > phoebe.58913: Flags [P.], seq 53712:53808, 
> ack 1, win 193, options [nop,nop,TS val 8677173 ecr 1211608], length 96
>         0x0000:  4510 0094 c7bf 4000 4006 f12d c0a8 8007
>         0x0010:  c0a8 800e 0016 e621 2d64 84e6 1fc2 3f5b
>         0x0020:  8018 00c1 81ed 0000 0101 080a 0084 6735
>         0x0030:  0012 7cd8 4301 4af9 87c9 d2b4 8ba6 aedb
>         0x0040:  0572 1738 93db 789c 634b 4386 d013 db27
>         0x0050:  258b 6fa6 743c d429 a5e1 162f 2721 19bf
>         0x0060:  6669 a5c3 6bea 89ec a635 b8b4 8727 38c1
>         0x0070:  139f 5989 781b 49dd 79f5 4dfe 78ac ecb0
>         0x0080:  546c 33e0 0953 04bc 0647 a9d4 2fc4 cba0
>         0x0090:  44b2 3b01
> 
> incorrect packet (on the receiving machine):
> 19:03:21.133174 IP hydra.ssh > phoebe.58913: Flags [P.], seq 53712:53808, 
> ack 1, win 193, options [nop,nop,TS val 8677173 ecr 1211608], length 96
>         0x0000:  4510 0094 c7bf 4000 4006 f12d c0a8 8007
>         0x0010:  c0a8 800e 0016 e621 2d64 84e6 1fc2 3f5b
>         0x0020:  8018 00c1 6aa4 0000 0101 080a 0084 6735
>         0x0030:  0012 7cd8 0000 0000 0000 0000 0010 0000
>         0x0040:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x0050:  0000 0000 0000 0000 0000 00c0 dc92 4702
>         0x0060:  88ff ff00 0000 0000 0000 0000 0000 0000
>         0x0070:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x0080:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x0090:  0000 00e0
> 
> Obviously, scatter-gather doesn't work, the header is correct, but the 
> packet body was likely read from random memory.
> 
> I tried to use "clflush" instruction on the transmit descriptor and the 
> packet body to test if it is a cache-coherency issue, but the corruption 
> was still there.
> 
> I tried to limit memory to 2G to test if it was a problem with high 
> memory, but the corruption was still there.
> 
> I tries olded kernels (as far as 2.6.34), the corruption was still there, 
> but it took much more time to trigger it with old kernels.
> 
> 
> Do you have other reports of data corruption with skge hardware? Shouldn't 
> the driver set "scatter-gather" off by default because it is unreliable?

No reports, of problems.
Scatter-gather is used all the time by normal TCP connections.
I suspect something different because of the IOMMU and separate sockets.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox