* Re: [PATCH] xen-netfront: report link speed to ethtool
From: David Miller @ 2011-11-18 19:11 UTC (permalink / raw)
To: bhutchings; +Cc: olaf, netdev, xen-devel, jeremy.fitzhardinge, konrad.wilk
In-Reply-To: <1321638394.2883.32.camel@bwh-desktop>
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Fri, 18 Nov 2011 17:46:34 +0000
> On Fri, 2011-11-18 at 17:48 +0100, Olaf Hering wrote:
>> Add .get_settings function, return fake data so that ethtool can get
>> enough information. For some application like VCS, this is useful,
>> otherwise some of application logic will get panic.
>> The reported data refers to VMWare vmxnet.
>>
>> Signed-off-by: Xin Wei Hu <xwhu@suse.com>
>> Signed-off-by: Chunyan Liu <cyliu@suse.com>
>> Signed-off-by: Olaf Hering <olaf@aepfle.de>
>
> NAK, we should not just make things up.
Agreed, if you cannot determine the values with certainty do not
implement this method.
Fix the tools which cannot function without this information.
^ permalink raw reply
* Re: NFS TCP race condition with SOCK_ASYNC_NOSPACE
From: Trond Myklebust @ 2011-11-18 19:14 UTC (permalink / raw)
To: Andrew Cooper
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <4EC6AC47.60404-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
On Fri, 2011-11-18 at 19:04 +0000, Andrew Cooper wrote:
> On 18/11/11 18:52, Trond Myklebust wrote:
> > On Fri, 2011-11-18 at 18:40 +0000, Andrew Cooper wrote:
> >> Hello,
> >>
> >> As described originally in
> >> http://www.spinics.net/lists/linux-nfs/msg25314.html, we were
> >> encountering a bug whereby the NFS session was unexpectedly timing out.
> >>
> >> I believe I have found the source of the race condition causing the timeout.
> >>
> >> Brief overview of setup:
> >> 10GiB network, NFS mounted using TCP. Problem reproduces with
> >> multiple different NICs, with synchronous or asynchronous mounts, and
> >> with soft and hard mounts. Reproduces on 2.6.32 and I am currently
> >> trying to reproduce with mainline. (I don't have physical access to the
> >> servers so installing stuff is not fantastically easy)
> >>
> >>
> >>
> >> In net/sunrpc/xprtsock.c:xs_tcp_send_request(), we try to write data to
> >> the sock buffer using xs_sendpages()
> >>
> >> When the sock buffer is nearly fully, we get an EAGAIN from
> >> xs_sendpages() which causes a break out of the loop. Lower down the
> >> function, we switch on status which cases us to call xs_nospace() with
> >> the task.
> >>
> >> In xs_nospace(), we test the SOCK_ASYNC_NOSPACE bit from the socket, and
> >> in the rare case where that bit is clear, we return 0 instead of
> >> EAGAIN. This promptly overwrites status in xs_tcp_send_request().
> >>
> >> The result is that xs_tcp_release_xprt() finds a request which has no
> >> error, but has not sent all of the bytes in its send buffer. It cleans
> >> up by setting XPRT_CLOSE_WAIT which causes xprt_clear_locked() to queue
> >> xprt->task_cleanup, which closes the TCP connection.
> >>
> >>
> >> Under normal operation, the TCP connection goes down and back up without
> >> interruption to the NFS layer. However, when the NFS server hangs in a
> >> half closed state, the client forces a RST of the TCP connection,
> >> leading to the timeout.
> >>
> >> I have tried a few naive fixes such as changing the default return value
> >> in xs_nospace() from 0 to -EAGAIN (meaning that 0 will never be
> >> returned) but this causes a kernel memory leak. Can someone who a
> >> better understanding of these interactions than me have a look? It
> >> seems that the if (test_bit()) test in xs_nospace() should have an else
> >> clause.
> > I fully agree with your analysis. The correct thing to do here is to
> > always return either EAGAIN or ENOTCONN. Thank you very much for working
> > this one out!
> >
> > Trond
>
> Returning EAGAIN seems to cause a kernel memory leak, as the oomkiller
> starts going after processes holding large amounts of LowMem. Returning
The EAGAIN should trigger a retry of the send.
> ENOTCONN causes the NFS session to complain about a timeout in the logs,
> and in the case of a softmout, give an EIO to the calling process.
Correct. ENOTCONN means that the connection was lost.
> >From the looks of the TCP stream, and from the the looks of some
> targeted debugging, nothing is actually wrong, so the client should not
> be trying to FIN the TCP connection. Is it possible that there is a
> more sinister reason for SOCK_ASYNC_NOSPACE being clear?
Normally, it means that we're out of the out-of-write-buffer condition
that caused the socket to fail (i.e. the socket has made progress
sending more data, so that we can now resume sending more). Returning
EAGAIN in that condition is correct.
> I can attempt to find which of the many calls to clear that bit is
> actually causing the problem, but I have a feeing that is going to a
> little more tricky to narrow down.
>
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org
www.netapp.com
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] xen-netfront: report link speed to ethtool
From: Olaf Hering @ 2011-11-18 19:17 UTC (permalink / raw)
To: Ben Hutchings
Cc: netdev, xen-devel, Jeremy Fitzhardinge, Konrad Rzeszutek Wilk
In-Reply-To: <1321643431.2883.39.camel@bwh-desktop>
On Fri, Nov 18, Ben Hutchings wrote:
> On Fri, 2011-11-18 at 19:43 +0100, Olaf Hering wrote:
> > On Fri, Nov 18, Ben Hutchings wrote:
> >
> > > On Fri, 2011-11-18 at 17:48 +0100, Olaf Hering wrote:
> > > > The reported data refers to VMWare vmxnet.
> > > NAK, we should not just make things up.
> >
> > So how about removing veth_get_settings, vmxnet3_get_settings,
> > tun_get_settings and other functions that escaped my grep?
>
> If they can't provide meaningful information then maybe they should be
> removed. However, that could result in a regression for existing
> working configurations. (This isn't the same as the case you're trying
> to fix, since those applications have never worked with xen-netfront or
> many other drivers that don't implement get_settings.)
That may be.
How about a new generic ethtool_op_get_settings_veth which returns fake
values for all relevant drivers (virtio, xen-netfront, and the ones
listed above)?
Olaf
^ permalink raw reply
* Re: b43: BCM 4331: MacBook 8,1: No connection after suspend
From: John W. Linville @ 2011-11-18 19:22 UTC (permalink / raw)
To: Nico -telmich- Schottelius, LKML, netdev, Arend van Spriel
Cc: b43-dev, linux-wireless, Rafał Miłecki, Larry Finger,
Michael Buesch
In-Reply-To: <20111118173242.GA2101@schottelius.org>
Arend has nothing to do with b43, and b43 has it's own
mailing list, b43-dev@lists.infradead.org. Even if it didn't,
linux-wireless@vger.kernel.org would be a more appropriate list than
this one.
John
On Fri, Nov 18, 2011 at 06:32:42PM +0100, Nico -telmich- Schottelius wrote:
> Hello,
>
> new notebook, new problems (*):
>
> Running 3.2.0-rc1 on the MacBook Pro 8,1 with the BCM4331
> (14e4:4331) the WLAN indeed works:
>
> [ 86.231702] b43-phy0: Broadcom 4331 WLAN found (core revision 29)
> [ 86.269190] ieee80211 phy0: Selected rate control algorithm
> 'minstrel_ht'
> [ 86.270486] Broadcom 43xx driver loaded [ Features: PMNLS ]
> [ 87.677265] b43-phy0: Loading firmware version 666.2 (2011-02-23
> 01:15:07)
>
> But: After a suspend it seems not to receive any packets anymore:
>
> [ 2334.494845] wlan0: authenticate with 64:87:d7:37:89:89 (try 1)
> [ 2334.694035] wlan0: authenticate with 64:87:d7:37:89:89 (try 2)
> [ 2334.893909] wlan0: authenticate with 64:87:d7:37:89:89 (try 3)
> [ 2335.093824] wlan0: authentication with 64:87:d7:37:89:89 timed out
>
> wpa_supplicant thus retries to connect to the network again and again
> without success.
>
> This is reproducable on the MBP 8,1 and is *NOT* fixed if I unload
> b43 and modprobe it again. It is also "not fixed" when doing
> multiple suspends/resumes.
>
> Any pointers to this?
>
> Cheers,
>
> Nico
>
>
> (*) feels like in good old times...
>
> --
> PGP key: 7ED9 F7D3 6B10 81D7 0EC5 5C09 D7DC C8E4 3187 7DF0
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply
* Re: Should "N/A" dust bunnies be swept from fw_version?
From: Rick Jones @ 2011-11-18 19:23 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20111118.141046.1309959676401370888.davem@davemloft.net>
On 11/18/2011 11:10 AM, David Miller wrote:
> From: Rick Jones<rick.jones2@hp.com>
> Date: Fri, 18 Nov 2011 11:09:48 -0800
>
>> On 11/17/2011 04:19 PM, Ben Hutchings wrote:
>>> On Thu, 2011-11-17 at 15:27 -0800, Rick Jones wrote:
>>>> In the discussion on "enable virtio_net to return bus_info in ethtool
>>>> -i
>>>> consistent with emulated NICs" Ben Hutchings had the following
>>>> feedback
>>>> on what might go into bus_info:
>>>>
>>>>> Please use the existing 'not implemented' value, which is the empty
>>>>> string. If you think ethtool should print some helpful message
>>>>> instead
>>>>> of an empty string, please submit a patch for ethtool.
>>>>
>>>> When I was sweeping in the .get_drvinfo routines, I noticed many
>>>> drivers
>>>> would return "N/A" for fw_version - presumably they were drivers for
>>>> cards without firmware. Should those be removed to have the
>>>> fw_version
>>>> be the empty string, or should those sleeping dust bunnies be allowed
>>>> to
>>>> lie?
>>>
>>> I much prefer the empty string; the ethtool utility can turn that into
>>> a
>>> user-friendly placeholder if it's considered confusing.
>>
>> Any other opinions out there? Anyone? Anyone?-)
>
> I agree with Ben, just provide the empty string.
OK, when I have time to pick-up my broom, I'll do some additional sweeping.
rick
^ permalink raw reply
* Re: Occasional oops with IPSec and IPv6.
From: Nick Bowler @ 2011-11-18 19:26 UTC (permalink / raw)
To: Timo Teräs; +Cc: Eric Dumazet, netdev, David S. Miller
In-Reply-To: <4EC6A38E.6060404@iki.fi>
On 2011-11-18 20:27 +0200, Timo Teräs wrote:
> On 11/18/2011 06:39 PM, Eric Dumazet wrote:
> > Le vendredi 18 novembre 2011 à 11:27 -0500, Nick Bowler a écrit :
> >> On 2011-11-17 14:09 -0500, Nick Bowler wrote:
> >>> One of the tests we do with IPsec involves sending and receiving UDP
> >>> datagrams of all sizes from 1 to N bytes, where N is much larger than
> >>> the MTU. In this particular instance, the MTU is 1500 bytes and N is
> >>> 10000 bytes. This test works fine with IPv4, but I'm getting an
> >>> occasional oops on Linus' master with IPv6 (output at end of email). We
> >>> also run the same test where N is less than the MTU, and it does not
> >>> trigger this issue. The resulting fallout seems to eventually lock up
> >>> the box (although it continues to work for a little while afterwards).
> >>>
> >>> The issue appears timing related, and it doesn't always occur. This
> >>> probably also explains why I've not seen this issue before now, as we
> >>> recently upgraded all our lab systems to machines from this century
> >>> (with newfangled dual core processors). This also makes it somewhat
> >>> hard to reproduce, but I can trigger it pretty reliably by running 'yes'
> >>> in an ssh session (which doesn't use IPsec) while running the test:
> >>> it'll usually trigger in 2 or 3 runs. The choice of cipher suite
> >>> appears to be irrelevant.
[...]
> > Please note commit 80c802f307 added a known bug, fixed in commit
> > 0b150932197b (xfrm: avoid possible oopse in xfrm_alloc_dst)
> >
> > Given commit 80c802f307 complexity, we can assume other bugs are to be
> > fixed as well.
[...]
> This looks quite different. And I've been trying to figure out what
> causes this. However, the OOPS happens at ip6_fragment(), indicating
> that there was not enough allocated headroom (skb underrun). My initial
> thought is ipv6 bug that just got uncovered by my commit; especially
> since ipv4 side is happy. But I haven't yet been able to figure this one
> out.
>
> Could you also try Herbert's latest patch set:
> [0/6] Replace LL_ALLOCATED_SPACE to allow needed_headroom adjustment
>
> This changes how the headroom is calculated, and *might* fix this issue
> too if it's caused by the same SMP race condition which got uncovered by
> my other commit earlier.
I applied all six of those patches, but I still see a crash. However,
the call trace seems to be slightly different. I've appended the trace
from the run with these paches applied, just in case it's significant.
NOTE: I did not carefully look at the traces of all the crashes I've
triggered. This particular backtrace could potentially have appeared
before applying these patches and I would not have noticed.
[ 45.318137] NET: Registered protocol family 15
[ 125.153082] skb_under_panic: text:c1215d1d len:1462 put:14 head:f2ff1000 data:f2ff0ffa tail:0xf2ff15b0 end:0xf2ff1780 dev:p10p1
[ 125.165124] ------------[ cut here ]------------
[ 125.166001] kernel BUG at net/core/skbuff.c:147!
[ 125.166001] invalid opcode: 0000 [#1] PREEMPT SMP
[ 125.166001] Modules linked in: authenc esp6 xfrm6_mode_transport deflate zlib_deflate ctr twofish_generic twofish_common camellia serpent blowfish_generic blowfish_common cast5 des_generic cbc xcbc rmd160 sha512_generic sha256_generic sha1_generic md5 hmac crypto_null af_key nfs lockd auth_rpcgss sunrpc rng_core iptable_filter ip_tables ip6table_filter ip6_tables x_tables psmouse sg r8169 mii evdev button ipv6 autofs4 usbhid ohci_hcd ehci_hcd usbcore usb_common sd_mod radeon ttm drm_kms_helper drm backlight i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect [last unloaded: scsi_wait_scan]
[ 125.196579]
[ 125.196579] Pid: 2792, comm: udp_burst Not tainted 3.2.0-rc2-00115-g8b662f5 #54 System manufacturer System Product Name/M4A785T-M
[ 125.196579] EIP: 0060:[<c11ff2af>] EFLAGS: 00010246 CPU: 0
[ 125.196579] EIP is at skb_push+0x52/0x5b
[ 125.196579] EAX: 00000089 EBX: f39cb000 ECX: 00000080 EDX: 00000003
[ 125.196579] ESI: f39cb000 EDI: f39cb000 EBP: f29abb10 ESP: f29abae4
[ 125.196579] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 125.196579] Process udp_burst (pid: 2792, ti=f29aa000 task=f3d8a2c0 task.ti=f29aa000)
[ 125.196579] Stack:
[ 125.196579] c13a2756 c1215d1d 000005b6 0000000e f2ff1000 f2ff0ffa f2ff15b0 f2ff1780
[ 125.196579] f39cb000 00000000 f4fcdec0 f29abb28 c1215d1d 000086dd c1215d01 f29a5600
[ 125.196579] 00000002 f29abb44 c120ee6d f4fcdec0 00000000 000005a8 f4fcde00 f29a5600
[ 125.196579] Call Trace:
[ 125.196579] [<c1215d1d>] ? eth_header+0x1c/0x8b
[ 125.196579] [<c1215d1d>] eth_header+0x1c/0x8b
[ 125.196579] [<c1215d01>] ? eth_rebuild_header+0x53/0x53
[ 125.196579] [<c120ee6d>] dev_hard_header.constprop.12+0x28/0x32
[ 125.196579] [<c120ef74>] neigh_resolve_output+0xfd/0x138
[ 125.196579] [<f838af19>] ip6_finish_output2+0x280/0x31a [ipv6]
[ 125.196579] [<f838bf61>] ip6_fragment+0x3bd/0x939 [ipv6]
[ 125.196579] [<f838ac99>] ? NF_HOOK.constprop.4+0x30/0x30 [ipv6]
[ 125.196579] [<f838c51c>] ip6_finish_output+0x3f/0x4c [ipv6]
[ 125.196579] [<f838c5e1>] ip6_output+0xb8/0xc0 [ipv6]
[ 125.196579] [<c12520f1>] xfrm_output_resume+0x75/0x2c5
[ 125.196579] [<c125234e>] xfrm_output2+0xd/0xf
[ 125.196579] [<c12523e3>] xfrm_output+0x93/0x9c
[ 125.196579] [<f83a8b32>] xfrm6_output_finish+0x13/0x15 [ipv6]
[ 125.196579] [<f83a8a1f>] __xfrm6_output+0x108/0x10d [ipv6]
[ 125.196579] [<f83a8b7b>] xfrm6_output+0x47/0x4c [ipv6]
[ 125.196579] [<f838a7b4>] dst_output+0x12/0x15 [ipv6]
[ 125.196579] [<f838b36a>] ip6_local_out+0x17/0x1a [ipv6]
[ 125.196579] [<f838d27b>] ip6_push_pending_frames+0x2a4/0x346 [ipv6]
[ 125.196579] [<f839a035>] udp_v6_push_pending_frames+0x213/0x271 [ipv6]
[ 125.196579] [<f839ae84>] ? udpv6_sendmsg+0x68d/0x832 [ipv6]
[ 125.196579] [<f839aea6>] udpv6_sendmsg+0x6af/0x832 [ipv6]
[ 125.196579] [<c123fe84>] ? ip_fast_csum+0x30/0x30
[ 125.196579] [<c12403c0>] inet_sendmsg+0x4e/0x57
[ 125.196579] [<c11f8de6>] sock_sendmsg+0xbe/0xd9
[ 125.196579] [<c1052d64>] ? trace_hardirqs_off+0xb/0xd
[ 125.196579] [<c1270f48>] ? restore_all+0xf/0xf
[ 125.196579] [<c1055715>] ? trace_hardirqs_on_caller+0x10e/0x13f
[ 125.196579] [<c10542df>] ? mark_lock+0x26/0x1ea
[ 125.196579] [<c10acdbb>] ? fget_light+0x28/0x7c
[ 125.196579] [<c11fa23a>] sys_sendto+0xb1/0xcd
[ 125.196579] [<c10548e7>] ? __lock_acquire+0x444/0xb17
[ 125.196579] [<c1270bb1>] ? _raw_spin_unlock_irq+0x39/0x45
[ 125.196579] [<c1055038>] ? lock_release_non_nested+0x7e/0x1bb
[ 125.196579] [<c11fa26e>] sys_send+0x18/0x1a
[ 125.196579] [<c11fa877>] sys_socketcall+0xce/0x19a
[ 125.196579] [<c11507c0>] ? trace_hardirqs_on_thunk+0xc/0x10
[ 125.196579] [<c1271650>] sysenter_do_call+0x12/0x36
[ 125.196579] Code: c1 85 f6 0f 45 de 53 ff b1 98 00 00 00 ff b1 94 00 00 00 50 ff b1 9c 00 00 00 52 ff 71 50 ff 75 04 68 56 27 3a c1 e8 5a c7 06 00 <0f> 0b 8d 65 f8 5b 5e 5d c3 55 89 c1 89 e5 56 53 83 79 54 00 8b
[ 125.196579] EIP: [<c11ff2af>] skb_push+0x52/0x5b SS:ESP 0068:f29abae4
[ 125.544777] ---[ end trace 3ca7fd586035bfb5 ]---
[ 125.549588] BUG: sleeping function called from invalid context at kernel/rwsem.c:21
[ 125.557655] in_atomic(): 0, irqs_disabled(): 0, pid: 2792, name: udp_burst
[ 125.565415] INFO: lockdep is turned off.
[ 125.569682] Pid: 2792, comm: udp_burst Tainted: G D 3.2.0-rc2-00115-g8b662f5 #54
[ 125.578640] Call Trace:
[ 125.581476] [<c10307b1>] ? console_unlock+0x1b6/0x1c9
[ 125.587209] [<c1024dbd>] __might_sleep+0xe2/0xe9
[ 125.592457] [<c126ff47>] down_read+0x17/0x3b
[ 125.597311] [<c105fc85>] acct_collect+0x39/0x134
[ 125.602749] [<c1032c08>] do_exit+0x188/0x5de
[ 125.607604] [<c1031464>] ? kmsg_dump+0xdf/0xe7
[ 125.612710] [<c1004737>] oops_end+0x92/0x9a
[ 125.617647] [<c1004868>] die+0x51/0x59
[ 125.622008] [<c1002626>] do_trap+0x89/0xa2
[ 125.626665] [<c1002776>] ? do_bounds+0x52/0x52
[ 125.631781] [<c10027e7>] do_invalid_op+0x71/0x7b
[ 125.637157] [<c11ff2af>] ? skb_push+0x52/0x5b
[ 125.642175] [<c1270f48>] ? restore_all+0xf/0xf
[ 125.647256] [<c10307b1>] ? console_unlock+0x1b6/0x1c9
[ 125.653106] [<c102369b>] ? need_resched+0x14/0x1e
[ 125.658517] [<c126f1f7>] ? preempt_schedule+0x40/0x46
[ 125.664271] [<c1030c19>] ? vprintk+0x390/0x3ae
[ 125.669417] [<c1052d01>] ? trace_hardirqs_off_caller+0x2e/0x86
[ 125.675999] [<c11507d0>] ? trace_hardirqs_off_thunk+0xc/0x10
[ 125.682561] [<c127140b>] error_code+0x5f/0x64
[ 125.687553] [<c1002776>] ? do_bounds+0x52/0x52
[ 125.692621] [<c11ff2af>] ? skb_push+0x52/0x5b
[ 125.697723] [<c1215d1d>] ? eth_header+0x1c/0x8b
[ 125.702905] [<c1215d1d>] eth_header+0x1c/0x8b
[ 125.707963] [<c1215d01>] ? eth_rebuild_header+0x53/0x53
[ 125.713945] [<c120ee6d>] dev_hard_header.constprop.12+0x28/0x32
[ 125.720617] [<c120ef74>] neigh_resolve_output+0xfd/0x138
[ 125.726714] [<f838af19>] ip6_finish_output2+0x280/0x31a [ipv6]
[ 125.733397] [<f838bf61>] ip6_fragment+0x3bd/0x939 [ipv6]
[ 125.739483] [<f838ac99>] ? NF_HOOK.constprop.4+0x30/0x30 [ipv6]
[ 125.746261] [<f838c51c>] ip6_finish_output+0x3f/0x4c [ipv6]
[ 125.752772] [<f838c5e1>] ip6_output+0xb8/0xc0 [ipv6]
[ 125.758684] [<c12520f1>] xfrm_output_resume+0x75/0x2c5
[ 125.764729] [<c125234e>] xfrm_output2+0xd/0xf
[ 125.769960] [<c12523e3>] xfrm_output+0x93/0x9c
[ 125.775292] [<f83a8b32>] xfrm6_output_finish+0x13/0x15 [ipv6]
[ 125.781988] [<f83a8a1f>] __xfrm6_output+0x108/0x10d [ipv6]
[ 125.788515] [<f83a8b7b>] xfrm6_output+0x47/0x4c [ipv6]
[ 125.794659] [<f838a7b4>] dst_output+0x12/0x15 [ipv6]
[ 125.800633] [<f838b36a>] ip6_local_out+0x17/0x1a [ipv6]
[ 125.806889] [<f838d27b>] ip6_push_pending_frames+0x2a4/0x346 [ipv6]
[ 125.814176] [<f839a035>] udp_v6_push_pending_frames+0x213/0x271 [ipv6]
[ 125.821792] [<f839ae84>] ? udpv6_sendmsg+0x68d/0x832 [ipv6]
[ 125.828447] [<f839aea6>] udpv6_sendmsg+0x6af/0x832 [ipv6]
[ 125.834931] [<c123fe84>] ? ip_fast_csum+0x30/0x30
[ 125.840522] [<c12403c0>] inet_sendmsg+0x4e/0x57
[ 125.845919] [<c11f8de6>] sock_sendmsg+0xbe/0xd9
[ 125.851343] [<c1052d64>] ? trace_hardirqs_off+0xb/0xd
[ 125.857271] [<c1270f48>] ? restore_all+0xf/0xf
[ 125.862642] [<c1055715>] ? trace_hardirqs_on_caller+0x10e/0x13f
[ 125.869618] [<c10542df>] ? mark_lock+0x26/0x1ea
[ 125.875028] [<c10acdbb>] ? fget_light+0x28/0x7c
[ 125.880431] [<c11fa23a>] sys_sendto+0xb1/0xcd
[ 125.885688] [<c10548e7>] ? __lock_acquire+0x444/0xb17
[ 125.891665] [<c1270bb1>] ? _raw_spin_unlock_irq+0x39/0x45
[ 125.898057] [<c1055038>] ? lock_release_non_nested+0x7e/0x1bb
[ 125.904803] [<c11fa26e>] sys_send+0x18/0x1a
[ 125.909815] [<c11fa877>] sys_socketcall+0xce/0x19a
[ 125.915539] [<c11507c0>] ? trace_hardirqs_on_thunk+0xc/0x10
[ 125.922127] [<c1271650>] sysenter_do_call+0x12/0x36
[ 185.166028] INFO: rcu_preempt detected stalls on CPUs/tasks: {} (detected by 1, t=60002 jiffies)
[ 185.167017] INFO: Stall ended before state dump start
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)
^ permalink raw reply
* Re: [Devel] Re: [PATCH v5 00/10] per-cgroup tcp memory pressure
From: Glauber Costa @ 2011-11-18 19:39 UTC (permalink / raw)
To: David Miller
Cc: jbottomley, eric.dumazet, linux-kernel, netdev, paul, lizf,
linux-mm, devel, kirill, gthelen, kamezawa.hiroyu
In-Reply-To: <20111117.163501.1963137869848419475.davem@davemloft.net>
On 11/17/2011 07:35 PM, David Miller wrote:
> From: James Bottomley<jbottomley@parallels.com>
> Date: Tue, 15 Nov 2011 18:27:12 +0000
>
>> Ping on this, please. We're blocked on this patch set until we can get
>> an ack that the approach is acceptable to network people.
>
> __sk_mem_schedule is now more expensive, because instead of short-circuiting
> the majority of the function's logic when "allocated<= prot->sysctl_mem[0]"
> and immediately returning 1, the whole rest of the function is run.
Not the whole rest of the function. Rather, just the other two tests.
But that's the behavior we need since if your parent is on pressure, you
should be as well. How do you feel if we'd also provide two versions for
this:
1) non-cgroup, try to return 1 as fast as we can
2) cgroup, also check your parents.
That could be enclosed in the same static branch we're using right now.
> The static branch protecting all of the cgroup code seems to be
> enabled if any memory based cgroup'ing is enabled. What if people use
> the memory cgroup facility but not for sockets? I am to understand
> that, of the very few people who are going to use this stuff in any
> capacity, this would be a common usage.
How about we make the jump_label only used for sockets (which is basic
what we have now, just need a clear name to indicate that), and then
enable it not when the first non-root cgroup is created, but when the
first one sets the limit to something different than unlimited?
Of course to that point, we'd be accounting only to the root structures,
but I guess this is not a big deal.
> TCP specific stuff in mm/memcontrol.c, at best that's not nice at all.
How crucial is that? Thing is that as far as I am concerned, all the
memcg people really want the inner layout of struct mem_cgroup to be
private to memcontrol.c This means that at some point, we need to have
at least a wrapper in memcontrol.c that is able to calculate the offset
of the tcp structure, and since most functions are actually quite
simple, that would just make us do more function calls.
Well, an alternative to that would be to use a void pointer in the newly
added struct cg_proto to an already parsed memcg-related field
(in this case tcp_memcontrol), that would be passed to the functions
instead of the whole memcg structure. Do you think this would be
preferable ?
> Otherwise looks mostly good.
Thank you for your time.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [Devel] Re: [PATCH v5 00/10] per-cgroup tcp memory pressure
From: David Miller @ 2011-11-18 19:51 UTC (permalink / raw)
To: glommer
Cc: jbottomley, eric.dumazet, linux-kernel, netdev, paul, lizf,
linux-mm, devel, kirill, gthelen, kamezawa.hiroyu
In-Reply-To: <4EC6B457.4010502@parallels.com>
From: Glauber Costa <glommer@parallels.com>
Date: Fri, 18 Nov 2011 17:39:03 -0200
> On 11/17/2011 07:35 PM, David Miller wrote:
>> From: James Bottomley<jbottomley@parallels.com>
>> Date: Tue, 15 Nov 2011 18:27:12 +0000
>>
>>> Ping on this, please. We're blocked on this patch set until we can
>>> get
>>> an ack that the approach is acceptable to network people.
>>
>> __sk_mem_schedule is now more expensive, because instead of
>> short-circuiting
>> the majority of the function's logic when "allocated<=
>> prot->sysctl_mem[0]"
>> and immediately returning 1, the whole rest of the function is run.
>
> Not the whole rest of the function. Rather, just the other two
> tests. But that's the behavior we need since if your parent is on
> pressure, you should be as well. How do you feel if we'd also provide
> two versions for this:
> 1) non-cgroup, try to return 1 as fast as we can
> 2) cgroup, also check your parents.
Fair enough.
> How about we make the jump_label only used for sockets (which is basic
> what we have now, just need a clear name to indicate that), and then
> enable it not when the first non-root cgroup is created, but when the
> first one sets the limit to something different than unlimited?
>
> Of course to that point, we'd be accounting only to the root
> structures,
> but I guess this is not a big deal.
This sounds good for now.
>> TCP specific stuff in mm/memcontrol.c, at best that's not nice at all.
>
> How crucial is that?
It's a big deal. We've been working for years to yank protocol specific
things even out of net/core/*.c, it simply doesn't belong there.
I'd even be happier if you had to create a net/ipv4/tcp_memcg.c and
include/net/tcp_memcg.h
> Thing is that as far as I am concerned, all the
> memcg people
...
What the memcg people want is entirely their problem, especially if it
involves crapping up non-networking files with protocol specific junk.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: Bug#645308: tg3 broken for NetXtreme 5714S in squeeze 6.0.3 installer
From: Matt Carlson @ 2011-11-18 19:54 UTC (permalink / raw)
To: Ben Hutchings
Cc: Matthew Carlson, Michael Chan, 645308@bugs.debian.org, Marc Haber,
netdev
In-Reply-To: <1319614904.11727.17.camel@deadeye>
On Wed, Oct 26, 2011 at 12:41:44AM -0700, Ben Hutchings wrote:
> On Tue, 2011-10-25 at 17:20 -0700, Matt Carlson wrote:
> > On Mon, Oct 24, 2011 at 04:47:54PM -0700, Ben Hutchings wrote:
> > > On Mon, 2011-10-24 at 14:24 -0700, Matt Carlson wrote:
> > > > On Fri, Oct 21, 2011 at 05:19:39AM -0700, Ben Hutchings wrote:
> > > > > On Fri, 2011-10-21 at 11:08 +0200, Marc Haber wrote:
> > > > > > On Fri, Oct 21, 2011 at 11:00:46AM +0200, Marc Haber wrote:
> > > > > > > On Thu, Oct 20, 2011 at 05:28:34AM +0100, Ben Hutchings wrote:
> > > > > > > > I don't see any changes that would obviously change the way this device
> > > > > > > > is reconfigured during a down/up cycle. There were some changes to
> > > > > > > > power management that should just let the PCI core do some work that the
> > > > > > > > driver used to, but it's possible that the result isn't quite the same.
> > > > > > > > I built a module with those reverted; source and binary attached. Could
> > > > > > > > you test that? I checked that d-i does include an insmod command.
> > > > > > >
> > > > > > > The squeeze 6.0.3 installer with the shipped tg3.ko replaced with
> > > > > > > yours boots and networks just fine without any workaround and without
> > > > > > > manual interaction.
> > > > > >
> > > > > > I was a bit fast on that. The interface now fails right in the middle
> > > > > > of installation and needs the modprobe -r, modprobe stunt to network
> > > > > > again.
> > > > >
> > > > > Matt, Michael,
> > > > >
> > > > > The tg3 driver has regressed for the 5714S since Linux 2.6.32. Marc
> > > > > Haber found this in the backported version included in our stable
> > > > > update, but also confirmed it in Linux 3.0.
> > > > >
> > > > > Bringing the interface down and then up again (which the installer does
> > > > > for some reason) can leave it unable to pass traffic (possibly after
> > > > > working for a few packets) until the module is reloaded.
> > > > >
> > > > > I asked Marc to check whether reverting the power management changes
> > > > > (071697e2bcd8dff2af4d6fdd6525c2324f89553b,
> > > > > d237d9ecf06a00f0ebca657958cf2a1e92940796) made a difference, but it
> > > > > doesn't seem to.
> > > > >
> > > > > There is more information in the bug log at
> > > > > <http://bugs.debian.org/645308>.
> > > >
> > > > Where can I get the sources for this driver? Commit
> > > > 9e975cc291d80d5e4562d6bed15ec171e896d69b, entitled
> > > > "tg3: Fix io failures after chip reset" has been a common source of
> > > > problems.
> > >
> > > Our current package has Linux 3.0.6 which includes the backport of that
> > > change. However, it is *not* included in my backport to 2.6.32 so it
> > > doesn't explain the original report.
> > >
> > > The backported version can be found in:
> > >
> > > git://anonscm.debian.org/kernel/linux-2.6.git squeeze
> >
> > The kernel version of that repository is 3.0.0-rc1. Am I looking in the
> > right place?
>
> Look at the squeeze branch, not master.
>
> Ben.
>
> > But you're right. The version of the driver in that repository does not
> > have the change.
O.K. With Ben's help, I was able to see the diff between the two
driver versions.
I think I misspoke earlier. The commit I mentioned above fixes a problem
introduced in an earlier patch. (commit ID
d2394e6bb1aa636f3bd142cb6f7845a4332514b5, entitled
"tg3: Always turn on APE features in mac_mode reg") So the fact that
2.6.32-35 works and 2.6.32-36 doesn't means you applied the problematic
patch that had the bug, but you didn't apply the above fix to correct it.
Sorry for the confusion.
^ permalink raw reply
* Re: NFS TCP race condition with SOCK_ASYNC_NOSPACE
From: Andrew Cooper @ 2011-11-18 19:55 UTC (permalink / raw)
To: Trond Myklebust
Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <1321643673.2653.41.camel-SyLVLa/KEI9HwK5hSS5vWB2eb7JE58TQ@public.gmane.org>
On 18/11/2011 19:14, Trond Myklebust wrote:
> On Fri, 2011-11-18 at 19:04 +0000, Andrew Cooper wrote:
>> On 18/11/11 18:52, Trond Myklebust wrote:
>>> On Fri, 2011-11-18 at 18:40 +0000, Andrew Cooper wrote:
>>>> Hello,
>>>>
>>>> As described originally in
>>>> http://www.spinics.net/lists/linux-nfs/msg25314.html, we were
>>>> encountering a bug whereby the NFS session was unexpectedly timing out.
>>>>
>>>> I believe I have found the source of the race condition causing the timeout.
>>>>
>>>> Brief overview of setup:
>>>> 10GiB network, NFS mounted using TCP. Problem reproduces with
>>>> multiple different NICs, with synchronous or asynchronous mounts, and
>>>> with soft and hard mounts. Reproduces on 2.6.32 and I am currently
>>>> trying to reproduce with mainline. (I don't have physical access to the
>>>> servers so installing stuff is not fantastically easy)
>>>>
>>>>
>>>>
>>>> In net/sunrpc/xprtsock.c:xs_tcp_send_request(), we try to write data to
>>>> the sock buffer using xs_sendpages()
>>>>
>>>> When the sock buffer is nearly fully, we get an EAGAIN from
>>>> xs_sendpages() which causes a break out of the loop. Lower down the
>>>> function, we switch on status which cases us to call xs_nospace() with
>>>> the task.
>>>>
>>>> In xs_nospace(), we test the SOCK_ASYNC_NOSPACE bit from the socket, and
>>>> in the rare case where that bit is clear, we return 0 instead of
>>>> EAGAIN. This promptly overwrites status in xs_tcp_send_request().
>>>>
>>>> The result is that xs_tcp_release_xprt() finds a request which has no
>>>> error, but has not sent all of the bytes in its send buffer. It cleans
>>>> up by setting XPRT_CLOSE_WAIT which causes xprt_clear_locked() to queue
>>>> xprt->task_cleanup, which closes the TCP connection.
>>>>
>>>>
>>>> Under normal operation, the TCP connection goes down and back up without
>>>> interruption to the NFS layer. However, when the NFS server hangs in a
>>>> half closed state, the client forces a RST of the TCP connection,
>>>> leading to the timeout.
>>>>
>>>> I have tried a few naive fixes such as changing the default return value
>>>> in xs_nospace() from 0 to -EAGAIN (meaning that 0 will never be
>>>> returned) but this causes a kernel memory leak. Can someone who a
>>>> better understanding of these interactions than me have a look? It
>>>> seems that the if (test_bit()) test in xs_nospace() should have an else
>>>> clause.
>>> I fully agree with your analysis. The correct thing to do here is to
>>> always return either EAGAIN or ENOTCONN. Thank you very much for working
>>> this one out!
>>>
>>> Trond
>> Returning EAGAIN seems to cause a kernel memory leak, as the oomkiller
>> starts going after processes holding large amounts of LowMem. Returning
> The EAGAIN should trigger a retry of the send.
>
>> ENOTCONN causes the NFS session to complain about a timeout in the logs,
>> and in the case of a softmout, give an EIO to the calling process.
> Correct. ENOTCONN means that the connection was lost.
>
>> >From the looks of the TCP stream, and from the the looks of some
>> targeted debugging, nothing is actually wrong, so the client should not
>> be trying to FIN the TCP connection. Is it possible that there is a
>> more sinister reason for SOCK_ASYNC_NOSPACE being clear?
> Normally, it means that we're out of the out-of-write-buffer condition
> that caused the socket to fail (i.e. the socket has made progress
> sending more data, so that we can now resume sending more). Returning
> EAGAIN in that condition is correct.
Following my latest set of tests, I would have to say that we are in the
abnormal case. Removing the test_bit check and always requeuing causes
the next call to xs_sendpages() to return with an ENOTCONN. I guess I
have some more debugging to do.
>
>> I can attempt to find which of the many calls to clear that bit is
>> actually causing the problem, but I have a feeing that is going to a
>> little more tricky to narrow down.
>>
~Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] net: add calxeda xgmac ethernet driver
From: David Miller @ 2011-11-18 19:54 UTC (permalink / raw)
To: robherring2; +Cc: netdev, devicetree-discuss, joe, saeed.bishara, rob.herring
In-Reply-To: <1321411611-28839-1-git-send-email-robherring2@gmail.com>
From: Rob Herring <robherring2@gmail.com>
Date: Tue, 15 Nov 2011 20:46:51 -0600
> From: Rob Herring <rob.herring@calxeda.com>
>
> Add support for the XGMAC 10Gb ethernet device in the Calxeda Highbank
> SOC.
>
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Doesn't build, because you reference SK_8K which is a define not
available on all platforms. Only architectures which make use of
asm-generic/sizes.h will be able to see that definition.
^ permalink raw reply
* Re: [patch net-next 1/2] team: add fix_features
From: David Miller @ 2011-11-18 20:00 UTC (permalink / raw)
To: jpirko
Cc: netdev, eric.dumazet, bhutchings, shemminger, andy, fbl, jzupka,
ivecera, mirqus
In-Reply-To: <1321539365-1125-1-git-send-email-jpirko@redhat.com>
From: Jiri Pirko <jpirko@redhat.com>
Date: Thu, 17 Nov 2011 15:16:04 +0100
> do fix features in similar way as bonding code does
>
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Applied.
^ permalink raw reply
* Re: [patch net-next 2/2] team: avoid using variable-length array
From: David Miller @ 2011-11-18 20:00 UTC (permalink / raw)
To: jpirko
Cc: netdev, eric.dumazet, bhutchings, shemminger, andy, fbl, jzupka,
ivecera, mirqus
In-Reply-To: <1321539365-1125-2-git-send-email-jpirko@redhat.com>
From: Jiri Pirko <jpirko@redhat.com>
Date: Thu, 17 Nov 2011 15:16:05 +0100
> Apparently using variable-length array is not correct
> (https://lkml.org/lkml/2011/10/23/25). So remove it.
>
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Applied.
^ permalink raw reply
* Re: [patch net-next] team: replace kmalloc+memcpy by kmemdup
From: David Miller @ 2011-11-18 20:00 UTC (permalink / raw)
To: jpirko
Cc: netdev, eric.dumazet, bhutchings, shemminger, andy, fbl, jzupka,
ivecera
In-Reply-To: <1321547557-1175-1-git-send-email-jpirko@redhat.com>
From: Jiri Pirko <jpirko@redhat.com>
Date: Thu, 17 Nov 2011 17:32:37 +0100
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Applied.
^ permalink raw reply
* Re: [PATCH v8] Phonet: set the pipe handle using setsockopt
From: David Miller @ 2011-11-18 20:01 UTC (permalink / raw)
To: remi.denis-courmont; +Cc: netdev
In-Reply-To: <2868814.BQAWhlMiRL@hector>
From: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Date: Fri, 18 Nov 2011 13:41:56 +0200
> Le Vendredi 18 Novembre 2011 16:52:05 ext Hemant Vilas RAMDASI a écrit :
>> From: Dinesh Kumar Sharma <dinesh.sharma@stericsson.com>
>>
>> This provides flexibility to set the pipe handle
>> using setsockopt. The pipe can be enabled (if disabled) later
>> using ioctl.
>>
>> Signed-off-by: Hemant Ramdasi <hemant.ramdasi@stericsson.com>
>> Signed-off-by: Dinesh Kumar Sharma <dinesh.sharma@stericsson.com>
>
> Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Applied.
^ permalink raw reply
* Re: [0/6] Replace LL_ALLOCATED_SPACE to allow needed_headroom adjustment
From: David Miller @ 2011-11-18 20:01 UTC (permalink / raw)
To: herbert; +Cc: eric.dumazet, evonlanthen, linux-kernel, netdev, timo.teras
In-Reply-To: <20111118121832.GA2868@gondor.apana.org.au>
From: Herbert Xu <herbert@gondor.hengli.com.au>
Date: Fri, 18 Nov 2011 20:18:32 +0800
> On Tue, Oct 25, 2011 at 11:12:04PM -0400, David Miller wrote:
>> From: Herbert Xu <herbert@gondor.hengli.com.au>
>> Date: Tue, 25 Oct 2011 13:54:25 +0200
>>
>> > So I'm going to get rid of LL_ALLOCATED_SPACE completely and
>> > replace it with explicit references to the tailroom as it doesn't
>> > need the alignment anyway (The headroom needs alignment since
>> > we use it to ensure the head is aligned).
>>
>> Ok.
>
> Here are the patches that do this. I also picked up one spot
> that should have used LL_ALLOCATED_SPACE but did not (see 2nd
> last patch).
This all looks good to me, applied to net-next, thanks!
^ permalink raw reply
* Re: b43: BCM 4331: MacBook 8,1: No connection after suspend
From: Arend van Spriel @ 2011-11-18 20:02 UTC (permalink / raw)
To: Nico Schottelius, Nico -telmich- Schottelius, LKML, netdev
In-Reply-To: <20111118173921.GB2101@schottelius.org>
On 11/18/2011 06:39 PM, Nico Schottelius wrote:
> Nico -telmich- Schottelius [Fri, Nov 18, 2011 at 06:32:42PM +0100]:
>> Hello,
>>
>> new notebook, new problems (*):
>
> + full dmesg attached.
>
It seems you are running b43 + bcma. Adding Rafał and friends.
Gr. AvS
^ permalink raw reply
* Re: Occasional oops with IPSec and IPv6.
From: Timo Teräs @ 2011-11-18 20:06 UTC (permalink / raw)
To: Nick Bowler; +Cc: Eric Dumazet, netdev, David S. Miller
In-Reply-To: <20111118192639.GA10531@elliptictech.com>
On 11/18/2011 09:26 PM, Nick Bowler wrote:
> On 2011-11-18 20:27 +0200, Timo Teräs wrote:
>> On 11/18/2011 06:39 PM, Eric Dumazet wrote:
>>> Le vendredi 18 novembre 2011 à 11:27 -0500, Nick Bowler a écrit :
>>>> On 2011-11-17 14:09 -0500, Nick Bowler wrote:
>>>>> One of the tests we do with IPsec involves sending and receiving UDP
>>>>> datagrams of all sizes from 1 to N bytes, where N is much larger than
>>>>> the MTU. In this particular instance, the MTU is 1500 bytes and N is
>>>>> 10000 bytes. This test works fine with IPv4, but I'm getting an
>>>>> occasional oops on Linus' master with IPv6 (output at end of email). We
>>>>> also run the same test where N is less than the MTU, and it does not
>>>>> trigger this issue. The resulting fallout seems to eventually lock up
>>>>> the box (although it continues to work for a little while afterwards).
>>>>>
>>>>> The issue appears timing related, and it doesn't always occur. This
>>>>> probably also explains why I've not seen this issue before now, as we
>>>>> recently upgraded all our lab systems to machines from this century
>>>>> (with newfangled dual core processors). This also makes it somewhat
>>>>> hard to reproduce, but I can trigger it pretty reliably by running 'yes'
>>>>> in an ssh session (which doesn't use IPsec) while running the test:
>>>>> it'll usually trigger in 2 or 3 runs. The choice of cipher suite
>>>>> appears to be irrelevant.
> [...]
>>> Please note commit 80c802f307 added a known bug, fixed in commit
>>> 0b150932197b (xfrm: avoid possible oopse in xfrm_alloc_dst)
>>>
>>> Given commit 80c802f307 complexity, we can assume other bugs are to be
>>> fixed as well.
> [...]
>> This looks quite different. And I've been trying to figure out what
>> causes this. However, the OOPS happens at ip6_fragment(), indicating
>> that there was not enough allocated headroom (skb underrun). My initial
>> thought is ipv6 bug that just got uncovered by my commit; especially
>> since ipv4 side is happy. But I haven't yet been able to figure this one
>> out.
>>
>> Could you also try Herbert's latest patch set:
>> [0/6] Replace LL_ALLOCATED_SPACE to allow needed_headroom adjustment
>>
>> This changes how the headroom is calculated, and *might* fix this issue
>> too if it's caused by the same SMP race condition which got uncovered by
>> my other commit earlier.
>
> I applied all six of those patches, but I still see a crash. However,
> the call trace seems to be slightly different. I've appended the trace
> from the run with these paches applied, just in case it's significant.
>
> NOTE: I did not carefully look at the traces of all the crashes I've
> triggered. This particular backtrace could potentially have appeared
> before applying these patches and I would not have noticed.
It's still headroom underrun.
I'm not too familiar with the relevant IPv6 code, but it seems to be
mostly modelled after the IPv4 side. Looking at the back trace offset
inside ipv6_fragment, I'd say it was taking the "fast path" for
constructing the fragments. So first guess is that the headroom check
for allowing fast path to happen is not right.
Since the code seems to be treating separately hlen and struct frag_hdr,
I'm wondering if the following patch would be in place?
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 1c9bf8b..c35d9fc 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -675,7 +675,7 @@ int ip6_fragment(struct sk_buff *skb, int
(*output)(struct sk_buff *))
/* Correct geometry. */
if (frag->len > mtu ||
((frag->len & 7) && frag->next) ||
- skb_headroom(frag) < hlen)
+ skb_headroom(frag) < hlen + sizeof(struct frag_hdr))
goto slow_path_clean;
/* Partially cloned skb? */
Alternatively, we could just run the "slow path" unconditionally with
the test load to see if it fixes the issue. At least that'd be pretty
good test if it's a problem in the ipv6 fragmentation code or something
else.
- Timo
^ permalink raw reply related
* Re: b43: BCM 4331: MacBook 8,1: No connection after suspend
From: Arend van Spriel @ 2011-11-18 20:09 UTC (permalink / raw)
To: Nico Schottelius, Nico -telmich- Schottelius, LKML,
"netdev@vger.kernel.org" <ne
In-Reply-To: <4EC6B9E9.50204@broadcom.com>
On 11/18/2011 09:02 PM, Arend van Spriel wrote:
> On 11/18/2011 06:39 PM, Nico Schottelius wrote:
>> Nico -telmich- Schottelius [Fri, Nov 18, 2011 at 06:32:42PM +0100]:
>>> Hello,
>>>
>>> new notebook, new problems (*):
>>
>> + full dmesg attached.
>>
>
> It seems you are running b43 + bcma. Adding Rafał and friends.
>
> Gr. AvS
Hi Rafał,
In brcmsmac we reprogram the PCI BAR windows upon resume. Not sure if
that is done or needed in bcma, but may be worth checking.
Gr. AvS
^ permalink raw reply
* Re: Occasional oops with IPSec and IPv6.
From: David Miller @ 2011-11-18 20:10 UTC (permalink / raw)
To: timo.teras; +Cc: nbowler, eric.dumazet, netdev
In-Reply-To: <4EC6BAD7.3010200@iki.fi>
From: Timo Teräs <timo.teras@iki.fi>
Date: Fri, 18 Nov 2011 22:06:47 +0200
> I'm not too familiar with the relevant IPv6 code, but it seems to be
> mostly modelled after the IPv4 side. Looking at the back trace offset
> inside ipv6_fragment, I'd say it was taking the "fast path" for
> constructing the fragments. So first guess is that the headroom check
> for allowing fast path to happen is not right.
>
> Since the code seems to be treating separately hlen and struct frag_hdr,
> I'm wondering if the following patch would be in place?
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 1c9bf8b..c35d9fc 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -675,7 +675,7 @@ int ip6_fragment(struct sk_buff *skb, int
> (*output)(struct sk_buff *))
> /* Correct geometry. */
> if (frag->len > mtu ||
> ((frag->len & 7) && frag->next) ||
> - skb_headroom(frag) < hlen)
> + skb_headroom(frag) < hlen + sizeof(struct frag_hdr))
> goto slow_path_clean;
>
> /* Partially cloned skb? */
>
>
> Alternatively, we could just run the "slow path" unconditionally with
> the test load to see if it fixes the issue. At least that'd be pretty
> good test if it's a problem in the ipv6 fragmentation code or something
> else.
This reminds me of the following change from Steffen Klassert in net-next:
commit 299b0767642a65f0c5446ab6d35e6df0daf43d33
Author: Steffen Klassert <steffen.klassert@secunet.com>
Date: Tue Oct 11 01:43:33 2011 +0000
ipv6: Fix IPsec slowpath fragmentation problem
ip6_append_data() builds packets based on the mtu from dst_mtu(rt->dst.path).
On IPsec the effective mtu is lower because we need to add the protocol
headers and trailers later when we do the IPsec transformations. So after
the IPsec transformations the packet might be too big, which leads to a
slowpath fragmentation then. This patch fixes this by building the packets
based on the lower IPsec mtu from dst_mtu(&rt->dst) and adapts the exthdr
handling to this.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 835c04b..1e20b64 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1193,6 +1193,7 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
struct sk_buff *skb;
unsigned int maxfraglen, fragheaderlen;
int exthdrlen;
+ int dst_exthdrlen;
int hh_len;
int mtu;
int copy;
@@ -1248,7 +1249,7 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
np->cork.hop_limit = hlimit;
np->cork.tclass = tclass;
mtu = np->pmtudisc == IPV6_PMTUDISC_PROBE ?
- rt->dst.dev->mtu : dst_mtu(rt->dst.path);
+ rt->dst.dev->mtu : dst_mtu(&rt->dst);
if (np->frag_size < mtu) {
if (np->frag_size)
mtu = np->frag_size;
@@ -1259,16 +1260,17 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
cork->length = 0;
sk->sk_sndmsg_page = NULL;
sk->sk_sndmsg_off = 0;
- exthdrlen = rt->dst.header_len + (opt ? opt->opt_flen : 0) -
- rt->rt6i_nfheader_len;
+ exthdrlen = (opt ? opt->opt_flen : 0) - rt->rt6i_nfheader_len;
length += exthdrlen;
transhdrlen += exthdrlen;
+ dst_exthdrlen = rt->dst.header_len;
} else {
rt = (struct rt6_info *)cork->dst;
fl6 = &inet->cork.fl.u.ip6;
opt = np->cork.opt;
transhdrlen = 0;
exthdrlen = 0;
+ dst_exthdrlen = 0;
mtu = cork->fragsize;
}
@@ -1368,6 +1370,8 @@ alloc_new_skb:
else
alloclen = datalen + fragheaderlen;
+ alloclen += dst_exthdrlen;
+
/*
* The last fragment gets additional space at tail.
* Note: we overallocate on fragments with MSG_MODE
@@ -1419,9 +1423,9 @@ alloc_new_skb:
/*
* Find where to start putting bytes
*/
- data = skb_put(skb, fraglen);
- skb_set_network_header(skb, exthdrlen);
- data += fragheaderlen;
+ data = skb_put(skb, fraglen + dst_exthdrlen);
+ skb_set_network_header(skb, exthdrlen + dst_exthdrlen);
+ data += fragheaderlen + dst_exthdrlen;
skb->transport_header = (skb->network_header +
fragheaderlen);
if (fraggap) {
@@ -1434,6 +1438,7 @@ alloc_new_skb:
pskb_trim_unique(skb_prev, maxfraglen);
}
copy = datalen - transhdrlen - fraggap;
+
if (copy < 0) {
err = -EINVAL;
kfree_skb(skb);
@@ -1448,6 +1453,7 @@ alloc_new_skb:
length -= datalen - fraggap;
transhdrlen = 0;
exthdrlen = 0;
+ dst_exthdrlen = 0;
csummode = CHECKSUM_NONE;
/*
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 3486f62..6f7824e 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -542,8 +542,7 @@ static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6,
goto out;
offset = rp->offset;
- total_len = inet_sk(sk)->cork.base.length - (skb_network_header(skb) -
- skb->data);
+ total_len = inet_sk(sk)->cork.base.length;
if (offset >= total_len - 1) {
err = -EINVAL;
ip6_flush_pending_frames(sk);
^ permalink raw reply related
* Re: Unable to flush ICMP redirect routes in kernel 3.0+
From: David Miller @ 2011-11-18 20:26 UTC (permalink / raw)
To: eric.dumazet; +Cc: fbl, famzah, netdev, segoon
In-Reply-To: <1321632128.3277.29.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 18 Nov 2011 17:02:08 +0100
> David, unless I missed something, we should revert commit f39925dbde77
> ipv4: Cache learned redirect information in inetpeer.)
>
> With following patch, redirects now work for me.
Yes, it doesn't work very well... sigh.
I've applied your patch and queued it up for stable.
Long term we need a different scheme for redirects.
Thanks!
^ permalink raw reply
* Re: Occasional oops with IPSec and IPv6.
From: Nick Bowler @ 2011-11-18 21:21 UTC (permalink / raw)
To: Timo Teräs; +Cc: Eric Dumazet, netdev, David S. Miller
In-Reply-To: <4EC6BAD7.3010200@iki.fi>
On 2011-11-18 22:06 +0200, Timo Teräs wrote:
> It's still headroom underrun.
>
> I'm not too familiar with the relevant IPv6 code, but it seems to be
> mostly modelled after the IPv4 side. Looking at the back trace offset
> inside ipv6_fragment, I'd say it was taking the "fast path" for
> constructing the fragments. So first guess is that the headroom check
> for allowing fast path to happen is not right.
>
> Since the code seems to be treating separately hlen and struct frag_hdr,
> I'm wondering if the following patch would be in place?
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 1c9bf8b..c35d9fc 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -675,7 +675,7 @@ int ip6_fragment(struct sk_buff *skb, int
> (*output)(struct sk_buff *))
> /* Correct geometry. */
> if (frag->len > mtu ||
> ((frag->len & 7) && frag->next) ||
> - skb_headroom(frag) < hlen)
> + skb_headroom(frag) < hlen + sizeof(struct frag_hdr))
> goto slow_path_clean;
>
> /* Partially cloned skb? */
>
>
> Alternatively, we could just run the "slow path" unconditionally with
> the test load to see if it fixes the issue. At least that'd be pretty
> good test if it's a problem in the ipv6 fragmentation code or something
> else.
Good call. I replaced the "correct geometry" check with an
unconditional "goto slow_path_clean;", and I can no longer reproduce the
crash. So at the very least, I have a workaround now. (I still have
Herbert Xu's six patches applied on top of Linus' master).
I then tried the smaller change above, but this does not correct the
issue.
Cheers,
--
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)
^ permalink raw reply
* Re: [PATCH] net: add calxeda xgmac ethernet driver
From: Ben Hutchings @ 2011-11-18 22:36 UTC (permalink / raw)
To: Rob Herring; +Cc: netdev, devicetree-discuss, joe, saeed.bishara, Rob Herring
In-Reply-To: <1321411611-28839-1-git-send-email-robherring2@gmail.com>
On Tue, 2011-11-15 at 20:46 -0600, Rob Herring wrote:
[...]
> +static int desc_get_rx_status(struct xgmac_priv *priv, struct xgmac_dma_desc *p)
> +{
[...]
> + if (status & RXDESC_EXT_STATUS) {
> + if (ext_status & RXDESC_IP_HEADER_ERR)
> + x->rx_ip_header_error++;
> + if (ext_status & RXDESC_IP_PAYLOAD_ERR)
> + x->rx_payload_error++;
> + netdev_dbg(priv->dev, "IP checksum error - stat %08x\n",
> + ext_status);
> + return -1;
You must not drop packets with a checksum failure above the link level;
i.e. you should drop for bad Ethernet CRC but not bad IP checksum. The
return value here should be CHECKSUM_NONE.
[...]
> +static int xgmac_dma_desc_rings_init(struct net_device *dev)
> +{
[...]
> + /* The base address of the RX/TX descriptor lists must be written into
> + * DMA CSR3 and CSR4, respectively. */
> + writel(priv->dma_tx_phy, priv->base + XGMAC_DMA_TX_BASE_ADDR);
> + writel(priv->dma_rx_phy, priv->base + XGMAC_DMA_RX_BASE_ADDR);
The code doesn't use the names 'CSR3' or 'CSR4' (thankfully) so this
comment is redundant.
[...]
> +static netdev_tx_t xgmac_xmit(struct sk_buff *skb, struct net_device *dev)
> +{
> + struct xgmac_priv *priv = netdev_priv(dev);
> + unsigned int entry;
> + int i;
> + int nfrags = skb_shinfo(skb)->nr_frags;
> + struct xgmac_dma_desc *desc, *first;
> + unsigned int desc_flags;
> + unsigned int len;
> + dma_addr_t paddr;
> +
> + if (dma_ring_space(priv->tx_head, priv->tx_tail, DMA_TX_RING_SZ) <
> + (nfrags + 1)) {
> + writel(DMA_INTR_DEFAULT_MASK | DMA_INTR_ENA_TIE,
> + priv->base + XGMAC_DMA_INTR_ENA);
> + netif_stop_queue(dev);
> + return NETDEV_TX_BUSY;
> + }
> +
> + desc_flags = (skb->ip_summed == CHECKSUM_PARTIAL) ?
> + TXDESC_CSUM_ALL : 0;
> + entry = priv->tx_head;
> + desc = priv->dma_tx + entry;
> + first = desc;
> +
> + priv->tx_skbuff[entry] = skb;
> + len = skb_headlen(skb);
> + paddr = dma_map_single(priv->device, skb->data, len, DMA_TO_DEVICE);
Don't you need to check for failure?
> + desc_set_buf_addr_and_size(desc, paddr, len);
> +
> + for (i = 0; i < nfrags; i++) {
> + skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
> +
> + len = frag->size;
> + entry = dma_ring_incr(entry, DMA_TX_RING_SZ);
> + desc = priv->dma_tx + entry;
> +
> + paddr = dma_map_page(priv->device, frag->page.p,
> + frag->page_offset, len, DMA_TO_DEVICE);
Use skb_frag_dma_map() and check for failure.
> + priv->tx_skbuff[entry] = NULL;
> +
> + desc_set_buf_addr_and_size(desc, paddr, len);
> + if (i < (nfrags - 1))
> + desc_set_tx_owner(desc, desc_flags);
> + }
[...]
> +static void xgmac_set_rx_mode(struct net_device *dev)
> +{
> + int i;
> + struct xgmac_priv *priv = netdev_priv(dev);
> + void __iomem *ioaddr = priv->base;
> + unsigned int value = 0;
> + u32 mc_filter[XGMAC_NUM_HASH];
Maybe call this hash_filter since you may use it for matching large
numbers of unicast addresses as well?
[...]
> +static int xgmac_change_mtu(struct net_device *dev, int new_mtu)
> +{
> + struct xgmac_priv *priv = netdev_priv(dev);
> + int old_mtu;
> +
> + if ((new_mtu < 46) || (new_mtu > MAX_MTU)) {
> + netdev_err(priv->dev, "invalid MTU, max MTU is: %d\n", MAX_MTU);
> + return -EINVAL;
> + }
> +
> + old_mtu = dev->mtu;
> + dev->mtu = new_mtu;
> +
> + /* return early if the buffer sizes will not change */
> + if (old_mtu <= ETH_DATA_LEN && new_mtu <= ETH_DATA_LEN)
> + return 0;
> + if (old_mtu == new_mtu)
> + return 0;
> +
> + /* Stop everything, get ready to change the MTU */
> + if (!netif_running(dev))
> + return 0;
> +
> + /* Bring the interface down and then back up */
> + xgmac_release(dev);
> + xgmac_open(dev);
> +
> + return 0;
> +}
This function should end with return xgmac_open(dev) so that a failure
of that function is properly reported.
You also need to make sure that it's safe to call xgmac_release() a
second time if this call to xgmac_open() fails; I think at the moment
that will result in a crash.
[...]
> +struct rtnl_link_stats64 *
> +xgmac_get_stats64(struct net_device *dev,
> + struct rtnl_link_stats64 *storage)
> +{
> + struct xgmac_priv *priv = netdev_priv(dev);
> + void __iomem *base = priv->base;
> + u64 count;
Calls to ndo_get_stats64 are *not* serialised and may be done in atomic
context. You need to serialise calls yourself using a spinlock.
> + storage->rx_packets = readl(base + XGMAC_MMC_RXFRAME_GB_LO);
> + storage->rx_packets |=
> + (u64)(readl(base + XGMAC_MMC_RXFRAME_GB_HI)) << 32;
> + storage->rx_bytes = readl(base + XGMAC_MMC_RXOCTET_G_LO);
> + storage->rx_bytes |= (u64)(readl(base + XGMAC_MMC_RXOCTET_G_HI)) << 32;
Does reading the 'LO' register latch the 'HI' value until you read that
as well? If not, you need to detect a rollover here.
> + storage->multicast = readl(base + XGMAC_MMC_RXMCFRAME_G);
> + storage->rx_crc_errors = readl(base + XGMAC_MMC_RXCRCERR);
> + storage->rx_length_errors = readl(base + XGMAC_MMC_RXLENGTHERR);
> + storage->rx_missed_errors = readl(base + XGMAC_MMC_RXOVERFLOW);
> +
> + storage->tx_packets = readl(base + XGMAC_MMC_TXFRAME_GB_LO);
> + storage->tx_packets |=
> + (u64)(readl(base + XGMAC_MMC_TXFRAME_GB_HI)) << 32;
> + storage->tx_bytes = readl(base + XGMAC_MMC_TXOCTET_G_LO);
> + storage->tx_bytes |= (u64)(readl(base + XGMAC_MMC_TXOCTET_G_HI)) << 32;
> +
> + count = readl(base + XGMAC_MMC_TXFRAME_G_LO);
> + count |= (__u64)(readl(base + XGMAC_MMC_TXFRAME_G_HI)) << 32;
> + storage->tx_errors = storage->tx_packets - count;
This subtraction is problematic: unless the TX frame counters are *all*
latched until you finish reading them, tx_errors can jump backwards.
> + storage->tx_fifo_errors = readl(base + XGMAC_MMC_TXUNDERFLOW);
> +
> + return storage;
> +}
[...]
> +static int xgmac_ethtool_getsettings(struct net_device *dev,
> + struct ethtool_cmd *cmd)
> +{
> + cmd->autoneg = 0;
> + cmd->duplex = DUPLEX_FULL;
> + ethtool_cmd_speed_set(cmd, 10000);
> + cmd->supported = SUPPORTED_10000baseT_Full;
> + cmd->advertising = 0;
> + cmd->transceiver = XCVR_INTERNAL;
> + return 0;
> +}
Please don't use SUPPORTED_10000baseT_Full. I know there are a lot of
drivers currently using that to mean any 10G full-duplex mode, but it's
not really correct. The supported mask really isn't that important in
the absence of autonegotiation, anyway.
[...]
> +static int xgmac_set_pauseparam(struct net_device *netdev,
> + struct ethtool_pauseparam *pause)
> +{
> + struct xgmac_priv *priv = netdev_priv(netdev);
> + return xgmac_set_flow_ctrl(priv, pause->rx_pause, pause->tx_pause);
> +}
This should reject requests to enable pause frame autonegotiation:
if (pause->autoneg)
return -EINVAL;
[...]
> +static const struct xgmac_stats xgmac_gstrings_stats[] = {
[...]
> + XGMAC_STAT(tx_undeflow_irq),
'underflow' is missing an 'r'.
Also, I don't think it's helpful to include '_irq' in the names reported
through the ethtool API.
[...]
> +static int xgmac_get_sset_count(struct net_device *netdev, int sset)
> +{
> + switch (sset) {
> + case ETH_SS_STATS:
> + return XGMAC_STATS_LEN;
> + default:
> + return -EOPNOTSUPP;
You support the get_sset_count operation, just not this argument value,
so I think EINVAL is the correct error code.
[...]
> +static int xgmac_set_wol(struct net_device *dev,
> + struct ethtool_wolinfo *wol)
> +{
> + struct xgmac_priv *priv = netdev_priv(dev);
> + u32 support = WAKE_MAGIC | WAKE_UCAST;
> +
> + if (!device_can_wakeup(priv->device))
> + return -EINVAL;
The error code should be EOPNOTSUPP, unless this capability can change
dynamically.
[...]
> +/**
> + * xgmac_probe
> + * @pdev: platform device pointer
> + * Description: the driver is initialized through platform_device.
> + */
> +static int xgmac_probe(struct platform_device *pdev)
> +{
[...]
> + netif_napi_add(ndev, &priv->napi, xgmac_poll, 64);
> + ret = register_netdev(ndev);
> + if (ret)
> + goto err_reg;
> +
> + return 0;
> +
> +err_reg:
> + free_irq(priv->pmt_irq, ndev);
[...]
You need to call netif_napi_del() on this error path, and in
xgmac_remove().
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: b43: BCM 4331: MacBook 8,1: No connection after suspend
From: Rafał Miłecki @ 2011-11-18 23:08 UTC (permalink / raw)
To: Arend van Spriel
Cc: Nico Schottelius, Nico -telmich- Schottelius, LKML,
netdev@vger.kernel.org, b43-dev@lists.infradead.org
In-Reply-To: <4EC6BB5E.4010101@broadcom.com>
2011/11/18 Arend van Spriel <arend@broadcom.com>:
> On 11/18/2011 09:02 PM, Arend van Spriel wrote:
>> On 11/18/2011 06:39 PM, Nico Schottelius wrote:
>>> Nico -telmich- Schottelius [Fri, Nov 18, 2011 at 06:32:42PM +0100]:
>>>> Hello,
>>>>
>>>> new notebook, new problems (*):
>>>
>>> + full dmesg attached.
>>>
>>
>> It seems you are running b43 + bcma. Adding Rafał and friends.
>>
>> Gr. AvS
>
> Hi Rafał,
>
> In brcmsmac we reprogram the PCI BAR windows upon resume. Not sure if
> that is done or needed in bcma, but may be worth checking.
Good point.
Please reload b43 *and bcma*. Both drivers. Share your results.
--
Rafał
^ permalink raw reply
* [GIT PULL net-next] Open vSwitch
From: Jesse Gross @ 2011-11-18 23:12 UTC (permalink / raw)
To: David S. Miller; +Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA
This series of patches proposes the Open vSwitch kernel components for
upstream. Open vSwitch has existed as a separate project for several
years and we now believe it to be mature enough for inclusion. The
actual functionality is described more fully in the commit that adds
the kernel code.
The following changes since commit f8a15af093b19b86d56933c8757cee298d0f32a8:
team: replace kmalloc+memcpy by kmemdup (2011-11-18 14:55:03 -0500)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch.git for-upstream
Jesse Gross (1):
net: Add Open vSwitch kernel components.
Pravin B Shelar (3):
genetlink: Add genl_notify()
genetlink: Add lockdep_genl_is_held().
vlan: Move vlan_set_encap_proto() to vlan header file
Documentation/networking/00-INDEX | 2 +
Documentation/networking/openvswitch.txt | 195 +++
MAINTAINERS | 8 +
include/linux/genetlink.h | 3 +
include/linux/if_vlan.h | 34 +
include/linux/openvswitch.h | 452 +++++++
include/net/genetlink.h | 2 +
net/8021q/vlan_core.c | 33 -
net/Kconfig | 1 +
net/Makefile | 1 +
net/netlink/genetlink.c | 21 +
net/openvswitch/Kconfig | 28 +
net/openvswitch/Makefile | 14 +
net/openvswitch/actions.c | 415 +++++++
net/openvswitch/datapath.c | 1888 ++++++++++++++++++++++++++++++
net/openvswitch/datapath.h | 125 ++
net/openvswitch/dp_notify.c | 70 ++
net/openvswitch/flow.c | 1373 ++++++++++++++++++++++
net/openvswitch/flow.h | 195 +++
net/openvswitch/vport-internal_dev.c | 241 ++++
net/openvswitch/vport-internal_dev.h | 28 +
net/openvswitch/vport-netdev.c | 200 ++++
net/openvswitch/vport-netdev.h | 42 +
net/openvswitch/vport.c | 396 +++++++
net/openvswitch/vport.h | 205 ++++
25 files changed, 5939 insertions(+), 33 deletions(-)
create mode 100644 Documentation/networking/openvswitch.txt
create mode 100644 include/linux/openvswitch.h
create mode 100644 net/openvswitch/Kconfig
create mode 100644 net/openvswitch/Makefile
create mode 100644 net/openvswitch/actions.c
create mode 100644 net/openvswitch/datapath.c
create mode 100644 net/openvswitch/datapath.h
create mode 100644 net/openvswitch/dp_notify.c
create mode 100644 net/openvswitch/flow.c
create mode 100644 net/openvswitch/flow.h
create mode 100644 net/openvswitch/vport-internal_dev.c
create mode 100644 net/openvswitch/vport-internal_dev.h
create mode 100644 net/openvswitch/vport-netdev.c
create mode 100644 net/openvswitch/vport-netdev.h
create mode 100644 net/openvswitch/vport.c
create mode 100644 net/openvswitch/vport.h
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox