* Re: [PATCH 5/5] ipv4: Add FIB nexthop exceptions.
From: David Miller @ 2012-07-17 14:25 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1342533605.2626.680.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 17 Jul 2012 16:00:05 +0200
> On Tue, 2012-07-17 at 06:14 -0700, David Miller wrote:
>> In a regime where we have subnetted route entries, we need a way to
>> store persistent storage about destination specific learned values
>> such as redirects and PMTU values.
>>
>> This is implemented here via nexthop exceptions.
>>
>> The initial implementation is a simple linked list, and can be
>> expanded to a hash table when it is shown to be justified.
>
> Say a typical host uses a single default route, I am trying to convince
> myself it can really use a simple linked list ?
>
> Arent PMTU entries added by messages coming from untrusted sources ?
They are trusted when we validate them at the socket layer, at least
as is done for TCP.
I totally agree that we'll need to adjust the list into something more
sophisticated, but that's an implementation detail rather than
something that requires the actual infrastructure to be redone.
^ permalink raw reply
* [GIT] Networking
From: David Miller @ 2012-07-17 14:36 UTC (permalink / raw)
To: torvalds; +Cc: akpm, netdev, linux-kernel
I know this looks like a lot more than you want to see right now,
however a) the stuff here are real OOPS'ers, memory leaks, and
regressions and b) it's been a full 2 weeks since I last sent bug
fixes your way.
If it makes you feel any better, my default has been to toss fixes
into net-next unless it was really serious like the stuff below.
I have a CIPSO ipv4 option processing oops'er I intend to work on
fixing myself if the maintainer of the code doesn't look at it in the
24 hours.
1) IPVS oops'ers:
a) Should not reset skb->nf_bridge in forwarding hook (Lin Ming)
b) 3.4 commit can cause ip_vs_control_cleanup to be invoked after
the ipvs_core_ops are unregistered during rmmod (Julian ANastasov)
2) ixgbevf bringup failure can crash in TX descriptor cleanup (Alexander Duyck)
3) AX25 switch missing break statement hoses ROSE sockets (Alan Cox)
4) CAIF accesses freed per-net memory (Sjur Brandeland)
5) Network cgroup code has out-or-bounds accesses (Eric DUmazet), and accesses
freed memory (Gao Feng)
6) Fix a crash in SCTP reported by Dave Jones caused by freeing an association
still on a list (Neil HOrman)
7) __netdev_alloc_skb() regresses on GFP_DMA using drivers because that GFP
flag is not being retained for the allocation (Eric Dumazet).
8) Missing NULL hceck in sch_sfb netlink message parsing (Alan Cox)
9) bnx2 crashes because TX index iteration is not bounded correctly (Michael
Chan)
10) IPoIB generates warnings in TCP queue collapsing (via
skb_try_coalesce) because it does not set skb->truesize correctly
(Eric Dumazet)
11) vlan_info objects leak for the implicit vlan with ID 0 (Amir Hanania)
12) A fix for TX time stamp handling in gianfar does not transfer
socket ownership from one packet to another correctly, resulting
in a socket write space imbalance (Eric Dumazet)
13) Julia Lawall found several cases where we do a list iteration, and
then at the loop termination unconditionally assume we ended up with
real list object, rather than the list head itself (CNIC, RXRPC,
mISDN).
14) The bonding driver handles procfs moving incorrectly when a device
it manages is moved from one namespace to another (Eric Biederman)
15) Missing memory barriers in stmmac descriptor accesses result in
various crashes (Deepak Sikri)
16) Fix handling of broadcast packets in batman-adv (Simon Wunderlich)
17) Properly check the sanity of sendmsg() lengths in ieee802154's
dgram_sendmsg(). Dave Jones and others have hit and reported this
bug (Sasha Levin)
18) Some drivers (b44 and b43legacy) on 64-bit machines stopped
working because of how netdev_alloc_skb() was adjusted. Such
drivers should now use alloc_skb() for obtaining bounce buffers.
(Eric Dumazet)
19) atl1c mis-managed it's link state in that it stops the queue by
hand on link down. The generic networking takes care of that and
this double stop locks the queue down. So simply removing the
driver's queue stop call fixes the problem (Cloud Ren)
20) Fix out-of-memory due to mis-accounting in net_em packet scheduler
(Eric Dumazet)
21) If DCB and SR-IOV are configured at the same time in IXGBE the chip
will hang because this is not supported (Alexander Duyck)
22) A commit to stop drivers using netdev->base_addr broke the CNIC
driver (Michael Chan)
23) Timeout regression in ipset caused by an attempt to fix an overflow
bug (Jozsef Kadlecsik).
24) mac80211 minstrel code allocates memory using incorrect size
(Thomas Huehn)
25) llcp_sock_getname() needs to check for a NULL device otherwise we
OOPS (Sasha Levin)
26) mwifiex leaks memory (Bing Zhao)
27) Propagate iwlwifi fix to iwlegacy, even when we're not associated
we need to monitor for stuck queues in the watchdog handler
(Stanislaw Geuszka)
Please pull, thanks a lot.
The following changes since commit 9e85a6f9dc231f3ed3c1dc1b12217505d970142a:
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux (2012-07-03 18:06:49 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net master
for you to fetch changes up to 602e65a3b0c4f6b09fba19817ff798647a08e706:
Merge branch 'master' of git://1984.lsi.us.es/nf (2012-07-17 03:19:33 -0700)
----------------------------------------------------------------
Alan Cox (2):
sch_sfb: Fix missing NULL check
ax25: Fix missing break
Alexander Duyck (2):
ixgbe: DCB and SR-IOV can not co-exist and will cause hangs
ixgbevf: Fix panic when loading driver
Amir Hanania (1):
net: Fix memory leak - vlan_info struct
Bing Zhao (1):
mwifiex: fix Coverity SCAN CID 709078: Resource leak (RESOURCE_LEAK)
Bjørn Mork (1):
net: qmi_wwan: add ZTE MF60
Bruce Allan (1):
e1000e: fix test for PHY being accessible on 82577/8/9 and I217
Cloud Ren (1):
atl1c: fix issue of transmit queue 0 timed out
David Daney (1):
netdev/phy: Fixup lockdep warnings in mdio-mux.c
David S. Miller (4):
Merge branch 'master' of git://1984.lsi.us.es/nf
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Merge branch 'master' of git://git.kernel.org/.../jkirsher/net
Merge branch 'master' of git://1984.lsi.us.es/nf
Deepak Sikri (2):
stmmac: Fix for nfs hang on multiple reboot
stmmac: Fix for higher mtu size handling
Dmitry Eremin-Solenikov (1):
MAINTAINERS: reflect actual changes in IEEE 802.15.4 maintainership
Eliad Peller (1):
mac80211: destroy assoc_data correctly if assoc fails
Emmanuel Grumbach (1):
iwlegacy: don't mess up the SCD when removing a key
Eric Dumazet (6):
net: dont use __netdev_alloc_skb for bounce buffer
netem: add limitation to reordered packets
net: cgroup: fix out of bounds accesses
gianfar: fix potential sk_wmem_alloc imbalance
IPoIB: fix skb truesize underestimatiom
net: respect GFP_DMA in __netdev_alloc_skb()
Eric W. Biederman (2):
bonding: Manage /proc/net/bonding/ entries from the netdev events
bonding: debugfs and network namespaces are incompatible
Gao feng (2):
cgroup: fix panic in netprio_cgroup
net: cgroup: fix access the unallocated memory in netprio cgroup
John W. Linville (1):
Merge branch 'master' of git://git.kernel.org/.../linville/wireless into for-davem
Jozsef Kadlecsik (1):
netfilter: ipset: timeout fixing bug broke SET target special timeout value
Julia Lawall (3):
drivers/isdn/mISDN/stack.c: remove invalid reference to list iterator variable
net/rxrpc/ar-peer.c: remove invalid reference to list iterator variable
drivers/net/ethernet/broadcom/cnic.c: remove invalid reference to list iterator variable
Julian Anastasov (1):
ipvs: fix oops in ip_vs_dst_event on rmmod
Lin Ming (1):
ipvs: fix oops on NAT reply in br_nf context
Michael Chan (2):
cnic: Don't use netdev->base_addr
bnx2: Fix bug in bnx2_free_tx_skbs().
Narendra K (1):
ixgbevf: Prevent RX/TX statistics getting reset to zero
Neil Horman (1):
sctp: Fix list corruption resulting from freeing an association on a list
Pablo Neira Ayuso (1):
netfilter: nf_ct_ecache: fix crash with multiple containers, one shutting down
Sasha Levin (2):
ieee802154: verify packet size before trying to allocate it
NFC: Prevent NULL deref when getting socket name
Simon Wunderlich (1):
batman-adv: check incoming packet type for bla
Sjur Brændeland (1):
caif: Fix access to freed pernet memory
Stanislaw Gruszka (2):
rt2x00usb: fix indexes ordering on RX queue kick
iwlegacy: always monitor for stuck queues
Thomas Huehn (1):
mac80211: correct size the argument to kzalloc in minstrel_ht
Tushar Dave (1):
e1000e: Correct link check logic for 82571 serdes
MAINTAINERS | 3 +-
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 12 ++++---
drivers/isdn/mISDN/stack.c | 4 +--
drivers/net/bonding/bond_debugfs.c | 2 +-
drivers/net/bonding/bond_main.c | 9 ++++--
drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 1 -
drivers/net/ethernet/broadcom/b44.c | 4 +--
drivers/net/ethernet/broadcom/bnx2.c | 6 ++--
drivers/net/ethernet/broadcom/cnic.c | 10 ++++--
drivers/net/ethernet/freescale/gianfar.c | 7 ++--
drivers/net/ethernet/intel/e1000e/82571.c | 3 ++
drivers/net/ethernet/intel/e1000e/ich8lan.c | 42 ++++++++++++++++++------
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 5 +++
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 15 ++-------
drivers/net/ethernet/stmicro/stmmac/ring_mode.c | 3 +-
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 3 ++
drivers/net/phy/mdio-mux.c | 10 ++++--
drivers/net/usb/qmi_wwan.c | 18 +++++++++++
drivers/net/wireless/b43legacy/dma.c | 2 +-
drivers/net/wireless/iwlegacy/4965-mac.c | 4 +--
drivers/net/wireless/iwlegacy/common.c | 14 ++++----
drivers/net/wireless/mwifiex/cfg80211.c | 1 +
drivers/net/wireless/rt2x00/rt2x00usb.c | 2 +-
include/net/ip_vs.h | 2 +-
include/net/netfilter/nf_conntrack_ecache.h | 2 +-
net/8021q/vlan.c | 3 ++
net/ax25/af_ax25.c | 1 +
net/batman-adv/bridge_loop_avoidance.c | 15 ++++++---
net/batman-adv/bridge_loop_avoidance.h | 5 +--
net/batman-adv/soft-interface.c | 6 +++-
net/caif/caif_dev.c | 2 +-
net/core/dev.c | 8 +++--
net/core/netprio_cgroup.c | 78 +++++++++++++++++++++++++++++++++------------
net/core/skbuff.c | 2 +-
net/ieee802154/dgram.c | 12 +++----
net/mac80211/mlme.c | 6 ++--
net/mac80211/rc80211_minstrel_ht.c | 2 +-
net/netfilter/ipvs/ip_vs_ctl.c | 5 +--
net/netfilter/xt_set.c | 4 ++-
net/nfc/llcp/sock.c | 2 +-
net/rxrpc/ar-peer.c | 2 +-
net/sched/sch_netem.c | 42 +++++++++---------------
net/sched/sch_sfb.c | 2 ++
net/sctp/input.c | 7 ++--
net/sctp/socket.c | 12 +++++--
45 files changed, 256 insertions(+), 144 deletions(-)
^ permalink raw reply
* Re: [PATCH net-next] tcp: implement RFC 5961 4.2
From: David Miller @ 2012-07-17 14:41 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, kkiran
In-Reply-To: <1342525290.2626.459.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 17 Jul 2012 13:41:30 +0200
> From: Eric Dumazet <edumazet@google.com>
>
> Implement the RFC 5691 mitigation against Blind
> Reset attack using SYN bit.
>
> Section 4.2 of RFC 5961 advises to send a Challenge ACK and drop
> incoming packet, instead of resetting the session.
>
> Add a new SNMP counter to count number of challenge acks sent
> in response to SYN packets.
> (netstat -s | grep TCPSYNChallenge)
>
> Remove obsolete TCPAbortOnSyn, since we no longer abort a TCP session
> because of a SYN flag.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Looks good, applied, thanks Eric.
^ permalink raw reply
* PATCH net/ipv6/mip6.c destopt corruption
From: András Takács @ 2012-07-17 14:47 UTC (permalink / raw)
To: netdev
In-Reply-To: <A9CE2E85-2DDC-4182-B494-431A6A62BC95@wakoond.hu>
Dear All,
I have added a lot of debug messages to the kernel source and finally found the problem. When the kernel creates the skb from iovec (ip6_append_data) it sets the pointer of the network header to a wrong position. It will be shifted with 24 bytes (it is the length of the HAO dest. opt. header with paddings).
After this point, the message will be corrupted, the beginning (the first 24 bytes) of the MH part will be truncated. Later, when the kernel adds the dest. opt. header itself, there isn't any issue.
So, back to the wrong network header pointer. It is shifted by exthdrlen (= 24) by the skb_set_network_header() function. This exthdrlen comes from rt->rt6i_nfheader_len, which comes from the dst_entry chain. This nfheader_len value comes from the header_len of the desired xfrm type (in this case hao dest opt):
(net/xfrm/xfrm_policy.c: xfrm_bundle_create)
header_len += xfrm[i]->props.header_len;
if (xfrm[i]->type->flags & XFRM_TYPE_NON_FRAGMENT)
nfheader_len += xfrm[i]->props.header_len;
...
xfrm_init_path((struct xfrm_dst *)dst0, dst, nfheader_len);
I have run a fast grep on the kernel tree, and this XFRM_TYPE_NON_FRAGMENT does not have any effect, just sets (or not) nfheader_len here. So, the following patch solves the issue:
diff -Nuar linux-3.4.2-orig/net/ipv6/mip6.c linux-3.4.2/net/ipv6/mip6.c
--- linux-3.4.2-orig/net/ipv6/mip6.c 2012-07-17 15:18:30.148777104 +0200
+++ linux-3.4.2/net/ipv6/mip6.c 2012-07-17 15:21:12.104779113 +0200
@@ -338,7 +338,7 @@
.description = "MIP6DESTOPT",
.owner = THIS_MODULE,
.proto = IPPROTO_DSTOPTS,
- .flags = XFRM_TYPE_NON_FRAGMENT | XFRM_TYPE_LOCAL_COADDR,
+ .flags = XFRM_TYPE_LOCAL_COADDR,
.init_state = mip6_destopt_init_state,
.destructor = mip6_destopt_destroy,
.input = mip6_destopt_input,
@@ -471,7 +471,7 @@
.description = "MIP6RT",
.owner = THIS_MODULE,
.proto = IPPROTO_ROUTING,
- .flags = XFRM_TYPE_NON_FRAGMENT | XFRM_TYPE_REMOTE_COADDR,
+ .flags = XFRM_TYPE_REMOTE_COADDR,
.init_state = mip6_rthdr_init_state,
.destructor = mip6_rthdr_destroy,
.input = mip6_rthdr_input,
What do you think about this fix? Does it have any drawback?
Ragards,
András
On Jul 16, 2012, at 3:40 PM, András Takács wrote:
>
> Dear All,
>
>
> I have serious problems with HAO dest opt XFRM processing. In the past few days I have tried to find the problem, and I figured out the following:
>
> 1. case: No XFRM rules
> It works fine (as it was described in my previous e-mail)
>
> 2. case: HAO RO XFRM processing
> I have created the following rules manually:
> sudo ip -6 xfrm policy add src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto 135 type 5 dir out priority 2 ptype sub tmpl src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto hao reqid 0 mode ro
> sudo ip -6 xfrm state add src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto hao reqid 0 mode ro replay-window 0 coa 2001:470:7210:11:20c:29ff:fe46:a0e3 sel src 2001:470:7210:10::11 dst 2001:470:7210:10::1000
>
> The message format is corrupted, because during the xfrm processing, the beginning of the MH part will be overwritten by the DST OPT header.
>
> 3. case: ESP TUNNEL XFRM
> I have created ESP TUNNEL XFRM rules manually, and it was worked fine.
> So the problem has to be somewhere in the net/ipv6/mip6.c or net/ipv6/xfrm_mode_ro.c files.
>
> -------------
>
> I added a lot of debug printk statements to the source, and I have figured out the following:
>
> When the kernel creates the skb from the iovec, it seems to be ok (in ip6_append_data):
>
> skb->data to skb->tail:
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 57 39 B1 FB 87 39 B1 FB 64 92 FF FF 18 00 00 00 00 00 00 00 00 00 00 00 3B 03 05 00 00 00 00 01 00 00 00 02 01 00 03 10 20 01 04 70 72 10 00 11 02 0C 29 FF FE 46 A0 E3
>
> Unfortunately at the beginning of the xfrm6_ro_output function, it seems to be corrupt:
>
> 60 00 00 00 00 08 87 40 20 01 04 70 72 10 00 10 00 00 00 00 00 00 00 11 20 01 04 70 72 10 00 10 00 00 00 00 00 00 10 00 02 0C 29 FF FE 46 A0 E3
>
> Here missing the first 24 bytes of the MH part. It is quite suspicious, because the size of the DST OPT header (with the necessary padding) is exactly same long.
>
> After this point xfrm6_ro_output and mip6_destopt_output works fine, and insert the DST OPT header to this truncated skb.
>
>
> Could you please help me to find the connection ("call - graph" ?) between ip6_append_data and xfrm6_ro_output? I can't find the point where it fails. In ip6_append_data, the beginning of the skb is reserved for the IPv6 header, but where will be this part filled with the right values?
>
>
> Thank you very much for your help!
>
>
> Regards,
> András
>
>
> On Jun 21, 2012, at 10:41 PM, Andras Takacs wrote:
>
>> Dear All,
>>
>> I'm working with Mobile IPv6 systems, and I'm setting up a new MIP6 environment. I would like to use the latest stable kernel, so I'm using 3.4.2. Unfortunately I have some serious problems with destination option XFRM processing. I have done the following tests to find the issue:
>>
>> First case: No XFRM policies and states.
>> Sending MH messages without destopt header.
>> In this case the message format is OK, I have tested it with tcpdump and wireshark.
>>
>> 21:33:58.817130 IP6 2001:470:7210:10::11 > 2001:470:7210:10::1000: mobility: BU seq#=1 lifetime=8
>> 0x0000: 6000 0000 0020 8740 2001 0470 7210 0010 `......@...pr...
>> 0x0010: 0000 0000 0000 0011 2001 0470 7210 0010 ...........pr...
>> 0x0020: 0000 0000 0000 1000 3b03 0500 1c46 0001 ........;....F..
>> 0x0030: 0000 0002 0100 0310 2001 0470 7210 0011 ...........pr...
>> 0x0040: 020c 29ff fe46 a0e3 ..)..F..
>>
>> Second case: Adding destopt XFRM policy and state:
>>
>> ip -6 xfrm policy add src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto 135 type 5 dir out priority 2 ptype sub tmpl src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto hao reqid 0 mode ro level use
>> ip -6 xfrm state add src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto hao reqid 0 mode ro replay-window 0 coa 2001:470:7210:11:20c:29ff:fe46:a0e3 sel src 2001:470:7210:10::11 dst 2001:470:7210:10::1000
>>
>> In this case, the message format is corrupted:
>>
>> 21:30:42.350315 IP6 2001:470:7210:11:20c:29ff:fe46:a0e3 > 2001:470:7210:10::1000: DSTOPT mobility: type-#41 len=12
>> 0x0000: 6000 0000 0020 3c40 2001 0470 7210 0011 `.....<@...pr...
>> 0x0010: 020c 29ff fe46 a0e3 2001 0470 7210 0010 ..)..F.....pr...
>> 0x0020: 0000 0000 0000 1000 8702 0102 0000 c910 ................
>> 0x0030: 2001 0470 7210 0010 0000 0000 0000 0011 ...pr...........
>> 0x0040: 020c 29ff fe46 a0e3
>>
>> As you can see, the IPv6 header is OK. Next, the destination option header is OK. Finally, the following part of the packet isn't OK. If you compare the two dump carefully, you will see, that the last 8 bytes are identical. The mip6_destopt_output function adds the destination option header correctly, but overwrites the existing MH header, and doesn't shift it after the destopt header.
>>
>> I'm not familiar with the XFRM framework enough to fix the problem. :(
>> Maybe, could anyone help to me to fix this issue?
>>
>> The last environment, which worked fine was built on 2.6.35 version. The problem happened between 2.6.35 and 3.4.2. Sorry, I know, it is a quite big interval. :(
>>
>> Thanks!
>>
>>
>> Best regards,
>> András Takács
>
^ permalink raw reply
* [patch net-next 0/2] team: add netpoll support
From: Jiri Pirko @ 2012-07-17 15:22 UTC (permalink / raw)
To: netdev; +Cc: davem
Also contains a little change to netpoll core.
Jiri Pirko (2):
netpoll: move np->dev and np->dev_name init into __netpoll_setup()
team: add netpoll support
drivers/net/bonding/bond_main.c | 4 +-
drivers/net/team/team.c | 113 +++++++++++++++++++++++++++++
drivers/net/team/team_mode_activebackup.c | 3 +-
drivers/net/team/team_mode_broadcast.c | 7 +-
drivers/net/team/team_mode_loadbalance.c | 3 +-
drivers/net/team/team_mode_roundrobin.c | 3 +-
include/linux/if_team.h | 33 +++++++++
include/linux/netpoll.h | 2 +-
net/8021q/vlan_dev.c | 5 +-
net/bridge/br_device.c | 5 +-
net/core/netpoll.c | 10 +--
11 files changed, 161 insertions(+), 27 deletions(-)
--
1.7.10.4
^ permalink raw reply
* [patch net-next 1/2] netpoll: move np->dev and np->dev_name init into __netpoll_setup()
From: Jiri Pirko @ 2012-07-17 15:22 UTC (permalink / raw)
To: netdev; +Cc: davem
In-Reply-To: <1342538556-22601-1-git-send-email-jiri@resnulli.us>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
drivers/net/bonding/bond_main.c | 4 +---
include/linux/netpoll.h | 2 +-
net/8021q/vlan_dev.c | 5 +----
net/bridge/br_device.c | 5 +----
net/core/netpoll.c | 10 +++++-----
5 files changed, 9 insertions(+), 17 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 4ddcc3e..1eb3979 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1240,9 +1240,7 @@ static inline int slave_enable_netpoll(struct slave *slave)
if (!np)
goto out;
- np->dev = slave->dev;
- strlcpy(np->dev_name, slave->dev->name, IFNAMSIZ);
- err = __netpoll_setup(np);
+ err = __netpoll_setup(np, slave->dev);
if (err) {
kfree(np);
goto out;
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 5dfa091..28f5389 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -43,7 +43,7 @@ struct netpoll_info {
void netpoll_send_udp(struct netpoll *np, const char *msg, int len);
void netpoll_print_options(struct netpoll *np);
int netpoll_parse_options(struct netpoll *np, char *opt);
-int __netpoll_setup(struct netpoll *np);
+int __netpoll_setup(struct netpoll *np, struct net_device *ndev);
int netpoll_setup(struct netpoll *np);
int netpoll_trap(void);
void netpoll_set_trap(int trap);
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index da1bc9c..73a2a83 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -681,10 +681,7 @@ static int vlan_dev_netpoll_setup(struct net_device *dev, struct netpoll_info *n
if (!netpoll)
goto out;
- netpoll->dev = real_dev;
- strlcpy(netpoll->dev_name, real_dev->name, IFNAMSIZ);
-
- err = __netpoll_setup(netpoll);
+ err = __netpoll_setup(netpoll, real_dev);
if (err) {
kfree(netpoll);
goto out;
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 929e48aed..f4be1bb 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -246,10 +246,7 @@ int br_netpoll_enable(struct net_bridge_port *p)
if (!np)
goto out;
- np->dev = p->dev;
- strlcpy(np->dev_name, p->dev->name, IFNAMSIZ);
-
- err = __netpoll_setup(np);
+ err = __netpoll_setup(np, p->dev);
if (err) {
kfree(np);
goto out;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index f9f40b9..b4c90e4 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -715,14 +715,16 @@ int netpoll_parse_options(struct netpoll *np, char *opt)
}
EXPORT_SYMBOL(netpoll_parse_options);
-int __netpoll_setup(struct netpoll *np)
+int __netpoll_setup(struct netpoll *np, struct net_device *ndev)
{
- struct net_device *ndev = np->dev;
struct netpoll_info *npinfo;
const struct net_device_ops *ops;
unsigned long flags;
int err;
+ np->dev = ndev;
+ strlcpy(np->dev_name, ndev->name, IFNAMSIZ);
+
if ((ndev->priv_flags & IFF_DISABLE_NETPOLL) ||
!ndev->netdev_ops->ndo_poll_controller) {
np_err(np, "%s doesn't support polling, aborting\n",
@@ -851,13 +853,11 @@ int netpoll_setup(struct netpoll *np)
np_info(np, "local IP %pI4\n", &np->local_ip);
}
- np->dev = ndev;
-
/* fill up the skb queue */
refill_skbs();
rtnl_lock();
- err = __netpoll_setup(np);
+ err = __netpoll_setup(np, ndev);
rtnl_unlock();
if (err)
--
1.7.10.4
^ permalink raw reply related
* [patch net-next 2/2] team: add netpoll support
From: Jiri Pirko @ 2012-07-17 15:22 UTC (permalink / raw)
To: netdev; +Cc: davem
In-Reply-To: <1342538556-22601-1-git-send-email-jiri@resnulli.us>
It's done in very similar way this is done in bonding and bridge.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
drivers/net/team/team.c | 113 +++++++++++++++++++++++++++++
drivers/net/team/team_mode_activebackup.c | 3 +-
drivers/net/team/team_mode_broadcast.c | 7 +-
drivers/net/team/team_mode_loadbalance.c | 3 +-
drivers/net/team/team_mode_roundrobin.c | 3 +-
include/linux/if_team.h | 33 +++++++++
6 files changed, 152 insertions(+), 10 deletions(-)
diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 3620c63..1a13470 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -18,6 +18,7 @@
#include <linux/ctype.h>
#include <linux/notifier.h>
#include <linux/netdevice.h>
+#include <linux/netpoll.h>
#include <linux/if_vlan.h>
#include <linux/if_arp.h>
#include <linux/socket.h>
@@ -787,6 +788,58 @@ static void team_port_leave(struct team *team, struct team_port *port)
dev_put(team->dev);
}
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static int team_port_enable_netpoll(struct team *team, struct team_port *port)
+{
+ struct netpoll *np;
+ int err;
+
+ np = kzalloc(sizeof(*np), GFP_KERNEL);
+ if (!np)
+ return -ENOMEM;
+
+ err = __netpoll_setup(np, port->dev);
+ if (err) {
+ kfree(np);
+ return err;
+ }
+ port->np = np;
+ return err;
+}
+
+static void team_port_disable_netpoll(struct team_port *port)
+{
+ struct netpoll *np = port->np;
+
+ if (!np)
+ return;
+ port->np = NULL;
+
+ /* Wait for transmitting packets to finish before freeing. */
+ synchronize_rcu_bh();
+ __netpoll_cleanup(np);
+ kfree(np);
+}
+
+static struct netpoll_info *team_netpoll_info(struct team *team)
+{
+ return team->dev->npinfo;
+}
+
+#else
+static int team_port_enable_netpoll(struct team *team, struct team_port *port)
+{
+ return 0;
+}
+static void team_port_disable_netpoll(struct team_port *port)
+{
+}
+static struct netpoll_info *team_netpoll_info(struct team *team)
+{
+ return NULL;
+}
+#endif
+
static void __team_port_change_check(struct team_port *port, bool linkup);
static int team_port_add(struct team *team, struct net_device *port_dev)
@@ -853,6 +906,15 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
goto err_vids_add;
}
+ if (team_netpoll_info(team)) {
+ err = team_port_enable_netpoll(team, port);
+ if (err) {
+ netdev_err(dev, "Failed to enable netpoll on device %s\n",
+ portname);
+ goto err_enable_netpoll;
+ }
+ }
+
err = netdev_set_master(port_dev, dev);
if (err) {
netdev_err(dev, "Device %s failed to set master\n", portname);
@@ -892,6 +954,9 @@ err_handler_register:
netdev_set_master(port_dev, NULL);
err_set_master:
+ team_port_disable_netpoll(port);
+
+err_enable_netpoll:
vlan_vids_del_by_dev(port_dev, dev);
err_vids_add:
@@ -932,6 +997,7 @@ static int team_port_del(struct team *team, struct net_device *port_dev)
list_del_rcu(&port->list);
netdev_rx_handler_unregister(port_dev);
netdev_set_master(port_dev, NULL);
+ team_port_disable_netpoll(port);
vlan_vids_del_by_dev(port_dev, dev);
dev_close(port_dev);
team_port_leave(team, port);
@@ -1307,6 +1373,48 @@ static int team_vlan_rx_kill_vid(struct net_device *dev, uint16_t vid)
return 0;
}
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void team_poll_controller(struct net_device *dev)
+{
+}
+
+static void __team_netpoll_cleanup(struct team *team)
+{
+ struct team_port *port;
+
+ list_for_each_entry(port, &team->port_list, list)
+ team_port_disable_netpoll(port);
+}
+
+static void team_netpoll_cleanup(struct net_device *dev)
+{
+ struct team *team = netdev_priv(dev);
+
+ mutex_lock(&team->lock);
+ __team_netpoll_cleanup(team);
+ mutex_unlock(&team->lock);
+}
+
+static int team_netpoll_setup(struct net_device *dev,
+ struct netpoll_info *npifo)
+{
+ struct team *team = netdev_priv(dev);
+ struct team_port *port;
+ int err;
+
+ mutex_lock(&team->lock);
+ list_for_each_entry(port, &team->port_list, list) {
+ err = team_port_enable_netpoll(team, port);
+ if (err) {
+ __team_netpoll_cleanup(team);
+ break;
+ }
+ }
+ mutex_unlock(&team->lock);
+ return err;
+}
+#endif
+
static int team_add_slave(struct net_device *dev, struct net_device *port_dev)
{
struct team *team = netdev_priv(dev);
@@ -1363,6 +1471,11 @@ static const struct net_device_ops team_netdev_ops = {
.ndo_get_stats64 = team_get_stats64,
.ndo_vlan_rx_add_vid = team_vlan_rx_add_vid,
.ndo_vlan_rx_kill_vid = team_vlan_rx_kill_vid,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ .ndo_poll_controller = team_poll_controller,
+ .ndo_netpoll_setup = team_netpoll_setup,
+ .ndo_netpoll_cleanup = team_netpoll_cleanup,
+#endif
.ndo_add_slave = team_add_slave,
.ndo_del_slave = team_del_slave,
.ndo_fix_features = team_fix_features,
diff --git a/drivers/net/team/team_mode_activebackup.c b/drivers/net/team/team_mode_activebackup.c
index 253b8a5..6262b4d 100644
--- a/drivers/net/team/team_mode_activebackup.c
+++ b/drivers/net/team/team_mode_activebackup.c
@@ -43,8 +43,7 @@ static bool ab_transmit(struct team *team, struct sk_buff *skb)
active_port = rcu_dereference_bh(ab_priv(team)->active_port);
if (unlikely(!active_port))
goto drop;
- skb->dev = active_port->dev;
- if (dev_queue_xmit(skb))
+ if (team_dev_queue_xmit(team, active_port, skb))
return false;
return true;
diff --git a/drivers/net/team/team_mode_broadcast.c b/drivers/net/team/team_mode_broadcast.c
index 5562345..c96e4d2 100644
--- a/drivers/net/team/team_mode_broadcast.c
+++ b/drivers/net/team/team_mode_broadcast.c
@@ -29,8 +29,8 @@ static bool bc_transmit(struct team *team, struct sk_buff *skb)
if (last) {
skb2 = skb_clone(skb, GFP_ATOMIC);
if (skb2) {
- skb2->dev = last->dev;
- ret = dev_queue_xmit(skb2);
+ ret = team_dev_queue_xmit(team, last,
+ skb2);
if (!sum_ret)
sum_ret = ret;
}
@@ -39,8 +39,7 @@ static bool bc_transmit(struct team *team, struct sk_buff *skb)
}
}
if (last) {
- skb->dev = last->dev;
- ret = dev_queue_xmit(skb);
+ ret = team_dev_queue_xmit(team, last, skb);
if (!sum_ret)
sum_ret = ret;
}
diff --git a/drivers/net/team/team_mode_loadbalance.c b/drivers/net/team/team_mode_loadbalance.c
index 51a4b19..cdc31b5 100644
--- a/drivers/net/team/team_mode_loadbalance.c
+++ b/drivers/net/team/team_mode_loadbalance.c
@@ -217,8 +217,7 @@ static bool lb_transmit(struct team *team, struct sk_buff *skb)
port = select_tx_port_func(team, lb_priv, skb, hash);
if (unlikely(!port))
goto drop;
- skb->dev = port->dev;
- if (dev_queue_xmit(skb))
+ if (team_dev_queue_xmit(team, port, skb))
return false;
lb_update_tx_stats(tx_bytes, lb_priv, get_lb_port_priv(port), hash);
return true;
diff --git a/drivers/net/team/team_mode_roundrobin.c b/drivers/net/team/team_mode_roundrobin.c
index 0cf38e9..ad7ed0e 100644
--- a/drivers/net/team/team_mode_roundrobin.c
+++ b/drivers/net/team/team_mode_roundrobin.c
@@ -55,8 +55,7 @@ static bool rr_transmit(struct team *team, struct sk_buff *skb)
port = __get_first_port_up(team, port);
if (unlikely(!port))
goto drop;
- skb->dev = port->dev;
- if (dev_queue_xmit(skb))
+ if (team_dev_queue_xmit(team, port, skb))
return false;
return true;
diff --git a/include/linux/if_team.h b/include/linux/if_team.h
index dfa0c8e..7fd0cde 100644
--- a/include/linux/if_team.h
+++ b/include/linux/if_team.h
@@ -13,6 +13,8 @@
#ifdef __KERNEL__
+#include <linux/netpoll.h>
+
struct team_pcpu_stats {
u64 rx_packets;
u64 rx_bytes;
@@ -60,6 +62,10 @@ struct team_port {
unsigned int mtu;
} orig;
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ struct netpoll *np;
+#endif
+
long mode_priv[0];
};
@@ -73,6 +79,33 @@ static inline bool team_port_txable(struct team_port *port)
return port->linkup && team_port_enabled(port);
}
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static inline void team_netpoll_send_skb(struct team_port *port,
+ struct sk_buff *skb)
+{
+ struct netpoll *np = port->np;
+
+ if (np)
+ netpoll_send_skb(np, skb);
+}
+#else
+static inline void team_netpoll_send_skb(struct team_port *port,
+ struct sk_buff *skb)
+{
+}
+#endif
+
+static inline int team_dev_queue_xmit(struct team *team, struct team_port *port,
+ struct sk_buff *skb)
+{
+ skb->dev = port->dev;
+ if (unlikely(netpoll_tx_running(port->dev))) {
+ team_netpoll_send_skb(port, skb);
+ return 0;
+ }
+ return dev_queue_xmit(skb);
+}
+
struct team_mode_ops {
int (*init)(struct team *team);
void (*exit)(struct team *team);
--
1.7.10.4
^ permalink raw reply related
* ethtool 3.4.2 released
From: Ben Hutchings @ 2012-07-17 15:31 UTC (permalink / raw)
To: netdev
[-- Attachment #1: Type: text/plain, Size: 756 bytes --]
ethtool version 3.4.2 has been released. This fixes various bugs.
Home page: https://ftp.kernel.org/pub/software/network/ethtool/
Download link:
https://ftp.kernel.org/pub/software/network/ethtool/ethtool-3.4.2.tar.gz
Release notes:
* Fix: Fix regression in RX NFC rule insertion for drivers that do
not select rule locations (-N/-U option)
* Fix: Remove bogus error message when changing offload settings
on Linux < 2.6.39 (-K option)
* Fix: Use alternate method to check for VLAN tag offload on Linux
< 2.6.37 (-k option)
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply
* Re: [ethtool PATCH] ethtool: Resolve use of uninitialized memory in rxclass_get_dev_info
From: Ben Hutchings @ 2012-07-17 15:32 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, jeffrey.t.kirsher
In-Reply-To: <5004AD60.9090702@intel.com>
On Mon, 2012-07-16 at 17:10 -0700, Alexander Duyck wrote:
> On 07/16/2012 01:03 PM, Ben Hutchings wrote:
> > On Fri, 2012-07-13 at 09:55 -0700, Alexander Duyck wrote:
> >> The ethtool function for getting the rule count was not zeroing out the
> >> data field before passing it to the kernel. As a result the value started
> >> uninitialized and was incorrectly returning a result indicating that
> >> devices supported setting new rule indexes. In order to correct this I am
> >> adding a one line fix that sets data to zero before we pass the command to
> >> the kernel.
> > Right. For 'get' commands with no parameters (besides the device) the
> > data copied back to userland is normally zero-initialised and then
> > filled out by the driver, and I seem to have worked on that assumption.
> > But because of the odd multiplexing of RX NFC commands
> > ETHTOOL_GRXCLSRLCNT doesn't work like that. And for 'my' driver that
> > didn't matter. Sorry about that.
> >
> > (We should really have some explicit documentation of responsibility for
> > structure initialisation.)
> >
> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> >> ---
> >>
> >> I am resending this since I didn't see any notification that it had been seen.
> >> I also realized that I had not clearly identified that this is an ethtool user
> >> space patch and not an ethtool kernel space patch.
> > It was perfectly clear and I had queued it up to review but hadn't yet
> > done so.
> >
> > Ben.
> >
> Yeah, that was my mistake. I thought I hadn't sent it out with the
> ethtool prefix when I actually had.
So, anyway, I've applied it and just done a bug fix release (3.4.2).
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* [PATCH 5/5 v2] ipv4: Add FIB nexthop exceptions.
From: David Miller @ 2012-07-17 15:58 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet
In a regime where we have subnetted route entries, we need a way to
store persistent storage about destination specific learned values
such as redirects and PMTU values.
This is implemented here via nexthop exceptions.
The initial implementation is a 2048 entry hash table with relaiming
starting at chain length 5. A more sophisticated scheme can be
devised if that proves necessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
---
Eric, just for you :-)
include/net/ip_fib.h | 18 ++++
net/ipv4/fib_semantics.c | 23 +++++
net/ipv4/route.c | 256 ++++++++++++++++++++++++++++++++++++++++------
3 files changed, 266 insertions(+), 31 deletions(-)
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 5697ace..e9ee1ca 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -18,6 +18,7 @@
#include <net/flow.h>
#include <linux/seq_file.h>
+#include <linux/rcupdate.h>
#include <net/fib_rules.h>
#include <net/inetpeer.h>
@@ -46,6 +47,22 @@ struct fib_config {
struct fib_info;
+struct fib_nh_exception {
+ struct fib_nh_exception __rcu *fnhe_next;
+ __be32 fnhe_daddr;
+ u32 fnhe_pmtu;
+ u32 fnhe_gw;
+ unsigned long fnhe_expires;
+ unsigned long fnhe_stamp;
+};
+
+struct fnhe_hash_bucket {
+ struct fib_nh_exception __rcu *chain;
+};
+
+#define FNHE_HASH_SIZE 2048
+#define FNHE_RECLAIM_DEPTH 5
+
struct fib_nh {
struct net_device *nh_dev;
struct hlist_node nh_hash;
@@ -63,6 +80,7 @@ struct fib_nh {
__be32 nh_gw;
__be32 nh_saddr;
int nh_saddr_genid;
+ struct fnhe_hash_bucket *nh_exceptions;
};
/*
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index d71bfbd..1e09852 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -140,6 +140,27 @@ const struct fib_prop fib_props[RTN_MAX + 1] = {
},
};
+static void free_nh_exceptions(struct fib_nh *nh)
+{
+ struct fnhe_hash_bucket *hash = nh->nh_exceptions;
+ int i;
+
+ for (i = 0; i < FNHE_HASH_SIZE; i++) {
+ struct fib_nh_exception *fnhe;
+
+ fnhe = rcu_dereference(hash[i].chain);
+ while (fnhe) {
+ struct fib_nh_exception *next;
+
+ next = rcu_dereference(fnhe->fnhe_next);
+ kfree(fnhe);
+
+ fnhe = next;
+ }
+ }
+ kfree(hash);
+}
+
/* Release a nexthop info record */
static void free_fib_info_rcu(struct rcu_head *head)
{
@@ -148,6 +169,8 @@ static void free_fib_info_rcu(struct rcu_head *head)
change_nexthops(fi) {
if (nexthop_nh->nh_dev)
dev_put(nexthop_nh->nh_dev);
+ if (nexthop_nh->nh_exceptions)
+ free_nh_exceptions(nexthop_nh);
} endfor_nexthops(fi);
release_net(fi->fib_net);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index b35d3bf..a5bd0b4 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1275,14 +1275,130 @@ static void rt_del(unsigned int hash, struct rtable *rt)
spin_unlock_bh(rt_hash_lock_addr(hash));
}
-static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buff *skb)
+static void __build_flow_key(struct flowi4 *fl4, struct sock *sk,
+ const struct iphdr *iph,
+ int oif, u8 tos,
+ u8 prot, u32 mark, int flow_flags)
+{
+ if (sk) {
+ const struct inet_sock *inet = inet_sk(sk);
+
+ oif = sk->sk_bound_dev_if;
+ mark = sk->sk_mark;
+ tos = RT_CONN_FLAGS(sk);
+ prot = inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol;
+ }
+ flowi4_init_output(fl4, oif, mark, tos,
+ RT_SCOPE_UNIVERSE, prot,
+ flow_flags,
+ iph->daddr, iph->saddr, 0, 0);
+}
+
+static void build_skb_flow_key(struct flowi4 *fl4, struct sk_buff *skb, struct sock *sk)
+{
+ const struct iphdr *iph = ip_hdr(skb);
+ int oif = skb->dev->ifindex;
+ u8 tos = RT_TOS(iph->tos);
+ u8 prot = iph->protocol;
+ u32 mark = skb->mark;
+
+ __build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
+}
+
+static void build_sk_flow_key(struct flowi4 *fl4, struct sock *sk)
+{
+ const struct inet_sock *inet = inet_sk(sk);
+ struct ip_options_rcu *inet_opt;
+ __be32 daddr = inet->inet_daddr;
+
+ rcu_read_lock();
+ inet_opt = rcu_dereference(inet->inet_opt);
+ if (inet_opt && inet_opt->opt.srr)
+ daddr = inet_opt->opt.faddr;
+ flowi4_init_output(fl4, sk->sk_bound_dev_if, sk->sk_mark,
+ RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE,
+ inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
+ inet_sk_flowi_flags(sk),
+ daddr, inet->inet_saddr, 0, 0);
+ rcu_read_unlock();
+}
+
+static void ip_rt_build_flow_key(struct flowi4 *fl4, struct sock *sk,
+ struct sk_buff *skb)
+{
+ if (skb)
+ build_skb_flow_key(fl4, skb, sk);
+ else
+ build_sk_flow_key(fl4, sk);
+}
+
+static DEFINE_SPINLOCK(fnhe_lock);
+
+static struct fib_nh_exception *fnhe_oldest(struct fnhe_hash_bucket *hash, __be32 daddr)
+{
+ struct fib_nh_exception *fnhe, *oldest;
+
+ oldest = rcu_dereference(hash->chain);
+ for (fnhe = rcu_dereference(oldest->fnhe_next); fnhe;
+ fnhe = rcu_dereference(fnhe->fnhe_next)) {
+ if (time_before(fnhe->fnhe_stamp, oldest->fnhe_stamp))
+ oldest = fnhe;
+ }
+ return oldest;
+}
+
+static struct fib_nh_exception *find_or_create_fnhe(struct fib_nh *nh, __be32 daddr)
+{
+ struct fnhe_hash_bucket *hash = nh->nh_exceptions;
+ struct fib_nh_exception *fnhe;
+ int depth;
+ u32 hval;
+
+ if (!hash) {
+ hash = nh->nh_exceptions = kzalloc(FNHE_HASH_SIZE * sizeof(*hash),
+ GFP_ATOMIC);
+ if (!hash)
+ return NULL;
+ }
+
+ hval = (__force u32) daddr;
+ hval ^= (hval >> 11) ^ (hval >> 22);
+ hash += hval;
+
+ depth = 0;
+ for (fnhe = rcu_dereference(hash->chain); fnhe;
+ fnhe = rcu_dereference(fnhe->fnhe_next)) {
+ if (fnhe->fnhe_daddr == daddr)
+ goto out;
+ depth++;
+ }
+
+ if (depth > FNHE_RECLAIM_DEPTH) {
+ fnhe = fnhe_oldest(hash + hval, daddr);
+ goto out_daddr;
+ }
+ fnhe = kzalloc(sizeof(*fnhe), GFP_ATOMIC);
+ if (!fnhe)
+ return NULL;
+
+ fnhe->fnhe_next = hash->chain;
+ rcu_assign_pointer(hash->chain, fnhe);
+
+out_daddr:
+ fnhe->fnhe_daddr = daddr;
+out:
+ fnhe->fnhe_stamp = jiffies;
+ return fnhe;
+}
+
+static void __ip_do_redirect(struct rtable *rt, struct sk_buff *skb, struct flowi4 *fl4)
{
__be32 new_gw = icmp_hdr(skb)->un.gateway;
__be32 old_gw = ip_hdr(skb)->saddr;
struct net_device *dev = skb->dev;
struct in_device *in_dev;
+ struct fib_result res;
struct neighbour *n;
- struct rtable *rt;
struct net *net;
switch (icmp_hdr(skb)->code & 7) {
@@ -1296,7 +1412,6 @@ static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buf
return;
}
- rt = (struct rtable *) dst;
if (rt->rt_gateway != old_gw)
return;
@@ -1320,11 +1435,21 @@ static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buf
goto reject_redirect;
}
- n = ipv4_neigh_lookup(dst, NULL, &new_gw);
+ n = ipv4_neigh_lookup(&rt->dst, NULL, &new_gw);
if (n) {
if (!(n->nud_state & NUD_VALID)) {
neigh_event_send(n, NULL);
} else {
+ if (fib_lookup(net, fl4, &res) == 0) {
+ struct fib_nh *nh = &FIB_RES_NH(res);
+ struct fib_nh_exception *fnhe;
+
+ spin_lock_bh(&fnhe_lock);
+ fnhe = find_or_create_fnhe(nh, fl4->daddr);
+ if (fnhe)
+ fnhe->fnhe_gw = new_gw;
+ spin_unlock_bh(&fnhe_lock);
+ }
rt->rt_gateway = new_gw;
rt->rt_flags |= RTCF_REDIRECTED;
call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
@@ -1349,6 +1474,17 @@ reject_redirect:
;
}
+static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buff *skb)
+{
+ struct rtable *rt;
+ struct flowi4 fl4;
+
+ rt = (struct rtable *) dst;
+
+ ip_rt_build_flow_key(&fl4, sk, skb);
+ __ip_do_redirect(rt, skb, &fl4);
+}
+
static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst)
{
struct rtable *rt = (struct rtable *)dst;
@@ -1508,33 +1644,51 @@ out: kfree_skb(skb);
return 0;
}
-static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
- struct sk_buff *skb, u32 mtu)
+static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu)
{
- struct rtable *rt = (struct rtable *) dst;
-
- dst_confirm(dst);
+ struct fib_result res;
if (mtu < ip_rt_min_pmtu)
mtu = ip_rt_min_pmtu;
+ if (fib_lookup(dev_net(rt->dst.dev), fl4, &res) == 0) {
+ struct fib_nh *nh = &FIB_RES_NH(res);
+ struct fib_nh_exception *fnhe;
+
+ spin_lock_bh(&fnhe_lock);
+ fnhe = find_or_create_fnhe(nh, fl4->daddr);
+ if (fnhe) {
+ fnhe->fnhe_pmtu = mtu;
+ fnhe->fnhe_expires = jiffies + ip_rt_mtu_expires;
+ }
+ spin_unlock_bh(&fnhe_lock);
+ }
rt->rt_pmtu = mtu;
dst_set_expires(&rt->dst, ip_rt_mtu_expires);
}
+static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
+ struct sk_buff *skb, u32 mtu)
+{
+ struct rtable *rt = (struct rtable *) dst;
+ struct flowi4 fl4;
+
+ ip_rt_build_flow_key(&fl4, sk, skb);
+ __ip_rt_update_pmtu(rt, &fl4, mtu);
+}
+
void ipv4_update_pmtu(struct sk_buff *skb, struct net *net, u32 mtu,
int oif, u32 mark, u8 protocol, int flow_flags)
{
- const struct iphdr *iph = (const struct iphdr *)skb->data;
+ const struct iphdr *iph = (const struct iphdr *) skb->data;
struct flowi4 fl4;
struct rtable *rt;
- flowi4_init_output(&fl4, oif, mark, RT_TOS(iph->tos), RT_SCOPE_UNIVERSE,
- protocol, flow_flags,
- iph->daddr, iph->saddr, 0, 0);
+ __build_flow_key(&fl4, NULL, iph, oif,
+ RT_TOS(iph->tos), protocol, mark, flow_flags);
rt = __ip_route_output_key(net, &fl4);
if (!IS_ERR(rt)) {
- ip_rt_update_pmtu(&rt->dst, NULL, skb, mtu);
+ __ip_rt_update_pmtu(rt, &fl4, mtu);
ip_rt_put(rt);
}
}
@@ -1542,27 +1696,31 @@ EXPORT_SYMBOL_GPL(ipv4_update_pmtu);
void ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu)
{
- const struct inet_sock *inet = inet_sk(sk);
+ const struct iphdr *iph = (const struct iphdr *) skb->data;
+ struct flowi4 fl4;
+ struct rtable *rt;
- return ipv4_update_pmtu(skb, sock_net(sk), mtu,
- sk->sk_bound_dev_if, sk->sk_mark,
- inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
- inet_sk_flowi_flags(sk));
+ __build_flow_key(&fl4, sk, iph, 0, 0, 0, 0, 0);
+ rt = __ip_route_output_key(sock_net(sk), &fl4);
+ if (!IS_ERR(rt)) {
+ __ip_rt_update_pmtu(rt, &fl4, mtu);
+ ip_rt_put(rt);
+ }
}
EXPORT_SYMBOL_GPL(ipv4_sk_update_pmtu);
void ipv4_redirect(struct sk_buff *skb, struct net *net,
int oif, u32 mark, u8 protocol, int flow_flags)
{
- const struct iphdr *iph = (const struct iphdr *)skb->data;
+ const struct iphdr *iph = (const struct iphdr *) skb->data;
struct flowi4 fl4;
struct rtable *rt;
- flowi4_init_output(&fl4, oif, mark, RT_TOS(iph->tos), RT_SCOPE_UNIVERSE,
- protocol, flow_flags, iph->daddr, iph->saddr, 0, 0);
+ __build_flow_key(&fl4, NULL, iph, oif,
+ RT_TOS(iph->tos), protocol, mark, flow_flags);
rt = __ip_route_output_key(net, &fl4);
if (!IS_ERR(rt)) {
- ip_do_redirect(&rt->dst, NULL, skb);
+ __ip_do_redirect(rt, skb, &fl4);
ip_rt_put(rt);
}
}
@@ -1570,12 +1728,16 @@ EXPORT_SYMBOL_GPL(ipv4_redirect);
void ipv4_sk_redirect(struct sk_buff *skb, struct sock *sk)
{
- const struct inet_sock *inet = inet_sk(sk);
+ const struct iphdr *iph = (const struct iphdr *) skb->data;
+ struct flowi4 fl4;
+ struct rtable *rt;
- return ipv4_redirect(skb, sock_net(sk), sk->sk_bound_dev_if,
- sk->sk_mark,
- inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
- inet_sk_flowi_flags(sk));
+ __build_flow_key(&fl4, sk, iph, 0, 0, 0, 0, 0);
+ rt = __ip_route_output_key(sock_net(sk), &fl4);
+ if (!IS_ERR(rt)) {
+ __ip_do_redirect(rt, skb, &fl4);
+ ip_rt_put(rt);
+ }
}
EXPORT_SYMBOL_GPL(ipv4_sk_redirect);
@@ -1722,14 +1884,46 @@ static void rt_init_metrics(struct rtable *rt, const struct flowi4 *fl4,
dst_init_metrics(&rt->dst, fi->fib_metrics, true);
}
+static void rt_bind_exception(struct rtable *rt, struct fib_nh *nh, __be32 daddr)
+{
+ struct fnhe_hash_bucket *hash = nh->nh_exceptions;
+ struct fib_nh_exception *fnhe;
+ u32 hval;
+
+ hval = (__force u32) daddr;
+ hval ^= (hval >> 11) ^ (hval >> 22);
+
+ for (fnhe = rcu_dereference(hash[hval].chain); fnhe;
+ fnhe = rcu_dereference(fnhe->fnhe_next)) {
+ if (fnhe->fnhe_daddr == daddr) {
+ if (fnhe->fnhe_pmtu) {
+ unsigned long expires = fnhe->fnhe_expires;
+ unsigned long diff = jiffies - expires;
+
+ if (time_before(jiffies, expires)) {
+ rt->rt_pmtu = fnhe->fnhe_pmtu;
+ dst_set_expires(&rt->dst, diff);
+ }
+ }
+ if (fnhe->fnhe_gw)
+ rt->rt_gateway = fnhe->fnhe_gw;
+ fnhe->fnhe_stamp = jiffies;
+ break;
+ }
+ }
+}
+
static void rt_set_nexthop(struct rtable *rt, const struct flowi4 *fl4,
const struct fib_result *res,
struct fib_info *fi, u16 type, u32 itag)
{
if (fi) {
- if (FIB_RES_GW(*res) &&
- FIB_RES_NH(*res).nh_scope == RT_SCOPE_LINK)
- rt->rt_gateway = FIB_RES_GW(*res);
+ struct fib_nh *nh = &FIB_RES_NH(*res);
+
+ if (nh->nh_gw && nh->nh_scope == RT_SCOPE_LINK)
+ rt->rt_gateway = nh->nh_gw;
+ if (unlikely(nh->nh_exceptions))
+ rt_bind_exception(rt, nh, fl4->daddr);
rt_init_metrics(rt, fl4, fi);
#ifdef CONFIG_IP_ROUTE_CLASSID
rt->dst.tclassid = FIB_RES_NH(*res).nh_tclassid;
--
1.7.10.4
^ permalink raw reply related
* That's pretty much it for 3.5.0
From: David Miller @ 2012-07-17 16:01 UTC (permalink / raw)
To: netdev; +Cc: linux-wireless, netfilter-devel
Linus was _extremely_ generous and took in all the stuff that was
pending in the net tree just now.
Besides very serious issues, I'm not willing to consider any more bug
fixes for the 'net' tree at this time.
Only one pending known bug qualifies, and that's the CIPSO ip option
processing OOPS'er. And I'll work on that myself if Paul Moore
doesn't show a sign of life in the next day.
Thanks.
^ permalink raw reply
* Re: [patch net-next 0/2] team: add netpoll support
From: David Miller @ 2012-07-17 16:02 UTC (permalink / raw)
To: jiri; +Cc: netdev
In-Reply-To: <1342538556-22601-1-git-send-email-jiri@resnulli.us>
From: Jiri Pirko <jiri@resnulli.us>
Date: Tue, 17 Jul 2012 17:22:34 +0200
> Also contains a little change to netpoll core.
>
> Jiri Pirko (2):
> netpoll: move np->dev and np->dev_name init into __netpoll_setup()
> team: add netpoll support
Both applied, thanks Jiri.
^ permalink raw reply
* [PATCH ethtool 0/3] Cleanup for the RPM
From: Ben Hutchings @ 2012-07-17 16:21 UTC (permalink / raw)
To: netdev; +Cc: linux-net-drivers
I haven't tried building RPMs in a while, and I don't know whether
anyone actually uses the bundled spec file. Anyway, this should freshen
it up a bit.
Ben.
Ben Hutchings (3):
ethtool.spec: Update summary and description, based on Fedora package
ethtool.spec: Update URL to the current home page
ethtool.spec: Do not include ChangeLog or INSTALL
ethtool.spec.in | 14 ++++++--------
1 files changed, 6 insertions(+), 8 deletions(-)
--
1.7.7.6
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* [PATCH ethtool 1/3] ethtool.spec: Update summary and description, based on Fedora package
From: Ben Hutchings @ 2012-07-17 16:23 UTC (permalink / raw)
To: netdev; +Cc: linux-net-drivers
In-Reply-To: <1342542068.2698.7.camel@bwh-desktop.uk.solarflarecom.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
It would probably be better to come up with a single concise description
to use in all the various places it's needed: RPMs, debs, web site,
manual page...
Ben.
ethtool.spec.in | 10 ++++------
1 files changed, 4 insertions(+), 6 deletions(-)
diff --git a/ethtool.spec.in b/ethtool.spec.in
index 4ff736a..879555d 100644
--- a/ethtool.spec.in
+++ b/ethtool.spec.in
@@ -3,7 +3,7 @@ Version : @VERSION@
Release : 1
Group : Utilities
-Summary : A tool for setting ethernet parameters
+Summary : Settings tool for Ethernet and other network devices
License : GPL
URL : http://sourceforge.net/projects/gkernel/
@@ -13,11 +13,9 @@ Source : %{name}-%{version}.tar.gz
%description
-Ethtool is a small utility to get and set values from your your ethernet
-controllers. Not all ethernet drivers support ethtool, but it is getting
-better. If your ethernet driver doesn't support it, ask the maintainer to
-write support - it's not hard!
-
+This utility allows querying and changing settings such as speed,
+port, auto-negotiation, PCI locations and checksum offload on many
+network devices, especially Ethernet devices.
%prep
%setup -q
--
1.7.7.6
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related
* [PATCH ethtool 2/3] ethtool.spec: Update URL to the current home page
From: Ben Hutchings @ 2012-07-17 16:24 UTC (permalink / raw)
To: netdev; +Cc: linux-net-drivers
In-Reply-To: <1342542068.2698.7.camel@bwh-desktop.uk.solarflarecom.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
ethtool.spec.in | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/ethtool.spec.in b/ethtool.spec.in
index 879555d..863dfd4 100644
--- a/ethtool.spec.in
+++ b/ethtool.spec.in
@@ -6,7 +6,7 @@ Group : Utilities
Summary : Settings tool for Ethernet and other network devices
License : GPL
-URL : http://sourceforge.net/projects/gkernel/
+URL : https://ftp.kernel.org/pub/software/network/ethtool/
Buildroot : %{_tmppath}/%{name}-%{version}
Source : %{name}-%{version}.tar.gz
--
1.7.7.6
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related
* [PATCH ethtool 3/3] ethtool.spec: Do not include ChangeLog or INSTALL
From: Ben Hutchings @ 2012-07-17 16:24 UTC (permalink / raw)
To: netdev; +Cc: linux-net-drivers
In-Reply-To: <1342542068.2698.7.camel@bwh-desktop.uk.solarflarecom.com>
The ChangeLog is ancient history, replaced by the version control
changelog.
INSTALL is redundant in a binary package.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
ethtool.spec.in | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/ethtool.spec.in b/ethtool.spec.in
index 863dfd4..6e9e1f5 100644
--- a/ethtool.spec.in
+++ b/ethtool.spec.in
@@ -34,7 +34,7 @@ make install DESTDIR=${RPM_BUILD_ROOT}
%defattr(-,root,root)
/usr/sbin/ethtool
%{_mandir}/man8/ethtool.8*
-%doc AUTHORS COPYING INSTALL NEWS README ChangeLog
+%doc AUTHORS COPYING NEWS README
%changelog
--
1.7.7.6
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply related
* Re: Crash in CIPSO_V4_TAG_LOCAL handling
From: Paul Moore @ 2012-07-17 16:25 UTC (permalink / raw)
To: David Miller; +Cc: mlin, alan, netdev
In-Reply-To: <20120714.130817.394766887121758073.davem@davemloft.net>
On Sat, Jul 14, 2012 at 4:08 PM, David Miller <davem@davemloft.net> wrote:
> From: Lin Ming <mlin@ss.pku.edu.cn>
> Date: Sun, 15 Jul 2012 01:22:30 +0800
>
>> It's caused by below code added in commit 15c45f7b.
>>
>> case CIPSO_V4_TAG_LOCAL:
>> /* This is a non-standard tag that we only allow for
>> * local connections, so if the incoming interface is
>> * not the loopback device drop the packet. */
>> if (!(skb->dev->flags & IFF_LOOPBACK)) {
>> err_offset = opt_iter;
>> goto validate_return_locked;
>> }
>
> Paul please fix this, as shown 'skb' can easily be NULL in this
> code path.
Just saw this ... I'll start looking into this today.
--
paul moore
www.paul-moore.com
^ permalink raw reply
* [PATCH] jme: netpoll support
From: Lekensteyn @ 2012-07-17 16:29 UTC (permalink / raw)
To: Guo-Fu Tseng; +Cc: netdev
From: Peter Wu <lekensteyn@gmail.com>
This patch adds the netpoll function to support netconsole. Tested and works
fine on my "JMC250 PCI Express Gigabit Ethernet Controller" (PCI ID 0250).
Signed-off-by: Peter Wu <lekensteyn@gmail.com>
---
drivers/net/ethernet/jme.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c
index 4ea6580..c911d88 100644
--- a/drivers/net/ethernet/jme.c
+++ b/drivers/net/ethernet/jme.c
@@ -2743,6 +2743,17 @@ jme_set_features(struct net_device *netdev, netdev_features_t features)
return 0;
}
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void jme_netpoll(struct net_device *dev)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ jme_intr(dev->irq, dev);
+ local_irq_restore(flags);
+}
+#endif
+
static int
jme_nway_reset(struct net_device *netdev)
{
@@ -2944,6 +2955,9 @@ static const struct net_device_ops jme_netdev_ops = {
.ndo_tx_timeout = jme_tx_timeout,
.ndo_fix_features = jme_fix_features,
.ndo_set_features = jme_set_features,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ .ndo_poll_controller = jme_netpoll,
+#endif
};
static int __devinit
--
1.7.9.5
^ permalink raw reply related
* pull request: sfc-next 2012-07-17
From: Ben Hutchings @ 2012-07-17 17:05 UTC (permalink / raw)
To: David Miller; +Cc: linux-net-drivers, netdev
[-- Attachment #1: Type: text/plain, Size: 2182 bytes --]
The following changes since commit 141e369de698f2e17bf716b83fcc647ddcb2220c:
xfrm: Initialize the struct xfrm_dst behind the dst_enty field (2012-07-14 00:29:12 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next.git for-davem
(commit c2dbab39db1c3c2ccbdbb2c6bac6f07cc7a7c1f6)
1. Fix potential badness when running a self-test with SR-IOV enabled.
2. Fix calculation of some interface statistics that could run backward.
3. Miscellaneous cleanup.
Ben.
Ben Hutchings (10):
sfc: Work around bogus 'uninitialised variable' warning
sfc: Use generic DMA API, not PCI-DMA API
sfc: Remove dead write to tso_state::packet_space
sfc: Stop changing header offsets on TX
sfc: Use strlcpy() to copy ethtool stats names
sfc: Use dev_kfree_skb() in efx_end_loopback()
sfc: Explain why efx_mcdi_exit_assertion() ignores result of efx_mcdi_rpc()
sfc: Disable VF queues during register self-test
sfc: Fix interface statistics running backward
sfc: Correct some comments on enum reset_type
drivers/net/ethernet/sfc/efx.c | 10 ++--
drivers/net/ethernet/sfc/enum.h | 8 ++--
drivers/net/ethernet/sfc/ethtool.c | 2 +-
drivers/net/ethernet/sfc/falcon.c | 35 +++++++++++--
drivers/net/ethernet/sfc/falcon_xmac.c | 12 ++--
drivers/net/ethernet/sfc/filter.c | 2 +-
drivers/net/ethernet/sfc/mcdi.c | 11 +++-
drivers/net/ethernet/sfc/net_driver.h | 9 ++-
drivers/net/ethernet/sfc/nic.c | 11 ++---
drivers/net/ethernet/sfc/nic.h | 18 ++++++
drivers/net/ethernet/sfc/rx.c | 22 ++++----
drivers/net/ethernet/sfc/selftest.c | 64 ++++++----------------
drivers/net/ethernet/sfc/siena.c | 37 ++++++++++---
drivers/net/ethernet/sfc/tx.c | 93 ++++++++++++++------------------
14 files changed, 181 insertions(+), 153 deletions(-)
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply
* Re: Crash in CIPSO_V4_TAG_LOCAL handling
From: David Miller @ 2012-07-17 17:28 UTC (permalink / raw)
To: paul; +Cc: mlin, alan, netdev
In-Reply-To: <CAHC9VhQKhoRt8yD4WAgD5tyqjY4bbp4Z4hr5cWfs-d6GJt0pjQ@mail.gmail.com>
From: Paul Moore <paul@paul-moore.com>
Date: Tue, 17 Jul 2012 12:25:28 -0400
> On Sat, Jul 14, 2012 at 4:08 PM, David Miller <davem@davemloft.net> wrote:
>> From: Lin Ming <mlin@ss.pku.edu.cn>
>> Date: Sun, 15 Jul 2012 01:22:30 +0800
>>
>>> It's caused by below code added in commit 15c45f7b.
>>>
>>> case CIPSO_V4_TAG_LOCAL:
>>> /* This is a non-standard tag that we only allow for
>>> * local connections, so if the incoming interface is
>>> * not the loopback device drop the packet. */
>>> if (!(skb->dev->flags & IFF_LOOPBACK)) {
>>> err_offset = opt_iter;
>>> goto validate_return_locked;
>>> }
>>
>> Paul please fix this, as shown 'skb' can easily be NULL in this
>> code path.
>
> Just saw this ... I'll start looking into this today.
Thanks, sorry I messed up your email, I should have checked MAINTAINERS :)
^ permalink raw reply
* Re: pull request: sfc-next 2012-07-17
From: David Miller @ 2012-07-17 17:31 UTC (permalink / raw)
To: bhutchings; +Cc: linux-net-drivers, netdev
In-Reply-To: <1342544740.2698.13.camel@bwh-desktop.uk.solarflarecom.com>
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Tue, 17 Jul 2012 18:05:40 +0100
> The following changes since commit 141e369de698f2e17bf716b83fcc647ddcb2220c:
>
> xfrm: Initialize the struct xfrm_dst behind the dst_enty field (2012-07-14 00:29:12 -0700)
>
> are available in the git repository at:
> git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next.git for-davem
>
> (commit c2dbab39db1c3c2ccbdbb2c6bac6f07cc7a7c1f6)
>
> 1. Fix potential badness when running a self-test with SR-IOV enabled.
> 2. Fix calculation of some interface statistics that could run backward.
> 3. Miscellaneous cleanup.
Please post the patches so that at least in theory people have the
opportunity to review them.
^ permalink raw reply
* Re: [PATCH] netem: fix rate extension and drop accounting
From: Eric Dumazet @ 2012-07-17 17:39 UTC (permalink / raw)
To: Mark Gordon
Cc: Hagen Paul Pfeifer, David Miller, netdev, Yuchung Cheng,
Andreas Terzis
In-Reply-To: <CAPVr9VMCYFO-7uEzO6ft2vpPhVvRgHB3EWJJG62OqGqux1LsZQ@mail.gmail.com>
Please Mark :
1) Dont top post on netdev
2) Dont write HTML mails on netdev (your mail never went to netdev,
only to CCed people). Only text mails are allowed.
On Tue, 2012-07-17 at 10:20 -0700, Mark Gordon wrote:
> Even the static delay case seems wrong with the new patch. Assume all
> packets have the same sched_time. Then if you spam packets that get
> processed at the same time by netem they will all get scheduled with
> the same time_to_send because the first packet will get time_to_send
> of [1] = clock_time + sched_time. Then packet n compute 'now' as
> [n-1] and delay as sched_time - (clock_time - [1]) = 0 so that [n] =
> [n-1]. Therefore every packet gets scheduled at the same time.
>
>
> The above modification seems to fix the issue when latency/jitter is 0
> but suffers from a missing non-linearity when delay is present. Is
> there a technical reason I'm missing that prevents us from doing rate
> and latency here? Why wouldn't the 'official' patch have correct
> rate?
Because delay is variable (jitter)
netem as is is not working correctly if you have both a rate limit and
delay.
Hagen is working on a solution, but there is no easy fix.
The right solution is to have :
1) A rate stage, using a child qdisc (that you can graft to install your
own qdisc hierarchy if needed, say if you want codel or fq_codel ;))
Thats basically a TBF...
2) skb orphan
3) drops/reorders/corrupt/additional delay (variable delay)
using an internal tfifo, to mimic real networks behavior.
Thats the reverse of how its currently done.
Alternatively, this could be implemented as a special network device,
like bonding, instead of a qdisc.
^ permalink raw reply
* Re: That's pretty much it for 3.5.0
From: Rustad, Mark D @ 2012-07-17 17:41 UTC (permalink / raw)
To: David Miller
Cc: <netdev@vger.kernel.org>,
<linux-wireless@vger.kernel.org>,
<netfilter-devel@vger.kernel.org>
In-Reply-To: <20120717.090142.125145009944045241.davem@davemloft.net>
On Jul 17, 2012, at 9:01 AM, David Miller wrote:
> Linus was _extremely_ generous and took in all the stuff that was
> pending in the net tree just now.
Maybe *too* generous. :-) I just updated and when I boot I get an early crash in update_netdev_tables which is in netprio_cgroup.c.
> Besides very serious issues, I'm not willing to consider any more bug
> fixes for the 'net' tree at this time.
I think the above issue will have to be fixed, as it completely prevents booting for any kernel that includes the netprio_cgroup option.
> Only one pending known bug qualifies, and that's the CIPSO ip option
> processing OOPS'er. And I'll work on that myself if Paul Moore
> doesn't show a sign of life in the next day.
>
> Thanks.
I can start taking a look at this if you like, but I see that Gao feng has two patches in the last set of patches that may be related.
To give you an idea how early the crash is, here are a few log messages leading up to it:
[ 0.003455] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
[ 0.005550] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
[ 0.007165] Mount-cache hash table entries: 256
[ 0.010289] Initializing cgroup subsys net_cls
[ 0.010947] Initializing cgroup subsys net_prio
[ 0.011039] BUG: unable to handle kernel NULL pointer dereference at 0000000000000828
[ 0.011998] IP: [<ffffffff814202c8>] update_netdev_tables+0x68/0xe0
--
Mark Rustad, LAN Access Division, Intel Corporation
^ permalink raw reply
* Re: [PATCH 0/5] Long term PMTU/redirect storage in ipv4.
From: David Miller @ 2012-07-17 18:03 UTC (permalink / raw)
To: netdev
In-Reply-To: <20120717.061418.1893307699868826531.davem@davemloft.net>
From: David Miller <davem@davemloft.net>
Date: Tue, 17 Jul 2012 06:14:18 -0700 (PDT)
> These patches implement the final mechanism necessary to really allow
> us to go without the route cache in ipv4.
Ok I pushed this out to net-next with the v2 of patch #5 and the merge
commit message adjusted to suit.
I think the routing cache will die in net-next for real some time
later this week.
I'll start respinning those patches.
^ permalink raw reply
* Re: pull request: sfc-next 2012-07-17
From: Ben Hutchings @ 2012-07-17 18:03 UTC (permalink / raw)
To: David Miller; +Cc: linux-net-drivers, netdev
In-Reply-To: <20120717.103103.655643871226631461.davem@davemloft.net>
On Tue, 2012-07-17 at 10:31 -0700, David Miller wrote:
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Tue, 17 Jul 2012 18:05:40 +0100
>
> > The following changes since commit 141e369de698f2e17bf716b83fcc647ddcb2220c:
> >
> > xfrm: Initialize the struct xfrm_dst behind the dst_enty field (2012-07-14 00:29:12 -0700)
> >
> > are available in the git repository at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next.git for-davem
> >
> > (commit c2dbab39db1c3c2ccbdbb2c6bac6f07cc7a7c1f6)
> >
> > 1. Fix potential badness when running a self-test with SR-IOV enabled.
> > 2. Fix calculation of some interface statistics that could run backward.
> > 3. Miscellaneous cleanup.
>
> Please post the patches so that at least in theory people have the
> opportunity to review them.
Sorry, yes. They're the same as last time modulo the MMIO, but I
suppose they may have been ignored after the objections to that.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox