* Re: Bridge extensions to iproute2
From: Stephen Hemminger @ 2012-07-10 22:06 UTC (permalink / raw)
To: Maciej Żenczykowski; +Cc: Linux NetDev, David Miller
In-Reply-To: <CAHo-OozFCrcx4FpsuNzEV4Cp_4hbVw896utpstYL2qcD1EwCyA@mail.gmail.com>
On Tue, 10 Jul 2012 15:01:26 -0700
Maciej Żenczykowski <zenczykowski@gmail.com> wrote:
> > I will get back to these. There wasn't a motivation to go fast because
> > there wasn't a user of these. Now with fdb offload support they are needed.
>
> Do you have some semi-ready patches that could be used for test purposes?
>
> While it looks like the forwarding database capability is there in the kernel,
> I can't currently find an interface to turn learning off.
I'll put something in today.
^ permalink raw reply
* Re: [PATCH] tc: filter: validate filter priority in userspace.
From: Stephen Hemminger @ 2012-07-10 22:39 UTC (permalink / raw)
To: Li Wei; +Cc: netdev
In-Reply-To: <4FFBEBA8.1050802@cn.fujitsu.com>
On Tue, 10 Jul 2012 16:45:28 +0800
Li Wei <lw@cn.fujitsu.com> wrote:
>
> Because we use the high 16 bits of tcm_info to pass prio value to
> kernel, thus it's range would be [0, 0xffff], without validation
> in tc when user pass a lager(>65535) priority, the actual priority
> set in kernel would confuse the user.
>
> So, add a validation to ensure prio in the range.
Applied
^ permalink raw reply
* Re: [PATCH iproute2] tc: u32: Fix icmp_code off.
From: Stephen Hemminger @ 2012-07-10 22:40 UTC (permalink / raw)
To: Hiroaki SHIMODA; +Cc: netdev
In-Reply-To: <20120710185318.075fc257cb6b9a2c8ae66479@gmail.com>
On Tue, 10 Jul 2012 18:53:18 +0900
Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> wrote:
> The off of icmp_code is not 20 but 21. Also offmask should be 0 unless
> nexthdr+ is specified.
>
> Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Both patches applied
^ permalink raw reply
* Re: [trivial PATCH 1/7] ixgb: use PCI_VENDOR_ID_*
From: Jeff Kirsher @ 2012-07-10 22:41 UTC (permalink / raw)
To: Jon Mason
Cc: trivial, linux-kernel, Jesse Brandeburg, Bruce Allan,
Carolyn Wyborny, Don Skidmore, Greg Rose, Peter P Waskiewicz Jr,
Alex Duyck, John Ronciak, netdev
In-Reply-To: <1341959492-31389-1-git-send-email-jdmason@kudzu.us>
[-- Attachment #1: Type: text/plain, Size: 4241 bytes --]
On Tue, 2012-07-10 at 15:31 -0700, Jon Mason wrote:
> Use PCI_VENDOR_ID_* from pci_ids.h instead of creating #define locally.
>
> Signed-off-by: Jon Mason <jdmason@kudzu.us>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Cc: Bruce Allan <bruce.w.allan@intel.com>
> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
> Cc: Don Skidmore <donald.c.skidmore@intel.com>
> Cc: Greg Rose <gregory.v.rose@intel.com>
> Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
> Cc: Alex Duyck <alexander.h.duyck@intel.com>
> Cc: John Ronciak <john.ronciak@intel.com>
> ---
> drivers/net/ethernet/intel/ixgb/ixgb_hw.c | 5 +++--
> drivers/net/ethernet/intel/ixgb/ixgb_ids.h | 5 -----
> drivers/net/ethernet/intel/ixgb/ixgb_main.c | 10 +++++-----
> 3 files changed, 8 insertions(+), 12 deletions(-)
This should go through David Miller's networking tree's. Adding netdev
mailing list.
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>
> diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_hw.c b/drivers/net/ethernet/intel/ixgb/ixgb_hw.c
> index 99b69ad..bf9a220 100644
> --- a/drivers/net/ethernet/intel/ixgb/ixgb_hw.c
> +++ b/drivers/net/ethernet/intel/ixgb/ixgb_hw.c
> @@ -32,6 +32,7 @@
>
> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>
> +#include <linux/pci_ids.h>
> #include "ixgb_hw.h"
> #include "ixgb_ids.h"
>
> @@ -96,7 +97,7 @@ static u32 ixgb_mac_reset(struct ixgb_hw *hw)
> ASSERT(!(ctrl_reg & IXGB_CTRL0_RST));
> #endif
>
> - if (hw->subsystem_vendor_id == SUN_SUBVENDOR_ID) {
> + if (hw->subsystem_vendor_id == PCI_VENDOR_ID_SUN) {
> ctrl_reg = /* Enable interrupt from XFP and SerDes */
> IXGB_CTRL1_GPI0_EN |
> IXGB_CTRL1_SDP6_DIR |
> @@ -271,7 +272,7 @@ ixgb_identify_phy(struct ixgb_hw *hw)
> }
>
> /* update phy type for sun specific board */
> - if (hw->subsystem_vendor_id == SUN_SUBVENDOR_ID)
> + if (hw->subsystem_vendor_id == PCI_VENDOR_ID_SUN)
> phy_type = ixgb_phy_type_bcm;
>
> return phy_type;
> diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_ids.h b/drivers/net/ethernet/intel/ixgb/ixgb_ids.h
> index 2a58847..32c1b30 100644
> --- a/drivers/net/ethernet/intel/ixgb/ixgb_ids.h
> +++ b/drivers/net/ethernet/intel/ixgb/ixgb_ids.h
> @@ -33,11 +33,6 @@
> ** The Device and Vendor IDs for 10 Gigabit MACs
> **********************************************************************/
>
> -#define INTEL_VENDOR_ID 0x8086
> -#define INTEL_SUBVENDOR_ID 0x8086
> -#define SUN_VENDOR_ID 0x108E
> -#define SUN_SUBVENDOR_ID 0x108E
> -
> #define IXGB_DEVICE_ID_82597EX 0x1048
> #define IXGB_DEVICE_ID_82597EX_SR 0x1A48
> #define IXGB_DEVICE_ID_82597EX_LR 0x1B48
> diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> index 5fce363..4e5a060 100644
> --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> @@ -54,13 +54,13 @@ MODULE_PARM_DESC(copybreak,
> * Class, Class Mask, private data (not used) }
> */
> static DEFINE_PCI_DEVICE_TABLE(ixgb_pci_tbl) = {
> - {INTEL_VENDOR_ID, IXGB_DEVICE_ID_82597EX,
> + {PCI_VENDOR_ID_INTEL, IXGB_DEVICE_ID_82597EX,
> PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
> - {INTEL_VENDOR_ID, IXGB_DEVICE_ID_82597EX_CX4,
> + {PCI_VENDOR_ID_INTEL, IXGB_DEVICE_ID_82597EX_CX4,
> PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
> - {INTEL_VENDOR_ID, IXGB_DEVICE_ID_82597EX_SR,
> + {PCI_VENDOR_ID_INTEL, IXGB_DEVICE_ID_82597EX_SR,
> PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
> - {INTEL_VENDOR_ID, IXGB_DEVICE_ID_82597EX_LR,
> + {PCI_VENDOR_ID_INTEL, IXGB_DEVICE_ID_82597EX_LR,
> PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
>
> /* required last entry */
> @@ -195,7 +195,7 @@ ixgb_irq_enable(struct ixgb_adapter *adapter)
> {
> u32 val = IXGB_INT_RXT0 | IXGB_INT_RXDMT0 |
> IXGB_INT_TXDW | IXGB_INT_LSC;
> - if (adapter->hw.subsystem_vendor_id == SUN_SUBVENDOR_ID)
> + if (adapter->hw.subsystem_vendor_id == PCI_VENDOR_ID_SUN)
> val |= IXGB_INT_GPI0;
> IXGB_WRITE_REG(&adapter->hw, IMS, val);
> IXGB_WRITE_FLUSH(&adapter->hw);
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* Re: [trivial PATCH 2/7] ixgbe: remove unused #define
From: Jeff Kirsher @ 2012-07-10 22:43 UTC (permalink / raw)
To: Jon Mason
Cc: trivial, linux-kernel, Jesse Brandeburg, Bruce Allan,
Carolyn Wyborny, Don Skidmore, Greg Rose, Peter P Waskiewicz Jr,
Alex Duyck, John Ronciak, netdev
In-Reply-To: <1341959492-31389-2-git-send-email-jdmason@kudzu.us>
[-- Attachment #1: Type: text/plain, Size: 1397 bytes --]
On Tue, 2012-07-10 at 15:31 -0700, Jon Mason wrote:
> Remove unused IXGBE_INTEL_VENDOR_ID #define
>
> Signed-off-by: Jon Mason <jdmason@kudzu.us>
> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> Cc: Bruce Allan <bruce.w.allan@intel.com>
> Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
> Cc: Don Skidmore <donald.c.skidmore@intel.com>
> Cc: Greg Rose <gregory.v.rose@intel.com>
> Cc: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
> Cc: Alex Duyck <alexander.h.duyck@intel.com>
> Cc: John Ronciak <john.ronciak@intel.com>
> ---
> drivers/net/ethernet/intel/ixgbe/ixgbe_type.h | 3 ---
> 1 file changed, 3 deletions(-)
This should also go through David Miller's networking tree's. Adding
netdev mailing list.
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> index 204848d..c8d8040 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
> @@ -32,9 +32,6 @@
> #include <linux/mdio.h>
> #include <linux/netdevice.h>
>
> -/* Vendor ID */
> -#define IXGBE_INTEL_VENDOR_ID 0x8086
> -
> /* Device IDs */
> #define IXGBE_DEV_ID_82598 0x10B6
> #define IXGBE_DEV_ID_82598_BX 0x1508
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply
* 3.5rc6 sctp panic
From: Dave Jones @ 2012-07-11 0:08 UTC (permalink / raw)
To: netdev; +Cc: Vlad Yasevich, Sridhar Samudrala
I just hit this while fuzz testing, and the box locked up immediately afterwards.
The serial log was a little mangled, I did my best to clean it up..
[22766.294255] general protection fault: 0000 [#1] PREEMPT SMP
[22766.295376] CPU 0
[22766.295384] Modules linked in:
[22766.387137] ffffffffa169f292 6b6b6b6b6b6b6b6b ffff880147c03a90 ffff880147c03a74
[22766.387135] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 00000000000
[22766.387136] Process trinity-watchdo (pid: 10896, threadinfo ffff88013e7d2000,
[22766.387137] Stack:
[22766.387140] ffff880147c03a10
[22766.387140] ffffffffa169f2b6
[22766.387140] ffff88013ed95728
[22766.387143] 0000000000000002
[22766.387143] 0000000000000000
[22766.387143] ffff880003fad062
[22766.387144] ffff88013c120000
[22766.387144]
[22766.387145] Call Trace:
[22766.387145] <IRQ>
[22766.387150] [<ffffffffa169f292>] ? __sctp_lookup_association+0x62/0xd0 [sctp]
[22766.387154] [<ffffffffa169f2b6>] __sctp_lookup_association+0x86/0xd0 [sctp]
[22766.387157] [<ffffffffa169f597>] sctp_rcv+0x207/0xbb0 [sctp]
[22766.387161] [<ffffffff810d4da8>] ? trace_hardirqs_off_caller+0x28/0xd0
[22766.387163] [<ffffffff815827e3>] ? nf_hook_slow+0x133/0x210
[22766.387166] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387168] [<ffffffff8159043d>] ip_local_deliver_finish+0x18d/0x4c0
[22766.387169] [<ffffffff815902fc>] ? ip_local_deliver_finish+0x4c/0x4c0
[22766.387171] [<ffffffff81590a07>] ip_local_deliver+0x47/0x80
[22766.387172] [<ffffffff8158fd80>] ip_rcv_finish+0x150/0x680
[22766.387174] [<ffffffff81590c54>] ip_rcv+0x214/0x320
[22766.387176] [<ffffffff81558c07>] __netif_receive_skb+0x7b7/0x910
[22766.387178] [<ffffffff8155856c>] ? __netif_receive_skb+0x11c/0x910
[22766.387180] [<ffffffff810d423e>] ? put_lock_stats.isra.25+0xe/0x40
[22766.387182] [<ffffffff81558f83>] netif_receive_skb+0x23/0x1f0
[22766.387183] [<ffffffff815596a9>] ? dev_gro_receive+0x139/0x440
[22766.387185] [<ffffffff81559280>] napi_skb_finish+0x70/0xa0
[22766.387187] [<ffffffff81559cb5>] napi_gro_receive+0xf5/0x130
[22766.387218] [<ffffffffa01c4679>] e1000_receive_skb+0x59/0x70 [e1000e]
[22766.387242] [<ffffffffa01c5aab>] e1000_clean_rx_irq+0x28b/0x460 [e1000e]
[22766.387266] [<ffffffffa01c9c18>] e1000e_poll+0x78/0x430 [e1000e]
[22766.387268] [<ffffffff81559fea>] net_rx_action+0x1aa/0x3d0
[22766.387270] [<ffffffff810a495f>] ? account_system_vtime+0x10f/0x130
[22766.387273] [<ffffffff810734d0>] __do_softirq+0xe0/0x420
[22766.387275] [<ffffffff8169826c>] call_softirq+0x1c/0x30
[22766.387278] [<ffffffff8101db15>] do_softirq+0xd5/0x110
[22766.387279] [<ffffffff81073bc5>] irq_exit+0xd5/0xe0
[22766.387281] [<ffffffff81698b03>] do_IRQ+0x63/0xd0
[22766.387283] [<ffffffff8168ee2f>] common_interrupt+0x6f/0x6f
[22766.387283] <EOI>
[22766.387284]
[22766.387285] [<ffffffff8168eed9>] ? retint_swapgs+0x13/0x1b
[22766.387285] Code: c0 90 5d c3 66 0f 1f 44 00 00 4c 89 c8 5d c3 0f 1f 00 55 48 89 e5 48 83
ec 20 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 66 66 90 <0f> b7 87 98 00 00 00 48 89 fb
49 89 f5 66 c1 c0 08 66 39 46 02
[22766.387307]
[22766.387307] RIP
[22766.387311] [<ffffffffa168a2c9>] sctp_assoc_is_match+0x19/0x90 [sctp]
[22766.387311] RSP <ffff880147c039b0>
[22766.387142] ffffffffa16ab120
[22766.599537] ---[ end trace 3f6dae82e37b17f5 ]---
[22766.601221] Kernel panic - not syncing: Fatal exception in interrupt
Disassembly of the function shows that we oopsed here..
/* Is this the association we are looking for? */
struct sctp_transport *sctp_assoc_is_match(struct sctp_association *asoc,
const union sctp_addr *laddr,
const union sctp_addr *paddr)
{
1070: 55 push %rbp
1071: 48 89 e5 mov %rsp,%rbp
1074: 48 83 ec 20 sub $0x20,%rsp
1078: 48 89 5d e8 mov %rbx,-0x18(%rbp)
107c: 4c 89 65 f0 mov %r12,-0x10(%rbp)
1080: 4c 89 6d f8 mov %r13,-0x8(%rbp)
1084: e8 00 00 00 00 callq 1089 <sctp_assoc_is_match+0x19>
struct sctp_transport *transport;
if ((htons(asoc->base.bind_addr.port) == laddr->v4.sin_port) &&
1089: 0f b7 87 98 00 00 00 movzwl 0x98(%rdi),%eax
^ permalink raw reply
* Re: [PATCH] etherdevice: introduce eth_broadcast_addr
From: Paul Gortmaker @ 2012-07-11 0:09 UTC (permalink / raw)
To: Johannes Berg; +Cc: David Miller, netdev, linux-wireless
In-Reply-To: <1341937124.4475.27.camel@jlt3.sipsolutions.net>
On Tue, Jul 10, 2012 at 12:18 PM, Johannes Berg
<johannes@sipsolutions.net> wrote:
> From: Johannes Berg <johannes.berg@intel.com>
>
> A lot of code has either the memset or an inefficient copy
> from a static array that contains the all-ones broadcast
Shouldn't we see all that "lot of code" here in this same
commit, now using this new shortcut? If we apply this, we
have a new function, but with no users. If you have done
the audit, and found the inefficient cases, why isn't it here?
I would think it better to just fix those people who have a
pointless static array of all-ones to use the memset. If it was a
multi line thing to achieve the eth_broadcast_addr() then it
might make sense to exist. But as a one line alias, it does
seem somewhat pointless to me.
Paul.
--
> address. Introduce eth_broadcast_addr() to fill an address
> with all ones, making the code clearer and allowing us to
> get rid of some constant arrays.
>
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
> include/linux/etherdevice.h | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
> index 3d406e0..98a27cc 100644
> --- a/include/linux/etherdevice.h
> +++ b/include/linux/etherdevice.h
> @@ -138,6 +138,17 @@ static inline void random_ether_addr(u8 *addr)
> }
>
> /**
> + * eth_broadcast_addr - Assign broadcast address
> + * @addr: Pointer to a six-byte array containing the Ethernet address
> + *
> + * Assign the broadcast address to the given address array.
> + */
> +static inline void eth_broadcast_addr(u8 *addr)
> +{
> + memset(addr, 0xff, ETH_ALEN);
> +}
> +
> +/**
> * eth_hw_addr_random - Generate software assigned random Ethernet and
> * set device flag
> * @dev: pointer to net_device structure
> --
> 1.7.10.4
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [GIT PULL nf] IPVS
From: Simon Horman @ 2012-07-11 0:19 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Hans Schillstrom, Jesper Dangaard Brouer
Hi Pablo,
this pull request consists of three bug fixes for IPVS.
Please consider for inclusion in 3.5 and stable.
The bug fix from Julian, "ipvs: fix oops in ip_vs_dst_event on rmmod"
fixes a regression introduced in 3.4 and thus I believe it is
only relevant to 3.5 and 3.4-stable.
The other two fixes appear to have been present since at least 2.6.37
(there were a lot of changes to IPVS around that time).
----------------------------------------------------------------
The following changes since commit 6bd0405bb4196b44f1acb7a58f11382cdaf6f7f0:
netfilter: nf_ct_ecache: fix crash with multiple containers, one shutting down (2012-07-09 10:53:19 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs.git master
for you to fetch changes up to 51878010232aaac12822e219b94e89de54faa1ef:
ipvs: fix oops in ip_vs_dst_event on rmmod (2012-07-11 09:00:47 +0900)
----------------------------------------------------------------
Julian Anastasov (1):
ipvs: fix oops in ip_vs_dst_event on rmmod
Lin Ming (1):
ipvs: fix oops on NAT reply in br_nf context
Xiaotian Feng (1):
ipvs: add missing lock in ip_vs_ftp_init_conn()
include/net/ip_vs.h | 2 +-
net/netfilter/ipvs/ip_vs_ctl.c | 5 +++--
net/netfilter/ipvs/ip_vs_ftp.c | 2 ++
3 files changed, 6 insertions(+), 3 deletions(-)
^ permalink raw reply
* [PATCH 3/3] ipvs: fix oops in ip_vs_dst_event on rmmod
From: Simon Horman @ 2012-07-11 0:19 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Hans Schillstrom, Jesper Dangaard Brouer,
Simon Horman
In-Reply-To: <1341965963-7275-1-git-send-email-horms@verge.net.au>
From: Julian Anastasov <ja@ssi.bg>
After commit 39f618b4fd95ae243d940ec64c961009c74e3333 (3.4)
"ipvs: reset ipvs pointer in netns" we can oops in
ip_vs_dst_event on rmmod ip_vs because ip_vs_control_cleanup
is called after the ipvs_core_ops subsys is unregistered and
net->ipvs is NULL. Fix it by exiting early from ip_vs_dst_event
if ipvs is NULL. It is safe because all services and dests
for the net are already freed.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/ip_vs_ctl.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index d43e3c1..84444dd 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -1521,11 +1521,12 @@ static int ip_vs_dst_event(struct notifier_block *this, unsigned long event,
{
struct net_device *dev = ptr;
struct net *net = dev_net(dev);
+ struct netns_ipvs *ipvs = net_ipvs(net);
struct ip_vs_service *svc;
struct ip_vs_dest *dest;
unsigned int idx;
- if (event != NETDEV_UNREGISTER)
+ if (event != NETDEV_UNREGISTER || !ipvs)
return NOTIFY_DONE;
IP_VS_DBG(3, "%s() dev=%s\n", __func__, dev->name);
EnterFunction(2);
@@ -1551,7 +1552,7 @@ static int ip_vs_dst_event(struct notifier_block *this, unsigned long event,
}
}
- list_for_each_entry(dest, &net_ipvs(net)->dest_trash, n_list) {
+ list_for_each_entry(dest, &ipvs->dest_trash, n_list) {
__ip_vs_dev_reset(dest, dev);
}
mutex_unlock(&__ip_vs_mutex);
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 2/3] ipvs: add missing lock in ip_vs_ftp_init_conn()
From: Simon Horman @ 2012-07-11 0:19 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Hans Schillstrom, Jesper Dangaard Brouer,
Xiaotian Feng, Xiaotian Feng, Patrick McHardy, David S. Miller,
Simon Horman
In-Reply-To: <1341965963-7275-1-git-send-email-horms@verge.net.au>
From: Xiaotian Feng <xtfeng@gmail.com>
We met a kernel panic in 2.6.32.43 kernel:
[2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
<snip>
[2680311.849009] general protection fault: 0000 [#1] SMP
[2680311.853001] RIP: 0010:[<ffffffff815f155c>] [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
[2680311.853001] RSP: 0018:ffff880028303e70 EFLAGS: 00010202
[2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
[2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
<snip>
[2680311.853001] Call Trace:
[2680311.853001] <IRQ>
[2680311.853001] [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
[2680311.853001] [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
[2680311.853001] [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
[2680311.853001] [<ffffffff81049a13>] __do_softirq+0xb3/0x150
[2680311.853001] [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
[2680311.853001] [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
[2680311.853001] [<ffffffff81049957>] irq_exit+0x77/0x80
[2680311.853001] [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
[2680311.853001] [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
[2680311.853001] <EOI>
[2680311.853001] [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
[2680311.853001] [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
[2680311.853001] [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
[2680311.853001] [<ffffffff816d504d>] ? start_secondary+0x19d/0x280
rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
connection and result the general protect fault.
The "request for already hashed" warning, told us someone might change the connection flags
incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
and list_del() it, then kernel panic happened.
After code review, the only chance that kernel change connection flag without protection is
in ip_vs_ftp_init_conn().
Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/ip_vs_ftp.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index b20b29c..c2bc264 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -65,8 +65,10 @@ static int ip_vs_ftp_pasv;
static int
ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
{
+ spin_lock(&cp->lock);
/* We use connection tracking for the command connection */
cp->flags |= IP_VS_CONN_F_NFCT;
+ spin_unlock(&cp->lock);
return 0;
}
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 1/3] ipvs: fix oops on NAT reply in br_nf context
From: Simon Horman @ 2012-07-11 0:19 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Hans Schillstrom, Jesper Dangaard Brouer,
Lin Ming, Simon Horman
In-Reply-To: <1341965963-7275-1-git-send-email-horms@verge.net.au>
From: Lin Ming <mlin@ss.pku.edu.cn>
IPVS should not reset skb->nf_bridge in FORWARD hook
by calling nf_reset for NAT replies. It triggers oops in
br_nf_forward_finish.
[ 579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[ 579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[ 579.781792] PGD 218f9067 PUD 0
[ 579.781865] Oops: 0000 [#1] SMP
[ 579.781945] CPU 0
[ 579.781983] Modules linked in:
[ 579.782047]
[ 579.782080]
[ 579.782114] Pid: 4644, comm: qemu Tainted: G W 3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard /30E8
[ 579.782300] RIP: 0010:[<ffffffff817b1ca5>] [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
[ 579.782455] RSP: 0018:ffff88007b003a98 EFLAGS: 00010287
[ 579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
[ 579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
[ 579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
[ 579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
[ 579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
[ 579.783177] FS: 0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
[ 579.783306] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[ 579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
[ 579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
[ 579.783919] Stack:
[ 579.783959] ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
[ 579.784110] ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
[ 579.784260] ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
[ 579.784477] Call Trace:
[ 579.784523] <IRQ>
[ 579.784562]
[ 579.784603] [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
[ 579.784707] [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[ 579.784797] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[ 579.784906] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[ 579.784995] [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
[ 579.785175] [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
[ 579.785179] [<ffffffff817ac417>] __br_forward+0x97/0xa2
[ 579.785179] [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
[ 579.785179] [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
[ 579.785179] [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
[ 579.785179] [<ffffffff81704b58>] nf_iterate+0x47/0x7d
[ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[ 579.785179] [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
[ 579.785179] [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
[ 579.785179] [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
[ 579.785179] [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
[ 579.785179] [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
[ 579.785179] [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
[ 579.785179] [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
[ 579.785179] [<ffffffff816e6800>] net_rx_action+0xdf/0x242
[ 579.785179] [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
[ 579.785179] [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[ 579.785179] [<ffffffff8188812c>] call_softirq+0x1c/0x30
The steps to reproduce as follow,
1. On Host1, setup brige br0(192.168.1.106)
2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
3. Start IPVS service on Host1
ipvsadm -A -t 192.168.1.106:80 -s rr
ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
4. Run apache benchmark on Host2(192.168.1.101)
ab -n 1000 http://192.168.1.106/
ip_vs_reply4
ip_vs_out
handle_response
ip_vs_notrack
nf_reset()
{
skb->nf_bridge = NULL;
}
Actually, IPVS wants in this case just to replace nfct
with untracked version. So replace the nf_reset(skb) call
in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.
Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
include/net/ip_vs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d6146b4..95374d1 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1425,7 +1425,7 @@ static inline void ip_vs_notrack(struct sk_buff *skb)
struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
if (!ct || !nf_ct_is_untracked(ct)) {
- nf_reset(skb);
+ nf_conntrack_put(skb->nfct);
skb->nfct = &nf_ct_untracked_get()->ct_general;
skb->nfctinfo = IP_CT_NEW;
nf_conntrack_get(skb->nfct);
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [GIT PULL nf-next] IPVS
From: Simon Horman @ 2012-07-11 0:25 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Hans Schillstrom, Jesper Dangaard Brouer
Hi Pablo,
please consider the following enhancements to IPVS for inclusion in 3.6.
----------------------------------------------------------------
The following changes since commit 46ba5a25f521e3c50d7bb81b1abb977769047456:
netfilter: nfnetlink_queue: do not allow to set unsupported flag bits (2012-07-04 19:51:50 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next.git master
for you to fetch changes up to 1fd130ebf10e1185022a9c0470f2298943bad1c4:
ipvs: generalize app registration in netns (2012-07-10 17:58:10 +0900)
----------------------------------------------------------------
Julian Anastasov (2):
ipvs: ip_vs_ftp depends on nf_conntrack_ftp helper
ipvs: generalize app registration in netns
include/net/ip_vs.h | 5 ++--
net/netfilter/ipvs/Kconfig | 3 ++-
net/netfilter/ipvs/ip_vs_app.c | 61 +++++++++++++++++++++++++++++++-----------
net/netfilter/ipvs/ip_vs_ftp.c | 21 ++++-----------
4 files changed, 54 insertions(+), 36 deletions(-)
^ permalink raw reply
* [PATCH 1/2] ipvs: ip_vs_ftp depends on nf_conntrack_ftp helper
From: Simon Horman @ 2012-07-11 0:25 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Hans Schillstrom, Jesper Dangaard Brouer,
Simon Horman
In-Reply-To: <1341966327-16606-1-git-send-email-horms@verge.net.au>
From: Julian Anastasov <ja@ssi.bg>
The FTP application indirectly depends on the
nf_conntrack_ftp helper for proper NAT support. If the
module is not loaded, IPVS can resize the packets for the
command connection, eg. PASV response but the SEQ adjustment
logic in ipv4_confirm is not called without helper.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
net/netfilter/ipvs/Kconfig | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/ipvs/Kconfig b/net/netfilter/ipvs/Kconfig
index f987138..8b2cffd 100644
--- a/net/netfilter/ipvs/Kconfig
+++ b/net/netfilter/ipvs/Kconfig
@@ -250,7 +250,8 @@ comment 'IPVS application helper'
config IP_VS_FTP
tristate "FTP protocol helper"
- depends on IP_VS_PROTO_TCP && NF_CONNTRACK && NF_NAT
+ depends on IP_VS_PROTO_TCP && NF_CONNTRACK && NF_NAT && \
+ NF_CONNTRACK_FTP
select IP_VS_NFCT
---help---
FTP is a protocol that transfers IP address and/or port number in
--
1.7.10.2.484.gcd07cc5
^ permalink raw reply related
* [PATCH 2/2] ipvs: generalize app registration in netns
From: Simon Horman @ 2012-07-11 0:25 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
Julian Anastasov, Hans Schillstrom, Jesper Dangaard Brouer,
Simon Horman
In-Reply-To: <1341966327-16606-1-git-send-email-horms@verge.net.au>
From: Julian Anastasov <ja@ssi.bg>
Get rid of the ftp_app pointer and allow applications
to be registered without adding fields in the netns_ipvs structure.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
include/net/ip_vs.h | 5 ++--
net/netfilter/ipvs/ip_vs_app.c | 61 +++++++++++++++++++++++++++++++-----------
net/netfilter/ipvs/ip_vs_ftp.c | 21 ++++-----------
3 files changed, 52 insertions(+), 35 deletions(-)
diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index d6146b4..6cb4699 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -808,8 +808,6 @@ struct netns_ipvs {
struct list_head rs_table[IP_VS_RTAB_SIZE];
/* ip_vs_app */
struct list_head app_list;
- /* ip_vs_ftp */
- struct ip_vs_app *ftp_app;
/* ip_vs_proto */
#define IP_VS_PROTO_TAB_SIZE 32 /* must be power of 2 */
struct ip_vs_proto_data *proto_data_table[IP_VS_PROTO_TAB_SIZE];
@@ -1179,7 +1177,8 @@ extern void ip_vs_service_net_cleanup(struct net *net);
* (from ip_vs_app.c)
*/
#define IP_VS_APP_MAX_PORTS 8
-extern int register_ip_vs_app(struct net *net, struct ip_vs_app *app);
+extern struct ip_vs_app *register_ip_vs_app(struct net *net,
+ struct ip_vs_app *app);
extern void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app);
extern int ip_vs_bind_app(struct ip_vs_conn *cp, struct ip_vs_protocol *pp);
extern void ip_vs_unbind_app(struct ip_vs_conn *cp);
diff --git a/net/netfilter/ipvs/ip_vs_app.c b/net/netfilter/ipvs/ip_vs_app.c
index 64f9e8f..11caaea 100644
--- a/net/netfilter/ipvs/ip_vs_app.c
+++ b/net/netfilter/ipvs/ip_vs_app.c
@@ -180,22 +180,41 @@ register_ip_vs_app_inc(struct net *net, struct ip_vs_app *app, __u16 proto,
}
-/*
- * ip_vs_app registration routine
- */
-int register_ip_vs_app(struct net *net, struct ip_vs_app *app)
+/* Register application for netns */
+struct ip_vs_app *register_ip_vs_app(struct net *net, struct ip_vs_app *app)
{
struct netns_ipvs *ipvs = net_ipvs(net);
- /* increase the module use count */
- ip_vs_use_count_inc();
+ struct ip_vs_app *a;
+ int err = 0;
+
+ if (!ipvs)
+ return ERR_PTR(-ENOENT);
mutex_lock(&__ip_vs_app_mutex);
- list_add(&app->a_list, &ipvs->app_list);
+ list_for_each_entry(a, &ipvs->app_list, a_list) {
+ if (!strcmp(app->name, a->name)) {
+ err = -EEXIST;
+ break;
+ }
+ }
+ if (!err) {
+ a = kmemdup(app, sizeof(*app), GFP_KERNEL);
+ if (!a)
+ err = -ENOMEM;
+ }
+ if (!err) {
+ INIT_LIST_HEAD(&a->incs_list);
+ list_add(&a->a_list, &ipvs->app_list);
+ /* increase the module use count */
+ ip_vs_use_count_inc();
+ }
mutex_unlock(&__ip_vs_app_mutex);
- return 0;
+ if (err)
+ return ERR_PTR(err);
+ return a;
}
@@ -205,20 +224,29 @@ int register_ip_vs_app(struct net *net, struct ip_vs_app *app)
*/
void unregister_ip_vs_app(struct net *net, struct ip_vs_app *app)
{
- struct ip_vs_app *inc, *nxt;
+ struct netns_ipvs *ipvs = net_ipvs(net);
+ struct ip_vs_app *a, *anxt, *inc, *nxt;
+
+ if (!ipvs)
+ return;
mutex_lock(&__ip_vs_app_mutex);
- list_for_each_entry_safe(inc, nxt, &app->incs_list, a_list) {
- ip_vs_app_inc_release(net, inc);
- }
+ list_for_each_entry_safe(a, anxt, &ipvs->app_list, a_list) {
+ if (app && strcmp(app->name, a->name))
+ continue;
+ list_for_each_entry_safe(inc, nxt, &a->incs_list, a_list) {
+ ip_vs_app_inc_release(net, inc);
+ }
- list_del(&app->a_list);
+ list_del(&a->a_list);
+ kfree(a);
- mutex_unlock(&__ip_vs_app_mutex);
+ /* decrease the module use count */
+ ip_vs_use_count_dec();
+ }
- /* decrease the module use count */
- ip_vs_use_count_dec();
+ mutex_unlock(&__ip_vs_app_mutex);
}
@@ -586,5 +614,6 @@ int __net_init ip_vs_app_net_init(struct net *net)
void __net_exit ip_vs_app_net_cleanup(struct net *net)
{
+ unregister_ip_vs_app(net, NULL /* all */);
proc_net_remove(net, "ip_vs_app");
}
diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index b20b29c..ad70b7e 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -441,16 +441,10 @@ static int __net_init __ip_vs_ftp_init(struct net *net)
if (!ipvs)
return -ENOENT;
- app = kmemdup(&ip_vs_ftp, sizeof(struct ip_vs_app), GFP_KERNEL);
- if (!app)
- return -ENOMEM;
- INIT_LIST_HEAD(&app->a_list);
- INIT_LIST_HEAD(&app->incs_list);
- ipvs->ftp_app = app;
- ret = register_ip_vs_app(net, app);
- if (ret)
- goto err_exit;
+ app = register_ip_vs_app(net, &ip_vs_ftp);
+ if (IS_ERR(app))
+ return PTR_ERR(app);
for (i = 0; i < ports_count; i++) {
if (!ports[i])
@@ -464,9 +458,7 @@ static int __net_init __ip_vs_ftp_init(struct net *net)
return 0;
err_unreg:
- unregister_ip_vs_app(net, app);
-err_exit:
- kfree(ipvs->ftp_app);
+ unregister_ip_vs_app(net, &ip_vs_ftp);
return ret;
}
/*
@@ -474,10 +466,7 @@ err_exit:
*/
static void __ip_vs_ftp_exit(struct net *net)
{
- struct netns_ipvs *ipvs = net_ipvs(net);
^ permalink raw reply related
* Re: [PATCH 03/16] tcp: Maintain dynamic metrics in local cache.
From: David Miller @ 2012-07-11 0:29 UTC (permalink / raw)
To: joe; +Cc: netdev
In-Reply-To: <1341939724.6118.145.camel@joe2Laptop>
From: Joe Perches <joe@perches.com>
Date: Tue, 10 Jul 2012 10:02:04 -0700
> Maybe something like this is a bit more legible?
> {
> if (a->family != b->family)
> return false;
>
> if (a->family == AF_INET)
> return a->addr.a4 == b->addr.a4;
>
> return ipv6_addr_equal((const struct in6_addr *)&a->addr.a6,
> (const struct in6_addr *)&b->addr.a6);
> }
My version was meant to be fast rather than legible :-)
^ permalink raw reply
* Re: 82571EB: Detected Hardware Unit Hang
From: Joe Jin @ 2012-07-11 0:34 UTC (permalink / raw)
To: Dave, Tushar N
Cc: netdev@vger.kernel.org, e1000-devel@lists.sf.net,
linux-kernel@vger.kernel.org
In-Reply-To: <061C8A8601E8EE4CA8D8FD6990CEA891274EE41F@ORSMSX102.amr.corp.intel.com>
On 07/11/12 03:02, Dave, Tushar N wrote:
>> -----Original Message-----
>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>> On Behalf Of Joe Jin
>> Sent: Tuesday, July 10, 2012 12:40 AM
>> To: Joe Jin
>> Cc: e1000-devel@lists.sf.net; netdev@vger.kernel.org; linux-
>> kernel@vger.kernel.org
>> Subject: Re: 82571EB: Detected Hardware Unit Hang
>>
>> When I debug the driver I found before Detected HW hang, driver unable to
>> clean and reclaim the resources:
>>
>> 1457 while ((eop_desc->upper.data &
>> cpu_to_le32(E1000_TXD_STAT_DD)) && <== at here upper.data always is 0x300
>> 1458 (count < tx_ring->count)) {
>> <--- snip --->
>> 1487 }
>>
>>
>> I checked all driver codes I did not found anywhere will set the
>> upper.data with E1000_TXD_STAT_DD, I guess upper.data be set by hardware?
>
> Yes upper.data (part of it is STATUS byte) is set by HW. Basically driver checks E1000_TXD_STAT_DD (Descriptor Done) bit. If this bit is set that means HW has processed that descriptor and driver can now clean that descriptor.
> With value 0x300 , DD bit is not set. That means HW has not processed that descriptor.
Thanks for the clarify, might be firmware issue?
>
> How fast does tx hang reproduce? I suggest you to enable debug code in driver so when tx hang occurs it will dump the HW desc ring info into kernel log.
Once I copy a file from other server, issue to be reproduced at once.
I'll enable the debug to get more debug info.
> You can run "ethtool -s ethx msglvl 0x2c00" to enable debug.
> Once tx hang occurs please send me the full dmesg log.
>
> Does tx hang occur with in-kernel e1000e driver too?
I tried several drivers included rhel5 the latest, Intel the latest,
rhel6 the latest, issue see on all those drivers.
Thanks,
Joe
>
> Thanks.
>
> -Tushar
>
>
>> If OS is 32bit system, what which happen?
>
>
>>
>> Thanks in advance,
>> Joe
>>
>> On 07/09/12 16:51, Joe Jin wrote:
>>> Hi list,
>>>
>>> I'm seeing a Unit Hang even with the latest e1000e driver 2.0.0 when
>>> doing scp test. this issue is easy do reproduced on SUN FIRE X2270 M2,
>>> just copy a big file (>500M) from another server will hit it at once.
>>>
>>> Would you please help on this?
>>>
>>> device info:
>>> # lspci -s 05:00.0
>>> 05:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit
>>> Ethernet Controller (Copper) (rev 06)
>>>
>>> # lspci -s 05:00.0 -n
>>> 05:00.0 0200: 8086:10bc (rev 06)
>>>
>>> # ethtool -i eth0
>>> driver: e1000e
>>> version: 2.0.0-NAPI
>>> firmware-version: 5.10-2
>>> bus-info: 0000:05:00.0
>>>
>>> # ethtool -k eth0
>>> Offload parameters for eth0:
>>> rx-checksumming: on
>>> tx-checksumming: on
>>> scatter-gather: on
>>> tcp segmentation offload: on
>>> udp fragmentation offload: off
>>> generic segmentation offload: on
>>> generic-receive-offload: on
>>>
>>> kernel log:
>>> -----------
>>> e1000e 0000:05:00.0: eth0: Detected Hardware Unit Hang:
>>> TDH <6c>
>>> TDT <81>
>>> next_to_use <81>
>>> next_to_clean <6b>
>>> buffer_info[next_to_clean]:
>>> time_stamp <fffc7a23>
>>> next_to_watch <71>
>>> jiffies <fffc8c0c>
>>> next_to_watch.status <0>
>>> MAC Status <80387>
>>> PHY Status <792d>
>>> PHY 1000BASE-T Status <3c00>
>>> PHY Extended Status <3000>
>>> PCI Status <10>
>>> e1000e 0000:05:00.0: eth0: Detected Hardware Unit Hang:
>>> TDH <6c>
>>> TDT <81>
>>> next_to_use <81>
>>> next_to_clean <6b>
>>> buffer_info[next_to_clean]:
>>> time_stamp <fffc7a23>
>>> next_to_watch <71>
>>> jiffies <fffc9bac>
>>> next_to_watch.status <0>
>>> MAC Status <80387>
>>> PHY Status <792d>
>>> PHY 1000BASE-T Status <3c00>
>>> PHY Extended Status <3000>
>>> PCI Status <10>
>>> ------------[ cut here ]------------
>>> WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x225/0x230()
>>> Hardware name: SUN FIRE X2270 M2 NETDEV WATCHDOG: eth0 (e1000e):
>>> transmit queue 0 timed out Modules linked in: autofs4 hidp rfcomm
>>> bluetooth rfkill lockd sunrpc cpufreq_ondemand acpi_cpufreq mperf
>>> be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad
>>> ib_core ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3
>>> mdio libiscsi_tcp libiscsi scsi_transport_iscsi video sbs sbshc
>>> acpi_pad acpi_ipmi ipmi_msghandler parport_pc lp parport e1000e(U)
>>> snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
>>> igb snd_pcm_oss serio_raw snd_mixer_oss snd_pcm tpm_infineon snd_timer
>>> snd soundcore snd_page_alloc i2c_i801 iTCO_wdt i2c_core pcspkr
>>> i7core_edac iTCO_vendor_support ioatdma ghes dca edac_core hed
>>> dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod usb_storage
>>> sd_mod crc_t10dif sg ahci libahci ext3 jbd mbcache [last unloaded:
>>> microcode]
>>> Pid: 0, comm: swapper Not tainted 2.6.39-200.24.1.el5uek #1 Call
>>> Trace:
>>> [<c07d9ac5>] ? dev_watchdog+0x225/0x230 [<c045ba61>]
>>> warn_slowpath_common+0x81/0xa0 [<c07d9ac5>] ?
>>> dev_watchdog+0x225/0x230 [<c045bb23>] warn_slowpath_fmt+0x33/0x40
>>> [<c07d9ac5>] dev_watchdog+0x225/0x230 [<c07d98a0>] ?
>>> dev_activate+0xb0/0xb0 [<c0468e82>] call_timer_fn+0x32/0xf0
>>> [<c04bceb0>] ? rcu_check_callbacks+0x80/0x80 [<c046a76d>]
>>> run_timer_softirq+0xed/0x1b0 [<c07d98a0>] ? dev_activate+0xb0/0xb0
>>> [<c0461a81>] __do_softirq+0x91/0x1a0 [<c04619f0>] ?
>>> local_bh_enable+0x80/0x80 <IRQ> [<c0462295>] ? irq_exit+0x95/0xa0
>>> [<c087f8b8>] ? smp_apic_timer_interrupt+0x38/0x42
>>> [<c08784f5>] ? apic_timer_interrupt+0x31/0x38 [<c046007b>] ?
>>> do_exit+0x11b/0x370 [<c065eae4>] ? intel_idle+0xa4/0x100
>>> [<c078d9b9>] ? cpuidle_idle_call+0xb9/0x1e0 [<c0411d77>] ?
>>> cpu_idle+0x97/0xd0 [<c085cbbd>] ? rest_init+0x5d/0x70 [<c0b07a7a>] ?
>>> start_kernel+0x28a/0x340 [<c0b074b0>] ? obsolete_checksetup+0xb0/0xb0
>>> [<c0b070a4>] ? i386_start_kernel+0x64/0xb0 ---[ end trace
>>> 5502b55cd4d4e5cb ]--- e1000e 0000:05:00.0: eth0: Reset adapter
>>> e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>>>
>>> Thanks,
>>> Joe
>>>
>>
>>
>>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: [PATCH] etherdevice: introduce eth_broadcast_addr
From: David Miller @ 2012-07-11 0:41 UTC (permalink / raw)
To: paul.gortmaker; +Cc: johannes, netdev, linux-wireless
In-Reply-To: <CAP=VYLqNyAF1gSsKTd3YojeJXK_-pEzXAKyVQbQfwhtwOyMmwg@mail.gmail.com>
From: Paul Gortmaker <paul.gortmaker@windriver.com>
Date: Tue, 10 Jul 2012 20:09:44 -0400
> On Tue, Jul 10, 2012 at 12:18 PM, Johannes Berg
> <johannes@sipsolutions.net> wrote:
>> From: Johannes Berg <johannes.berg@intel.com>
>>
>> A lot of code has either the memset or an inefficient copy
>> from a static array that contains the all-ones broadcast
>
> Shouldn't we see all that "lot of code" here in this same
> commit, now using this new shortcut?
I disagree and I intend to apply Johannes's patch as-is to net-next.
^ permalink raw reply
* Re: [PATCH 03/16] tcp: Maintain dynamic metrics in local cache.
From: Joe Perches @ 2012-07-11 0:44 UTC (permalink / raw)
To: David Miller; +Cc: netdev
In-Reply-To: <20120710.172908.745359979722998717.davem@davemloft.net>
On Tue, 2012-07-10 at 17:29 -0700, David Miller wrote:
> From: Joe Perches <joe@perches.com>
> Date: Tue, 10 Jul 2012 10:02:04 -0700
>
> > Maybe something like this is a bit more legible?
> > {
> > if (a->family != b->family)
> > return false;
> >
> > if (a->family == AF_INET)
> > return a->addr.a4 == b->addr.a4;
> >
> > return ipv6_addr_equal((const struct in6_addr *)&a->addr.a6,
> > (const struct in6_addr *)&b->addr.a6);
> > }
>
> My version was meant to be fast rather than legible :-)
Fast to write you mean? ;)
I'd guess the one above is faster to execute.
If it's not, the code in ipv6_addr_equal
should be reverted. commit fed85383ac34d82
("[IPV6]: Use XOR and OR rather than mutiple ands for ipv6 address comparisons")
cheers, Joe
^ permalink raw reply
* Re: [PATCH 03/16] tcp: Maintain dynamic metrics in local cache.
From: David Miller @ 2012-07-11 1:01 UTC (permalink / raw)
To: joe; +Cc: netdev
In-Reply-To: <1341967486.13724.9.camel@joe2Laptop>
From: Joe Perches <joe@perches.com>
Date: Tue, 10 Jul 2012 17:44:46 -0700
> I'd guess the one above is faster to execute.
It is.
> If it's not, the code in ipv6_addr_equal
> should be reverted. commit fed85383ac34d82
> ("[IPV6]: Use XOR and OR rather than mutiple ands for ipv6 address comparisons")
Not necessarily.
My version here is faster because we unconditionally test
the first word, which we need to do for both the ipv4 and
ipv6 cases.
The ipv6 routine optimization you mention exists in a
world where we know we have an ipv6 address always, which
is not the case here.
If anything, we should do XOR's on the final three words,
but we should not remove the first word optimization for
ipv4 which is the common case.
^ permalink raw reply
* Re: net-next kernel NULL pointer dereference at fib_rules_tclass
From: David Miller @ 2012-07-11 1:05 UTC (permalink / raw)
To: gregory.v.rose; +Cc: eric.dumazet, ogerlitz, netdev, shlomop, amirv, erezsh
In-Reply-To: <20120710111434.00003ba8@unknown>
From: Greg Rose <gregory.v.rose@intel.com>
Date: Tue, 10 Jul 2012 11:14:34 -0700
> On Tue, 10 Jul 2012 19:25:01 +0200
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>> On Tue, 2012-07-10 at 09:44 -0700, David Miller wrote:
>> > From: Or Gerlitz <ogerlitz@mellanox.com>
>> > Date: Tue, 10 Jul 2012 10:16:55 +0300
>> >
>> > > Starting system logger: BUG: unable to handle kernel NULL pointer
>> > > dereference at 00000000000000ac IP: [<ffffffff81320393>]
>> > > fib_rules_tclass+0xf/0x17
>> >
>> > Ok, fib_rules_tclass() checks for res->r being NULL and only
>> > dereferences it if it is not.
>> >
>> > fib4_rule->tclassid has offset ~0x8c on x86-64, and this fault
>> > address is 0x10 bytes off.
>> >
>> > Does this patch fix the problem?
>> >
>> > diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
>> > index 539c672..000c467 100644
>> > --- a/include/net/ip_fib.h
>> > +++ b/include/net/ip_fib.h
>> > @@ -230,6 +230,7 @@ static inline int fib_lookup(struct net *net,
>> > struct flowi4 *flp, struct fib_result *res)
>> > {
>> > if (!net->ipv4.fib_has_custom_rules) {
>> > + res->r = NULL;
>> > if (net->ipv4.fib_local &&
>> > !fib_table_lookup(net->ipv4.fib_local, flp,
>> > res, FIB_LOOKUP_NOREF))
>>
>> It does here, thanks
>
> Works for me too.
Great, pushed out to net-next, thanks everyone.
^ permalink raw reply
* Re: [PATCH] etherdevice: introduce eth_broadcast_addr
From: David Miller @ 2012-07-11 1:07 UTC (permalink / raw)
To: johannes; +Cc: netdev, linux-wireless
In-Reply-To: <1341937124.4475.27.camel@jlt3.sipsolutions.net>
From: Johannes Berg <johannes@sipsolutions.net>
Date: Tue, 10 Jul 2012 18:18:44 +0200
> From: Johannes Berg <johannes.berg@intel.com>
>
> A lot of code has either the memset or an inefficient copy
> from a static array that contains the all-ones broadcast
> address. Introduce eth_broadcast_addr() to fill an address
> with all ones, making the code clearer and allowing us to
> get rid of some constant arrays.
>
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH] etherdevice: introduce eth_broadcast_addr
From: Joe Perches @ 2012-07-11 1:09 UTC (permalink / raw)
To: David Miller
Cc: paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
johannes-cdvu00un1VgdHxzADdlk8Q, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20120710.174142.995966539991957646.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
On Tue, 2012-07-10 at 17:41 -0700, David Miller wrote:
> From: Paul Gortmaker <paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ@public.gmane.org>
> Date: Tue, 10 Jul 2012 20:09:44 -0400
>
> > On Tue, Jul 10, 2012 at 12:18 PM, Johannes Berg
> > <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> wrote:
> >> From: Johannes Berg <johannes.berg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >>
> >> A lot of code has either the memset or an inefficient copy
> >> from a static array that contains the all-ones broadcast
> >
> > Shouldn't we see all that "lot of code" here in this same
> > commit, now using this new shortcut?
If I grepped properly, there are 42 instances of static arrays for
for broadcast ethernet addresses in drivers/net and drivers/staging
so it'd save some smallish amount of code by using a combination of
is_broadcast_ether_addr and this new func.
I think there are 53 instances of the memset(foo, 0xff, 6|ETH_ALEN).
> I disagree and I intend to apply Johannes's patch as-is to net-next.
Sounds fine to me.
For some additional style symmetry, how about a conversion of
random_ether_address to eth_random_addr too via
o Rename random_ether_addr to eth_random_addr and add a
#define random_ether_addr eth_random_addr
o sed 's/\brandom_ether_addr\b/eth_random_addr/g' files_that_use_REA
o remove the #define after awhile
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH net-next 3/9] qlge: Fix ethtool WOL calls to operate only on devices that support WOL.
From: Jitendra Kalsaria @ 2012-07-11 0:57 UTC (permalink / raw)
To: davem; +Cc: netdev, ron.mercer, Dept_NX_Linux_NIC_Driver, Jitendra Kalsaria
In-Reply-To: <1341968259-18931-1-git-send-email-jitendra.kalsaria@qlogic.com>
From: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
---
drivers/net/ethernet/qlogic/qlge/qlge.h | 2 ++
drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c | 20 ++++++++++++++++----
2 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qlge/qlge.h b/drivers/net/ethernet/qlogic/qlge/qlge.h
index 6e7050c..ae7dddc 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge.h
+++ b/drivers/net/ethernet/qlogic/qlge/qlge.h
@@ -25,6 +25,8 @@
#define QLGE_VENDOR_ID 0x1077
#define QLGE_DEVICE_ID_8012 0x8012
#define QLGE_DEVICE_ID_8000 0x8000
+#define QLGE_MEZZ_SSYS_ID_068 0x0068
+#define QLGE_MEZZ_SSYS_ID_180 0x0180
#define MAX_CPUS 8
#define MAX_TX_RINGS MAX_CPUS
#define MAX_RX_RINGS ((MAX_CPUS * 2) + 1)
diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c b/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
index 8e2c2a7..c2adfa2 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
@@ -388,17 +388,29 @@ static void ql_get_drvinfo(struct net_device *ndev,
static void ql_get_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
{
struct ql_adapter *qdev = netdev_priv(ndev);
- /* What we support. */
- wol->supported = WAKE_MAGIC;
- /* What we've currently got set. */
- wol->wolopts = qdev->wol;
+ unsigned short ssys_dev = qdev->pdev->subsystem_device;
+
+ /* WOL is only supported for mezz card. */
+ if (ssys_dev == QLGE_MEZZ_SSYS_ID_068 ||
+ ssys_dev == QLGE_MEZZ_SSYS_ID_180) {
+ wol->supported = WAKE_MAGIC;
+ wol->wolopts = qdev->wol;
+ }
}
static int ql_set_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
{
struct ql_adapter *qdev = netdev_priv(ndev);
int status;
+ unsigned short ssys_dev = qdev->pdev->subsystem_device;
+ /* WOL is only supported for mezz card. */
+ if (ssys_dev != QLGE_MEZZ_SSYS_ID_068 ||
+ ssys_dev != QLGE_MEZZ_SSYS_ID_180) {
+ netif_info(qdev, drv, qdev->ndev,
+ "WOL is only supported for mezz card\n");
+ return -EOPNOTSUPP;
+ }
if (wol->wolopts & ~WAKE_MAGIC)
return -EINVAL;
qdev->wol = wol->wolopts;
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 4/9] qlge: Clean up ethtool set WOL routine.
From: Jitendra Kalsaria @ 2012-07-11 0:57 UTC (permalink / raw)
To: davem; +Cc: netdev, ron.mercer, Dept_NX_Linux_NIC_Driver, Jitendra Kalsaria
In-Reply-To: <1341968259-18931-1-git-send-email-jitendra.kalsaria@qlogic.com>
From: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
---
drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c | 9 ---------
1 files changed, 0 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c b/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
index c2adfa2..3b0912f 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qlge/qlge_ethtool.c
@@ -401,7 +401,6 @@ static void ql_get_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
static int ql_set_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
{
struct ql_adapter *qdev = netdev_priv(ndev);
- int status;
unsigned short ssys_dev = qdev->pdev->subsystem_device;
/* WOL is only supported for mezz card. */
@@ -416,14 +415,6 @@ static int ql_set_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
qdev->wol = wol->wolopts;
netif_info(qdev, drv, qdev->ndev, "Set wol option 0x%x\n", qdev->wol);
- if (!qdev->wol) {
- u32 wol = 0;
- status = ql_mb_wol_mode(qdev, wol);
- netif_err(qdev, drv, qdev->ndev, "WOL %s (wol code 0x%x)\n",
- status == 0 ? "cleared successfully" : "clear failed",
- wol);
- }
-
return 0;
}
--
1.7.1
^ permalink raw reply related
* [PATCH net-next 0/9] qlge: bug fix
From: Jitendra Kalsaria @ 2012-07-11 0:57 UTC (permalink / raw)
To: davem; +Cc: netdev, ron.mercer, Dept_NX_Linux_NIC_Driver, Jitendra Kalsaria
From: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Please apply it to net-next.
Thanks,
Jitendra
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox