* [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll
From: nhorman @ 2010-10-14 2:01 UTC (permalink / raw)
To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman
In-Reply-To: <1287021713-1750-1-git-send-email-nhorman@tuxdriver.com>
From: Neil Horman <nhorman@tuxdriver.com>
The monitoring paths in the bonding driver take write locks that are shared by
the tx path. If netconsole is in use, these paths can call printk which puts us
in the netpoll tx path, which, if netconsole is attached to the bonding driver,
result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the
monitor paths in the bonding driver don't claim the xmit_lock, nor should they).
The solution is to use a per cpu flag internal to the driver to indicate when a
cpu is holding the lock in a path that might recusrse into the tx path for the
driver via netconsole. By checking this flag on transmit, we can defer the
sending of the netconsole frames until a later time using the retransmit feature
of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY. I've
tested this and am able to transmit via netconsole while causing failover
conditions on the bond slave links.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
drivers/net/bonding/bond_main.c | 46 ++++++++++++++++++++++++++++++++++---
drivers/net/bonding/bond_sysfs.c | 8 ++++++
drivers/net/bonding/bonding.h | 30 ++++++++++++++++++++++++
3 files changed, 80 insertions(+), 4 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index eb7d089..8868a51 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -76,6 +76,7 @@
#include <linux/if_vlan.h>
#include <linux/if_bonding.h>
#include <linux/jiffies.h>
+#include <linux/preempt.h>
#include <net/route.h>
#include <net/net_namespace.h>
#include <net/netns/generic.h>
@@ -169,6 +170,10 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link
/*----------------------------- Global variables ----------------------------*/
+#ifdef CONFIG_NET_POLL_CONTROLLER
+cpumask_var_t netpoll_block_tx;
+#endif
+
static const char * const version =
DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n";
@@ -310,6 +315,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id)
pr_debug("bond: %s, vlan id %d\n", bond->dev->name, vlan_id);
+ block_netpoll_tx();
write_lock_bh(&bond->lock);
list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
@@ -344,6 +350,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id)
out:
write_unlock_bh(&bond->lock);
+ unblock_netpoll_tx();
return res;
}
@@ -1804,10 +1811,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
bond_set_carrier(bond);
#ifdef CONFIG_NET_POLL_CONTROLLER
- /*
- * Netpoll and bonding is broken, make sure it is not initialized
- * until it is fixed.
- */
if (disable_netpoll) {
bond_dev->priv_flags |= IFF_DISABLE_NETPOLL;
} else {
@@ -1892,6 +1895,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
return -EINVAL;
}
+ block_netpoll_tx();
netdev_bonding_change(bond_dev, NETDEV_BONDING_DESLAVE);
write_lock_bh(&bond->lock);
@@ -1901,6 +1905,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
pr_info("%s: %s not enslaved\n",
bond_dev->name, slave_dev->name);
write_unlock_bh(&bond->lock);
+ unblock_netpoll_tx();
return -EINVAL;
}
@@ -1994,6 +1999,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
}
write_unlock_bh(&bond->lock);
+ unblock_netpoll_tx();
/* must do this from outside any spinlocks */
bond_destroy_slave_symlinks(bond_dev, slave_dev);
@@ -2085,6 +2091,7 @@ static int bond_release_all(struct net_device *bond_dev)
struct net_device *slave_dev;
struct sockaddr addr;
+ block_netpoll_tx();
write_lock_bh(&bond->lock);
netif_carrier_off(bond_dev);
@@ -2183,6 +2190,7 @@ static int bond_release_all(struct net_device *bond_dev)
out:
write_unlock_bh(&bond->lock);
+ unblock_netpoll_tx();
return 0;
}
@@ -2232,9 +2240,11 @@ static int bond_ioctl_change_active(struct net_device *bond_dev, struct net_devi
(old_active) &&
(new_active->link == BOND_LINK_UP) &&
IS_UP(new_active->dev)) {
+ block_netpoll_tx();
write_lock_bh(&bond->curr_slave_lock);
bond_change_active_slave(bond, new_active);
write_unlock_bh(&bond->curr_slave_lock);
+ unblock_netpoll_tx();
} else
res = -EINVAL;
@@ -2466,9 +2476,11 @@ static void bond_miimon_commit(struct bonding *bond)
do_failover:
ASSERT_RTNL();
+ block_netpoll_tx();
write_lock_bh(&bond->curr_slave_lock);
bond_select_active_slave(bond);
write_unlock_bh(&bond->curr_slave_lock);
+ unblock_netpoll_tx();
}
bond_set_carrier(bond);
@@ -2911,11 +2923,13 @@ void bond_loadbalance_arp_mon(struct work_struct *work)
}
if (do_failover) {
+ block_netpoll_tx();
write_lock_bh(&bond->curr_slave_lock);
bond_select_active_slave(bond);
write_unlock_bh(&bond->curr_slave_lock);
+ unblock_netpoll_tx();
}
re_arm:
@@ -3074,9 +3088,11 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks)
do_failover:
ASSERT_RTNL();
+ block_netpoll_tx();
write_lock_bh(&bond->curr_slave_lock);
bond_select_active_slave(bond);
write_unlock_bh(&bond->curr_slave_lock);
+ unblock_netpoll_tx();
}
bond_set_carrier(bond);
@@ -4564,6 +4580,13 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct bonding *bond = netdev_priv(dev);
+ /*
+ * If we risk deadlock from transmitting this in the
+ * netpoll path, tell netpoll to queue the frame for later tx
+ */
+ if (is_netpoll_tx_blocked(dev))
+ return NETDEV_TX_BUSY;
+
if (TX_QUEUE_OVERRIDE(bond->params.mode)) {
if (!bond_slave_override(bond, skb))
return NETDEV_TX_OK;
@@ -5277,6 +5300,13 @@ static int __init bonding_init(void)
if (res)
goto out;
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ if (!alloc_cpumask_var(&netpoll_block_tx, GFP_KERNEL)) {
+ res = -ENOMEM;
+ goto out;
+ }
+#endif
+
res = register_pernet_subsys(&bond_net_ops);
if (res)
goto out;
@@ -5295,6 +5325,7 @@ static int __init bonding_init(void)
if (res)
goto err;
+
register_netdevice_notifier(&bond_netdev_notifier);
register_inetaddr_notifier(&bond_inetaddr_notifier);
bond_register_ipv6_notifier();
@@ -5304,6 +5335,9 @@ err:
rtnl_link_unregister(&bond_link_ops);
err_link:
unregister_pernet_subsys(&bond_net_ops);
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ free_cpumask_var(netpoll_block_tx);
+#endif
goto out;
}
@@ -5318,6 +5352,10 @@ static void __exit bonding_exit(void)
rtnl_link_unregister(&bond_link_ops);
unregister_pernet_subsys(&bond_net_ops);
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ free_cpumask_var(netpoll_block_tx);
+#endif
}
module_init(bonding_init);
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 01b4c3f..8fd0174 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1066,6 +1066,7 @@ static ssize_t bonding_store_primary(struct device *d,
if (!rtnl_trylock())
return restart_syscall();
+ block_netpoll_tx();
read_lock(&bond->lock);
write_lock_bh(&bond->curr_slave_lock);
@@ -1101,6 +1102,7 @@ static ssize_t bonding_store_primary(struct device *d,
out:
write_unlock_bh(&bond->curr_slave_lock);
read_unlock(&bond->lock);
+ unblock_netpoll_tx();
rtnl_unlock();
return count;
@@ -1146,11 +1148,13 @@ static ssize_t bonding_store_primary_reselect(struct device *d,
bond->dev->name, pri_reselect_tbl[new_value].modename,
new_value);
+ block_netpoll_tx();
read_lock(&bond->lock);
write_lock_bh(&bond->curr_slave_lock);
bond_select_active_slave(bond);
write_unlock_bh(&bond->curr_slave_lock);
read_unlock(&bond->lock);
+ unblock_netpoll_tx();
out:
rtnl_unlock();
return ret;
@@ -1232,6 +1236,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
if (!rtnl_trylock())
return restart_syscall();
+
+ block_netpoll_tx();
read_lock(&bond->lock);
write_lock_bh(&bond->curr_slave_lock);
@@ -1288,6 +1294,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
out:
write_unlock_bh(&bond->curr_slave_lock);
read_unlock(&bond->lock);
+ unblock_netpoll_tx();
+
rtnl_unlock();
return count;
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index c15f213..deef1aa 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -19,6 +19,7 @@
#include <linux/proc_fs.h>
#include <linux/if_bonding.h>
#include <linux/kobject.h>
+#include <linux/cpumask.h>
#include <linux/in6.h>
#include "bond_3ad.h"
#include "bond_alb.h"
@@ -117,6 +118,35 @@
bond_for_each_slave_from(bond, pos, cnt, (bond)->first_slave)
+#ifdef CONFIG_NET_POLL_CONTROLLER
+extern cpumask_var_t netpoll_block_tx;
+
+static inline void block_netpoll_tx(void)
+{
+ preempt_disable();
+ BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(),
+ netpoll_block_tx));
+}
+
+static inline void unblock_netpoll_tx(void)
+{
+ BUG_ON(!cpumask_test_and_clear_cpu(smp_processor_id(),
+ netpoll_block_tx));
+ preempt_enable();
+}
+
+static inline int is_netpoll_tx_blocked(struct net_device *dev)
+{
+ if (unlikely(dev->priv_flags & IFF_IN_NETPOLL))
+ return cpumask_test_cpu(smp_processor_id(), netpoll_block_tx);
+ return 0;
+}
+#else
+#define block_netpoll_tx()
+#define unblock_netpoll_tx()
+#define is_netpoll_tx_blocked(dev)
+#endif
+
struct bond_params {
int mode;
int xmit_policy;
--
1.7.2.3
^ permalink raw reply related
* [PATCH 3/5] Fix napi poll for bonding driver
From: nhorman @ 2010-10-14 2:01 UTC (permalink / raw)
To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman
In-Reply-To: <1287021713-1750-1-git-send-email-nhorman@tuxdriver.com>
From: Neil Horman <nhorman@tuxdriver.com>
Usually the netpoll path, when preforming a napi poll can get away with just
polling all the napi instances of the configured device. Thats not the case for
the bonding driver however, as the napi instances which may wind up getting
flagged as needing polling after the poll_controller call don't belong to the
bonded device, but rather to the slave devices. Fix this by checking the device
in question for the IFF_MASTER flag, if set, we know we need to check the full
poll list for this cpu, rather than just the devices napi instance list.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
net/core/netpoll.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 4e98ffa..d79d221 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -156,8 +156,15 @@ static void poll_napi(struct net_device *dev)
{
struct napi_struct *napi;
int budget = 16;
+ struct softnet_data *sd = &__get_cpu_var(softnet_data);
+ struct list_head *nlist;
- list_for_each_entry(napi, &dev->napi_list, dev_list) {
+ if (dev->flags & IFF_MASTER)
+ nlist = &sd->poll_list;
+ else
+ nlist = &dev->napi_list;
+
+ list_for_each_entry(napi, nlist, dev_list) {
if (napi->poll_owner != smp_processor_id() &&
spin_trylock(&napi->poll_lock)) {
budget = poll_one_napi(dev->npinfo, napi, budget);
--
1.7.2.3
^ permalink raw reply related
* [PATCH 4/5] Fix netconsole to not deadlock on rmmod
From: nhorman @ 2010-10-14 2:01 UTC (permalink / raw)
To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman
In-Reply-To: <1287021713-1750-1-git-send-email-nhorman@tuxdriver.com>
From: Neil Horman <nhorman@tuxdriver.com>
Netconsole calls netpoll_cleanup on receipt of a NETDEVICE_UNREGISTER event.
The notifier subsystem calls these event handlers with rtnl_lock held, which
netpoll_cleanup also takes, resulting in deadlock. Fix this by calling the
__netpoll_cleanup interior function instead, and fixing up the additional
pointers.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
drivers/net/netconsole.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index ca142c4..94255f0 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -678,7 +678,14 @@ static int netconsole_netdev_event(struct notifier_block *this,
strlcpy(nt->np.dev_name, dev->name, IFNAMSIZ);
break;
case NETDEV_UNREGISTER:
- netpoll_cleanup(&nt->np);
+ /*
+ * rtnl_lock already held
+ */
+ if (nt->np.dev) {
+ __netpoll_cleanup(&nt->np);
+ dev_put(nt->np.dev);
+ nt->np.dev = NULL;
+ }
/* Fall through */
case NETDEV_GOING_DOWN:
case NETDEV_BONDING_DESLAVE:
--
1.7.2.3
^ permalink raw reply related
* [PATCH 5/5] Re-enable netpoll over bonding
From: nhorman @ 2010-10-14 2:01 UTC (permalink / raw)
To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman
In-Reply-To: <1287021713-1750-1-git-send-email-nhorman@tuxdriver.com>
From: Neil Horman <nhorman@tuxdriver.com>
With the inclusion of previous fixup patches, netpoll over bonding apears to
work reliably with failover conditions. This reverts Gospos previous commit
c22d7ac844f1cb9c6a5fd20f89ebadc2feef891b, and allows access again to the netpoll
functionality in the bonding driver.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
drivers/net/bonding/bond_main.c | 29 ++++++++++-------------------
1 files changed, 10 insertions(+), 19 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 8868a51..38d4ca0 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -184,9 +184,6 @@ static int arp_ip_count;
static int bond_mode = BOND_MODE_ROUNDROBIN;
static int xmit_hashtype = BOND_XMIT_POLICY_LAYER2;
static int lacp_fast;
-#ifdef CONFIG_NET_POLL_CONTROLLER
-static int disable_netpoll = 1;
-#endif
const struct bond_parm_tbl bond_lacp_tbl[] = {
{ "slow", AD_LACP_SLOW},
@@ -1811,19 +1808,15 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
bond_set_carrier(bond);
#ifdef CONFIG_NET_POLL_CONTROLLER
- if (disable_netpoll) {
+ if (slaves_support_netpoll(bond_dev)) {
+ bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
+ if (bond_dev->npinfo)
+ slave_dev->npinfo = bond_dev->npinfo;
+ } else if (!(bond_dev->priv_flags & IFF_DISABLE_NETPOLL)) {
bond_dev->priv_flags |= IFF_DISABLE_NETPOLL;
- } else {
- if (slaves_support_netpoll(bond_dev)) {
- bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
- if (bond_dev->npinfo)
- slave_dev->npinfo = bond_dev->npinfo;
- } else if (!(bond_dev->priv_flags & IFF_DISABLE_NETPOLL)) {
- bond_dev->priv_flags |= IFF_DISABLE_NETPOLL;
- pr_info("New slave device %s does not support netpoll\n",
- slave_dev->name);
- pr_info("Disabling netpoll support for %s\n", bond_dev->name);
- }
+ pr_info("New slave device %s does not support netpoll\n",
+ slave_dev->name);
+ pr_info("Disabling netpoll support for %s\n", bond_dev->name);
}
#endif
read_unlock(&bond->lock);
@@ -2030,10 +2023,8 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
#ifdef CONFIG_NET_POLL_CONTROLLER
read_lock_bh(&bond->lock);
- /* Make sure netpoll over stays disabled until fixed. */
- if (!disable_netpoll)
- if (slaves_support_netpoll(bond_dev))
- bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
+ if (slaves_support_netpoll(bond_dev))
+ bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
read_unlock_bh(&bond->lock);
if (slave_dev->netdev_ops->ndo_netpoll_cleanup)
slave_dev->netdev_ops->ndo_netpoll_cleanup(slave_dev);
--
1.7.2.3
^ permalink raw reply related
* Re: ixgbe: normalize frag_list usage
From: David Miller @ 2010-10-14 2:17 UTC (permalink / raw)
To: alexander.h.duyck
Cc: jeffrey.t.kirsher, jesse.brandeburg, bruce.w.allan, netdev
In-Reply-To: <4CAE2692.8020401@intel.com>
From: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Thu, 07 Oct 2010 12:59:14 -0700
> I can track it in the RSC_CB if that works for you. Right now though
> I guess I am not seeing the difference between tracking this in
> skb->frag_next vs IXGBE_RSC_CB(skb)->frag_head. I think it might help
> if you were to provide some functions that demonstrate exactly what
> you had in mind for frag list handling. Specifically if you were to
> add a function for merging a frag into the frag list, and for how you
> want to approach cleaning up the skb->prev/frag_tail_tracker pointer
> when you are cleaning up an active frag_list.
Basically the one helper function will look like this.
static inline void skb_frag_list_add(struct sk_buff *head,
struct sk_buff *new)
{
if (!skb_shinfo(skb)->frag_list) {
skb_shinfo(skb)->frag_list = new;
skb->frag_list_tail = new;
} else {
skb->frag_list_tail->frag_list_next = new;
skb->frag_list_tail = new;
}
}
If you have to track the head from the tail packets, please do
so in a private control block.
^ permalink raw reply
* Re: [PATCH net-next 1/2]: stmmac: make ethtool functions local
From: Gustavo F. Padovan @ 2010-10-13 21:35 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, Giuseppe Cavallaro, Deepak SIKRI, netdev
In-Reply-To: <20101013175031.41c29349@nehalam>
* Stephen Hemminger <shemminger@vyatta.com> [2010-10-13 17:50:31 -0700]:
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
^ permalink raw reply
* Re: [PATCH net-next 2/2] stmmac: make function tables const
From: Gustavo F. Padovan @ 2010-10-13 21:35 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, Giuseppe Cavallaro, Deepak SIKRI, netdev
In-Reply-To: <20101013175125.6f6a64db@nehalam>
* Stephen Hemminger <shemminger@vyatta.com> [2010-10-13 17:51:25 -0700]:
> These tables only contain function pointers.
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
^ permalink raw reply
* Re: tbf/htb qdisc limitations
From: Bill Fink @ 2010-10-14 3:36 UTC (permalink / raw)
To: Jarek Poplawski; +Cc: Rick Jones, Steven Brudenell, netdev
In-Reply-To: <20101013062649.GA6915@ff.dom.local>
On Wed, 13 Oct 2010, Jarek Poplawski wrote:
> On Tue, Oct 12, 2010 at 03:17:18PM -0700, Rick Jones wrote:
> >>> my burst problem is the only semi-legitimate motivation i can think
> >>> of. the only other possible motivations i can imagine are setting
> >>> "limit" to buffer more than 4GB of packets and setting "rate" to
> >>> something more than 32 gigabit; both of these seem kind of dubious. is
> >>> there something else you had in mind?
> >>
> >>
> >> No, mainly 10 gigabit rates and additionally 64-bit stats.
> >
> > Any issue for bonded 10 GbE interfaces? Now that the IEEE have ratified
> > (June) how far out are 40 GbE interfaces? Or 100 GbE for that matter.
>
> Alas packet schedulers using rate tables are still around 1G. Above 2G
> they get less and less accurate, so hfsc is recommended.
I was just trying to do an 8 Gbps rate limit on a 10-GigE path,
and couldn't get it to work with either htb or tbf. Are you
saying this currently isn't possible? Or are you saying to use
this hfsc mechanism, which there doesn't seem to be a man page
for?
-Bill
^ permalink raw reply
* Re: BUG ? ipip unregister_netdevice_many()
From: Eric Dumazet @ 2010-10-14 3:57 UTC (permalink / raw)
To: David Miller; +Cc: daniel.lezcano, ebiederm, hans.schillstrom, netdev
In-Reply-To: <20101013.162321.183058542.davem@davemloft.net>
Le mercredi 13 octobre 2010 à 16:23 -0700, David Miller a écrit :
> From: Daniel Lezcano <daniel.lezcano@free.fr>
> Date: Thu, 14 Oct 2010 00:16:15 +0200
>
> > do you mind to wait I test the patch before merging it ?
> > I would like to stress a bit this routine with multiple containers.
>
> Yes, it would be great if you could test this.
>
> Please make sure you get the fix for the bug that
> Jarek found ('list' needs to be initialized to NULL)
>
> I've included the latest version below:
>
> diff --git a/include/net/route.h b/include/net/route.h
> index 7e5e73b..8d24761 100644
> --- a/include/net/route.h
> +++ b/include/net/route.h
> @@ -106,7 +106,7 @@ extern int ip_rt_init(void);
> extern void ip_rt_redirect(__be32 old_gw, __be32 dst, __be32 new_gw,
> __be32 src, struct net_device *dev);
> extern void rt_cache_flush(struct net *net, int how);
> -extern void rt_cache_flush_batch(void);
> +extern void rt_cache_flush_batch(struct net *net);
> extern int __ip_route_output_key(struct net *, struct rtable **, const struct flowi *flp);
> extern int ip_route_output_key(struct net *, struct rtable **, struct flowi *flp);
> extern int ip_route_output_flow(struct net *, struct rtable **rp, struct flowi *flp, struct sock *sk, int flags);
> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> index 919f2ad..4039f56 100644
> --- a/net/ipv4/fib_frontend.c
> +++ b/net/ipv4/fib_frontend.c
> @@ -999,7 +999,7 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo
> rt_cache_flush(dev_net(dev), 0);
> break;
> case NETDEV_UNREGISTER_BATCH:
> - rt_cache_flush_batch();
> + rt_cache_flush_batch(dev_net(dev));
> break;
> }
> return NOTIFY_DONE;
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 0755aa4..6ad730c 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -712,13 +712,14 @@ static inline int rt_is_expired(struct rtable *rth)
> * Can be called by a softirq or a process.
> * In the later case, we want to be reschedule if necessary
> */
> -static void rt_do_flush(int process_context)
> +static void rt_do_flush(struct net *net, int process_context)
> {
> unsigned int i;
> struct rtable *rth, *next;
> - struct rtable * tail;
>
> for (i = 0; i <= rt_hash_mask; i++) {
> + struct rtable *list = NULL, **pprev;
> +
> if (process_context && need_resched())
> cond_resched();
> rth = rt_hash_table[i].chain;
> @@ -726,41 +727,27 @@ static void rt_do_flush(int process_context)
> continue;
>
> spin_lock_bh(rt_hash_lock_addr(i));
> -#ifdef CONFIG_NET_NS
> - {
> - struct rtable ** prev, * p;
>
> - rth = rt_hash_table[i].chain;
> + pprev = &rt_hash_table[i].chain;
> + rth = *pprev;
> + while (rth) {
> + next = rth->dst.rt_next;
> + if (dev_net(rth->dst.dev) == net) {
if (net_eq(dev_net(rth->dst.dev), net)) {
> + *pprev = next;
>
> - /* defer releasing the head of the list after spin_unlock */
> - for (tail = rth; tail; tail = tail->dst.rt_next)
> - if (!rt_is_expired(tail))
> - break;
> - if (rth != tail)
> - rt_hash_table[i].chain = tail;
> -
> - /* call rt_free on entries after the tail requiring flush */
> - prev = &rt_hash_table[i].chain;
> - for (p = *prev; p; p = next) {
> - next = p->dst.rt_next;
> - if (!rt_is_expired(p)) {
> - prev = &p->dst.rt_next;
> - } else {
> - *prev = next;
> - rt_free(p);
> - }
> - }
> + rth->dst.rt_next = list;
> + list = rth;
I was wondering about RCU rules here.
We change pointers while a reader might enter in a loop.
It seems fine : At soon as we spin_unlock(), the loop should be closed.
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
minor coding style : You should add a brace in the else clause :
pprev = &rt_hash_table[i].chain;
for (rth = *pprev; rth != NULL; rth = next) {
next = rth->dst.rt_next;
if (net_eq(dev_net(rth->dst.dev), net)) {
*pprev = next;
rth->dst.rt_next = list;
list = rth;
} else {
pprev = &rth->dst.rt_next;
}
}
Thanks !
^ permalink raw reply
* Nested GRE locking bug
From: Ben Hutchings @ 2010-10-14 4:00 UTC (permalink / raw)
To: netdev; +Cc: Beatrice Barbe, 599816
[-- Attachment #1.1: Type: text/plain, Size: 11854 bytes --]
Beatrice Barbe reported a reproducible crash after creating large
numbers of nested GRE tunnels and then pinging with the source address
forced. I was able to reproduce this using net-2.6. I'm attaching the
kernel config I used and a script to reproduce this based on the script
she provided. The magic number of tunnels to create is apparently 37.
With lockdep enabled, I get the following output:
=============================================
[ INFO: possible recursive locking detected ]
2.6.36-rc7-00040-gb0057c5 #5
---------------------------------------------
ping/2199 is trying to acquire lock:
(_xmit_IPGRE){+.....}, at: [<c1139968>] dev_queue_xmit+0x37e/0x454
but task is already holding lock:
(_xmit_IPGRE){+.....}, at: [<c1139968>] dev_queue_xmit+0x37e/0x454
other info that might help us debug this:
4 locks held by ping/2199:
#0: (sk_lock-AF_INET){+.+.+.}, at: [<c1168c46>] raw_sendmsg+0x590/0x64c
#1: (rcu_read_lock_bh){.+....}, at: [<c11395ea>] dev_queue_xmit+0x0/0x454
#2: (_xmit_IPGRE){+.....}, at: [<c1139968>] dev_queue_xmit+0x37e/0x454
#3: (rcu_read_lock_bh){.+....}, at: [<c11395ea>] dev_queue_xmit+0x0/0x454
stack backtrace:
Pid: 2199, comm: ping Not tainted 2.6.36-rc7-00040-gb0057c5 #5
Call Trace:
[<c1187b3c>] ? printk+0xf/0x13
[<c103a942>] __lock_acquire+0xbda/0x1311
[<c103a32b>] ? __lock_acquire+0x5c3/0x1311
[<c103b0d2>] lock_acquire+0x59/0x77
[<c1139968>] ? dev_queue_xmit+0x37e/0x454
[<c11898b4>] _raw_spin_lock+0x1b/0x2a
[<c1139968>] ? dev_queue_xmit+0x37e/0x454
[<c1139968>] dev_queue_xmit+0x37e/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c1151382>] ? ip_append_data+0x536/0x7dc
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c1151851>] ? ip_generic_getfrag+0x0/0x8a
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c1150dff>] ip_push_pending_frames+0x260/0x2ad
[<c1168c85>] raw_sendmsg+0x5cf/0x64c
[<c11708ad>] inet_sendmsg+0x46/0x4f
[<c112cea9>] sock_sendmsg+0xa4/0xba
[<c105897d>] ? might_fault+0x35/0x6f
[<c105897d>] ? might_fault+0x35/0x6f
[<c1134614>] ? verify_iovec+0x3e/0x6a
[<c112d2af>] sys_sendmsg+0x149/0x196
[<c104b079>] ? unlock_page+0x3f/0x42
[<c103b176>] ? lock_release_non_nested+0x86/0x221
[<c105897d>] ? might_fault+0x35/0x6f
[<c105897d>] ? might_fault+0x35/0x6f
[<c112e287>] sys_socketcall+0x146/0x18b
[<c10cb5c8>] ? trace_hardirqs_on_thunk+0xc/0x10
[<c1189f5d>] syscall_call+0x7/0xb
------------[ cut here ]------------
WARNING: at kernel/softirq.c:143 local_bh_enable_ip+0x39/0xa5()
Hardware name: Bochs
Pid: 2199, comm: ping Not tainted 2.6.36-rc7-00040-gb0057c5 #5
Call Trace:
[<c101a092>] warn_slowpath_common+0x60/0x75
[<c101e534>] ? local_bh_enable_ip+0x39/0xa5
[<c114b993>] ? rt_intern_hash+0x4da/0x4f9
[<c101a0b6>] warn_slowpath_null+0xf/0x13
[<c101e534>] local_bh_enable_ip+0x39/0xa5
[<c1189d4e>] _raw_spin_unlock_bh+0x25/0x28
[<c114b993>] rt_intern_hash+0x4da/0x4f9
[<c114c1b8>] __ip_route_output_key+0x806/0x860
[<c114c220>] ip_route_output_flow+0xe/0x3e
[<c114c25c>] ip_route_output_key+0xc/0xe
[<c11793d6>] ipgre_tunnel_xmit+0x1ac/0x757
[<c1139968>] ? dev_queue_xmit+0x37e/0x454
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c11798e2>] ipgre_tunnel_xmit+0x6b8/0x757
[<c114a188>] ? ip_rt_update_pmtu+0x0/0x60
[<c11394fc>] dev_hard_start_xmit+0x33a/0x428
[<c1139987>] dev_queue_xmit+0x39d/0x454
[<c1151382>] ? ip_append_data+0x536/0x7dc
[<c115292e>] ip_finish_output+0x29d/0x2c7
[<c1151851>] ? ip_generic_getfrag+0x0/0x8a
[<c11529e2>] ip_output+0x8a/0x8f
[<c1150b9c>] ip_local_out+0x50/0x53
[<c1150dff>] ip_push_pending_frames+0x260/0x2ad
[<c1168c85>] raw_sendmsg+0x5cf/0x64c
[<c11708ad>] inet_sendmsg+0x46/0x4f
[<c112cea9>] sock_sendmsg+0xa4/0xba
[<c105897d>] ? might_fault+0x35/0x6f
[<c105897d>] ? might_fault+0x35/0x6f
[<c1134614>] ? verify_iovec+0x3e/0x6a
[<c112d2af>] sys_sendmsg+0x149/0x196
[<c104b079>] ? unlock_page+0x3f/0x42
[<c103b176>] ? lock_release_non_nested+0x86/0x221
[<c105897d>] ? might_fault+0x35/0x6f
[<c105897d>] ? might_fault+0x35/0x6f
[<c112e287>] sys_socketcall+0x146/0x18b
[<c10cb5c8>] ? trace_hardirqs_on_thunk+0xc/0x10
[<c1189f5d>] syscall_call+0x7/0xb
<IRQ>
Ben.
--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
[-- Attachment #1.2: .config --]
[-- Type: text/x-mpsub, Size: 28654 bytes --]
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.36-rc7
# Thu Oct 14 04:18:32 2010
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
# CONFIG_X86_64 is not set
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf32-i386"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig"
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
# CONFIG_NEED_DMA_MAP_STATE is not set
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
# CONFIG_GENERIC_TIME_VSYSCALL is not set
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_DEFAULT_IDLE=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
# CONFIG_HAVE_CPUMASK_OF_CPU_MAP is not set
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
# CONFIG_AUDIT_ARCH is not set
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_EARLY_RES=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_32_LAZY_GS=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-ecx -fcall-saved-edx"
CONFIG_KTIME_SCALAR=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_CONSTRUCTORS=y
#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_LZO is not set
CONFIG_SWAP=y
# CONFIG_SYSVIPC is not set
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_AUDIT is not set
#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_TINY_RCU is not set
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_FANOUT=32
# CONFIG_RCU_FANOUT_EXACT is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_IKCONFIG is not set
CONFIG_LOG_BUF_SHIFT=18
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
# CONFIG_CGROUPS is not set
# CONFIG_SYSFS_DEPRECATED_V2 is not set
# CONFIG_RELAY is not set
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_NET_NS is not set
# CONFIG_BLK_DEV_INITRD is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_HAVE_PERF_EVENTS=y
#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# CONFIG_PERF_COUNTERS is not set
# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_PCI_QUIRKS=y
CONFIG_SLUB_DEBUG=y
CONFIG_COMPAT_BRK=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_PROFILING is not set
CONFIG_HAVE_OPROFILE=y
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_DMA_ATTRS=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_DMA_API_DEBUG=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
CONFIG_HAVE_USER_RETURN_NOTIFIER=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
#
# GCOV-based kernel profiling
#
CONFIG_HAVE_GENERIC_DMA_COHERENT=y
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
# CONFIG_MODULES is not set
CONFIG_BLOCK=y
CONFIG_LBDAF=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_BLK_DEV_INTEGRITY is not set
#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
# CONFIG_IOSCHED_DEADLINE is not set
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"
# CONFIG_INLINE_SPIN_TRYLOCK is not set
# CONFIG_INLINE_SPIN_TRYLOCK_BH is not set
# CONFIG_INLINE_SPIN_LOCK is not set
# CONFIG_INLINE_SPIN_LOCK_BH is not set
# CONFIG_INLINE_SPIN_LOCK_IRQ is not set
# CONFIG_INLINE_SPIN_LOCK_IRQSAVE is not set
CONFIG_INLINE_SPIN_UNLOCK=y
# CONFIG_INLINE_SPIN_UNLOCK_BH is not set
CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
# CONFIG_INLINE_SPIN_UNLOCK_IRQRESTORE is not set
# CONFIG_INLINE_READ_TRYLOCK is not set
# CONFIG_INLINE_READ_LOCK is not set
# CONFIG_INLINE_READ_LOCK_BH is not set
# CONFIG_INLINE_READ_LOCK_IRQ is not set
# CONFIG_INLINE_READ_LOCK_IRQSAVE is not set
CONFIG_INLINE_READ_UNLOCK=y
# CONFIG_INLINE_READ_UNLOCK_BH is not set
CONFIG_INLINE_READ_UNLOCK_IRQ=y
# CONFIG_INLINE_READ_UNLOCK_IRQRESTORE is not set
# CONFIG_INLINE_WRITE_TRYLOCK is not set
# CONFIG_INLINE_WRITE_LOCK is not set
# CONFIG_INLINE_WRITE_LOCK_BH is not set
# CONFIG_INLINE_WRITE_LOCK_IRQ is not set
# CONFIG_INLINE_WRITE_LOCK_IRQSAVE is not set
CONFIG_INLINE_WRITE_UNLOCK=y
# CONFIG_INLINE_WRITE_UNLOCK_BH is not set
CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
# CONFIG_INLINE_WRITE_UNLOCK_IRQRESTORE is not set
# CONFIG_MUTEX_SPIN_ON_OWNER is not set
# CONFIG_FREEZER is not set
#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
# CONFIG_SMP is not set
# CONFIG_X86_EXTENDED_PLATFORM is not set
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_PARAVIRT_GUEST=y
# CONFIG_VMI is not set
# CONFIG_KVM_CLOCK is not set
CONFIG_KVM_GUEST=y
# CONFIG_LGUEST_GUEST is not set
CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
CONFIG_NO_BOOTMEM=y
# CONFIG_MEMTEST is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
CONFIG_MCORE2=y
# CONFIG_MATOM is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CPU=y
CONFIG_X86_INTERNODE_CACHE_SHIFT=6
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_XADD=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=5
CONFIG_X86_DEBUGCTLMSR=y
CONFIG_CPU_SUP_INTEL=y
CONFIG_CPU_SUP_CYRIX_32=y
CONFIG_CPU_SUP_AMD=y
CONFIG_CPU_SUP_CENTAUR=y
CONFIG_CPU_SUP_TRANSMETA_32=y
CONFIG_CPU_SUP_UMC_32=y
CONFIG_HPET_TIMER=y
CONFIG_DMI=y
# CONFIG_IOMMU_HELPER is not set
# CONFIG_IOMMU_API is not set
CONFIG_NR_CPUS=1
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
# CONFIG_X86_UP_APIC is not set
# CONFIG_X86_MCE is not set
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
# CONFIG_ARCH_PHYS_ADDR_T_64BIT is not set
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ILLEGAL_POINTER_VALUE=0
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
# CONFIG_PHYS_ADDR_T_64BIT is not set
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
# CONFIG_KSM is not set
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
# CONFIG_HIGHPTE is not set
# CONFIG_X86_CHECK_BIOS_CORRUPTION is not set
CONFIG_X86_RESERVE_LOW_64K=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
# CONFIG_SECCOMP is not set
# CONFIG_CC_STACKPROTECTOR is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_SCHED_HRTICK=y
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x1000000
# CONFIG_RELOCATABLE is not set
CONFIG_PHYSICAL_ALIGN=0x100000
# CONFIG_COMPAT_VDSO is not set
# CONFIG_CMDLINE_BOOL is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
#
# Power management and ACPI options
#
# CONFIG_PM is not set
# CONFIG_SFI is not set
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
# CONFIG_CPU_IDLE is not set
#
# Bus options (PCI etc.)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_DOMAINS=y
# CONFIG_PCI_CNB20LE_QUIRK is not set
# CONFIG_PCIEPORTBUS is not set
# CONFIG_ARCH_SUPPORTS_MSI is not set
# CONFIG_PCI_DEBUG is not set
# CONFIG_PCI_STUB is not set
# CONFIG_PCI_IOV is not set
CONFIG_ISA_DMA_API=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
# CONFIG_OLPC is not set
# CONFIG_OLPC_OPENFIRMWARE is not set
CONFIG_K8_NB=y
# CONFIG_PCCARD is not set
# CONFIG_HOTPLUG_PCI is not set
#
# Executable file formats / Emulations
#
CONFIG_BINFMT_ELF=y
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
CONFIG_HAVE_AOUT=y
# CONFIG_BINFMT_AOUT is not set
# CONFIG_BINFMT_MISC is not set
CONFIG_HAVE_ATOMIC_IOMAP=y
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
# CONFIG_IP_ADVANCED_ROUTER is not set
CONFIG_IP_FIB_HASH=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=y
CONFIG_NET_IPGRE=y
# CONFIG_NET_IPGRE_BROADCAST is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_XFRM_TUNNEL is not set
CONFIG_INET_TUNNEL=y
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
# CONFIG_INET_LRO is not set
# CONFIG_INET_DIAG is not set
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
# CONFIG_TCP_MD5SIG is not set
# CONFIG_IPV6 is not set
# CONFIG_NETWORK_SECMARK is not set
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
# CONFIG_NETFILTER is not set
# CONFIG_IP_DCCP is not set
# CONFIG_IP_SCTP is not set
# CONFIG_RDS is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
# CONFIG_L2TP is not set
# CONFIG_BRIDGE is not set
# CONFIG_NET_DSA is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_PHONET is not set
# CONFIG_IEEE802154 is not set
# CONFIG_NET_SCHED is not set
# CONFIG_DCB is not set
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_HAMRADIO is not set
# CONFIG_CAN is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_AF_RXRPC is not set
CONFIG_WIRELESS=y
# CONFIG_CFG80211 is not set
# CONFIG_LIB80211 is not set
#
# CFG80211 needs to be enabled for MAC80211
#
#
# Some wireless drivers require a rate control algorithm
#
# CONFIG_WIMAX is not set
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set
# CONFIG_CAIF is not set
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
# CONFIG_DEVTMPFS is not set
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
CONFIG_FIRMWARE_IN_KERNEL=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_SYS_HYPERVISOR is not set
# CONFIG_CONNECTOR is not set
# CONFIG_MTD is not set
# CONFIG_PARPORT is not set
# CONFIG_BLK_DEV is not set
CONFIG_MISC_DEVICES=y
# CONFIG_IBM_ASM is not set
# CONFIG_PHANTOM is not set
# CONFIG_SGI_IOC4 is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_CS5535_MFGPT is not set
# CONFIG_HP_ILO is not set
# CONFIG_VMWARE_BALLOON is not set
# CONFIG_C2PORT is not set
#
# EEPROM support
#
# CONFIG_EEPROM_93CX6 is not set
# CONFIG_CB710_CORE is not set
CONFIG_HAVE_IDE=y
# CONFIG_IDE is not set
#
# SCSI device support
#
CONFIG_SCSI_MOD=y
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
# CONFIG_SCSI_TGT is not set
# CONFIG_SCSI_NETLINK is not set
# CONFIG_SCSI_PROC_FS is not set
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
# CONFIG_CHR_DEV_SCH is not set
# CONFIG_SCSI_MULTI_LUN is not set
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_LOGGING is not set
# CONFIG_SCSI_SCAN_ASYNC is not set
#
# SCSI Transports
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
# CONFIG_SCSI_SAS_LIBSAS is not set
# CONFIG_SCSI_SRP_ATTRS is not set
# CONFIG_SCSI_LOWLEVEL is not set
# CONFIG_SCSI_DH is not set
# CONFIG_SCSI_OSD_INITIATOR is not set
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
CONFIG_ATA_VERBOSE_ERROR=y
# CONFIG_SATA_PMP is not set
#
# Controllers with non-SFF native interface
#
# CONFIG_SATA_AHCI is not set
# CONFIG_SATA_AHCI_PLATFORM is not set
# CONFIG_SATA_INIC162X is not set
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y
#
# SFF controllers with custom DMA interface
#
# CONFIG_PDC_ADMA is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_SX4 is not set
CONFIG_ATA_BMDMA=y
#
# SATA SFF controllers with BMDMA
#
CONFIG_ATA_PIIX=y
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_SVW is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
#
# PATA SFF controllers with BMDMA
#
# CONFIG_PATA_ALI is not set
# CONFIG_PATA_AMD is not set
# CONFIG_PATA_ARTOP is not set
# CONFIG_PATA_ATIIXP is not set
# CONFIG_PATA_ATP867X is not set
# CONFIG_PATA_CMD64X is not set
# CONFIG_PATA_CS5520 is not set
# CONFIG_PATA_CS5530 is not set
# CONFIG_PATA_CS5535 is not set
# CONFIG_PATA_CS5536 is not set
# CONFIG_PATA_CYPRESS is not set
# CONFIG_PATA_EFAR is not set
# CONFIG_PATA_HPT366 is not set
# CONFIG_PATA_HPT37X is not set
# CONFIG_PATA_HPT3X2N is not set
# CONFIG_PATA_HPT3X3 is not set
# CONFIG_PATA_IT8213 is not set
# CONFIG_PATA_IT821X is not set
# CONFIG_PATA_JMICRON is not set
# CONFIG_PATA_MARVELL is not set
# CONFIG_PATA_NETCELL is not set
# CONFIG_PATA_NINJA32 is not set
# CONFIG_PATA_NS87415 is not set
# CONFIG_PATA_OLDPIIX is not set
# CONFIG_PATA_OPTIDMA is not set
# CONFIG_PATA_PDC2027X is not set
# CONFIG_PATA_PDC_OLD is not set
# CONFIG_PATA_RADISYS is not set
# CONFIG_PATA_RDC is not set
# CONFIG_PATA_SC1200 is not set
# CONFIG_PATA_SCH is not set
# CONFIG_PATA_SERVERWORKS is not set
# CONFIG_PATA_SIL680 is not set
# CONFIG_PATA_SIS is not set
# CONFIG_PATA_TOSHIBA is not set
# CONFIG_PATA_TRIFLEX is not set
# CONFIG_PATA_VIA is not set
# CONFIG_PATA_WINBOND is not set
#
# PIO-only SFF controllers
#
# CONFIG_PATA_CMD640_PCI is not set
# CONFIG_PATA_MPIIX is not set
# CONFIG_PATA_NS87410 is not set
# CONFIG_PATA_OPTI is not set
# CONFIG_PATA_RZ1000 is not set
#
# Generic fallback / legacy drivers
#
# CONFIG_ATA_GENERIC is not set
# CONFIG_PATA_LEGACY is not set
# CONFIG_MD is not set
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
#
# You can enable one or both FireWire driver stacks.
#
#
# The newer stack is recommended.
#
# CONFIG_FIREWIRE is not set
# CONFIG_IEEE1394 is not set
# CONFIG_FIREWIRE_NOSY is not set
# CONFIG_I2O is not set
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
# CONFIG_DUMMY is not set
# CONFIG_BONDING is not set
# CONFIG_MACVLAN is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_VETH is not set
# CONFIG_ARCNET is not set
# CONFIG_PHYLIB is not set
# CONFIG_NET_ETHERNET is not set
CONFIG_NETDEV_1000=y
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
CONFIG_E1000=y
# CONFIG_E1000E is not set
# CONFIG_IP1000 is not set
# CONFIG_IGB is not set
# CONFIG_IGBVF is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SIS190 is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2 is not set
# CONFIG_CNIC is not set
# CONFIG_QLA3XXX is not set
# CONFIG_ATL1 is not set
# CONFIG_ATL1E is not set
# CONFIG_ATL1C is not set
# CONFIG_JME is not set
# CONFIG_NETDEV_10000 is not set
# CONFIG_TR is not set
# CONFIG_WLAN is not set
#
# Enable WiMAX (Networking options) to see the WiMAX drivers
#
# CONFIG_WAN is not set
#
# CAIF transport drivers
#
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_NETCONSOLE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_VMXNET3 is not set
# CONFIG_ISDN is not set
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
# CONFIG_INPUT_FF_MEMLESS is not set
# CONFIG_INPUT_POLLDEV is not set
# CONFIG_INPUT_SPARSEKMAP is not set
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_INPUT_EVBUG is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
# CONFIG_KEYBOARD_OPENCORES is not set
# CONFIG_KEYBOARD_STOWAWAY is not set
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_INPUT_MOUSE is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
# CONFIG_SERIO_ALTERA_PS2 is not set
# CONFIG_GAMEPORT is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_VT_HW_CONSOLE_BINDING is not set
CONFIG_DEVKMEM=y
# CONFIG_SERIAL_NONSTANDARD is not set
# CONFIG_N_GSM is not set
# CONFIG_NOZOMI is not set
#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
#
# Non-8250 serial port support
#
# CONFIG_SERIAL_MFD_HSU is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
# CONFIG_SERIAL_TIMBERDALE is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
# CONFIG_IPMI_HANDLER is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_NVRAM is not set
# CONFIG_RTC is not set
# CONFIG_GEN_RTC is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
# CONFIG_MWAVE is not set
# CONFIG_PC8736x_GPIO is not set
# CONFIG_NSC_GPIO is not set
# CONFIG_CS5535_GPIO is not set
# CONFIG_RAW_DRIVER is not set
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set
CONFIG_DEVPORT=y
# CONFIG_RAMOOPS is not set
# CONFIG_I2C is not set
# CONFIG_SPI is not set
#
# PPS support
#
# CONFIG_PPS is not set
CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
# CONFIG_POWER_SUPPLY is not set
# CONFIG_HWMON is not set
# CONFIG_THERMAL is not set
# CONFIG_WATCHDOG is not set
CONFIG_SSB_POSSIBLE=y
#
# Sonics Silicon Backplane
#
# CONFIG_SSB is not set
# CONFIG_MFD_SUPPORT is not set
# CONFIG_REGULATOR is not set
# CONFIG_MEDIA_SUPPORT is not set
#
# Graphics support
#
# CONFIG_AGP is not set
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=16
# CONFIG_DRM is not set
# CONFIG_VGASTATE is not set
# CONFIG_VIDEO_OUTPUT_CONTROL is not set
# CONFIG_FB is not set
# CONFIG_BACKLIGHT_LCD_SUPPORT is not set
#
# Display device support
#
# CONFIG_DISPLAY_SUPPORT is not set
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_VGACON_SOFT_SCROLLBACK=y
CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=128
CONFIG_DUMMY_CONSOLE=y
# CONFIG_SOUND is not set
# CONFIG_HID_SUPPORT is not set
# CONFIG_USB_SUPPORT is not set
# CONFIG_UWB is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
# CONFIG_NEW_LEDS is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_INFINIBAND is not set
# CONFIG_EDAC is not set
# CONFIG_RTC_CLASS is not set
# CONFIG_DMADEVICES is not set
# CONFIG_AUXDISPLAY is not set
# CONFIG_UIO is not set
# CONFIG_STAGING is not set
# CONFIG_X86_PLATFORM_DEVICES is not set
#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_FIRMWARE_MEMMAP=y
# CONFIG_DELL_RBU is not set
# CONFIG_DCDBAS is not set
CONFIG_DMIID=y
# CONFIG_ISCSI_IBFT_FIND is not set
#
# File systems
#
# CONFIG_EXT2_FS is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
# CONFIG_EXT3_FS_SECURITY is not set
# CONFIG_EXT4_FS is not set
CONFIG_JBD=y
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_BTRFS_FS is not set
# CONFIG_NILFS2_FS is not set
CONFIG_FILE_LOCKING=y
# CONFIG_FSNOTIFY is not set
# CONFIG_DNOTIFY is not set
# CONFIG_INOTIFY_USER is not set
# CONFIG_FANOTIFY is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set
# CONFIG_FUSE_FS is not set
CONFIG_GENERIC_ACL=y
#
# Caches
#
# CONFIG_FSCACHE is not set
#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set
#
# DOS/FAT/NT Filesystems
#
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_NTFS_FS is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
# CONFIG_CONFIGFS_FS is not set
# CONFIG_MISC_FILESYSTEMS is not set
# CONFIG_NETWORK_FILESYSTEMS is not set
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
# CONFIG_NLS is not set
# CONFIG_DLM is not set
#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
# CONFIG_PRINTK_TIME is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
# CONFIG_ENABLE_MUST_CHECK is not set
CONFIG_FRAME_WARN=1024
CONFIG_MAGIC_SYSRQ=y
# CONFIG_STRIP_ASM_SYMS is not set
CONFIG_UNUSED_SYMBOLS=y
# CONFIG_DEBUG_FS is not set
# CONFIG_HEADERS_CHECK is not set
CONFIG_DEBUG_KERNEL=y
# CONFIG_DEBUG_SHIRQ is not set
# CONFIG_LOCKUP_DETECTOR is not set
# CONFIG_HARDLOCKUP_DETECTOR is not set
CONFIG_DETECT_HUNG_TASK=y
# CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0
# CONFIG_SCHED_DEBUG is not set
# CONFIG_SCHEDSTATS is not set
CONFIG_TIMER_STATS=y
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
# CONFIG_DEBUG_KMEMLEAK is not set
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_RT_MUTEX_TESTER is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_DEBUG_KOBJECT is not set
# CONFIG_DEBUG_HIGHMEM is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_VIRTUAL is not set
# CONFIG_DEBUG_WRITECOUNT is not set
CONFIG_DEBUG_MEMORY_INIT=y
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_CREDENTIALS is not set
CONFIG_ARCH_WANT_FRAME_POINTERS=y
# CONFIG_FRAME_POINTER is not set
# CONFIG_BOOT_PRINTK_DELAY is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_CPU_STALL_DETECTOR is not set
# CONFIG_BACKTRACE_SELF_TEST is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
# CONFIG_FAULT_INJECTION is not set
# CONFIG_LATENCYTOP is not set
# CONFIG_SYSCTL_SYSCALL_CHECK is not set
# CONFIG_DEBUG_PAGEALLOC is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_FP_TEST=y
CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_TRACING_SUPPORT=y
# CONFIG_FTRACE is not set
# CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set
# CONFIG_DMA_API_DEBUG is not set
# CONFIG_ATOMIC64_SELFTEST is not set
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
CONFIG_HAVE_ARCH_KMEMCHECK=y
# CONFIG_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
# CONFIG_EARLY_PRINTK_DBGP is not set
CONFIG_DEBUG_STACKOVERFLOW=y
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_X86_PTDUMP is not set
# CONFIG_DEBUG_RODATA is not set
# CONFIG_4KSTACKS is not set
CONFIG_DOUBLEFAULT=y
# CONFIG_IOMMU_STRESS is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_IO_DELAY_TYPE_0X80=0
CONFIG_IO_DELAY_TYPE_0XED=1
CONFIG_IO_DELAY_TYPE_UDELAY=2
CONFIG_IO_DELAY_TYPE_NONE=3
CONFIG_IO_DELAY_0X80=y
# CONFIG_IO_DELAY_0XED is not set
# CONFIG_IO_DELAY_UDELAY is not set
# CONFIG_IO_DELAY_NONE is not set
CONFIG_DEFAULT_IO_DELAY_TYPE=0
# CONFIG_CPA_DEBUG is not set
# CONFIG_OPTIMIZE_INLINING is not set
# CONFIG_DEBUG_STRICT_USER_COPY_CHECKS is not set
#
# Security options
#
# CONFIG_KEYS is not set
# CONFIG_SECURITY is not set
# CONFIG_SECURITYFS is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_DEFAULT_SECURITY=""
# CONFIG_CRYPTO is not set
CONFIG_HAVE_KVM=y
# CONFIG_VIRTUALIZATION is not set
# CONFIG_BINARY_PRINTF is not set
#
# Library routines
#
CONFIG_GENERIC_FIND_FIRST_BIT=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_GENERIC_FIND_LAST_BIT=y
# CONFIG_CRC_CCITT is not set
# CONFIG_CRC16 is not set
# CONFIG_CRC_T10DIF is not set
# CONFIG_CRC_ITU_T is not set
# CONFIG_CRC32 is not set
# CONFIG_CRC7 is not set
# CONFIG_LIBCRC32C is not set
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT=y
CONFIG_HAS_DMA=y
CONFIG_NLATTR=y
[-- Attachment #1.3: tunnels.sh --]
[-- Type: application/x-shellscript, Size: 1116 bytes --]
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply
* Re: tbf/htb qdisc limitations
From: Eric Dumazet @ 2010-10-14 4:01 UTC (permalink / raw)
To: Bill Fink; +Cc: Jarek Poplawski, Rick Jones, Steven Brudenell, netdev
In-Reply-To: <20101013233653.1e363692.billfink@mindspring.com>
Le mercredi 13 octobre 2010 à 23:36 -0400, Bill Fink a écrit :
> I was just trying to do an 8 Gbps rate limit on a 10-GigE path,
> and couldn't get it to work with either htb or tbf. Are you
> saying this currently isn't possible? Or are you saying to use
> this hfsc mechanism, which there doesn't seem to be a man page
> for?
man pages ? Oh well...
8Gbps rate limit sounds very optimistic with a central lock and one
queue...
Maybe its possible to split this into 8 x 1Gbps, using 8 queues...
or 16 x 500 Mbps
^ permalink raw reply
* Re: Nested GRE locking bug
From: Eric Dumazet @ 2010-10-14 4:11 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev, Beatrice Barbe, 599816
In-Reply-To: <1287028842.11178.68.camel@localhost>
Le jeudi 14 octobre 2010 à 05:00 +0100, Ben Hutchings a écrit :
> Beatrice Barbe reported a reproducible crash after creating large
> numbers of nested GRE tunnels and then pinging with the source address
> forced. I was able to reproduce this using net-2.6. I'm attaching the
> kernel config I used and a script to reproduce this based on the script
> she provided. The magic number of tunnels to create is apparently 37.
>
> With lockdep enabled, I get the following output:
>
Thats a known problem, actually, called stack exhaustion :)
net-next-2.6 contains a fix for this, adding the perc_cpu xmit_recursion
limit. We might push it to net-2.6
Thanks
commit 745e20f1b626b1be4b100af5d4bf7b3439392f8f
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed Sep 29 13:23:09 2010 -0700
net: add a recursion limit in xmit path
As tunnel devices are going to be lockless, we need to make sure a
misconfigured machine wont enter an infinite loop.
Add a percpu variable, and limit to three the number of stacked xmits.
Reported-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/core/dev.c b/net/core/dev.c
index 48ad47f..50dacca 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2177,6 +2177,9 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
return rc;
}
+static DEFINE_PER_CPU(int, xmit_recursion);
+#define RECURSION_LIMIT 3
+
/**
* dev_queue_xmit - transmit a buffer
* @skb: buffer to transmit
@@ -2242,10 +2245,15 @@ int dev_queue_xmit(struct sk_buff *skb)
if (txq->xmit_lock_owner != cpu) {
+ if (__this_cpu_read(xmit_recursion) > RECURSION_LIMIT)
+ goto recursion_alert;
+
HARD_TX_LOCK(dev, txq, cpu);
if (!netif_tx_queue_stopped(txq)) {
+ __this_cpu_inc(xmit_recursion);
rc = dev_hard_start_xmit(skb, dev, txq);
+ __this_cpu_dec(xmit_recursion);
if (dev_xmit_complete(rc)) {
HARD_TX_UNLOCK(dev, txq);
goto out;
@@ -2257,7 +2265,9 @@ int dev_queue_xmit(struct sk_buff *skb)
"queue packet!\n", dev->name);
} else {
/* Recursion is detected! It is possible,
- * unfortunately */
+ * unfortunately
+ */
+recursion_alert:
if (net_ratelimit())
printk(KERN_CRIT "Dead loop on virtual device "
"%s, fix it urgently!\n", dev->name);
^ permalink raw reply related
* Re: BUG ? ipip unregister_netdevice_many()
From: Eric W. Biederman @ 2010-10-14 4:40 UTC (permalink / raw)
To: David Miller; +Cc: hans.schillstrom, daniel.lezcano, netdev
In-Reply-To: <20101012.130520.48517464.davem@davemloft.net>
David Miller <davem@davemloft.net> writes:
> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Fri, 08 Oct 2010 10:32:40 -0700
>
>> It is just dealing with not flushing the entire routing cache, just the
>> routes that have expired. Which prevents one network namespace from
>> flushing it's routes and DOS'ing another.
>
> That's a very indirect and obfuscated way of handling it.
>
> And I still don't know why we let the first contiguous set of expired
> entries in the chain get freed outside of the lock, and the rest
> inside the lock. That really isn't explained by anything I've read.
>
> How about we just do exactly what's intended, and with no ifdefs?
I'm all for no ifdefs.
And reading the code your version looks much simpler and easier
to read and I am all for that.
However I think the test should still be rt_is_expired(), because
that is what rt_do_flush() is doing removing the expired entries
from the list.
The only difference being that we remove the assumption that all hash
table entries must be expired at this point.
We have very straight forwardly expired all of the route table entries
for the namespace that go this going earlier.
Eric
> Signed-off-by: David S. Miller <davem@davemloft.net>
>
> diff --git a/include/net/route.h b/include/net/route.h
> index 7e5e73b..8d24761 100644
> --- a/include/net/route.h
> +++ b/include/net/route.h
> @@ -106,7 +106,7 @@ extern int ip_rt_init(void);
> extern void ip_rt_redirect(__be32 old_gw, __be32 dst, __be32 new_gw,
> __be32 src, struct net_device *dev);
> extern void rt_cache_flush(struct net *net, int how);
> -extern void rt_cache_flush_batch(void);
> +extern void rt_cache_flush_batch(struct net *net);
> extern int __ip_route_output_key(struct net *, struct rtable **, const struct flowi *flp);
> extern int ip_route_output_key(struct net *, struct rtable **, struct flowi *flp);
> extern int ip_route_output_flow(struct net *, struct rtable **rp, struct flowi *flp, struct sock *sk, int flags);
> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> index 919f2ad..4039f56 100644
> --- a/net/ipv4/fib_frontend.c
> +++ b/net/ipv4/fib_frontend.c
> @@ -999,7 +999,7 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo
> rt_cache_flush(dev_net(dev), 0);
> break;
> case NETDEV_UNREGISTER_BATCH:
> - rt_cache_flush_batch();
> + rt_cache_flush_batch(dev_net(dev));
I believe this change is actually wrong. dev here is the first
element of a list of network devices, and that list may span multiple
network namespaces.
> break;
> }
> return NOTIFY_DONE;
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 0755aa4..6ad730c 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -712,13 +712,14 @@ static inline int rt_is_expired(struct rtable *rth)
> * Can be called by a softirq or a process.
> * In the later case, we want to be reschedule if necessary
> */
> -static void rt_do_flush(int process_context)
> +static void rt_do_flush(struct net *net, int process_context)
> {
> unsigned int i;
> struct rtable *rth, *next;
> - struct rtable * tail;
>
> for (i = 0; i <= rt_hash_mask; i++) {
> + struct rtable *list, **pprev;
> +
> if (process_context && need_resched())
> cond_resched();
> rth = rt_hash_table[i].chain;
> @@ -726,41 +727,27 @@ static void rt_do_flush(int process_context)
> continue;
>
> spin_lock_bh(rt_hash_lock_addr(i));
> -#ifdef CONFIG_NET_NS
> - {
> - struct rtable ** prev, * p;
>
> - rth = rt_hash_table[i].chain;
> + pprev = &rt_hash_table[i].chain;
> + rth = *pprev;
> + while (rth) {
> + next = rth->dst.rt_next;
> + if (dev_net(rth->dst.dev) == net) {
> + *pprev = next;
>
> - /* defer releasing the head of the list after spin_unlock */
> - for (tail = rth; tail; tail = tail->dst.rt_next)
> - if (!rt_is_expired(tail))
> - break;
> - if (rth != tail)
> - rt_hash_table[i].chain = tail;
> -
> - /* call rt_free on entries after the tail requiring flush */
> - prev = &rt_hash_table[i].chain;
> - for (p = *prev; p; p = next) {
> - next = p->dst.rt_next;
> - if (!rt_is_expired(p)) {
> - prev = &p->dst.rt_next;
> - } else {
> - *prev = next;
> - rt_free(p);
> - }
> - }
> + rth->dst.rt_next = list;
> + list = rth;
> + } else
> + pprev = &rth->dst.rt_next;
> +
> + rth = next;
> }
> -#else
> - rth = rt_hash_table[i].chain;
> - rt_hash_table[i].chain = NULL;
> - tail = NULL;
> -#endif
> +
> spin_unlock_bh(rt_hash_lock_addr(i));
>
> - for (; rth != tail; rth = next) {
> - next = rth->dst.rt_next;
> - rt_free(rth);
> + for (; list; list = next) {
> + next = list->dst.rt_next;
> + rt_free(list);
> }
> }
> }
> @@ -906,13 +893,13 @@ void rt_cache_flush(struct net *net, int delay)
> {
> rt_cache_invalidate(net);
> if (delay >= 0)
> - rt_do_flush(!in_softirq());
> + rt_do_flush(net, !in_softirq());
> }
>
> /* Flush previous cache invalidated entries from the cache */
> -void rt_cache_flush_batch(void)
> +void rt_cache_flush_batch(struct net *net)
> {
> - rt_do_flush(!in_softirq());
> + rt_do_flush(net, !in_softirq());
> }
>
> static void rt_emergency_hash_rebuild(struct net *net)
^ permalink raw reply
* Re: BUG ? ipip unregister_netdevice_many()
From: David Miller @ 2010-10-14 4:50 UTC (permalink / raw)
To: ebiederm; +Cc: hans.schillstrom, daniel.lezcano, netdev
In-Reply-To: <m1fww94ca6.fsf@fess.ebiederm.org>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Wed, 13 Oct 2010 21:40:49 -0700
> However I think the test should still be rt_is_expired(), because
> that is what rt_do_flush() is doing removing the expired entries
> from the list.
I can't see a reason for that test.
Everything calling into this code path has created a condition
that requires that all routing cache entries for that namespace
be deleted.
This function is meant to unconditionally flush the entire table.
I believe you added that extraneous test, and it never existed there
before.
^ permalink raw reply
* Re: [RFC PATCH net-next] drivers/net Documentation/networking: Create directory intel_wired_lan
From: Jeff Kirsher @ 2010-10-14 4:57 UTC (permalink / raw)
To: Joe Perches
Cc: Brandeburg, Jesse, Allan, Bruce W, Wyborny, Carolyn,
Skidmore, Donald C, Rose, Gregory V, Waskiewicz Jr, Peter P,
Duyck, Alexander H, Ronciak, John, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, e1000-devel
In-Reply-To: <1287008906.1117.428.camel@Joe-Laptop>
[-- Attachment #1: Type: text/plain, Size: 1926 bytes --]
On Wed, 2010-10-13 at 15:28 -0700, Joe Perches wrote:
> On Mon, 2010-10-11 at 17:00 -0700, Joe Perches wrote:
> > On Mon, 2010-10-11 at 16:52 -0700, Jeff Kirsher wrote:
> > > On Sun, Oct 10, 2010 at 13:42, Joe Perches <joe@perches.com> wrote:
> > > > Perhaps it's better to move drivers from the very populated
> > > > drivers/net directory into vendor specific directories similar
> > > > to the Atheros approach used for drivers/net/wireless/ath/
> > > NAK
> > > First, I think we need to keep the documentation in /Documentation/networking.
> > > Second, the changes are extensive and would create a lot of regression testing.
> > I don't see any actual changes here other than layout.
> > What kind of regression testing do you think necessary?
>
> Jeff?
>
Sorry I am not ignoring you, I was taking a closer look at your patch.
> What regression testing would actually be done?
>
The Makefile and Kconfig needs more work. I applied your patch and none
of the Intel Wired drivers build.
The statement that there would be a lot of regression testing was in
reference to your response to Stephen that it would "allow consolidation
of common code". Sorry for being vague about the regression testing.
In general, I do like the idea of moving all the Intel wired LAN drivers
into their own directory, like was Atheros has done in Wireless.
I am working on providing an updated RFC patch to resolve the
Makefile/Kconfig issues I found and few other minor issues I have
found.
> Any new objects are trivially validated against existing
> objects.
>
> > > We have been looking at solutions like this for future
> > > drivers/hardware and is on the list of items we are currently working
> > > on, but feel it should not be made retroactively due to the regression
> > > testing and massive changes that would need to be made.
> >
> > Might as well start somewhere.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply
* [PATCH] niu: introduce temp variables to avoid sparse warnings when swapping in-situ
From: Harvey Harrison @ 2010-10-14 4:59 UTC (permalink / raw)
To: netdev; +Cc: Harvey Harrison
Suppress a large block of warnings like:
drivers/net/niu.c:7094:38: warning: incorrect type in assignment (different base types)
drivers/net/niu.c:7094:38: expected restricted __be32 [usertype] ip4src
drivers/net/niu.c:7094:38: got unsigned long long
drivers/net/niu.c:7104:17: warning: cast from restricted __be32
...
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
---
This could be done without the temp variables, but they do make it a bit clearer.
drivers/net/niu.c | 92 +++++++++++++++++++++-------------------------------
1 files changed, 37 insertions(+), 55 deletions(-)
diff --git a/drivers/net/niu.c b/drivers/net/niu.c
index c0437fd..781e368 100644
--- a/drivers/net/niu.c
+++ b/drivers/net/niu.c
@@ -7090,24 +7090,20 @@ static int niu_get_hash_opts(struct niu *np, struct ethtool_rxnfc *nfc)
static void niu_get_ip4fs_from_tcam_key(struct niu_tcam_entry *tp,
struct ethtool_rx_flow_spec *fsp)
{
+ u32 tmp;
+ u16 prt;
- fsp->h_u.tcp_ip4_spec.ip4src = (tp->key[3] & TCAM_V4KEY3_SADDR) >>
- TCAM_V4KEY3_SADDR_SHIFT;
- fsp->h_u.tcp_ip4_spec.ip4dst = (tp->key[3] & TCAM_V4KEY3_DADDR) >>
- TCAM_V4KEY3_DADDR_SHIFT;
- fsp->m_u.tcp_ip4_spec.ip4src = (tp->key_mask[3] & TCAM_V4KEY3_SADDR) >>
- TCAM_V4KEY3_SADDR_SHIFT;
- fsp->m_u.tcp_ip4_spec.ip4dst = (tp->key_mask[3] & TCAM_V4KEY3_DADDR) >>
- TCAM_V4KEY3_DADDR_SHIFT;
-
- fsp->h_u.tcp_ip4_spec.ip4src =
- cpu_to_be32(fsp->h_u.tcp_ip4_spec.ip4src);
- fsp->m_u.tcp_ip4_spec.ip4src =
- cpu_to_be32(fsp->m_u.tcp_ip4_spec.ip4src);
- fsp->h_u.tcp_ip4_spec.ip4dst =
- cpu_to_be32(fsp->h_u.tcp_ip4_spec.ip4dst);
- fsp->m_u.tcp_ip4_spec.ip4dst =
- cpu_to_be32(fsp->m_u.tcp_ip4_spec.ip4dst);
+ tmp = (tp->key[3] & TCAM_V4KEY3_SADDR) >> TCAM_V4KEY3_SADDR_SHIFT;
+ fsp->h_u.tcp_ip4_spec.ip4src = cpu_to_be32(tmp);
+
+ tmp = (tp->key[3] & TCAM_V4KEY3_DADDR) >> TCAM_V4KEY3_DADDR_SHIFT;
+ fsp->h_u.tcp_ip4_spec.ip4dst = cpu_to_be32(tmp);
+
+ tmp = (tp->key_mask[3] & TCAM_V4KEY3_SADDR) >> TCAM_V4KEY3_SADDR_SHIFT;
+ fsp->m_u.tcp_ip4_spec.ip4src = cpu_to_be32(tmp);
+
+ tmp = (tp->key_mask[3] & TCAM_V4KEY3_DADDR) >> TCAM_V4KEY3_DADDR_SHIFT;
+ fsp->m_u.tcp_ip4_spec.ip4dst = cpu_to_be32(tmp);
fsp->h_u.tcp_ip4_spec.tos = (tp->key[2] & TCAM_V4KEY2_TOS) >>
TCAM_V4KEY2_TOS_SHIFT;
@@ -7118,54 +7114,40 @@ static void niu_get_ip4fs_from_tcam_key(struct niu_tcam_entry *tp,
case TCP_V4_FLOW:
case UDP_V4_FLOW:
case SCTP_V4_FLOW:
- fsp->h_u.tcp_ip4_spec.psrc =
- ((tp->key[2] & TCAM_V4KEY2_PORT_SPI) >>
- TCAM_V4KEY2_PORT_SPI_SHIFT) >> 16;
- fsp->h_u.tcp_ip4_spec.pdst =
- ((tp->key[2] & TCAM_V4KEY2_PORT_SPI) >>
- TCAM_V4KEY2_PORT_SPI_SHIFT) & 0xffff;
- fsp->m_u.tcp_ip4_spec.psrc =
- ((tp->key_mask[2] & TCAM_V4KEY2_PORT_SPI) >>
- TCAM_V4KEY2_PORT_SPI_SHIFT) >> 16;
- fsp->m_u.tcp_ip4_spec.pdst =
- ((tp->key_mask[2] & TCAM_V4KEY2_PORT_SPI) >>
- TCAM_V4KEY2_PORT_SPI_SHIFT) & 0xffff;
+ prt = ((tp->key[2] & TCAM_V4KEY2_PORT_SPI) >>
+ TCAM_V4KEY2_PORT_SPI_SHIFT) >> 16;
+ fsp->h_u.tcp_ip4_spec.psrc = cpu_to_be16(prt);
+
+ prt = ((tp->key[2] & TCAM_V4KEY2_PORT_SPI) >>
+ TCAM_V4KEY2_PORT_SPI_SHIFT) & 0xffff;
+ fsp->h_u.tcp_ip4_spec.pdst = cpu_to_be16(prt);
- fsp->h_u.tcp_ip4_spec.psrc =
- cpu_to_be16(fsp->h_u.tcp_ip4_spec.psrc);
- fsp->h_u.tcp_ip4_spec.pdst =
- cpu_to_be16(fsp->h_u.tcp_ip4_spec.pdst);
- fsp->m_u.tcp_ip4_spec.psrc =
- cpu_to_be16(fsp->m_u.tcp_ip4_spec.psrc);
- fsp->m_u.tcp_ip4_spec.pdst =
- cpu_to_be16(fsp->m_u.tcp_ip4_spec.pdst);
+ prt = ((tp->key_mask[2] & TCAM_V4KEY2_PORT_SPI) >>
+ TCAM_V4KEY2_PORT_SPI_SHIFT) >> 16;
+ fsp->m_u.tcp_ip4_spec.psrc = cpu_to_be16(prt);
+
+ prt = ((tp->key_mask[2] & TCAM_V4KEY2_PORT_SPI) >>
+ TCAM_V4KEY2_PORT_SPI_SHIFT) & 0xffff;
+ fsp->m_u.tcp_ip4_spec.pdst = cpu_to_be16(prt);
break;
case AH_V4_FLOW:
case ESP_V4_FLOW:
- fsp->h_u.ah_ip4_spec.spi =
- (tp->key[2] & TCAM_V4KEY2_PORT_SPI) >>
- TCAM_V4KEY2_PORT_SPI_SHIFT;
- fsp->m_u.ah_ip4_spec.spi =
- (tp->key_mask[2] & TCAM_V4KEY2_PORT_SPI) >>
+ tmp = (tp->key[2] & TCAM_V4KEY2_PORT_SPI) >>
TCAM_V4KEY2_PORT_SPI_SHIFT;
+ fsp->h_u.ah_ip4_spec.spi = cpu_to_be32(tmp);
- fsp->h_u.ah_ip4_spec.spi =
- cpu_to_be32(fsp->h_u.ah_ip4_spec.spi);
- fsp->m_u.ah_ip4_spec.spi =
- cpu_to_be32(fsp->m_u.ah_ip4_spec.spi);
+ tmp = (tp->key_mask[2] & TCAM_V4KEY2_PORT_SPI) >>
+ TCAM_V4KEY2_PORT_SPI_SHIFT;
+ fsp->m_u.ah_ip4_spec.spi = cpu_to_be32(tmp);
break;
case IP_USER_FLOW:
- fsp->h_u.usr_ip4_spec.l4_4_bytes =
- (tp->key[2] & TCAM_V4KEY2_PORT_SPI) >>
- TCAM_V4KEY2_PORT_SPI_SHIFT;
- fsp->m_u.usr_ip4_spec.l4_4_bytes =
- (tp->key_mask[2] & TCAM_V4KEY2_PORT_SPI) >>
+ tmp = (tp->key[2] & TCAM_V4KEY2_PORT_SPI) >>
TCAM_V4KEY2_PORT_SPI_SHIFT;
+ fsp->h_u.usr_ip4_spec.l4_4_bytes = cpu_to_be32(tmp);
- fsp->h_u.usr_ip4_spec.l4_4_bytes =
- cpu_to_be32(fsp->h_u.usr_ip4_spec.l4_4_bytes);
- fsp->m_u.usr_ip4_spec.l4_4_bytes =
- cpu_to_be32(fsp->m_u.usr_ip4_spec.l4_4_bytes);
+ tmp = (tp->key_mask[2] & TCAM_V4KEY2_PORT_SPI) >>
+ TCAM_V4KEY2_PORT_SPI_SHIFT;
+ fsp->m_u.usr_ip4_spec.l4_4_bytes = cpu_to_be32(tmp);
fsp->h_u.usr_ip4_spec.proto =
(tp->key[2] & TCAM_V4KEY2_PROTO) >>
--
1.7.1
^ permalink raw reply related
* Re: BUG ? ipip unregister_netdevice_many()
From: Eric W. Biederman @ 2010-10-14 5:20 UTC (permalink / raw)
To: David Miller; +Cc: hans.schillstrom, daniel.lezcano, netdev
In-Reply-To: <20101013.215013.104074480.davem@davemloft.net>
David Miller <davem@davemloft.net> writes:
> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Wed, 13 Oct 2010 21:40:49 -0700
>
>> However I think the test should still be rt_is_expired(), because
>> that is what rt_do_flush() is doing removing the expired entries
>> from the list.
>
> I can't see a reason for that test.
>
> Everything calling into this code path has created a condition
> that requires that all routing cache entries for that namespace
> be deleted.
>
> This function is meant to unconditionally flush the entire table.
>
> I believe you added that extraneous test, and it never existed there
> before.
At the point network namespaces entered the picture the logic was:
void rt_cache_flush(struct net *net, int delay)
{
rt_cache_invalidate();
if (delay >= 0)
rt_do_flush(!in_softirq());
}
/* Strictly speaking rt_is_expired was just open coded in
* rt_check_expire. But this is the check that was used.
*/
static inline int rt_is_expired(struct rtable *rth)
{
return rth->rt_genid != atomic_read(&rt_genid);
}
static void rt_cache_invalidate(void)
{
unsigned char shuffle;
get_random_bytes(&shuffle, sizeof(shuffle));
atomic_add(shuffle + 1U, &rt_genid);
}
static void rt_do_flush(int process_context)
{
unsigned int i;
struct rtable *rth, *next;
for (i = 0; i <= rt_hash_mask; i++) {
if (process_context && need_resched())
cond_resched();
rth = rt_hash_table[i].chain;
if (!rth)
continue;
spin_lock_bh(rt_hash_lock_addr(i));
rth = rt_hash_table[i].chain;
rt_hash_table[i].chain = NULL;
tail = NULL;
spin_unlock_bh(rt_hash_lock_addr(i));
for(; rth != tail; rth = next)
{
next = rth->dst.rt_next;
rt_free(rth);
}
}
}
Because of the rt_cache_invalidate() in rt_cache_flush() this
guaranteed that rt_is_expired() was true for every route cache entry,
and this also guaranteed that every routing cache entry we were flush
atomically became inaccessible.
So rt_is_expired() has always been valid, but in practice it was just
always optimized out as being redundant.
With the network namespace support we limit the scope of the test of
the invalidate to just a single network namespace, and as such
rt_is_expired stops being true for every cache entry. So we cannot
unconditionally throw away entire chains.
All of which can be either done by network namespace equality or by
rt_is_expired(). Although Denis picked rt_is_expired() when he made
his change.
The only place it makes a noticable difference in practice is what
happens when we do batched deleletes of lots of network devices in
different network namespaces.
During batched network device deletes in fib_netdev_event we do
rt_cache_flush(dev_net(dev), -1) for each network device. and then a
final rt_cache_flush_batch() to remove the invalidated entries. These
devices can be from multiple network namespaces, so I suspect that is
a savings worth having.
So if we are going to change the tests we need to do something with
rt_cache_flush_batch(). Further I do not see what is confusing about
a test that asks if the routing cache entry is unusable. Is
rt_cache_expired() a bad name?
Eric
^ permalink raw reply
* [PATCH] stmmac: remove ifdef NETIF_F_TSO from stmmac_ethtool.c
From: Giuseppe CAVALLARO @ 2010-10-14 5:54 UTC (permalink / raw)
To: netdev; +Cc: Giuseppe Cavallaro
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Reported-by: Armando Visconti <armando.visconti@st.com>
---
drivers/net/stmmac/stmmac_ethtool.c | 2 --
1 files changed, 0 insertions(+), 2 deletions(-)
diff --git a/drivers/net/stmmac/stmmac_ethtool.c b/drivers/net/stmmac/stmmac_ethtool.c
index f080509..9262f4a 100644
--- a/drivers/net/stmmac/stmmac_ethtool.c
+++ b/drivers/net/stmmac/stmmac_ethtool.c
@@ -377,10 +377,8 @@ static struct ethtool_ops stmmac_ethtool_ops = {
.get_wol = stmmac_get_wol,
.set_wol = stmmac_set_wol,
.get_sset_count = stmmac_get_sset_count,
-#ifdef NETIF_F_TSO
.get_tso = ethtool_op_get_tso,
.set_tso = ethtool_op_set_tso,
-#endif
};
void stmmac_set_ethtool_ops(struct net_device *netdev)
--
1.5.5.6
^ permalink raw reply related
* Re: [RFC PATCH net-next] drivers/net Documentation/networking: Create directory intel_wired_lan
From: Joe Perches @ 2010-10-14 5:57 UTC (permalink / raw)
To: jeffrey.t.kirsher
Cc: Brandeburg, Jesse, Allan, Bruce W, Wyborny, Carolyn,
Skidmore, Donald C, Rose, Gregory V, Waskiewicz Jr, Peter P,
Duyck, Alexander H, Ronciak, John, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org, e1000-devel
In-Reply-To: <1287032255.4113.14.camel@jtkirshe-MOBL1>
On Wed, 2010-10-13 at 21:57 -0700, Jeff Kirsher wrote:
> On Wed, 2010-10-13 at 15:28 -0700, Joe Perches wrote:
> Sorry I am not ignoring you, I was taking a closer look at your patch.
> > What regression testing would actually be done?
> The Makefile and Kconfig needs more work. I applied your patch and none
> of the Intel Wired drivers build.
Care to describe the Makefile/Kconfig issues you have seen?
I built it allyesconfig, defconfig, allmodconfig and allnoconfig.
Perhaps you need to use "git am foo" in a test branch instead
of "patch -p1 < foo" ?
> I am working on providing an updated RFC patch to resolve the
> Makefile/Kconfig issues I found and few other minor issues I have
> found.
Oh good.
cheers, Joe
^ permalink raw reply
* Re: [PATCH net-next] net: allocate skbs on local node
From: Pekka Enberg @ 2010-10-14 6:22 UTC (permalink / raw)
To: David Rientjes
Cc: Christoph Lameter, Andrew Morton, Eric Dumazet, David Miller,
netdev, Michael Chan, Eilon Greenstein, Christoph Hellwig, LKML,
Nick Piggin
In-Reply-To: <alpine.DEB.2.00.1010131539440.27839@chino.kir.corp.google.com>
On Wed, 13 Oct 2010, Christoph Lameter wrote:
>>> I was going to mention that as an idea, but I thought storing the metadata
>>> for certain debugging features might differ from the two allocators so
>>> substantially that it would be even more convoluted and difficult to
>>> maintain?
>>
>> We could have some callbacks to store allocator specific metadata?
On 10/14/10 1:41 AM, David Rientjes wrote:
> It depends on whether we could share the same base for both slab (unified
> allocator) and slub, which you snipped from your reply, that would make
> this cleaner.
Argh. Why would we want to introduce something that's effectively a new
allocator based on SLUB? If there's something controversial in the
current patch series, lets just keep it out of mainline. A "rewrite" is
the reason we're in this mess so lets not repeat the same mistake again!
Pekka
^ permalink raw reply
* Re: tbf/htb qdisc limitations
From: Bill Fink @ 2010-10-14 6:34 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Jarek Poplawski, Rick Jones, Steven Brudenell, netdev
In-Reply-To: <1287028881.2649.103.camel@edumazet-laptop>
On Thu, 14 Oct 2010, Eric Dumazet wrote:
> Le mercredi 13 octobre 2010 à 23:36 -0400, Bill Fink a écrit :
>
> > I was just trying to do an 8 Gbps rate limit on a 10-GigE path,
> > and couldn't get it to work with either htb or tbf. Are you
> > saying this currently isn't possible? Or are you saying to use
> > this hfsc mechanism, which there doesn't seem to be a man page
> > for?
>
> man pages ? Oh well...
>
> 8Gbps rate limit sounds very optimistic with a central lock and one
> queue...
>
> Maybe its possible to split this into 8 x 1Gbps, using 8 queues...
> or 16 x 500 Mbps
Not when I'm trying to rate limit a single flow.
-Bill
^ permalink raw reply
* Re: BUG ? ipip unregister_netdevice_many()
From: Hans Schillstrom @ 2010-10-14 6:41 UTC (permalink / raw)
To: David Miller
Cc: jarkao2@gmail.com, ebiederm@xmission.com, daniel.lezcano@free.fr,
netdev@vger.kernel.org
In-Reply-To: <20101013.145856.112590240.davem@davemloft.net>
On Wednesday 13 October 2010 23:58:56 David Miller wrote:
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Wed, 13 Oct 2010 11:19:47 +0000
>
> >> -static void rt_do_flush(int process_context)
> >> +static void rt_do_flush(struct net *net, int process_context)
> >> {
> >> unsigned int i;
> >> struct rtable *rth, *next;
> >> - struct rtable * tail;
> >>
> >> for (i = 0; i <= rt_hash_mask; i++) {
> >> + struct rtable *list, **pprev;
> >
> > Isn't "list = NULL" needed here?
>
> Yes it is, thanks for catching that.
>
It solves the crach but....
#
Slab corruption: size-4096 start=ffff88000f950000, len=4096
010: 00 00 00 00 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b
unregister_netdevice: waiting for lo to become free. Usage count = 4
Slab corruption: size-4096 start=ffff88000f9af000, len=4096
010: 00 00 00 00 00 00 00 00 6b 6b 6b 6b 6b 6b 6b 6b
unregister_netdevice: waiting for lo to become free. Usage count = 4
unregister_netdevice: waiting for lo to become free. Usage count = 4
unregister_netdevice: waiting for lo to become free. Usage count = 4
Regards
Hans Schillstrom <hans.schillstrom@ericsson.com>
^ permalink raw reply
* Re: tbf/htb qdisc limitations
From: Bill Fink @ 2010-10-14 7:13 UTC (permalink / raw)
To: Jarek Poplawski; +Cc: Rick Jones, Steven Brudenell, netdev
In-Reply-To: <20101014064404.GA6219@ff.dom.local>
On Thu, 14 Oct, Jarek Poplawski wrote:
> On Wed, Oct 13, 2010 at 11:36:53PM -0400, Bill Fink wrote:
> > On Wed, 13 Oct 2010, Jarek Poplawski wrote:
> >
> > > On Tue, Oct 12, 2010 at 03:17:18PM -0700, Rick Jones wrote:
> > > >>> my burst problem is the only semi-legitimate motivation i can think
> > > >>> of. the only other possible motivations i can imagine are setting
> > > >>> "limit" to buffer more than 4GB of packets and setting "rate" to
> > > >>> something more than 32 gigabit; both of these seem kind of dubious. is
> > > >>> there something else you had in mind?
> > > >>
> > > >>
> > > >> No, mainly 10 gigabit rates and additionally 64-bit stats.
> > > >
> > > > Any issue for bonded 10 GbE interfaces? Now that the IEEE have ratified
> > > > (June) how far out are 40 GbE interfaces? Or 100 GbE for that matter.
> > >
> > > Alas packet schedulers using rate tables are still around 1G. Above 2G
> > > they get less and less accurate, so hfsc is recommended.
> >
> > I was just trying to do an 8 Gbps rate limit on a 10-GigE path,
> > and couldn't get it to work with either htb or tbf. Are you
> > saying this currently isn't possible?
>
> Let's start from reminding that no precise packet scheduling should be
> expected with gso/tso etc. turned on. I don't know current hardware
> limits for such a non-gso traffic, but for 8 Gbit rate htb or tbf
> would definitely have wrong rate tables (overflowed values) for packet
> sizes below 1500 bytes.
TSO/GSO was disabled and was using 9000-byte jumbo frames
(and specified mtu 9000 to tc command).
Here was one attempt I made using tbf:
tc qdisc add dev eth2 root handle 1: prio
tc qdisc add dev eth2 parent 1:1 handle 10: tbf rate 8900mbit buffer 1112500 limit 10000 mtu 9000
tc filter add dev eth2 protocol ip parent 1: prio 1 u32 match ip dst 192.168.1.23 flowid 10:1
I tried many variations of the above, all without success.
> > Or are you saying to use
> > this hfsc mechanism, which there doesn't seem to be a man page
> > for?
>
> There was a try:
> http://lists.openwall.net/netdev/2009/02/26/138
Thanks for the pointer. I will check it out later in detail,
but I'm already having difficulty with deciding if I have the
tc commands right for tbf and htb, and hfsc looks even more
involved.
-Bill
^ permalink raw reply
* Re: [PATCH net-next] net: allocate skbs on local node
From: David Rientjes @ 2010-10-14 7:23 UTC (permalink / raw)
To: Pekka Enberg
Cc: Christoph Lameter, Andrew Morton, Eric Dumazet, David Miller,
netdev, Michael Chan, Eilon Greenstein, Christoph Hellwig, LKML,
Nick Piggin
In-Reply-To: <4CB6A1AB.4030800@cs.helsinki.fi>
On Thu, 14 Oct 2010, Pekka Enberg wrote:
> Argh. Why would we want to introduce something that's effectively a new
> allocator based on SLUB? If there's something controversial in the current
> patch series, lets just keep it out of mainline. A "rewrite" is the reason
> we're in this mess so lets not repeat the same mistake again!
>
SLUB is a good base framework for developing just about any slab allocator
you can imagine, in part because of its enhanced debugging facilities.
Nick originally developed SLQB with much of the same SLUB framework and
the queueing changes that Christoph is proposing in his new unified
allocator builds upon SLUB.
Instead of the slab.c, slab_queue.c, and slab_nonqueue.c trifecta, I
suggested building as much of the core allocator into a single file as
possible and then extending that with a config option such as
CONFIG_SLAB_QUEUEING, if possible. Christoph knows his allocator better
than anybody so he'd be the person to ask if this was indeed feasible and,
if so, I think it's in the best interest of a long-term maintainable
kernel.
I care about how this is organized because I think the current config
option demanding users select between SLAB and SLUB without really
understanding the differences (especially for users who run a very wide
range of applications and the pros and cons of better microbenchmark
results for one allocator over another isn't at all convincing) is
detrimental.
^ permalink raw reply
* Re: tbf/htb qdisc limitations
From: Jarek Poplawski @ 2010-10-14 6:44 UTC (permalink / raw)
To: Bill Fink; +Cc: Rick Jones, Steven Brudenell, netdev
In-Reply-To: <20101013233653.1e363692.billfink@mindspring.com>
On Wed, Oct 13, 2010 at 11:36:53PM -0400, Bill Fink wrote:
> On Wed, 13 Oct 2010, Jarek Poplawski wrote:
>
> > On Tue, Oct 12, 2010 at 03:17:18PM -0700, Rick Jones wrote:
> > >>> my burst problem is the only semi-legitimate motivation i can think
> > >>> of. the only other possible motivations i can imagine are setting
> > >>> "limit" to buffer more than 4GB of packets and setting "rate" to
> > >>> something more than 32 gigabit; both of these seem kind of dubious. is
> > >>> there something else you had in mind?
> > >>
> > >>
> > >> No, mainly 10 gigabit rates and additionally 64-bit stats.
> > >
> > > Any issue for bonded 10 GbE interfaces? Now that the IEEE have ratified
> > > (June) how far out are 40 GbE interfaces? Or 100 GbE for that matter.
> >
> > Alas packet schedulers using rate tables are still around 1G. Above 2G
> > they get less and less accurate, so hfsc is recommended.
>
> I was just trying to do an 8 Gbps rate limit on a 10-GigE path,
> and couldn't get it to work with either htb or tbf. Are you
> saying this currently isn't possible?
Let's start from reminding that no precise packet scheduling should be
expected with gso/tso etc. turned on. I don't know current hardware
limits for such a non-gso traffic, but for 8 Gbit rate htb or tbf
would definitely have wrong rate tables (overflowed values) for packet
sizes below 1500 bytes.
> Or are you saying to use
> this hfsc mechanism, which there doesn't seem to be a man page
> for?
There was a try:
http://lists.openwall.net/netdev/2009/02/26/138
Jarek P.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox