* Re: [RFC] CAIF Protocol Stack
From: Rémi Denis-Courmont @ 2009-09-18 12:31 UTC (permalink / raw)
To: netdev
In-Reply-To: <61D8D34BB13CFE408D154529C120E07902DF9076@eseldmw101.eemea.ericsson.se>
Hello,
On Wed, 16 Sep 2009 14:30:34 +0200, Sjur Brændeland
<sjur.brandeland@stericsson.com> wrote:
> The Implementation of CAIF is divided into:
> * CAIF Devices: Character Device, Net Device and Kernel API.
> * CAIF Protocol Implementation
> * CAIF Link Layer
I'm a bit confused here. What do you call a CAIF Device?
Do you mean a GPRS context is a network device, and an AT command interface
is a character device? Or is the CAIF modem a device? or what?
--
Rémi Denis-Courmont
^ permalink raw reply
* RE: [RFC] CAIF Protocol Stack
From: Sjur Brændeland @ 2009-09-18 13:38 UTC (permalink / raw)
To: Rémi Denis-Courmont, netdev
In-Reply-To: <0f510ae3e0a78a2c1345d8e08bdafb0e@chewa.net>
> -----Original Message-----
> From: Rémi Denis-Courmont
> Sent: 18. september 2009 14:32
> Hello,
>
> On Wed, 16 Sep 2009 14:30:34 +0200, Sjur Brændeland
> <sjur.brandeland@stericsson.com> wrote:
> > The Implementation of CAIF is divided into:
> > * CAIF Devices: Character Device, Net Device and Kernel API.
> > * CAIF Protocol Implementation
> > * CAIF Link Layer
>
> I'm a bit confused here. What do you call a CAIF Device?
>
> Do you mean a GPRS context is a network device, and an AT
> command interface is a character device? Or is the CAIF modem
> a device? or what?
What I meant was:
* "Net Device" - a "struct net_device" with one instance for each GPRS PDP context.
* "Character Device" - a chr device, with one instance for each AT channel towards the modem.
BR/Sjur Brændeland
^ permalink raw reply
* [PATCH net-next-2.6] bonding: introduce primary_reselect option
From: Jiri Pirko @ 2009-09-18 15:30 UTC (permalink / raw)
To: netdev; +Cc: davem, fubar, bonding-devel, nicolas.2p.debian
(updated 3)
In some cases there is not desirable to switch back to primary interface when
it's link recovers and rather stay with currently active one. We need to avoid
packetloss as much as we can in some cases. This is solved by introducing
primary_reselect option. Note that enslaved primary slave is set as current
active no matter what.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index d5181ce..fd650e0 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -614,6 +614,32 @@ primary
The primary option is only valid for active-backup mode.
+primary_reselect
+
+ Specifies the behavior of the current active slave when the primary was
+ down and comes back up. This option is designed to prevent
+ flip-flopping between the primary slave and other slaves. The possible
+ values and their respective effects are:
+
+ always or 0 (default)
+
+ The primary slave becomes the active slave whenever it comes
+ back up.
+
+ better or 1
+
+ The primary slave becomes the active slave when it comes back
+ up, if the speed and duplex of the primary slave is better
+ than the speed and duplex of the current active slave.
+
+ failure or 2
+
+ The primary slave becomes the active slave only if the current
+ active slave fails and the primary slave is up.
+
+ When no slave are active, if the primary comes back up, it becomes the
+ active slave, regardless of the value of primary_reselect.
+
updelay
Specifies the time, in milliseconds, to wait before enabling a
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 699bfdd..1127361 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -94,6 +94,7 @@ static int downdelay;
static int use_carrier = 1;
static char *mode;
static char *primary;
+static char *primary_reselect;
static char *lacp_rate;
static char *ad_select;
static char *xmit_hash_policy;
@@ -126,6 +127,13 @@ MODULE_PARM_DESC(mode, "Mode of operation : 0 for balance-rr, "
"6 for balance-alb");
module_param(primary, charp, 0);
MODULE_PARM_DESC(primary, "Primary network device to use");
+module_param(primary_reselect, charp, 0);
+MODULE_PARM_DESC(primary_reselect, "Reselect primary slave "
+ "once it comes up; "
+ "0 for always (default), "
+ "1 for only if speed of primary is not "
+ "better, "
+ "2 for never");
module_param(lacp_rate, charp, 0);
MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner "
"(slow/fast)");
@@ -200,6 +208,13 @@ const struct bond_parm_tbl fail_over_mac_tbl[] = {
{ NULL, -1},
};
+const struct bond_parm_tbl pri_reselect_tbl[] = {
+{ "always", BOND_PRI_RESELECT_ALWAYS},
+{ "better", BOND_PRI_RESELECT_BETTER},
+{ "failure", BOND_PRI_RESELECT_FAILURE},
+{ NULL, -1},
+};
+
struct bond_parm_tbl ad_select_tbl[] = {
{ "stable", BOND_AD_STABLE},
{ "bandwidth", BOND_AD_BANDWIDTH},
@@ -1070,6 +1085,25 @@ out:
}
+static bool bond_should_change_active(struct bonding *bond)
+{
+ struct slave *prim = bond->primary_slave;
+ struct slave *curr = bond->curr_active_slave;
+
+ if (!prim || !curr || curr->link != BOND_LINK_UP)
+ return true;
+ if (bond->force_primary) {
+ bond->force_primary = false;
+ return true;
+ }
+ if (bond->params.primary_reselect == BOND_PRI_RESELECT_BETTER &&
+ (prim->speed < curr->speed ||
+ (prim->speed == curr->speed && prim->duplex <= curr->duplex)))
+ return false;
+ if (bond->params.primary_reselect == BOND_PRI_RESELECT_FAILURE)
+ return false;
+ return true;
+}
/**
* find_best_interface - select the best available slave to be the active one
@@ -1094,7 +1128,8 @@ static struct slave *bond_find_best_slave(struct bonding *bond)
}
if ((bond->primary_slave) &&
- bond->primary_slave->link == BOND_LINK_UP) {
+ bond->primary_slave->link == BOND_LINK_UP &&
+ bond_should_change_active(bond)) {
new_active = bond->primary_slave;
}
@@ -1675,8 +1710,10 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) {
/* if there is a primary slave, remember it */
- if (strcmp(bond->params.primary, new_slave->dev->name) == 0)
+ if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
bond->primary_slave = new_slave;
+ bond->force_primary = true;
+ }
}
write_lock_bh(&bond->curr_slave_lock);
@@ -4643,7 +4680,7 @@ int bond_parse_parm(const char *buf, const struct bond_parm_tbl *tbl)
static int bond_check_params(struct bond_params *params)
{
- int arp_validate_value, fail_over_mac_value;
+ int arp_validate_value, fail_over_mac_value, primary_reselect_value;
/*
* Convert string parameters.
@@ -4942,6 +4979,20 @@ static int bond_check_params(struct bond_params *params)
primary = NULL;
}
+ if (primary && primary_reselect) {
+ primary_reselect_value = bond_parse_parm(primary_reselect,
+ pri_reselect_tbl);
+ if (primary_reselect_value == -1) {
+ pr_err(DRV_NAME
+ ": Error: Invalid primary_reselect \"%s\"\n",
+ primary_reselect ==
+ NULL ? "NULL" : primary_reselect);
+ return -EINVAL;
+ }
+ } else {
+ primary_reselect_value = BOND_PRI_RESELECT_ALWAYS;
+ }
+
if (fail_over_mac) {
fail_over_mac_value = bond_parse_parm(fail_over_mac,
fail_over_mac_tbl);
@@ -4973,6 +5024,7 @@ static int bond_check_params(struct bond_params *params)
params->use_carrier = use_carrier;
params->lacp_fast = lacp_fast;
params->primary[0] = 0;
+ params->primary_reselect = primary_reselect_value;
params->fail_over_mac = fail_over_mac_value;
if (primary) {
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 6044e12..42c44f2 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1212,6 +1212,61 @@ static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR,
bonding_show_primary, bonding_store_primary);
/*
+ * Show and set the primary_reselect flag.
+ */
+static ssize_t bonding_show_primary_reselect(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct bonding *bond = to_bond(d);
+
+ return sprintf(buf, "%s %d\n",
+ pri_reselect_tbl[bond->params.primary_reselect].modename,
+ bond->params.primary_reselect);
+}
+
+static ssize_t bonding_store_primary_reselect(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int new_value, ret = count;
+ struct bonding *bond = to_bond(d);
+
+ if (!rtnl_trylock())
+ return restart_syscall();
+
+ new_value = bond_parse_parm(buf, pri_reselect_tbl);
+ if (new_value < 0) {
+ pr_err(DRV_NAME
+ ": %s: Ignoring invalid primary_reselect value %.*s.\n",
+ bond->dev->name,
+ (int) strlen(buf) - 1, buf);
+ ret = -EINVAL;
+ goto out;
+ } else {
+ bond->params.primary_reselect = new_value;
+ pr_info(DRV_NAME ": %s: setting primary_reselect to %s (%d).\n",
+ bond->dev->name, pri_reselect_tbl[new_value].modename,
+ new_value);
+ if (new_value == BOND_PRI_RESELECT_ALWAYS ||
+ new_value == BOND_PRI_RESELECT_BETTER) {
+ bond->force_primary = true;
+ read_lock(&bond->lock);
+ write_lock_bh(&bond->curr_slave_lock);
+ bond_select_active_slave(bond);
+ write_unlock_bh(&bond->curr_slave_lock);
+ read_unlock(&bond->lock);
+ }
+ }
+out:
+ rtnl_unlock();
+ return ret;
+}
+static DEVICE_ATTR(primary_reselect, S_IRUGO | S_IWUSR,
+ bonding_show_primary_reselect,
+ bonding_store_primary_reselect);
+
+/*
* Show and set the use_carrier flag.
*/
static ssize_t bonding_show_carrier(struct device *d,
@@ -1500,6 +1555,7 @@ static struct attribute *per_bond_attrs[] = {
&dev_attr_num_unsol_na.attr,
&dev_attr_miimon.attr,
&dev_attr_primary.attr,
+ &dev_attr_primary_reselect.attr,
&dev_attr_use_carrier.attr,
&dev_attr_active_slave.attr,
&dev_attr_mii_status.attr,
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 6824771..b5b1530 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -131,6 +131,7 @@ struct bond_params {
int lacp_fast;
int ad_select;
char primary[IFNAMSIZ];
+ int primary_reselect;
__be32 arp_targets[BOND_MAX_ARP_TARGETS];
};
@@ -190,6 +191,7 @@ struct bonding {
struct slave *curr_active_slave;
struct slave *current_arp_slave;
struct slave *primary_slave;
+ bool force_primary;
s32 slave_cnt; /* never change this value outside the attach/detach wrappers */
rwlock_t lock;
rwlock_t curr_slave_lock;
@@ -258,6 +260,10 @@ static inline bool bond_is_lb(const struct bonding *bond)
|| bond->params.mode == BOND_MODE_ALB;
}
+#define BOND_PRI_RESELECT_ALWAYS 0
+#define BOND_PRI_RESELECT_BETTER 1
+#define BOND_PRI_RESELECT_FAILURE 2
+
#define BOND_FOM_NONE 0
#define BOND_FOM_ACTIVE 1
#define BOND_FOM_FOLLOW 2
@@ -348,6 +354,7 @@ extern const struct bond_parm_tbl bond_mode_tbl[];
extern const struct bond_parm_tbl xmit_hashtype_tbl[];
extern const struct bond_parm_tbl arp_validate_tbl[];
extern const struct bond_parm_tbl fail_over_mac_tbl[];
+extern const struct bond_parm_tbl pri_reselect_tbl[];
extern struct bond_parm_tbl ad_select_tbl[];
#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
^ permalink raw reply related
* Re: [PATCH 2/4] bonding: make sure tx and rx hash tables stay in sync when using alb mode
From: Andy Gospodarek @ 2009-09-18 15:36 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev, bonding-devel
In-Reply-To: <27763.1253144169@death.nxdomain.ibm.com>
On Wed, Sep 16, 2009 at 04:36:09PM -0700, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
>
> >
> >Subject: [PATCH] bonding: make sure tx and rx hash tables stay in sync when using alb mode
>
> When testing this, I'm getting a lockdep warning. It appears to
> be unhappy that tlb_choose_channel acquires the tx / rx hash table locks
> in the order tx then rx, but rlb_choose_channel -> alb_get_best_slave
> acquires the locks in the other order. I applied all four patches, but
> it looks like the change that trips lockdep is in this patch (#2).
>
> I haven't gotten an actual deadlock from this, although it seems
> plausible if there are two cpus in bond_alb_xmit at the same time, and
> one of them is sending an ARP.
>
> One fairly straightforward fix would be to combine the rx and tx
> hash table locks into a single lock. I suspect that wouldn't have any
> real performance penalty, since the rx hash table lock is generally not
> acquired very often (unlike the tx lock, which is taken for every packet
> that goes out).
>
> Also, FYI, two of the four patches had trailing whitespace. I
> believe it was #2 and #4.
>
> Thoughts?
Jay,
This patch should address both the the deadlock and whitespace conerns.
I ran a kernel with LOCKDEP enabled and saw no warnings while passing
traffic on the bond while pulling cables and while removing the module.
Here it is....
[PATCH] bonding: make sure tx and rx hash tables stay in sync when using alb mode
I noticed that it was easy for alb (mode 6) bonding to get into a state
where the tx hash-table and rx hash-table are out of sync (there is
really nothing to keep them synchronized), and we will transmit traffic
destined for a host on one slave and send ARP frames to the same slave
from another interface using a different source MAC.
There is no compelling reason to do this, so this patch makes sure the
rx hash-table changes whenever the tx hash-table is updated based on
device load. This patch also drops the code that does rlb re-balancing
since the balancing will not be controlled by the tx hash-table based on
transmit load. In order to address an issue found with the initial
patch, I have also combined the rx and tx hash table lock into a single
lock. This will facilitate moving these into a single table at some
point.
---
drivers/net/bonding/bond_alb.c | 203 +++++++++++++++-------------------------
drivers/net/bonding/bond_alb.h | 3 +-
2 files changed, 75 insertions(+), 131 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index bcf25c6..04b7055 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -111,6 +111,7 @@ static inline struct arp_pkt *arp_pkt(const struct sk_buff *skb)
/* Forward declaration */
static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[]);
+static struct slave *alb_get_best_slave(struct bonding *bond, u32 hash_index);
static inline u8 _simple_hash(const u8 *hash_start, int hash_size)
{
@@ -124,18 +125,18 @@ static inline u8 _simple_hash(const u8 *hash_start, int hash_size)
return hash;
}
-/*********************** tlb specific functions ***************************/
-
-static inline void _lock_tx_hashtbl(struct bonding *bond)
+/********************* hash table lock functions *************************/
+static inline void _lock_hashtbl(struct bonding *bond)
{
- spin_lock_bh(&(BOND_ALB_INFO(bond).tx_hashtbl_lock));
+ spin_lock_bh(&(BOND_ALB_INFO(bond).hashtbl_lock));
}
-static inline void _unlock_tx_hashtbl(struct bonding *bond)
+static inline void _unlock_hashtbl(struct bonding *bond)
{
- spin_unlock_bh(&(BOND_ALB_INFO(bond).tx_hashtbl_lock));
+ spin_unlock_bh(&(BOND_ALB_INFO(bond).hashtbl_lock));
}
+/*********************** tlb specific functions ***************************/
/* Caller must hold tx_hashtbl lock */
static inline void tlb_init_table_entry(struct tlb_client_info *entry, int save_load)
{
@@ -163,7 +164,7 @@ static void tlb_clear_slave(struct bonding *bond, struct slave *slave, int save_
struct tlb_client_info *tx_hash_table;
u32 index;
- _lock_tx_hashtbl(bond);
+ _lock_hashtbl(bond);
/* clear slave from tx_hashtbl */
tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl;
@@ -180,7 +181,7 @@ static void tlb_clear_slave(struct bonding *bond, struct slave *slave, int save_
tlb_init_slave(slave);
- _unlock_tx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* Must be called before starting the monitor timer */
@@ -191,7 +192,7 @@ static int tlb_initialize(struct bonding *bond)
struct tlb_client_info *new_hashtbl;
int i;
- spin_lock_init(&(bond_info->tx_hashtbl_lock));
+ spin_lock_init(&(bond_info->hashtbl_lock));
new_hashtbl = kzalloc(size, GFP_KERNEL);
if (!new_hashtbl) {
@@ -200,7 +201,7 @@ static int tlb_initialize(struct bonding *bond)
bond->dev->name);
return -1;
}
- _lock_tx_hashtbl(bond);
+ _lock_hashtbl(bond);
bond_info->tx_hashtbl = new_hashtbl;
@@ -208,7 +209,7 @@ static int tlb_initialize(struct bonding *bond)
tlb_init_table_entry(&bond_info->tx_hashtbl[i], 1);
}
- _unlock_tx_hashtbl(bond);
+ _unlock_hashtbl(bond);
return 0;
}
@@ -218,12 +219,12 @@ static void tlb_deinitialize(struct bonding *bond)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
- _lock_tx_hashtbl(bond);
+ _lock_hashtbl(bond);
kfree(bond_info->tx_hashtbl);
bond_info->tx_hashtbl = NULL;
- _unlock_tx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* Caller must hold bond lock for read */
@@ -264,24 +265,6 @@ static struct slave *tlb_get_least_loaded_slave(struct bonding *bond)
return least_loaded;
}
-/* Caller must hold bond lock for read and hashtbl lock */
-static struct slave *tlb_get_best_slave(struct bonding *bond, u32 hash_index)
-{
- struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
- struct tlb_client_info *tx_hash_table = bond_info->tx_hashtbl;
- struct slave *last_slave = tx_hash_table[hash_index].last_slave;
- struct slave *next_slave = NULL;
-
- if (last_slave && SLAVE_IS_OK(last_slave)) {
- /* Use the last slave listed in the tx hashtbl if:
- the last slave currently is essentially unloaded. */
- if (SLAVE_TLB_INFO(last_slave).load < 10)
- next_slave = last_slave;
- }
-
- return next_slave ? next_slave : tlb_get_least_loaded_slave(bond);
-}
-
/* Caller must hold bond lock for read */
static struct slave *tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len)
{
@@ -289,13 +272,12 @@ static struct slave *tlb_choose_channel(struct bonding *bond, u32 hash_index, u3
struct tlb_client_info *hash_table;
struct slave *assigned_slave;
- _lock_tx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_table = bond_info->tx_hashtbl;
assigned_slave = hash_table[hash_index].tx_slave;
if (!assigned_slave) {
- assigned_slave = tlb_get_best_slave(bond, hash_index);
-
+ assigned_slave = alb_get_best_slave(bond, hash_index);
if (assigned_slave) {
struct tlb_slave_info *slave_info =
&(SLAVE_TLB_INFO(assigned_slave));
@@ -319,20 +301,52 @@ static struct slave *tlb_choose_channel(struct bonding *bond, u32 hash_index, u3
hash_table[hash_index].tx_bytes += skb_len;
}
- _unlock_tx_hashtbl(bond);
+ _unlock_hashtbl(bond);
return assigned_slave;
}
/*********************** rlb specific functions ***************************/
-static inline void _lock_rx_hashtbl(struct bonding *bond)
+
+/* Caller must hold bond lock for read and hashtbl lock */
+static struct slave *rlb_update_rx_table(struct bonding *bond, struct slave *next_slave, u32 hash_index)
{
- spin_lock_bh(&(BOND_ALB_INFO(bond).rx_hashtbl_lock));
+ struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
+
+ /* check rlb table and correct it if wrong */
+ if (bond_info->rlb_enabled) {
+ struct rlb_client_info *rx_client_info = &(bond_info->rx_hashtbl[hash_index]);
+
+ /* if the new slave computed by tlb checks doesn't match rlb, stop rlb from using it */
+ if (next_slave && (next_slave != rx_client_info->slave))
+ rx_client_info->slave = next_slave;
+ }
+ return next_slave;
}
-static inline void _unlock_rx_hashtbl(struct bonding *bond)
+/* Caller must hold bond lock for read and hashtbl lock */
+static struct slave *alb_get_best_slave(struct bonding *bond, u32 hash_index)
{
- spin_unlock_bh(&(BOND_ALB_INFO(bond).rx_hashtbl_lock));
+ struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
+ struct tlb_client_info *tx_hash_table = bond_info->tx_hashtbl;
+ struct slave *last_slave = tx_hash_table[hash_index].last_slave;
+ struct slave *next_slave = NULL;
+
+ /* presume the next slave will be the least loaded one */
+ next_slave = tlb_get_least_loaded_slave(bond);
+
+ if (last_slave && SLAVE_IS_OK(last_slave)) {
+ /* Use the last slave listed in the tx hashtbl if:
+ the last slave currently is essentially unloaded. */
+ if (SLAVE_TLB_INFO(last_slave).load < 10)
+ next_slave = last_slave;
+ }
+
+ /* update the rlb hashtbl if there was a previous entry */
+ if (bond_info->rlb_enabled)
+ rlb_update_rx_table(bond, next_slave, hash_index);
+
+ return next_slave;
}
/* when an ARP REPLY is received from a client update its info
@@ -344,7 +358,7 @@ static void rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
struct rlb_client_info *client_info;
u32 hash_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = _simple_hash((u8*)&(arp->ip_src), sizeof(arp->ip_src));
client_info = &(bond_info->rx_hashtbl[hash_index]);
@@ -358,7 +372,7 @@ static void rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
bond_info->rx_ntt = 1;
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
static int rlb_arp_recv(struct sk_buff *skb, struct net_device *bond_dev, struct packet_type *ptype, struct net_device *orig_dev)
@@ -402,38 +416,6 @@ out:
return res;
}
-/* Caller must hold bond lock for read */
-static struct slave *rlb_next_rx_slave(struct bonding *bond)
-{
- struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
- struct slave *rx_slave, *slave, *start_at;
- int i = 0;
-
- if (bond_info->next_rx_slave) {
- start_at = bond_info->next_rx_slave;
- } else {
- start_at = bond->first_slave;
- }
-
- rx_slave = NULL;
-
- bond_for_each_slave_from(bond, slave, i, start_at) {
- if (SLAVE_IS_OK(slave)) {
- if (!rx_slave) {
- rx_slave = slave;
- } else if (slave->speed > rx_slave->speed) {
- rx_slave = slave;
- }
- }
- }
-
- if (rx_slave) {
- bond_info->next_rx_slave = rx_slave->next;
- }
-
- return rx_slave;
-}
-
/* teach the switch the mac of a disabled slave
* on the primary for fault tolerance
*
@@ -468,14 +450,14 @@ static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
u32 index, next_index;
/* clear slave from rx_hashtbl */
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
rx_hash_table = bond_info->rx_hashtbl;
index = bond_info->rx_hashtbl_head;
for (; index != RLB_NULL_INDEX; index = next_index) {
next_index = rx_hash_table[index].next;
if (rx_hash_table[index].slave == slave) {
- struct slave *assigned_slave = rlb_next_rx_slave(bond);
+ struct slave *assigned_slave = alb_get_best_slave(bond, index);
if (assigned_slave) {
rx_hash_table[index].slave = assigned_slave;
@@ -499,7 +481,7 @@ static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
}
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
write_lock_bh(&bond->curr_slave_lock);
@@ -558,7 +540,7 @@ static void rlb_update_rx_clients(struct bonding *bond)
struct rlb_client_info *client_info;
u32 hash_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
@@ -576,7 +558,7 @@ static void rlb_update_rx_clients(struct bonding *bond)
*/
bond_info->rlb_update_delay_counter = RLB_UPDATE_DELAY;
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* The slave was assigned a new mac address - update the clients */
@@ -587,7 +569,7 @@ static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *sla
int ntt = 0;
u32 hash_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
@@ -607,7 +589,7 @@ static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *sla
bond_info->rlb_update_retry_counter = RLB_UPDATE_RETRY;
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* mark all clients using src_ip to be updated */
@@ -617,7 +599,7 @@ static void rlb_req_update_subnet_clients(struct bonding *bond, __be32 src_ip)
struct rlb_client_info *client_info;
u32 hash_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
@@ -643,7 +625,7 @@ static void rlb_req_update_subnet_clients(struct bonding *bond, __be32 src_ip)
}
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* Caller must hold both bond and ptr locks for read */
@@ -655,7 +637,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
struct rlb_client_info *client_info;
u32 hash_index = 0;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = _simple_hash((u8 *)&arp->ip_dst, sizeof(arp->ip_src));
client_info = &(bond_info->rx_hashtbl[hash_index]);
@@ -671,7 +653,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
assigned_slave = client_info->slave;
if (assigned_slave) {
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
return assigned_slave;
}
} else {
@@ -687,7 +669,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
}
}
/* assign a new slave */
- assigned_slave = rlb_next_rx_slave(bond);
+ assigned_slave = alb_get_best_slave(bond, hash_index);
if (assigned_slave) {
client_info->ip_src = arp->ip_src;
@@ -723,7 +705,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
}
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
return assigned_slave;
}
@@ -771,36 +753,6 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
return tx_slave;
}
-/* Caller must hold bond lock for read */
-static void rlb_rebalance(struct bonding *bond)
-{
- struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
- struct slave *assigned_slave;
- struct rlb_client_info *client_info;
- int ntt;
- u32 hash_index;
-
- _lock_rx_hashtbl(bond);
-
- ntt = 0;
- hash_index = bond_info->rx_hashtbl_head;
- for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
- client_info = &(bond_info->rx_hashtbl[hash_index]);
- assigned_slave = rlb_next_rx_slave(bond);
- if (assigned_slave && (client_info->slave != assigned_slave)) {
- client_info->slave = assigned_slave;
- client_info->ntt = 1;
- ntt = 1;
- }
- }
-
- /* update the team's flag only after the whole iteration */
- if (ntt) {
- bond_info->rx_ntt = 1;
- }
- _unlock_rx_hashtbl(bond);
-}
-
/* Caller must hold rx_hashtbl lock */
static void rlb_init_table_entry(struct rlb_client_info *entry)
{
@@ -817,8 +769,6 @@ static int rlb_initialize(struct bonding *bond)
int size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info);
int i;
- spin_lock_init(&(bond_info->rx_hashtbl_lock));
-
new_hashtbl = kmalloc(size, GFP_KERNEL);
if (!new_hashtbl) {
printk(KERN_ERR DRV_NAME
@@ -826,7 +776,7 @@ static int rlb_initialize(struct bonding *bond)
bond->dev->name);
return -1;
}
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
bond_info->rx_hashtbl = new_hashtbl;
@@ -836,7 +786,7 @@ static int rlb_initialize(struct bonding *bond)
rlb_init_table_entry(bond_info->rx_hashtbl + i);
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
/*initialize packet type*/
pk_type->type = cpu_to_be16(ETH_P_ARP);
@@ -855,13 +805,13 @@ static void rlb_deinitialize(struct bonding *bond)
dev_remove_pack(&(bond_info->rlb_pkt_type));
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
kfree(bond_info->rx_hashtbl);
bond_info->rx_hashtbl = NULL;
bond_info->rx_hashtbl_head = RLB_NULL_INDEX;
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
static void rlb_clear_vlan(struct bonding *bond, unsigned short vlan_id)
@@ -869,7 +819,7 @@ static void rlb_clear_vlan(struct bonding *bond, unsigned short vlan_id)
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
u32 curr_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
curr_index = bond_info->rx_hashtbl_head;
while (curr_index != RLB_NULL_INDEX) {
@@ -894,7 +844,7 @@ static void rlb_clear_vlan(struct bonding *bond, unsigned short vlan_id)
curr_index = next_index;
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/*********************** tlb/rlb shared functions *********************/
@@ -1521,11 +1471,6 @@ void bond_alb_monitor(struct work_struct *work)
read_lock(&bond->lock);
}
- if (bond_info->rlb_rebalance) {
- bond_info->rlb_rebalance = 0;
- rlb_rebalance(bond);
- }
-
/* check if clients need updating */
if (bond_info->rx_ntt) {
if (bond_info->rlb_update_delay_counter) {
diff --git a/drivers/net/bonding/bond_alb.h b/drivers/net/bonding/bond_alb.h
index b65fd29..09d755a 100644
--- a/drivers/net/bonding/bond_alb.h
+++ b/drivers/net/bonding/bond_alb.h
@@ -90,7 +90,7 @@ struct tlb_slave_info {
struct alb_bond_info {
struct timer_list alb_timer;
struct tlb_client_info *tx_hashtbl; /* Dynamically allocated */
- spinlock_t tx_hashtbl_lock;
+ spinlock_t hashtbl_lock; /* lock for both tables */
u32 unbalanced_load;
int tx_rebalance_counter;
int lp_counter;
@@ -98,7 +98,6 @@ struct alb_bond_info {
int rlb_enabled;
struct packet_type rlb_pkt_type;
struct rlb_client_info *rx_hashtbl; /* Receive hash table */
- spinlock_t rx_hashtbl_lock;
u32 rx_hashtbl_head;
u8 rx_ntt; /* flag - need to transmit
* to all rx clients
--
1.5.5.6
^ permalink raw reply related
* Re: [PATCH 4/4 v2] bonding: add sysfs files to display tlb and alb hash table contents
From: Andy Gospodarek @ 2009-09-18 15:53 UTC (permalink / raw)
To: netdev, fubar, bonding-devel
In-Reply-To: <20090911211317.GT8515@gospo.rdu.redhat.com>
On Fri, Sep 11, 2009 at 05:13:17PM -0400, Andy Gospodarek wrote:
>
> bonding: add sysfs files to display tlb and alb hash table contents
>
> While debugging some problems with alb (mode 6) bonding I realized that
> being able to output the contents of both hash tables would be helpful.
> This is what the output looks like for the two files:
>
> device load
> eth1 491
> eth2 491
> hash device last device tx bytes load next previous
> 2 eth1 eth1 2254 491 0 0
> 3 eth2 eth2 2744 491 0 0
> 6 eth2 0 488 0 0
> 8 eth2 0 461698 0 0
> 1b eth2 0 249 0 0
> eb eth2 0 21 0 0
> ff eth2 0 22 0 0
>
> hash ip_src ip_dst mac_dst slave assign ntt
> 2 10.0.3.2 10.0.3.11 00:e0:81:71:ee:a9 eth1 1 0
> 3 10.0.3.2 10.0.3.10 00:e0:81:71:ee:a9 eth2 1 0
> 8 10.0.3.2 10.0.3.1 00:e0:81:71:ee:a9 eth2 1 0
>
> These were a great help debugging the fixes I have just posted and they
> might be helpful for others, so I decided to include them in my
> patchset.
>
> Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
>
Needed to repost since patch 2/4 changed and first patch had whitespace
issues:
[PATCH v2] bonding: add sysfs files to display tlb and alb hash table contents
While debugging some problems with alb (mode 6) bonding I realized that
being able to output the contents of both hash tables would be helpful.
This is what the output looks like for the two files:
device load
eth1 491
eth2 491
hash device last device tx bytes load next previous
2 eth1 eth1 2254 491 0 0
3 eth2 eth2 2744 491 0 0
6 eth2 0 488 0 0
8 eth2 0 461698 0 0
1b eth2 0 249 0 0
eb eth2 0 21 0 0
ff eth2 0 22 0 0
hash ip_src ip_dst mac_dst slave assign ntt
2 10.0.3.2 10.0.3.11 00:e0:81:71:ee:a9 eth1 1 0
3 10.0.3.2 10.0.3.10 00:e0:81:71:ee:a9 eth2 1 0
8 10.0.3.2 10.0.3.1 00:e0:81:71:ee:a9 eth2 1 0
These were a great help debugging the fixes I have just posted and they
might be helpful for others, so I decided to include them in my post.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
---
drivers/net/bonding/bond_alb.c | 61 ++++++++++++++++++++++++++++++++++++++
drivers/net/bonding/bond_alb.h | 2 +
drivers/net/bonding/bond_sysfs.c | 40 +++++++++++++++++++++++++
3 files changed, 103 insertions(+), 0 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 5d51489..adc5acd 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -750,6 +750,67 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
return tx_slave;
}
+int rlb_print_rx_hashtbl(struct bonding *bond, char *buf)
+{
+ struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
+ struct rlb_client_info *client_info;
+ u32 hash_index;
+ u32 count = 0;
+
+ _lock_hashtbl(bond);
+
+ count = sprintf(buf, "hash ip_src ip_dst mac_dst slave assign ntt\n");
+ hash_index = bond_info->rx_hashtbl_head;
+ for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
+ client_info = &(bond_info->rx_hashtbl[hash_index]);
+ count += sprintf(buf + count,"%-4x %-15pi4 %-15pi4 %pM %-5s %-6d %d\n",
+ hash_index,
+ &client_info->ip_src,
+ &client_info->ip_dst,
+ client_info->mac_dst,
+ client_info->slave->dev->name,
+ client_info->assigned,
+ client_info->ntt);
+ }
+
+ _unlock_hashtbl(bond);
+ return count;
+}
+
+int tlb_print_tx_hashtbl(struct bonding *bond, char *buf)
+{
+ struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
+ u32 hash_index;
+ u32 count = 0;
+ struct slave *slave;
+ int i;
+
+ _lock_hashtbl(bond);
+
+ count += sprintf(buf, "device load\n");
+ bond_for_each_slave(bond, slave, i) {
+ struct tlb_slave_info *slave_info = &(SLAVE_TLB_INFO(slave));
+ count += sprintf(buf + count,"%-7s %d\n",slave->dev->name,slave_info->load);
+ }
+ count += sprintf(buf + count, "hash device last device tx bytes load next previous\n");
+ for (hash_index = 0; hash_index < TLB_HASH_TABLE_SIZE; hash_index++) {
+ struct tlb_client_info *client_info = &(bond_info->tx_hashtbl[hash_index]);
+ if (client_info->tx_slave || client_info->last_slave) {
+ count += sprintf(buf + count,"%-4x %-8s %-13s %-14d %-11d %-4x %d\n",
+ hash_index,
+ (client_info->tx_slave) ? client_info->tx_slave->dev->name : "",
+ (client_info->last_slave) ? client_info->last_slave->dev->name : "",
+ client_info->tx_bytes,
+ client_info->load_history,
+ (client_info->next != TLB_NULL_INDEX) ? client_info->next : 0,
+ (client_info->prev != TLB_NULL_INDEX) ? client_info->prev : 0);
+ }
+ }
+
+ _unlock_hashtbl(bond);
+ return count;
+}
+
/* Caller must hold rx_hashtbl lock */
static void rlb_init_table_entry(struct rlb_client_info *entry)
{
diff --git a/drivers/net/bonding/bond_alb.h b/drivers/net/bonding/bond_alb.h
index 09d755a..57e761b 100644
--- a/drivers/net/bonding/bond_alb.h
+++ b/drivers/net/bonding/bond_alb.h
@@ -131,5 +131,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev);
void bond_alb_monitor(struct work_struct *);
int bond_alb_set_mac_address(struct net_device *bond_dev, void *addr);
void bond_alb_clear_vlan(struct bonding *bond, unsigned short vlan_id);
+int rlb_print_rx_hashtbl(struct bonding *bond, char *buf);
+int tlb_print_tx_hashtbl(struct bonding *bond, char *buf);
#endif /* __BOND_ALB_H__ */
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 55bf34f..1123e1f 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1480,6 +1480,44 @@ static ssize_t bonding_show_ad_partner_mac(struct device *d,
static DEVICE_ATTR(ad_partner_mac, S_IRUGO, bonding_show_ad_partner_mac, NULL);
+/*
+ * Show current tlb/alb tx hash table.
+ */
+static ssize_t bonding_show_tlb_tx_hash(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ int count = 0;
+ struct bonding *bond = to_bond(d);
+
+ if (bond->params.mode == BOND_MODE_ALB ||
+ bond->params.mode == BOND_MODE_TLB) {
+ count = tlb_print_tx_hashtbl(bond, buf);
+ }
+
+ return count;
+}
+static DEVICE_ATTR(tlb_tx_hash, S_IRUGO, bonding_show_tlb_tx_hash, NULL);
+
+
+/*
+ * Show current alb rx hash table.
+ */
+static ssize_t bonding_show_alb_rx_hash(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ int count = 0;
+ struct bonding *bond = to_bond(d);
+
+ if (bond->params.mode == BOND_MODE_ALB) {
+ count = rlb_print_rx_hashtbl(bond, buf);
+ }
+
+ return count;
+}
+static DEVICE_ATTR(alb_rx_hash, S_IRUGO, bonding_show_alb_rx_hash, NULL);
+
static struct attribute *per_bond_attrs[] = {
&dev_attr_slaves.attr,
@@ -1505,6 +1543,8 @@ static struct attribute *per_bond_attrs[] = {
&dev_attr_ad_actor_key.attr,
&dev_attr_ad_partner_key.attr,
&dev_attr_ad_partner_mac.attr,
+ &dev_attr_alb_rx_hash.attr,
+ &dev_attr_tlb_tx_hash.attr,
NULL,
};
--
1.5.5.6
^ permalink raw reply related
* Re: [PATCH 2/4 v3] bonding: make sure tx and rx hash tables stay in sync when using alb mode
From: Andy Gospodarek @ 2009-09-18 15:56 UTC (permalink / raw)
To: Jay Vosburgh; +Cc: netdev, bonding-devel
In-Reply-To: <20090918153622.GX8515@gospo.rdu.redhat.com>
On Fri, Sep 18, 2009 at 11:36:22AM -0400, Andy Gospodarek wrote:
> On Wed, Sep 16, 2009 at 04:36:09PM -0700, Jay Vosburgh wrote:
> > Andy Gospodarek <andy@greyhouse.net> wrote:
> >
> > >
> > >Subject: [PATCH] bonding: make sure tx and rx hash tables stay in sync when using alb mode
> >
> > When testing this, I'm getting a lockdep warning. It appears to
> > be unhappy that tlb_choose_channel acquires the tx / rx hash table locks
> > in the order tx then rx, but rlb_choose_channel -> alb_get_best_slave
> > acquires the locks in the other order. I applied all four patches, but
> > it looks like the change that trips lockdep is in this patch (#2).
> >
> > I haven't gotten an actual deadlock from this, although it seems
> > plausible if there are two cpus in bond_alb_xmit at the same time, and
> > one of them is sending an ARP.
> >
> > One fairly straightforward fix would be to combine the rx and tx
> > hash table locks into a single lock. I suspect that wouldn't have any
> > real performance penalty, since the rx hash table lock is generally not
> > acquired very often (unlike the tx lock, which is taken for every packet
> > that goes out).
> >
> > Also, FYI, two of the four patches had trailing whitespace. I
> > believe it was #2 and #4.
> >
> > Thoughts?
>
> Jay,
>
> This patch should address both the the deadlock and whitespace conerns.
> I ran a kernel with LOCKDEP enabled and saw no warnings while passing
> traffic on the bond while pulling cables and while removing the module.
> Here it is....
>
Adding the version and signed-off-by lines might be nice, eh?
[PATCH v3] bonding: make sure tx and rx hash tables stay in sync when using alb mode
I noticed that it was easy for alb (mode 6) bonding to get into a state
where the tx hash-table and rx hash-table are out of sync (there is
really nothing to keep them synchronized), and we will transmit traffic
destined for a host on one slave and send ARP frames to the same slave
from another interface using a different source MAC.
There is no compelling reason to do this, so this patch makes sure the
rx hash-table changes whenever the tx hash-table is updated based on
device load. This patch also drops the code that does rlb re-balancing
since the balancing will not be controlled by the tx hash-table based on
transmit load. In order to address an issue found with the initial
patch, I have also combined the rx and tx hash table lock into a single
lock. This will facilitate moving these into a single table at some
point.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
---
drivers/net/bonding/bond_alb.c | 203 +++++++++++++++-------------------------
drivers/net/bonding/bond_alb.h | 3 +-
2 files changed, 75 insertions(+), 131 deletions(-)
diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index bcf25c6..04b7055 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -111,6 +111,7 @@ static inline struct arp_pkt *arp_pkt(const struct sk_buff *skb)
/* Forward declaration */
static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[]);
+static struct slave *alb_get_best_slave(struct bonding *bond, u32 hash_index);
static inline u8 _simple_hash(const u8 *hash_start, int hash_size)
{
@@ -124,18 +125,18 @@ static inline u8 _simple_hash(const u8 *hash_start, int hash_size)
return hash;
}
-/*********************** tlb specific functions ***************************/
-
-static inline void _lock_tx_hashtbl(struct bonding *bond)
+/********************* hash table lock functions *************************/
+static inline void _lock_hashtbl(struct bonding *bond)
{
- spin_lock_bh(&(BOND_ALB_INFO(bond).tx_hashtbl_lock));
+ spin_lock_bh(&(BOND_ALB_INFO(bond).hashtbl_lock));
}
-static inline void _unlock_tx_hashtbl(struct bonding *bond)
+static inline void _unlock_hashtbl(struct bonding *bond)
{
- spin_unlock_bh(&(BOND_ALB_INFO(bond).tx_hashtbl_lock));
+ spin_unlock_bh(&(BOND_ALB_INFO(bond).hashtbl_lock));
}
+/*********************** tlb specific functions ***************************/
/* Caller must hold tx_hashtbl lock */
static inline void tlb_init_table_entry(struct tlb_client_info *entry, int save_load)
{
@@ -163,7 +164,7 @@ static void tlb_clear_slave(struct bonding *bond, struct slave *slave, int save_
struct tlb_client_info *tx_hash_table;
u32 index;
- _lock_tx_hashtbl(bond);
+ _lock_hashtbl(bond);
/* clear slave from tx_hashtbl */
tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl;
@@ -180,7 +181,7 @@ static void tlb_clear_slave(struct bonding *bond, struct slave *slave, int save_
tlb_init_slave(slave);
- _unlock_tx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* Must be called before starting the monitor timer */
@@ -191,7 +192,7 @@ static int tlb_initialize(struct bonding *bond)
struct tlb_client_info *new_hashtbl;
int i;
- spin_lock_init(&(bond_info->tx_hashtbl_lock));
+ spin_lock_init(&(bond_info->hashtbl_lock));
new_hashtbl = kzalloc(size, GFP_KERNEL);
if (!new_hashtbl) {
@@ -200,7 +201,7 @@ static int tlb_initialize(struct bonding *bond)
bond->dev->name);
return -1;
}
- _lock_tx_hashtbl(bond);
+ _lock_hashtbl(bond);
bond_info->tx_hashtbl = new_hashtbl;
@@ -208,7 +209,7 @@ static int tlb_initialize(struct bonding *bond)
tlb_init_table_entry(&bond_info->tx_hashtbl[i], 1);
}
- _unlock_tx_hashtbl(bond);
+ _unlock_hashtbl(bond);
return 0;
}
@@ -218,12 +219,12 @@ static void tlb_deinitialize(struct bonding *bond)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
- _lock_tx_hashtbl(bond);
+ _lock_hashtbl(bond);
kfree(bond_info->tx_hashtbl);
bond_info->tx_hashtbl = NULL;
- _unlock_tx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* Caller must hold bond lock for read */
@@ -264,24 +265,6 @@ static struct slave *tlb_get_least_loaded_slave(struct bonding *bond)
return least_loaded;
}
-/* Caller must hold bond lock for read and hashtbl lock */
-static struct slave *tlb_get_best_slave(struct bonding *bond, u32 hash_index)
-{
- struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
- struct tlb_client_info *tx_hash_table = bond_info->tx_hashtbl;
- struct slave *last_slave = tx_hash_table[hash_index].last_slave;
- struct slave *next_slave = NULL;
-
- if (last_slave && SLAVE_IS_OK(last_slave)) {
- /* Use the last slave listed in the tx hashtbl if:
- the last slave currently is essentially unloaded. */
- if (SLAVE_TLB_INFO(last_slave).load < 10)
- next_slave = last_slave;
- }
-
- return next_slave ? next_slave : tlb_get_least_loaded_slave(bond);
-}
-
/* Caller must hold bond lock for read */
static struct slave *tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len)
{
@@ -289,13 +272,12 @@ static struct slave *tlb_choose_channel(struct bonding *bond, u32 hash_index, u3
struct tlb_client_info *hash_table;
struct slave *assigned_slave;
- _lock_tx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_table = bond_info->tx_hashtbl;
assigned_slave = hash_table[hash_index].tx_slave;
if (!assigned_slave) {
- assigned_slave = tlb_get_best_slave(bond, hash_index);
-
+ assigned_slave = alb_get_best_slave(bond, hash_index);
if (assigned_slave) {
struct tlb_slave_info *slave_info =
&(SLAVE_TLB_INFO(assigned_slave));
@@ -319,20 +301,52 @@ static struct slave *tlb_choose_channel(struct bonding *bond, u32 hash_index, u3
hash_table[hash_index].tx_bytes += skb_len;
}
- _unlock_tx_hashtbl(bond);
+ _unlock_hashtbl(bond);
return assigned_slave;
}
/*********************** rlb specific functions ***************************/
-static inline void _lock_rx_hashtbl(struct bonding *bond)
+
+/* Caller must hold bond lock for read and hashtbl lock */
+static struct slave *rlb_update_rx_table(struct bonding *bond, struct slave *next_slave, u32 hash_index)
{
- spin_lock_bh(&(BOND_ALB_INFO(bond).rx_hashtbl_lock));
+ struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
+
+ /* check rlb table and correct it if wrong */
+ if (bond_info->rlb_enabled) {
+ struct rlb_client_info *rx_client_info = &(bond_info->rx_hashtbl[hash_index]);
+
+ /* if the new slave computed by tlb checks doesn't match rlb, stop rlb from using it */
+ if (next_slave && (next_slave != rx_client_info->slave))
+ rx_client_info->slave = next_slave;
+ }
+ return next_slave;
}
-static inline void _unlock_rx_hashtbl(struct bonding *bond)
+/* Caller must hold bond lock for read and hashtbl lock */
+static struct slave *alb_get_best_slave(struct bonding *bond, u32 hash_index)
{
- spin_unlock_bh(&(BOND_ALB_INFO(bond).rx_hashtbl_lock));
+ struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
+ struct tlb_client_info *tx_hash_table = bond_info->tx_hashtbl;
+ struct slave *last_slave = tx_hash_table[hash_index].last_slave;
+ struct slave *next_slave = NULL;
+
+ /* presume the next slave will be the least loaded one */
+ next_slave = tlb_get_least_loaded_slave(bond);
+
+ if (last_slave && SLAVE_IS_OK(last_slave)) {
+ /* Use the last slave listed in the tx hashtbl if:
+ the last slave currently is essentially unloaded. */
+ if (SLAVE_TLB_INFO(last_slave).load < 10)
+ next_slave = last_slave;
+ }
+
+ /* update the rlb hashtbl if there was a previous entry */
+ if (bond_info->rlb_enabled)
+ rlb_update_rx_table(bond, next_slave, hash_index);
+
+ return next_slave;
}
/* when an ARP REPLY is received from a client update its info
@@ -344,7 +358,7 @@ static void rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
struct rlb_client_info *client_info;
u32 hash_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = _simple_hash((u8*)&(arp->ip_src), sizeof(arp->ip_src));
client_info = &(bond_info->rx_hashtbl[hash_index]);
@@ -358,7 +372,7 @@ static void rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
bond_info->rx_ntt = 1;
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
static int rlb_arp_recv(struct sk_buff *skb, struct net_device *bond_dev, struct packet_type *ptype, struct net_device *orig_dev)
@@ -402,38 +416,6 @@ out:
return res;
}
-/* Caller must hold bond lock for read */
-static struct slave *rlb_next_rx_slave(struct bonding *bond)
-{
- struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
- struct slave *rx_slave, *slave, *start_at;
- int i = 0;
-
- if (bond_info->next_rx_slave) {
- start_at = bond_info->next_rx_slave;
- } else {
- start_at = bond->first_slave;
- }
-
- rx_slave = NULL;
-
- bond_for_each_slave_from(bond, slave, i, start_at) {
- if (SLAVE_IS_OK(slave)) {
- if (!rx_slave) {
- rx_slave = slave;
- } else if (slave->speed > rx_slave->speed) {
- rx_slave = slave;
- }
- }
- }
-
- if (rx_slave) {
- bond_info->next_rx_slave = rx_slave->next;
- }
-
- return rx_slave;
-}
-
/* teach the switch the mac of a disabled slave
* on the primary for fault tolerance
*
@@ -468,14 +450,14 @@ static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
u32 index, next_index;
/* clear slave from rx_hashtbl */
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
rx_hash_table = bond_info->rx_hashtbl;
index = bond_info->rx_hashtbl_head;
for (; index != RLB_NULL_INDEX; index = next_index) {
next_index = rx_hash_table[index].next;
if (rx_hash_table[index].slave == slave) {
- struct slave *assigned_slave = rlb_next_rx_slave(bond);
+ struct slave *assigned_slave = alb_get_best_slave(bond, index);
if (assigned_slave) {
rx_hash_table[index].slave = assigned_slave;
@@ -499,7 +481,7 @@ static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
}
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
write_lock_bh(&bond->curr_slave_lock);
@@ -558,7 +540,7 @@ static void rlb_update_rx_clients(struct bonding *bond)
struct rlb_client_info *client_info;
u32 hash_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
@@ -576,7 +558,7 @@ static void rlb_update_rx_clients(struct bonding *bond)
*/
bond_info->rlb_update_delay_counter = RLB_UPDATE_DELAY;
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* The slave was assigned a new mac address - update the clients */
@@ -587,7 +569,7 @@ static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *sla
int ntt = 0;
u32 hash_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
@@ -607,7 +589,7 @@ static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *sla
bond_info->rlb_update_retry_counter = RLB_UPDATE_RETRY;
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* mark all clients using src_ip to be updated */
@@ -617,7 +599,7 @@ static void rlb_req_update_subnet_clients(struct bonding *bond, __be32 src_ip)
struct rlb_client_info *client_info;
u32 hash_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
@@ -643,7 +625,7 @@ static void rlb_req_update_subnet_clients(struct bonding *bond, __be32 src_ip)
}
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/* Caller must hold both bond and ptr locks for read */
@@ -655,7 +637,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
struct rlb_client_info *client_info;
u32 hash_index = 0;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
hash_index = _simple_hash((u8 *)&arp->ip_dst, sizeof(arp->ip_src));
client_info = &(bond_info->rx_hashtbl[hash_index]);
@@ -671,7 +653,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
assigned_slave = client_info->slave;
if (assigned_slave) {
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
return assigned_slave;
}
} else {
@@ -687,7 +669,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
}
}
/* assign a new slave */
- assigned_slave = rlb_next_rx_slave(bond);
+ assigned_slave = alb_get_best_slave(bond, hash_index);
if (assigned_slave) {
client_info->ip_src = arp->ip_src;
@@ -723,7 +705,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
}
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
return assigned_slave;
}
@@ -771,36 +753,6 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
return tx_slave;
}
-/* Caller must hold bond lock for read */
-static void rlb_rebalance(struct bonding *bond)
-{
- struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
- struct slave *assigned_slave;
- struct rlb_client_info *client_info;
- int ntt;
- u32 hash_index;
-
- _lock_rx_hashtbl(bond);
-
- ntt = 0;
- hash_index = bond_info->rx_hashtbl_head;
- for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
- client_info = &(bond_info->rx_hashtbl[hash_index]);
- assigned_slave = rlb_next_rx_slave(bond);
- if (assigned_slave && (client_info->slave != assigned_slave)) {
- client_info->slave = assigned_slave;
- client_info->ntt = 1;
- ntt = 1;
- }
- }
-
- /* update the team's flag only after the whole iteration */
- if (ntt) {
- bond_info->rx_ntt = 1;
- }
- _unlock_rx_hashtbl(bond);
-}
-
/* Caller must hold rx_hashtbl lock */
static void rlb_init_table_entry(struct rlb_client_info *entry)
{
@@ -817,8 +769,6 @@ static int rlb_initialize(struct bonding *bond)
int size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info);
int i;
- spin_lock_init(&(bond_info->rx_hashtbl_lock));
-
new_hashtbl = kmalloc(size, GFP_KERNEL);
if (!new_hashtbl) {
printk(KERN_ERR DRV_NAME
@@ -826,7 +776,7 @@ static int rlb_initialize(struct bonding *bond)
bond->dev->name);
return -1;
}
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
bond_info->rx_hashtbl = new_hashtbl;
@@ -836,7 +786,7 @@ static int rlb_initialize(struct bonding *bond)
rlb_init_table_entry(bond_info->rx_hashtbl + i);
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
/*initialize packet type*/
pk_type->type = cpu_to_be16(ETH_P_ARP);
@@ -855,13 +805,13 @@ static void rlb_deinitialize(struct bonding *bond)
dev_remove_pack(&(bond_info->rlb_pkt_type));
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
kfree(bond_info->rx_hashtbl);
bond_info->rx_hashtbl = NULL;
bond_info->rx_hashtbl_head = RLB_NULL_INDEX;
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
static void rlb_clear_vlan(struct bonding *bond, unsigned short vlan_id)
@@ -869,7 +819,7 @@ static void rlb_clear_vlan(struct bonding *bond, unsigned short vlan_id)
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
u32 curr_index;
- _lock_rx_hashtbl(bond);
+ _lock_hashtbl(bond);
curr_index = bond_info->rx_hashtbl_head;
while (curr_index != RLB_NULL_INDEX) {
@@ -894,7 +844,7 @@ static void rlb_clear_vlan(struct bonding *bond, unsigned short vlan_id)
curr_index = next_index;
}
- _unlock_rx_hashtbl(bond);
+ _unlock_hashtbl(bond);
}
/*********************** tlb/rlb shared functions *********************/
@@ -1521,11 +1471,6 @@ void bond_alb_monitor(struct work_struct *work)
read_lock(&bond->lock);
}
- if (bond_info->rlb_rebalance) {
- bond_info->rlb_rebalance = 0;
- rlb_rebalance(bond);
- }
-
/* check if clients need updating */
if (bond_info->rx_ntt) {
if (bond_info->rlb_update_delay_counter) {
diff --git a/drivers/net/bonding/bond_alb.h b/drivers/net/bonding/bond_alb.h
index b65fd29..09d755a 100644
--- a/drivers/net/bonding/bond_alb.h
+++ b/drivers/net/bonding/bond_alb.h
@@ -90,7 +90,7 @@ struct tlb_slave_info {
struct alb_bond_info {
struct timer_list alb_timer;
struct tlb_client_info *tx_hashtbl; /* Dynamically allocated */
- spinlock_t tx_hashtbl_lock;
+ spinlock_t hashtbl_lock; /* lock for both tables */
u32 unbalanced_load;
int tx_rebalance_counter;
int lp_counter;
@@ -98,7 +98,6 @@ struct alb_bond_info {
int rlb_enabled;
struct packet_type rlb_pkt_type;
struct rlb_client_info *rx_hashtbl; /* Receive hash table */
- spinlock_t rx_hashtbl_lock;
u32 rx_hashtbl_head;
u8 rx_ntt; /* flag - need to transmit
* to all rx clients
--
1.5.5.6
^ permalink raw reply related
* Re: [RFC] defer skb allocation in virtio_net -- mergable buff part
From: Shirley Ma @ 2009-09-18 17:04 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: netdev, kvm, linux-kernel
In-Reply-To: <20090825114143.GA13884@redhat.com>
Hello Michael,
I am working on the patch to address the question you raised below. I am
adding one more function -- destroy_buf in virtqueue_ops, so we don't
need to maintain the list of pending buffers in upper layer (like
virtio_net), when the device is shutdown or removed, this buffer free
func will be called to release all pending buffers in virtio_ring on
behalf of virtio_net.
The rest of comments are minor. The new patch will defer skb allocation
for both mergable and none-mergable buffers.
Thanks
Shirley
On Tue, 2009-08-25 at 14:41 +0300, Michael S. Tsirkin wrote:
> > #define VIRTNET_SEND_COMMAND_SG_MAX 2
> >
> > +struct page_list
> > +{
>
> Kernel style is "struct page_list {".
> Also, prefix with virtnet_?
>
> > + struct page *page;
> > + struct list_head list;
> > +};
> > +
> > struct virtnet_info
> > {
> > struct virtio_device *vdev;
> > @@ -72,6 +79,8 @@ struct virtnet_info
> >
> > /* Chain pages by the private ptr. */
> > struct page *pages;
>
> Do we need the pages list now? Can we do without?
>
> Pls document fields below.
>
> > + struct list_head used_pages;
>
> Seems a waste to have this list just for dev down.
> Extend virtio to give us all buffers from vq
> on shutdown?
>
^ permalink raw reply
* Re: [origin tree build failure] [PATCH] Fix: (.text+0x22ec88): undefined reference to `ieee80211_unregister_hw'
From: Kalle Valo @ 2009-09-18 18:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: David Miller, Bob Copeland, Coelho Luciano (Nokia-D/Helsinki),
Oikarinen Juuso (Nokia-D/Tampere), torvalds@linux-foundation.org,
akpm@linux-foundation.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, John W. Linville,
linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20090915114958.GA26902-X9Un+BFzKDI@public.gmane.org>
Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org> writes:
> FYI, -tip testing found that something in this lot broke the build with
> certain configs (attached):
>
> drivers/built-in.o: In function `wl1251_free_hw':
> (.text+0x22ec88): undefined reference to `ieee80211_unregister_hw'
> drivers/built-in.o: In function `wl1251_free_hw':
> (.text+0x22ecf5): undefined reference to `ieee80211_free_hw'
> drivers/built-in.o: In function `wl1251_op_bss_info_changed':
> wl1251_main.c:(.text+0x22f161): undefined reference to `ieee80211_beacon_get'
> drivers/built-in.o: In function `wl1251_op_config':
> wl1251_main.c:(.text+0x22f2f8): undefined reference to `ieee80211_frequency_to_channel'
> drivers/built-in.o: In function `wl1251_op_stop':
> wl1251_main.c:(.text+0x22f554): undefined reference to `ieee80211_scan_completed'
> drivers/built-in.o: In function `wl1251_op_tx':
> wl1251_main.c:(.text+0x22f6a5): undefined reference to `ieee80211_queue_work'
> wl1251_main.c:(.text+0x22f6b6): undefined reference to `ieee80211_stop_queues'
> drivers/built-in.o: In function `wl1251_alloc_hw':
> (.text+0x22f710): undefined reference to `ieee80211_alloc_hw'
> drivers/built-in.o: In function `wl1251_alloc_hw':
> (.text+0x22f9e4): undefined reference to `ieee80211_free_hw'
> drivers/built-in.o: In function `wl1251_init_ieee80211':
> (.text+0x2305df): undefined reference to `ieee80211_register_hw'
> drivers/built-in.o: In function `wl1251_event_handle':
> (.text+0x2306c4): undefined reference to `ieee80211_scan_completed'
> drivers/built-in.o: In function `wl1251_tx_flush':
> (.text+0x230810): undefined reference to `ieee80211_tx_status'
> drivers/built-in.o: In function `wl1251_tx_flush':
> (.text+0x230846): undefined reference to `ieee80211_tx_status'
> drivers/built-in.o: In function `wl1251_tx_frame':
> wl1251_tx.c:(.text+0x230a97): undefined reference to `ieee80211_hdrlen'
> drivers/built-in.o: In function `wl1251_tx_complete':
> (.text+0x230d30): undefined reference to `ieee80211_get_hdrlen_from_skb'
> drivers/built-in.o: In function `wl1251_tx_complete':
> (.text+0x230d58): undefined reference to `ieee80211_tx_status'
> drivers/built-in.o: In function `wl1251_tx_complete':
> (.text+0x230dc0): undefined reference to `ieee80211_wake_queues'
> drivers/built-in.o: In function `wl1251_tx_work':
> (.text+0x230f57): undefined reference to `ieee80211_stop_queues'
> drivers/built-in.o: In function `wl1251_rx':
> (.text+0x231187): undefined reference to `ieee80211_channel_to_frequency'
> drivers/built-in.o: In function `wl1251_rx':
> (.text+0x2311e4): undefined reference to `ieee80211_rx'
>
> Turning CONFIG_WL1251 off makes it build.
>
> A (very) quick first look suggests that not all prior dependencies were
> carried over to the new drivers in drivers/net/wireless/wl12xx/Kconfig:
>
> -config WL12XX
> - tristate "TI wl1251/wl1271 support"
> - depends on MAC80211 && WLAN_80211 && SPI_MASTER &&
> GENERIC_HARDIRQS && EXPERIMENTAL
> +menuconfig WL12XX
> + boolean "TI wl12xx driver support"
> + depends on MAC80211 && WLAN_80211 && EXPERIMENTAL
> + ---help---
> + This will enable TI wl12xx driver support. The drivers make
> + use of the mac80211 stack.
> +
> +config WL1251
> + tristate "TI wl1251 support"
> + depends on WL12XX && GENERIC_HARDIRQS
>
> the friction is between modular/build-in mode:
>
> CONFIG_WL1251=y
> CONFIG_MAC80211=m
>
> Kconfig does not carry over the modular dependency from WL12XX to
> WL1251. An explicit rule via the patch below turns CONFIG_WL1251 into a
> modular entry as well:
>
> CONFIG_WL12XX=y
> CONFIG_WL1251=m
>
> ( Note: i have tested this patch with this particular config and it
> solves the problem there but have not investigated any deeper. )
>
> Ingo
>
> Signed-off-by: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
I missed this state entirely, thanks for fixing this.
Acked-by: Kalle Valo <kalle.valo-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org>
"wl1251:" prefix is just missing from the commit summary.
Who is going to take the patch? Should I send this to John?
--
Kalle Valo
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH net-next-2.6] bonding: introduce primary_reselect option
From: Nicolas de Pesloüan @ 2009-09-18 19:32 UTC (permalink / raw)
To: Jiri Pirko; +Cc: netdev, davem, fubar, bonding-devel
In-Reply-To: <20090918153006.GC2801@psychotron.redhat.com>
Jiri Pirko a écrit :
> (updated 3)
>
> In some cases there is not desirable to switch back to primary interface when
> it's link recovers and rather stay with currently active one. We need to avoid
> packetloss as much as we can in some cases. This is solved by introducing
> primary_reselect option. Note that enslaved primary slave is set as current
> active no matter what.
>
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>
> diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
> index d5181ce..fd650e0 100644
> --- a/Documentation/networking/bonding.txt
> +++ b/Documentation/networking/bonding.txt
> @@ -614,6 +614,32 @@ primary
>
> The primary option is only valid for active-backup mode.
>
> +primary_reselect
> +
> + Specifies the behavior of the current active slave when the primary was
> + down and comes back up. This option is designed to prevent
> + flip-flopping between the primary slave and other slaves. The possible
> + values and their respective effects are:
> +
> + always or 0 (default)
> +
> + The primary slave becomes the active slave whenever it comes
> + back up.
> +
> + better or 1
> +
> + The primary slave becomes the active slave when it comes back
> + up, if the speed and duplex of the primary slave is better
> + than the speed and duplex of the current active slave.
> +
> + failure or 2
> +
> + The primary slave becomes the active slave only if the current
> + active slave fails and the primary slave is up.
> +
> + When no slave are active, if the primary comes back up, it becomes the
> + active slave, regardless of the value of primary_reselect.
> +
> updelay
>
> Specifies the time, in milliseconds, to wait before enabling a
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 699bfdd..1127361 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -94,6 +94,7 @@ static int downdelay;
> static int use_carrier = 1;
> static char *mode;
> static char *primary;
> +static char *primary_reselect;
> static char *lacp_rate;
> static char *ad_select;
> static char *xmit_hash_policy;
> @@ -126,6 +127,13 @@ MODULE_PARM_DESC(mode, "Mode of operation : 0 for balance-rr, "
> "6 for balance-alb");
> module_param(primary, charp, 0);
> MODULE_PARM_DESC(primary, "Primary network device to use");
> +module_param(primary_reselect, charp, 0);
> +MODULE_PARM_DESC(primary_reselect, "Reselect primary slave "
> + "once it comes up; "
> + "0 for always (default), "
> + "1 for only if speed of primary is not "
> + "better, "
> + "2 for never");
You should remove "not" for option value 1 and use the word failure for option
value 2.
MODULE_PARM_DESC(primary_reselect, "Reselect primary slave "
"once it comes up; "
"0 for always (default), "
"1 for only if speed of primary is "
"better, "
"2 for only on active slave "
"failure");
Apart from this small detail, this sounds good for me.
> module_param(lacp_rate, charp, 0);
> MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner "
> "(slow/fast)");
> @@ -200,6 +208,13 @@ const struct bond_parm_tbl fail_over_mac_tbl[] = {
> { NULL, -1},
> };
>
> +const struct bond_parm_tbl pri_reselect_tbl[] = {
> +{ "always", BOND_PRI_RESELECT_ALWAYS},
> +{ "better", BOND_PRI_RESELECT_BETTER},
> +{ "failure", BOND_PRI_RESELECT_FAILURE},
> +{ NULL, -1},
> +};
> +
> struct bond_parm_tbl ad_select_tbl[] = {
> { "stable", BOND_AD_STABLE},
> { "bandwidth", BOND_AD_BANDWIDTH},
> @@ -1070,6 +1085,25 @@ out:
>
> }
>
> +static bool bond_should_change_active(struct bonding *bond)
> +{
> + struct slave *prim = bond->primary_slave;
> + struct slave *curr = bond->curr_active_slave;
> +
> + if (!prim || !curr || curr->link != BOND_LINK_UP)
> + return true;
> + if (bond->force_primary) {
> + bond->force_primary = false;
> + return true;
> + }
> + if (bond->params.primary_reselect == BOND_PRI_RESELECT_BETTER &&
> + (prim->speed < curr->speed ||
> + (prim->speed == curr->speed && prim->duplex <= curr->duplex)))
> + return false;
> + if (bond->params.primary_reselect == BOND_PRI_RESELECT_FAILURE)
> + return false;
> + return true;
> +}
>
> /**
> * find_best_interface - select the best available slave to be the active one
> @@ -1094,7 +1128,8 @@ static struct slave *bond_find_best_slave(struct bonding *bond)
> }
>
> if ((bond->primary_slave) &&
> - bond->primary_slave->link == BOND_LINK_UP) {
> + bond->primary_slave->link == BOND_LINK_UP &&
> + bond_should_change_active(bond)) {
> new_active = bond->primary_slave;
> }
>
> @@ -1675,8 +1710,10 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
>
> if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) {
> /* if there is a primary slave, remember it */
> - if (strcmp(bond->params.primary, new_slave->dev->name) == 0)
> + if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
> bond->primary_slave = new_slave;
> + bond->force_primary = true;
> + }
> }
>
> write_lock_bh(&bond->curr_slave_lock);
> @@ -4643,7 +4680,7 @@ int bond_parse_parm(const char *buf, const struct bond_parm_tbl *tbl)
>
> static int bond_check_params(struct bond_params *params)
> {
> - int arp_validate_value, fail_over_mac_value;
> + int arp_validate_value, fail_over_mac_value, primary_reselect_value;
>
> /*
> * Convert string parameters.
> @@ -4942,6 +4979,20 @@ static int bond_check_params(struct bond_params *params)
> primary = NULL;
> }
>
> + if (primary && primary_reselect) {
> + primary_reselect_value = bond_parse_parm(primary_reselect,
> + pri_reselect_tbl);
> + if (primary_reselect_value == -1) {
> + pr_err(DRV_NAME
> + ": Error: Invalid primary_reselect \"%s\"\n",
> + primary_reselect ==
> + NULL ? "NULL" : primary_reselect);
> + return -EINVAL;
> + }
> + } else {
> + primary_reselect_value = BOND_PRI_RESELECT_ALWAYS;
> + }
> +
> if (fail_over_mac) {
> fail_over_mac_value = bond_parse_parm(fail_over_mac,
> fail_over_mac_tbl);
> @@ -4973,6 +5024,7 @@ static int bond_check_params(struct bond_params *params)
> params->use_carrier = use_carrier;
> params->lacp_fast = lacp_fast;
> params->primary[0] = 0;
> + params->primary_reselect = primary_reselect_value;
> params->fail_over_mac = fail_over_mac_value;
>
> if (primary) {
> diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
> index 6044e12..42c44f2 100644
> --- a/drivers/net/bonding/bond_sysfs.c
> +++ b/drivers/net/bonding/bond_sysfs.c
> @@ -1212,6 +1212,61 @@ static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR,
> bonding_show_primary, bonding_store_primary);
>
> /*
> + * Show and set the primary_reselect flag.
> + */
> +static ssize_t bonding_show_primary_reselect(struct device *d,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct bonding *bond = to_bond(d);
> +
> + return sprintf(buf, "%s %d\n",
> + pri_reselect_tbl[bond->params.primary_reselect].modename,
> + bond->params.primary_reselect);
> +}
> +
> +static ssize_t bonding_store_primary_reselect(struct device *d,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + int new_value, ret = count;
> + struct bonding *bond = to_bond(d);
> +
> + if (!rtnl_trylock())
> + return restart_syscall();
> +
> + new_value = bond_parse_parm(buf, pri_reselect_tbl);
> + if (new_value < 0) {
> + pr_err(DRV_NAME
> + ": %s: Ignoring invalid primary_reselect value %.*s.\n",
> + bond->dev->name,
> + (int) strlen(buf) - 1, buf);
> + ret = -EINVAL;
> + goto out;
> + } else {
> + bond->params.primary_reselect = new_value;
> + pr_info(DRV_NAME ": %s: setting primary_reselect to %s (%d).\n",
> + bond->dev->name, pri_reselect_tbl[new_value].modename,
> + new_value);
> + if (new_value == BOND_PRI_RESELECT_ALWAYS ||
> + new_value == BOND_PRI_RESELECT_BETTER) {
> + bond->force_primary = true;
> + read_lock(&bond->lock);
> + write_lock_bh(&bond->curr_slave_lock);
> + bond_select_active_slave(bond);
> + write_unlock_bh(&bond->curr_slave_lock);
> + read_unlock(&bond->lock);
> + }
> + }
> +out:
> + rtnl_unlock();
> + return ret;
> +}
> +static DEVICE_ATTR(primary_reselect, S_IRUGO | S_IWUSR,
> + bonding_show_primary_reselect,
> + bonding_store_primary_reselect);
> +
> +/*
> * Show and set the use_carrier flag.
> */
> static ssize_t bonding_show_carrier(struct device *d,
> @@ -1500,6 +1555,7 @@ static struct attribute *per_bond_attrs[] = {
> &dev_attr_num_unsol_na.attr,
> &dev_attr_miimon.attr,
> &dev_attr_primary.attr,
> + &dev_attr_primary_reselect.attr,
> &dev_attr_use_carrier.attr,
> &dev_attr_active_slave.attr,
> &dev_attr_mii_status.attr,
> diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
> index 6824771..b5b1530 100644
> --- a/drivers/net/bonding/bonding.h
> +++ b/drivers/net/bonding/bonding.h
> @@ -131,6 +131,7 @@ struct bond_params {
> int lacp_fast;
> int ad_select;
> char primary[IFNAMSIZ];
> + int primary_reselect;
> __be32 arp_targets[BOND_MAX_ARP_TARGETS];
> };
>
> @@ -190,6 +191,7 @@ struct bonding {
> struct slave *curr_active_slave;
> struct slave *current_arp_slave;
> struct slave *primary_slave;
> + bool force_primary;
> s32 slave_cnt; /* never change this value outside the attach/detach wrappers */
> rwlock_t lock;
> rwlock_t curr_slave_lock;
> @@ -258,6 +260,10 @@ static inline bool bond_is_lb(const struct bonding *bond)
> || bond->params.mode == BOND_MODE_ALB;
> }
>
> +#define BOND_PRI_RESELECT_ALWAYS 0
> +#define BOND_PRI_RESELECT_BETTER 1
> +#define BOND_PRI_RESELECT_FAILURE 2
> +
> #define BOND_FOM_NONE 0
> #define BOND_FOM_ACTIVE 1
> #define BOND_FOM_FOLLOW 2
> @@ -348,6 +354,7 @@ extern const struct bond_parm_tbl bond_mode_tbl[];
> extern const struct bond_parm_tbl xmit_hashtype_tbl[];
> extern const struct bond_parm_tbl arp_validate_tbl[];
> extern const struct bond_parm_tbl fail_over_mac_tbl[];
> +extern const struct bond_parm_tbl pri_reselect_tbl[];
> extern struct bond_parm_tbl ad_select_tbl[];
>
> #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
>
Nicolas.
^ permalink raw reply
* [net-2.6 PATCH 1/3] ixgbe: fix sfp_timer clean up in ixgbe_down
From: Jeff Kirsher @ 2009-09-18 19:45 UTC (permalink / raw)
To: davem
Cc: netdev, gospo, Shannon Nelson, Don Skidmore,
Peter P Waskiewicz Jr, Jeff Kirsher
From: Don Skidmore <donald.c.skidmore@intel.com>
We weren't stoping the sfp_timer after the device was brought down.
This patch properly cleans up.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ixgbe/ixgbe_main.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 59ad959..056434c 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -2926,6 +2926,8 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
ixgbe_napi_disable_all(adapter);
+ clear_bit(__IXGBE_SFP_MODULE_NOT_FOUND, &adapter->state);
+ del_timer_sync(&adapter->sfp_timer);
del_timer_sync(&adapter->watchdog_timer);
cancel_work_sync(&adapter->watchdog_task);
^ permalink raw reply related
* [net-2.6 PATCH 2/3] ixgbe: Allow tx itr specific settings
From: Jeff Kirsher @ 2009-09-18 19:46 UTC (permalink / raw)
To: davem; +Cc: netdev, gospo, Shannon Nelson, Peter P Waskiewicz Jr,
Jeff Kirsher
In-Reply-To: <20090918194533.28898.49436.stgit@localhost.localdomain>
From: Nelson, Shannon <shannon.nelson@intel.com>
Allow the user to set Tx specific itr values. This only makes sense
when there are separate vectors for Tx and Rx. When the queues are
doubled up RxTx on the vectors, we still only use the rx itr value.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ixgbe/ixgbe.h | 6 ++-
drivers/net/ixgbe/ixgbe_ethtool.c | 75 ++++++++++++++++++++++++++++++-------
drivers/net/ixgbe/ixgbe_main.c | 31 +++++++++------
3 files changed, 83 insertions(+), 29 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe.h b/drivers/net/ixgbe/ixgbe.h
index dd688d4..385be60 100644
--- a/drivers/net/ixgbe/ixgbe.h
+++ b/drivers/net/ixgbe/ixgbe.h
@@ -267,7 +267,8 @@ struct ixgbe_adapter {
enum ixgbe_fc_mode last_lfc_mode;
/* Interrupt Throttle Rate */
- u32 itr_setting;
+ u32 rx_itr_setting;
+ u32 tx_itr_setting;
u16 eitr_low;
u16 eitr_high;
@@ -351,7 +352,8 @@ struct ixgbe_adapter {
struct ixgbe_hw_stats stats;
/* Interrupt Throttle Rate */
- u32 eitr_param;
+ u32 rx_eitr_param;
+ u32 tx_eitr_param;
unsigned long state;
u64 tx_busy;
diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index 026e94a..53b0a66 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -1929,7 +1929,7 @@ static int ixgbe_get_coalesce(struct net_device *netdev,
ec->tx_max_coalesced_frames_irq = adapter->tx_ring[0].work_limit;
/* only valid if in constant ITR mode */
- switch (adapter->itr_setting) {
+ switch (adapter->rx_itr_setting) {
case 0:
/* throttling disabled */
ec->rx_coalesce_usecs = 0;
@@ -1940,9 +1940,25 @@ static int ixgbe_get_coalesce(struct net_device *netdev,
break;
default:
/* fixed interrupt rate mode */
- ec->rx_coalesce_usecs = 1000000/adapter->eitr_param;
+ ec->rx_coalesce_usecs = 1000000/adapter->rx_eitr_param;
break;
}
+
+ /* only valid if in constant ITR mode */
+ switch (adapter->tx_itr_setting) {
+ case 0:
+ /* throttling disabled */
+ ec->tx_coalesce_usecs = 0;
+ break;
+ case 1:
+ /* dynamic ITR mode */
+ ec->tx_coalesce_usecs = 1;
+ break;
+ default:
+ ec->tx_coalesce_usecs = 1000000/adapter->tx_eitr_param;
+ break;
+ }
+
return 0;
}
@@ -1953,6 +1969,14 @@ static int ixgbe_set_coalesce(struct net_device *netdev,
struct ixgbe_q_vector *q_vector;
int i;
+ /*
+ * don't accept tx specific changes if we've got mixed RxTx vectors
+ * test and jump out here if needed before changing the rx numbers
+ */
+ if ((1000000/ec->tx_coalesce_usecs) != adapter->tx_eitr_param &&
+ adapter->q_vector[0]->txr_count && adapter->q_vector[0]->rxr_count)
+ return -EINVAL;
+
if (ec->tx_max_coalesced_frames_irq)
adapter->tx_ring[0].work_limit = ec->tx_max_coalesced_frames_irq;
@@ -1963,26 +1987,49 @@ static int ixgbe_set_coalesce(struct net_device *netdev,
return -EINVAL;
/* store the value in ints/second */
- adapter->eitr_param = 1000000/ec->rx_coalesce_usecs;
+ adapter->rx_eitr_param = 1000000/ec->rx_coalesce_usecs;
/* static value of interrupt rate */
- adapter->itr_setting = adapter->eitr_param;
+ adapter->rx_itr_setting = adapter->rx_eitr_param;
/* clear the lower bit as its used for dynamic state */
- adapter->itr_setting &= ~1;
+ adapter->rx_itr_setting &= ~1;
} else if (ec->rx_coalesce_usecs == 1) {
/* 1 means dynamic mode */
- adapter->eitr_param = 20000;
- adapter->itr_setting = 1;
+ adapter->rx_eitr_param = 20000;
+ adapter->rx_itr_setting = 1;
} else {
/*
* any other value means disable eitr, which is best
* served by setting the interrupt rate very high
*/
if (adapter->flags2 & IXGBE_FLAG2_RSC_ENABLED)
- adapter->eitr_param = IXGBE_MAX_RSC_INT_RATE;
+ adapter->rx_eitr_param = IXGBE_MAX_RSC_INT_RATE;
else
- adapter->eitr_param = IXGBE_MAX_INT_RATE;
- adapter->itr_setting = 0;
+ adapter->rx_eitr_param = IXGBE_MAX_INT_RATE;
+ adapter->rx_itr_setting = 0;
+ }
+
+ if (ec->tx_coalesce_usecs > 1) {
+ /* check the limits */
+ if ((1000000/ec->tx_coalesce_usecs > IXGBE_MAX_INT_RATE) ||
+ (1000000/ec->tx_coalesce_usecs < IXGBE_MIN_INT_RATE))
+ return -EINVAL;
+
+ /* store the value in ints/second */
+ adapter->tx_eitr_param = 1000000/ec->tx_coalesce_usecs;
+
+ /* static value of interrupt rate */
+ adapter->tx_itr_setting = adapter->tx_eitr_param;
+
+ /* clear the lower bit as its used for dynamic state */
+ adapter->tx_itr_setting &= ~1;
+ } else if (ec->tx_coalesce_usecs == 1) {
+ /* 1 means dynamic mode */
+ adapter->tx_eitr_param = 10000;
+ adapter->tx_itr_setting = 1;
+ } else {
+ adapter->tx_eitr_param = IXGBE_MAX_INT_RATE;
+ adapter->tx_itr_setting = 0;
}
/* MSI/MSIx Interrupt Mode */
@@ -1992,17 +2039,17 @@ static int ixgbe_set_coalesce(struct net_device *netdev,
for (i = 0; i < num_vectors; i++) {
q_vector = adapter->q_vector[i];
if (q_vector->txr_count && !q_vector->rxr_count)
- /* tx vector gets half the rate */
- q_vector->eitr = (adapter->eitr_param >> 1);
+ /* tx only */
+ q_vector->eitr = adapter->tx_eitr_param;
else
/* rx only or mixed */
- q_vector->eitr = adapter->eitr_param;
+ q_vector->eitr = adapter->rx_eitr_param;
ixgbe_write_eitr(q_vector);
}
/* Legacy Interrupt Mode */
} else {
q_vector = adapter->q_vector[0];
- q_vector->eitr = adapter->eitr_param;
+ q_vector->eitr = adapter->rx_eitr_param;
ixgbe_write_eitr(q_vector);
}
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 056434c..1aa9f6a 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -926,12 +926,12 @@ static void ixgbe_configure_msix(struct ixgbe_adapter *adapter)
r_idx + 1);
}
- /* if this is a tx only vector halve the interrupt rate */
if (q_vector->txr_count && !q_vector->rxr_count)
- q_vector->eitr = (adapter->eitr_param >> 1);
+ /* tx only */
+ q_vector->eitr = adapter->tx_eitr_param;
else if (q_vector->rxr_count)
- /* rx only */
- q_vector->eitr = adapter->eitr_param;
+ /* rx or mixed */
+ q_vector->eitr = adapter->rx_eitr_param;
ixgbe_write_eitr(q_vector);
}
@@ -1359,7 +1359,7 @@ static int ixgbe_clean_rxonly(struct napi_struct *napi, int budget)
/* If all Rx work done, exit the polling mode */
if (work_done < budget) {
napi_complete(napi);
- if (adapter->itr_setting & 1)
+ if (adapter->rx_itr_setting & 1)
ixgbe_set_itr_msix(q_vector);
if (!test_bit(__IXGBE_DOWN, &adapter->state))
ixgbe_irq_enable_queues(adapter,
@@ -1420,7 +1420,7 @@ static int ixgbe_clean_rxtx_many(struct napi_struct *napi, int budget)
/* If all Rx work done, exit the polling mode */
if (work_done < budget) {
napi_complete(napi);
- if (adapter->itr_setting & 1)
+ if (adapter->rx_itr_setting & 1)
ixgbe_set_itr_msix(q_vector);
if (!test_bit(__IXGBE_DOWN, &adapter->state))
ixgbe_irq_enable_queues(adapter,
@@ -1458,10 +1458,10 @@ static int ixgbe_clean_txonly(struct napi_struct *napi, int budget)
if (!ixgbe_clean_tx_irq(q_vector, tx_ring))
work_done = budget;
- /* If all Rx work done, exit the polling mode */
+ /* If all Tx work done, exit the polling mode */
if (work_done < budget) {
napi_complete(napi);
- if (adapter->itr_setting & 1)
+ if (adapter->tx_itr_setting & 1)
ixgbe_set_itr_msix(q_vector);
if (!test_bit(__IXGBE_DOWN, &adapter->state))
ixgbe_irq_enable_queues(adapter, ((u64)1 << q_vector->v_idx));
@@ -1848,7 +1848,7 @@ static void ixgbe_configure_msi_and_legacy(struct ixgbe_adapter *adapter)
struct ixgbe_hw *hw = &adapter->hw;
IXGBE_WRITE_REG(hw, IXGBE_EITR(0),
- EITR_INTS_PER_SEC_TO_REG(adapter->eitr_param));
+ EITR_INTS_PER_SEC_TO_REG(adapter->rx_eitr_param));
ixgbe_set_ivar(adapter, 0, 0, 0);
ixgbe_set_ivar(adapter, 1, 0, 0);
@@ -2991,7 +2991,7 @@ static int ixgbe_poll(struct napi_struct *napi, int budget)
/* If budget not fully consumed, exit the polling mode */
if (work_done < budget) {
napi_complete(napi);
- if (adapter->itr_setting & 1)
+ if (adapter->rx_itr_setting & 1)
ixgbe_set_itr(adapter);
if (!test_bit(__IXGBE_DOWN, &adapter->state))
ixgbe_irq_enable_queues(adapter, IXGBE_EIMS_RTX_QUEUE);
@@ -3601,7 +3601,10 @@ static int ixgbe_alloc_q_vectors(struct ixgbe_adapter *adapter)
if (!q_vector)
goto err_out;
q_vector->adapter = adapter;
- q_vector->eitr = adapter->eitr_param;
+ if (q_vector->txr_count && !q_vector->rxr_count)
+ q_vector->eitr = adapter->tx_eitr_param;
+ else
+ q_vector->eitr = adapter->rx_eitr_param;
q_vector->v_idx = q_idx;
netif_napi_add(adapter->netdev, &q_vector->napi, (*poll), 64);
adapter->q_vector[q_idx] = q_vector;
@@ -3870,8 +3873,10 @@ static int __devinit ixgbe_sw_init(struct ixgbe_adapter *adapter)
hw->fc.disable_fc_autoneg = false;
/* enable itr by default in dynamic mode */
- adapter->itr_setting = 1;
- adapter->eitr_param = 20000;
+ adapter->rx_itr_setting = 1;
+ adapter->rx_eitr_param = 20000;
+ adapter->tx_itr_setting = 1;
+ adapter->tx_eitr_param = 10000;
/* set defaults for eitr in MegaBytes */
adapter->eitr_low = 10;
^ permalink raw reply related
* [net-2.6 PATCH 3/3] ixgbe: move rx queue RSC configuration to a separate function
From: Jeff Kirsher @ 2009-09-18 19:46 UTC (permalink / raw)
To: davem
Cc: netdev, gospo, Shannon Nelson, Peter P Waskiewicz Jr,
Don Skidmore, Jeff Kirsher
In-Reply-To: <20090918194533.28898.49436.stgit@localhost.localdomain>
From: Nelson, Shannon <shannon.nelson@intel.com>
Shorten ixgbe_configure_rx() and lessen indent depth.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ixgbe/ixgbe_main.c | 78 ++++++++++++++++++++++++----------------
1 files changed, 47 insertions(+), 31 deletions(-)
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 1aa9f6a..c407bd9 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -1970,6 +1970,50 @@ static u32 ixgbe_setup_mrqc(struct ixgbe_adapter *adapter)
}
/**
+ * ixgbe_configure_rscctl - enable RSC for the indicated ring
+ * @adapter: address of board private structure
+ * @index: index of ring to set
+ * @rx_buf_len: rx buffer length
+ **/
+static void ixgbe_configure_rscctl(struct ixgbe_adapter *adapter, int index,
+ int rx_buf_len)
+{
+ struct ixgbe_ring *rx_ring;
+ struct ixgbe_hw *hw = &adapter->hw;
+ int j;
+ u32 rscctrl;
+
+ rx_ring = &adapter->rx_ring[index];
+ j = rx_ring->reg_idx;
+ rscctrl = IXGBE_READ_REG(hw, IXGBE_RSCCTL(j));
+ rscctrl |= IXGBE_RSCCTL_RSCEN;
+ /*
+ * we must limit the number of descriptors so that the
+ * total size of max desc * buf_len is not greater
+ * than 65535
+ */
+ if (rx_ring->flags & IXGBE_RING_RX_PS_ENABLED) {
+#if (MAX_SKB_FRAGS > 16)
+ rscctrl |= IXGBE_RSCCTL_MAXDESC_16;
+#elif (MAX_SKB_FRAGS > 8)
+ rscctrl |= IXGBE_RSCCTL_MAXDESC_8;
+#elif (MAX_SKB_FRAGS > 4)
+ rscctrl |= IXGBE_RSCCTL_MAXDESC_4;
+#else
+ rscctrl |= IXGBE_RSCCTL_MAXDESC_1;
+#endif
+ } else {
+ if (rx_buf_len < IXGBE_RXBUFFER_4096)
+ rscctrl |= IXGBE_RSCCTL_MAXDESC_16;
+ else if (rx_buf_len < IXGBE_RXBUFFER_8192)
+ rscctrl |= IXGBE_RSCCTL_MAXDESC_8;
+ else
+ rscctrl |= IXGBE_RSCCTL_MAXDESC_4;
+ }
+ IXGBE_WRITE_REG(hw, IXGBE_RSCCTL(j), rscctrl);
+}
+
+/**
* ixgbe_configure_rx - Configure 8259x Receive Unit after Reset
* @adapter: board private structure
*
@@ -1990,7 +2034,6 @@ static void ixgbe_configure_rx(struct ixgbe_adapter *adapter)
u32 fctrl, hlreg0;
u32 reta = 0, mrqc = 0;
u32 rdrxctl;
- u32 rscctrl;
int rx_buf_len;
/* Decide whether to use packet split mode or not */
@@ -2148,36 +2191,9 @@ static void ixgbe_configure_rx(struct ixgbe_adapter *adapter)
if (adapter->flags2 & IXGBE_FLAG2_RSC_ENABLED) {
/* Enable 82599 HW-RSC */
- for (i = 0; i < adapter->num_rx_queues; i++) {
- rx_ring = &adapter->rx_ring[i];
- j = rx_ring->reg_idx;
- rscctrl = IXGBE_READ_REG(hw, IXGBE_RSCCTL(j));
- rscctrl |= IXGBE_RSCCTL_RSCEN;
- /*
- * we must limit the number of descriptors so that the
- * total size of max desc * buf_len is not greater
- * than 65535
- */
- if (rx_ring->flags & IXGBE_RING_RX_PS_ENABLED) {
-#if (MAX_SKB_FRAGS > 16)
- rscctrl |= IXGBE_RSCCTL_MAXDESC_16;
-#elif (MAX_SKB_FRAGS > 8)
- rscctrl |= IXGBE_RSCCTL_MAXDESC_8;
-#elif (MAX_SKB_FRAGS > 4)
- rscctrl |= IXGBE_RSCCTL_MAXDESC_4;
-#else
- rscctrl |= IXGBE_RSCCTL_MAXDESC_1;
-#endif
- } else {
- if (rx_buf_len < IXGBE_RXBUFFER_4096)
- rscctrl |= IXGBE_RSCCTL_MAXDESC_16;
- else if (rx_buf_len < IXGBE_RXBUFFER_8192)
- rscctrl |= IXGBE_RSCCTL_MAXDESC_8;
- else
- rscctrl |= IXGBE_RSCCTL_MAXDESC_4;
- }
- IXGBE_WRITE_REG(hw, IXGBE_RSCCTL(j), rscctrl);
- }
+ for (i = 0; i < adapter->num_rx_queues; i++)
+ ixgbe_configure_rscctl(adapter, i, rx_buf_len);
+
/* Disable RSC for ACK packets */
IXGBE_WRITE_REG(hw, IXGBE_RSCDBU,
(IXGBE_RSCDBU_RSCACKDIS | IXGBE_READ_REG(hw, IXGBE_RSCDBU)));
^ permalink raw reply related
* Re: [PATCH net-next-2.6] bonding: introduce primary_reselect option
From: Jiri Pirko @ 2009-09-18 19:52 UTC (permalink / raw)
To: Nicolas de Pesloüan; +Cc: netdev, davem, fubar, bonding-devel
In-Reply-To: <4AB3E03B.3070205@free.fr>
Fri, Sep 18, 2009 at 09:32:11PM CEST, nicolas.2p.debian@free.fr wrote:
> Jiri Pirko a écrit :
>> (updated 3)
>>
>> In some cases there is not desirable to switch back to primary interface when
>> it's link recovers and rather stay with currently active one. We need to avoid
>> packetloss as much as we can in some cases. This is solved by introducing
>> primary_reselect option. Note that enslaved primary slave is set as current
>> active no matter what.
>>
>> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>>
>> diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
>> index d5181ce..fd650e0 100644
>> --- a/Documentation/networking/bonding.txt
>> +++ b/Documentation/networking/bonding.txt
>> @@ -614,6 +614,32 @@ primary
>> The primary option is only valid for active-backup mode.
>> +primary_reselect
>> +
>> + Specifies the behavior of the current active slave when the primary was
>> + down and comes back up. This option is designed to prevent
>> + flip-flopping between the primary slave and other slaves. The possible
>> + values and their respective effects are:
>> +
>> + always or 0 (default)
>> +
>> + The primary slave becomes the active slave whenever it comes
>> + back up.
>> +
>> + better or 1
>> +
>> + The primary slave becomes the active slave when it comes back
>> + up, if the speed and duplex of the primary slave is better
>> + than the speed and duplex of the current active slave.
>> +
>> + failure or 2
>> +
>> + The primary slave becomes the active slave only if the current
>> + active slave fails and the primary slave is up.
>> +
>> + When no slave are active, if the primary comes back up, it becomes the
>> + active slave, regardless of the value of primary_reselect.
>> +
>> updelay
>> Specifies the time, in milliseconds, to wait before enabling a
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 699bfdd..1127361 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -94,6 +94,7 @@ static int downdelay;
>> static int use_carrier = 1;
>> static char *mode;
>> static char *primary;
>> +static char *primary_reselect;
>> static char *lacp_rate;
>> static char *ad_select;
>> static char *xmit_hash_policy;
>> @@ -126,6 +127,13 @@ MODULE_PARM_DESC(mode, "Mode of operation : 0 for balance-rr, "
>> "6 for balance-alb");
>> module_param(primary, charp, 0);
>> MODULE_PARM_DESC(primary, "Primary network device to use");
>> +module_param(primary_reselect, charp, 0);
>> +MODULE_PARM_DESC(primary_reselect, "Reselect primary slave "
>> + "once it comes up; "
>> + "0 for always (default), "
>> + "1 for only if speed of primary is not "
>> + "better, "
>> + "2 for never");
>
> You should remove "not" for option value 1 and use the word failure for
> option value 2.
>
> MODULE_PARM_DESC(primary_reselect, "Reselect primary slave "
> "once it comes up; "
> "0 for always (default), "
> "1 for only if speed of primary is "
> "better, "
> "2 for only on active slave "
> "failure");
Okay, I wasn't sure how to put it here. This sounds good, going to resend.
Thanks Nicolas.
>
> Apart from this small detail, this sounds good for me.
>
>> module_param(lacp_rate, charp, 0);
>> MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner "
>> "(slow/fast)");
>> @@ -200,6 +208,13 @@ const struct bond_parm_tbl fail_over_mac_tbl[] = {
>> { NULL, -1},
>> };
>> +const struct bond_parm_tbl pri_reselect_tbl[] = {
>> +{ "always", BOND_PRI_RESELECT_ALWAYS},
>> +{ "better", BOND_PRI_RESELECT_BETTER},
>> +{ "failure", BOND_PRI_RESELECT_FAILURE},
>> +{ NULL, -1},
>> +};
>> +
>> struct bond_parm_tbl ad_select_tbl[] = {
>> { "stable", BOND_AD_STABLE},
>> { "bandwidth", BOND_AD_BANDWIDTH},
>> @@ -1070,6 +1085,25 @@ out:
>> }
>> +static bool bond_should_change_active(struct bonding *bond)
>> +{
>> + struct slave *prim = bond->primary_slave;
>> + struct slave *curr = bond->curr_active_slave;
>> +
>> + if (!prim || !curr || curr->link != BOND_LINK_UP)
>> + return true;
>> + if (bond->force_primary) {
>> + bond->force_primary = false;
>> + return true;
>> + }
>> + if (bond->params.primary_reselect == BOND_PRI_RESELECT_BETTER &&
>> + (prim->speed < curr->speed ||
>> + (prim->speed == curr->speed && prim->duplex <= curr->duplex)))
>> + return false;
>> + if (bond->params.primary_reselect == BOND_PRI_RESELECT_FAILURE)
>> + return false;
>> + return true;
>> +}
>> /**
>> * find_best_interface - select the best available slave to be the active one
>> @@ -1094,7 +1128,8 @@ static struct slave *bond_find_best_slave(struct bonding *bond)
>> }
>> if ((bond->primary_slave) &&
>> - bond->primary_slave->link == BOND_LINK_UP) {
>> + bond->primary_slave->link == BOND_LINK_UP &&
>> + bond_should_change_active(bond)) {
>> new_active = bond->primary_slave;
>> }
>> @@ -1675,8 +1710,10 @@ int bond_enslave(struct net_device *bond_dev,
>> struct net_device *slave_dev)
>> if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) {
>> /* if there is a primary slave, remember it */
>> - if (strcmp(bond->params.primary, new_slave->dev->name) == 0)
>> + if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
>> bond->primary_slave = new_slave;
>> + bond->force_primary = true;
>> + }
>> }
>> write_lock_bh(&bond->curr_slave_lock);
>> @@ -4643,7 +4680,7 @@ int bond_parse_parm(const char *buf, const struct bond_parm_tbl *tbl)
>> static int bond_check_params(struct bond_params *params)
>> {
>> - int arp_validate_value, fail_over_mac_value;
>> + int arp_validate_value, fail_over_mac_value, primary_reselect_value;
>> /*
>> * Convert string parameters.
>> @@ -4942,6 +4979,20 @@ static int bond_check_params(struct bond_params *params)
>> primary = NULL;
>> }
>> + if (primary && primary_reselect) {
>> + primary_reselect_value = bond_parse_parm(primary_reselect,
>> + pri_reselect_tbl);
>> + if (primary_reselect_value == -1) {
>> + pr_err(DRV_NAME
>> + ": Error: Invalid primary_reselect \"%s\"\n",
>> + primary_reselect ==
>> + NULL ? "NULL" : primary_reselect);
>> + return -EINVAL;
>> + }
>> + } else {
>> + primary_reselect_value = BOND_PRI_RESELECT_ALWAYS;
>> + }
>> +
>> if (fail_over_mac) {
>> fail_over_mac_value = bond_parse_parm(fail_over_mac,
>> fail_over_mac_tbl);
>> @@ -4973,6 +5024,7 @@ static int bond_check_params(struct bond_params *params)
>> params->use_carrier = use_carrier;
>> params->lacp_fast = lacp_fast;
>> params->primary[0] = 0;
>> + params->primary_reselect = primary_reselect_value;
>> params->fail_over_mac = fail_over_mac_value;
>> if (primary) {
>> diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>> index 6044e12..42c44f2 100644
>> --- a/drivers/net/bonding/bond_sysfs.c
>> +++ b/drivers/net/bonding/bond_sysfs.c
>> @@ -1212,6 +1212,61 @@ static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR,
>> bonding_show_primary, bonding_store_primary);
>> /*
>> + * Show and set the primary_reselect flag.
>> + */
>> +static ssize_t bonding_show_primary_reselect(struct device *d,
>> + struct device_attribute *attr,
>> + char *buf)
>> +{
>> + struct bonding *bond = to_bond(d);
>> +
>> + return sprintf(buf, "%s %d\n",
>> + pri_reselect_tbl[bond->params.primary_reselect].modename,
>> + bond->params.primary_reselect);
>> +}
>> +
>> +static ssize_t bonding_store_primary_reselect(struct device *d,
>> + struct device_attribute *attr,
>> + const char *buf, size_t count)
>> +{
>> + int new_value, ret = count;
>> + struct bonding *bond = to_bond(d);
>> +
>> + if (!rtnl_trylock())
>> + return restart_syscall();
>> +
>> + new_value = bond_parse_parm(buf, pri_reselect_tbl);
>> + if (new_value < 0) {
>> + pr_err(DRV_NAME
>> + ": %s: Ignoring invalid primary_reselect value %.*s.\n",
>> + bond->dev->name,
>> + (int) strlen(buf) - 1, buf);
>> + ret = -EINVAL;
>> + goto out;
>> + } else {
>> + bond->params.primary_reselect = new_value;
>> + pr_info(DRV_NAME ": %s: setting primary_reselect to %s (%d).\n",
>> + bond->dev->name, pri_reselect_tbl[new_value].modename,
>> + new_value);
>> + if (new_value == BOND_PRI_RESELECT_ALWAYS ||
>> + new_value == BOND_PRI_RESELECT_BETTER) {
>> + bond->force_primary = true;
>> + read_lock(&bond->lock);
>> + write_lock_bh(&bond->curr_slave_lock);
>> + bond_select_active_slave(bond);
>> + write_unlock_bh(&bond->curr_slave_lock);
>> + read_unlock(&bond->lock);
>> + }
>> + }
>> +out:
>> + rtnl_unlock();
>> + return ret;
>> +}
>> +static DEVICE_ATTR(primary_reselect, S_IRUGO | S_IWUSR,
>> + bonding_show_primary_reselect,
>> + bonding_store_primary_reselect);
>> +
>> +/*
>> * Show and set the use_carrier flag.
>> */
>> static ssize_t bonding_show_carrier(struct device *d,
>> @@ -1500,6 +1555,7 @@ static struct attribute *per_bond_attrs[] = {
>> &dev_attr_num_unsol_na.attr,
>> &dev_attr_miimon.attr,
>> &dev_attr_primary.attr,
>> + &dev_attr_primary_reselect.attr,
>> &dev_attr_use_carrier.attr,
>> &dev_attr_active_slave.attr,
>> &dev_attr_mii_status.attr,
>> diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
>> index 6824771..b5b1530 100644
>> --- a/drivers/net/bonding/bonding.h
>> +++ b/drivers/net/bonding/bonding.h
>> @@ -131,6 +131,7 @@ struct bond_params {
>> int lacp_fast;
>> int ad_select;
>> char primary[IFNAMSIZ];
>> + int primary_reselect;
>> __be32 arp_targets[BOND_MAX_ARP_TARGETS];
>> };
>> @@ -190,6 +191,7 @@ struct bonding {
>> struct slave *curr_active_slave;
>> struct slave *current_arp_slave;
>> struct slave *primary_slave;
>> + bool force_primary;
>> s32 slave_cnt; /* never change this value outside the attach/detach wrappers */
>> rwlock_t lock;
>> rwlock_t curr_slave_lock;
>> @@ -258,6 +260,10 @@ static inline bool bond_is_lb(const struct bonding *bond)
>> || bond->params.mode == BOND_MODE_ALB;
>> }
>> +#define BOND_PRI_RESELECT_ALWAYS 0
>> +#define BOND_PRI_RESELECT_BETTER 1
>> +#define BOND_PRI_RESELECT_FAILURE 2
>> +
>> #define BOND_FOM_NONE 0
>> #define BOND_FOM_ACTIVE 1
>> #define BOND_FOM_FOLLOW 2
>> @@ -348,6 +354,7 @@ extern const struct bond_parm_tbl bond_mode_tbl[];
>> extern const struct bond_parm_tbl xmit_hashtype_tbl[];
>> extern const struct bond_parm_tbl arp_validate_tbl[];
>> extern const struct bond_parm_tbl fail_over_mac_tbl[];
>> +extern const struct bond_parm_tbl pri_reselect_tbl[];
>> extern struct bond_parm_tbl ad_select_tbl[];
>> #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
>>
>
> Nicolas.
^ permalink raw reply
* [patch 1/1] net: fix CONFIG_NET=n build on sparc64
From: akpm @ 2009-09-18 19:52 UTC (permalink / raw)
To: davem; +Cc: netdev, akpm
From: Andrew Morton <akpm@linux-foundation.org>
sparc64 allnoconfig:
arch/sparc/kernel/built-in.o(.text+0x134e0): In function `sys32_recvfrom':
: undefined reference to `compat_sys_recvfrom'
arch/sparc/kernel/built-in.o(.text+0x134e4): In function `sys32_recvfrom':
: undefined reference to `compat_sys_recvfrom'
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/sys_ni.c | 1 +
1 file changed, 1 insertion(+)
diff -puN kernel/sys_ni.c~net-fix-config_net=n-build-on-sparc64 kernel/sys_ni.c
--- a/kernel/sys_ni.c~net-fix-config_net=n-build-on-sparc64
+++ a/kernel/sys_ni.c
@@ -49,6 +49,7 @@ cond_syscall(sys_sendmsg);
cond_syscall(compat_sys_sendmsg);
cond_syscall(sys_recvmsg);
cond_syscall(compat_sys_recvmsg);
+cond_syscall(compat_sys_recvfrom);
cond_syscall(sys_socketcall);
cond_syscall(sys_futex);
cond_syscall(compat_sys_futex);
_
^ permalink raw reply
* [patch 2/6] hfc_usb: fix read buffer overflow
From: akpm @ 2009-09-18 19:53 UTC (permalink / raw)
To: isdn; +Cc: netdev, akpm, roel.kluin
From: Roel Kluin <roel.kluin@gmail.com>
Check whether index is within bounds before testing the element.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/isdn/hisax/hfc_usb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff -puN drivers/isdn/hisax/hfc_usb.c~hfc_usb-fix-read-buffer-overflow drivers/isdn/hisax/hfc_usb.c
--- a/drivers/isdn/hisax/hfc_usb.c~hfc_usb-fix-read-buffer-overflow
+++ a/drivers/isdn/hisax/hfc_usb.c
@@ -817,8 +817,8 @@ collect_rx_frame(usb_fifo * fifo, __u8 *
}
/* we have a complete hdlc packet */
if (finish) {
- if ((!fifo->skbuff->data[fifo->skbuff->len - 1])
- && (fifo->skbuff->len > 3)) {
+ if (fifo->skbuff->len > 3 &&
+ !fifo->skbuff->data[fifo->skbuff->len - 1]) {
if (fifon == HFCUSB_D_RX) {
DBG(HFCUSB_DBG_DCHANNEL,
_
^ permalink raw reply
* [patch 1/6] isdn: hisax, fix lock imbalance
From: akpm @ 2009-09-18 19:53 UTC (permalink / raw)
To: isdn; +Cc: netdev, akpm, jirislaby, Karsten-Keil
From: Jiri Slaby <jirislaby@gmail.com>
Add omittted unlocks to 2 functions.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Karsten Keil <Karsten-Keil@t-online.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/isdn/hisax/amd7930_fn.c | 1 +
drivers/isdn/hisax/icc.c | 1 +
2 files changed, 2 insertions(+)
diff -puN drivers/isdn/hisax/amd7930_fn.c~isdn-hisax-fix-lock-imbalance drivers/isdn/hisax/amd7930_fn.c
--- a/drivers/isdn/hisax/amd7930_fn.c~isdn-hisax-fix-lock-imbalance
+++ a/drivers/isdn/hisax/amd7930_fn.c
@@ -594,6 +594,7 @@ Amd7930_l1hw(struct PStack *st, int pr,
if (cs->debug & L1_DEB_WARN)
debugl1(cs, "Amd7930: l1hw: l2l1 tx_skb exist this shouldn't happen");
skb_queue_tail(&cs->sq, skb);
+ spin_unlock_irqrestore(&cs->lock, flags);
break;
}
if (cs->debug & DEB_DLOG_HEX)
diff -puN drivers/isdn/hisax/icc.c~isdn-hisax-fix-lock-imbalance drivers/isdn/hisax/icc.c
--- a/drivers/isdn/hisax/icc.c~isdn-hisax-fix-lock-imbalance
+++ a/drivers/isdn/hisax/icc.c
@@ -468,6 +468,7 @@ ICC_l1hw(struct PStack *st, int pr, void
if (cs->debug & L1_DEB_WARN)
debugl1(cs, " l2l1 tx_skb exist this shouldn't happen");
skb_queue_tail(&cs->sq, skb);
+ spin_unlock_irqrestore(&cs->lock, flags);
break;
}
if (cs->debug & DEB_DLOG_HEX)
_
^ permalink raw reply
* [PATCH net-next-2.6 v2] bonding: introduce primary_reselect option
From: Jiri Pirko @ 2009-09-18 19:53 UTC (permalink / raw)
To: netdev; +Cc: davem, fubar, bonding-devel, nicolas.2p.debian
(updated 4)
In some cases there is not desirable to switch back to primary interface when
it's link recovers and rather stay with currently active one. We need to avoid
packetloss as much as we can in some cases. This is solved by introducing
primary_reselect option. Note that enslaved primary slave is set as current
active no matter what.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index d5181ce..fd650e0 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -614,6 +614,32 @@ primary
The primary option is only valid for active-backup mode.
+primary_reselect
+
+ Specifies the behavior of the current active slave when the primary was
+ down and comes back up. This option is designed to prevent
+ flip-flopping between the primary slave and other slaves. The possible
+ values and their respective effects are:
+
+ always or 0 (default)
+
+ The primary slave becomes the active slave whenever it comes
+ back up.
+
+ better or 1
+
+ The primary slave becomes the active slave when it comes back
+ up, if the speed and duplex of the primary slave is better
+ than the speed and duplex of the current active slave.
+
+ failure or 2
+
+ The primary slave becomes the active slave only if the current
+ active slave fails and the primary slave is up.
+
+ When no slave are active, if the primary comes back up, it becomes the
+ active slave, regardless of the value of primary_reselect.
+
updelay
Specifies the time, in milliseconds, to wait before enabling a
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 699bfdd..8f8a6cc 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -94,6 +94,7 @@ static int downdelay;
static int use_carrier = 1;
static char *mode;
static char *primary;
+static char *primary_reselect;
static char *lacp_rate;
static char *ad_select;
static char *xmit_hash_policy;
@@ -126,6 +127,14 @@ MODULE_PARM_DESC(mode, "Mode of operation : 0 for balance-rr, "
"6 for balance-alb");
module_param(primary, charp, 0);
MODULE_PARM_DESC(primary, "Primary network device to use");
+module_param(primary_reselect, charp, 0);
+MODULE_PARM_DESC(primary_reselect, "Reselect primary slave "
+ "once it comes up; "
+ "0 for always (default), "
+ "1 for only if speed of primary is "
+ "better, "
+ "2 for only on active slave "
+ "failure");
module_param(lacp_rate, charp, 0);
MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner "
"(slow/fast)");
@@ -200,6 +209,13 @@ const struct bond_parm_tbl fail_over_mac_tbl[] = {
{ NULL, -1},
};
+const struct bond_parm_tbl pri_reselect_tbl[] = {
+{ "always", BOND_PRI_RESELECT_ALWAYS},
+{ "better", BOND_PRI_RESELECT_BETTER},
+{ "failure", BOND_PRI_RESELECT_FAILURE},
+{ NULL, -1},
+};
+
struct bond_parm_tbl ad_select_tbl[] = {
{ "stable", BOND_AD_STABLE},
{ "bandwidth", BOND_AD_BANDWIDTH},
@@ -1070,6 +1086,25 @@ out:
}
+static bool bond_should_change_active(struct bonding *bond)
+{
+ struct slave *prim = bond->primary_slave;
+ struct slave *curr = bond->curr_active_slave;
+
+ if (!prim || !curr || curr->link != BOND_LINK_UP)
+ return true;
+ if (bond->force_primary) {
+ bond->force_primary = false;
+ return true;
+ }
+ if (bond->params.primary_reselect == BOND_PRI_RESELECT_BETTER &&
+ (prim->speed < curr->speed ||
+ (prim->speed == curr->speed && prim->duplex <= curr->duplex)))
+ return false;
+ if (bond->params.primary_reselect == BOND_PRI_RESELECT_FAILURE)
+ return false;
+ return true;
+}
/**
* find_best_interface - select the best available slave to be the active one
@@ -1094,7 +1129,8 @@ static struct slave *bond_find_best_slave(struct bonding *bond)
}
if ((bond->primary_slave) &&
- bond->primary_slave->link == BOND_LINK_UP) {
+ bond->primary_slave->link == BOND_LINK_UP &&
+ bond_should_change_active(bond)) {
new_active = bond->primary_slave;
}
@@ -1675,8 +1711,10 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) {
/* if there is a primary slave, remember it */
- if (strcmp(bond->params.primary, new_slave->dev->name) == 0)
+ if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
bond->primary_slave = new_slave;
+ bond->force_primary = true;
+ }
}
write_lock_bh(&bond->curr_slave_lock);
@@ -4643,7 +4681,7 @@ int bond_parse_parm(const char *buf, const struct bond_parm_tbl *tbl)
static int bond_check_params(struct bond_params *params)
{
- int arp_validate_value, fail_over_mac_value;
+ int arp_validate_value, fail_over_mac_value, primary_reselect_value;
/*
* Convert string parameters.
@@ -4942,6 +4980,20 @@ static int bond_check_params(struct bond_params *params)
primary = NULL;
}
+ if (primary && primary_reselect) {
+ primary_reselect_value = bond_parse_parm(primary_reselect,
+ pri_reselect_tbl);
+ if (primary_reselect_value == -1) {
+ pr_err(DRV_NAME
+ ": Error: Invalid primary_reselect \"%s\"\n",
+ primary_reselect ==
+ NULL ? "NULL" : primary_reselect);
+ return -EINVAL;
+ }
+ } else {
+ primary_reselect_value = BOND_PRI_RESELECT_ALWAYS;
+ }
+
if (fail_over_mac) {
fail_over_mac_value = bond_parse_parm(fail_over_mac,
fail_over_mac_tbl);
@@ -4973,6 +5025,7 @@ static int bond_check_params(struct bond_params *params)
params->use_carrier = use_carrier;
params->lacp_fast = lacp_fast;
params->primary[0] = 0;
+ params->primary_reselect = primary_reselect_value;
params->fail_over_mac = fail_over_mac_value;
if (primary) {
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 6044e12..42c44f2 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1212,6 +1212,61 @@ static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR,
bonding_show_primary, bonding_store_primary);
/*
+ * Show and set the primary_reselect flag.
+ */
+static ssize_t bonding_show_primary_reselect(struct device *d,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct bonding *bond = to_bond(d);
+
+ return sprintf(buf, "%s %d\n",
+ pri_reselect_tbl[bond->params.primary_reselect].modename,
+ bond->params.primary_reselect);
+}
+
+static ssize_t bonding_store_primary_reselect(struct device *d,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ int new_value, ret = count;
+ struct bonding *bond = to_bond(d);
+
+ if (!rtnl_trylock())
+ return restart_syscall();
+
+ new_value = bond_parse_parm(buf, pri_reselect_tbl);
+ if (new_value < 0) {
+ pr_err(DRV_NAME
+ ": %s: Ignoring invalid primary_reselect value %.*s.\n",
+ bond->dev->name,
+ (int) strlen(buf) - 1, buf);
+ ret = -EINVAL;
+ goto out;
+ } else {
+ bond->params.primary_reselect = new_value;
+ pr_info(DRV_NAME ": %s: setting primary_reselect to %s (%d).\n",
+ bond->dev->name, pri_reselect_tbl[new_value].modename,
+ new_value);
+ if (new_value == BOND_PRI_RESELECT_ALWAYS ||
+ new_value == BOND_PRI_RESELECT_BETTER) {
+ bond->force_primary = true;
+ read_lock(&bond->lock);
+ write_lock_bh(&bond->curr_slave_lock);
+ bond_select_active_slave(bond);
+ write_unlock_bh(&bond->curr_slave_lock);
+ read_unlock(&bond->lock);
+ }
+ }
+out:
+ rtnl_unlock();
+ return ret;
+}
+static DEVICE_ATTR(primary_reselect, S_IRUGO | S_IWUSR,
+ bonding_show_primary_reselect,
+ bonding_store_primary_reselect);
+
+/*
* Show and set the use_carrier flag.
*/
static ssize_t bonding_show_carrier(struct device *d,
@@ -1500,6 +1555,7 @@ static struct attribute *per_bond_attrs[] = {
&dev_attr_num_unsol_na.attr,
&dev_attr_miimon.attr,
&dev_attr_primary.attr,
+ &dev_attr_primary_reselect.attr,
&dev_attr_use_carrier.attr,
&dev_attr_active_slave.attr,
&dev_attr_mii_status.attr,
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 6824771..b5b1530 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -131,6 +131,7 @@ struct bond_params {
int lacp_fast;
int ad_select;
char primary[IFNAMSIZ];
+ int primary_reselect;
__be32 arp_targets[BOND_MAX_ARP_TARGETS];
};
@@ -190,6 +191,7 @@ struct bonding {
struct slave *curr_active_slave;
struct slave *current_arp_slave;
struct slave *primary_slave;
+ bool force_primary;
s32 slave_cnt; /* never change this value outside the attach/detach wrappers */
rwlock_t lock;
rwlock_t curr_slave_lock;
@@ -258,6 +260,10 @@ static inline bool bond_is_lb(const struct bonding *bond)
|| bond->params.mode == BOND_MODE_ALB;
}
+#define BOND_PRI_RESELECT_ALWAYS 0
+#define BOND_PRI_RESELECT_BETTER 1
+#define BOND_PRI_RESELECT_FAILURE 2
+
#define BOND_FOM_NONE 0
#define BOND_FOM_ACTIVE 1
#define BOND_FOM_FOLLOW 2
@@ -348,6 +354,7 @@ extern const struct bond_parm_tbl bond_mode_tbl[];
extern const struct bond_parm_tbl xmit_hashtype_tbl[];
extern const struct bond_parm_tbl arp_validate_tbl[];
extern const struct bond_parm_tbl fail_over_mac_tbl[];
+extern const struct bond_parm_tbl pri_reselect_tbl[];
extern struct bond_parm_tbl ad_select_tbl[];
#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
^ permalink raw reply related
* Re: [PATCH] i2400m: minimal ethtool support
From: Inaky Perez-Gonzalez @ 2009-09-18 19:57 UTC (permalink / raw)
To: Dan Williams; +Cc: netdev, wimax@linuxwimax.org
In-Reply-To: <1253217974.11454.12.camel@localhost.localdomain>
On Thu, 2009-09-17 at 13:06 -0700, Dan Williams wrote:
> Add minimal ethtool support for carrier detection.
>
> Signed-off-by: Dan Williams <dcbw@redhat.com>
Merged, thanks
--
-- Inaky
^ permalink raw reply
* [patch 4/6] mISDN: fix reversed if in st_own_ctrl()
From: akpm @ 2009-09-18 19:53 UTC (permalink / raw)
To: isdn; +Cc: netdev, akpm, error27
From: Dan Carpenter <error27@gmail.com>
The current code probably returns -EINVAL a lot. Otherwise it would oops.
Compile tested only. Found by smatch (http://repo.or.cz/w/smatch.git).
Signed-off-by: Dan Carpenter <error27@gmail.com>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/isdn/mISDN/stack.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN drivers/isdn/mISDN/stack.c~misdn-fix-reversed-if-in-st_own_ctrl drivers/isdn/mISDN/stack.c
--- a/drivers/isdn/mISDN/stack.c~misdn-fix-reversed-if-in-st_own_ctrl
+++ a/drivers/isdn/mISDN/stack.c
@@ -364,7 +364,7 @@ add_layer2(struct mISDNchannel *ch, stru
static int
st_own_ctrl(struct mISDNchannel *ch, u_int cmd, void *arg)
{
- if (!ch->st || ch->st->layer1)
+ if (!ch->st || !ch->st->layer1)
return -EINVAL;
return ch->st->layer1->ctrl(ch->st->layer1, cmd, arg);
}
_
^ permalink raw reply
* [patch 3/6] isdn: fix netjet build errors
From: akpm @ 2009-09-18 19:53 UTC (permalink / raw)
To: isdn; +Cc: netdev, akpm, randy.dunlap
From: Randy Dunlap <randy.dunlap@oracle.com>
Fix netjet driver link errors when ISDN_I4L is not enabled:
drivers/built-in.o: In function `mode_tiger':
netjet.c:(.text+0x325dc8): undefined reference to `isdnhdlc_rcv_init'
netjet.c:(.text+0x325dd5): undefined reference to `isdnhdlc_out_init'
drivers/built-in.o: In function `fill_dma':
netjet.c:(.text+0x325fb6): undefined reference to `isdnhdlc_encode'
drivers/built-in.o: In function `read_dma':
netjet.c:(.text+0x32631a): undefined reference to `isdnhdlc_decode'
drivers/built-in.o: In function `nj_irq':
netjet.c:(.text+0x326e01): undefined reference to `isdnhdlc_encode'
or move isdnhdlc.c to some other sub-dir..
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Karsten Keil <isdn@linux-pingi.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/isdn/hardware/mISDN/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff -puN drivers/isdn/hardware/mISDN/Kconfig~isdn-fix-netjet-build-errors drivers/isdn/hardware/mISDN/Kconfig
--- a/drivers/isdn/hardware/mISDN/Kconfig~isdn-fix-netjet-build-errors
+++ a/drivers/isdn/hardware/mISDN/Kconfig
@@ -78,6 +78,7 @@ config MISDN_NETJET
depends on PCI
select MISDN_IPAC
select ISDN_HDLC
+ select ISDN_I4L # so that make will recurse into sub-dir.
help
Enable support for Traverse Technologies NETJet PCI cards.
_
^ permalink raw reply
* [patch 5/6] isdn: eicon, use offsetof
From: akpm @ 2009-09-18 19:53 UTC (permalink / raw)
To: isdn; +Cc: netdev, akpm, jirislaby, armin
From: Jiri Slaby <jirislaby@gmail.com>
Use offsetof instead of explicit implementation.
* fixes bug with omitted & like:
len = (byte)(((T30_INFO *) 0)->station_id + 20)
* avoids compiler warnings with wrong sizes (pointer-to-char cast):
len = (byte)(&(((T30_INFO *) 0)->universal_6));
* cleans up the code
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Karsten Keil <isdn@linux-pingi.de>
Acked-by: Armin Schindler <armin@melware.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/isdn/hardware/eicon/message.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff -puN drivers/isdn/hardware/eicon/message.c~isdn-eicon-use-offsetof drivers/isdn/hardware/eicon/message.c
--- a/drivers/isdn/hardware/eicon/message.c~isdn-eicon-use-offsetof
+++ a/drivers/isdn/hardware/eicon/message.c
@@ -2692,7 +2692,7 @@ static byte connect_b3_req(dword Id, wor
if (!(fax_control_bits & T30_CONTROL_BIT_MORE_DOCUMENTS)
|| (fax_feature_bits & T30_FEATURE_BIT_MORE_DOCUMENTS))
{
- len = (byte)(&(((T30_INFO *) 0)->universal_6));
+ len = offsetof(T30_INFO, universal_6);
fax_info_change = false;
if (ncpi->length >= 4)
{
@@ -2754,7 +2754,7 @@ static byte connect_b3_req(dword Id, wor
for (i = 0; i < w; i++)
((T30_INFO *)(plci->fax_connect_info_buffer))->station_id[i] = fax_parms[4].info[1+i];
((T30_INFO *)(plci->fax_connect_info_buffer))->head_line_len = 0;
- len = (byte)(((T30_INFO *) 0)->station_id + 20);
+ len = offsetof(T30_INFO, station_id) + 20;
w = fax_parms[5].length;
if (w > 20)
w = 20;
@@ -2788,7 +2788,7 @@ static byte connect_b3_req(dword Id, wor
}
else
{
- len = (byte)(&(((T30_INFO *) 0)->universal_6));
+ len = offsetof(T30_INFO, universal_6);
}
fax_info_change = true;
@@ -2892,7 +2892,7 @@ static byte connect_b3_res(dword Id, wor
&& (plci->nsf_control_bits & T30_NSF_CONTROL_BIT_ENABLE_NSF)
&& (plci->nsf_control_bits & T30_NSF_CONTROL_BIT_NEGOTIATE_RESP))
{
- len = ((byte)(((T30_INFO *) 0)->station_id + 20));
+ len = offsetof(T30_INFO, station_id) + 20;
if (plci->fax_connect_info_length < len)
{
((T30_INFO *)(plci->fax_connect_info_buffer))->station_id_len = 0;
@@ -3802,7 +3802,7 @@ static byte manufacturer_res(dword Id, w
break;
}
ncpi = &m_parms[1];
- len = ((byte)(((T30_INFO *) 0)->station_id + 20));
+ len = offsetof(T30_INFO, station_id) + 20;
if (plci->fax_connect_info_length < len)
{
((T30_INFO *)(plci->fax_connect_info_buffer))->station_id_len = 0;
@@ -6844,7 +6844,7 @@ static void nl_ind(PLCI *plci)
if ((plci->requested_options_conn | plci->requested_options | a->requested_options_table[plci->appl->Id-1])
& ((1L << PRIVATE_FAX_SUB_SEP_PWD) | (1L << PRIVATE_FAX_NONSTANDARD)))
{
- i = ((word)(((T30_INFO *) 0)->station_id + 20)) + ((T30_INFO *)plci->NL.RBuffer->P)->head_line_len;
+ i = offsetof(T30_INFO, station_id) + 20 + ((T30_INFO *)plci->NL.RBuffer->P)->head_line_len;
while (i < plci->NL.RBuffer->length)
plci->ncpi_buffer[++len] = plci->NL.RBuffer->P[i++];
}
@@ -7236,7 +7236,7 @@ static void nl_ind(PLCI *plci)
{
plci->RData[1].P = plci->RData[0].P;
plci->RData[1].PLength = plci->RData[0].PLength;
- plci->RData[0].P = v120_header_buffer + (-((int) v120_header_buffer) & 3);
+ plci->RData[0].P = v120_header_buffer + (-((unsigned long)v120_header_buffer) & 3);
if ((plci->NL.RBuffer->P[0] & V120_HEADER_EXTEND_BIT) || (plci->NL.RLength == 1))
plci->RData[0].PLength = 1;
else
@@ -8473,7 +8473,7 @@ static word add_b23(PLCI *plci, API_PARS
fax_control_bits |= T30_CONTROL_BIT_ACCEPT_SEL_POLLING;
}
len = nlc[0];
- pos = ((byte)(((T30_INFO *) 0)->station_id + 20));
+ pos = offsetof(T30_INFO, station_id) + 20;
if (pos < plci->fax_connect_info_length)
{
for (i = 1 + plci->fax_connect_info_buffer[pos]; i != 0; i--)
@@ -8525,7 +8525,7 @@ static word add_b23(PLCI *plci, API_PARS
}
PUT_WORD(&(((T30_INFO *)&nlc[1])->control_bits_low), fax_control_bits);
- len = ((byte)(((T30_INFO *) 0)->station_id + 20));
+ len = offsetof(T30_INFO, station_id) + 20;
for (i = 0; i < len; i++)
plci->fax_connect_info_buffer[i] = nlc[1+i];
((T30_INFO *) plci->fax_connect_info_buffer)->head_line_len = 0;
_
^ permalink raw reply
* [patch 6/6] isdn: eicon, return on error
From: akpm @ 2009-09-18 19:53 UTC (permalink / raw)
To: isdn; +Cc: netdev, akpm, jirislaby, armin
From: Jiri Slaby <jirislaby@gmail.com>
When diva_strace_read_uint returns an error, return even from
process_idi_event, because l2_state is uninitialized.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Karsten Keil <isdn@linux-pingi.de>
Acked-by: Armin Schindler <armin@melware.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/isdn/hardware/eicon/maintidi.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff -puN drivers/isdn/hardware/eicon/maintidi.c~isdn-eicon-return-on-error drivers/isdn/hardware/eicon/maintidi.c
--- a/drivers/isdn/hardware/eicon/maintidi.c~isdn-eicon-return-on-error
+++ a/drivers/isdn/hardware/eicon/maintidi.c
@@ -959,8 +959,9 @@ static int process_idi_event (diva_strac
}
if (!strncmp("State\\Layer2 No1", path, pVar->path_length)) {
char* tmp = &pLib->lines[0].pInterface->Layer2[0];
- dword l2_state;
- diva_strace_read_uint (pVar, &l2_state);
+ dword l2_state;
+ if (diva_strace_read_uint(pVar, &l2_state))
+ return -1;
switch (l2_state) {
case 0:
_
^ permalink raw reply
* Re: [Bonding-devel] [PATCH net-next-2.6] bonding: set primary param via sysfs
From: Nicolas de Pesloüan @ 2009-09-18 20:21 UTC (permalink / raw)
To: Jiri Pirko; +Cc: netdev, fubar, davem, bonding-devel
In-Reply-To: <20090918121321.GB2801@psychotron.redhat.com>
Jiri Pirko wrote:
> Primary module parameter passed to bonding is pernament. That means if you
typo in "permanent".
> release the primary slave and enslave it again, it becomes the primary slave
> again. But if you set primary slave via sysfs, the primary slave is only set
> once and it's not remembered in bond->params structure. Therefore the setting is
> lost after releasing the primary slave. This simple one-liner fixes this.
You patch also has the side effect of fixing this strange behavior:
If you move the primary slave from one bond device to another one, it becomes
the primary for this other bond device, ignoring what you might have set as the
primary for this other bond device.
#modprobe bonding mode=active-backup primary=eth0
#echo +eth0 > /sys/class/net/bond0/bonding/slaves
#cat /sys/class/net/bond0/bonding/primary
eth0
#echo -eth0 > /sys/class/net/bond0/bonding/slaves
#echo +eth1 > /sys/class/net/bond1/bonding/slaves
#echo eth1 > /sys/class/net/bond1/bonding/primary
#cat /sys/class/net/bond1/bonding/primary
eth1
#echo +eth0 > /sys/class/net/bond1/bonding/slaves
#cat /sys/class/net/bond1/bonding/primary
eth0
=> Primary just changed, for no good reason.
Can someone imagine that some current configurations rely on this incredible
side effect ?
Nicolas.
> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>
> diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
> index 6044e12..ff449de 100644
> --- a/drivers/net/bonding/bond_sysfs.c
> +++ b/drivers/net/bonding/bond_sysfs.c
> @@ -1182,6 +1182,7 @@ static ssize_t bonding_store_primary(struct device *d,
> ": %s: Setting %s as primary slave.\n",
> bond->dev->name, slave->dev->name);
> bond->primary_slave = slave;
> + strcpy(bond->params.primary, slave->dev->name);
> bond_select_active_slave(bond);
> goto out;
> }
^ permalink raw reply
* Re: fanotify as syscalls
From: Eric Paris @ 2009-09-18 20:52 UTC (permalink / raw)
To: Andreas Gruenbacher
Cc: Jamie Lokier, Linus Torvalds, Evgeniy Polyakov, David Miller,
linux-kernel, linux-fsdevel, netdev, viro, alan, hch
In-Reply-To: <200909172207.01764.agruen@suse.de>
On Thu, 2009-09-17 at 22:07 +0200, Andreas Gruenbacher wrote:
> From my point of view, "global" events make no sense, and fanotify listeners
> should register which directories they are interested in (e.g., include "/",
> exclude "/proc"). This takes care of chroots and namespaces as well.
While I completely agree that most users don't want global events, the
antimalware vendors who today, unprotect and hack the syscall table on
their unsuspecting customer's machines to intercept every read, write,
open, close, mmap, etc syscall want EXACTLY that. They'd been asking
for a way to get this information for quite some time now. The largest
vendors in this market have agreed the interface (well, when it was a
socket interface that I talked about for so long) should meet their
needs.
Subtree watching / isn't any different or better, just harder and more
complex to implement. You still have to exclude /proc and /sys and
everything else. Just like one must with a global listener. Still
though, this sounds like an issue for the f_type and f_fsid exclusion
syscall I say I'm still not settled on. Not and issue with the basis of
fanotify or with the 3 proposed syscalls.
Jamie, do you see a problem with what I have been asking for review on
or see a problem with extending it moving forward?
Linus, do you see the value of 'yet another notification scheme' ?
-Eric
^ permalink raw reply
* Re: fanotify as syscalls
From: Andreas Gruenbacher @ 2009-09-18 22:00 UTC (permalink / raw)
To: Eric Paris
Cc: Jamie Lokier, Linus Torvalds, Evgeniy Polyakov, David Miller,
linux-kernel, linux-fsdevel, netdev, viro, alan, hch
In-Reply-To: <1253307128.2552.21.camel@dhcp231-106.rdu.redhat.com>
On Friday, 18 September 2009 22:52:08 Eric Paris wrote:
> On Thu, 2009-09-17 at 22:07 +0200, Andreas Gruenbacher wrote:
> > From my point of view, "global" events make no sense, and fanotify
> > listeners should register which directories they are interested in (e.g.,
> > include "/", exclude "/proc"). This takes care of chroots and namespaces
> > as well.
>
> While I completely agree that most users don't want global events, the
> antimalware vendors who today, unprotect and hack the syscall table on
> their unsuspecting customer's machines to intercept every read, write,
> open, close, mmap, etc syscall want EXACTLY that.
I understand that "global" is what those guys get today for lack of a
reasonable mechanism, but it's not what anybody can ge given by fanotify: it
conflicts with filesystem namespaces.
Consider running several "virtual machines" in separate namespaces on the same
kernel. With "global" you are forced to run the same global fanotify
listeners everywhere; with per-mount-point listeners, you can choose
between "global" and something more fine-grained by identifying which
vfsmounts you are interested in. (Filesystem namespaces correspond to
vfsmount hierarchies.)
> [...] You still have to exclude /proc and /sys and everything else.
Those are mount points, and so convenient to handle with a per-mount-point
mechanism. No additional kernel code needed.
> [...] Still though, this sounds like an issue for the f_type and f_fsid
> exclusion syscall I say I'm still not settled on.
Those are also obsolete with a per-mount-point mechanism.
Thanks,
Andreas
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox