Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 2/2] fec: Cleanup PHY probing
From: Denis Kirjanov @ 2010-06-02 19:17 UTC (permalink / raw)
  To: davem; +Cc: netdev

Cleanup PHY probing: use helpers from phylib

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
---
drivers/net/fec.c |   16 +++++-----------
 1 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/net/fec.c b/drivers/net/fec.c
index edfff92..c107d8e 100644
--- a/drivers/net/fec.c
+++ b/drivers/net/fec.c
@@ -679,30 +679,24 @@ static int fec_enet_mii_probe(struct net_device *dev)
 {
 	struct fec_enet_private *fep = netdev_priv(dev);
 	struct phy_device *phy_dev = NULL;
-	int phy_addr;
+	int ret;
 
 	fep->phy_dev = NULL;
 
 	/* find the first phy */
-	for (phy_addr = 0; phy_addr < PHY_MAX_ADDR; phy_addr++) {
-		if (fep->mii_bus->phy_map[phy_addr]) {
-			phy_dev = fep->mii_bus->phy_map[phy_addr];
-			break;
-		}
-	}
-
+	phy_dev = phy_find_first(fep->mii_bus);
 	if (!phy_dev) {
 		printk(KERN_ERR "%s: no PHY found\n", dev->name);
 		return -ENODEV;
 	}
 
 	/* attach the mac to the phy */
-	phy_dev = phy_connect(dev, dev_name(&phy_dev->dev),
+	ret = phy_connect_direct(dev, phy_dev,
 			     &fec_enet_adjust_link, 0,
 			     PHY_INTERFACE_MODE_MII);
-	if (IS_ERR(phy_dev)) {
+	if (ret) {
 		printk(KERN_ERR "%s: Could not attach to PHY\n", dev->name);
-		return PTR_ERR(phy_dev);
+		return ret;
 	}
 
 	/* mask with MAC supported features */

^ permalink raw reply related

* [PATCH 1/2] fec: convert TX hook to netdev_tx_t
From: Denis Kirjanov @ 2010-06-02 19:15 UTC (permalink / raw)
  To: davem; +Cc: netdev

Convert TX hook return value to netdev_tx_t

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
---
drivers/net/fec.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/fec.c b/drivers/net/fec.c
index 42d9ac9..64afed0 100644
--- a/drivers/net/fec.c
+++ b/drivers/net/fec.c
@@ -208,7 +208,7 @@ static void fec_stop(struct net_device *dev);
 /* Transmitter timeout */
 #define TX_TIMEOUT (2 * HZ)
 
-static int
+static netdev_tx_t
 fec_enet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct fec_enet_private *fep = netdev_priv(dev);

^ permalink raw reply related

* Re: sysfs class/net/ problem
From: Johannes Berg @ 2010-06-02 19:12 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Greg KH, netdev
In-Reply-To: <1275504953.3915.31.camel@jlt3.sipsolutions.net>

On Wed, 2010-06-02 at 20:55 +0200, Johannes Berg wrote:
> On Wed, 2010-06-02 at 11:05 -0700, Eric W. Biederman wrote:
> 
> > If you want to dig into this look at sysfs_delete_link.  instrument
> > it so that you can see if it is called for wlan{0,1,2} and see what
> > ns it is called for.
> > 
> > My current hypothesis is something is causing us to try and delete
> > the symlink from the wrong namespace, so we just skip that part of it.
> 
> [   78.253128] create link wlan0 ns=ffff88001ce1e600
> ...
> [   93.462268] delete link wlan0 ns=ffff88001ce1e600

No that was the sd, but sd->s_ns is NULL in both cases.

johannes


^ permalink raw reply

* Re: [PATCH v2] netfilter: Xtables: idletimer target implementation
From: Luciano Coelho @ 2010-06-02 19:05 UTC (permalink / raw)
  To: ext Jan Engelhardt
  Cc: netdev@vger.kernel.org, netfilter-devel@vger.kernel.org,
	kaber@trash.net, Timo Teras
In-Reply-To: <1275503835.1574.0.camel@powerslave>

On Wed, 2010-06-02 at 20:37 +0200, Coelho Luciano (Nokia-D/Helsinki)
wrote:
> On Wed, 2010-06-02 at 17:16 +0200, ext Jan Engelhardt wrote:
> > On Wednesday 2010-06-02 15:41, Luciano Coelho wrote:
> > 
> > >+static int __init idletimer_tg_init(void)
> > >+{
> > >+	int ret;
> > >+
> > >+	idletimer_tg_kobj = kobject_create_and_add("idletimer",
> > >+						   &THIS_MODULE->mkobj.kobj);
> > 
> > Isn't this going to oops when you compile this module as =y?
> 
> Damn, that's true. :(
> 
> I'll investigate how to fix this.

Would it be too hacky to force it to be a module (ie. add "depends on m"
in Kconfig)?

Besides /sys/module/xt_IDLETIMER and /sys/class/net, which we have
already discarded, I can't find any other place that would make sense to
add the idletimer in the kernel object hierarchy...


-- 
Cheers,
Luca.


^ permalink raw reply

* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful
From: Christoph Lameter @ 2010-06-02 18:59 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, shemminger
In-Reply-To: <1275504070.2519.12.camel@edumazet-laptop>

On Wed, 2 Jun 2010, Eric Dumazet wrote:

> > In my particular case it is a weird corner case for the rp_filter.
> >
> > Two NICs are on the same subnet. Different multicast groups are joined
> > on each (Using two NICs to balance the MC load since the drivers have
> > some multicast limitations and having different interrupt lines for each
> > NIC is also beneficial).
> >
>
> yeah, I know about this problem, and am working on it too...
>
> > The rp_filter rejects all multicast traffic to the subscriptions on the
> > second NIC. I guess this is because the source address of the MC traffic
> > (on the same subnet) is also reachable via the first NIC.
> >
>
> Its clearly a case were rp_filter should be set to 2, dont you think ?

The rp_filter is rejecting traffic coming into a NIC for which the kernel
has a multicast join list that indicates that this traffic is expected on
this NIC. You could consult the MC subscription list to verify that the
traffic is coming into the right NIC.

In the MC case the user can explicitly specify through which NIC the
traffic is expected. See IP_ADD_MEMBERSHIP.

^ permalink raw reply

* Re: [PATCH] ipconfig: send host-name in DHCP requests
From: Andi Kleen @ 2010-06-02 18:56 UTC (permalink / raw)
  To: David Miller; +Cc: fengguang.wu, netdev, linux-kernel, andi
In-Reply-To: <20100602.070531.73368509.davem@davemloft.net>

On Wed, Jun 02, 2010 at 07:05:31AM -0700, David Miller wrote:
> From: Wu Fengguang <fengguang.wu@intel.com>
> Date: Mon, 31 May 2010 11:19:53 +0800
> 
> > Normally dhclient can be configured to send the "host-name" option
> > in DHCP requests to update the client's DNS record. However for an
> > NFSROOT system, dhclient shall never be called (which may change the
> > IP addr and therefore lose your root NFS mount connection).
> > 
> > So enable updating the DNS record with kernel parameter
> > 
> > 	ip=::::$HOST_NAME::dhcp
> > 
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> 
> Applied, thanks.

Small nit: Fengguang, please document the new option in 
Documentation/kernel-parameters.txt That could be done in a follow-on patch.

-Andi
> 

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply

* Re: sysfs class/net/ problem
From: Johannes Berg @ 2010-06-02 18:55 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Greg KH, netdev
In-Reply-To: <m1y6exux6x.fsf@fess.ebiederm.org>

On Wed, 2010-06-02 at 11:05 -0700, Eric W. Biederman wrote:

> If you want to dig into this look at sysfs_delete_link.  instrument
> it so that you can see if it is called for wlan{0,1,2} and see what
> ns it is called for.
> 
> My current hypothesis is something is causing us to try and delete
> the symlink from the wrong namespace, so we just skip that part of it.

[   78.253128] create link wlan0 ns=ffff88001ce1e600
...
[   93.462268] delete link wlan0 ns=ffff88001ce1e600

looks the same ...

Also note

[  109.872488] netconsole: network logging stopped, interface wlan0 unregistered
[  109.872910] PM: Removing info for No Bus:wlan0
[  109.872941] delete link wlan0 ns=ffff88001e9bd600
[  110.130563] PM: Removing info for No Bus:rfkill0
[  110.130599] delete link rfkill0 ns=ffff88001b61ea80
[  110.131135] PM: Removing info for No Bus:phy0
[  110.131161] delete link phy0 ns=ffff88001b61e240
[  110.131424] PM: Removing info for No Bus:hwsimdev0
[  110.131445] delete link hwsimdev0 ns=ffff88001b67ed80

(I changed the struct device thing in hwsim to be hwsimdev%d rather than
hwsim%d to tell the difference to hwsim0, the monitor netdev) so it's
getting removed from PM way after the wlan0 that links into it...

johannes


^ permalink raw reply

* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful
From: Eric Dumazet @ 2010-06-02 18:41 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: David Miller, netdev, shemminger
In-Reply-To: <alpine.DEB.2.00.1006021255001.32431@router.home>

Le mercredi 02 juin 2010 à 13:01 -0500, Christoph Lameter a écrit :
> On Wed, 2 Jun 2010, Eric Dumazet wrote:
> 
> > Here is the patch I cooked to account for RP_FILTER errors in multicast
> > path.
> >
> > I will complete it to also do the unicast part before official
> > submission.
> >
> > Christoph, the official counter would be IPSTATS_MIB_INNOROUTES
> 
> Great. Thanks.
> 
> > ipSystemStatsInNoRoutes OBJECT-TYPE
> >     SYNTAX     Counter32
> >     MAX-ACCESS read-only
> >     STATUS     current
> >     DESCRIPTION
> >            "The number of input IP datagrams discarded because no route
> >             could be found to transmit them to their destination.
> 
> add "or because the rp_filter rejected the packet"? In the case of MC
> traffic you dont really need a route.
> 

Unicast trafic dont need a reverse route, if you only receive packets.

rp_filter is an optional check, not covered by standard MIBS, so its
borderline.


> In my particular case it is a weird corner case for the rp_filter.
> 
> Two NICs are on the same subnet. Different multicast groups are joined
> on each (Using two NICs to balance the MC load since the drivers have
> some multicast limitations and having different interrupt lines for each
> NIC is also beneficial).
> 

yeah, I know about this problem, and am working on it too...

> The rp_filter rejects all multicast traffic to the subscriptions on the
> second NIC. I guess this is because the source address of the MC traffic
> (on the same subnet) is also reachable via the first NIC.
> 

Its clearly a case were rp_filter should be set to 2, dont you think ?

> So you could add also "because of breakage in the rp_filter (rp_filter
> ignores the multicast subscription tables when determining the correct
> reverse path of the packet)"
> 

In standard RFC ? I wont change it :)



^ permalink raw reply

* [PATCH net-next-2.6 2/2][v2] bonding: allow user-controlled output slave selection
From: Andy Gospodarek @ 2010-06-02 18:40 UTC (permalink / raw)
  To: netdev; +Cc: nhorman


v2: changed bonding module version, modified to apply on top of changes
from previous patch in series, and updated documentation to elaborate on
multiqueue awareness that now exists in bonding driver.

This patch give the user the ability to control the output slave for
round-robin and active-backup bonding.  Similar functionality was
discussed in the past, but Jay Vosburgh indicated he would rather see a
feature like this added to existing modes rather than creating a
completely new mode.  Jay's thoughts as well as Neil's input surrounding
some of the issues with the first implementation pushed us toward a
design that relied on the queue_mapping rather than skb marks.
Round-robin and active-backup modes were chosen as the first users of
this slave selection as they seemed like the most logical choices when
considering a multi-switch environment.

Round-robin mode works without any modification, but active-backup does
require inclusion of the first patch in this series and setting
the 'all_slaves_active' flag.  This will allow reception of unicast traffic on
any of the backup interfaces.

This was tested with IPv4-based filters as well as VLAN-based filters
with good results.

More information as well as a configuration example is available in the
patch to Documentation/networking/bonding.txt.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
 Documentation/networking/bonding.txt |   84 ++++++++++++++++++++++++-
 drivers/net/bonding/bond_main.c      |   75 +++++++++++++++++++++-
 drivers/net/bonding/bond_sysfs.c     |  116 ++++++++++++++++++++++++++++++++++
 drivers/net/bonding/bonding.h        |    9 ++-
 include/linux/if_bonding.h           |    1 +
 5 files changed, 278 insertions(+), 7 deletions(-)

diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 61f516b..d091478 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -49,6 +49,7 @@ Table of Contents
 3.3	Configuring Bonding Manually with Ifenslave
 3.3.1		Configuring Multiple Bonds Manually
 3.4	Configuring Bonding Manually via Sysfs
+3.5	Overriding Configuration for Special Cases
 
 4. Querying Bonding Configuration
 4.1	Bonding Configuration
@@ -1318,8 +1319,87 @@ echo 2000 > /sys/class/net/bond1/bonding/arp_interval
 echo +eth2 > /sys/class/net/bond1/bonding/slaves
 echo +eth3 > /sys/class/net/bond1/bonding/slaves
 
-
-4. Querying Bonding Configuration 
+3.5 Overriding Configuration for Special Cases
+----------------------------------------------
+When using the bonding driver, the physical port which transmits a frame is
+typically selected by the bonding driver, and is not relevant to the user or
+system administrator.  The output port is simply selected using the policies of
+the selected bonding mode.  On occasion however, it is helpful to direct certain
+classes of traffic to certain physical interfaces on output to implement
+slightly more complex policies.  For example, to reach a web server over a
+bonded interface in which eth0 connects to a private network, while eth1
+connects via a public network, it may be desirous to bias the bond to send said
+traffic over eth0 first, using eth1 only as a fall back, while all other traffic
+can safely be sent over either interface.  Such configurations may be achieved
+using the traffic control utilities inherent in linux.
+
+By default the bonding driver is multiqueue aware and 16 queues are created
+when the driver initializes (see Documentation/networking/multiqueue.txt
+for details).  If more or less queues are desired the module parameter
+tx_queues can be used to change this value.  There is no sysfs parameter
+available as the allocation is done at module init time.
+
+The output of the file /proc/net/bonding/bondX has changed so the output Queue
+ID is now printed for each slave:
+
+Bonding Mode: fault-tolerance (active-backup)
+Primary Slave: None
+Currently Active Slave: eth0
+MII Status: up
+MII Polling Interval (ms): 0
+Up Delay (ms): 0
+Down Delay (ms): 0
+
+Slave Interface: eth0
+MII Status: up
+Link Failure Count: 0
+Permanent HW addr: 00:1a:a0:12:8f:cb
+Slave queue ID: 0
+
+Slave Interface: eth1
+MII Status: up
+Link Failure Count: 0
+Permanent HW addr: 00:1a:a0:12:8f:cc
+Slave queue ID: 2
+
+The queue_id for a slave can be set using the command:
+
+# echo "eth1:2" > /sys/class/net/bond0/bonding/queue_id
+
+Any interface that needs a queue_id set should set it with multiple calls
+like the one above until proper priorities are set for all interfaces.  On
+distributions that allow configuration via initscripts, multiple 'queue_id'
+arguments can be added to BONDING_OPTS to set all needed slave queues.
+
+These queue id's can be used in conjunction with the tc utility to configure
+a multiqueue qdisc and filters to bias certain traffic to transmit on certain
+slave devices.  For instance, say we wanted, in the above configuration to
+force all traffic bound to 192.168.1.100 to use eth1 in the bond as its output
+device. The following commands would accomplish this:
+
+# tc qdisc add dev bond0 handle 1 root multiq
+
+# tc filter add dev bond0 protocol ip parent 1: prio 1 u32 match ip dst \
+	192.168.1.100 action skbedit queue_mapping 2
+
+These commands tell the kernel to attach a multiqueue queue discipline to the
+bond0 interface and filter traffic enqueued to it, such that packets with a dst
+ip of 192.168.1.100 have their output queue mapping value overwritten to 2.
+This value is then passed into the driver, causing the normal output path
+selection policy to be overridden, selecting instead qid 2, which maps to eth1.
+
+Note that qid values begin at 1.  Qid 0 is reserved to initiate to the driver
+that normal output policy selection should take place.  One benefit to simply
+leaving the qid for a slave to 0 is the multiqueue awareness in the bonding
+driver that is now present.  This awareness allows tc filters to be placed on
+slave devices as well as bond devices and the bonding driver will simply act as
+a pass-through for selecting output queues on the slave device rather than 
+output port selection.
+
+This feature first appeared in bonding driver version 3.7.0 and support for
+output slave selection was limited to round-robin and active-backup modes.
+
+4 Querying Bonding Configuration
 =================================
 
 4.1 Bonding Configuration
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index f22f6bf..1b19276 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -90,6 +90,7 @@
 #define BOND_LINK_ARP_INTERV	0
 
 static int max_bonds	= BOND_DEFAULT_MAX_BONDS;
+static int tx_queues	= BOND_DEFAULT_TX_QUEUES;
 static int num_grat_arp = 1;
 static int num_unsol_na = 1;
 static int miimon	= BOND_LINK_MON_INTERV;
@@ -111,6 +112,8 @@ static struct bond_params bonding_defaults;
 
 module_param(max_bonds, int, 0);
 MODULE_PARM_DESC(max_bonds, "Max number of bonded devices");
+module_param(tx_queues, int, 0);
+MODULE_PARM_DESC(tx_queues, "Max number of transmit queues (default = 16)");
 module_param(num_grat_arp, int, 0644);
 MODULE_PARM_DESC(num_grat_arp, "Number of gratuitous ARP packets to send on failover event");
 module_param(num_unsol_na, int, 0644);
@@ -1540,6 +1543,12 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 		goto err_undo_flags;
 	}
 
+	/*
+	 * Set the new_slave's queue_id to be zero.  Queue ID mapping
+	 * is set via sysfs or module option if desired.
+	 */
+	new_slave->queue_id = 0;
+
 	/* Save slave's original mtu and then set it to match the bond */
 	new_slave->original_mtu = slave_dev->mtu;
 	res = dev_set_mtu(slave_dev, bond->dev->mtu);
@@ -3285,6 +3294,7 @@ static void bond_info_show_slave(struct seq_file *seq,
 		else
 			seq_puts(seq, "Aggregator ID: N/A\n");
 	}
+	seq_printf(seq, "Slave queue ID: %d\n", slave->queue_id);
 }
 
 static int bond_info_seq_show(struct seq_file *seq, void *v)
@@ -4421,9 +4431,59 @@ static void bond_set_xmit_hash_policy(struct bonding *bond)
 	}
 }
 
+/*
+ * Lookup the slave that corresponds to a qid
+ */
+static inline int bond_slave_override(struct bonding *bond,
+				      struct sk_buff *skb)
+{
+	int i, res = 1;
+	struct slave *slave = NULL;
+	struct slave *check_slave;
+
+	read_lock(&bond->lock);
+
+	if (!BOND_IS_OK(bond) || !skb->queue_mapping)
+		goto out;
+
+	/* Find out if any slaves have the same mapping as this skb. */
+	bond_for_each_slave(bond, check_slave, i) {
+		if (check_slave->queue_id == skb->queue_mapping) {
+			slave = check_slave;
+			break;
+		}
+	}
+
+	/* If the slave isn't UP, use default transmit policy. */
+	if (slave && slave->queue_id && IS_UP(slave->dev) &&
+	    (slave->link == BOND_LINK_UP)) {
+		res = bond_dev_queue_xmit(bond, skb, slave->dev);
+	}
+
+out:
+	read_unlock(&bond->lock);
+	return res;
+}
+
+static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
+{
+	/*
+	 * This helper function exists to help dev_pick_tx get the correct
+	 * destination queue.  Using a helper function skips the a call to
+	 * skb_tx_hash and will put the skbs in the queue we expect on their
+	 * way down to the bonding driver.
+	 */
+	return skb->queue_mapping;
+}
+
 static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
-	const struct bonding *bond = netdev_priv(dev);
+	struct bonding *bond = netdev_priv(dev);
+
+	if (TX_QUEUE_OVERRIDE(bond->params.mode)) {
+		if (!bond_slave_override(bond, skb))
+			return NETDEV_TX_OK;
+	}
 
 	switch (bond->params.mode) {
 	case BOND_MODE_ROUNDROBIN:
@@ -4508,6 +4568,7 @@ static const struct net_device_ops bond_netdev_ops = {
 	.ndo_open		= bond_open,
 	.ndo_stop		= bond_close,
 	.ndo_start_xmit		= bond_start_xmit,
+	.ndo_select_queue	= bond_select_queue,
 	.ndo_get_stats		= bond_get_stats,
 	.ndo_do_ioctl		= bond_do_ioctl,
 	.ndo_set_multicast_list	= bond_set_multicast_list,
@@ -4776,6 +4837,13 @@ static int bond_check_params(struct bond_params *params)
 		}
 	}
 
+	if (tx_queues < 1 || tx_queues > 255) {
+		pr_warning("Warning: tx_queues (%d) should be between "
+			   "1 and 255, resetting to %d\n",
+			   tx_queues, BOND_DEFAULT_TX_QUEUES);
+		tx_queues = BOND_DEFAULT_TX_QUEUES;
+	}
+
 	if ((all_slaves_active != 0) && (all_slaves_active != 1)) {
 		pr_warning("Warning: all_slaves_active module parameter (%d), "
 			   "not of valid value (0/1), so it was set to "
@@ -4953,6 +5021,7 @@ static int bond_check_params(struct bond_params *params)
 	params->primary[0] = 0;
 	params->primary_reselect = primary_reselect_value;
 	params->fail_over_mac = fail_over_mac_value;
+	params->tx_queues = tx_queues;
 	params->all_slaves_active = all_slaves_active;
 
 	if (primary) {
@@ -5040,8 +5109,8 @@ int bond_create(struct net *net, const char *name)
 
 	rtnl_lock();
 
-	bond_dev = alloc_netdev(sizeof(struct bonding), name ? name : "",
-				bond_setup);
+	bond_dev = alloc_netdev_mq(sizeof(struct bonding), name ? name : "",
+				bond_setup, tx_queues);
 	if (!bond_dev) {
 		pr_err("%s: eek! can't alloc netdev!\n", name);
 		rtnl_unlock();
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 066311a..f9a0343 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1412,6 +1412,121 @@ static ssize_t bonding_show_ad_partner_mac(struct device *d,
 static DEVICE_ATTR(ad_partner_mac, S_IRUGO, bonding_show_ad_partner_mac, NULL);
 
 /*
+ * Show the queue_ids of the slaves in the current bond.
+ */
+static ssize_t bonding_show_queue_id(struct device *d,
+				     struct device_attribute *attr,
+				     char *buf)
+{
+	struct slave *slave;
+	int i, res = 0;
+	struct bonding *bond = to_bond(d);
+
+	if (!rtnl_trylock())
+		return restart_syscall();
+
+	read_lock(&bond->lock);
+	bond_for_each_slave(bond, slave, i) {
+		if (res > (PAGE_SIZE - 6)) {
+			/* not enough space for another interface name */
+			if ((PAGE_SIZE - res) > 10)
+				res = PAGE_SIZE - 10;
+			res += sprintf(buf + res, "++more++ ");
+			break;
+		}
+		res += sprintf(buf + res, "%s:%d ",
+			       slave->dev->name, slave->queue_id);
+	}
+	read_unlock(&bond->lock);
+	if (res)
+		buf[res-1] = '\n'; /* eat the leftover space */
+	rtnl_unlock();
+	return res;
+}
+
+/*
+ * Set the queue_ids of the  slaves in the current bond.  The bond
+ * interface must be enslaved for this to work.
+ */
+static ssize_t bonding_store_queue_id(struct device *d,
+				      struct device_attribute *attr,
+				      const char *buffer, size_t count)
+{
+	struct slave *slave, *update_slave;
+	struct bonding *bond = to_bond(d);
+	u16 qid;
+	int i, ret = count;
+	char *delim;
+	struct net_device *sdev = NULL;
+
+	if (!rtnl_trylock())
+		return restart_syscall();
+
+	/* delim will point to queue id if successful */
+	delim = strchr(buffer, ':');
+	if (!delim)
+		goto err_no_cmd;
+
+	/*
+	 * Terminate string that points to device name and bump it
+	 * up one, so we can read the queue id there.
+	 */
+	*delim = '\0';
+	if (sscanf(++delim, "%hd\n", &qid) != 1)
+		goto err_no_cmd;
+
+	/* Check buffer length, valid ifname and queue id */
+	if (strlen(buffer) > IFNAMSIZ ||
+	    !dev_valid_name(buffer) ||
+	    qid > bond->params.tx_queues)
+		goto err_no_cmd;
+
+	/* Get the pointer to that interface if it exists */
+	sdev = __dev_get_by_name(dev_net(bond->dev), buffer);
+	if (!sdev)
+		goto err_no_cmd;
+
+	read_lock(&bond->lock);
+
+	/* Search for thes slave and check for duplicate qids */
+	update_slave = NULL;
+	bond_for_each_slave(bond, slave, i) {
+		if (sdev == slave->dev)
+			/*
+			 * We don't need to check the matching
+			 * slave for dups, since we're overwriting it
+			 */
+			update_slave = slave;
+		else if (qid && qid == slave->queue_id) {
+			goto err_no_cmd_unlock;
+		}
+	}
+
+	if (!update_slave)
+		goto err_no_cmd_unlock;
+
+	/* Actually set the qids for the slave */
+	update_slave->queue_id = qid;
+
+	read_unlock(&bond->lock);
+out:
+	rtnl_unlock();
+	return ret;
+
+err_no_cmd_unlock:
+	read_unlock(&bond->lock);
+err_no_cmd:
+	pr_info("invalid input for queue_id set for %s.\n",
+		bond->dev->name);
+	ret = -EPERM;
+	goto out;
+}
+
+static DEVICE_ATTR(queue_id, S_IRUGO | S_IWUSR, bonding_show_queue_id,
+		   bonding_store_queue_id);
+
+
+/*
  * Show and set the all_slaves_active flag.
  */
 static ssize_t bonding_show_slaves_active(struct device *d,
@@ -1489,6 +1604,7 @@ static struct attribute *per_bond_attrs[] = {
 	&dev_attr_ad_actor_key.attr,
 	&dev_attr_ad_partner_key.attr,
 	&dev_attr_ad_partner_mac.attr,
+	&dev_attr_queue_id.attr,
 	&dev_attr_all_slaves_active.attr,
 	NULL,
 };
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index cecdea2..c6fdd85 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -23,8 +23,8 @@
 #include "bond_3ad.h"
 #include "bond_alb.h"
 
-#define DRV_VERSION	"3.6.0"
-#define DRV_RELDATE	"September 26, 2009"
+#define DRV_VERSION	"3.7.0"
+#define DRV_RELDATE	"June 2, 2010"
 #define DRV_NAME	"bonding"
 #define DRV_DESCRIPTION	"Ethernet Channel Bonding Driver"
 
@@ -60,6 +60,9 @@
 		 ((mode) == BOND_MODE_TLB)          ||	\
 		 ((mode) == BOND_MODE_ALB))
 
+#define TX_QUEUE_OVERRIDE(mode)				\
+			(((mode) == BOND_MODE_ACTIVEBACKUP) ||	\
+			 ((mode) == BOND_MODE_ROUNDROBIN))
 /*
  * Less bad way to call ioctl from within the kernel; this needs to be
  * done some other way to get the call out of interrupt context.
@@ -131,6 +134,7 @@ struct bond_params {
 	char primary[IFNAMSIZ];
 	int primary_reselect;
 	__be32 arp_targets[BOND_MAX_ARP_TARGETS];
+	int tx_queues;
 	int all_slaves_active;
 };
 
@@ -165,6 +169,7 @@ struct slave {
 	u8     perm_hwaddr[ETH_ALEN];
 	u16    speed;
 	u8     duplex;
+	u16    queue_id;
 	struct ad_slave_info ad_info; /* HUGE - better to dynamically alloc */
 	struct tlb_slave_info tlb_info;
 };
diff --git a/include/linux/if_bonding.h b/include/linux/if_bonding.h
index cd525fa..2c79943 100644
--- a/include/linux/if_bonding.h
+++ b/include/linux/if_bonding.h
@@ -83,6 +83,7 @@
 
 #define BOND_DEFAULT_MAX_BONDS  1   /* Default maximum number of devices to support */
 
+#define BOND_DEFAULT_TX_QUEUES 16   /* Default number of tx queues per device */
 /* hashing types */
 #define BOND_XMIT_POLICY_LAYER2		0 /* layer 2 (MAC only), default */
 #define BOND_XMIT_POLICY_LAYER34	1 /* layer 3+4 (IP ^ (TCP || UDP)) */
-- 
1.7.0.1


^ permalink raw reply related

* [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Tejun Heo @ 2010-06-02 18:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <4C04D41B.4050704@kernel.org>

Replace vhost_workqueue with per-vhost kthread.  Other than callback
argument change from struct work_struct * to struct vhost_work *,
there's no visible change to vhost_poll_*() interface.

This conversion is to make each vhost use a dedicated kthread so that
resource control via cgroup can be applied.

Partially based on Sridhar Samudrala's patch.

* Updated to use sub structure vhost_work instead of directly using
  vhost_poll at Michael's suggestion.

* Added flusher wake_up() optimization at Michael's suggestion.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Sridhar Samudrala <samudrala.sridhar@gmail.com>
---
Okay, just tested it.  dev->work_lock had to be updated to use irq
operations but other than that it worked just fine.  Copied a large
file using scp and it seems to perform pretty well although I don't
have any reference of comparison.  So, here's the updated version with
the sign off.

Thanks.

 drivers/vhost/net.c   |   56 ++++++++++---------------
 drivers/vhost/vhost.c |  111 ++++++++++++++++++++++++++++++++++++++------------
 drivers/vhost/vhost.h |   38 +++++++++++------
 3 files changed, 134 insertions(+), 71 deletions(-)

Index: work/drivers/vhost/net.c
===================================================================
--- work.orig/drivers/vhost/net.c
+++ work/drivers/vhost/net.c
@@ -294,54 +294,58 @@ static void handle_rx(struct vhost_net *
 	unuse_mm(net->dev.mm);
 }

-static void handle_tx_kick(struct work_struct *work)
+static void handle_tx_kick(struct vhost_work *work)
 {
-	struct vhost_virtqueue *vq;
-	struct vhost_net *net;
-	vq = container_of(work, struct vhost_virtqueue, poll.work);
-	net = container_of(vq->dev, struct vhost_net, dev);
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						  poll.work);
+	struct vhost_net *net = container_of(vq->dev, struct vhost_net, dev);
+
 	handle_tx(net);
 }

-static void handle_rx_kick(struct work_struct *work)
+static void handle_rx_kick(struct vhost_work *work)
 {
-	struct vhost_virtqueue *vq;
-	struct vhost_net *net;
-	vq = container_of(work, struct vhost_virtqueue, poll.work);
-	net = container_of(vq->dev, struct vhost_net, dev);
+	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
+						  poll.work);
+	struct vhost_net *net = container_of(vq->dev, struct vhost_net, dev);
+
 	handle_rx(net);
 }

-static void handle_tx_net(struct work_struct *work)
+static void handle_tx_net(struct vhost_work *work)
 {
-	struct vhost_net *net;
-	net = container_of(work, struct vhost_net, poll[VHOST_NET_VQ_TX].work);
+	struct vhost_net *net = container_of(work, struct vhost_net,
+					     poll[VHOST_NET_VQ_TX].work);
 	handle_tx(net);
 }

-static void handle_rx_net(struct work_struct *work)
+static void handle_rx_net(struct vhost_work *work)
 {
-	struct vhost_net *net;
-	net = container_of(work, struct vhost_net, poll[VHOST_NET_VQ_RX].work);
+	struct vhost_net *net = container_of(work, struct vhost_net,
+					     poll[VHOST_NET_VQ_RX].work);
 	handle_rx(net);
 }

 static int vhost_net_open(struct inode *inode, struct file *f)
 {
 	struct vhost_net *n = kmalloc(sizeof *n, GFP_KERNEL);
+	struct vhost_dev *dev;
 	int r;
+
 	if (!n)
 		return -ENOMEM;
+
+	dev = &n->dev;
 	n->vqs[VHOST_NET_VQ_TX].handle_kick = handle_tx_kick;
 	n->vqs[VHOST_NET_VQ_RX].handle_kick = handle_rx_kick;
-	r = vhost_dev_init(&n->dev, n->vqs, VHOST_NET_VQ_MAX);
+	r = vhost_dev_init(dev, n->vqs, VHOST_NET_VQ_MAX);
 	if (r < 0) {
 		kfree(n);
 		return r;
 	}

-	vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT);
-	vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN);
+	vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
+	vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
 	n->tx_poll_state = VHOST_NET_POLL_DISABLED;

 	f->private_data = n;
@@ -644,25 +648,13 @@ static struct miscdevice vhost_net_misc

 static int vhost_net_init(void)
 {
-	int r = vhost_init();
-	if (r)
-		goto err_init;
-	r = misc_register(&vhost_net_misc);
-	if (r)
-		goto err_reg;
-	return 0;
-err_reg:
-	vhost_cleanup();
-err_init:
-	return r;
-
+	return misc_register(&vhost_net_misc);
 }
 module_init(vhost_net_init);

 static void vhost_net_exit(void)
 {
 	misc_deregister(&vhost_net_misc);
-	vhost_cleanup();
 }
 module_exit(vhost_net_exit);

Index: work/drivers/vhost/vhost.c
===================================================================
--- work.orig/drivers/vhost/vhost.c
+++ work/drivers/vhost/vhost.c
@@ -17,12 +17,12 @@
 #include <linux/mm.h>
 #include <linux/miscdevice.h>
 #include <linux/mutex.h>
-#include <linux/workqueue.h>
 #include <linux/rcupdate.h>
 #include <linux/poll.h>
 #include <linux/file.h>
 #include <linux/highmem.h>
 #include <linux/slab.h>
+#include <linux/kthread.h>

 #include <linux/net.h>
 #include <linux/if_packet.h>
@@ -37,8 +37,6 @@ enum {
 	VHOST_MEMORY_F_LOG = 0x1,
 };

-static struct workqueue_struct *vhost_workqueue;
-
 static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
 			    poll_table *pt)
 {
@@ -52,23 +50,31 @@ static void vhost_poll_func(struct file
 static int vhost_poll_wakeup(wait_queue_t *wait, unsigned mode, int sync,
 			     void *key)
 {
-	struct vhost_poll *poll;
-	poll = container_of(wait, struct vhost_poll, wait);
+	struct vhost_poll *poll = container_of(wait, struct vhost_poll, wait);
+
 	if (!((unsigned long)key & poll->mask))
 		return 0;

-	queue_work(vhost_workqueue, &poll->work);
+	vhost_poll_queue(poll);
 	return 0;
 }

 /* Init poll structure */
-void vhost_poll_init(struct vhost_poll *poll, work_func_t func,
-		     unsigned long mask)
+void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn,
+		     unsigned long mask, struct vhost_dev *dev)
 {
-	INIT_WORK(&poll->work, func);
+	struct vhost_work *work = &poll->work;
+
 	init_waitqueue_func_entry(&poll->wait, vhost_poll_wakeup);
 	init_poll_funcptr(&poll->table, vhost_poll_func);
 	poll->mask = mask;
+	poll->dev = dev;
+
+	INIT_LIST_HEAD(&work->node);
+	work->fn = fn;
+	init_waitqueue_head(&work->done);
+	atomic_set(&work->flushing, 0);
+	work->queue_seq = work->done_seq = 0;
 }

 /* Start polling a file. We add ourselves to file's wait queue. The caller must
@@ -92,12 +98,29 @@ void vhost_poll_stop(struct vhost_poll *
  * locks that are also used by the callback. */
 void vhost_poll_flush(struct vhost_poll *poll)
 {
-	flush_work(&poll->work);
+	struct vhost_work *work = &poll->work;
+	int seq = work->queue_seq;
+
+	atomic_inc(&work->flushing);
+	smp_mb__after_atomic_inc();	/* mb flush-b0 paired with worker-b1 */
+	wait_event(work->done, seq - work->done_seq <= 0);
+	atomic_dec(&work->flushing);
+	smp_mb__after_atomic_dec();	/* rmb flush-b1 paired with worker-b0 */
 }

 void vhost_poll_queue(struct vhost_poll *poll)
 {
-	queue_work(vhost_workqueue, &poll->work);
+	struct vhost_dev *dev = poll->dev;
+	struct vhost_work *work = &poll->work;
+	unsigned long flags;
+
+	spin_lock_irqsave(&dev->work_lock, flags);
+	if (list_empty(&work->node)) {
+		list_add_tail(&work->node, &dev->work_list);
+		work->queue_seq++;
+		wake_up_process(dev->worker);
+	}
+	spin_unlock_irqrestore(&dev->work_lock, flags);
 }

 static void vhost_vq_reset(struct vhost_dev *dev,
@@ -125,10 +148,52 @@ static void vhost_vq_reset(struct vhost_
 	vq->log_ctx = NULL;
 }

+static int vhost_worker(void *data)
+{
+	struct vhost_dev *dev = data;
+	struct vhost_work *work;
+
+repeat:
+	set_current_state(TASK_INTERRUPTIBLE);	/* mb paired w/ kthread_stop */
+
+	if (kthread_should_stop()) {
+		__set_current_state(TASK_RUNNING);
+		return 0;
+	}
+
+	work = NULL;
+	spin_lock_irq(&dev->work_lock);
+	if (!list_empty(&dev->work_list)) {
+		work = list_first_entry(&dev->work_list,
+					struct vhost_work, node);
+		list_del_init(&work->node);
+	}
+	spin_unlock_irq(&dev->work_lock);
+
+	if (work) {
+		__set_current_state(TASK_RUNNING);
+		work->fn(work);
+		smp_wmb();	/* wmb worker-b0 paired with flush-b1 */
+		work->done_seq = work->queue_seq;
+		smp_mb();	/* mb worker-b1 paired with flush-b0 */
+		if (atomic_read(&work->flushing))
+			wake_up_all(&work->done);
+	} else
+		schedule();
+
+	goto repeat;
+}
+
 long vhost_dev_init(struct vhost_dev *dev,
 		    struct vhost_virtqueue *vqs, int nvqs)
 {
+	struct task_struct *worker;
 	int i;
+
+	worker = kthread_create(vhost_worker, dev, "vhost-%d", current->pid);
+	if (IS_ERR(worker))
+		return PTR_ERR(worker);
+
 	dev->vqs = vqs;
 	dev->nvqs = nvqs;
 	mutex_init(&dev->mutex);
@@ -136,6 +201,9 @@ long vhost_dev_init(struct vhost_dev *de
 	dev->log_file = NULL;
 	dev->memory = NULL;
 	dev->mm = NULL;
+	spin_lock_init(&dev->work_lock);
+	INIT_LIST_HEAD(&dev->work_list);
+	dev->worker = worker;

 	for (i = 0; i < dev->nvqs; ++i) {
 		dev->vqs[i].dev = dev;
@@ -143,9 +211,10 @@ long vhost_dev_init(struct vhost_dev *de
 		vhost_vq_reset(dev, dev->vqs + i);
 		if (dev->vqs[i].handle_kick)
 			vhost_poll_init(&dev->vqs[i].poll,
-					dev->vqs[i].handle_kick,
-					POLLIN);
+					dev->vqs[i].handle_kick, POLLIN, dev);
 	}
+
+	wake_up_process(worker);	/* avoid contributing to loadavg */
 	return 0;
 }

@@ -217,6 +286,9 @@ void vhost_dev_cleanup(struct vhost_dev
 	if (dev->mm)
 		mmput(dev->mm);
 	dev->mm = NULL;
+
+	WARN_ON(!list_empty(&dev->work_list));
+	kthread_stop(dev->worker);
 }

 static int log_access_ok(void __user *log_base, u64 addr, unsigned long sz)
@@ -1113,16 +1185,3 @@ void vhost_disable_notify(struct vhost_v
 		vq_err(vq, "Failed to enable notification at %p: %d\n",
 		       &vq->used->flags, r);
 }
-
-int vhost_init(void)
-{
-	vhost_workqueue = create_singlethread_workqueue("vhost");
-	if (!vhost_workqueue)
-		return -ENOMEM;
-	return 0;
-}
-
-void vhost_cleanup(void)
-{
-	destroy_workqueue(vhost_workqueue);
-}
Index: work/drivers/vhost/vhost.h
===================================================================
--- work.orig/drivers/vhost/vhost.h
+++ work/drivers/vhost/vhost.h
@@ -5,13 +5,13 @@
 #include <linux/vhost.h>
 #include <linux/mm.h>
 #include <linux/mutex.h>
-#include <linux/workqueue.h>
 #include <linux/poll.h>
 #include <linux/file.h>
 #include <linux/skbuff.h>
 #include <linux/uio.h>
 #include <linux/virtio_config.h>
 #include <linux/virtio_ring.h>
+#include <asm/atomic.h>

 struct vhost_device;

@@ -20,19 +20,31 @@ enum {
 	VHOST_NET_MAX_SG = MAX_SKB_FRAGS + 2,
 };

+struct vhost_work;
+typedef void (*vhost_work_fn_t)(struct vhost_work *work);
+
+struct vhost_work {
+	struct list_head	  node;
+	vhost_work_fn_t		  fn;
+	wait_queue_head_t	  done;
+	atomic_t		  flushing;
+	int			  queue_seq;
+	int			  done_seq;
+};
+
 /* Poll a file (eventfd or socket) */
 /* Note: there's nothing vhost specific about this structure. */
 struct vhost_poll {
 	poll_table                table;
 	wait_queue_head_t        *wqh;
 	wait_queue_t              wait;
-	/* struct which will handle all actual work. */
-	struct work_struct        work;
+	struct vhost_work	  work;
 	unsigned long		  mask;
+	struct vhost_dev	 *dev;
 };

-void vhost_poll_init(struct vhost_poll *poll, work_func_t func,
-		     unsigned long mask);
+void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn,
+		     unsigned long mask, struct vhost_dev *dev);
 void vhost_poll_start(struct vhost_poll *poll, struct file *file);
 void vhost_poll_stop(struct vhost_poll *poll);
 void vhost_poll_flush(struct vhost_poll *poll);
@@ -63,7 +75,7 @@ struct vhost_virtqueue {
 	struct vhost_poll poll;

 	/* The routine to call when the Guest pings us, or timeout. */
-	work_func_t handle_kick;
+	vhost_work_fn_t handle_kick;

 	/* Last available index we saw. */
 	u16 last_avail_idx;
@@ -86,11 +98,11 @@ struct vhost_virtqueue {
 	struct iovec hdr[VHOST_NET_MAX_SG];
 	size_t hdr_size;
 	/* We use a kind of RCU to access private pointer.
-	 * All readers access it from workqueue, which makes it possible to
-	 * flush the workqueue instead of synchronize_rcu. Therefore readers do
+	 * All readers access it from worker, which makes it possible to
+	 * flush the vhost_work instead of synchronize_rcu. Therefore readers do
 	 * not need to call rcu_read_lock/rcu_read_unlock: the beginning of
-	 * work item execution acts instead of rcu_read_lock() and the end of
-	 * work item execution acts instead of rcu_read_lock().
+	 * vhost_work execution acts instead of rcu_read_lock() and the end of
+	 * vhost_work execution acts instead of rcu_read_lock().
 	 * Writers use virtqueue mutex. */
 	void *private_data;
 	/* Log write descriptors */
@@ -110,6 +122,9 @@ struct vhost_dev {
 	int nvqs;
 	struct file *log_file;
 	struct eventfd_ctx *log_ctx;
+	spinlock_t work_lock;
+	struct list_head work_list;
+	struct task_struct *worker;
 };

 long vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue *vqs, int nvqs);
@@ -136,9 +151,6 @@ bool vhost_enable_notify(struct vhost_vi
 int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
 		    unsigned int log_num, u64 len);

-int vhost_init(void);
-void vhost_cleanup(void);
-
 #define vq_err(vq, fmt, ...) do {                                  \
 		pr_debug(pr_fmt(fmt), ##__VA_ARGS__);       \
 		if ((vq)->error_ctx)                               \

^ permalink raw reply

* [PATCH net-next-2.6 1/2][v2] bonding: add all_slaves_active parameter
From: Andy Gospodarek @ 2010-06-02 18:39 UTC (permalink / raw)
  To: netdev; +Cc: fubar, nhorman


v2: changed parameter name from 'keep_all' to 'all_slaves_active' and
skipped setting slaves to inactive rather than creating a new flag at
Jay's suggestion.

In an effort to suppress duplicate frames on certain bonding modes
(specifically the modes that do not require additional configuration on
the switch or switches connected to the host), code was added in the
generic receive patch in 2.6.16.  The current behavior works quite well
for most users, but there are some times it would be nice to restore old
functionality and allow all frames to make their way up the stack.

This patch adds support for a new module option and sysfs file called
'all_slaves_active' that will restore pre-2.6.16 functionality if the
user desires.  The default value is '0' and retains existing behavior,
but the user can set it to '1' and allow all frames up if desired.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
 drivers/net/bonding/bond_main.c  |   13 +++++++++
 drivers/net/bonding/bond_sysfs.c |   52 ++++++++++++++++++++++++++++++++++++++
 drivers/net/bonding/bonding.h    |    4 ++-
 3 files changed, 68 insertions(+), 1 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index ef60244..f22f6bf 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -106,6 +106,7 @@ static int arp_interval = BOND_LINK_ARP_INTERV;
 static char *arp_ip_target[BOND_MAX_ARP_TARGETS];
 static char *arp_validate;
 static char *fail_over_mac;
+static int all_slaves_active = 0;
 static struct bond_params bonding_defaults;
 
 module_param(max_bonds, int, 0);
@@ -155,6 +156,10 @@ module_param(arp_validate, charp, 0);
 MODULE_PARM_DESC(arp_validate, "validate src/dst of ARP probes: none (default), active, backup or all");
 module_param(fail_over_mac, charp, 0);
 MODULE_PARM_DESC(fail_over_mac, "For active-backup, do not set all slaves to the same MAC.  none (default), active or follow");
+module_param(all_slaves_active, int, 0);
+MODULE_PARM_DESC(all_slaves_active, "Keep all frames received on an interface"
+				     "by setting active flag for all slaves.  "
+				     "0 for never (default), 1 for always.");
 
 /*----------------------------- Global variables ----------------------------*/
 
@@ -4771,6 +4776,13 @@ static int bond_check_params(struct bond_params *params)
 		}
 	}
 
+	if ((all_slaves_active != 0) && (all_slaves_active != 1)) {
+		pr_warning("Warning: all_slaves_active module parameter (%d), "
+			   "not of valid value (0/1), so it was set to "
+			   "0\n", all_slaves_active);
+		all_slaves_active = 0;
+	}
+
 	/* reset values for TLB/ALB */
 	if ((bond_mode == BOND_MODE_TLB) ||
 	    (bond_mode == BOND_MODE_ALB)) {
@@ -4941,6 +4953,7 @@ static int bond_check_params(struct bond_params *params)
 	params->primary[0] = 0;
 	params->primary_reselect = primary_reselect_value;
 	params->fail_over_mac = fail_over_mac_value;
+	params->all_slaves_active = all_slaves_active;
 
 	if (primary) {
 		strncpy(params->primary, primary, IFNAMSIZ);
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 496ac1e..066311a 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1411,7 +1411,58 @@ static ssize_t bonding_show_ad_partner_mac(struct device *d,
 }
 static DEVICE_ATTR(ad_partner_mac, S_IRUGO, bonding_show_ad_partner_mac, NULL);
 
+/*
+ * Show and set the all_slaves_active flag.
+ */
+static ssize_t bonding_show_slaves_active(struct device *d,
+					  struct device_attribute *attr,
+					  char *buf)
+{
+	struct bonding *bond = to_bond(d);
+
+	return sprintf(buf, "%d\n", bond->params.all_slaves_active);
+}
+
+static ssize_t bonding_store_slaves_active(struct device *d,
+					   struct device_attribute *attr,
+					   const char *buf, size_t count)
+{
+	int i, new_value, ret = count;
+	struct bonding *bond = to_bond(d);
+	struct slave *slave;
+
+	if (sscanf(buf, "%d", &new_value) != 1) {
+		pr_err("%s: no all_slaves_active value specified.\n",
+		       bond->dev->name);
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (new_value == bond->params.all_slaves_active)
+		goto out;
+
+	if ((new_value == 0) || (new_value == 1)) {
+		bond->params.all_slaves_active = new_value;
+	} else {
+		pr_info("%s: Ignoring invalid all_slaves_active value %d.\n",
+			bond->dev->name, new_value);
+		ret = -EINVAL;
+		goto out;
+	}
 
+	bond_for_each_slave(bond, slave, i) {
+		if (slave->state == BOND_STATE_BACKUP) {
+			if (new_value)
+				slave->dev->priv_flags &= ~IFF_SLAVE_INACTIVE;
+			else
+				slave->dev->priv_flags |= IFF_SLAVE_INACTIVE;
+		}
+	}
+out:
+	return count;
+}
+static DEVICE_ATTR(all_slaves_active, S_IRUGO | S_IWUSR,
+		   bonding_show_slaves_active, bonding_store_slaves_active);
 
 static struct attribute *per_bond_attrs[] = {
 	&dev_attr_slaves.attr,
@@ -1438,6 +1489,7 @@ static struct attribute *per_bond_attrs[] = {
 	&dev_attr_ad_actor_key.attr,
 	&dev_attr_ad_partner_key.attr,
 	&dev_attr_ad_partner_mac.attr,
+	&dev_attr_all_slaves_active.attr,
 	NULL,
 };
 
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index da80964..cecdea2 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -131,6 +131,7 @@ struct bond_params {
 	char primary[IFNAMSIZ];
 	int primary_reselect;
 	__be32 arp_targets[BOND_MAX_ARP_TARGETS];
+	int all_slaves_active;
 };
 
 struct bond_parm_tbl {
@@ -290,7 +291,8 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave)
 	struct bonding *bond = netdev_priv(slave->dev->master);
 	if (!bond_is_lb(bond))
 		slave->state = BOND_STATE_BACKUP;
-	slave->dev->priv_flags |= IFF_SLAVE_INACTIVE;
+	if (!bond->params.all_slaves_active)
+		slave->dev->priv_flags |= IFF_SLAVE_INACTIVE;
 	if (slave_do_arp_validate(bond, slave))
 		slave->dev->priv_flags |= IFF_SLAVE_NEEDARP;
 }
-- 
1.7.0.1


^ permalink raw reply related

* Re: [PATCH v2] netfilter: Xtables: idletimer target implementation
From: Luciano Coelho @ 2010-06-02 18:37 UTC (permalink / raw)
  To: ext Jan Engelhardt
  Cc: netdev@vger.kernel.org, netfilter-devel@vger.kernel.org,
	kaber@trash.net, Timo Teras
In-Reply-To: <alpine.LSU.2.01.1006021708370.27340@obet.zrqbmnf.qr>

On Wed, 2010-06-02 at 17:16 +0200, ext Jan Engelhardt wrote:
> On Wednesday 2010-06-02 15:41, Luciano Coelho wrote:
> 
> >+static int __init idletimer_tg_init(void)
> >+{
> >+	int ret;
> >+
> >+	idletimer_tg_kobj = kobject_create_and_add("idletimer",
> >+						   &THIS_MODULE->mkobj.kobj);
> 
> Isn't this going to oops when you compile this module as =y?

Damn, that's true. :(

I'll investigate how to fix this.

-- 
Cheers,
Luca.


^ permalink raw reply

* Re: [PATCH] net: mac8390 - Sort out memory/MMIO accesses and casts
From: Joe Perches @ 2010-06-02 18:21 UTC (permalink / raw)
  To: Geert Uytterhoeven, Finn Thain; +Cc: davem, netdev
In-Reply-To: <1275500180-32640-1-git-send-email-geert@linux-m68k.org>

On Wed, 2010-06-02 at 19:36 +0200, Geert Uytterhoeven wrote:
> commit 5c7fffd0e3b57cb63f50bbd710868f012d67654f ("drivers/net/mac8390.c: Remove
> useless memcpy casting") removed too many casts, introducing the following
> warnings:
> 
> | drivers/net/mac8390.c:248: warning: passing argument 1 of '__builtin_memcpy' makes pointer from integer without a cast
> | drivers/net/mac8390.c:253: warning: passing argument 1 of 'word_memcpy_tocard' makes pointer from integer without a cast
> | drivers/net/mac8390.c:255: warning: passing argument 2 of 'word_memcpy_fromcard' makes pointer from integer without a cast
> 
> Instead of just readding the casts,
>   - move all casts inside word_memcpy_{to,from}card(),
>   - replace an incorrect memcpy() by memcpy_toio(),
>   - add memcmp_withio() as a wrapper around memcmp(),
>   - replace an incorrect memcpy_toio() by memcpy_fromio().
> 
> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
> Tested-by: Finn Thain <fthain@telegraphics.com.au>

Thanks Geert and Finn.

Apologies for not setting up a cross-compiler to fix this.

cheers, Joe


^ permalink raw reply

* Re: sysfs class/net/ problem
From: Eric W. Biederman @ 2010-06-02 18:05 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Greg KH, netdev
In-Reply-To: <1275501157.3915.22.camel@jlt3.sipsolutions.net>

Johannes Berg <johannes@sipsolutions.net> writes:

> On Wed, 2010-06-02 at 10:23 -0700, Eric W. Biederman wrote:
>
>> So far that hypothesis that the target of the symlink is being removed before
>> the actual actual link looks like it could cause this.
>
> Yeah though I'm not sure how that would happen? Wouldn't the symlink
> cause the target kobject to still be referenced, and thus stay around
> until the symlink goes away?

The references don't affect visibility in sysfs.  All of that is manual
at the sysfs layer, and there doesn't appear to be a good substitute.  Generally
the device layer manages to handle all of the details automatically but it
appears something is missing.

>> Are there any other left overs in sysfs, besides just /sys/class/net/wlan0?
>
> No, not based on find /sys and diffing before/after anyway.

It is going to be a little bit before I manage to dig into this deeply.

If you want to dig into this look at sysfs_delete_link.  instrument
it so that you can see if it is called for wlan{0,1,2} and see what
ns it is called for.

My current hypothesis is something is causing us to try and delete
the symlink from the wrong namespace, so we just skip that part of it.

Eric

^ permalink raw reply

* (unknown), 
From: SHUNG EDWIN @ 2010-06-02 14:31 UTC (permalink / raw)


 Dear Friend,

I am Mr. Shung Hin Hui Edwin a manager on investor relations in Standard 
Chartered Bank, Hong Kong. I have a business proposal for you. If interested 
please contact me for details

I greet
Edwin Shung Hui Hin.

^ permalink raw reply

* [PATCH] chelsio: Remove remnants of CONFIG_CHELSIO_T1_COUGAR
From: Roland Dreier @ 2010-06-02 18:04 UTC (permalink / raw)
  To: netdev, David S. Miller, Stephen Hemminger

CONFIG_CHELSIO_T1_COUGAR cannot be set (it appears nowhere in any
Kconfig files), and the code it protects could never build (cspi.h was
never added to the kernel tree).  Therefore it's pretty safe to remove
all vestiges of this dead code.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
---
 drivers/net/chelsio/common.h |    1 -
 drivers/net/chelsio/subr.c   |   49 +----------------------------------------
 2 files changed, 2 insertions(+), 48 deletions(-)

diff --git a/drivers/net/chelsio/common.h b/drivers/net/chelsio/common.h
index 036b2df..092f31a 100644
--- a/drivers/net/chelsio/common.h
+++ b/drivers/net/chelsio/common.h
@@ -286,7 +286,6 @@ struct board_info {
 	unsigned int            clock_mc3;
 	unsigned int            clock_mc4;
 	unsigned int            espi_nports;
-	unsigned int            clock_cspi;
 	unsigned int            clock_elmer0;
 	unsigned char           mdio_mdien;
 	unsigned char           mdio_mdiinv;
diff --git a/drivers/net/chelsio/subr.c b/drivers/net/chelsio/subr.c
index 53bde15..599d178 100644
--- a/drivers/net/chelsio/subr.c
+++ b/drivers/net/chelsio/subr.c
@@ -185,9 +185,6 @@ static int t1_pci_intr_handler(adapter_t *adapter)
 	return 0;
 }
 
-#ifdef CONFIG_CHELSIO_T1_COUGAR
-#include "cspi.h"
-#endif
 #ifdef CONFIG_CHELSIO_T1_1G
 #include "fpga_defs.h"
 
@@ -280,7 +277,7 @@ static void mi1_mdio_init(adapter_t *adapter, const struct board_info *bi)
 	t1_tpi_write(adapter, A_ELMER0_PORT0_MI1_CFG, val);
 }
 
-#if defined(CONFIG_CHELSIO_T1_1G) || defined(CONFIG_CHELSIO_T1_COUGAR)
+#if defined(CONFIG_CHELSIO_T1_1G)
 /*
  * Elmer MI1 MDIO read/write operations.
  */
@@ -317,7 +314,7 @@ static int mi1_mdio_write(struct net_device *dev, int phy_addr, int mmd_addr,
 	return 0;
 }
 
-#if defined(CONFIG_CHELSIO_T1_1G) || defined(CONFIG_CHELSIO_T1_COUGAR)
+#if defined(CONFIG_CHELSIO_T1_1G)
 static const struct mdio_ops mi1_mdio_ops = {
 	.init = mi1_mdio_init,
 	.read = mi1_mdio_read,
@@ -752,31 +749,6 @@ int t1_elmer0_ext_intr_handler(adapter_t *adapter)
 					 mod_detect ? "removed" : "inserted");
 		}
 		break;
-#ifdef CONFIG_CHELSIO_T1_COUGAR
-	case CHBT_BOARD_COUGAR:
-		if (adapter->params.nports == 1) {
-			if (cause & ELMER0_GP_BIT1) {         /* Vitesse MAC */
-				struct cmac *mac = adapter->port[0].mac;
-				mac->ops->interrupt_handler(mac);
-			}
-			if (cause & ELMER0_GP_BIT5) {     /* XPAK MOD_DETECT */
-			}
-		} else {
-			int i, port_bit;
-
-			for_each_port(adapter, i) {
-				port_bit = i ? i + 1 : 0;
-				if (!(cause & (1 << port_bit)))
-					continue;
-
-				phy = adapter->port[i].phy;
-				phy_cause = phy->ops->interrupt_handler(phy);
-				if (phy_cause & cphy_cause_link_change)
-					t1_link_changed(adapter, i);
-			}
-		}
-		break;
-#endif
 	}
 	t1_tpi_write(adapter, A_ELMER0_INT_CAUSE, cause);
 	return 0;
@@ -955,7 +927,6 @@ static int board_init(adapter_t *adapter, const struct board_info *bi)
 	case CHBT_BOARD_N110:
 	case CHBT_BOARD_N210:
 	case CHBT_BOARD_CHT210:
-	case CHBT_BOARD_COUGAR:
 		t1_tpi_par(adapter, 0xf);
 		t1_tpi_write(adapter, A_ELMER0_GPO, 0x800);
 		break;
@@ -1004,10 +975,6 @@ int t1_init_hw_modules(adapter_t *adapter)
 		       adapter->regs + A_MC5_CONFIG);
 	}
 
-#ifdef CONFIG_CHELSIO_T1_COUGAR
-	if (adapter->cspi && t1_cspi_init(adapter->cspi))
-		goto out_err;
-#endif
 	if (adapter->espi && t1_espi_init(adapter->espi, bi->chip_mac,
 					  bi->espi_nports))
 		goto out_err;
@@ -1061,10 +1028,6 @@ void t1_free_sw_modules(adapter_t *adapter)
 		t1_tp_destroy(adapter->tp);
 	if (adapter->espi)
 		t1_espi_destroy(adapter->espi);
-#ifdef CONFIG_CHELSIO_T1_COUGAR
-	if (adapter->cspi)
-		t1_cspi_destroy(adapter->cspi);
-#endif
 }
 
 static void __devinit init_link_config(struct link_config *lc,
@@ -1084,14 +1047,6 @@ static void __devinit init_link_config(struct link_config *lc,
 	}
 }
 
-#ifdef CONFIG_CHELSIO_T1_COUGAR
-	if (bi->clock_cspi && !(adapter->cspi = t1_cspi_create(adapter))) {
-		pr_err("%s: CSPI initialization failed\n",
-		       adapter->name);
-		goto error;
-	}
-#endif
-
 /*
  * Allocate and initialize the data structures that hold the SW state of
  * the Terminator HW modules.

-- 
Roland Dreier <rolandd@cisco.com> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

^ permalink raw reply related

* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful
From: Christoph Lameter @ 2010-06-02 18:01 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, shemminger
In-Reply-To: <1275500802.2519.7.camel@edumazet-laptop>

On Wed, 2 Jun 2010, Eric Dumazet wrote:

> Here is the patch I cooked to account for RP_FILTER errors in multicast
> path.
>
> I will complete it to also do the unicast part before official
> submission.
>
> Christoph, the official counter would be IPSTATS_MIB_INNOROUTES

Great. Thanks.

> ipSystemStatsInNoRoutes OBJECT-TYPE
>     SYNTAX     Counter32
>     MAX-ACCESS read-only
>     STATUS     current
>     DESCRIPTION
>            "The number of input IP datagrams discarded because no route
>             could be found to transmit them to their destination.

add "or because the rp_filter rejected the packet"? In the case of MC
traffic you dont really need a route.

In my particular case it is a weird corner case for the rp_filter.

Two NICs are on the same subnet. Different multicast groups are joined
on each (Using two NICs to balance the MC load since the drivers have
some multicast limitations and having different interrupt lines for each
NIC is also beneficial).

The rp_filter rejects all multicast traffic to the subscriptions on the
second NIC. I guess this is because the source address of the MC traffic
(on the same subnet) is also reachable via the first NIC.

So you could add also "because of breakage in the rp_filter (rp_filter
ignores the multicast subscription tables when determining the correct
reverse path of the packet)"

^ permalink raw reply

* Re: [Patch]8139too: remove unnecessary cast of ioread32()'s return value
From: Jeff Garzik @ 2010-06-02 17:52 UTC (permalink / raw)
  To: David Miller; +Cc: romieu, netdev
In-Reply-To: <20100530.183557.104047217.davem@davemloft.net>

On 05/30/2010 09:35 PM, David Miller wrote:
> From: David Miller<davem@davemloft.net>
> Date: Sun, 30 May 2010 18:29:48 -0700 (PDT)
>
>> From: Jeff Garzik<jeff@garzik.org>
>> Date: Sun, 30 May 2010 19:24:18 -0400
>>
>>> That was the genesis of the question.  Some arches still use unsigned
>>> long.
>>
>> They are 32-bit.
>
> In fact the only two offenders are h8300 and m32r, which are
> both 32-bit.

The main interesting one is an ARM sub-arch that supports PCI.


> This is really in the realm of "who cares."

Fair enough.  That answers my question.

	Jeff




^ permalink raw reply

* Re: sysfs class/net/ problem
From: Johannes Berg @ 2010-06-02 17:52 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Greg KH, netdev
In-Reply-To: <m1ocftwdp7.fsf@fess.ebiederm.org>

On Wed, 2010-06-02 at 10:23 -0700, Eric W. Biederman wrote:

> So far that hypothesis that the target of the symlink is being removed before
> the actual actual link looks like it could cause this.

Yeah though I'm not sure how that would happen? Wouldn't the symlink
cause the target kobject to still be referenced, and thus stay around
until the symlink goes away?

> Are there any other left overs in sysfs, besides just /sys/class/net/wlan0?

No, not based on find /sys and diffing before/after anyway.

johannes

^ permalink raw reply

* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful
From: Eric Dumazet @ 2010-06-02 17:46 UTC (permalink / raw)
  To: David Miller; +Cc: cl, netdev, shemminger
In-Reply-To: <20100602.103102.121237521.davem@davemloft.net>

Le mercredi 02 juin 2010 à 10:31 -0700, David Miller a écrit :
> Just in case people are really so clueless as to be unable to figure
> this out:
> 
> echo 1 >/sys/kernel/debug/tracing/events/skb/kfree_skb/enable
> ...do some stuff...
> cat /sys/kernel/debug/tracing/trace
> 
> You can even trace it using 'perf' by passing "skb:kfree_skb"
> as the event specifier.

Thanks !

Here is the patch I cooked to account for RP_FILTER errors in multicast
path.

I will complete it to also do the unicast part before official
submission.

Christoph, the official counter would be IPSTATS_MIB_INNOROUTES

ipSystemStatsInNoRoutes OBJECT-TYPE
    SYNTAX     Counter32
    MAX-ACCESS read-only
    STATUS     current
    DESCRIPTION
           "The number of input IP datagrams discarded because no route
            could be found to transmit them to their destination.



diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 4f0ed45..f207289 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -284,7 +284,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
 	if (no_addr)
 		goto last_resort;
 	if (rpf == 1)
-		goto e_inval;
+		goto e_rpf;
 	fl.oif = dev->ifindex;
 
 	ret = 0;
@@ -299,7 +299,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
 
 last_resort:
 	if (rpf)
-		goto e_inval;
+		goto e_rpf;
 	*spec_dst = inet_select_addr(dev, 0, RT_SCOPE_UNIVERSE);
 	*itag = 0;
 	return 0;
@@ -308,6 +308,8 @@ e_inval_res:
 	fib_res_put(&res);
 e_inval:
 	return -EINVAL;
+e_rpf:
+	return -ENETUNREACH;
 }
 
 static inline __be32 sk_extract_addr(struct sockaddr *addr)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 8495bce..8e9e2f9 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1851,6 +1851,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	__be32 spec_dst;
 	struct in_device *in_dev = in_dev_get(dev);
 	u32 itag = 0;
+	int err;
 
 	/* Primary sanity checks. */
 
@@ -1865,10 +1866,12 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 		if (!ipv4_is_local_multicast(daddr))
 			goto e_inval;
 		spec_dst = inet_select_addr(dev, 0, RT_SCOPE_LINK);
-	} else if (fib_validate_source(saddr, 0, tos, 0,
-					dev, &spec_dst, &itag, 0) < 0)
-		goto e_inval;
-
+	} else {
+		err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst,
+					  &itag, 0);
+		if (err < 0)
+			goto e_err;
+	}
 	rth = dst_alloc(&ipv4_dst_ops);
 	if (!rth)
 		goto e_nobufs;
@@ -1922,6 +1925,9 @@ e_nobufs:
 e_inval:
 	in_dev_put(in_dev);
 	return -EINVAL;
+e_err:
+	in_dev_put(in_dev);
+	return err;
 }
 
 



^ permalink raw reply related

* [PATCH] net: mac8390 - Sort out memory/MMIO accesses and casts
From: Geert Uytterhoeven @ 2010-06-02 17:36 UTC (permalink / raw)
  To: davem; +Cc: netdev, Geert Uytterhoeven

commit 5c7fffd0e3b57cb63f50bbd710868f012d67654f ("drivers/net/mac8390.c: Remove
useless memcpy casting") removed too many casts, introducing the following
warnings:

| drivers/net/mac8390.c:248: warning: passing argument 1 of '__builtin_memcpy' makes pointer from integer without a cast
| drivers/net/mac8390.c:253: warning: passing argument 1 of 'word_memcpy_tocard' makes pointer from integer without a cast
| drivers/net/mac8390.c:255: warning: passing argument 2 of 'word_memcpy_fromcard' makes pointer from integer without a cast

Instead of just readding the casts,
  - move all casts inside word_memcpy_{to,from}card(),
  - replace an incorrect memcpy() by memcpy_toio(),
  - add memcmp_withio() as a wrapper around memcmp(),
  - replace an incorrect memcpy_toio() by memcpy_fromio().

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Finn Thain <fthain@telegraphics.com.au>
---
 drivers/net/mac8390.c |   44 ++++++++++++++++++++++----------------------
 1 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/net/mac8390.c b/drivers/net/mac8390.c
index 1136c9a..c7476a4 100644
--- a/drivers/net/mac8390.c
+++ b/drivers/net/mac8390.c
@@ -157,6 +157,8 @@ static void dayna_block_output(struct net_device *dev, int count,
 #define memcpy_fromio(a, b, c)	memcpy((a), (void *)(b), (c))
 #define memcpy_toio(a, b, c)	memcpy((void *)(a), (b), (c))
 
+#define memcmp_withio(a, b, c)	memcmp((a), (void *)(b), (c))
+
 /* Slow Sane (16-bit chunk memory read/write) Cabletron uses this */
 static void slow_sane_get_8390_hdr(struct net_device *dev,
 				   struct e8390_pkt_hdr *hdr, int ring_page);
@@ -164,8 +166,8 @@ static void slow_sane_block_input(struct net_device *dev, int count,
 				  struct sk_buff *skb, int ring_offset);
 static void slow_sane_block_output(struct net_device *dev, int count,
 				   const unsigned char *buf, int start_page);
-static void word_memcpy_tocard(void *tp, const void *fp, int count);
-static void word_memcpy_fromcard(void *tp, const void *fp, int count);
+static void word_memcpy_tocard(unsigned long tp, const void *fp, int count);
+static void word_memcpy_fromcard(void *tp, unsigned long fp, int count);
 
 static enum mac8390_type __init mac8390_ident(struct nubus_dev *dev)
 {
@@ -245,9 +247,9 @@ static enum mac8390_access __init mac8390_testio(volatile unsigned long membase)
 	unsigned long outdata = 0xA5A0B5B0;
 	unsigned long indata =  0x00000000;
 	/* Try writing 32 bits */
-	memcpy(membase, &outdata, 4);
+	memcpy_toio(membase, &outdata, 4);
 	/* Now compare them */
-	if (memcmp((char *)&outdata, (char *)membase, 4) == 0)
+	if (memcmp_withio(&outdata, membase, 4) == 0)
 		return ACCESS_32;
 	/* Write 16 bit output */
 	word_memcpy_tocard(membase, &outdata, 4);
@@ -731,7 +733,7 @@ static void sane_get_8390_hdr(struct net_device *dev,
 			      struct e8390_pkt_hdr *hdr, int ring_page)
 {
 	unsigned long hdr_start = (ring_page - WD_START_PG)<<8;
-	memcpy_fromio((void *)hdr, (char *)dev->mem_start + hdr_start, 4);
+	memcpy_fromio(hdr, dev->mem_start + hdr_start, 4);
 	/* Fix endianness */
 	hdr->count = swab16(hdr->count);
 }
@@ -745,14 +747,13 @@ static void sane_block_input(struct net_device *dev, int count,
 	if (xfer_start + count > ei_status.rmem_end) {
 		/* We must wrap the input move. */
 		int semi_count = ei_status.rmem_end - xfer_start;
-		memcpy_fromio(skb->data, (char *)dev->mem_start + xfer_base,
+		memcpy_fromio(skb->data, dev->mem_start + xfer_base,
 			      semi_count);
 		count -= semi_count;
-		memcpy_toio(skb->data + semi_count,
-			    (char *)ei_status.rmem_start, count);
-	} else {
-		memcpy_fromio(skb->data, (char *)dev->mem_start + xfer_base,
+		memcpy_fromio(skb->data + semi_count, ei_status.rmem_start,
 			      count);
+	} else {
+		memcpy_fromio(skb->data, dev->mem_start + xfer_base, count);
 	}
 }
 
@@ -761,7 +762,7 @@ static void sane_block_output(struct net_device *dev, int count,
 {
 	long shmem = (start_page - WD_START_PG)<<8;
 
-	memcpy_toio((char *)dev->mem_start + shmem, buf, count);
+	memcpy_toio(dev->mem_start + shmem, buf, count);
 }
 
 /* dayna block input/output */
@@ -812,7 +813,7 @@ static void slow_sane_get_8390_hdr(struct net_device *dev,
 				   int ring_page)
 {
 	unsigned long hdr_start = (ring_page - WD_START_PG)<<8;
-	word_memcpy_fromcard(hdr, (char *)dev->mem_start + hdr_start, 4);
+	word_memcpy_fromcard(hdr, dev->mem_start + hdr_start, 4);
 	/* Register endianism - fix here rather than 8390.c */
 	hdr->count = (hdr->count&0xFF)<<8|(hdr->count>>8);
 }
@@ -826,15 +827,14 @@ static void slow_sane_block_input(struct net_device *dev, int count,
 	if (xfer_start + count > ei_status.rmem_end) {
 		/* We must wrap the input move. */
 		int semi_count = ei_status.rmem_end - xfer_start;
-		word_memcpy_fromcard(skb->data,
-				     (char *)dev->mem_start + xfer_base,
+		word_memcpy_fromcard(skb->data, dev->mem_start + xfer_base,
 				     semi_count);
 		count -= semi_count;
 		word_memcpy_fromcard(skb->data + semi_count,
-				     (char *)ei_status.rmem_start, count);
+				     ei_status.rmem_start, count);
 	} else {
-		word_memcpy_fromcard(skb->data,
-				     (char *)dev->mem_start + xfer_base, count);
+		word_memcpy_fromcard(skb->data, dev->mem_start + xfer_base,
+				     count);
 	}
 }
 
@@ -843,12 +843,12 @@ static void slow_sane_block_output(struct net_device *dev, int count,
 {
 	long shmem = (start_page - WD_START_PG)<<8;
 
-	word_memcpy_tocard((char *)dev->mem_start + shmem, buf, count);
+	word_memcpy_tocard(dev->mem_start + shmem, buf, count);
 }
 
-static void word_memcpy_tocard(void *tp, const void *fp, int count)
+static void word_memcpy_tocard(unsigned long tp, const void *fp, int count)
 {
-	volatile unsigned short *to = tp;
+	volatile unsigned short *to = (void *)tp;
 	const unsigned short *from = fp;
 
 	count++;
@@ -858,10 +858,10 @@ static void word_memcpy_tocard(void *tp, const void *fp, int count)
 		*to++ = *from++;
 }
 
-static void word_memcpy_fromcard(void *tp, const void *fp, int count)
+static void word_memcpy_fromcard(void *tp, unsigned long fp, int count)
 {
 	unsigned short *to = tp;
-	const volatile unsigned short *from = fp;
+	const volatile unsigned short *from = (const void *)fp;
 
 	count++;
 	count /= 2;
-- 
1.7.0.4


^ permalink raw reply related

* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful
From: Neil Horman @ 2010-06-02 17:41 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, cl, netdev, shemminger
In-Reply-To: <1275499150.2519.0.camel@edumazet-laptop>

On Wed, Jun 02, 2010 at 07:19:10PM +0200, Eric Dumazet wrote:
> Le mercredi 02 juin 2010 à 10:12 -0700, David Miller a écrit :
> > From: Christoph Lameter <cl@linux-foundation.org>
> > Date: Wed, 2 Jun 2010 11:49:18 -0500 (CDT)
> > 
> > > On Wed, 2 Jun 2010, Eric Dumazet wrote:
> > > 
> > >> take a look at
> > >>
> > >> http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/html/SystemTap_Beginners_Guide/useful-systemtap-scripts.html#dropwatch
> > > 
> > > System tap?
> > 
> > You don't need to use system tap, just the normal tracing stuff using
> > sysfs files suffices.
> > 
> 
> It would be good if Neil could gave us a man page or something ;)
> 
That stap script was really meant to be a stopgap measure.  As mentioned, you
can use the debugfs interface to turn tracepoints on and use them anyway you
wish.  Or, if you want to use the kfree_skb and napi_poll tracepoints in a more
formalized way, you can use the dropwatch user space utility:

https://fedorahosted.org/dropwatch/

Which includes a man page on usage :)

I also recently updated it so that this utility can query /proc/kallsyms to
translate program counter values into symbollic names and offsets for you. :)

Regards
Neil


^ permalink raw reply

* Re: [PATCH] IP: Increment INADDRERRORS if routing for a packet is not successful
From: David Miller @ 2010-06-02 17:31 UTC (permalink / raw)
  To: cl; +Cc: eric.dumazet, netdev, shemminger
In-Reply-To: <20100602.101258.134121018.davem@davemloft.net>

Just in case people are really so clueless as to be unable to figure
this out:

echo 1 >/sys/kernel/debug/tracing/events/skb/kfree_skb/enable
...do some stuff...
cat /sys/kernel/debug/tracing/trace

You can even trace it using 'perf' by passing "skb:kfree_skb"
as the event specifier.

^ permalink raw reply

* Re: [PATCH] net: mac8390 - Sort out memory/MMIO accesses and casts
From: David Miller @ 2010-06-02 17:26 UTC (permalink / raw)
  To: geert; +Cc: fthain, joe, netdev, linux-kernel, linux-m68k
In-Reply-To: <AANLkTilqknhgBCXd-5Yxj9U8gDzDfMmmZdwDOvf-8kbH@mail.gmail.com>

From: Geert Uytterhoeven <geert@linux-m68k.org>
Date: Wed, 2 Jun 2010 19:24:08 +0200

> On Sat, May 29, 2010 at 10:03, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> On Fri, May 28, 2010 at 19:21, Finn Thain <fthain@telegraphics.com.au> wrote:
>>> On Sun, 23 May 2010, Geert Uytterhoeven wrote:
>>>> >> But here's a better solution. I do not have the hardware to test it,
>>>> >> though. Finn, does it {look OK,work}?
>>>> >
>>>> > It looks fine. I can't test it right now, but I will do so when I get
>>>> > the opportunity.
>>>>
>>>> Any news from the test front?
>>>
>>> This is commit ba0f916ca7ac79356e2ed32a85c3aa8255b104e7, right?
>>
>> Yep.
>>
>>> If so, it tests OK here.
>>
>> Thanks for testing!
> 
> David, will you take this one too?

Please post a fresh copy, there is no way that sucker still applies cleanly
as there have been some changes in this area recently.

Thanks.

^ permalink raw reply

* Re: [PATCH] net: mac8390 - Sort out memory/MMIO accesses and casts (was: Re: drivers/net/mac8390.c: Remove useless memcpy casting)
From: Geert Uytterhoeven @ 2010-06-02 17:24 UTC (permalink / raw)
  To: Finn Thain
  Cc: Joe Perches, David S. Miller, netdev, Linux Kernel Mailing List,
	Linux/m68k
In-Reply-To: <AANLkTilLXOSb2P_m4uUubmJar2VMFXizzpa_mQ1j8GPz@mail.gmail.com>

On Sat, May 29, 2010 at 10:03, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Fri, May 28, 2010 at 19:21, Finn Thain <fthain@telegraphics.com.au> wrote:
>> On Sun, 23 May 2010, Geert Uytterhoeven wrote:
>>> >> But here's a better solution. I do not have the hardware to test it,
>>> >> though. Finn, does it {look OK,work}?
>>> >
>>> > It looks fine. I can't test it right now, but I will do so when I get
>>> > the opportunity.
>>>
>>> Any news from the test front?
>>
>> This is commit ba0f916ca7ac79356e2ed32a85c3aa8255b104e7, right?
>
> Yep.
>
>> If so, it tests OK here.
>
> Thanks for testing!

David, will you take this one too?

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox