Netdev List
 help / color / mirror / Atom feed
* Re: loosing IPMI-card by loading netconsole
From: Henning Fehrmann @ 2010-05-18 13:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Ronciak, John, Kirsher, Jeffrey T, Brandeburg, Jesse,
	Allan, Bruce W, Waskiewicz Jr, Peter P, netdev@vger.kernel.org,
	Carsten Aulbert
In-Reply-To: <4BEDCD87.4090602@kernel.org>


Hello,

> >> Yeap, sure, it would be effective but I kind of want to leave
> >> bisection as the last resort.  Bisection is a somewhat painful process
> >> especially when the machine isn't right next to you and someone who
> >> has overall knowledge can often identify the problem much easier with
> >> appropriate debugging info.
> > 
> > Well nothing jumps to mind in the netpoll/netconsole code and I haven't
> > heard any similar reports. My guess is it's something obscure, so I
> > think the sooner you start bisecting... Even one or two tests will get
> > us a lot closer to figuring out what changed in the last 1.5 years.
> 
> I see.  I was hoping it would ring a bell to someone.  We'll probably
> try to provide the info Jesse asked and if that doesn't lead anywhere
> start bisecting.


Let me re-describe the symptoms.
I am not loading any ipmi related modules and not the netconsole
module.
When booting out current 2.6.32 kernel we can not access the IPMI
remotely.

We had one case where the IPMI card was accessible while using this
kernel but probably due to the fact that eth0 was removed. We do not
consider this case anymore. 

This problem does not occur when using an older kernel. 

It has likely nothing to do with netconsole.

Here is the bisecting result:

The sha1 sum of the first bad commit is: 
6e50912a442947d5fafd296ca6fdcbeb36b163ff

Hence, the last good commit has:
b2f8f7525c8aa1fdd8ad8c72c832dfb571d5f768

Cheers,
Henning

^ permalink raw reply

* Re: [PATCH net-next-2.6 2/2] bonding: allow user-controlled output slave selection
From: Andy Gospodarek @ 2010-05-18 12:57 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Neil Horman, netdev
In-Reply-To: <12958.1274145714@death.nxdomain.ibm.com>

On Mon, May 17, 2010 at 9:21 PM, Jay Vosburgh <fubar@us.ibm.com> wrote:
> Jay Vosburgh <fubar@us.ibm.com> wrote:
> [...]
>>       For your patch, I'm exploring the idea of not setting
>>IFF_SLAVE_INACTIVE on "inactive" slaves for an "all_slaves_active"
>>option (I think that's a more descriptive name than "keep_all") instead
>>of adding a new KEEP_ALL flag bit to priv_flags.  Did you consider this
>>methodology and exclude it for some reason?
>
>        Following up to myself, I coded up approximately what I was
> talking about.  This doesn't require the extra priv_flag, and the sysfs
> _store is a little more complicated, but this appears to work (testing
> with ping -f after clearing the switch's MAC table to induce traffic
> flooding).  I didn't change the option name from "keep_all" here, but as
> far as the functionality goes, this seems to do what I think you want it
> to.
>

I can test it here to be sure, but overall this seems like a fine
approach.  The IFF_SLAVE_INACTIVE doesn't seem to be used for much
other than duplicate suppression at this point, so it seems like a
logical choice.

As for the original name 'keep_all,' I tried to use something that
user would find easy to understand.  Obviously not all users think
alike. :-)


>        -J
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 5e12462..c80d2ff 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -106,6 +106,7 @@ static int arp_interval = BOND_LINK_ARP_INTERV;
>  static char *arp_ip_target[BOND_MAX_ARP_TARGETS];
>  static char *arp_validate;
>  static char *fail_over_mac;
> +static int keep_all    = 0;
>  static struct bond_params bonding_defaults;
>
>  module_param(max_bonds, int, 0);
> @@ -155,6 +156,9 @@ module_param(arp_validate, charp, 0);
>  MODULE_PARM_DESC(arp_validate, "validate src/dst of ARP probes: none (default), active, backup or all");
>  module_param(fail_over_mac, charp, 0);
>  MODULE_PARM_DESC(fail_over_mac, "For active-backup, do not set all slaves to the same MAC.  none (default), active or follow");
> +module_param(keep_all, int, 0);
> +MODULE_PARM_DESC(keep_all, "Keep all frames received on an interface"
> +                          "0 for never (default), 1 for always.");
>
>  /*----------------------------- Global variables ----------------------------*/
>
> @@ -4756,6 +4760,13 @@ static int bond_check_params(struct bond_params *params)
>                }
>        }
>
> +       if ((keep_all != 0) && (keep_all != 1)) {
> +               pr_warning("Warning: keep_all module parameter (%d), "
> +                          "not of valid value (0/1), so it was set to "
> +                          "0\n", keep_all);
> +               keep_all = 0;
> +       }
> +
>        /* reset values for TLB/ALB */
>        if ((bond_mode == BOND_MODE_TLB) ||
>            (bond_mode == BOND_MODE_ALB)) {
> @@ -4926,6 +4937,7 @@ static int bond_check_params(struct bond_params *params)
>        params->primary[0] = 0;
>        params->primary_reselect = primary_reselect_value;
>        params->fail_over_mac = fail_over_mac_value;
> +       params->keep_all = keep_all;
>
>        if (primary) {
>                strncpy(params->primary, primary, IFNAMSIZ);
> diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
> index b8bec08..73b3c03 100644
> --- a/drivers/net/bonding/bond_sysfs.c
> +++ b/drivers/net/bonding/bond_sysfs.c
> @@ -339,7 +339,6 @@ out:
>
>  static DEVICE_ATTR(slaves, S_IRUGO | S_IWUSR, bonding_show_slaves,
>                   bonding_store_slaves);
> -
>  /*
>  * Show and set the bonding mode.  The bond interface must be down to
>  * change the mode.
> @@ -1472,6 +1471,59 @@ static ssize_t bonding_show_ad_partner_mac(struct device *d,
>  }
>  static DEVICE_ATTR(ad_partner_mac, S_IRUGO, bonding_show_ad_partner_mac, NULL);
>
> +/*
> + * Show and set the keep_all flag.
> + */
> +static ssize_t bonding_show_keep(struct device *d,
> +                                struct device_attribute *attr,
> +                                char *buf)
> +{
> +       struct bonding *bond = to_bond(d);
> +
> +       return sprintf(buf, "%d\n", bond->params.keep_all);
> +}
> +
> +static ssize_t bonding_store_keep(struct device *d,
> +                                 struct device_attribute *attr,
> +                                 const char *buf, size_t count)
> +{
> +       int i, new_value, ret = count;
> +       struct bonding *bond = to_bond(d);
> +       struct slave *slave;
> +
> +       if (sscanf(buf, "%d", &new_value) != 1) {
> +               pr_err("%s: no keep_all value specified.\n",
> +                      bond->dev->name);
> +               ret = -EINVAL;
> +               goto out;
> +       }
> +
> +       if (new_value == bond->params.keep_all)
> +               goto out;
> +
> +       if ((new_value == 0) || (new_value == 1)) {
> +               bond->params.keep_all = new_value;
> +       } else {
> +               pr_info("%s: Ignoring invalid keep_all value %d.\n",
> +                       bond->dev->name, new_value);
> +               ret = -EINVAL;
> +               goto out;
> +       }
> +
> +       bond_for_each_slave(bond, slave, i) {
> +               if (slave->state == BOND_STATE_BACKUP) {
> +                       if (new_value)
> +                               slave->dev->priv_flags &= ~IFF_SLAVE_INACTIVE;
> +                       else
> +                               slave->dev->priv_flags |= IFF_SLAVE_INACTIVE;
> +               }
> +       }
> +out:
> +       return count;
> +}
> +static DEVICE_ATTR(keep_all, S_IRUGO | S_IWUSR,
> +                  bonding_show_keep, bonding_store_keep);
> +
>
>
>  static struct attribute *per_bond_attrs[] = {
> @@ -1499,6 +1551,7 @@ static struct attribute *per_bond_attrs[] = {
>        &dev_attr_ad_actor_key.attr,
>        &dev_attr_ad_partner_key.attr,
>        &dev_attr_ad_partner_mac.attr,
> +       &dev_attr_keep_all.attr,
>        NULL,
>  };
>
> diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
> index 2aa3367..ef7969b 100644
> --- a/drivers/net/bonding/bonding.h
> +++ b/drivers/net/bonding/bonding.h
> @@ -131,6 +131,7 @@ struct bond_params {
>        char primary[IFNAMSIZ];
>        int primary_reselect;
>        __be32 arp_targets[BOND_MAX_ARP_TARGETS];
> +       int keep_all;
>  };
>
>  struct bond_parm_tbl {
> @@ -291,7 +292,8 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave)
>        struct bonding *bond = netdev_priv(slave->dev->master);
>        if (!bond_is_lb(bond))
>                slave->state = BOND_STATE_BACKUP;
> -       slave->dev->priv_flags |= IFF_SLAVE_INACTIVE;
> +       if (!bond->params.keep_all)
> +               slave->dev->priv_flags |= IFF_SLAVE_INACTIVE;
>        if (slave_do_arp_validate(bond, slave))
>                slave->dev->priv_flags |= IFF_SLAVE_NEEDARP;
>  }
>
>
> ---
>        -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
>

^ permalink raw reply

* Re: dev_get_valid_name buggy with hash collision
From: Octavian Purdila @ 2010-05-18 12:29 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: Linux Netdev List
In-Reply-To: <4BF26926.4070507@free.fr>

On Tuesday 18 May 2010 13:17:10 you wrote:

> the commit:
> 
> commit d90310243fd750240755e217c5faa13e24f41536
> Author: Octavian Purdila <opurdila@ixiacom.com>
> Date:   Wed Nov 18 02:36:59 2009 +0000
> 
>      net: device name allocation cleanups
> 
> introduced a bug when there is a hash collision making impossible to
> rename a device with eth%d
<snip>

Hi Daniel,

Thanks for the detailed explanation ! 

> 
> IMO, the bug is to pass the 'dev->name' parameter to __dev_alloc_name
> directly instead of a temporary name.

I agree.

> 
> The patch in attachment fix the problem but I am not sure we shouldn't
> go further and do more cleanup around this bug, so you may consider it
> as a RFC more than a fix. If it is acceptable, I will send it as a patch
> against net-2.6.
> 

The patch looks good to me, just one doubt here:

>--- net-2.6.orig/net/core/dev.c
>+++ net-2.6/net/core/dev.c
>@@ -936,18 +936,22 @@ int dev_alloc_name(struct net_device *de
> }
> EXPORT_SYMBOL(dev_alloc_name);
> 
>-static int dev_get_valid_name(struct net *net, const char *name, char *buf,
>-                             bool fmt)
>+static int dev_get_valid_name(struct net_device *dev, const char *name, bool 
fmt)
> {
>+       struct net *net;
>+
>+       BUG_ON(!dev_net(dev));
>+       net = dev_net(dev);
>+
>        if (!dev_valid_name(name))
>                return -EINVAL;
> 
>        if (fmt && strchr(name, '%'))
>-               return __dev_alloc_name(net, name, buf);
>+               return dev_alloc_name(dev, name);
>        else if (__dev_get_by_name(net, name))
>                return -EEXIST;
>-       else if (buf != name)
>-               strlcpy(buf, name, IFNAMSIZ);
>+       else if (strncmp(dev->name, name, IFNAMSIZ))
>+                strlcpy(dev->name, name, IFNAMSIZ);
> 

Why do the strncmp, can't we preserve the (buf != name) condition?

Thanks,
tavi

^ permalink raw reply

* Re: [PATCH net-next-2.6] bonding: remove redundant checks from bonding_store_slaves
From: Jiri Pirko @ 2010-05-18 12:23 UTC (permalink / raw)
  To: netdev; +Cc: davem, fubar, bonding-devel
In-Reply-To: <20100518120944.GA2878@psychotron.lab.eng.brq.redhat.com>

Please scratch this one too, will repost it after I post 2nd version of
"[PATCH net-next-2.6] bonding: move slave MTU handling from sysfs"

Thanks, Jirka

Tue, May 18, 2010 at 02:09:45PM CEST, jpirko@redhat.com wrote:
>Remove checks that duplicates similar checks in bond_enslave.
>
>Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>
>diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>index 4e84cfc..6c44c07 100644
>--- a/drivers/net/bonding/bond_sysfs.c
>+++ b/drivers/net/bonding/bond_sysfs.c
>@@ -243,7 +243,7 @@ static ssize_t bonding_store_slaves(struct device *d,
> 
> 	if (command[0] == '+') {
> 
>-		/* Got a slave name in ifname.  Is it already in the list? */
>+		/* Got a slave name in ifname. */
> 
> 		dev = __dev_get_by_name(dev_net(bond->dev), ifname);
> 		if (!dev) {
>@@ -253,24 +253,6 @@ static ssize_t bonding_store_slaves(struct device *d,
> 			goto out;
> 		}
> 
>-		if (dev->flags & IFF_UP) {
>-			pr_err("%s: Error: Unable to enslave %s because it is already up.\n",
>-			       bond->dev->name, dev->name);
>-			ret = -EPERM;
>-			goto out;
>-		}
>-
>-		read_lock(&bond->lock);
>-		bond_for_each_slave(bond, slave, i)
>-			if (slave->dev == dev) {
>-				pr_err("%s: Interface %s is already enslaved!\n",
>-				       bond->dev->name, ifname);
>-				ret = -EPERM;
>-				read_unlock(&bond->lock);
>-				goto out;
>-			}
>-		read_unlock(&bond->lock);
>-
> 		pr_info("%s: Adding slave %s.\n", bond->dev->name, ifname);
> 
> 		/* If this is the first slave, then we need to set

^ permalink raw reply

* Re: [PATCH net-next-2.6] bonding: move slave MTU handling from sysfs
From: Jiri Pirko @ 2010-05-18 12:21 UTC (permalink / raw)
  To: netdev; +Cc: davem, fubar, bonding-devel, monis
In-Reply-To: <20100517144731.GE2878@psychotron.lab.eng.brq.redhat.com>

Actually, please scratch this, the patch is not correct (res/ret). Will post the
correct one soon.

Jirka

Mon, May 17, 2010 at 04:47:31PM CEST, jpirko@redhat.com wrote:
>For some reason, MTU handling (storing, and restoring) is taking  place in
>bond_sysfs. The correct place for this code is in bond_enslave, bond_release.
>So move it there.
>
>Jirka
>
>Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 5e12462..2c3f9db 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -1533,6 +1533,14 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
> 	 */
> 	new_slave->original_flags = slave_dev->flags;
> 
>+	/* Save slave's original mtu and then set it to match the bond */
>+	new_slave->original_mtu = slave_dev->mtu;
>+	res = dev_set_mtu(slave_dev, bond->dev->mtu);
>+	if (res) {
>+		pr_debug("Error %d calling dev_set_mtu\n", res);
>+		goto err_free;
>+	}
>+
> 	/*
> 	 * Save slave's original ("permanent") mac address for modes
> 	 * that need it, and for restoring it upon release, and then
>@@ -1550,7 +1558,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
> 		res = dev_set_mac_address(slave_dev, &addr);
> 		if (res) {
> 			pr_debug("Error %d calling set_mac_address\n", res);
>-			goto err_free;
>+			goto err_restore_mtu;
> 		}
> 	}
> 
>@@ -1785,6 +1793,9 @@ err_restore_mac:
> 		dev_set_mac_address(slave_dev, &addr);
> 	}
> 
>+err_restore_mtu:
>+	dev_set_mtu(slave_dev, new_slave->original_mtu);
>+
> err_free:
> 	kfree(new_slave);
> 
>@@ -1969,6 +1980,8 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
> 		dev_set_mac_address(slave_dev, &addr);
> 	}
> 
>+	dev_set_mtu(slave_dev, slave->original_mtu);
>+
> 	slave_dev->priv_flags &= ~(IFF_MASTER_8023AD | IFF_MASTER_ALB |
> 				   IFF_SLAVE_INACTIVE | IFF_BONDING |
> 				   IFF_SLAVE_NEEDARP);
>diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>index 392e291..4e84cfc 100644
>--- a/drivers/net/bonding/bond_sysfs.c
>+++ b/drivers/net/bonding/bond_sysfs.c
>@@ -220,7 +220,6 @@ static ssize_t bonding_store_slaves(struct device *d,
> 	char command[IFNAMSIZ + 1] = { 0, };
> 	char *ifname;
> 	int i, res, ret = count;
>-	u32 original_mtu;
> 	struct slave *slave;
> 	struct net_device *dev = NULL;
> 	struct bonding *bond = to_bond(d);
>@@ -281,43 +280,22 @@ static ssize_t bonding_store_slaves(struct device *d,
> 			memcpy(bond->dev->dev_addr, dev->dev_addr,
> 			       dev->addr_len);
> 
>-		/* Set the slave's MTU to match the bond */
>-		original_mtu = dev->mtu;
>-		res = dev_set_mtu(dev, bond->dev->mtu);
>-		if (res) {
>-			ret = res;
>-			goto out;
>-		}
>-
> 		res = bond_enslave(bond->dev, dev);
>-		bond_for_each_slave(bond, slave, i)
>-			if (strnicmp(slave->dev->name, ifname, IFNAMSIZ) == 0)
>-				slave->original_mtu = original_mtu;
>-		if (res)
>-			ret = res;
> 
> 		goto out;
> 	}
> 
> 	if (command[0] == '-') {
> 		dev = NULL;
>-		original_mtu = 0;
> 		bond_for_each_slave(bond, slave, i)
> 			if (strnicmp(slave->dev->name, ifname, IFNAMSIZ) == 0) {
> 				dev = slave->dev;
>-				original_mtu = slave->original_mtu;
> 				break;
> 			}
> 		if (dev) {
> 			pr_info("%s: Removing slave %s\n",
> 				bond->dev->name, dev->name);
>-				res = bond_release(bond->dev, dev);
>-			if (res) {
>-				ret = res;
>-				goto out;
>-			}
>-			/* set the slave MTU to the default */
>-			dev_set_mtu(dev, original_mtu);
>+			res = bond_release(bond->dev, dev);
> 		} else {
> 			pr_err("unable to remove non-existent slave %s for bond %s.\n",
> 			       ifname, bond->dev->name);

^ permalink raw reply

* [PATCH net-next-2.6] bonding: remove redundant checks from bonding_store_slaves
From: Jiri Pirko @ 2010-05-18 12:09 UTC (permalink / raw)
  To: netdev; +Cc: davem, fubar, bonding-devel

Remove checks that duplicates similar checks in bond_enslave.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 4e84cfc..6c44c07 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -243,7 +243,7 @@ static ssize_t bonding_store_slaves(struct device *d,
 
 	if (command[0] == '+') {
 
-		/* Got a slave name in ifname.  Is it already in the list? */
+		/* Got a slave name in ifname. */
 
 		dev = __dev_get_by_name(dev_net(bond->dev), ifname);
 		if (!dev) {
@@ -253,24 +253,6 @@ static ssize_t bonding_store_slaves(struct device *d,
 			goto out;
 		}
 
-		if (dev->flags & IFF_UP) {
-			pr_err("%s: Error: Unable to enslave %s because it is already up.\n",
-			       bond->dev->name, dev->name);
-			ret = -EPERM;
-			goto out;
-		}
-
-		read_lock(&bond->lock);
-		bond_for_each_slave(bond, slave, i)
-			if (slave->dev == dev) {
-				pr_err("%s: Interface %s is already enslaved!\n",
-				       bond->dev->name, ifname);
-				ret = -EPERM;
-				read_unlock(&bond->lock);
-				goto out;
-			}
-		read_unlock(&bond->lock);
-
 		pr_info("%s: Adding slave %s.\n", bond->dev->name, ifname);
 
 		/* If this is the first slave, then we need to set

^ permalink raw reply related

* Re: [PATCH] bfin_mac: fix memleak in mii_bus{probe|remove}
From: Denis Kirjanov @ 2010-05-18 11:34 UTC (permalink / raw)
  To: Sonic Zhang
  Cc: davem, michael.hennerich, cooloney, uclinux-dist-devel, netdev
In-Reply-To: <AANLkTikcB3f0KmkcnDsV46Xrpm3WcCLQklxbLSJfMTlW@mail.gmail.com>

On Tue, May 18, 2010 at 18:05 +0800, Sonic Zhang wrote:
> 2010/5/18 Denis Kirjanov <kirjanov@gmail.com <kirjanov@gmail.com>:
> > Fix memory leak with miibus->irq
> >
> > Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
> > ---
> >
> > drivers/net/bfin_mac.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/net/bfin_mac.c b/drivers/net/bfin_mac.c
> > index 39a54ba..7a17b8a 100644
> > --- a/drivers/net/bfin_mac.c
> > +++ b/drivers/net/bfin_mac.c
> > @@ -1627,6 +1627,7 @@ static int __devinit bfin_mii_bus_probe(struct platform_device *pdev)
> >
> >  out_err_mdiobus_register:
> >        mdiobus_free(miibus);
> > +       kfree(miibus->irq);
> 
> Should you move this kfree before mdiobus_free?
> 
> >  out_err_alloc:
> >        peripheral_free_list(pin_req);
> >
> > @@ -1638,6 +1639,7 @@ static int __devexit bfin_mii_bus_remove(struct platform_device *pdev)
> >        struct mii_bus *miibus = platform_get_drvdata(pdev);
> >        platform_set_drvdata(pdev, NULL);
> >        mdiobus_unregister(miibus);
> > +       kfree(miibus->irq);
> >        mdiobus_free(miibus);
> >        peripheral_free_list(pin_req);
> >        return 0;
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
Oops, yeah, fixed.
Thanks.

[PATCH] bfin_mac: fix memleak in mii_bus_{probe|remove}
Fix memory leak with miibus->irq

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
---
drivers/net/bfin_mac.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/bfin_mac.c b/drivers/net/bfin_mac.c
index 39a54ba..368f333 100644
--- a/drivers/net/bfin_mac.c
+++ b/drivers/net/bfin_mac.c
@@ -1626,6 +1626,7 @@ static int __devinit bfin_mii_bus_probe(struct platform_device *pdev)
 	return 0;
 
 out_err_mdiobus_register:
+	kfree(miibus->irq);
 	mdiobus_free(miibus);
 out_err_alloc:
 	peripheral_free_list(pin_req);
@@ -1638,6 +1639,7 @@ static int __devexit bfin_mii_bus_remove(struct platform_device *pdev)
 	struct mii_bus *miibus = platform_get_drvdata(pdev);
 	platform_set_drvdata(pdev, NULL);
 	mdiobus_unregister(miibus);
+	kfree(miibus->irq);
 	mdiobus_free(miibus);
 	peripheral_free_list(pin_req);
 	return 0;


^ permalink raw reply related

* Re: [PATCHv2] vhost-net: utilize PUBLISH_USED_IDX feature
From: Juan Quintela @ 2010-05-18 11:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: davem, Rusty Russell, Paul E. McKenney, Arnd Bergmann, kvm,
	virtualization, netdev, linux-kernel, alex.williamson, amit.shah
In-Reply-To: <20100518022105.GA23129@redhat.com>

"Michael S. Tsirkin" <mst@redhat.com> wrote:
> With PUBLISH_USED_IDX, guest tells us which used entries
> it has consumed. This can be used to reduce the number
> of interrupts: after we write a used entry, if the guest has not yet
> consumed the previous entry, or if the guest has already consumed the
> new entry, we do not need to interrupt.
> This imporves bandwidth by 30% under some workflows.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>
> This is on top of Rusty's tree and depends on the virtio patch.
>
> Changes from v1:
> fix build

Minor nit if you have to respin it:

>  /* This actually signals the guest, using eventfd. */
> -void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
> +static void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
>  {
>  	__u16 flags;
> +	__u16 used;

local var named "used" here

>  	/* Flush out used index updates. This is paired
>  	 * with the barrier that the Guest executes when enabling
>  	 * interrupts. */
> @@ -1053,6 +1057,17 @@ void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq)
>  	     !vhost_has_feature(dev, VIRTIO_F_NOTIFY_ON_EMPTY)))
>  		return;
>  
> +	if (vhost_has_feature(dev, VIRTIO_RING_F_PUBLISH_USED)) {
> +		__u16 used;

and a "more" local one also named "used" :)

I guess you want to remove the previous declaration, as var is only used
in this block.

> +		if (get_user(used, vq->last_used)) {
> +			vq_err(vq, "Failed to get last used idx");
> +			return;
> +		}
> +
> +		if (used != (u16)(vq->last_used_idx - 1))
> +			return;
> +	}
> +
>  	/* Signal the Guest tell them we used something up. */
>  	if (vq->call_ctx)
>  		eventfd_signal(vq->call_ctx, 1);

Rest of patch looks ok, and as a bonus un-export two functions.

Later, Juan.

^ permalink raw reply

* Re: [PATCHv2] virtio: put last seen used index into ring itself
From: Juan Quintela @ 2010-05-18 11:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rusty Russell, Jiri Pirko, Shirley Ma, Amit Shah, Mark McLoughlin,
	netdev, linux-kernel, alex.williamson
In-Reply-To: <20100518021315.GA22852@redhat.com>

"Michael S. Tsirkin" <mst@redhat.com> wrote:
> Generally, the Host end of the virtio ring doesn't need to see where
> Guest is up to in consuming the ring.  However, to completely understand
> what's going on from the outside, this information must be exposed.
> For example, host can reduce the number of interrupts by detecting
> that the guest is currently handling previous buffers.
>
> Fortunately, we have room to expand: the ring is always a whole number
> of pages and there's hundreds of bytes of padding after the avail ring
> and the used ring, whatever the number of descriptors (which must be a
> power of 2).
>
> We add a feature bit so the guest can tell the host that it's writing
> out the current value there, if it wants to use that.
>
> This is based on a patch by Rusty Russell, with the main difference
> being that we dedicate a feature bit to guest to tell the host it is
> writing the used index.  This way we don't need to force host to publish
> the last available index until we have a use for it.
>
> Another difference is that while the feature helps virtio-net,
> there have been conflicting reports wrt virtio-blk.
> The reason is unknown, it could be due to the fact that
> virtio-blk does not bother to disable interrupts at all.
> So for now, this patch only acks this feature for -net.
>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

It looks good.

Later, Juan.

^ permalink raw reply

* RE: get beyond 1Gbps with pktgen on 10Gb nic?
From: Jon Zhou @ 2010-05-18 11:14 UTC (permalink / raw)
  To: Rick Jones; +Cc: netdev@vger.kernel.org
In-Reply-To: <4BEAF44E.8090201@hp.com>

hi rick:

do you mean "TCP_NODELAY" will send with packet size as I expect

without this option,netperf might sent packet with large size? (but eventually it will be splitted into MTU size?)

thanks
jon


-----Original Message-----
From: Rick Jones [mailto:rick.jones2@hp.com] 
Sent: Thursday, May 13, 2010 2:33 AM
To: Jesse Brandeburg
Cc: Jon Zhou; Ben Greear; Ben Hutchings; netdev@vger.kernel.org
Subject: Re: get beyond 1Gbps with pktgen on 10Gb nic?

Jesse Brandeburg wrote:
> On Tue, May 11, 2010 at 9:00 PM, Jon Zhou <Jon.Zhou@jdsu.com> wrote:
> 
>> I just used multi netperf instances to reach 900K pps/8Gb+ traffic on the
>> Broadcom 10G nic:

>>
>>command:
>>
>>for i in 1 2 3 4 5 6 7 8 9 10
>>do
>>netperf -l 60 -H 192.168.0.53 -- -m 60 -s 100M -S 100M &
>>done

100 Megabytes seems a trifle excessive as a socket buffer size.  I would suggest 
  lopping-off a few zeros and use 1M instead.  Or, one can let linux auto-tune 
the socket buffers/windows - just don't accept the socket buffer size reported 
by the classic netperf command - it is from the initial creation of the socket. 
  To get what it became by the end of the test one should use the "omni" tests. 
  Contact me offlist or via netperf-talk in the netperf.org domain for more on that.

>> the msg size was assigned as 64 bytes, but when I checked the file captured
>> by tcpdump, found that netperf sent many frames which are large than 64
>> bytes(i.e.4000-10K+ bytes) and these frames were truncated by tcpdump.
>>
>> so that the actual avg packet size is around 1500 bytes, but what I want is
>> avg packet: 300-400 bytes and reach 5Gb+.
>>
>>does it make sense?
> 
> 
> if you set the TCP_NODELAY (to disable nagle) option on netperf 

If he was seeing 4K to 10K byte frames in his tcpdump, that was likely TSO above 
and beyond nagle.  I was going to say it also suggests he was running tcpdump on 
the sending side rather than the receiver, but then there is LRO/GRO isn't there...

> (check netperf -t TCP_STREAM -- -h) then you should be able to control packet
>  size.

Or at least influence it meaningfully :)  If he was seeing 4K to 10K byte frames 
in his tcpdump, that was likely TSO above and beyond nagle.If there are packet 
losses and retransmissions, the retransmissions, which may or may not include 
new data may be larger.

The "netperf -t TCP_STREAM -- -h" to get test-specific help shown by Jesse will 
show the option you need to add to set TCP_NODELAY.  For additional descriptions 
of netperf command options:

http://www.netperf.org/svn/netperf2/tags/netperf-2.4.5/doc/netperf.html

For "quick and dirty" testing, the loop as it appears above is "ok" but I would 
suggest abusing the confidence intervals code to minimize the skew error:

http://www.netperf.org/svn/netperf2/tags/netperf-2.4.5/doc/netperf.html#Using-Netperf-to-Measure-Aggregate-Performance

happy benchmarking,

rick jones

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply

* [PATCH 3/4] ipv6: Use POSTDAD state
From: Herbert Xu @ 2010-05-18 11:04 UTC (permalink / raw)
  To: David Miller, jbohac, yoshfuji, netdev, shemminger
In-Reply-To: <20100518110243.GA7750@gondor.apana.org.au>

ipv6: Use POSTDAD state

This patch makes use of the new POSTDAD state.  This prevents
a race between DAD completion and failure.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 net/ipv6/addrconf.c |   29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index d8d7c63..e7e9643 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1412,10 +1412,27 @@ static void addrconf_dad_stop(struct inet6_ifaddr *ifp, int dad_failed)
 		ipv6_del_addr(ifp);
 }
 
+static int addrconf_dad_end(struct inet6_ifaddr *ifp)
+{
+	int err = -ENOENT;
+
+	spin_lock(&ifp->state_lock);
+	if (ifp->state == INET6_IFADDR_STATE_DAD) {
+		ifp->state = INET6_IFADDR_STATE_POSTDAD;
+		err = 0;
+	}
+	spin_unlock(&ifp->state_lock);
+
+	return err;
+}
+
 void addrconf_dad_failure(struct inet6_ifaddr *ifp)
 {
 	struct inet6_dev *idev = ifp->idev;
 
+	if (addrconf_dad_end(ifp))
+		return;
+
 	if (net_ratelimit())
 		printk(KERN_INFO "%s: IPv6 duplicate address %pI6c detected!\n",
 			ifp->idev->dev->name, &ifp->addr);
@@ -2731,6 +2748,7 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 
 			/* Flag it for later restoration when link comes up */
 			ifa->flags |= IFA_F_TENTATIVE;
+			ifa->state = INET6_IFADDR_STATE_DAD;
 
 			write_unlock_bh(&idev->lock);
 
@@ -2895,6 +2913,9 @@ static void addrconf_dad_timer(unsigned long data)
 	struct inet6_dev *idev = ifp->idev;
 	struct in6_addr mcaddr;
 
+	if (!ifp->probes && addrconf_dad_end(ifp))
+		goto out;
+
 	read_lock(&idev->lock);
 	if (idev->dead || !(idev->if_flags & IF_READY)) {
 		read_unlock(&idev->lock);
@@ -2967,12 +2988,10 @@ static void addrconf_dad_run(struct inet6_dev *idev) {
 	read_lock_bh(&idev->lock);
 	for (ifp = idev->addr_list; ifp; ifp = ifp->if_next) {
 		spin_lock(&ifp->lock);
-		if (!(ifp->flags & IFA_F_TENTATIVE)) {
-			spin_unlock(&ifp->lock);
-			continue;
-		}
+		if (ifp->flags & IFA_F_TENTATIVE &&
+		    ifp->state == INET6_IFADDR_STATE_DAD)
+			addrconf_dad_kick(ifp);
 		spin_unlock(&ifp->lock);
-		addrconf_dad_kick(ifp);
 	}
 	read_unlock_bh(&idev->lock);
 }

^ permalink raw reply related

* [PATCH 4/4] ipv6: Never schedule DAD timer on dead address
From: Herbert Xu @ 2010-05-18 11:04 UTC (permalink / raw)
  To: David Miller, jbohac, yoshfuji, netdev, shemminger
In-Reply-To: <20100518110243.GA7750@gondor.apana.org.au>

ipv6: Never schedule DAD timer on dead address

This patch ensures that all places that schedule the DAD timer
look at the address state in a safe manner before scheduling the
timer.  This ensures that we don't end up with pending timers
after deleting an address.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 net/ipv6/addrconf.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index e7e9643..dd72f9c 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2865,10 +2865,10 @@ static void addrconf_dad_start(struct inet6_ifaddr *ifp, u32 flags)
 	net_srandom(ifp->addr.s6_addr32[3]);
 
 	read_lock_bh(&idev->lock);
+	spin_lock(&ifp->lock);
 	if (ifp->state == INET6_IFADDR_STATE_DEAD)
 		goto out;
 
-	spin_lock(&ifp->lock);
 	if (dev->flags&(IFF_NOARP|IFF_LOOPBACK) ||
 	    idev->cnf.accept_dad < 1 ||
 	    !(ifp->flags&IFA_F_TENTATIVE) ||
@@ -2902,8 +2902,8 @@ static void addrconf_dad_start(struct inet6_ifaddr *ifp, u32 flags)
 		ip6_ins_rt(ifp->rt);
 
 	addrconf_dad_kick(ifp);
-	spin_unlock(&ifp->lock);
 out:
+	spin_unlock(&ifp->lock);
 	read_unlock_bh(&idev->lock);
 }
 
@@ -2923,6 +2923,12 @@ static void addrconf_dad_timer(unsigned long data)
 	}
 
 	spin_lock(&ifp->lock);
+	if (ifp->state == INET6_IFADDR_STATE_DEAD) {
+		spin_unlock(&ifp->lock);
+		read_unlock(&idev->lock);
+		goto out;
+	}
+
 	if (ifp->probes == 0) {
 		/*
 		 * DAD was successful

^ permalink raw reply related

* [PATCH 1/4] ipv6: Replace inet6_ifaddr->dead with state
From: Herbert Xu @ 2010-05-18 11:04 UTC (permalink / raw)
  To: David Miller, jbohac, yoshfuji, netdev, shemminger
In-Reply-To: <20100518110243.GA7750@gondor.apana.org.au>

ipv6: Replace inet6_ifaddr->dead with state

This patch replaces the boolean dead flag on inet6_ifaddr with
a state enum.  This allows us to roll back changes when deleting
an address according to whether DAD has completed or not.

This patch only adds the state field and does not change the logic.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 include/net/if_inet6.h |   12 ++++++++++--
 net/ipv6/addrconf.c    |   11 ++++++-----
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h
index 545d8b0..421a1e8 100644
--- a/include/net/if_inet6.h
+++ b/include/net/if_inet6.h
@@ -30,6 +30,13 @@
 #define IF_PREFIX_ONLINK	0x01
 #define IF_PREFIX_AUTOCONF	0x02
 
+enum {
+	INET6_IFADDR_STATE_DAD,
+	INET6_IFADDR_STATE_POSTDAD,
+	INET6_IFADDR_STATE_UP,
+	INET6_IFADDR_STATE_DEAD,
+};
+
 #ifdef __KERNEL__
 
 struct inet6_ifaddr {
@@ -40,6 +47,9 @@ struct inet6_ifaddr {
 	__u32			prefered_lft;
 	atomic_t		refcnt;
 	spinlock_t		lock;
+	spinlock_t		state_lock;
+
+	int			state;
 
 	__u8			probes;
 	__u8			flags;
@@ -62,8 +72,6 @@ struct inet6_ifaddr {
 	struct inet6_ifaddr	*ifpub;
 	int			regen_count;
 #endif
-
-	int			dead;
 };
 
 struct ip6_sf_socklist {
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 413054f..14b2ccf 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -539,7 +539,7 @@ void inet6_ifa_finish_destroy(struct inet6_ifaddr *ifp)
 	if (del_timer(&ifp->timer))
 		printk("Timer is still running, when freeing ifa=%p\n", ifp);
 
-	if (!ifp->dead) {
+	if (ifp->state != INET6_IFADDR_STATE_DEAD) {
 		printk("Freeing alive inet6 address %p\n", ifp);
 		return;
 	}
@@ -642,6 +642,7 @@ ipv6_add_addr(struct inet6_dev *idev, const struct in6_addr *addr, int pfxlen,
 	ipv6_addr_copy(&ifa->addr, addr);
 
 	spin_lock_init(&ifa->lock);
+	spin_lock_init(&ifa->state_lock);
 	init_timer(&ifa->timer);
 	ifa->timer.data = (unsigned long) ifa;
 	ifa->scope = scope;
@@ -716,7 +717,7 @@ static void ipv6_del_addr(struct inet6_ifaddr *ifp)
 
 	hash = ipv6_addr_hash(&ifp->addr);
 
-	ifp->dead = 1;
+	ifp->state = INET6_IFADDR_STATE_DEAD;
 
 	write_lock_bh(&addrconf_hash_lock);
 	for (ifap = &inet6_addr_lst[hash]; (ifa=*ifap) != NULL;
@@ -2679,7 +2680,7 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 	while ((ifa = idev->tempaddr_list) != NULL) {
 		idev->tempaddr_list = ifa->tmp_next;
 		ifa->tmp_next = NULL;
-		ifa->dead = 1;
+		ifa->state = INET6_IFADDR_STATE_DEAD;
 		write_unlock_bh(&idev->lock);
 		spin_lock_bh(&ifa->lock);
 
@@ -2724,7 +2725,7 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 			ifa->flags |= IFA_F_TENTATIVE;
 			in6_ifa_hold(ifa);
 		} else {
-			ifa->dead = 1;
+			ifa->state = INET6_IFADDR_STATE_DEAD;
 		}
 		write_unlock_bh(&idev->lock);
 
@@ -2827,7 +2828,7 @@ static void addrconf_dad_start(struct inet6_ifaddr *ifp, u32 flags)
 	net_srandom(ifp->addr.s6_addr32[3]);
 
 	read_lock_bh(&idev->lock);
-	if (ifp->dead)
+	if (ifp->state == INET6_IFADDR_STATE_DEAD)
 		goto out;
 
 	spin_lock(&ifp->lock);

^ permalink raw reply related

* [PATCH 2/4] ipv6: Use state_lock to protect ifa state
From: Herbert Xu @ 2010-05-18 11:04 UTC (permalink / raw)
  To: David Miller, jbohac, yoshfuji, netdev, shemminger
In-Reply-To: <20100518110243.GA7750@gondor.apana.org.au>

ipv6: Use state_lock to protect ifa state

This patch makes use of the new state_lock to synchronise between
updates to the ifa state.  This fixes the issue where a remotely
triggered address deletion (through DAD failure) coincides with a
local administrative address deletion, causing certain actions to
be performed twice incorrectly.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 net/ipv6/addrconf.c |   23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 14b2ccf..d8d7c63 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -711,13 +711,20 @@ static void ipv6_del_addr(struct inet6_ifaddr *ifp)
 {
 	struct inet6_ifaddr *ifa, **ifap;
 	struct inet6_dev *idev = ifp->idev;
+	int state;
 	int hash;
 	int deleted = 0, onlink = 0;
 	unsigned long expires = jiffies;
 
 	hash = ipv6_addr_hash(&ifp->addr);
 
+	spin_lock_bh(&ifp->state_lock);
+	state = ifp->state;
 	ifp->state = INET6_IFADDR_STATE_DEAD;
+	spin_unlock_bh(&ifp->state_lock);
+
+	if (state == INET6_IFADDR_STATE_DEAD)
+		goto out;
 
 	write_lock_bh(&addrconf_hash_lock);
 	for (ifap = &inet6_addr_lst[hash]; (ifa=*ifap) != NULL;
@@ -831,6 +838,7 @@ static void ipv6_del_addr(struct inet6_ifaddr *ifp)
 		dst_release(&rt->u.dst);
 	}
 
+out:
 	in6_ifa_put(ifp);
 }
 
@@ -2621,6 +2629,7 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 	struct inet6_dev *idev;
 	struct inet6_ifaddr *ifa, *keep_list, **bifa;
 	struct net *net = dev_net(dev);
+	int state;
 	int i;
 
 	ASSERT_RTNL();
@@ -2680,7 +2689,6 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 	while ((ifa = idev->tempaddr_list) != NULL) {
 		idev->tempaddr_list = ifa->tmp_next;
 		ifa->tmp_next = NULL;
-		ifa->state = INET6_IFADDR_STATE_DEAD;
 		write_unlock_bh(&idev->lock);
 		spin_lock_bh(&ifa->lock);
 
@@ -2723,14 +2731,25 @@ static int addrconf_ifdown(struct net_device *dev, int how)
 
 			/* Flag it for later restoration when link comes up */
 			ifa->flags |= IFA_F_TENTATIVE;
+
+			write_unlock_bh(&idev->lock);
+
 			in6_ifa_hold(ifa);
 		} else {
+			write_unlock_bh(&idev->lock);
+			spin_lock_bh(&ifa->state_lock);
+			state = ifa->state;
 			ifa->state = INET6_IFADDR_STATE_DEAD;
+			spin_unlock_bh(&ifa->state_lock);
+
+			if (state == INET6_IFADDR_STATE_DEAD)
+				goto put_ifa;
 		}
-		write_unlock_bh(&idev->lock);
 
 		__ipv6_ifa_notify(RTM_DELADDR, ifa);
 		atomic_notifier_call_chain(&inet6addr_chain, NETDEV_DOWN, ifa);
+
+put_ifa:
 		in6_ifa_put(ifa);
 
 		write_lock_bh(&idev->lock);

^ permalink raw reply related

* [0/4] Fix addrconf race conditions
From: Herbert Xu @ 2010-05-18 11:02 UTC (permalink / raw)
  To: David Miller; +Cc: jbohac, yoshfuji, netdev, shemminger
In-Reply-To: <20100423150540.GA26611@gondor.apana.org.au>

On Fri, Apr 23, 2010 at 11:05:40PM +0800, Herbert Xu wrote:
>
> This stuff is more broken than I thought.  For example, we perform
> a number of actions when DAD succeeds, e.g., joining an anycast
> group.  However, this is not synchronised with respect to address
> deletion at all, so if DAD succeeds just as someone deletes the
> address, you can easily get stuck on that anycast group.
> 
> I will try to untangle this mess tomorrow.

Tomorrow took a while to arrive :)

Here is a first batch of patches.  Note that this is by no means
a comprehensive fix for all the ndisc/addrconf race conditions.
It is just a first step in trying to address the problems.

The patchset revolves around a new lock, ifp->state_lock.  I
added it instead of trying to reuse the existing ifp->lock because
the latter has serious nesting issues that prevent it from easily
being used.  My long term plan is to restructure the locking and
eventually phase out ifp->lock in favour of ifp->state_lock.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: Kernel crash with sky2
From: Roedel, Joerg @ 2010-05-18 11:01 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev@vger.kernel.org
In-Reply-To: <20100517122236.7f33101f@nehalam>

Hi Stephen,

On Mon, May 17, 2010 at 03:22:36PM -0400, Stephen Hemminger wrote:
> On Mon, 17 May 2010 20:52:28 +0200
> Joerg Roedel <joerg.roedel@amd.com> wrote:
> > I experience the following crash with 2.6.34 in the sky2 code on my
> > laptop when I plug off the lan-cable and then plug-off the power cable
> > and switching to battery. It does not happen with acpi=off.
> 
> So you have a busted BIOS that powers off the device.

Yeah you are right, its a BIOS issue. I tried to find out how the OS is
informed about the device taken away. But none of the hotplug
drivers or enabling acpi debug showed anything here. I wonder how this
is done in the "other" operating system. Or how the device could be
re-enabled, plugging the wires back doesn't help. Very weird.
Anyway, I found a BIOS option to disable this behavior and now things
work unpatched. Thanks for your help.

	Joerg



^ permalink raw reply

* Re: [PATCH] virtio: put last seen used index into ring itself
From: Michael S. Tsirkin @ 2010-05-18 10:25 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Rusty Russell, Jiri Pirko, Shirley Ma, Amit Shah, Mark McLoughlin,
	netdev, linux-kernel, quintela, alex.williamson
In-Reply-To: <4BF26B87.3080902@redhat.com>

On Tue, May 18, 2010 at 12:27:19PM +0200, Jes Sorensen wrote:
> On 05/18/10 02:47, Michael S. Tsirkin wrote:
> > Generally, the Host end of the virtio ring doesn't need to see where
> > Guest is up to in consuming the ring.  However, to completely understand
> > what's going on from the outside, this information must be exposed.
> > For example, host can reduce the number of interrupts by detecting
> > that the guest is currently handling previous buffers.
> > 
> > Fortunately, we have room to expand: the ring is always a whole number
> > of pages and there's hundreds of bytes of padding after the avail ring
> > and the used ring, whatever the number of descriptors (which must be a
> > power of 2).
> 
> Hi Michael,
> 
> Small build fix for this against Linus upstream.
> 
> Jes
> 
> 
> Fix build of virtio_net.c
> 
> Signed-off-by: Jes.Sorensen@redhat.com

Thanks, this is already fixed in v2 that I sent.

> ---
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index d37e5be..679e2df 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -23,6 +23,7 @@
>  #include <linux/module.h>
>  #include <linux/virtio.h>
>  #include <linux/virtio_net.h>
> +#include <linux/virtio_ring.h>
>  #include <linux/scatterlist.h>
>  #include <linux/if_vlan.h>
>  #include <linux/slab.h>
> 

^ permalink raw reply

* Re: [PATCH] virtio: put last seen used index into ring itself
From: Jes Sorensen @ 2010-05-18 10:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rusty Russell, Jiri Pirko, Shirley Ma, Amit Shah, Mark McLoughlin,
	netdev, linux-kernel, quintela, alex.williamson
In-Reply-To: <20100518004744.GA21359@redhat.com>

On 05/18/10 02:47, Michael S. Tsirkin wrote:
> Generally, the Host end of the virtio ring doesn't need to see where
> Guest is up to in consuming the ring.  However, to completely understand
> what's going on from the outside, this information must be exposed.
> For example, host can reduce the number of interrupts by detecting
> that the guest is currently handling previous buffers.
> 
> Fortunately, we have room to expand: the ring is always a whole number
> of pages and there's hundreds of bytes of padding after the avail ring
> and the used ring, whatever the number of descriptors (which must be a
> power of 2).

Hi Michael,

Small build fix for this against Linus upstream.

Jes


Fix build of virtio_net.c

Signed-off-by: Jes.Sorensen@redhat.com
---
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index d37e5be..679e2df 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -23,6 +23,7 @@
 #include <linux/module.h>
 #include <linux/virtio.h>
 #include <linux/virtio_net.h>
+#include <linux/virtio_ring.h>
 #include <linux/scatterlist.h>
 #include <linux/if_vlan.h>
 #include <linux/slab.h>

^ permalink raw reply related

* dev_get_valid_name buggy with hash collision
From: Daniel Lezcano @ 2010-05-18 10:17 UTC (permalink / raw)
  To: Octavian Purdila; +Cc: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 3982 bytes --]


Hi all,

the commit:

commit d90310243fd750240755e217c5faa13e24f41536
Author: Octavian Purdila <opurdila@ixiacom.com>
Date:   Wed Nov 18 02:36:59 2009 +0000

     net: device name allocation cleanups

introduced a bug when there is a hash collision making impossible to 
rename a device with eth%d

This bug is very hard to reproduce and appears rarely, but finally I 
succeed to reproduce it with the program in attachment which fail to 
rename a device with errno ENFILE.

The test program creates a new network namespace in order to avoid 
messing the real network and to be sure we don't have any ethernet 
devices. Hence, we know if we create one ethernet device with the name 
eth%d the result will be 'eth0'.

The first step is to find a conflicting name with 'eth0':
  1) We compute the hash of 'eth0' with the hashing functions imported 
from the kernel
  2) We create a temporary name, compute its hash
  3) We compare the hash with the one we found for 'eth0'.

We loop until the hashes are different. When they are the same, then the 
temporary name is a conflicting name.

We create a dummy device with the temporary conflicting name and then we 
try to rename it with 'eth%d' (expecting 'eth0'), that fails with ENFILE 
due to the kernel bug.

 From the kernel POV, this is what happen:

dev_change_name does:
---------------------

     [ ... ]

     dev_get_valid_name(net, newname, dev->name, 1);

     [ ... ]

Note the dev->name parameter and newname is 'eth%d'.

dev_get_valid_name does:
------------------------

     [ ... ]

     if (fmt && strchr(name, '%'))
         return __dev_alloc_name(net, name, buf);

     [ ... ]

Note the 'buf' parameter is the 'dev->name' parameter and 'name' is "eth%d"

__dev_alloc_name does:
----------------------

     [ ... ]

         for_each_netdev(net, d) {
             if (!sscanf(d->name, name, &i))
                 continue;
             if (i < 0 || i >= max_netdevices)
                 continue;

             /*  avoid cases where sscanf is not exact inverse of printf */
             snprintf(buf, IFNAMSIZ, name, i);
             if (!strncmp(buf, d->name, IFNAMSIZ))
                 set_bit(i, inuse);
         }

     [ ... ]

Here the buf parameter is 'dev->name', so while we are browsing the 
network devices, we change the name of our eth device we want to rename. 
But in the context of our test program, that does not happen because 
there is no "eth[0-9]" network devices in the namespace, then we exit 
the loop with 'i == 0'.

Right after we do:

     if (buf != name)
         snprintf(buf, IFNAMSIZ, name, i);

Here buf and name pointers are different, so we modify 'buf' which is 
'dev->name', that is the network device name directly. So we have 
'dev->name' == "eth0" after this line.


So right after we are trying to find ourself :)

     [ ... ]

     if (!__dev_get_by_name(net, buf))
         return i;

     [ ... ]

When hashing are the same for the oldname and the newname, the function 
'__dev_get_by_name':

     [ ... ]

     struct hlist_head *head = dev_name_hash(net, name);

     hlist_for_each_entry(dev, p, head, name_hlist)
         if (!strncmp(dev->name, name, IFNAMSIZ))
             return dev;

     [ ... ]

will find the network device because we do __dev_get_by_name(net, 
"eth0"), the dev->name is already modified with "eth0" and the hashing 
of the temporary name and "eth0" are the same so returning the same hash 
entry.

We are lucky, most of the time, because the name of the network device 
and the new name have different hash entry, so we compare to ourself 
very rarely.


IMO, the bug is to pass the 'dev->name' parameter to __dev_alloc_name 
directly instead of a temporary name.

The patch in attachment fix the problem but I am not sure we shouldn't 
go further and do more cleanup around this bug, so you may consider it 
as a RFC more than a fix. If it is acceptable, I will send it as a patch 
against net-2.6.

Thanks
   -- Daniel















[-- Attachment #2: myethash.c --]
[-- Type: text/x-csrc, Size: 2623 bytes --]

#include <stdio.h>
#include <net/if.h>
#define _GNU_SOURCE
#include <sched.h>
#include <unistd.h>
#include <sys/types.h>

/* 
 * We import the hash function from the kernel, so we can compute a conflicting
 * name directly in this program and reproduce the bug easily
 */

#define NETDEV_HASHBITS  8

/*************** from include/linux/dcache.h *****************************/

#define init_name_hash()  0
static inline unsigned long
partial_name_hash(unsigned long c, unsigned long prevhash)
{
        return (prevhash + (c << 4) + (c >> 4)) * 11;
}

static inline unsigned long end_name_hash(unsigned long hash)
{
        return (unsigned int) hash;
}

static inline unsigned int
full_name_hash(const unsigned char *name, unsigned int len)
{
        unsigned long hash = init_name_hash();
        while (len--)
                hash = partial_name_hash(*name++, hash);
        return end_name_hash(hash);
}

/***************** from include/linux/hash.h *****************************/

/* 2^31 + 2^29 - 2^25 + 2^22 - 2^19 - 2^16 + 1 */
#define GOLDEN_RATIO_PRIME_32 0x9e370001UL

static inline unsigned int hash_32(unsigned int val, unsigned int bits)
{
        /* On some cpus multiply is faster, on others gcc will do shifts */
        unsigned int hash = val * GOLDEN_RATIO_PRIME_32;

        /* High bits are more random, so use them. */
        return hash >> (32 - bits);
}

/*************************************************************************/


unsigned int dev_name_hash(const char *name)
{
	return hash_32(full_name_hash(name, strnlen(name, IFNAMSIZ)), 
                        NETDEV_HASHBITS);
}

int main(int argc, char *argv[])
{
	char devname[IFNAMSIZ];
	char cmd[4096];
	unsigned int val, val2;
	
	if (getuid()) {
		fprintf(stderr, "you have to be root !\n");
		return -1;
	}

	/*
	 * Unshare the network namespace, we don't mess the network
         * and we can assume the eth%d rename will be eth0
	 */
	if (unshare(CLONE_NEWNET)) {
		perror("unshare");
		return -1;
	}		

	val = dev_name_hash("eth0");

	while (1) {

		snprintf(devname, IFNAMSIZ, "ethXXXXXX");
		mktemp(devname);

		val2 = dev_name_hash(devname);

		if (val == val2) {
			printf("'%s' has same hash entry\n", devname);
			break;
		}
	}

	/*
	 * We create a dummy interface with the conflicting name
	 * and then we rename it with an kernel allocated name
	 * The kernel will fail to rename with -ENFILE.
	 */

	sprintf(cmd, "ip link add %s type dummy", devname);

	if (system(cmd)) {
		perror("system");
		return -1;
	}

	sprintf(cmd, "ip link set name eth%%d dev %s", devname);
	printf("%s\n", cmd);
	system(cmd); 

	return 0;
}

[-- Attachment #3: fix-dev_get_valid_name.patch --]
[-- Type: text/x-diff, Size: 2315 bytes --]

Subject: fix dev_get_valid_name
From: Daniel Lezcano <daniel.lezcano@free.fr>

the commit:

commit d90310243fd750240755e217c5faa13e24f41536
Author: Octavian Purdila <opurdila@ixiacom.com>
Date:   Wed Nov 18 02:36:59 2009 +0000

    net: device name allocation cleanups

introduced a bug when there is a hash collision making impossible
to rename a device with eth%d. This bug is very hard to reproduce
and appears rarely.

The problem is coming from we don't pass a temporary buffer to
__dev_alloc_name but 'dev->name' which is modified by the function.

Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
---
 net/core/dev.c |   20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

Index: net-2.6/net/core/dev.c
===================================================================
--- net-2.6.orig/net/core/dev.c
+++ net-2.6/net/core/dev.c
@@ -936,18 +936,22 @@ int dev_alloc_name(struct net_device *de
 }
 EXPORT_SYMBOL(dev_alloc_name);
 
-static int dev_get_valid_name(struct net *net, const char *name, char *buf,
-			      bool fmt)
+static int dev_get_valid_name(struct net_device *dev, const char *name, bool fmt)
 {
+	struct net *net;
+
+	BUG_ON(!dev_net(dev));
+	net = dev_net(dev);
+
 	if (!dev_valid_name(name))
 		return -EINVAL;
 
 	if (fmt && strchr(name, '%'))
-		return __dev_alloc_name(net, name, buf);
+		return dev_alloc_name(dev, name);
 	else if (__dev_get_by_name(net, name))
 		return -EEXIST;
-	else if (buf != name)
-		strlcpy(buf, name, IFNAMSIZ);
+	else if (strncmp(dev->name, name, IFNAMSIZ))
+		 strlcpy(dev->name, name, IFNAMSIZ);
 
 	return 0;
 }
@@ -979,7 +983,7 @@ int dev_change_name(struct net_device *d
 
 	memcpy(oldname, dev->name, IFNAMSIZ);
 
-	err = dev_get_valid_name(net, newname, dev->name, 1);
+	err = dev_get_valid_name(dev, newname, 1);
 	if (err < 0)
 		return err;
 
@@ -5083,7 +5087,7 @@ int register_netdevice(struct net_device
 		}
 	}
 
-	ret = dev_get_valid_name(net, dev->name, dev->name, 0);
+	ret = dev_get_valid_name(dev, dev->name, 0);
 	if (ret)
 		goto err_uninit;
 
@@ -5661,7 +5665,7 @@ int dev_change_net_namespace(struct net_
 		/* We get here if we can't use the current device name */
 		if (!pat)
 			goto out;
-		if (dev_get_valid_name(net, pat, dev->name, 1))
+		if (dev_get_valid_name(dev, pat, 1))
 			goto out;
 	}
 

^ permalink raw reply

* Re: [PATCH] bfin_mac: fix memleak in mii_bus{probe|remove}
From: Sonic Zhang @ 2010-05-18 10:05 UTC (permalink / raw)
  To: kirjanov, davem, michael.hennerich, sonic.zhang, cooloney,
	uclinux-dist-devel, netdev
In-Reply-To: <20100518092839.GA21791@coldcone>

2010/5/18 Denis Kirjanov <kirjanov@gmail.com <kirjanov@gmail.com>:
> Fix memory leak with miibus->irq
>
> Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
> ---
>
> drivers/net/bfin_mac.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/net/bfin_mac.c b/drivers/net/bfin_mac.c
> index 39a54ba..7a17b8a 100644
> --- a/drivers/net/bfin_mac.c
> +++ b/drivers/net/bfin_mac.c
> @@ -1627,6 +1627,7 @@ static int __devinit bfin_mii_bus_probe(struct platform_device *pdev)
>
>  out_err_mdiobus_register:
>        mdiobus_free(miibus);
> +       kfree(miibus->irq);

Should you move this kfree before mdiobus_free?

>  out_err_alloc:
>        peripheral_free_list(pin_req);
>
> @@ -1638,6 +1639,7 @@ static int __devexit bfin_mii_bus_remove(struct platform_device *pdev)
>        struct mii_bus *miibus = platform_get_drvdata(pdev);
>        platform_set_drvdata(pdev, NULL);
>        mdiobus_unregister(miibus);
> +       kfree(miibus->irq);
>        mdiobus_free(miibus);
>        peripheral_free_list(pin_req);
>        return 0;
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* [PATCH] bfin_mac: fix memleak in mii_bus{probe|remove}
From: Denis Kirjanov <kirjanov@gmail.com @ 2010-05-18  9:28 UTC (permalink / raw)
  To: davem; +Cc: michael.hennerich, sonic.zhang, cooloney, uclinux-dist-devel,
	netdev

Fix memory leak with miibus->irq

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
---

drivers/net/bfin_mac.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/bfin_mac.c b/drivers/net/bfin_mac.c
index 39a54ba..7a17b8a 100644
--- a/drivers/net/bfin_mac.c
+++ b/drivers/net/bfin_mac.c
@@ -1627,6 +1627,7 @@ static int __devinit bfin_mii_bus_probe(struct platform_device *pdev)
 
 out_err_mdiobus_register:
 	mdiobus_free(miibus);
+	kfree(miibus->irq);
 out_err_alloc:
 	peripheral_free_list(pin_req);
 
@@ -1638,6 +1639,7 @@ static int __devexit bfin_mii_bus_remove(struct platform_device *pdev)
 	struct mii_bus *miibus = platform_get_drvdata(pdev);
 	platform_set_drvdata(pdev, NULL);
 	mdiobus_unregister(miibus);
+	kfree(miibus->irq);
 	mdiobus_free(miibus);
 	peripheral_free_list(pin_req);
 	return 0;

^ permalink raw reply related

* [PATCH NEXT 1/1] qlcnic: adding co maintainer
From: Amit Kumar Salecha @ 2010-05-18  9:48 UTC (permalink / raw)
  To: davem; +Cc: netdev, ameen.rahman, anirban.chakraborty

Adding Anirban as co maintainer

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
---
 MAINTAINERS |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index abd0a77..1ace1b9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3971,6 +3971,7 @@ F:	include/linux/*device.h
 
 NETXEN (1/10) GbE SUPPORT
 M:	Amit Kumar Salecha <amit.salecha@qlogic.com>
+M:	Anirban Chakraborty <anirban.chakraborty@qlogic.com>
 L:	netdev@vger.kernel.org
 W:	http://www.qlogic.com
 S:	Supported
-- 
1.6.3.3


^ permalink raw reply related

* [PATCH] iproute2: rework SR-IOV VF support
From: Chris Wright @ 2010-05-18  7:57 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Williams, Mitch A

The kernel interface changed just before 2.6.34 was released.  This brings
iproute2 in line with the current changes.  The VF portion of setlink is
comprised of a set of nested attributes.

  IFLA_VFINFO_LIST (NESTED)
    IFLA_VF_INFO (NESTED)
      IFLA_VF_MAC
      IFLA_VF_VLAN
      IFLA_VF_TX_RATE

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
---
Didn't forget the doc updates, tomorrow.

 ip/ipaddress.c |   60 ++++++++++++++++++------------
 ip/iplink.c    |  113 +++++++++++++++++++++++++++++++++++--------------------
 2 files changed, 108 insertions(+), 65 deletions(-)

diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index 48f7b1e..f6db183 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -187,6 +187,36 @@ static void print_linktype(FILE *fp, struct rtattr *tb)
 	}
 }
 
+static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
+{
+	struct ifla_vf_mac *vf_mac;
+	struct ifla_vf_vlan *vf_vlan;
+	struct ifla_vf_tx_rate *vf_tx_rate;
+	struct rtattr *vf[IFLA_VF_MAX+1];
+	SPRINT_BUF(b1);
+
+	if (vfinfo->rta_type != IFLA_VF_INFO) {
+		fprintf(stderr, "BUG: rta type is %d\n", vfinfo->rta_type);
+		return;
+	}
+
+	parse_rtattr_nested(vf, IFLA_VF_MAX, vfinfo);
+
+	vf_mac = RTA_DATA(vf[IFLA_VF_MAC]);
+	vf_vlan = RTA_DATA(vf[IFLA_VF_VLAN]);
+	vf_tx_rate = RTA_DATA(vf[IFLA_VF_TX_RATE]);
+
+	fprintf(fp, "\n    vf %d MAC %s", vf_mac->vf,
+		ll_addr_n2a((unsigned char *)&vf_mac->mac,
+		ETH_ALEN, 0, b1, sizeof(b1)));
+	if (vf_vlan->vlan)
+		fprintf(fp, ", vlan %d", vf_vlan->vlan);
+	if (vf_vlan->qos)
+		fprintf(fp, ", qos %d", vf_vlan->qos);
+	if (vf_tx_rate->rate)
+		fprintf(fp, ", tx rate %d (Mbps)", vf_tx_rate->rate);
+}
+
 int print_linkinfo(const struct sockaddr_nl *who,
 		   struct nlmsghdr *n, void *arg)
 {
@@ -331,31 +361,13 @@ int print_linkinfo(const struct sockaddr_nl *who,
 				);
 		}
 	}
-	if (do_link && tb[IFLA_VFINFO] && tb[IFLA_NUM_VF]) {
-		SPRINT_BUF(b1);
-		struct rtattr *rta = tb[IFLA_VFINFO];
-		struct ifla_vf_info *ivi;
-		int i;
-		for (i = 0; i < *(int *)RTA_DATA(tb[IFLA_NUM_VF]); i++) {
-			if (rta->rta_type != IFLA_VFINFO) {
-				fprintf(stderr, "BUG: rta type is %d\n", rta->rta_type);
-				break;
-			}
-			ivi = RTA_DATA(rta);
-			fprintf(fp, "\n    vf %d: MAC %s",
-				ivi->vf,
-				ll_addr_n2a((unsigned char *)&ivi->mac,
-					    ETH_ALEN, 0, b1, sizeof(b1)));
-				if (ivi->vlan)
-					fprintf(fp, ", vlan %d", ivi->vlan);
-				if (ivi->qos)
-					fprintf(fp, ", qos %d", ivi->qos);
-				if (ivi->tx_rate)
-					fprintf(fp, ", tx rate %d (Mbps_",
-						ivi->tx_rate);
-			rta = (struct rtattr *)((char *)rta + RTA_ALIGN(rta->rta_len));
-		}
+	if (do_link && tb[IFLA_VFINFO_LIST] && tb[IFLA_NUM_VF]) {
+		struct rtattr *i, *vflist = tb[IFLA_VFINFO_LIST];
+		int rem = RTA_PAYLOAD(vflist);
+		for (i = RTA_DATA(vflist); RTA_OK(i, rem); i = RTA_NEXT(i, rem))
+			print_vfinfo(fp, i);
 	}
+
 	fprintf(fp, "\n");
 	fflush(fp);
 	return 0;
diff --git a/ip/iplink.c b/ip/iplink.c
index 97fca8b..cb2c4f5 100644
--- a/ip/iplink.c
+++ b/ip/iplink.c
@@ -176,6 +176,73 @@ struct iplink_req {
 	char			buf[1024];
 };
 
+int iplink_parse_vf(int vf, int *argcp, char ***argvp,
+			   struct iplink_req *req)
+{
+	int len, argc = *argcp;
+	char **argv = *argvp;
+	struct rtattr *vfinfo;
+
+	vfinfo = addattr_nest(&req->n, sizeof(*req), IFLA_VF_INFO);
+
+	while (NEXT_ARG_OK()) {
+		NEXT_ARG();
+		if (matches(*argv, "mac") == 0) {
+			struct ifla_vf_mac ivm;
+			NEXT_ARG();
+			ivm.vf = vf;
+			len = ll_addr_a2n((char *)ivm.mac, 32, *argv);
+			if (len < 0)
+				return -1;
+			addattr_l(&req->n, sizeof(*req), IFLA_VF_MAC, &ivm, sizeof(ivm));
+		} else if (matches(*argv, "vlan") == 0) {
+			struct ifla_vf_vlan ivv;
+			NEXT_ARG();
+			if (get_unsigned(&ivv.vlan, *argv, 0)) {
+				invarg("Invalid \"vlan\" value\n", *argv);
+			}
+			ivv.vf = vf;
+			ivv.qos = 0;
+			if (NEXT_ARG_OK()) {
+				NEXT_ARG();
+				if (matches(*argv, "qos") == 0) {
+					NEXT_ARG();
+					if (get_unsigned(&ivv.qos, *argv, 0)) {
+						invarg("Invalid \"qos\" value\n", *argv);
+					}
+				} else {
+					/* rewind arg */
+					PREV_ARG();
+				}
+			}
+			addattr_l(&req->n, sizeof(*req), IFLA_VF_VLAN, &ivv, sizeof(ivv));
+		} else if (matches(*argv, "rate") == 0) {
+			struct ifla_vf_tx_rate ivt;
+			NEXT_ARG();
+			if (get_unsigned(&ivt.rate, *argv, 0)) {
+				invarg("Invalid \"rate\" value\n", *argv);
+			}
+			ivt.vf = vf;
+			addattr_l(&req->n, sizeof(*req), IFLA_VF_TX_RATE, &ivt, sizeof(ivt));
+		
+		} else {
+			/* rewind arg */
+			PREV_ARG();
+			break;
+		}
+	}
+
+	if (argc == *argcp)
+		incomplete_command();
+
+	addattr_nest_end(&req->n, vfinfo);
+
+	*argcp = argc;
+	*argvp = argv;
+	return 0;
+}
+
+
 int iplink_parse(int argc, char **argv, struct iplink_req *req,
 		char **name, char **type, char **link, char **dev)
 {
@@ -283,53 +350,17 @@ int iplink_parse(int argc, char **argv, struct iplink_req *req,
 			} else
 				return on_off("noarp");
 		} else if (strcmp(*argv, "vf") == 0) {
+			struct rtattr *vflist;
 			NEXT_ARG();
 			if (get_integer(&vf,  *argv, 0)) {
 				invarg("Invalid \"vf\" value\n", *argv);
 			}
-		} else if (matches(*argv, "mac") == 0) {
-			struct ifla_vf_mac ivm;
-			NEXT_ARG();
-			if (vf < 0) {
-				missarg("vf");
-			}
-			ivm.vf = vf;
-			len = ll_addr_a2n((char *)ivm.mac, 32, *argv);
+			vflist = addattr_nest(&req->n, sizeof(*req),
+					      IFLA_VFINFO_LIST);
+			len = iplink_parse_vf(vf, &argc, &argv, req);
 			if (len < 0)
 				return -1;
-			addattr_l(&req->n, sizeof(*req), IFLA_VF_MAC, &ivm, sizeof(ivm));
-		} else if (matches(*argv, "vlan") == 0) {
-			struct ifla_vf_vlan ivv;
-			NEXT_ARG();
-			if (vf < 0) {
-				missarg("vf");
-			}
-			if (get_unsigned(&ivv.vlan, *argv, 0)) {
-				invarg("Invalid \"vlan\" value\n", *argv);
-			}
-			ivv.vf = vf;
-			ivv.qos = 0;
-			if (NEXT_ARG_OK()) {
-				NEXT_ARG();
-				if (matches(*argv, "qos") == 0) {
-					NEXT_ARG();
-					if (get_unsigned(&ivv.qos, *argv, 0)) {
-						invarg("Invalid \"qos\" value\n", *argv);
-					}
-				}
-			}
-			addattr_l(&req->n, sizeof(*req), IFLA_VF_VLAN, &ivv, sizeof(ivv));
-		} else if (matches(*argv, "rate") == 0) {
-			struct ifla_vf_tx_rate ivt;
-			NEXT_ARG();
-			if (vf < 0) {
-				missarg("vf");
-			}
-			if (get_unsigned(&ivt.rate, *argv, 0)) {
-				invarg("Invalid \"rate\" value\n", *argv);
-			}
-			ivt.vf = vf;
-			addattr_l(&req->n, sizeof(*req), IFLA_VF_TX_RATE, &ivt, sizeof(ivt));
+			addattr_nest_end(&req->n, vflist);
 #ifdef IFF_DYNAMIC
 		} else if (matches(*argv, "dynamic") == 0) {
 			NEXT_ARG();

^ permalink raw reply related

* [PATCH] tcp: tcp_md5_hash_skb_data() frag_list handling
From: Eric Dumazet @ 2010-05-18  6:40 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20100517.231403.148532542.davem@davemloft.net>

Le lundi 17 mai 2010 à 23:14 -0700, David Miller a écrit :
> Eric, would you please formally submit that tcp md5 fraglist
> fix to net/ipv4/tcp.c with proper signoffs?

Here it is, thanks !


[PATCH] tcp: tcp_md5_hash_skb_data() frag_list handling

tcp_md5_hash_skb_data() should handle skb->frag_list, and eventually
recurse.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/ipv4/tcp.c |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 6596b4f..49d0d2b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2999,6 +2999,7 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
 	const unsigned head_data_len = skb_headlen(skb) > header_len ?
 				       skb_headlen(skb) - header_len : 0;
 	const struct skb_shared_info *shi = skb_shinfo(skb);
+	struct sk_buff *frag_iter;
 
 	sg_init_table(&sg, 1);
 
@@ -3013,6 +3014,10 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
 			return 1;
 	}
 
+	skb_walk_frags(skb, frag_iter)
+		if (tcp_md5_hash_skb_data(hp, frag_iter, 0))
+			return 1;
+
 	return 0;
 }
 



^ permalink raw reply related

* Re: [PATCH v3 3/3] ptp: Added a clock that uses the eTSEC found on the MPC85xx.
From: Richard Cochran @ 2010-05-18  6:36 UTC (permalink / raw)
  To: Scott Wood
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
In-Reply-To: <4BF18582.6000500-KZfg59tc24xl57MIdRCFDg@public.gmane.org>

On Mon, May 17, 2010 at 01:05:54PM -0500, Scott Wood wrote:
> >>> >+  - tmr_fiper1   Fixed interval period pulse generator.
> >>> >+  - tmr_fiper2   Fixed interval period pulse generator.
> >>
> 
> MPC8572 and P2020 have fiper3 as well.

I doubt they really have a third fiper.

First of all, this signal is not routed anywhere on the boards. Also,
according to the documentation, it has no bit in the TMR_CTRL or the
TMR_TEMASK registers. Unless there is a bit in TMR_TEMASK, you cannot
get an interrupt from it.

If you cannot use the signal externally (in the "real" world) and you
cannot get an interrupt, what good is it to have such a periodic
signal? Polling the bit in the TMR_TEVENT to see when a pulse occurs
seems pointless.

Scott, you have connections, right? Can you clarify this for me?

Thanks,

Richard

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox