Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [MeeGo-Dev][PATCH v3] Topcliff: Update PCH_CAN driver to 2.6.35
From: Masayuki Ohtake @ 2010-10-13 10:09 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w,
	qi.wang-ral2JQCrhuEAvxtiuMwx3w,
	margie.foster-ral2JQCrhuEAvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA, yong.y.wang-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w, Christian Pellegrin,
	Tomoya MORINAGA, David S. Miller,
	joel.clark-ral2JQCrhuEAvxtiuMwx3w, Samuel Ortiz
In-Reply-To: <4CA4541F.5040804@grandegger.com>

On Thursday, September 30, 2010 6:10 PM, Wolfgang Grandegger wrote:

>
> + iowrite32(num, &(priv->regs)->if2_creq);
> + while (counter) {
>> + if2_creq = (ioread32(&(priv->regs)->if2_creq)) &
>> +      CAN_IF_CREQ_BUSY;
>> + if (!if2_creq)
>> + break;
>> + counter--;
>> + }
>> + if (!counter)
>> + dev_err(&priv->ndev->dev, "IF2 BUSY Flag is set forever.\n");
>> +}
>
>Duplicated code!

No.
These are not the same.

Though it is possible to integrate to one function by adding parameter,
I think, current two function method is more easily to read.

>
>
>
>> + if (status & PCH_STUF_ERR)
>> + cf->data[2] |= CAN_ERR_PROT_STUFF;
>> +
>> + if (status & PCH_FORM_ERR)
>> + cf->data[2] |= CAN_ERR_PROT_FORM;
> +
> + if (status & PCH_ACK_ERR)
> + cf->data[2] |= CAN_ERR_PROT_LOC_ACK | CAN_ERR_PROT_LOC_ACK_DEL;
> +
> + if ((status & PCH_BIT1_ERR) || (status & PCH_BIT0_ERR))
> + cf->data[2] |= CAN_ERR_PROT_BIT;
> +
> + if (status & PCH_CRC_ERR)
> + cf->data[2] |= CAN_ERR_PROT_LOC_CRC_SEQ |
> + CAN_ERR_PROT_LOC_CRC_DEL;
> +
> + if (status & PCH_LEC_ALL)
> + iowrite32(status | PCH_LEC_ALL,
> +   &(priv->regs)->stat);
>
>A bit-wise test of the above values is wrong, I believe. Please use the
>switch statement instead.

The above conditions are not only one time.
I think "switch" is not suitable for the above.
Thus, current "if" processing is better.

>
>
> + u32 brp;
> +
> + pch_can_get_run_mode(priv, &curr_mode);
> + if (curr_mode == PCH_CAN_RUN)
> + pch_can_set_run_mode(priv, PCH_CAN_STOP);
>
>The device is stopped when this function is called. Please remove.

No.
The above is necessary.
Because this is our HW specification.
Before setting bitrate, run-mode must be "STOP".

>
>
> +static netdev_tx_t pch_xmit(struct sk_buff *skb, struct net_device *ndev)
> +{
> + canid_t id;
> + u32 id1 = 0;
> + u32 id2 = 0;
>
>Need these values to be preset?

These values are not essential.
But these help a engineer to read this code.

>
>
> + /* Enable CAN Interrupts */
> + pch_can_set_int_custom(priv);
> +
> + /* Restore Run Mode */
> + pch_can_set_run_mode(priv, PCH_CAN_RUN);
> +
> + return retval;
> +}
>
>Are the suspend and resume functions tested?
>
Yes, we tested before.

=========================================

Except the above, we are modifying for your indications.

I will resubmit soon.

Thanks, Ohtake(OKISemi)

^ permalink raw reply

* Re: [PATCH -next] sundance: Add initial ethtool stats support
From: Denis Kirjanov @ 2010-10-13 10:28 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, David Miller, Ben Hutchings, Jeff Garzik
In-Reply-To: <1286951637.2703.141.camel@edumazet-laptop>

On 10/13/2010 10:33 AM, Eric Dumazet wrote:
> Le mercredi 13 octobre 2010 à 10:06 +0400, Denis Kirjanov a écrit :
> 
>> @@ -1494,13 +1506,23 @@ static struct net_device_stats *get_stats(struct net_device *dev)
>>  	dev->stats.rx_missed_errors	+= ioread8(ioaddr + RxMissed);
>>  	dev->stats.tx_packets += ioread16(ioaddr + TxFramesOK);
>>  	dev->stats.rx_packets += ioread16(ioaddr + RxFramesOK);
>> -	dev->stats.collisions += ioread8(ioaddr + StatsLateColl);
>> -	dev->stats.collisions += ioread8(ioaddr + StatsMultiColl);
>> -	dev->stats.collisions += ioread8(ioaddr + StatsOneColl);
>>  	dev->stats.tx_carrier_errors += ioread8(ioaddr + StatsCarrierError);
>> -	ioread8(ioaddr + StatsTxDefer);
>> -	for (i = StatsTxDefer; i <= StatsMcastRx; i++)
>> -		ioread8(ioaddr + i);
>> +
>> +	np->xstats.tx_multiple_collisions += ioread8(ioaddr + StatsMultiColl);
>> +	np->xstats.tx_single_collisions += ioread8(ioaddr + StatsOneColl);
>> +	np->xstats.tx_late_collisions += ioread8(ioaddr + StatsLateColl);
>> +	dev->stats.collisions += np->xstats.tx_multiple_collisions
>> +		+ np->xstats.tx_single_collisions
>> +		+ np->xstats.tx_late_collisions;
> 
> 
> Oh well..., really ?
> 
> Each time we are calling get_stats(), you do
> 
> dev->stats.collisions += (number of accumulated collisions since device
> is up)
> 
> instead of
> 
> dev->stats.collisions += (number of accumulated collisions since last
> time we read counters)

Ok, this is a fixed patch.
Thanks.

Subject: [PATCH -next V5] sundance: Add ethtool stats support

Add ethtool stats support

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
---
V2:
 check for the ETH_SS_STATS in get_string()
 use xstats struct for ethtool stats
V3:
 make counters 64-bits wide
V4:
 fixed some misspellings
V5:
 fixed issue with collisions counting

 drivers/net/sundance.c |   94 ++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 87 insertions(+), 7 deletions(-)

diff --git a/drivers/net/sundance.c b/drivers/net/sundance.c
index 4283cc5..bfa2a01 100644
--- a/drivers/net/sundance.c
+++ b/drivers/net/sundance.c
@@ -363,6 +363,19 @@ struct netdev_private {
         dma_addr_t tx_ring_dma;
         dma_addr_t rx_ring_dma;
 	struct timer_list timer;		/* Media monitoring timer. */
+	/* ethtool extra stats */
+	struct {
+		u64 tx_multiple_collisions;
+		u64 tx_single_collisions;
+		u64 tx_late_collisions;
+		u64 tx_defered;
+		u64 tx_defered_excessive;
+		u64 tx_aborted;
+		u64 tx_bcasts;
+		u64 rx_bcasts;
+		u64 tx_mcasts;
+		u64 rx_mcasts;
+	} xstats;
 	/* Frequently used values: keep some adjacent for cache effect. */
 	spinlock_t lock;
 	int msg_enable;
@@ -1486,21 +1499,34 @@ static struct net_device_stats *get_stats(struct net_device *dev)
 {
 	struct netdev_private *np = netdev_priv(dev);
 	void __iomem *ioaddr = np->base;
-	int i;
 	unsigned long flags;
+	u8 late_coll, single_coll, mult_coll;
 
 	spin_lock_irqsave(&np->statlock, flags);
 	/* The chip only need report frame silently dropped. */
 	dev->stats.rx_missed_errors	+= ioread8(ioaddr + RxMissed);
 	dev->stats.tx_packets += ioread16(ioaddr + TxFramesOK);
 	dev->stats.rx_packets += ioread16(ioaddr + RxFramesOK);
-	dev->stats.collisions += ioread8(ioaddr + StatsLateColl);
-	dev->stats.collisions += ioread8(ioaddr + StatsMultiColl);
-	dev->stats.collisions += ioread8(ioaddr + StatsOneColl);
 	dev->stats.tx_carrier_errors += ioread8(ioaddr + StatsCarrierError);
-	ioread8(ioaddr + StatsTxDefer);
-	for (i = StatsTxDefer; i <= StatsMcastRx; i++)
-		ioread8(ioaddr + i);
+
+	mult_coll = ioread8(ioaddr + StatsMultiColl);
+	np->xstats.tx_multiple_collisions += mult_coll;
+	single_coll = ioread8(ioaddr + StatsOneColl);
+	np->xstats.tx_single_collisions += single_coll;
+	late_coll = ioread8(ioaddr + StatsLateColl);
+	np->xstats.tx_late_collisions += late_coll;
+	dev->stats.collisions += mult_coll
+		+ single_coll
+		+ late_coll;
+
+	np->xstats.tx_defered += ioread8(ioaddr + StatsTxDefer);
+	np->xstats.tx_defered_excessive += ioread8(ioaddr + StatsTxXSDefer);
+	np->xstats.tx_aborted += ioread8(ioaddr + StatsTxAbort);
+	np->xstats.tx_bcasts += ioread8(ioaddr + StatsBcastTx);
+	np->xstats.rx_bcasts += ioread8(ioaddr + StatsBcastRx);
+	np->xstats.tx_mcasts += ioread8(ioaddr + StatsMcastTx);
+	np->xstats.rx_mcasts += ioread8(ioaddr + StatsMcastRx);
+
 	dev->stats.tx_bytes += ioread16(ioaddr + TxOctetsLow);
 	dev->stats.tx_bytes += ioread16(ioaddr + TxOctetsHigh) << 16;
 	dev->stats.rx_bytes += ioread16(ioaddr + RxOctetsLow);
@@ -1566,6 +1592,21 @@ static int __set_mac_addr(struct net_device *dev)
 	return 0;
 }
 
+static const struct {
+	const char name[ETH_GSTRING_LEN];
+} sundance_stats[] = {
+	{ "tx_multiple_collisions" },
+	{ "tx_single_collisions" },
+	{ "tx_late_collisions" },
+	{ "tx_defered" },
+	{ "tx_defered_excessive" },
+	{ "tx_aborted" },
+	{ "tx_bcasts" },
+	{ "rx_bcasts" },
+	{ "tx_mcasts" },
+	{ "rx_mcasts" },
+};
+
 static int check_if_running(struct net_device *dev)
 {
 	if (!netif_running(dev))
@@ -1624,6 +1665,42 @@ static void set_msglevel(struct net_device *dev, u32 val)
 	np->msg_enable = val;
 }
 
+static void get_strings(struct net_device *dev, u32 stringset,
+		u8 *data)
+{
+	if (stringset == ETH_SS_STATS)
+		memcpy(data, sundance_stats, sizeof(sundance_stats));
+}
+
+static int get_sset_count(struct net_device *dev, int sset)
+{
+	switch (sset) {
+	case ETH_SS_STATS:
+		return ARRAY_SIZE(sundance_stats);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static void get_ethtool_stats(struct net_device *dev,
+		struct ethtool_stats *stats, u64 *data)
+{
+	struct netdev_private *np = netdev_priv(dev);
+	int i = 0;
+
+	get_stats(dev);
+	data[i++] = np->xstats.tx_multiple_collisions;
+	data[i++] = np->xstats.tx_single_collisions;
+	data[i++] = np->xstats.tx_late_collisions;
+	data[i++] = np->xstats.tx_defered;
+	data[i++] = np->xstats.tx_defered_excessive;
+	data[i++] = np->xstats.tx_aborted;
+	data[i++] = np->xstats.tx_bcasts;
+	data[i++] = np->xstats.rx_bcasts;
+	data[i++] = np->xstats.tx_mcasts;
+	data[i++] = np->xstats.rx_mcasts;
+}
+
 static const struct ethtool_ops ethtool_ops = {
 	.begin = check_if_running,
 	.get_drvinfo = get_drvinfo,
@@ -1633,6 +1710,9 @@ static const struct ethtool_ops ethtool_ops = {
 	.get_link = get_link,
 	.get_msglevel = get_msglevel,
 	.set_msglevel = set_msglevel,
+	.get_strings = get_strings,
+	.get_sset_count = get_sset_count,
+	.get_ethtool_stats = get_ethtool_stats,
 };
 
 static int netdev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
-- 
1.7.0



^ permalink raw reply related

* Re: [PATCH -next] sundance: Add initial ethtool stats support
From: Eric Dumazet @ 2010-10-13 10:36 UTC (permalink / raw)
  To: Denis Kirjanov; +Cc: netdev, David Miller, Ben Hutchings, Jeff Garzik
In-Reply-To: <4CB589B2.50907@kernel.org>

Le mercredi 13 octobre 2010 à 14:28 +0400, Denis Kirjanov a écrit :

> +		u64 tx_defered;
> +		u64 tx_defered_excessive;

Joe said correct english word is "deferred", not "defered"




^ permalink raw reply

* Re: [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll
From: Neil Horman @ 2010-10-13 10:51 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, bonding-devel, fubar, davem, andy
In-Reply-To: <4CB51D27.8030703@redhat.com>

On Wed, Oct 13, 2010 at 10:44:55AM +0800, Cong Wang wrote:
> On 10/13/10 05:55, nhorman@tuxdriver.com wrote:
> >--- a/drivers/net/bonding/bond_main.c
> >+++ b/drivers/net/bonding/bond_main.c
> >@@ -76,6 +76,7 @@
> >  #include<linux/if_vlan.h>
> >  #include<linux/if_bonding.h>
> >  #include<linux/jiffies.h>
> >+#include<linux/preempt.h>
> >  #include<net/route.h>
> >  #include<net/net_namespace.h>
> >  #include<net/netns/generic.h>
> >@@ -169,6 +170,35 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link
> >
> >  /*----------------------------- Global variables ----------------------------*/
> >
> >+#ifdef CONFIG_NET_POLL_CONTROLLER
> >+static cpumask_var_t netpoll_block_tx;
> >+
> >+static inline void block_netpoll_tx(void
> >+{
> >+	preempt_disable();
> >+	BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(),
> >+					netpoll_block_tx));
> >+}
> >+
> >+static inline void unblock_netpoll_tx(void)
> >+{
> >+	BUG_ON(!cpumask_test_and_clear_cpu(smp_processor_id(),
> >+					  netpoll_block_tx));
> >+	preempt_enable();
> >+}
> >+
> >+static inline int is_netpoll_tx_blocked(struct net_device *dev)
> >+{
> >+	if (unlikely(dev->priv_flags&  IFF_IN_NETPOLL))
> >+		return cpumask_test_cpu(smp_processor_id(), netpoll_block_tx);
> >+	return 0;
> >+}
> >+#else
> >+#define block_netpoll_tx()
> >+#define unblock_netpoll_tx()
> >+#define is_netpoll_tx_blocked(dev)
> >+#endif
> >+
> 
> These should go to netpoll.h, IMHO.
> 

Doh, you're right, they should.  Particularly because I just noticed there are
a few paths through sysfs for bonding that can recurse the same way the
monitoring code can.  I'll respin this shortly, thanks.
Neil


^ permalink raw reply

* Re: [PATCH -next] sundance: Add initial ethtool stats support
From: Denis Kirjanov @ 2010-10-13 10:56 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, David Miller, Ben Hutchings, Jeff Garzik
In-Reply-To: <1286966186.3876.246.camel@edumazet-laptop>

On 10/13/2010 02:36 PM, Eric Dumazet wrote:
> Le mercredi 13 octobre 2010 à 14:28 +0400, Denis Kirjanov a écrit :
> 
>> +		u64 tx_defered;
>> +		u64 tx_defered_excessive;
> 
PATCH -next v5] sundance: Add ethtool stats support

Add ethtool stats support.

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
---
V2:
 check for the ETH_SS_STATS in get_string()
 use xstats struct for ethtool stats
V3:
 make counters 64-bits wide
V4:
 fixed some misspellings
V5:
 fixed issue with collisions counting

drivers/net/sundance.c |   94 ++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 87 insertions(+), 7 deletions(-)

diff --git a/drivers/net/sundance.c b/drivers/net/sundance.c
index 4283cc5..3ed2a67 100644
--- a/drivers/net/sundance.c
+++ b/drivers/net/sundance.c
@@ -363,6 +363,19 @@ struct netdev_private {
         dma_addr_t tx_ring_dma;
         dma_addr_t rx_ring_dma;
 	struct timer_list timer;		/* Media monitoring timer. */
+	/* ethtool extra stats */
+	struct {
+		u64 tx_multiple_collisions;
+		u64 tx_single_collisions;
+		u64 tx_late_collisions;
+		u64 tx_deferred;
+		u64 tx_deferred_excessive;
+		u64 tx_aborted;
+		u64 tx_bcasts;
+		u64 rx_bcasts;
+		u64 tx_mcasts;
+		u64 rx_mcasts;
+	} xstats;
 	/* Frequently used values: keep some adjacent for cache effect. */
 	spinlock_t lock;
 	int msg_enable;
@@ -1486,21 +1499,34 @@ static struct net_device_stats *get_stats(struct net_device *dev)
 {
 	struct netdev_private *np = netdev_priv(dev);
 	void __iomem *ioaddr = np->base;
-	int i;
 	unsigned long flags;
+	u8 late_coll, single_coll, mult_coll;
 
 	spin_lock_irqsave(&np->statlock, flags);
 	/* The chip only need report frame silently dropped. */
 	dev->stats.rx_missed_errors	+= ioread8(ioaddr + RxMissed);
 	dev->stats.tx_packets += ioread16(ioaddr + TxFramesOK);
 	dev->stats.rx_packets += ioread16(ioaddr + RxFramesOK);
-	dev->stats.collisions += ioread8(ioaddr + StatsLateColl);
-	dev->stats.collisions += ioread8(ioaddr + StatsMultiColl);
-	dev->stats.collisions += ioread8(ioaddr + StatsOneColl);
 	dev->stats.tx_carrier_errors += ioread8(ioaddr + StatsCarrierError);
-	ioread8(ioaddr + StatsTxDefer);
-	for (i = StatsTxDefer; i <= StatsMcastRx; i++)
-		ioread8(ioaddr + i);
+
+	mult_coll = ioread8(ioaddr + StatsMultiColl);
+	np->xstats.tx_multiple_collisions += mult_coll;
+	single_coll = ioread8(ioaddr + StatsOneColl);
+	np->xstats.tx_single_collisions += single_coll;
+	late_coll = ioread8(ioaddr + StatsLateColl);
+	np->xstats.tx_late_collisions += late_coll;
+	dev->stats.collisions += mult_coll
+		+ single_coll
+		+ late_coll;
+
+	np->xstats.tx_deferred += ioread8(ioaddr + StatsTxDefer);
+	np->xstats.tx_deferred_excessive += ioread8(ioaddr + StatsTxXSDefer);
+	np->xstats.tx_aborted += ioread8(ioaddr + StatsTxAbort);
+	np->xstats.tx_bcasts += ioread8(ioaddr + StatsBcastTx);
+	np->xstats.rx_bcasts += ioread8(ioaddr + StatsBcastRx);
+	np->xstats.tx_mcasts += ioread8(ioaddr + StatsMcastTx);
+	np->xstats.rx_mcasts += ioread8(ioaddr + StatsMcastRx);
+
 	dev->stats.tx_bytes += ioread16(ioaddr + TxOctetsLow);
 	dev->stats.tx_bytes += ioread16(ioaddr + TxOctetsHigh) << 16;
 	dev->stats.rx_bytes += ioread16(ioaddr + RxOctetsLow);
@@ -1566,6 +1592,21 @@ static int __set_mac_addr(struct net_device *dev)
 	return 0;
 }
 
+static const struct {
+	const char name[ETH_GSTRING_LEN];
+} sundance_stats[] = {
+	{ "tx_multiple_collisions" },
+	{ "tx_single_collisions" },
+	{ "tx_late_collisions" },
+	{ "tx_deferred" },
+	{ "tx_deferred_excessive" },
+	{ "tx_aborted" },
+	{ "tx_bcasts" },
+	{ "rx_bcasts" },
+	{ "tx_mcasts" },
+	{ "rx_mcasts" },
+};
+
 static int check_if_running(struct net_device *dev)
 {
 	if (!netif_running(dev))
@@ -1624,6 +1665,42 @@ static void set_msglevel(struct net_device *dev, u32 val)
 	np->msg_enable = val;
 }
 
+static void get_strings(struct net_device *dev, u32 stringset,
+		u8 *data)
+{
+	if (stringset == ETH_SS_STATS)
+		memcpy(data, sundance_stats, sizeof(sundance_stats));
+}
+
+static int get_sset_count(struct net_device *dev, int sset)
+{
+	switch (sset) {
+	case ETH_SS_STATS:
+		return ARRAY_SIZE(sundance_stats);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static void get_ethtool_stats(struct net_device *dev,
+		struct ethtool_stats *stats, u64 *data)
+{
+	struct netdev_private *np = netdev_priv(dev);
+	int i = 0;
+
+	get_stats(dev);
+	data[i++] = np->xstats.tx_multiple_collisions;
+	data[i++] = np->xstats.tx_single_collisions;
+	data[i++] = np->xstats.tx_late_collisions;
+	data[i++] = np->xstats.tx_deferred;
+	data[i++] = np->xstats.tx_deferred_excessive;
+	data[i++] = np->xstats.tx_aborted;
+	data[i++] = np->xstats.tx_bcasts;
+	data[i++] = np->xstats.rx_bcasts;
+	data[i++] = np->xstats.tx_mcasts;
+	data[i++] = np->xstats.rx_mcasts;
+}
+
 static const struct ethtool_ops ethtool_ops = {
 	.begin = check_if_running,
 	.get_drvinfo = get_drvinfo,
@@ -1633,6 +1710,9 @@ static const struct ethtool_ops ethtool_ops = {
 	.get_link = get_link,
 	.get_msglevel = get_msglevel,
 	.set_msglevel = set_msglevel,
+	.get_strings = get_strings,
+	.get_sset_count = get_sset_count,
+	.get_ethtool_stats = get_ethtool_stats,
 };
 
 static int netdev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
-- 
1.7.0


^ permalink raw reply related

* Re: [MeeGo-Dev][PATCH v3] Topcliff: Update PCH_CAN driver to 2.6.35
From: Wolfgang Grandegger @ 2010-10-13 11:08 UTC (permalink / raw)
  To: Masayuki Ohtake
  Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w,
	qi.wang-ral2JQCrhuEAvxtiuMwx3w,
	margie.foster-ral2JQCrhuEAvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA, yong.y.wang-ral2JQCrhuEAvxtiuMwx3w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w, Christian Pellegrin,
	Tomoya MORINAGA, David S. Miller,
	joel.clark-ral2JQCrhuEAvxtiuMwx3w, Samuel Ortiz
In-Reply-To: <004a01cb6abe$c41a9cc0$66f8800a-a06+6cuVnkTSQfdrb5gaxUEOCMrvLtNR@public.gmane.org>

On 10/13/2010 12:09 PM, Masayuki Ohtake wrote:
> On Thursday, September 30, 2010 6:10 PM, Wolfgang Grandegger wrote:
> 
>>
>> + iowrite32(num, &(priv->regs)->if2_creq);
>> + while (counter) {
>>> + if2_creq = (ioread32(&(priv->regs)->if2_creq)) &
>>> +      CAN_IF_CREQ_BUSY;
>>> + if (!if2_creq)
>>> + break;
>>> + counter--;
>>> + }
>>> + if (!counter)
>>> + dev_err(&priv->ndev->dev, "IF2 BUSY Flag is set forever.\n");
>>> +}
>>
>> Duplicated code!
> 
> No.
> These are not the same.

Of course they are not the same. The only difference is the register
offset (of if1 or if2). A common function with a pointer to the if
register as argument makes sense.

> Though it is possible to integrate to one function by adding parameter,
> I think, current two function method is more easily to read.

I disagree.

>>
>>
>>
>>> + if (status & PCH_STUF_ERR)
>>> + cf->data[2] |= CAN_ERR_PROT_STUFF;
>>> +
>>> + if (status & PCH_FORM_ERR)
>>> + cf->data[2] |= CAN_ERR_PROT_FORM;
>> +
>> + if (status & PCH_ACK_ERR)
>> + cf->data[2] |= CAN_ERR_PROT_LOC_ACK | CAN_ERR_PROT_LOC_ACK_DEL;
>> +
>> + if ((status & PCH_BIT1_ERR) || (status & PCH_BIT0_ERR))
>> + cf->data[2] |= CAN_ERR_PROT_BIT;
>> +
>> + if (status & PCH_CRC_ERR)
>> + cf->data[2] |= CAN_ERR_PROT_LOC_CRC_SEQ |
>> + CAN_ERR_PROT_LOC_CRC_DEL;
>> +
>> + if (status & PCH_LEC_ALL)
>> + iowrite32(status | PCH_LEC_ALL,
>> +   &(priv->regs)->stat);

Well, if status==7 (PCH_LEC_ALL), all of the above conditions are true
as well... convinced now?

>> A bit-wise test of the above values is wrong, I believe. Please use the
>> switch statement instead.
> 
> The above conditions are not only one time.
> I think "switch" is not suitable for the above.
> Thus, current "if" processing is better.

I don't understand! The Last Error Code (LEC) can have values from 0 to
7. A "switch" statement is therefore the right choice. Or have I missed
something.

>>
>>
>> + u32 brp;
>> +
>> + pch_can_get_run_mode(priv, &curr_mode);
>> + if (curr_mode == PCH_CAN_RUN)
>> + pch_can_set_run_mode(priv, PCH_CAN_STOP);
>>
>> The device is stopped when this function is called. Please remove.
> 
> No.
> The above is necessary.

Yes, because you started the device *too early* in pch_can_open() called
by pch_open(). See my other related comments of my previous mail.

> Because this is our HW specification.
> Before setting bitrate, run-mode must be "STOP".

I think it can be avoided easily.

>>
>>
>> +static netdev_tx_t pch_xmit(struct sk_buff *skb, struct net_device *ndev)
>> +{
>> + canid_t id;
>> + u32 id1 = 0;
>> + u32 id2 = 0;
>>
>> Need these values to be preset?
> 
> These values are not essential.
> But these help a engineer to read this code.

I disagree.

>> + /* Enable CAN Interrupts */
>> + pch_can_set_int_custom(priv);
>> +
>> + /* Restore Run Mode */
>> + pch_can_set_run_mode(priv, PCH_CAN_RUN);
>> +
>> + return retval;
>> +}
>>
>> Are the suspend and resume functions tested?
>>
> Yes, we tested before.
> 
> =========================================
> 
> Except the above, we are modifying for your indications.
> 
> I will resubmit soon.

Thanks,

Wolfgang.

^ permalink raw reply

* Re: BUG ? ipip unregister_netdevice_many()
From: Jarek Poplawski @ 2010-10-13 11:19 UTC (permalink / raw)
  To: David Miller; +Cc: ebiederm, hans.schillstrom, daniel.lezcano, netdev
In-Reply-To: <20101012.130520.48517464.davem@davemloft.net>

On 2010-10-12 22:05, David Miller wrote:
> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Fri, 08 Oct 2010 10:32:40 -0700
> 
>> It is just dealing with not flushing the entire routing cache, just the
>> routes that have expired.  Which prevents one network namespace from
>> flushing it's routes and DOS'ing another.
> 
> That's a very indirect and obfuscated way of handling it.
> 
> And I still don't know why we let the first contiguous set of expired
> entries in the chain get freed outside of the lock, and the rest
> inside the lock.  That really isn't explained by anything I've read.
> 
> How about we just do exactly what's intended, and with no ifdefs?
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>
...
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 0755aa4..6ad730c 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -712,13 +712,14 @@ static inline int rt_is_expired(struct rtable *rth)
>   * Can be called by a softirq or a process.
>   * In the later case, we want to be reschedule if necessary
>   */
> -static void rt_do_flush(int process_context)
> +static void rt_do_flush(struct net *net, int process_context)
>  {
>  	unsigned int i;
>  	struct rtable *rth, *next;
> -	struct rtable * tail;
>  
>  	for (i = 0; i <= rt_hash_mask; i++) {
> +		struct rtable *list, **pprev;

Isn't "list = NULL" needed here?

Jarek P.

...
> +				rth->dst.rt_next = list;
> +				list = rth;
> +			} else
> +				pprev = &rth->dst.rt_next;
> +
> +			rth = next;
...

^ permalink raw reply

* Re: kernel panic in fib_rules_lookup [2.6.27.7 vendor-patched]
From: Joe Buehler @ 2010-10-13 11:18 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1286905245.2703.3.camel@edumazet-laptop>

Eric Dumazet wrote:
> 2.6.27 is a bit old, you might try :
>
> commit 7fa7cb7109d07c29ab28bb877bc7049a0150dbe5
> Author: Eric Dumazet <eric.dumazet@gmail.com>
> Date:   Mon Sep 27 04:18:27 2010 +0000
>   
Amazing -- a fix in 20 minutes -- thank you!  I need to do more load
testing but it looks like it may have solved the issue.

Joe Buehler


^ permalink raw reply

* Re: [MeeGo-Dev][PATCH v3] Topcliff: Update PCH_CAN driver to 2.6.35
From: Masayuki Ohtake @ 2010-10-13 12:05 UTC (permalink / raw)
  To: Wolfgang Grandegger
  Cc: andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w, Samuel Ortiz,
	margie.foster-ral2JQCrhuEAvxtiuMwx3w,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	yong.y.wang-ral2JQCrhuEAvxtiuMwx3w,
	kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w,
	joel.clark-ral2JQCrhuEAvxtiuMwx3w, Tomoya MORINAGA,
	David S. Miller, Christian Pellegrin,
	qi.wang-ral2JQCrhuEAvxtiuMwx3w
In-Reply-To: <4CB5931E.8030103@grandegger.com>

Hi Wolfgang,

On Wednesday, October 13, 2010 8:08 PM, Wolfgang Grandegger wrote:
> I disagree.

> I think it can be avoided easily.
> I disagree.
I will modify like you are saying.

> I don't understand! The Last Error Code (LEC) can have values from 0 to
> 7. A "switch" statement is therefore the right choice. Or have I missed
> something.
Yes, you are right.
I miss-understood.

Thanks, Ohtake(OKISemi)

----- Original Message ----- 
From: "Wolfgang Grandegger" <wg-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
To: "Masayuki Ohtake" <masa-korg-ECg8zkTtlr0C6LszWs/t0g@public.gmane.org>
Cc: <joel.clark-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>; "Tomoya MORINAGA" <morinaga526-ECg8zkTtlr0C6LszWs/t0g@public.gmane.org>; <kok.howg.ewe-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>;
<yong.y.wang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>; <margie.foster-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>; <qi.wang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>; <andrew.chih.howe.khor-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>;
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>; <netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>; <socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org>; "Samuel Ortiz"
<sameo-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>; "Barry Song" <21cnbao-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>; "Christian Pellegrin" <chripell-VaTbYqLCNhc@public.gmane.org>; "Wolfram Sang"
<w.sang-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>; "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Sent: Wednesday, October 13, 2010 8:08 PM
Subject: Re: [MeeGo-Dev][PATCH v3] Topcliff: Update PCH_CAN driver to 2.6.35


> On 10/13/2010 12:09 PM, Masayuki Ohtake wrote:
> > On Thursday, September 30, 2010 6:10 PM, Wolfgang Grandegger wrote:
> >
> >>
> >> + iowrite32(num, &(priv->regs)->if2_creq);
> >> + while (counter) {
> >>> + if2_creq = (ioread32(&(priv->regs)->if2_creq)) &
> >>> +      CAN_IF_CREQ_BUSY;
> >>> + if (!if2_creq)
> >>> + break;
> >>> + counter--;
> >>> + }
> >>> + if (!counter)
> >>> + dev_err(&priv->ndev->dev, "IF2 BUSY Flag is set forever.\n");
> >>> +}
> >>
> >> Duplicated code!
> >
> > No.
> > These are not the same.
>
> Of course they are not the same. The only difference is the register
> offset (of if1 or if2). A common function with a pointer to the if
> register as argument makes sense.
>
> > Though it is possible to integrate to one function by adding parameter,
> > I think, current two function method is more easily to read.
>
> I disagree.
>
> >>
> >>
> >>
> >>> + if (status & PCH_STUF_ERR)
> >>> + cf->data[2] |= CAN_ERR_PROT_STUFF;
> >>> +
> >>> + if (status & PCH_FORM_ERR)
> >>> + cf->data[2] |= CAN_ERR_PROT_FORM;
> >> +
> >> + if (status & PCH_ACK_ERR)
> >> + cf->data[2] |= CAN_ERR_PROT_LOC_ACK | CAN_ERR_PROT_LOC_ACK_DEL;
> >> +
> >> + if ((status & PCH_BIT1_ERR) || (status & PCH_BIT0_ERR))
> >> + cf->data[2] |= CAN_ERR_PROT_BIT;
> >> +
> >> + if (status & PCH_CRC_ERR)
> >> + cf->data[2] |= CAN_ERR_PROT_LOC_CRC_SEQ |
> >> + CAN_ERR_PROT_LOC_CRC_DEL;
> >> +
> >> + if (status & PCH_LEC_ALL)
> >> + iowrite32(status | PCH_LEC_ALL,
> >> +   &(priv->regs)->stat);
>
> Well, if status==7 (PCH_LEC_ALL), all of the above conditions are true
> as well... convinced now?
>
> >> A bit-wise test of the above values is wrong, I believe. Please use the
> >> switch statement instead.
> >
> > The above conditions are not only one time.
> > I think "switch" is not suitable for the above.
> > Thus, current "if" processing is better.
>
> I don't understand! The Last Error Code (LEC) can have values from 0 to
> 7. A "switch" statement is therefore the right choice. Or have I missed
> something.
>
> >>
> >>
> >> + u32 brp;
> >> +
> >> + pch_can_get_run_mode(priv, &curr_mode);
> >> + if (curr_mode == PCH_CAN_RUN)
> >> + pch_can_set_run_mode(priv, PCH_CAN_STOP);
> >>
> >> The device is stopped when this function is called. Please remove.
> >
> > No.
> > The above is necessary.
>
> Yes, because you started the device *too early* in pch_can_open() called
> by pch_open(). See my other related comments of my previous mail.
>
> > Because this is our HW specification.
> > Before setting bitrate, run-mode must be "STOP".
>
> I think it can be avoided easily.
>
> >>
> >>
> >> +static netdev_tx_t pch_xmit(struct sk_buff *skb, struct net_device *ndev)
> >> +{
> >> + canid_t id;
> >> + u32 id1 = 0;
> >> + u32 id2 = 0;
> >>
> >> Need these values to be preset?
> >
> > These values are not essential.
> > But these help a engineer to read this code.
>
> I disagree.
>
> >> + /* Enable CAN Interrupts */
> >> + pch_can_set_int_custom(priv);
> >> +
> >> + /* Restore Run Mode */
> >> + pch_can_set_run_mode(priv, PCH_CAN_RUN);
> >> +
> >> + return retval;
> >> +}
> >>
> >> Are the suspend and resume functions tested?
> >>
> > Yes, we tested before.
> >
> > =========================================
> >
> > Except the above, we are modifying for your indications.
> >
> > I will resubmit soon.
>
> Thanks,
>
> Wolfgang.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* [PATCH 1/5] Fix bonding drivers improper modification of netpoll structure
From: nhorman @ 2010-10-13 12:35 UTC (permalink / raw)
  To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman
In-Reply-To: <1286973334-4339-1-git-send-email-nhorman@tuxdriver.com>

From: Neil Horman <nhorman@tuxdriver.com>

The bonding driver currently modifies the netpoll structure in its xmit path
while sending frames from netpoll.  This is racy, as other cpus can access the
netpoll structure in parallel. Since the bonding driver points np->dev to a
slave device, other cpus can inadvertently attempt to send data directly to
slave devices, leading to improper locking with the bonding master, lost frames,
and deadlocks.  This patch fixes that up.

This patch also removes the real_dev pointer from the netpoll structure as that
data is really only used by bonding in the poll_controller, and we can emulate
its behavior by check each slave for IS_UP.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
 drivers/net/bonding/bond_main.c |   15 +++++++++------
 include/linux/netpoll.h         |    9 +++++++--
 net/core/netpoll.c              |    6 +++---
 3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index a0bf35d..eb7d089 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -449,11 +449,9 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
 	if (unlikely(bond->dev->priv_flags & IFF_IN_NETPOLL)) {
 		struct netpoll *np = bond->dev->npinfo->netpoll;
 		slave_dev->npinfo = bond->dev->npinfo;
-		np->real_dev = np->dev = skb->dev;
 		slave_dev->priv_flags |= IFF_IN_NETPOLL;
-		netpoll_send_skb(np, skb);
+		netpoll_send_skb_on_dev(np, skb, slave_dev);
 		slave_dev->priv_flags &= ~IFF_IN_NETPOLL;
-		np->dev = bond->dev;
 	} else
 #endif
 		dev_queue_xmit(skb);
@@ -1332,9 +1330,14 @@ static bool slaves_support_netpoll(struct net_device *bond_dev)
 
 static void bond_poll_controller(struct net_device *bond_dev)
 {
-	struct net_device *dev = bond_dev->npinfo->netpoll->real_dev;
-	if (dev != bond_dev)
-		netpoll_poll_dev(dev);
+	struct bonding *bond = netdev_priv(bond_dev);
+	struct slave *slave;
+	int i;
+
+	bond_for_each_slave(bond, slave, i) {
+		if (slave->dev && IS_UP(slave->dev))
+			netpoll_poll_dev(slave->dev);
+	}
 }
 
 static void bond_netpoll_cleanup(struct net_device *bond_dev)
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 50d8009..79358bb 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -14,7 +14,6 @@
 
 struct netpoll {
 	struct net_device *dev;
-	struct net_device *real_dev;
 	char dev_name[IFNAMSIZ];
 	const char *name;
 	void (*rx_hook)(struct netpoll *, int, char *, int);
@@ -53,7 +52,13 @@ void netpoll_set_trap(int trap);
 void __netpoll_cleanup(struct netpoll *np);
 void netpoll_cleanup(struct netpoll *np);
 int __netpoll_rx(struct sk_buff *skb);
-void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb);
+void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
+			     struct net_device *dev);
+static inline void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
+{
+	netpoll_send_skb_on_dev(np, skb, np->dev);
+}
+
 
 
 #ifdef CONFIG_NETPOLL
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 537e01a..4e98ffa 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -288,11 +288,11 @@ static int netpoll_owner_active(struct net_device *dev)
 	return 0;
 }
 
-void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
+void netpoll_send_skb_on_dev(struct netpoll *np, struct sk_buff *skb,
+			     struct net_device *dev)
 {
 	int status = NETDEV_TX_BUSY;
 	unsigned long tries;
-	struct net_device *dev = np->dev;
 	const struct net_device_ops *ops = dev->netdev_ops;
 	/* It is up to the caller to keep npinfo alive. */
 	struct netpoll_info *npinfo = np->dev->npinfo;
@@ -346,7 +346,7 @@ void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
 		schedule_delayed_work(&npinfo->tx_work,0);
 	}
 }
-EXPORT_SYMBOL(netpoll_send_skb);
+EXPORT_SYMBOL(netpoll_send_skb_on_dev);
 
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
 {
-- 
1.7.2.3


^ permalink raw reply related

* [PATCH] bonding: various fixes for bonding, netpoll & netconsole (v2)
From: nhorman @ 2010-10-13 12:35 UTC (permalink / raw)
  To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman

Version 2, taking teh following changes into account:

1) Moved tx blocking/checking macros to netpoll.h as suggested by amwang

2) Added tx blocking macro calls to sysfs paths, as they can deadlock in the
same way that the link monitoring paths can.

Summary: 
A while ago we tried to enable netpoll on the bonding driver to enable
netconsole.  That worked well in a steady state, but deadlocked frequently in
failover conditions due to some recursive lock-taking (as well as a few other
problems).  I've gone through the driver, netconsole and netpoll code, fixed up
those deadlocks, and confirmed that, with this patch series, we can use
netconsole on bonding without deadlock in all bonding modes with all slaves,
even accross failovers.  I've also fixed up some incidental bugs that I ran
across while looking through this code, as described in individual patches

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* [PATCH 2/5] Fix deadlock in bonding driver resulting from internal locking when using netpoll
From: nhorman @ 2010-10-13 12:35 UTC (permalink / raw)
  To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman
In-Reply-To: <1286973334-4339-1-git-send-email-nhorman@tuxdriver.com>

From: Neil Horman <nhorman@tuxdriver.com>

The monitoring paths in the bonding driver take write locks that are shared by
the tx path.  If netconsole is in use, these paths can call printk which puts us
in the netpoll tx path, which, if netconsole is attached to the bonding driver,
result in deadlock (the xmit_lock guards are useless in netpoll_send_skb, as the
monitor paths in the bonding driver don't claim the xmit_lock, nor should they).
The solution is to use a per cpu flag internal to the driver to indicate when a
cpu is holding the lock in a path that might recusrse into the tx path for the
driver via netconsole.  By checking this flag on transmit, we can defer the
sending of the netconsole frames until a later time using the retransmit feature
of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY.  I've
tested this and am able to transmit via netconsole while causing failover
conditions on the bond slave links.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
 drivers/net/bonding/bond_main.c  |   42 ++++++++++++++++++++++++++++++++++---
 drivers/net/bonding/bond_sysfs.c |    8 +++++++
 drivers/net/bonding/bonding.h    |   30 +++++++++++++++++++++++++++
 3 files changed, 76 insertions(+), 4 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index eb7d089..bc38d69 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -76,6 +76,7 @@
 #include <linux/if_vlan.h>
 #include <linux/if_bonding.h>
 #include <linux/jiffies.h>
+#include <linux/preempt.h>
 #include <net/route.h>
 #include <net/net_namespace.h>
 #include <net/netns/generic.h>
@@ -169,6 +170,10 @@ MODULE_PARM_DESC(resend_igmp, "Number of IGMP membership reports to send on link
 
 /*----------------------------- Global variables ----------------------------*/
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+cpumask_var_t netpoll_block_tx;
+#endif
+
 static const char * const version =
 	DRV_DESCRIPTION ": v" DRV_VERSION " (" DRV_RELDATE ")\n";
 
@@ -310,6 +315,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id)
 
 	pr_debug("bond: %s, vlan id %d\n", bond->dev->name, vlan_id);
 
+	block_netpoll_tx();
 	write_lock_bh(&bond->lock);
 
 	list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
@@ -344,6 +350,7 @@ static int bond_del_vlan(struct bonding *bond, unsigned short vlan_id)
 
 out:
 	write_unlock_bh(&bond->lock);
+	unblock_netpoll_tx();
 	return res;
 }
 
@@ -1804,10 +1811,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 	bond_set_carrier(bond);
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-	/*
-	 * Netpoll and bonding is broken, make sure it is not initialized
-	 * until it is fixed.
-	 */
 	if (disable_netpoll) {
 		bond_dev->priv_flags |= IFF_DISABLE_NETPOLL;
 	} else {
@@ -1892,6 +1895,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 		return -EINVAL;
 	}
 
+	block_netpoll_tx();
 	netdev_bonding_change(bond_dev, NETDEV_BONDING_DESLAVE);
 	write_lock_bh(&bond->lock);
 
@@ -1901,6 +1905,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 		pr_info("%s: %s not enslaved\n",
 			bond_dev->name, slave_dev->name);
 		write_unlock_bh(&bond->lock);
+		unblock_netpoll_tx();
 		return -EINVAL;
 	}
 
@@ -1994,6 +1999,7 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 	}
 
 	write_unlock_bh(&bond->lock);
+	unblock_netpoll_tx();
 
 	/* must do this from outside any spinlocks */
 	bond_destroy_slave_symlinks(bond_dev, slave_dev);
@@ -2085,6 +2091,7 @@ static int bond_release_all(struct net_device *bond_dev)
 	struct net_device *slave_dev;
 	struct sockaddr addr;
 
+	block_netpoll_tx();
 	write_lock_bh(&bond->lock);
 
 	netif_carrier_off(bond_dev);
@@ -2183,6 +2190,7 @@ static int bond_release_all(struct net_device *bond_dev)
 
 out:
 	write_unlock_bh(&bond->lock);
+	unblock_netpoll_tx();
 
 	return 0;
 }
@@ -2232,9 +2240,11 @@ static int bond_ioctl_change_active(struct net_device *bond_dev, struct net_devi
 	    (old_active) &&
 	    (new_active->link == BOND_LINK_UP) &&
 	    IS_UP(new_active->dev)) {
+		block_netpoll_tx();
 		write_lock_bh(&bond->curr_slave_lock);
 		bond_change_active_slave(bond, new_active);
 		write_unlock_bh(&bond->curr_slave_lock);
+		unblock_netpoll_tx();
 	} else
 		res = -EINVAL;
 
@@ -2466,9 +2476,11 @@ static void bond_miimon_commit(struct bonding *bond)
 
 do_failover:
 		ASSERT_RTNL();
+		block_netpoll_tx();
 		write_lock_bh(&bond->curr_slave_lock);
 		bond_select_active_slave(bond);
 		write_unlock_bh(&bond->curr_slave_lock);
+		unblock_netpoll_tx();
 	}
 
 	bond_set_carrier(bond);
@@ -2911,11 +2923,13 @@ void bond_loadbalance_arp_mon(struct work_struct *work)
 	}
 
 	if (do_failover) {
+		block_netpoll_tx();
 		write_lock_bh(&bond->curr_slave_lock);
 
 		bond_select_active_slave(bond);
 
 		write_unlock_bh(&bond->curr_slave_lock);
+		unblock_netpoll_tx();
 	}
 
 re_arm:
@@ -3074,9 +3088,11 @@ static void bond_ab_arp_commit(struct bonding *bond, int delta_in_ticks)
 
 do_failover:
 		ASSERT_RTNL();
+		block_netpoll_tx();
 		write_lock_bh(&bond->curr_slave_lock);
 		bond_select_active_slave(bond);
 		write_unlock_bh(&bond->curr_slave_lock);
+		unblock_netpoll_tx();
 	}
 
 	bond_set_carrier(bond);
@@ -4564,6 +4580,13 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct bonding *bond = netdev_priv(dev);
 
+	/*
+	 * If we risk deadlock from transmitting this in the
+	 * netpoll path, tell netpoll to queue the frame for later tx
+	 */
+	if (is_netpoll_tx_blocked(dev))
+		return NETDEV_TX_BUSY;
+
 	if (TX_QUEUE_OVERRIDE(bond->params.mode)) {
 		if (!bond_slave_override(bond, skb))
 			return NETDEV_TX_OK;
@@ -5295,6 +5318,13 @@ static int __init bonding_init(void)
 	if (res)
 		goto err;
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	if (!alloc_cpumask_var(&netpoll_block_tx, GFP_KERNEL)) {
+		res = -ENOMEM;
+		bond_destroy_sysfs();
+		goto err;
+	}
+#endif
 	register_netdevice_notifier(&bond_netdev_notifier);
 	register_inetaddr_notifier(&bond_inetaddr_notifier);
 	bond_register_ipv6_notifier();
@@ -5316,6 +5346,10 @@ static void __exit bonding_exit(void)
 
 	bond_destroy_sysfs();
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	free_cpumask_var(netpoll_block_tx);
+#endif
+
 	rtnl_link_unregister(&bond_link_ops);
 	unregister_pernet_subsys(&bond_net_ops);
 }
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 01b4c3f..8fd0174 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1066,6 +1066,7 @@ static ssize_t bonding_store_primary(struct device *d,
 
 	if (!rtnl_trylock())
 		return restart_syscall();
+	block_netpoll_tx();
 	read_lock(&bond->lock);
 	write_lock_bh(&bond->curr_slave_lock);
 
@@ -1101,6 +1102,7 @@ static ssize_t bonding_store_primary(struct device *d,
 out:
 	write_unlock_bh(&bond->curr_slave_lock);
 	read_unlock(&bond->lock);
+	unblock_netpoll_tx();
 	rtnl_unlock();
 
 	return count;
@@ -1146,11 +1148,13 @@ static ssize_t bonding_store_primary_reselect(struct device *d,
 		bond->dev->name, pri_reselect_tbl[new_value].modename,
 		new_value);
 
+	block_netpoll_tx();
 	read_lock(&bond->lock);
 	write_lock_bh(&bond->curr_slave_lock);
 	bond_select_active_slave(bond);
 	write_unlock_bh(&bond->curr_slave_lock);
 	read_unlock(&bond->lock);
+	unblock_netpoll_tx();
 out:
 	rtnl_unlock();
 	return ret;
@@ -1232,6 +1236,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
 
 	if (!rtnl_trylock())
 		return restart_syscall();
+
+	block_netpoll_tx();
 	read_lock(&bond->lock);
 	write_lock_bh(&bond->curr_slave_lock);
 
@@ -1288,6 +1294,8 @@ static ssize_t bonding_store_active_slave(struct device *d,
  out:
 	write_unlock_bh(&bond->curr_slave_lock);
 	read_unlock(&bond->lock);
+	unblock_netpoll_tx();
+
 	rtnl_unlock();
 
 	return count;
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index c15f213..deef1aa 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -19,6 +19,7 @@
 #include <linux/proc_fs.h>
 #include <linux/if_bonding.h>
 #include <linux/kobject.h>
+#include <linux/cpumask.h>
 #include <linux/in6.h>
 #include "bond_3ad.h"
 #include "bond_alb.h"
@@ -117,6 +118,35 @@
 		bond_for_each_slave_from(bond, pos, cnt, (bond)->first_slave)
 
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+extern cpumask_var_t netpoll_block_tx;
+
+static inline void block_netpoll_tx(void)
+{
+	preempt_disable();
+	BUG_ON(cpumask_test_and_set_cpu(smp_processor_id(),
+					netpoll_block_tx));
+}
+
+static inline void unblock_netpoll_tx(void)
+{
+	BUG_ON(!cpumask_test_and_clear_cpu(smp_processor_id(),
+					   netpoll_block_tx));
+	preempt_enable();
+}
+
+static inline int is_netpoll_tx_blocked(struct net_device *dev)
+{
+	if (unlikely(dev->priv_flags & IFF_IN_NETPOLL))
+		return cpumask_test_cpu(smp_processor_id(), netpoll_block_tx);
+	return 0;
+}
+#else
+#define block_netpoll_tx()
+#define unblock_netpoll_tx()
+#define is_netpoll_tx_blocked(dev)
+#endif
+
 struct bond_params {
 	int mode;
 	int xmit_policy;
-- 
1.7.2.3


^ permalink raw reply related

* [PATCH 3/5] Fix napi poll for bonding driver
From: nhorman @ 2010-10-13 12:35 UTC (permalink / raw)
  To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman
In-Reply-To: <1286973334-4339-1-git-send-email-nhorman@tuxdriver.com>

From: Neil Horman <nhorman@tuxdriver.com>

Usually the netpoll path, when preforming a napi poll can get away with just
polling all the napi instances of the configured device.  Thats not the case for
the bonding driver however, as the napi instances which may wind up getting
flagged as needing polling after the poll_controller call don't belong to the
bonded device, but rather to the slave devices.  Fix this by checking the device
in question for the IFF_MASTER flag, if set, we know we need to check the full
poll list for this cpu, rather than just the devices napi instance list.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
 net/core/netpoll.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 4e98ffa..d79d221 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -156,8 +156,15 @@ static void poll_napi(struct net_device *dev)
 {
 	struct napi_struct *napi;
 	int budget = 16;
+	struct softnet_data *sd = &__get_cpu_var(softnet_data);
+	struct list_head *nlist;

-	list_for_each_entry(napi, &dev->napi_list, dev_list) {
+	if (dev->flags & IFF_MASTER)
+		nlist = &sd->poll_list;
+	else
+		nlist = &dev->napi_list;
+
+	list_for_each_entry(napi, nlist, dev_list) {
 		if (napi->poll_owner != smp_processor_id() &&
 		    spin_trylock(&napi->poll_lock)) {
 			budget = poll_one_napi(dev->npinfo, napi, budget);
-- 
1.7.2.3

^ permalink raw reply related

* [PATCH 4/5] Fix netconsole to not deadlock on rmmod
From: nhorman @ 2010-10-13 12:35 UTC (permalink / raw)
  To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman
In-Reply-To: <1286973334-4339-1-git-send-email-nhorman@tuxdriver.com>

From: Neil Horman <nhorman@tuxdriver.com>

Netconsole calls netpoll_cleanup on receipt of a NETDEVICE_UNREGISTER event.
The notifier subsystem calls these event handlers with rtnl_lock held, which
netpoll_cleanup also takes, resulting in deadlock.  Fix this by calling the
__netpoll_cleanup interior function instead, and fixing up the additional
pointers.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
 drivers/net/netconsole.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index ca142c4..94255f0 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -678,7 +678,14 @@ static int netconsole_netdev_event(struct notifier_block *this,
 				strlcpy(nt->np.dev_name, dev->name, IFNAMSIZ);
 				break;
 			case NETDEV_UNREGISTER:
-				netpoll_cleanup(&nt->np);
+				/*
+				 * rtnl_lock already held
+				 */
+				if (nt->np.dev) {
+					__netpoll_cleanup(&nt->np);
+					dev_put(nt->np.dev);
+					nt->np.dev = NULL;
+				}
 				/* Fall through */
 			case NETDEV_GOING_DOWN:
 			case NETDEV_BONDING_DESLAVE:
-- 
1.7.2.3


^ permalink raw reply related

* [PATCH 5/5] Re-enable netpoll over bonding
From: nhorman @ 2010-10-13 12:35 UTC (permalink / raw)
  To: netdev; +Cc: bonding-devel, fubar, davem, andy, amwang, nhorman
In-Reply-To: <1286973334-4339-1-git-send-email-nhorman@tuxdriver.com>

From: Neil Horman <nhorman@tuxdriver.com>

With the inclusion of previous fixup patches, netpoll over bonding apears to
work reliably with failover conditions.  This reverts Gospos previous commit
c22d7ac844f1cb9c6a5fd20f89ebadc2feef891b, and allows access again to the netpoll
functionality in the bonding driver.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
---
 drivers/net/bonding/bond_main.c |   29 ++++++++++-------------------
 1 files changed, 10 insertions(+), 19 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index bc38d69..d42c380 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -184,9 +184,6 @@ static int arp_ip_count;
 static int bond_mode	= BOND_MODE_ROUNDROBIN;
 static int xmit_hashtype = BOND_XMIT_POLICY_LAYER2;
 static int lacp_fast;
-#ifdef CONFIG_NET_POLL_CONTROLLER
-static int disable_netpoll = 1;
-#endif
 
 const struct bond_parm_tbl bond_lacp_tbl[] = {
 {	"slow",		AD_LACP_SLOW},
@@ -1811,19 +1808,15 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 	bond_set_carrier(bond);
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-	if (disable_netpoll) {
+	if (slaves_support_netpoll(bond_dev)) {
+		bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
+		if (bond_dev->npinfo)
+			slave_dev->npinfo = bond_dev->npinfo;
+	} else if (!(bond_dev->priv_flags & IFF_DISABLE_NETPOLL)) {
 		bond_dev->priv_flags |= IFF_DISABLE_NETPOLL;
-	} else {
-		if (slaves_support_netpoll(bond_dev)) {
-			bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
-			if (bond_dev->npinfo)
-				slave_dev->npinfo = bond_dev->npinfo;
-		} else if (!(bond_dev->priv_flags & IFF_DISABLE_NETPOLL)) {
-			bond_dev->priv_flags |= IFF_DISABLE_NETPOLL;
-			pr_info("New slave device %s does not support netpoll\n",
-				slave_dev->name);
-			pr_info("Disabling netpoll support for %s\n", bond_dev->name);
-		}
+		pr_info("New slave device %s does not support netpoll\n",
+			slave_dev->name);
+		pr_info("Disabling netpoll support for %s\n", bond_dev->name);
 	}
 #endif
 	read_unlock(&bond->lock);
@@ -2030,10 +2023,8 @@ int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	read_lock_bh(&bond->lock);
 
-	 /* Make sure netpoll over stays disabled until fixed. */
-	if (!disable_netpoll)
-		if (slaves_support_netpoll(bond_dev))
-				bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
+	if (slaves_support_netpoll(bond_dev))
+		bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
 	read_unlock_bh(&bond->lock);
 	if (slave_dev->netdev_ops->ndo_netpoll_cleanup)
 		slave_dev->netdev_ops->ndo_netpoll_cleanup(slave_dev);
-- 
1.7.2.3


^ permalink raw reply related

* [PATCH net-next] fib6: use FIB_LOOKUP_NOREF in fib6_rule_lookup()
From: Eric Dumazet @ 2010-10-13 12:45 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Avoid two atomic ops on found rule in fib6_rule_lookup()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/ipv6/fib6_rules.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c
index b1108ed..d829874 100644
--- a/net/ipv6/fib6_rules.c
+++ b/net/ipv6/fib6_rules.c
@@ -34,11 +34,10 @@ struct dst_entry *fib6_rule_lookup(struct net *net, struct flowi *fl,
 {
 	struct fib_lookup_arg arg = {
 		.lookup_ptr = lookup,
+		.flags = FIB_LOOKUP_NOREF,
 	};
 
 	fib_rules_lookup(net->ipv6.fib6_rules_ops, fl, flags, &arg);
-	if (arg.rule)
-		fib_rule_put(arg.rule);
 
 	if (arg.result)
 		return arg.result;



^ permalink raw reply related

* Re: [PATCH net-next 1/5] tipc: Enhance enabling and disabling of Ethernet bearers
From: Neil Horman @ 2010-10-13 13:42 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: davem, netdev, allan.stephens
In-Reply-To: <1286929558-2954-1-git-send-email-paul.gortmaker@windriver.com>

On Tue, Oct 12, 2010 at 08:25:54PM -0400, Paul Gortmaker wrote:
> From: Allan Stephens <allan.stephens@windriver.com>
> 
> Use work queue to eliminate release of TIPC's configuration lock when
> registering for device notifications while activating Ethernet media
> support. Optimize code that locates an unused bearer entry when enabling
> an Ethernet bearer. Use work queue to break the association between a
> newly disabled Ethernet bearer and its device driver, thereby allowing
> quicker reuse of the bearer entry.
> 
> Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

> ---
>  net/tipc/config.c    |   13 +------
>  net/tipc/eth_media.c |   93 ++++++++++++++++++++++++++++++-------------------
>  2 files changed, 58 insertions(+), 48 deletions(-)
> 
> diff --git a/net/tipc/config.c b/net/tipc/config.c
> index 961d1b0..a43450c 100644
> --- a/net/tipc/config.c
> +++ b/net/tipc/config.c
> @@ -332,19 +332,8 @@ static struct sk_buff *cfg_set_own_addr(void)
>  		return tipc_cfg_reply_error_string(TIPC_CFG_NOT_SUPPORTED
>  						   " (cannot change node address once assigned)");
>  
> -	/*
> -	 * Must release all spinlocks before calling start_net() because
> -	 * Linux version of TIPC calls eth_media_start() which calls
> -	 * register_netdevice_notifier() which may block!
> -	 *
> -	 * Temporarily releasing the lock should be harmless for non-Linux TIPC,
> -	 * but Linux version of eth_media_start() should really be reworked
> -	 * so that it can be called with spinlocks held.
> -	 */
> -
> -	spin_unlock_bh(&config_lock);
>  	tipc_core_start_net(addr);
> -	spin_lock_bh(&config_lock);
> +
>  	return tipc_cfg_reply_none();
>  }
>  
Why is the work queue needed at all?  Looking at this path, the only entry to it
is from:
genl_rcv_msg
 handle_cmd
  tipc_cfg_do_cmd
   cfg_set_own_addr

That path looks to only be callable from a user space context, so sleeping on
the rtnl lock in when you call register_netdevice_notifier in eth_media_start
should be perfectly fine.  So you should just be able to remove the lock without
any additional work.  Does doing so cause other problems?

> diff --git a/net/tipc/eth_media.c b/net/tipc/eth_media.c
> index 6e988ba..479dbc0 100644
> --- a/net/tipc/eth_media.c
> +++ b/net/tipc/eth_media.c
> @@ -51,17 +51,20 @@
>   * @bearer: ptr to associated "generic" bearer structure
>   * @dev: ptr to associated Ethernet network device
>   * @tipc_packet_type: used in binding TIPC to Ethernet driver
> + * @cleanup: work item used when disabling bearer
>   */
>  
>  struct eth_bearer {
>  	struct tipc_bearer *bearer;
>  	struct net_device *dev;
>  	struct packet_type tipc_packet_type;
> +	struct work_struct cleanup;
>  };
>  
>  static struct eth_bearer eth_bearers[MAX_ETH_BEARERS];
>  static int eth_started = 0;
>  static struct notifier_block notifier;
> +static struct work_struct reg_notifier;
>  
>  /**
>   * send_msg - send a TIPC message out over an Ethernet interface
> @@ -157,22 +160,22 @@ static int enable_bearer(struct tipc_bearer *tb_ptr)
>  	if (!dev)
>  		return -ENODEV;
>  
> -	/* Find Ethernet bearer for device (or create one) */
> -
> -	for (;(eb_ptr != stop) && eb_ptr->dev && (eb_ptr->dev != dev); eb_ptr++);
> -	if (eb_ptr == stop)
> -		return -EDQUOT;
> -	if (!eb_ptr->dev) {
> -		eb_ptr->dev = dev;
> -		eb_ptr->tipc_packet_type.type = htons(ETH_P_TIPC);
> -		eb_ptr->tipc_packet_type.dev = dev;
> -		eb_ptr->tipc_packet_type.func = recv_msg;
> -		eb_ptr->tipc_packet_type.af_packet_priv = eb_ptr;
> -		INIT_LIST_HEAD(&(eb_ptr->tipc_packet_type.list));
> -		dev_hold(dev);
> -		dev_add_pack(&eb_ptr->tipc_packet_type);
> +	/* Create Ethernet bearer for device */
> +
> +	while (eb_ptr->dev != NULL) {
> +		if (++eb_ptr == stop)
> +			return -EDQUOT;
>  	}
>  
> +	eb_ptr->dev = dev;
> +	eb_ptr->tipc_packet_type.type = __constant_htons(ETH_P_TIPC);
> +	eb_ptr->tipc_packet_type.dev = dev;
> +	eb_ptr->tipc_packet_type.func = recv_msg;
> +	eb_ptr->tipc_packet_type.af_packet_priv = eb_ptr;
> +	INIT_LIST_HEAD(&eb_ptr->tipc_packet_type.list);
> +	dev_hold(dev);
> +	dev_add_pack(&eb_ptr->tipc_packet_type);
> +
>  	/* Associate TIPC bearer with Ethernet bearer */
>  
>  	eb_ptr->bearer = tb_ptr;
> @@ -185,16 +188,36 @@ static int enable_bearer(struct tipc_bearer *tb_ptr)
>  }
>  
>  /**
> + * cleanup_bearer - break association between Ethernet bearer and interface
> + *
> + * This routine must be invoked from a work queue because it can sleep.
> + */
> +
> +static void cleanup_bearer(struct work_struct *work)
> +{
> +	struct eth_bearer *eb_ptr =
> +		container_of(work, struct eth_bearer, cleanup);
> +
> +	dev_remove_pack(&eb_ptr->tipc_packet_type);
> +	dev_put(eb_ptr->dev);
> +	eb_ptr->dev = NULL;
> +}
> +
> +/**
>   * disable_bearer - detach TIPC bearer from an Ethernet interface
>   *
> - * We really should do dev_remove_pack() here, but this function can not be
> - * called at tasklet level. => Use eth_bearer->bearer as a flag to throw away
> - * incoming buffers, & postpone dev_remove_pack() to eth_media_stop() on exit.
> + * Mark Ethernet bearer as inactive so that incoming buffers are thrown away,
> + * then get worker thread to complete bearer cleanup.  (Can't do cleanup
> + * here because cleanup code needs to sleep and caller holds spinlocks.)
>   */
>  
>  static void disable_bearer(struct tipc_bearer *tb_ptr)
>  {
> -	((struct eth_bearer *)tb_ptr->usr_handle)->bearer = NULL;
> +	struct eth_bearer *eb_ptr = (struct eth_bearer *)tb_ptr->usr_handle;
> +
> +	eb_ptr->bearer = NULL;
> +	INIT_WORK(&eb_ptr->cleanup, cleanup_bearer);
If you do really need a workqueue for this, you should move this to someplace
like tipc_enable_bearers, in the restart loop, so you only initalize it once.
That also reduces the chance of a race if multiple processes attempt to disable
this barrier in rapid succession.

> +	schedule_work(&eb_ptr->cleanup);
>  }
>  
>  /**
> @@ -265,6 +288,19 @@ static char *eth_addr2str(struct tipc_media_addr *a, char *str_buf, int str_size
>  }
>  
>  /**
> + * do_registration - register TIPC to receive device notifications
> + *
> + * This routine must be invoked from a work queue because it can sleep.
> + */
> +
> +static void do_registration(struct work_struct *dummy)
> +{
> +	notifier.notifier_call = &recv_notification;
> +	notifier.priority = 0;
> +	register_netdevice_notifier(&notifier);
> +}
> +
> +/**
>   * tipc_eth_media_start - activate Ethernet bearer support
>   *
>   * Register Ethernet media type with TIPC bearer code.  Also register
> @@ -291,11 +327,9 @@ int tipc_eth_media_start(void)
>  	if (res)
>  		return res;
>  
> -	notifier.notifier_call = &recv_notification;
> -	notifier.priority = 0;
> -	res = register_netdevice_notifier(&notifier);
> -	if (!res)
> -		eth_started = 1;
> +	INIT_WORK(&reg_notifier, do_registration);
> +	schedule_work(&reg_notifier);
> +	eth_started = 1;
This is racy.  Even though tipc_cfg_do_cmd holds the config_lock, so only one
context can execute this at a time. two requests that execute this code in rapid
succession can re-initalize the reg_notifier work_struct while its sitting on a
work queue, causing an oops if the workqueue task tries to dequeue this entry
while its getting re-initalized.  You need to move this to someplace like the
module_init routine.

>  	return res;
>  }
>  
> @@ -305,22 +339,9 @@ int tipc_eth_media_start(void)
>  
>  void tipc_eth_media_stop(void)
>  {
> -	int i;
> -
>  	if (!eth_started)
>  		return;
>  
>  	unregister_netdevice_notifier(&notifier);
> -	for (i = 0; i < MAX_ETH_BEARERS ; i++) {
> -		if (eth_bearers[i].bearer) {
> -			eth_bearers[i].bearer->blocked = 1;
Where does this now get set when executing a tipc_eth_media_stop?  Don't you
want to block access to all bearers immediately when you call this?

> -			eth_bearers[i].bearer = NULL;
> -		}
> -		if (eth_bearers[i].dev) {
> -			dev_remove_pack(&eth_bearers[i].tipc_packet_type);
> -			dev_put(eth_bearers[i].dev);
> -		}
> -	}
> -	memset(&eth_bearers, 0, sizeof(eth_bearers));
>  	eth_started = 0;
>  }
> -- 
> 1.7.0.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* xfrm by MARK: tcp problems when mark for in and out differ
From: Gerd v. Egidy @ 2010-10-13 13:57 UTC (permalink / raw)
  To: jamal; +Cc: netdev, dev

Hi,

I use current strongswan git to set up ipsec connections with the xfrm by MARK 
feature. When I configure xfrm policies with different marks for incoming and 
outgoing packets, incoming tcp connections can't be established anymore. The 
SYN-ACK packet is never sent through the tunnel.

An example policy looks like this:

src 192.168.5.0/24 dst 192.168.1.0/24
        dir out priority 1760
        mark 5/0xffffffff
        tmpl src 172.16.1.131 dst 172.16.1.130
                proto esp reqid 16384 mode tunnel
src 192.168.1.0/24 dst 192.168.5.0/24
        dir fwd priority 1760
        tmpl src 172.16.1.130 dst 172.16.1.131
                proto esp reqid 16384 mode tunnel
src 192.168.1.0/24 dst 192.168.5.0/24
        dir in priority 1760
        tmpl src 172.16.1.130 dst 172.16.1.131
                proto esp reqid 16384 mode tunnel

-> incoming packets are without mark, outgoing packets are marked with 5

I traced the packet in the xfrm code and found out that the problem is in the 
flow data. When the SYN-ACK hits __xfrm_lookup, the value in fl->mark is 0 
(more precisely: the mark value used in the incoming packet). This means that 
xfrm_policy_match will not match on the correct policy because the mark values 
differ.

I'm not too familiar with the kernel networking code. But I guess that the 
flow for the SYN-ACK is set up based on the data used for the SYN and is not 
updated when my iptables rule changes the mark of the packet:

iptables -t raw -A OUTPUT -s 192.168.5.0/255.255.255.0 -d 
192.168.1.0/255.255.255.0 -j MARK --set-mark 5

I guess that the flow data should be updated somewhere. But I don't know what 
the correct place for that code would be.

Can somebody more familiar with the network stack help me with this please?

Thank you very much.

Kind regards,

Gerd

-- 
Address (better: trap) for people I really don't want to get mail from:
jonas@cactusamerica.com

^ permalink raw reply

* Re: [PATCH] ehea: Fix a checksum issue on the receive path
From: Breno Leitao @ 2010-10-13 14:06 UTC (permalink / raw)
  To: David Miller; +Cc: Stephen Hemminger, eric.dumazet, netdev, fubar
In-Reply-To: <20101009103136.341104c5@nehalam>

On 10/09/2010 02:31 PM, Stephen Hemminger wrote:
> Also hardware checksum can be wrong/broken. By passing up a packet
> which the driver thinks is bad, the software can still work.
I see. Unfortunately ehea driver is dropping the packets that have a 
wrong checksum. I will work on the fix, and soon I will send the patch.

David,

Just to clarify, this patch that started this thead is not invalidated 
by this "problem". So, I'd like to see this patch committed on your 
tree. Does it make sense?

Thanks
Breno

^ permalink raw reply

* Re: [PATCH net-next 2/5] tipc: Simplify bearer shutdown logic
From: Neil Horman @ 2010-10-13 14:39 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: davem, netdev, allan.stephens
In-Reply-To: <1286929558-2954-2-git-send-email-paul.gortmaker@windriver.com>

On Tue, Oct 12, 2010 at 08:25:55PM -0400, Paul Gortmaker wrote:
> From: Allan Stephens <allan.stephens@windriver.com>
> 
> Disable all active bearers when TIPC is shut down without having to do
> a name-based search to locate each bearer object.
> 
It seems like you're doing a good deal more in this patch than just disabling
all active bearers without doing a name search.  The description is implemented
in the for loop of tipc_bearer_stop.  Whats the rest of it for?

> Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> ---
>  net/tipc/bearer.c |   61 ++++++++++++++++++++--------------------------------
>  1 files changed, 24 insertions(+), 37 deletions(-)
> 
> diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
> index 9c10c6b..9969ec6 100644
> --- a/net/tipc/bearer.c
> +++ b/net/tipc/bearer.c
> @@ -280,39 +280,39 @@ static int bearer_name_validate(const char *name,
>  }
>  
>  /**
> - * bearer_find - locates bearer object with matching bearer name
> + * tipc_bearer_find_interface - locates bearer object with matching interface name
>   */
>  
> -static struct bearer *bearer_find(const char *name)
> +struct bearer *tipc_bearer_find_interface(const char *if_name)
>  {
>  	struct bearer *b_ptr;
> +	char *b_if_name;
>  	u32 i;
>  
> -	if (tipc_mode != TIPC_NET_MODE)
> -		return NULL;
> -
>  	for (i = 0, b_ptr = tipc_bearers; i < MAX_BEARERS; i++, b_ptr++) {
> -		if (b_ptr->active && (!strcmp(b_ptr->publ.name, name)))
> +		if (!b_ptr->active)
> +			continue;
> +		b_if_name = strchr(b_ptr->publ.name, ':') + 1;
> +		if (!strcmp(b_if_name, if_name))
>  			return b_ptr;
>  	}
>  	return NULL;
>  }
>  
>  /**
> - * tipc_bearer_find_interface - locates bearer object with matching interface name
> + * bearer_find - locates bearer object with matching bearer name
>   */
>  
> -struct bearer *tipc_bearer_find_interface(const char *if_name)
> +static struct bearer *bearer_find(const char *name)
>  {
>  	struct bearer *b_ptr;
> -	char *b_if_name;
>  	u32 i;
>  
> +	if (tipc_mode != TIPC_NET_MODE)
> +		return NULL;
> +
I get why you're removing the tipc_mode check above, but why are you adding it
here?  commit d0021b252eaf65ca07ed14f0d66425dd9ccab9a6 modified tipc_bearers so
that calling this function should be tipc_mode safe (i.e. looking for a bearer
prior to entering TIPC_NET_MODE will return a NULL result).  Did you find a
subsequent problem?

>  	for (i = 0, b_ptr = tipc_bearers; i < MAX_BEARERS; i++, b_ptr++) {
> -		if (!b_ptr->active)
> -			continue;
> -		b_if_name = strchr(b_ptr->publ.name, ':') + 1;
> -		if (!strcmp(b_if_name, if_name))
> +		if (b_ptr->active && (!strcmp(b_ptr->publ.name, name)))
>  			return b_ptr;
>  	}
>  	return NULL;
> @@ -630,30 +630,17 @@ int tipc_block_bearer(const char *name)
>   * Note: This routine assumes caller holds tipc_net_lock.
>   */
>  
> -static int bearer_disable(const char *name)
> +static int bearer_disable(struct bearer *b_ptr)
>  {
> -	struct bearer *b_ptr;
>  	struct link *l_ptr;
>  	struct link *temp_l_ptr;
>  
> -	b_ptr = bearer_find(name);
> -	if (!b_ptr) {
> -		warn("Attempt to disable unknown bearer <%s>\n", name);
> -		return -EINVAL;
> -	}
> -
> -	info("Disabling bearer <%s>\n", name);
> +	info("Disabling bearer <%s>\n", b_ptr->publ.name);
>  	tipc_disc_stop_link_req(b_ptr->link_req);
>  	spin_lock_bh(&b_ptr->publ.lock);
>  	b_ptr->link_req = NULL;
>  	b_ptr->publ.blocked = 1;
> -	if (b_ptr->media->disable_bearer) {
> -		spin_unlock_bh(&b_ptr->publ.lock);
> -		write_unlock_bh(&tipc_net_lock);
> -		b_ptr->media->disable_bearer(&b_ptr->publ);
> -		write_lock_bh(&tipc_net_lock);
> -		spin_lock_bh(&b_ptr->publ.lock);
> -	}
> +	b_ptr->media->disable_bearer(&b_ptr->publ);
>  	list_for_each_entry_safe(l_ptr, temp_l_ptr, &b_ptr->links, link_list) {
>  		tipc_link_delete(l_ptr);
>  	}
> @@ -664,10 +651,16 @@ static int bearer_disable(const char *name)
>  
>  int tipc_disable_bearer(const char *name)
>  {
> +	struct bearer *b_ptr;
>  	int res;
>  
>  	write_lock_bh(&tipc_net_lock);
> -	res = bearer_disable(name);
> +	b_ptr = bearer_find(name);
> +	if (b_ptr == NULL) {
> +		warn("Attempt to disable unknown bearer <%s>\n", name);
> +		res = -EINVAL;
> +	} else
> +		res = bearer_disable(b_ptr);
>  	write_unlock_bh(&tipc_net_lock);
>  	return res;
>  }
> @@ -680,13 +673,7 @@ void tipc_bearer_stop(void)
>  
>  	for (i = 0; i < MAX_BEARERS; i++) {
>  		if (tipc_bearers[i].active)
> -			tipc_bearers[i].publ.blocked = 1;
> -	}
> -	for (i = 0; i < MAX_BEARERS; i++) {
> -		if (tipc_bearers[i].active)
> -			bearer_disable(tipc_bearers[i].publ.name);
> +			bearer_disable(&tipc_bearers[i]);
>  	}
>  	media_count = 0;
>  }
> -
> -
> -- 
> 1.7.0.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* [PATCH net-next] fib: remove a useless synchronize_rcu() call
From: Eric Dumazet @ 2010-10-13 14:43 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

fib_nl_delrule() calls synchronize_rcu() for no apparent reason,
while rtnl is held.

I suspect it was done to avoid an atomic_inc_not_zero() in
fib_rules_lookup(), which commit 7fa7cb7109d07 added anyway.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/core/fib_rules.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c
index 21698f8..1bc3f25 100644
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -494,7 +494,6 @@ static int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg)
 			}
 		}
 
-		synchronize_rcu();
 		notify_rule_change(RTM_DELRULE, rule, ops, nlh,
 				   NETLINK_CB(skb).pid);
 		fib_rule_put(rule);



^ permalink raw reply related

* Re: [PATCH net-next 3/5] tipc: Optimizations to bearer enabling logic
From: Neil Horman @ 2010-10-13 14:58 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: davem, netdev, allan.stephens
In-Reply-To: <1286929558-2954-3-git-send-email-paul.gortmaker@windriver.com>

On Tue, Oct 12, 2010 at 08:25:56PM -0400, Paul Gortmaker wrote:
> From: Allan Stephens <allan.stephens@windriver.com>
> 
> Introduces "enabling" state during activation of a new TIPC bearer,
> which supplements the existing "disabled" and "enabled" states.
> This change allows the new bearer to be added without having to
> temporarily block the processing of incoming packets on existing
> bearers during the binding of the new bearer to its associated
> interface. It also makes it unnecessary to zero out the entire
> bearer structure at the start of activation.
> 
> Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> ---
>  net/tipc/bcast.c  |    2 +-
>  net/tipc/bearer.c |   27 +++++++++++++++++----------
>  net/tipc/bearer.h |    8 ++++++--
>  net/tipc/link.c   |    2 +-
>  4 files changed, 25 insertions(+), 14 deletions(-)
> 
> diff --git a/net/tipc/bcast.c b/net/tipc/bcast.c
> index ecfaac1..ba6dcb2 100644
> --- a/net/tipc/bcast.c
> +++ b/net/tipc/bcast.c
> @@ -645,7 +645,7 @@ void tipc_bcbearer_sort(void)
>  	for (b_index = 0; b_index < MAX_BEARERS; b_index++) {
>  		struct bearer *b = &tipc_bearers[b_index];
>  
> -		if (!b->active || !b->nodes.count)
> +		if ((b->state != BEARER_ENABLED) || !b->nodes.count)
>  			continue;
>  
>  		if (!bp_temp[b->priority].primary)
> diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
> index 9969ec6..379338f 100644
> --- a/net/tipc/bearer.c
> +++ b/net/tipc/bearer.c
> @@ -290,7 +290,7 @@ struct bearer *tipc_bearer_find_interface(const char *if_name)
>  	u32 i;
>  
>  	for (i = 0, b_ptr = tipc_bearers; i < MAX_BEARERS; i++, b_ptr++) {
> -		if (!b_ptr->active)
> +		if (b_ptr->state != BEARER_ENABLED)
>  			continue;
>  		b_if_name = strchr(b_ptr->publ.name, ':') + 1;
>  		if (!strcmp(b_if_name, if_name))
> @@ -312,7 +312,8 @@ static struct bearer *bearer_find(const char *name)
>  		return NULL;
>  
>  	for (i = 0, b_ptr = tipc_bearers; i < MAX_BEARERS; i++, b_ptr++) {
> -		if (b_ptr->active && (!strcmp(b_ptr->publ.name, name)))
> +		if ((b_ptr->state == BEARER_ENABLED) &&
> +		    (!strcmp(b_ptr->publ.name, name)))
>  			return b_ptr;
>  	}
>  	return NULL;
> @@ -337,7 +338,8 @@ struct sk_buff *tipc_bearer_get_names(void)
>  	for (i = 0, m_ptr = media_list; i < media_count; i++, m_ptr++) {
>  		for (j = 0; j < MAX_BEARERS; j++) {
>  			b_ptr = &tipc_bearers[j];
> -			if (b_ptr->active && (b_ptr->media == m_ptr)) {
> +			if ((b_ptr->state == BEARER_ENABLED) &&
> +			    (b_ptr->media == m_ptr)) {
>  				tipc_cfg_append_tlv(buf, TIPC_TLV_BEARER_NAME,
>  						    b_ptr->publ.name,
>  						    strlen(b_ptr->publ.name) + 1);
> @@ -532,7 +534,7 @@ restart:
>  	bearer_id = MAX_BEARERS;
>  	with_this_prio = 1;
>  	for (i = MAX_BEARERS; i-- != 0; ) {
> -		if (!tipc_bearers[i].active) {
> +		if (tipc_bearers[i].state != BEARER_ENABLED) {
>  			bearer_id = i;
>  			continue;
>  		}
> @@ -559,21 +561,23 @@ restart:
>  	}
>  
>  	b_ptr = &tipc_bearers[bearer_id];
> -	memset(b_ptr, 0, sizeof(struct bearer));
> -
> +	b_ptr->state = BEARER_ENABLING;
>  	strcpy(b_ptr->publ.name, name);
> +	b_ptr->priority = priority;
> +
> +	write_unlock_bh(&tipc_net_lock);
Why the 3rd state?  Doesn't seem needed. 

^ permalink raw reply

* [-next Oct 13] Build failure drivers/s390/net/ctcm_mpc.o
From: Sachin Sant @ 2010-10-13 15:09 UTC (permalink / raw)
  To: netdev; +Cc: linux-next@vger.kernel.org, linux-s390, ursula.braun,
	Eric Dumazet

Today's next fails to build on a s390 box with following
error

drivers/s390/net/ctcm_mpc.c: In function ctc_mpc_dealloc_ch:
drivers/s390/net/ctcm_mpc.c:541: error: struct net_device has no member named refcnt

The code in question was changed by
commit 29b4433d991c88d86ca48a4c1cc33c671475be4b
net: percpu net_device refcount

The net_device struct member refcnt was changed from atomic_t to
int __percpu.

Thanks
-Sachin


-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------


^ permalink raw reply

* Re: [PATCH v2] xps-mp: Transmit Packet Steering for multiqueue
From: Tom Herbert @ 2010-10-13 15:41 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, netdev
In-Reply-To: <1286948893.2703.88.camel@edumazet-laptop>

On Tue, Oct 12, 2010 at 10:48 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> Le mardi 12 octobre 2010 à 17:20 -0700, Tom Herbert a écrit :
> > +#ifdef CONFIG_RPS
> > +       struct kobject          kobj;
> > +       struct netdev_queue     *first;
> > +       atomic_t                count;
> > +       struct xps_map          *xps_maps;
> > +#endif
>
> Please put kobj at the end of this block, because its a big object (64
> bytes)
>
Okay.

> +#ifdef CONFIG_RPS
> > +       struct xps_map          *xps_maps;
> > +       struct netdev_queue     *first;
> > +       atomic_t                count;
> > +       struct kobject          kobj;
> > +#endif
> >
>
> This way, we only use one cache line to access hot path fields, instead
> of two cache lines.
>
Okay.

>
>
> Tom, I believe you should split your patch in several parts, its really
> too hard to review.
>
>
> For example the netif_alloc_netdev_queues() stuff should be a patch on
> its own, as this stuff was already discussed on netdev some days ago, so
> that you could copy Ben and me on this one ;)
>
Ahh, that was some discussion about TX queues in the thread about RX
queue allocation.  I seemed to have missed that part.

> (Ie not allocating _tx in alloc_netdev_mq() but in register_netdevice())
>
Sorry about that.  The good changes in rx queue motivated these
changes.  I'll extract the parts that move the TX queue allocation
into a separate patch.

Thanks for reviewing,
Tom

>
>
> Its also a good thing to give diffstat output :
>
>  include/linux/netdevice.h |   28 +++
>  include/linux/skbuff.h    |    1
>  net/core/dev.c            |  191 ++++++++++++++++------
>  net/core/net-sysfs.c      |  305 +++++++++++++++++++++++++++++++++++-
>  net/core/net-sysfs.h      |    3
>  net/ipv4/tcp_output.c     |    4
>  6 files changed, 471 insertions(+), 61 deletions(-)
>
>
>

^ permalink raw reply

* Re: [PATCH net-next] net:  allocate skbs on local node
From: Christoph Lameter @ 2010-10-13 16:00 UTC (permalink / raw)
  To: David Rientjes
  Cc: Pekka Enberg, Andrew Morton, Eric Dumazet, David Miller, netdev,
	Michael Chan, Eilon Greenstein, Christoph Hellwig, LKML,
	Nick Piggin
In-Reply-To: <alpine.DEB.2.00.1010121234380.10165@chino.kir.corp.google.com>

On Tue, 12 Oct 2010, David Rientjes wrote:

> > Take the unified as a SLAB cleanup instead? Then at least we have
> > a large common code base and just differentiate through the locking
> > mechanism?
> >
>
> Will you be adding the extensive slub debugging to slab then?  It would be
> a shame to lose it because one allocator is chosen over another for
> performance reasons and then we need to recompile to debug issues as they
> arise.

Well basically we would copy SLUB to SLAB apply unification patches to
SLAB instead of SLUBB. We first have to make sure that the unified patches
have the same performance as SLAB.

It maybe much better to isolate the debug features and general bootstrap
from the particulars of the allocation strategy of either SLUB or SLAB.
That way a common code base exists and it would be easier to add different
allocation strategies.

Basically have slab.c with the basic functions and then slab_queueing.c
and slab_noqueue.c for SLAB/SLUB with the particulars of the allocation
strategy?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox