Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] mISDN: Fix wrong usage of flush_work_sync while holding locks
From: Karsten Keil @ 2012-09-13 19:50 UTC (permalink / raw)
  To: David Miller; +Cc: keil, netdev
In-Reply-To: <20120913.150659.17331564270446673.davem@davemloft.net>

Am 13.09.2012 21:06, schrieb David Miller:
> From: David Miller <davem@davemloft.net>
> Date: Thu, 13 Sep 2012 14:59:37 -0400 (EDT)
> 
>> From: Karsten Keil <keil@b1-systems.de>
>> Date: Thu, 13 Sep 2012 16:36:20 +0200
>>
>>> It is a bad idea to hold a spinlock and call flush_work_sync.
>>> Move the workqueue cleanup outside the spinlock and use cancel_work_sync,
>>> on closing the channel this seems to be the more correct function.
>>> Remove the never used and constant return value of mISDN_freebchannel.
>>>
>>> Signed-off-by: Karsten Keil <keil@b1-systems.de>
>>> Cc: <stable@kernel.org>
>>
>> Applied, thanks.
> 
> BTW, about -stable:
> 
> 1) Even if it were appropriate to submit this directly to -stable,
>    stable@kernel.org is not the correct email address and you must
>    have seen the bounce produced by trying to send email there.
> 
>    Rather, stable@vger.kernel.org is the correct address.
> 
> 2) I queue up and submit networking bug fixes for -stable myself so
>    you should not submit them directly but rather make a request that
>    I add your patch to my networking -stable queue.
> 

OK, then I will put something like, "Should be considered for stable
too" in the comment and do not add the CC: stable@vger.kernel.org
for the next time.

Thanks.

^ permalink raw reply

* Re: [PATCH 1/4] net_sched: gred: correct comment about qavg calculation in RIO mode
From: David Miller @ 2012-09-13 20:10 UTC (permalink / raw)
  To: jhs; +Cc: david.ward, netdev, brosler, cyril
In-Reply-To: <50521EFC.90701@mojatatu.com>

From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: Thu, 13 Sep 2012 13:59:24 -0400

> On 12-09-13 11:22 AM, David Ward wrote:
>> Signed-off-by: David Ward <david.ward@ll.mit.edu>
> 
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/4] net_sched: gred: eliminate redundant DP prio comparisons
From: David Miller @ 2012-09-13 20:10 UTC (permalink / raw)
  To: jhs; +Cc: david.ward, netdev, brosler, cyril
In-Reply-To: <50521F3E.7010101@mojatatu.com>

From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: Thu, 13 Sep 2012 14:00:30 -0400

> On 12-09-13 11:22 AM, David Ward wrote:
>> Each pair of DPs only needs to be compared once when searching for
>> a non-unique prio value.
>>
>> Signed-off-by: David Ward <david.ward@ll.mit.edu>
> 
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

Applied.

^ permalink raw reply

* Re: [PATCH 3/4] net_sched: gred: fix qave reporting via netlink
From: David Miller @ 2012-09-13 20:10 UTC (permalink / raw)
  To: jhs; +Cc: david.ward, netdev, brosler, cyril
In-Reply-To: <50521F97.20503@mojatatu.com>

From: Jamal Hadi Salim <jhs@mojatatu.com>
Date: Thu, 13 Sep 2012 14:01:59 -0400

> On 12-09-13 11:22 AM, David Ward wrote:
>> q->vars.qavg is a Wlog scaled value, but q->backlog is not. In order
>> to pass q->vars.qavg as the backlog value, we need to un-scale it.
>> Additionally, the qave value returned via netlink should not be Wlog
>> scaled, so we need to un-scale the result of red_calc_qavg().
>>
>> This caused artificially high values for "Average Queue" to be shown
>> by 'tc -s -d qdisc', but did not affect the actual operation of GRED.
>>
>> Signed-off-by: David Ward <david.ward@ll.mit.edu>
> 
> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

Applied.

^ permalink raw reply

* Re: [PATCH 4/4] net_sched: gred: actually perform idling in WRED mode
From: David Miller @ 2012-09-13 20:10 UTC (permalink / raw)
  To: david.ward; +Cc: jhs, netdev, brosler, cyril
In-Reply-To: <50523632.60703@ll.mit.edu>

From: "Ward, David - 0663 - MITLL" <david.ward@ll.mit.edu>
Date: Thu, 13 Sep 2012 15:38:26 -0400

> On 13/09/12 14:08, Jamal Hadi Salim wrote:
>> On 12-09-13 11:22 AM, David Ward wrote:
>>> gred_dequeue() and gred_drop() do not seem to get called when the
>>> queue is empty, meaning that we never start idling while in WRED
>>> mode. And since qidlestart is not stored by gred_store_wred_set(),
>>> we would never stop idling while in WRED mode if we ever started.
>>> This messes up the average queue size calculation that influences
>>> packet marking/dropping behavior.
>>>
>>> Now, we start WRED mode idling as we are removing the last packet
>>> from the queue. Also we now actually stop WRED mode idling when we
>>> are enqueuing a packet.
>>>
>>> Cc: Bruce Osler <brosler@cisco.com>
>>> Signed-off-by: David Ward <david.ward@ll.mit.edu>
>> This is one is not so obvious. Iam assuming you vetted it via some
>> tests.
>> In which case:
>> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
>>
>> cheers,
>> jamal
>>
> 
> Before applying this patch, the average queue size (as seen with "tc
> -s qdisc") remained constant forever after I stopped sending any
> packets through the interface -- it didn't taper off as you would
> expect.  After the patch, the average queue size will now taper off if
> packets are not being sent.

Applied.

^ permalink raw reply

* Re: [PATCH net v2] net: qmi_wwan: call subdriver with control intf only
From: David Miller @ 2012-09-13 20:12 UTC (permalink / raw)
  To: bjorn; +Cc: netdev, linux-usb
In-Reply-To: <1347518675-8427-1-git-send-email-bjorn@mork.no>

From: Bjørn Mork <bjorn@mork.no>
Date: Thu, 13 Sep 2012 08:44:35 +0200

> This fixes a hang on suspend due to calling wdm_suspend on
> the unregistered data interface. The hang should have been
> a NULL pointer reference had it not been for a logic error
> in the cdc_wdm code.
> 
>   commit 230718bd net: qmi_wwan: bind to both control and data interface
> 
> changed qmi_wwan to use cdc_wdm as a subdriver for devices with
> a two-interface QMI/wwan function.  The commit failed to update
> qmi_wwan_suspend and qmi_wwan_resume, which were written to handle
> either a single combined interface function, or no subdriver at all.
> 
> The result was that we called into the subdriver both when the
> control interface was suspended and when the data interface was
> suspended.  Calling the subdriver suspend function with an
> unregistered interface is not supported and will make the
> subdriver bug out.
> 
> Signed-off-by: Bjørn Mork <bjorn@mork.no>

Applied, thanks.

^ permalink raw reply

* Re: [net PATCH v2 1/7] bnx2x: Avoid sending multiple statistics queries
From: David Miller @ 2012-09-13 20:17 UTC (permalink / raw)
  To: yuvalmin; +Cc: netdev, dmitry, eilong
In-Reply-To: <1347374054-16730-2-git-send-email-yuvalmin@broadcom.com>

From: "Yuval Mintz" <yuvalmin@broadcom.com>
Date: Tue, 11 Sep 2012 17:34:08 +0300

> From: Dmitry Kravkov <dmitry@broadcom.com>
> 
> During traffic when DCB is enabled, it is possible for multiple instances
> of statistics queries to be sent to the chip - this may cause the FW to assert.
> 
> This patch prevents the sending of an additional instance of statistics query
> while the previous query hasn't completed.
> 
> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
> Signed-off-by: Eilon Greenstein <eilong@broadcom.com>

This change results in no change in behavior as far as I can tell.

> -	if (bnx2x_storm_stats_update(bp) && (bp->stats_pending++ == 3)) {
> -		BNX2X_ERR("storm stats were not updated for 3 times\n");
> -		bnx2x_panic();
> +	if (bnx2x_storm_stats_update(bp)) {
> +		if (bp->stats_pending++ == 3) {
> +			BNX2X_ERR("storm stats were not updated for 3 times\n");
> +			bnx2x_panic();
> +		}

There is no difference between:

	if (A && B) {
		C;
	}

and:

	if (A) {
		if (B) {
			C;
		}
	}

Yet that's exactly what is happening in this patch.

And such a do-nothing change is certainly not appropriate this late in
the -rc series.

I'm tossing this entire series, please sort this out and submit
the real actual critical bug fixes.

^ permalink raw reply

* Re: [v3 PATCH 1/2] netprio_cgroup: Remove update_netdev_tables() since it is unnecessary
From: David Miller @ 2012-09-13 20:19 UTC (permalink / raw)
  To: srivatsa.bhat
  Cc: nhorman, David.Laight, john.r.fastabend, gaofeng, eric.dumazet,
	mark.d.rustad, lizefan, netdev, linux-kernel
In-Reply-To: <20120913063225.6278.87780.stgit@srivatsabhat.in.ibm.com>

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Date: Thu, 13 Sep 2012 12:02:25 +0530

> The update_netdev_tables() function appears to be unnecessary, since the
> write_update_netdev_table() function will adjust the priomaps as and when
> required anyway. So drop the usage of update_netdev_tables() entirely.
> 
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Applied to net-next

^ permalink raw reply

* Re: [v3 PATCH 2/2] netprio_cgroup: Use memcpy instead of the for-loop to copy priomap
From: David Miller @ 2012-09-13 20:19 UTC (permalink / raw)
  To: srivatsa.bhat
  Cc: nhorman, David.Laight, john.r.fastabend, gaofeng, eric.dumazet,
	mark.d.rustad, lizefan, netdev, linux-kernel
In-Reply-To: <20120913063233.6278.7615.stgit@srivatsabhat.in.ibm.com>

From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Date: Thu, 13 Sep 2012 12:02:34 +0530

> Replace the current (inefficient) for-loop with memcpy, to copy priomap.
> 
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Applied to net-next

^ permalink raw reply

* Remarks and comments about ipconfig behavior
From: Erwan Velu @ 2012-09-13 20:21 UTC (permalink / raw)
  To: netdev

Hey Fellows !

I've been figuring a strange behavior today and I'd like to share with 
you both experience and remarks.

On my system that runs a 3.2.29 but that's also applicable with 
linux-next and all other releases.

As shown here : 
http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=blob;f=net/ipv4/ipconfig.c;h=67e8a6b086ea7a0d2c4cc986ed6ed0e6b4414c6a;hb=HEAD#l262 
, if you specify an ip= option on the cmdline, the kernel is expecting 
the carrier to be present unless it will make a loop up to 2mn as per 
CONF_CARRIER_TIMEOUT value.

That lever for me two points :
- why is this timeout setup for so long ? Even with a spantree 
configuration, not having a carrier for 2mn is *waow* ... Does a 30sec 
could not be enough ? What is the need of waiting so long time ?

- Until we get the carrier, the kernel just stops and the boot process 
is totally locked but there isn't any message shown to the user. The 
system really look like frozen/dead for 2 minutes.

I spent a complete day trying to understand why my box was unable to 
boot while the network cable was removed.
I just discover this behavior by comparing log files to understand that 
was the reason.

So my suggestion would be the following and I can offer patches if you 
agree on thoses points :

- reducing the timeout to something smaller like 30sec
- display a message every second to inform the user we are waiting the 
carrier to satisfy the ipconfig option

What are you thoughts on that ?

Thanks you,
Erwan Velu

^ permalink raw reply

* Re: [net-next.git 3/8 (V2)] stmmac: add the initial tx coalesce schema
From: David Miller @ 2012-09-13 20:23 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: netdev, bhutchings
In-Reply-To: <1347346514-23411-4-git-send-email-peppe.cavallaro@st.com>

From: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
Date: Tue, 11 Sep 2012 08:55:09 +0200

> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&priv->tx_lock, flags);
>  
> -	spin_lock(&priv->tx_lock);
> +	priv->xstats.tx_clean++;

You are changing the locking here for the sake of the new timer.

But timers run in software interrupt context, so this change is
completely unnecessary since NAPI runs in software interrupt context
as well, and neither timers nor NAPI run in hardware interrupts
context.

Therefore, disabling hardware interrupts for this lock is unnecessary
and will decrease performance.

^ permalink raw reply

* Re: [PATCH] scsi_netlink: Remove dead and buggy code
From: David Miller @ 2012-09-13 20:24 UTC (permalink / raw)
  To: ebiederm; +Cc: James.Bottomley, netdev, James.Smart, linux-scsi
In-Reply-To: <20120911.184846.1973557158791638595.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Tue, 11 Sep 2012 18:48:46 -0400 (EDT)

> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Tue, 11 Sep 2012 15:40:08 -0700
> 
>> So just for curiosity I searched the entire git history for scsi_nl_add_
>> and the only commit that I found was the addition of that code to the
>> tree in August of 2008.
>> 
>> Does anyone have any reason to keep scsi_nl_add_transport or
>> scsi_nl_add_driver that have never been used in the 4 years since
>> they have been added?
> 
> That's basically the question on the table right now :-)

Time has run out.

Letting legitimate and reasonable patches simply rot for a week is
absolutely never acceptable, so I'm applying Eric's patch to net-next.

^ permalink raw reply

* Re: Remarks and comments about ipconfig behavior
From: David Miller @ 2012-09-13 20:31 UTC (permalink / raw)
  To: erwanaliasr1; +Cc: netdev
In-Reply-To: <5052403B.4040408@gmail.com>

From: Erwan Velu <erwanaliasr1@gmail.com>
Date: Thu, 13 Sep 2012 22:21:15 +0200

> That lever for me two points :
> - why is this timeout setup for so long ? Even with a spantree
> - configuration, not having a carrier for 2mn is *waow* ... Does a 30sec
> - could not be enough ? What is the need of waiting so long time ?

I've seen PHY/switch/hub combinations that take longer than 30 seconds
to fully negotiate the link.

There is really no upper limit to the link speed/duplex/etc.
negoatiation process.

Even if the actual negoatiation protocol had an upper limit on
negoatiation time, hardware implementations do things like try
sampling the quality of the cable signal and may choose to
down-rev the advertised features and restart the negoatiation.

^ permalink raw reply

* Re: [PATCH 1/2] ipv6: Add labels for site-local and 6bone testing addresses (RFC6724)
From: David Miller @ 2012-09-13 20:34 UTC (permalink / raw)
  To: yoshfuji; +Cc: netdev
In-Reply-To: <201209110508.q8B58GJO015601@94.43.138.210.xn.2iij.net>

From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Date: Tue, 11 Sep 2012 13:16:49 +0900

> Added labels for site-local addresses (fec0::/10) and 6bone testing
> addresses (3ffe::/16) in order to depreference them.
> 
> Note that the RFC introduced new rows for Teredo, ULA and 6to4 addresses
> in the default policy table.  Some of them have different labels from ours.
> For backward compatibility, we do not change the "default" labels.
> 
> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH 2/2] ipv6: Compare addresses only bits up to the prefix length (RFC6724).
From: David Miller @ 2012-09-13 20:34 UTC (permalink / raw)
  To: yoshfuji; +Cc: netdev
In-Reply-To: <201209110508.q8B58Min015633@94.43.138.210.xn.2iij.net>

From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Date: Tue, 11 Sep 2012 13:41:07 +0900

> Compare bits up to the source address's prefix length only to
> allows DNS load balancing to continue to be used as a tie breaker.
> 
> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

Applied to net-next

^ permalink raw reply

* RE: [net PATCH v2 1/7] bnx2x: Avoid sending multiple statistics queries
From: Yuval Mintz @ 2012-09-13 20:34 UTC (permalink / raw)
  To: David Miller; +Cc: netdev@vger.kernel.org, Dmitry Kravkov, Eilon Greenstein
In-Reply-To: <20120913.161711.2180618409918407558.davem@davemloft.net>

> > This patch prevents the sending of an additional instance of statistics
> query
> > while the previous query hasn't completed.
> >
> This change results in no change in behavior as far as I can tell.
> 
> > -	if (bnx2x_storm_stats_update(bp) && (bp->stats_pending++ == 3)) {
> > -		BNX2X_ERR("storm stats were not updated for 3 times\n");
> > -		bnx2x_panic();
> > +	if (bnx2x_storm_stats_update(bp)) {
> > +		if (bp->stats_pending++ == 3) {
> > +			BNX2X_ERR("storm stats were not updated for 3
> times\n");
> > +			bnx2x_panic();
> > +		}

But you're missing the 'return;' statement at the end.

> 
> There is no difference between:
> 
> 	if (A && B) {
> 		C;
> 	}
> 
> and:
> 
> 	if (A) {
> 		if (B) {
> 			C;
> 		}
> 	}

But there is a difference between:
If (A && B) {
	C;
	D;
}

And:
if (A) {
	if (B) {
		C;
	}
	D;
}

The point of this patch was not to change the condition of the print & panic but rather to guarantee that
if bnx2x_storm_stats_update failed, no more ramrods will be posted, as the function will return 
(regardless of stats_pending value).

Sorry it wasn't clearer in the patch.

^ permalink raw reply

* RE: [net PATCH v2 1/7] bnx2x: Avoid sending multiple statistics queries
From: Dmitry Kravkov @ 2012-09-13 20:35 UTC (permalink / raw)
  To: David Miller, Yuval Mintz; +Cc: netdev@vger.kernel.org, Eilon Greenstein
In-Reply-To: <20120913.161711.2180618409918407558.davem@davemloft.net>

> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Thursday, September 13, 2012 11:17 PM
> To: Yuval Mintz
> Cc: netdev@vger.kernel.org; Dmitry Kravkov; Eilon Greenstein
> Subject: Re: [net PATCH v2 1/7] bnx2x: Avoid sending multiple statistics queries
> 
> From: "Yuval Mintz" <yuvalmin@broadcom.com>
> Date: Tue, 11 Sep 2012 17:34:08 +0300
> 
> > From: Dmitry Kravkov <dmitry@broadcom.com>
> >
> > During traffic when DCB is enabled, it is possible for multiple instances
> > of statistics queries to be sent to the chip - this may cause the FW to assert.
> >
> > This patch prevents the sending of an additional instance of statistics query
> > while the previous query hasn't completed.
> >
> > Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
> > Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
> > Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
> 
> This change results in no change in behavior as far as I can tell.
> 
> > -	if (bnx2x_storm_stats_update(bp) && (bp->stats_pending++ == 3)) {
> > -		BNX2X_ERR("storm stats were not updated for 3 times\n");
> > -		bnx2x_panic();
> > +	if (bnx2x_storm_stats_update(bp)) {
> > +		if (bp->stats_pending++ == 3) {
> > +			BNX2X_ERR("storm stats were not updated for 3
> times\n");
> > +			bnx2x_panic();
> > +		}
> 
> There is no difference between:
> 
> 	if (A && B) {
> 		C;
> 	}
> 
> and:
> 
> 	if (A) {
> 		if (B) {
> 			C;
> 		}
> 	}
> 
> Yet that's exactly what is happening in this patch.
> 
> And such a do-nothing change is certainly not appropriate this late in
> the -rc series.
> 
> I'm tossing this entire series, please sort this out and submit
> the real actual critical bug fixes.
return statement is not seen in the patch:
Before the change we returned from the function if (A &&B)
Now we return even if (A)

^ permalink raw reply

* Re: Remarks and comments about ipconfig behavior
From: Erwan Velu @ 2012-09-13 20:37 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120913.163134.361379133013906511.davem@davemloft.net>

Le 13/09/2012 22:31, David Miller a écrit :
> From: Erwan Velu<erwanaliasr1@gmail.com>
> Date: Thu, 13 Sep 2012 22:21:15 +0200
>
>> That lever for me two points :
>> - why is this timeout setup for so long ? Even with a spantree
>> - configuration, not having a carrier for 2mn is *waow* ... Does a 30sec
>> - could not be enough ? What is the need of waiting so long time ?
> I've seen PHY/switch/hub combinations that take longer than 30 seconds
> to fully negotiate the link.
>
> There is really no upper limit to the link speed/duplex/etc.
> negoatiation process.
>
> Even if the actual negoatiation protocol had an upper limit on
> negoatiation time, hardware implementations do things like try
> sampling the quality of the cable signal and may choose to
> down-rev the advertised features and restart the negoatiation.
Ok but shouldn't we display some message to the user trying to explain 
why the kernel is stopped and perfectly silent during its booting 
process ? 2 minutes, that's a pretty long time with an almost frozen 
kernel isn't it ?

^ permalink raw reply

* Re: [PATCH 4/4] net_sched: gred: actually perform idling in WRED mode
From: Jamal Hadi Salim @ 2012-09-13 20:37 UTC (permalink / raw)
  To: Ward, David - 0663 - MITLL
  Cc: netdev@vger.kernel.org, Bruce Osler, Cyril Chemparathy
In-Reply-To: <50523632.60703@ll.mit.edu>

On 12-09-13 03:38 PM, Ward, David - 0663 - MITLL wrote:
> Before applying this patch, the average queue size (as seen with "tc 
> -s qdisc") remained constant forever after I stopped sending any 
> packets through the interface -- it didn't taper off as you would 
> expect.  After the patch, the average queue size will now taper off if 
> packets are not being sent.
>


That makes sense. thanks

cheers,
jamal

^ permalink raw reply

* Re: [net PATCH v2 1/7] bnx2x: Avoid sending multiple statistics queries
From: David Miller @ 2012-09-13 20:38 UTC (permalink / raw)
  To: yuvalmin; +Cc: netdev, dmitry, eilong
In-Reply-To: <979A8436335E3744ADCD3A9F2A2B68A504B81E31@SJEXCHMB10.corp.ad.broadcom.com>

From: "Yuval Mintz" <yuvalmin@broadcom.com>
Date: Thu, 13 Sep 2012 20:34:57 +0000

> But there is a difference between:
> If (A && B) {
> 	C;
> 	D;
> }
> 
> And:
> if (A) {
> 	if (B) {
> 		C;
> 	}
> 	D;
> }

Indeed, I missed the return, it make a lot more sense now.

I've applied this series to 'net' and will push it out after
build testing.

Thanks.

^ permalink raw reply

* Re: [net-next.git 3/8 (V2)] stmmac: add the initial tx coalesce schema
From: Ben Hutchings @ 2012-09-13 20:42 UTC (permalink / raw)
  To: David Miller; +Cc: peppe.cavallaro, netdev
In-Reply-To: <20120913.162333.1518469374321928795.davem@davemloft.net>

On Thu, 2012-09-13 at 16:23 -0400, David Miller wrote:
> From: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
> Date: Tue, 11 Sep 2012 08:55:09 +0200
> 
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&priv->tx_lock, flags);
> >  
> > -	spin_lock(&priv->tx_lock);
> > +	priv->xstats.tx_clean++;
> 
> You are changing the locking here for the sake of the new timer.
> 
> But timers run in software interrupt context, so this change is
> completely unnecessary since NAPI runs in software interrupt context
> as well, and neither timers nor NAPI run in hardware interrupts
> context.

NAPI pollers can be called by netpoll in hardware interrupt context, and
this driver supports netpoll.

Ben.

> Therefore, disabling hardware interrupts for this lock is unnecessary
> and will decrease performance.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: Remarks and comments about ipconfig behavior
From: David Miller @ 2012-09-13 20:45 UTC (permalink / raw)
  To: erwanaliasr1; +Cc: netdev
In-Reply-To: <50524419.4080404@gmail.com>

From: Erwan Velu <erwanaliasr1@gmail.com>
Date: Thu, 13 Sep 2012 22:37:45 +0200

> Ok but shouldn't we display some message to the user trying to explain
> why the kernel is stopped and perfectly silent during its booting
> process ? 2 minutes, that's a pretty long time with an almost frozen
> kernel isn't it ?

Patches welcome.

^ permalink raw reply

* Re: [net-next.git 3/8 (V2)] stmmac: add the initial tx coalesce schema
From: David Miller @ 2012-09-13 20:46 UTC (permalink / raw)
  To: bhutchings; +Cc: peppe.cavallaro, netdev
In-Reply-To: <1347568971.13258.19.camel@deadeye.wl.decadent.org.uk>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 13 Sep 2012 21:42:51 +0100

> On Thu, 2012-09-13 at 16:23 -0400, David Miller wrote:
>> From: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
>> Date: Tue, 11 Sep 2012 08:55:09 +0200
>> 
>> > +	unsigned long flags;
>> > +
>> > +	spin_lock_irqsave(&priv->tx_lock, flags);
>> >  
>> > -	spin_lock(&priv->tx_lock);
>> > +	priv->xstats.tx_clean++;
>> 
>> You are changing the locking here for the sake of the new timer.
>> 
>> But timers run in software interrupt context, so this change is
>> completely unnecessary since NAPI runs in software interrupt context
>> as well, and neither timers nor NAPI run in hardware interrupts
>> context.
> 
> NAPI pollers can be called by netpoll in hardware interrupt context, and
> this driver supports netpoll.

And the netpoll handler need to make amends to make sure that
hardware the environment expected by the software interrupt
code is preserved.

Well written NAPI drivers never need to disable hardware interrupts
in their ->poll() method and it's callers, neither should you.

I'm not considering your patches until you implement this properly.

^ permalink raw reply

* Re: bnx2 cards intermittantly going offline
From: Michael Chan @ 2012-09-13 20:30 UTC (permalink / raw)
  To: Sven Ulland; +Cc: netdev, Marc A. Donges
In-Reply-To: <5051FFAA.8060501@opera.com>


On Thu, 2012-09-13 at 17:45 +0200, Sven Ulland wrote:
> On 09/13/2012 03:51 PM, Marc A. Donges wrote:
> > After 55 days of operation the machine (A) suddenly was no longer
> > reachable via network. Strangely, a second machine (B) that should
> > take over the IP addresses (keepalived) did not take over. Only
> > after shutting the switchport to which A is attached did B take
> > over.

The rx_ftq_discards problem is a firmware problem.  FTQ discards mean
that the firmware is no longer running and the packets are dropped at
the FTQ.  This is likely fixed in:

commit 22fa159d37efbfe781bbb99279efe83f58b87d29
Author: Michael Chan <mchan@broadcom.com>
Date:   Mon Oct 11 16:12:00 2010 -0700

    bnx2: Update firmware to 6.0.x.


> 
> Hi. We've had the same symptom with our BCM5709S [14e4:163a] on
> Debian. Like you, we were on stable's 2.6.32-41squeeze2. Google led us
> to many similar issues [1,2,3]. They concluded with the fix being in
> mainline commit c441b8d2 [4]: "bnx2: Fix lost MSI-X problem on 5709
> NICs".

This is a different problem and will not result in FTQ discards.

> 
> Broadcom: Can you publish a tool that decodes ethtool -d dumps to make
> debugging easier, or do you deem it no longer necessary with the the
> register dump commits in 555069da?

The register dump during tx timeout is now quite comprehensive.

> 
> Now, Debian's 2.6.32-41squeeze2 is based on longterm release 2.6.32.54
> [5]. That version includes commit 0b7817ed [6], which is a backport of
> the already mentioned mainline commit c441b8d2.
> 
> So we tried digging further and applying some seemingly relevant
> commits [7,8] to our 2.6.32, but without any change in behaviour. Our
> temporary fix was to run 'ethtool -t ethX' to reset the device every
> time it locked up.
> 
> This dragged on with various builds, until we ended up on mainline
> 2.6.38 where we no longer saw any symptoms. I don't know in which
> kernel version it was fixed, but we ended up on that one, sort of by
> chance. Unfortunately, it had severe issues with kswapd memory
> compaction causing CPU soft lockups [9], so we went straight to
> squeeze-backports' 3.2.23-1~bpo60+2. We've been happy since then.
> 
> > We have five pairs of basically identical machines performing the
> > same task (each pair for one site). The error has not occured with
> > any other one, but this site is the busiest:
> 
> We also saw the issue only at a site with generally higher load
> compared to other sites.
> 
> I'd love to know exactly which commit fixed the issue, but it's fairly
> tricky to reproduce the issue, and the bisect count is fairly high (it
> need not be a specific fix for bnx2).

If you see the same FTQ discards, please try that firmware commit
mentioned above.  Thanks.

> 
> sven
> 
> 
> [1]: bnx2 driver crashes under random circumstances
> https://bugzilla.redhat.com/show_bug.cgi?id=520888
> 
> [2]: Access denied. Come on, Red Hat!
> https://bugzilla.redhat.com/show_bug.cgi?id=511368
> 
> [3]: NIC doesn't register packets [rhel-5.5.z]
> https://bugzilla.redhat.com/show_bug.cgi?id=587799
> 
> [4]: bnx2: Fix lost MSI-X problem on 5709 NICs.
> http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=object;h=c441b8d2cb2194b05550a558d6d95d8944e56a84
> 
> [5]: Debian Changelog linux-2.6 (2.6.32-45)
> http://packages.debian.org/changelogs/pool/main/l/linux-2.6/linux-2.6_2.6.32-45/changelog#version2.6.32-41
> 
> [6]: bnx2: Fix lost MSI-X problem on 5709 NICs.
> http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commit;h=0b7817edda5e44e5fa769645bd1220f5e7b0beb5
> 
> [7]: bnx2: reset_task is crashing the kernel. Fixing it.
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4529819c45161e4a119134f56ef504e69420bc98
> 
> [8]: bnx2: fixing a timout error due not refreshing TX timers correctly
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e6bf95ffa8d6f8f4b7ee33ea01490d95b0bbeb6e
> 
> [9]: [PATCH] remove compaction from kswapd
> http://thread.gmane.org/gmane.linux.kernel.mm/58962
> https://lkml.org/lkml/2011/3/25/664
> 
> 

^ permalink raw reply

* Re: [PATCH net-next] ipv6: route templates can be const
From: David Miller @ 2012-09-13 20:53 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1347436071.13103.695.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 12 Sep 2012 09:47:51 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> We kmemdup() templates, so they can be const.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox