Netdev List
 help / color / mirror / Atom feed
* Re: TCP and reordering
From: Vijay Subramanian @ 2012-11-28 18:52 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Rick Jones, Saku Ytti, netdev
In-Reply-To: <1354127585.14302.467.camel@edumazet-glaptop>

>
> Note that this sysctl controls the initial value of the per socket
> reordering value. It _does_ increase automatically (assuming SACK is
> enabled of course)
>
>
>

I believe enabling SACK is not a requirement. Even with plain Reno,
reordering  is tracked in tp->reordering and the reordering event is
counted in LINUX_MIB_TCPRENOREORDER.
tcp_add_reno_sack()-->tcp_check_reno_reordering() is the code path.

Thanks,
Vijay

^ permalink raw reply

* Re: [GIT net] Open vSwitch
From: David Miller @ 2012-11-28 18:48 UTC (permalink / raw)
  To: jesse; +Cc: netdev, dev
In-Reply-To: <1354041423-3050-1-git-send-email-jesse@nicira.com>


Date: Tue, 27 Nov 2012 10:37:01 -0800

Is it really Tuesday morning where you are?

^ permalink raw reply

* Re: TCP and reordering
From: Saku Ytti @ 2012-11-28 18:44 UTC (permalink / raw)
  To: netdev
In-Reply-To: <50B656CC.1040001@hp.com>

On (2012-11-28 10:24 -0800), Rick Jones wrote:

> By increasing net.ipv4.tcp_reordering you make the sending TCP less
> "sensitive" to duplicate ACKs and so less sensitive to reordering
> detected by the receiver.  The receiver is generating as many
> (duplicate) ACKs as before, it is just that the sender is ignoring
> them a bit more.

Thanks, I didn't know this.

-- 
  ++ytti

^ permalink raw reply

* Re: [PATCH net-next] gro: Handle inline VLAN tags
From: David Miller @ 2012-11-28 18:39 UTC (permalink / raw)
  To: gallatin; +Cc: eric.dumazet, bhutchings, netdev, linux-net-drivers, herbert
In-Reply-To: <50B64A24.60200@myri.com>

From: Andrew Gallatin <gallatin@myri.com>
Date: Wed, 28 Nov 2012 12:30:12 -0500

> Perfect, thanks.  Eric seemed OK with doing the vlan decap
> in the driver.  But.. I didn't get an "nack" or an "applied" from
> you in response to that myri10ge patchset.  Do I need to re-submit it?
> 
> Sorry, I'm not terribly familiar with the etiquette for this sort
> of thing.

Please just repost the patch set.

^ permalink raw reply

* Re: [PATCH 1/2] arping: Fix finding of a default interface when no -I is specified.
From: YOSHIFUJI Hideaki @ 2012-11-28 18:38 UTC (permalink / raw)
  To: Jan Synacek; +Cc: netdev, YOSHIFUJI Hideaki
In-Reply-To: <1354018775-4966-1-git-send-email-jsynacek@redhat.com>

Jan Synacek wrote:
> By default, no interface string should be supplied. This ensures that we can
> recognize if an interface was specified or omitted. Previously, when -I was not
> used, default interface string was compiled-in and the automatic selection
> didn't work correctly.

Okay, but to retain default device support compiled in,
different patch applied.

Thank you, anyway.

P.S. Even if default interface compiled in, you can say "-I ''" to
override that.

--yoshfuji

^ permalink raw reply

* Re: TCP and reordering
From: Eric Dumazet @ 2012-11-28 18:33 UTC (permalink / raw)
  To: Rick Jones; +Cc: Saku Ytti, netdev
In-Reply-To: <50B656CC.1040001@hp.com>

On Wed, 2012-11-28 at 10:24 -0800, Rick Jones wrote:
> On 11/28/2012 12:54 AM, Saku Ytti wrote:
> > On (2012-11-28 00:35 -0800), Vijay Subramanian wrote:
> >
> >> Also note that reordering is tracked on the sender side using the
> >> per flow variable tp->reordering . This measures the amount of
> >> reordering on the connection so that fast retransmit and other loss
> >> recovery mechanisms are not entered prematurely. Doesn't this
> >> behavior at the  sender already provide the behavior you seek?
> >
> > Sorry I don't seem to understand what you mean. Do you mind explaining how
> > the sender can help to restore performance on reordering network?
> 
> tp->reordering is initialized via the sysctl net.ipv4.tcp_reordering 
> which controls how anxious TCP will be to fast retransmit.
> 
> By increasing net.ipv4.tcp_reordering you make the sending TCP less 
> "sensitive" to duplicate ACKs and so less sensitive to reordering 
> detected by the receiver.  The receiver is generating as many 
> (duplicate) ACKs as before, it is just that the sender is ignoring them 
> a bit more.

Note that this sysctl controls the initial value of the per socket
reordering value. It _does_ increase automatically (assuming SACK is
enabled of course)

^ permalink raw reply

* Re: [RFC PATCH] 8139cp: properly support change of MTU values
From: John Greene @ 2012-11-28 18:00 UTC (permalink / raw)
  To: Rami Rosen; +Cc: netdev
In-Reply-To: <CAKoUArk7R3_JBn7Qdn8Ry+qy7NnidroqSArTpFwdP4WkryJypA@mail.gmail.com>

On 11/28/2012 02:23 AM, Rami Rosen wrote:
> Hi,
>
> In cp_change_mtu(), there is the following check:
> ...
> if (new_mtu < CP_MIN_MTU || new_mtu > CP_MAX_MTU)
> 		return -EINVAL;
> ...
> Later on, we set dev->mtu to new_mtu.
>
> The CP_MIN_MTU is defined to be 60; shouldn't it be 68 ?
>
>
> The reason for 68 is (RFC 791,  Internet Protocol,
> http://www.ietf.org/rfc/rfc791.txt):
> "Every internet module must be able to forward a datagram of 68 octets
> without further fragmentation.  This is because an internet  header
> may be up to 60 octets, and the minimum fragment is 8 octets."
>
> See also the generic Ethernet () method in eth_change_mtu() (net/ethernet/eth.c)
>
> int eth_change_mtu(struct net_device *dev, int new_mtu)
> {
> 	if (new_mtu < 68 || new_mtu > ETH_DATA_LEN)
> 		return -EINVAL;
> 	dev->mtu = new_mtu;
> 	return 0;
> }
>
>
> regards,
> Rami Rosen
>
> http://ramirose.wix.com/ramirosen
>
> On Tue, Nov 27, 2012 at 10:08 PM, John Greene <jogreene@redhat.com> wrote:
>> The 8139cp driver has a change_mtu function that has not been
>> enabled since the dawn of the git repository. However, the
>> generic eth_change_mtu is not used in its place, so that
>> invalid MTU values can be set on the interface.
>>
>> Original patch salvages the broken code for the single case of
>> setting the MTU while the interface is down, which is safe
>> and also includes the range check.  Now enhanced to support up
>> or down interface.
>>
>> Original patch from
>> http://lkml.indiana.edu/hypermail/linux/kernel/1202.2/00770.html
>>
>> Testing: has been test on virtual 8139cp setup without issue,
>> awaiting real hardware and retest again, but wanted to post now.
>>
>> Signed-Off-By: "John Greene" <jogreene@redhat.com>
>> CC: "David S. Miller" <davem@davemloft.net>commit c3f214170cd1abb89290815c3894b945a86894fe
Author: John Greene <jogreene@redhat.com>
Date:   Mon Nov 26 16:05:06 2012 -0500

     8139cp: properly support change of MTU values

>> ---
>>   drivers/net/ethernet/realtek/8139cp.c | 22 +++-------------------
>>   1 file changed, 3 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
>> index 6cb96b4..7847c83 100644
>> --- a/drivers/net/ethernet/realtek/8139cp.c
>> +++ b/drivers/net/ethernet/realtek/8139cp.c
>> @@ -1226,12 +1226,9 @@ static void cp_tx_timeout(struct net_device *dev)
>>          spin_unlock_irqrestore(&cp->lock, flags);
>>   }
>>
>> -#ifdef BROKEN
>>   static int cp_change_mtu(struct net_device *dev, int new_mtu)
>>   {
>>          struct cp_private *cp = netdev_priv(dev);
>> -       int rc;
>> -       unsigned long flags;
>>
>>          /* check for invalid MTU, according to hardware limits */
>>          if (new_mtu < CP_MIN_MTU || new_mtu > CP_MAX_MTU)
>> @@ -1244,22 +1241,11 @@ static int cp_change_mtu(struct net_device *dev, int new_mtu)
>>                  return 0;
>>          }
>>
>> -       spin_lock_irqsave(&cp->lock, flags);
>> -
>> -       cp_stop_hw(cp);                 /* stop h/w and free rings */
>> -       cp_clean_rings(cp);
>> -
>> +       /* network IS up, close it, reset MTU, and come up again. */
>> +       cp_close(dev);
>>          dev->mtu = new_mtu;
>> -       cp_set_rxbufsize(cp);           /* set new rx buf size */
>> -
>> -       rc = cp_init_rings(cp);         /* realloc and restart h/w */
>> -       cp_start_hw(cp);
>> -
>> -       spin_unlock_irqrestore(&cp->lock, flags);
>> -
>> -       return rc;
>> +       return cp_open(dev);
>>   }
>> -#endif /* BROKEN */
>>
>>   static const char mii_2_8139_map[8] = {
>>          BasicModeCtrl,
>> @@ -1835,9 +1821,7 @@ static const struct net_device_ops cp_netdev_ops = {
>>          .ndo_start_xmit         = cp_start_xmit,
>>          .ndo_tx_timeout         = cp_tx_timeout,
>>          .ndo_set_features       = cp_set_features,
>> -#ifdef BROKEN
>>          .ndo_change_mtu         = cp_change_mtu,
>> -#endif
>>
>>   #ifdef CONFIG_NET_POLL_CONTROLLER
>>          .ndo_poll_controller    = cp_poll_controller,
>> --
>> 1.7.11.7
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Good catch Rami.  Thanks.

^ permalink raw reply

* Re: TCP and reordering
From: Rick Jones @ 2012-11-28 18:24 UTC (permalink / raw)
  To: Saku Ytti; +Cc: netdev
In-Reply-To: <20121128085404.GB26010@pob.ytti.fi>

On 11/28/2012 12:54 AM, Saku Ytti wrote:
> On (2012-11-28 00:35 -0800), Vijay Subramanian wrote:
>
>> Also note that reordering is tracked on the sender side using the
>> per flow variable tp->reordering . This measures the amount of
>> reordering on the connection so that fast retransmit and other loss
>> recovery mechanisms are not entered prematurely. Doesn't this
>> behavior at the  sender already provide the behavior you seek?
>
> Sorry I don't seem to understand what you mean. Do you mind explaining how
> the sender can help to restore performance on reordering network?

tp->reordering is initialized via the sysctl net.ipv4.tcp_reordering 
which controls how anxious TCP will be to fast retransmit.

By increasing net.ipv4.tcp_reordering you make the sending TCP less 
"sensitive" to duplicate ACKs and so less sensitive to reordering 
detected by the receiver.  The receiver is generating as many 
(duplicate) ACKs as before, it is just that the sender is ignoring them 
a bit more.

^ permalink raw reply

* Re: [PATCH v2 net-next] net: move inet_dport/inet_num in sock_common
From: Eric Dumazet @ 2012-11-28 18:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Ling Ma, Joe Perches, Ben Hutchings
In-Reply-To: <1354107378.14302.149.camel@edumazet-glaptop>

On Wed, 2012-11-28 at 04:56 -0800, Eric Dumazet wrote:

> +#define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif)	\
> +	((inet_sk(__sk)->inet_portpair == (__ports))		&&	\
> +	 (inet_sk(__sk)->inet_addrpair == (__addrpair))		&&	\
> +	 (!(__sk)->sk_bound_dev_if	||				\
> +	   ((__sk)->sk_bound_dev_if == (__dif))) 		&& 	\
> +	 net_eq(sock_net(__sk), (__net)))
> +#define INET_TW_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif)\
> +	((inet_twsk(__sk)->tw_portpair == (__ports))	&&		\
> +	 (inet_twsk(__sk)->tw_addrpair == (__addrpair))	&&		\
> +	 (!(__sk)->sk_bound_dev_if	||				\
> +	   ((__sk)->sk_bound_dev_if == (__dif)))	&&		\
> +	 net_eq(sock_net(__sk), (__net)))

Last minute change in a param name __cookie / __addrpair

I reverted this change as the 32bit part is not using the 'cookie', so
keeping __the cookie name makes sense.

I'll fix this and add performance results in the changelog.

^ permalink raw reply

* Re: [PATCH 2/2] arping: Call usage() before limiting capabilities.
From: YOSHIFUJI Hideaki @ 2012-11-28 18:18 UTC (permalink / raw)
  To: Jan Synacek; +Cc: netdev, YOSHIFUJI Hideaki
In-Reply-To: <1354018775-4966-2-git-send-email-jsynacek@redhat.com>

Jan Synacek wrote:
> Otherwise, running arping binary without the capabilities set results in printing
> warnings with the usage.
> 
> Signed-off-by: Jan Synacek <jsynacek@redhat.com>

Fixed in different way.  ping/ping6 has also been fixed.
Thank you.

--yoshfuji

^ permalink raw reply

* Re: [PATCH] smsc95xx: fix suspend buffer overflow
From: Steve Glendinning @ 2012-11-28 18:06 UTC (permalink / raw)
  To: Joe Perches; +Cc: Steve Glendinning, netdev, Dan Carpenter
In-Reply-To: <1354038178.8028.0.camel@joe-AO722>

>>       if (pdata->wolopts & (WAKE_BCAST | WAKE_MCAST | WAKE_ARP | WAKE_UCAST)) {
>> -             u32 *filter_mask = kzalloc(32, GFP_KERNEL);
>> +             u32 *filter_mask = kzalloc(sizeof(u32) * 32, GFP_KERNEL);
>
> It's also unchecked for alloc failure.

Thanks both, I've resubmitted with an alloc failure check,

I completely agree - that filter code isn't pretty!  If you have time
to knock up a patch I'd be happy to test it.

Steve

^ permalink raw reply

* Re: TCP and reordering
From: David Woodhouse @ 2012-11-28 18:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Benjamin LaHaise, Vijay Subramanian, David Miller, saku,
	rick.jones2, netdev
In-Reply-To: <1354122996.14302.427.camel@edumazet-glaptop>

[-- Attachment #1: Type: text/plain, Size: 643 bytes --]

On Wed, 2012-11-28 at 09:16 -0800, Eric Dumazet wrote:
> Its the driver responsibility to maintain the coherent 'bytes' value for
> each transmitted/completed packet.
> 
> If a driver calls an external entity, it cannot possibly use BQL, unless
> doing an approximation (bytes becomes a fixed value)

If tracking the original destructor, I can also track the original size
before the skb got passed down the stack.

> BQL was really something to control/limit queueing on ethernet links,
> not for stacked devices, as stacked devices normally have no queue.

Stacked devices have more queue than anything else :)

-- 
dwmw2


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 6171 bytes --]

^ permalink raw reply

* [PATCH] smsc75xx: don't call usbnet_resume if usbnet_suspend fails
From: Steve Glendinning @ 2012-11-28 17:57 UTC (permalink / raw)
  To: netdev; +Cc: Steve Glendinning

If usbnet_suspend returns an error we don't want to call
usbnet_resume to clean up, but instead just return the error.

If usbnet_suspend *does* succeed, and we have a problem further
on, the desired behaviour is still to call usbnet_resume
to clean up before returning.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
---
 drivers/net/usb/smsc75xx.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/usb/smsc75xx.c b/drivers/net/usb/smsc75xx.c
index 1823806..86d9249 100644
--- a/drivers/net/usb/smsc75xx.c
+++ b/drivers/net/usb/smsc75xx.c
@@ -1411,7 +1411,7 @@ static int smsc75xx_suspend(struct usb_interface *intf, pm_message_t message)
 	int ret;
 
 	ret = usbnet_suspend(intf, message);
-	check_warn_goto_done(ret, "usbnet_suspend error\n");
+	check_warn_return(ret, "usbnet_suspend error\n");
 
 	if (pdata->suspend_flags) {
 		netdev_warn(dev->net, "error during last resume\n");
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 3/3] smsc95xx: don't enable remote wakeup directly
From: Steve Glendinning @ 2012-11-28 17:46 UTC (permalink / raw)
  To: netdev; +Cc: Steve Glendinning, Bjorn Mork
In-Reply-To: <1354124819-29531-1-git-send-email-steve.glendinning@shawell.net>

As pointed out by Bjorn Mork, the generic "usb" driver sets this
for us so no need to directly set it in this driver.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Cc: Bjorn Mork <bjorn@mork.no>
---
 drivers/net/usb/smsc95xx.c |   30 +++++-------------------------
 1 file changed, 5 insertions(+), 25 deletions(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index ffeaf13..064df1a 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -154,25 +154,6 @@ static int __must_check smsc95xx_write_reg(struct usbnet *dev, u32 index,
 {
 	return __smsc95xx_write_reg(dev, index, data, 0);
 }
-static int smsc95xx_set_feature(struct usbnet *dev, u32 feature)
-{
-	if (WARN_ON_ONCE(!dev))
-		return -EINVAL;
-
-	return usbnet_write_cmd_nopm(dev, USB_REQ_SET_FEATURE,
-				     USB_RECIP_DEVICE, feature, 0,
-				     NULL, 0);
-}
-
-static int smsc95xx_clear_feature(struct usbnet *dev, u32 feature)
-{
-	if (WARN_ON_ONCE(!dev))
-		return -EINVAL;
-
-	return usbnet_write_cmd_nopm(dev, USB_REQ_CLEAR_FEATURE,
-				     USB_RECIP_DEVICE, feature,
-				     0, NULL, 0);
-}
 
 /* Loop until the read is completed with timeout
  * called with phy_mutex held */
@@ -685,8 +666,13 @@ static int smsc95xx_ethtool_set_wol(struct net_device *net,
 {
 	struct usbnet *dev = netdev_priv(net);
 	struct smsc95xx_priv *pdata = (struct smsc95xx_priv *)(dev->data[0]);
+	int ret;
 
 	pdata->wolopts = wolinfo->wolopts & SUPPORTED_WAKE;
+
+	ret = device_set_wakeup_enable(&dev->udev->dev, pdata->wolopts);
+	check_warn_return(ret, "device_set_wakeup_enable error %d\n", ret);
+
 	return 0;
 }
 
@@ -1160,8 +1146,6 @@ static int smsc95xx_enter_suspend0(struct usbnet *dev)
 	ret = smsc95xx_read_reg_nopm(dev, PM_CTRL, &val);
 	check_warn_return(ret, "Error reading PM_CTRL\n");
 
-	smsc95xx_set_feature(dev, USB_DEVICE_REMOTE_WAKEUP);
-
 	return 0;
 }
 
@@ -1204,8 +1188,6 @@ static int smsc95xx_enter_suspend1(struct usbnet *dev)
 	ret = smsc95xx_write_reg_nopm(dev, PM_CTRL, val);
 	check_warn_return(ret, "Error writing PM_CTRL\n");
 
-	smsc95xx_set_feature(dev, USB_DEVICE_REMOTE_WAKEUP);
-
 	return 0;
 }
 
@@ -1456,8 +1438,6 @@ static int smsc95xx_resume(struct usb_interface *intf)
 	BUG_ON(!dev);
 
 	if (pdata->wolopts) {
-		smsc95xx_clear_feature(dev, USB_DEVICE_REMOTE_WAKEUP);
-
 		/* clear wake-up sources */
 		ret = smsc95xx_read_reg_nopm(dev, WUCSR, &val);
 		check_warn_return(ret, "Error reading WUCSR\n");
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 2/3] smsc95xx: fix error handling in suspend failure case
From: Steve Glendinning @ 2012-11-28 17:46 UTC (permalink / raw)
  To: netdev; +Cc: Steve Glendinning
In-Reply-To: <1354124819-29531-1-git-send-email-steve.glendinning@shawell.net>

This patch ensures that if we fail to suspend the LAN9500 device
we call usbnet_resume before returning failure, instead of
leaving the usbnet driver in an unusable state.

Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
---
 drivers/net/usb/smsc95xx.c |   50 +++++++++++++++++++++++++-------------------
 1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index c397b3a..ffeaf13 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -1248,35 +1248,37 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 
 		/* disable energy detect (link up) & wake up events */
 		ret = smsc95xx_read_reg_nopm(dev, WUCSR, &val);
-		check_warn_return(ret, "Error reading WUCSR\n");
+		check_warn_goto_done(ret, "Error reading WUCSR\n");
 
 		val &= ~(WUCSR_MPEN_ | WUCSR_WAKE_EN_);
 
 		ret = smsc95xx_write_reg_nopm(dev, WUCSR, val);
-		check_warn_return(ret, "Error writing WUCSR\n");
+		check_warn_goto_done(ret, "Error writing WUCSR\n");
 
 		ret = smsc95xx_read_reg_nopm(dev, PM_CTRL, &val);
-		check_warn_return(ret, "Error reading PM_CTRL\n");
+		check_warn_goto_done(ret, "Error reading PM_CTRL\n");
 
 		val &= ~(PM_CTL_ED_EN_ | PM_CTL_WOL_EN_);
 
 		ret = smsc95xx_write_reg_nopm(dev, PM_CTRL, val);
-		check_warn_return(ret, "Error writing PM_CTRL\n");
+		check_warn_goto_done(ret, "Error writing PM_CTRL\n");
 
-		return smsc95xx_enter_suspend2(dev);
+		ret = smsc95xx_enter_suspend2(dev);
+		goto done;
 	}
 
 	if (pdata->wolopts & WAKE_PHY) {
 		ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
 			(PHY_INT_MASK_ANEG_COMP_ | PHY_INT_MASK_LINK_DOWN_));
-		check_warn_return(ret, "error enabling PHY wakeup ints\n");
+		check_warn_goto_done(ret, "error enabling PHY wakeup ints\n");
 
 		/* if link is down then configure EDPD and enter SUSPEND1,
 		 * otherwise enter SUSPEND0 below
 		 */
 		if (!link_up) {
 			netdev_info(dev->net, "entering SUSPEND1 mode\n");
-			return smsc95xx_enter_suspend1(dev);
+			ret = smsc95xx_enter_suspend1(dev);
+			goto done;
 		}
 	}
 
@@ -1292,7 +1294,8 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 
 		if (!filter_mask) {
 			netdev_warn(dev->net, "Unable to allocate filter_mask\n");
-			return -ENOMEM;
+			ret = -ENOMEM;
+			goto done;
 		}
 
 		memset(command, 0, sizeof(command));
@@ -1354,49 +1357,49 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 			ret = smsc95xx_write_reg_nopm(dev, WUFF, filter_mask[i]);
 			if (ret < 0)
 				kfree(filter_mask);
-			check_warn_return(ret, "Error writing WUFF\n");
+			check_warn_goto_done(ret, "Error writing WUFF\n");
 		}
 		kfree(filter_mask);
 
 		for (i = 0; i < (wuff_filter_count / 4); i++) {
 			ret = smsc95xx_write_reg_nopm(dev, WUFF, command[i]);
-			check_warn_return(ret, "Error writing WUFF\n");
+			check_warn_goto_done(ret, "Error writing WUFF\n");
 		}
 
 		for (i = 0; i < (wuff_filter_count / 4); i++) {
 			ret = smsc95xx_write_reg_nopm(dev, WUFF, offset[i]);
-			check_warn_return(ret, "Error writing WUFF\n");
+			check_warn_goto_done(ret, "Error writing WUFF\n");
 		}
 
 		for (i = 0; i < (wuff_filter_count / 2); i++) {
 			ret = smsc95xx_write_reg_nopm(dev, WUFF, crc[i]);
-			check_warn_return(ret, "Error writing WUFF\n");
+			check_warn_goto_done(ret, "Error writing WUFF\n");
 		}
 
 		/* clear any pending pattern match packet status */
 		ret = smsc95xx_read_reg_nopm(dev, WUCSR, &val);
-		check_warn_return(ret, "Error reading WUCSR\n");
+		check_warn_goto_done(ret, "Error reading WUCSR\n");
 
 		val |= WUCSR_WUFR_;
 
 		ret = smsc95xx_write_reg_nopm(dev, WUCSR, val);
-		check_warn_return(ret, "Error writing WUCSR\n");
+		check_warn_goto_done(ret, "Error writing WUCSR\n");
 	}
 
 	if (pdata->wolopts & WAKE_MAGIC) {
 		/* clear any pending magic packet status */
 		ret = smsc95xx_read_reg_nopm(dev, WUCSR, &val);
-		check_warn_return(ret, "Error reading WUCSR\n");
+		check_warn_goto_done(ret, "Error reading WUCSR\n");
 
 		val |= WUCSR_MPR_;
 
 		ret = smsc95xx_write_reg_nopm(dev, WUCSR, val);
-		check_warn_return(ret, "Error writing WUCSR\n");
+		check_warn_goto_done(ret, "Error writing WUCSR\n");
 	}
 
 	/* enable/disable wakeup sources */
 	ret = smsc95xx_read_reg_nopm(dev, WUCSR, &val);
-	check_warn_return(ret, "Error reading WUCSR\n");
+	check_warn_goto_done(ret, "Error reading WUCSR\n");
 
 	if (pdata->wolopts & (WAKE_BCAST | WAKE_MCAST | WAKE_ARP | WAKE_UCAST)) {
 		netdev_info(dev->net, "enabling pattern match wakeup\n");
@@ -1415,11 +1418,11 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 	}
 
 	ret = smsc95xx_write_reg_nopm(dev, WUCSR, val);
-	check_warn_return(ret, "Error writing WUCSR\n");
+	check_warn_goto_done(ret, "Error writing WUCSR\n");
 
 	/* enable wol wakeup source */
 	ret = smsc95xx_read_reg_nopm(dev, PM_CTRL, &val);
-	check_warn_return(ret, "Error reading PM_CTRL\n");
+	check_warn_goto_done(ret, "Error reading PM_CTRL\n");
 
 	val |= PM_CTL_WOL_EN_;
 
@@ -1428,14 +1431,19 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 		val |= PM_CTL_ED_EN_;
 
 	ret = smsc95xx_write_reg_nopm(dev, PM_CTRL, val);
-	check_warn_return(ret, "Error writing PM_CTRL\n");
+	check_warn_goto_done(ret, "Error writing PM_CTRL\n");
 
 	/* enable receiver to enable frame reception */
 	smsc95xx_start_rx_path(dev, 1);
 
 	/* some wol options are enabled, so enter SUSPEND0 */
 	netdev_info(dev->net, "entering SUSPEND0 mode\n");
-	return smsc95xx_enter_suspend0(dev);
+	ret = smsc95xx_enter_suspend0(dev);
+
+done:
+	if (ret)
+		usbnet_resume(intf);
+	return ret;
 }
 
 static int smsc95xx_resume(struct usb_interface *intf)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 1/3] smsc95xx: fix suspend buffer overflow
From: Steve Glendinning @ 2012-11-28 17:46 UTC (permalink / raw)
  To: netdev; +Cc: Steve Glendinning, Bjorn Mork, Joe Perches
In-Reply-To: <1354124819-29531-1-git-send-email-steve.glendinning@shawell.net>

This patch fixes a buffer overflow introduced by bbd9f9e, where
the filter_mask array is accessed beyond its bounds.

Updated to also add a check for kzalloc failure, as reported by
Bjorn Mork and Joe Perches.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Cc: Bjorn Mork <bjorn@mork.no>
Cc: Joe Perches <joe@perches.com>
---
 drivers/net/usb/smsc95xx.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index 79d495d..c397b3a 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -1281,7 +1281,7 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 	}
 
 	if (pdata->wolopts & (WAKE_BCAST | WAKE_MCAST | WAKE_ARP | WAKE_UCAST)) {
-		u32 *filter_mask = kzalloc(32, GFP_KERNEL);
+		u32 *filter_mask = kzalloc(sizeof(u32) * 32, GFP_KERNEL);
 		u32 command[2];
 		u32 offset[2];
 		u32 crc[4];
@@ -1290,6 +1290,11 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 			LAN9500A_WUFF_NUM : LAN9500_WUFF_NUM;
 		int i, filter = 0;
 
+		if (!filter_mask) {
+			netdev_warn(dev->net, "Unable to allocate filter_mask\n");
+			return -ENOMEM;
+		}
+
 		memset(command, 0, sizeof(command));
 		memset(offset, 0, sizeof(offset));
 		memset(crc, 0, sizeof(crc));
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 0/3] smsc95xx enhancements
From: Steve Glendinning @ 2012-11-28 17:46 UTC (permalink / raw)
  To: netdev; +Cc: Steve Glendinning

This patchset is a resubmission of 1 updated patch and two new
enhancements.

Please consider all for net-next

Steve Glendinning (3):
  smsc95xx: fix suspend buffer overflow
  smsc95xx: fix error handling in suspend failure case
  smsc95xx: don't enable remote wakeup directly

 drivers/net/usb/smsc95xx.c |   85 ++++++++++++++++++++------------------------
 1 file changed, 39 insertions(+), 46 deletions(-)

-- 
1.7.10.4

^ permalink raw reply

* Re: [PATCH] bonding: make arp_ip_target parameter checks consistent with sysfs
From: Bjørn Mork @ 2012-11-28 17:34 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: David Miller, netdev, andy, fubar
In-Reply-To: <50B63CF4.8070404@redhat.com>

Nikolay Aleksandrov <nikolay@redhat.com> writes:

> Actually I did see it from kernel/module.c and didn't come up with it
> myself.

Just FYI, running  scripts/checkpatch.pl -f kernel/module.c results in:

total: 27 errors, 65 warnings, 3639 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

so I'd not use that as a style guide if I were you....



Bjørn

^ permalink raw reply

* Re: [PATCH net-next] gro: Handle inline VLAN tags
From: Andrew Gallatin @ 2012-11-28 17:30 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, bhutchings, netdev, linux-net-drivers, herbert
In-Reply-To: <20121128.114602.1208211617672057364.davem@davemloft.net>

On 11/28/12 11:46, David Miller wrote:
> From: Andrew Gallatin <gallatin@myri.com>
> Date: Mon, 26 Nov 2012 10:04:44 -0500
>
>> How do you feel about the patchset I posted on 11/14/2012
>> ([PATCH net-next 0/3] myri10ge: LRO to GRO conversion,
>> http://marc.info/?l=linux-netdev&m=135289838223920&w=2)
>> which moves myri10ge from LRO to GRO?
>>
>> Specifically, if doing vlan decap in GRO is not OK, then how
>> about doing it in the driver?
>
> I'm going to trust Eric's judgment on this one, as he has been
> thinking more about the long term GRO issues than I have.
>
> Thanks.
>

Perfect, thanks.  Eric seemed OK with doing the vlan decap
in the driver.  But.. I didn't get an "nack" or an "applied" from
you in response to that myri10ge patchset.  Do I need to re-submit it?

Sorry, I'm not terribly familiar with the etiquette for this sort
of thing.

Thanks,

Drew

^ permalink raw reply

* Re: [PATCH v2 net-next] net: move inet_dport/inet_num in sock_common
From: Eric Dumazet @ 2012-11-28 17:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, ling.ma.program, joe, bhutchings
In-Reply-To: <20121128.120243.1053933388686939425.davem@davemloft.net>

On Wed, 2012-11-28 at 12:02 -0500, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Wed, 28 Nov 2012 11:48:35 -0500 (EST)
> 
> > I like this a lot, applied, thanks Eric!
> 
> And I'd like it even more if it was build tested :-/
> 
> net/ipv4/inet_hashtables.c: In function ‘__inet_lookup_established’:
> net/ipv4/inet_hashtables.c:242:7: error: expected expression before ‘)’ token
> net/ipv4/inet_hashtables.c:246:8: error: expected expression before ‘)’ token
> net/ipv4/inet_hashtables.c:267:7: error: expected expression before ‘)’ token
> net/ipv4/inet_hashtables.c:274:8: error: expected expression before ‘)’ token
> net/ipv4/inet_hashtables.c:226:2: warning: unused variable ‘acookie’ [-Wunused-variable]
> net/ipv4/inet_hashtables.c: In function ‘__inet_check_established’:
> net/ipv4/inet_hashtables.c:326:7: error: expected expression before ‘)’ token
> net/ipv4/inet_hashtables.c:341:7: error: expected expression before ‘)’ token
> net/ipv4/inet_hashtables.c:307:2: warning: unused variable ‘acookie’ [-Wunused-variable]
> net/ipv4/inet_hashtables.c:306:6: warning: unused variable ‘dif’ [-Wunused-variable]
> 
> This is with gcc-4.7.2-2 on Fedora 17 on an allmodconfig build.

Hmm, sorry David, I'll double check and send a v3.

Thanks

^ permalink raw reply

* Re: TCP and reordering
From: Eric Dumazet @ 2012-11-28 17:16 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Benjamin LaHaise, Vijay Subramanian, David Miller, saku,
	rick.jones2, netdev
In-Reply-To: <1354120887.21562.87.camel@shinybook.infradead.org>

On Wed, 2012-11-28 at 16:41 +0000, David Woodhouse wrote:

> Another fun issue with tunnelling protocols and BQL... packets tend to
> *grow* as they get encapsulated. So you might end up calling
> netdev_sent_queue() with a given size, then netdev_completed_queue()
> with a bigger packet later...

Its the driver responsibility to maintain the coherent 'bytes' value for
each transmitted/completed packet.

If a driver calls an external entity, it cannot possibly use BQL, unless
doing an approximation (bytes becomes a fixed value)

BQL was really something to control/limit queueing on ethernet links,
not for stacked devices, as stacked devices normally have no queue.

^ permalink raw reply

* Re: [PATCH 1/2] smsc75xx: refactor entering suspend modes
From: Steve Glendinning @ 2012-11-28 17:16 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: Alan Stern, netdev, linux-usb, David Miller
In-Reply-To: <87wqx528ci.fsf@nemi.mork.no>

>> +
>> + ret = device_set_wakeup_enable(&net->dev, pdata->wolopts);
>
> You are touching the network device here.  That should have been the USB
> device.  Try something like
>
>  ret = device_set_wakeup_enable(&dev->udev->dev, pdata->wolopts);

Perfect, this works exactly as expected now.  Patch included in
resubmitted patchset, thanks Bjorn!

^ permalink raw reply

* Re: [PATCH] br2684: don't send frames on not-ready vcc
From: David Miller @ 2012-11-28 17:11 UTC (permalink / raw)
  To: dwmw2; +Cc: chas, krzysiek, netdev, linux-kernel, nathan
In-Reply-To: <1354122555.21562.92.camel@shinybook.infradead.org>

From: David Woodhouse <dwmw2@infradead.org>
Date: Wed, 28 Nov 2012 17:09:15 +0000

> And then I'll either send an explicit pull request, or submit it as
> patches ― whichever you prefer.

The canonical thing is to do both, send the pull request in the
"[PATCH 0/N]" email, and then the patches so everyone can see the
final form of the changes.

> Hopefully in the next day or so; the merge window is approaching...

Yep, thanks.

^ permalink raw reply

* Re: [PATCH] br2684: don't send frames on not-ready vcc
From: David Woodhouse @ 2012-11-28 17:09 UTC (permalink / raw)
  To: David Miller; +Cc: chas, krzysiek, netdev, linux-kernel, nathan
In-Reply-To: <20121128.120415.778262714601372592.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 610 bytes --]

On Wed, 2012-11-28 at 12:04 -0500, David Miller wrote:
> Do you want me to pull that tree into net-next or is there a plan to
> repost the entire series of work for a final submission?

I think it needs a little more testing/consensus first. I'd like an ack
from Chas on the atm ->release_cb() thing, at least. And I wouldn't mind
confirmation from Nathan's customer that they're no longer seeing the
panics.

And then I'll either send an explicit pull request, or submit it as
patches — whichever you prefer.

Hopefully in the next day or so; the merge window is approaching...

-- 
dwmw2


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 6171 bytes --]

^ permalink raw reply

* Re: TCP and reordering
From: Benjamin LaHaise @ 2012-11-28 17:08 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Eric Dumazet, Vijay Subramanian, David Miller, saku, rick.jones2,
	netdev
In-Reply-To: <1354120887.21562.87.camel@shinybook.infradead.org>

On Wed, Nov 28, 2012 at 04:41:27PM +0000, David Woodhouse wrote:
> Absolutely. But in the cases where they *do* connect to the congested
> link, and the packets are backing up on the *same* host, there's no
> excuse for not actually knowing that and behaving appropriately :)

Agreed.

> And even if the congested link is somewhere upstream, you'd hope that
> something exists (like ECN) to let you know about it.
> 
> In the LNS case that I'm most familiar with, the LNS *does* know about
> the bandwidth of each customer's ADSL line, and limits the bandwidth of
> each session appropriately. It's much better to decide at the LNS which
> packets to drop, than to let the telco decide. Or worse, to have the
> ADSL link drop one ATM cell out of *every* packet when it's
> overloaded...

I'm speaking from experience: the telcos I've dealt with (2 different 
companies here in Canada) do *not* know the speed of the ADSL link being 
fed with PPPoE at the customer premises that a LAC receives as an incoming 
session.  The issue is that the aggregation network does not propagate 
that information from the DSLAM to the LAC.  It's a big mess where the 
aggregation network has a mix of ATM and L2 ethernet switches, and much 
of the gear has no support for protocols that can carry that information.

> > This sort of chaining of destructors is going to be very expensive in 
> > terms of CPU cycles.  If this does get implemented, please ensure there is 
> > a way to turn it off.
> 
> You asked that before, and I think we agreed that it would be acceptable
> to use the existing CONFIG_BQL option?

No, that would not be sufficient, as otherwise there is no means to control 
the behaviour of distribution vendor kernels -- they would most likely 
default to on.

> I'm looking at adding ppp-channel equivalents of
> netdev_{reset,sent,completed}_queue, and having the PPP channels call
> them as appropriate. For some it's trivial, but in the PPPoE/L2TP cases
> because we want to install destructors without stomping on TSQ it'll be
> substantial enough that it should be compiled out if CONFIG_BQL isn't
> enabled.

This sounds like overhead.  That said, I'd like to measure it to see what 
sort of actual effect this has on performance before passing any judgement.  
I'd be happy to put together a test setup to run anything you've come up 
with through.

> > That said, if there is local congestion, the benefits of BQL would be 
> > worthwhile to have.
> 
> If there is local congestion... *or* if you have proper bandwidth
> management on the link to the clients; either by knowing the bandwidth
> and voluntarily limiting to it, or by something like ECN.

Improved ECN support is a very good idea.

> > > But I wish there was a nicer way to chain destructors. And no, I don't
> > > count what GSO does. We can't use the cb here anyway since we're passing
> > > it down the stack.
> > 
> > I think all the tunneling protocols are going to have the same problem 
> > here, so it deserves some thought about how to tackle the issue in a 
> > generic way without incurring a large amount of overhead. 
> 
> Right. There are a few cases of skb->destructor being used at different
> levels of the stack where I suspect this might already be an issue, in
> fact. And things like TSQ will silently be losing track of packets
> because of skb_orphan, even before they've left the box.
> 
> Hah, and I note that l2tp is *already* stomping on skb->destructor for
> its own purposes. So I could potentially just use its existing callback
> and pretend I hadn't seen that it screws up TSQ, and leave the issue of
> chaining destructors to be Someone Else's Problem???.

*nod*

> Actually, I think it overwrites the destructor without calling
> skb_orphan() first ??? which will *really* upset TSQ, won't it?

Yes, that would defeat things.

> >  This exact 
> > problem is one of the reasons multilink PPP often doesn't work well over 
> > L2TP or PPPoE as compared to its behaviour over ttys.
> 
> Another fun issue with tunnelling protocols and BQL... packets tend to
> *grow* as they get encapsulated. So you might end up calling
> netdev_sent_queue() with a given size, then netdev_completed_queue()
> with a bigger packet later...

Oh fun!

Ultimately, we want to know about congestion as early as possible in the 
packet processing.  In the case of L2TP, it would be helpful to use the 
knowledge of the path the packet will be sent out on to correclty set the 
ECN bits on the packet inside the L2TP encapsulation.  The L2TP code does 
not appear to do this at present, so this needs work.

		-ben

> -- 
> dwmw2
> 



-- 
"Thought is the essence of where you are now."

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox