Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] tcp: undo_retrans counter fixes
From: Yuchung Cheng @ 2011-02-08  0:22 UTC (permalink / raw)
  To: Ilpo Järvinen; +Cc: David Miller, Netdev
In-Reply-To: <alpine.DEB.2.00.1102080112450.29228@melkinpaasi.cs.helsinki.fi>

On Mon, Feb 7, 2011 at 3:36 PM, Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> wrote:
>
> On Mon, 7 Feb 2011, David Miller wrote:
>
> > From: Yuchung Cheng <ycheng@google.com>
> > Date: Mon,  7 Feb 2011 14:57:04 -0800
> >
> > > Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
> > > not set or undo_retrans is already 0. This happens when sender receives
> > > more DSACK ACKs than packets retransmitted during the current
> > > undo phase. This may also happen when sender receives DSACK after
> > > the undo operation is completed or cancelled.
> > >
> > > Fix another bug that undo_retrans is incorrectly incremented when
> > > sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
> > > is rare but not impossible.
> > >
> > > Signed-off-by: Yuchung Cheng <ycheng@google.com>
> >
> > Looks good, Ilpo could you please review this real quick?
>
> I already too a quick look so you're real lucky, only delay of writing is
> needed... :-)
thanks.
>
> Neither is harmful to "fix" but I think they're partially also checking
> for things which shouldn't cause problems... E.g., undo_retrans is only
> used after checking undo_marker's validity first so I don't think
> undo_marker check is exactly necessary there (but on the other hand it
> does no harm)...
logically we should check the validity of undo_marker/undo_retrans
before we use them? The current code has no problem if
tcp_fastretrans_alert() always call tcp_try_undo_*  functions whenever
undo_marker != 0 and undo_retrans == 0. I don't think that's always
true.

>
> The tcp_retransmit_skb problem I don't understand at all as we should be
> fragmenting or resetting pcount to 1 (the latter is true only if all
> bugfixes were included to the kernel where >1 pcount for a rexmitted skb
> was seen). If pcount is indeed >1 we might have other issues too somewhere
We found that sometimes pcount > 1 on real servers. This change keeps
the retrans_out/undo_retrans counters consistent.

> but I fail to remember immediately what they would be. That change is not
> bad though since using +/-1 is something we should be getting rid of
> anyway and on long term it would be nice to make tcp_retransmit_skb to be
> able to take advantage of TSO anyway whenever possible.
>
> I also noticed that the undo_retrans code in sacktag side is still doing
> undo_retrans-- ops which could certainly cause real miscounts, though
> it is extremely unlikely due to the fact that DSACK should be sent
> immediately for a single segment at a time (so the sender would need to
> split+recollapse in between).
I have the same doubt but our servers never hit this condition (pcount
> 1). So I keep this part intact.

>
> --
>  i.

^ permalink raw reply

* [PATCH] batman-adv: Linearize fragment packets before merge
From: Sven Eckelmann @ 2011-02-07 23:59 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r, Marek Lindner
In-Reply-To: <1297123161-25612-1-git-send-email-sven-KaDOiPu9UxWEi8DpZVb4nw@public.gmane.org>

We access the data inside the skbs of two fragments directly using memmove
during the merge. The data of the skb could span over multiple skb pages. An
direct access without knowledge about the pages would lead to an invalid memory
access.

Signed-off-by: Sven Eckelmann <sven-KaDOiPu9UxWEi8DpZVb4nw@public.gmane.org>
[lindner_marek-LWAfsSFWpa4@public.gmane.org: Move return from function to the end]
Signed-off-by: Marek Lindner <lindner_marek-LWAfsSFWpa4@public.gmane.org>
---
 net/batman-adv/unicast.c |   15 ++++++++++-----
 1 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/net/batman-adv/unicast.c b/net/batman-adv/unicast.c
index ee41fef..d1a6113 100644
--- a/net/batman-adv/unicast.c
+++ b/net/batman-adv/unicast.c
@@ -50,12 +50,12 @@ static struct sk_buff *frag_merge_packet(struct list_head *head,
 		skb = tfp->skb;
 	}
 
+	if (skb_linearize(skb) < 0 || skb_linearize(tmp_skb) < 0)
+		goto err;
+
 	skb_pull(tmp_skb, sizeof(struct unicast_frag_packet));
-	if (pskb_expand_head(skb, 0, tmp_skb->len, GFP_ATOMIC) < 0) {
-		/* free buffered skb, skb will be freed later */
-		kfree_skb(tfp->skb);
-		return NULL;
-	}
+	if (pskb_expand_head(skb, 0, tmp_skb->len, GFP_ATOMIC) < 0)
+		goto err;
 
 	/* move free entry to end */
 	tfp->skb = NULL;
@@ -70,6 +70,11 @@ static struct sk_buff *frag_merge_packet(struct list_head *head,
 	unicast_packet->packet_type = BAT_UNICAST;
 
 	return skb;
+
+err:
+	/* free buffered skb, skb will be freed later */
+	kfree_skb(tfp->skb);
+	return NULL;
 }
 
 static void frag_create_entry(struct list_head *head, struct sk_buff *skb)
-- 
1.7.2.3

^ permalink raw reply related

* pull request: batman-adv 2011-02-08
From: Sven Eckelmann @ 2011-02-07 23:59 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r

Hi,

I would like to propose following patch for net-2.6.git/2.6.38 which fixes a
possible kernel oops in the unicast fragmentation code.

thanks,
	Sven

The following changes since commit 1181e1daace88018b2ff66592aa10a4791d705ff:

  batman-adv: Make vis info stack traversal threadsafe (2011-01-30 10:32:08 +0100)

are available in the git repository at:
  git://git.open-mesh.org/ecsv/linux-merge.git batman-adv/merge

Sven Eckelmann (1):
      batman-adv: Linearize fragment packets before merge

 net/batman-adv/unicast.c |   15 ++++++++++-----
 1 files changed, 10 insertions(+), 5 deletions(-)

^ permalink raw reply

* Re: [PATCH] tcp: undo_retrans counter fixes
From: Ilpo Järvinen @ 2011-02-07 23:36 UTC (permalink / raw)
  To: David Miller; +Cc: ycheng, Netdev
In-Reply-To: <20110207.150522.28821840.davem@davemloft.net>

On Mon, 7 Feb 2011, David Miller wrote:

> From: Yuchung Cheng <ycheng@google.com>
> Date: Mon,  7 Feb 2011 14:57:04 -0800
> 
> > Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
> > not set or undo_retrans is already 0. This happens when sender receives
> > more DSACK ACKs than packets retransmitted during the current
> > undo phase. This may also happen when sender receives DSACK after
> > the undo operation is completed or cancelled.
> > 
> > Fix another bug that undo_retrans is incorrectly incremented when
> > sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
> > is rare but not impossible.
> > 
> > Signed-off-by: Yuchung Cheng <ycheng@google.com>
> 
> Looks good, Ilpo could you please review this real quick?

I already too a quick look so you're real lucky, only delay of writing is 
needed... :-)

Neither is harmful to "fix" but I think they're partially also checking 
for things which shouldn't cause problems... E.g., undo_retrans is only 
used after checking undo_marker's validity first so I don't think 
undo_marker check is exactly necessary there (but on the other hand it 
does no harm)... 

The tcp_retransmit_skb problem I don't understand at all as we should be 
fragmenting or resetting pcount to 1 (the latter is true only if all 
bugfixes were included to the kernel where >1 pcount for a rexmitted skb 
was seen). If pcount is indeed >1 we might have other issues too somewhere 
but I fail to remember immediately what they would be. That change is not 
bad though since using +/-1 is something we should be getting rid of 
anyway and on long term it would be nice to make tcp_retransmit_skb to be 
able to take advantage of TSO anyway whenever possible.

I also noticed that the undo_retrans code in sacktag side is still doing 
undo_retrans-- ops which could certainly cause real miscounts, though 
it is extremely unlikely due to the fact that DSACK should be sent 
immediately for a single segment at a time (so the sender would need to 
split+recollapse in between).

-- 
 i.

^ permalink raw reply

* Re: [Bugme-new] [Bug 28482] New: ADSL PPPOE kernel bug at /arch/x86/kernel/pci-nommu.c
From: Andrew Morton @ 2011-02-07 23:30 UTC (permalink / raw)
  To: wm666, linux-ide; +Cc: bugzilla-daemon, bugme-daemon, netdev
In-Reply-To: <bug-28482-10286@https.bugzilla.kernel.org/>


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 7 Feb 2011 13:24:48 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=28482
> 
>            Summary: ADSL PPPOE kernel bug at /arch/x86/kernel/pci-nommu.c
>            Product: Drivers
>            Version: 2.5

2.6.36.x, actually.

>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Network
>         AssignedTo: drivers_network@kernel-bugs.osdl.org

OK, this is strange.  The BUG trace points very firmly at the ATA code,
but your report attributes the crash to the PPPOE code.  How can this be?

Perhaps the PPPOE code is scribbling on memory which the ata code uses.

If you have more oops traces then it would be useful for us to see them
please.  You could attach them to the bugzilla report and then let us
know via reply-to-all to this email, thanks.



>         ReportedBy: wm666@mail.ru
>         Regression: No
> 
> 
> Created an attachment (id=46712)
>  --> (https://bugzilla.kernel.org/attachment.cgi?id=46712)
> Screenshot
> 
> uname -a
> Linux server 2.6.36-gentoo-r5-sfireman #1 SMP Sun Feb 6 17:34:43 MSK 2011
> x86_64 Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz GenuineIntel GNU/Linux
> 
> lsmod
> xt_CLASSIFY              877  24
> sch_sfq                 5791  4
> sch_htb                14250  1
> xt_limit                1830  2
> xt_state                1143  5
> ipt_addrtype            1865  0
> xt_DSCP                 2059  0
> xt_dscp                 1579  0
> xt_string               1211  0
> xt_owner                1063  0
> xt_NFQUEUE              1565  0
> xt_multiport            1702  6
> xt_mark                 1093  0
> xt_iprange              1456  0
> xt_hashlimit            5797  0
> xt_conntrack            2551  0
> xt_connmark             1629  0
> scsi_wait_scan           679  0
> 
> lspci
> 00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub (rev
> 02)
> 00:01.0 PCI bridge: Intel Corporation 82P965/G965 PCI Express Root Port (rev
> 02)
> 00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1
> (rev 02)
> 00:1c.5 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 6
> (rev 02)
> 00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
> Controller #1 (rev 02)
> 00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
> Controller #2 (rev 02)
> 00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI
> Controller #3 (rev 02)
> 00:1d.3 USB Controller: Intel Corporation Device 2833 (rev 02)
> 00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI
> Controller #1 (rev 02)
> 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
> 00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface
> Controller (rev 02)
> 00:1f.2 SATA controller: Intel Corporation 82801HR/HO/HH (ICH8R/DO/DH) 6 port
> SATA AHCI Controller (rev 02)
> 00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02)
> 02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E
> Gigabit Ethernet Controller (rev 12)
> 03:00.0 VGA compatible controller: nVidia Corporation Device 10c3 (rev a2)
> 03:00.1 Audio device: nVidia Corporation Device 0be3 (rev a1)
> 04:00.0 Multimedia controller: Philips Semiconductors SAA7146 (rev 01)
> 04:02.0 Ethernet controller: D-Link System Inc DGE-530T Gigabit Ethernet
> Adapter (rev 11) (rev 11)
> 04:04.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit
> Ethernet Controller (rev 14)
> 
> kernel bug at /arch/x86/kernel/pci-nommu.c
> 
> how to reproduce:
> I use pppoe(pppd with compiled in kernel module for pppoe)
> gentoo /etc/conf.d/net
> 
> #HOME DOMAIN NAME
> dns_domain_lo="home"
> 
> #LAN
> config_eth0=( "192.168.10.1 netmask 255.255.255.0 brd 192.168.10.255" )
> 
> #WIFI
> config_eth1=( "192.168.0.2 netmask 255.255.255.0 brd 192.168.0.255" )
> 
> #ADSL
> config_eth2=( "192.168.1.2 netmask 255.255.255.0 brd 192.168.1.255" )
> 
> #ADSL Primary
> config_ppp0=(
>         "ppp"
>         "192.168.1.254" #ALIAS
> )
> link_ppp0="eth2"
> plugins_ppp0=( "pppoe" )
> username_ppp0='sfireman'
> password_ppp0='zpres9'
> pppd_ppp0=(
>         "noauth"
>         "defaultroute"
>         "usepeerdns"
>         "holdoff 3"
>         "child-timeout 60"
>         "lcp-echo-interval 15"
>         "lcp-echo-failure 3"
>         noaccomp noccp nobsdcomp nodeflate nopcomp novj novjccomp
> )
> 
> depend_ppp0() {
>     need net.eth2
> }
> 
> #ADSL Statistic only
> config_ppp1=( "ppp" )
> link_ppp1="eth2"
> plugins_ppp1=( "pppoe" )
> username_ppp1='stat'
> password_ppp1='stat'
> pppd_ppp1=(
>         "noauth"
>         "defaultroute"
>         "usepeerdns"
>         "holdoff 3"
>         "child-timeout 60"
>         "lcp-echo-interval 15"
>         "lcp-echo-failure 3"
>         noaccomp noccp nobsdcomp nodeflate nopcomp novj novjccomp
> )
> 
> depend_ppp1() {
>     need net.eth2
> }
> 
> do.: 
> /etc/init.d/net.ppp0 start
> /etc/init.d/net.ppp1 start
> 
> so. bug causes by starting second ppp link, if first already present.
> 
> Bumps! :(
> It's reproduces 100%.
> But stack traces is different.
> Please, fix it.
> 
> P.S. sorry for my ugly english :)


^ permalink raw reply

* Re: [Bugme-new] [Bug 28512] New: IPv6 SLAAC address preferred over static one as source address
From: Andrew Morton @ 2011-02-07 23:20 UTC (permalink / raw)
  To: netdev; +Cc: bugzilla-daemon, bugme-daemon, ghen
In-Reply-To: <bug-28512-10286@https.bugzilla.kernel.org/>


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 7 Feb 2011 16:15:16 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=28512
> 
>            Summary: IPv6 SLAAC address preferred over static one as source
>                     address
>            Product: Networking
>            Version: 2.5
>     Kernel Version: 2.6.36
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: IPV6
>         AssignedTo: yoshfuji@linux-ipv6.org
>         ReportedBy: ghen@telenet.be
>         Regression: No
> 
> 
> Linux IPv6 source address selection rules are described here:
> http://www.davidc.net/networking/ipv6-source-address-selection-linux
> 
> In case of a tie, "Linux chooses to use the latest address added".
> 
> A very common tie is where a host has a SLAAC (Stateless address
> auto-configuration) address as well as one or more statically assigned ones in
> the same /64.  The SLAAC address will almost always be the most recently
> "added" one, as it is renewed with every Router Advertisement on the network,
> and there will be a tie for all other rules.
> 
> As a consequence, the kernel chooses this address by default for outgoing
> connections.  This is usually not the preferred scenario; the static address
> will more likely have proper reverse DNS, be configured in ACL's, etc.
> 
> This has been discussed on the ipv6-ops mailing list
> (ipv6-ops@lists.cluenet.de), and a better suggestion for a tie-breaker came
> out: the preferred lifetime of the address.
> 
> SLAAC addresses will have a limited preferred lifetime (as defined by the
> router), static addresses will usually have an unlimited preferred lifetime
> (0).  So it makes a lot of sense to take this preferred lifetime into account
> for source address selection (how is it otherwise "preferred"?).
> 
> This could be added as rule #9 before using the most recently added as a final
> tie breaker?
> 
> Geert


^ permalink raw reply

* Re: [Bugme-new] [Bug 28532] New: Link state change detection problem on Moschip MCS7832 [mcs7830]
From: Andrew Morton @ 2011-02-07 23:14 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA
  Cc: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r,
	bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r,
	myxal.mxl-Re5JQEeQqe8AvxtiuMwx3w, Andreas Mohr
In-Reply-To: <bug-28532-10286-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 7 Feb 2011 18:14:56 GMT
bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=28532
> 
>            Summary: Link state change detection problem on Moschip MCS7832
>                     [mcs7830]
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.38-rc2
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Network
>         AssignedTo: drivers_network-ztI5WcYan/vQLgFONoPN62D2FQJk+8+b@public.gmane.org
>         ReportedBy: myxal.mxl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
>         Regression: No
> 
> 
> Created an attachment (id=46752)
>  --> (https://bugzilla.kernel.org/attachment.cgi?id=46752)
> lsusb -vv output
> 
> Hi.
> 
> I have a network adapter which uses the aforementioned driver and while
> checking for the link state via ethtool reports the correct state, many
> networking userspace utilities seem to have no clue about it (NM 0.8.1 starts
> dhclient BEFORE any cable is plugged) - and more importantly, don't notice when
> the cable is (dis)connected. Since there's not even a kernel message when
> (dis)connecting the cable, I suspect the driver does not implement Link state
> change detection at all. Is this accurate?
> 
> LSCD works in Windows, where it's apparently implemented through periodic
> polling (judging by virtualbox's blinking USB icon).
> How is this situation normally handled? Is it kernel's job to do the polling?
> Or are userspace utilities expected to do this?
> 
> Docs to the chip are available here:
> http://www.moschip.com/data/products/MCS7830/Data%20Sheet_7830.pdf
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v4 0/5] net: Unified offload configuration
From: Michał Mirosław @ 2011-02-07 23:12 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bhutchings
In-Reply-To: <20110207.145259.246536239.davem@davemloft.net>

On Mon, Feb 07, 2011 at 02:52:59PM -0800, David Miller wrote:
> From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> Date: Mon, 7 Feb 2011 23:49:37 +0100
> > What driver did you get this output from?
> NIU, which sets NETIF_F_HW_CSUM in netdev->features, which is
> absolutely correct.

Quick look at niu_process_rx_pkt() reveals that if you disable
TX checksumming you will also get 'rx-checksumming: off' from ethtool,
but RX checksumming will still be used.

Best Regards,
Michał Mirosław

^ permalink raw reply

* Re: [Bugme-new] [Bug 28542] New: 3c59x.c: Regression since 2.6.36 (incl.), for e. g., TCP stalls on receive
From: Andrew Morton @ 2011-02-07 23:09 UTC (permalink / raw)
  To: netdev
  Cc: bugzilla-daemon, bugme-daemon, Jan Beulich, Ben Hutchings,
	Neil Horman, Steffen Klassert, for.poige+bugzilla.kernel.org
In-Reply-To: <bug-28542-10286@https.bugzilla.kernel.org/>

On Mon, 7 Feb 2011 18:55:29 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=28542
> 
>            Summary: 3c59x.c: Regression since 2.6.36 (incl.), for e. g.,
>                     TCP stalls on receive
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.36—2.6.37
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Network
>         AssignedTo: drivers_network@kernel-bugs.osdl.org
>         ReportedBy: for.poige+bugzilla.kernel.org@gmail.com
>         Regression: Yes
> 
> 
> Dump of kernel source bz2 download being stalled:
> http://poige.livejournal.com/475372.html#cutid1 (see the bunch of ACKs
> #5555523).
> 
> As a workaround I've just replaced drivers/net/3c59x.c with its former
> (2.6.35.11) version, the connectivity problems are gone now.
> 


^ permalink raw reply

* Re: [Bugme-new] [Bug 28552] New: ipheth stopped working between 2.6.35.9 and 2.6.36.2
From: Andrew Morton @ 2011-02-07 23:07 UTC (permalink / raw)
  To: linux-usb, netdev; +Cc: bugzilla-daemon, bugme-daemon, linux-kernel
In-Reply-To: <bug-28552-10286@https.bugzilla.kernel.org/>


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 7 Feb 2011 19:24:29 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=28552
> 
>            Summary: ipheth stopped working between 2.6.35.9 and 2.6.36.2
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.36.2
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Network
>         AssignedTo: drivers_network@kernel-bugs.osdl.org
>         ReportedBy: linux-kernel@jmbreuer.net
>         Regression: Yes
> 
> 
> Created an attachment (id=46762)
>  --> (https://bugzilla.kernel.org/attachment.cgi?id=46762)
> tcpdump of iPhone tethering attempt in 2.6.36.2
> 
> Overview:
> 
>     Internet connection via iPhone 3G ("USB tethering") no longer works in
> 2.6.36.2; it appears that the packets coming from the iPhone are
> misinterpreted.
> 
> Steps to Reproduce:
> 
>     1) Connect iPhone to computer via USB cable
> 
>     2) Put iPhone in "tethering" mode
> 
> Actual Results:
> 
>     The DHCP client times out trying to obtain a lease.
> 
> Expected Results: 
> 
>     The computer gets IP information via DHCP, and is able to share the
> iPhone's internet connection.
> 
> Build Date & Platform: 
> 
>     2.6.36.2 on x86_64
>     iPhone 3G Firmware 3.1.3
> 
> Additional Builds and Platforms:
> 
>     2.6.35.9 on x86_64 (same hardware) works as expected.
> 
> Additional Information:
> 
>     Attached are 2 tcpdump logs:
>       2.6.36.2-tcpdump.txt in the failure case
>       2.6.35.9-tcpdump.txt, working connection setup
> 
> Comparing the DHCP reply at 19:53:36.581403 (works) to 19:05:05.744666 (fails)
> it appears that 2.6.36.2 misinterprets the data coming from the iPhone, so that
> the failing packet is not recognized as an IP packet / valid DHCP reply.
> 
> The change to ipheth.c itself between 2.6.36 and 2.6.35 seems trivial:
> http://lxr.free-electrons.com/diff/drivers/net/usb/ipheth.c?v=2.6.35;diffval=2.6.36;diffvar=v
> 
> Therefore I'd at first tried copying 2.6.35's ipheth.c into my 2.6.36 tree and
> rebuilding this driver; it appears to fail the same as vanilla 2.6.36 does
> [sorry, no logs - I can repeat this if required].
> 
> This suggests that the failure happens somewhere in either the networking or
> USB layer, not the ipheth.c driver itself.
> 
> I'm happy to perform any additional testing required.
> 


^ permalink raw reply

* Re: [PATCH] tcp: undo_retrans counter fixes
From: David Miller @ 2011-02-07 23:05 UTC (permalink / raw)
  To: ycheng; +Cc: netdev, ilpo.jarvinen
In-Reply-To: <1297119424-19956-1-git-send-email-ycheng@google.com>

From: Yuchung Cheng <ycheng@google.com>
Date: Mon,  7 Feb 2011 14:57:04 -0800

> Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
> not set or undo_retrans is already 0. This happens when sender receives
> more DSACK ACKs than packets retransmitted during the current
> undo phase. This may also happen when sender receives DSACK after
> the undo operation is completed or cancelled.
> 
> Fix another bug that undo_retrans is incorrectly incremented when
> sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
> is rare but not impossible.
> 
> Signed-off-by: Yuchung Cheng <ycheng@google.com>

Looks good, Ilpo could you please review this real quick?

Thanks.

^ permalink raw reply

* [PATCH] tcp: undo_retrans counter fixes
From: Yuchung Cheng @ 2011-02-07 22:57 UTC (permalink / raw)
  To: netdev; +Cc: ilpo.jarvinen, Yuchung Cheng

Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
not set or undo_retrans is already 0. This happens when sender receives
more DSACK ACKs than packets retransmitted during the current
undo phase. This may also happen when sender receives DSACK after
the undo operation is completed or cancelled.

Fix another bug that undo_retrans is incorrectly incremented when
sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
is rare but not impossible.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_input.c  |    5 +++--
 net/ipv4/tcp_output.c |    2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2f692ce..08ea735 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1222,7 +1222,7 @@ static int tcp_check_dsack(struct sock *sk, struct sk_buff *ack_skb,
 	}
 
 	/* D-SACK for already forgotten data... Do dumb counting. */
-	if (dup_sack &&
+	if (dup_sack && tp->undo_marker && tp->undo_retrans &&
 	    !after(end_seq_0, prior_snd_una) &&
 	    after(end_seq_0, tp->undo_marker))
 		tp->undo_retrans--;
@@ -1299,7 +1299,8 @@ static u8 tcp_sacktag_one(struct sk_buff *skb, struct sock *sk,
 
 	/* Account D-SACK for retransmitted packet. */
 	if (dup_sack && (sacked & TCPCB_RETRANS)) {
-		if (after(TCP_SKB_CB(skb)->end_seq, tp->undo_marker))
+		if (tp->undo_marker && tp->undo_retrans &&
+		    after(TCP_SKB_CB(skb)->end_seq, tp->undo_marker))
 			tp->undo_retrans--;
 		if (sacked & TCPCB_SACKED_ACKED)
 			state->reord = min(fack_count, state->reord);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 406f320..dfa5beb 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2162,7 +2162,7 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 		if (!tp->retrans_stamp)
 			tp->retrans_stamp = TCP_SKB_CB(skb)->when;
 
-		tp->undo_retrans++;
+		tp->undo_retrans += tcp_skb_pcount(skb);
 
 		/* snd_nxt is stored to detect loss of retransmitted segment,
 		 * see tcp_input.c tcp_sacktag_write_queue().
-- 
1.7.3.1


^ permalink raw reply related

* Re: [PATCH v4 0/5] net: Unified offload configuration
From: David Miller @ 2011-02-07 22:52 UTC (permalink / raw)
  To: mirq-linux; +Cc: netdev, bhutchings
In-Reply-To: <20110207224937.GA32549@rere.qmqm.pl>

From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Date: Mon, 7 Feb 2011 23:49:37 +0100

> What driver did you get this output from?

NIU, which sets NETIF_F_HW_CSUM in netdev->features, which is
absolutely correct.

^ permalink raw reply

* Re: [PATCH v4 0/5] net: Unified offload configuration
From: Michał Mirosław @ 2011-02-07 22:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bhutchings
In-Reply-To: <20110207.133721.48496023.davem@davemloft.net>

On Mon, Feb 07, 2011 at 01:37:21PM -0800, David Miller wrote:
> From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> Date: Thu,  3 Feb 2011 15:21:21 +0100 (CET)
> > Here's a v4 of the ethtool unification patch series.
[cut list]
> After these changes the ethtool output is now inaccurate for
> RX checksumming.
> 
> Before:
> 
> davem@maramba:~$ /usr/sbin/ethtool -k eth0
> Offload parameters for eth0:
> rx-checksumming: on
> tx-checksumming: on
[...]
 
> After:
> 
> davem@maramba:~$ /usr/sbin/ethtool -k eth0
> Offload parameters for eth0:
> rx-checksumming: off
> tx-checksumming: on
[...]
> 
> If the issue is that you require driver or ethtool utility changes in
> order for things to keep working properly, then that is not
> acceptable.

> I'm reverting all of these changes, resubmit them when you have them
> in a state such that no regressions will be introduced.

What driver did you get this output from?  All drivers that implement
set_rx_csum also implement their own get_rx_csum and so should not be
affected by patch #4. For others that don't implement get_rx_csum
rx-checksumming status is unreliable. Looking at random drivers:
 - via-rhine: will advertise rx-checksumming when it doesn't support
	it (as a side effect of hardware workaround - checksum in driver)
 - sunhme: has no way to disable rx- and tx-checksumming so was
	correctly showing rx-checksumming enabled
 - 8139too: will show rx-checksumming enabled but doesn't support it
	(side effect of hardware workaround - checksumming in driver)

I wouldn't be suprised if there was a driver which doesn't advertise
RX checksumming but use it anyway.

So yes - this patchset uncovers bugs in drivers. I'll see how can I
make the RXCSUM patch retain the previous behaviour for this case.

Best Regards,
Michał Mirosław

^ permalink raw reply

* Re: [PATCH 1/2] CDC NCM errata updates for cdc.h
From: Greg KH @ 2011-02-07 22:03 UTC (permalink / raw)
  To: Alexey Orishko
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, gregkh-l3A5Bk7waGM,
	yauheni.kaliuta-xNZwKgViW5gAvxtiuMwx3w, Alexey Orishko
In-Reply-To: <1297107910-18263-1-git-send-email-alexey.orishko-0IS4wlFg1OjSUeElwK9/Pw@public.gmane.org>

On Mon, Feb 07, 2011 at 08:45:09PM +0100, Alexey Orishko wrote:
> Changes are based on the following documents:
> - CDC NCM errata:
> http://www.usb.org/developers/devclass_docs/NCM10_012011.zip
> - CDC and WMC errata link:
> http://www.usb.org/developers/devclass_docs/CDC1.2_WMC1.1_012011.zip
> 
> Signed-off-by: Alexey Orishko <alexey.orishko-0IS4wlFg1OjSUeElwK9/Pw@public.gmane.org>

Acked-by: Greg Kroah-Hartman <gregkh-l3A5Bk7waGM@public.gmane.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: x25: possible skb leak on bad facilities
From: David Miller @ 2011-02-07 21:42 UTC (permalink / raw)
  To: andrew.hendry; +Cc: apw, john, linux-x25, netdev, linux-kernel, tim.gardner
In-Reply-To: <1297073295.9577.13.camel@jaunty>

From: Andrew Hendry <andrew.hendry@gmail.com>
Date: Mon, 07 Feb 2011 21:08:15 +1100

> 
> Originally x25_parse_facilities returned
> -1 for an error
>  0 meaning 0 length facilities
>>0 the length of the facilities parsed.
> 
> 5ef41308f94dc introduced more error checking in x25_parse_facilities
> however used 0 to indicate bad parsing
> a6331d6f9a429 followed this further for DTE facilities, again using 0 for bad parsing.
> 
> The meaning of 0 got confused in the callers.
> If the facilities are messed up we can't determine where the data starts.
> So patch makes all parsing errors return -1 and ensures callers close and don't use the skb further.
> 
> Reported-by: Andy Whitcroft <apw@canonical.com>
> Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>

Please reference the commit header line text when referring to SHA1
IDs, because when backporting to other GIT trees the SHA1 IDs might be
different.

I took care of this when applying your patch, thanks.

^ permalink raw reply

* Re: [PATCH v4 0/5] net: Unified offload configuration
From: David Miller @ 2011-02-07 21:37 UTC (permalink / raw)
  To: mirq-linux; +Cc: netdev, bhutchings
In-Reply-To: <cover.1296741561.git.mirq-linux@rere.qmqm.pl>

From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Date: Thu,  3 Feb 2011 15:21:21 +0100 (CET)

> Here's a v4 of the ethtool unification patch series.
> 
> What's in it?
>  1:
> 	the patch - implement unified ethtool setting ops
>  2..3:
> 	implement interoperation between old and new ethtool ops
>  4:
> 	include RX checksum in features and plug it into new framework
>  5:
> 	convert loopback device to new framework

After these changes the ethtool output is now inaccurate for
RX checksumming.

Before:

davem@maramba:~$ /usr/sbin/ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: on
large receive offload: off
davem@maramba:~$ 

After:

davem@maramba:~$ /usr/sbin/ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: off
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: off
udp fragmentation offload: off
generic segmentation offload: on
large receive offload: off

If the issue is that you require driver or ethtool utility changes in
order for things to keep working properly, then that is not
acceptable.

I'm reverting all of these changes, resubmit them when you have them
in a state such that no regressions will be introduced.

Thanks.

^ permalink raw reply

* Re: [PATCH v4 5/5] loopback: convert to hw_features
From: David Miller @ 2011-02-07 21:18 UTC (permalink / raw)
  To: mirq-linux; +Cc: netdev, bhutchings
In-Reply-To: <38aaa7ba1c3fa19ec27536ccdf44c7368d1b21ac.1296741562.git.mirq-linux@rere.qmqm.pl>

From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Date: Thu,  3 Feb 2011 15:21:22 +0100 (CET)

> This also enables TSOv6, TSO-ECN, and UFO as loopback clearly can handle them.
> 
> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>

Applied.

^ permalink raw reply

* Re: [PATCH] bonding/vlan: Avoid mangled NAs on slaves without VLAN tag insertion
From: David Miller @ 2011-02-07 21:17 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, fubar, stable, bonding-devel
In-Reply-To: <1297106455.4077.7.camel@bwh-desktop>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 07 Feb 2011 19:20:55 +0000

> This is related to commit f88a4a9b65a6f3422b81be995535d0e69df11bb8
> upstream, but the bug cannot be properly fixed without the other
> changes to VLAN tagging in 2.6.37.
> 
> bond_na_send() attempts to insert a VLAN tag in between building and
> sending packets of the respective formats.  If the slave does not
> implement hardware VLAN tag insertion then vlan_put_tag() will mangle
> the network-layer header because the Ethernet header is not present at
> this point (unlike in bond_arp_send()).
> 
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply

* Re: [PATCH v4 4/5] net: introduce NETIF_F_RXCSUM
From: David Miller @ 2011-02-07 21:12 UTC (permalink / raw)
  To: mirq-linux; +Cc: netdev, bhutchings
In-Reply-To: <629a5e9cc87a171997e611b8227d58cfe4fbe6ff.1296741562.git.mirq-linux@rere.qmqm.pl>

From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Date: Thu,  3 Feb 2011 15:21:22 +0100 (CET)

> Introduce NETIF_F_RXCSUM to replace device-private flags for RX checksum
> offload. Integrate it with ndo_fix_features.
> 
> ethtool_op_get_rx_csum() is removed altogether as nothing in-tree uses it.
> 
> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>                                                    

Applied.

^ permalink raw reply

* Re: [PATCH v4 3/5] net: use ndo_fix_features for ethtool_ops->set_flags
From: David Miller @ 2011-02-07 21:03 UTC (permalink / raw)
  To: bhutchings; +Cc: mirq-linux, netdev
In-Reply-To: <1297107972.4077.11.camel@bwh-desktop>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 07 Feb 2011 19:46:12 +0000

> On Thu, 2011-02-03 at 15:21 +0100, Michał Mirosław wrote:
>> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
>> ---
>>  net/core/ethtool.c |   31 +++++++++++++++++++++++++++++--
>>  1 files changed, 29 insertions(+), 2 deletions(-)
>> 
>> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
>> index 555accf..6e7c6f2 100644
>> --- a/net/core/ethtool.c
>> +++ b/net/core/ethtool.c
>> @@ -240,6 +240,34 @@ static int ethtool_set_features(struct net_device *dev, void __user *useraddr)
>>  	return ret;
>>  }
>>  
>> +static int __ethtool_set_flags(struct net_device *dev, u32 data)
>> +{
>> +	u32 changed;
>> +
>> +	if (data & ~flags_dup_features)
>> +		return -EINVAL;
>> +
>> +	/* legacy set_flags() op */
>> +	if (dev->ethtool_ops->set_flags) {
>> +		if (unlikely(dev->hw_features & flags_dup_features))
>> +			netdev_warn(dev,
>> +				"driver BUG: mixed hw_features and set_flags()\n");
>> +		return dev->ethtool_ops->set_flags(dev, data);
>> +	}
>> +
>> +	/* allow changing only bits set in hw_features */
>> +	changed = (data ^ dev->wanted_features) & flags_dup_features;
>> +	if (changed & ~dev->hw_features)
>> +		return -EOPNOTSUPP;
> [...]
> 
> The error code should only be EOPNOTSUPP if (dev->hw_features &
> flags_dup_features) == 0.  Otherwise it should be EINVAL.

I'll fix this up when I apply his patch, thanks Ben.

^ permalink raw reply

* Re: Oops in tcp_output.c, kernel 2.6.38-rc3
From: David Miller @ 2011-02-07 21:02 UTC (permalink / raw)
  To: cebbert; +Cc: netdev, ilpo.jarvinen
In-Reply-To: <20110204153254.5c37c6f2@katamari>

From: Chuck Ebbert <cebbert@redhat.com>
Date: Fri, 4 Feb 2011 15:32:54 -0500

> Analysis is below. (From https://bugzilla.redhat.com/show_bug.cgi?id=674622)
> 
>  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
>  IP: [<ffffffff81407b16>] tcp_write_xmit+0x694/0x7af

I bet this is some kind of bug in the driver or similar, what device
and also what kind of network config (netfilter, packet scheduler,
interface, routes, etc.) is this?

^ permalink raw reply

* Re: [PATCH v4 2/5] net: ethtool: use ndo_fix_features for offload setting
From: David Miller @ 2011-02-07 21:01 UTC (permalink / raw)
  To: mirq-linux; +Cc: netdev, bhutchings
In-Reply-To: <ca3bcc3bec8779b67b476d5d7325ea9fbbf54308.1296741561.git.mirq-linux@rere.qmqm.pl>

From: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Date: Thu,  3 Feb 2011 15:21:21 +0100 (CET)

> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>                                                    

Applied, and now I see why the tree "built" successfully for
you.

You remove the duplicate EXPORT_SYMBOL() in this patch.

You absolutely cannot test your patch sets like this, only
build testing at the end.

Every single individual change must not introduce any functional
or build regressions, therefore you must make sure the build works
fine after each and every patch in your series, not just after they
are all applied.

What disturbs me even more, is that really this problem was introduced
because you mixed functional and cleanup changes in the first patch.
Something you should also never do.

^ permalink raw reply

* Re: [PATCH v4 1/5] net: Introduce new feature setting ops
From: David Miller @ 2011-02-07 20:55 UTC (permalink / raw)
  To: bhutchings; +Cc: mirq-linux, netdev
In-Reply-To: <20110207.125120.71110722.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Mon, 07 Feb 2011 12:51:20 -0800 (PST)

> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Mon, 07 Feb 2011 19:39:57 +0000
> 
>> On Thu, 2011-02-03 at 15:21 +0100, Michał Mirosław wrote:
>>> This introduces a new framework to handle device features setting.
>>> It consists of:
>>>   - new fields in struct net_device:
>>> 	+ hw_features - features that hw/driver supports toggling
>>> 	+ wanted_features - features that user wants enabled, when possible
>>>   - new netdev_ops:
>>> 	+ feat = ndo_fix_features(dev, feat) - API checking constraints for
>>> 		enabling features or their combinations
>>> 	+ ndo_set_features(dev) - API updating hardware state to match
>>> 		changed dev->features
>>>   - new ethtool commands:
>>> 	+ ETHTOOL_GFEATURES/ETHTOOL_SFEATURES: get/set dev->wanted_features
>>> 		and trigger device reconfiguration if resulting dev->features
>>> 		changed
>>> 	+ ETHTOOL_GSTRINGS(ETH_SS_FEATURES): get feature bits names (meaning)
>>> 
>>> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
>> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
> 
> Applied, thanks.

I had to make a fix to this patch, there were duplicate EXPORT_SYMBOL()
lines in net/core/ethtool.c for ethtool_op_set_tx_csum() after your
changes.

How did this build successfully for you?

^ permalink raw reply

* Re: [PATCH v4 1/5] net: Introduce new feature setting ops
From: David Miller @ 2011-02-07 20:51 UTC (permalink / raw)
  To: bhutchings; +Cc: mirq-linux, netdev
In-Reply-To: <1297107597.4077.8.camel@bwh-desktop>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 07 Feb 2011 19:39:57 +0000

> On Thu, 2011-02-03 at 15:21 +0100, Michał Mirosław wrote:
>> This introduces a new framework to handle device features setting.
>> It consists of:
>>   - new fields in struct net_device:
>> 	+ hw_features - features that hw/driver supports toggling
>> 	+ wanted_features - features that user wants enabled, when possible
>>   - new netdev_ops:
>> 	+ feat = ndo_fix_features(dev, feat) - API checking constraints for
>> 		enabling features or their combinations
>> 	+ ndo_set_features(dev) - API updating hardware state to match
>> 		changed dev->features
>>   - new ethtool commands:
>> 	+ ETHTOOL_GFEATURES/ETHTOOL_SFEATURES: get/set dev->wanted_features
>> 		and trigger device reconfiguration if resulting dev->features
>> 		changed
>> 	+ ETHTOOL_GSTRINGS(ETH_SS_FEATURES): get feature bits names (meaning)
>> 
>> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>

Applied, thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox