Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2 3/3] RFC tcp: early retransmit: delayed fast retransmit
From: Neal Cardwell @ 2012-05-01 17:55 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: davem, ilpo.jarvinen, nanditad, edumazet, netdev
In-Reply-To: <1335638781-960-3-git-send-email-ycheng@google.com>

On Sat, Apr 28, 2012 at 2:46 PM, Yuchung Cheng <ycheng@google.com> wrote:
> Implementing the advanced early retransmit (sysctl_tcp_early_retrans==2).
> Delays the fast retransmit by an interval of RTT/4. We borrow the
> RTO timer to implement the delay. If we receive another ACK or send
> a new packet, the timer is cancelled and restored to original RTO
> value offset by time elapsed.  When the delayed-ER timer fires,
> we enter fast recovery and perform fast retransmit.
>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> ---
>  include/linux/tcp.h   |    4 +-
>  include/net/tcp.h     |    3 ++
>  net/ipv4/tcp_input.c  |   69 ++++++++++++++++++++++++++++++++++++++++++++-----
>  net/ipv4/tcp_output.c |    5 +--
>  net/ipv4/tcp_timer.c  |    5 +++
>  5 files changed, 74 insertions(+), 12 deletions(-)

Acked-by: Neal Cardwell <ncardwell@google.com>

neal

^ permalink raw reply

* Re: [PATCH v2 1/3] RFC tcp: early retransmit: tcp_enter_recovery()
From: Neal Cardwell @ 2012-05-01 17:54 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: davem, ilpo.jarvinen, nanditad, edumazet, netdev
In-Reply-To: <1335638781-960-1-git-send-email-ycheng@google.com>

On Sat, Apr 28, 2012 at 2:46 PM, Yuchung Cheng <ycheng@google.com> wrote:
> This a prepartion patch that refactors the code to enter recovery
> into a new function tcp_enter_recovery(). It's needed to implement
> the delayed fast retransmit in ER.
>
> Signed-off-by: Yuchung Cheng <ycheng@google.com>
> ---
> ChangeLog in v2:
>  Fix sysctl_tcp_early_retrans default to 2
>  Re-arrange the patch parts so they are individually compilable and functional.
>
>  net/ipv4/tcp_input.c |   61 +++++++++++++++++++++++++++----------------------
>  1 files changed, 34 insertions(+), 27 deletions(-)

Acked-by: Neal Cardwell <ncardwell@google.com>

BTW, I think this is a nice clean-up, independent of early retransmit.

neal

^ permalink raw reply

* Re: [PATCH v2 2/3] RFC tcp: early retransmit
From: Neal Cardwell @ 2012-05-01 17:53 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: davem, ilpo.jarvinen, nanditad, edumazet, netdev
In-Reply-To: <1335638781-960-2-git-send-email-ycheng@google.com>

On Sat, Apr 28, 2012 at 2:46 PM, Yuchung Cheng <ycheng@google.com> wrote:
> --- a/Documentation/networking/ip-sysctl.txt
> +++ b/Documentation/networking/ip-sysctl.txt
> @@ -202,6 +202,20 @@ tcp_ecn - INTEGER
>                  not support ECN, behavior is like with ECN disabled.
>        Default: 2
>
> +tcp_early_retrans - INTEGER
> +       Enable Early Retransmit (ER), per RFC 5827. ER lowers the threshold

I get a trailing-whitespace error in the documentation when I apply this patch:

> git am /tmp/er2.patch
Applying: RFC tcp: early retransmit
.../.git/rebase-apply/patch:20:
trailing whitespace.
       Enable Early Retransmit (ER), per RFC 5827. ER lowers the threshold
warning: 1 line adds whitespace errors.

Also, to keep the sysctls alphabetical, tcp_early_retrans should probably go
before tcp_ecn.

> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -365,12 +365,13 @@ struct tcp_sock {
>
>        u32     frto_highmark;  /* snd_nxt when RTO occurred */
>        u16     advmss;         /* Advertised MSS                       */
> -       u8      frto_counter;   /* Number of new acks after RTO */
> -       u8      nonagle     : 4,/* Disable Nagle algorithm?             */
> +       u16     nonagle     : 4,/* Disable Nagle algorithm?             */
>                thin_lto    : 1,/* Use linear timeouts for thin streams */
>                thin_dupack : 1,/* Fast retransmit on first dupack      */
>                repair      : 1,
> -               unused      : 1;
> +               do_early_retrans: 1;/* Enable RFC5827 early-retransmit  */
> +
> +       u8      frto_counter;   /* Number of new acks after RTO */
>        u8      repair_queue;

To keep the change minimal and reduce the risk of mysterious
performance regressions from cache effects, I'd suggest keeping the
frto_counter and nonagle u8 bytes as u8 bytes in their current
location, and add a new u8 for the two ER bits. Same amount of space
as the scheme in the patch, just less shuffling.

> @@ -987,6 +989,7 @@ static void tcp_update_reordering(struct sock *sk, const int metric,
>                       tp->undo_marker ? tp->undo_retrans : 0);
>  #endif
>                tcp_disable_fack(tp);
> +               tcp_disable_early_retrans(tp);
>        }
>  }

I think we should stick with the behavior where we disable early
retransmit any time tcp_update_reordering() is called with a non-zero
reordering metric. This is what we've tested and measured, and my
sense is that we could risk a significant number of spurious ER
firings if instead we relax this so that only reordering >3 causes us
to disable ER. I know the delayed ER should help avoid spurious ER
firings when there is a small degree of reordering, but my guess would
be that the max(RTT/4, 2ms) is perhaps not big enough if we're
allowing delayed ER for connections that have already witnessed small
degrees of reordering. So until we have more experimental data, I'd
recommend sticking with:

	if (metric > 0)
		tcp_disable_early_retrans(tp);

Otherwise, looks good to me.

neal

^ permalink raw reply

* Re: [PATCH] bridge: make brctl showstp display port id
From: Joakim Tjernlund @ 2012-05-01 17:47 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev@vger.kernel.org
In-Reply-To: <20120501103955.1553a9cb@s6510.linuxnetplumber.net>

Stephen Hemminger <shemminger@vyatta.com> wrote on 2012/05/01 19:39:55:
>
> > Stephen Hemminger <shemminger@vyatta.com> wrote on 2012/04/30 18:07:47:
> > >
> > > On Mon, 30 Apr 2012 10:39:08 +0200
> > > Joakim Tjernlund <Joakim.Tjernlund@transmode.se> wrote:
> > >
> > > > My brctl showstp br0 always shows a 0 port id:
> > > > eth2 (1)
> > > >  port id      0         state             disabled
> > > >  designated root   8000.00069c00b2fb   path cost       100
> > > >
> > > > because port id is printed as a hex number in sys fs. Change the
> > > > two hex occurrences(port no and port id) to decimal, just like all
> > > > the other numbers in this area.
> > > >
> > > > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > > > ---
> > > >  net/bridge/br_sysfs_if.c |    4 ++--
> > > >  1 files changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
> > > > index fd5799c..9c4c2eb 100644
> > > > --- a/net/bridge/br_sysfs_if.c
> > > > +++ b/net/bridge/br_sysfs_if.c
> > > > @@ -86,13 +86,13 @@ static BRPORT_ATTR(designated_cost, S_IRUGO, show_designated_cost, NULL);
> > > >
> > > >  static ssize_t show_port_id(struct net_bridge_port *p, char *buf)
> > > >  {
> > > > -   return sprintf(buf, "0x%x\n", p->port_id);
> > > > +   return sprintf(buf, "%d\n", p->port_id);
> > > >  }
> > > >  static BRPORT_ATTR(port_id, S_IRUGO, show_port_id, NULL);
> > > >
> > > >  static ssize_t show_port_no(struct net_bridge_port *p, char *buf)
> > > >  {
> > > > -   return sprintf(buf, "0x%x\n", p->port_no);
> > > > +   return sprintf(buf, "%d\n", p->port_no);
> > > >  }
> > > >
> > > >  static BRPORT_ATTR(port_no, S_IRUGO, show_port_no, NULL);
> > >
> > > No. This would be a visible change to applications.
> > > The bridge utilities should be fixed instead.
> >
> > Well, my reasoning was that if not the native tools got this right, what are the
> > chances that other apps got it right? I don't know any other apps than
> > brctl using this info, do you? We could check one or two apps to see what they do.
> >
> >  Jocke
> >
>
> Current bridge-utils does not have the issue. It has been fixed
> for 5 years. The problem was fixed by:
>
> commit 0397c6aa48769505e6aa4481f0786c8f4eaaf890
> Author: Jeremy Jackson <jerj@coplanar.net>
> Date:   Tue May 8 11:10:34 2007 -0700
>
>     I've noticed for a while that
>
>     output is showing 0 for port_no and  port_id
>
>     It seems that somewhere in 2.6 sysfs land the following items got
>     printed in hexadecimal, and brctl code was parsing for decimal only
>
>     doug:/sys/class/net/eth0/brport# cat port_id
>     0x8001
>     doug:/sys/class/net/eth0/brport# cat port_no
>     0x1


Yeah, just noticed that too. On our embedded board we have the release just before that fix :(

 Jocke

^ permalink raw reply

* Re: [PATCH] netem: fix possible skb leak
From: David Miller @ 2012-05-01 17:41 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, shemminger
In-Reply-To: <1335726502.3897.8.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 29 Apr 2012 21:08:22 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> skb_checksum_help(skb) can return an error, we must free skb in this
> case. qdisc_drop(skb, sch) can also be feeded with a NULL skb (if
> skb_unshare() failed), so lets use this generic helper.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH] bridge: make brctl showstp display port id
From: Stephen Hemminger @ 2012-05-01 17:39 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: netdev@vger.kernel.org
In-Reply-To: <OF4DD70AEF.9EE04586-ONC12579F1.00502FF3-C12579F1.00506F9C@transmode.se>

On Tue, 1 May 2012 16:38:34 +0200
Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:

> Stephen Hemminger <shemminger@vyatta.com> wrote on 2012/04/30 18:07:47:
> >
> > On Mon, 30 Apr 2012 10:39:08 +0200
> > Joakim Tjernlund <Joakim.Tjernlund@transmode.se> wrote:
> >
> > > My brctl showstp br0 always shows a 0 port id:
> > > eth2 (1)
> > >  port id      0         state             disabled
> > >  designated root   8000.00069c00b2fb   path cost       100
> > >
> > > because port id is printed as a hex number in sys fs. Change the
> > > two hex occurrences(port no and port id) to decimal, just like all
> > > the other numbers in this area.
> > >
> > > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > > ---
> > >  net/bridge/br_sysfs_if.c |    4 ++--
> > >  1 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
> > > index fd5799c..9c4c2eb 100644
> > > --- a/net/bridge/br_sysfs_if.c
> > > +++ b/net/bridge/br_sysfs_if.c
> > > @@ -86,13 +86,13 @@ static BRPORT_ATTR(designated_cost, S_IRUGO, show_designated_cost, NULL);
> > >
> > >  static ssize_t show_port_id(struct net_bridge_port *p, char *buf)
> > >  {
> > > -   return sprintf(buf, "0x%x\n", p->port_id);
> > > +   return sprintf(buf, "%d\n", p->port_id);
> > >  }
> > >  static BRPORT_ATTR(port_id, S_IRUGO, show_port_id, NULL);
> > >
> > >  static ssize_t show_port_no(struct net_bridge_port *p, char *buf)
> > >  {
> > > -   return sprintf(buf, "0x%x\n", p->port_no);
> > > +   return sprintf(buf, "%d\n", p->port_no);
> > >  }
> > >
> > >  static BRPORT_ATTR(port_no, S_IRUGO, show_port_no, NULL);
> >
> > No. This would be a visible change to applications.
> > The bridge utilities should be fixed instead.
> 
> Well, my reasoning was that if not the native tools got this right, what are the
> chances that other apps got it right? I don't know any other apps than
> brctl using this info, do you? We could check one or two apps to see what they do.
> 
>  Jocke
> 

Current bridge-utils does not have the issue. It has been fixed
for 5 years. The problem was fixed by:

commit 0397c6aa48769505e6aa4481f0786c8f4eaaf890
Author: Jeremy Jackson <jerj@coplanar.net>
Date:   Tue May 8 11:10:34 2007 -0700

    I've noticed for a while that
    
    output is showing 0 for port_no and  port_id
    
    It seems that somewhere in 2.6 sysfs land the following items got
    printed in hexadecimal, and brctl code was parsing for decimal only
    
    doug:/sys/class/net/eth0/brport# cat port_id
    0x8001
    doug:/sys/class/net/eth0/brport# cat port_no
    0x1

^ permalink raw reply

* Re: [PATCH 0/2] sky2 patches
From: David Miller @ 2012-05-01 17:39 UTC (permalink / raw)
  To: shemminger; +Cc: netdev
In-Reply-To: <20120430154944.890709139@vyatta.com>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Mon, 30 Apr 2012 08:49:44 -0700

> These are both patches against net-next but should be
> applied to -net and stable as well.

Both applied to 'net' and queued up for -stable, thanks.

^ permalink raw reply

* Re: 3.4-rc: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
From: Francois Romieu @ 2012-05-01 17:27 UTC (permalink / raw)
  To: Stefan Richter; +Cc: netdev
In-Reply-To: <20120501182405.3dda59b9@stein>

Stefan Richter <stefanr@s5r6.in-berlin.de> :
[...]
> I switched from 3.3 to 3.4-rc5 yesterday and am getting the following warning
> some time after boot (e.g. this time about two hours after boot).  What's up
> with that?

The device stopped being able to transmit anything at some point (netdev
watchdog message). The warnings always look the same. If the device does
not recover, we have a problem. Do we ?

[...]
> There are no noticeable issues, but I don't run anything demanding on the
> interface right now which would make any downtime obvious.

I don't understand: is it a low traffic interface or a no traffic one ?

> It is an RTL8168(?) on an Asus M3A78-EM motherboard.

dmesg | grep XID should identify the chipset. You may consider sending
an 'ethtool eth0' and 'ethtool -i eth0'.

-- 
Ueimor

^ permalink raw reply

* Re: [PATCH 3/8] Sometimes the ISDN chip only controls the D-channel
From: David Miller @ 2012-05-01 17:30 UTC (permalink / raw)
  To: kkeil; +Cc: netdev
In-Reply-To: <1335613404-10187-4-git-send-email-kkeil@linux-pingi.de>

From: Karsten Keil <kkeil@linux-pingi.de>
Date: Sat, 28 Apr 2012 13:43:19 +0200

> The B-channels are only accessed via the PCM backplane.
> Add infrastruckture for this special mode.
> 
> Signed-off-by: Karsten Keil <kkeil@linux-pingi.de>

I dread reviewing these ISDN patch sets because they are so
full of problems, and it's so damn obvious how little care is
put into preparing them.

What I see is that you put the minimum amount of work necessary
into splitting up your huge ISDN patch set submission into more
managable pieces, and as a result you are introducing problems.

> diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
> index abe2d69..502bcf1 100644
> --- a/drivers/isdn/mISDN/socket.c
> +++ b/drivers/isdn/mISDN/socket.c
> @@ -270,6 +270,7 @@ data_sock_release(struct socket *sock)
>  		break;
>  	case ISDN_P_LAPD_TE:
>  	case ISDN_P_LAPD_NT:
> +	case ISDN_P_B_PCM:
>  	case ISDN_P_B_RAW:
>  	case ISDN_P_B_HDLC:
>  	case ISDN_P_B_X75SLP:

Ok, that's fine.

> @@ -148,6 +149,8 @@ struct bchannel {
>  	u_int			state;
>  	void			*hw;
>  	int			slot;	/* multiport card channel slot */
> +	int			pcm_tx;	/* PCM tx slot nr */
> +	int			pcm_rx;	/* PCM rx slot nr */
>  	struct timer_list	timer;
>  	/* receive data */
>  	struct sk_buff		*rx_skb;

But what the hell is this?  These structure members are unused by
this patch, and in fact no patch in your entire series uses them.

> @@ -360,8 +360,8 @@ clear_channelmap(u_int nr, u_char *map)
>  #define MISDN_CTRL_LOOP			0x0001
>  #define MISDN_CTRL_CONNECT		0x0002
>  #define MISDN_CTRL_DISCONNECT		0x0004
> -#define MISDN_CTRL_PCMCONNECT		0x0010
> -#define MISDN_CTRL_PCMDISCONNECT	0x0020
> +#define MISDN_CTRL_GET_PCM_SLOTS	0x0010
> +#define MISDN_CTRL_SET_PCM_SLOTS	0x0020
>  #define MISDN_CTRL_SETPEER		0x0040
>  #define MISDN_CTRL_UNSETPEER		0x0080
>  #define MISDN_CTRL_RX_OFF		0x0100

Another completely unrelated change, nothing in this patch uses
these new defines.

> @@ -381,6 +381,10 @@ clear_channelmap(u_int nr, u_char *map)
>  #define MISDN_CTRL_HFC_WD_INIT		0x4009
>  #define MISDN_CTRL_HFC_WD_RESET		0x400A
>  
> +/* special PCM slot numbers */
> +#define MISDN_PCM_SLOT_DISABLE	-1	/* PCM disabled */
> +#define MISDN_PCM_SLOT_IGNORE	-2	/* PCM setting will be not changed */
> +
>  /* socket options */
>  #define MISDN_TIME_STAMP		0x0001
>  

Same thing.

> @@ -389,6 +393,7 @@ struct mISDN_ctrl_req {
>  	int		channel;
>  	int		p1;
>  	int		p2;
> +	int		p3;
>  };
>  
>  /* muxer options */

And again, same problem.

You really need to get your act in gear and prepare your patches
properly, so that they don't have unrelated changes in them.

This is not amateur hour.

^ permalink raw reply

* Re: [PATCH net-next 0/5] be2net fixes
From: David Miller @ 2012-05-01 17:22 UTC (permalink / raw)
  To: somnath.kotur; +Cc: netdev
In-Reply-To: <ef51171e-d498-4201-b836-db6fbdb2ee4f@exht1.ad.emulex.com>

From: Somnath Kotur <somnath.kotur@emulex.com>
Date: Mon, 30 Apr 2012 11:33:47 +0530

> Re-posting patches after incorporating Ben Hutchings's 
> review comments in patches 2 and one of the comments for patch 5

Ben still has feedback, please address it and resubmit this series.

^ permalink raw reply

* Re: [PATCH net-next] netem: add ECN capability
From: David Miller @ 2012-05-01 17:17 UTC (permalink / raw)
  To: shemminger; +Cc: eric.dumazet, netdev, therbert, ncardwell, hagen
In-Reply-To: <20120501095938.7214e4a5@s6510.linuxnetplumber.net>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Tue, 1 May 2012 09:59:38 -0700

> On Tue, 01 May 2012 09:40:51 -0400 (EDT)
> David Miller <davem@davemloft.net> wrote:
> 
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Tue, 01 May 2012 11:11:05 +0200
>> 
>> > From: Eric Dumazet <edumazet@google.com>
>> > 
>> > Add ECN (Explicit Congestion Notification) marking capability to netem
>> > 
>> > tc qdisc add dev eth0 root netem drop 0.5 ecn
>> > 
>> > Instead of dropping packets, try to ECN mark them.
>> > 
>> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>> 
>> Applied.
> 
> At least give a day of review, rather than premature acceptance

Yeah, because Eric Dumazet is going to fall off the face of the
planet and not fix whatever problems you have with his work.

Get real Stephen.

^ permalink raw reply

* Re: [PATCH net-next] netem: add ECN capability
From: Eric Dumazet @ 2012-05-01 17:10 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, netdev, Tom Herbert, Neal Cardwell,
	Hagen Paul Pfeifer
In-Reply-To: <20120501095914.0e60ff6a@s6510.linuxnetplumber.net>

On Tue, 2012-05-01 at 09:59 -0700, Stephen Hemminger wrote:

> The concept is fine, but a couple of questions.
>  1. Why a whole u32 for boolean?

a boolean in this structure wont save space, and this file is full of
u32. Why bother ?

IMHO boolean are fine for function arguments, but in a structure, not
very helpful.

>  2. The logic in this part of netem is setup to handle case of random duplication
>     combined with random loss. With ecn option set, will this code correctly
>     handled a duplication combined with a loss and send one packet?
>     It looks like the new code would change that behaviour.

Hmm, I'll take a look.

^ permalink raw reply

* skb_under_panic bug
From: Pieper, Jeffrey E @ 2012-05-01 17:08 UTC (permalink / raw)
  To: netdev@vger.kernel.org; +Cc: Pieper, Jeffrey E

I've seen the following stack trace several times recently, and was wondering if anyone else has ran into this. This occurs during bi-directional TCP/UDP traffic (netperf) on multiple platforms/devices. Whatever is occurring also seems to invoke hdparm.  I'm running 3.4.0-rc4 (recent net-next pull) with CONFIG_PREEMPT=y:

skb_under_panic: text:ffffffff8130d538 len:120 put:14 head:ffff880128c89800 data:ffff880128c897f4 tail:0x6c end:0xc0 dev:eth0
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:145!
invalid opcode: 0000 [#1] PREEMPT SMP 
CPU 4 
Modules linked in: nfsd lockd exportfs sunrpc e1000e [last unloaded: scsi_wait_scan]

Pid: 5030, comm: netperf Not tainted 3.4.0-rc2-net-next-e1000e-queue_20120423 #2                  /DQ57TM
RIP: 0010:[<ffffffff812ef1b4>]  [<ffffffff812ef1b4>] skb_push+0x72/0x7b
RSP: 0018:ffff8801283459d8  EFLAGS: 00010292
RAX: 0000000000000084 RBX: ffff880127ffc000 RCX: 00000000fffffff3
RDX: 00000000000000d6 RSI: 0000000000000046 RDI: ffffffff8162375a
RBP: ffff8801283459f8 R08: 0000000000008d0a R09: ffff88012f019000
R10: 0000000000000001 R11: 0000000000000078 R12: ffff880128ff6ca0
R13: 0000000000000800 R14: 0000000000000212 R15: ffff880128ff6c98
FS:  00007fc4c3feb700(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffff600400 CR3: 0000000127e81000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process netperf (pid: 5030, threadinfo ffff880128344000, task ffff8801279859d0)
Stack:
000000000000006c 00000000000000c0 ffff880127ffc000 ffff880128ff6ca0
ffff880128345a38 ffffffff8130d538 ffff88010000006a 0000000000000000
ffffffff81300cd5 ffff88011dfad9c0 ffff880128ff6c00 ffff880127ffc000
Call Trace:
[<ffffffff8130d538>] eth_header+0x28/0xb4
[<ffffffff81300cd5>] ? neigh_resolve_output+0x14d/0x19a
[<ffffffff81300cd7>] neigh_resolve_output+0x14f/0x19a
[<ffffffff8131c35a>] ip_finish_output+0x22f/0x271
[<ffffffff8131c3d9>] ip_output+0x3d/0x3f
[<ffffffff81319dd4>] ip_local_out+0x62/0x64
[<ffffffff81319ddf>] ip_send_skb+0x9/0x2c
[<ffffffff81337a85>] udp_send_skb+0x250/0x2aa
[<ffffffff81338fa2>] udp_sendmsg+0x4e3/0x6f8
[<ffffffff8131b345>] ? ip_append_page+0x4b2/0x4b2
[<ffffffff813a442b>] ? preempt_schedule_irq+0x3c/0x51
[<ffffffff81049998>] ? __dequeue_entity+0x2e/0x33
[<ffffffff8133f654>] inet_sendmsg+0x93/0x9c
[<ffffffff812e741d>] sock_sendmsg+0xbb/0xd4
[<ffffffff81049998>] ? __dequeue_entity+0x2e/0x33
[<ffffffff81049998>] ? __dequeue_entity+0x2e/0x33
[<ffffffff813a442b>] ? preempt_schedule_irq+0x3c/0x51
[<ffffffff813a5516>] ? retint_kernel+0x26/0x30
[<ffffffff812e748f>] ? sockfd_lookup_light+0x1b/0x54
[<ffffffff812e7b3f>] sys_sendto+0xfa/0x122
[<ffffffff813a442b>] ? preempt_schedule_irq+0x3c/0x51
[<ffffffff813a97a2>] system_call_fastpath+0x16/0x1b
Code: 8b 57 68 48 89 44 24 10 8b 87 b0 00 00 00 48 89 44 24 08 31 c0 8b bf ac 00 00 00 48 89 3c 24 48 c7 c7 b7 6d 4c 81 e8 26 37 0b 00 <0f> 0b eb fe 4c 89 c8 c9 c3 55 89 f1 48 89 e5 48 83 ec 20 4c 8b 
RIP  [<ffffffff812ef1b4>] skb_push+0x72/0x7b
RSP <ffff8801283459d8>
---[ end trace 35a690c4aebb4bd0 ]---
hdparm: sending ioctl 330 to a partition!
hdparm: sending ioctl 330 to a partition!
hdparm: sending ioctl 330 to a partition!
hdparm: sending ioctl 330 to a partition!

Thanks in advance,

Jeff Pieper

^ permalink raw reply

* Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag
From: Eric Dumazet @ 2012-05-01 17:04 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <4FA00C9F.8080409@intel.com>

On Tue, 2012-05-01 at 09:17 -0700, Alexander Duyck wrote:
> On 04/30/2012 11:39 PM, Eric Dumazet wrote:
> > On Mon, 2012-04-30 at 22:33 -0700, Alexander Duyck wrote:
> >
> >> The question I had was more specific to GRO.  As long as we have
> >> skb->users == 1 and the skb isn't cloned we should be fine.   It just
> >> hadn't occurred to me before that napi_gro_receive had the extra
> >> requirement that the skb couldn't be cloned.
> >>
> > OK
> >
> > By the way, even if skb was cloned, we would be allowed to steal
> > skb->head.
> >
> > When we clone an oskb we :
> >
> > 1) allocate a struct nskb sk_buff (or use the shadow in case of TCP)
> > 2) increment dataref
> The problem I have is with this piece right here.  So you increment
> dataref.  Now you have an skb that is still pointing to the shared info
> on this page and dataref is 2.  What about the side that is stealing the
> head?  Is it going to be tracking the dataref as well and decrementing
> it before put_page or does it just assume that dataref is 1 and call
> put_page directly?  I am guessing the latter since I didn't see anything
> that allowed for tracking the dataref of stolen heads.

The only changed thing is the kfree() replaced by put_page()

This kfree() was done when last reference to dataref was released.

If we had a problem before, we have same problem after my patch.

Truth is : In TCP (coalesce and splice()) and GRO, we owns skbs.

(See the various __kfree_skb(skb) calls in net/ipv4/tcp_input.c
There is one exception in ipv6 / treq->pktopts ) but its for SYN packet
and this wont be merged with a previous packet.

^ permalink raw reply

* Re: [PATCH net-next] netem: add ECN capability
From: Stephen Hemminger @ 2012-05-01 16:59 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, therbert, ncardwell, hagen
In-Reply-To: <20120501.094051.655617860554790911.davem@davemloft.net>

On Tue, 01 May 2012 09:40:51 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Tue, 01 May 2012 11:11:05 +0200
> 
> > From: Eric Dumazet <edumazet@google.com>
> > 
> > Add ECN (Explicit Congestion Notification) marking capability to netem
> > 
> > tc qdisc add dev eth0 root netem drop 0.5 ecn
> > 
> > Instead of dropping packets, try to ECN mark them.
> > 
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> 
> Applied.

At least give a day of review, rather than premature acceptance

^ permalink raw reply

* Re: [PATCH net-next] netem: add ECN capability
From: Stephen Hemminger @ 2012-05-01 16:59 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Tom Herbert, Neal Cardwell,
	Hagen Paul Pfeifer
In-Reply-To: <1335863465.11396.45.camel@edumazet-glaptop>

On Tue, 01 May 2012 11:11:05 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> From: Eric Dumazet <edumazet@google.com>
> 
> Add ECN (Explicit Congestion Notification) marking capability to netem
> 
> tc qdisc add dev eth0 root netem drop 0.5 ecn
> 
> Instead of dropping packets, try to ECN mark them.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> Cc: Tom Herbert <therbert@google.com>
> Cc: Hagen Paul Pfeifer <hagen@jauu.net>
> Cc: Stephen Hemminger <shemminger@vyatta.com>
> ---
>  include/linux/pkt_sched.h |    1 +
>  net/sched/sch_netem.c     |   18 +++++++++++++++---
>  2 files changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
> index 410b33d..ffe975c 100644
> --- a/include/linux/pkt_sched.h
> +++ b/include/linux/pkt_sched.h
> @@ -509,6 +509,7 @@ enum {
>  	TCA_NETEM_CORRUPT,
>  	TCA_NETEM_LOSS,
>  	TCA_NETEM_RATE,
> +	TCA_NETEM_ECN,
>  	__TCA_NETEM_MAX,
>  };
>  
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index 1109731..231cd11 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -26,6 +26,7 @@
>  
>  #include <net/netlink.h>
>  #include <net/pkt_sched.h>
> +#include <net/inet_ecn.h>
>  
>  #define VERSION "1.3"
>  
> @@ -78,6 +79,7 @@ struct netem_sched_data {
>  	psched_tdiff_t jitter;
>  
>  	u32 loss;
> +	u32 ecn;
>  	u32 limit;
>  	u32 counter;
>  	u32 gap;
> @@ -374,9 +376,12 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>  		++count;
>  
>  	/* Drop packet? */
> -	if (loss_event(q))
> -		--count;
> -
> +	if (loss_event(q)) {
> +		if (q->ecn && INET_ECN_set_ce(skb))
> +			sch->qstats.drops++; /* mark packet */
> +		else
> +			--count;
> +	}
>  	if (count == 0) {
>  		sch->qstats.drops++;
>  		kfree_skb(skb);
> @@ -706,6 +711,7 @@ static const struct nla_policy netem_policy[TCA_NETEM_MAX + 1] = {
>  	[TCA_NETEM_CORRUPT]	= { .len = sizeof(struct tc_netem_corrupt) },
>  	[TCA_NETEM_RATE]	= { .len = sizeof(struct tc_netem_rate) },
>  	[TCA_NETEM_LOSS]	= { .type = NLA_NESTED },
> +	[TCA_NETEM_ECN]		= { .type = NLA_U32 },
>  };
>  
>  static int parse_attr(struct nlattr *tb[], int maxtype, struct nlattr *nla,
> @@ -776,6 +782,9 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt)
>  	if (tb[TCA_NETEM_RATE])
>  		get_rate(sch, tb[TCA_NETEM_RATE]);
>  
> +	if (tb[TCA_NETEM_ECN])
> +		q->ecn = nla_get_u32(tb[TCA_NETEM_ECN]);
> +
>  	q->loss_model = CLG_RANDOM;
>  	if (tb[TCA_NETEM_LOSS])
>  		ret = get_loss_clg(sch, tb[TCA_NETEM_LOSS]);
> @@ -902,6 +911,9 @@ static int netem_dump(struct Qdisc *sch, struct sk_buff *skb)
>  	if (nla_put(skb, TCA_NETEM_RATE, sizeof(rate), &rate))
>  		goto nla_put_failure;
>  
> +	if (q->ecn && nla_put_u32(skb, TCA_NETEM_ECN, q->ecn))
> +		goto nla_put_failure;
> +
>  	if (dump_loss_model(q, skb) != 0)
>  		goto nla_put_failure;

The concept is fine, but a couple of questions.
 1. Why a whole u32 for boolean?
 2. The logic in this part of netem is setup to handle case of random duplication
    combined with random loss. With ecn option set, will this code correctly
    handled a duplication combined with a loss and send one packet?
    It looks like the new code would change that behaviour.

^ permalink raw reply

* Re: [PATCH 1/1] sky2: fix YUKON FE+ rev. A0, ingress VLAN tagged frame problem
From: Stephen Hemminger @ 2012-05-01 16:54 UTC (permalink / raw)
  To: Michele Cucchi; +Cc: Mirko Lindner, netdev
In-Reply-To: <1335887340-24284-1-git-send-email-cucchimichele@tiscali.it>

On Tue,  1 May 2012 17:49:00 +0200
Michele Cucchi <cucchimichele@tiscali.it> wrote:

> The cards based on chip YUKON FE+ revision A0, fails to correctly receive VLAN tagged frames. The ingress frames stays on raw physical interface and don't reach VLANS allocated interfaces.
> The problem is the double call of __vlan_hwaccel_put_tag function, one in the sky2_skb_rx function and the other in the vlans global function vlan_untag.
> Adding a check of card version and revision in the sky2_skb_rx function allow to skip the first time call of __vlan_hwaccel_put_tag and to fix the problem.
> 
> This fix is tested on 3.4-rc5.
> 
> Signed-off-by: Michele Cucchi <cucchimichele@tiscali.it>
> ---
>  drivers/net/ethernet/marvell/sky2.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
> index c9b504e..2be0351 100644
> --- a/drivers/net/ethernet/marvell/sky2.c
> +++ b/drivers/net/ethernet/marvell/sky2.c
> @@ -2649,7 +2649,10 @@ static inline void sky2_tx_done(struct net_device *dev, u16 last)
>  static inline void sky2_skb_rx(const struct sky2_port *sky2,
>  			       u32 status, struct sk_buff *skb)
>  {
> -	if (status & GMR_FS_VLAN)
> +	/* Workaround for YUKON FE+ rev. A0, vlan untag problem */
> +	if ((status & GMR_FS_VLAN) &&
> +	(!(sky2->hw->chip_id == CHIP_ID_YUKON_FE_P &&
> +	sky2->hw->chip_rev == CHIP_REV_YU_FE2_A0)))
>  		__vlan_hwaccel_put_tag(skb, be16_to_cpu(sky2->rx_tag));
>  
>  	if (skb->ip_summed == CHECKSUM_NONE)

A better solution to this was started by Mirko and revised by me.
It is was posted yesterday.

^ permalink raw reply

* Re: [PATCH] bridge: make brctl showstp display port id
From: Stephen Hemminger @ 2012-05-01 16:53 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: netdev@vger.kernel.org
In-Reply-To: <OF4DD70AEF.9EE04586-ONC12579F1.00502FF3-C12579F1.00506F9C@transmode.se>

On Tue, 1 May 2012 16:38:34 +0200
Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:

> Stephen Hemminger <shemminger@vyatta.com> wrote on 2012/04/30 18:07:47:
> >
> > On Mon, 30 Apr 2012 10:39:08 +0200
> > Joakim Tjernlund <Joakim.Tjernlund@transmode.se> wrote:
> >
> > > My brctl showstp br0 always shows a 0 port id:
> > > eth2 (1)
> > >  port id      0         state             disabled
> > >  designated root   8000.00069c00b2fb   path cost       100
> > >
> > > because port id is printed as a hex number in sys fs. Change the
> > > two hex occurrences(port no and port id) to decimal, just like all
> > > the other numbers in this area.
> > >
> > > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > > ---
> > >  net/bridge/br_sysfs_if.c |    4 ++--
> > >  1 files changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
> > > index fd5799c..9c4c2eb 100644
> > > --- a/net/bridge/br_sysfs_if.c
> > > +++ b/net/bridge/br_sysfs_if.c
> > > @@ -86,13 +86,13 @@ static BRPORT_ATTR(designated_cost, S_IRUGO, show_designated_cost, NULL);
> > >
> > >  static ssize_t show_port_id(struct net_bridge_port *p, char *buf)
> > >  {
> > > -   return sprintf(buf, "0x%x\n", p->port_id);
> > > +   return sprintf(buf, "%d\n", p->port_id);
> > >  }
> > >  static BRPORT_ATTR(port_id, S_IRUGO, show_port_id, NULL);
> > >
> > >  static ssize_t show_port_no(struct net_bridge_port *p, char *buf)
> > >  {
> > > -   return sprintf(buf, "0x%x\n", p->port_no);
> > > +   return sprintf(buf, "%d\n", p->port_no);
> > >  }
> > >
> > >  static BRPORT_ATTR(port_no, S_IRUGO, show_port_no, NULL);
> >
> > No. This would be a visible change to applications.
> > The bridge utilities should be fixed instead.
> 
> Well, my reasoning was that if not the native tools got this right, what are the
> chances that other apps got it right? I don't know any other apps than
> brctl using this info, do you? We could check one or two apps to see what they do.

Unfortunately, that isn't really possible. API's get very hard to change
safely. It isn't worth the effort.

^ permalink raw reply

* 3.4-rc: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
From: Stefan Richter @ 2012-05-01 16:24 UTC (permalink / raw)
  To: netdev

Hi,

I switched from 3.3 to 3.4-rc5 yesterday and am getting the following warning
some time after boot (e.g. this time about two hours after boot).  What's up
with that?

May  1 15:08:35 stein kernel: WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0x16b/0x20f()
May  1 15:08:35 stein kernel: Hardware name: System Product Name
May  1 15:08:35 stein kernel: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
May  1 15:08:35 stein kernel: Modules linked in: usb_storage snd_isight snd_firewire_lib firewire_ohci firewire_core crc_itu_t cpufreq_ondemand hwmon_vid nfsd lockd sunrpc ext4 jbd2 crc16 sr_mod cdrom snd_hda_codec_realtek powernow_k8 freq_table mperf yenta_socket k10temp pcmcia_core pcmcia_rsrc pata_atiixp sg snd_hda_intel snd_hda_codec snd_pcm snd_timer snd snd_page_alloc processor r8169 mii
May  1 15:08:35 stein kernel: Pid: 0, comm: swapper/3 Not tainted 3.4.0-rc5 #4
May  1 15:08:35 stein kernel: Call Trace:
May  1 15:08:35 stein kernel: <IRQ>  [<ffffffff8102562d>] ? warn_slowpath_common+0x78/0x8c
May  1 15:08:35 stein kernel: [<ffffffff810256e2>] ? warn_slowpath_fmt+0x45/0x4a
May  1 15:08:35 stein kernel: [<ffffffff810174c4>] ? lapic_next_event+0x18/0x1f
May  1 15:08:35 stein kernel: [<ffffffff812c2040>] ? dev_watchdog+0x16b/0x20f
May  1 15:08:35 stein kernel: [<ffffffff8103da4c>] ? hrtimer_interrupt+0x100/0x19b
May  1 15:08:35 stein kernel: [<ffffffff8102e3a8>] ? run_timer_softirq+0x177/0x209
May  1 15:08:35 stein kernel: [<ffffffff8104e201>] ? clockevents_program_event+0xaa/0xce
May  1 15:08:35 stein kernel: [<ffffffff812c1ed5>] ? qdisc_reset+0x35/0x35
May  1 15:08:35 stein kernel: [<ffffffff8102a122>] ? __do_softirq+0x7f/0x106
May  1 15:08:35 stein kernel: [<ffffffff8131750c>] ? call_softirq+0x1c/0x30
May  1 15:08:35 stein kernel: [<ffffffff810034ea>] ? do_softirq+0x31/0x67
May  1 15:08:35 stein kernel: [<ffffffff8102a3cf>] ? irq_exit+0x44/0xae
May  1 15:08:35 stein kernel: [<ffffffff8100322b>] ? do_IRQ+0x94/0xad
May  1 15:08:35 stein kernel: [<ffffffff81315c67>] ? common_interrupt+0x67/0x67
May  1 15:08:35 stein kernel: <EOI>  [<ffffffff81008a6c>] ? default_idle+0x20/0x34
May  1 15:08:35 stein kernel: [<ffffffff81008b46>] ? amd_e400_idle+0xc6/0xe2
May  1 15:08:35 stein kernel: [<ffffffff810092c6>] ? cpu_idle+0x62/0x86
May  1 15:08:35 stein kernel: ---[ end trace 11b2855a75e24d28 ]---
May  1 15:08:35 stein kernel: r8169 0000:0b:00.0: eth0: link up

The "link up" message occurs several times per session.

(booted on Apr 30 20:38:16)
Apr 30 20:38:17 stein kernel: r8169 0000:0b:00.0: eth0: link up
Apr 30 23:57:36 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 01:57:46 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 04:20:21 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 04:28:57 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 06:11:35 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 06:13:59 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 07:51:58 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 08:14:56 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 08:15:20 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 08:16:08 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 08:26:20 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 08:35:04 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 08:46:00 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 08:50:24 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 08:57:48 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 09:07:26 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 10:33:23 stein kernel: r8169 0000:0b:00.0: eth0: link up

(booted on May  1 12:47:45)
May  1 12:47:46 stein kernel: r8169 0000:0b:00.0: eth0: link up
May  1 15:08:35 stein kernel: r8169 0000:0b:00.0: eth0: link up

I do not remember such messages from any earlier kernel.

There are no noticeable issues, but I don't run anything demanding on the
interface right now which would make any downtime obvious.

It is an RTL8168(?) on an Asus M3A78-EM motherboard.

-[0000:00]-+-00.0  Advanced Micro Devices [AMD] RS780 Host Bridge [1022:9600]
           +-01.0-[01]--+-05.0  Advanced Micro Devices [AMD] nee ATI RS780 [Radeon HD 3200] [1002:9610]
           |            \-05.1  Advanced Micro Devices [AMD] nee ATI RS780 HDMI Audio [Radeon HD 3000-3300 Series] [1002:960f]
           +-02.0-[02-07]----00.0-[03-07]--+-01.0-[04]----00.0  Agere Systems FW643 PCI Express 1394b Controller (PHY/Link) [11c1:5901]
           |                               +-02.0-[05]----00.0  Agere Systems FW643 PCI Express 1394b Controller (PHY/Link) [11c1:5901]
           |                               +-03.0-[06]--
           |                               \-04.0-[07]--
           +-04.0-[08-09]----00.0-[09]----00.0  Texas Instruments XIO2213A/B/XIO2221 IEEE-1394b OHCI Controller [Cheetah Express] [104c:823f]
           +-05.0-[0a]----00.0  JMicron Technology Corp. IEEE 1394 Host Controller [197b:2380]
           +-06.0-[0b]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller [10ec:8168]
           +-11.0  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] [1002:4391]
           +-12.0  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
           +-12.1  Advanced Micro Devices [AMD] nee ATI SB7x0 USB OHCI1 Controller [1002:4398]
           +-12.2  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
           +-13.0  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI0 Controller [1002:4397]
           +-13.1  Advanced Micro Devices [AMD] nee ATI SB7x0 USB OHCI1 Controller [1002:4398]
           +-13.2  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB EHCI Controller [1002:4396]
           +-14.0  Advanced Micro Devices [AMD] nee ATI SBx00 SMBus Controller [1002:4385]
           +-14.1  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 IDE Controller [1002:439c]
           +-14.2  Advanced Micro Devices [AMD] nee ATI SBx00 Azalia (Intel HDA) [1002:4383]
           +-14.3  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 LPC host controller [1002:439d]
           +-14.4-[0c-10]----06.0  Ricoh Co Ltd RL5c475 [1180:0475]
           +-14.5  Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 USB OHCI2 Controller [1002:4399]
           +-18.0  Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration [1022:1200]
           +-18.1  Advanced Micro Devices [AMD] Family 10h Processor Address Map [1022:1201]
           +-18.2  Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller [1022:1202]
           +-18.3  Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control [1022:1203]
           \-18.4  Advanced Micro Devices [AMD] Family 10h Processor Link Control [1022:1204]

-- 
Stefan Richter
-=====-===-- -=-= ----=
http://arcgraph.de/sr/

^ permalink raw reply

* Re: [PATCH 3/4 v2 net-next] net: make GRO aware of skb->head_frag
From: Alexander Duyck @ 2012-05-01 16:17 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexander Duyck, David Miller, netdev, Neal Cardwell, Tom Herbert,
	Jeff Kirsher, Michael Chan, Matt Carlson, Herbert Xu,
	Ben Hutchings, Ilpo Järvinen, Maciej Żenczykowski
In-Reply-To: <1335854378.11396.26.camel@edumazet-glaptop>

On 04/30/2012 11:39 PM, Eric Dumazet wrote:
> On Mon, 2012-04-30 at 22:33 -0700, Alexander Duyck wrote:
>
>> The question I had was more specific to GRO.  As long as we have
>> skb->users == 1 and the skb isn't cloned we should be fine.   It just
>> hadn't occurred to me before that napi_gro_receive had the extra
>> requirement that the skb couldn't be cloned.
>>
> OK
>
> By the way, even if skb was cloned, we would be allowed to steal
> skb->head.
>
> When we clone an oskb we :
>
> 1) allocate a struct nskb sk_buff (or use the shadow in case of TCP)
> 2) increment dataref
The problem I have is with this piece right here.  So you increment
dataref.  Now you have an skb that is still pointing to the shared info
on this page and dataref is 2.  What about the side that is stealing the
head?  Is it going to be tracking the dataref as well and decrementing
it before put_page or does it just assume that dataref is 1 and call
put_page directly?  I am guessing the latter since I didn't see anything
that allowed for tracking the dataref of stolen heads.

> 3) link skb->head
>
> After cloning()  nskb->users == 1, and oskb->users is unchanged
>
> I believe you are referring to skb_get(oskb), that ones increment
> oskb->users and skb_shared(oskb) then returns true.
skb_clone and skb_get are two completely separate things.  My concern
with the skb_get portion is if skb->users is greater than 1 we should
not be freeing the skbuff back to the head cache.  This was addressed
with the fact that GRO is requiring that skb->users be 1 before it is
handed the skb, although I didn't look too closely at the other patches
that are also stealing skb->head.  My concern with skb_clone, as I
mentioned above, is that I am not sure how the dataref tracking will
still work.

It looks like Dave applied the patches last night so I will pull the
latest net-next and do some poking around.  It is possible I am just
being dense and not getting this, but I just feel like there is
something that got missed or overlooked.

Thanks,

Alex

^ permalink raw reply

* [PATCH 1/1] sky2: fix YUKON FE+ rev. A0, ingress VLAN tagged frame problem
From: Michele Cucchi @ 2012-05-01 15:49 UTC (permalink / raw)
  To: Mirko Lindner, Stephen Hemminger; +Cc: netdev, Michele Cucchi

The cards based on chip YUKON FE+ revision A0, fails to correctly receive VLAN tagged frames. The ingress frames stays on raw physical interface and don't reach VLANS allocated interfaces.
The problem is the double call of __vlan_hwaccel_put_tag function, one in the sky2_skb_rx function and the other in the vlans global function vlan_untag.
Adding a check of card version and revision in the sky2_skb_rx function allow to skip the first time call of __vlan_hwaccel_put_tag and to fix the problem.

This fix is tested on 3.4-rc5.

Signed-off-by: Michele Cucchi <cucchimichele@tiscali.it>
---
 drivers/net/ethernet/marvell/sky2.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/marvell/sky2.c b/drivers/net/ethernet/marvell/sky2.c
index c9b504e..2be0351 100644
--- a/drivers/net/ethernet/marvell/sky2.c
+++ b/drivers/net/ethernet/marvell/sky2.c
@@ -2649,7 +2649,10 @@ static inline void sky2_tx_done(struct net_device *dev, u16 last)
 static inline void sky2_skb_rx(const struct sky2_port *sky2,
 			       u32 status, struct sk_buff *skb)
 {
-	if (status & GMR_FS_VLAN)
+	/* Workaround for YUKON FE+ rev. A0, vlan untag problem */
+	if ((status & GMR_FS_VLAN) &&
+	(!(sky2->hw->chip_id == CHIP_ID_YUKON_FE_P &&
+	sky2->hw->chip_rev == CHIP_REV_YU_FE2_A0)))
 		__vlan_hwaccel_put_tag(skb, be16_to_cpu(sky2->rx_tag));
 
 	if (skb->ip_summed == CHECKSUM_NONE)
-- 
1.7.2.5

^ permalink raw reply related

* Re: [PATCH] mwl8k: Add 0x2a02 PCI device-id (Marvell 88W8361)
From: Lennert Buytenhek @ 2012-05-01 15:49 UTC (permalink / raw)
  To: sedat.dilek
  Cc: John W. Linville, linux-wireless, netdev, linux-kernel, lautriv,
	Jim Cromie, Ben Hutchings, Hauke Mehrtens
In-Reply-To: <CA+icZUV=qtXwVxb+uvQJcsza0y+6k2LHJFFPnECY0M=QWBexaQ@mail.gmail.com>

On Tue, May 01, 2012 at 03:54:49PM +0200, Sedat Dilek wrote:

> Some more concerns...
> Dunno, if [4] is really correct and how to use one and same device-id
> for 8363 ***and*** 8361P:
> ...
> { PCI_VDEVICE(MARVELL, 0x2a02), .driver_data = MWL8363, },

No, you'll have to add a MWL8361P to the enum, and add a
mwl8k_device_info table entry for that, and then add the PCI ID
table entry with the MWL8361P identifier.

^ permalink raw reply

* Re: [PATCH] bridge: make brctl showstp display port id
From: Joakim Tjernlund @ 2012-05-01 14:38 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev@vger.kernel.org
In-Reply-To: <20120430090747.1d64198b@nehalam.linuxnetplumber.net>

Stephen Hemminger <shemminger@vyatta.com> wrote on 2012/04/30 18:07:47:
>
> On Mon, 30 Apr 2012 10:39:08 +0200
> Joakim Tjernlund <Joakim.Tjernlund@transmode.se> wrote:
>
> > My brctl showstp br0 always shows a 0 port id:
> > eth2 (1)
> >  port id      0         state             disabled
> >  designated root   8000.00069c00b2fb   path cost       100
> >
> > because port id is printed as a hex number in sys fs. Change the
> > two hex occurrences(port no and port id) to decimal, just like all
> > the other numbers in this area.
> >
> > Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
> > ---
> >  net/bridge/br_sysfs_if.c |    4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
> > index fd5799c..9c4c2eb 100644
> > --- a/net/bridge/br_sysfs_if.c
> > +++ b/net/bridge/br_sysfs_if.c
> > @@ -86,13 +86,13 @@ static BRPORT_ATTR(designated_cost, S_IRUGO, show_designated_cost, NULL);
> >
> >  static ssize_t show_port_id(struct net_bridge_port *p, char *buf)
> >  {
> > -   return sprintf(buf, "0x%x\n", p->port_id);
> > +   return sprintf(buf, "%d\n", p->port_id);
> >  }
> >  static BRPORT_ATTR(port_id, S_IRUGO, show_port_id, NULL);
> >
> >  static ssize_t show_port_no(struct net_bridge_port *p, char *buf)
> >  {
> > -   return sprintf(buf, "0x%x\n", p->port_no);
> > +   return sprintf(buf, "%d\n", p->port_no);
> >  }
> >
> >  static BRPORT_ATTR(port_no, S_IRUGO, show_port_no, NULL);
>
> No. This would be a visible change to applications.
> The bridge utilities should be fixed instead.

Well, my reasoning was that if not the native tools got this right, what are the
chances that other apps got it right? I don't know any other apps than
brctl using this info, do you? We could check one or two apps to see what they do.

 Jocke

^ permalink raw reply

* Re: [net-next v2 6/8] ixgbe: add syfs interface for to export read only driver information
From: Ben Greear @ 2012-05-01 14:30 UTC (permalink / raw)
  To: David Miller
  Cc: jeffrey.t.kirsher, donald.c.skidmore, netdev, gospo, sassmann,
	bhutchings
In-Reply-To: <20120501.100241.1409452912879198250.davem@davemloft.net>

On 05/01/2012 07:02 AM, David Miller wrote:
> From: Jeff Kirsher<jeffrey.t.kirsher@intel.com>
> Date: Tue,  1 May 2012 01:51:07 -0700
>
>> From: Don Skidmore<donald.c.skidmore@intel.com>
>>
>> This patch exports non-thermal (which was done via hwmon in an earlier
>> patch) data to sysfs which isn't readily available elsewhere.  All of the
>> fields are read only as this interface is to only export driver data.
>>
>> Signed-off-by: Don Skidmore<donald.c.skidmore@intel.com>
>> Tested-by: Stephen Ko<stephen.s.ko@intel.com>
>> Signed-off-by: Jeff Kirsher<jeffrey.t.kirsher@intel.com>
>
> I don't like it.
>
> Some of this stuff is generic and belongs somewhere like ethtool, for
> example the descriptor sizes and queue sizes.
>
> The others are reading registers, and we have an ethtool API for that
> already.
>
> But putting anything like this in sysfs is pointless, because the
> stuff that other cards have too will then go into differently named
> sysfs files which, as is oft repeated here, is a terrible user
> experience.
>
> If you want to do this right, add a new ethtool interface that allows
> the publication of card specific unchanging values, in a style like
> what we already do for statistics.  Have one query that gets the
> string list, and then another which fetches the actual values.

And if you decide to go forward with this, I have some ideas for API
and would like to discuss it.  Or, if I beat you to it, I'll post some
RFC patches when I get a chance to write them up.

Basically, I'm thinking of a get-strings(flags), get-types(flags), and get-values(flags)
API.  The types would return an array of enums that would define the data
type, like uint32, int8, etc.  Strings would be same as it is today,
just with a new type.  Values would return array of uint64 as it does now.

To each method you could specify a uint32 flags that would be device specific.
I would like to use flags in wifi to specify whether to get the underlying NIC
stats v/s just the soft-device stats, for instance.  And maybe some additional
granularity if some stats are more costly to get than others so that one could
probe expensive stats more rarely....

And while strings would still be free-form, we could at least attempt to use
the same strings for the same kinds of stats where possible.

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* [PATCH 1/2] iproute2: allow IPv6 addresses for l2tp local and remote parameters
From: James Chapman @ 2012-05-01 14:25 UTC (permalink / raw)
  To: netdev; +Cc: Chris Elston
In-Reply-To: <1335882323-6219-1-git-send-email-jchapman@katalix.com>

From: Chris Elston <celston@katalix.com>

Adds support for parsing IPv6 addresses to the parameters local and
remote in the l2tp commands. Requires netlink attributes L2TP_ATTR_IP6_SADDR
and L2TP_ATTR_IP6_DADDR, added in a required kernel patch already submitted
to netdev.

Also enables printing of IPv6 addresses returned by the L2TP_CMD_TUNNEL_GET
request.

Signed-off-by: Chris Elston <celston@katalix.com>
Signed-off-by: James Chapman <jchapman@katalix.com>
---
 ip/ipl2tp.c |   59 +++++++++++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/ip/ipl2tp.c b/ip/ipl2tp.c
index c5683f5..a05e1a3 100644
--- a/ip/ipl2tp.c
+++ b/ip/ipl2tp.c
@@ -50,8 +50,8 @@ struct l2tp_parm {
 	uint8_t cookie[8];
 	int peer_cookie_len;
 	uint8_t peer_cookie[8];
-	struct in_addr local_ip;
-	struct in_addr peer_ip;
+	inet_prefix local_ip;
+	inet_prefix peer_ip;
 
 	uint16_t pw_type;
 	uint16_t mtu;
@@ -97,6 +97,8 @@ static int create_tunnel(struct l2tp_parm *p)
 		struct genlmsghdr	g;
 		char   			buf[1024];
 	} req;
+	uint32_t local_attr = L2TP_ATTR_IP_SADDR;
+	uint32_t peer_attr = L2TP_ATTR_IP_DADDR;
 
 	memset(&req, 0, sizeof(req));
 	req.n.nlmsg_type = genl_family;
@@ -110,8 +112,14 @@ static int create_tunnel(struct l2tp_parm *p)
 	addattr8(&req.n, 1024, L2TP_ATTR_PROTO_VERSION, 3);
 	addattr16(&req.n, 1024, L2TP_ATTR_ENCAP_TYPE, p->encap);
 
-	addattr32(&req.n, 1024, L2TP_ATTR_IP_SADDR, p->local_ip.s_addr);
-	addattr32(&req.n, 1024, L2TP_ATTR_IP_DADDR, p->peer_ip.s_addr);
+	if (p->local_ip.family == AF_INET6)
+		local_attr = L2TP_ATTR_IP6_SADDR;
+	addattr_l(&req.n, 1024, local_attr, &p->local_ip.data, p->local_ip.bytelen);
+
+	if (p->peer_ip.family == AF_INET6)
+		peer_attr = L2TP_ATTR_IP6_DADDR;
+	addattr_l(&req.n, 1024, peer_attr, &p->peer_ip.data, p->peer_ip.bytelen);
+
 	if (p->encap == L2TP_ENCAPTYPE_UDP) {
 		addattr16(&req.n, 1024, L2TP_ATTR_UDP_SPORT, p->local_udp_port);
 		addattr16(&req.n, 1024, L2TP_ATTR_UDP_DPORT, p->peer_udp_port);
@@ -225,13 +233,14 @@ static void print_cookie(char *name, const uint8_t *cookie, int len)
 static void print_tunnel(const struct l2tp_data *data)
 {
 	const struct l2tp_parm *p = &data->config;
+	char buf[INET6_ADDRSTRLEN];
 
 	printf("Tunnel %u, encap %s\n",
 	       p->tunnel_id,
 	       p->encap == L2TP_ENCAPTYPE_UDP ? "UDP" :
 	       p->encap == L2TP_ENCAPTYPE_IP ? "IP" : "??");
-	printf("  From %s ", inet_ntoa(p->local_ip));
-	printf("to %s\n", inet_ntoa(p->peer_ip));
+	printf("  From %s ", inet_ntop(p->local_ip.family, p->local_ip.data, buf, sizeof(buf)));
+	printf("to %s\n", inet_ntop(p->peer_ip.family, p->peer_ip.data, buf, sizeof(buf)));
 	printf("  Peer tunnel %u\n",
 	       p->peer_tunnel_id);
 
@@ -315,10 +324,30 @@ static int get_response(struct nlmsghdr *n, void *arg)
 
 	if (attrs[L2TP_ATTR_RECV_TIMEOUT])
 		p->reorder_timeout = rta_getattr_u64(attrs[L2TP_ATTR_RECV_TIMEOUT]);
-	if (attrs[L2TP_ATTR_IP_SADDR])
-		p->local_ip.s_addr = rta_getattr_u32(attrs[L2TP_ATTR_IP_SADDR]);
-	if (attrs[L2TP_ATTR_IP_DADDR])
-		p->peer_ip.s_addr = rta_getattr_u32(attrs[L2TP_ATTR_IP_DADDR]);
+	if (attrs[L2TP_ATTR_IP_SADDR]) {
+		p->local_ip.family = AF_INET;
+		p->local_ip.data[0] = rta_getattr_u32(attrs[L2TP_ATTR_IP_SADDR]);
+		p->local_ip.bytelen = 4;
+		p->local_ip.bitlen = -1;
+	}
+	if (attrs[L2TP_ATTR_IP_DADDR]) {
+		p->peer_ip.family = AF_INET;
+		p->peer_ip.data[0] = rta_getattr_u32(attrs[L2TP_ATTR_IP_DADDR]);
+		p->peer_ip.bytelen = 4;
+		p->peer_ip.bitlen = -1;
+	}
+	if (attrs[L2TP_ATTR_IP6_SADDR]) {
+		p->local_ip.family = AF_INET6;
+		memcpy(&p->local_ip.data, RTA_DATA(attrs[L2TP_ATTR_IP6_SADDR]),
+			p->local_ip.bytelen = 16);
+		p->local_ip.bitlen = -1;
+	}
+	if (attrs[L2TP_ATTR_IP6_DADDR]) {
+		p->peer_ip.family = AF_INET6;
+		memcpy(&p->peer_ip.data, RTA_DATA(attrs[L2TP_ATTR_IP6_DADDR]),
+			p->peer_ip.bytelen = 16);
+		p->peer_ip.bitlen = -1;
+	}
 	if (attrs[L2TP_ATTR_UDP_SPORT])
 		p->local_udp_port = rta_getattr_u16(attrs[L2TP_ATTR_UDP_SPORT]);
 	if (attrs[L2TP_ATTR_UDP_DPORT])
@@ -529,10 +558,12 @@ static int parse_args(int argc, char **argv, int cmd, struct l2tp_parm *p)
 			p->ifname = *argv;
 		} else if (strcmp(*argv, "remote") == 0) {
 			NEXT_ARG();
-			p->peer_ip.s_addr = get_addr32(*argv);
+			if (get_addr(&p->peer_ip, *argv, AF_UNSPEC))
+				invarg("invalid remote address\n", *argv);
 		} else if (strcmp(*argv, "local") == 0) {
 			NEXT_ARG();
-			p->local_ip.s_addr = get_addr32(*argv);
+			if (get_addr(&p->local_ip, *argv, AF_UNSPEC))
+				invarg("invalid local address\n", *argv);
 		} else if ((strcmp(*argv, "tunnel_id") == 0) ||
 			   (strcmp(*argv, "tid") == 0)) {
 			__u32 uval;
@@ -648,10 +679,10 @@ static int do_add(int argc, char **argv)
 		missarg("peer_tunnel_id");
 
 	if (p.tunnel) {
-		if (p.local_ip.s_addr == 0)
+		if (p.local_ip.family == AF_UNSPEC)
 			missarg("local");
 
-		if (p.peer_ip.s_addr == 0)
+		if (p.peer_ip.family == AF_UNSPEC)
 			missarg("remote");
 
 		if (p.encap == L2TP_ENCAPTYPE_UDP) {
-- 
1.7.0.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox