Netdev List

Netdev List
 help / color / mirror / Atom feed

* Broken rndis_host with #define DEBUG (by a475f603d23392f386e45cf377b17c30ed3bbb80)
From: Jussi Kivilinna @ 2010-06-12 20:39 UTC (permalink / raw)
  To: Joe Perches; +Cc: David S. Miller, netdev, David Brownell

Hello!

Commit a475f603d23392f386e45cf377b17c30ed3bbb80 "drivers/net/usb: Use  
netif_<level> logging facilities" causes problems with rndis_host.

rndis_host oopses with #define DEBUG, in generic_rndis_bind() when  
calling netif_dbg(). Problem is that (netdev)->dev.parent is NULL  
(used in netdev_printk) while generic_rndis_bind is called.

-Jussi

^ permalink raw reply

* Re: hamachi on offer
From: Benjamin LaHaise @ 2010-06-12 17:30 UTC (permalink / raw)
  To: Andi Kleen; +Cc: netdev
In-Reply-To: <20100612095125.GA16967@basil.fritz.box>

Hi Andi,

On Sat, Jun 12, 2010 at 11:51:25AM +0200, Andi Kleen wrote:
> Hi,
> 
> I found an old hamachi (Packet Engines GNIC II) PCI GigE card while
> going through old junk. 
..
> Please only serious offers if you want to actually work with the code.

I have a fibre NatSemi DP83820 gige that should interoperate with this, and 
which needs testing as well.  I'd be willing to update and test the two 
drivers to the point of getting them working.  I also have access to media 
convertors, plus a few reasons to set up an automated test suite (more to 
test the media convertors, but testing the nics would be a beneficial side 
effect).  If someone else wants to maintain this longer term, I'd be 
willing to pass along the other fibre nic and some media convertors to 
help out.

		-ben

^ permalink raw reply

* [PATCH  kernel 2.6.35-rc2] pcnet_cs: add new id (TOSHIBA Modem/LAN Card)
From: Ken Kawasaki @ 2010-06-12 10:17 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20100228083420.9ca8e285.ken_kawasaki@spring.nifty.jp>


pcnet_cs:
serial_cs:
    add new id (TOSHIBA Modem/LAN Card)

Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>

---

--- linux-2.6.35-rc2/drivers/net/pcmcia/pcnet_cs.c.orig	2010-06-12 08:19:26.000000000 +0900
+++ linux-2.6.35-rc2/drivers/net/pcmcia/pcnet_cs.c	2010-06-12 08:21:35.000000000 +0900
@@ -1727,6 +1727,7 @@ static struct pcmcia_device_id pcnet_ids
 	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "cis/PCMLM28.cis"),
 	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "cis/PCMLM28.cis"),
 	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "TOSHIBA", "Modem/LAN Card", 0xb4585a1a, 0x53f922f8, "cis/PCMLM28.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID12(0, "DAYNA COMMUNICATIONS", "LAN AND MODEM MULTIFUNCTION", 0x8fdf8f89, 0xdd5ed9e8, "cis/DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID4(0, "NSC MF LAN/Modem", 0x58fc6056, "cis/DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_MANF_CARD(0, 0x0175, 0x0000, "cis/DP83903.cis"),
--- linux-2.6.35-rc2/drivers/serial/serial_cs.c.orig	2010-06-12 09:08:11.000000000 +0900
+++ linux-2.6.35-rc2/drivers/serial/serial_cs.c	2010-06-12 09:08:47.000000000 +0900
@@ -821,6 +821,7 @@ static struct pcmcia_device_id serial_id
 	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "cis/PCMLM28.cis"),
 	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "cis/PCMLM28.cis"),
 	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "TOSHIBA", "Modem/LAN Card", 0xb4585a1a, 0x53f922f8, "cis/PCMLM28.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID12(1, "DAYNA COMMUNICATIONS", "LAN AND MODEM MULTIFUNCTION", 0x8fdf8f89, 0xdd5ed9e8, "cis/DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID4(1, "NSC MF LAN/Modem", 0x58fc6056, "cis/DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_MANF_CARD(1, 0x0101, 0x0556, "cis/3CCFEM556.cis"),

^ permalink raw reply

* mpd client timeouts (bisected) 2.6.35-rc3
From: Markus Trippelsdorf @ 2010-06-12 10:28 UTC (permalink / raw)
  To: john.r.fastabend; +Cc: linux-kernel, netdev, davem

Commit 597a264b1a9c7e36d1728f677c66c5c1f7e3b837:
»net: deliver skbs on inactive slaves to exact matches«

causes large timeouts when mpd clients try to connect to a locally
running mpd (music player demon) on my machine. This makes it 
impossible to control mpd.

I bisected this down to the commit mentioned above.
Reverting the commit from 2.6.35-rc3 also solves the problem.

-- 
Markus

^ permalink raw reply

* hamachi on offer
From: Andi Kleen @ 2010-06-12  9:51 UTC (permalink / raw)
  To: netdev

Hi,

I found an old hamachi (Packet Engines GNIC II) PCI GigE card while
going through old junk. 

That was one of the first Gigabit drivers in Linux (hamachi.c), but 
probably hasn't seen much use in the last years and it's unclear if the 
driver still works.

If anyone is interested in maintaining the driver I would be willing to 
ship the card for free to them. The card is still a quite respectable Gigabit
Ethernet card.

The main drawback is that it has a fiber PHY, not copper, so one would 
need a suitable fiber switch or run it loopback with another fiber card.

I believe documentation is also still available.

Please only serious offers if you want to actually work with the code.

Thanks,
-Andi

^ permalink raw reply

* Re: Weak host model vs .interface down
From: Joakim Tjernlund @ 2010-06-12  9:34 UTC (permalink / raw)
  To: Mark Smith; +Cc: netdev, Rick Jones
In-Reply-To: <20100612092748.5fed4baf@lk-netdev.nosense.org>

Mark Smith <lk-netdev@lk-netdev.nosense.org> wrote on 2010/06/12 01:57:48:
>
> On Fri, 11 Jun 2010 21:41:45 +0200
> Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:
>
> > Rick Jones <rick.jones2@hp.com> wrote on 2010/06/11 19:13:42:
> > >
> > > > The weak model doesn't go into such detail, it is assumption/impl. detail
> > > > to assume that the ip address still is part of the system even when the interface
> > > > is down. One could just as well define interface down as temporarly removing
> > > > the IP address from the system too. This makes make much more sense to me and
> > > > if you always want the system to answer on a IP adress you make it an IP alias.
> > > >
> > > > Since the current behaviour is a problem to me and routers in general, can
> > > > we change this? What is the current usage model which needs it to stay as is?
> > >
> > > Router != end-system  so I wouldn't think the weak or strong end-system model
> > > would apply to a router.  I think Stephen already posted a patch to allow that
> > > for when one's box was a router rather than an end-system.
> >
> > Not really an anwser to what I was asking but I choose to read that as
> > you agree with me. The rest is an impl. detail. :)
> > Stephen's patch is good but I would not mind making I/F down removing the
> > IP address from the system unconditionally.
> >
>
> I've asked the same question a few years back and got the same answer.
> I accept the strong host / weak host argument, however I've also
> thought about the problem a bit more, and why people get confused about
> it.
>
> The problem is the mental model. Assigning an IP address to an
> interface implies that the IP address as attached and associated with
> the interface and therefore the state of the interface. That is
> certainly the case for people like me who work with networking
> equipment, typically routers, which follow the strong host model. It is
> very convenient to know that by shutting down an interface the
> associated IP address stops working too. Other measures, such as
> ACLing, or writing down and deleting and then having put it back, are
> relatively much more effort and error prone.

Very well put!

>
> While I'm sure past operational history is likely to make this
> impractical, it would be far more intuitive for weak host model IP
> address assignments to be made to a single, forced always up virtual
> interface on the host, and strong host IP address assignments made to
> any other "non-weak host" interfaces.
>
> It'd be an interesting experiment to see if loopback could be used as a
> "host interface" in the weak host model.

Or you can use the dummy I/F too. I have used lo/dummy to assign a host/system
address and it works fine. I am not aware of any limitations but if there are
I am sure someone will point them out :)

 Jocke


^ permalink raw reply

* RE: [RFC PATCH v7 01/19] Add a new structure for skb buffer from external.
From: Xin, Xiaohui @ 2010-06-12  9:31 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stephen Hemminger, netdev@vger.kernel.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, mst@redhat.com, mingo@elte.hu,
	davem@davemloft.net, jdike@linux.intel.com
In-Reply-To: <20100611052112.GA25649@gondor.apana.org.au>

>-----Original Message-----
>From: Herbert Xu [mailto:herbert@gondor.apana.org.au]
>Sent: Friday, June 11, 2010 1:21 PM
>To: Xin, Xiaohui
>Cc: Stephen Hemminger; netdev@vger.kernel.org; kvm@vger.kernel.org;
>linux-kernel@vger.kernel.org; mst@redhat.com; mingo@elte.hu; davem@davemloft.net;
>jdike@linux.intel.com
>Subject: Re: [RFC PATCH v7 01/19] Add a new structure for skb buffer from external.
>
>On Wed, Jun 09, 2010 at 05:54:02PM +0800, Xin, Xiaohui wrote:
>>
>> I'm not sure if I understand your way correctly:
>> 1) Does the way only deal with driver with SG feature? Since packet
>> is non-linear...
>
>No the hardware doesn't have to support SG.  You just need to
>place the entire packet contents in a page instead of skb->head.
>
>> 2) Is skb->data still pointing to guest user buffers?
>> If yes, how to avoid the modifications to net core change to skb?
>
>skb->data would not point to guest user buffers.  In the common
>case the packet is not modified on its way to the guest so this
>is not an issue.
>
>In the rare case where it is modified, you only have to copy the
>bits which are modified and the cost of that is inconsequential
>since you have to write to that memory anyway.
>
>> 3) In our way only parts of drivers need be modified to support zero-copy.
>> and here, need we modify all the drivers?
>
>If you're asking the portion of each driver supporting zero-copy
>that needs to be modified, then AFAICS this doesn't change that
>very much at all.
>
>> I think to make skb->head empty at first will cause more effort to pass the check with
>> skb header. Have I missed something here? I really make the skb->head NULL
>> just before kfree(skb) in skb_release_data(), it's done by callback we have made for skb.
>
>No I'm not suggesting you set it to NULL.  It should have some
>memory allocated, but skb_headlen(skb) should be zero.
>
>Please have a look at how the napi_gro_frags interface works (e.g.,
>in drivers/net/cxgb3/sge.c).  This is exactly the model that I am
>suggesting.
>
>Cheers,
>--
>Visit Openswan at http://www.openswan.org/
>Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
>Home Page: http://gondor.apana.org.au/~herbert/
>PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Herbert,
I explained what I think the thought in your mind here, please clarify if 
something missed.

1) Modify driver from netdev_alloc_skb() to alloc user pages if dev is zero-copyed.
  If the driver support PS mode, then modify alloc_page() too.
2) Add napi_gro_frags() in driver to receive the user pages instead of driver's receiving 
function.
3) napi_gro_frags() will allocate small skb and pull the header data from 
the first page to skb->data.

Is above the way what you have suggested?
I have thought something in detail about the way.

1) The first page will have an offset after the header is copied into allocated kernel skb. 
The offset should be recalculated when the user page data is transferred to guest. This 
may modify some of the gro code.

2) napi_gro_frags() may remove a page when it's data is totally be pulled, but we cannot 
put a user page as normally. This may modify the gro code too.

3) When the user buffer returned to guest, some of them need to be appended a vnet header.
That means for some pages, the vnet header room should be reserved when allocated.
But we cannot know which one will be used as the first page when allocated. If we reserved vnet header for each page, since the set_skb_frag() in guest driver only use the offset 0 for second pages, then page data will be wrong.

4) Since the user buffer pages should be released, so we still need a dtor callback to do that, and then I still need a place to hold it. How do you think about to put it in skb_shinfo?

Currently I can only think of this.
How do you think about then?

Thanks
Xiaohui 

^ permalink raw reply

* Re: [PATCH] tcp: unify tcp flag macros
From: Changli Gao @ 2010-06-12  9:21 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: David S. Miller, Alexey Kuznetsov, Pekka Savola (ipv6),
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, netdev,
	netfilter-devel
In-Reply-To: <alpine.LSU.2.01.1006121017380.28286@obet.zrqbmnf.qr>

On Sat, Jun 12, 2010 at 4:18 PM, Jan Engelhardt <jengelh@medozas.de> wrote:
>
> On Saturday 2010-06-12 07:59, Changli Gao wrote:
>>--- a/include/net/tcp.h
>>+++ b/include/net/tcp.h
>>@@ -602,6 +602,17 @@ extern u32        __tcp_select_window(struct sock *sk);
>>  */
>> #define tcp_time_stamp                ((__u32)(jiffies))
>>
>>+#define tcp_flag_byte(th) (((u_int8_t *)th)[13])
>>+
>>+#define       TH_FIN  0x01
>>+#define       TH_SYN  0x02
>>+#define       TH_RST  0x04
>>+#define       TH_PUSH 0x08
>>+#define       TH_ACK  0x10
>>+#define       TH_URG  0x20
>>+#define       TH_ECE  0x40
>>+#define       TH_CWR  0x80
>
> I'd prefer something less generic, like TCPHDR_{FIN,SYN,etc}.
>
>

The same macros are definied by BSDs too. So I think it is better to
keep them the same.

-- 
Regards,
Changli Gao(xiaosuo@gmail.com)

^ permalink raw reply

* Re: [PATCH] tcp: unify tcp flag macros
From: Jan Engelhardt @ 2010-06-12  8:18 UTC (permalink / raw)
  To: Changli Gao
  Cc: David S. Miller, Alexey Kuznetsov, Pekka Savola (ipv6),
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, netdev,
	netfilter-devel
In-Reply-To: <1276322372-12584-1-git-send-email-xiaosuo@gmail.com>


On Saturday 2010-06-12 07:59, Changli Gao wrote:
>--- a/include/net/tcp.h
>+++ b/include/net/tcp.h
>@@ -602,6 +602,17 @@ extern u32	__tcp_select_window(struct sock *sk);
>  */
> #define tcp_time_stamp		((__u32)(jiffies))
> 
>+#define tcp_flag_byte(th) (((u_int8_t *)th)[13])
>+
>+#define	TH_FIN	0x01
>+#define	TH_SYN	0x02
>+#define	TH_RST	0x04
>+#define	TH_PUSH	0x08
>+#define	TH_ACK	0x10
>+#define	TH_URG	0x20
>+#define	TH_ECE	0x40
>+#define	TH_CWR	0x80

I'd prefer something less generic, like TCPHDR_{FIN,SYN,etc}.


^ permalink raw reply

* [PATCH] tcp: unify tcp flag macros
From: Changli Gao @ 2010-06-12  5:59 UTC (permalink / raw)
  To: David S. Miller
  Cc: Alexey Kuznetsov, Pekka Savola (ipv6), James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, netfilter-devel,
	Changli Gao

unify tcp flag macros

unify tcp flag macros: TH_FIN, TH_SYN, TH_RST, TH_PUSH, TH_ACK, TH_URG, TH_ECE
and TH_CWR. TCBCB_FLAG_* are replaced with the corresponding TH_*.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 include/net/tcp.h                      |   24 +++++-------
 net/ipv4/tcp.c                         |    8 ++--
 net/ipv4/tcp_input.c                   |    2 -
 net/ipv4/tcp_output.c                  |   62 +++++++++++++++------------------
 net/netfilter/nf_conntrack_proto_tcp.c |   11 -----
 net/netfilter/xt_TCPMSS.c              |    2 -
 6 files changed, 46 insertions(+), 63 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 5731664..238b636 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -602,6 +602,17 @@ extern u32	__tcp_select_window(struct sock *sk);
  */
 #define tcp_time_stamp		((__u32)(jiffies))
 
+#define tcp_flag_byte(th) (((u_int8_t *)th)[13])
+
+#define	TH_FIN	0x01
+#define	TH_SYN	0x02
+#define	TH_RST	0x04
+#define	TH_PUSH	0x08
+#define	TH_ACK	0x10
+#define	TH_URG	0x20
+#define	TH_ECE	0x40
+#define	TH_CWR	0x80
+
 /* This is what the send packet queuing engine uses to pass
  * TCP per-packet control information to the transmission
  * code.  We also store the host-order sequence numbers in
@@ -620,19 +631,6 @@ struct tcp_skb_cb {
 	__u32		end_seq;	/* SEQ + FIN + SYN + datalen	*/
 	__u32		when;		/* used to compute rtt's	*/
 	__u8		flags;		/* TCP header flags.		*/
-
-	/* NOTE: These must match up to the flags byte in a
-	 *       real TCP header.
-	 */
-#define TCPCB_FLAG_FIN		0x01
-#define TCPCB_FLAG_SYN		0x02
-#define TCPCB_FLAG_RST		0x04
-#define TCPCB_FLAG_PSH		0x08
-#define TCPCB_FLAG_ACK		0x10
-#define TCPCB_FLAG_URG		0x20
-#define TCPCB_FLAG_ECE		0x40
-#define TCPCB_FLAG_CWR		0x80
-
 	__u8		sacked;		/* State flags for SACK/FACK.	*/
 #define TCPCB_SACKED_ACKED	0x01	/* SKB ACK'd by a SACK block	*/
 #define TCPCB_SACKED_RETRANS	0x02	/* SKB retransmitted		*/
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 49d0d2b..ada200d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -511,7 +511,7 @@ int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg)
 
 static inline void tcp_mark_push(struct tcp_sock *tp, struct sk_buff *skb)
 {
-	TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_PSH;
+	TCP_SKB_CB(skb)->flags |= TH_PUSH;
 	tp->pushed_seq = tp->write_seq;
 }
 
@@ -527,7 +527,7 @@ static inline void skb_entail(struct sock *sk, struct sk_buff *skb)
 
 	skb->csum    = 0;
 	tcb->seq     = tcb->end_seq = tp->write_seq;
-	tcb->flags   = TCPCB_FLAG_ACK;
+	tcb->flags   = TH_ACK;
 	tcb->sacked  = 0;
 	skb_header_release(skb);
 	tcp_add_write_queue_tail(sk, skb);
@@ -815,7 +815,7 @@ new_segment:
 		skb_shinfo(skb)->gso_segs = 0;
 
 		if (!copied)
-			TCP_SKB_CB(skb)->flags &= ~TCPCB_FLAG_PSH;
+			TCP_SKB_CB(skb)->flags &= ~TH_PUSH;
 
 		copied += copy;
 		poffset += copy;
@@ -1061,7 +1061,7 @@ new_segment:
 			}
 
 			if (!copied)
-				TCP_SKB_CB(skb)->flags &= ~TCPCB_FLAG_PSH;
+				TCP_SKB_CB(skb)->flags &= ~TH_PUSH;
 
 			tp->write_seq += copy;
 			TCP_SKB_CB(skb)->end_seq += copy;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 548d575..fbadd66 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3286,7 +3286,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets,
 		 * connection startup slow start one packet too
 		 * quickly.  This is severely frowned upon behavior.
 		 */
-		if (!(scb->flags & TCPCB_FLAG_SYN)) {
+		if (!(scb->flags & TH_SYN)) {
 			flag |= FLAG_DATA_ACKED;
 		} else {
 			flag |= FLAG_SYN_ACKED;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index b4ed957..011199a 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -294,9 +294,9 @@ static u16 tcp_select_window(struct sock *sk)
 /* Packet ECN state for a SYN-ACK */
 static inline void TCP_ECN_send_synack(struct tcp_sock *tp, struct sk_buff *skb)
 {
-	TCP_SKB_CB(skb)->flags &= ~TCPCB_FLAG_CWR;
+	TCP_SKB_CB(skb)->flags &= ~TH_CWR;
 	if (!(tp->ecn_flags & TCP_ECN_OK))
-		TCP_SKB_CB(skb)->flags &= ~TCPCB_FLAG_ECE;
+		TCP_SKB_CB(skb)->flags &= ~TH_ECE;
 }
 
 /* Packet ECN state for a SYN.  */
@@ -306,7 +306,7 @@ static inline void TCP_ECN_send_syn(struct sock *sk, struct sk_buff *skb)
 
 	tp->ecn_flags = 0;
 	if (sysctl_tcp_ecn == 1) {
-		TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_ECE | TCPCB_FLAG_CWR;
+		TCP_SKB_CB(skb)->flags |= TH_ECE | TH_CWR;
 		tp->ecn_flags = TCP_ECN_OK;
 	}
 }
@@ -361,7 +361,7 @@ static void tcp_init_nondata_skb(struct sk_buff *skb, u32 seq, u8 flags)
 	skb_shinfo(skb)->gso_type = 0;
 
 	TCP_SKB_CB(skb)->seq = seq;
-	if (flags & (TCPCB_FLAG_SYN | TCPCB_FLAG_FIN))
+	if (flags & (TH_SYN | TH_FIN))
 		seq++;
 	TCP_SKB_CB(skb)->end_seq = seq;
 }
@@ -820,7 +820,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 	tcb = TCP_SKB_CB(skb);
 	memset(&opts, 0, sizeof(opts));
 
-	if (unlikely(tcb->flags & TCPCB_FLAG_SYN))
+	if (unlikely(tcb->flags & TH_SYN))
 		tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5);
 	else
 		tcp_options_size = tcp_established_options(sk, skb, &opts,
@@ -843,7 +843,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 	*(((__be16 *)th) + 6)	= htons(((tcp_header_size >> 2) << 12) |
 					tcb->flags);
 
-	if (unlikely(tcb->flags & TCPCB_FLAG_SYN)) {
+	if (unlikely(tcb->flags & TH_SYN)) {
 		/* RFC1323: The window in SYN & SYN/ACK segments
 		 * is never scaled.
 		 */
@@ -866,7 +866,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 	}
 
 	tcp_options_write((__be32 *)(th + 1), tp, &opts);
-	if (likely((tcb->flags & TCPCB_FLAG_SYN) == 0))
+	if (likely((tcb->flags & TH_SYN) == 0))
 		TCP_ECN_send(sk, skb, tcp_header_size);
 
 #ifdef CONFIG_TCP_MD5SIG
@@ -880,7 +880,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 
 	icsk->icsk_af_ops->send_check(sk, skb);
 
-	if (likely(tcb->flags & TCPCB_FLAG_ACK))
+	if (likely(tcb->flags & TH_ACK))
 		tcp_event_ack_sent(sk, tcp_skb_pcount(skb));
 
 	if (skb->len != tcp_header_size)
@@ -1023,7 +1023,7 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
 
 	/* PSH and FIN should only be set in the second packet. */
 	flags = TCP_SKB_CB(skb)->flags;
-	TCP_SKB_CB(skb)->flags = flags & ~(TCPCB_FLAG_FIN | TCPCB_FLAG_PSH);
+	TCP_SKB_CB(skb)->flags = flags & ~(TH_FIN | TH_PUSH);
 	TCP_SKB_CB(buff)->flags = flags;
 	TCP_SKB_CB(buff)->sacked = TCP_SKB_CB(skb)->sacked;
 
@@ -1328,8 +1328,7 @@ static inline unsigned int tcp_cwnd_test(struct tcp_sock *tp,
 	u32 in_flight, cwnd;
 
 	/* Don't be strict about the congestion window for the final FIN.  */
-	if ((TCP_SKB_CB(skb)->flags & TCPCB_FLAG_FIN) &&
-	    tcp_skb_pcount(skb) == 1)
+	if ((TCP_SKB_CB(skb)->flags & TH_FIN) && tcp_skb_pcount(skb) == 1)
 		return 1;
 
 	in_flight = tcp_packets_in_flight(tp);
@@ -1398,7 +1397,7 @@ static inline int tcp_nagle_test(struct tcp_sock *tp, struct sk_buff *skb,
 	 * Nagle can be ignored during F-RTO too (see RFC4138).
 	 */
 	if (tcp_urg_mode(tp) || (tp->frto_counter == 2) ||
-	    (TCP_SKB_CB(skb)->flags & TCPCB_FLAG_FIN))
+	    (TCP_SKB_CB(skb)->flags & TH_FIN))
 		return 1;
 
 	if (!tcp_nagle_check(tp, skb, cur_mss, nonagle))
@@ -1487,7 +1486,7 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len,
 
 	/* PSH and FIN should only be set in the second packet. */
 	flags = TCP_SKB_CB(skb)->flags;
-	TCP_SKB_CB(skb)->flags = flags & ~(TCPCB_FLAG_FIN | TCPCB_FLAG_PSH);
+	TCP_SKB_CB(skb)->flags = flags & ~(TH_FIN | TH_PUSH);
 	TCP_SKB_CB(buff)->flags = flags;
 
 	/* This packet was never sent out yet, so no SACK bits. */
@@ -1518,7 +1517,7 @@ static int tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb)
 	const struct inet_connection_sock *icsk = inet_csk(sk);
 	u32 send_win, cong_win, limit, in_flight;
 
-	if (TCP_SKB_CB(skb)->flags & TCPCB_FLAG_FIN)
+	if (TCP_SKB_CB(skb)->flags & TH_FIN)
 		goto send_now;
 
 	if (icsk->icsk_ca_state != TCP_CA_Open)
@@ -1644,7 +1643,7 @@ static int tcp_mtu_probe(struct sock *sk)
 
 	TCP_SKB_CB(nskb)->seq = TCP_SKB_CB(skb)->seq;
 	TCP_SKB_CB(nskb)->end_seq = TCP_SKB_CB(skb)->seq + probe_size;
-	TCP_SKB_CB(nskb)->flags = TCPCB_FLAG_ACK;
+	TCP_SKB_CB(nskb)->flags = TH_ACK;
 	TCP_SKB_CB(nskb)->sacked = 0;
 	nskb->csum = 0;
 	nskb->ip_summed = skb->ip_summed;
@@ -1669,7 +1668,7 @@ static int tcp_mtu_probe(struct sock *sk)
 			sk_wmem_free_skb(sk, skb);
 		} else {
 			TCP_SKB_CB(nskb)->flags |= TCP_SKB_CB(skb)->flags &
-						   ~(TCPCB_FLAG_FIN|TCPCB_FLAG_PSH);
+						   ~(TH_FIN|TH_PUSH);
 			if (!skb_shinfo(skb)->nr_frags) {
 				skb_pull(skb, copy);
 				if (skb->ip_summed != CHECKSUM_PARTIAL)
@@ -2020,7 +2019,7 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
 
 	if (!sysctl_tcp_retrans_collapse)
 		return;
-	if (TCP_SKB_CB(skb)->flags & TCPCB_FLAG_SYN)
+	if (TCP_SKB_CB(skb)->flags & TH_SYN)
 		return;
 
 	tcp_for_write_queue_from_safe(skb, tmp, sk) {
@@ -2112,7 +2111,7 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 	 * since it is cheap to do so and saves bytes on the network.
 	 */
 	if (skb->len > 0 &&
-	    (TCP_SKB_CB(skb)->flags & TCPCB_FLAG_FIN) &&
+	    (TCP_SKB_CB(skb)->flags & TH_FIN) &&
 	    tp->snd_una == (TCP_SKB_CB(skb)->end_seq - 1)) {
 		if (!pskb_trim(skb, 0)) {
 			/* Reuse, even though it does some unnecessary work */
@@ -2301,7 +2300,7 @@ void tcp_send_fin(struct sock *sk)
 	mss_now = tcp_current_mss(sk);
 
 	if (tcp_send_head(sk) != NULL) {
-		TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_FIN;
+		TCP_SKB_CB(skb)->flags |= TH_FIN;
 		TCP_SKB_CB(skb)->end_seq++;
 		tp->write_seq++;
 	} else {
@@ -2317,8 +2316,7 @@ void tcp_send_fin(struct sock *sk)
 		/* Reserve space for headers and prepare control bits. */
 		skb_reserve(skb, MAX_TCP_HEADER);
 		/* FIN eats a sequence byte, write_seq advanced by tcp_queue_skb(). */
-		tcp_init_nondata_skb(skb, tp->write_seq,
-				     TCPCB_FLAG_ACK | TCPCB_FLAG_FIN);
+		tcp_init_nondata_skb(skb, tp->write_seq, TH_ACK | TH_FIN);
 		tcp_queue_skb(sk, skb);
 	}
 	__tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_OFF);
@@ -2342,8 +2340,7 @@ void tcp_send_active_reset(struct sock *sk, gfp_t priority)
 
 	/* Reserve space for headers and prepare control bits. */
 	skb_reserve(skb, MAX_TCP_HEADER);
-	tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk),
-			     TCPCB_FLAG_ACK | TCPCB_FLAG_RST);
+	tcp_init_nondata_skb(skb, tcp_acceptable_seq(sk), TH_ACK | TH_RST);
 	/* Send it off. */
 	TCP_SKB_CB(skb)->when = tcp_time_stamp;
 	if (tcp_transmit_skb(sk, skb, 0, priority))
@@ -2363,11 +2360,11 @@ int tcp_send_synack(struct sock *sk)
 	struct sk_buff *skb;
 
 	skb = tcp_write_queue_head(sk);
-	if (skb == NULL || !(TCP_SKB_CB(skb)->flags & TCPCB_FLAG_SYN)) {
+	if (skb == NULL || !(TCP_SKB_CB(skb)->flags & TH_SYN)) {
 		printk(KERN_DEBUG "tcp_send_synack: wrong queue state\n");
 		return -EFAULT;
 	}
-	if (!(TCP_SKB_CB(skb)->flags & TCPCB_FLAG_ACK)) {
+	if (!(TCP_SKB_CB(skb)->flags & TH_ACK)) {
 		if (skb_cloned(skb)) {
 			struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC);
 			if (nskb == NULL)
@@ -2381,7 +2378,7 @@ int tcp_send_synack(struct sock *sk)
 			skb = nskb;
 		}
 
-		TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_ACK;
+		TCP_SKB_CB(skb)->flags |= TH_ACK;
 		TCP_ECN_send_synack(tcp_sk(sk), skb);
 	}
 	TCP_SKB_CB(skb)->when = tcp_time_stamp;
@@ -2459,8 +2456,7 @@ struct sk_buff *tcp_make_synack(struct sock *sk, struct dst_entry *dst,
 	/* Setting of flags are superfluous here for callers (and ECE is
 	 * not even correctly set)
 	 */
-	tcp_init_nondata_skb(skb, tcp_rsk(req)->snt_isn,
-			     TCPCB_FLAG_SYN | TCPCB_FLAG_ACK);
+	tcp_init_nondata_skb(skb, tcp_rsk(req)->snt_isn, TH_SYN | TH_ACK);
 
 	if (OPTION_COOKIE_EXTENSION & opts.options) {
 		if (s_data_desired) {
@@ -2592,7 +2588,7 @@ int tcp_connect(struct sock *sk)
 	skb_reserve(buff, MAX_TCP_HEADER);
 
 	tp->snd_nxt = tp->write_seq;
-	tcp_init_nondata_skb(buff, tp->write_seq++, TCPCB_FLAG_SYN);
+	tcp_init_nondata_skb(buff, tp->write_seq++, TH_SYN);
 	TCP_ECN_send_syn(sk, buff);
 
 	/* Send it off. */
@@ -2698,7 +2694,7 @@ void tcp_send_ack(struct sock *sk)
 
 	/* Reserve space for headers and prepare control bits. */
 	skb_reserve(buff, MAX_TCP_HEADER);
-	tcp_init_nondata_skb(buff, tcp_acceptable_seq(sk), TCPCB_FLAG_ACK);
+	tcp_init_nondata_skb(buff, tcp_acceptable_seq(sk), TH_ACK);
 
 	/* Send it off, this clears delayed acks for us. */
 	TCP_SKB_CB(buff)->when = tcp_time_stamp;
@@ -2732,7 +2728,7 @@ static int tcp_xmit_probe_skb(struct sock *sk, int urgent)
 	 * end to send an ack.  Don't queue or clone SKB, just
 	 * send it.
 	 */
-	tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TCPCB_FLAG_ACK);
+	tcp_init_nondata_skb(skb, tp->snd_una - !urgent, TH_ACK);
 	TCP_SKB_CB(skb)->when = tcp_time_stamp;
 	return tcp_transmit_skb(sk, skb, 0, GFP_ATOMIC);
 }
@@ -2762,13 +2758,13 @@ int tcp_write_wakeup(struct sock *sk)
 		if (seg_size < TCP_SKB_CB(skb)->end_seq - TCP_SKB_CB(skb)->seq ||
 		    skb->len > mss) {
 			seg_size = min(seg_size, mss);
-			TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_PSH;
+			TCP_SKB_CB(skb)->flags |= TH_PUSH;
 			if (tcp_fragment(sk, skb, seg_size, mss))
 				return -1;
 		} else if (!tcp_skb_pcount(skb))
 			tcp_set_skb_tso_segs(sk, skb, mss);
 
-		TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_PSH;
+		TCP_SKB_CB(skb)->flags |= TH_PUSH;
 		TCP_SKB_CB(skb)->when = tcp_time_stamp;
 		err = tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC);
 		if (!err)
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
index 9dd8cd4..8c65d62 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -736,15 +736,6 @@ static bool tcp_in_window(const struct nf_conn *ct,
 	return res;
 }
 
-#define	TH_FIN	0x01
-#define	TH_SYN	0x02
-#define	TH_RST	0x04
-#define	TH_PUSH	0x08
-#define	TH_ACK	0x10
-#define	TH_URG	0x20
-#define	TH_ECE	0x40
-#define	TH_CWR	0x80
-
 /* table of valid flag combinations - PUSH, ECE and CWR are always valid */
 static const u8 tcp_valid_flags[(TH_FIN|TH_SYN|TH_RST|TH_ACK|TH_URG) + 1] =
 {
@@ -803,7 +794,7 @@ static int tcp_error(struct net *net, struct nf_conn *tmpl,
 	}
 
 	/* Check TCP flags. */
-	tcpflags = (((u_int8_t *)th)[13] & ~(TH_ECE|TH_CWR|TH_PUSH));
+	tcpflags = (tcp_flag_byte(th) & ~(TH_ECE|TH_CWR|TH_PUSH));
 	if (!tcp_valid_flags[tcpflags]) {
 		if (LOG_INVALID(net, IPPROTO_TCP))
 			nf_log_packet(pf, 0, skb, NULL, NULL, NULL,
diff --git a/net/netfilter/xt_TCPMSS.c b/net/netfilter/xt_TCPMSS.c
index 1841388..5bd203b 100644
--- a/net/netfilter/xt_TCPMSS.c
+++ b/net/netfilter/xt_TCPMSS.c
@@ -220,8 +220,6 @@ tcpmss_tg6(struct sk_buff *skb, const struct xt_action_param *par)
 }
 #endif
 
-#define TH_SYN 0x02
-
 /* Must specify -p tcp --syn */
 static inline bool find_syn_match(const struct xt_entry_match *m)
 {

^ permalink raw reply related

* Re: [ath5k-devel] [PATCH] [ath5k][leds] Ability to disable leds support. If leds support enabled do not force mac802.11 leds layer selection.
From: Dmytro Milinevskyy @ 2010-06-12  2:02 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Bob Copeland, ath5k-devel-xDcbHBWguxEUs3QNXV6qNA, Kalle Valo,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA, GeunSik Lim, Jiri Slaby,
	Greg Kroah-Hartman, John W. Linville, Keng-Yu Lin,
	netdev-u79uwXL29TY76Z2rM5mHXA, Jiri Kosina, Shahar Or,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Luca Verdesca
In-Reply-To: <1276288163.3918.1.camel-8upI4CBIZJIJvtFkdXX2HixXY32XiHfO@public.gmane.org>

In this case no way to control LEDs. If you turn LEDs support you
can't tune them out, no?

I believe that user should be able to chose whether LEDs support is
needed or not. This is how done in intel wireless drivers. The driver
should be tolerant.

-- Dima

On Fri, Jun 11, 2010 at 11:29 PM, Johannes Berg
<johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> wrote:
> On Fri, 2010-06-11 at 23:26 +0300, Dmytro Milinevskyy wrote:
>> Generic LED class is not disabled so it's possible to control leds in sysfs.
>
> But if you turn off ATH5K_LEDS??
>
> johannes
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 1/2] pktgen: increasing transmission granularity
From: David Miller @ 2010-06-12  1:40 UTC (permalink / raw)
  To: robert; +Cc: daniel.turull, netdev, jens.laas
In-Reply-To: <19473.7957.743957.808639@gargle.gargle.HOWL>

From: Robert Olsson <robert@herjulf.net>
Date: Thu, 10 Jun 2010 19:21:25 +0200

> So with this improved rate control we could do RFC2544-like testing as well.
>
> Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>

Applied, thanks everyone.

^ permalink raw reply

* Re: [PATCH net-next-2.6] econet: fix locking
From: David Miller @ 2010-06-12  1:39 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1276137185.2475.19.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 10 Jun 2010 04:33:05 +0200

> econet lacks proper locking. It holds econet_lock only when inserting or
> deleting an entry in econet_sklist, not during lookups.
> 
> - convert econet_lock from rwlock to spinlock
> 
> - use econet_lock in ec_listening_socket() lookup
> 
> - use appropriate sock_hold() / sock_put() to avoid corruptions.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] pkt_sched: gen_kill_estimator() rcu fixes
From: David Miller @ 2010-06-12  1:39 UTC (permalink / raw)
  To: eric.dumazet; +Cc: jarkao2, xiaosuo, netdev, shemminger, kaber
In-Reply-To: <1276085363.2442.125.camel@edumazet-laptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 09 Jun 2010 14:09:23 +0200

> [PATCH v2] pkt_sched: gen_kill_estimator() rcu fixes
> 
> gen_kill_estimator() API is incomplete or not well documented, since
> caller should make sure an RCU grace period is respected before
> freeing stats_lock.
> 
> This was partially addressed in commit 5d944c640b4 
> (gen_estimator: deadlock fix), but same problem exist for all
> gen_kill_estimator() users, if lock they use is not already RCU
> protected.
> 
> A code review shows xt_RATEEST.c, act_api.c, act_police.c have this
> problem. Other are ok because they use qdisc lock, already RCU
> protected.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] bnx2: Fix compiler warning in bnx2_disable_forced_2g5().
From: David Miller @ 2010-06-12  1:38 UTC (permalink / raw)
  To: mchan; +Cc: prarit, netdev
In-Reply-To: <1276017690-5442-1-git-send-email-mchan@broadcom.com>

From: "Michael Chan" <mchan@broadcom.com>
Date: Tue, 8 Jun 2010 10:21:30 -0700

> drivers/net/bnx2.c: In function 'bnx2_disable_forced_2g5':
> drivers/net/bnx2.c:1489: warning: 'bmcr' may be used uninitialized in this function
> 
> We fix it by checking return values from all bnx2_read_phy() and proceeding
> to do read-modify-write only if the read operation is successful.
> 
> The related bnx2_enable_forced_2g5() is also fixed the same way.
> 
> Reported-by: Prarit Bhargava <prarit@redhat.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>

Applied to net-next-2.6, thanks.

^ permalink raw reply

* Re: [patch] enic: cleanup vic_provinfo_alloc()
From: David Miller @ 2010-06-12  1:37 UTC (permalink / raw)
  To: error27; +Cc: scofeldm, vkolluri, roprabhu, netdev, kernel-janitors
In-Reply-To: <20100610075903.GL5483@bicker>

From: Dan Carpenter <error27@gmail.com>
Date: Thu, 10 Jun 2010 09:59:03 +0200

> If oui were a null variable then vic_provinfo_alloc() would leak memory.
> But this function is only called from one place and oui is not null so 
> I removed the check.
> 
> I also moved the memory allocation down a line so it was easier to spot.
> (No one ever reads variable declarations).
> 
> Signed-off-by: Dan Carpenter <error27@gmail.com>

Applied to net-next-2.6, thanks.

^ permalink raw reply

* Re: [patch] enic: cleanup vic_provinfo_alloc()
From: Scott Feldman @ 2010-06-12  0:05 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: Vasanthy Kolluri, Roopa Prabhu, netdev, kernel-janitors
In-Reply-To: <20100610075903.GL5483@bicker>

On 6/10/10 12:59 AM, "Dan Carpenter" <error27@gmail.com> wrote:

> If oui were a null variable then vic_provinfo_alloc() would leak memory.
> But this function is only called from one place and oui is not null so
> I removed the check.
> 
> I also moved the memory allocation down a line so it was easier to spot.
> (No one ever reads variable declarations).
> 
> Signed-off-by: Dan Carpenter <error27@gmail.com>

We'll pick this one up and resubmit with the next enic patch bomb.  Thanks
Dan.

-scott


^ permalink raw reply

* Re: Weak host model vs .interface down
From: Mark Smith @ 2010-06-11 23:57 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Rick Jones, netdev
In-Reply-To: <OF73173D19.3439A8A1-ONC125773F.006BA38C-C125773F.006C31A4@transmode.se>

On Fri, 11 Jun 2010 21:41:45 +0200
Joakim Tjernlund <joakim.tjernlund@transmode.se> wrote:

> Rick Jones <rick.jones2@hp.com> wrote on 2010/06/11 19:13:42:
> >
> > > The weak model doesn't go into such detail, it is assumption/impl. detail
> > > to assume that the ip address still is part of the system even when the interface
> > > is down. One could just as well define interface down as temporarly removing
> > > the IP address from the system too. This makes make much more sense to me and
> > > if you always want the system to answer on a IP adress you make it an IP alias.
> > >
> > > Since the current behaviour is a problem to me and routers in general, can
> > > we change this? What is the current usage model which needs it to stay as is?
> >
> > Router != end-system  so I wouldn't think the weak or strong end-system model
> > would apply to a router.  I think Stephen already posted a patch to allow that
> > for when one's box was a router rather than an end-system.
> 
> Not really an anwser to what I was asking but I choose to read that as
> you agree with me. The rest is an impl. detail. :)
> Stephen's patch is good but I would not mind making I/F down removing the
> IP address from the system unconditionally.
> 

I've asked the same question a few years back and got the same answer.
I accept the strong host / weak host argument, however I've also
thought about the problem a bit more, and why people get confused about
it.

The problem is the mental model. Assigning an IP address to an
interface implies that the IP address as attached and associated with
the interface and therefore the state of the interface. That is
certainly the case for people like me who work with networking
equipment, typically routers, which follow the strong host model. It is
very convenient to know that by shutting down an interface the
associated IP address stops working too. Other measures, such as
ACLing, or writing down and deleting and then having put it back, are
relatively much more effort and error prone.

While I'm sure past operational history is likely to make this
impractical, it would be far more intuitive for weak host model IP
address assignments to be made to a single, forced always up virtual
interface on the host, and strong host IP address assignments made to
any other "non-weak host" interfaces.

It'd be an interesting experiment to see if loopback could be used as a
"host interface" in the weak host model.

Regards,
Mark.

^ permalink raw reply

* [net-2.6 PATCH] ixgbe: fix for race with 8259(8|9) during shutdown
From: Jeff Kirsher @ 2010-06-11 23:20 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Akihiko Saitou, Don Skidmore, Jeff Kirsher

From: Don Skidmore <donald.c.skidmore@intel.com>

There is a small window where the watchdog could be running as the
interface is brought down on a NIC with two ports wired back to back.
If ixgbe_update_status is then called can lead to a panic.  This patch
allows the update to bail if we are in that condition.

This issue was orignally reported and fix proposed by Akihiko Saitou.

CC: Akihiko Saitou <asaitou@users.sourceforge.net>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/ixgbe/ixgbe_main.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index b2af2f6..ce30c62 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -5282,6 +5282,10 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter)
 	u32 i, missed_rx = 0, mpc, bprc, lxon, lxoff, xon_off_tot;
 	u64 non_eop_descs = 0, restart_queue = 0;
 
+	if (test_bit(__IXGBE_DOWN, &adapter->state) ||
+	    test_bit(__IXGBE_RESETTING, &adapter->state))
+		return;
+
 	if (adapter->flags2 & IXGBE_FLAG2_RSC_ENABLED) {
 		u64 rsc_count = 0;
 		u64 rsc_flush = 0;


^ permalink raw reply related

* Re: [PATCH 3/8] netpoll: Fix RCU usage
From: Paul E. McKenney @ 2010-06-11 23:10 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Michael S. Tsirkin, Qianfeng Zhang, David S. Miller, netdev,
	WANG Cong, Stephen Hemminger, Matt Mackall
In-Reply-To: <E1OMtjg-0006Ob-Sb@gondolin.me.apana.org.au>

On Fri, Jun 11, 2010 at 12:12:44PM +1000, Herbert Xu wrote:
> netpoll: Fix RCU usage
> 
> The use of RCU in netpoll is incorrect in a number of places:
> 
> 1) The initial setting is lacking a write barrier.
> 2) The synchronize_rcu is in the wrong place.
> 3) Read barriers are missing.
> 4) Some places are even missing rcu_read_lock.
> 5) npinfo is zeroed after freeing.
> 
> This patch fixes those issues.  As most users are in BH context,
> this also converts the RCU usage to the BH variant.

Looks good for me from an RCU viewpoint, but as usual I confess much
ignorance of the surrounding code.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> ---
> 
>  include/linux/netpoll.h |   13 ++++++++-----
>  net/core/netpoll.c      |   20 ++++++++++++--------
>  2 files changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
> index e9e2312..95c9f7e 100644
> --- a/include/linux/netpoll.h
> +++ b/include/linux/netpoll.h
> @@ -57,12 +57,15 @@ void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb);
>  #ifdef CONFIG_NETPOLL
>  static inline bool netpoll_rx(struct sk_buff *skb)
>  {
> -	struct netpoll_info *npinfo = skb->dev->npinfo;
> +	struct netpoll_info *npinfo;
>  	unsigned long flags;
>  	bool ret = false;
> 
> +	rcu_read_lock_bh();
> +	npinfo = rcu_dereference(skb->dev->npinfo);
> +
>  	if (!npinfo || (list_empty(&npinfo->rx_np) && !npinfo->rx_flags))
> -		return false;
> +		goto out;
> 
>  	spin_lock_irqsave(&npinfo->rx_lock, flags);
>  	/* check rx_flags again with the lock held */
> @@ -70,12 +73,14 @@ static inline bool netpoll_rx(struct sk_buff *skb)
>  		ret = true;
>  	spin_unlock_irqrestore(&npinfo->rx_lock, flags);
> 
> +out:
> +	rcu_read_unlock_bh();
>  	return ret;
>  }
> 
>  static inline int netpoll_rx_on(struct sk_buff *skb)
>  {
> -	struct netpoll_info *npinfo = skb->dev->npinfo;
> +	struct netpoll_info *npinfo = rcu_dereference(skb->dev->npinfo);
> 
>  	return npinfo && (!list_empty(&npinfo->rx_np) || npinfo->rx_flags);
>  }
> @@ -91,7 +96,6 @@ static inline void *netpoll_poll_lock(struct napi_struct *napi)
>  {
>  	struct net_device *dev = napi->dev;
> 
> -	rcu_read_lock(); /* deal with race on ->npinfo */
>  	if (dev && dev->npinfo) {
>  		spin_lock(&napi->poll_lock);
>  		napi->poll_owner = smp_processor_id();
> @@ -108,7 +112,6 @@ static inline void netpoll_poll_unlock(void *have)
>  		napi->poll_owner = -1;
>  		spin_unlock(&napi->poll_lock);
>  	}
> -	rcu_read_unlock();
>  }
> 
>  #else
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> index 748c930..4b7623d 100644
> --- a/net/core/netpoll.c
> +++ b/net/core/netpoll.c
> @@ -292,6 +292,7 @@ void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
>  	unsigned long tries;
>  	struct net_device *dev = np->dev;
>  	const struct net_device_ops *ops = dev->netdev_ops;
> +	/* It is up to the caller to keep npinfo alive. */
>  	struct netpoll_info *npinfo = np->dev->npinfo;
> 
>  	if (!npinfo || !netif_running(dev) || !netif_device_present(dev)) {
> @@ -841,10 +842,7 @@ int netpoll_setup(struct netpoll *np)
>  	refill_skbs();
> 
>  	/* last thing to do is link it to the net device structure */
> -	ndev->npinfo = npinfo;
> -
> -	/* avoid racing with NAPI reading npinfo */
> -	synchronize_rcu();
> +	rcu_assign_pointer(ndev->npinfo, npinfo);
> 
>  	return 0;
> 
> @@ -888,6 +886,16 @@ void netpoll_cleanup(struct netpoll *np)
> 
>  			if (atomic_dec_and_test(&npinfo->refcnt)) {
>  				const struct net_device_ops *ops;
> +
> +				ops = np->dev->netdev_ops;
> +				if (ops->ndo_netpoll_cleanup)
> +					ops->ndo_netpoll_cleanup(np->dev);
> +
> +				rcu_assign_pointer(np->dev->npinfo, NULL);
> +
> +				/* avoid racing with NAPI reading npinfo */
> +				synchronize_rcu_bh();
> +
>  				skb_queue_purge(&npinfo->arp_tx);
>  				skb_queue_purge(&npinfo->txq);
>  				cancel_rearming_delayed_work(&npinfo->tx_work);
> @@ -895,10 +903,6 @@ void netpoll_cleanup(struct netpoll *np)
>  				/* clean after last, unfinished work */
>  				__skb_queue_purge(&npinfo->txq);
>  				kfree(npinfo);
> -				ops = np->dev->netdev_ops;
> -				if (ops->ndo_netpoll_cleanup)
> -					ops->ndo_netpoll_cleanup(np->dev);
> -				np->dev->npinfo = NULL;
>  			}
>  		}
> 

^ permalink raw reply

* [net-2.6 PATCH] e1000: Fix message logging defect
From: Jeff Kirsher @ 2010-06-11 22:51 UTC (permalink / raw)
  To: davem; +Cc: netdev, gospo, Joe Perches, Jeff Kirsher

From: Joe Perches <joe@perches.com>

commit 675ad47375c76a7c3be4ace9554d92cd55518ced
removed the capability to use ethtool.set_msglevel to
control the types of messages emitted by the driver.

That commit should probably be reverted.

If not, then this patch fixes a message logging defect
introduced by converting a printk without KERN_<level>
to e_info.

This also reduces text by about 200 bytes.

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/e1000/e1000_main.c |   17 ++++++++---------
 1 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index ebdea08..68a8089 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -1047,15 +1047,14 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
 		goto err_register;
 
 	/* print bus type/speed/width info */
-	e_info("(PCI%s:%s:%s) ",
-		((hw->bus_type == e1000_bus_type_pcix) ? "-X" : ""),
-		((hw->bus_speed == e1000_bus_speed_133) ? "133MHz" :
-		 (hw->bus_speed == e1000_bus_speed_120) ? "120MHz" :
-		 (hw->bus_speed == e1000_bus_speed_100) ? "100MHz" :
-		 (hw->bus_speed == e1000_bus_speed_66) ? "66MHz" : "33MHz"),
-		((hw->bus_width == e1000_bus_width_64) ? "64-bit" : "32-bit"));
-
-	e_info("%pM\n", netdev->dev_addr);
+	e_info("(PCI%s:%dMHz:%d-bit) %pM\n",
+	       ((hw->bus_type == e1000_bus_type_pcix) ? "-X" : ""),
+	       ((hw->bus_speed == e1000_bus_speed_133) ? 133 :
+		(hw->bus_speed == e1000_bus_speed_120) ? 120 :
+		(hw->bus_speed == e1000_bus_speed_100) ? 100 :
+		(hw->bus_speed == e1000_bus_speed_66) ? 66 : 33),
+	       ((hw->bus_width == e1000_bus_width_64) ? 64 : 32),
+	       netdev->dev_addr);
 
 	/* carrier off reporting is important to ethtool even BEFORE open */
 	netif_carrier_off(netdev);


^ permalink raw reply related

* [net-2.6 PATCH] ixgbe: fix automatic LRO/RSC settings for low latency
From: Jeff Kirsher @ 2010-06-11 22:47 UTC (permalink / raw)
  To: davem
  Cc: netdev, gospo, Andy Gospodarek, Stanislaw Gruszka, stable,
	Jeff Kirsher

From: Andy Gospodarek <andy@greyhouse.net>

This patch added to 2.6.34:

	commit f8d1dcaf88bddc7f282722ec1fdddbcb06a72f18
	Author: Jesse Brandeburg <jesse.brandeburg@intel.com>
	Date:   Tue Apr 27 01:37:20 2010 +0000

	    ixgbe: enable extremely low latency

introduced a feature where LRO (called RSC on the hardware) was disabled
automatically when setting rx-usecs to 0 via ethtool.  Some might not
like the fact that LRO was disabled automatically, but I'm fine with
that.  What I don't like is that LRO/RSC is automatically enabled when
rx-usecs is set >0 via ethtool.

This would certainly be a problem if the device was used for forwarding
and it was determined that the low latency wasn't needed after the
device was already forwarding.  I played around with saving the state of
LRO in the driver, but it just didn't seem worthwhile and would require
a small change to dev_disable_lro() that I did not like.

This patch simply leaves LRO disabled when setting rx-usecs >0 and
requires that the user enable it again.  An extra informational message
will also now appear in the log so users can understand why LRO isn't
being enabled as they expect.

Inconsistency of LRO setting first noticed by Stanislaw Gruszka.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
CC: Stanislaw Gruszka <sgruszka@redhat.com>
CC: stable@kernel.org
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

 drivers/net/ixgbe/ixgbe_ethtool.c |   37 ++++++++-----------------------------
 1 files changed, 8 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethtool.c b/drivers/net/ixgbe/ixgbe_ethtool.c
index c50a754..3a93a81 100644
--- a/drivers/net/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ixgbe/ixgbe_ethtool.c
@@ -2077,25 +2077,6 @@ static int ixgbe_get_coalesce(struct net_device *netdev,
 	return 0;
 }
 
-/*
- * this function must be called before setting the new value of
- * rx_itr_setting
- */
-static bool ixgbe_reenable_rsc(struct ixgbe_adapter *adapter,
-                               struct ethtool_coalesce *ec)
-{
-	/* check the old value and enable RSC if necessary */
-	if ((adapter->rx_itr_setting == 0) &&
-	    (adapter->flags2 & IXGBE_FLAG2_RSC_CAPABLE)) {
-		adapter->flags2 |= IXGBE_FLAG2_RSC_ENABLED;
-		adapter->netdev->features |= NETIF_F_LRO;
-		DPRINTK(PROBE, INFO, "rx-usecs set to %d, re-enabling RSC\n",
-		        ec->rx_coalesce_usecs);
-		return true;
-	}
-	return false;
-}
-
 static int ixgbe_set_coalesce(struct net_device *netdev,
                               struct ethtool_coalesce *ec)
 {
@@ -2124,9 +2105,6 @@ static int ixgbe_set_coalesce(struct net_device *netdev,
 		    (1000000/ec->rx_coalesce_usecs < IXGBE_MIN_INT_RATE))
 			return -EINVAL;
 
-		/* check the old value and enable RSC if necessary */
-		need_reset = ixgbe_reenable_rsc(adapter, ec);
-
 		/* store the value in ints/second */
 		adapter->rx_eitr_param = 1000000/ec->rx_coalesce_usecs;
 
@@ -2135,9 +2113,6 @@ static int ixgbe_set_coalesce(struct net_device *netdev,
 		/* clear the lower bit as its used for dynamic state */
 		adapter->rx_itr_setting &= ~1;
 	} else if (ec->rx_coalesce_usecs == 1) {
-		/* check the old value and enable RSC if necessary */
-		need_reset = ixgbe_reenable_rsc(adapter, ec);
-
 		/* 1 means dynamic mode */
 		adapter->rx_eitr_param = 20000;
 		adapter->rx_itr_setting = 1;
@@ -2157,10 +2132,11 @@ static int ixgbe_set_coalesce(struct net_device *netdev,
 		 */
 		if (adapter->flags2 & IXGBE_FLAG2_RSC_ENABLED) {
 			adapter->flags2 &= ~IXGBE_FLAG2_RSC_ENABLED;
-			netdev->features &= ~NETIF_F_LRO;
-			DPRINTK(PROBE, INFO,
-			        "rx-usecs set to 0, disabling RSC\n");
-
+			if (netdev->features & NETIF_F_LRO) {
+				netdev->features &= ~NETIF_F_LRO;
+				DPRINTK(PROBE, INFO, "rx-usecs set to 0, "
+					"disabling LRO/RSC\n");
+			}
 			need_reset = true;
 		}
 	}
@@ -2255,6 +2231,9 @@ static int ixgbe_set_flags(struct net_device *netdev, u32 data)
 			}
 		} else if (!adapter->rx_itr_setting) {
 			netdev->features &= ~ETH_FLAG_LRO;
+			if (data & ETH_FLAG_LRO)
+				DPRINTK(PROBE, INFO, "rx-usecs set to 0, "
+					"LRO/RSC cannot be enabled.\n");
 		}
 	}
 


^ permalink raw reply related

* [PATCH] gianfar: Fix oversized packets handling
From: Anton Vorontsov @ 2010-06-11 20:51 UTC (permalink / raw)
  To: David Miller; +Cc: Sandeep Gopalpet, Andy Fleming, netdev, linuxppc-dev

Issuing the following command on host:

$ ifconfig eth2 mtu 1600 ; ping 10.0.0.27 -s 1485 -c 1

Makes some boards (tested with MPC8315 rev 1.1 and MPC8313 rev 1.0)
oops like this:

  skb_over_panic: text:c0195914 len:1537 put:1537 head:c79e4800 data:c79e4880 tail:0xc79e4e81 end:0xc79e4e80 dev:eth1
  ------------[ cut here ]------------
  kernel BUG at net/core/skbuff.c:127!
  Oops: Exception in kernel mode, sig: 5 [#1]
  MPC831x RDB
  last sysfs file: /sys/kernel/uevent_seqnum
  Modules linked in:
  NIP: c01c1840 LR: c01c1840 CTR: c016d918
  [...]
  NIP [c01c1840] skb_over_panic+0x48/0x5c
  LR [c01c1840] skb_over_panic+0x48/0x5c
  Call Trace:
  [c0339d50] [c01c1840] skb_over_panic+0x48/0x5c (unreliable)
  [c0339d60] [c01c3020] skb_put+0x5c/0x60
  [c0339d70] [c0195914] gfar_clean_rx_ring+0x25c/0x3d0
  [c0339dc0] [c01976e8] gfar_poll+0x170/0x1bc

Dumped buffer descriptors showed that eTSEC's length/truncation
logic sometimes passes oversized packets, i.e. for the above ICMP
packet the following two buffer descriptors may become ready:

  status=1400 length=1536
  status=1800 length=1541

So, it seems that gianfar actually receives the whole big frame,
and it tries to place the packet into two BDs. This situation
confuses the driver, and so the skb_put() sanity check fails.

This patch fixes the issue by adding an appropriate check, i.e.
the driver should not try to process frames with buffer
descriptor's length over rx_buffer_size (i.e. maxfrm and mrblr).

Note that sometimes eTSEC works correctly, i.e. in the second
(last) buffer descriptor bits 'truncated' and 'crcerr' are set,
and so there's no oops. Though I couldn't find any logic when
it works correctly and when not.

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
---
 drivers/net/gianfar.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 46c69cd..7c31f0f 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -2642,6 +2642,10 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit)
 		dma_unmap_single(&priv->ofdev->dev, bdp->bufPtr,
 				priv->rx_buffer_size, DMA_FROM_DEVICE);

+		if (unlikely(!(bdp->status & RXBD_ERR) &&
+				bdp->length > priv->rx_buffer_size))
+			bdp->status = RXBD_LARGE;
+
 		/* We drop the frame if we failed to allocate a new buffer */
 		if (unlikely(!newskb || !(bdp->status & RXBD_LAST) ||
 				 bdp->status & RXBD_ERR)) {
-- 
1.7.0.5

^ permalink raw reply related

* Re: [PATCH 1/4][RFC] Introduction to sk_buff state checking
From: David VomLehn @ 2010-06-11 20:50 UTC (permalink / raw)
  To: David Miller; +Cc: to, linux-kernel, akpm, netdev
In-Reply-To: <20100611.133811.52184181.davem@davemloft.net>

On Fri, Jun 11, 2010 at 01:38:11PM -0700, David Miller wrote:
> 
> This is the second time you've submitted this patch set,

There are changes, based on the feedback I got.

> it is also the second time that:
> 
> 1) Your email client has emitted these strange
>    "linux-kernel@vger.kernel.org"@cisco.com
>    addresses.
> 
>    I saw that you were specifically notified about this
>    problem during the previous submission.
> 
>    What did you do to try and correct it?
> 
> 2) You are not CC:'ing the networking developer mailing
>    list "netdev@vger.kernel.org"  Many networking developers
>    do not read linux-kernel, so by not CC:'ing that list
>    you are not reaching the folks most capable of reviewing
>    your patches.

All true, and I apologize. I know there's enough traffic on kernel mailing
lists as it is.

> Please fix this up.

Working on it now.

-- 
David VL

^ permalink raw reply

* Re: Weak host model vs .interface down
From: Joakim Tjernlund @ 2010-06-11 20:46 UTC (permalink / raw)
  To: Mitchell Erblich; +Cc: netdev, Rick Jones
In-Reply-To: <2297DADB-883E-4156-998A-854BE6CAC079@earthlink.net>

Mitchell Erblich <erblichs@earthlink.net> wrote on 2010/06/11 21:50:14:
>
>
> On Jun 11, 2010, at 10:06 AM, Joakim Tjernlund wrote:
>
> > Rick Jones <rick.jones2@hp.com> wrote on 2010/06/11 18:32:20:
> >> Joakim Tjernlund wrote:
> >>> Linux uses the weak host model which makes the IP addresses part of the system
> >>> rather than the interface. However consider this:
> >>>
> >>> System A, eth0 connected to the network
> >>> # > ifconfig eth0 192.168.1.16
> >>> # > ifconfig eth1 192.168.1.17 down
> >>>
> >>> System B
> >>> # > ping 192.168.1.17
> >>> PING 192.168.1.17 (192.168.1.17) 56(84) bytes of data.
> >>> 64 bytes from 192.168.1.17: icmp_seq=1 ttl=64 time=0.618 ms
> >>>
> >>> Isn't it a bit much to respond on 192.168.1.17 when its interface is down?
> >>
> >> As you said at the beginning, the weak end system model presumes the IP address
> >> is part of the system.  Seems to me that means unless one removes the IP address
> >> from the system it is reasonable for the system to continue to respond to that
> >> IP address.  Regardless of what happens to any individual interface.
> >
> > The weak model doesn't go into such detail, it is assumption/impl. detail
> > to assume that the ip address still is part of the system even when the interface
> > is down. One could just as well define interface down as temporarly removing
> > the IP address from the system too. This makes make much more sense to me and
> > if you always want the system to answer on a IP adress you make it an IP alias.
> >
> > Since the current behaviour is a problem to me and routers in general, can
> > we change this? What is the current usage model which needs it to stay as is?
> >
> >>
> >> Now, I wouldn't expect it to continue to respond to 192.168.1.17 through eth1,
> >> but if eth0 is indeed connected to the same broadcast domain, given the
> >> following of the weak end-system model, continuing to respond seems consistent
> >> with enthusiasticaly following the weak end-system model.
> >
> > Dosnt matter if it is in the same broadcast domain, you can use a bridge
> > interface or dummy interface too. It will still respond to 192.168.1.17
> > I can't find a way disable this behaviour, can you?
> >
> > --
>
> Guys
>
> Isn't this the diff between models of a host/end system and a
> router/intermediate system?

Not sure what you mean here, but there is no such assumtion in
the models.

>
> Can you verify that xmit capability on the intf is disabled with the
> down arg?

umm, isn't that true by definition? if an I/F is put into down state, it
cannot xmit nor receive.

>
> IMO, One possible behaviour is to allow the receipt of a magic
> packet to bring up a down system for the "energy star protocol".

isn't that something totally different? I cannot se how that relates
to the matter at hand.

 Jocke


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox