Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 00/13] net: Add and use ether_addr_equal
From: Johannes Berg @ 2012-05-10  6:48 UTC (permalink / raw)
  To: David Miller
  Cc: coreteam, netdev, bridge, linux-wireless, linux-kernel,
	linux-bluetooth, netfilter, netfilter-devel, joe
In-Reply-To: <20120509.212110.437114897180819434.davem@davemloft.net>

On Wed, 2012-05-09 at 21:21 -0400, David Miller wrote:

> That case you didn't convert in mac80211 is probably the
> bug Johannes was talking about which started this whole
> discussion.

The bug case that started it all is in net/wireless/scan.c and Emmanuel
has since changed it back to memcmp(). Not sure if that's the one you
were referring to or not :-)

johannes

^ permalink raw reply

* Re: [PATCH 00/13] net: Add and use ether_addr_equal
From: Joe Perches @ 2012-05-10  6:54 UTC (permalink / raw)
  To: Johannes Berg
  Cc: David Miller, netdev, bridge, netfilter-devel, netfilter,
	coreteam, linux-wireless, linux-kernel, linux-bluetooth
In-Reply-To: <1336632528.4334.1.camel@jlt3.sipsolutions.net>

On Thu, 2012-05-10 at 08:48 +0200, Johannes Berg wrote:
> On Wed, 2012-05-09 at 21:21 -0400, David Miller wrote:
> 
> > That case you didn't convert in mac80211 is probably the
> > bug Johannes was talking about which started this whole
> > discussion.
> 
> The bug case that started it all is in net/wireless/scan.c and Emmanuel
> has since changed it back to memcmp(). Not sure if that's the one you
> were referring to or not :-)

That's the one that I left alone.

Post patch:
$ git grep -n -w compare_ether_addr net
net/batman-adv/main.h:198: * note: can't use compare_ether_addr() as it requires aligned memory
net/wireless/scan.c:381:        return compare_ether_addr(a->bssid, b->bssid);



^ permalink raw reply

* Re: [PATCH 00/13] net: Add and use ether_addr_equal
From: Johannes Berg @ 2012-05-10  6:56 UTC (permalink / raw)
  To: Joe Perches
  Cc: netfilter, netdev, bridge, linux-wireless, linux-kernel,
	linux-bluetooth, coreteam, netfilter-devel, David Miller
In-Reply-To: <1336632860.22495.6.camel@joe2Laptop>

On Wed, 2012-05-09 at 23:54 -0700, Joe Perches wrote:
> On Thu, 2012-05-10 at 08:48 +0200, Johannes Berg wrote:
> > On Wed, 2012-05-09 at 21:21 -0400, David Miller wrote:
> > 
> > > That case you didn't convert in mac80211 is probably the
> > > bug Johannes was talking about which started this whole
> > > discussion.
> > 
> > The bug case that started it all is in net/wireless/scan.c and Emmanuel
> > has since changed it back to memcmp(). Not sure if that's the one you
> > were referring to or not :-)
> 
> That's the one that I left alone.
> 
> Post patch:
> $ git grep -n -w compare_ether_addr net
> net/batman-adv/main.h:198: * note: can't use compare_ether_addr() as it requires aligned memory
> net/wireless/scan.c:381:        return compare_ether_addr(a->bssid, b->bssid);

Ok, great, then that means the fix from Emmanuel won't conflict when it
gets in.

johannes

^ permalink raw reply

* Re: [PATCH 00/13] net: Add and use ether_addr_equal
From: Emmanuel Grumbach @ 2012-05-10  7:08 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Joe Perches, David Miller, netdev, bridge, netfilter-devel,
	netfilter, coreteam, linux-wireless, linux-kernel,
	linux-bluetooth
In-Reply-To: <1336633015.4334.2.camel@jlt3.sipsolutions.net>

>> >
>> > > That case you didn't convert in mac80211 is probably the
>> > > bug Johannes was talking about which started this whole
>> > > discussion.
>> >
>> > The bug case that started it all is in net/wireless/scan.c and Emmanuel
>> > has since changed it back to memcmp(). Not sure if that's the one you
>> > were referring to or not :-)
>>
>> That's the one that I left alone.
>>
>> Post patch:
>> $ git grep -n -w compare_ether_addr net
>> net/batman-adv/main.h:198: * note: can't use compare_ether_addr() as it requires aligned memory
>> net/wireless/scan.c:381:        return compare_ether_addr(a->bssid, b->bssid);
>
> Ok, great, then that means the fix from Emmanuel won't conflict when it
> gets in.
>

Thanks Joe - as Johannes said this won't conflict with my patch. And
yes the code is now more clear.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 07/14] batman-adv: split neigh_new function into generic and batman iv specific parts
From: Sven Eckelmann @ 2012-05-10  7:34 UTC (permalink / raw)
  To: b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, lindner_marek-LWAfsSFWpa4,
	David Miller
In-Reply-To: <20120509.204111.1931607501484626500.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 753 bytes --]

On Wednesday, May 09, 2012 08:41:11 PM David Miller wrote:
[...]
> The namespace pollution of the batman-adv code needs to improve,
> and I'm putting my foot down starting with this change.
> 
> If you have a static function which is therefore private to a
> source file, name it whatever you want.
> 
> But once it gets exported out of that file, you have to give it
> an appropriate name.  Probably with a "batman_adv_" prefix or
> similar.

I aggree, but would like to like to have a shorter prefix batadv_. I know that 
you said "or similar" but there are still some developers that fear your 
response to a patch that only adds the prefix batadv_ instead of the longer 
version.

Could you please approve or disapprove this proposal.

Thanks,
	Sven

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: Fwd: Memory exhaust issue with only IPsec policies configured on continuous traffic
From: Nikhil Agarwal @ 2012-05-10  9:42 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linux-kernel, netdev, herbert, benjamin.thery, davem, pstaszewski
In-Reply-To: <1336627923.12504.128.camel@edumazet-glaptop>

If i disable this cache of DST, dst are working fine. But now XFRM
state is allocated for every incoming packet and then freed. But while
freeing the xfrm state same garbage collection logic is there. Now
since packets are coming continuously garbage collector may not get
scheduled and large amount of memory is stuck to be freed causing the
system to go into non-recoverable state.

It seems that ther should be some change garbage collection scheduling
logic or  some mechnism to decide wether to cache some entry or not.

On Thu, May 10, 2012 at 11:02 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2012-05-10 at 07:27 +0200, Eric Dumazet wrote:
>
>> Yep, we can use DST_NOCACHE
>>
>
> Please try following patch :
>
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 5773f5d..172c251 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2896,6 +2896,7 @@ struct dst_entry *ipv4_blackhole_route(struct net *net, struct dst_entry *dst_or
>        if (rt) {
>                struct dst_entry *new = &rt->dst;
>
> +               new->flags |= DST_NOCACHE;
>                new->__use = 1;
>                new->input = dst_discard;
>                new->output = dst_discard;
>
>

^ permalink raw reply

* [PATCH net-next 1/2] l2tp: fix reorder timeout recovery
From: James Chapman @ 2012-05-10  9:43 UTC (permalink / raw)
  To: netdev

When L2TP data packet reordering is enabled, packets are held in a
queue while waiting for out-of-sequence packets. If a packet gets
lost, packets will be held until the reorder timeout expires, when we
are supposed to then advance to the sequence number of the next packet
but we don't currently do so. As a result, the data channel is stuck
because we are waiting for a packet that will never arrive - all
packets age out and none are passed.

The fix is to add a flag to the session context, which is set when the
reorder timeout expires and tells the receive code to reset the next
expected sequence number to that of the next packet in the queue.

Tested in a production L2TP network with Starent and Nortel L2TP gear.

Signed-off-by: James Chapman <jchapman@katalix.com>
---
 net/l2tp/l2tp_core.c |    9 +++++++++
 net/l2tp/l2tp_core.h |    1 +
 2 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 456b52d..d1ab3a2 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -428,6 +428,7 @@ start:
 			       session->name, L2TP_SKB_CB(skb)->ns,
 			       L2TP_SKB_CB(skb)->length, session->nr,
 			       skb_queue_len(&session->reorder_q));
+			session->reorder_skip = 1;
 			__skb_unlink(skb, &session->reorder_q);
 			kfree_skb(skb);
 			if (session->deref)
@@ -436,6 +437,14 @@ start:
 		}
 
 		if (L2TP_SKB_CB(skb)->has_seq) {
+			if (session->reorder_skip) {
+				PRINTK(session->debug, L2TP_MSG_SEQ, KERN_DEBUG,
+				       "%s: advancing nr to next pkt: %u -> %u",
+				       session->name, session->nr,
+				       L2TP_SKB_CB(skb)->ns);
+				session->reorder_skip = 0;
+				session->nr = L2TP_SKB_CB(skb)->ns;
+			}
 			if (L2TP_SKB_CB(skb)->ns != session->nr) {
 				PRINTK(session->debug, L2TP_MSG_SEQ, KERN_DEBUG,
 				       "%s: holding oos pkt %u len %d, "
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index 0bf60fc..9002634 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -123,6 +123,7 @@ struct l2tp_session {
 						 * categories */
 	int			reorder_timeout; /* configured reorder timeout
 						  * (in jiffies) */
+	int			reorder_skip;	/* set if skip to next nr */
 	int			mtu;
 	int			mru;
 	enum l2tp_pwtype	pwtype;
-- 
1.7.0.4

^ permalink raw reply related

* [PATCH net-next 2/2] l2tp: fix data packet sequence number handling
From: James Chapman @ 2012-05-10  9:43 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1336642989-4785-1-git-send-email-jchapman@katalix.com>

If enabled, L2TP data packets have sequence numbers which a receiver
can use to drop out of sequence frames or try to reorder them. The
first frame has sequence number 0, but the L2TP code currently expects
it to be 1. This results in the first data frame being handled as out
of sequence.

This one-line patch fixes the problem.

Signed-off-by: James Chapman <jchapman@katalix.com>
---
 net/l2tp/l2tp_core.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index d1ab3a2..0d6aedc 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1762,7 +1762,7 @@ struct l2tp_session *l2tp_session_create(int priv_size, struct l2tp_tunnel *tunn
 
 		session->session_id = session_id;
 		session->peer_session_id = peer_session_id;
-		session->nr = 1;
+		session->nr = 0;
 
 		sprintf(&session->name[0], "sess %u/%u",
 			tunnel->tunnel_id, session->session_id);
-- 
1.7.0.4

^ permalink raw reply related

* Re: [PATCH] netlink: connector: implement cn_netlink_reply
From: Alban Crequy @ 2012-05-10 10:39 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: Ben Hutchings, netdev, Vincent Sanders, Javier Martinez Canillas,
	Rodrigo Moya
In-Reply-To: <20120510004553.GB8362@ioremap.net>

On Thu, 10 May 2012 04:45:53 +0400,
Evgeniy Polyakov <zbr@ioremap.net> wrote :

> On Thu, May 10, 2012 at 01:20:48AM +0100, Ben Hutchings
> (bhutchings@solarflare.com) wrote:
> > On Wed, 2012-05-09 at 15:37 +0100, Alban Crequy wrote:
> > > In a connector callback, it was not possible to reply to a
> > > message only to a sender. This patch implements
> > > cn_netlink_reply(). It uses the connector socket to send an
> > > unicast netlink message back to the sender.
> > [...]
> > 
> > We try to avoid adding functions with no users.  You'll need to
> > submit the code that's intended to use this as well.
> 
> I have no objection against this patch, but as correctly stated it is
> useless without users. Alban, what is the code you want this
> functionality to be used in? Do you plan to submit it? Can you submit
> this change in the patch with your code?

The code to use the feature is not yet ready for submission and we will
add this patch to the front of that submission in due course.

We are just being good community members and making each patch
available early. Thanks for your feedback on this patch. Please let
me know if I can add any reviewed-by.

Alban

^ permalink raw reply

* Re: [PATCH 9/9] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack.
From: Michael S. Tsirkin @ 2012-05-10 11:19 UTC (permalink / raw)
  To: Ian Campbell
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, David Miller, Eric Dumazet,
	Neil Brown, J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1336056971-7839-9-git-send-email-ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>

On Thu, May 03, 2012 at 03:56:11PM +0100, Ian Campbell wrote:
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index f6d8c73..1145929 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -198,7 +198,8 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
>  	while (pglen > 0) {
>  		if (slen == size)
>  			flags = 0;
> -		result = kernel_sendpage(sock, *ppage, NULL, base, size, flags);
> +		result = kernel_sendpage(sock, *ppage, xdr->destructor,
> +					 base, size, flags);
>  		if (result > 0)
>  			len += result;
>  		if (result != size)

So I tried triggering this by simply creating an nfs export on localhost
and copying a large file out with dd, but this never seems to trigger
this code.

Any idea how to test?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 8/9] net: add paged frag destructor support to kernel_sendpage.
From: Michael S. Tsirkin @ 2012-05-10 11:48 UTC (permalink / raw)
  To: Ian Campbell; +Cc: netdev, David Miller, Eric Dumazet
In-Reply-To: <1336056971-7839-8-git-send-email-ian.campbell@citrix.com>

On Thu, May 03, 2012 at 03:56:10PM +0100, Ian Campbell wrote:
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 2d590ca..bee7864 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -822,8 +822,11 @@ static int tcp_send_mss(struct sock *sk, int *size_goal, int flags)
>  	return mss_now;
>  }
>  
> -static ssize_t do_tcp_sendpages(struct sock *sk, struct page **pages, int poffset,
> -			 size_t psize, int flags)
> +static ssize_t do_tcp_sendpages(struct sock *sk,
> +				struct page **pages,
> +				struct skb_frag_destructor *destroy,
> +				int poffset,
> +				size_t psize, int flags)
>  {
>  	struct tcp_sock *tp = tcp_sk(sk);
>  	int mss_now, size_goal;
> @@ -870,7 +873,7 @@ new_segment:
>  			copy = size;
>  
>  		i = skb_shinfo(skb)->nr_frags;
> -		can_coalesce = skb_can_coalesce(skb, i, page, NULL, offset);
> +		can_coalesce = skb_can_coalesce(skb, i, page, destroy, offset);
>  		if (!can_coalesce && i >= MAX_SKB_FRAGS) {
>  			tcp_mark_push(tp, skb);
>  			goto new_segment;
> @@ -881,8 +884,9 @@ new_segment:
>  		if (can_coalesce) {
>  			skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy);
>  		} else {
> -			get_page(page);
>  			skb_fill_page_desc(skb, i, page, offset, copy);
> +			skb_frag_set_destructor(skb, i, destroy);
> +			skb_frag_ref(skb, i);
>  		}
>  
>  		skb->len += copy;
> @@ -937,18 +941,20 @@ out_err:
>  	return sk_stream_error(sk, flags, err);
>  }
>  
> -int tcp_sendpage(struct sock *sk, struct page *page, int offset,
> -		 size_t size, int flags)
> +int tcp_sendpage(struct sock *sk, struct page *page,
> +		 struct skb_frag_destructor *destroy,
> +		 int offset, size_t size, int flags)
>  {
>  	ssize_t res;
>  
>  	if (!(sk->sk_route_caps & NETIF_F_SG) ||
>  	    !(sk->sk_route_caps & NETIF_F_ALL_CSUM))
> -		return sock_no_sendpage(sk->sk_socket, page, offset, size,
> -					flags);
> +		return sock_no_sendpage(sk->sk_socket, page, destroy,
> +					offset, size, flags);
>  
>  	lock_sock(sk);
> -	res = do_tcp_sendpages(sk, &page, offset, size, flags);
> +	res = do_tcp_sendpages(sk, &page, destroy,
> +			       offset, size, flags);
>  	release_sock(sk);
>  	return res;
>  }


Sorry about making more noise but I realized there's
something I don't understand here.

Is it true that all this does is stick the destructor in the frag list?

If so, could this deadlock (or delay application significantly) if tcp
has queued the skb on the write queue but is not transmitting it, while
the application is waiting for pages to complete?

-- 
MST

^ permalink raw reply

* [PATCH 0/3] tcp: Validate recv queue on repair and related stuff
From: Pavel Emelyanov @ 2012-05-10 11:49 UTC (permalink / raw)
  To: David Miller, Eric Dumazet; +Cc: Linux Netdev List

As noted by Eric, no checks are performed when repairing data in tcp
read queue. He also suggested that the tcp_try_rmem_schedule() gets
out-lined for this.

This set does both of the above, more details are in patch comments.

Applies to net-next.

Thanks,
Pavel

^ permalink raw reply

* [PATCH 1/3] tcp: Move rcvq sending to tcp_input.c
From: Pavel Emelyanov @ 2012-05-10 11:49 UTC (permalink / raw)
  To: David Miller, Eric Dumazet; +Cc: Linux Netdev List
In-Reply-To: <4FABAB39.6090201@parallels.com>

It actually works on the input queue and will use its read mem
routines, thus it's better to have in in the tcp_input.c file.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
 include/net/tcp.h    |    3 +--
 net/ipv4/tcp.c       |   33 ---------------------------------
 net/ipv4/tcp_input.c |   35 ++++++++++++++++++++++++++++++++++-
 3 files changed, 35 insertions(+), 36 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 92faa6a..aaf5de9 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -432,8 +432,7 @@ extern int tcp_disconnect(struct sock *sk, int flags);
 
 void tcp_connect_init(struct sock *sk);
 void tcp_finish_connect(struct sock *sk, struct sk_buff *skb);
-int __must_check tcp_queue_rcv(struct sock *sk, struct sk_buff *skb,
-			       int hdrlen, bool *fragstolen);
+int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size);
 
 /* From syncookies.c */
 extern __u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS];
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 5654062..86e2cf2 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -978,39 +978,6 @@ static inline int select_size(const struct sock *sk, bool sg)
 	return tmp;
 }
 
-static int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
-{
-	struct sk_buff *skb;
-	struct tcphdr *th;
-	bool fragstolen;
-
-	skb = alloc_skb(size + sizeof(*th), sk->sk_allocation);
-	if (!skb)
-		goto err;
-
-	th = (struct tcphdr *)skb_put(skb, sizeof(*th));
-	skb_reset_transport_header(skb);
-	memset(th, 0, sizeof(*th));
-
-	if (memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size))
-		goto err_free;
-
-	TCP_SKB_CB(skb)->seq = tcp_sk(sk)->rcv_nxt;
-	TCP_SKB_CB(skb)->end_seq = TCP_SKB_CB(skb)->seq + size;
-	TCP_SKB_CB(skb)->ack_seq = tcp_sk(sk)->snd_una - 1;
-
-	if (tcp_queue_rcv(sk, skb, sizeof(*th), &fragstolen)) {
-		WARN_ON_ONCE(fragstolen); /* should not happen */
-		__kfree_skb(skb);
-	}
-	return size;
-
-err_free:
-	kfree_skb(skb);
-err:
-	return -ENOMEM;
-}
-
 int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		size_t size)
 {
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index eb58b94..7c6c99d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4746,7 +4746,7 @@ end:
 		skb_set_owner_r(skb, sk);
 }
 
-int tcp_queue_rcv(struct sock *sk, struct sk_buff *skb, int hdrlen,
+static int __must_check tcp_queue_rcv(struct sock *sk, struct sk_buff *skb, int hdrlen,
 		  bool *fragstolen)
 {
 	int eaten;
@@ -4763,6 +4763,39 @@ int tcp_queue_rcv(struct sock *sk, struct sk_buff *skb, int hdrlen,
 	return eaten;
 }
 
+int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
+{
+	struct sk_buff *skb;
+	struct tcphdr *th;
+	bool fragstolen;
+
+	skb = alloc_skb(size + sizeof(*th), sk->sk_allocation);
+	if (!skb)
+		goto err;
+
+	th = (struct tcphdr *)skb_put(skb, sizeof(*th));
+	skb_reset_transport_header(skb);
+	memset(th, 0, sizeof(*th));
+
+	if (memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size))
+		goto err_free;
+
+	TCP_SKB_CB(skb)->seq = tcp_sk(sk)->rcv_nxt;
+	TCP_SKB_CB(skb)->end_seq = TCP_SKB_CB(skb)->seq + size;
+	TCP_SKB_CB(skb)->ack_seq = tcp_sk(sk)->snd_una - 1;
+
+	if (tcp_queue_rcv(sk, skb, sizeof(*th), &fragstolen)) {
+		WARN_ON_ONCE(fragstolen); /* should not happen */
+		__kfree_skb(skb);
+	}
+	return size;
+
+err_free:
+	kfree_skb(skb);
+err:
+	return -ENOMEM;
+}
+
 static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 {
 	const struct tcphdr *th = tcp_hdr(skb);
-- 
1.5.5.6

^ permalink raw reply related

* [PATCH 2/3] tcp: Schedule rmem for rcvq repair send
From: Pavel Emelyanov @ 2012-05-10 11:50 UTC (permalink / raw)
  To: David Miller, Eric Dumazet; +Cc: Linux Netdev List
In-Reply-To: <4FABAB39.6090201@parallels.com>

As noted by Eric, no checks are performed on the data size we're
putting in the read queue during repair. Thus, validate the given
data size with the common rmem management routine.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
 net/ipv4/tcp_input.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 7c6c99d..164659f 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4769,6 +4769,9 @@ int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
 	struct tcphdr *th;
 	bool fragstolen;
 
+	if (tcp_try_rmem_schedule(sk, size + sizeof(*th)))
+		goto err;
+
 	skb = alloc_skb(size + sizeof(*th), sk->sk_allocation);
 	if (!skb)
 		goto err;
-- 
1.5.5.6

^ permalink raw reply related

* [PATCH 3/3] tcp: Out-line tcp_try_rmem_schedule
From: Pavel Emelyanov @ 2012-05-10 11:50 UTC (permalink / raw)
  To: David Miller, Eric Dumazet; +Cc: Linux Netdev List
In-Reply-To: <4FABAB39.6090201@parallels.com>

As proposed by Eric, make the tcp_input.o thinner.

add/remove: 1/1 grow/shrink: 1/4 up/down: 868/-1329 (-461)
function                                     old     new   delta
tcp_try_rmem_schedule                          -     864    +864
tcp_ack                                     4811    4815      +4
tcp_validate_incoming                        817     815      -2
tcp_collapse                                 860     858      -2
tcp_send_rcvq                                555     353    -202
tcp_data_queue                              3435    3033    -402
tcp_prune_queue                              721       -    -721

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
 net/ipv4/tcp_input.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 164659f..b99ada2 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4511,7 +4511,7 @@ static void tcp_ofo_queue(struct sock *sk)
 static int tcp_prune_ofo_queue(struct sock *sk);
 static int tcp_prune_queue(struct sock *sk);
 
-static inline int tcp_try_rmem_schedule(struct sock *sk, unsigned int size)
+static int tcp_try_rmem_schedule(struct sock *sk, unsigned int size)
 {
 	if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf ||
 	    !sk_rmem_schedule(sk, size)) {
-- 
1.5.5.6

^ permalink raw reply related

* Re: [PATCH 1/3] tcp: Move rcvq sending to tcp_input.c
From: Eric Dumazet @ 2012-05-10 12:00 UTC (permalink / raw)
  To: Pavel Emelyanov; +Cc: David Miller, Linux Netdev List
In-Reply-To: <4FABAB55.5050803@parallels.com>

On Thu, 2012-05-10 at 15:49 +0400, Pavel Emelyanov wrote:
> It actually works on the input queue and will use its read mem
> routines, thus it's better to have in in the tcp_input.c file.
> 
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
> ---
>  include/net/tcp.h    |    3 +--
>  net/ipv4/tcp.c       |   33 ---------------------------------
>  net/ipv4/tcp_input.c |   35 ++++++++++++++++++++++++++++++++++-
>  3 files changed, 35 insertions(+), 36 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH 2/3] tcp: Schedule rmem for rcvq repair send
From: Eric Dumazet @ 2012-05-10 12:00 UTC (permalink / raw)
  To: Pavel Emelyanov; +Cc: David Miller, Linux Netdev List
In-Reply-To: <4FABAB69.70401@parallels.com>

On Thu, 2012-05-10 at 15:50 +0400, Pavel Emelyanov wrote:
> As noted by Eric, no checks are performed on the data size we're
> putting in the read queue during repair. Thus, validate the given
> data size with the common rmem management routine.
> 
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
> ---
>  net/ipv4/tcp_input.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH 3/3] tcp: Out-line tcp_try_rmem_schedule
From: Eric Dumazet @ 2012-05-10 12:00 UTC (permalink / raw)
  To: Pavel Emelyanov; +Cc: David Miller, Linux Netdev List
In-Reply-To: <4FABAB7C.6040306@parallels.com>

On Thu, 2012-05-10 at 15:50 +0400, Pavel Emelyanov wrote:
> As proposed by Eric, make the tcp_input.o thinner.
> 
> add/remove: 1/1 grow/shrink: 1/4 up/down: 868/-1329 (-461)
> function                                     old     new   delta
> tcp_try_rmem_schedule                          -     864    +864
> tcp_ack                                     4811    4815      +4
> tcp_validate_incoming                        817     815      -2
> tcp_collapse                                 860     858      -2
> tcp_send_rcvq                                555     353    -202
> tcp_data_queue                              3435    3033    -402
> tcp_prune_queue                              721       -    -721
> 
> Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
> ---
>  net/ipv4/tcp_input.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

Thanks !

^ permalink raw reply

* Re: [PATCH] xfrm: take iphdr size into account for esp payload size calculation
From: Steffen Klassert @ 2012-05-10 12:18 UTC (permalink / raw)
  To: Benjamin Poirier
  Cc: netdev, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel
In-Reply-To: <1336602952-10479-1-git-send-email-bpoirier@suse.de>

On Wed, May 09, 2012 at 06:35:52PM -0400, Benjamin Poirier wrote:
> 
> According to what is done, mainly in esp_output(), net_header_len aka
> sizeof(struct iphdr) must be taken into account before doing the alignment
> calculation.

Why do you need to take the ip header into account here? Your patch breaks
pmtu discovery, at least on tunnel mode with aes-sha1 (aes blocksize 16 bytes).

With your patch applied:

tracepath -n 192.168.1.2
 1?: [LOCALHOST]     pmtu 1442
 1:  send failed
 1:  send failed
     Resume: pmtu 1442

Without your patch:

tracepath -n 192.168.1.2
 1?: [LOCALHOST]     pmtu 1438
 1:  192.168.1.2       0.736ms reached
 1:  192.168.1.2       0.390ms reached
     Resume: pmtu 1438 hops 1 back 64 

Your patch increases the mtu by 4 bytes. Be aware that adding
one byte of payload may increase the packet size up to 16 bytes
in the case of aes, as we have to pad the encryption payload
always to a multiple of the cipher blocksize.

> -
> -	switch (x->props.mode) {
> -	case XFRM_MODE_TUNNEL:
> -		break;
> -	default:
> -	case XFRM_MODE_TRANSPORT:
> -		/* The worst case */
> -		mtu -= blksize - 4;
> -		mtu += min_t(u32, blksize - 4, rem);
> -		break;

Btw. why we are doing the calculation above for transport mode?

^ permalink raw reply

* [PATCH 0/2 net] 6lowpan fixes
From: alex.bluesman.smirnov @ 2012-05-10 13:22 UTC (permalink / raw)
  To: davem; +Cc: netdev

Hi David,

this patch set contains 2 fixes for the 6lowpan module. Please find detailed
description in the patch headers.

With best regards,
Alex

8<--

The following changes since commit 4b31b26441fb3c5f1e61ee13832358c09f8ca12d:

  6lowpan: duplicate definition of IEEE802154_ALEN (2012-04-26 13:02:33 +0400)

are available in the git repository at:
  http://github.com/linux-wsn/kernel.git 6lowpan_dev

Alexander Smirnov (2):
      6lowpan: add missing pskb_may_pull() check
      6lowpan: fix hop limit compression

 net/ieee802154/6lowpan.c |    3 +++

From: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Subject: [PATCH 0/2 net] 6lowpan fixes
In-Reply-To: 

^ permalink raw reply

* [PATCH 1/2 net] 6lowpan: add missing pskb_may_pull() check
From: alex.bluesman.smirnov @ 2012-05-10 13:22 UTC (permalink / raw)
  To: davem; +Cc: netdev, Alexander Smirnov
In-Reply-To: <1336656163-19382-1-git-send-email-y>

From: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>

Add pskb_may_pull() call when fetching u8 from skb.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
---
 net/ieee802154/6lowpan.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index 32eb417..0ab3efe 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -295,6 +295,8 @@ static u8 lowpan_fetch_skb_u8(struct sk_buff *skb)
 {
 	u8 ret;
 
+	BUG_ON(!pskb_may_pull(skb, 1));
+
 	ret = skb->data[0];
 	skb_pull(skb, 1);
 
-- 
1.7.2.3

^ permalink raw reply related

* [PATCH 2/2 net] 6lowpan: fix hop limit compression
From: alex.bluesman.smirnov @ 2012-05-10 13:22 UTC (permalink / raw)
  To: davem; +Cc: netdev, Alexander Smirnov, Tony Cheneau
In-Reply-To: <1336656163-19382-1-git-send-email-y>

From: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>

Add missing pointer shift for the 'default' case.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
Cc: Tony Cheneau <tony.cheneau+zigbeedev@amnesiak.org>
---
 net/ieee802154/6lowpan.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index 0ab3efe..86f0013 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -492,6 +492,7 @@ static int lowpan_header_create(struct sk_buff *skb,
 		break;
 	default:
 		*hc06_ptr = hdr->hop_limit;
+		hc06_ptr += 1;
 		break;
 	}
 
-- 
1.7.2.3

^ permalink raw reply related

* Re: [PATCH 9/9] sunrpc: use SKB fragment destructors to delay completion until page is released by network stack.
From: Ian Campbell @ 2012-05-10 13:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, David Miller,
	Eric Dumazet, Neil Brown, J. Bruce Fields,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <20120510111948.GA9609-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 1578 bytes --]

On Thu, 2012-05-10 at 12:19 +0100, Michael S. Tsirkin wrote:
> On Thu, May 03, 2012 at 03:56:11PM +0100, Ian Campbell wrote:
> > diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> > index f6d8c73..1145929 100644
> > --- a/net/sunrpc/svcsock.c
> > +++ b/net/sunrpc/svcsock.c
> > @@ -198,7 +198,8 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
> >  	while (pglen > 0) {
> >  		if (slen == size)
> >  			flags = 0;
> > -		result = kernel_sendpage(sock, *ppage, NULL, base, size, flags);
> > +		result = kernel_sendpage(sock, *ppage, xdr->destructor,
> > +					 base, size, flags);
> >  		if (result > 0)
> >  			len += result;
> >  		if (result != size)
> 
> So I tried triggering this by simply creating an nfs export on localhost
> and copying a large file out with dd, but this never seems to trigger
> this code.
> 
> Any idea how to test?

My test code, which is a bit overly complex for this because it also
tries to demonstrate corruption on the wire, is attached.

Using dd I suspect you probably need to increase the block size, and
possibly enable O_DIRECT (conv=direct?)

My typical scenario has been to mount a remote NFS and run
tcpdump -s 4096 -x -ne -v -i eth0 'host $client and ip[184:4] == 0x55555555'
to watch for on-wire corruption.

FYI I've triggered a BUG_ON in my local debug patches with your series
applied, I'm just investigating whether its my debugging or something in
the series which causes it.

After that I'll try it with local NFS and VMs with bridging etc to test
the extra aspects which your series is exercising.

Ian.

[-- Attachment #2: blktest3.c --]
[-- Type: text/x-csrc, Size: 1090 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>

#include <fcntl.h>

#include <err.h>
#include <errno.h>

#define NR 256
#define SIZE 4096

int main(int argc, char **argv)
{
	int fd, rc, n, iter = 0;
	const char *path;
	static unsigned char  __attribute__ ((aligned (4096))) buf[NR][SIZE];

	if (argc != 2) {
		fprintf(stderr, "usage: blktest2 [PATH]\n");
		exit(1);
	}
	path = argv[1];

	printf("opening %s for O_DIRECT access\n", path);
	fd = open(path, O_CREAT/*|O_DIRECT*/|O_RDWR, 0666);
	if (fd < 0)
		err(1,"unable to open file");

	while(1) {
		if ((iter%10)==0)
			printf("iteration %d ...", iter);
		if (lseek(fd, (iter%NR)*SIZE, SEEK_SET) < 0)
			err(1, "seek for write %d %d\n", iter, n);
		memset(buf[iter%NR], 0xaa, SIZE);
		rc = write(fd, buf[iter%NR], SIZE);
		memset(buf[iter%NR], 0x55, SIZE);
		if (rc == -1)
			//warn("write failed");
			err(1, "write failed");
		else if (rc != SIZE)
			err(1, "only wrote %d/%d bytes\n", rc, SIZE);
		if ((iter%10)==0)
			printf("\n", iter);

		iter++;
	}
}

^ permalink raw reply

* [PATCH net-next] 6lowpan: IPv6 link local address
From: Alexander Smirnov @ 2012-05-10 13:25 UTC (permalink / raw)
  To: davem; +Cc: netdev, Alexander Smirnov

According to the RFC4944 (Transmission of IPv6 Packets over
IEEE 802.15.4 Networks), chapter 7:

The IPv6 link-local address [RFC4291] for an IEEE 802.15.4 interface
is formed by appending the Interface Identifier, as defined above, to
the prefix FE80::/64.

  10 bits            54 bits                  64 bits
+----------+-----------------------+----------------------------+
|1111111010|         (zeros)       |    Interface Identifier    |
+----------+-----------------------+----------------------------+

This patch adds IPv6 address generation support for the 6lowpan
interfaces.

Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
---
 net/ipv6/addrconf.c |   14 +++++++++++++-
 1 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index e3b3421..8b7f100 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -66,6 +66,7 @@
 #include <net/sock.h>
 #include <net/snmp.h>
 
+#include <net/af_ieee802154.h>
 #include <net/ipv6.h>
 #include <net/protocol.h>
 #include <net/ndisc.h>
@@ -1514,6 +1515,14 @@ static int addrconf_ifid_eui48(u8 *eui, struct net_device *dev)
 	return 0;
 }
 
+static int addrconf_ifid_eui64(u8 *eui, struct net_device *dev)
+{
+	if (dev->addr_len != IEEE802154_ADDR_LEN)
+		return -1;
+	memcpy(eui, dev->dev_addr, 8);
+	return 0;
+}
+
 static int addrconf_ifid_arcnet(u8 *eui, struct net_device *dev)
 {
 	/* XXX: inherit EUI-64 from other interface -- yoshfuji */
@@ -1577,6 +1586,8 @@ static int ipv6_generate_eui64(u8 *eui, struct net_device *dev)
 		return addrconf_ifid_sit(eui, dev);
 	case ARPHRD_IPGRE:
 		return addrconf_ifid_gre(eui, dev);
+	case ARPHRD_IEEE802154:
+		return addrconf_ifid_eui64(eui, dev);
 	}
 	return -1;
 }
@@ -2438,7 +2449,8 @@ static void addrconf_dev_config(struct net_device *dev)
 	    (dev->type != ARPHRD_FDDI) &&
 	    (dev->type != ARPHRD_IEEE802_TR) &&
 	    (dev->type != ARPHRD_ARCNET) &&
-	    (dev->type != ARPHRD_INFINIBAND)) {
+	    (dev->type != ARPHRD_INFINIBAND) &&
+	    (dev->type != ARPHRD_IEEE802154)) {
 		/* Alas, we support only Ethernet autoconfiguration. */
 		return;
 	}
-- 
1.7.2.3

^ permalink raw reply related

* Re: [PATCH 1/2 net] 6lowpan: add missing pskb_may_pull() check
From: Eric Dumazet @ 2012-05-10 13:30 UTC (permalink / raw)
  To: alex.bluesman.smirnov; +Cc: davem, netdev
In-Reply-To: <4fabc166.9208cc0a.4de8.ffff9211@mx.google.com>

On Thu, 2012-05-10 at 17:22 +0400, alex.bluesman.smirnov@gmail.com
wrote:
> From: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
> 
> Add pskb_may_pull() call when fetching u8 from skb.
> 
> Signed-off-by: Alexander Smirnov <alex.bluesman.smirnov@gmail.com>
> ---
>  net/ieee802154/6lowpan.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
> index 32eb417..0ab3efe 100644
> --- a/net/ieee802154/6lowpan.c
> +++ b/net/ieee802154/6lowpan.c
> @@ -295,6 +295,8 @@ static u8 lowpan_fetch_skb_u8(struct sk_buff *skb)
>  {
>  	u8 ret;
>  
> +	BUG_ON(!pskb_may_pull(skb, 1));
> +
>  	ret = skb->data[0];
>  	skb_pull(skb, 1);
>  

No, you cant do that.

pskb_may_pull() can fail, and you crash your machine instead of graceful
error reporting.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox