Netdev List
 help / color / mirror / Atom feed
* Re: [TCP] IPV6 : Change a divide into a right shift in tcp_v6_send_ack()
From: Eric Dumazet @ 2007-12-21  7:39 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki / 吉藤英明; +Cc: davem, netdev
In-Reply-To: <20071221.162833.82587283.yoshfuji@linux-ipv6.org>

YOSHIFUJI Hideaki / 吉藤英明 a écrit :
> In article <476B65F8.10201@cosmosbay.com> (at Fri, 21 Dec 2007 08:06:32 +0100), Eric Dumazet <dada1@cosmosbay.com> says:
> 
>> YOSHIFUJI Hideaki / 吉藤英明 a écrit :
>>> In article <476B574E.80601@cosmosbay.com> (at Fri, 21 Dec 2007 07:03:58 +0100), Eric Dumazet <dada1@cosmosbay.com> says:
>>>
>>>> Because tot_len is signed in tcp_v6_send_ack(), tot_len/4 forces compiler
>>>> to emit an integer divide, while we can help it to use a right shift,
>>>> less expensive.
>>> Are you really sure?
>>> At least, gcc-4.1.2-20061115 (debian) does not make any difference.
>>>
>>> And, IMHO, because shift for signed variable is fragile, so we should
>>> avoid using it.
>>>
>> Yes I am sure, but maybe you are on x86_64 ?
>>
>> gcc-4.2.2 on x86
> 
> I'm on gcc-4.1.2 20061115 (prerelease) (Debian 4.1.1-21), on x86 (i686).
> Maybe compiler difference?!
> 
>> If you think tot_len can be negative, I understand you can be against this 
>> patch. But I am sure it's allways > 0, even if I am a total ipv6 newbie :)
> 
> Okay, anyway, I'll convert them to unsigned int, which is more
> appropriate.

I didnt chose this path, because David was against changing some fields from 
'int' to 'unsigned'. If you look in other parts of networking, we have many >> 
1 or >> 2 already there.



^ permalink raw reply

* Re: [TCP] IPV6 : Change a divide into a right shift in tcp_v6_send_ack()
From: Ilpo Järvinen @ 2007-12-21  7:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S. Miller,
	YOSHIFUJI Hideaki / 吉藤英明,
	Linux Netdev List
In-Reply-To: <476B574E.80601@cosmosbay.com>

[-- Attachment #1: Type: text/plain, Size: 320 bytes --]

On Fri, 21 Dec 2007, Eric Dumazet wrote:

> Because tot_len is signed in tcp_v6_send_ack(), tot_len/4 forces compiler
> to emit an integer divide, while we can help it to use a right shift,
> less expensive.

Can't you just change tot_len to unsigned here? It's just sizeof and some 
positive constants added...

-- 
 i.

[-- Attachment #2: Type: text/plain, Size: 415 bytes --]

diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0268e11..92f0fda 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1124,7 +1124,7 @@ static void tcp_v6_send_ack(struct tcp_timewait_sock *tw,
 	memset(t1, 0, sizeof(*t1));
 	t1->dest = th->source;
 	t1->source = th->dest;
-	t1->doff = tot_len/4;
+	t1->doff = tot_len >> 2;
 	t1->seq = htonl(seq);
 	t1->ack_seq = htonl(ack);
 	t1->ack = 1;

^ permalink raw reply related

* Re: [TCP] IPV6 : Change a divide into a right shift in tcp_v6_send_ack()
From: Eric Dumazet @ 2007-12-21  7:06 UTC (permalink / raw)
  To: YOSHIFUJI Hideaki / 吉藤英明; +Cc: davem, netdev
In-Reply-To: <20071221.155030.131184865.yoshfuji@linux-ipv6.org>

YOSHIFUJI Hideaki / 吉藤英明 a écrit :
> In article <476B574E.80601@cosmosbay.com> (at Fri, 21 Dec 2007 07:03:58 +0100), Eric Dumazet <dada1@cosmosbay.com> says:
> 
>> Because tot_len is signed in tcp_v6_send_ack(), tot_len/4 forces compiler
>> to emit an integer divide, while we can help it to use a right shift,
>> less expensive.
> 
> Are you really sure?
> At least, gcc-4.1.2-20061115 (debian) does not make any difference.
> 
> And, IMHO, because shift for signed variable is fragile, so we should
> avoid using it.
> 

Yes I am sure, but maybe you are on x86_64 ?

gcc-4.2.2 on x86

# objdump --disassemble net/ipv6/tcp_ipv6.o|grep -6 idiv
       b2:       66 8b 42 02             mov    0x2(%edx),%ax
       b6:       ba 04 00 00 00          mov    $0x4,%edx
       bb:       89 d7                   mov    %edx,%edi
       bd:       66 89 45 00             mov    %ax,0x0(%ebp)
       c1:       89 d8                   mov    %ebx,%eax
       c3:       99                      cltd
       c4:       f7 ff                   idiv   %edi
       c6:       88 c2                   mov    %al,%dl
       c8:       8a 45 0c                mov    0xc(%ebp),%al
       cb:       c1 e2 04                shl    $0x4,%edx
       ce:       83 e0 0f                and    $0xf,%eax
       d1:       09 d0                   or     %edx,%eax
       d3:       88 45 0c                mov    %al,0xc(%ebp)


If you think tot_len can be negative, I understand you can be against this 
patch. But I am sure it's allways > 0, even if I am a total ipv6 newbie :)

Thank you

^ permalink raw reply

* Re: [TCP] IPV6 : Change a divide into a right shift in tcp_v6_send_ack()
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2007-12-21  6:50 UTC (permalink / raw)
  To: dada1; +Cc: davem, netdev, yoshfuji
In-Reply-To: <476B574E.80601@cosmosbay.com>

In article <476B574E.80601@cosmosbay.com> (at Fri, 21 Dec 2007 07:03:58 +0100), Eric Dumazet <dada1@cosmosbay.com> says:

> Because tot_len is signed in tcp_v6_send_ack(), tot_len/4 forces compiler
> to emit an integer divide, while we can help it to use a right shift,
> less expensive.

Are you really sure?
At least, gcc-4.1.2-20061115 (debian) does not make any difference.

And, IMHO, because shift for signed variable is fragile, so we should
avoid using it.

--yoshfuji

^ permalink raw reply

* [SOCK] Avoid integer divides where not necessary in include/net/sock.h
From: Eric Dumazet @ 2007-12-21  6:18 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 368 bytes --]

Because sk_wmem_queued, sk_sndbuf are signed, a divide per two
forces compiler to use an integer divide. We can instead use
a right shift.

SK_STREAM_MEM_QUANTUM deserves to be declared as an unsigned
quantity, so that sk_stream_pages() and __sk_stream_mem_reclaim()
can use right shifts instead of integer divides.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>


[-- Attachment #2: sock_h.patch --]
[-- Type: text/plain, Size: 1480 bytes --]

diff --git a/include/net/sock.h b/include/net/sock.h
index 803d8f2..6da08fc 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -445,7 +445,7 @@ static inline int sk_acceptq_is_full(struct sock *sk)
  */
 static inline int sk_stream_min_wspace(struct sock *sk)
 {
-	return sk->sk_wmem_queued / 2;
+	return sk->sk_wmem_queued >> 1;
 }
 
 static inline int sk_stream_wspace(struct sock *sk)
@@ -715,7 +715,7 @@ static inline struct inode *SOCK_INODE(struct socket *socket)
 extern void __sk_stream_mem_reclaim(struct sock *sk);
 extern int sk_stream_mem_schedule(struct sock *sk, int size, int kind);
 
-#define SK_STREAM_MEM_QUANTUM ((int)PAGE_SIZE)
+#define SK_STREAM_MEM_QUANTUM ((unsigned int)PAGE_SIZE)
 
 static inline int sk_stream_pages(int amt)
 {
@@ -1187,7 +1187,7 @@ static inline void sk_wake_async(struct sock *sk, int how, int band)
 static inline void sk_stream_moderate_sndbuf(struct sock *sk)
 {
 	if (!(sk->sk_userlocks & SOCK_SNDBUF_LOCK)) {
-		sk->sk_sndbuf = min(sk->sk_sndbuf, sk->sk_wmem_queued / 2);
+		sk->sk_sndbuf = min(sk->sk_sndbuf, sk->sk_wmem_queued >> 1);
 		sk->sk_sndbuf = max(sk->sk_sndbuf, SOCK_MIN_SNDBUF);
 	}
 }
@@ -1211,7 +1211,7 @@ static inline struct page *sk_stream_alloc_page(struct sock *sk)
  */
 static inline int sock_writeable(const struct sock *sk) 
 {
-	return atomic_read(&sk->sk_wmem_alloc) < (sk->sk_sndbuf / 2);
+	return atomic_read(&sk->sk_wmem_alloc) < (sk->sk_sndbuf >> 1);
 }
 
 static inline gfp_t gfp_any(void)

^ permalink raw reply related

* Re: [PATCH] PS3: gelic: Add wireless support for PS3
From: Masakazu Mokuno @ 2007-12-21  6:17 UTC (permalink / raw)
  To: Jouni Malinen
  Cc: Dan Williams, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	geoffrey.levand-mEdOJwZ7QcZBDgjK7y7TUQ, Geert Uytterhoeven
In-Reply-To: <20071215014244.GI5698-mgr6C1c9aYeHXe+LvDLADg@public.gmane.org>

	Hi Jouni,

On Fri, 14 Dec 2007 17:42:44 -0800
Jouni Malinen <j@w1.fi> wrote:

> However, there is a part that you are not going to like.. This is likely
> using a private ioctl for some parts of the association requests, i.e.,
> no -Dwext.. I would assume that this could be cleaned up, though, if
> WEXT would be extended a bit to allow one more enc_capa to notify
> whether the driver wants to take care of 4-way handshake and to allow
> the PSK to be configured with a new key type.

Yes, as the current WEXT did not have this feature, I could not but
design the new PS3 driver in that way.

> It would be interesting to see whether the driver/firmware/hypervisor
> could be convinced to allow EAPOL frames to go through between
> association and 4-way handshake (which would be completed by
> driver/firmware). This is the way I can support WPA/WPA2-Enterprise with
> OSX..

I had asked the hypervisor guy. Unfortunately all EAPOL frames would be
dropped by the wireless chip firmware.

-- 
Masakazu MOKUNO

^ permalink raw reply

* Re: After many hours all outbound connections get stuck in SYN_SENT
From: Jan Engelhardt @ 2007-12-21  6:06 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: James Nichols, Glen Turner, Eric Dumazet, LKML, Linux Netdev List
In-Reply-To: <Pine.LNX.4.64.0712202247410.20288@kivilampi-30.cs.helsinki.fi>


On Dec 20 2007 23:05, Ilpo Järvinen wrote:
>> 
>> Given the fact that I've had this problem for so long, over a variety
>> of networking hardware vendors and colo-facilities, this really sounds
>> good to me.  It will be challenging for me to justify a kernel core
>> dump, but a simple patch to dump the Sack data would be do-able.
>
>If your symptoms really are: SYNs leaving (if they show up in tcpdump, for 
>sure they've left TCP code already) and SYN-ACK not showing up even in 
>something as early as in tcpdump (for sure TCP side code didn't execute at 
>that point yet), there's very little change that Linux' TCP code has some 
>bug in it, only things that do something in such scenario are the SYN 
>generation and retransmitting SYNs (and those are trivially verifiable 
>from tcpdump).
>
Take a machine, put two interfaces in it, configure as bridge (br0
over eth0 and eth1 without any assigned ip addresses), put it between
end node and the cisco. tcpdump there, which should give an unbiased
view wrt. endnode/cisco. Then perhaps, also configure such a network
listening bridge on the other side of the cisco, e.g. on the link to
the internet and watch that. Compare the two tcpdumpds and see if
sack got trashed.

^ permalink raw reply

* [TCP] IPV6 : Change a divide into a right shift in tcp_v6_send_ack()
From: Eric Dumazet @ 2007-12-21  6:03 UTC (permalink / raw)
  To: David S. Miller,
	YOSHIFUJI Hideaki / 吉藤英明
  Cc: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 211 bytes --]

Because tot_len is signed in tcp_v6_send_ack(), tot_len/4 forces compiler
to emit an integer divide, while we can help it to use a right shift,
less expensive.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

[-- Attachment #2: tcp_ipv6.patch --]
[-- Type: text/plain, Size: 415 bytes --]

diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0268e11..92f0fda 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1124,7 +1124,7 @@ static void tcp_v6_send_ack(struct tcp_timewait_sock *tw,
 	memset(t1, 0, sizeof(*t1));
 	t1->dest = th->source;
 	t1->source = th->dest;
-	t1->doff = tot_len/4;
+	t1->doff = tot_len >> 2;
 	t1->seq = htonl(seq);
 	t1->ack_seq = htonl(ack);
 	t1->ack = 1;

^ permalink raw reply related

* [TCP] tcp_write_timeout.c cleanup
From: Eric Dumazet @ 2007-12-21  5:56 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 248 bytes --]

Before submiting a patch to change a divide to a right shift, I felt
necessary to create a helper function tcp_mtu_probing() to reduce length of 
lines exceeding 100 chars in tcp_write_timeout().

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>


[-- Attachment #2: tcp_timer_cleanup.patch --]
[-- Type: text/plain, Size: 1838 bytes --]

diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index d8970ec..8f14808 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -114,13 +114,31 @@ static int tcp_orphan_retries(struct sock *sk, int alive)
 	return retries;
 }
 
+static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk)
+{
+	int mss;
+
+	/* Black hole detection */
+	if (sysctl_tcp_mtu_probing) {
+		if (!icsk->icsk_mtup.enabled) {
+			icsk->icsk_mtup.enabled = 1;
+			tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
+		} else {
+			struct tcp_sock *tp = tcp_sk(sk);
+			mss = tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_low)/2;
+			mss = min(sysctl_tcp_base_mss, mss);
+			mss = max(mss, 68 - tp->tcp_header_len);
+			icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
+			tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
+		}
+	}
+}
+
 /* A write timeout has occurred. Process the after effects. */
 static int tcp_write_timeout(struct sock *sk)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
-	struct tcp_sock *tp = tcp_sk(sk);
 	int retry_until;
-	int mss;
 
 	if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) {
 		if (icsk->icsk_retransmits)
@@ -129,18 +147,7 @@ static int tcp_write_timeout(struct sock *sk)
 	} else {
 		if (icsk->icsk_retransmits >= sysctl_tcp_retries1) {
 			/* Black hole detection */
-			if (sysctl_tcp_mtu_probing) {
-				if (!icsk->icsk_mtup.enabled) {
-					icsk->icsk_mtup.enabled = 1;
-					tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
-				} else {
-					mss = min(sysctl_tcp_base_mss,
-						  tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_low)/2);
-					mss = max(mss, 68 - tp->tcp_header_len);
-					icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
-					tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
-				}
-			}
+			tcp_mtu_probing(icsk, sk);
 
 			dst_negative_advice(&sk->sk_dst_cache);
 		}

^ permalink raw reply related

* Re: [TCP] Avoid two divides in tcp_output.c
From: David Miller @ 2007-12-21  5:40 UTC (permalink / raw)
  To: dada1; +Cc: netdev
In-Reply-To: <476B5064.5010302@cosmosbay.com>

From: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri, 21 Dec 2007 06:34:28 +0100

> Because 'free_space' variable in __tcp_select_window() is signed,
> expression (free_space / 2) forces compiler to emit an integer divide.
> 
> This can be changed to a plain right shift, less expensive.
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

Applied to net-2.6.25, thanks Eric.

^ permalink raw reply

* [TCP] Avoid two divides in tcp_output.c
From: Eric Dumazet @ 2007-12-21  5:34 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Netdev List

[-- Attachment #1: Type: text/plain, Size: 249 bytes --]

Because 'free_space' variable in __tcp_select_window() is signed,
expression (free_space / 2) forces compiler to emit an integer divide.

This can be changed to a plain right shift, less expensive.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

[-- Attachment #2: tcp_output.patch --]
[-- Type: text/plain, Size: 696 bytes --]

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 7c50271..9a9510a 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1627,7 +1627,7 @@ u32 __tcp_select_window(struct sock *sk)
 	if (mss > full_space)
 		mss = full_space;
 
-	if (free_space < full_space/2) {
+	if (free_space < (full_space >> 1)) {
 		icsk->icsk_ack.quick = 0;
 
 		if (tcp_memory_pressure)
@@ -1666,7 +1666,7 @@ u32 __tcp_select_window(struct sock *sk)
 		if (window <= free_space - mss || window > free_space)
 			window = (free_space/mss)*mss;
 		else if (mss == full_space &&
-			 free_space > window + full_space/2)
+			 free_space > window + (full_space >> 1))
 			window = free_space;
 	}
 

^ permalink raw reply related

* Re: (usagi-core 34097) Re: [PATCH] [XFRM] IPv6: Fix dst/routing check at transformation.
From: Masahide NAKAMURA @ 2007-12-21  5:09 UTC (permalink / raw)
  To: David Miller; +Cc: usagi-core, herbert, netdev
In-Reply-To: <200712211406.53583.nakam@linux-ipv6.org>

Friday 21 December 2007 14:06, Masahide NAKAMURA wrote:
> Thanks, I'll resend by hand this time.
> Mayby I use your e-mail address without name
> by current git-send-email.

Ah, they are already applied. I don't need resend anymore.
I'll be careful this next time.

Regards,

-- 
Masahide NAKAMURA

^ permalink raw reply

* Re: [PATCH] [XFRM] IPv6: Fix dst/routing check at transformation.
From: Masahide NAKAMURA @ 2007-12-21  5:06 UTC (permalink / raw)
  To: David Miller; +Cc: usagi-core, herbert, netdev
In-Reply-To: <20071220.195054.23654691.davem@davemloft.net>

Friday 21 December 2007 12:50, David Miller wrote:
> From: Masahide NAKAMURA <nakam@linux-ipv6.org>
> Date: Fri, 21 Dec 2007 12:48:31 +0900
> 
> > My 5 patches for XFRM sent to netdev should be TOed to David, but it is not.
> > 
> > It does not seems that the command works for me.
> > git-send-email --to "David S. Miller <davem@davemloft.net>" --to herbert@gondor.apana.org.au --cc...
> > 
> > Please see my patches, even it is not TOed to you.
> 
> All of your patches won't make it anywhere.
> 
> In the email headers my name shows up like this:
> 
> 	David S. Miller
> 
> Email SMTP rules dictate that if special characters like
> "." appear in the name it must be surrounded by double
> quotes otherwise it is a syntax error.
> 
> This is a bug in git-send-email that I thought was fixed
> by now.  Perhaps it is fixed in git mainline and not any
> of the stable releases yet.
> 
> Perhaps you can submit them by hand until you resolve the
> git-send-email problem?

Thanks, I'll resend by hand this time.
Mayby I use your e-mail address without name
by current git-send-email.

-- 
Masahide NAKAMURA

^ permalink raw reply

* Re: After many hours all outbound connections get stuck in SYN_SENT
From: Glen Turner @ 2007-12-21  4:51 UTC (permalink / raw)
  To: James Nichols
  Cc: Jan Engelhardt, Eric Dumazet, linux-kernel, Linux Netdev List
In-Reply-To: <83a51e120712200837p9e3d1a4g15b5f4763597073e@mail.gmail.com>


> I do have TCP Sequence # Randomization enabled on my router.

Huh?  Do you mean a PIX blade in a Cisco switch-router chassis? It
would be very useful if you could be less vague about the
equipment in use.

>  However,
> if this was causing an issue, wouldn't it always occur and cause
> connection issues, not just after 38 hours of correct operation?

That depends more on your customers' networking attributes
then you are sharing or perhaps even know.  Perhaps your customer
base is very Window-skewed and you simply aren't seeing any Sack
Permitted negotiations for the first 37.999 hours. Or
perhaps you've had a network glitch, and all of your
connections have done a Selective Ack, which the firewall
has trashed, leaving all the connections in a wacko state,
not just a few which you haven't noticed.

The actual failure mode needs a packet trace to determine,
but you should be able to do this yourself (or ask your
local network engineering staff).

If your firewall is trashing the Sack field, then it needs
to be fixed.  Time to raise a case with the Cisco TAC and
ask them directly if your PIX version has bug CSCse14419.
You can't expect Sack to work when it's being fed trash,
so it is important to make sure that is not happening.

Cheers, Glen
#include <network_engineer.h>
#undef KERNEL_HACKER


^ permalink raw reply

* Re: [PATCH 1/3] XFRM: Assorted IPsec fixups
From: David Miller @ 2007-12-21  4:49 UTC (permalink / raw)
  To: jmorris; +Cc: paul.moore, netdev, linux-audit, latten
In-Reply-To: <Xine.LNX.4.64.0712210925130.27551@us.intercode.com.au>

From: James Morris <jmorris@namei.org>
Date: Fri, 21 Dec 2007 09:25:38 +1100 (EST)

> On Thu, 20 Dec 2007, Paul Moore wrote:
> 
> > This patch fixes a number of small but potentially troublesome things in the
> > XFRM/IPsec code:
> > 
> >  * Use the 'audit_enabled' variable already in include/linux/audit.h
> >    Removed the need for extern declarations local to each XFRM audit fuction
> > 
> >  * Convert 'sid' to 'secid' everywhere we can
> >    The 'sid' name is specific to SELinux, 'secid' is the common naming
> >    convention used by the kernel when refering to tokenized LSM labels,
> >    unfortunately we have to leave 'ctx_sid' in 'struct xfrm_sec_ctx' otherwise
> >    we risk breaking userspace
> > 
> >  * Convert address display to use standard NIP* macros
> >    Similar to what was recently done with the SPD audit code, this also also
> >    includes the removal of some unnecessary memcpy() calls
> > 
> >  * Move common code to xfrm_audit_common_stateinfo()
> >    Code consolidation from the "less is more" book on software development
> > 
> >  * Proper spacing around commas in function arguments
> >    Minor style tweak since I was already touching the code
> > 
> > Signed-off-by: Paul Moore <paul.moore@hp.com>
> 
> Acked-by: James Morris <jmorris@namei.org>

Applied.

^ permalink raw reply

* Re: [PATCH 3/3] [XFRM]: Add packet processing statistics option.
From: David Miller @ 2007-12-21  4:44 UTC (permalink / raw)
  To: nakam; +Cc: herbert, netdev, usagi-core
In-Reply-To: <119820852455-git-send-email-nakam@linux-ipv6.org>

From: Masahide NAKAMURA <nakam@linux-ipv6.org>
Date: Fri, 21 Dec 2007 12:42:04 +0900

> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>

Applied, thanks again.

^ permalink raw reply

* Re: [PATCH 2/3] [XFRM]: Support to increment packet dropping statistics.
From: David Miller @ 2007-12-21  4:43 UTC (permalink / raw)
  To: nakam; +Cc: herbert, netdev, usagi-core
In-Reply-To: <11982085193489-git-send-email-nakam@linux-ipv6.org>

From: Masahide NAKAMURA <nakam@linux-ipv6.org>
Date: Fri, 21 Dec 2007 12:41:59 +0900

> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>

Applied.

^ permalink raw reply

* Re: [PATCH 1/3] [XFRM]: Define packet dropping statistics.
From: David Miller @ 2007-12-21  4:43 UTC (permalink / raw)
  To: nakam; +Cc: herbert, netdev, usagi-core
In-Reply-To: <11982085132010-git-send-email-nakam@linux-ipv6.org>

From: Masahide NAKAMURA <nakam@linux-ipv6.org>
Date: Fri, 21 Dec 2007 12:41:53 +0900

> This statistics is shown factor dropped by transformation
> at /proc/net/xfrm_stat for developer.
> It is a counter designed from current transformation source code
> and defined as linux private MIB.
> 
> See Documentation/networking/xfrm_proc.txt for the detail.
> 
> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>

Patch applied.

This is very useful, thanks for implementing this feature.

^ permalink raw reply

* Re: [PATCH] [XFRM] MIPv6: Fix to input RO state correctly.
From: David Miller @ 2007-12-21  4:42 UTC (permalink / raw)
  To: nakam; +Cc: herbert, netdev, usagi-core
In-Reply-To: <11982084484155-git-send-email-nakam@linux-ipv6.org>

From: Masahide NAKAMURA <nakam@linux-ipv6.org>
Date: Fri, 21 Dec 2007 12:40:48 +0900

> Disable spin_lock during xfrm_type.input() function.
> Follow design as IPsec inbound does.
> 
> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>

Applied.

^ permalink raw reply

* Re: [PATCH] [XFRM] IPv6: Fix dst/routing check at transformation.
From: David Miller @ 2007-12-21  4:41 UTC (permalink / raw)
  To: nakam; +Cc: herbert, netdev, usagi-core
In-Reply-To: <11982084391595-git-send-email-nakam@linux-ipv6.org>

From: Masahide NAKAMURA <nakam@linux-ipv6.org>
Date: Fri, 21 Dec 2007 12:40:39 +0900

> IPv6 specific thing is wrongly removed from transformation at net-2.6.25.
> This patch recovers it with current design.
> 
> o Update "path" of xfrm_dst since IPv6 transformation should
>   care about routing changes. It is required by MIPv6 and
>   off-link destined IPsec.
> o Rename nfheader_len which is for non-fragment transformation used by
>   MIPv6 to rt6i_nfheader_len as IPv6 name space.
> 
> Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>

Applied.

^ permalink raw reply

* BNX2 warning
From: David Miller @ 2007-12-21  4:39 UTC (permalink / raw)
  To: mchan; +Cc: netdev


[ misspelled netdev list first time, retrying... ]

Michael, please fix this, thanks :-)

drivers/net/bnx2.c: In function 'bnx2_init_napi':
drivers/net/bnx2.c:7329: warning: no return statement in function returning non-void

^ permalink raw reply

* Re: TSO trimming question
From: David Miller @ 2007-12-21  4:36 UTC (permalink / raw)
  To: jheffner; +Cc: ilpo.jarvinen, netdev, herbert
In-Reply-To: <476A920D.6020401@psc.edu>

From: John Heffner <jheffner@psc.edu>
Date: Thu, 20 Dec 2007 11:02:21 -0500

> David Miller wrote:
> > From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
> > Date: Thu, 20 Dec 2007 13:40:51 +0200 (EET)
> > 
> >> [PATCH] [TCP]: Fix TSO deferring
> >>
> >> I'd say that most of what tcp_tso_should_defer had in between
> >> there was dead code because of this.
> >>
> >> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> > 
> > Yikes!!!!!
> > 
> > John, we've been living a lie for more than a year. :-/
> > 
> > On the bright side this explains a lot of small TSO frames I've been
> > seeing in traces over the past year but never got a chance to
> > investigate.
> 
> Ouch.  This fix may improve some benchmarks.
> 
> Re-checking this function was on my list of things to do because I had 
> also noticed some TSO frames that seemed a bit small.  This clearly 
> explains it.

What I'll do for now is I'll put this into net-2.6.25 to let it
"cook" for a while.  It's an obvious fix but it's enabling code
that's effectively been disabled for more than a year so something
might turn up that we need to fix.

^ permalink raw reply

* Evgeniy Polyakov
From: David Miller @ 2007-12-21  4:34 UTC (permalink / raw)
  To: netdev


If someone has a way other than email to contact Evgeniy, could
you please let him know that his email is bouncing in strange
ways.

I'll have to unsubscribe him if this goes on much longer, which
I don't want to do.

Thanks.

Here is some example bounce text:

451 4.0.0 readqf: cannot open ./dflBL48UH3032179: No such file or directory
552 5.3.4 Message is too large; 15000000 bytes max
554 5.0.0 Service unavailable

^ permalink raw reply

* Re: [PATCH 2/4] [CORE]: datagram: basic memory accounting functions
From: David Miller @ 2007-12-21  4:31 UTC (permalink / raw)
  To: haoki
  Cc: herbert, netdev, tyasui, mhiramat, satoshi.oshima.fk, billfink,
	andi, johnpol, shemminger, yoshfuji, yumiko.sugita.yf
In-Reply-To: <476B3EAE.4070809@redhat.com>

From: Hideo AOKI <haoki@redhat.com>
Date: Thu, 20 Dec 2007 23:18:54 -0500

> > Also, the memory accounting is done at different parts in
> > the socket code paths for stream vs. datagram.  This is why
> > everything is inconsistent, and, a mess.
> 
> Could you tell me more detailed information?

I think the core thing is that TCP and INET protocols call into
the memory accounting internally, either inside their own code
paths or with inet_*() helpers.

This is versus what we really want is everything happening via generic
sk_foo() helpers.

If that's what's happening already, great, just consolidate the
datagram vs. stream stuff and it should be good.


^ permalink raw reply

* Re: [PATCH 2/4] [CORE]: datagram: basic memory accounting functions
From: Hideo AOKI @ 2007-12-21  4:18 UTC (permalink / raw)
  To: David Miller
  Cc: herbert, netdev, tyasui, mhiramat, satoshi.oshima.fk, billfink,
	andi, johnpol, shemminger, yoshfuji, yumiko.sugita.yf, haoki
In-Reply-To: <20071220.034304.212231150.davem@davemloft.net>

Hello,

Thank you so much for your comments.

David Miller wrote:

> All of these other functions are identical copies of the stream
> counterparts, they should all be consolidated.
> 
> I still see a lot of special casing, instead of large pieces of common
> code.
> 
> There should be one core set of functions that handle the memory
> accounting, regardless of socket type.  Maybe there is one spot where
> something like sk->prot->doing_memory_accounting is tested, but that's
> it.

I understood. I'll re-write my patch set to make memory accounting
core functions.

> Also, the memory accounting is done at different parts in
> the socket code paths for stream vs. datagram.  This is why
> everything is inconsistent, and, a mess.

Could you tell me more detailed information?

Does this comment mean interface and usage of memory accounting functions?
If so, I'll consolidate functions like sk_stream_set_owner_r() and
sk_stream_free_skb(). And, I may have to use memory accounting functions in
memory allocating functions like sk_stream_alloc_pskb() as possible
instead of inside of socket operating functions.

Or, does the comment mean that send buffer accounting in IP layer
(e.g. ip_append_data()) is wrong?

Anyway, in next patch set, I'm going to consolidate mem_schedule
functions and mem_reclaim functions. To do so, some of memory
accounting functions for stream protocols will be renamed or
be moved to core/sock.c from core/stream.c.

I would like to know what kind of enhancement must be needed for
memory accounting core functions.

Again, thank you for taking your time to review this feature.

Best regards,
Hideo

-- 
Hitachi Computer Products (America) Inc.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox