From: "David S. Miller" <davem@davemloft.net>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: ak@suse.de, niv@us.ibm.com, andy.grover@gmail.com,
anton@samba.org, netdev@oss.sgi.com
Subject: Re: bad TSO performance in 2.6.9-rc2-BK
Date: Tue, 28 Sep 2004 14:10:02 -0700 [thread overview]
Message-ID: <20040928141002.164c60af.davem@davemloft.net> (raw)
In-Reply-To: <20040927213233.GC7243@gondor.apana.org.au>
[-- Attachment #1: Type: text/plain, Size: 1069 bytes --]
On Tue, 28 Sep 2004 07:32:33 +1000
Herbert Xu <herbert@gondor.apana.org.au> wrote:
> On Mon, Sep 27, 2004 at 12:01:54PM -0700, David S. Miller wrote:
> >
> > > tcp_current_mss() doesn't call tcp_sync_mss() unless the PMTU changes.
> >
> > Good catch, probably we should make it do so when sk_route_caps
> > indicates we are doing TSO.
>
> Alternatively we could move the TSO code out of tcp_sync_mss() and
> put it in tcp_current_mss() instead. It seems to be the only one
> using the factor anyway.
Ok, here are 2 patches incorporating all of the things
we discussed in this area:
1) Uninline tcp_current_mss(), fix tcp_sync_mss() return
value to match tcp_current_mss()'s
2) Fix the do_large calculation bug in tcp_current_mss() as
per Herbert's original patch.
3) Move TSO mss calculation work to tcp_current_mss(). We have
to do something like this since tcp_sync_mss() is only invoked
when the PMTU changes whereas the TSO MTU is dependant upon
both the path and the current congestion window.
So, this patch should wrap up these issues.
[-- Attachment #2: diff1 --]
[-- Type: application/octet-stream, Size: 3591 bytes --]
# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
# 2004/09/28 13:26:54-07:00 davem@nuts.davemloft.net
# [TCP]: Uninline tcp_current_mss().
#
# Also fix the return value of tcp_sync_mss() to
# be unsigned.
#
# Signed-off-by: David S. Miller <davem@davemloft.net>
#
# net/ipv4/tcp_output.c
# 2004/09/28 13:26:01-07:00 davem@nuts.davemloft.net +31 -1
# [TCP]: Uninline tcp_current_mss().
#
# include/net/tcp.h
# 2004/09/28 13:26:00-07:00 davem@nuts.davemloft.net +2 -32
# [TCP]: Uninline tcp_current_mss().
#
diff -Nru a/include/net/tcp.h b/include/net/tcp.h
--- a/include/net/tcp.h 2004-09-28 13:49:22 -07:00
+++ b/include/net/tcp.h 2004-09-28 13:49:22 -07:00
@@ -961,7 +961,8 @@
extern void tcp_delete_keepalive_timer (struct sock *);
extern void tcp_reset_keepalive_timer (struct sock *, unsigned long);
-extern int tcp_sync_mss(struct sock *sk, u32 pmtu);
+extern unsigned int tcp_sync_mss(struct sock *sk, u32 pmtu);
+extern unsigned int tcp_current_mss(struct sock *sk, int large);
extern const char timer_bug_msg[];
@@ -1033,37 +1034,6 @@
default:
printk(timer_bug_msg);
};
-}
-
-/* Compute the current effective MSS, taking SACKs and IP options,
- * and even PMTU discovery events into account.
- *
- * LARGESEND note: !urg_mode is overkill, only frames up to snd_up
- * cannot be large. However, taking into account rare use of URG, this
- * is not a big flaw.
- */
-
-static inline unsigned int tcp_current_mss(struct sock *sk, int large)
-{
- struct tcp_opt *tp = tcp_sk(sk);
- struct dst_entry *dst = __sk_dst_get(sk);
- int do_large, mss_now;
-
- do_large = (large &&
- (sk->sk_route_caps & NETIF_F_TSO) &&
- !tp->urg_mode);
- mss_now = do_large ? tp->mss_cache : tp->mss_cache_std;
-
- if (dst) {
- u32 mtu = dst_pmtu(dst);
- if (mtu != tp->pmtu_cookie ||
- tp->ext2_header_len != dst->header_len)
- mss_now = tcp_sync_mss(sk, mtu);
- }
- if (tp->eff_sacks)
- mss_now -= (TCPOLEN_SACK_BASE_ALIGNED +
- (tp->eff_sacks * TCPOLEN_SACK_PERBLOCK));
- return mss_now;
}
/* Initialize RCV_MSS value.
diff -Nru a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
--- a/net/ipv4/tcp_output.c 2004-09-28 13:49:22 -07:00
+++ b/net/ipv4/tcp_output.c 2004-09-28 13:49:22 -07:00
@@ -603,7 +603,7 @@
this function. --ANK (980731)
*/
-int tcp_sync_mss(struct sock *sk, u32 pmtu)
+unsigned int tcp_sync_mss(struct sock *sk, u32 pmtu)
{
struct tcp_opt *tp = tcp_sk(sk);
struct dst_entry *dst = __sk_dst_get(sk);
@@ -661,6 +661,36 @@
return mss_now;
}
+/* Compute the current effective MSS, taking SACKs and IP options,
+ * and even PMTU discovery events into account.
+ *
+ * LARGESEND note: !urg_mode is overkill, only frames up to snd_up
+ * cannot be large. However, taking into account rare use of URG, this
+ * is not a big flaw.
+ */
+
+unsigned int tcp_current_mss(struct sock *sk, int large)
+{
+ struct tcp_opt *tp = tcp_sk(sk);
+ struct dst_entry *dst = __sk_dst_get(sk);
+ int do_large, mss_now;
+
+ do_large = (large &&
+ (sk->sk_route_caps & NETIF_F_TSO) &&
+ !tp->urg_mode);
+ mss_now = do_large ? tp->mss_cache : tp->mss_cache_std;
+
+ if (dst) {
+ u32 mtu = dst_pmtu(dst);
+ if (mtu != tp->pmtu_cookie ||
+ tp->ext2_header_len != dst->header_len)
+ mss_now = tcp_sync_mss(sk, mtu);
+ }
+ if (tp->eff_sacks)
+ mss_now -= (TCPOLEN_SACK_BASE_ALIGNED +
+ (tp->eff_sacks * TCPOLEN_SACK_PERBLOCK));
+ return mss_now;
+}
/* This routine writes packets to the network. It advances the
* send_head. This happens as incoming acks open up the remote
[-- Attachment #3: diff2 --]
[-- Type: application/octet-stream, Size: 2712 bytes --]
# This is a BitKeeper generated diff -Nru style patch.
#
# ChangeSet
# 2004/09/28 13:46:58-07:00 davem@nuts.davemloft.net
# [TCP]: Move TSO mss calcs to tcp_current_mss()
#
# Based upon a bug fix patch and suggestions from
# Herbert Xu <herbert@gondor.apana.org.au>
#
# Signed-off-by: David S. Miller <davem@davemloft.net>
#
# net/ipv4/tcp_output.c
# 2004/09/28 13:46:28-07:00 davem@nuts.davemloft.net +29 -24
# [TCP]: Move TSO mss calcs to tcp_current_mss()
#
# Based upon a bug fix patch and suggestions from
# Herbert Xu <herbert@gondor.apana.org.au>
#
# Signed-off-by: David S. Miller <davem@davemloft.net>
#
diff -Nru a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
--- a/net/ipv4/tcp_output.c 2004-09-28 13:49:37 -07:00
+++ b/net/ipv4/tcp_output.c 2004-09-28 13:49:37 -07:00
@@ -639,25 +639,6 @@
tp->pmtu_cookie = pmtu;
tp->mss_cache = tp->mss_cache_std = mss_now;
- if (sk->sk_route_caps & NETIF_F_TSO) {
- int large_mss, factor;
-
- large_mss = 65535 - tp->af_specific->net_header_len -
- tp->ext_header_len - tp->ext2_header_len - tp->tcp_header_len;
-
- if (tp->max_window && large_mss > (tp->max_window>>1))
- large_mss = max((tp->max_window>>1), 68U - tp->tcp_header_len);
-
- /* Always keep large mss multiple of real mss, but
- * do not exceed congestion window.
- */
- factor = large_mss / mss_now;
- if (factor > tp->snd_cwnd)
- factor = tp->snd_cwnd;
-
- tp->mss_cache = mss_now * factor;
- }
-
return mss_now;
}
@@ -675,17 +656,41 @@
struct dst_entry *dst = __sk_dst_get(sk);
int do_large, mss_now;
- do_large = (large &&
- (sk->sk_route_caps & NETIF_F_TSO) &&
- !tp->urg_mode);
- mss_now = do_large ? tp->mss_cache : tp->mss_cache_std;
-
+ mss_now = tp->mss_cache_std;
if (dst) {
u32 mtu = dst_pmtu(dst);
if (mtu != tp->pmtu_cookie ||
tp->ext2_header_len != dst->header_len)
mss_now = tcp_sync_mss(sk, mtu);
}
+
+ do_large = (large &&
+ (sk->sk_route_caps & NETIF_F_TSO) &&
+ !tp->urg_mode);
+
+ if (do_large) {
+ int large_mss, factor;
+
+ large_mss = 65535 - tp->af_specific->net_header_len -
+ tp->ext_header_len - tp->ext2_header_len -
+ tp->tcp_header_len;
+
+ if (tp->max_window && large_mss > (tp->max_window>>1))
+ large_mss = max((tp->max_window>>1),
+ 68U - tp->tcp_header_len);
+
+ /* Always keep large mss multiple of real mss, but
+ * do not exceed congestion window.
+ */
+ factor = large_mss / mss_now;
+ if (factor > tp->snd_cwnd)
+ factor = tp->snd_cwnd;
+
+ tp->mss_cache = mss_now * factor;
+
+ mss_now = tp->mss_cache;
+ }
+
if (tp->eff_sacks)
mss_now -= (TCPOLEN_SACK_BASE_ALIGNED +
(tp->eff_sacks * TCPOLEN_SACK_PERBLOCK));
next prev parent reply other threads:[~2004-09-28 21:10 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-20 6:30 bad TSO performance in 2.6.9-rc2-BK Anton Blanchard
2004-09-20 15:54 ` Nivedita Singhvi
2004-09-21 15:55 ` Anton Blanchard
2004-09-20 20:30 ` Andi Kleen
2004-09-21 22:58 ` David S. Miller
2004-09-22 14:00 ` Andi Kleen
2004-09-22 18:12 ` David S. Miller
2004-09-22 19:55 ` Andi Kleen
2004-09-22 20:07 ` Nivedita Singhvi
2004-09-22 20:30 ` David S. Miller
2004-09-22 20:56 ` Nivedita Singhvi
2004-09-22 21:56 ` Andi Kleen
2004-09-22 22:04 ` David S. Miller
2004-09-22 20:12 ` Andrew Grover
2004-09-22 20:39 ` David S. Miller
2004-09-22 22:06 ` Andi Kleen
2004-09-22 22:25 ` David S. Miller
2004-09-22 22:47 ` Andi Kleen
2004-09-22 22:50 ` David S. Miller
2004-09-23 23:11 ` David S. Miller
2004-09-23 23:41 ` Herbert Xu
2004-09-23 23:41 ` David S. Miller
2004-09-24 0:12 ` Herbert Xu
2004-09-24 0:40 ` Herbert Xu
2004-09-24 1:07 ` Herbert Xu
2004-09-24 1:17 ` David S. Miller
2004-09-27 1:27 ` Herbert Xu
2004-09-27 2:50 ` Herbert Xu
2004-09-27 4:00 ` David S. Miller
2004-09-27 5:45 ` Herbert Xu
2004-09-27 19:01 ` David S. Miller
2004-09-27 21:32 ` Herbert Xu
2004-09-28 21:10 ` David S. Miller [this message]
2004-09-28 21:34 ` Andi Kleen
2004-09-28 21:53 ` David S. Miller
2004-09-28 22:33 ` Andi Kleen
2004-09-28 22:57 ` David S. Miller
2004-09-28 23:27 ` Andi Kleen
2004-09-28 23:35 ` David S. Miller
2004-09-28 23:55 ` Andi Kleen
2004-09-29 0:04 ` David S. Miller
2004-09-29 20:58 ` John Heffner
2004-09-29 21:10 ` Nivedita Singhvi
2004-09-29 21:50 ` David S. Miller
2004-09-29 21:56 ` Andi Kleen
2004-09-29 23:29 ` David S. Miller
2004-09-29 23:51 ` John Heffner
2004-09-30 0:03 ` David S. Miller
2004-09-30 0:10 ` Herbert Xu
2004-10-01 0:34 ` David S. Miller
2004-10-01 1:12 ` David S. Miller
2004-10-01 3:40 ` David S. Miller
2004-10-01 10:35 ` Andi Kleen
2004-10-01 10:23 ` Andi Kleen
2004-09-30 0:10 ` John Heffner
2004-09-30 17:25 ` John Heffner
2004-09-30 20:23 ` David S. Miller
2004-09-30 0:05 ` Herbert Xu
2004-09-30 4:33 ` David S. Miller
2004-09-30 5:47 ` Herbert Xu
2004-09-30 7:39 ` David S. Miller
2004-09-30 8:09 ` Herbert Xu
2004-09-30 9:29 ` Andi Kleen
2004-09-30 20:20 ` David S. Miller
2004-09-29 3:27 ` John Heffner
2004-09-29 9:01 ` Andi Kleen
2004-09-29 19:56 ` David S. Miller
2004-09-29 20:56 ` Andi Kleen
2004-09-29 21:17 ` David S. Miller
2004-09-29 21:00 ` David S. Miller
2004-09-29 21:16 ` Nivedita Singhvi
2004-09-29 21:22 ` David S. Miller
2004-09-29 21:43 ` Andi Kleen
2004-09-29 21:51 ` John Heffner
2004-09-29 21:52 ` David S. Miller
2004-09-24 8:30 ` Andi Kleen
2004-09-27 22:38 ` John Heffner
2004-09-27 23:04 ` David S. Miller
2004-09-27 23:25 ` Andi Kleen
2004-09-27 23:37 ` David S. Miller
2004-09-27 23:51 ` Andi Kleen
2004-09-28 0:15 ` David S. Miller
2004-09-27 23:36 ` Herbert Xu
2004-09-28 0:13 ` David S. Miller
2004-09-28 0:34 ` Herbert Xu
2004-09-28 4:59 ` David S. Miller
2004-09-28 5:15 ` Herbert Xu
2004-09-28 5:58 ` David S. Miller
2004-09-28 6:45 ` Nivedita Singhvi
2004-09-28 7:20 ` Nivedita Singhvi
2004-09-28 20:38 ` David S. Miller
2004-09-28 7:23 ` Nivedita Singhvi
2004-09-28 8:23 ` Herbert Xu
2004-09-28 12:53 ` John Heffner
2004-09-22 20:28 ` David S. Miller
[not found] <Pine.NEB.4.33.0409301625560.13549-100000@dexter.psc.edu>
2004-10-02 1:32 ` John Heffner
2004-10-04 20:07 ` David S. Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040928141002.164c60af.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=ak@suse.de \
--cc=andy.grover@gmail.com \
--cc=anton@samba.org \
--cc=herbert@gondor.apana.org.au \
--cc=netdev@oss.sgi.com \
--cc=niv@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).