From: Fan Du <fengyuleidian0615@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Fan Du <fan.du@intel.com>,
johnwheffner@gmail.com, edumazet@google.com, davem@davemloft.net,
netdev@vger.kernel.org
Subject: Re: [PATCHv2 net-next 4/4] ipv4: Create probe timer for tcp PMTU as per RFC4821
Date: Thu, 26 Feb 2015 14:24:10 +0800 [thread overview]
Message-ID: <54EEBC0A.1040103@gmail.com> (raw)
In-Reply-To: <1424924362.5565.143.camel@edumazet-glaptop2.roam.corp.google.com>
于 2015年02月26日 12:19, Eric Dumazet 写道:
> On Thu, 2015-02-26 at 11:49 +0800, Fan Du wrote:
>> As per RFC4821 7.3. Selecting Probe Size, a probe timer should
>> be armed once probing has converged. Once this timer expired,
>> probing again to take advantage of any path PMTU change. The
>> recommended probing interval is 10 minutes per RFC1981. Probing
>> interval could be sysctled by sysctl_tcp_probe_interval.
>>
>> Eric suggested to implement pseudo timer based on 32bits jiffies
>> tcp_time_stamp instead of using classic timer for such rare event.
>>
>> Signed-off-by: Fan Du <fan.du@intel.com>
>> ---
>> v2:
>> - Create a pseudo timer based on 32bits jiffies tcp_time_stamp
>> to control when to reprobing as suggested by Eric.
>
> ...
>
>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>> index c418829..0e9ee6a 100644
>> --- a/net/ipv4/tcp_output.c
>> +++ b/net/ipv4/tcp_output.c
>> @@ -1354,6 +1354,8 @@ void tcp_mtup_init(struct sock *sk)
>> icsk->icsk_af_ops->net_header_len;
>> icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, net->ipv4.sysctl_tcp_base_mss);
>> icsk->icsk_mtup.probe_size = 0;
>> + if (icsk->icsk_mtup.enabled)
>
> I guess the test can be removed.
>
> Assignment of probe_timestamp can be done regardless of the flag.
Exactly speaking, the assignment should honor enabled variable,
once enabled is 1, probing actually starts, so probe_timestamp should also be marked from that point.
one place missing is below:
@@ -108,6 +108,7 @@ static void tcp_blackhole_probe(struct inet_connection_sock *icsk,
if (net->ipv4.sysctl_tcp_mtu_probing) {
if (!icsk->icsk_mtup.enabled) {
icsk->icsk_mtup.enabled = 1;
+ icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
>> + icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
>> }
>> EXPORT_SYMBOL(tcp_mtup_init);
>>
>> @@ -1823,6 +1825,35 @@ send_now:
>> return false;
>> }
>>
>> +static inline void tcp_mtu_check_reprobe(struct sock *sk)
>> +{
>> + struct inet_connection_sock *icsk = inet_csk(sk);
>> + struct net *net = sock_net(sk);
>> + struct tcp_sock *tp = tcp_sk(sk);
>> +
>
> extra newline
>
>> + u32 delta;
>> + u32 now = tcp_time_stamp;
>> + u32 interval = net->ipv4.sysctl_tcp_probe_interval;
>
>
> Please try to keep variables sorted by lengths, it helps readability :
>
> u32 interval = net->ipv4.sysctl_tcp_probe_interval;
> struct inet_connection_sock *icsk = inet_csk(sk);
> struct tcp_sock *tp = tcp_sk(sk);
> struct net *net = sock_net(sk);
> s32 delta;
then I have to arrange those variable like this ;)
+static inline void tcp_mtu_check_reprobe(struct sock *sk)
+{
+ struct inet_connection_sock *icsk = inet_csk(sk);
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct net *net = sock_net(sk);
+ u32 interval;
+ s32 delta;
+
+ interval = net->ipv4.sysctl_tcp_probe_interval;
+ delta = tcp_time_stamp - icsk->icsk_mtup.probe_timestamp;
+ if (unlikely(delta >= interval * HZ)) {
+ int mss = tcp_current_mss(sk);
+
+ /* Update current search range */
+ icsk->icsk_mtup.search_high = tp->rx_opt.mss_clamp +
+ sizeof(struct tcphdr) +
+ icsk->icsk_af_ops->net_header_len;
+ icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
+ icsk->icsk_mtup.probe_size = 0;
+
+ /* Update probe time stamp */
+ icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
+ }
+}
And thanks for the all the comments.
>
>
>> +
>> + if (likely(now > icsk->icsk_mtup.probe_timestamp))
>
> Wrapping issue.
>
> if ((s32)(now - icsk->icsk_mtup.probe_timestamp)) > 0) ...
>
>
>> + delta = now - icsk->icsk_mtup.probe_timestamp;
>> + else
>> + delta = now + (U32_MAX - icsk->icsk_mtup.probe_timestamp);
>
>
> oh well..
>
> delta = tcp_time_stamp - icsk->icsk_mtup.probe_timestamp;
>
> is fine (if delta is declared as s32).
> Adding U32_MAX makes no sense. No need for @now var.
>
>> + if (unlikely(jiffies_to_msecs(delta) >= interval * MSEC_PER_SC)) {
>
> if (unlikely(delta >= interval * HZ)) {
>
>> + int mss = tcp_current_mss(sk);
>> +
>> + /* Update current search range */
>> + icsk->icsk_mtup.search_high = tp->rx_opt.mss_clamp +
>> + sizeof(struct tcphdr) +
>> + icsk->icsk_af_ops->net_header_len;
>> + icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
>> + icsk->icsk_mtup.probe_size = 0;
>> +
>> + /* Update probe time stamp */
>> + icsk->icsk_mtup.probe_timestamp = now;
>
> icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
>> + }
>> +}
>> +
>> /* Create a new MTU probe if we are ready.
>> * MTU probe is regularly attempting to increase the path MTU by
>> * deliberately sending larger packets. This discovers routing
>> @@ -1866,8 +1897,11 @@ static int tcp_mtu_probe(struct sock *sk)
>> size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache;
>> interval = icsk->icsk_mtup.search_high - icsk->icsk_mtup.search_low;
>> if (probe_size > tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_high) ||
>> - interval < net->ipv4.sysctl_tcp_probe_threshold) {
>> - /* TODO: set timer for probe_converge_event */
>> + interval < net->ipv4.sysctl_tcp_probe_threshold) {
>> + /* Check whether enough time has elaplased for
>> + * another round of probing.
>> + */
>> + tcp_mtu_check_reprobe(sk);
>> return -1;
>> }
>>
>
>
next prev parent reply other threads:[~2015-02-26 6:29 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-13 8:16 [PATCH net-next 0/3] Small fix for TCP PMTU Fan Du
2015-02-13 8:16 ` [PATCH net-next 1/3] ipv4: Raise tcp PMTU probe mss base size Fan Du
2015-02-13 9:49 ` yzhu1
2015-02-16 5:15 ` Fan Du
2015-02-13 8:16 ` [PATCH net-next 2/3] ipv4: Use binary search to choose tcp PMTU probe_size Fan Du
2015-02-13 17:52 ` John Heffner
2015-02-16 5:27 ` Fan Du
2015-02-16 23:59 ` John Heffner
2015-02-13 8:16 ` [PATCH net-next 3/3] ipv4: Create probe timer for tcp PMTU as per RFC4821 Fan Du
2015-02-13 9:59 ` Ying Xue
2015-02-16 5:28 ` Fan Du
2015-02-13 12:31 ` Eric Dumazet
2015-02-16 5:38 ` Fan Du
2015-02-16 12:19 ` Eric Dumazet
2015-02-26 3:49 ` [PATCHv2 net-next 0/4] Small fix for TCP PMTU Fan Du
2015-02-26 3:49 ` [PATCHv2 net-next 1/4] ipv4: Raise tcp PMTU probe mss base size Fan Du
2015-02-26 3:49 ` [PATCHv2 net-next 2/4] ipv4: Use binary search to choose tcp PMTU probe_size Fan Du
2015-02-27 22:17 ` David Miller
2015-02-26 3:49 ` [PATCHv2 net-next 3/4] ipv4: shrink current mss for tcp PMTU blackhole detection Fan Du
2015-02-26 3:49 ` [PATCHv2 net-next 4/4] ipv4: Create probe timer for tcp PMTU as per RFC4821 Fan Du
2015-02-26 4:19 ` Eric Dumazet
2015-02-26 6:24 ` Fan Du [this message]
2015-02-26 13:40 ` [PATCHv2 net-next 0/4] Small fix for TCP PMTU David Laight
2015-02-27 5:37 ` Fan Du
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54EEBC0A.1040103@gmail.com \
--to=fengyuleidian0615@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=fan.du@intel.com \
--cc=johnwheffner@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.