* [DCCP] [Patch 0/6] [BUG-Fix]: Fixes for CCID3, seqnos, and dccp_probe
@ 2008-01-28 10:16 Gerrit Renker
2008-01-28 10:16 ` [PATCH 1/6] [CCID3]: Function returned wrong value Gerrit Renker
0 siblings, 1 reply; 8+ messages in thread
From: Gerrit Renker @ 2008-01-28 10:16 UTC (permalink / raw)
To: acme; +Cc: dccp, netdev
This is a set of bug fixes for CCID3 and general DCCP.
Please consider patches #1, #2, #3. The remainder are for the test tree
(but are fixes nonetheless) and may not apply directly onto mainline; with
regard to patch #6, please see note at end of message.
Patch #1: Fixes a CCID3 bug: when loss was encountered, the value returned at
minimum resolution was wrong and over 80 times too high.
Patch #2: Fixes a bug in the assignment of GAR (used ISR instead of ISS).
Patch #3: Another bug fix: AWL was only ever set after the Response.
Patch #4: Fixes adjustments to AWL and SWL. These were only updated at the
begin of the connection (where the statements reduced to much
simpler assignments), but need to be adjusted for the reception
of subsequent packets also.
(Note: This patch is best used after going through the feature
negotiation patches, since the feature handlers for Sequence
Window ensure that these values are up to date.)
Patch #5: As a consequence of the preceding patches, the dccp_connect_init
function may be merged into dccp_connect. Is a suggestion (which
would make sense), not a bug fix.
Patch #6: A fix for dccp_probe - this used to be noisy and created huge
files. By attaching the dccp_probe onto a different function, this
was fixed, the output file size shrank by a factor of over 100,
with qualitatively the same or better output.
I have put these after the feature-negotiation patches onto
git://eden-feed.erg.abdn.ac.uk/dccp_exp {dccp,ccid4}
and updated the CCID4 tree to take advantage of the dccp_probe update.
If people would like to use the dccp_probe update for a mainline kernel,
I have uploaded a similar patch to patch #6 onto
http://www.erg.abdn.ac.uk/users/gerrit/dccp/dccp_probe_for_mainline.diff
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH 1/6] [CCID3]: Function returned wrong value 2008-01-28 10:16 [DCCP] [Patch 0/6] [BUG-Fix]: Fixes for CCID3, seqnos, and dccp_probe Gerrit Renker @ 2008-01-28 10:16 ` Gerrit Renker 2008-01-28 10:16 ` [PATCH 2/6] [DCCP]: Bug in initial acknowledgment number assignment Gerrit Renker 0 siblings, 1 reply; 8+ messages in thread From: Gerrit Renker @ 2008-01-28 10:16 UTC (permalink / raw) To: acme; +Cc: dccp, netdev, Gerrit Renker This fixes a bug in the reverse lookup of p, given a value f(p) (used in the reverse-lookup of the first/synthetic loss interval in RFC 3448, 6.3.1): instead of p, the function returned the smallest table value f(p). The smallest tabulated value of 10^6 * f(p) = sqrt(2*p/3) + 12 * sqrt(3*p/8) * (32 * p^3 + p) for p=0.0001 is 8172. Since this value is scaled by 10^6, the impact of this bug is that a loss of 8172/10^6 = 0.8172% was reported whenever the input was below the table resolution of 0.01%. This means that the value was over 80 times too high, resulting in large spikes of the initial loss interval, and thus unnecessarily reducing the throughput. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> --- net/dccp/ccids/lib/tfrc_equation.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) --- a/net/dccp/ccids/lib/tfrc_equation.c +++ b/net/dccp/ccids/lib/tfrc_equation.c @@ -677,11 +677,11 @@ u32 tfrc_calc_x_reverse_lookup(u32 fvalue) /* Error cases. */ if (fvalue < tfrc_calc_x_lookup[0][1]) { DCCP_WARN("fvalue %d smaller than resolution\n", fvalue); - return tfrc_calc_x_lookup[0][1]; + return TFRC_SMALLEST_P; } if (fvalue > tfrc_calc_x_lookup[TFRC_CALC_X_ARRSIZE - 1][0]) { DCCP_WARN("fvalue %d exceeds bounds!\n", fvalue); - return 1000000; + return 1000000; /* The maximum: 100% scaled by 10^6 */ } if (fvalue <= tfrc_calc_x_lookup[TFRC_CALC_X_ARRSIZE - 1][1]) { ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/6] [DCCP]: Bug in initial acknowledgment number assignment 2008-01-28 10:16 ` [PATCH 1/6] [CCID3]: Function returned wrong value Gerrit Renker @ 2008-01-28 10:16 ` Gerrit Renker 2008-01-28 10:16 ` [PATCH 3/6] [DCCP]: Bug-Fix - AWL was never updated Gerrit Renker 0 siblings, 1 reply; 8+ messages in thread From: Gerrit Renker @ 2008-01-28 10:16 UTC (permalink / raw) To: acme; +Cc: dccp, netdev, Gerrit Renker Step 8.5 in RFC 4340 says for the newly cloned socket Initialize S.GAR := S.ISS, but what in fact the code (minisocks.c) does is Initialize S.GAR := S.ISR, which is wrong (typo?) -- fixed by the patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> --- net/dccp/minisocks.c | 7 +++---- 1 files changed, 3 insertions(+), 4 deletions(-) --- a/net/dccp/minisocks.c +++ b/net/dccp/minisocks.c @@ -122,13 +122,12 @@ struct sock *dccp_create_openreq_child(struct sock *sk, * Initialize S.GAR := S.ISS * Set S.ISR, S.GSR, S.SWL, S.SWH from packet or Init Cookies */ + newdp->dccps_gar = newdp->dccps_iss = dreq->dreq_iss; + dccp_update_gss(newsk, dreq->dreq_iss); - newdp->dccps_gar = newdp->dccps_isr = dreq->dreq_isr; + newdp->dccps_isr = dreq->dreq_isr; dccp_update_gsr(newsk, dreq->dreq_isr); - newdp->dccps_iss = dreq->dreq_iss; - dccp_update_gss(newsk, dreq->dreq_iss); - /* * SWL and AWL are initially adjusted so that they are not less than * the initial Sequence Numbers received and sent, respectively: ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 3/6] [DCCP]: Bug-Fix - AWL was never updated 2008-01-28 10:16 ` [PATCH 2/6] [DCCP]: Bug in initial acknowledgment number assignment Gerrit Renker @ 2008-01-28 10:16 ` Gerrit Renker 2008-01-28 10:16 ` [PATCH 4/6] [DCCP]: Fix the adjustments to AWL and SWL Gerrit Renker 2008-01-28 17:59 ` [PATCH 3/6] [DCCP]: Bug-Fix - AWL was never updated Ian McDonald 0 siblings, 2 replies; 8+ messages in thread From: Gerrit Renker @ 2008-01-28 10:16 UTC (permalink / raw) To: acme; +Cc: dccp, netdev, Gerrit Renker This patch was triggered by finding the following message in the syslog: "kernel: dccp_check_seqno: DCCP: Step 6 failed for DATAACK packet, [...] P.ackno exists or LAWL(82947089) <= P.ackno(82948208) <= S.AWH(82948728), sending SYNC..." Note the difference between AWH and AWL: it is 1639 packets (while Sequence Window was actually at 100). A closer look at the trace showed that LAWL = AWL = 82947089 equalled the ISS on the Response. The cause of the bug was that AWL was only ever set on the first packet - the DCCP-Request sent by dccp_v{4,6}_connect(). The fix is to continually update AWL/AWH with each new packet (as GSS=AWH). In addition, AWL/AWH are now updated to enforce more stringent checks on the initial sequence numbers when connecting: * AWL is initialised to ISS and remains at this value; * AWH is always set to GSS (via dccp_update_gss()); * so on the first Request: AWL = AWH = ISS, and on the n-th Request: AWL = ISS, AWH = ISS+n. As a consequence, only Response packets that refer to Requests sent by this host will pass, all others are discarded. This is the intention and in effect implements the initial adjustments for AWL as specified in RFC 4340, 7.5.1. Note: A problem that remains is that ISS can potentially be under-run even after the initial handshake; this is addressed a subsequent patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> --- net/dccp/output.c | 34 +++++++++++++++------------------- 1 files changed, 15 insertions(+), 19 deletions(-) --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -53,8 +53,11 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb) dccp_packet_hdr_len(dcb->dccpd_type); int err, set_ack = 1; u64 ackno = dp->dccps_gsr; - - dccp_inc_seqno(&dp->dccps_gss); + /* + * Increment GSS here already in case the option code needs it. + * Update GSS for real only if option processing below succeeds. + */ + dcb->dccpd_seq = ADD48(dp->dccps_gss, 1); switch (dcb->dccpd_type) { case DCCP_PKT_DATA: @@ -66,6 +69,9 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb) case DCCP_PKT_REQUEST: set_ack = 0; + /* Use ISS on the first (non-retransmitted) Request. */ + if (icsk->icsk_retransmits == 0) + dcb->dccpd_seq = dp->dccps_iss; /* fall through */ case DCCP_PKT_SYNC: @@ -84,14 +90,11 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb) break; } - dcb->dccpd_seq = dp->dccps_gss; - if (dccp_insert_options(sk, skb)) { kfree_skb(skb); return -EPROTO; } - /* Build DCCP header and checksum it. */ dh = dccp_zeroed_hdr(skb, dccp_header_size); dh->dccph_type = dcb->dccpd_type; @@ -103,7 +106,7 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb) /* XXX For now we're using only 48 bits sequence numbers */ dh->dccph_x = 1; - dp->dccps_awh = dp->dccps_gss; + dccp_update_gss(sk, dcb->dccpd_seq); dccp_hdr_set_seq(dh, dp->dccps_gss); if (set_ack) dccp_hdr_set_ack(dccp_hdr_ack_bits(skb), ackno); @@ -112,6 +115,11 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb) case DCCP_PKT_REQUEST: dccp_hdr_request(skb)->dccph_req_service = dp->dccps_service; + /* + * Limit Ack window to ISS <= P.ackno <= GSS, so that + * only Responses to Requests we sent are considered. + */ + dp->dccps_awl = dp->dccps_iss; break; case DCCP_PKT_RESET: dccp_hdr_reset(skb)->dccph_reset_code = @@ -447,19 +455,7 @@ static inline void dccp_connect_init(struct sock *sk) dccp_sync_mss(sk, dst_mtu(dst)); - /* - * SWL and AWL are initially adjusted so that they are not less than - * the initial Sequence Numbers received and sent, respectively: - * SWL := max(GSR + 1 - floor(W/4), ISR), - * AWL := max(GSS - W' + 1, ISS). - * These adjustments MUST be applied only at the beginning of the - * connection. - */ - dccp_update_gss(sk, dp->dccps_iss); - dccp_set_seqno(&dp->dccps_awl, max48(dp->dccps_awl, dp->dccps_iss)); - - /* S.GAR - greatest valid acknowledgement number received on a non-Sync; - * initialized to S.ISS (sec. 8.5) */ + /* Initialise GAR as per 8.5; AWL/AWH are set in dccp_transmit_skb() */ dp->dccps_gar = dp->dccps_iss; icsk->icsk_retransmits = 0; ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 4/6] [DCCP]: Fix the adjustments to AWL and SWL 2008-01-28 10:16 ` [PATCH 3/6] [DCCP]: Bug-Fix - AWL was never updated Gerrit Renker @ 2008-01-28 10:16 ` Gerrit Renker 2008-01-28 10:16 ` [PATCH 5/6] [DCCP]: Merge now-reduced connect_init() function Gerrit Renker 2008-01-28 17:59 ` [PATCH 3/6] [DCCP]: Bug-Fix - AWL was never updated Ian McDonald 1 sibling, 1 reply; 8+ messages in thread From: Gerrit Renker @ 2008-01-28 10:16 UTC (permalink / raw) To: acme; +Cc: dccp, netdev, Gerrit Renker This fixes a problem and a potential loophole with regard to seqno/ackno validity: the problem is that the initial adjustments to AWL/SWL were only performed at the begin of the connection, during the handshake. Since the Sequence Window feature is always greater than Wmin=32 (7.5.2), it is however necessary to perform these adjustments at least for the first W/W' (variables as per 7.5.1) packets in the lifetime of a connection. This requirement is complicated by the fact that W/W' can change at any time during the lifetime of a connection. Therefore the consequence is to perform this safety check each time SWL/AWL are updated. A second problem solved by this patch is that the remote/local Sequence Window feature values (which set the bounds for AWL/SWL/SWH) are undefined until the feature negotiation has completed. During the initial handshake we have more stringent sequence number protection, the changes added by this patch effect that {A,S}W{L,H} are within the correct bounds at the instant that feature negotiation completes (since the SeqWin feature activation handlers call dccp_update_gsr/gss()). A detailed rationale is below -- can be removed from the commit message. 1. Server sequence number checks during initial handshake --------------------------------------------------------- The server can not use the fields of the listening socket for seqno/ackno checks and thus needs to store all relevant information on a per-connection basis on the dccp_request socket. This is a size-constrained structure and has currently only ISS (dreq_iss) and ISR (dreq_isr) defined. Adding further fields (SW{L,H}, AW{L,H}) would increase the size of the struct and it is questionable whether this will have any practical gain. The currently implemented solution is as follows. * receiving first Request: dccp_v{4,6}_conn_request sets ISR := P.seqno, ISS := dccp_v{4,6}_init_sequence() * sending first Response: dccp_v{4,6}_send_response via dccp_make_response() sets P.seqno := ISS, sets P.ackno := ISR * receiving retransmitted Request: dccp_check_req() overrides ISR := P.seqno * answering retransmitted Request: dccp_make_response() sets ISS += 1, otherwise as per first Response * completing the handshake: succeeds in dccp_check_req() for the first Ack where P.ackno == ISS (P.seqno is not tested) * creating child socket: ISS, ISR are copied from the request_sock This solution will succeed whenever the server can receive the Request and the subsequent Ack in succession, without retransmissions. If there is packet loss, the client needs to retransmit until this condition succeeds; it will otherwise eventually give up. Adding further fields to the request_sock could increase the robustness a bit, in that it would make possible to let a reordered Ack (from a retransmitted Response) pass. The argument against such a solution is that if the packet loss is not persistent and an Ack gets through, why not wait for the one answering the original response: if the loss is persistent, it is probably better to not start the connection in the first place. Long story short: the present design (by Arnaldo) is simple and will likely work just as well as a more complicated solution. As a consequence, {A,S}W{L,H} are not needed until the moment the request_sock is cloned into the accept queue. At that stage feature negotiation has completed, so that the values for the local and remote Sequence Window feature (7.5.2) are known, i.e. we are now in a better position to compute {A,S}W{L,H}. 2. Client sequence number checks during initial handshake --------------------------------------------------------- Until entering PARTOPEN the client does not need the adjustments, since it constrains the Ack window to the packet it sent. * sending first Request: dccp_v{4,6}_connect() choose ISS, dccp_connect() then sets GAR := ISS (as per 8.5), dccp_transmit_skb() (with the previous bug fix) sets GSS := ISS, AWL := ISS, AWH := GSS * n-th retransmitted Request (with previous patch): dccp_retransmit_skb() via timer calls dccp_transmit_skb(), which sets GSS := ISS+n and then AWL := ISS, AWH := ISS+n * receiving any Response: dccp_rcv_request_sent_state_process() -- accepts packet if AWL <= P.ackno <= AWH; -- sets GSR = ISR = P.seqno * sending the Ack completing the handshake: dccp_send_ack() calls dccp_transmit_skb(), which sets GSS += 1 and AWL := ISS, AWH := GSS Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> --- net/dccp/dccp.h | 20 ++++++++++++++++++++ net/dccp/input.c | 18 ++++++------------ net/dccp/minisocks.c | 30 +++++++++--------------------- 3 files changed, 35 insertions(+), 33 deletions(-) --- a/net/dccp/dccp.h +++ b/net/dccp/dccp.h @@ -392,6 +392,23 @@ static inline void dccp_update_gsr(struct sock *sk, u64 seq) dp->dccps_gsr = seq; /* Sequence validity window depends on remote Sequence Window (7.5.1) */ dp->dccps_swl = SUB48(ADD48(dp->dccps_gsr, 1), dp->dccps_r_seq_win / 4); + /* + * Adjust SWL so that it is not below ISR. In contrast to RFC 4340, + * 7.5.1 we perform this check beyond the initial handshake: W/W' are + * always > 32, so for the first W/W' packets in the lifetime of a + * connection we always have to adjust SWL. + * A second reason why we are doing this is that the window depends on + * the feature-remote value of Sequence Window: nothing stops the peer + * from updating this value while we are busy adjusting SWL for the + * first W packets (we would have to count from scratch again then). + * Therefore it is safer to always make sure that the Sequence Window + * is not artificially extended by a peer who grows SWL downwards by + * continually updating the feature-remote Sequence-Window. + * If sequence numbers wrap it is bad luck. But that will take a while + * (48 bit), and this measure prevents Sequence-number attacks. + */ + if (before48(dp->dccps_swl, dp->dccps_isr)) + dp->dccps_swl = dp->dccps_isr; dp->dccps_swh = ADD48(dp->dccps_gsr, (3 * dp->dccps_r_seq_win) / 4); } @@ -402,6 +419,9 @@ static inline void dccp_update_gss(struct sock *sk, u64 seq) dp->dccps_gss = seq; /* Ack validity window depends on local Sequence Window value (7.5.1) */ dp->dccps_awl = SUB48(ADD48(dp->dccps_gss, 1), dp->dccps_l_seq_win); + /* Adjust AWL so that it is not below ISS - see comment above for SWL */ + if (before48(dp->dccps_awl, dp->dccps_iss)) + dp->dccps_awl = dp->dccps_iss; dp->dccps_awh = dp->dccps_gss; } --- a/net/dccp/input.c +++ b/net/dccp/input.c @@ -440,20 +440,14 @@ static int dccp_rcv_request_sent_state_process(struct sock *sk, dp->dccps_syn_rtt = dccp_sample_rtt(sk, 10 * (tstamp - dp->dccps_options_received.dccpor_timestamp_echo)); - dp->dccps_isr = DCCP_SKB_CB(skb)->dccpd_seq; - dccp_update_gsr(sk, dp->dccps_isr); /* - * SWL and AWL are initially adjusted so that they are not less than - * the initial Sequence Numbers received and sent, respectively: - * SWL := max(GSR + 1 - floor(W/4), ISR), - * AWL := max(GSS - W' + 1, ISS). - * These adjustments MUST be applied only at the beginning of the - * connection. - * - * AWL was adjusted in dccp_v4_connect -acme + * Set ISR, GSR from packet. ISS was set in dccp_v{4,6}_connect + * and GSS in dccp_transmit_skb(). Setting AWL/AWH and SWL/SWH + * is done as part of activating the feature values below, since + * these settings depend on the local/remote Sequence Window + * features, which were undefined or not confirmed until now. */ - dccp_set_seqno(&dp->dccps_swl, - max48(dp->dccps_swl, dp->dccps_isr)); + dp->dccps_gsr = dp->dccps_isr = DCCP_SKB_CB(skb)->dccpd_seq; dccp_sync_mss(sk, icsk->icsk_pmtu_cookie); --- a/net/dccp/minisocks.c +++ b/net/dccp/minisocks.c @@ -120,30 +120,18 @@ struct sock *dccp_create_openreq_child(struct sock *sk, * * Choose S.ISS (initial seqno) or set from Init Cookies * Initialize S.GAR := S.ISS - * Set S.ISR, S.GSR, S.SWL, S.SWH from packet or Init Cookies - */ - newdp->dccps_gar = newdp->dccps_iss = dreq->dreq_iss; - dccp_update_gss(newsk, dreq->dreq_iss); - - newdp->dccps_isr = dreq->dreq_isr; - dccp_update_gsr(newsk, dreq->dreq_isr); - - /* - * SWL and AWL are initially adjusted so that they are not less than - * the initial Sequence Numbers received and sent, respectively: - * SWL := max(GSR + 1 - floor(W/4), ISR), - * AWL := max(GSS - W' + 1, ISS). - * These adjustments MUST be applied only at the beginning of the - * connection. + * Set S.ISR, S.GSR from packet (or Init Cookies) + * + * Setting AWL/AWH and SWL/SWH happens as part of the feature + * activation below, as these windows all depend on the local + * and remote Sequence Window feature values (7.5.2). */ - dccp_set_seqno(&newdp->dccps_swl, - max48(newdp->dccps_swl, newdp->dccps_isr)); - dccp_set_seqno(&newdp->dccps_awl, - max48(newdp->dccps_awl, newdp->dccps_iss)); + newdp->dccps_gss = newdp->dccps_iss = dreq->dreq_iss; + newdp->dccps_gar = newdp->dccps_iss; + newdp->dccps_gsr = newdp->dccps_isr = dreq->dreq_isr; /* - * Activate features after initialising the sequence numbers since - * CCID initialisation may depend on GSS, ISR, ISS etc. + * Activate features: initialise CCIDs, sequence windows etc. */ if (dccp_feat_activate_values(newsk, &dreq->dreq_featneg)) { /* It is still raw copy of parent, so invalidate ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 5/6] [DCCP]: Merge now-reduced connect_init() function 2008-01-28 10:16 ` [PATCH 4/6] [DCCP]: Fix the adjustments to AWL and SWL Gerrit Renker @ 2008-01-28 10:16 ` Gerrit Renker 2008-01-28 10:16 ` [PATCH 6/6] [DCCP-PROBE]: Reduce noise in output and convert to ktime_t Gerrit Renker 0 siblings, 1 reply; 8+ messages in thread From: Gerrit Renker @ 2008-01-28 10:16 UTC (permalink / raw) To: acme; +Cc: dccp, netdev, Gerrit Renker After moving the assignment of GAR/ISS from dccp_connect_init() to dccp_transmit_skb(), the former function becomes very small, so that a merger with dccp_connect() suggested itself. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> --- net/dccp/output.c | 18 +++++------------- 1 files changed, 5 insertions(+), 13 deletions(-) --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -444,8 +444,9 @@ int dccp_send_reset(struct sock *sk, enum dccp_reset_codes code) /* * Do all connect socket setups that can be done AF independent. */ -static inline void dccp_connect_init(struct sock *sk) +int dccp_connect(struct sock *sk) { + struct sk_buff *skb; struct dccp_sock *dp = dccp_sk(sk); struct dst_entry *dst = __sk_dst_get(sk); struct inet_connection_sock *icsk = inet_csk(sk); @@ -455,22 +456,12 @@ static inline void dccp_connect_init(struct sock *sk) dccp_sync_mss(sk, dst_mtu(dst)); - /* Initialise GAR as per 8.5; AWL/AWH are set in dccp_transmit_skb() */ - dp->dccps_gar = dp->dccps_iss; - - icsk->icsk_retransmits = 0; -} - -int dccp_connect(struct sock *sk) -{ - struct sk_buff *skb; - struct inet_connection_sock *icsk = inet_csk(sk); - /* do not connect if feature negotiation setup fails */ if (dccp_feat_finalise_settings(dccp_sk(sk))) return -EPROTO; - dccp_connect_init(sk); + /* Initialise GAR as per 8.5; AWL/AWH are set in dccp_transmit_skb() */ + dp->dccps_gar = dp->dccps_iss; skb = alloc_skb(sk->sk_prot->max_header, sk->sk_allocation); if (unlikely(skb == NULL)) @@ -486,6 +477,7 @@ int dccp_connect(struct sock *sk) DCCP_INC_STATS(DCCP_MIB_ACTIVEOPENS); /* Timer for repeating the REQUEST until an answer. */ + icsk->icsk_retransmits = 0; inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, icsk->icsk_rto, DCCP_RTO_MAX); return 0; ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 6/6] [DCCP-PROBE]: Reduce noise in output and convert to ktime_t 2008-01-28 10:16 ` [PATCH 5/6] [DCCP]: Merge now-reduced connect_init() function Gerrit Renker @ 2008-01-28 10:16 ` Gerrit Renker 0 siblings, 0 replies; 8+ messages in thread From: Gerrit Renker @ 2008-01-28 10:16 UTC (permalink / raw) To: acme; +Cc: dccp, netdev, Gerrit Renker This fixes the problem that dccp_probe output can grow quite large without apparent benefit (many identical data points), creating huge files (up to over one Gigabyte for a few minutes' test run) which are very hard to post-process (in one instance it got so bad that gnuplot ate up all memory plus swap). The cause for the problem is that the kprobe is inserted into dccp_sendmsg, which can be called in a polling-mode (whenever the TX queue is full due to congestion-control issues, EAGAIN is returned). This creates many very similar data points, i.e. the increase of processing time does not increase the quality/information of the probe output. The fix is to attach the probe to a different function -- write_xmit was chosen since it gets called continually (both via userspace and timer); an input-path function would stop sampling as soon as the other end stops sending feedback. For comparison the output file sizes for the same 20 second test run over a lossy link: * before / without patch: 118 Megabytes * after / with patch: 1.2 Megabytes and there was much less noise in the output. To allow backward compatibility with scripts that people use, the now-unused `size' field in the output has been replaced with the CCID identifier. This also serves for future compatibility - support for CCID2 is work in progress (depends on the still unfinished SRTT/RTTVAR updates). While at it, the update to ktime_t was also performed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> --- net/dccp/probe.c | 68 ++++++++++++++++++----------------------------------- 1 files changed, 23 insertions(+), 45 deletions(-) --- a/net/dccp/probe.c +++ b/net/dccp/probe.c @@ -46,77 +46,55 @@ struct { struct kfifo *fifo; spinlock_t lock; wait_queue_head_t wait; - struct timeval tstart; + ktime_t start; } dccpw; -static void printl(const char *fmt, ...) -{ - va_list args; - int len; - struct timeval now; - char tbuf[256]; - - va_start(args, fmt); - do_gettimeofday(&now); - - now.tv_sec -= dccpw.tstart.tv_sec; - now.tv_usec -= dccpw.tstart.tv_usec; - if (now.tv_usec < 0) { - --now.tv_sec; - now.tv_usec += 1000000; - } - - len = sprintf(tbuf, "%lu.%06lu ", - (unsigned long) now.tv_sec, - (unsigned long) now.tv_usec); - len += vscnprintf(tbuf+len, sizeof(tbuf)-len, fmt, args); - va_end(args); - - kfifo_put(dccpw.fifo, tbuf, len); - wake_up(&dccpw.wait); -} - -static int jdccp_sendmsg(struct kiocb *iocb, struct sock *sk, - struct msghdr *msg, size_t size) +static void jdccp_write_xmit(struct sock *sk) { const struct inet_sock *inet = inet_sk(sk); struct ccid3_hc_tx_sock *hctx = NULL; + struct timespec tv; + char tbuf[256]; + int len, ccid = ccid_get_current_id(dccp_sk(sk), false); - if (ccid_get_current_id(dccp_sk(sk), false) == DCCPC_CCID3) + if (ccid == DCCPC_CCID3) hctx = ccid3_hc_tx_sk(sk); - if (port == 0 || ntohs(inet->dport) == port || - ntohs(inet->sport) == port) { - if (hctx) - printl("%d.%d.%d.%d:%u %d.%d.%d.%d:%u %d %d %d %d %u " - "%llu %llu %d\n", + if (!port || ntohs(inet->dport) == port || ntohs(inet->sport) == port) { + + tv = ktime_to_timespec(ktime_sub(ktime_get(), dccpw.start)); + len = sprintf(tbuf, "%lu.%09lu %d.%d.%d.%d:%u %d.%d.%d.%d:%u %d", + (unsigned long) tv.tv_sec, + (unsigned long) tv.tv_nsec, NIPQUAD(inet->saddr), ntohs(inet->sport), - NIPQUAD(inet->daddr), ntohs(inet->dport), size, + NIPQUAD(inet->daddr), ntohs(inet->dport), ccid); + + if (hctx) + len += sprintf(tbuf + len, " %d %d %d %u %llu %llu %d", hctx->ccid3hctx_s, hctx->ccid3hctx_rtt, hctx->ccid3hctx_p, hctx->ccid3hctx_x_calc, hctx->ccid3hctx_x_recv >> 6, hctx->ccid3hctx_x >> 6, hctx->ccid3hctx_t_ipi); - else - printl("%d.%d.%d.%d:%u %d.%d.%d.%d:%u %d\n", - NIPQUAD(inet->saddr), ntohs(inet->sport), - NIPQUAD(inet->daddr), ntohs(inet->dport), size); + + len += sprintf(tbuf + len, "\n"); + kfifo_put(dccpw.fifo, tbuf, len); + wake_up(&dccpw.wait); } jprobe_return(); - return 0; } static struct jprobe dccp_send_probe = { .kp = { - .symbol_name = "dccp_sendmsg", + .symbol_name = "dccp_write_xmit", }, - .entry = jdccp_sendmsg, + .entry = jdccp_write_xmit, }; static int dccpprobe_open(struct inode *inode, struct file *file) { kfifo_reset(dccpw.fifo); - do_gettimeofday(&dccpw.tstart); + dccpw.start = ktime_get(); return 0; } ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 3/6] [DCCP]: Bug-Fix - AWL was never updated 2008-01-28 10:16 ` [PATCH 3/6] [DCCP]: Bug-Fix - AWL was never updated Gerrit Renker 2008-01-28 10:16 ` [PATCH 4/6] [DCCP]: Fix the adjustments to AWL and SWL Gerrit Renker @ 2008-01-28 17:59 ` Ian McDonald 1 sibling, 0 replies; 8+ messages in thread From: Ian McDonald @ 2008-01-28 17:59 UTC (permalink / raw) To: Gerrit Renker; +Cc: acme, dccp, netdev On Jan 28, 2008 11:16 PM, Gerrit Renker <gerrit@erg.abdn.ac.uk> wrote: > This patch was triggered by finding the following message in the syslog: > "kernel: dccp_check_seqno: DCCP: Step 6 failed for DATAACK packet, [...] > P.ackno exists or LAWL(82947089) <= P.ackno(82948208) > <= S.AWH(82948728), sending SYNC..." > > Note the difference between AWH and AWL: it is 1639 packets (while Sequence > Window was actually at 100). A closer look at the trace showed that > LAWL = AWL = 82947089 equalled the ISS on the Response. > > The cause of the bug was that AWL was only ever set on the first packet - the > DCCP-Request sent by dccp_v{4,6}_connect(). > > The fix is to continually update AWL/AWH with each new packet (as GSS=AWH). > > In addition, AWL/AWH are now updated to enforce more stringent checks on the > initial sequence numbers when connecting: > * AWL is initialised to ISS and remains at this value; > * AWH is always set to GSS (via dccp_update_gss()); > * so on the first Request: AWL = AWH = ISS, > and on the n-th Request: AWL = ISS, AWH = ISS+n. > > As a consequence, only Response packets that refer to Requests sent by this > host will pass, all others are discarded. This is the intention and in effect > implements the initial adjustments for AWL as specified in RFC 4340, 7.5.1. > > Note: A problem that remains is that ISS can potentially be under-run even after > the initial handshake; this is addressed a subsequent patch. > > Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Yes I had seen this and had worked out that variables weren't being updated as they should be but hadn't got as far as a fix before I stopped my coding days so much :-( Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-01-28 17:59 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-01-28 10:16 [DCCP] [Patch 0/6] [BUG-Fix]: Fixes for CCID3, seqnos, and dccp_probe Gerrit Renker 2008-01-28 10:16 ` [PATCH 1/6] [CCID3]: Function returned wrong value Gerrit Renker 2008-01-28 10:16 ` [PATCH 2/6] [DCCP]: Bug in initial acknowledgment number assignment Gerrit Renker 2008-01-28 10:16 ` [PATCH 3/6] [DCCP]: Bug-Fix - AWL was never updated Gerrit Renker 2008-01-28 10:16 ` [PATCH 4/6] [DCCP]: Fix the adjustments to AWL and SWL Gerrit Renker 2008-01-28 10:16 ` [PATCH 5/6] [DCCP]: Merge now-reduced connect_init() function Gerrit Renker 2008-01-28 10:16 ` [PATCH 6/6] [DCCP-PROBE]: Reduce noise in output and convert to ktime_t Gerrit Renker 2008-01-28 17:59 ` [PATCH 3/6] [DCCP]: Bug-Fix - AWL was never updated Ian McDonald
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox