* Re: [PATCH] sctp: Fix broken RTO-doubling for data retransmits
2009-02-19 17:11 [PATCH] sctp: Fix broken RTO-doubling for data retransmits Vlad Yasevich
@ 2009-02-20 1:18 ` Wei Yongjun
2009-02-20 2:39 ` Vlad Yasevich
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Wei Yongjun @ 2009-02-20 1:18 UTC (permalink / raw)
To: linux-sctp
Vlad Yasevich wrote:
> Commit faee47cdbfe8d74a1573c2f81ea6dbb08d735be6
> (sctp: Fix the RTO-doubling on idle-link heartbeats)
> broke the RTO doubling for data retransmits. If the
> heartbeat was sent before the data T3-rtx time, the
> the RTO will not double upon the T3-rtx expiration.
> Distingish between the operations by passing an argument
> to the function.
>
Hi Vlad
I think we should talk about what the SCTP_CMD_TRANSPORT_RESET and
SCTP_CMD_HB_TIMER_UPDATE command do.
SCTP_CMD_TRANSPORT_RESET
--> sctp_cmd_transport_reset()
--> sctp_transport_lower_cwnd()
... transport->cwnd = XXX if timer expire
... partial_bytes_acked = 0;
--> sctp_do_8_2_transport_strike()
... asoc->overall_error_count++;
... transport->rto = XXX if not send HB
SCTP_CMD_HB_TIMER_UPDATE
--> sctp_cmd_hb_timer_update()
... mod_timer(&t->hb_timer, sctp_transport_timeout(t))
In sctp_sf_do_prm_requestheartbeat(), it used
SCTP_CMD_TRANSPORT_RESET command, not SCTP_CMD_HB_TIMER_UPDATE,
so if user request to send HB, the partial_bytes_acked will be clear,
and the overall_error_count be increased. And the heartbeat primitive
may not send HB, but only mark the transport state to SCTP_INACTIVE.
We user request to send HB immediately:
- the HB timer will be update, so HB will expires after
(HB_INTERVAL + RTO +/- 50% * RTO).
- the overall_error_count should not be increased. Because this is not
a idle-link heartbeat.
- the cwnd and partial_bytes_acked should not be changed.
Is this correct?
And after the patch I sent, used SCTP_CMD_HB_TIMER_UPDATE instead of
SCTP_CMD_TRANSPORT_RESET command. The action of idle-link heartbeat
and user initiated heartbeat are like this.
idle-link heartbeat:
Endpoint A Endpoint B
|
| idle
|
HEARTBEAT ----------->
| lower cwnd
| overall_error_count++
| RTO = RTO * 2
| expires HB_INTERVAL + RTO +/- 50% * RTO
HEARTBEAT ----------->
| lower cwnd
| overall_error_count++
| RTO = RTO * 2
| expires HB_INTERVAL + RTO +/- 50% * RTO
user initiated heartbeat
Endpoint A Endpoint B
|
|
|
HEARTBEAT -----------> (*) first HB
|
| reset hb_timer, not update RTO
|
HEARTBEAT ----------->
| lower cwnd
| overall_error_count++
| RTO = RTO * 2
| expires HB_INTERVAL + RTO +/- 50% * RTO
The different between two is that only reset the HB timer, the
other increased overall_error_count and lower cwnd. I agree with
reset the HB timer.
Regards.
> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
> ---
> net/sctp/sm_sideeffect.c | 10 ++++++----
> 1 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
> index 0146cfb..4ac6163 100644
> --- a/net/sctp/sm_sideeffect.c
> +++ b/net/sctp/sm_sideeffect.c
> @@ -434,7 +434,8 @@ sctp_timer_event_t *sctp_timer_events[SCTP_NUM_TIMEOUT_TYPES] = {
> *
> */
> static void sctp_do_8_2_transport_strike(struct sctp_association *asoc,
> - struct sctp_transport *transport)
> + struct sctp_transport *transport,
> + int is_hb)
> {
> /* The check for association's overall error counter exceeding the
> * threshold is done in the state function.
> @@ -466,7 +467,7 @@ static void sctp_do_8_2_transport_strike(struct sctp_association *asoc,
> * The first unacknowleged HB triggers it. We do this with a flag
> * that indicates that we have an outstanding HB.
> */
> - if (transport->hb_sent) {
> + if (!is_hb || transport->hb_sent) {
> transport->last_rto = transport->rto;
> transport->rto = min((transport->rto * 2), transport->asoc->rto_max);
> }
> @@ -667,7 +668,7 @@ static void sctp_cmd_transport_reset(sctp_cmd_seq_t *cmds,
> sctp_transport_lower_cwnd(t, SCTP_LOWER_CWND_INACTIVE);
>
> /* Mark one strike against a transport. */
> - sctp_do_8_2_transport_strike(asoc, t);
> + sctp_do_8_2_transport_strike(asoc, t, 1);
>
> t->hb_sent = 1;
> }
> @@ -1459,7 +1460,8 @@ static int sctp_cmd_interpreter(sctp_event_t event_type,
>
> case SCTP_CMD_STRIKE:
> /* Mark one strike against a transport. */
> - sctp_do_8_2_transport_strike(asoc, cmd->obj.transport);
> + sctp_do_8_2_transport_strike(asoc, cmd->obj.transport,
> + 0);
> break;
>
> case SCTP_CMD_TRANSPORT_RESET:
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] sctp: Fix broken RTO-doubling for data retransmits
2009-02-19 17:11 [PATCH] sctp: Fix broken RTO-doubling for data retransmits Vlad Yasevich
2009-02-20 1:18 ` Wei Yongjun
@ 2009-02-20 2:39 ` Vlad Yasevich
2009-02-20 2:43 ` Vlad Yasevich
2009-02-23 19:15 ` Vlad Yasevich
3 siblings, 0 replies; 5+ messages in thread
From: Vlad Yasevich @ 2009-02-20 2:39 UTC (permalink / raw)
To: linux-sctp
Wei Yongjun wrote:
> Vlad Yasevich wrote:
>> Commit faee47cdbfe8d74a1573c2f81ea6dbb08d735be6
>> (sctp: Fix the RTO-doubling on idle-link heartbeats)
>> broke the RTO doubling for data retransmits. If the
>> heartbeat was sent before the data T3-rtx time, the
>> the RTO will not double upon the T3-rtx expiration.
>> Distingish between the operations by passing an argument
>> to the function.
>>
>
> Hi Vlad
>
> I think we should talk about what the SCTP_CMD_TRANSPORT_RESET and
> SCTP_CMD_HB_TIMER_UPDATE command do.
>
> SCTP_CMD_TRANSPORT_RESET
> --> sctp_cmd_transport_reset()
> --> sctp_transport_lower_cwnd()
> ... transport->cwnd = XXX if timer expire
> ... partial_bytes_acked = 0;
> --> sctp_do_8_2_transport_strike()
> ... asoc->overall_error_count++;
> ... transport->rto = XXX if not send HB
>
> SCTP_CMD_HB_TIMER_UPDATE
> --> sctp_cmd_hb_timer_update()
> ... mod_timer(&t->hb_timer, sctp_transport_timeout(t))
>
> In sctp_sf_do_prm_requestheartbeat(), it used
> SCTP_CMD_TRANSPORT_RESET command, not SCTP_CMD_HB_TIMER_UPDATE,
> so if user request to send HB, the partial_bytes_acked will be clear,
> and the overall_error_count be increased.
I am concerned about partial_bytes_acked, but the error count should be
increased.
> And the heartbeat primitive
> may not send HB, but only mark the transport state to SCTP_INACTIVE.
How? Once you reach sctp_sf_do_prm_requestheartbeat, you will either
send a HB, or error out with NOMEM. The transport reset happens after
the HB is sent.
>
> We user request to send HB immediately:
> - the HB timer will be update, so HB will expires after
> (HB_INTERVAL + RTO +/- 50% * RTO).
> - the overall_error_count should not be increased. Because this is not
> a idle-link heartbeat.
No, the error count should still increase.
> - the cwnd and partial_bytes_acked should not be changed.
This is true. Because the link is not idle, these 2 things shouldn't happen.
Looks like there is more work to do here.
-vlad
>
> Is this correct?
>
> And after the patch I sent, used SCTP_CMD_HB_TIMER_UPDATE instead of
> SCTP_CMD_TRANSPORT_RESET command. The action of idle-link heartbeat
> and user initiated heartbeat are like this.
>
> idle-link heartbeat:
>
> Endpoint A Endpoint B
> |
> | idle
> |
> HEARTBEAT ----------->
> | lower cwnd
> | overall_error_count++
> | RTO = RTO * 2
> | expires HB_INTERVAL + RTO +/- 50% * RTO
> HEARTBEAT ----------->
> | lower cwnd
> | overall_error_count++
> | RTO = RTO * 2
> | expires HB_INTERVAL + RTO +/- 50% * RTO
>
> user initiated heartbeat
>
> Endpoint A Endpoint B
> |
> |
> |
> HEARTBEAT -----------> (*) first HB
> |
> | reset hb_timer, not update RTO
> |
> HEARTBEAT ----------->
> | lower cwnd
> | overall_error_count++
> | RTO = RTO * 2
> | expires HB_INTERVAL + RTO +/- 50% * RTO
>
>
> The different between two is that only reset the HB timer, the
> other increased overall_error_count and lower cwnd. I agree with
> reset the HB timer.
>
> Regards.
>
>> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
>> ---
>> net/sctp/sm_sideeffect.c | 10 ++++++----
>> 1 files changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
>> index 0146cfb..4ac6163 100644
>> --- a/net/sctp/sm_sideeffect.c
>> +++ b/net/sctp/sm_sideeffect.c
>> @@ -434,7 +434,8 @@ sctp_timer_event_t *sctp_timer_events[SCTP_NUM_TIMEOUT_TYPES] = {
>> *
>> */
>> static void sctp_do_8_2_transport_strike(struct sctp_association *asoc,
>> - struct sctp_transport *transport)
>> + struct sctp_transport *transport,
>> + int is_hb)
>> {
>> /* The check for association's overall error counter exceeding the
>> * threshold is done in the state function.
>> @@ -466,7 +467,7 @@ static void sctp_do_8_2_transport_strike(struct sctp_association *asoc,
>> * The first unacknowleged HB triggers it. We do this with a flag
>> * that indicates that we have an outstanding HB.
>> */
>> - if (transport->hb_sent) {
>> + if (!is_hb || transport->hb_sent) {
>> transport->last_rto = transport->rto;
>> transport->rto = min((transport->rto * 2), transport->asoc->rto_max);
>> }
>> @@ -667,7 +668,7 @@ static void sctp_cmd_transport_reset(sctp_cmd_seq_t *cmds,
>> sctp_transport_lower_cwnd(t, SCTP_LOWER_CWND_INACTIVE);
>>
>> /* Mark one strike against a transport. */
>> - sctp_do_8_2_transport_strike(asoc, t);
>> + sctp_do_8_2_transport_strike(asoc, t, 1);
>>
>> t->hb_sent = 1;
>> }
>> @@ -1459,7 +1460,8 @@ static int sctp_cmd_interpreter(sctp_event_t event_type,
>>
>> case SCTP_CMD_STRIKE:
>> /* Mark one strike against a transport. */
>> - sctp_do_8_2_transport_strike(asoc, cmd->obj.transport);
>> + sctp_do_8_2_transport_strike(asoc, cmd->obj.transport,
>> + 0);
>> break;
>>
>> case SCTP_CMD_TRANSPORT_RESET:
>>
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] sctp: Fix broken RTO-doubling for data retransmits
2009-02-19 17:11 [PATCH] sctp: Fix broken RTO-doubling for data retransmits Vlad Yasevich
2009-02-20 1:18 ` Wei Yongjun
2009-02-20 2:39 ` Vlad Yasevich
@ 2009-02-20 2:43 ` Vlad Yasevich
2009-02-23 19:15 ` Vlad Yasevich
3 siblings, 0 replies; 5+ messages in thread
From: Vlad Yasevich @ 2009-02-20 2:43 UTC (permalink / raw)
To: linux-sctp
Vlad Yasevich wrote:
> Wei Yongjun wrote:
>> Vlad Yasevich wrote:
>>> Commit faee47cdbfe8d74a1573c2f81ea6dbb08d735be6
>>> (sctp: Fix the RTO-doubling on idle-link heartbeats)
>>> broke the RTO doubling for data retransmits. If the
>>> heartbeat was sent before the data T3-rtx time, the
>>> the RTO will not double upon the T3-rtx expiration.
>>> Distingish between the operations by passing an argument
>>> to the function.
>>>
>> Hi Vlad
>>
>> I think we should talk about what the SCTP_CMD_TRANSPORT_RESET and
>> SCTP_CMD_HB_TIMER_UPDATE command do.
>>
>> SCTP_CMD_TRANSPORT_RESET
>> --> sctp_cmd_transport_reset()
>> --> sctp_transport_lower_cwnd()
>> ... transport->cwnd = XXX if timer expire
>> ... partial_bytes_acked = 0;
>> --> sctp_do_8_2_transport_strike()
>> ... asoc->overall_error_count++;
>> ... transport->rto = XXX if not send HB
>>
>> SCTP_CMD_HB_TIMER_UPDATE
>> --> sctp_cmd_hb_timer_update()
>> ... mod_timer(&t->hb_timer, sctp_transport_timeout(t))
>>
>> In sctp_sf_do_prm_requestheartbeat(), it used
>> SCTP_CMD_TRANSPORT_RESET command, not SCTP_CMD_HB_TIMER_UPDATE,
>> so if user request to send HB, the partial_bytes_acked will be clear,
>> and the overall_error_count be increased.
>
> I am concerned about partial_bytes_acked, but the error count should be
> increased.
>
>> And the heartbeat primitive
>> may not send HB, but only mark the transport state to SCTP_INACTIVE.
>
> How? Once you reach sctp_sf_do_prm_requestheartbeat, you will either
> send a HB, or error out with NOMEM. The transport reset happens after
> the HB is sent.
>
>> We user request to send HB immediately:
>> - the HB timer will be update, so HB will expires after
>> (HB_INTERVAL + RTO +/- 50% * RTO).
>> - the overall_error_count should not be increased. Because this is not
>> a idle-link heartbeat.
>
> No, the error count should still increase.
>
>> - the cwnd and partial_bytes_acked should not be changed.
>
> This is true. Because the link is not idle, these 2 things shouldn't happen.
>
> Looks like there is more work to do here.
>
> -vlad
>
>> Is this correct?
Oh, forgot to address the state bellow.
>>
>> And after the patch I sent, used SCTP_CMD_HB_TIMER_UPDATE instead of
>> SCTP_CMD_TRANSPORT_RESET command. The action of idle-link heartbeat
>> and user initiated heartbeat are like this.
>>
>> idle-link heartbeat:
>>
>> Endpoint A Endpoint B
>> |
>> | idle
>> |
>> HEARTBEAT ----------->
>> | lower cwnd
>> | overall_error_count++
>> | RTO = RTO * 2
no, in this case, RTO should not be doubled, since we did not have an
outstanding HB that was not acknowleged.
>> | expires HB_INTERVAL + RTO +/- 50% * RTO
>> HEARTBEAT ----------->
>> | lower cwnd
>> | overall_error_count++
>> | RTO = RTO * 2
>> | expires HB_INTERVAL + RTO +/- 50% * RTO
In this case, RTO doubles correctly.
>>
>> user initiated heartbeat
>>
>> Endpoint A Endpoint B
>> |
>> |
>> |
>> HEARTBEAT -----------> (*) first HB
>> |
>> | reset hb_timer, not update RTO
We also need to update the error count.
>> |
>> HEARTBEAT ----------->
>> | lower cwnd
>> | overall_error_count++
>> | RTO = RTO * 2
>> | expires HB_INTERVAL + RTO +/- 50% * RTO
Yes, RTO doubles.
>>
>>
>> The different between two is that only reset the HB timer, the
>> other increased overall_error_count and lower cwnd. I agree with
>> reset the HB timer.
>>
It also needs to count the HB towards the error count. They have count
the same way as if there was a timeout prior to the first HB sent.
-vlad
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] sctp: Fix broken RTO-doubling for data retransmits
2009-02-19 17:11 [PATCH] sctp: Fix broken RTO-doubling for data retransmits Vlad Yasevich
` (2 preceding siblings ...)
2009-02-20 2:43 ` Vlad Yasevich
@ 2009-02-23 19:15 ` Vlad Yasevich
3 siblings, 0 replies; 5+ messages in thread
From: Vlad Yasevich @ 2009-02-23 19:15 UTC (permalink / raw)
To: linux-sctp
Commit faee47cdbfe8d74a1573c2f81ea6dbb08d735be6
(sctp: Fix the RTO-doubling on idle-link heartbeats)
broke the RTO doubling for data retransmits. If the
heartbeat was sent before the data T3-rtx time, the
the RTO will not double upon the T3-rtx expiration.
Distingish between the operations by passing an argument
to the function.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
---
include/net/sctp/command.h | 3 ++-
net/sctp/sm_sideeffect.c | 32 +++++++++++++-------------------
net/sctp/sm_statefuns.c | 6 ++++--
3 files changed, 19 insertions(+), 22 deletions(-)
diff --git a/include/net/sctp/command.h b/include/net/sctp/command.h
index 88988ab..3b96680 100644
--- a/include/net/sctp/command.h
+++ b/include/net/sctp/command.h
@@ -77,7 +77,8 @@ typedef enum {
SCTP_CMD_HB_TIMERS_START, /* Start the heartbeat timers. */
SCTP_CMD_HB_TIMER_UPDATE, /* Update a heartbeat timers. */
SCTP_CMD_HB_TIMERS_STOP, /* Stop the heartbeat timers. */
- SCTP_CMD_TRANSPORT_RESET, /* Reset the status of a transport. */
+ SCTP_CMD_TRANSPORT_HB_SENT, /* Reset the status of a transport. */
+ SCTP_CMD_TRANSPORT_IDLE, /* Do manipulations on idle transport */
SCTP_CMD_TRANSPORT_ON, /* Mark the transport as active. */
SCTP_CMD_REPORT_ERROR, /* Pass this error back out of the sm. */
SCTP_CMD_REPORT_BAD_TAG, /* Verification tags didn't match. */
diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 0146cfb..5385150 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -434,7 +434,8 @@ sctp_timer_event_t *sctp_timer_events[SCTP_NUM_TIMEOUT_TYPES] = {
*
*/
static void sctp_do_8_2_transport_strike(struct sctp_association *asoc,
- struct sctp_transport *transport)
+ struct sctp_transport *transport,
+ int is_hb)
{
/* The check for association's overall error counter exceeding the
* threshold is done in the state function.
@@ -466,7 +467,7 @@ static void sctp_do_8_2_transport_strike(struct sctp_association *asoc,
* The first unacknowleged HB triggers it. We do this with a flag
* that indicates that we have an outstanding HB.
*/
- if (transport->hb_sent) {
+ if (!is_hb || transport->hb_sent) {
transport->last_rto = transport->rto;
transport->rto = min((transport->rto * 2), transport->asoc->rto_max);
}
@@ -657,20 +658,6 @@ static void sctp_cmd_transport_on(sctp_cmd_seq_t *cmds,
sctp_transport_hold(t);
}
-/* Helper function to do a transport reset at the expiry of the hearbeat
- * timer.
- */
-static void sctp_cmd_transport_reset(sctp_cmd_seq_t *cmds,
- struct sctp_association *asoc,
- struct sctp_transport *t)
-{
- sctp_transport_lower_cwnd(t, SCTP_LOWER_CWND_INACTIVE);
-
- /* Mark one strike against a transport. */
- sctp_do_8_2_transport_strike(asoc, t);
-
- t->hb_sent = 1;
-}
/* Helper function to process the process SACK command. */
static int sctp_cmd_process_sack(sctp_cmd_seq_t *cmds,
@@ -1459,12 +1446,19 @@ static int sctp_cmd_interpreter(sctp_event_t event_type,
case SCTP_CMD_STRIKE:
/* Mark one strike against a transport. */
- sctp_do_8_2_transport_strike(asoc, cmd->obj.transport);
+ sctp_do_8_2_transport_strike(asoc, cmd->obj.transport,
+ 0);
+ break;
+
+ case SCTP_CMD_TRANSPORT_IDLE:
+ t = cmd->obj.transport;
+ sctp_transport_lower_cwnd(t, SCTP_LOWER_CWND_INACTIVE);
break;
- case SCTP_CMD_TRANSPORT_RESET:
+ case SCTP_CMD_TRANSPORT_HB_SENT:
t = cmd->obj.transport;
- sctp_cmd_transport_reset(commands, asoc, t);
+ sctp_do_8_2_transport_strike(asoc, t, 1);
+ t->hb_sent = 1;
break;
case SCTP_CMD_TRANSPORT_ON:
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index 3a0cd07..a907bab 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -988,7 +988,9 @@ sctp_disposition_t sctp_sf_sendbeat_8_3(const struct sctp_endpoint *ep,
/* Set transport error counter and association error counter
* when sending heartbeat.
*/
- sctp_add_cmd_sf(commands, SCTP_CMD_TRANSPORT_RESET,
+ sctp_add_cmd_sf(commands, SCTP_CMD_TRANSPORT_IDLE,
+ SCTP_TRANSPORT(transport));
+ sctp_add_cmd_sf(commands, SCTP_CMD_TRANSPORT_HB_SENT,
SCTP_TRANSPORT(transport));
}
sctp_add_cmd_sf(commands, SCTP_CMD_HB_TIMER_UPDATE,
@@ -4967,7 +4969,7 @@ sctp_disposition_t sctp_sf_do_prm_requestheartbeat(
* to that address and not acknowledged within one RTO.
*
*/
- sctp_add_cmd_sf(commands, SCTP_CMD_TRANSPORT_RESET,
+ sctp_add_cmd_sf(commands, SCTP_CMD_TRANSPORT_HB_SENT,
SCTP_TRANSPORT(arg));
return SCTP_DISPOSITION_CONSUME;
}
--
1.5.4.3
^ permalink raw reply related [flat|nested] 5+ messages in thread