* NAK: [PATCH 1/1] DCCP: Fix up t_nom
@ 2007-01-08 11:28 Gerrit Renker
2007-01-09 22:55 ` Ian McDonald
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Gerrit Renker @ 2007-01-08 11:28 UTC (permalink / raw)
To: dccp
Due to the following reasons:
1) The race condition is not tackled. As before, you allow several routines running
asynchronously in software interrupts (timer interrupt, net backlog softirq) and
in user context (dccp_sendmsg) to write/read on t_nom. Without locking, you will
have the race condition that one routine performs a write (e.g. sub_usecs) while
another performs a read and sends the packet to early.
2) It introduces new problems / BUGs that did not previously exist - please see inline.
3) By removing t_ipi you are now recalculating it each time a packet is sent - which is
far more expensive then only recomputing it when related data changes. This will
indeed contribute to slow-down.
I am not in support of this patch. This patch is against _working_ code. Instead of solving
problems, it introduces new ones.
Can I suggest that we spent our time more profitably on fixing the many bugs that exist?
Fix broken, not working code.
| We are calculating t_nom in the past we have made it far too complicated
| and not matching the RFC.
What made you come to this conclusion?
| @@ -275,11 +266,10 @@ static void ccid3_hc_tx_no_feedback_timer(unsigned long data)
| ccid3_hc_tx_update_x(sk, &now);
| }
| /*
| - * Schedule no feedback timer to expire in
| - * max(t_RTO, 2 * s/X) = max(t_RTO, 2 * t_ipi)
| + * Schedule no feedback timer to expire in t_RTO
| * See comments in packet_recv() regarding the value of t_RTO.
| */
| - t_nfb = max(hctx->ccid3hctx_t_rto, 2 * hctx->ccid3hctx_t_ipi);
| + t_nfb = hctx->ccid3hctx_t_rto;
| break;
| case TFRC_SSTATE_NO_SENT:
| DCCP_BUG("%s(%p) - Illegal state NO_SENT", dccp_role(sk), sk);
| @@ -337,13 +327,13 @@ static int ccid3_hc_tx_send_packet(struct sock *sk, struct sk_buff *skb)
| hctx->ccid3hctx_x = hctx->ccid3hctx_s;
| hctx->ccid3hctx_x <<= 6;
|
| - /* First timeout, according to [RFC 3448, 4.2], is 1 second */
| - hctx->ccid3hctx_t_ipi = USEC_PER_SEC;
| /* Initial delta: minimum of 0.5 sec and t_gran/2 */
| hctx->ccid3hctx_delta = TFRC_OPSYS_HALF_TIME_GRAN;
|
| /* Set t_0 for initial packet */
| + /* First timeout, according to [RFC 3448, 4.2], is 1 second */
| hctx->ccid3hctx_t_nom = now;
| + timeval_add_usecs(&hctx->ccid3hctx_t_nom, USEC_PER_SEC);
BUG: The first packet will be delayed for 1 second instead of being sent immediately as specified
in RFC 3448, 4.6.
| break;
| case TFRC_SSTATE_NO_FBACK:
| case TFRC_SSTATE_FBACK:
| @@ -361,6 +351,7 @@ static int ccid3_hc_tx_send_packet(struct sock *sk, struct sk_buff *skb)
| return delay / 1000L;
|
| ccid3_hc_tx_update_win_count(hctx, &now);
| + ccid3_update_send_time(hctx);
Recalculating t_ipi, t_delta, and t_nom each time a packet is sent - very expensive.
| @@ -486,6 +474,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb)
| hctx->ccid3hctx_x = scaled_div(w_init << 6, r_sample);
| hctx->ccid3hctx_t_ld = now;
|
| + timeval_sub_usecs(&hctx->ccid3hctx_t_nom, USEC_PER_SEC);
| ccid3_update_send_time(hctx);
|
| ccid3_pr_debug("%s(%p), s=%u, MSS=%u, w_init=%u, "
Why this complicated code - it was simpler before. Now you first add 1 second, then send the packet
immediately and then you subtract again.
| @@ -539,11 +528,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb)
| hctx->ccid3hctx_t_rto = max_t(u32, 4 * hctx->ccid3hctx_rtt,
| CONFIG_IP_DCCP_CCID3_RTO *
| (USEC_PER_SEC/1000));
| - /*
| - * Schedule no feedback timer to expire in
| - * max(t_RTO, 2 * s/X) = max(t_RTO, 2 * t_ipi)
| - */
| - t_nfb = max(hctx->ccid3hctx_t_rto, 2 * hctx->ccid3hctx_t_ipi);
| + t_nfb = hctx->ccid3hctx_t_rto;
|
| ccid3_pr_debug("%s(%p), Scheduled no feedback timer to "
| "expire in %lu jiffies (%luus)\n",
BUG: This removes conformance with RFC 3448, section 4.3 and section 4.4.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: NAK: [PATCH 1/1] DCCP: Fix up t_nom
2007-01-08 11:28 NAK: [PATCH 1/1] DCCP: Fix up t_nom Gerrit Renker
@ 2007-01-09 22:55 ` Ian McDonald
2007-01-10 10:13 ` Gerrit Renker
2007-01-13 4:54 ` Ian McDonald
2 siblings, 0 replies; 4+ messages in thread
From: Ian McDonald @ 2007-01-09 22:55 UTC (permalink / raw)
To: dccp
Gerrit,
Comments inline.
On 09/01/07, Gerrit Renker <gerrit@erg.abdn.ac.uk> wrote:
> Due to the following reasons:
> 1) The race condition is not tackled. As before, you allow several routines running
> asynchronously in software interrupts (timer interrupt, net backlog softirq) and
> in user context (dccp_sendmsg) to write/read on t_nom. Without locking, you will
> have the race condition that one routine performs a write (e.g. sub_usecs) while
> another performs a read and sends the packet to early.
I don't believe so. For a start now I'm only altering t_nom in one
place except for the initial setup which I'll discuss later. And in
addition I'm only doing additions apart from initial setup. As such I
believe mine is just as safe or more so. I have not observed any
problems in testing and the only way either of our patches could be
more robust is with locking. However I don't think this is needed from
my observation and logging.
> 2) It introduces new problems / BUGs that did not previously exist - please see inline.
Discussed inline.
> 3) By removing t_ipi you are now recalculating it each time a packet is sent - which is
> far more expensive then only recomputing it when related data changes. This will
> indeed contribute to slow-down.
>
Re-read section 4.6 of RFC3448. We are required to use s and X_inst
and the current code base does not update these every packet we
transmit. It is simpler to do it the way specified in the RFC. My
patch removes some of these comparisons so not all bad.
> I am not in support of this patch. This patch is against _working_ code. Instead of solving
> problems, it introduces new ones.
>
> Can I suggest that we spent our time more profitably on fixing the many bugs that exist?
> Fix broken, not working code.
>
I am fixing broken code. I don't think either of our initial solutions
were correct.
>
> | We are calculating t_nom in the past we have made it far too complicated
> | and not matching the RFC.
> What made you come to this conclusion?
>
Reading the RFC as per above.
> | hctx->ccid3hctx_t_nom = now;
> | + timeval_add_usecs(&hctx->ccid3hctx_t_nom, USEC_PER_SEC);
> BUG: The first packet will be delayed for 1 second instead of being sent immediately as specified
> in RFC 3448, 4.6.
>
No you are reading my code incorrectly. It breaks out of the switch
and sends immediately as returns 0.
> | break;
> | case TFRC_SSTATE_NO_FBACK:
> | case TFRC_SSTATE_FBACK:
> | @@ -361,6 +351,7 @@ static int ccid3_hc_tx_send_packet(struct sock *sk, struct sk_buff *skb)
> | return delay / 1000L;
> |
> | ccid3_hc_tx_update_win_count(hctx, &now);
> | + ccid3_update_send_time(hctx);
> Recalculating t_ipi, t_delta, and t_nom each time a packet is sent - very expensive.
>
It is slightly more expensive but we are removing other cases
>
>
> | @@ -486,6 +474,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb)
> | hctx->ccid3hctx_x = scaled_div(w_init << 6, r_sample);
> | hctx->ccid3hctx_t_ld = now;
> |
> | + timeval_sub_usecs(&hctx->ccid3hctx_t_nom, USEC_PER_SEC);
> | ccid3_update_send_time(hctx);
> Why this complicated code - it was simpler before. Now you first add 1 second, then send the packet
> immediately and then you subtract again.
>
Overall I think this is much simpler. Before we were adding and
subtracting with each change but now we are doing only on the first
feedback (not immediately as you say)
> | @@ -539,11 +528,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb)
> | - /*
> | - * Schedule no feedback timer to expire in
> | - * max(t_RTO, 2 * s/X) = max(t_RTO, 2 * t_ipi)
> | - */
> | - t_nfb = max(hctx->ccid3hctx_t_rto, 2 * hctx->ccid3hctx_t_ipi);
> | + t_nfb = hctx->ccid3hctx_t_rto;
> |
> BUG: This removes conformance with RFC 3448, section 4.3 and section 4.4.
>
I think I see what you are saying but the code is/was wrong here.
Section 4.3 and 4.4 talks about t_mbi, not t_ipi. t_mbi is a constant
of 64 seconds. So the code was wrong, I've removed it and it's still
wrong but in a different way.
Can I ask one favour? Why we still carry on debating which approach is
better can you move the refactor patch one up in the sequence so
Arnaldo can merge a set of undisputed patches? We'll get this one
sorted I know but we may as well get merged the ones we both agree on
just fine!
Ian
--
Web: http://wand.net.nz/~iam4
Blog: http://iansblog.jandi.co.nz
WAND Network Research Group
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: NAK: [PATCH 1/1] DCCP: Fix up t_nom
2007-01-08 11:28 NAK: [PATCH 1/1] DCCP: Fix up t_nom Gerrit Renker
2007-01-09 22:55 ` Ian McDonald
@ 2007-01-10 10:13 ` Gerrit Renker
2007-01-13 4:54 ` Ian McDonald
2 siblings, 0 replies; 4+ messages in thread
From: Gerrit Renker @ 2007-01-10 10:13 UTC (permalink / raw)
To: dccp
Hello,
below I have corrected where you pointed out that I was wrong.
However, I think we are both barking up the wrong tree here, i.e. I'd say that neither your or my patch is
`better', but rather: the evil lies somewhere else. So no matter which patch lands in Arnaldos tree, the main
problem will be unresolved irrespectively.
Both patches, when we put aside other quibbles here, are in effect a variation of the same theme.
| > 1) The race condition is not tackled. As before, you allow several routines running
| > asynchronously in software interrupts (timer interrupt, net backlog softirq) and
| > in user context (dccp_sendmsg) to write/read on t_nom. Without locking, you will
| > have the race condition that one routine performs a write (e.g. sub_usecs) while
| > another performs a read and sends the packet to early.
|
| I don't believe so. For a start now I'm only altering t_nom in one
| place except for the initial setup which I'll discuss later. And in
| addition I'm only doing additions apart from initial setup. As such I
| believe mine is just as safe or more so.
Went through the patch again and have to agree: there is only the subtraction in rx_packet_recv,
which would be debatable. So your patch also, by and large, tries to avoid the race condition
by monopolising the access to t_nom. Sorry I didn't realize earlier.
| > I am not in support of this patch. This patch is against _working_ code. Instead of solving
| > problems, it introduces new ones.
| >
| > Can I suggest that we spent our time more profitably on fixing the many bugs that exist?
| > Fix broken, not working code.
| >
| I am fixing broken code. I don't think either of our initial solutions
| were correct.
I agree but my point is that we are wasting our energy on a discussion of patches which, by and
large, effect the same: before you sent this email, I had consulted with colleagues about this
and will send the notes in a follow-up email.
| > | hctx->ccid3hctx_t_nom = now;
| > | + timeval_add_usecs(&hctx->ccid3hctx_t_nom, USEC_PER_SEC);
| > BUG: The first packet will be delayed for 1 second instead of being sent immediately as specified
| > in RFC 3448, 4.6.
| >
| No you are reading my code incorrectly. It breaks out of the switch
| and sends immediately as returns 0.
You are right and I had meant to point this out in a follow-up: the code above has the same effect as
before, there is no delay. Sorry.
| > | @@ -486,6 +474,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb)
| > | hctx->ccid3hctx_x = scaled_div(w_init << 6, r_sample);
| > | hctx->ccid3hctx_t_ld = now;
| > |
| > | + timeval_sub_usecs(&hctx->ccid3hctx_t_nom, USEC_PER_SEC);
| > | ccid3_update_send_time(hctx);
| > Why this complicated code - it was simpler before. Now you first add 1 second, then send the packet
| > immediately and then you subtract again.
| >
| Overall I think this is much simpler. Before we were adding and
| subtracting with each change but now we are doing only on the first
| feedback (not immediately as you say)
The race-condition patch also removes the add/subtract pair. Please let me restate the above - I think it
is a matter of programming taste - your patch does affect some simplification, mine also does; but I'd
say neither is `better'.
|
| > | @@ -539,11 +528,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb)
| > | - /*
| > | - * Schedule no feedback timer to expire in
| > | - * max(t_RTO, 2 * s/X) = max(t_RTO, 2 * t_ipi)
| > | - */
| > | - t_nfb = max(hctx->ccid3hctx_t_rto, 2 * hctx->ccid3hctx_t_ipi);
| > | + t_nfb = hctx->ccid3hctx_t_rto;
| > |
| > BUG: This removes conformance with RFC 3448, section 4.3 and section 4.4.
| >
| I think I see what you are saying but the code is/was wrong here.
| Section 4.3 and 4.4 talks about t_mbi, not t_ipi. t_mbi is a constant
| of 64 seconds.
I meant the passages where the nofeedback timer is set to expire after max(4*R, 2*s/X), where t_ipi = s/X.
Now it is only reset to max(4*R). However - again this is a sideline and not central to what I think your
patch tries to achieve.
| can you move the refactor patch one up in the sequence so
I have shifted it up by 4 - from 3f to 3a - and refreshed the offsets of the other patches.
Summary of differences
----------------------
I would like to come to a kind of settlement for this discussion. We are both arguing that each of our
patches is `better', but I think this discussion leads nowhere, since the real problems (see follow-up)
remain untouched.
Taking the subtraction in packet_recv aside, I think it the patch I submitted the exclusive access to
t_nom is clearer to see. In your solution, the access is within a function (ccid3_update_send_time), which
could end up being called by other functions and thus reintroduce the race condition through the back door.
It is safe only because this function has only a single caller at the moment.
In the patch I submitted (3d/3e) t_nom is only updated within send_packet and not via a function which
is available to other ccid3 routines. Recomputing t_ipi and delta on each send is slightly more expensive.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: NAK: [PATCH 1/1] DCCP: Fix up t_nom
2007-01-08 11:28 NAK: [PATCH 1/1] DCCP: Fix up t_nom Gerrit Renker
2007-01-09 22:55 ` Ian McDonald
2007-01-10 10:13 ` Gerrit Renker
@ 2007-01-13 4:54 ` Ian McDonald
2 siblings, 0 replies; 4+ messages in thread
From: Ian McDonald @ 2007-01-13 4:54 UTC (permalink / raw)
To: dccp
Hi there,
On 10/01/07, Gerrit Renker <gerrit@erg.abdn.ac.uk> wrote:
> Hello,
>
> |
> | > | @@ -539,11 +528,7 @@ static void ccid3_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb)
> | > | - /*
> | > | - * Schedule no feedback timer to expire in
> | > | - * max(t_RTO, 2 * s/X) = max(t_RTO, 2 * t_ipi)
> | > | - */
> | > | - t_nfb = max(hctx->ccid3hctx_t_rto, 2 * hctx->ccid3hctx_t_ipi);
> | > | + t_nfb = hctx->ccid3hctx_t_rto;
> | > |
> | > BUG: This removes conformance with RFC 3448, section 4.3 and section 4.4.
> | >
> | I think I see what you are saying but the code is/was wrong here.
> | Section 4.3 and 4.4 talks about t_mbi, not t_ipi. t_mbi is a constant
> | of 64 seconds.
> I meant the passages where the nofeedback timer is set to expire after max(4*R, 2*s/X), where t_ipi = s/X.
> Now it is only reset to max(4*R). However - again this is a sideline and not central to what I think your
> patch tries to achieve.
>
I've reworked the patch to not make changes here as I reread the spec
and you are right. This means I keep t_ipi in the structure.
>
> | can you move the refactor patch one up in the sequence so
> I have shifted it up by 4 - from 3f to 3a - and refreshed the offsets of the other patches.
>
Thanks. I'm now in agreement with all your patches except 3d, 3e, 4e.
If you want to you can move the other ones in front of these three so
Arnaldo has a list of agreed patches.
I'm about to send my revised patch which I believe should replace 3d and 3e.
Ian
--
Web: http://wand.net.nz/~iam4
Blog: http://iansblog.jandi.co.nz
WAND Network Research Group
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-01-13 4:54 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-08 11:28 NAK: [PATCH 1/1] DCCP: Fix up t_nom Gerrit Renker
2007-01-09 22:55 ` Ian McDonald
2007-01-10 10:13 ` Gerrit Renker
2007-01-13 4:54 ` Ian McDonald
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.