All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] 1/1 dccp: transmit buffering
@ 2006-02-06 23:04 Ian McDonald
  2006-02-15  1:18 ` Ian McDonald
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Ian McDonald @ 2006-02-06 23:04 UTC (permalink / raw)
  To: dccp

> See the comments below, I guess we don't need yet another lock :-)

Will look at some more about locks as time permits - we take
possession of our house today :-)
> > @@ -191,7 +192,7 @@ static int dccp_wait_for_ccid(struct soc
> >         while (1) {
> >                 prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE);
> >
> > -               if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
> > +               if (sk->sk_err)
>
> Humm, have to check this one...

The reason for this is to allow sending of queued packets when user
application says close...

> sk_reset_timer() please, so that we make sure we have a reference count
> for the sock as its going to be on a timer

As I read the timer source code if the timer returns do I have to do a
sock_put (I am guessing so as I don't see how it would happen
elsewise).

> > +       dp->dccps_xmit_lock = SPIN_LOCK_UNLOCKED;
>
> Not needed

I presume you mean because that is the default state? (Provided I
don't get rid of the lock altogether)

> sk_stop_timer please, to drop the refcount we grabbed with sk_reset_timer()
> elsewhere.
>
The thing with sk_stop_timer is that it doesn't check if the timer
routine is in use which would mean that we can have problems (I hit
this in testing). I presume the best way is to create a
sk_stop_timer_sync rather than use generic timer...

Ian
--
Ian McDonald
http://wand.net.nz/~iam4
WAND Network Research Group
University of Waikato
New Zealand

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] 1/1 dccp: transmit buffering
  2006-02-06 23:04 [PATCH] 1/1 dccp: transmit buffering Ian McDonald
@ 2006-02-15  1:18 ` Ian McDonald
  2006-02-15 11:57 ` Arnaldo Carvalho de Melo
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ian McDonald @ 2006-02-15  1:18 UTC (permalink / raw)
  To: dccp

On 2/7/06, Ian McDonald <imcdnzl@gmail.com> wrote:
> > See the comments below, I guess we don't need yet another lock :-)
>
> Will look at some more about locks as time permits - we take
> possession of our house today :-)

Still bit short of time but tested some more and found it clashed with
feature negotiation so will need more work :-( One of the problems is
with the code stopping the retransmit timer in feature negotiation.
Removing that fixes it for standard test cases but it then fails with
delay of 50 ms and 10% loss using netem where it worked without
feature negotiation for me. Might be bottom half locking perhaps -
haven't tested changes to that yet.
>
> The reason for this is to allow sending of queued packets when user
> application says close...
>
> > sk_reset_timer() please, so that we make sure we have a reference count
> > for the sock as its going to be on a timer
>
> As I read the timer source code if the timer returns do I have to do a
> sock_put (I am guessing so as I don't see how it would happen
> elsewise).

Are you able to answer this?
>
> > sk_stop_timer please, to drop the refcount we grabbed with sk_reset_timer()
> > elsewhere.
> >
> The thing with sk_stop_timer is that it doesn't check if the timer
> routine is in use which would mean that we can have problems (I hit
> this in testing). I presume the best way is to create a
> sk_stop_timer_sync rather than use generic timer...

Any comments on this anyone as this is how I intend to code it...

Ian
--
Ian McDonald
http://wand.net.nz/~iam4
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] 1/1 dccp: transmit buffering
  2006-02-06 23:04 [PATCH] 1/1 dccp: transmit buffering Ian McDonald
  2006-02-15  1:18 ` Ian McDonald
@ 2006-02-15 11:57 ` Arnaldo Carvalho de Melo
  2006-02-15 17:48 ` Ian McDonald
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-02-15 11:57 UTC (permalink / raw)
  To: dccp

Ian McDonald wrote:
> On 2/7/06, Ian McDonald <imcdnzl@gmail.com> wrote:
>>> See the comments below, I guess we don't need yet another lock :-)
>> Will look at some more about locks as time permits - we take
>> possession of our house today :-)
> 
> Still bit short of time but tested some more and found it clashed with
> feature negotiation so will need more work :-( One of the problems is
> with the code stopping the retransmit timer in feature negotiation.
> Removing that fixes it for standard test cases but it then fails with
> delay of 50 ms and 10% loss using netem where it worked without
> feature negotiation for me. Might be bottom half locking perhaps -
> haven't tested changes to that yet.

Andrea, can you take a look at this one?

>> The reason for this is to allow sending of queued packets when user
>> application says close...
>>
>>> sk_reset_timer() please, so that we make sure we have a reference count
>>> for the sock as its going to be on a timer
>> As I read the timer source code if the timer returns do I have to do a
>> sock_put (I am guessing so as I don't see how it would happen
>> elsewise).
> 
> Are you able to answer this?

Yes, you have to do a sock_put at the end of the timer routine, if
you restart the timer with sk_reset_timer it'll grab another refcount,
so the one done at first firing the timer will be dropped by the
sock_put at the end of the timer routine.

>>> sk_stop_timer please, to drop the refcount we grabbed with sk_reset_timer()
>>> elsewhere.
>>>
>> The thing with sk_stop_timer is that it doesn't check if the timer
>> routine is in use which would mean that we can have problems (I hit
>> this in testing). I presume the best way is to create a
>> sk_stop_timer_sync rather than use generic timer...
> 
> Any comments on this anyone as this is how I intend to code it...

Well, it checks, look at its usage of mod_timer to see if the timer was
pending and if not drop the lock, i.e. no refcount leakage here, if you
are worried about races that is why we use the timer routine under
bh_lock_sock()/bh_release_sock(), look at ccid3_hc_tx_no_feedback_timer.

Or is it some other worry you have? can you elaborate? If you need
mod_timer return forwarded to the sk_reset_user() user we can do it if
needed.

- Arnaldo


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] 1/1 dccp: transmit buffering
  2006-02-06 23:04 [PATCH] 1/1 dccp: transmit buffering Ian McDonald
  2006-02-15  1:18 ` Ian McDonald
  2006-02-15 11:57 ` Arnaldo Carvalho de Melo
@ 2006-02-15 17:48 ` Ian McDonald
  2006-02-21  2:14 ` Ian McDonald
  2006-02-21 12:09 ` Arnaldo Carvalho de Melo
  4 siblings, 0 replies; 6+ messages in thread
From: Ian McDonald @ 2006-02-15 17:48 UTC (permalink / raw)
  To: dccp

> > Still bit short of time but tested some more and found it clashed with
> > feature negotiation so will need more work :-( One of the problems is
> > with the code stopping the retransmit timer in feature negotiation.
> > Removing that fixes it for standard test cases but it then fails with
> > delay of 50 ms and 10% loss using netem where it worked without
> > feature negotiation for me. Might be bottom half locking perhaps -
> > haven't tested changes to that yet.
>
> Andrea, can you take a look at this one?

It may yet be a race I think. You can wait until I test my code with
bh locks (it only happens when my code is merged...)
>
> Well, it checks, look at its usage of mod_timer to see if the timer was
> pending and if not drop the lock, i.e. no refcount leakage here, if you
> are worried about races that is why we use the timer routine under
> bh_lock_sock()/bh_release_sock(), look at ccid3_hc_tx_no_feedback_timer.

That is my exact worry - races - so that solves it now for me. Thanks
a lot Arnaldo.

--
Ian McDonald
http://wand.net.nz/~iam4
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] 1/1 dccp: transmit buffering
  2006-02-06 23:04 [PATCH] 1/1 dccp: transmit buffering Ian McDonald
                   ` (2 preceding siblings ...)
  2006-02-15 17:48 ` Ian McDonald
@ 2006-02-21  2:14 ` Ian McDonald
  2006-02-21 12:09 ` Arnaldo Carvalho de Melo
  4 siblings, 0 replies; 6+ messages in thread
From: Ian McDonald @ 2006-02-21  2:14 UTC (permalink / raw)
  To: dccp

[-- Attachment #1: Type: text/plain, Size: 7094 bytes --]

Arnaldo, (or anybody else)

I can get it working with the extra lock but not without... Comments
inline and at the end. Help much appreciated!


> Please leave the "extern"
>
Yep!

> I guess we don't need this lock, but instead use bh_lock_sock, check if
> there are users, look at ccid3_hc_tx_no_feedback_timer & ccid2_hc_tx_rto_expire
> (I left a comment even having fixed the problem, duh will fix).

I'm not sure what you mean by users here. Users of what? Can you explain

> sk_reset_timer() please, so that we make sure we have a reference count
> for the sock as its going to be on a timer
>
Will tidyup once get other problems sorted...

> No need for the {}, just one statement

Understand. Lazy when removing debugging...

> sk_eat_skb()  later?
>
It would be sk_eat_skb in dccp_write_xmit for two reasons:
1) sk_eat_skb alters receive queue. This is write
2) Existing xmit doesn't seem to do anything like this. To me it looks
like the lower level does the kfree_skb...

>
> > +       dp->dccps_xmit_lock = SPIN_LOCK_UNLOCKED;
>
> Not needed

Understand - will remove

Now I have attached two patches. acme1.diff is slightly tidied up
version of original patch and works with normal testing and netem
testing.

acme2.diff is an attempt to get rid of spinlock and use sock_lock as I
didn't think I needed bh_sock_lock as not receiving packets as that is
seemed aimed at that (stopping the interrupt driver trashing receive
queues...)

If I do away with spinlocks as per acme2.diff I get this:
Feb 21 14:37:56 localhost kernel: [17180168.280000] Debug: sleeping
function called from invalid context at net/core/sock.c:1327
Feb 21 14:37:56 localhost kernel: [17180168.280000] in_atomic():1,
irqs_disabled():0
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[show_trace+19/21] show_trace+0x13/0x15
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[dump_stack+22/26] dump_stack+0x16/0x1a
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[__might_sleep+135/143] __might_sleep+0x87/0x8f
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[lock_sock+26/158] lock_sock+0x1a/0x9e
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[pg0+411554706/1070261248] dccp_write_xmit+0x25/0x264 [dccp]
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[pg0+411555519/1070261248] dccp_write_xmit_timer+0xd/0x11 [dccp]
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[run_timer_softirq+296/387] run_timer_softirq+0x128/0x183
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[__do_softirq+66/144] __do_softirq+0x42/0x90
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[do_softirq+42/47] do_softirq+0x2a/0x2f
Feb 21 14:37:56 localhost kernel: [17180168.280000]  [irq_exit+48/60]
irq_exit+0x30/0x3c
Feb 21 14:37:56 localhost kernel: [17180168.280000]  [do_IRQ+72/84]
do_IRQ+0x48/0x54
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[common_interrupt+26/32] common_interrupt+0x1a/0x20
Feb 21 14:37:56 localhost kernel: [17180168.280000]  [cpu_idle+71/115]
cpu_idle+0x47/0x73
Feb 21 14:37:56 localhost kernel: [17180168.280000]  [_stext+55/60]
rest_init+0x37/0x3c
Feb 21 14:37:56 localhost kernel: [17180168.280000] 
[start_kernel+639/641] start_kernel+0x27f/0x281
Feb 21 14:37:56 localhost kernel: [17180168.280000]  [L6+0/2] 0xc0100199

which seems to be that it doesn't like lock_sock being called from a
timer as I read this. Can I not call lock_sock from a timer? When I
substituted lock_sock/release_sock for bh_lock_sock/bh_release_sock in
dccp_write_xmit (it doesn't call might_sleep) I lock my machine solid
with no output...

Feb 21 14:37:57 localhost kernel: [17180169.060000] scheduling while
atomic: ttcp_acme/0x00000101/11660
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[show_trace+19/21] show_trace+0x13/0x15
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[dump_stack+22/26] dump_stack+0x16/0x1a
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[schedule+67/1316] schedule+0x43/0x524
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[lock_sock+113/158] lock_sock+0x71/0x9e
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[pg0+411554706/1070261248] dccp_write_xmit+0x25/0x264 [dccp]
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[pg0+411555519/1070261248] dccp_write_xmit_timer+0xd/0x11 [dccp]
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[run_timer_softirq+296/387] run_timer_softirq+0x128/0x183
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[__do_softirq+66/144] __do_softirq+0x42/0x90
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[do_softirq+42/47] do_softirq+0x2a/0x2f
Feb 21 14:37:57 localhost kernel: [17180169.060000]  [irq_exit+48/60]
irq_exit+0x30/0x3c
Feb 21 14:37:57 localhost kernel: [17180169.060000]  [do_IRQ+72/84]
do_IRQ+0x48/0x54
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[common_interrupt+26/32] common_interrupt+0x1a/0x20
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[do_gettimeofday+20/148] do_gettimeofday+0x14/0x94
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[pg0+411550955/1070261248] dccp_timestamp+0x11/0x42 [dccp]
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[pg0+411872049/1070261248] ccid3_hc_tx_send_packet+0xeb/0x290
[dccp_ccid3]
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[pg0+411554738/1070261248] dccp_write_xmit+0x45/0x264 [dccp]
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[pg0+411558587/1070261248] dccp_sendmsg+0x11e/0x15f [dccp]
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[inet_sendmsg+52/64] inet_sendmsg+0x34/0x40
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[do_sock_write+145/153] do_sock_write+0x91/0x99
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[sock_aio_write+85/99] sock_aio_write+0x55/0x63
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[do_sync_write+159/208] do_sync_write+0x9f/0xd0
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[vfs_write+180/327] vfs_write+0xb4/0x147
Feb 21 14:37:57 localhost kernel: [17180169.060000]  [sys_write+58/97]
sys_write+0x3a/0x61
Feb 21 14:37:57 localhost kernel: [17180169.060000] 
[syscall_call+7/11] syscall_call+0x7/0xb

Here when I am transmitting a packet the timer fires because I go to
get the time and it causes an issue.....

acme2.diff doesn't work for me.

I can't see how to get this working without a transmit lock which
works perfectly for me. Are you able to tell me what I am doing wrong?
I personally can't see what I am doing wrong - it is just that
lock_sock isn't enough to prevent races which is why I need to put in
a transmit lock... the issue is really the timer firing. TCP doesn't
have the issue in quite this way as it uses ACK clocking in effect
rather than being time based like CCID3.

Comments/help really appreciated as I'm pulling my hair out and just
want to use my extra lock....

Ian
--
Ian McDonald
http://wand.net.nz/~iam4
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand

[-- Attachment #2: acme1.diff --]
[-- Type: text/plain, Size: 7557 bytes --]

diff --git a/include/linux/dccp.h b/include/linux/dccp.h
index 088529f..4a2f845 100644
--- a/include/linux/dccp.h
+++ b/include/linux/dccp.h
@@ -411,6 +411,8 @@ struct dccp_ackvec;
  * @dccps_role - Role of this sock, one of %dccp_role
  * @dccps_ndp_count - number of Non Data Packets since last data packet
  * @dccps_hc_rx_ackvec - rx half connection ack vector
+ * @dccps_xmit_timer - timer for when CCID is not ready to send
+ * @dccps_xmit_lock - lock for transmitting
  */
 struct dccp_sock {
 	/* inet_connection_sock has to be the first member of dccp_sock */
@@ -443,6 +445,8 @@ struct dccp_sock {
 	enum dccp_role			dccps_role:2;
 	__u8				dccps_hc_rx_insert_options:1;
 	__u8				dccps_hc_tx_insert_options:1;
+	struct timer_list		dccps_xmit_timer;
+	spinlock_t			dccps_xmit_lock;
 };
  
 static inline struct dccp_sock *dccp_sk(const struct sock *sk)
diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h
index 93f26dd..ad38765 100644
--- a/net/dccp/dccp.h
+++ b/net/dccp/dccp.h
@@ -5,7 +5,7 @@
  *
  *  An implementation of the DCCP protocol
  *  Copyright (c) 2005 Arnaldo Carvalho de Melo <acme@conectiva.com.br>
- *  Copyright (c) 2005 Ian McDonald <iam4@cs.waikato.ac.nz>
+ *  Copyright (c) 2005-6 Ian McDonald <imcdnzl@gmail.com>
  *
  *	This program is free software; you can redistribute it and/or modify it
  *	under the terms of the GNU General Public License version 2 as
@@ -126,7 +126,7 @@ extern void dccp_send_delayed_ack(struct
 extern void dccp_send_sync(struct sock *sk, const u64 seq,
 			   const enum dccp_pkt_type pkt_type);
 
-extern int dccp_write_xmit(struct sock *sk, struct sk_buff *skb, long *timeo);
+extern void dccp_write_xmit(struct sock *sk, int block);
 extern void dccp_write_space(struct sock *sk);
 
 extern void dccp_init_xmit_timers(struct sock *sk);
diff --git a/net/dccp/output.c b/net/dccp/output.c
index efd7ffb..efadaaa 100644
--- a/net/dccp/output.c
+++ b/net/dccp/output.c
@@ -2,7 +2,8 @@
  *  net/dccp/output.c
  * 
  *  An implementation of the DCCP protocol
- *  Arnaldo Carvalho de Melo <acme@conectiva.com.br>
+ *  Copyright (c) 2005 Arnaldo Carvalho de Melo <acme@conectiva.com.br>
+ *  Copyright (c) 2006 Ian McDonald <imcdnzl@gmail.com>
  *
  *	This program is free software; you can redistribute it and/or
  *	modify it under the terms of the GNU General Public License
@@ -191,7 +192,7 @@ static int dccp_wait_for_ccid(struct soc
 	while (1) {
 		prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE);
 
-		if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
+		if (sk->sk_err)
 			goto do_error;
 		if (!*timeo)
 			goto do_nonblock;
@@ -207,9 +208,11 @@ static int dccp_wait_for_ccid(struct soc
 			goto do_nonblock;
 
 		sk->sk_write_pending++;
+		spin_unlock(&dp->dccps_xmit_lock);
 		release_sock(sk);
 		*timeo -= schedule_timeout(delay);
 		lock_sock(sk);
+		spin_lock(&dp->dccps_xmit_lock);
 		sk->sk_write_pending--;
 	}
 out:
@@ -227,37 +230,55 @@ do_interrupted:
 	goto out;
 }
 
-int dccp_write_xmit(struct sock *sk, struct sk_buff *skb, long *timeo)
-{
-	const struct dccp_sock *dp = dccp_sk(sk);
-	int err = ccid_hc_tx_send_packet(dp->dccps_hc_tx_ccid, sk, skb,
-					 skb->len);
+static void dccp_write_xmit_timer(unsigned long sk) {
+	dccp_write_xmit((struct sock *)sk,0);
+}
 
-	if (err > 0)
-		err = dccp_wait_for_ccid(sk, skb, timeo);
+void dccp_write_xmit(struct sock *sk, int block)
+{
+	struct dccp_sock *dp = dccp_sk(sk);
+	struct sk_buff *skb;
+	long timeo = 2000; /* FIXME imcdnzl - 2 second default */
 
-	if (err == 0) {
-		struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
-		const int len = skb->len;
+	spin_lock(&dp->dccps_xmit_lock);
+	
+	while ((skb = skb_peek(&sk->sk_write_queue))) {
+		int err = ccid_hc_tx_send_packet(dp->dccps_hc_tx_ccid, sk, skb,
+					 skb->len);
+		
+		if (err > 0) {
+			if (likely(!block)) { 
+				mod_timer(&dp->dccps_xmit_timer, 
+						msecs_to_jiffies(err)+jiffies);
+				goto xmit_out;
+			} else
+				err = dccp_wait_for_ccid(sk, skb, &timeo);
+		}
 
-		if (sk->sk_state == DCCP_PARTOPEN) {
-			/* See 8.1.5.  Handshake Completion */
-			inet_csk_schedule_ack(sk);
-			inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
+		skb_dequeue(&sk->sk_write_queue);
+		if (err == 0) {
+			struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
+			const int len = skb->len;
+
+			if (sk->sk_state == DCCP_PARTOPEN) {
+				/* See 8.1.5.  Handshake Completion */
+				inet_csk_schedule_ack(sk);
+				inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
 						  inet_csk(sk)->icsk_rto,
 						  DCCP_RTO_MAX);
-			dcb->dccpd_type = DCCP_PKT_DATAACK;
-		} else if (dccp_ack_pending(sk))
-			dcb->dccpd_type = DCCP_PKT_DATAACK;
-		else
-			dcb->dccpd_type = DCCP_PKT_DATA;
-
-		err = dccp_transmit_skb(sk, skb);
-		ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
-	} else
-		kfree_skb(skb);
-
-	return err;
+				dcb->dccpd_type = DCCP_PKT_DATAACK;
+			} else if (dccp_ack_pending(sk))
+				dcb->dccpd_type = DCCP_PKT_DATAACK;
+			else
+				dcb->dccpd_type = DCCP_PKT_DATA;
+
+			err = dccp_transmit_skb(sk, skb);
+			ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
+		} else 
+			kfree(skb);
+	}
+xmit_out:
+	spin_unlock(&dp->dccps_xmit_lock);
 }
 
 int dccp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
@@ -396,6 +417,10 @@ static inline void dccp_connect_init(str
 	dccp_set_seqno(&dp->dccps_awl, max48(dp->dccps_awl, dp->dccps_iss));
 
 	icsk->icsk_retransmits = 0;
+	init_timer(&dp->dccps_xmit_timer);
+	dp->dccps_xmit_timer.data = (unsigned long)sk;
+	dp->dccps_xmit_timer.function = dccp_write_xmit_timer;
+	dp->dccps_xmit_lock = SPIN_LOCK_UNLOCKED;
 }
 
 int dccp_connect(struct sock *sk)
@@ -527,8 +552,10 @@ void dccp_send_close(struct sock *sk, co
 					DCCP_PKT_CLOSE : DCCP_PKT_CLOSEREQ;
 
 	if (active) {
+		dccp_write_xmit(sk, 1);
 		dccp_skb_entail(sk, skb);
 		dccp_transmit_skb(sk, skb_clone(skb, prio));
+		/* FIXME do we need a retransmit timer here? */
 	} else
 		dccp_transmit_skb(sk, skb);
 }
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 65b11ea..119ecc5 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -385,6 +385,7 @@ int dccp_sendmsg(struct kiocb *iocb, str
 
 	lock_sock(sk);
 	timeo = sock_sndtimeo(sk, noblock);
+	/* FIXME imcdnzl - we should be using timeo for memory allocation */
 
 	/*
 	 * We have to use sk_stream_wait_connect here to set sk_write_pending,
@@ -407,19 +408,13 @@ int dccp_sendmsg(struct kiocb *iocb, str
 	if (rc != 0)
 		goto out_discard;
 
-	rc = dccp_write_xmit(sk, skb, &timeo);
-	/*
-	 * XXX we don't use sk_write_queue, so just discard the packet.
-	 *     Current plan however is to _use_ sk_write_queue with
-	 *     an algorith similar to tcp_sendmsg, where the main difference
-	 *     is that in DCCP we have to respect packet boundaries, so
-	 *     no coalescing of skbs.
-	 *
-	 *     This bug was _quickly_ found & fixed by just looking at an OSTRA
-	 *     generated callgraph 8) -acme
-	 */
+	skb_queue_tail(&sk->sk_write_queue, skb);
+	release_sock(sk);
+	dccp_write_xmit(sk,0);
+	goto out_end;
 out_release:
 	release_sock(sk);
+out_end:
 	return rc ? : len;
 out_discard:
 	kfree_skb(skb);
@@ -591,6 +586,7 @@ static int dccp_close_state(struct sock 
 
 void dccp_close(struct sock *sk, long timeout)
 {
+	struct dccp_sock *dp = dccp_sk(sk);
 	struct sk_buff *skb;
 
 	lock_sock(sk);
@@ -606,6 +602,8 @@ void dccp_close(struct sock *sk, long ti
 		goto adjudge_to_death;
 	}
 
+	del_timer_sync(&dp->dccps_xmit_timer);
+
 	/*
 	 * We need to flush the recv. buffs.  We do this only on the
 	 * descriptor close, not protocol-sourced closes, because the

[-- Attachment #3: acme2.diff --]
[-- Type: text/plain, Size: 7803 bytes --]

diff --git a/include/linux/dccp.h b/include/linux/dccp.h
index 088529f..3604d1d 100644
--- a/include/linux/dccp.h
+++ b/include/linux/dccp.h
@@ -411,6 +411,8 @@ struct dccp_ackvec;
  * @dccps_role - Role of this sock, one of %dccp_role
  * @dccps_ndp_count - number of Non Data Packets since last data packet
  * @dccps_hc_rx_ackvec - rx half connection ack vector
+ * @dccps_xmit_timer - timer for when CCID is not ready to send
+ * @dccps_xmit_lock - lock for transmitting
  */
 struct dccp_sock {
 	/* inet_connection_sock has to be the first member of dccp_sock */
@@ -443,6 +445,8 @@ struct dccp_sock {
 	enum dccp_role			dccps_role:2;
 	__u8				dccps_hc_rx_insert_options:1;
 	__u8				dccps_hc_tx_insert_options:1;
+	struct timer_list		dccps_xmit_timer;
+//	spinlock_t			dccps_xmit_lock;
 };
  
 static inline struct dccp_sock *dccp_sk(const struct sock *sk)
diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h
index 93f26dd..ad38765 100644
--- a/net/dccp/dccp.h
+++ b/net/dccp/dccp.h
@@ -5,7 +5,7 @@
  *
  *  An implementation of the DCCP protocol
  *  Copyright (c) 2005 Arnaldo Carvalho de Melo <acme@conectiva.com.br>
- *  Copyright (c) 2005 Ian McDonald <iam4@cs.waikato.ac.nz>
+ *  Copyright (c) 2005-6 Ian McDonald <imcdnzl@gmail.com>
  *
  *	This program is free software; you can redistribute it and/or modify it
  *	under the terms of the GNU General Public License version 2 as
@@ -126,7 +126,7 @@ extern void dccp_send_delayed_ack(struct
 extern void dccp_send_sync(struct sock *sk, const u64 seq,
 			   const enum dccp_pkt_type pkt_type);
 
-extern int dccp_write_xmit(struct sock *sk, struct sk_buff *skb, long *timeo);
+extern void dccp_write_xmit(struct sock *sk, int block);
 extern void dccp_write_space(struct sock *sk);
 
 extern void dccp_init_xmit_timers(struct sock *sk);
diff --git a/net/dccp/output.c b/net/dccp/output.c
index efd7ffb..f0e60af 100644
--- a/net/dccp/output.c
+++ b/net/dccp/output.c
@@ -2,7 +2,8 @@
  *  net/dccp/output.c
  * 
  *  An implementation of the DCCP protocol
- *  Arnaldo Carvalho de Melo <acme@conectiva.com.br>
+ *  Copyright (c) 2005 Arnaldo Carvalho de Melo <acme@conectiva.com.br>
+ *  Copyright (c) 2006 Ian McDonald <imcdnzl@gmail.com>
  *
  *	This program is free software; you can redistribute it and/or
  *	modify it under the terms of the GNU General Public License
@@ -191,7 +192,7 @@ static int dccp_wait_for_ccid(struct soc
 	while (1) {
 		prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE);
 
-		if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
+		if (sk->sk_err)
 			goto do_error;
 		if (!*timeo)
 			goto do_nonblock;
@@ -207,9 +208,12 @@ static int dccp_wait_for_ccid(struct soc
 			goto do_nonblock;
 
 		sk->sk_write_pending++;
+//		spin_unlock(&dp->dccps_xmit_lock);
 		release_sock(sk);
+/* not using bh_lock as comes through dccp_close which uses lock_sock */
 		*timeo -= schedule_timeout(delay);
 		lock_sock(sk);
+//		spin_lock(&dp->dccps_xmit_lock);
 		sk->sk_write_pending--;
 	}
 out:
@@ -227,37 +231,63 @@ do_interrupted:
 	goto out;
 }
 
-int dccp_write_xmit(struct sock *sk, struct sk_buff *skb, long *timeo)
+static void dccp_write_xmit_timer(unsigned long sk) {
+	dccp_write_xmit((struct sock *)sk,0);
+}
+
+void dccp_write_xmit(struct sock *sk, int block)
 {
-	const struct dccp_sock *dp = dccp_sk(sk);
-	int err = ccid_hc_tx_send_packet(dp->dccps_hc_tx_ccid, sk, skb,
-					 skb->len);
+	struct dccp_sock *dp = dccp_sk(sk);
+	struct sk_buff *skb;
+	long timeo = 2000; /* FIXME imcdnzl - 2 second default */
 
-	if (err > 0)
-		err = dccp_wait_for_ccid(sk, skb, timeo);
+//	spin_lock(&dp->dccps_xmit_lock);
+//
+	if (!block)
+		lock_sock(sk);
 
-	if (err == 0) {
-		struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
-		const int len = skb->len;
+	/* this is here because dccp_close locks the sock and
+	 * we want to avoid deadlock */
+	
+	while ((skb = skb_peek(&sk->sk_write_queue))) {
+		int err = ccid_hc_tx_send_packet(dp->dccps_hc_tx_ccid, sk, skb,
+					 skb->len);
+		
+		if (err > 0) {
+			if (likely(!block)) { 
+				mod_timer(&dp->dccps_xmit_timer, 
+						msecs_to_jiffies(err)+jiffies);
+				goto xmit_out;
+			} else
+				err = dccp_wait_for_ccid(sk, skb, &timeo);
+		}
 
-		if (sk->sk_state == DCCP_PARTOPEN) {
-			/* See 8.1.5.  Handshake Completion */
-			inet_csk_schedule_ack(sk);
-			inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
+		skb_dequeue(&sk->sk_write_queue);
+		if (err == 0) {
+			struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
+			const int len = skb->len;
+
+			if (sk->sk_state == DCCP_PARTOPEN) {
+				/* See 8.1.5.  Handshake Completion */
+				inet_csk_schedule_ack(sk);
+				inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
 						  inet_csk(sk)->icsk_rto,
 						  DCCP_RTO_MAX);
-			dcb->dccpd_type = DCCP_PKT_DATAACK;
-		} else if (dccp_ack_pending(sk))
-			dcb->dccpd_type = DCCP_PKT_DATAACK;
-		else
-			dcb->dccpd_type = DCCP_PKT_DATA;
-
-		err = dccp_transmit_skb(sk, skb);
-		ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
-	} else
-		kfree_skb(skb);
-
-	return err;
+				dcb->dccpd_type = DCCP_PKT_DATAACK;
+			} else if (dccp_ack_pending(sk))
+				dcb->dccpd_type = DCCP_PKT_DATAACK;
+			else
+				dcb->dccpd_type = DCCP_PKT_DATA;
+
+			err = dccp_transmit_skb(sk, skb);
+			ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len);
+		} else 
+			kfree(skb);
+	}
+xmit_out:
+	if (!block)
+		release_sock(sk);
+//	spin_unlock(&dp->dccps_xmit_lock);
 }
 
 int dccp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
@@ -396,6 +426,10 @@ static inline void dccp_connect_init(str
 	dccp_set_seqno(&dp->dccps_awl, max48(dp->dccps_awl, dp->dccps_iss));
 
 	icsk->icsk_retransmits = 0;
+	init_timer(&dp->dccps_xmit_timer);
+	dp->dccps_xmit_timer.data = (unsigned long)sk;
+	dp->dccps_xmit_timer.function = dccp_write_xmit_timer;
+//	dp->dccps_xmit_lock = SPIN_LOCK_UNLOCKED;
 }
 
 int dccp_connect(struct sock *sk)
@@ -527,8 +561,10 @@ void dccp_send_close(struct sock *sk, co
 					DCCP_PKT_CLOSE : DCCP_PKT_CLOSEREQ;
 
 	if (active) {
+		dccp_write_xmit(sk, 1);
 		dccp_skb_entail(sk, skb);
 		dccp_transmit_skb(sk, skb_clone(skb, prio));
+		/* FIXME do we need a retransmit timer here? */
 	} else
 		dccp_transmit_skb(sk, skb);
 }
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 65b11ea..119ecc5 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -385,6 +385,7 @@ int dccp_sendmsg(struct kiocb *iocb, str
 
 	lock_sock(sk);
 	timeo = sock_sndtimeo(sk, noblock);
+	/* FIXME imcdnzl - we should be using timeo for memory allocation */
 
 	/*
 	 * We have to use sk_stream_wait_connect here to set sk_write_pending,
@@ -407,19 +408,13 @@ int dccp_sendmsg(struct kiocb *iocb, str
 	if (rc != 0)
 		goto out_discard;
 
-	rc = dccp_write_xmit(sk, skb, &timeo);
-	/*
-	 * XXX we don't use sk_write_queue, so just discard the packet.
-	 *     Current plan however is to _use_ sk_write_queue with
-	 *     an algorith similar to tcp_sendmsg, where the main difference
-	 *     is that in DCCP we have to respect packet boundaries, so
-	 *     no coalescing of skbs.
-	 *
-	 *     This bug was _quickly_ found & fixed by just looking at an OSTRA
-	 *     generated callgraph 8) -acme
-	 */
+	skb_queue_tail(&sk->sk_write_queue, skb);
+	release_sock(sk);
+	dccp_write_xmit(sk,0);
+	goto out_end;
 out_release:
 	release_sock(sk);
+out_end:
 	return rc ? : len;
 out_discard:
 	kfree_skb(skb);
@@ -591,6 +586,7 @@ static int dccp_close_state(struct sock 
 
 void dccp_close(struct sock *sk, long timeout)
 {
+	struct dccp_sock *dp = dccp_sk(sk);
 	struct sk_buff *skb;
 
 	lock_sock(sk);
@@ -606,6 +602,8 @@ void dccp_close(struct sock *sk, long ti
 		goto adjudge_to_death;
 	}
 
+	del_timer_sync(&dp->dccps_xmit_timer);
+
 	/*
 	 * We need to flush the recv. buffs.  We do this only on the
 	 * descriptor close, not protocol-sourced closes, because the

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] 1/1 dccp: transmit buffering
  2006-02-06 23:04 [PATCH] 1/1 dccp: transmit buffering Ian McDonald
                   ` (3 preceding siblings ...)
  2006-02-21  2:14 ` Ian McDonald
@ 2006-02-21 12:09 ` Arnaldo Carvalho de Melo
  4 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2006-02-21 12:09 UTC (permalink / raw)
  To: dccp

On 2/20/06, Ian McDonald <imcdnzl@gmail.com> wrote:
> Arnaldo, (or anybody else)
>
> I can get it working with the extra lock but not without... Comments
> inline and at the end. Help much appreciated!
>
>
> > Please leave the "extern"
> >
> Yep!
>
> > I guess we don't need this lock, but instead use bh_lock_sock, check if
> > there are users, look at ccid3_hc_tx_no_feedback_timer & ccid2_hc_tx_rto_expire
> > (I left a comment even having fixed the problem, duh will fix).
>
> I'm not sure what you mean by users here. Users of what? Can you explain

Look at lock_sock(), release_sock(), bh_lock_sock(), bh_release_sock() and
at sk_receive_skb and you'll see what I mean :-)

> > sk_reset_timer() please, so that we make sure we have a reference count
> > for the sock as its going to be on a timer
> >
> Will tidyup once get other problems sorted...
>
> > No need for the {}, just one statement
>
> Understand. Lazy when removing debugging...
>
> > sk_eat_skb()  later?
> >
> It would be sk_eat_skb in dccp_write_xmit for two reasons:
> 1) sk_eat_skb alters receive queue. This is write
> 2) Existing xmit doesn't seem to do anything like this. To me it looks
> like the lower level does the kfree_skb...

Scrap that suggestion, brainfart :-\

> >
> > > +       dp->dccps_xmit_lock = SPIN_LOCK_UNLOCKED;
> >
> > Not needed
>
> Understand - will remove
>
> Now I have attached two patches. acme1.diff is slightly tidied up
> version of original patch and works with normal testing and netem
> testing.
>
> acme2.diff is an attempt to get rid of spinlock and use sock_lock as I
> didn't think I needed bh_sock_lock as not receiving packets as that is
> seemed aimed at that (stopping the interrupt driver trashing receive
> queues...)

I'll try to take a look later today, now have to visit a customer...

- Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-02-21 12:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-06 23:04 [PATCH] 1/1 dccp: transmit buffering Ian McDonald
2006-02-15  1:18 ` Ian McDonald
2006-02-15 11:57 ` Arnaldo Carvalho de Melo
2006-02-15 17:48 ` Ian McDonald
2006-02-21  2:14 ` Ian McDonald
2006-02-21 12:09 ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.