netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* udp weirdness
@ 2002-09-24  6:50 Eric Lemoine
  2002-09-27 12:02 ` Eric Lemoine
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Lemoine @ 2002-09-24  6:50 UTC (permalink / raw)
  To: netdev

I'm observing some UDP weirdness, or I'd better say some UDP behaviour
that I can't explain.

Two machines: one sending a UDP flow (using sendto) and another receiving 
this UDP flow (using bind + recv). 

When the dgram length is lower that 357 Bytes I observe strange results
at the send side. My home-made udp_tx program gives the following:

$./udp_tx -h 192.168.4.1 -m 357
357 1312621 357.518

357 is the dgram length (in B), 1312621 the number of dgrams sent and 
357.518 the perceived thruput (in Mbits/s). The weirdness is that I
get 357.518 Mbits/s whereas the underlying network is 10Mbits/s!

At the receive side the results are consistent (obviously):

$./udp_rx -m 357
357 29519 8.00884

<netstat -s --udp> on the send machine before and after the run also
gives me such a large amount of sent packets (~1312700), whereas
</sbin/ifconfig> confirms that about 29519 packets have been sent
out.

Below 357 Bytes, the same kind of results are observed. Above 357 Bytes,
the results make more sense to me:

$./udp_tx -h 192.168.4.1 -m 358
358 29505 8.04393

$./udp_rx -m 358
358 29468 8.0179

Does anybody know where I lose packets? And why do I lose them only when
the dgram length is below 357 Bytes?

BTW, I'm running 2.4.18-vanilla w/ the 3c59x driver.

Thx.
-- 
Eric

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-24  6:50 udp weirdness Eric Lemoine
@ 2002-09-27 12:02 ` Eric Lemoine
  2002-09-27 14:53   ` jamal
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Lemoine @ 2002-09-27 12:02 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: netdev

On Tue, Sep 24, 2002 at 08:50:46AM +0200, Eric Lemoine wrote:
> I'm observing some UDP weirdness, or I'd better say some UDP behaviour
> that I can't explain.
> 
> Two machines: one sending a UDP flow (using sendto) and another receiving 
> this UDP flow (using bind + recv). 
> 
> When the dgram length is lower that 357 Bytes I observe strange results
> at the send side. My home-made udp_tx program gives the following:
> 
> $./udp_tx -h 192.168.4.1 -m 357
> 357 1312621 357.518
> 
> 357 is the dgram length (in B), 1312621 the number of dgrams sent and 
> 357.518 the perceived thruput (in Mbits/s). The weirdness is that I
> get 357.518 Mbits/s whereas the underlying network is 10Mbits/s!
> 
> At the receive side the results are consistent (obviously):
> 
> $./udp_rx -m 357
> 357 29519 8.00884
> 
> <netstat -s --udp> on the send machine before and after the run also
> gives me such a large amount of sent packets (~1312700), whereas
> </sbin/ifconfig> confirms that about 29519 packets have been sent
> out.
> 
> Below 357 Bytes, the same kind of results are observed. Above 357 Bytes,
> the results make more sense to me:
> 
> $./udp_tx -h 192.168.4.1 -m 358
> 358 29505 8.04393
> 
> $./udp_rx -m 358
> 358 29468 8.0179
> 
> Does anybody know where I lose packets? And why do I lose them only when
> the dgram length is below 357 Bytes?
> 
> BTW, I'm running 2.4.18-vanilla w/ the 3c59x driver.

I figured out that packets can be dropped in pfifo_fast_enqueue()
[the default qdisc's enqueue func], even though the driver/kernel 
flow control has triggered. 

And sendto does not notify the user when packet gets dropped because 
the output queue overflows (as indicated in sendto manpage).

Why doesn't the kernel just put the process into sleep instead of 
dropping packets?

Thanks.
-- 
Eric

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-27 12:02 ` Eric Lemoine
@ 2002-09-27 14:53   ` jamal
  2002-09-27 15:04     ` Matti Aarnio
                       ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: jamal @ 2002-09-27 14:53 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: netdev



On Fri, 27 Sep 2002, Eric Lemoine wrote:

> I figured out that packets can be dropped in pfifo_fast_enqueue()
> [the default qdisc's enqueue func], even though the driver/kernel
> flow control has triggered.
>
> And sendto does not notify the user when packet gets dropped because
> the output queue overflows (as indicated in sendto manpage).
>
> Why doesn't the kernel just put the process into sleep instead of
> dropping packets?
>

What trigger do you suggest to wake up the process again?
A better idea maybe to return something to the socket so it can
manage things instead -- not sure what to return though that wouldnt
break some standard;

cheers,
jamal

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-27 14:53   ` jamal
@ 2002-09-27 15:04     ` Matti Aarnio
  2002-09-29 14:47       ` jamal
  2002-09-27 15:19     ` Eric Lemoine
  2002-09-27 15:57     ` Eric Lemoine
  2 siblings, 1 reply; 36+ messages in thread
From: Matti Aarnio @ 2002-09-27 15:04 UTC (permalink / raw)
  To: jamal; +Cc: Eric Lemoine, netdev

On Fri, Sep 27, 2002 at 10:53:00AM -0400, jamal wrote:
> On Fri, 27 Sep 2002, Eric Lemoine wrote:
> 
> > I figured out that packets can be dropped in pfifo_fast_enqueue()
> > [the default qdisc's enqueue func], even though the driver/kernel
> > flow control has triggered.
...
> What trigger do you suggest to wake up the process again?
> A better idea maybe to return something to the socket so it can
> manage things instead -- not sure what to return though that wouldnt
> break some standard;


"man sendto" error return codes:

       ENOBUFS
              The  output queue for a network interface was full.
              This generally indicates  that  the  interface  has
              stopped  sending,  but  may  be caused by transient
              congestion.  (This cannot occur in  Linux,  packets
              are just silently dropped when a device queue over­
              flows.)

> cheers,
> jamal

/Matti Aarnio

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-27 14:53   ` jamal
  2002-09-27 15:04     ` Matti Aarnio
@ 2002-09-27 15:19     ` Eric Lemoine
  2002-09-27 15:57     ` Eric Lemoine
  2 siblings, 0 replies; 36+ messages in thread
From: Eric Lemoine @ 2002-09-27 15:19 UTC (permalink / raw)
  To: jamal; +Cc: Eric Lemoine, netdev

On Fri, Sep 27, 2002 at 10:53:00AM -0400, jamal wrote:
>
> On Fri, 27 Sep 2002, Eric Lemoine wrote:
> 
> > I figured out that packets can be dropped in pfifo_fast_enqueue()
> > [the default qdisc's enqueue func], even though the driver/kernel
> > flow control has triggered.
> >
> > And sendto does not notify the user when packet gets dropped because
> > the output queue overflows (as indicated in sendto manpage).
> >
> > Why doesn't the kernel just put the process into sleep instead of
> > dropping packets?
> >
> 
> What trigger do you suggest to wake up the process again?
> A better idea maybe to return something to the socket so it can
> manage things instead -- not sure what to return though that wouldnt
> break some standard;

Linux seems to be the only one to not return ENOBUFS in this overflow
case (from sendto manpage), so i suspect it should not break any
standard.

-- 
Eric

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-27 14:53   ` jamal
  2002-09-27 15:04     ` Matti Aarnio
  2002-09-27 15:19     ` Eric Lemoine
@ 2002-09-27 15:57     ` Eric Lemoine
  2 siblings, 0 replies; 36+ messages in thread
From: Eric Lemoine @ 2002-09-27 15:57 UTC (permalink / raw)
  To: jamal; +Cc: Eric Lemoine, netdev

> > I figured out that packets can be dropped in pfifo_fast_enqueue()
> > [the default qdisc's enqueue func], even though the driver/kernel
> > flow control has triggered.
> >
> > And sendto does not notify the user when packet gets dropped because
> > the output queue overflows (as indicated in sendto manpage).
> >
> > Why doesn't the kernel just put the process into sleep instead of
> > dropping packets?
> >
> 
> What trigger do you suggest to wake up the process again?

I was thinking of putting processes into sleep on per-device lists.
Once the net driver sees the NIC is ready to send again it wakes up all
processes sleeping on its list. [If the socket from which the packet
comes from is marked non-blocking the process is not put into sleep 
(obviously) and EAGAIN is returned.]

Throwing away packets when not absolutely necessary does not make sense
to me. Please correct me if i'm wrong.

> A better idea maybe to return something to the socket so it can
> manage things instead -- not sure what to return though that wouldnt
> break some standard;
> 
> cheers,
> jamal

-- 
Eric

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-27 15:04     ` Matti Aarnio
@ 2002-09-29 14:47       ` jamal
  2002-09-30  8:49         ` Eric Lemoine
  0 siblings, 1 reply; 36+ messages in thread
From: jamal @ 2002-09-29 14:47 UTC (permalink / raw)
  To: Matti Aarnio; +Cc: Eric Lemoine, netdev

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 1262 bytes --]


Ok, understood. Actually we already seem to have enobufs being returned.

Eric,
Does the attached patch fix it? Not tested or even compiled.
Someone going to change the manpages?

cheers,
jamal

On Fri, 27 Sep 2002, Matti Aarnio wrote:

> On Fri, Sep 27, 2002 at 10:53:00AM -0400, jamal wrote:
> > On Fri, 27 Sep 2002, Eric Lemoine wrote:
> >
> > > I figured out that packets can be dropped in pfifo_fast_enqueue()
> > > [the default qdisc's enqueue func], even though the driver/kernel
> > > flow control has triggered.
> ...
> > What trigger do you suggest to wake up the process again?
> > A better idea maybe to return something to the socket so it can
> > manage things instead -- not sure what to return though that wouldnt
> > break some standard;
>
>
> "man sendto" error return codes:
>
>        ENOBUFS
>               The  output queue for a network interface was full.
>               This generally indicates  that  the  interface  has
>               stopped  sending,  but  may  be caused by transient
>               congestion.  (This cannot occur in  Linux,  packets
>               are just silently dropped when a device queue over­
>               flows.)
>
> > cheers,
> > jamal
>
> /Matti Aarnio
>

[-- Attachment #2: Type: TEXT/PLAIN, Size: 922 bytes --]

--- linux/net/ipv4/ip_output.c	2002/09/29 10:43:10	1.1
+++ linux/net/ipv4/ip_output.c	2002/09/29 10:43:10
@@ -603,8 +603,11 @@
 		err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, 
 			      skb->dst->dev, output_maybe_reroute);
 		if (err) {
-			if (err > 0)
-				err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+			if (err > 0) {
+				err = net_xmit_errno(err);
+				if (err && sk->protinfo.af_inet.recverr)
+				       sk->protinfo.af_inet.recverr = err;
+			}
 			if (err)
 				goto error;
 		}
@@ -713,8 +716,11 @@
 
 	err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev,
 		      output_maybe_reroute);
-	if (err > 0)
-		err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+	if (err > 0) {
+		err = net_xmit_errno(err);
+		if (err && sk->protinfo.af_inet.recverr)
+		       	 sk->protinfo.af_inet.recverr = err;
+	}
 	if (err)
 		goto error;
 out:

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-29 14:47       ` jamal
@ 2002-09-30  8:49         ` Eric Lemoine
  2002-09-30 11:09           ` jamal
  2002-09-30 12:10           ` jamal
  0 siblings, 2 replies; 36+ messages in thread
From: Eric Lemoine @ 2002-09-30  8:49 UTC (permalink / raw)
  To: jamal; +Cc: shell.cyberus.ca, Eric Lemoine, netdev

> Ok, understood. Actually we already seem to have enobufs being returned.
> 
> Eric,
> Does the attached patch fix it? Not tested or even compiled.
> Someone going to change the manpages?

protinfo.af_inet.recverr is a flag (1-bit variable) to enable the
"extended reliable error message passing"; I dont see any reason 
for modifying it here. Setting this flag is the user's responsability, 
isn't it?

My patch would look like this:

--- ip_output.c.old     Mon Sep 30 10:34:07 2002
+++ ip_output.c Mon Sep 30 10:40:08 2002
@@ -604,7 +604,7 @@
                              skb->dst->dev, output_maybe_reroute);
                if (err) {
                        if (err > 0)
-                               err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+                               err = (err == NET_XMIT_DROP || sk->protinfo.af_inet.recverr ) ? net_xmit_errno(err) : 0;
                        if (err)
                                goto error;
                }
@@ -714,7 +714,7 @@
        err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL,
rt->u.dst.dev,
                      output_maybe_reroute);
        if (err > 0)
-               err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+               err = (err == NET_XMIT_DROP || sk->protinfo.af_inet.recverr ) ? net_xmit_errno(err) : 0;
        if (err)
                goto error;
 out:

> --- linux/net/ipv4/ip_output.c	2002/09/29 10:43:10	1.1
> +++ linux/net/ipv4/ip_output.c	2002/09/29 10:43:10
> @@ -603,8 +603,11 @@
>  		err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, 
>  			      skb->dst->dev, output_maybe_reroute);
>  		if (err) {
> -			if (err > 0)
> -				err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
> +			if (err > 0) {
> +				err = net_xmit_errno(err);
> +				if (err && sk->protinfo.af_inet.recverr)
> +				       sk->protinfo.af_inet.recverr = err;
> +			}
>  			if (err)
>  				goto error;
>  		}
> @@ -713,8 +716,11 @@
>  
>  	err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev,
>  		      output_maybe_reroute);
> -	if (err > 0)
> -		err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
> +	if (err > 0) {
> +		err = net_xmit_errno(err);
> +		if (err && sk->protinfo.af_inet.recverr)
> +		       	 sk->protinfo.af_inet.recverr = err;
> +	}
>  	if (err)
>  		goto error;
>  out:

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-30  8:49         ` Eric Lemoine
@ 2002-09-30 11:09           ` jamal
  2002-09-30 12:10           ` jamal
  1 sibling, 0 replies; 36+ messages in thread
From: jamal @ 2002-09-30 11:09 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: netdev



On Mon, 30 Sep 2002, Eric Lemoine wrote:

> > Ok, understood. Actually we already seem to have enobufs being returned.
> >
> > Eric,
> > Does the attached patch fix it? Not tested or even compiled.
> > Someone going to change the manpages?
>
> protinfo.af_inet.recverr is a flag (1-bit variable) to enable the
> "extended reliable error message passing"; I dont see any reason
> for modifying it here. Setting this flag is the user's responsability,
> isn't it?

I dont see any reason for modifying it there either - it is used for
mostly for incoming ICMP errors; but the original piece of code was
modifying it so i left it as is; however, thats not what you were chasing
and i dont see how what you just posted is solving your problem.
Did you even bother to test what i sent you? Or for that matter
bother testing what you posted?

cheers,
jamal

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-30  8:49         ` Eric Lemoine
  2002-09-30 11:09           ` jamal
@ 2002-09-30 12:10           ` jamal
  2002-09-30 12:23             ` jamal
  1 sibling, 1 reply; 36+ messages in thread
From: jamal @ 2002-09-30 12:10 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: Eric Lemoine, netdev



On Mon, 30 Sep 2002, Eric Lemoine wrote:

> > Ok, understood. Actually we already seem to have enobufs being returned.
> >
> > Eric,
> > Does the attached patch fix it? Not tested or even compiled.
> > Someone going to change the manpages?
>
> protinfo.af_inet.recverr is a flag (1-bit variable) to enable the
> "extended reliable error message passing"; I dont see any reason
> for modifying it here. Setting this flag is the user's responsability,
> isn't it?
>

Ok, sorry, I didnt realize i was the one setting it; What i meant
was to use sk->protinfo.af_inet.recverr only in the case of all is
clean from the device layer.
i.e do something along the lines of:

--- ip_output.c	2002/09/29 08:50:36	1.1
+++ ip_output.c	2002/09/30 06:24:32
@@ -603,8 +603,11 @@
 		err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL,
 			      skb->dst->dev, output_maybe_reroute);
 		if (err) {
-			if (err > 0)
-				err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+			if (err > 0) {
+				err = net_xmit_errno(err);
+				if (!err && sk->protinfo.af_inet.recverr)
+				       err = sk->protinfo.af_inet.recverr;
+			}
 			if (err)
 				goto error;
 		}
@@ -713,8 +716,11 @@

 	err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev,
 		      output_maybe_reroute);
-	if (err > 0)
-		err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+	if (err > 0) {
+		err = net_xmit_errno(err);
+		if (!err && sk->protinfo.af_inet.recverr)
+		       	 err = sk->protinfo.af_inet.recverr;
+	}
 	if (err)
 		goto error;
 out:

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: udp weirdness
  2002-09-30 12:10           ` jamal
@ 2002-09-30 12:23             ` jamal
  2002-10-01  0:22               ` PATCH " jamal
  0 siblings, 1 reply; 36+ messages in thread
From: jamal @ 2002-09-30 12:23 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: Eric Lemoine, netdev



On Mon, 30 Sep 2002, jamal wrote:

> was to use sk->protinfo.af_inet.recverr only in the case of all is
> clean from the device layer.

Of course if you want to say sk->protinfo.af_inet.recverr is higher
priority then the patch becomes;->

--- ip_output.c	2002/09/29 08:50:36	1.1
+++ ip_output.c	2002/09/30 06:38:32
@@ -603,8 +603,9 @@
 		err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL,
 			      skb->dst->dev, output_maybe_reroute);
 		if (err) {
-			if (err > 0)
-				err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+			if (err > 0) {
+				 err = sk->protinfo.af_inet.recverr ? sk->protinfo.af_inet.recverr:net_xmit_errno(err);
+			}
 			if (err)
 				goto error;
 		}
@@ -713,8 +714,9 @@

 	err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev,
 		      output_maybe_reroute);
-	if (err > 0)
-		err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+	if (err > 0) {
+			err = sk->protinfo.af_inet.recverr ? sk->protinfo.af_inet.recverr:net_xmit_errno(err);
+	}
 	if (err)
 		goto error;
 out:

^ permalink raw reply	[flat|nested] 36+ messages in thread

* PATCH Re: udp weirdness
  2002-09-30 12:23             ` jamal
@ 2002-10-01  0:22               ` jamal
  2002-10-01  6:35                 ` Eric Lemoine
  2002-10-01 13:53                 ` kuznet
  0 siblings, 2 replies; 36+ messages in thread
From: jamal @ 2002-10-01  0:22 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: Eric Lemoine, netdev



Eric,

you are right about sk->protinfo.af_inet.recverr ..
I checked what other OSes do and i am convinced that the patch below
at least makes us behave like the BSDs.
Alexey/Dave, sk->protinfo.af_inet.recverr is not needed for enobufs to
be propagated back to the socket level; please review and probably
apply:

cheers,
jamal

--- ip_output.c	2002/09/29 08:50:36	1.1
+++ ip_output.c	2002/09/30 06:24:32
@@ -603,8 +603,11 @@
 		err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL,
 			      skb->dst->dev, output_maybe_reroute);
 		if (err) {
-			if (err > 0)
-				err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+			if (err > 0) {
+				err = net_xmit_errno(err);
+				if (!err && sk->protinfo.af_inet.recverr)
+				       err = sk->protinfo.af_inet.recverr;
+			}
 			if (err)
 				goto error;
 		}
@@ -713,8 +716,11 @@

 	err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev,
 		      output_maybe_reroute);
-	if (err > 0)
-		err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+	if (err > 0) {
+		err = net_xmit_errno(err);
+		if (!err && sk->protinfo.af_inet.recverr)
+		       	 err = sk->protinfo.af_inet.recverr;
+	}
 	if (err)
 		goto error;
 out:

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01  0:22               ` PATCH " jamal
@ 2002-10-01  6:35                 ` Eric Lemoine
  2002-10-01  9:51                   ` jamal
  2002-10-01 13:53                 ` kuznet
  1 sibling, 1 reply; 36+ messages in thread
From: Eric Lemoine @ 2002-10-01  6:35 UTC (permalink / raw)
  To: jamal; +Cc: Eric Lemoine, Eric Lemoine, netdev

> Eric,
> 
> you are right about sk->protinfo.af_inet.recverr ..
> I checked what other OSes do and i am convinced that the patch below
> at least makes us behave like the BSDs.
> Alexey/Dave, sk->protinfo.af_inet.recverr is not needed for enobufs to
> be propagated back to the socket level; please review and probably
> apply:

Jamal, 

Have you posted the right patch? I see that sk->protinfo.af_inet.recverr is
still around. Here follows the patch that I think is the correct one.
Please confirm. Thx.

--- ip_output.c.old     Mon Sep 30 10:34:07 2002
+++ ip_output.c Tue Oct  1 00:00:05 2002
@@ -604,7 +604,7 @@
                              skb->dst->dev, output_maybe_reroute);
                if (err) {
                        if (err > 0)
-                               err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+                               err = net_xmit_errno(err);
                        if (err)
                                goto error;
                }
@@ -714,7 +714,7 @@
        err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL,
rt->u.dst.dev,
                      output_maybe_reroute);
        if (err > 0)
-               err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+               err = net_xmit_errno(err);
        if (err)
                goto error;
 out:

> --- ip_output.c	2002/09/29 08:50:36	1.1
> +++ ip_output.c	2002/09/30 06:24:32
> @@ -603,8 +603,11 @@
>  		err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL,
>  			      skb->dst->dev, output_maybe_reroute);
>  		if (err) {
> -			if (err > 0)
> -				err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
> +			if (err > 0) {
> +				err = net_xmit_errno(err);
> +				if (!err && sk->protinfo.af_inet.recverr)
> +				       err = sk->protinfo.af_inet.recverr;
> +			}
>  			if (err)
>  				goto error;
>  		}
> @@ -713,8 +716,11 @@
> 
>  	err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev,
>  		      output_maybe_reroute);
> -	if (err > 0)
> -		err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
> +	if (err > 0) {
> +		err = net_xmit_errno(err);
> +		if (!err && sk->protinfo.af_inet.recverr)
> +		       	 err = sk->protinfo.af_inet.recverr;
> +	}
>  	if (err)
>  		goto error;
>  out:

-- 
Eric

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01  6:35                 ` Eric Lemoine
@ 2002-10-01  9:51                   ` jamal
  0 siblings, 0 replies; 36+ messages in thread
From: jamal @ 2002-10-01  9:51 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: Eric Lemoine, netdev




On Tue, 1 Oct 2002, Eric Lemoine wrote:

> Jamal,
>
> Have you posted the right patch?

;-> I dont know what happened; patch below is what i was trying to post
I deleted all the old patches this time to make sure ;->

> I see that sk->protinfo.af_inet.recverr is
> still around. Here follows the patch that I think is the correct one.
> Please confirm. Thx.
>

Yes it is correct but was missing one more bit from v6 added below.
Alexey/Dave this is the correct version ;->

cheers,
jamal

--- linux/net/ipv4/ip_output.c	2002/09/29 08:50:36     1.1
+++ linux/net/ipv4/ip_output.c	2002/09/30 09:11:15
@@ -603,8 +603,8 @@
 		err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL,
 			      skb->dst->dev, output_maybe_reroute);
 		if (err) {
-			if (err > 0)
-				err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+			if (err > 0)
+				 err = net_xmit_errno(err);
 			if (err)
 				goto error;
 		}
@@ -713,8 +713,8 @@

 	err = NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev,
 		      output_maybe_reroute);
-	if (err > 0)
-		err = sk->protinfo.af_inet.recverr ? net_xmit_errno(err) : 0;
+	if (err > 0)
+			err = net_xmit_errno(err);
 	if (err)
 		goto error;
 out:
--- linux/net/ipv6/ip6_output.c	2002/09/30 18:30:52     1.1
+++ linux/net/ipv6/ip6_output.c	2002/09/30 18:31:30
@@ -684,7 +684,7 @@
 out:
 	ip6_dst_store(sk, dst, fl->nl_u.ip6_u.daddr == &np->daddr ? &np->daddr : NULL);
 	if (err > 0)
-		err = np->recverr ? net_xmit_errno(err) : 0;
+		err = net_xmit_errno(err);
 	return err;
 }

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01  0:22               ` PATCH " jamal
  2002-10-01  6:35                 ` Eric Lemoine
@ 2002-10-01 13:53                 ` kuznet
  2002-10-01 14:14                   ` jamal
  2002-10-01 14:26                   ` Chris Friesen
  1 sibling, 2 replies; 36+ messages in thread
From: kuznet @ 2002-10-01 13:53 UTC (permalink / raw)
  To: jamal; +Cc: netdev

Hello!

> be propagated back to the socket level; please review and probably
> apply:

That was tried and failed miserably ages ago. Applications become
insane when getting this error from logging tons of useless messages
to abort()ing and cpu hogging. From the other hand silent reaction
to loss is almost exactly desired behaviour in presence of congestion
in the most of cases.

I have no idea, why that OSes work. Probably, they are so great that
never lose packets. We do. 

Probably, the world has changed since that time, but I have no desire
to repeat the experiment to be honest.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 13:53                 ` kuznet
@ 2002-10-01 14:14                   ` jamal
  2002-10-01 14:26                   ` Chris Friesen
  1 sibling, 0 replies; 36+ messages in thread
From: jamal @ 2002-10-01 14:14 UTC (permalink / raw)
  To: kuznet; +Cc: Eric Lemoine, Eric Lemoine, netdev


Alexey,

I agree; so just ignore the patch. You know you could have saved me
about an hour if you spoke earlier ;->
Eric, note what Alexey says. So to get enobufs please
use the sockoption to set IP_RECVERR
Someone should document this in the manpages.

cheers,
jamal

On Tue, 1 Oct 2002 kuznet@ms2.inr.ac.ru wrote:

> Hello!
>
> > be propagated back to the socket level; please review and probably
> > apply:
>
> That was tried and failed miserably ages ago. Applications become
> insane when getting this error from logging tons of useless messages
> to abort()ing and cpu hogging. From the other hand silent reaction
> to loss is almost exactly desired behaviour in presence of congestion
> in the most of cases.
>
> I have no idea, why that OSes work. Probably, they are so great that
> never lose packets. We do.
>
> Probably, the world has changed since that time, but I have no desire
> to repeat the experiment to be honest.
>
> Alexey
>
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 13:53                 ` kuznet
  2002-10-01 14:14                   ` jamal
@ 2002-10-01 14:26                   ` Chris Friesen
  2002-10-01 14:40                     ` kuznet
  1 sibling, 1 reply; 36+ messages in thread
From: Chris Friesen @ 2002-10-01 14:26 UTC (permalink / raw)
  To: kuznet; +Cc: jamal, netdev

kuznet@ms2.inr.ac.ru wrote:

> That was tried and failed miserably ages ago. Applications become
> insane when getting this error from logging tons of useless messages
> to abort()ing and cpu hogging. From the other hand silent reaction
> to loss is almost exactly desired behaviour in presence of congestion
> in the most of cases.
> 
> I have no idea, why that OSes work. Probably, they are so great that
> never lose packets. We do. 

One of the principles of software design that I was taught was the 
principle of least surprise.

If I'm looping on a sendto() with a blocking socket, I would expect the 
syscall to block until the packet has been handed off to the device 
driver.  This may mean blocking until local congestion backs off.

If I'm using non-blocking IO and there is local congestion, I would 
expect to get ENOBUFS or maybe EAGAIN/EWOULDBLOCK.

The way we do it now means that we can chew up massive amounts of cpu 
creating packets in userspace and throwing them away in the kernel, with 
no way of knowing from userspace that it is happening.  This just 
doesn't seem like the right thing to do.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 14:26                   ` Chris Friesen
@ 2002-10-01 14:40                     ` kuznet
  2002-10-01 14:52                       ` Chris Friesen
  0 siblings, 1 reply; 36+ messages in thread
From: kuznet @ 2002-10-01 14:40 UTC (permalink / raw)
  To: Chris Friesen; +Cc: hadi, netdev

Hello!

> If I'm looping on a sendto() with a blocking socket, I would expect the 
> syscall to block until the packet has been handed off to the device 
> driver.  This may mean blocking until local congestion backs off.

Feel free to implement. :-)


> The way we do it now means that we can chew up massive amounts of cpu 
> creating packets in userspace and throwing them away in the kernel, with 
> no way of knowing from userspace that it is happening.

What? If your applications is enough clever to handle ENOBUFS right,
set IP_RECVERR and live in peace.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 14:40                     ` kuznet
@ 2002-10-01 14:52                       ` Chris Friesen
  2002-10-01 15:31                         ` kuznet
  0 siblings, 1 reply; 36+ messages in thread
From: Chris Friesen @ 2002-10-01 14:52 UTC (permalink / raw)
  To: kuznet; +Cc: hadi, netdev

kuznet@ms2.inr.ac.ru wrote:

> Feel free to implement. :-)

I may have to poke around...if nothing else I'll learn more about the 
networking code...

> What? If your applications is enough clever to handle ENOBUFS right,
> set IP_RECVERR and live in peace.

The original poster was complaining about messages being silently 
dropped when there is congestion.  If IP_RECVERR is turned on, would 
sendto() then return -1 so I know to try and read the error messages? 
I'm assuming I get ENOBUFS back in the ee_code field?  Or can I get away 
with reading errno and ignoring the error queue?

I've never used IP_RECVERR and there doesn't seem to be a lot of 
documentation about it.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 14:52                       ` Chris Friesen
@ 2002-10-01 15:31                         ` kuznet
  2002-10-01 16:16                           ` Chris Friesen
                                             ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: kuznet @ 2002-10-01 15:31 UTC (permalink / raw)
  To: Chris Friesen; +Cc: hadi, netdev

Hello!

> > Feel free to implement. :-)
> 
> I may have to poke around...if nothing else I'll learn more about the 
> networking code...

It is difficult task, if possible at all.

The main obstacle is that we must not block after select() succeeded,
otherwise applications will lockup. Taking into account nature of datagram
services (and generally of networking services, where routes change et al.)
you do not know at time of select(), where the datagram will go.
So, blocking can be made only based on a criterium not depending on this.

We use sndbuf (like all the OSes). Actually, the problem with silent
losses is solved by tuning SO_SNDBUF. Though it is not a complete
solution (failing whith lots of senders), it solves all those bullshit
problems with silent losses. People just do not care about this, so
they get the thing which they deserve.


> dropped when there is congestion.  If IP_RECVERR is turned on, would 
> sendto() then return -1 so I know to try and read the error messages? 

Yes.


> I'm assuming I get ENOBUFS back in the ee_code field?  Or can I get away 
> with reading errno and ignoring the error queue?

No, error queue should be read. But not all the errors are queued,
only those which are supported asynchronously. ENOBUFS is still not.
F.e. when packet is dropped lately (some qdiscs do this, dropping already
queued packets to give place for another ones) ENOBUFS is not sent back.


>			     there doesn't seem to be a lot of 
> documentation about it.

Damn! (I'm sorry) It is documented _very_ well, thanks to Andi.
I really hate this mode when people do not worrying even to look to manpages
before pronouncing such statements.

man recvmsg
man ip

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 15:31                         ` kuznet
@ 2002-10-01 16:16                           ` Chris Friesen
  2002-10-01 16:41                             ` kuznet
  2002-10-01 16:42                           ` Ben Greear
  2002-10-02 11:13                           ` Eric Lemoine
  2 siblings, 1 reply; 36+ messages in thread
From: Chris Friesen @ 2002-10-01 16:16 UTC (permalink / raw)
  To: kuznet; +Cc: hadi, netdev

kuznet@ms2.inr.ac.ru wrote:

 > But not all the errors are queued,
> only those which are supported asynchronously. ENOBUFS is still not.
> F.e. when packet is dropped lately (some qdiscs do this, dropping already
> queued packets to give place for another ones) ENOBUFS is not sent back.

Hmm...so even with IP_RECVERR I may not be notified if the packet is 
dropped?


>>			     there doesn't seem to be a lot of 
>>documentation about it.
>>
> 
> Damn! (I'm sorry) It is documented _very_ well, thanks to Andi.
> I really hate this mode when people do not worrying even to look to manpages
> before pronouncing such statements.
> 
> man recvmsg
> man ip

Yes, I looked at both of those, and man udp as well.  My point about a 
lack of documentation was more related to corner cases and exactly what 
gets returned when.  The example above (where there are cases where no 
errors are returned even with IP_RECVERR turned on) is not mentioned 
anywhere in the documentation.

I write code for a telephony softswitch.  We are running a legacy app on 
top of an emulator on top of linux.  I want to ensure that my packets 
either a) got out onto the wire, or b) my app got an error message back 
explaining why the message didn't get onto the wire so that it can 
decide how to proceed.

Now my overall bandwidth requirements aren't too bad, but the traffic is 
bursty, with a batch of messages being sent in a tight sendto() loop.  I 
do NOT want messages to be silently dropped in the kernel with no error 
returned.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 16:16                           ` Chris Friesen
@ 2002-10-01 16:41                             ` kuznet
  2002-10-01 17:17                               ` Chris Friesen
  0 siblings, 1 reply; 36+ messages in thread
From: kuznet @ 2002-10-01 16:41 UTC (permalink / raw)
  To: Chris Friesen; +Cc: hadi, netdev

Hello!

> Hmm...so even with IP_RECVERR I may not be notified if the packet is 
> dropped?

Do you not jest occasionally? :-) "Even" sounds interesting, because
this kind of errors cannot be reported _principially_ without IP_RECVERR,
providing asynchronous error reporting.

Packet can be dropped, damaged _after_ they are successfully queued
to the device. And this need not to be emphasized in documentation
on IP_RECVERR, except for a section "FUTURE DEVELOPMENT".


> either a) got out onto the wire, or b) my app got an error message back 
> explaining why the message didn't get onto the wire so that it can 
> decide how to proceed.

Then use IP_RECVERR. It reports all the errors which are detectable.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 15:31                         ` kuznet
  2002-10-01 16:16                           ` Chris Friesen
@ 2002-10-01 16:42                           ` Ben Greear
  2002-10-01 16:58                             ` Chris Friesen
  2002-10-01 17:55                             ` jamal
  2002-10-02 11:13                           ` Eric Lemoine
  2 siblings, 2 replies; 36+ messages in thread
From: Ben Greear @ 2002-10-01 16:42 UTC (permalink / raw)
  To: kuznet; +Cc: Chris Friesen, hadi, netdev

kuznet@ms2.inr.ac.ru wrote:
> Hello!
> 
> 
>>>Feel free to implement. :-)
>>
>>I may have to poke around...if nothing else I'll learn more about the 
>>networking code...
> 
> 
> It is difficult task, if possible at all.
> 
> The main obstacle is that we must not block after select() succeeded,
> otherwise applications will lockup. Taking into account nature of datagram
> services (and generally of networking services, where routes change et al.)
> you do not know at time of select(), where the datagram will go.
> So, blocking can be made only based on a criterium not depending on this.
> 
> We use sndbuf (like all the OSes). Actually, the problem with silent
> losses is solved by tuning SO_SNDBUF. Though it is not a complete
> solution (failing whith lots of senders), it solves all those bullshit
> problems with silent losses. People just do not care about this, so
> they get the thing which they deserve.

Changing the size of the sending buffer will make you able to drop fewer
packets in bursty behaviour, but it does not guarantee you anything, does
it?  For those of us wanting to write programs with very little slop in
their behaviour, 'most of the time' is not good enough.

I have not yet looked at the UDP code in detail, but it does appear we
can know if a NIC accepts the packet for transmit or not.  If it does
not, then just don't dequeue off of the send list, retry it next time.

I cannot see any logical reason that we cannot ensure that a packet is
not at least given to the driver when accepted by sendto.  As soon as I
get my pktgen and send-to-self code cleaned up, I am planning to start
working on making UDP reliably send packets, or return an error to
the calling code.  I will, of course, keep you informed if I actually
get something working...

Enjoy,
Ben


> 
> 
> 
>>dropped when there is congestion.  If IP_RECVERR is turned on, would 
>>sendto() then return -1 so I know to try and read the error messages? 
> 
> 
> Yes.
> 
> 
> 
>>I'm assuming I get ENOBUFS back in the ee_code field?  Or can I get away 
>>with reading errno and ignoring the error queue?
> 
> 
> No, error queue should be read. But not all the errors are queued,
> only those which are supported asynchronously. ENOBUFS is still not.
> F.e. when packet is dropped lately (some qdiscs do this, dropping already
> queued packets to give place for another ones) ENOBUFS is not sent back.
> 
> 
> 
>>			     there doesn't seem to be a lot of 
>>documentation about it.
> 
> 
> Damn! (I'm sorry) It is documented _very_ well, thanks to Andi.
> I really hate this mode when people do not worrying even to look to manpages
> before pronouncing such statements.
> 
> man recvmsg
> man ip
> 
> Alexey
> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 16:42                           ` Ben Greear
@ 2002-10-01 16:58                             ` Chris Friesen
  2002-10-01 17:55                             ` jamal
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Friesen @ 2002-10-01 16:58 UTC (permalink / raw)
  To: Ben Greear; +Cc: kuznet, hadi, netdev

Ben Greear wrote:

> I cannot see any logical reason that we cannot ensure that a packet is
> not at least given to the driver when accepted by sendto.  As soon as I
> get my pktgen and send-to-self code cleaned up, I am planning to start
> working on making UDP reliably send packets, or return an error to
> the calling code.  I will, of course, keep you informed if I actually
> get something working...


That would be fantastic.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 16:41                             ` kuznet
@ 2002-10-01 17:17                               ` Chris Friesen
  0 siblings, 0 replies; 36+ messages in thread
From: Chris Friesen @ 2002-10-01 17:17 UTC (permalink / raw)
  To: kuznet; +Cc: hadi, netdev

kuznet@ms2.inr.ac.ru wrote:

>>Hmm...so even with IP_RECVERR I may not be notified if the packet is 
>>dropped?
>>
> 
> Do you not jest occasionally? :-) "Even" sounds interesting, because
> this kind of errors cannot be reported _principially_ without IP_RECVERR,
> providing asynchronous error reporting.

The point that I fundamentally need IP_RECVERR for async errors is 
interesting, I'll have to look into that.  Thanks for the pointer.

> Packet can be dropped, damaged _after_ they are successfully queued
> to the device. And this need not to be emphasized in documentation
> on IP_RECVERR, except for a section "FUTURE DEVELOPMENT".

This is certainly true, but I would hope that packets being damaged or 
dropped after being queued to the device would be a rare event.  It 
would still be good to have them reported though.

I understand the problems with regards to blocking sockets, since you 
can't block after the select() passes and select() doesn't know where 
the packet is going.  It seems that it should be pretty straightforward 
to enable ENOBUFS for nonblocking sockets though--am I missing something?

For now I'll play with SO_SNDBUF and IP_RECVERR.

Thanks,

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 16:42                           ` Ben Greear
  2002-10-01 16:58                             ` Chris Friesen
@ 2002-10-01 17:55                             ` jamal
  2002-10-01 18:36                               ` Chris Friesen
  2002-10-01 18:52                               ` Ben Greear
  1 sibling, 2 replies; 36+ messages in thread
From: jamal @ 2002-10-01 17:55 UTC (permalink / raw)
  To: Ben Greear; +Cc: kuznet, Chris Friesen, netdev



On Tue, 1 Oct 2002, Ben Greear wrote:

> get my pktgen and send-to-self code cleaned up, I am planning to start
> working on making UDP reliably send packets, or return an error to
> the calling code.  I will, of course, keep you informed if I actually
> get something working...

If you want realibility then thats what TCP is for.
I am curious why you would even want to retransmit a voice packet or
why a local drop should be treated any different from a remote/network
drop in a voice application ...

When you fail in sendto/sendmsg, errno is set to ENOBUFS as long as you
set IP_RECVERR in the socket options; you can also receive ICMP errors
as described in the manpages (use a msg_control buffer and call recvmsg
with MSG_ERRQUEUE).

BTW, a good sample of an app that makes good use of ENOBUFS to do
congestion control, IP_RECVERR and MSG_ERRQUEUE is the ping app in Alexeys
iputils package. Why did i not remember all this before chasing the
phantom with Eric is an indication i need to increase my cafeine
consumption.

cheers,
jamal

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 18:36                               ` Chris Friesen
@ 2002-10-01 18:35                                 ` jamal
  2002-10-01 18:54                                   ` Ben Greear
  2002-10-01 19:03                                   ` Chris Friesen
  0 siblings, 2 replies; 36+ messages in thread
From: jamal @ 2002-10-01 18:35 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Ben Greear, kuznet, netdev



On Tue, 1 Oct 2002, Chris Friesen wrote:

> to be silently dropped by the kernel because userspace is sending faster
> than they can get onto the wire during that tight loop.
>

So what happens when you find packets being dropped?
AFAIK, a dropped voice packet is as good as dead whether local or remote.

> Okay, so with IP_RECVERR set the case that Eric saw will not happen?  I
> mean that sendto() will return with -1 and errno set to ENOBUFS?

yes

cheers,
jamal

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 17:55                             ` jamal
@ 2002-10-01 18:36                               ` Chris Friesen
  2002-10-01 18:35                                 ` jamal
  2002-10-01 18:52                               ` Ben Greear
  1 sibling, 1 reply; 36+ messages in thread
From: Chris Friesen @ 2002-10-01 18:36 UTC (permalink / raw)
  To: jamal; +Cc: Ben Greear, kuznet, netdev

jamal wrote:
> 
> On Tue, 1 Oct 2002, Ben Greear wrote:
> 
> 
>>get my pktgen and send-to-self code cleaned up, I am planning to start
>>working on making UDP reliably send packets, or return an error to
>>the calling code.  I will, of course, keep you informed if I actually
>>get something working...
>>
> 
> If you want realibility then thats what TCP is for.

There is a requirement to interwork with other equipment using UDP.

> I am curious why you would even want to retransmit a voice packet or
> why a local drop should be treated any different from a remote/network
> drop in a voice application ...

The legacy app generates batches of messages, and the emulator layer 
then sends them out in a tight loop on sendto().  I don't want packets 
to be silently dropped by the kernel because userspace is sending faster 
than they can get onto the wire during that tight loop.

Eric's original testcase was a tight loop on sendto() resulting in 
userspace sending at a way higher rate than could be put onto the wire, 
so the kernel was silently dropping them.  This is exactly what I want 
to avoid.

> When you fail in sendto/sendmsg, errno is set to ENOBUFS as long as you
> set IP_RECVERR in the socket options; you can also receive ICMP errors
> as described in the manpages (use a msg_control buffer and call recvmsg
> with MSG_ERRQUEUE).

Okay, so with IP_RECVERR set the case that Eric saw will not happen?  I 
mean that sendto() will return with -1 and errno set to ENOBUFS?

> BTW, a good sample of an app that makes good use of ENOBUFS to do
> congestion control, IP_RECVERR and MSG_ERRQUEUE is the ping app in Alexeys
> iputils package. Why did i not remember all this before chasing the
> phantom with Eric is an indication i need to increase my cafeine
> consumption

I'll take a look and try things out.

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 17:55                             ` jamal
  2002-10-01 18:36                               ` Chris Friesen
@ 2002-10-01 18:52                               ` Ben Greear
  1 sibling, 0 replies; 36+ messages in thread
From: Ben Greear @ 2002-10-01 18:52 UTC (permalink / raw)
  To: jamal; +Cc: kuznet, Chris Friesen, netdev

jamal wrote:
> 
> On Tue, 1 Oct 2002, Ben Greear wrote:
> 
> 
>>get my pktgen and send-to-self code cleaned up, I am planning to start
>>working on making UDP reliably send packets, or return an error to
>>the calling code.  I will, of course, keep you informed if I actually
>>get something working...
> 
> 
> If you want realibility then thats what TCP is for.
> I am curious why you would even want to retransmit a voice packet or
> why a local drop should be treated any different from a remote/network
> drop in a voice application ...

Don't worry so much about why I want to do this, just assume that I do! :)
I have explained in the past, but since our needs are different, it does
not seem to impress upon anyone (I want to send at high speeds, and detect
every possible packet dropped by the network.  If my local machine drops
the packet, my detection of network-dropped packets is bogus...)

The reason a local packet drop is different, is because it can be different.
Any feedback of this nature that we can give to user-space can be used to
make any recovery or throttling decisions better.  I accept that I cannot
guarantee a UDP packet is received by it's intended target, but I do not
accept that the local machine cannot even guarantee that it has sent the
packet onto the network.

> 
> When you fail in sendto/sendmsg, errno is set to ENOBUFS as long as you
> set IP_RECVERR in the socket options; you can also receive ICMP errors
> as described in the manpages (use a msg_control buffer and call recvmsg
> with MSG_ERRQUEUE).

I will investigate the IP_RECVERR more closely, it may do just what I need.

Thanks,
Ben

> 
> BTW, a good sample of an app that makes good use of ENOBUFS to do
> congestion control, IP_RECVERR and MSG_ERRQUEUE is the ping app in Alexeys
> iputils package. Why did i not remember all this before chasing the
> phantom with Eric is an indication i need to increase my cafeine
> consumption.
> 
> cheers,
> jamal
> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 18:35                                 ` jamal
@ 2002-10-01 18:54                                   ` Ben Greear
  2002-10-01 19:03                                   ` Chris Friesen
  1 sibling, 0 replies; 36+ messages in thread
From: Ben Greear @ 2002-10-01 18:54 UTC (permalink / raw)
  To: jamal; +Cc: Chris Friesen, kuznet, netdev

jamal wrote:
> 
> On Tue, 1 Oct 2002, Chris Friesen wrote:
> 
> 
>>to be silently dropped by the kernel because userspace is sending faster
>>than they can get onto the wire during that tight loop.
>>
> 
> 
> So what happens when you find packets being dropped?
> AFAIK, a dropped voice packet is as good as dead whether local or remote.

If it is dropped locally, you can re-send in way less than 1 milisecond,
which is well within the realm of expected jitter.  You can also back off
for 5 miliseconds to alleviate congestion, which is still within acceptable
jitter.

If you think you sent it, but didn't actually send it, then it looks like
your network is shitting, when in fact your local machine is acting shitty.

Ben

> 
> 
>>Okay, so with IP_RECVERR set the case that Eric saw will not happen?  I
>>mean that sendto() will return with -1 and errno set to ENOBUFS?
> 
> 
> yes
> 
> cheers,
> jamal
> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 18:35                                 ` jamal
  2002-10-01 18:54                                   ` Ben Greear
@ 2002-10-01 19:03                                   ` Chris Friesen
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Friesen @ 2002-10-01 19:03 UTC (permalink / raw)
  To: jamal; +Cc: Ben Greear, kuznet, netdev

jamal wrote:
 >
 > On Tue, 1 Oct 2002, Chris Friesen wrote:
 >
 >> to be silently dropped by the kernel because userspace is sending
 >> faster than they can get onto the wire during that tight loop.

 > So what happens when you find packets being dropped? AFAIK, a
 > dropped voice packet is as good as dead whether local or remote.

We do have some leeway in terms of latency, and delayed leaving the box
is not the same as dropped.

We know that call processing messaging can only ever go out one ethernet
interface at a time, and we know that the call agent is guaranteed 90% 
of the userspace cpu time (scheduler changes).  We certify the box for a 
certain engineered throughput, so we know the average packets/sec value. 
We also have total knowledge/control over the other apps running on the 
box in question.

So really all I'm protecting against is from one single userspace app 
generating packets faster than the network can keep up.  As long as I 
get EAGAIN/EWOULDBLOCK/ENOBUFS on my non-blocking socket then I'll just 
try again until it succeeds or I get a more serious error.  So far this 
seems to be working nicely in 2.2, but we're just in the middle of a 
switch to 2.4 for the new release and I want to make sure I've got a 
handle on things.

Thanks for your help,

Chris

PS. I realize that this design is simplistic, but we are severely 
constrained by the legacy app running on the emulator.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-01 15:31                         ` kuznet
  2002-10-01 16:16                           ` Chris Friesen
  2002-10-01 16:42                           ` Ben Greear
@ 2002-10-02 11:13                           ` Eric Lemoine
  2002-10-02 14:09                             ` Chris Friesen
  2002-10-02 15:25                             ` Ben Greear
  2 siblings, 2 replies; 36+ messages in thread
From: Eric Lemoine @ 2002-10-02 11:13 UTC (permalink / raw)
  To: kuznet; +Cc: Chris Friesen, hadi, netdev

> > I may have to poke around...if nothing else I'll learn more about the 
> > networking code...
> 
> It is difficult task, if possible at all.
> 
> The main obstacle is that we must not block after select() succeeded,
> otherwise applications will lockup. Taking into account nature of datagram
> services (and generally of networking services, where routes change et al.)
> you do not know at time of select(), where the datagram will go.
> So, blocking can be made only based on a criterium not depending on this.
> problems with silent losses. People just do not care about this, so
> they get the thing which they deserve.

Alexey,

Would you mind explaining a bit more why apps will lockup if we block
after select() succeeded. Or anyone?

Thx.

	Eric.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-02 11:13                           ` Eric Lemoine
@ 2002-10-02 14:09                             ` Chris Friesen
  2002-10-02 15:25                             ` Ben Greear
  1 sibling, 0 replies; 36+ messages in thread
From: Chris Friesen @ 2002-10-02 14:09 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: kuznet, hadi, netdev

Eric Lemoine wrote:

> Would you mind explaining a bit more why apps will lockup if we block
> after select() succeeded. Or anyone?

I suspect he means a temporary lockup.

There are many apps that use blocking sockets and rely on select() to 
tell them when they may write out to the socket without blocking.

If the write then blocks, the whole app is blocked.  Some of them 
probably wouldn't like this.

Was that it Alexey? Or is there something more that I missed?

Chris

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-02 11:13                           ` Eric Lemoine
  2002-10-02 14:09                             ` Chris Friesen
@ 2002-10-02 15:25                             ` Ben Greear
  2002-10-03 15:58                               ` Eric Lemoine
  1 sibling, 1 reply; 36+ messages in thread
From: Ben Greear @ 2002-10-02 15:25 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: kuznet, Chris Friesen, hadi, netdev

Eric Lemoine wrote:
>>>I may have to poke around...if nothing else I'll learn more about the 
>>>networking code...
>>
>>It is difficult task, if possible at all.
>>
>>The main obstacle is that we must not block after select() succeeded,
>>otherwise applications will lockup. Taking into account nature of datagram
>>services (and generally of networking services, where routes change et al.)
>>you do not know at time of select(), where the datagram will go.
>>So, blocking can be made only based on a criterium not depending on this.
>>problems with silent losses. People just do not care about this, so
>>they get the thing which they deserve.
> 
> 
> Alexey,
> 
> Would you mind explaining a bit more why apps will lockup if we block
> after select() succeeded. Or anyone?

Actually, I'm more interested to know why we would **need** to block after
select has succeeded.  It would seem to me that select is busted in this
case.  For the case of a very large UDP packet and a small send buffer,
select gets confused, but at least when the send buffer is > 128k, it should
be right...


> 
> Thx.
> 
> 	Eric.
> 


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-02 15:25                             ` Ben Greear
@ 2002-10-03 15:58                               ` Eric Lemoine
  2002-10-03 16:29                                 ` kuznet
  0 siblings, 1 reply; 36+ messages in thread
From: Eric Lemoine @ 2002-10-03 15:58 UTC (permalink / raw)
  To: Ben Greear; +Cc: Eric Lemoine, kuznet, Chris Friesen, hadi, netdev

> > >It is difficult task, if possible at all.
> > >
> > >The main obstacle is that we must not block after select() succeeded,
> > >otherwise applications will lockup. Taking into account nature of datagram
> > >services (and generally of networking services, where routes change et 
> > >al.)
> > >you do not know at time of select(), where the datagram will go.
> > >So, blocking can be made only based on a criterium not depending on this.
> > >problems with silent losses. People just do not care about this, so
> > >they get the thing which they deserve.
> >
> >Alexey,
> >
> >Would you mind explaining a bit more why apps will lockup if we block
> >after select() succeeded. Or anyone?
> 
> Actually, I'm more interested to know why we would **need** to block after
> select has succeeded.  It would seem to me that select is busted in this
> case.  For the case of a very large UDP packet and a small send buffer,
> select gets confused, but at least when the send buffer is > 128k, it should
> be right...

With the current impl. process calling sendto() doesn't go to sleep 
after select() has succeeded. My question applied in the context where 
process goes to sleep when the qdisc queue is overflowed.

-- 
Eric

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: PATCH Re: udp weirdness
  2002-10-03 15:58                               ` Eric Lemoine
@ 2002-10-03 16:29                                 ` kuznet
  0 siblings, 0 replies; 36+ messages in thread
From: kuznet @ 2002-10-03 16:29 UTC (permalink / raw)
  To: Eric Lemoine; +Cc: greearb, Eric.Lemoine, cfriesen, hadi, netdev

Hello!

> > >Would you mind explaining a bit more why apps will lockup if we block
> > >after select() succeeded. Or anyone?

Because queue overflow is persistant condition. Do you want your routing
daemon went to coma when sending an update to dead ppp link?

But, actually, it is simply required by standards. No blocking after
select success is definition of select, in fact.

Alexey

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2002-10-03 16:29 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-24  6:50 udp weirdness Eric Lemoine
2002-09-27 12:02 ` Eric Lemoine
2002-09-27 14:53   ` jamal
2002-09-27 15:04     ` Matti Aarnio
2002-09-29 14:47       ` jamal
2002-09-30  8:49         ` Eric Lemoine
2002-09-30 11:09           ` jamal
2002-09-30 12:10           ` jamal
2002-09-30 12:23             ` jamal
2002-10-01  0:22               ` PATCH " jamal
2002-10-01  6:35                 ` Eric Lemoine
2002-10-01  9:51                   ` jamal
2002-10-01 13:53                 ` kuznet
2002-10-01 14:14                   ` jamal
2002-10-01 14:26                   ` Chris Friesen
2002-10-01 14:40                     ` kuznet
2002-10-01 14:52                       ` Chris Friesen
2002-10-01 15:31                         ` kuznet
2002-10-01 16:16                           ` Chris Friesen
2002-10-01 16:41                             ` kuznet
2002-10-01 17:17                               ` Chris Friesen
2002-10-01 16:42                           ` Ben Greear
2002-10-01 16:58                             ` Chris Friesen
2002-10-01 17:55                             ` jamal
2002-10-01 18:36                               ` Chris Friesen
2002-10-01 18:35                                 ` jamal
2002-10-01 18:54                                   ` Ben Greear
2002-10-01 19:03                                   ` Chris Friesen
2002-10-01 18:52                               ` Ben Greear
2002-10-02 11:13                           ` Eric Lemoine
2002-10-02 14:09                             ` Chris Friesen
2002-10-02 15:25                             ` Ben Greear
2002-10-03 15:58                               ` Eric Lemoine
2002-10-03 16:29                                 ` kuznet
2002-09-27 15:19     ` Eric Lemoine
2002-09-27 15:57     ` Eric Lemoine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).