netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sendfile() and UDP socket
@ 2008-09-10 12:39 Johann Baudy
  2008-09-10 20:16 ` David Miller
  0 siblings, 1 reply; 29+ messages in thread
From: Johann Baudy @ 2008-09-10 12:39 UTC (permalink / raw)
  To: netdev; +Cc: Evgeniy Polyakov

Hi All,

Sendfile() over UDP socket are currently limited to ~ 64KBytes file
(max cork.length).
Indeed, if you run sendfile() with a file size > 64KBytes over UDP
socket, system call will stop and return ~64KBytes without sending
anything on the network.
This patch is pushing ongoing frames when frames buffer is full, to
prevent overflow.

Signed-off-by: Johann Baudy <johann.baudy@gmail.com>

 net/ipv4/udp.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 56fcda3..d019e13 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -743,7 +743,22 @@ int udp_sendpage(struct sock *sk, struct page
*page, int offset,
                 size_t size, int flags)
 {
        struct udp_sock *up = udp_sk(sk);
+       struct inet_sock *inet = inet_sk(sk);
        int ret;
+       int fragheaderlen;
+       struct ip_options *opt = NULL;
+
+       lock_sock(sk);
+       if (inet->cork.flags & IPCORK_OPT)
+               opt = inet->cork.opt;
+       fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0);
+
+       if (inet->cork.length + size >= 0xFFFF - fragheaderlen) {
+               ret = udp_push_pending_frames(sk);
+               if (ret)
+                       goto out;
+       }
+       release_sock(sk);

        if (!up->pending) {
                struct msghdr msg = {   .msg_flags = flags|MSG_MORE };

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-10 12:39 [PATCH] sendfile() and UDP socket Johann Baudy
@ 2008-09-10 20:16 ` David Miller
  0 siblings, 0 replies; 29+ messages in thread
From: David Miller @ 2008-09-10 20:16 UTC (permalink / raw)
  To: johaahn; +Cc: netdev, johnpol

From: "Johann Baudy" <johaahn@gmail.com>
Date: Wed, 10 Sep 2008 14:39:55 +0200

> Hi All,
> 
> Sendfile() over UDP socket are currently limited to ~ 64KBytes file
> (max cork.length).
> Indeed, if you run sendfile() with a file size > 64KBytes over UDP
> socket, system call will stop and return ~64KBytes without sending
> anything on the network.
> This patch is pushing ongoing frames when frames buffer is full, to
> prevent overflow.
> 
> Signed-off-by: Johann Baudy <johann.baudy@gmail.com>

Your email client mangled the patch, turning tabs into spaces etc.
Please correct this and resubmit.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH] sendfile() and UDP socket
@ 2008-09-14 10:25 Johann Baudy
  2008-09-16  4:17 ` Simon Horman
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Johann Baudy @ 2008-09-14 10:25 UTC (permalink / raw)
  To: netdev; +Cc: David Miller

Hi All,

Sendfile() over UDP socket are currently limited to ~ 64KBytes file (max cork.length).
Indeed, if you run sendfile() with a file size > 64KBytes over UDP socket, system call will stop and return ~64KBytes without sending anything on the network.
This patch is pushing ongoing frames when frames buffer is full, to prevent overflow.

Signed-off-by: Johann Baudy <johann.baudy@gmail.com>

 net/ipv4/udp.c |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 8e42fbb..64e0857 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -743,7 +743,22 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
 		 size_t size, int flags)
 {
 	struct udp_sock *up = udp_sk(sk);
+	struct inet_sock *inet = inet_sk(sk);
 	int ret;
+	int fragheaderlen;
+	struct ip_options *opt = NULL;
+
+	lock_sock(sk);
+	if (inet->cork.flags & IPCORK_OPT)
+		opt = inet->cork.opt;
+	fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0);
+
+	if (inet->cork.length + size >= 0xFFFF - fragheaderlen) {
+		ret = udp_push_pending_frames(sk);
+		if (ret)
+			goto out;
+	}
+	release_sock(sk);
 
 	if (!up->pending) {
 		struct msghdr msg = {	.msg_flags = flags|MSG_MORE };



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-14 10:25 [PATCH] sendfile() and UDP socket Johann Baudy
@ 2008-09-16  4:17 ` Simon Horman
  2008-09-16  4:24   ` Simon Horman
  2008-09-16 12:01 ` Hirokazu Takahashi
  2008-09-21  8:04 ` David Miller
  2 siblings, 1 reply; 29+ messages in thread
From: Simon Horman @ 2008-09-16  4:17 UTC (permalink / raw)
  To: Johann Baudy; +Cc: netdev, David Miller

On Sun, Sep 14, 2008 at 12:25:56PM +0200, Johann Baudy wrote:
> Hi All,
> 
> Sendfile() over UDP socket are currently limited to ~ 64KBytes file (max cork.length).
> Indeed, if you run sendfile() with a file size > 64KBytes over UDP socket, system call will stop and return ~64KBytes without sending anything on the network.
> This patch is pushing ongoing frames when frames buffer is full, to prevent overflow.
> 
> Signed-off-by: Johann Baudy <johann.baudy@gmail.com>
> 
>  net/ipv4/udp.c |   15 +++++++++++++++
>  1 files changed, 15 insertions(+), 0 deletions(-)
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 8e42fbb..64e0857 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -743,7 +743,22 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
>  		 size_t size, int flags)
>  {
>  	struct udp_sock *up = udp_sk(sk);
> +	struct inet_sock *inet = inet_sk(sk);
>  	int ret;
> +	int fragheaderlen;
> +	struct ip_options *opt = NULL;
> +
> +	lock_sock(sk);
> +	if (inet->cork.flags & IPCORK_OPT)
> +		opt = inet->cork.opt;
> +	fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0);
> +
> +	if (inet->cork.length + size >= 0xFFFF - fragheaderlen) {
> +		ret = udp_push_pending_frames(sk);
> +		if (ret)
> +			goto out;
> +	}
> +	release_sock(sk);
>  
>  	if (!up->pending) {
>  		struct msghdr msg = {	.msg_flags = flags|MSG_MORE };

Hi,

I wonder if it is slightly nicer to do without the opt variable.
I _think_ its safe to access inet->cork.opt->optlen based
on the (inet->cork.flags & IPCORK_OPT) check.

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 8e42fbb..969f6cd 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -743,7 +743,21 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
 		 size_t size, int flags)
 {
 	struct udp_sock *up = udp_sk(sk);
+	struct inet_sock *inet = inet_sk(sk);
 	int ret;
+	int fragheaderlen;
+
+	fragheaderlen = sizeof(struct iphdr);
+	lock_sock(sk);
+	if (inet->cork.flags & IPCORK_OPT)
+		fragheaderlen += inet->cork.opt->optlen;
+
+	if (inet->cork.length + size >= 0xFFFF - fragheaderlen) {
+		ret = udp_push_pending_frames(sk);
+		if (ret)
+			goto out;
+	}
+	release_sock(sk);
 
 	if (!up->pending) {
 		struct msghdr msg = {	.msg_flags = flags|MSG_MORE };

-- 
Simon Horman
  VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
  H: www.vergenet.net/~horms/             W: www.valinux.co.jp/en


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-16  4:17 ` Simon Horman
@ 2008-09-16  4:24   ` Simon Horman
  0 siblings, 0 replies; 29+ messages in thread
From: Simon Horman @ 2008-09-16  4:24 UTC (permalink / raw)
  To: Johann Baudy; +Cc: netdev, David Miller

On Tue, Sep 16, 2008 at 02:17:05PM +1000, Simon Horman wrote:
> On Sun, Sep 14, 2008 at 12:25:56PM +0200, Johann Baudy wrote:
> > Hi All,
> > 
> > Sendfile() over UDP socket are currently limited to ~ 64KBytes file (max cork.length).
> > Indeed, if you run sendfile() with a file size > 64KBytes over UDP socket, system call will stop and return ~64KBytes without sending anything on the network.
> > This patch is pushing ongoing frames when frames buffer is full, to prevent overflow.
> > 
> > Signed-off-by: Johann Baudy <johann.baudy@gmail.com>
> > 
> >  net/ipv4/udp.c |   15 +++++++++++++++
> >  1 files changed, 15 insertions(+), 0 deletions(-)
> > 
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index 8e42fbb..64e0857 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -743,7 +743,22 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
> >  		 size_t size, int flags)
> >  {
> >  	struct udp_sock *up = udp_sk(sk);
> > +	struct inet_sock *inet = inet_sk(sk);
> >  	int ret;
> > +	int fragheaderlen;
> > +	struct ip_options *opt = NULL;
> > +
> > +	lock_sock(sk);
> > +	if (inet->cork.flags & IPCORK_OPT)
> > +		opt = inet->cork.opt;
> > +	fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0);
> > +
> > +	if (inet->cork.length + size >= 0xFFFF - fragheaderlen) {
> > +		ret = udp_push_pending_frames(sk);
> > +		if (ret)
> > +			goto out;
> > +	}
> > +	release_sock(sk);
> >  
> >  	if (!up->pending) {
> >  		struct msghdr msg = {	.msg_flags = flags|MSG_MORE };
> 
> Hi,
> 
> I wonder if it is slightly nicer to do without the opt variable.
> I _think_ its safe to access inet->cork.opt->optlen based
> on the (inet->cork.flags & IPCORK_OPT) check.
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 8e42fbb..969f6cd 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -743,7 +743,21 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
>  		 size_t size, int flags)
>  {
>  	struct udp_sock *up = udp_sk(sk);
> +	struct inet_sock *inet = inet_sk(sk);
>  	int ret;
> +	int fragheaderlen;

Also, I wonder if this should be an unsigned int
as both sizeof(struct iphdr) and inet->cork.opt->optlen are unsigned,
though of different width.

> +
> +	fragheaderlen = sizeof(struct iphdr);
> +	lock_sock(sk);
> +	if (inet->cork.flags & IPCORK_OPT)
> +		fragheaderlen += inet->cork.opt->optlen;
> +
> +	if (inet->cork.length + size >= 0xFFFF - fragheaderlen) {
> +		ret = udp_push_pending_frames(sk);
> +		if (ret)
> +			goto out;
> +	}
> +	release_sock(sk);
>  
>  	if (!up->pending) {
>  		struct msghdr msg = {	.msg_flags = flags|MSG_MORE };

-- 
Simon Horman
  VA Linux Systems Japan K.K., Sydney, Australia Satellite Office
  H: www.vergenet.net/~horms/             W: www.valinux.co.jp/en


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-14 10:25 [PATCH] sendfile() and UDP socket Johann Baudy
  2008-09-16  4:17 ` Simon Horman
@ 2008-09-16 12:01 ` Hirokazu Takahashi
  2008-09-18 17:31   ` Rémi Denis-Courmont
  2008-09-21  8:04 ` David Miller
  2 siblings, 1 reply; 29+ messages in thread
From: Hirokazu Takahashi @ 2008-09-16 12:01 UTC (permalink / raw)
  To: johaahn; +Cc: netdev, davem

Hi, Johann,

> Hi All,
> 
> Sendfile() over UDP socket are currently limited to ~ 64KBytes file (max cork.length).
> Indeed, if you run sendfile() with a file size > 64KBytes over UDP socket, system call will stop and return ~64KBytes without sending anything on the network.
> This patch is pushing ongoing frames when frames buffer is full, to prevent overflow.

UDP is a datagram protocol, so I think applications using UDP should
care about the size of packets they are going to send rather than
expecting that the messages will be split into several packets automatically.
If some of the packets have lost, it will be really hard for the
applications to re-create the same ones to send again.

You can pass "offset" of the file you are going to send and "count"
to be sent to sendfile systemcall, so you can split the file into
several pieces and send each of them.

If you want send a large file over UDP, the typical code will be like:

while (...) {
        setsockopt(fd, UDP_CORK, 1);
        sendmsg(fd, &apl_header, sizeof(apl_header));
        offset += sendfile(fd, offset, count);
        setsockopt(fd, UDP_CORK, 0);
}

or:

while (...) {
        sendmsg(fd, &apl_header, sizeof(apl_header), MSG_MORE);
        offset += sendfile(fd, offset, count);
}


> Signed-off-by: Johann Baudy <johann.baudy@gmail.com>
> 
>  net/ipv4/udp.c |   15 +++++++++++++++
>  1 files changed, 15 insertions(+), 0 deletions(-)
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 8e42fbb..64e0857 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -743,7 +743,22 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
>  		 size_t size, int flags)
>  {
>  	struct udp_sock *up = udp_sk(sk);
> +	struct inet_sock *inet = inet_sk(sk);
>  	int ret;
> +	int fragheaderlen;
> +	struct ip_options *opt = NULL;
> +
> +	lock_sock(sk);
> +	if (inet->cork.flags & IPCORK_OPT)
> +		opt = inet->cork.opt;
> +	fragheaderlen = sizeof(struct iphdr) + (opt ? opt->optlen : 0);
> +
> +	if (inet->cork.length + size >= 0xFFFF - fragheaderlen) {
> +		ret = udp_push_pending_frames(sk);
> +		if (ret)
> +			goto out;
> +	}
> +	release_sock(sk);
>  
>  	if (!up->pending) {
>  		struct msghdr msg = {	.msg_flags = flags|MSG_MORE };
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-16 12:01 ` Hirokazu Takahashi
@ 2008-09-18 17:31   ` Rémi Denis-Courmont
  2008-09-19 12:28     ` Hirokazu Takahashi
  0 siblings, 1 reply; 29+ messages in thread
From: Rémi Denis-Courmont @ 2008-09-18 17:31 UTC (permalink / raw)
  To: Hirokazu Takahashi, johaahn; +Cc: netdev

Le mardi 16 septembre 2008 15:01:17 Hirokazu Takahashi, vous avez écrit :
> UDP is a datagram protocol, so I think applications using UDP should
> care about the size of packets they are going to send rather than
> expecting that the messages will be split into several packets
> automatically. If some of the packets have lost, it will be really hard for
> the applications to re-create the same ones to send again.

Also, why use UDP for this... If you want stream semantics, why not use TCP or 
SCTP instead?

> If you want send a large file over UDP, the typical code will be like:

> while (...) {
>         sendmsg(fd, &apl_header, sizeof(apl_header), MSG_MORE);
>         offset += sendfile(fd, offset, count);
> }

Correct me if I am wrong, but... Unless you have a big MTU (as _not_ in 1500 
bytes :D), doing an extra syscall might be slower than copying data in a 
single vectorized sendmsg() syscall.

-- 
Rémi Denis-Courmont
http://www.remlab.net/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-18 17:31   ` Rémi Denis-Courmont
@ 2008-09-19 12:28     ` Hirokazu Takahashi
  2008-09-19 13:14       ` Rémi Denis-Courmont
  0 siblings, 1 reply; 29+ messages in thread
From: Hirokazu Takahashi @ 2008-09-19 12:28 UTC (permalink / raw)
  To: rdenis; +Cc: johaahn, netdev

Hi,

> > UDP is a datagram protocol, so I think applications using UDP should
> > care about the size of packets they are going to send rather than
> > expecting that the messages will be split into several packets
> > automatically. If some of the packets have lost, it will be really hard for
> > the applications to re-create the same ones to send again.
> 
> Also, why use UDP for this... If you want stream semantics, why not use TCP or 
> SCTP instead?

I think a lot of VoIP and video streaming services are working on UDP.
Linux NFS over UDP also uses this feature.

> > If you want send a large file over UDP, the typical code will be like:
> 
> > while (...) {
> >         sendmsg(fd, &apl_header, sizeof(apl_header), MSG_MORE);
> >         offset += sendfile(fd, offset, count);
> > }
> 
> Correct me if I am wrong, but... Unless you have a big MTU (as _not_ in 1500 
> bytes :D), doing an extra syscall might be slower than copying data in a 
> single vectorized sendmsg() syscall.

That's not true.
Even if the MTU is small, you can send a UDP message whose size can be
up to 64KB at once. They will be split into several IP packets without
any copies.

And more, copying data will pollute the cache memory much, which cannot
be ignored if you want send tons of data of files.


Thanks,
Hirokazu Takahashi.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-19 12:28     ` Hirokazu Takahashi
@ 2008-09-19 13:14       ` Rémi Denis-Courmont
  0 siblings, 0 replies; 29+ messages in thread
From: Rémi Denis-Courmont @ 2008-09-19 13:14 UTC (permalink / raw)
  To: Hirokazu Takahashi; +Cc: johaahn, netdev

Le vendredi 19 septembre 2008 15:28:54 Hirokazu Takahashi, vous avez écrit :
> > > UDP is a datagram protocol, so I think applications using UDP should
> > > care about the size of packets they are going to send rather than
> > > expecting that the messages will be split into several packets
> > > automatically. If some of the packets have lost, it will be really hard
> > > for the applications to re-create the same ones to send again.
> >
> > Also, why use UDP for this... If you want stream semantics, why not use
> > TCP or SCTP instead?
>
> I think a lot of VoIP and video streaming services are working on UDP.

VoIP uses lots of small packets, considering the typical packetization times. 
Using send(MSG_MORE) + sendfile would definitely be slower than a single 
sendmsg() in such case, because the per-packet memcpy() will be quite short.

Video streaming typically does send lots of large packets, and might well read 
the data from a mmap-able file. But video streaming protocols such as RTP 
typically try to avoid fragmentation, so large sendfile() won't work.

> > > If you want send a large file over UDP, the typical code will be like:
> > >
> > > while (...) {
> > >         sendmsg(fd, &apl_header, sizeof(apl_header), MSG_MORE);
> > >         offset += sendfile(fd, offset, count);
> > > }
> >
> > Correct me if I am wrong, but... Unless you have a big MTU (as _not_ in
> > 1500 bytes :D), doing an extra syscall might be slower than copying data
> > in a single vectorized sendmsg() syscall.
>
> That's not true.
> Even if the MTU is small, you can send a UDP message whose size can be
> up to 64KB at once. They will be split into several IP packets without
> any copies.

And you will encounter fragmentation, which sucks at high data rates.
As for low rates, you typically would not bother to optimize memcpy. But more 
importantly, you should not assume that the other end is only talking to 
you - it might as well be talking to many different people and not appreciate 
if all of them send lots of fragments.

And lets not get into how poorly fragmentation works through middleboxes...

> And more, copying data will pollute the cache memory much, which cannot
> be ignored if you want send tons of data of files.

-- 
Rémi Denis-Courmont
http://www.remlab.net/

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-14 10:25 [PATCH] sendfile() and UDP socket Johann Baudy
  2008-09-16  4:17 ` Simon Horman
  2008-09-16 12:01 ` Hirokazu Takahashi
@ 2008-09-21  8:04 ` David Miller
  2008-09-22  0:21   ` Evgeniy Polyakov
  2 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2008-09-21  8:04 UTC (permalink / raw)
  To: johaahn; +Cc: netdev

From: Johann Baudy <johaahn@gmail.com>
Date: Sun, 14 Sep 2008 12:25:56 +0200

> Sendfile() over UDP socket are currently limited to ~ 64KBytes file
> (max cork.length).  Indeed, if you run sendfile() with a file size >
> 64KBytes over UDP socket, system call will stop and return ~64KBytes
> without sending anything on the network.  This patch is pushing
> ongoing frames when frames buffer is full, to prevent overflow.
>
> Signed-off-by: Johann Baudy <johann.baudy@gmail.com>

Applications which work over datagram protocols must perform their own
segmentation.  It is not like doing a send over a stream protocol like
TCP, where you can use whatever length you want for send calls and
segmentation is done for the application.

If you look, this is what things like NFS using SUNRPC over UDP do.
They have a transmission unit for the data transfer and use that for
each "send".

A sendfile() with length >= 64K is the same as a sendmsg() with such a
length, which is defined as:

	if (len > 0xFFFF)
		return -EMSGSIZE;

So we could technically even return an error for this sendfile() over
UDP case.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-21  8:04 ` David Miller
@ 2008-09-22  0:21   ` Evgeniy Polyakov
  2008-09-22  0:44     ` David Miller
  0 siblings, 1 reply; 29+ messages in thread
From: Evgeniy Polyakov @ 2008-09-22  0:21 UTC (permalink / raw)
  To: David Miller; +Cc: johaahn, netdev

On Sun, Sep 21, 2008 at 01:04:58AM -0700, David Miller (davem@davemloft.net) wrote:
> Applications which work over datagram protocols must perform their own
> segmentation.  It is not like doing a send over a stream protocol like
> TCP, where you can use whatever length you want for send calls and
> segmentation is done for the application.

But isn't the whole idea of the sendfile() is to send a file no matter
what underlying media is?

> If you look, this is what things like NFS using SUNRPC over UDP do.
> They have a transmission unit for the data transfer and use that for
> each "send".

That's maybe because udp_sendpage() does not support sending pending
data if new packet is too big to attach?

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  0:21   ` Evgeniy Polyakov
@ 2008-09-22  0:44     ` David Miller
  2008-09-22  1:08       ` Evgeniy Polyakov
  0 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2008-09-22  0:44 UTC (permalink / raw)
  To: johnpol; +Cc: johaahn, netdev

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Mon, 22 Sep 2008 04:21:57 +0400

> On Sun, Sep 21, 2008 at 01:04:58AM -0700, David Miller (davem@davemloft.net) wrote:
> > Applications which work over datagram protocols must perform their own
> > segmentation.  It is not like doing a send over a stream protocol like
> > TCP, where you can use whatever length you want for send calls and
> > segmentation is done for the application.
> 
> But isn't the whole idea of the sendfile() is to send a file no matter
> what underlying media is?

It's a way to fabricate a send() directly from the page cache.

> That's maybe because udp_sendpage() does not support sending pending
> data if new packet is too big to attach?

I don't think so.

It's simply enforcing the wsize/rsize that's configured for the mount.
And this is exactly deciding what the UDP segment size should be.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  0:44     ` David Miller
@ 2008-09-22  1:08       ` Evgeniy Polyakov
  2008-09-22  2:07         ` David Miller
  0 siblings, 1 reply; 29+ messages in thread
From: Evgeniy Polyakov @ 2008-09-22  1:08 UTC (permalink / raw)
  To: David Miller; +Cc: johaahn, netdev

On Sun, Sep 21, 2008 at 05:44:50PM -0700, David Miller (davem@davemloft.net) wrote:
> > > Applications which work over datagram protocols must perform their own
> > > segmentation.  It is not like doing a send over a stream protocol like
> > > TCP, where you can use whatever length you want for send calls and
> > > segmentation is done for the application.
> > 
> > But isn't the whole idea of the sendfile() is to send a file no matter
> > what underlying media is?
> 
> It's a way to fabricate a send() directly from the page cache.

And to send exactly required number of bytes (or size of the cache)?
To send a single page (combined to several other pages) we have simple
->sendpage() callback, which should not return error when it is asked to
send a data and it can do it by actually submitting two packets without
special tcp-like processing of the segments.

> > That's maybe because udp_sendpage() does not support sending pending
> > data if new packet is too big to attach?
> 
> I don't think so.
> 
> It's simply enforcing the wsize/rsize that's configured for the mount.
> And this is exactly deciding what the UDP segment size should be.

And what if it is just a result of the knowledge on how udp_sendpage()
behaves, so code adjusts packet sizes in advance?

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  1:08       ` Evgeniy Polyakov
@ 2008-09-22  2:07         ` David Miller
  2008-09-22  4:19           ` Evgeniy Polyakov
  0 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2008-09-22  2:07 UTC (permalink / raw)
  To: johnpol; +Cc: johaahn, netdev

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Mon, 22 Sep 2008 05:08:34 +0400

> On Sun, Sep 21, 2008 at 05:44:50PM -0700, David Miller (davem@davemloft.net) wrote:
> > > > Applications which work over datagram protocols must perform their own
> > > > segmentation.  It is not like doing a send over a stream protocol like
> > > > TCP, where you can use whatever length you want for send calls and
> > > > segmentation is done for the application.
> > > 
> > > But isn't the whole idea of the sendfile() is to send a file no matter
> > > what underlying media is?
> > 
> > It's a way to fabricate a send() directly from the page cache.
> 
> And to send exactly required number of bytes (or size of the cache)?
> To send a single page (combined to several other pages) we have simple
> ->sendpage() callback, which should not return error when it is asked to
> send a data and it can do it by actually submitting two packets without
> special tcp-like processing of the segments.

You're basically throwing away the difference between datagram and stream
socket semantics.

I don't see what else I can explain if you cannot see that this is
significant.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  2:07         ` David Miller
@ 2008-09-22  4:19           ` Evgeniy Polyakov
  2008-09-22  4:27             ` David Miller
  2008-09-25 13:03             ` Using skb_get() to recycle skbs Ram.Natarajan
  0 siblings, 2 replies; 29+ messages in thread
From: Evgeniy Polyakov @ 2008-09-22  4:19 UTC (permalink / raw)
  To: David Miller; +Cc: johaahn, netdev

On Sun, Sep 21, 2008 at 07:07:15PM -0700, David Miller (davem@davemloft.net) wrote:
> > And to send exactly required number of bytes (or size of the cache)?
> > To send a single page (combined to several other pages) we have simple
> > ->sendpage() callback, which should not return error when it is asked to
> > send a data and it can do it by actually submitting two packets without
> > special tcp-like processing of the segments.
> 
> You're basically throwing away the difference between datagram and stream
> socket semantics.
> 
> I don't see what else I can explain if you cannot see that this is
> significant.

Hey David, that's getting the wrong direction :)
Do not make decision backed by what you read or decided to think before
that instead of making it clear.

Stream socket means that whatever data we put to it, it will become
completely boundary-free, in that regard, that receiving side will not
be able to get original sending packet sizes (without too much efforts).

Datagram just preservs the boundaries and that's what we have with this
patch. Previously we accumulated a single segment upto predefined size
and sent it when it is complete. Now we are able to sent it when it is
complete and start creating new segment without returning the error.

Like:
previosly: A + A + A + A + send + return
now: A + A + A + A + send + A + A and so on, so effectively nothing
changes except maybe time when segment is being sent: previously it was
on the full size and now it is when new packet does not fit that
predefined size.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  4:19           ` Evgeniy Polyakov
@ 2008-09-22  4:27             ` David Miller
  2008-09-22  4:40               ` Evgeniy Polyakov
  2008-09-25 13:03             ` Using skb_get() to recycle skbs Ram.Natarajan
  1 sibling, 1 reply; 29+ messages in thread
From: David Miller @ 2008-09-22  4:27 UTC (permalink / raw)
  To: johnpol; +Cc: johaahn, netdev

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Mon, 22 Sep 2008 08:19:29 +0400

> Stream socket means that whatever data we put to it, it will become
> completely boundary-free, in that regard, that receiving side will not
> be able to get original sending packet sizes (without too much efforts).

Right.

> Datagram just preservs the boundaries and that's what we have with this
> patch.

Not exactly.

Each and every send() operation on a datagram socket must result in
exactly one packet.  sendfile() was following this rule, when it
returned successfully only one single packet was emitted.

The new sendfile() semantics are outside of this model.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  4:27             ` David Miller
@ 2008-09-22  4:40               ` Evgeniy Polyakov
  2008-09-22  5:06                 ` David Miller
  0 siblings, 1 reply; 29+ messages in thread
From: Evgeniy Polyakov @ 2008-09-22  4:40 UTC (permalink / raw)
  To: David Miller; +Cc: johaahn, netdev

On Sun, Sep 21, 2008 at 09:27:24PM -0700, David Miller (davem@davemloft.net) wrote:
> > Datagram just preservs the boundaries and that's what we have with this
> > patch.
> 
> Not exactly.
> 
> Each and every send() operation on a datagram socket must result in
> exactly one packet.  sendfile() was following this rule, when it
> returned successfully only one single packet was emitted.
>
> The new sendfile() semantics are outside of this model.

That's for send(), not sendfile(). The latter now works like lots of
sends() while previously it worked as single send().

Why sendfile() should be completely different compared to stream case?
Obviously send() has to be different and it is, but sendfile() is just a
bunch of sends over the pages in the cache, so let's allow it to be
multiple sends and not only single packet send.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  4:40               ` Evgeniy Polyakov
@ 2008-09-22  5:06                 ` David Miller
  2008-09-22  5:49                   ` Evgeniy Polyakov
  0 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2008-09-22  5:06 UTC (permalink / raw)
  To: johnpol; +Cc: johaahn, netdev

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Mon, 22 Sep 2008 08:40:52 +0400

> Why sendfile() should be completely different compared to stream case?

Because datagram sockets are completely different from stream sockets.

You program them differently, segmentation is made by the socket user
not within by the protocol itself.

sendfile() should behave in a way congruent to the other data transfer
APIs of the BSD socket layer.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  5:06                 ` David Miller
@ 2008-09-22  5:49                   ` Evgeniy Polyakov
  2008-09-22  6:54                     ` David Miller
  0 siblings, 1 reply; 29+ messages in thread
From: Evgeniy Polyakov @ 2008-09-22  5:49 UTC (permalink / raw)
  To: David Miller; +Cc: johaahn, netdev

On Sun, Sep 21, 2008 at 10:06:05PM -0700, David Miller (davem@davemloft.net) wrote:
> Because datagram sockets are completely different from stream sockets.
> 
> You program them differently, segmentation is made by the socket user
> not within by the protocol itself.
> 
> sendfile() should behave in a way congruent to the other data transfer
> APIs of the BSD socket layer.

So effectively you are saying, that sendfile() is just a pure send(),
but with diferent arguments? I.e. it is not supposed to send the whole
data it points to, but as much as possible according to send() standard?

Well, it may be a right or wrong decision, and in my opinion sendfile()
is very diferent than send() since it should require the whole data to
be sent, i.e. being like a loop of send()s, but since sendfile() is
effectively a very new approach, it could have different behaviour
rules.

Because of this, Johann, this patch will not be applied, but thanks a
lot for your work, we made a clear meaning on interface usage.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  5:49                   ` Evgeniy Polyakov
@ 2008-09-22  6:54                     ` David Miller
  2008-09-22  7:04                       ` Evgeniy Polyakov
  0 siblings, 1 reply; 29+ messages in thread
From: David Miller @ 2008-09-22  6:54 UTC (permalink / raw)
  To: johnpol; +Cc: johaahn, netdev

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Mon, 22 Sep 2008 09:49:20 +0400

> On Sun, Sep 21, 2008 at 10:06:05PM -0700, David Miller (davem@davemloft.net) wrote:
> > Because datagram sockets are completely different from stream sockets.
> > 
> > You program them differently, segmentation is made by the socket user
> > not within by the protocol itself.
> > 
> > sendfile() should behave in a way congruent to the other data transfer
> > APIs of the BSD socket layer.
> 
> So effectively you are saying, that sendfile() is just a pure send(),
> but with diferent arguments? I.e. it is not supposed to send the whole
> data it points to, but as much as possible according to send() standard?

It already makes this, guess what happens when socket error occurs
midstream during sendfile(), even for TCP?

First, we return length successfully sent.

User has to retry sendfile() call with remaining length, and at this
point they will immediately get the socket error return value.

This was broken at one point and I remember applying the fix for this
several years ago :-)

> Well, it may be a right or wrong decision, and in my opinion sendfile()
> is very diferent than send() since it should require the whole data to
> be sent, i.e. being like a loop of send()s, but since sendfile() is
> effectively a very new approach, it could have different behaviour
> rules.

It is in fact exactly and precisely like send().  It must even give
the same error return semantics as other socket data transfer calls
do.  See above.

The user has to have a resending loop _anyways_, in order to do
correct error handling.

Nothing is gained from the proposal, really.  It can only harm application
developers into thinking that segmentation over datagram sockets is not
their responsibility, when it absolutely is.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  6:54                     ` David Miller
@ 2008-09-22  7:04                       ` Evgeniy Polyakov
  2008-09-23  4:54                         ` Herbert Xu
  0 siblings, 1 reply; 29+ messages in thread
From: Evgeniy Polyakov @ 2008-09-22  7:04 UTC (permalink / raw)
  To: David Miller; +Cc: johaahn, netdev

On Sun, Sep 21, 2008 at 11:54:44PM -0700, David Miller (davem@davemloft.net) wrote:
> > So effectively you are saying, that sendfile() is just a pure send(),
> > but with diferent arguments? I.e. it is not supposed to send the whole
> > data it points to, but as much as possible according to send() standard?
> 
> It already makes this, guess what happens when socket error occurs
> midstream during sendfile(), even for TCP?
> 
> First, we return length successfully sent.
> 
> User has to retry sendfile() call with remaining length, and at this
> point they will immediately get the socket error return value.
> 
> This was broken at one point and I remember applying the fix for this
> several years ago :-)

Well, still tcp errors are completely different from udp ones: the
former ony result in a real problem or interrupt/timeout in the simple
case. UDP adds 'it is too big to be transferred' one. And instead of
just send the previous frame and start new one, we return error so that
user could do the same with additional syscall.

> > Well, it may be a right or wrong decision, and in my opinion sendfile()
> > is very diferent than send() since it should require the whole data to
> > be sent, i.e. being like a loop of send()s, but since sendfile() is
> > effectively a very new approach, it could have different behaviour
> > rules.
> 
> It is in fact exactly and precisely like send().  It must even give
> the same error return semantics as other socket data transfer calls
> do.  See above.

That's the main difference on how you and me look at sendfile().

> The user has to have a resending loop _anyways_, in order to do
> correct error handling.
> 
> Nothing is gained from the proposal, really.  It can only harm application
> developers into thinking that segmentation over datagram sockets is not
> their responsibility, when it absolutely is.

Which basically means that sendfile() for udp is exactly send() of the
mapped data. And it is not about segmentation, since packets are
correctly segmented with this proposal: packet is rounded to the
submitted page. This is about optimization compared to send() of the
data. You basically draw a line, that with UDP it is impossible to do.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-22  7:04                       ` Evgeniy Polyakov
@ 2008-09-23  4:54                         ` Herbert Xu
  2008-09-23  6:27                           ` Evgeniy Polyakov
  0 siblings, 1 reply; 29+ messages in thread
From: Herbert Xu @ 2008-09-23  4:54 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: davem, johaahn, netdev

Evgeniy Polyakov <johnpol@2ka.mipt.ru> wrote:
> On Sun, Sep 21, 2008 at 11:54:44PM -0700, David Miller (davem@davemloft.net) wrote:
>
>> It is in fact exactly and precisely like send().  It must even give
>> the same error return semantics as other socket data transfer calls
>> do.  See above.
> 
> That's the main difference on how you and me look at sendfile().

I think this dicussion is moot since the only time you want to use
sendfile is for bulk transfers and these days anybody designing new
applications that does bulk transfers over UDP should be taken out
and shot.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-23  4:54                         ` Herbert Xu
@ 2008-09-23  6:27                           ` Evgeniy Polyakov
  2008-09-23  7:01                             ` Herbert Xu
  0 siblings, 1 reply; 29+ messages in thread
From: Evgeniy Polyakov @ 2008-09-23  6:27 UTC (permalink / raw)
  To: Herbert Xu; +Cc: davem, johaahn, netdev

On Tue, Sep 23, 2008 at 12:54:27PM +0800, Herbert Xu (herbert@gondor.apana.org.au) wrote:
> I think this dicussion is moot since the only time you want to use
> sendfile is for bulk transfers and these days anybody designing new
> applications that does bulk transfers over UDP should be taken out
> and shot.

One can protect himself pointing how slow may be memory bus in some
hardware setup, which completely does not allow to perform any copy.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-23  6:27                           ` Evgeniy Polyakov
@ 2008-09-23  7:01                             ` Herbert Xu
  2008-09-23  7:07                               ` Evgeniy Polyakov
  2008-09-24  4:53                               ` Bill Fink
  0 siblings, 2 replies; 29+ messages in thread
From: Herbert Xu @ 2008-09-23  7:01 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: davem, johaahn, netdev

On Tue, Sep 23, 2008 at 10:27:10AM +0400, Evgeniy Polyakov wrote:
> On Tue, Sep 23, 2008 at 12:54:27PM +0800, Herbert Xu (herbert@gondor.apana.org.au) wrote:
> > I think this dicussion is moot since the only time you want to use
> > sendfile is for bulk transfers and these days anybody designing new
> > applications that does bulk transfers over UDP should be taken out
> > and shot.
> 
> One can protect himself pointing how slow may be memory bus in some
> hardware setup, which completely does not allow to perform any copy.

Yes but bulk transfers over UDP is a bad idea regardless of how
slow your bus is :)

So what application needs this?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-23  7:01                             ` Herbert Xu
@ 2008-09-23  7:07                               ` Evgeniy Polyakov
  2008-09-24  4:53                               ` Bill Fink
  1 sibling, 0 replies; 29+ messages in thread
From: Evgeniy Polyakov @ 2008-09-23  7:07 UTC (permalink / raw)
  To: Herbert Xu; +Cc: davem, johaahn, netdev

On Tue, Sep 23, 2008 at 03:01:33PM +0800, Herbert Xu (herbert@gondor.apana.org.au) wrote:
> Yes but bulk transfers over UDP is a bad idea regardless of how
> slow your bus is :)

It still deserves living if it is not highly priority traffic like video
dataflow.

> So what application needs this?

Johann did not show his exact usage scenario, but from what I got I
concluded, that it is kind of a video sensor (or some other data which
is allowed to be lost), which has to add a header
to the frame and submit it to the network without any copy because of
hardware limitation for the memory bus. Sensor can put data via DMA to
the needed location. His first (and parallel to this one) idea was to
extend packet socket to allow send of the mapped data.
I proposed to do the same with sendfile, i.e. dma data to the mapped
area of some file in the ramdisk, add header, do it for multiple frames
and then send given file using sendfile().

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
  2008-09-23  7:01                             ` Herbert Xu
  2008-09-23  7:07                               ` Evgeniy Polyakov
@ 2008-09-24  4:53                               ` Bill Fink
       [not found]                                 ` <7e0dd21a0810140009k49c8876ax66f744d0a3a4931b@mail.gmail.com>
  1 sibling, 1 reply; 29+ messages in thread
From: Bill Fink @ 2008-09-24  4:53 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Evgeniy Polyakov, davem, johaahn, netdev

On Tue, 23 Sep 2008, Herbert Xu wrote:

> On Tue, Sep 23, 2008 at 10:27:10AM +0400, Evgeniy Polyakov wrote:
> > On Tue, Sep 23, 2008 at 12:54:27PM +0800, Herbert Xu (herbert@gondor.apana.org.au) wrote:
> > > I think this dicussion is moot since the only time you want to use
> > > sendfile is for bulk transfers and these days anybody designing new
> > > applications that does bulk transfers over UDP should be taken out
> > > and shot.
> > 
> > One can protect himself pointing how slow may be memory bus in some
> > hardware setup, which completely does not allow to perform any copy.
> 
> Yes but bulk transfers over UDP is a bad idea regardless of how
> slow your bus is :)
> 
> So what application needs this?

It seems it might be useful for a video server.  The one thing that
seems to be missing from the sendfile() semantics is a message size
to be used for splitting the file into UDP datagrams, but this could
be provided by a separate ioctl(), and could default to the largest
message size that would fit in the MTU.

						-Bill

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Using skb_get() to recycle skbs
  2008-09-22  4:19           ` Evgeniy Polyakov
  2008-09-22  4:27             ` David Miller
@ 2008-09-25 13:03             ` Ram.Natarajan
  2008-09-25 14:28               ` Evgeniy Polyakov
  1 sibling, 1 reply; 29+ messages in thread
From: Ram.Natarajan @ 2008-09-25 13:03 UTC (permalink / raw)
  To: netdev


In our netdev driver, we preallocate a pool of skb which are
used by our h/w to DMA frames.

Normally, when we indicate a packet to stack (netif_rx),
the stack ends up freeing it. If we do a skb_get() prior
to that, then the stack just reduces the use_count,
and our driver is able to re-use it. We do have to use
caution to see that stack has dropped it's reference
(atomic_read(skb->users) <= 1) prior to returning
the buffer to our h/w.

Is this a valid approach? It so it could save some
CPU cycles which are used in alloc_skb(). If this
can be done, how come more drivers in tree are not
doing it, is there any flip side to it? One could be
that stack may not free it in time (we could starve
the hardware of free buffers), but in that case
we can drop our reference, and let the stack
free it up, and force an allocation instead of
reuse.

Thanks for your response.

Ram Natarajan
Emulex Corporation

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Using skb_get() to recycle skbs
  2008-09-25 13:03             ` Using skb_get() to recycle skbs Ram.Natarajan
@ 2008-09-25 14:28               ` Evgeniy Polyakov
  0 siblings, 0 replies; 29+ messages in thread
From: Evgeniy Polyakov @ 2008-09-25 14:28 UTC (permalink / raw)
  To: Ram.Natarajan; +Cc: netdev

Hi.

On Thu, Sep 25, 2008 at 06:03:03AM -0700, Ram.Natarajan@Emulex.Com (Ram.Natarajan@Emulex.Com) wrote:
> In our netdev driver, we preallocate a pool of skb which are
> used by our h/w to DMA frames.
> 
> Normally, when we indicate a packet to stack (netif_rx),
> the stack ends up freeing it. If we do a skb_get() prior
> to that, then the stack just reduces the use_count,
> and our driver is able to re-use it. We do have to use
> caution to see that stack has dropped it's reference
> (atomic_read(skb->users) <= 1) prior to returning
> the buffer to our h/w.
> 
> Is this a valid approach? It so it could save some
> CPU cycles which are used in alloc_skb(). If this
> can be done, how come more drivers in tree are not
> doing it, is there any flip side to it? One could be
> that stack may not free it in time (we could starve
> the hardware of free buffers), but in that case
> we can drop our reference, and let the stack
> free it up, and force an allocation instead of
> reuse.

Now stack allocates new skb and copy data from old one to the new, since
it belives that skb is shared between different users and thus can not
be modified in the rx path. You can check how skb_share_check() is
called in ip_rcv() and similar receiving functions.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] sendfile() and UDP socket
       [not found]                                 ` <7e0dd21a0810140009k49c8876ax66f744d0a3a4931b@mail.gmail.com>
@ 2008-10-14  7:10                                   ` Johann Baudy
  0 siblings, 0 replies; 29+ messages in thread
From: Johann Baudy @ 2008-10-14  7:10 UTC (permalink / raw)
  To: Herbert Xu, Evgeniy Polyakov, David Miller, Bill Fink; +Cc: netdev

Hi All,

I'm sorry for this late reply but I was out of Internet access during one month.
So if my understanding is correct, sendfile() over UDP can't be used
to send a file > 64 ko in one system call, as each request must fit in
one UDP packet.

However is it expected that sendfile() returns 64Ko but sends nothing
over the network when we are performing this particular case
(file>64ko)? should it return 0?

Thanks in advance,
Johann

>
>
>
> On Wed, Sep 24, 2008 at 6:53 AM, Bill Fink <billfink@mindspring.com> wrote:
>>
>> On Tue, 23 Sep 2008, Herbert Xu wrote:
>>
>> > On Tue, Sep 23, 2008 at 10:27:10AM +0400, Evgeniy Polyakov wrote:
>> > > On Tue, Sep 23, 2008 at 12:54:27PM +0800, Herbert Xu (herbert@gondor.apana.org.au) wrote:
>> > > > I think this dicussion is moot since the only time you want to use
>> > > > sendfile is for bulk transfers and these days anybody designing new
>> > > > applications that does bulk transfers over UDP should be taken out
>> > > > and shot.
>> > >
>> > > One can protect himself pointing how slow may be memory bus in some
>> > > hardware setup, which completely does not allow to perform any copy.
>> >
>> > Yes but bulk transfers over UDP is a bad idea regardless of how
>> > slow your bus is :)
>> >
>> > So what application needs this?
>>
>> It seems it might be useful for a video server.  The one thing that
>> seems to be missing from the sendfile() semantics is a message size
>> to be used for splitting the file into UDP datagrams, but this could
>> be provided by a separate ioctl(), and could default to the largest
>> message size that would fit in the MTU.
>>
>>                                                -Bill
>
>
>
> --
> Johann Baudy
> johaahn@gmail.com



--
Johann Baudy
johaahn@gmail.com

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2008-10-14  7:10 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-14 10:25 [PATCH] sendfile() and UDP socket Johann Baudy
2008-09-16  4:17 ` Simon Horman
2008-09-16  4:24   ` Simon Horman
2008-09-16 12:01 ` Hirokazu Takahashi
2008-09-18 17:31   ` Rémi Denis-Courmont
2008-09-19 12:28     ` Hirokazu Takahashi
2008-09-19 13:14       ` Rémi Denis-Courmont
2008-09-21  8:04 ` David Miller
2008-09-22  0:21   ` Evgeniy Polyakov
2008-09-22  0:44     ` David Miller
2008-09-22  1:08       ` Evgeniy Polyakov
2008-09-22  2:07         ` David Miller
2008-09-22  4:19           ` Evgeniy Polyakov
2008-09-22  4:27             ` David Miller
2008-09-22  4:40               ` Evgeniy Polyakov
2008-09-22  5:06                 ` David Miller
2008-09-22  5:49                   ` Evgeniy Polyakov
2008-09-22  6:54                     ` David Miller
2008-09-22  7:04                       ` Evgeniy Polyakov
2008-09-23  4:54                         ` Herbert Xu
2008-09-23  6:27                           ` Evgeniy Polyakov
2008-09-23  7:01                             ` Herbert Xu
2008-09-23  7:07                               ` Evgeniy Polyakov
2008-09-24  4:53                               ` Bill Fink
     [not found]                                 ` <7e0dd21a0810140009k49c8876ax66f744d0a3a4931b@mail.gmail.com>
2008-10-14  7:10                                   ` Johann Baudy
2008-09-25 13:03             ` Using skb_get() to recycle skbs Ram.Natarajan
2008-09-25 14:28               ` Evgeniy Polyakov
  -- strict thread matches above, loose matches on Subject: below --
2008-09-10 12:39 [PATCH] sendfile() and UDP socket Johann Baudy
2008-09-10 20:16 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).