netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tcp: do not promote SPLICE_F_NONBLOCK to socket O_NONBLOCK
@ 2008-07-17 13:33 Octavian Purdila
  2008-07-17 14:21 ` Evgeniy Polyakov
  0 siblings, 1 reply; 19+ messages in thread
From: Octavian Purdila @ 2008-07-17 13:33 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]



[-- Attachment #2: x --]
[-- Type: text/plain, Size: 2185 bytes --]

commit 11134aa8499b6fd67569e8fd21bde6fc481898d1
Author: Octavian Purdila <opurdila@ixiacom.com>
Date:   Thu Jul 17 16:25:23 2008 +0300

    tcp: do not promote SPLICE_F_NONBLOCK to socket O_NONBLOCK
    
    This patch changes tcp_splice_read to the behavior implied by man 2
    splice:
    
         SPLICE_F_NONBLOCK - Do not block on I/O. This makes the splice
         pipe operations non-blocking, but splice() may nevertheless block
         because the file descriptors that are spliced to/from may block
         (unless they have the O_NONBLOCK flag set).
    
    This approach also provides a simple solution to the splice
    transfer size problem. Say we have the following common sequence:
    
         splice(socket, pipe);
         splice(pipe, file);
    
    Unless we specify SPLICE_F_NONBLOCK, we can't use arbitrarily large
    transfer sizes with the 1st splice since otherwise we will deadlock
    due to pipe being full.  But if we use SPLICE_F_NONBLOCK, the current
    implementation will make the underlying socket non-blocking and thus
    will force us use poll or other async I/O notification mechanism.
    
    Choosing a splice transfer size that won't deadlock is not trivial: we
    need to stay under PIPE_BUFFERS packets and since packets can have
    arbitrary sizes we will need to be conservative and use a small
    transfer size. That can degrade performance due to excessive system
    calls.
    
    Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 56a133c..cc5082b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -570,7 +570,7 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
 
 	lock_sock(sk);
 
-	timeo = sock_rcvtimeo(sk, flags & SPLICE_F_NONBLOCK);
+	timeo = sock_rcvtimeo(sk, sock->file->f_flags & O_NONBLOCK);
 	while (tss.len) {
 		ret = __tcp_splice_read(sk, &tss);
 		if (ret < 0)
@@ -578,10 +578,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
 		else if (!ret) {
 			if (spliced)
 				break;
-			if (flags & SPLICE_F_NONBLOCK) {
-				ret = -EAGAIN;
-				break;
-			}
 			if (sock_flag(sk, SOCK_DONE))
 				break;
 			if (sk->sk_err) {

^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2008-07-19 11:20 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-17 13:33 [PATCH] tcp: do not promote SPLICE_F_NONBLOCK to socket O_NONBLOCK Octavian Purdila
2008-07-17 14:21 ` Evgeniy Polyakov
2008-07-17 14:47   ` Octavian Purdila
2008-07-17 17:41     ` Evgeniy Polyakov
2008-07-17 21:52       ` Octavian Purdila
2008-07-18 10:53         ` Evgeniy Polyakov
2008-07-18 11:18           ` Octavian Purdila
2008-07-18 12:24             ` Evgeniy Polyakov
2008-07-18 14:04               ` Octavian Purdila
2008-07-18 14:32                 ` Evgeniy Polyakov
2008-07-18 15:50                   ` Octavian Purdila
2008-07-18 16:00                     ` Evgeniy Polyakov
2008-07-18 17:04                       ` Octavian Purdila
2008-07-18 17:53                         ` Evgeniy Polyakov
2008-07-18 18:16                           ` Octavian Purdila
2008-07-18 18:35                             ` Evgeniy Polyakov
2008-07-18 18:43                               ` Octavian Purdila
2008-07-19  8:51                                 ` Evgeniy Polyakov
2008-07-19 11:18                                   ` Octavian Purdila

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).