From: Octavian Purdila <opurdila@ixiacom.com>
To: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Subject: [PATCH] tcp: do not promote SPLICE_F_NONBLOCK to socket O_NONBLOCK
Date: Thu, 17 Jul 2008 16:33:49 +0300 [thread overview]
Message-ID: <200807171633.49791.opurdila@ixiacom.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 1 bytes --]
[-- Attachment #2: x --]
[-- Type: text/plain, Size: 2185 bytes --]
commit 11134aa8499b6fd67569e8fd21bde6fc481898d1
Author: Octavian Purdila <opurdila@ixiacom.com>
Date: Thu Jul 17 16:25:23 2008 +0300
tcp: do not promote SPLICE_F_NONBLOCK to socket O_NONBLOCK
This patch changes tcp_splice_read to the behavior implied by man 2
splice:
SPLICE_F_NONBLOCK - Do not block on I/O. This makes the splice
pipe operations non-blocking, but splice() may nevertheless block
because the file descriptors that are spliced to/from may block
(unless they have the O_NONBLOCK flag set).
This approach also provides a simple solution to the splice
transfer size problem. Say we have the following common sequence:
splice(socket, pipe);
splice(pipe, file);
Unless we specify SPLICE_F_NONBLOCK, we can't use arbitrarily large
transfer sizes with the 1st splice since otherwise we will deadlock
due to pipe being full. But if we use SPLICE_F_NONBLOCK, the current
implementation will make the underlying socket non-blocking and thus
will force us use poll or other async I/O notification mechanism.
Choosing a splice transfer size that won't deadlock is not trivial: we
need to stay under PIPE_BUFFERS packets and since packets can have
arbitrary sizes we will need to be conservative and use a small
transfer size. That can degrade performance due to excessive system
calls.
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 56a133c..cc5082b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -570,7 +570,7 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
lock_sock(sk);
- timeo = sock_rcvtimeo(sk, flags & SPLICE_F_NONBLOCK);
+ timeo = sock_rcvtimeo(sk, sock->file->f_flags & O_NONBLOCK);
while (tss.len) {
ret = __tcp_splice_read(sk, &tss);
if (ret < 0)
@@ -578,10 +578,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
else if (!ret) {
if (spliced)
break;
- if (flags & SPLICE_F_NONBLOCK) {
- ret = -EAGAIN;
- break;
- }
if (sock_flag(sk, SOCK_DONE))
break;
if (sk->sk_err) {
next reply other threads:[~2008-07-17 13:35 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-17 13:33 Octavian Purdila [this message]
2008-07-17 14:21 ` [PATCH] tcp: do not promote SPLICE_F_NONBLOCK to socket O_NONBLOCK Evgeniy Polyakov
2008-07-17 14:47 ` Octavian Purdila
2008-07-17 17:41 ` Evgeniy Polyakov
2008-07-17 21:52 ` Octavian Purdila
2008-07-18 10:53 ` Evgeniy Polyakov
2008-07-18 11:18 ` Octavian Purdila
2008-07-18 12:24 ` Evgeniy Polyakov
2008-07-18 14:04 ` Octavian Purdila
2008-07-18 14:32 ` Evgeniy Polyakov
2008-07-18 15:50 ` Octavian Purdila
2008-07-18 16:00 ` Evgeniy Polyakov
2008-07-18 17:04 ` Octavian Purdila
2008-07-18 17:53 ` Evgeniy Polyakov
2008-07-18 18:16 ` Octavian Purdila
2008-07-18 18:35 ` Evgeniy Polyakov
2008-07-18 18:43 ` Octavian Purdila
2008-07-19 8:51 ` Evgeniy Polyakov
2008-07-19 11:18 ` Octavian Purdila
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200807171633.49791.opurdila@ixiacom.com \
--to=opurdila@ixiacom.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).