From: Eric Dumazet <eric.dumazet@gmail.com>
To: unlisted-recipients:; (no To-header on input)
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>,
netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
Volker Lendecke <vl@samba.org>,
Octavian Purdila <opurdila@ixiacom.com>
Subject: Re: Splice on blocking TCP sockets again..
Date: Wed, 30 Sep 2009 08:19:37 +0200 [thread overview]
Message-ID: <4AC2F879.4080807@gmail.com> (raw)
In-Reply-To: <4AC2F3E4.5000904@gmail.com>
Eric Dumazet a écrit :
> Eric Dumazet a écrit :
>> Jason Gunthorpe a écrit :
>>>> One way to handle this is to switch tcp_read() to use the underlying file O_NONBLOCK
>>>> flag, as other socket operations do. And let SPLICE_F_NONBLOCK control the pipe output only.
>> arg, this was tcp_splice_read() of course
>>
>>> Thanks Eric, this seems reasonable from my userspace perspective.
>>>
>>> I admit I don't understand why SPLICE_F_NONBLOCK exists, it seems very
>>> un-unixy to have a syscall completely ignore the NONBLOCK flag of the
>>> fd it is called on. Ie setting NONBLOCK on the pipe itself does
>>> nothing when using splice..
>>>
>> Hmm, good question, I dont have the answer but I'll digg one.
>>
>
> commit 29e350944fdc2dfca102500790d8ad6d6ff4f69d
> splice: add SPLICE_F_NONBLOCK flag
>
> It doesn't make the splice itself necessarily nonblocking (because the
> actual file descriptors that are spliced from/to may block unless they
> have the O_NONBLOCK flag set), but it makes the splice pipe operations
> nonblocking.
>
> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
>
>
> See Linus intention was pretty clear : O_NONBLOCK should be taken into account
> by 'actual file that are spliced from/to', regardless of SPLICE_F_NONBLOCK flag
>
I also found first submission of the patch from Octavian Purdila,
so credit should be given to Octavian as well.
http://lkml.indiana.edu/hypermail/linux/kernel/0807.2/0687.html
We could add Linus into the discussion if it can help to make progress on this point.
I personally stopped to use splice(tcp -> pipe) in my projects because it was not usable
in a reliable way.
Thanks
[PATCH] net: splice() from tcp to pipe should take into account O_NONBLOCK
tcp_splice_read() doesnt take into account socket's O_NONBLOCK flag
Before this patch :
splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE);
causes a random endless block (if pipe is full) and
splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
will return 0 immediately if the TCP buffer is empty.
User application has no way to instruct splice() that socket should be in blocking mode
but pipe in nonblock more.
Many projects cannot use splice(tcp -> pipe) because of this flaw.
http://git.samba.org/?p=samba.git;a=history;f=source3/lib/recvfile.c;h=ea0159642137390a0f7e57a123684e6e63e47581;hb=HEAD
http://lkml.indiana.edu/hypermail/linux/kernel/0807.2/0687.html
Linus introduced SPLICE_F_NONBLOCK in commit 29e350944fdc2dfca102500790d8ad6d6ff4f69d
(splice: add SPLICE_F_NONBLOCK flag )
It doesn't make the splice itself necessarily nonblocking (because the
actual file descriptors that are spliced from/to may block unless they
have the O_NONBLOCK flag set), but it makes the splice pipe operations
nonblocking.
Linus intention was clear : let SPLICE_F_NONBLOCK control the splice pipe mode only
This patch instruct tcp_splice_read() to use the underlying file O_NONBLOCK
flag, as other socket operations do.
Users will then call :
splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE | SPLICE_F_NONBLOCK );
to block on data coming from socket (if file is in blocking mode),
and not block on pipe output (to avoid deadlock)
First version of this patch was submitted by Octavian Purdila
Reported-by: Volker Lendecke <vl@samba.org>
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
---
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 21387eb..8cdfab6 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -580,7 +580,7 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
lock_sock(sk);
- timeo = sock_rcvtimeo(sk, flags & SPLICE_F_NONBLOCK);
+ timeo = sock_rcvtimeo(sk, sock->file->f_flags & O_NONBLOCK);
while (tss.len) {
ret = __tcp_splice_read(sk, &tss);
if (ret < 0)
next prev parent reply other threads:[~2009-09-30 6:19 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-30 0:48 Splice on blocking TCP sockets again Jason Gunthorpe
2009-09-30 4:54 ` Eric Dumazet
2009-09-30 5:40 ` Jason Gunthorpe
2009-09-30 5:51 ` Eric Dumazet
2009-09-30 6:00 ` Eric Dumazet
2009-09-30 6:19 ` Eric Dumazet [this message]
2009-10-01 22:17 ` Jason Gunthorpe
2009-09-30 6:37 ` Volker Lendecke
2009-10-02 17:10 ` Jason Gunthorpe
2009-10-02 18:05 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AC2F879.4080807@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=davem@davemloft.net \
--cc=jgunthorpe@obsidianresearch.com \
--cc=netdev@vger.kernel.org \
--cc=opurdila@ixiacom.com \
--cc=vl@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).