From: Rick Jones <rick.jones2@hp.com>
To: Willy Tarreau <w@1wt.eu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
David Miller <davem@davemloft.net>,
netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH] tcp: splice: fix an infinite loop in tcp_read_sock()
Date: Thu, 10 Jan 2013 15:48:47 -0800 [thread overview]
Message-ID: <50EF535F.4040905@hp.com> (raw)
In-Reply-To: <20130110232153.GE17390@1wt.eu>
On 01/10/2013 03:21 PM, Willy Tarreau wrote:
> On Thu, Jan 10, 2013 at 03:05:55PM -0800, Eric Dumazet wrote:
>> Thats because you splice( very_large_amount_of_bytes), so you dont
>> hit this bug.
>
> Not always, I use many sizes (from 1k to very large).
>
>> netperf does the splice ( exact_amount_of_bytes ) so hits this pretty
>> fast on loopback at least.
>
> OK I see, if we need an exact size to trigger it, that explains it !
Netperf does not use a specific size all the time - the size it uses on
the receive will be the "receive_size" calculated the same way it has
been since the beginning - either a size specified by a test-specific -M
option, or based on the value of SO_RCVBUF at the time the socket was
created.
The kernel of the code making the splice calls - recv_data_no_copy() in
src/nettest_omni.c looks like:
recv_data_no_copy(SOCKET data_socket, struct ring_elt *recv_ring,
uint32_t bytes_to_recv, struct sockaddr *source, netperf_socklen_t
*sourcelen, uint32_t flags, uint32_t *num_receives) {
...
do {
bytes_recvd = splice(data_socket,
NULL,
pfd[1],
NULL,
bytes_left,
my_flags);
if (bytes_recvd > 0) {
/* per Eric Dumazet, we should just let this second splice call
move as many bytes as it can and not worry about how much.
this should make the call more robust when made on a system
under memory pressure */
splice(pfd[0], NULL, fdnull, NULL, 1 << 30, my_flags);
bytes_left -= bytes_recvd;
}
else {
break;
}
my_recvs++; /* should the pair of splices count as one? */
} while ((bytes_left > 0) && (flags & NETPERF_WAITALL));
where NETPERF_WAITALL is only set for an _RR test. Bytes_left is
initialized to bytes_to_recv which is the "receive_size." my_flags is
set to 0x03.
Now, if there are no test-specific -M option (or -s or -S depending on
the test) netperf will, from run to run use the same receive_size -
under Linux chances are quite good that will be 87380.
happy benchmarking,
rick jones
prev parent reply other threads:[~2013-01-10 23:48 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-10 17:06 [PATCH] tcp: splice: fix an infinite loop in tcp_read_sock() Eric Dumazet
2013-01-10 22:37 ` David Miller
2013-01-10 23:01 ` Willy Tarreau
2013-01-10 23:05 ` Eric Dumazet
2013-01-10 23:21 ` Willy Tarreau
2013-01-10 23:48 ` Rick Jones [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50EF535F.4040905@hp.com \
--to=rick.jones2@hp.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=w@1wt.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).