From: Ben Greear <greearb@candelatech.com>
To: "linux-os (Dick Johnson)" <linux-os@analogic.com>
Cc: Linux kernel <linux-kernel@vger.kernel.org>
Subject: Re: Network compatibility and performance
Date: Sat, 12 Aug 2006 12:21:40 -0700 [thread overview]
Message-ID: <44DE2A44.5070006@candelatech.com> (raw)
In-Reply-To: <Pine.LNX.4.61.0608101131530.4239@chaos.analogic.com>
linux-os (Dick Johnson) wrote:
> Hello,
>
> Network throughput is seriously defective with linux-2.6.16.24
> if the length given to 'write()' is a large number.
>
> Given this code on a connected socket........
>
> //-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> //
> // Copyright(c) 2005 Analogic Corporation (rjohnson@analogic.com)
> //
> // This program may be distributed under the GNU Public License
> // version 2, as published by the Free Software Foundation, Inc.,
> // 59 Temple Place, Suite 330 Boston, MA, 02111.
> //
> //-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>
> #include <stdio.h>
> #include <unistd.h>
> #include <stdlib.h>
> #include <stdint.h>
> #include <signal.h>
> #include <string.h>
> #include <stdarg.h>
> #include <errno.h>
> #include <fcntl.h>
> #include <netinet/in.h>
> #include <netinet/tcp.h>
> #include <sys/poll.h>
>
> #define BUF_LEN 0x1000
> #define FAIL -1
>
> //-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> //
> // This sends a message that could exceed the size of the network buffers.
> // It returns 0 if everything went okay, and FAIL if not.
> //
> int32_t sender(int32_t fd, void *buf, size_t len)
> {
> int32_t ret_val;
> uint8_t *cp;
> cp = (uint8_t *) buf;
> while(len) {
> if((ret_val = write(fd, cp, MIN(len, BUF_LEN))) == FAIL) {
> if(errno == EAGAIN)
> continue;
> return ret_val;
> }
> len -= ret_val;
> cp += ret_val;
> }
> return 0;
> }
>
> It used to work quite well with:
>
> while(len) {
> if((ret_val = write(fd, cp, len)) == FAIL) {
> return ret_val;
> }
> len -= ret_val;
> cp += ret_val;
> }
>
> The network socket layer would return the amount of bytes
> actually sent and the code would walk its way up through the
> buffer. This was the expected behavior for many years.
>
> Then after about Linux-2.6.8, I needed to do:
>
> while(len) {
> if((ret_val = write(fd, cp, len)) == FAIL) {
> if(errno == EAGAIN)
> continue;
> return ret_val;
> }
> len -= ret_val;
> cp += ret_val;
> }
>
> This was because Linux would claim to run out of resources
> even though there was nothing else running on the system.
>
> Now at Linux-2.6.16.24, the code needed to be further modified
> to:
> while(len) {
> if((ret_val = write(fd, cp, MIN(len, 0x1000))) == FAIL) {
> if(errno == EAGAIN)
> continue;
> return ret_val;
> }
> len -= ret_val;
> cp += ret_val;
> }
In the case where you are getting EAGAIN, this is a busy-spin. You
might want to sleep in a select() or similar call as soon as you get
EAGAIN on this socket..or go off and do other work while the OS clears
out the send queue.
Also, from your description, this code should return 0 on success. It
is returning 'ret_val' instead, which should be > 0.
I have no idea why you need to add the MIN() logic..and that seems like
something that should not be required.
> ... or else it would spin <forever> returning 0 with no errno set.
> In all cases, these problems exist when 'len' is a large value, perhaps
> 0x01000000, much greater than Linux could ever find an available
> buffer for. Linux used to send what it could. Now it will just fail to
> send anything at all, returning 0 if it 'feels' like it doesn't want
> to bother. This is not good. With the hacked code, the data throughput
> is about 100,000 bytes per second on a dedicated link. The previous
> code ran 112,000 bytes per second. Once the 'errno' happens, the
> network stumbles to a measley 12,000 bytes per second. This
> breaks our applications.
Even 112kbps sucks on a decent network. What is the speed of your
network, what protocol are you using, if tcp, what is the latency
of your network?
Ben
next prev parent reply other threads:[~2006-08-12 19:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-10 15:34 Network compatibility and performance linux-os (Dick Johnson)
2006-08-10 17:28 ` Stephen Hemminger
2006-08-10 18:09 ` linux-os (Dick Johnson)
2006-08-10 18:14 ` Stephen Hemminger
2006-08-10 18:32 ` linux-os (Dick Johnson)
2006-08-12 19:21 ` Ben Greear [this message]
2006-08-14 11:30 ` linux-os (Dick Johnson)
2006-08-14 21:25 ` Ben Greear
2006-08-15 11:34 ` linux-os (Dick Johnson)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44DE2A44.5070006@candelatech.com \
--to=greearb@candelatech.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-os@analogic.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox