netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* sendfile() behavior while troubleshooting netdevice
@ 2008-07-29  0:07 Jay Cliburn
  2008-07-29  0:27 ` Ian Jeffray
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jay Cliburn @ 2008-07-29  0:07 UTC (permalink / raw)
  To: netdev; +Cc: ian

[-- Attachment #1: Type: text/plain, Size: 2022 bytes --]

I'm troubleshooting the problem reported here and I need some help:  
http://lkml.org/lkml/2008/7/15/325.  

Summary of problem:

sendfile() + TSO + atl1 driver == corrupted file at the receiver

According to the reporter, remove either sendfile() or TSO from the
equation and no corruption occurs.

In order to bound the scope of the problem, I wrote a little sendfile()
client/server program so I can control the data being transferred and
isolate the cause of the corruption, but I can't get the program to
work right no matter *what* NIC or kernel version I use.  I get
corrupted data at the receiver.  Where am I going wrong?

Basically, the server listens on port 5000.  The client connects and
sends the filename it wants to fetch from the server.  The server
responds with the size of the file first, then calls sendfile() to ship
the file proper.

At small file sizes (<10k ish), I get good copies most of the time.
At larger file sizes (e.g. 150k), I *never* get a good copy.

Here's an example.

The hosts:
server, petrel,  r8169, 192.168.1.6,   2.6.20-1.2320.fc5
client, sparrow, e100,  192.168.1.195, 2.6.25.6-27.fc8

(Note that neither of these hosts uses the atl1 driver, however I get
similar behavior whenever I employ a host that does.)

[jcliburn@petrel ~]$ ./sfsrv
sending file 'testfile' size 1600 bytes
1600 bytes sent
sending file 'testfile' size 1600 bytes
1600 bytes sent
sending file 'testfile' size 1600 bytes
1600 bytes sent


[jcliburn@sparrow ~]$ ./sfcli 192.168.1.6 testfile
connected...
file size is 1600 bytes
received 1600 bytes
[jcliburn@sparrow ~]$ ./sfcli 192.168.1.6 testfile
connected...
file size is 1600 bytes
received 1600 bytes
[jcliburn@sparrow ~]$ ./sfcli 192.168.1.6 testfile
connected...
file size is 1600 bytes
received 1592 bytes
error: expected 1600, received 1592

The last round came up short 8 bytes.  Any idea why?  Client and server
source code attached.

It doesn't matter which host is the server or client; data is lost
either way.

Thanks for any assistance.

Jay

[-- Attachment #2: sfclient.c --]
[-- Type: text/x-csrc, Size: 2224 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/inet.h>

int main(int argc, char **argv)
{
	int sock;
	int ret;
	unsigned int fsize;
	unsigned int count = 0;
	unsigned int prev;
	unsigned short port = 5000;
	char fname[128];
	char cfsize[12];
	char buf[65536];
	FILE* outf;
	struct sockaddr_in addr;

	if (argc != 3) {
		fprintf(stderr, "Usage: %s IP_addr filename\n", argv[0]);
		exit(1);
	}

	memset(&addr, 0, sizeof(addr));
	ret = inet_aton(argv[1], &addr.sin_addr);
	if (!ret) {
		fprintf(stderr, "invalid IP address %s\n", argv[1]);
		exit(1);
	}

	ret = sscanf(argv[2], "%s", fname);
	if (ret != 1) {
		fprintf(stderr, "filename error");
		exit(1);
	}

	addr.sin_family = AF_INET;
	addr.sin_port = htons(port);

	sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
	if (sock == -1) {
		perror("socket()");
		exit(1);
	}

	ret = connect(sock, (struct sockaddr *)&addr, sizeof(addr));
	if (ret == -1) {
		perror("connect()");
		fprintf(stderr, "connection failed\n");
		exit(1);
	}

	fprintf(stderr, "connected...\n");

	/* send the filename to the server */
	ret = send(sock, fname, strlen(fname), 0);
	if (ret == -1) {
		perror("send()");
		fprintf(stderr, "send failed\n");
		exit(1);
	}

	/* get the file size back from the server */
	ret = recv(sock, cfsize, sizeof(cfsize), 0);
	if (ret == -1) {
		perror("recv()");
		fprintf(stderr, "recv failed\n");
		exit(1);
	}
	fsize = atoi(cfsize);
	fprintf(stderr, "file size is %d bytes\n", fsize);

	/* open the output file */
	outf = fopen(fname, "w");
	if (!outf) {
		perror("fopen()");
		fprintf(stderr, "file open failed\n");
	}

	/* receive the incoming data and write it to the output file */
	while (count < fsize) {
		ret = recv(sock, buf, sizeof(buf), MSG_DONTWAIT);
		if (ret == -1) {
			perror("recv()");
			fprintf(stderr, "recv failed\n");
		}

		if (ret > 0) {
			fwrite(buf, sizeof(char), (size_t) ret, outf);
			fprintf(stderr, "received %d bytes\n", ret);
		}

		prev = count;
		count += ret;
		if (prev == count) {
			fprintf(stderr, "error: expected %d, received %d\n",
				fsize, count);
			exit(1);
		}
	}
	fclose(outf);
	exit(0);
}



[-- Attachment #3: sfserver.c --]
[-- Type: text/x-csrc, Size: 2422 bytes --]

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/stat.h>
#include <sys/sendfile.h>

int main(int argc, char **argv)
{
	int sock;
	int ret;
	int desc;
	int fd;
	int len;
	unsigned short port = 5000;
	char fname[128];
	char fsize[12];
	off_t offset = 0;
	struct sockaddr_in addr;
	struct sockaddr_in addr2;
	struct stat stat_buf;

	sock = socket(AF_INET, SOCK_STREAM, 0);
	if (sock == -1) {
		perror("socket()");
		fprintf(stderr, "socket failed\n");
		exit(1);
	}

	memset(&addr, 0, sizeof(addr));
	addr.sin_family = AF_INET;
	addr.sin_addr.s_addr = INADDR_ANY;
	addr.sin_port = htons(port);

	ret =  bind(sock, (struct sockaddr *) &addr, sizeof(addr));
	if (ret) {
		perror("bind()");
		fprintf(stderr, "bind socket failed\n");
		exit(1);
	}

	/* listen for client */
	ret = listen(sock, 1);
	if (ret) {
		perror("listen()");
		fprintf(stderr, "listen failed\n");
		exit(1);
	}

	while (1) {
		len = sizeof(struct sockaddr);
		desc = accept(sock, (struct sockaddr *) &addr2, &len);
		if (desc == -1) {
			perror("accept()");
			fprintf(stderr, "accept failed\n");
			exit(1);
		}

		/* get the file name from the client and open it */
		ret = recv(desc, fname, sizeof(fname), 0);
		if (ret == -1) {
			perror("recv()");
			fprintf(stderr, "recv failed\n");
			exit(1);
		}
		fname[ret] = '\0';
		fd = open(fname, O_RDONLY);
		if (fd == -1) {
			perror("open()");
			fprintf(stderr, "unable to open '%s'\n", fname);
			exit(1);
		}

		/* get the size of the file and tell the client */
		fstat(fd, &stat_buf);
		sprintf(fsize, "%d", stat_buf.st_size);
		ret = send(desc, fsize, strlen(fsize), 0);
		if (ret == -1) {
			perror("send()");
			fprintf(stderr, "send failed\n");
			exit(1);
		}

		fprintf(stderr, "sending file '%s' size %d bytes\n",
			fname, stat_buf.st_size);

		/* send the file */
		offset = 0;
		ret = sendfile(desc, fd, &offset, stat_buf.st_size);
		if (ret == -1) {
			perror("sendfile()");
			fprintf(stderr, "sendfile failed\n");
			exit(1);
		}
		fprintf(stderr, "%d bytes sent\n", ret);

		/* did the whole file get sent? */
		if (ret != stat_buf.st_size) {
			fprintf(stderr, "error: %d of %d bytes transferred\n",
				ret, (int) stat_buf.st_size);
			exit(1);
		}
		close(fd);
		close(desc);
	}
	close(sock);
	exit(0);
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sendfile() behavior while troubleshooting netdevice
  2008-07-29  0:07 sendfile() behavior while troubleshooting netdevice Jay Cliburn
@ 2008-07-29  0:27 ` Ian Jeffray
  2008-07-29  0:52   ` Jay Cliburn
  2008-07-29  6:28 ` Evgeniy Polyakov
  2008-07-29  9:32 ` Jarek Poplawski
  2 siblings, 1 reply; 7+ messages in thread
From: Ian Jeffray @ 2008-07-29  0:27 UTC (permalink / raw)
  To: Jay Cliburn; +Cc: netdev

Jay Cliburn wrote:
> I'm troubleshooting the problem reported here and I need some help:  
> http://lkml.org/lkml/2008/7/15/325.  

[snip]

> The last round came up short 8 bytes.  Any idea why?  Client and server
> source code attached.
> 
> It doesn't matter which host is the server or client; data is lost
> either way.

Because you send() a strlen() amount of data for the filesize,
but recv() a sizeof(cfsize) amount of data?   I supect 8 bytes
is being sucked up in your first recv().  Try sending a fixed
size block for the file size.

Ian.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sendfile() behavior while troubleshooting netdevice
  2008-07-29  0:27 ` Ian Jeffray
@ 2008-07-29  0:52   ` Jay Cliburn
  0 siblings, 0 replies; 7+ messages in thread
From: Jay Cliburn @ 2008-07-29  0:52 UTC (permalink / raw)
  To: Ian Jeffray; +Cc: netdev

On Tue, 29 Jul 2008 01:27:09 +0100
Ian Jeffray <ian@jeffray.co.uk> wrote:


> Because you send() a strlen() amount of data for the filesize,
> but recv() a sizeof(cfsize) amount of data?   I supect 8 bytes
> is being sucked up in your first recv().  Try sending a fixed
> size block for the file size.

Thanks for the reply.  I changed strlen(fsize) to sizeof(fsize) in
sfserver.c; no difference.

[jcliburn@petrel ~]$ ./sfsrv
sending file 'testfile' size 1600 bytes
1600 bytes sent
sending file 'testfile' size 1600 bytes
1600 bytes sent
sending file 'testfile' size 1600 bytes
1600 bytes sent
sending file 'testfile' size 1600 bytes
1600 bytes sent


[jcliburn@sparrow ~]$ ./sfcli 192.168.1.6 testfile
connected...
file size is 1600 bytes
received 1592 bytes
error: expected 1600, received 1592
[jcliburn@sparrow ~]$ ./sfcli 192.168.1.6 testfile
connected...
file size is 1600 bytes
received 1600 bytes
[jcliburn@sparrow ~]$ ./sfcli 192.168.1.6 testfile
connected...
file size is 1600 bytes
received 1592 bytes
error: expected 1600, received 1592

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sendfile() behavior while troubleshooting netdevice
  2008-07-29  0:07 sendfile() behavior while troubleshooting netdevice Jay Cliburn
  2008-07-29  0:27 ` Ian Jeffray
@ 2008-07-29  6:28 ` Evgeniy Polyakov
  2008-07-29 13:32   ` J. K. Cliburn
  2008-07-29  9:32 ` Jarek Poplawski
  2 siblings, 1 reply; 7+ messages in thread
From: Evgeniy Polyakov @ 2008-07-29  6:28 UTC (permalink / raw)
  To: Jay Cliburn; +Cc: netdev, ian

Hi Jay.

On Mon, Jul 28, 2008 at 07:07:07PM -0500, Jay Cliburn (jacliburn@bellsouth.net) wrote:
> sendfile() + TSO + atl1 driver == corrupted file at the receiver

...

> The hosts:
> server, petrel,  r8169, 192.168.1.6,   2.6.20-1.2320.fc5
> client, sparrow, e100,  192.168.1.195, 2.6.25.6-27.fc8

...

> [jcliburn@sparrow ~]$ ./sfcli 192.168.1.6 testfile
> connected...
> file size is 1600 bytes
> received 1592 bytes
> error: expected 1600, received 1592

I.e. trouble also happens with r8169 driver?

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sendfile() behavior while troubleshooting netdevice
  2008-07-29  0:07 sendfile() behavior while troubleshooting netdevice Jay Cliburn
  2008-07-29  0:27 ` Ian Jeffray
  2008-07-29  6:28 ` Evgeniy Polyakov
@ 2008-07-29  9:32 ` Jarek Poplawski
  2008-07-29 13:16   ` J. K. Cliburn
  2 siblings, 1 reply; 7+ messages in thread
From: Jarek Poplawski @ 2008-07-29  9:32 UTC (permalink / raw)
  To: Jay Cliburn; +Cc: netdev, ian

On 29-07-2008 02:07, Jay Cliburn wrote:
...
> file size is 1600 bytes
> received 1592 bytes
> error: expected 1600, received 1592

sfclient.c:
...
		prev = count;
		count += ret;
		if (prev == count) {
			fprintf(stderr, "error: expected %d, received %d\n",
				fsize, count);
--->			exit(1);
		}

Maybe you could try without this exit if anything comes later?

Jarek P.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sendfile() behavior while troubleshooting netdevice
  2008-07-29  9:32 ` Jarek Poplawski
@ 2008-07-29 13:16   ` J. K. Cliburn
  0 siblings, 0 replies; 7+ messages in thread
From: J. K. Cliburn @ 2008-07-29 13:16 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev, ian

Hi Jarek,

Jarek Poplawski wrote:
> sfclient.c:
> ...
> 		prev = count;
> 		count += ret;
> 		if (prev == count) {
> 			fprintf(stderr, "error: expected %d, received %d\n",
> 				fsize, count);
> --->			exit(1);
> 		}
> 
> Maybe you could try without this exit if anything comes later?

That's the way I initially had it, but I'd get infinite loops when the 
bytes were lost; no additional bytes ever showed up.  The 'if (prev == 
count)' check was added later, just to kick the thing out of that 
infinite loop.

However, your question made me realize that ret can be -1, and I don't 
want to change 'count' in that case.  Changed to

if (ret > 0)
	count += ret;

Thanks,
Jay

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: sendfile() behavior while troubleshooting netdevice
  2008-07-29  6:28 ` Evgeniy Polyakov
@ 2008-07-29 13:32   ` J. K. Cliburn
  0 siblings, 0 replies; 7+ messages in thread
From: J. K. Cliburn @ 2008-07-29 13:32 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: netdev, ian

Evgeniy Polyakov wrote:
> Hi Jay.
> 
> On Mon, Jul 28, 2008 at 07:07:07PM -0500, Jay Cliburn (jacliburn@bellsouth.net) wrote:
>> sendfile() + TSO + atl1 driver == corrupted file at the receiver
> 
> ...
> 
>> The hosts:
>> server, petrel,  r8169, 192.168.1.6,   2.6.20-1.2320.fc5
>> client, sparrow, e100,  192.168.1.195, 2.6.25.6-27.fc8
> 
> ...
> 
>> [jcliburn@sparrow ~]$ ./sfcli 192.168.1.6 testfile
>> connected...
>> file size is 1600 bytes
>> received 1592 bytes
>> error: expected 1600, received 1592
> 
> I.e. trouble also happens with r8169 driver?
> 

I think you may be right, Evgeniy.  I thought I had tested things using 
e100 as the server side, but apparently I didn't.  If I run the server 
from the e100 host, it works flawlessly every time.  (But of course, the 
NIC doesn't support TSO, either.)

More testing tonight...

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-07-29 13:32 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-29  0:07 sendfile() behavior while troubleshooting netdevice Jay Cliburn
2008-07-29  0:27 ` Ian Jeffray
2008-07-29  0:52   ` Jay Cliburn
2008-07-29  6:28 ` Evgeniy Polyakov
2008-07-29 13:32   ` J. K. Cliburn
2008-07-29  9:32 ` Jarek Poplawski
2008-07-29 13:16   ` J. K. Cliburn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).