public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
@ 2001-01-08  1:24 David S. Miller
  2001-01-08 10:39 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: David S. Miller @ 2001-01-08  1:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev


I've put a patch up for testing on the kernel.org mirrors:

/pub/linux/kernel/people/davem/zerocopy-2.4.0-1.diff.gz

It provides a framework for zerocopy transmits and delayed
receive fragment coalescing.  TUX-1.01 uses this framework.

Zerocopy transmit requires some driver support, things run
as they did before for drivers which do not have the support
added.  Currently sg+csum driver support has been added to
Acenic, 3c59x, sunhme, and loopback drivers.  We had eepro100
support coded at one point, but it was removed because we didn't know
how to identify the cards which support hw csum assist vs. ones
which could not.

I would like people to test this hard and report bugs they may
discover.  _PLEASE_ try to see if 2.4.0 without this patch produces
the same problem, and if so report it is a 2.4.0 bug _not_ as a
bug in the zerocopy patch.  Thank you.

In particular, I am interested in hearing about any new breakage
caused by the zerocopy patches when using netfilter.  When reporting
bugs, please note what networking cards you are using as whether the
card actually is using hw csum assist and sg support is an important
data point.

Finally, regardless of networking card, there should be a measurable
performance boost for NFS clients with this patch due to the delayed
fragment coalescing.  KNFSD does not take full advantage of this
facility yet.

Later,
David S. Miller
davem@redhat.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
@ 2001-01-09 13:08 Stephen Landamore
  2001-01-09 13:24 ` Ingo Molnar
  0 siblings, 1 reply; 119+ messages in thread
From: Stephen Landamore @ 2001-01-09 13:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: mingo

Ingo Molnar wrote:
> On Tue, 9 Jan 2001, Christoph Hellwig wrote:
>
>> Sure.  But sendfile is not one of the fundamental UNIX operations...
>
> Neither were eg. kernel-based semaphores. So what? Unix wasnt
> perfect and isnt perfect - but it was a (very) good starting
> point. If you are arguing against the existence or importance of
> sendfile() you should re-think, sendfile() is a unique (and
> important) interface because it enables moving information between
> files (streams) without involving any interim user-space memory
> buffer. No original Unix API did this AFAIK, so we obviously had to
> add it. It's an important Linux API category.

Ehh, that's not correct. HP-UX was the first to implement sendfile().
Linux (and other commercial unices) then copied the idea...

For the record, sendfile() exists because we (Zeus) asked HP for
it. (So of course we agree that sendfile is important!)

Regards,
Stephen

--
Stephen Landamore, <slandamore@zeus.com>              Zeus Technology
Tel: +44 1223 525000                      Universally Serving the Net
Fax: +44 1223 525100                              http://www.zeus.com
Zeus Technology, Zeus House, Cowley Road, Cambridge, CB4 0ZT, ENGLAND

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
@ 2001-01-09 17:46 Manfred Spraul
  0 siblings, 0 replies; 119+ messages in thread
From: Manfred Spraul @ 2001-01-09 17:46 UTC (permalink / raw)
  To: sct, mingo; +Cc: linux-kernel

sct wrote:
> We've already got measurements showing how insane this is. Raw IO 
> requests, plus internal pagebuf contiguous requests from XFS, have to 
> get broken down into page-sized chunks by the current ll_rw_block() 
> API, only to get reassembled by the make_request code. It's 
> *enormous* overhead, and the kiobuf-based disk IO code demonstrates 
> this clearly. 

Stephen, I see one big difference between ll_rw_block and the proposed
tcp_sendpage():
You must allocate and initialize a complete buffer head for each page
you want to read, and then you pass the array of buffer heads to
ll_rw_block with one function call.
I'm certain the overhead is the allocation/initialization/freeing of the
buffer heads, not the function call.

AFAICS the proposed tcp_sendpage interface is the other way around:
you need one function call for each page, but no memory
allocation/setup. The memory is allocated internally by the tcp_sendpage
implementation, and it merges requests when possible, thus for a 9000
byte jumbopacket you'd need 3 function calls to tcp_sendpage(MSG_MORE),
but only one skb is allocated and set up.

Ingo is that correct?

--
	Manfred

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 119+ messages in thread
* Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
@ 2001-01-10  8:41 Manfred Spraul
  2001-01-10  8:31 ` David S. Miller
                   ` (2 more replies)
  0 siblings, 3 replies; 119+ messages in thread
From: Manfred Spraul @ 2001-01-10  8:41 UTC (permalink / raw)
  To: mingo, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 908 bytes --]

> > In user space, how do you know when its safe to reuse the buffer that 
> > was handed to sendmsg() with the MSG_NOCOPY flag? Or does sendmsg() 
> > with that flag block until the buffer isn't needed by the kernel any 
> > more? If it does block, doesn't that defeat the use of non-blocking 
> > I/O? 
> 
> sendmsg() marks those pages COW and copies the original page into a new 
> one for further usage. (the old page is used until the packet is 
> released.) So for maximum performance user-space should not reuse such 
> buffers immediately. 
>
That means sendmsg() changes the page tables? I measures
smp_call_function on my Dual Pentium 350, and it took around 1950 cpu
ticks.
I'm sure that for an 8 way server the total lost time on all cpus (multi
threaded server) is larger than the time required to copy the complete
page.
(I've attached my patch, just run "insmod dummy p_shift=0")


--
	Manfred

[-- Attachment #2: patch-newperf --]
[-- Type: text/plain, Size: 3500 bytes --]

--- 2.4/drivers/net/dummy.c	Mon Dec  4 02:45:22 2000
+++ build-2.4/drivers/net/dummy.c	Wed Jan 10 09:15:20 2001
@@ -95,9 +95,168 @@
 
 static struct net_device dev_dummy;
 
+/* ************************************* */
+int p_shift = -1;
+MODULE_PARM     (p_shift, "1i");
+MODULE_PARM_DESC(p_shift, "Shift for the profile buffer");
+
+int p_size = 0;
+MODULE_PARM     (p_size, "1i");
+MODULE_PARM_DESC(p_size, "size");
+
+
+#define STAT_TABLELEN		16384
+static unsigned long totals[STAT_TABLELEN];
+static unsigned int overflows;
+
+static unsigned long long stime;
+static void start_measure(void)
+{
+	 __asm__ __volatile__ (
+		".align 64\n\t"
+	 	"pushal\n\t"
+		"cpuid\n\t"
+		"popal\n\t"
+		"rdtsc\n\t"
+		"movl %%eax,(%0)\n\t"
+		"movl %%edx,4(%0)\n\t"
+		: /* no output */
+		: "c"(&stime)
+		: "eax", "edx", "memory" );
+}
+
+static void end_measure(void)
+{
+static unsigned long long etime;
+	__asm__ __volatile__ (
+		"pushal\n\t"
+		"cpuid\n\t"
+		"popal\n\t"
+		"rdtsc\n\t"
+		"movl %%eax,(%0)\n\t"
+		"movl %%edx,4(%0)\n\t"
+		: /* no output */
+		: "c"(&etime)
+		: "eax", "edx", "memory" );
+	{
+		unsigned long time = (unsigned long)(etime-stime);
+		time >>= p_shift;
+		if(time < STAT_TABLELEN) {
+			totals[time]++;
+		} else {
+			overflows++;
+		}
+	}
+}
+
+static void clean_buf(void)
+{
+	memset(totals,0,sizeof(totals));
+	overflows = 0;
+}
+
+static void print_line(unsigned long* array)
+{
+	int i;
+	for(i=0;i<32;i++) {
+		if((i%32)==16)
+			printk(":");
+		printk("%lx ",array[i]); 
+	}
+}
+
+static void print_buf(char* caption)
+{
+	int i, other = 0;
+	printk("Results - %s - shift %d",
+		caption, p_shift);
+
+	for(i=0;i<STAT_TABLELEN;i+=32) {
+		int j;
+		int local = 0;
+		for(j=0;j<32;j++)
+			local += totals[i+j];
+
+		if(local) {
+			printk("\n%3x: ",i);
+			print_line(&totals[i]);
+			other += local;
+		}
+	}
+	printk("\nOverflows: %d.\n",
+		overflows);
+	printk("Sum: %ld\n",other+overflows);
+}
+
+static void return_immediately(void* dummy)
+{
+	return;
+}
+
+static void just_one_page(void* dummy)
+{
+	__flush_tlb_one(0x12345678);
+	return;
+}
+
+
 static int __init dummy_init_module(void)
 {
 	int err;
+
+	if(p_shift != -1) {
+		int i;
+		void* p;
+		kmem_cache_t* cachep;
+		/* empty test measurement: */
+		printk("******** kernel cpu benchmark started **********\n");
+		clean_buf();
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		schedule_timeout(200);
+		for(i=0;i<100;i++) {
+			start_measure();
+			return_immediately(NULL);
+			return_immediately(NULL);
+			return_immediately(NULL);
+			return_immediately(NULL);
+			end_measure();
+		}
+		print_buf("zero");
+		clean_buf();
+
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		schedule_timeout(200);
+		for(i=0;i<100;i++) {
+			start_measure();
+			return_immediately(NULL);
+			return_immediately(NULL);
+			smp_call_function(return_immediately,NULL,
+						1, 1);
+			return_immediately(NULL);
+			return_immediately(NULL);
+			end_measure();
+		}
+		print_buf("empty smp_call_function()");
+		clean_buf();
+
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		schedule_timeout(200);
+		for(i=0;i<100;i++) {
+			start_measure();
+			return_immediately(NULL);
+			return_immediately(NULL);
+			smp_call_function(just_one_page,NULL,
+						1, 1);
+			just_one_page(NULL);
+			return_immediately(NULL);
+			return_immediately(NULL);
+			end_measure();
+		}
+		print_buf("flush_one_page()");
+		clean_buf();	
+
+		return -EINVAL;
+	}
 
 	dev_dummy.init = dummy_init;
 	SET_MODULE_OWNER(&dev_dummy);


^ permalink raw reply	[flat|nested] 119+ messages in thread

end of thread, other threads:[~2001-01-19 15:56 UTC | newest]

Thread overview: 119+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-01-08  1:24 [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1 David S. Miller
2001-01-08 10:39 ` Christoph Hellwig
2001-01-08 10:34   ` David S. Miller
2001-01-08 18:05     ` Rik van Riel
2001-01-08 21:07       ` David S. Miller
2001-01-09 10:23       ` Ingo Molnar
2001-01-09 10:31         ` Christoph Hellwig
2001-01-09 10:31           ` David S. Miller
2001-01-09 11:28             ` Christoph Hellwig
2001-01-09 11:42               ` David S. Miller
2001-01-09 12:04               ` Ingo Molnar
2001-01-09 14:25                 ` Stephen C. Tweedie
2001-01-09 14:33                   ` Alan Cox
2001-01-09 15:00                   ` Ingo Molnar
2001-01-09 15:27                     ` Stephen C. Tweedie
2001-01-09 16:16                       ` Ingo Molnar
2001-01-09 16:37                         ` Alan Cox
2001-01-09 16:48                           ` Ingo Molnar
2001-01-09 17:29                             ` Alan Cox
2001-01-09 17:38                               ` Jens Axboe
2001-01-09 18:38                                 ` Ingo Molnar
2001-01-09 19:54                                   ` Andrea Arcangeli
2001-01-09 20:10                                     ` Ingo Molnar
2001-01-10  0:00                                       ` Andrea Arcangeli
2001-01-09 20:12                                     ` Jens Axboe
2001-01-09 23:20                                       ` Andrea Arcangeli
2001-01-09 23:34                                         ` Jens Axboe
2001-01-09 23:52                                           ` Andrea Arcangeli
2001-01-17  5:16                                     ` Rik van Riel
2001-01-09 17:56                             ` Chris Evans
2001-01-09 18:41                               ` Ingo Molnar
2001-01-09 22:58                                 ` [patch]: ac4 blk (was Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1) Jens Axboe
2001-01-09 19:20                           ` [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1 J Sloan
2001-01-09 18:10                         ` Stephen C. Tweedie
2001-01-09 15:38                     ` Benjamin C.R. LaHaise
2001-01-09 16:40                       ` Ingo Molnar
2001-01-09 17:30                         ` Benjamin C.R. LaHaise
2001-01-09 18:12                           ` Stephen C. Tweedie
2001-01-09 18:35                           ` Ingo Molnar
2001-01-09 17:53                       ` Christoph Hellwig
2001-01-09 21:13                   ` David S. Miller
2001-01-09 19:14               ` Linus Torvalds
2001-01-09 20:07                 ` Ingo Molnar
2001-01-09 20:15                   ` Linus Torvalds
2001-01-09 20:36                     ` Christoph Hellwig
2001-01-09 20:55                       ` Linus Torvalds
2001-01-09 21:12                         ` Christoph Hellwig
2001-01-09 21:26                           ` Linus Torvalds
2001-01-10  7:42                             ` Christoph Hellwig
2001-01-10  8:05                               ` Linus Torvalds
2001-01-10  8:33                                 ` Christoph Hellwig
2001-01-10  8:37                                 ` Andrew Morton
2001-01-10 23:32                                   ` Linus Torvalds
2001-01-19 15:55                                     ` Andrew Scott
2001-01-17 14:05                               ` Rik van Riel
2001-01-18  0:53                                 ` Christoph Hellwig
2001-01-18  1:13                                   ` Linus Torvalds
2001-01-18 17:50                                     ` Christoph Hellwig
2001-01-18 18:04                                       ` Linus Torvalds
2001-01-18 21:12                                     ` Albert D. Cahalan
2001-01-19  1:52                                       ` 2.4.1-pre8 video/ohci1394 compile problem ebi4
2001-01-19  6:55                                       ` [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1 Linus Torvalds
2001-01-09 23:06                         ` Benjamin C.R. LaHaise
2001-01-09 23:54                           ` Linus Torvalds
2001-01-10  7:51                             ` Gerd Knorr
2001-01-12  1:42                 ` Stephen C. Tweedie
2001-01-09 11:05           ` Ingo Molnar
2001-01-09 18:27             ` Christoph Hellwig
2001-01-09 19:19               ` Ingo Molnar
2001-01-09 14:18         ` Stephen C. Tweedie
2001-01-09 14:40           ` Ingo Molnar
2001-01-09 14:51             ` Alan Cox
2001-01-09 15:17             ` Stephen C. Tweedie
2001-01-09 15:37               ` Ingo Molnar
2001-01-09 21:18               ` David S. Miller
2001-01-09 22:25               ` Linus Torvalds
2001-01-10 15:21                 ` Stephen C. Tweedie
2001-01-09 15:25             ` Stephen Frost
2001-01-09 15:40               ` Ingo Molnar
2001-01-09 15:48                 ` Stephen Frost
2001-01-10  1:14                 ` Dave Zarzycki
2001-01-10  1:14                   ` David S. Miller
2001-01-10  2:18                     ` Dave Zarzycki
2001-01-10  1:19                   ` Ingo Molnar
2001-01-10  2:56         ` storage over IP (was Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1) dean gaudet
2001-01-10  2:58           ` David S. Miller
2001-01-10  3:18             ` dean gaudet
2001-01-10  3:09               ` David S. Miller
2001-01-10  3:05           ` storage over IP (was Re: [PLEASE-TESTME] Zerocopy networking patch, Alan Cox
2001-01-08 21:56 ` [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1 Jes Sorensen
2001-01-08 21:48   ` David S. Miller
2001-01-08 22:32     ` Jes Sorensen
2001-01-08 22:36       ` David S. Miller
2001-01-09 12:12         ` Ingo Molnar
2001-01-08 22:43       ` Stephen Frost
2001-01-08 22:37         ` David S. Miller
2001-01-09 13:52 ` Trond Myklebust
2001-01-09 13:42   ` David S. Miller
2001-01-09 15:27     ` Trond Myklebust
2001-01-09 21:19       ` David S. Miller
2001-01-10  9:21         ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2001-01-09 13:08 Stephen Landamore
2001-01-09 13:24 ` Ingo Molnar
2001-01-09 13:47   ` Andrew Morton
2001-01-09 19:15     ` Dan Hollis
2001-01-09 19:14   ` Dan Hollis
2001-01-09 22:03     ` David S. Miller
2001-01-09 22:58       ` Dan Hollis
2001-01-09 22:59         ` Ingo Molnar
2001-01-09 23:11           ` Dan Hollis
2001-01-10  3:24           ` Chris Wedgwood
2001-01-09 17:46 Manfred Spraul
2001-01-10  8:41 Manfred Spraul
2001-01-10  8:31 ` David S. Miller
2001-01-10 11:25 ` Ingo Molnar
2001-01-10 12:03   ` Manfred Spraul
2001-01-10 12:07     ` Ingo Molnar
2001-01-10 16:18       ` Jamie Lokier
2001-01-13 15:43 ` yodaiken

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox