From: torvalds@transmeta.com (Linus Torvalds)
To: linux-kernel@vger.kernel.org
Subject: Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1
Date: 9 Jan 2001 11:14:54 -0800 [thread overview]
Message-ID: <93fnve$250$1@penguin.transmeta.com> (raw)
In-Reply-To: <Pine.LNX.4.21.0101081603080.21675-100000@duckman.distro.conectiva> <20010109113145.A28758@caldera.de> <200101091031.CAA01242@pizda.ninka.net> <20010109122810.A3115@caldera.de>
In article <20010109122810.A3115@caldera.de>,
Christoph Hellwig <hch@caldera.de> wrote:
>
>You get that multiple page call with kiobufs for free...
No, you don't.
kiobufs are crap. Face it. They do NOT allow proper multi-page scatter
gather, regardless of what the kiobuf PR department has said.
I've complained about it before, and nobody listened. Davids zero-copy
network code had the same bug. I complained about it to David, and David
took about a day to understand my arguments, and fixed it.
It's more likely that the zero-copy network code will be used in real
life than kiobufs will ever be. The kiobufs are damn ugly by
comparison, and the fact that the kiobuf people don't even seem to
realize the problems makes me just more convinced that it's not worth
even arguing about.
What is the problem with kiobuf's? Simple: they have a "offset" and a
"length", and an array of pages. What that completely and utterly
misses is that if you have an array of pages, you should have an array
of "offset" and "length" too. As it is, kiobuf's cannot be used for
things like readv() and writev().
Yes, to work around this limitation, there's the notion of "kiovec", an
array of kiobuf's. Never mind the fact that if kiobuf's had been
properly designed in the first place, you wouldn't need kiovec's at all.
And kiovec's are too damn heavy to use for something like the networking
zero-copy, with all the double indirection etc.
I told David that he can fix the network zero-copy code two ways: either
he makes it _truly_ scatter-gather (an array of not just pages, but of
proper page-offset-length tuples), or he makes it just a single area and
lets the low-level TCP/whatever code build up multiple segments
internally. Either of which are good designs.
It so happens that none of the users actually wanted multi-page
scatter-gather, and the only thing that really wanted to do the sg was
the networking layer when it created a single packet out of multiple
areas, so the zero-copy stuff uses the simpler non-array interface.
And kiobufs can rot in hell for their design mistakes. Maybe somebody
will listen some day and fix them up, and in the meantime they can look
at the networking code for an example of how to do it.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2001-01-09 19:15 UTC|newest]
Thread overview: 119+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-01-08 1:24 [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1 David S. Miller
2001-01-08 10:39 ` Christoph Hellwig
2001-01-08 10:34 ` David S. Miller
2001-01-08 18:05 ` Rik van Riel
2001-01-08 21:07 ` David S. Miller
2001-01-09 10:23 ` Ingo Molnar
2001-01-09 10:31 ` Christoph Hellwig
2001-01-09 10:31 ` David S. Miller
2001-01-09 11:28 ` Christoph Hellwig
2001-01-09 11:42 ` David S. Miller
2001-01-09 12:04 ` Ingo Molnar
2001-01-09 14:25 ` Stephen C. Tweedie
2001-01-09 14:33 ` Alan Cox
2001-01-09 15:00 ` Ingo Molnar
2001-01-09 15:27 ` Stephen C. Tweedie
2001-01-09 16:16 ` Ingo Molnar
2001-01-09 16:37 ` Alan Cox
2001-01-09 16:48 ` Ingo Molnar
2001-01-09 17:29 ` Alan Cox
2001-01-09 17:38 ` Jens Axboe
2001-01-09 18:38 ` Ingo Molnar
2001-01-09 19:54 ` Andrea Arcangeli
2001-01-09 20:10 ` Ingo Molnar
2001-01-10 0:00 ` Andrea Arcangeli
2001-01-09 20:12 ` Jens Axboe
2001-01-09 23:20 ` Andrea Arcangeli
2001-01-09 23:34 ` Jens Axboe
2001-01-09 23:52 ` Andrea Arcangeli
2001-01-17 5:16 ` Rik van Riel
2001-01-09 17:56 ` Chris Evans
2001-01-09 18:41 ` Ingo Molnar
2001-01-09 22:58 ` [patch]: ac4 blk (was Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1) Jens Axboe
2001-01-09 19:20 ` [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1 J Sloan
2001-01-09 18:10 ` Stephen C. Tweedie
2001-01-09 15:38 ` Benjamin C.R. LaHaise
2001-01-09 16:40 ` Ingo Molnar
2001-01-09 17:30 ` Benjamin C.R. LaHaise
2001-01-09 18:12 ` Stephen C. Tweedie
2001-01-09 18:35 ` Ingo Molnar
2001-01-09 17:53 ` Christoph Hellwig
2001-01-09 21:13 ` David S. Miller
2001-01-09 19:14 ` Linus Torvalds [this message]
2001-01-09 20:07 ` Ingo Molnar
2001-01-09 20:15 ` Linus Torvalds
2001-01-09 20:36 ` Christoph Hellwig
2001-01-09 20:55 ` Linus Torvalds
2001-01-09 21:12 ` Christoph Hellwig
2001-01-09 21:26 ` Linus Torvalds
2001-01-10 7:42 ` Christoph Hellwig
2001-01-10 8:05 ` Linus Torvalds
2001-01-10 8:33 ` Christoph Hellwig
2001-01-10 8:37 ` Andrew Morton
2001-01-10 23:32 ` Linus Torvalds
2001-01-19 15:55 ` Andrew Scott
2001-01-17 14:05 ` Rik van Riel
2001-01-18 0:53 ` Christoph Hellwig
2001-01-18 1:13 ` Linus Torvalds
2001-01-18 17:50 ` Christoph Hellwig
2001-01-18 18:04 ` Linus Torvalds
2001-01-18 21:12 ` Albert D. Cahalan
2001-01-19 1:52 ` 2.4.1-pre8 video/ohci1394 compile problem ebi4
2001-01-19 6:55 ` [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1 Linus Torvalds
2001-01-09 23:06 ` Benjamin C.R. LaHaise
2001-01-09 23:54 ` Linus Torvalds
2001-01-10 7:51 ` Gerd Knorr
2001-01-12 1:42 ` Stephen C. Tweedie
2001-01-09 11:05 ` Ingo Molnar
2001-01-09 18:27 ` Christoph Hellwig
2001-01-09 19:19 ` Ingo Molnar
2001-01-09 14:18 ` Stephen C. Tweedie
2001-01-09 14:40 ` Ingo Molnar
2001-01-09 14:51 ` Alan Cox
2001-01-09 15:17 ` Stephen C. Tweedie
2001-01-09 15:37 ` Ingo Molnar
2001-01-09 21:18 ` David S. Miller
2001-01-09 22:25 ` Linus Torvalds
2001-01-10 15:21 ` Stephen C. Tweedie
2001-01-09 15:25 ` Stephen Frost
2001-01-09 15:40 ` Ingo Molnar
2001-01-09 15:48 ` Stephen Frost
2001-01-10 1:14 ` Dave Zarzycki
2001-01-10 1:14 ` David S. Miller
2001-01-10 2:18 ` Dave Zarzycki
2001-01-10 1:19 ` Ingo Molnar
2001-01-10 2:56 ` storage over IP (was Re: [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1) dean gaudet
2001-01-10 2:58 ` David S. Miller
2001-01-10 3:18 ` dean gaudet
2001-01-10 3:09 ` David S. Miller
2001-01-10 3:05 ` storage over IP (was Re: [PLEASE-TESTME] Zerocopy networking patch, Alan Cox
2001-01-08 21:56 ` [PLEASE-TESTME] Zerocopy networking patch, 2.4.0-1 Jes Sorensen
2001-01-08 21:48 ` David S. Miller
2001-01-08 22:32 ` Jes Sorensen
2001-01-08 22:36 ` David S. Miller
2001-01-09 12:12 ` Ingo Molnar
2001-01-08 22:43 ` Stephen Frost
2001-01-08 22:37 ` David S. Miller
2001-01-09 13:52 ` Trond Myklebust
2001-01-09 13:42 ` David S. Miller
2001-01-09 15:27 ` Trond Myklebust
2001-01-09 21:19 ` David S. Miller
2001-01-10 9:21 ` Trond Myklebust
-- strict thread matches above, loose matches on Subject: below --
2001-01-09 13:08 Stephen Landamore
2001-01-09 13:24 ` Ingo Molnar
2001-01-09 13:47 ` Andrew Morton
2001-01-09 19:15 ` Dan Hollis
2001-01-09 19:14 ` Dan Hollis
2001-01-09 22:03 ` David S. Miller
2001-01-09 22:58 ` Dan Hollis
2001-01-09 22:59 ` Ingo Molnar
2001-01-09 23:11 ` Dan Hollis
2001-01-10 3:24 ` Chris Wedgwood
2001-01-09 17:46 Manfred Spraul
2001-01-10 8:41 Manfred Spraul
2001-01-10 8:31 ` David S. Miller
2001-01-10 11:25 ` Ingo Molnar
2001-01-10 12:03 ` Manfred Spraul
2001-01-10 12:07 ` Ingo Molnar
2001-01-10 16:18 ` Jamie Lokier
2001-01-13 15:43 ` yodaiken
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='93fnve$250$1@penguin.transmeta.com' \
--to=torvalds@transmeta.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox