From: Jeff Garzik <jeff@garzik.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
"Nicholas A. Bellinger" <nab@linux-iscsi.org>,
James Bottomley <James.Bottomley@HansenPartnership.com>,
Vladislav Bolkhovitin <vst@vlnb.net>,
Bart Van Assche <bart.vanassche@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
linux-scsi@vger.kernel.org, scst-devel@lists.sourceforge.net,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Mike Christie <michaelc@cs.wisc.edu>
Subject: Re: Integration of SCST in the mainstream Linux kernel
Date: Mon, 04 Feb 2008 17:57:47 -0500 [thread overview]
Message-ID: <47A7986B.1070206@garzik.org> (raw)
In-Reply-To: <alpine.LFD.1.00.0802041317280.3034@hp.linux-foundation.org>
Linus Torvalds wrote:
> So no, performance is not the only reason to move to kernel space. It can
> easily be things like needing direct access to internal data queues (for a
> iSCSI target, this could be things like barriers or just tagged commands -
> yes, you can probably emulate things like that without access to the
> actual IO queues, but are you sure the semantics will be entirely right?
>
> The kernel/userland boundary is not just a performance boundary, it's an
> abstraction boundary too, and these kinds of protocols tend to break
> abstractions. NFS broke it by having "file handles" (which is not
> something that really exists in user space, and is almost impossible to
> emulate correctly), and I bet the same thing happens when emulating a SCSI
> target in user space.
Well, speaking as a complete nutter who just finished the bare bones of
an NFSv4 userland server[1]... it depends on your approach.
If the userland server is the _only_ one accessing the data[2] -- i.e.
the database server model where ls(1) shows a couple multi-gigabyte
files or a raw partition -- then it's easy to get all the semantics
right, including file handles. You're not racing with local kernel
fileserving.
Couple that with sendfile(2), sync_file_range(2) and a few other
Linux-specific syscalls, and you've got an efficient NFS file server.
It becomes a solution similar to Apache or MySQL or Oracle.
I quite grant there are many good reasons to do NFS or iSCSI data path
in the kernel... my point is more that "impossible" is just from one
point of view ;-)
> Maybe not. I _rally_ haven't looked into iSCSI, I'm just guessing there
> would be things like ordering issues.
iSCSI and NBD were passe ideas at birth. :)
Networked block devices are attractive because the concepts and
implementation are more simple than networked filesystems... but usually
you want to run some sort of filesystem on top. At that point you might
as well run NFS or [gfs|ocfs|flavor-of-the-week], and ditch your
networked block device (and associated complexity).
iSCSI is barely useful, because at least someone finally standardized
SCSI over LAN/WAN.
But you just don't need its complexity if your filesystem must have its
own authentication, distributed coordination, multiple-connection
management code of its own.
Jeff
P.S. Clearly my NFSv4 server is NOT intended to replace the kernel one.
It's more for experiments, and doing FUSE-like filesystem work.
[1] http://linux.yyz.us/projects/nfsv4.html
[2] well, outside of dd(1) and similar tricks... the same "going around
its back" tricks that can screw up a mounted filesystem.
next prev parent reply other threads:[~2008-02-04 22:58 UTC|newest]
Thread overview: 147+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-23 14:22 Integration of SCST in the mainstream Linux kernel Bart Van Assche
2008-01-23 17:11 ` Vladislav Bolkhovitin
2008-01-29 20:42 ` James Bottomley
2008-01-29 21:31 ` Roland Dreier
2008-01-29 23:32 ` FUJITA Tomonori
2008-01-30 1:15 ` [Scst-devel] " Vu Pham
2008-01-30 8:38 ` Bart Van Assche
2008-01-30 10:56 ` FUJITA Tomonori
2008-01-30 11:40 ` Vladislav Bolkhovitin
2008-01-30 13:10 ` Bart Van Assche
2008-01-30 13:54 ` FUJITA Tomonori
2008-01-31 7:48 ` Bart Van Assche
2008-01-31 13:25 ` Nicholas A. Bellinger
2008-01-31 14:34 ` Bart Van Assche
2008-01-31 14:44 ` Nicholas A. Bellinger
2008-01-31 15:50 ` Vladislav Bolkhovitin
2008-01-31 16:25 ` [Scst-devel] " Joe Landman
2008-01-31 17:08 ` Bart Van Assche
2008-01-31 17:13 ` Joe Landman
2008-01-31 18:12 ` David Dillow
2008-02-01 11:50 ` Vladislav Bolkhovitin
2008-02-01 11:50 ` Vladislav Bolkhovitin
2008-02-01 12:25 ` Vladislav Bolkhovitin
2008-01-31 17:14 ` Nicholas A. Bellinger
2008-01-31 17:40 ` Bart Van Assche
2008-01-31 18:15 ` Nicholas A. Bellinger
2008-02-01 9:08 ` Bart Van Assche
2008-02-01 8:11 ` Bart Van Assche
2008-02-01 10:39 ` Nicholas A. Bellinger
2008-02-01 11:04 ` Bart Van Assche
2008-02-01 12:05 ` Nicholas A. Bellinger
2008-02-01 13:25 ` Bart Van Assche
2008-02-01 14:36 ` Nicholas A. Bellinger
2008-01-30 16:34 ` James Bottomley
2008-01-30 16:50 ` Bart Van Assche
2008-02-02 15:32 ` Pete Wyckoff
2008-02-05 17:01 ` Erez Zilber
2008-02-06 12:16 ` Bart Van Assche
2008-02-06 16:45 ` Benny Halevy
2008-02-06 17:06 ` Roland Dreier
2008-02-18 9:43 ` Erez Zilber
2008-02-18 11:01 ` Bart Van Assche
2008-02-20 7:34 ` Erez Zilber
2008-02-20 8:41 ` Bart Van Assche
2008-01-30 11:18 ` Vladislav Bolkhovitin
2008-01-30 8:29 ` Bart Van Assche
2008-01-30 16:22 ` James Bottomley
2008-01-30 17:03 ` Bart Van Assche
2008-02-05 7:14 ` [Scst-devel] " Tomasz Chmielewski
2008-02-05 13:38 ` FUJITA Tomonori
2008-02-05 16:07 ` Tomasz Chmielewski
2008-02-05 16:21 ` Ming Zhang
2008-02-05 16:43 ` FUJITA Tomonori
2008-02-05 17:09 ` Matteo Tescione
2008-02-06 1:29 ` FUJITA Tomonori
2008-02-06 2:01 ` Nicholas A. Bellinger
2008-01-30 11:17 ` Vladislav Bolkhovitin
2008-02-04 12:27 ` Vladislav Bolkhovitin
2008-02-04 13:53 ` Bart Van Assche
2008-02-04 17:00 ` David Dillow
2008-02-04 17:08 ` Vladislav Bolkhovitin
2008-02-05 16:25 ` Bart Van Assche
2008-02-05 18:18 ` Linus Torvalds
2008-02-04 15:30 ` James Bottomley
2008-02-04 16:25 ` Vladislav Bolkhovitin
2008-02-04 17:06 ` James Bottomley
2008-02-04 17:16 ` Vladislav Bolkhovitin
2008-02-04 17:25 ` James Bottomley
2008-02-04 17:56 ` Vladislav Bolkhovitin
2008-02-04 18:22 ` James Bottomley
2008-02-04 18:38 ` Vladislav Bolkhovitin
2008-02-04 18:54 ` James Bottomley
2008-02-05 18:59 ` Vladislav Bolkhovitin
2008-02-05 19:13 ` James Bottomley
2008-02-06 18:07 ` Vladislav Bolkhovitin
2008-02-07 13:13 ` [Scst-devel] " Bart Van Assche
2008-02-07 13:45 ` Vladislav Bolkhovitin
2008-02-07 22:51 ` david
2008-02-08 10:37 ` Vladislav Bolkhovitin
2008-02-09 7:40 ` david
2008-02-08 11:33 ` Nicholas A. Bellinger
2008-02-08 14:36 ` Vladislav Bolkhovitin
2008-02-08 23:53 ` Nicholas A. Bellinger
2008-02-15 15:02 ` Bart Van Assche
2008-02-07 15:38 ` [Scst-devel] " Nicholas A. Bellinger
2008-02-07 20:37 ` Luben Tuikov
2008-02-08 10:32 ` Vladislav Bolkhovitin
2008-02-09 7:32 ` Luben Tuikov
2008-02-11 10:02 ` Vladislav Bolkhovitin
2008-02-08 11:53 ` [Scst-devel] " Nicholas A. Bellinger
2008-02-08 14:42 ` Vladislav Bolkhovitin
2008-02-09 0:00 ` Nicholas A. Bellinger
2008-02-04 18:29 ` Linus Torvalds
2008-02-04 18:49 ` James Bottomley
2008-02-04 19:06 ` Nicholas A. Bellinger
2008-02-04 19:19 ` Nicholas A. Bellinger
2008-02-04 19:44 ` Linus Torvalds
2008-02-04 20:06 ` [Scst-devel] " 4news
2008-02-04 20:24 ` Nicholas A. Bellinger
2008-02-04 21:01 ` J. Bruce Fields
2008-02-04 21:24 ` Linus Torvalds
2008-02-04 22:00 ` Nicholas A. Bellinger
2008-02-04 22:57 ` Jeff Garzik [this message]
2008-02-04 23:45 ` Linus Torvalds
2008-02-05 0:08 ` Jeff Garzik
2008-02-05 1:20 ` Linus Torvalds
2008-02-05 8:38 ` Bart Van Assche
2008-02-05 17:50 ` Jeff Garzik
2008-02-06 10:22 ` Bart Van Assche
2008-02-06 14:21 ` Jeff Garzik
2008-02-05 13:05 ` Olivier Galibert
2008-02-05 18:08 ` Jeff Garzik
2008-02-05 19:01 ` Vladislav Bolkhovitin
2008-02-04 22:43 ` Alan Cox
2008-02-04 17:30 ` Douglas Gilbert
2008-02-05 2:07 ` [Scst-devel] " Chris Weiss
2008-02-05 14:19 ` FUJITA Tomonori
2008-02-04 22:59 ` Nicholas A. Bellinger
2008-02-04 23:00 ` James Bottomley
2008-02-04 23:12 ` Nicholas A. Bellinger
2008-02-04 23:16 ` Nicholas A. Bellinger
2008-02-05 18:37 ` James Bottomley
2008-02-04 23:04 ` Jeff Garzik
2008-02-04 23:27 ` Linus Torvalds
2008-02-05 19:01 ` Vladislav Bolkhovitin
2008-02-05 19:12 ` Jeff Garzik
2008-02-05 19:21 ` Vladislav Bolkhovitin
2008-02-06 0:11 ` Nicholas A. Bellinger
2008-02-06 1:43 ` Nicholas A. Bellinger
2008-02-12 16:05 ` [Scst-devel] " Bart Van Assche
2008-02-13 3:44 ` Nicholas A. Bellinger
2008-02-13 6:18 ` CONFIG_SLUB and reproducable general protection faults on 2.6.2x Nicholas A. Bellinger
2008-02-13 16:37 ` Nicholas A. Bellinger
2008-02-06 0:17 ` Integration of SCST in the mainstream Linux kernel Nicholas A. Bellinger
2008-02-06 0:48 ` Nicholas A. Bellinger
2008-02-06 0:51 ` Nicholas A. Bellinger
2008-02-05 0:07 ` Matt Mackall
2008-02-05 0:24 ` Linus Torvalds
2008-02-05 0:42 ` Jeff Garzik
2008-02-05 0:45 ` Matt Mackall
2008-02-05 4:43 ` [Scst-devel] " Matteo Tescione
2008-02-05 5:07 ` James Bottomley
2008-02-05 13:38 ` FUJITA Tomonori
2008-02-05 19:00 ` Vladislav Bolkhovitin
2008-02-05 17:10 ` Erez Zilber
2008-02-05 19:02 ` Bart Van Assche
2008-02-05 19:02 ` Vladislav Bolkhovitin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47A7986B.1070206@garzik.org \
--to=jeff@garzik.org \
--cc=James.Bottomley@HansenPartnership.com \
--cc=akpm@linux-foundation.org \
--cc=bart.vanassche@gmail.com \
--cc=bfields@fieldses.org \
--cc=fujita.tomonori@lab.ntt.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=michaelc@cs.wisc.edu \
--cc=nab@linux-iscsi.org \
--cc=scst-devel@lists.sourceforge.net \
--cc=torvalds@linux-foundation.org \
--cc=vst@vlnb.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).