From: Vladislav Bolkhovitin <vst@vlnb.net>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Bart Van Assche <bart.vanassche@gmail.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>,
linux-scsi@vger.kernel.org, scst-devel@lists.sourceforge.net,
linux-kernel@vger.kernel.org
Subject: Re: Integration of SCST in the mainstream Linux kernel
Date: Mon, 04 Feb 2008 21:38:02 +0300 [thread overview]
Message-ID: <47A75B8A.3020503@vlnb.net> (raw)
In-Reply-To: <1202149322.3096.66.camel@localhost.localdomain>
James Bottomley wrote:
> On Mon, 2008-02-04 at 20:56 +0300, Vladislav Bolkhovitin wrote:
>
>>James Bottomley wrote:
>>
>>>On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote:
>>>
>>>
>>>>James Bottomley wrote:
>>>>
>>>>
>>>>>>>>So, James, what is your opinion on the above? Or the overall SCSI target
>>>>>>>>project simplicity doesn't matter much for you and you think it's fine
>>>>>>>>to duplicate Linux page cache in the user space to keep the in-kernel
>>>>>>>>part of the project as small as possible?
>>>>>>>
>>>>>>>
>>>>>>>The answers were pretty much contained here
>>>>>>>
>>>>>>>http://marc.info/?l=linux-scsi&m=120164008302435
>>>>>>>
>>>>>>>and here:
>>>>>>>
>>>>>>>http://marc.info/?l=linux-scsi&m=120171067107293
>>>>>>>
>>>>>>>Weren't they?
>>>>>>
>>>>>>No, sorry, it doesn't look so for me. They are about performance, but
>>>>>>I'm asking about the overall project's architecture, namely about one
>>>>>>part of it: simplicity. Particularly, what do you think about
>>>>>>duplicating Linux page cache in the user space to have zero-copy cached
>>>>>>I/O? Or can you suggest another architectural solution for that problem
>>>>>>in the STGT's approach?
>>>>>
>>>>>
>>>>>Isn't that an advantage of a user space solution? It simply uses the
>>>>>backing store of whatever device supplies the data. That means it takes
>>>>>advantage of the existing mechanisms for caching.
>>>>
>>>>No, please reread this thread, especially this message:
>>>>http://marc.info/?l=linux-kernel&m=120169189504361&w=2. This is one of
>>>>the advantages of the kernel space implementation. The user space
>>>>implementation has to have data copied between the cache and user space
>>>>buffer, but the kernel space one can use pages in the cache directly,
>>>>without extra copy.
>>>
>>>
>>>Well, you've said it thrice (the bellman cried) but that doesn't make it
>>>true.
>>>
>>>The way a user space solution should work is to schedule mmapped I/O
>>>from the backing store and then send this mmapped region off for target
>>>I/O. For reads, the page gather will ensure that the pages are up to
>>>date from the backing store to the cache before sending the I/O out.
>>>For writes, You actually have to do a msync on the region to get the
>>>data secured to the backing store.
>>
>>James, have you checked how fast is mmaped I/O if work size > size of
>>RAM? It's several times slower comparing to buffered I/O. It was many
>>times discussed in LKML and, seems, VM people consider it unavoidable.
>
>
> Erm, but if you're using the case of work size > size of RAM, you'll
> find buffered I/O won't help because you don't have the memory for
> buffers either.
James, just check and you will see, buffered I/O is a lot faster.
>>So, using mmaped IO isn't an option for high performance. Plus, mmaped
>>IO isn't an option for high reliability requirements, since it doesn't
>>provide a practical way to handle I/O errors.
>
> I think you'll find it does ... the page gather returns -EFAULT if
> there's an I/O error in the gathered region.
Err, to whom return? If you try to read from a mmaped page, which can't
be populated due to I/O error, you will get SIGBUS or SIGSEGV, I don't
remember exactly. It's quite tricky to get back to the faulted command
from the signal handler.
Or do you mean mmap(MAP_POPULATE)/munmap() for each command? Do you
think that such mapping/unmapping is good for performance?
> msync does something
> similar if there's a write failure.
>
>>>You also have to pull tricks with
>>>the mmap region in the case of writes to prevent useless data being read
>>>in from the backing store.
>>
>>Can you be more exact and specify what kind of tricks should be done for
>>that?
>
> Actually, just avoid touching it seems to do the trick with a recent
> kernel.
Hmm, how can one write to an mmaped page and don't touch it?
> James
>
>
>
next prev parent reply other threads:[~2008-02-04 18:38 UTC|newest]
Thread overview: 147+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-23 14:22 Integration of SCST in the mainstream Linux kernel Bart Van Assche
2008-01-23 17:11 ` Vladislav Bolkhovitin
2008-01-29 20:42 ` James Bottomley
2008-01-29 21:31 ` Roland Dreier
2008-01-29 23:32 ` FUJITA Tomonori
2008-01-30 1:15 ` [Scst-devel] " Vu Pham
2008-01-30 8:38 ` Bart Van Assche
2008-01-30 10:56 ` FUJITA Tomonori
2008-01-30 11:40 ` Vladislav Bolkhovitin
2008-01-30 13:10 ` Bart Van Assche
2008-01-30 13:54 ` FUJITA Tomonori
2008-01-31 7:48 ` Bart Van Assche
2008-01-31 13:25 ` Nicholas A. Bellinger
2008-01-31 14:34 ` Bart Van Assche
2008-01-31 14:44 ` Nicholas A. Bellinger
2008-01-31 15:50 ` Vladislav Bolkhovitin
2008-01-31 16:25 ` [Scst-devel] " Joe Landman
2008-01-31 17:08 ` Bart Van Assche
2008-01-31 17:13 ` Joe Landman
2008-01-31 18:12 ` David Dillow
2008-02-01 11:50 ` Vladislav Bolkhovitin
2008-02-01 11:50 ` Vladislav Bolkhovitin
2008-02-01 12:25 ` Vladislav Bolkhovitin
2008-01-31 17:14 ` Nicholas A. Bellinger
2008-01-31 17:40 ` Bart Van Assche
2008-01-31 18:15 ` Nicholas A. Bellinger
2008-02-01 9:08 ` Bart Van Assche
2008-02-01 8:11 ` Bart Van Assche
2008-02-01 10:39 ` Nicholas A. Bellinger
2008-02-01 11:04 ` Bart Van Assche
2008-02-01 12:05 ` Nicholas A. Bellinger
2008-02-01 13:25 ` Bart Van Assche
2008-02-01 14:36 ` Nicholas A. Bellinger
2008-01-30 16:34 ` James Bottomley
2008-01-30 16:50 ` Bart Van Assche
2008-02-02 15:32 ` Pete Wyckoff
2008-02-05 17:01 ` Erez Zilber
2008-02-06 12:16 ` Bart Van Assche
2008-02-06 16:45 ` Benny Halevy
2008-02-06 17:06 ` Roland Dreier
2008-02-18 9:43 ` Erez Zilber
2008-02-18 11:01 ` Bart Van Assche
2008-02-20 7:34 ` Erez Zilber
2008-02-20 8:41 ` Bart Van Assche
2008-01-30 11:18 ` Vladislav Bolkhovitin
2008-01-30 8:29 ` Bart Van Assche
2008-01-30 16:22 ` James Bottomley
2008-01-30 17:03 ` Bart Van Assche
2008-02-05 7:14 ` [Scst-devel] " Tomasz Chmielewski
2008-02-05 13:38 ` FUJITA Tomonori
2008-02-05 16:07 ` Tomasz Chmielewski
2008-02-05 16:21 ` Ming Zhang
2008-02-05 16:43 ` FUJITA Tomonori
2008-02-05 17:09 ` Matteo Tescione
2008-02-06 1:29 ` FUJITA Tomonori
2008-02-06 2:01 ` Nicholas A. Bellinger
2008-01-30 11:17 ` Vladislav Bolkhovitin
2008-02-04 12:27 ` Vladislav Bolkhovitin
2008-02-04 13:53 ` Bart Van Assche
2008-02-04 17:00 ` David Dillow
2008-02-04 17:08 ` Vladislav Bolkhovitin
2008-02-05 16:25 ` Bart Van Assche
2008-02-05 18:18 ` Linus Torvalds
2008-02-04 15:30 ` James Bottomley
2008-02-04 16:25 ` Vladislav Bolkhovitin
2008-02-04 17:06 ` James Bottomley
2008-02-04 17:16 ` Vladislav Bolkhovitin
2008-02-04 17:25 ` James Bottomley
2008-02-04 17:56 ` Vladislav Bolkhovitin
2008-02-04 18:22 ` James Bottomley
2008-02-04 18:38 ` Vladislav Bolkhovitin [this message]
2008-02-04 18:54 ` James Bottomley
2008-02-05 18:59 ` Vladislav Bolkhovitin
2008-02-05 19:13 ` James Bottomley
2008-02-06 18:07 ` Vladislav Bolkhovitin
2008-02-07 13:13 ` [Scst-devel] " Bart Van Assche
2008-02-07 13:45 ` Vladislav Bolkhovitin
2008-02-07 22:51 ` david
2008-02-08 10:37 ` Vladislav Bolkhovitin
2008-02-09 7:40 ` david
2008-02-08 11:33 ` Nicholas A. Bellinger
2008-02-08 14:36 ` Vladislav Bolkhovitin
2008-02-08 23:53 ` Nicholas A. Bellinger
2008-02-15 15:02 ` Bart Van Assche
2008-02-07 15:38 ` [Scst-devel] " Nicholas A. Bellinger
2008-02-07 20:37 ` Luben Tuikov
2008-02-08 10:32 ` Vladislav Bolkhovitin
2008-02-09 7:32 ` Luben Tuikov
2008-02-11 10:02 ` Vladislav Bolkhovitin
2008-02-08 11:53 ` [Scst-devel] " Nicholas A. Bellinger
2008-02-08 14:42 ` Vladislav Bolkhovitin
2008-02-09 0:00 ` Nicholas A. Bellinger
2008-02-04 18:29 ` Linus Torvalds
2008-02-04 18:49 ` James Bottomley
2008-02-04 19:06 ` Nicholas A. Bellinger
2008-02-04 19:19 ` Nicholas A. Bellinger
2008-02-04 19:44 ` Linus Torvalds
2008-02-04 20:06 ` [Scst-devel] " 4news
2008-02-04 20:24 ` Nicholas A. Bellinger
2008-02-04 21:01 ` J. Bruce Fields
2008-02-04 21:24 ` Linus Torvalds
2008-02-04 22:00 ` Nicholas A. Bellinger
2008-02-04 22:57 ` Jeff Garzik
2008-02-04 23:45 ` Linus Torvalds
2008-02-05 0:08 ` Jeff Garzik
2008-02-05 1:20 ` Linus Torvalds
2008-02-05 8:38 ` Bart Van Assche
2008-02-05 17:50 ` Jeff Garzik
2008-02-06 10:22 ` Bart Van Assche
2008-02-06 14:21 ` Jeff Garzik
2008-02-05 13:05 ` Olivier Galibert
2008-02-05 18:08 ` Jeff Garzik
2008-02-05 19:01 ` Vladislav Bolkhovitin
2008-02-04 22:43 ` Alan Cox
2008-02-04 17:30 ` Douglas Gilbert
2008-02-05 2:07 ` [Scst-devel] " Chris Weiss
2008-02-05 14:19 ` FUJITA Tomonori
2008-02-04 22:59 ` Nicholas A. Bellinger
2008-02-04 23:00 ` James Bottomley
2008-02-04 23:12 ` Nicholas A. Bellinger
2008-02-04 23:16 ` Nicholas A. Bellinger
2008-02-05 18:37 ` James Bottomley
2008-02-04 23:04 ` Jeff Garzik
2008-02-04 23:27 ` Linus Torvalds
2008-02-05 19:01 ` Vladislav Bolkhovitin
2008-02-05 19:12 ` Jeff Garzik
2008-02-05 19:21 ` Vladislav Bolkhovitin
2008-02-06 0:11 ` Nicholas A. Bellinger
2008-02-06 1:43 ` Nicholas A. Bellinger
2008-02-12 16:05 ` [Scst-devel] " Bart Van Assche
2008-02-13 3:44 ` Nicholas A. Bellinger
2008-02-13 6:18 ` CONFIG_SLUB and reproducable general protection faults on 2.6.2x Nicholas A. Bellinger
2008-02-13 16:37 ` Nicholas A. Bellinger
2008-02-06 0:17 ` Integration of SCST in the mainstream Linux kernel Nicholas A. Bellinger
2008-02-06 0:48 ` Nicholas A. Bellinger
2008-02-06 0:51 ` Nicholas A. Bellinger
2008-02-05 0:07 ` Matt Mackall
2008-02-05 0:24 ` Linus Torvalds
2008-02-05 0:42 ` Jeff Garzik
2008-02-05 0:45 ` Matt Mackall
2008-02-05 4:43 ` [Scst-devel] " Matteo Tescione
2008-02-05 5:07 ` James Bottomley
2008-02-05 13:38 ` FUJITA Tomonori
2008-02-05 19:00 ` Vladislav Bolkhovitin
2008-02-05 17:10 ` Erez Zilber
2008-02-05 19:02 ` Bart Van Assche
2008-02-05 19:02 ` Vladislav Bolkhovitin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47A75B8A.3020503@vlnb.net \
--to=vst@vlnb.net \
--cc=James.Bottomley@HansenPartnership.com \
--cc=akpm@linux-foundation.org \
--cc=bart.vanassche@gmail.com \
--cc=fujita.tomonori@lab.ntt.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=scst-devel@lists.sourceforge.net \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).