* Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software [not found] <20050301190140.97212.qmail@web30006.mail.mud.yahoo.com> @ 2005-03-01 19:29 ` Ming Zhang 0 siblings, 0 replies; 9+ messages in thread From: Ming Zhang @ 2005-03-01 19:29 UTC (permalink / raw) To: Vikas Aggarwal Cc: Bryan Henderson, Tomonori Fujita, arjan, iet-dev, linux-scsi On Tue, 2005-03-01 at 14:01, Vikas Aggarwal wrote: > And If future IET be visioned as an Enterprise Class Array(Multiple > Host-Side Adapters ie., FAs + Multiple Device Side Adapters ie., DAs), > should better be in direct control of all the system-resources without > being pushed out the kernel. > of course this is the reason why we use that name. ming > Bryan Henderson <hbryan@us.ibm.com> wrote: > One thing that's implicit in your reasons for wanting to be in > the kernel > is that you've chosen to exploit the kernel's page cache. As a > user of > the page cache, you have more control from inside the kernel > than from > user space. The page cache was designed to be fundamentally > invisible to > user space. > > A pure user space implementation of an ISCSI target would use > process > virtual memory for a cache and manage it itself. It would > access the > storage with direct I/O. It looks to me like this is aimed at > a > single-application Linux system (the whole system is just an > ISCSI > target), which means there's not much need for a kernel to > manage shared > resources. > > -- > Bryan Henderson IBM Almaden Research Center > San Jose CA Filesystems > > > ------------------------------------------------------- > SF email is sponsored by - The IT Pro duct Guide > Read honest & candid reviews on hundreds of IT Products from > real users. > Discover which products truly live up to the hype. Start > reading now. > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > _______________________________________________ > Iscsitarget-devel mailing list > Iscsitarget-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/iscsitarget-devel > > ______________________________________________________________________ > Do you Yahoo!? > Read only the mail you want - Yahoo! Mail SpamGuard. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [ANNOUNCE] iSCSI enterprise target software @ 2005-03-01 7:19 FUJITA Tomonori 2005-03-01 8:40 ` Arjan van de Ven 0 siblings, 1 reply; 9+ messages in thread From: FUJITA Tomonori @ 2005-03-01 7:19 UTC (permalink / raw) To: linux-scsi; +Cc: iscsitarget-devel Hi, I would like to announce the iSCSI enterprise target (IET) software, which is open-source software to build iSCSI storage systems. It can provide disk volumes to iSCSI initiators by using any kinds of files (regular files, block devices, virtual block devices like RAID and LVM, etc). The project was started by forking the Ardis target implementation (http://www.ardistech.com/iscsi/) about one year ago. The source code and further information are available from: http://iscsitarget.sourceforge.net/ The user-space daemon handles authentication and the kernel threads take care of network and disk I/O requests from initiators by using the VFS interface. The kernel-space code is not intrusive. It doesn't touch other parts of the kernel. The code is already stable and usable. More than 130 people currently subscribe to the project's mailing list. The developers aim for inclusion into the mainline kernel. The latest code against 2.6.11-rc5 for review can be found at: http://zaal.org/iscsi/iet/0.4.6/r996.tar.gz Could you please review the code? Any comments are greatly appreciated. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 7:19 FUJITA Tomonori @ 2005-03-01 8:40 ` Arjan van de Ven 2005-03-01 9:35 ` FUJITA Tomonori 0 siblings, 1 reply; 9+ messages in thread From: Arjan van de Ven @ 2005-03-01 8:40 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, iscsitarget-devel On Tue, 2005-03-01 at 16:19 +0900, FUJITA Tomonori wrote: > The user-space daemon handles authentication and the kernel threads > take care of network and disk I/O requests from initiators by using > the VFS interface. The kernel-space code is not intrusive. It doesn't > touch other parts of the kernel. The code is already stable and > usable. More than 130 people currently subscribe to the project's > mailing list. The developers aim for inclusion into the mainline > kernel. > Could you please review the code? Any comments are greatly > appreciated. > - can you explain why the target has to be inside the kernel and can't be a pure userspace daemon ? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 8:40 ` Arjan van de Ven @ 2005-03-01 9:35 ` FUJITA Tomonori 2005-03-01 9:46 ` Arjan van de Ven 2005-03-02 5:04 ` FUJITA Tomonori 0 siblings, 2 replies; 9+ messages in thread From: FUJITA Tomonori @ 2005-03-01 9:35 UTC (permalink / raw) To: linux-scsi; +Cc: arjan, iscsitarget-devel From: Arjan van de Ven <arjan@infradead.org> Subject: Re: [ANNOUNCE] iSCSI enterprise target software Date: Tue, 01 Mar 2005 09:40:38 +0100 > > Could you please review the code? Any comments are greatly > > appreciated. > > - > > can you explain why the target has to be inside the kernel and can't be > a pure userspace daemon ? o synchronization Suppose that an target runs in the user space and an initiator sends two WRITE commands (A and B) with the simple attribute. The target can write A and B simultaneously. Before the target sends the response of A, A must be committed to disk (that is, some dirty page cache must be committed). So the target calls fsync(). It commits A to disk. Moreover, it also commits B to disk unnecessarily. This really hurts performance. The current code uses the sync_page_range function. o disk drive cache When the target calls fsync(), dirty page cache is supposed to be committed to disk. However, the disk drive uses write-back policy, it is not. The data is still in disk drive cache. There is no system call to control disk drive cache. So the target (in the user space) cannot make good use of it. The current code also assumes the disk drive uses write-through policy. This is because no handy vfs interface for controlling disk drive cache. I think that there is some room for further improvement in the Linux kernel for storage systems. If the kernel maintainers add new system calls to do the above jobs for storage systems, We can implement good iSCSI target software running in the user space. The last reason is that user-space cost like memory copy. With 1Gbs Ethernet, is is not critical. However, with 10G, it is critical, I expect. I've been setting up 10G experimental infrastructure to evaluate iSCSI performance. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 9:35 ` FUJITA Tomonori @ 2005-03-01 9:46 ` Arjan van de Ven 2005-03-01 10:22 ` [Iscsitarget-devel] " FUJITA Tomonori 2005-03-02 5:04 ` FUJITA Tomonori 1 sibling, 1 reply; 9+ messages in thread From: Arjan van de Ven @ 2005-03-01 9:46 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, iscsitarget-devel On Tue, 2005-03-01 at 18:35 +0900, FUJITA Tomonori wrote: > From: Arjan van de Ven <arjan@infradead.org> > Subject: Re: [ANNOUNCE] iSCSI enterprise target software > Date: Tue, 01 Mar 2005 09:40:38 +0100 > > > > Could you please review the code? Any comments are greatly > > > appreciated. > > > - > > > > can you explain why the target has to be inside the kernel and can't be > > a pure userspace daemon ? > > o synchronization > > Suppose that an target runs in the user space and an initiator sends > two WRITE commands (A and B) with the simple attribute. > > The target can write A and B simultaneously. Before the target sends > the response of A, A must be committed to disk (that is, some dirty > page cache must be committed). So the target calls fsync(). It commits > A to disk. Moreover, it also commits B to disk unnecessarily. This > really hurts performance. fsync or msync() ? I would imagine the target mmaping it's backend in userspace and using msync() to kick off IO. At which point it's not that much different from the control you do of the pagecache from inside the kernel... > o disk drive cache > > When the target calls fsync(), dirty page cache is supposed to be > committed to disk. However, the disk drive uses write-back policy, it > is not. The data is still in disk drive cache. There is no system call > to control disk drive cache. So the target (in the user space) cannot > make good use of it. fsync() (and I suppose msync()) nowadays send a "flush cache" command to the physical disk as well. This is new since 2.6.9 or so. > The current code also assumes the disk drive uses write-through > policy. This is because no handy vfs interface for controlling disk > drive cache. I think that there is some room for further improvement > in the Linux kernel for storage systems. that's already present since 2.6.9..... > The last reason is that user-space cost like memory copy. With 1Gbs > Ethernet, is is not critical. However, with 10G, it is critical, I > expect. I've been setting up 10G experimental infrastructure to > evaluate iSCSI performance. if you use the mmap not write/read approach this copy isn't there. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 9:46 ` Arjan van de Ven @ 2005-03-01 10:22 ` FUJITA Tomonori 2005-03-01 10:33 ` Arjan van de Ven 0 siblings, 1 reply; 9+ messages in thread From: FUJITA Tomonori @ 2005-03-01 10:22 UTC (permalink / raw) To: arjan; +Cc: linux-scsi, iscsitarget-devel From: Arjan van de Ven <arjan@infradead.org> Subject: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software Date: Tue, 01 Mar 2005 10:46:03 +0100 > fsync or msync() ? I would imagine the target mmaping it's backend in > userspace and using msync() to kick off IO. At which point it's not that > much different from the control you do of the pagecache from inside the > kernel... Can we avoid calling mmap() and munmap() repeatedly with large disk? > > When the target calls fsync(), dirty page cache is supposed to be > > committed to disk. However, the disk drive uses write-back policy, it > > is not. The data is still in disk drive cache. There is no system call > > to control disk drive cache. So the target (in the user space) cannot > > make good use of it. > > fsync() (and I suppose msync()) nowadays send a "flush cache" command to > the physical disk as well. This is new since 2.6.9 or so. > > > The current code also assumes the disk drive uses write-through > > policy. This is because no handy vfs interface for controlling disk > > drive cache. I think that there is some room for further improvement > > in the Linux kernel for storage systems. > > that's already present since 2.6.9..... Thanks a lot. I've not noticed these changes. I'll see the code later. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 10:22 ` [Iscsitarget-devel] " FUJITA Tomonori @ 2005-03-01 10:33 ` Arjan van de Ven 2005-03-01 10:46 ` Arjan van de Ven 2005-03-01 10:48 ` Libor Vanek 0 siblings, 2 replies; 9+ messages in thread From: Arjan van de Ven @ 2005-03-01 10:33 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, iscsitarget-devel On Tue, 2005-03-01 at 19:22 +0900, FUJITA Tomonori wrote: > From: Arjan van de Ven <arjan@infradead.org> > Subject: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software > Date: Tue, 01 Mar 2005 10:46:03 +0100 > > > fsync or msync() ? I would imagine the target mmaping it's backend in > > userspace and using msync() to kick off IO. At which point it's not that > > much different from the control you do of the pagecache from inside the > > kernel... > > Can we avoid calling mmap() and munmap() repeatedly with large disk? my server has 512Gb address space with 2.6.9/2.6.10, and a lot more than that with the 2.6.11 kernel (4 level page tables rock). So the answer would be yes. (and on old servers without 64 bit, you indeed need to mmap/munmap lazily to create a window, but I suspect that the 3 Gb of address space you have there can be managed smart to minimize the number of unmaps if you really try) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 10:33 ` Arjan van de Ven @ 2005-03-01 10:46 ` Arjan van de Ven 2005-03-01 11:23 ` FUJITA Tomonori 2005-03-01 10:48 ` Libor Vanek 1 sibling, 1 reply; 9+ messages in thread From: Arjan van de Ven @ 2005-03-01 10:46 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: linux-scsi, iscsitarget-devel On Tue, 2005-03-01 at 11:33 +0100, Arjan van de Ven wrote: > On Tue, 2005-03-01 at 19:22 +0900, FUJITA Tomonori wrote: > > From: Arjan van de Ven <arjan@infradead.org> > > Subject: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software > > Date: Tue, 01 Mar 2005 10:46:03 +0100 > > > > > fsync or msync() ? I would imagine the target mmaping it's backend in > > > userspace and using msync() to kick off IO. At which point it's not that > > > much different from the control you do of the pagecache from inside the > > > kernel... > > > > Can we avoid calling mmap() and munmap() repeatedly with large disk? > > my server has 512Gb address space with 2.6.9/2.6.10, and a lot more than > that with the 2.6.11 kernel (4 level page tables rock). So the answer > would be yes. > > (and on old servers without 64 bit, you indeed need to mmap/munmap > lazily to create a window, but I suspect that the 3 Gb of address space > you have there can be managed smart to minimize the number of unmaps if > you really try) note that on 32 bit servers the kernel side needs to do kmap() on the pages anyway, and that a kmap/kunmap series is very much equivalent to a mmap/munmap series in lots of ways, so I doubt that has many additional savings for doing it in kernel space. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 10:46 ` Arjan van de Ven @ 2005-03-01 11:23 ` FUJITA Tomonori 0 siblings, 0 replies; 9+ messages in thread From: FUJITA Tomonori @ 2005-03-01 11:23 UTC (permalink / raw) To: arjan; +Cc: linux-scsi, iscsitarget-devel From: Arjan van de Ven <arjan@infradead.org> Subject: Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software Date: Tue, 01 Mar 2005 11:46:32 +0100 > note that on 32 bit servers the kernel side needs to do kmap() on the > pages anyway, and that a kmap/kunmap series is very much equivalent to a > mmap/munmap series in lots of ways, so I doubt that has many additional > savings for doing it in kernel space. The code uses the vfs interface, kmap_atomic() is used instead of kmap(). kmap_atomic() is much faster kmap(). ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 10:33 ` Arjan van de Ven 2005-03-01 10:46 ` Arjan van de Ven @ 2005-03-01 10:48 ` Libor Vanek 2005-03-01 10:51 ` Arjan van de Ven 1 sibling, 1 reply; 9+ messages in thread From: Libor Vanek @ 2005-03-01 10:48 UTC (permalink / raw) To: Arjan van de Ven; +Cc: FUJITA Tomonori, linux-scsi, iscsitarget-devel Arjan van de Ven wrote: >On Tue, 2005-03-01 at 19:22 +0900, FUJITA Tomonori wrote: > > >>From: Arjan van de Ven <arjan@infradead.org> >>Subject: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software >>Date: Tue, 01 Mar 2005 10:46:03 +0100 >> >> >> >>>fsync or msync() ? I would imagine the target mmaping it's backend in >>>userspace and using msync() to kick off IO. At which point it's not that >>>much different from the control you do of the pagecache from inside the >>>kernel... >>> >>> >>Can we avoid calling mmap() and munmap() repeatedly with large disk? >> >> > >my server has 512Gb address space with 2.6.9/2.6.10, and a lot more than >that with the 2.6.11 kernel (4 level page tables rock). So the answer >would be yes. > >(and on old servers without 64 bit, you indeed need to mmap/munmap >lazily to create a window, but I suspect that the 3 Gb of address space >you have there can be managed smart to minimize the number of unmaps if >you really try) > > I don't know in detail what are you talking about (if whole disk must fit address space) but please consider we're speaking about TBs (10-20 TB RAID is quite cheap nowadays with 400 GB SATA disks). -- Best regards, Libor Vanek ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 10:48 ` Libor Vanek @ 2005-03-01 10:51 ` Arjan van de Ven 0 siblings, 0 replies; 9+ messages in thread From: Arjan van de Ven @ 2005-03-01 10:51 UTC (permalink / raw) To: Libor Vanek; +Cc: FUJITA Tomonori, linux-scsi, iscsitarget-devel On Tue, 2005-03-01 at 11:48 +0100, Libor Vanek wrote: > > > I don't know in detail what are you talking about (if whole disk must > fit address space) but please consider we're speaking about TBs (10-20 > TB RAID is quite cheap nowadays with 400 GB SATA disks). so? if you need one map/unmap per terabyte, the cost of that is like zero. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-01 9:35 ` FUJITA Tomonori 2005-03-01 9:46 ` Arjan van de Ven @ 2005-03-02 5:04 ` FUJITA Tomonori 2005-03-02 5:21 ` Dmitry Yusupov 1 sibling, 1 reply; 9+ messages in thread From: FUJITA Tomonori @ 2005-03-02 5:04 UTC (permalink / raw) To: arjan; +Cc: linux-scsi, iscsitarget-devel From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Subject: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software Date: Tue, 01 Mar 2005 18:35:49 +0900 > The last reason is that user-space cost like memory copy. With 1Gbs > Ethernet, is is not critical. However, with 10G, it is critical, I > expect. I've been setting up 10G experimental infrastructure to > evaluate iSCSI performance. If we try to build high-performance iSCSI target software on 10Gbs Ethernet, I think that we need to implement it in the kernel space, although this topic is still in the research stage. IICR, Intel provides open-source iSCSI target software, which uses mmap() in the user space, as you suggested. However, they chose a different approach to implement iSCSI target software for 10Gbs. They modified the Linux kernel TCP stack to fully integrate iSCSI functionality with it. It is still not clear how to get the best performance out of 10GBs Ethernet, however, it seems that using just the socket interface is not good enough. So I think that we need to control all the system-resources to do it. We've not optimized our code yet. The current code simply uses the socket interface, however, moving some functionality to the TCP stack can improve the performance, although I don't plan to modify the TCP stack. A possible approach to do it is exploiting sk_data_ready(). In addition, you can find several papers (e.g., Chelsio technical papers) saying that UP kernel can achieve better throughput than SMP kernels with 10Gbs. There is some room for further improvement in the Linux kernel. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software 2005-03-02 5:04 ` FUJITA Tomonori @ 2005-03-02 5:21 ` Dmitry Yusupov 0 siblings, 0 replies; 9+ messages in thread From: Dmitry Yusupov @ 2005-03-02 5:21 UTC (permalink / raw) To: FUJITA Tomonori; +Cc: arjan, linux-scsi, iscsitarget-devel On Wed, 2005-03-02 at 14:04 +0900, FUJITA Tomonori wrote: > From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> > Subject: [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software > Date: Tue, 01 Mar 2005 18:35:49 +0900 > > > The last reason is that user-space cost like memory copy. With 1Gbs > > Ethernet, is is not critical. However, with 10G, it is critical, I > > expect. I've been setting up 10G experimental infrastructure to > > evaluate iSCSI performance. > > If we try to build high-performance iSCSI target software on 10Gbs > Ethernet, I think that we need to implement it in the kernel > space, although this topic is still in the research stage. > > IICR, Intel provides open-source iSCSI target software, which uses > mmap() in the user space, as you suggested. However, they chose a > different approach to implement iSCSI target software for 10Gbs. They > modified the Linux kernel TCP stack to fully integrate iSCSI > functionality with it. It is still not clear how to get the best > performance out of 10GBs Ethernet, however, it seems that using just > the socket interface is not good enough. So I think that we need to > control all the system-resources to do it. > > We've not optimized our code yet. The current code simply uses the > socket interface, however, moving some functionality to the TCP stack > can improve the performance, although I don't plan to modify theu TCP > stack. A possible approach to do it is exploiting sk_data_ready(). www.open-iscsi.org project uses this technique on receiving path and shows very nice performance numbers for Read operations comparable with HW accelerators. To add such a functionality to an existing project will probably require total re-design. Regards, Dima ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2005-03-02 5:22 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20050301190140.97212.qmail@web30006.mail.mud.yahoo.com>
2005-03-01 19:29 ` [Iscsitarget-devel] Re: [ANNOUNCE] iSCSI enterprise target software Ming Zhang
2005-03-01 7:19 FUJITA Tomonori
2005-03-01 8:40 ` Arjan van de Ven
2005-03-01 9:35 ` FUJITA Tomonori
2005-03-01 9:46 ` Arjan van de Ven
2005-03-01 10:22 ` [Iscsitarget-devel] " FUJITA Tomonori
2005-03-01 10:33 ` Arjan van de Ven
2005-03-01 10:46 ` Arjan van de Ven
2005-03-01 11:23 ` FUJITA Tomonori
2005-03-01 10:48 ` Libor Vanek
2005-03-01 10:51 ` Arjan van de Ven
2005-03-02 5:04 ` FUJITA Tomonori
2005-03-02 5:21 ` Dmitry Yusupov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox