* Re: working on extent locks for i_mutex [not found] <4F0F9E97.1090403@linux.vnet.ibm.com> @ 2012-01-13 4:34 ` Dave Chinner 2012-01-13 7:14 ` Tao Ma 2012-01-13 20:50 ` Allison Henderson 0 siblings, 2 replies; 9+ messages in thread From: Dave Chinner @ 2012-01-13 4:34 UTC (permalink / raw) To: Allison Henderson; +Cc: Lukas Czerner, Ext4 Developers List, xfs On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote: > Hi All, > > I know this is an old topic, but I am poking it again because I've > had some work items wrap up, and Im planning on picking up on this > one again. I am thinking about implementing extent locks to replace > i_mutex. So I just wanted to touch base with folks and see what > people are working on because I know there were some folks out there > that were thing about doing similar solutions. What locking API are you looking at? If you are looking at an something like: read_range_{try}lock(lock, off, len) read_range_unlock(lock, off, len) write_range_{try}lock(lock, off, len) write_range_unlock(lock, off, len) and implementing with an rbtree or a btree for tracking, then I definitely have a use for it in XFS - replacing the current rwsem that is used for the iolock. Range locks like this are the only thing we need to allow concurrent buffered writes to the same file to maintain the per-write exclusion that posix requires. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: working on extent locks for i_mutex 2012-01-13 4:34 ` working on extent locks for i_mutex Dave Chinner @ 2012-01-13 7:14 ` Tao Ma 2012-01-13 11:52 ` Dave Chinner 2012-01-13 20:50 ` Allison Henderson 1 sibling, 1 reply; 9+ messages in thread From: Tao Ma @ 2012-01-13 7:14 UTC (permalink / raw) To: Dave Chinner; +Cc: Lukas Czerner, xfs, Ext4 Developers List, Allison Henderson On 01/13/2012 12:34 PM, Dave Chinner wrote: > On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote: >> Hi All, >> >> I know this is an old topic, but I am poking it again because I've >> had some work items wrap up, and Im planning on picking up on this >> one again. I am thinking about implementing extent locks to replace >> i_mutex. So I just wanted to touch base with folks and see what >> people are working on because I know there were some folks out there >> that were thing about doing similar solutions. > > What locking API are you looking at? If you are looking at an > something like: > > read_range_{try}lock(lock, off, len) > read_range_unlock(lock, off, len) > write_range_{try}lock(lock, off, len) > write_range_unlock(lock, off, len) > > and implementing with an rbtree or a btree for tracking, then I > definitely have a use for it in XFS - replacing the current rwsem > that is used for the iolock. Range locks like this are the only > thing we need to allow concurrent buffered writes to the same file > to maintain the per-write exclusion that posix requires. Interesting, so xfs already have these range lock, right? If yes, any possibility that the code can be reused in ext4 since we have the same thing in mind but don't have any resource to work on it by now. btw, IIRC flock(2) uses a list to indicate the range lock, so if we can make these pieces of codes common, at least there are 3 places that can benefit from it. ;) Thanks Tao _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: working on extent locks for i_mutex 2012-01-13 7:14 ` Tao Ma @ 2012-01-13 11:52 ` Dave Chinner 2012-01-13 11:57 ` Tao Ma 0 siblings, 1 reply; 9+ messages in thread From: Dave Chinner @ 2012-01-13 11:52 UTC (permalink / raw) To: Tao Ma; +Cc: Lukas Czerner, xfs, Ext4 Developers List, Allison Henderson On Fri, Jan 13, 2012 at 03:14:51PM +0800, Tao Ma wrote: > On 01/13/2012 12:34 PM, Dave Chinner wrote: > > On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote: > >> Hi All, > >> > >> I know this is an old topic, but I am poking it again because I've > >> had some work items wrap up, and Im planning on picking up on this > >> one again. I am thinking about implementing extent locks to replace > >> i_mutex. So I just wanted to touch base with folks and see what > >> people are working on because I know there were some folks out there > >> that were thing about doing similar solutions. > > > > What locking API are you looking at? If you are looking at an > > something like: > > > > read_range_{try}lock(lock, off, len) > > read_range_unlock(lock, off, len) > > write_range_{try}lock(lock, off, len) > > write_range_unlock(lock, off, len) > > > > and implementing with an rbtree or a btree for tracking, then I > > definitely have a use for it in XFS - replacing the current rwsem > > that is used for the iolock. Range locks like this are the only > > thing we need to allow concurrent buffered writes to the same file > > to maintain the per-write exclusion that posix requires. > Interesting, so xfs already have these range lock, right? If yes, any > possibility that the code can be reused in ext4 since we have the same > thing in mind but don't have any resource to work on it by now. No, it doesn't have range locks. If has separate locks for IO exclusion vs metadata modification (i_iolock vs i_ilock). Both are rwsems, the ilock nests inside and protects the extent list and other metadata. What I want to do is replace the i_iolock with a read/write range lock so that we can do sane cache coherent concurrent IO to separate ranges of the file. We can't do concurrent modifications to the extent tree, so we have no need for changing the i_ilock (metadata) lock to range locks. > btw, IIRC flock(2) uses a list to indicate the range lock, so if we can > make these pieces of codes common, at least there are 3 places that can > benefit from it. ;) flock is way more complex than simple read/write range locks and has fixed semantics and lots of scope for difficult to find regressions, so I wouldn't even bother trying to support them... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: working on extent locks for i_mutex 2012-01-13 11:52 ` Dave Chinner @ 2012-01-13 11:57 ` Tao Ma 0 siblings, 0 replies; 9+ messages in thread From: Tao Ma @ 2012-01-13 11:57 UTC (permalink / raw) To: Dave Chinner; +Cc: Lukas Czerner, xfs, Ext4 Developers List, Allison Henderson On 01/13/2012 07:52 PM, Dave Chinner wrote: > On Fri, Jan 13, 2012 at 03:14:51PM +0800, Tao Ma wrote: >> On 01/13/2012 12:34 PM, Dave Chinner wrote: >>> On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote: >>>> Hi All, >>>> >>>> I know this is an old topic, but I am poking it again because I've >>>> had some work items wrap up, and Im planning on picking up on this >>>> one again. I am thinking about implementing extent locks to replace >>>> i_mutex. So I just wanted to touch base with folks and see what >>>> people are working on because I know there were some folks out there >>>> that were thing about doing similar solutions. >>> >>> What locking API are you looking at? If you are looking at an >>> something like: >>> >>> read_range_{try}lock(lock, off, len) >>> read_range_unlock(lock, off, len) >>> write_range_{try}lock(lock, off, len) >>> write_range_unlock(lock, off, len) >>> >>> and implementing with an rbtree or a btree for tracking, then I >>> definitely have a use for it in XFS - replacing the current rwsem >>> that is used for the iolock. Range locks like this are the only >>> thing we need to allow concurrent buffered writes to the same file >>> to maintain the per-write exclusion that posix requires. >> Interesting, so xfs already have these range lock, right? If yes, any >> possibility that the code can be reused in ext4 since we have the same >> thing in mind but don't have any resource to work on it by now. > > No, it doesn't have range locks. If has separate locks for IO > exclusion vs metadata modification (i_iolock vs i_ilock). Both are > rwsems, the ilock nests inside and protects the extent list and > other metadata. > > What I want to do is replace the i_iolock with a read/write range > lock so that we can do sane cache coherent concurrent IO to separate > ranges of the file. We can't do concurrent modifications to the > extent tree, so we have no need for changing the i_ilock (metadata) > lock to range locks. OK, I see. Thanks for the information. > > >> btw, IIRC flock(2) uses a list to indicate the range lock, so if we can >> make these pieces of codes common, at least there are 3 places that can >> benefit from it. ;) > > flock is way more complex than simple read/write range locks and has > fixed semantics and lots of scope for difficult to find regressions, > so I wouldn't even bother trying to support them... fair enough. :) Thanks Tao _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: working on extent locks for i_mutex 2012-01-13 4:34 ` working on extent locks for i_mutex Dave Chinner 2012-01-13 7:14 ` Tao Ma @ 2012-01-13 20:50 ` Allison Henderson 2012-01-15 23:57 ` Dave Chinner 1 sibling, 1 reply; 9+ messages in thread From: Allison Henderson @ 2012-01-13 20:50 UTC (permalink / raw) To: Dave Chinner; +Cc: Lukas Czerner, Ext4 Developers List, xfs On 01/12/2012 09:34 PM, Dave Chinner wrote: > On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote: >> Hi All, >> >> I know this is an old topic, but I am poking it again because I've >> had some work items wrap up, and Im planning on picking up on this >> one again. I am thinking about implementing extent locks to replace >> i_mutex. So I just wanted to touch base with folks and see what >> people are working on because I know there were some folks out there >> that were thing about doing similar solutions. > > What locking API are you looking at? If you are looking at an > something like: > > read_range_{try}lock(lock, off, len) > read_range_unlock(lock, off, len) > write_range_{try}lock(lock, off, len) > write_range_unlock(lock, off, len) > > and implementing with an rbtree or a btree for tracking, then I > definitely have a use for it in XFS - replacing the current rwsem > that is used for the iolock. Range locks like this are the only > thing we need to allow concurrent buffered writes to the same file > to maintain the per-write exclusion that posix requires. > > Cheers, > > Dave. Yes that is generally the idea I was thinking about doing, but at the time, I was not thinking outside the scope of ext4. You are thinking maybe it should be in vfs layer so that it's something that all the filesystems will use? That seems to be the impression I'm getting from folks. Thx! Allison Henderson _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: working on extent locks for i_mutex 2012-01-13 20:50 ` Allison Henderson @ 2012-01-15 23:57 ` Dave Chinner [not found] ` <4F146275.8090304@linux.vnet.ibm.com> 0 siblings, 1 reply; 9+ messages in thread From: Dave Chinner @ 2012-01-15 23:57 UTC (permalink / raw) To: Allison Henderson; +Cc: Lukas Czerner, Ext4 Developers List, xfs On Fri, Jan 13, 2012 at 01:50:52PM -0700, Allison Henderson wrote: > On 01/12/2012 09:34 PM, Dave Chinner wrote: > >On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote: > >>Hi All, > >> > >>I know this is an old topic, but I am poking it again because I've > >>had some work items wrap up, and Im planning on picking up on this > >>one again. I am thinking about implementing extent locks to replace > >>i_mutex. So I just wanted to touch base with folks and see what > >>people are working on because I know there were some folks out there > >>that were thing about doing similar solutions. > > > >What locking API are you looking at? If you are looking at an > >something like: > > > >read_range_{try}lock(lock, off, len) > >read_range_unlock(lock, off, len) > >write_range_{try}lock(lock, off, len) > >write_range_unlock(lock, off, len) > > > >and implementing with an rbtree or a btree for tracking, then I > >definitely have a use for it in XFS - replacing the current rwsem > >that is used for the iolock. Range locks like this are the only > >thing we need to allow concurrent buffered writes to the same file > >to maintain the per-write exclusion that posix requires. > > Yes that is generally the idea I was thinking about doing, but at > the time, I was not thinking outside the scope of ext4. You are > thinking maybe it should be in vfs layer so that it's something that > all the filesystems will use? That seems to be the impression I'm > getting from folks. Thx! Yes, that's what I'm suggesting. Not so much a vfs layer function, but a library (range locks could be useful outside filesystems) so locating it in lib/ was what I was thinking.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <4F146275.8090304@linux.vnet.ibm.com>]
* Re: working on extent locks for i_mutex [not found] ` <4F146275.8090304@linux.vnet.ibm.com> @ 2012-01-18 12:02 ` Zheng Liu 2012-01-19 21:16 ` Frank Mayhar 0 siblings, 1 reply; 9+ messages in thread From: Zheng Liu @ 2012-01-18 12:02 UTC (permalink / raw) To: Allison Henderson; +Cc: Lukas Czerner, Tao Ma, Ext4 Developers List, xfs On Mon, Jan 16, 2012 at 10:46:29AM -0700, Allison Henderson wrote: > On 01/15/2012 04:57 PM, Dave Chinner wrote: > >On Fri, Jan 13, 2012 at 01:50:52PM -0700, Allison Henderson wrote: > >>On 01/12/2012 09:34 PM, Dave Chinner wrote: > >>>On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote: > >>>>Hi All, > >>>> > >>>>I know this is an old topic, but I am poking it again because I've > >>>>had some work items wrap up, and Im planning on picking up on this > >>>>one again. I am thinking about implementing extent locks to replace > >>>>i_mutex. So I just wanted to touch base with folks and see what > >>>>people are working on because I know there were some folks out there > >>>>that were thing about doing similar solutions. > >>> > >>>What locking API are you looking at? If you are looking at an > >>>something like: > >>> > >>>read_range_{try}lock(lock, off, len) > >>>read_range_unlock(lock, off, len) > >>>write_range_{try}lock(lock, off, len) > >>>write_range_unlock(lock, off, len) > >>> > >>>and implementing with an rbtree or a btree for tracking, then I > >>>definitely have a use for it in XFS - replacing the current rwsem > >>>that is used for the iolock. Range locks like this are the only > >>>thing we need to allow concurrent buffered writes to the same file > >>>to maintain the per-write exclusion that posix requires. > >> > >>Yes that is generally the idea I was thinking about doing, but at > >>the time, I was not thinking outside the scope of ext4. You are > >>thinking maybe it should be in vfs layer so that it's something that > >>all the filesystems will use? That seems to be the impression I'm > >>getting from folks. Thx! > > > >Yes, that's what I'm suggesting. Not so much a vfs layer function, > >but a library (range locks could be useful outside filesystems) so > >locating it in lib/ was what I was thinking.... > > > >Cheers, > > > >Dave. > > Alrighty, that sounds good to me. I will aim to keep it as general > purpose as I can. I am going to start some proto typing and will > post back when I get something working. Thx for the feedback all! > :) Hi Allison, For this project, do you have a schedule? Would you like to share to me? This lock contention heavily impacts the performance of direct IO in our production environment. So we hope to improve it ASAP. I have done some direct IO benchmarks to compare ext4 with xfs using fio in Intel SSD. The result shows that, in direct IO, xfs outperforms ext4 and ext4 with dioread_nolock. To understand the effect of lock contention, I define a new function called ext4_file_aio_write() that calls __generic_file_aio_write() without acquiring i_mutex lock. Meanwhile, I remove DIO_LOCKING flag when __blockdev_direct_IO() is called and do the similar benchmarks. The result shows that the performance in ext4 is almost the same to the xfs. Thus, it proves that the i_mutex heavily impacts the performance. Hopefully the result is useful for you. :-) I post the result in here. config file: [global] filesize=64G size=64G bs=16k ioengine=psync direct=1 filename=/mnt/ext4/benchmark runtime=600 group_reporting thread [randrw] numjobs=32 rw=randrw rwmixread=90 result: iops 1 (r/w) 2 3 ext4 5584/622 5726/636 5719/636 ext4+dioread_nolock 7105/789 7117/793 7129/795 ext4+dio_nolock 8920/992 8956/995 8976/997 xfs 8726/971 8962/994 8975/998 bandwidth 1 (r/w) 2 3 KB/s ext4 89359/9955.3 91621/10186 91519/10185 ext4+dioread_nolock 113691/12635 113882/12692 114066/12728 ext4+dio_nolock 142731/15888 143301/15930 143617/15959 xfs 139627/15537 143400/15914 143603/15980 latency 1 (r/w) 2 3 usec ext4 5163.28/5048.31 5037.81/4914.82 5041.49/4932.81 ext4+dioread_nolock 1220.04/29510.5 1213.67/29418.9 1208.77/29361.49 ext4+dio_nolock 3226.61/3194.35 3214.59/3178.09 3207.34/3173.78 xfs 3299.87/3266.32 3213.73/3182.20 3208.16/3178.10 Regards, Zheng > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: working on extent locks for i_mutex 2012-01-18 12:02 ` Zheng Liu @ 2012-01-19 21:16 ` Frank Mayhar 2012-01-20 2:26 ` Zheng Liu 0 siblings, 1 reply; 9+ messages in thread From: Frank Mayhar @ 2012-01-19 21:16 UTC (permalink / raw) To: Zheng Liu Cc: xfs, Allison Henderson, Lukas Czerner, Tao Ma, Ext4 Developers List On Wed, 2012-01-18 at 20:02 +0800, Zheng Liu wrote: > For this project, do you have a schedule? Would you like to share to me? This > lock contention heavily impacts the performance of direct IO in our production > environment. So we hope to improve it ASAP. > > I have done some direct IO benchmarks to compare ext4 with xfs using fio > in Intel SSD. The result shows that, in direct IO, xfs outperforms ext4 and > ext4 with dioread_nolock. > > To understand the effect of lock contention, I define a new function called > ext4_file_aio_write() that calls __generic_file_aio_write() without acquiring > i_mutex lock. Meanwhile, I remove DIO_LOCKING flag when __blockdev_direct_IO() > is called and do the similar benchmarks. The result shows that the performance > in ext4 is almost the same to the xfs. Thus, it proves that the i_mutex heavily > impacts the performance. Hopefully the result is useful for you. :-) For the record, I have a patchset that, while not affecting i_mutex (or locking in general), does allow AIO append writes to actually be done asynchronously. (Currently they're forced to be done synchronously.) It makes a big difference in performance for that particular case, even for spinning media. Performance roughly doubled when testing with fio against a regular two-terabyte drive; the performance improvement against SSD would have to be much greater. One day soon I'll accumulate enough spare time to port the patchset forward to the latest kernel and submit it here. -- Frank Mayhar fmayhar@google.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: working on extent locks for i_mutex 2012-01-19 21:16 ` Frank Mayhar @ 2012-01-20 2:26 ` Zheng Liu 0 siblings, 0 replies; 9+ messages in thread From: Zheng Liu @ 2012-01-20 2:26 UTC (permalink / raw) To: Frank Mayhar Cc: xfs, Allison Henderson, Lukas Czerner, Tao Ma, Ext4 Developers List On Thu, Jan 19, 2012 at 01:16:10PM -0800, Frank Mayhar wrote: > On Wed, 2012-01-18 at 20:02 +0800, Zheng Liu wrote: > > For this project, do you have a schedule? Would you like to share to me? This > > lock contention heavily impacts the performance of direct IO in our production > > environment. So we hope to improve it ASAP. > > > > I have done some direct IO benchmarks to compare ext4 with xfs using fio > > in Intel SSD. The result shows that, in direct IO, xfs outperforms ext4 and > > ext4 with dioread_nolock. > > > > To understand the effect of lock contention, I define a new function called > > ext4_file_aio_write() that calls __generic_file_aio_write() without acquiring > > i_mutex lock. Meanwhile, I remove DIO_LOCKING flag when __blockdev_direct_IO() > > is called and do the similar benchmarks. The result shows that the performance > > in ext4 is almost the same to the xfs. Thus, it proves that the i_mutex heavily > > impacts the performance. Hopefully the result is useful for you. :-) > > For the record, I have a patchset that, while not affecting i_mutex (or > locking in general), does allow AIO append writes to actually be done > asynchronously. (Currently they're forced to be done synchronously.) > It makes a big difference in performance for that particular case, even > for spinning media. Performance roughly doubled when testing with fio > against a regular two-terabyte drive; the performance improvement > against SSD would have to be much greater. > > One day soon I'll accumulate enough spare time to port the patchset > forward to the latest kernel and submit it here. Interesting. I think it might help us to improve this issue. So could you please post your test case and result in detail? Thank you. :-) Regards, Zheng _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-01-20 2:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4F0F9E97.1090403@linux.vnet.ibm.com>
2012-01-13 4:34 ` working on extent locks for i_mutex Dave Chinner
2012-01-13 7:14 ` Tao Ma
2012-01-13 11:52 ` Dave Chinner
2012-01-13 11:57 ` Tao Ma
2012-01-13 20:50 ` Allison Henderson
2012-01-15 23:57 ` Dave Chinner
[not found] ` <4F146275.8090304@linux.vnet.ibm.com>
2012-01-18 12:02 ` Zheng Liu
2012-01-19 21:16 ` Frank Mayhar
2012-01-20 2:26 ` Zheng Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox