linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* working on extent locks for i_mutex
@ 2012-01-13  3:01 Allison Henderson
  2012-01-13  4:01 ` Andreas Dilger
  2012-01-13  4:34 ` Dave Chinner
  0 siblings, 2 replies; 13+ messages in thread
From: Allison Henderson @ 2012-01-13  3:01 UTC (permalink / raw)
  To: Ext4 Developers List, Lukas Czerner

Hi All,

I know this is an old topic, but I am poking it again because I've had 
some work items wrap up, and Im planning on picking up on this one 
again.  I am thinking about implementing extent locks to replace 
i_mutex.  So I just wanted to touch base with folks and see what people 
are working on because I know there were some folks out there that were 
thing about doing similar solutions.

A while ago I had done some investigation on where i_mutex is currently 
used, so I did a review and updated my list.  Only one thing had been 
removed, but I will leave the list here since it was a while ago.  Let 
me know if anyone has been working on similar concept.  Thx!

Allison


List of ext4 functions that lock i_mutex:
ext4_sync_file
ext4_fallocate
ext4_move_extents via two helper routines:
     mext_inode_double_lock and mext_inode_double_unlock
ext4_ioctl (for the EXT4_IOC_SETFLAGS ioctl)
ext4_quota_write
ext4_llseek
ext4_end_io_work
ext4_ind_direct_IO (only while calling ext4_flush_completed_IO)


Functions called by vfs with i_mutex locked:
ext4_setattr
ext4_da_writepages
ext4_rmdir
ext4_unlink
ext4_symlink
ext4_link
ext4_rename
ext4_get_block


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-13  3:01 working on extent locks for i_mutex Allison Henderson
@ 2012-01-13  4:01 ` Andreas Dilger
  2012-01-13 20:50   ` Allison Henderson
  2012-01-13  4:34 ` Dave Chinner
  1 sibling, 1 reply; 13+ messages in thread
From: Andreas Dilger @ 2012-01-13  4:01 UTC (permalink / raw)
  To: Allison Henderson; +Cc: Ext4 Developers List, Lukas Czerner, Zhen Liang

On 2012-01-12, at 8:01 PM, Allison Henderson wrote:
> I know this is an old topic, but I am poking it again because I've had some work items wrap up, and Im planning on picking up on this one again.  I am thinking about implementing extent locks to replace i_mutex.  So I just wanted to touch base with folks and see what people are working on because I know there were some folks out there that were thing about doing similar solutions.
> 
> A while ago I had done some investigation on where i_mutex is currently used, so I did a review and updated my list.  Only one thing had been removed, but I will leave the list here since it was a while ago.  Let me know if anyone has been working on similar concept.  Thx!

The in-ext4 users appear to all be file IO related, while the VFS functions
are mostly directory related, though I don't see any mention of the callers
of ext4_mkdir() or ext4_mknod() or ext4_create()?

For Lustre we developed a patch that allows parallel metadata operations on
directories (e.g. concurrent lookup, mkdir, rmdir, create, unlink in a single
directory) which would replace i_mutex for the namespace operations.  Patch:

http://git.whamcloud.com/?p=fs/lustre-release.git;a=blob;f=ldiskfs/kernel_patches/patches/ext4_pdirop-rhel6.patch;hb=HEAD

though this in itself isn't enough to allow the VFS to do parallel directory
operations.  We're of course also interested in parallel file IO operations
through the VFS for the client, though this hasn't been a focus of ours since
we typically have a large number of clients doing IO concurrently.

> List of ext4 functions that lock i_mutex:
> ext4_sync_file
> ext4_fallocate
> ext4_move_extents via two helper routines:
>    mext_inode_double_lock and mext_inode_double_unlock
> ext4_ioctl (for the EXT4_IOC_SETFLAGS ioctl)
> ext4_quota_write
> ext4_llseek
> ext4_end_io_work
> ext4_ind_direct_IO (only while calling ext4_flush_completed_IO)
> 
> 
> Functions called by vfs with i_mutex locked:
> ext4_setattr
> ext4_da_writepages
> ext4_rmdir
> ext4_unlink
> ext4_symlink
> ext4_link
> ext4_rename
> ext4_get_block
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-13  3:01 working on extent locks for i_mutex Allison Henderson
  2012-01-13  4:01 ` Andreas Dilger
@ 2012-01-13  4:34 ` Dave Chinner
  2012-01-13  7:14   ` Tao Ma
  2012-01-13 20:50   ` Allison Henderson
  1 sibling, 2 replies; 13+ messages in thread
From: Dave Chinner @ 2012-01-13  4:34 UTC (permalink / raw)
  To: Allison Henderson; +Cc: Ext4 Developers List, Lukas Czerner, xfs

On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
> Hi All,
> 
> I know this is an old topic, but I am poking it again because I've
> had some work items wrap up, and Im planning on picking up on this
> one again.  I am thinking about implementing extent locks to replace
> i_mutex.  So I just wanted to touch base with folks and see what
> people are working on because I know there were some folks out there
> that were thing about doing similar solutions.

What locking API are you looking at? If you are looking at an
something like:

read_range_{try}lock(lock, off, len)
read_range_unlock(lock, off, len)
write_range_{try}lock(lock, off, len)
write_range_unlock(lock, off, len)

and implementing with an rbtree or a btree for tracking, then I
definitely have a use for it in XFS - replacing the current rwsem
that is used for the iolock. Range locks like this are the only
thing we need to allow concurrent buffered writes to the same file
to maintain the per-write exclusion that posix requires.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-13  4:34 ` Dave Chinner
@ 2012-01-13  7:14   ` Tao Ma
  2012-01-13 11:52     ` Dave Chinner
  2012-01-13 20:50   ` Allison Henderson
  1 sibling, 1 reply; 13+ messages in thread
From: Tao Ma @ 2012-01-13  7:14 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Allison Henderson, Ext4 Developers List, Lukas Czerner, xfs

On 01/13/2012 12:34 PM, Dave Chinner wrote:
> On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
>> Hi All,
>>
>> I know this is an old topic, but I am poking it again because I've
>> had some work items wrap up, and Im planning on picking up on this
>> one again.  I am thinking about implementing extent locks to replace
>> i_mutex.  So I just wanted to touch base with folks and see what
>> people are working on because I know there were some folks out there
>> that were thing about doing similar solutions.
> 
> What locking API are you looking at? If you are looking at an
> something like:
> 
> read_range_{try}lock(lock, off, len)
> read_range_unlock(lock, off, len)
> write_range_{try}lock(lock, off, len)
> write_range_unlock(lock, off, len)
> 
> and implementing with an rbtree or a btree for tracking, then I
> definitely have a use for it in XFS - replacing the current rwsem
> that is used for the iolock. Range locks like this are the only
> thing we need to allow concurrent buffered writes to the same file
> to maintain the per-write exclusion that posix requires.
Interesting, so xfs already have these range lock, right? If yes, any
possibility that the code can be reused in ext4 since we have the same
thing in mind but don't have any resource to work on it by now.

btw, IIRC flock(2) uses a list to indicate the range lock, so if we can
make these pieces of codes common, at least there are 3 places that can
benefit from it. ;)

Thanks
Tao

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-13  7:14   ` Tao Ma
@ 2012-01-13 11:52     ` Dave Chinner
  2012-01-13 11:57       ` Tao Ma
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2012-01-13 11:52 UTC (permalink / raw)
  To: Tao Ma; +Cc: Allison Henderson, Ext4 Developers List, Lukas Czerner, xfs

On Fri, Jan 13, 2012 at 03:14:51PM +0800, Tao Ma wrote:
> On 01/13/2012 12:34 PM, Dave Chinner wrote:
> > On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
> >> Hi All,
> >>
> >> I know this is an old topic, but I am poking it again because I've
> >> had some work items wrap up, and Im planning on picking up on this
> >> one again.  I am thinking about implementing extent locks to replace
> >> i_mutex.  So I just wanted to touch base with folks and see what
> >> people are working on because I know there were some folks out there
> >> that were thing about doing similar solutions.
> > 
> > What locking API are you looking at? If you are looking at an
> > something like:
> > 
> > read_range_{try}lock(lock, off, len)
> > read_range_unlock(lock, off, len)
> > write_range_{try}lock(lock, off, len)
> > write_range_unlock(lock, off, len)
> > 
> > and implementing with an rbtree or a btree for tracking, then I
> > definitely have a use for it in XFS - replacing the current rwsem
> > that is used for the iolock. Range locks like this are the only
> > thing we need to allow concurrent buffered writes to the same file
> > to maintain the per-write exclusion that posix requires.
> Interesting, so xfs already have these range lock, right? If yes, any
> possibility that the code can be reused in ext4 since we have the same
> thing in mind but don't have any resource to work on it by now.

No, it doesn't have range locks. If has separate locks for IO
exclusion vs metadata modification (i_iolock vs i_ilock). Both are
rwsems, the ilock nests inside and protects the extent list and
other metadata.

What I want to do is replace the i_iolock with a read/write range
lock so that we can do sane cache coherent concurrent IO to separate
ranges of the file. We can't do concurrent modifications to the
extent tree, so we have no need for changing the i_ilock (metadata)
lock to range locks.


> btw, IIRC flock(2) uses a list to indicate the range lock, so if we can
> make these pieces of codes common, at least there are 3 places that can
> benefit from it. ;)

flock is way more complex than simple read/write range locks and has
fixed semantics and lots of scope for difficult to find regressions,
so I wouldn't even bother trying to support them...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-13 11:52     ` Dave Chinner
@ 2012-01-13 11:57       ` Tao Ma
  0 siblings, 0 replies; 13+ messages in thread
From: Tao Ma @ 2012-01-13 11:57 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Allison Henderson, Ext4 Developers List, Lukas Czerner, xfs

On 01/13/2012 07:52 PM, Dave Chinner wrote:
> On Fri, Jan 13, 2012 at 03:14:51PM +0800, Tao Ma wrote:
>> On 01/13/2012 12:34 PM, Dave Chinner wrote:
>>> On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
>>>> Hi All,
>>>>
>>>> I know this is an old topic, but I am poking it again because I've
>>>> had some work items wrap up, and Im planning on picking up on this
>>>> one again.  I am thinking about implementing extent locks to replace
>>>> i_mutex.  So I just wanted to touch base with folks and see what
>>>> people are working on because I know there were some folks out there
>>>> that were thing about doing similar solutions.
>>>
>>> What locking API are you looking at? If you are looking at an
>>> something like:
>>>
>>> read_range_{try}lock(lock, off, len)
>>> read_range_unlock(lock, off, len)
>>> write_range_{try}lock(lock, off, len)
>>> write_range_unlock(lock, off, len)
>>>
>>> and implementing with an rbtree or a btree for tracking, then I
>>> definitely have a use for it in XFS - replacing the current rwsem
>>> that is used for the iolock. Range locks like this are the only
>>> thing we need to allow concurrent buffered writes to the same file
>>> to maintain the per-write exclusion that posix requires.
>> Interesting, so xfs already have these range lock, right? If yes, any
>> possibility that the code can be reused in ext4 since we have the same
>> thing in mind but don't have any resource to work on it by now.
> 
> No, it doesn't have range locks. If has separate locks for IO
> exclusion vs metadata modification (i_iolock vs i_ilock). Both are
> rwsems, the ilock nests inside and protects the extent list and
> other metadata.
> 
> What I want to do is replace the i_iolock with a read/write range
> lock so that we can do sane cache coherent concurrent IO to separate
> ranges of the file. We can't do concurrent modifications to the
> extent tree, so we have no need for changing the i_ilock (metadata)
> lock to range locks.
OK, I see. Thanks for the information.
> 
> 
>> btw, IIRC flock(2) uses a list to indicate the range lock, so if we can
>> make these pieces of codes common, at least there are 3 places that can
>> benefit from it. ;)
> 
> flock is way more complex than simple read/write range locks and has
> fixed semantics and lots of scope for difficult to find regressions,
> so I wouldn't even bother trying to support them...
fair enough. :)

Thanks
Tao

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-13  4:01 ` Andreas Dilger
@ 2012-01-13 20:50   ` Allison Henderson
  0 siblings, 0 replies; 13+ messages in thread
From: Allison Henderson @ 2012-01-13 20:50 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Ext4 Developers List, Lukas Czerner, Zhen Liang

On 01/12/2012 09:01 PM, Andreas Dilger wrote:
> On 2012-01-12, at 8:01 PM, Allison Henderson wrote:
>> I know this is an old topic, but I am poking it again because I've had some work items wrap up, and Im planning on picking up on this one again.  I am thinking about implementing extent locks to replace i_mutex.  So I just wanted to touch base with folks and see what people are working on because I know there were some folks out there that were thing about doing similar solutions.
>>
>> A while ago I had done some investigation on where i_mutex is currently used, so I did a review and updated my list.  Only one thing had been removed, but I will leave the list here since it was a while ago.  Let me know if anyone has been working on similar concept.  Thx!
>
> The in-ext4 users appear to all be file IO related, while the VFS functions
> are mostly directory related, though I don't see any mention of the callers
> of ext4_mkdir() or ext4_mknod() or ext4_create()?
Hmm, those ones didnt turn up when I was looking for i_mutex locking. 
Though it would make sense that they would be locked.  Did I over look 
it in a helper function somewhere?

>
> For Lustre we developed a patch that allows parallel metadata operations on
> directories (e.g. concurrent lookup, mkdir, rmdir, create, unlink in a single
> directory) which would replace i_mutex for the namespace operations.  Patch:
>
> http://git.whamcloud.com/?p=fs/lustre-release.git;a=blob;f=ldiskfs/kernel_patches/patches/ext4_pdirop-rhel6.patch;hb=HEAD
>
> though this in itself isn't enough to allow the VFS to do parallel directory
> operations.  We're of course also interested in parallel file IO operations
> through the VFS for the client, though this hasn't been a focus of ours since
> we typically have a large number of clients doing IO concurrently.
>

I see, I will take a look at it, maybe there will be some things I can 
borrow from it.  Thx!

>> List of ext4 functions that lock i_mutex:
>> ext4_sync_file
>> ext4_fallocate
>> ext4_move_extents via two helper routines:
>>     mext_inode_double_lock and mext_inode_double_unlock
>> ext4_ioctl (for the EXT4_IOC_SETFLAGS ioctl)
>> ext4_quota_write
>> ext4_llseek
>> ext4_end_io_work
>> ext4_ind_direct_IO (only while calling ext4_flush_completed_IO)
>>
>>
>> Functions called by vfs with i_mutex locked:
>> ext4_setattr
>> ext4_da_writepages
>> ext4_rmdir
>> ext4_unlink
>> ext4_symlink
>> ext4_link
>> ext4_rename
>> ext4_get_block
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-13  4:34 ` Dave Chinner
  2012-01-13  7:14   ` Tao Ma
@ 2012-01-13 20:50   ` Allison Henderson
  2012-01-15 23:57     ` Dave Chinner
  1 sibling, 1 reply; 13+ messages in thread
From: Allison Henderson @ 2012-01-13 20:50 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Lukas Czerner, Ext4 Developers List, xfs

On 01/12/2012 09:34 PM, Dave Chinner wrote:
> On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
>> Hi All,
>>
>> I know this is an old topic, but I am poking it again because I've
>> had some work items wrap up, and Im planning on picking up on this
>> one again.  I am thinking about implementing extent locks to replace
>> i_mutex.  So I just wanted to touch base with folks and see what
>> people are working on because I know there were some folks out there
>> that were thing about doing similar solutions.
>
> What locking API are you looking at? If you are looking at an
> something like:
>
> read_range_{try}lock(lock, off, len)
> read_range_unlock(lock, off, len)
> write_range_{try}lock(lock, off, len)
> write_range_unlock(lock, off, len)
>
> and implementing with an rbtree or a btree for tracking, then I
> definitely have a use for it in XFS - replacing the current rwsem
> that is used for the iolock. Range locks like this are the only
> thing we need to allow concurrent buffered writes to the same file
> to maintain the per-write exclusion that posix requires.
>
> Cheers,
>
> Dave.

Yes that is generally the idea I was thinking about doing, but at the 
time, I was not thinking outside the scope of ext4.  You are thinking 
maybe it should be in vfs layer so that it's something that all the 
filesystems will use?  That seems to be the impression I'm getting from 
folks.  Thx!

Allison Henderson


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-13 20:50   ` Allison Henderson
@ 2012-01-15 23:57     ` Dave Chinner
  2012-01-16 17:46       ` Allison Henderson
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2012-01-15 23:57 UTC (permalink / raw)
  To: Allison Henderson; +Cc: Lukas Czerner, Ext4 Developers List, xfs

On Fri, Jan 13, 2012 at 01:50:52PM -0700, Allison Henderson wrote:
> On 01/12/2012 09:34 PM, Dave Chinner wrote:
> >On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
> >>Hi All,
> >>
> >>I know this is an old topic, but I am poking it again because I've
> >>had some work items wrap up, and Im planning on picking up on this
> >>one again.  I am thinking about implementing extent locks to replace
> >>i_mutex.  So I just wanted to touch base with folks and see what
> >>people are working on because I know there were some folks out there
> >>that were thing about doing similar solutions.
> >
> >What locking API are you looking at? If you are looking at an
> >something like:
> >
> >read_range_{try}lock(lock, off, len)
> >read_range_unlock(lock, off, len)
> >write_range_{try}lock(lock, off, len)
> >write_range_unlock(lock, off, len)
> >
> >and implementing with an rbtree or a btree for tracking, then I
> >definitely have a use for it in XFS - replacing the current rwsem
> >that is used for the iolock. Range locks like this are the only
> >thing we need to allow concurrent buffered writes to the same file
> >to maintain the per-write exclusion that posix requires.
> 
> Yes that is generally the idea I was thinking about doing, but at
> the time, I was not thinking outside the scope of ext4.  You are
> thinking maybe it should be in vfs layer so that it's something that
> all the filesystems will use?  That seems to be the impression I'm
> getting from folks.  Thx!

Yes, that's what I'm suggesting. Not so much a vfs layer function,
but a library (range locks could be useful outside filesystems) so
locating it in lib/ was what I was thinking....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-15 23:57     ` Dave Chinner
@ 2012-01-16 17:46       ` Allison Henderson
  2012-01-18 12:02         ` Zheng Liu
  0 siblings, 1 reply; 13+ messages in thread
From: Allison Henderson @ 2012-01-16 17:46 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Lukas Czerner, Ext4 Developers List, xfs

On 01/15/2012 04:57 PM, Dave Chinner wrote:
> On Fri, Jan 13, 2012 at 01:50:52PM -0700, Allison Henderson wrote:
>> On 01/12/2012 09:34 PM, Dave Chinner wrote:
>>> On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
>>>> Hi All,
>>>>
>>>> I know this is an old topic, but I am poking it again because I've
>>>> had some work items wrap up, and Im planning on picking up on this
>>>> one again.  I am thinking about implementing extent locks to replace
>>>> i_mutex.  So I just wanted to touch base with folks and see what
>>>> people are working on because I know there were some folks out there
>>>> that were thing about doing similar solutions.
>>>
>>> What locking API are you looking at? If you are looking at an
>>> something like:
>>>
>>> read_range_{try}lock(lock, off, len)
>>> read_range_unlock(lock, off, len)
>>> write_range_{try}lock(lock, off, len)
>>> write_range_unlock(lock, off, len)
>>>
>>> and implementing with an rbtree or a btree for tracking, then I
>>> definitely have a use for it in XFS - replacing the current rwsem
>>> that is used for the iolock. Range locks like this are the only
>>> thing we need to allow concurrent buffered writes to the same file
>>> to maintain the per-write exclusion that posix requires.
>>
>> Yes that is generally the idea I was thinking about doing, but at
>> the time, I was not thinking outside the scope of ext4.  You are
>> thinking maybe it should be in vfs layer so that it's something that
>> all the filesystems will use?  That seems to be the impression I'm
>> getting from folks.  Thx!
>
> Yes, that's what I'm suggesting. Not so much a vfs layer function,
> but a library (range locks could be useful outside filesystems) so
> locating it in lib/ was what I was thinking....
>
> Cheers,
>
> Dave.

Alrighty, that sounds good to me.  I will aim to keep it as general 
purpose as I can.  I am going to start some proto typing and will post 
back when I get something working.  Thx for the feedback all!  :)

Allison


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-16 17:46       ` Allison Henderson
@ 2012-01-18 12:02         ` Zheng Liu
  2012-01-19 21:16           ` Frank Mayhar
  0 siblings, 1 reply; 13+ messages in thread
From: Zheng Liu @ 2012-01-18 12:02 UTC (permalink / raw)
  To: Allison Henderson
  Cc: Dave Chinner, Lukas Czerner, Ext4 Developers List, Tao Ma, xfs

On Mon, Jan 16, 2012 at 10:46:29AM -0700, Allison Henderson wrote:
> On 01/15/2012 04:57 PM, Dave Chinner wrote:
> >On Fri, Jan 13, 2012 at 01:50:52PM -0700, Allison Henderson wrote:
> >>On 01/12/2012 09:34 PM, Dave Chinner wrote:
> >>>On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
> >>>>Hi All,
> >>>>
> >>>>I know this is an old topic, but I am poking it again because I've
> >>>>had some work items wrap up, and Im planning on picking up on this
> >>>>one again.  I am thinking about implementing extent locks to replace
> >>>>i_mutex.  So I just wanted to touch base with folks and see what
> >>>>people are working on because I know there were some folks out there
> >>>>that were thing about doing similar solutions.
> >>>
> >>>What locking API are you looking at? If you are looking at an
> >>>something like:
> >>>
> >>>read_range_{try}lock(lock, off, len)
> >>>read_range_unlock(lock, off, len)
> >>>write_range_{try}lock(lock, off, len)
> >>>write_range_unlock(lock, off, len)
> >>>
> >>>and implementing with an rbtree or a btree for tracking, then I
> >>>definitely have a use for it in XFS - replacing the current rwsem
> >>>that is used for the iolock. Range locks like this are the only
> >>>thing we need to allow concurrent buffered writes to the same file
> >>>to maintain the per-write exclusion that posix requires.
> >>
> >>Yes that is generally the idea I was thinking about doing, but at
> >>the time, I was not thinking outside the scope of ext4.  You are
> >>thinking maybe it should be in vfs layer so that it's something that
> >>all the filesystems will use?  That seems to be the impression I'm
> >>getting from folks.  Thx!
> >
> >Yes, that's what I'm suggesting. Not so much a vfs layer function,
> >but a library (range locks could be useful outside filesystems) so
> >locating it in lib/ was what I was thinking....
> >
> >Cheers,
> >
> >Dave.
> 
> Alrighty, that sounds good to me.  I will aim to keep it as general
> purpose as I can.  I am going to start some proto typing and will
> post back when I get something working.  Thx for the feedback all!
> :)

Hi Allison,

For this project, do you have a schedule? Would you like to share to me? This
lock contention heavily impacts the performance of direct IO in our production
environment. So we hope to improve it ASAP.

I have done some direct IO benchmarks to compare ext4 with xfs using fio
in Intel SSD. The result shows that, in direct IO, xfs outperforms ext4 and
ext4 with dioread_nolock.

To understand the effect of lock contention, I define a new function called 
ext4_file_aio_write() that calls __generic_file_aio_write() without acquiring 
i_mutex lock. Meanwhile, I remove DIO_LOCKING flag when __blockdev_direct_IO() 
is called and do the similar benchmarks. The result shows that the performance 
in ext4 is almost the same to the xfs. Thus, it proves that the i_mutex heavily
impacts the performance. Hopefully the result is useful for you. :-)

I post the result in here.

config file:
[global]
filesize=64G
size=64G
bs=16k
ioengine=psync
direct=1
filename=/mnt/ext4/benchmark
runtime=600
group_reporting
thread

[randrw]
numjobs=32
rw=randrw
rwmixread=90

result:

iops			1 (r/w)		2		3
ext4			5584/622	5726/636	5719/636
ext4+dioread_nolock	7105/789	7117/793	7129/795
ext4+dio_nolock		8920/992	8956/995	8976/997
xfs			8726/971	8962/994	8975/998

bandwidth		1 (r/w)		2		3		KB/s
ext4			89359/9955.3	91621/10186	91519/10185
ext4+dioread_nolock	113691/12635	113882/12692	114066/12728
ext4+dio_nolock		142731/15888	143301/15930	143617/15959
xfs			139627/15537	143400/15914	143603/15980

latency			1 (r/w)		2		3		usec
ext4			5163.28/5048.31	5037.81/4914.82	5041.49/4932.81
ext4+dioread_nolock	1220.04/29510.5 1213.67/29418.9 1208.77/29361.49
ext4+dio_nolock		3226.61/3194.35	3214.59/3178.09	3207.34/3173.78
xfs			3299.87/3266.32	3213.73/3182.20	3208.16/3178.10

Regards,
Zheng

> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-18 12:02         ` Zheng Liu
@ 2012-01-19 21:16           ` Frank Mayhar
  2012-01-20  2:26             ` Zheng Liu
  0 siblings, 1 reply; 13+ messages in thread
From: Frank Mayhar @ 2012-01-19 21:16 UTC (permalink / raw)
  To: Zheng Liu
  Cc: Allison Henderson, Dave Chinner, Lukas Czerner,
	Ext4 Developers List, Tao Ma, xfs

On Wed, 2012-01-18 at 20:02 +0800, Zheng Liu wrote:
> For this project, do you have a schedule? Would you like to share to me? This
> lock contention heavily impacts the performance of direct IO in our production
> environment. So we hope to improve it ASAP.
> 
> I have done some direct IO benchmarks to compare ext4 with xfs using fio
> in Intel SSD. The result shows that, in direct IO, xfs outperforms ext4 and
> ext4 with dioread_nolock.
> 
> To understand the effect of lock contention, I define a new function called 
> ext4_file_aio_write() that calls __generic_file_aio_write() without acquiring 
> i_mutex lock. Meanwhile, I remove DIO_LOCKING flag when __blockdev_direct_IO() 
> is called and do the similar benchmarks. The result shows that the performance 
> in ext4 is almost the same to the xfs. Thus, it proves that the i_mutex heavily
> impacts the performance. Hopefully the result is useful for you. :-)

For the record, I have a patchset that, while not affecting i_mutex (or
locking in general), does allow AIO append writes to actually be done
asynchronously.  (Currently they're forced to be done synchronously.)
It makes a big difference in performance for that particular case, even
for spinning media.  Performance roughly doubled when testing with fio
against a regular two-terabyte drive; the performance improvement
against SSD would have to be much greater.

One day soon I'll accumulate enough spare time to port the patchset
forward to the latest kernel and submit it here.
-- 
Frank Mayhar
fmayhar@google.com


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: working on extent locks for i_mutex
  2012-01-19 21:16           ` Frank Mayhar
@ 2012-01-20  2:26             ` Zheng Liu
  0 siblings, 0 replies; 13+ messages in thread
From: Zheng Liu @ 2012-01-20  2:26 UTC (permalink / raw)
  To: Frank Mayhar
  Cc: Allison Henderson, Dave Chinner, Lukas Czerner,
	Ext4 Developers List, Tao Ma, xfs

On Thu, Jan 19, 2012 at 01:16:10PM -0800, Frank Mayhar wrote:
> On Wed, 2012-01-18 at 20:02 +0800, Zheng Liu wrote:
> > For this project, do you have a schedule? Would you like to share to me? This
> > lock contention heavily impacts the performance of direct IO in our production
> > environment. So we hope to improve it ASAP.
> > 
> > I have done some direct IO benchmarks to compare ext4 with xfs using fio
> > in Intel SSD. The result shows that, in direct IO, xfs outperforms ext4 and
> > ext4 with dioread_nolock.
> > 
> > To understand the effect of lock contention, I define a new function called 
> > ext4_file_aio_write() that calls __generic_file_aio_write() without acquiring 
> > i_mutex lock. Meanwhile, I remove DIO_LOCKING flag when __blockdev_direct_IO() 
> > is called and do the similar benchmarks. The result shows that the performance 
> > in ext4 is almost the same to the xfs. Thus, it proves that the i_mutex heavily
> > impacts the performance. Hopefully the result is useful for you. :-)
> 
> For the record, I have a patchset that, while not affecting i_mutex (or
> locking in general), does allow AIO append writes to actually be done
> asynchronously.  (Currently they're forced to be done synchronously.)
> It makes a big difference in performance for that particular case, even
> for spinning media.  Performance roughly doubled when testing with fio
> against a regular two-terabyte drive; the performance improvement
> against SSD would have to be much greater.
> 
> One day soon I'll accumulate enough spare time to port the patchset
> forward to the latest kernel and submit it here.
Interesting. I think it might help us to improve this issue. So could
you please post your test case and result in detail? Thank you. :-)

Regards,
Zheng

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-01-20  2:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-13  3:01 working on extent locks for i_mutex Allison Henderson
2012-01-13  4:01 ` Andreas Dilger
2012-01-13 20:50   ` Allison Henderson
2012-01-13  4:34 ` Dave Chinner
2012-01-13  7:14   ` Tao Ma
2012-01-13 11:52     ` Dave Chinner
2012-01-13 11:57       ` Tao Ma
2012-01-13 20:50   ` Allison Henderson
2012-01-15 23:57     ` Dave Chinner
2012-01-16 17:46       ` Allison Henderson
2012-01-18 12:02         ` Zheng Liu
2012-01-19 21:16           ` Frank Mayhar
2012-01-20  2:26             ` Zheng Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).