From: Zheng Liu <gnehzuil.liu@gmail.com>
To: Allison Henderson <achender@linux.vnet.ibm.com>
Cc: Dave Chinner <david@fromorbit.com>,
Lukas Czerner <lczerner@redhat.com>,
Ext4 Developers List <linux-ext4@vger.kernel.org>,
Tao Ma <tm@tao.ma>,
xfs@oss.sgi.com
Subject: Re: working on extent locks for i_mutex
Date: Wed, 18 Jan 2012 20:02:23 +0800 [thread overview]
Message-ID: <20120118120223.GA4322@gmail.com> (raw)
In-Reply-To: <4F146275.8090304@linux.vnet.ibm.com>
On Mon, Jan 16, 2012 at 10:46:29AM -0700, Allison Henderson wrote:
> On 01/15/2012 04:57 PM, Dave Chinner wrote:
> >On Fri, Jan 13, 2012 at 01:50:52PM -0700, Allison Henderson wrote:
> >>On 01/12/2012 09:34 PM, Dave Chinner wrote:
> >>>On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
> >>>>Hi All,
> >>>>
> >>>>I know this is an old topic, but I am poking it again because I've
> >>>>had some work items wrap up, and Im planning on picking up on this
> >>>>one again. I am thinking about implementing extent locks to replace
> >>>>i_mutex. So I just wanted to touch base with folks and see what
> >>>>people are working on because I know there were some folks out there
> >>>>that were thing about doing similar solutions.
> >>>
> >>>What locking API are you looking at? If you are looking at an
> >>>something like:
> >>>
> >>>read_range_{try}lock(lock, off, len)
> >>>read_range_unlock(lock, off, len)
> >>>write_range_{try}lock(lock, off, len)
> >>>write_range_unlock(lock, off, len)
> >>>
> >>>and implementing with an rbtree or a btree for tracking, then I
> >>>definitely have a use for it in XFS - replacing the current rwsem
> >>>that is used for the iolock. Range locks like this are the only
> >>>thing we need to allow concurrent buffered writes to the same file
> >>>to maintain the per-write exclusion that posix requires.
> >>
> >>Yes that is generally the idea I was thinking about doing, but at
> >>the time, I was not thinking outside the scope of ext4. You are
> >>thinking maybe it should be in vfs layer so that it's something that
> >>all the filesystems will use? That seems to be the impression I'm
> >>getting from folks. Thx!
> >
> >Yes, that's what I'm suggesting. Not so much a vfs layer function,
> >but a library (range locks could be useful outside filesystems) so
> >locating it in lib/ was what I was thinking....
> >
> >Cheers,
> >
> >Dave.
>
> Alrighty, that sounds good to me. I will aim to keep it as general
> purpose as I can. I am going to start some proto typing and will
> post back when I get something working. Thx for the feedback all!
> :)
Hi Allison,
For this project, do you have a schedule? Would you like to share to me? This
lock contention heavily impacts the performance of direct IO in our production
environment. So we hope to improve it ASAP.
I have done some direct IO benchmarks to compare ext4 with xfs using fio
in Intel SSD. The result shows that, in direct IO, xfs outperforms ext4 and
ext4 with dioread_nolock.
To understand the effect of lock contention, I define a new function called
ext4_file_aio_write() that calls __generic_file_aio_write() without acquiring
i_mutex lock. Meanwhile, I remove DIO_LOCKING flag when __blockdev_direct_IO()
is called and do the similar benchmarks. The result shows that the performance
in ext4 is almost the same to the xfs. Thus, it proves that the i_mutex heavily
impacts the performance. Hopefully the result is useful for you. :-)
I post the result in here.
config file:
[global]
filesize=64G
size=64G
bs=16k
ioengine=psync
direct=1
filename=/mnt/ext4/benchmark
runtime=600
group_reporting
thread
[randrw]
numjobs=32
rw=randrw
rwmixread=90
result:
iops 1 (r/w) 2 3
ext4 5584/622 5726/636 5719/636
ext4+dioread_nolock 7105/789 7117/793 7129/795
ext4+dio_nolock 8920/992 8956/995 8976/997
xfs 8726/971 8962/994 8975/998
bandwidth 1 (r/w) 2 3 KB/s
ext4 89359/9955.3 91621/10186 91519/10185
ext4+dioread_nolock 113691/12635 113882/12692 114066/12728
ext4+dio_nolock 142731/15888 143301/15930 143617/15959
xfs 139627/15537 143400/15914 143603/15980
latency 1 (r/w) 2 3 usec
ext4 5163.28/5048.31 5037.81/4914.82 5041.49/4932.81
ext4+dioread_nolock 1220.04/29510.5 1213.67/29418.9 1208.77/29361.49
ext4+dio_nolock 3226.61/3194.35 3214.59/3178.09 3207.34/3173.78
xfs 3299.87/3266.32 3213.73/3182.20 3208.16/3178.10
Regards,
Zheng
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Zheng Liu <gnehzuil.liu@gmail.com>
To: Allison Henderson <achender@linux.vnet.ibm.com>
Cc: Lukas Czerner <lczerner@redhat.com>, Tao Ma <tm@tao.ma>,
Ext4 Developers List <linux-ext4@vger.kernel.org>,
xfs@oss.sgi.com
Subject: Re: working on extent locks for i_mutex
Date: Wed, 18 Jan 2012 20:02:23 +0800 [thread overview]
Message-ID: <20120118120223.GA4322@gmail.com> (raw)
In-Reply-To: <4F146275.8090304@linux.vnet.ibm.com>
On Mon, Jan 16, 2012 at 10:46:29AM -0700, Allison Henderson wrote:
> On 01/15/2012 04:57 PM, Dave Chinner wrote:
> >On Fri, Jan 13, 2012 at 01:50:52PM -0700, Allison Henderson wrote:
> >>On 01/12/2012 09:34 PM, Dave Chinner wrote:
> >>>On Thu, Jan 12, 2012 at 08:01:43PM -0700, Allison Henderson wrote:
> >>>>Hi All,
> >>>>
> >>>>I know this is an old topic, but I am poking it again because I've
> >>>>had some work items wrap up, and Im planning on picking up on this
> >>>>one again. I am thinking about implementing extent locks to replace
> >>>>i_mutex. So I just wanted to touch base with folks and see what
> >>>>people are working on because I know there were some folks out there
> >>>>that were thing about doing similar solutions.
> >>>
> >>>What locking API are you looking at? If you are looking at an
> >>>something like:
> >>>
> >>>read_range_{try}lock(lock, off, len)
> >>>read_range_unlock(lock, off, len)
> >>>write_range_{try}lock(lock, off, len)
> >>>write_range_unlock(lock, off, len)
> >>>
> >>>and implementing with an rbtree or a btree for tracking, then I
> >>>definitely have a use for it in XFS - replacing the current rwsem
> >>>that is used for the iolock. Range locks like this are the only
> >>>thing we need to allow concurrent buffered writes to the same file
> >>>to maintain the per-write exclusion that posix requires.
> >>
> >>Yes that is generally the idea I was thinking about doing, but at
> >>the time, I was not thinking outside the scope of ext4. You are
> >>thinking maybe it should be in vfs layer so that it's something that
> >>all the filesystems will use? That seems to be the impression I'm
> >>getting from folks. Thx!
> >
> >Yes, that's what I'm suggesting. Not so much a vfs layer function,
> >but a library (range locks could be useful outside filesystems) so
> >locating it in lib/ was what I was thinking....
> >
> >Cheers,
> >
> >Dave.
>
> Alrighty, that sounds good to me. I will aim to keep it as general
> purpose as I can. I am going to start some proto typing and will
> post back when I get something working. Thx for the feedback all!
> :)
Hi Allison,
For this project, do you have a schedule? Would you like to share to me? This
lock contention heavily impacts the performance of direct IO in our production
environment. So we hope to improve it ASAP.
I have done some direct IO benchmarks to compare ext4 with xfs using fio
in Intel SSD. The result shows that, in direct IO, xfs outperforms ext4 and
ext4 with dioread_nolock.
To understand the effect of lock contention, I define a new function called
ext4_file_aio_write() that calls __generic_file_aio_write() without acquiring
i_mutex lock. Meanwhile, I remove DIO_LOCKING flag when __blockdev_direct_IO()
is called and do the similar benchmarks. The result shows that the performance
in ext4 is almost the same to the xfs. Thus, it proves that the i_mutex heavily
impacts the performance. Hopefully the result is useful for you. :-)
I post the result in here.
config file:
[global]
filesize=64G
size=64G
bs=16k
ioengine=psync
direct=1
filename=/mnt/ext4/benchmark
runtime=600
group_reporting
thread
[randrw]
numjobs=32
rw=randrw
rwmixread=90
result:
iops 1 (r/w) 2 3
ext4 5584/622 5726/636 5719/636
ext4+dioread_nolock 7105/789 7117/793 7129/795
ext4+dio_nolock 8920/992 8956/995 8976/997
xfs 8726/971 8962/994 8975/998
bandwidth 1 (r/w) 2 3 KB/s
ext4 89359/9955.3 91621/10186 91519/10185
ext4+dioread_nolock 113691/12635 113882/12692 114066/12728
ext4+dio_nolock 142731/15888 143301/15930 143617/15959
xfs 139627/15537 143400/15914 143603/15980
latency 1 (r/w) 2 3 usec
ext4 5163.28/5048.31 5037.81/4914.82 5041.49/4932.81
ext4+dioread_nolock 1220.04/29510.5 1213.67/29418.9 1208.77/29361.49
ext4+dio_nolock 3226.61/3194.35 3214.59/3178.09 3207.34/3173.78
xfs 3299.87/3266.32 3213.73/3182.20 3208.16/3178.10
Regards,
Zheng
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-01-18 11:58 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-13 3:01 working on extent locks for i_mutex Allison Henderson
2012-01-13 4:01 ` Andreas Dilger
2012-01-13 20:50 ` Allison Henderson
2012-01-13 4:34 ` Dave Chinner
2012-01-13 4:34 ` Dave Chinner
2012-01-13 7:14 ` Tao Ma
2012-01-13 7:14 ` Tao Ma
2012-01-13 11:52 ` Dave Chinner
2012-01-13 11:52 ` Dave Chinner
2012-01-13 11:57 ` Tao Ma
2012-01-13 11:57 ` Tao Ma
2012-01-13 20:50 ` Allison Henderson
2012-01-13 20:50 ` Allison Henderson
2012-01-15 23:57 ` Dave Chinner
2012-01-15 23:57 ` Dave Chinner
2012-01-16 17:46 ` Allison Henderson
2012-01-18 12:02 ` Zheng Liu [this message]
2012-01-18 12:02 ` Zheng Liu
2012-01-19 21:16 ` Frank Mayhar
2012-01-19 21:16 ` Frank Mayhar
2012-01-20 2:26 ` Zheng Liu
2012-01-20 2:26 ` Zheng Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120118120223.GA4322@gmail.com \
--to=gnehzuil.liu@gmail.com \
--cc=achender@linux.vnet.ibm.com \
--cc=david@fromorbit.com \
--cc=lczerner@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tm@tao.ma \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.