* possible fsync02() xfs slowness regression on power7
[not found] <130819766.6999797.1362043338279.JavaMail.root@redhat.com>
@ 2013-02-28 9:28 ` CAI Qian
2013-02-28 11:54 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: CAI Qian @ 2013-02-28 9:28 UTC (permalink / raw)
To: xfs; +Cc: Steve Best
This LTP test starting to fail using the latest upstream kernel on one of the power7
systems here, http://tinyurl.com/bngwouj
# ./fsync02
fsync02 1 TFAIL : fsync took too long: 252.000000 seconds; max_block: 214
When it is working, the test is almost returned immediately. The bisecting so far
indicated that one or a few of the following could be culprits.
# git log --pretty=oneline 498f7f505dc79934c878c7667840c50c64f232fc..b199c8a4ba11879df87daad496ceee41fdc6aa82
b199c8a4ba11879df87daad496ceee41fdc6aa82 xfs: Pull EFI/EFD handling out from under the AIL lock
9c5f8414efd5eeed9f498d4170337a3eb126341f xfs: fix EFI transaction cancellation.
821eb21d97a8b686649c08b7284d0b9f34d0e138 xfs: connect up buffer reclaim priority hooks
430cbeb86fdcbbdabea7d4aa65307de8de425350 xfs: add a lru to the XFS buffer cache
ff57ab21995a8636cfc72efeebb09cc6034d756f xfs: convert xfsbud shrinker to a per-buftarg shrinker.
1a427ab0c1b205d1bda8da0b77ea9d295ac23c57 xfs: convert pag_ici_lock to a spin lock
1a3e8f3da09c7082d25b512a0ffe569391e4c09a xfs: convert inode cache lookups to use RCU locking
d95b7aaf9ab6738bef1ebcc52ab66563085e44ac xfs: rcu free inodes
6e857567dbbfe14dd6cc3f7414671b047b1ff5c7 xfs: don't truncate prealloc from frequently accessed inodes
055388a3188f56676c21e92962fc366ac8b5cb72 xfs: dynamic speculative EOF preallocation
622d81494fa32343a4b97b607619656c7a4a6d1a xfs: use KM_NOFS for allocations during attribute list operations
dcfcf20512cb517ac18b9433b676183fa1257911 xfs: provide a inode iolock lockdep class
489a150f6454e2cd93d9e0ee6d7c5a361844f62a xfs: factor duplicate code in xfs_alloc_ag_vextent_near into a helper
9f9baab38dacd11fe6095a1e59f3783a305f7020 xfs: clean up xfs_alloc_ag_vextent_exact
ecff71e677c6d469f525dcf31ada709d5858307c xfs: simplify xfs_map_at_offset
aeea1b1f81800e362a3aca86d769d02e137a8fa7 xfs: refactor xfs_vm_writepage
2fa24f92530edaf86c3b5f662464e0d2e3b3e517 xfs: remove the all_bh flag from xfs_convert_page
ed1e7b7e484dfb64168755613d499f32a97409bd xfs: remove xfs_probe_cluster
8ff2957d581582890693affc09920108a67cb05d xfs: simplify xfs_map_blocks
a206c817c864583c44e2f418db8e6c7a000fbc38 xfs: kill xfs_iomap
CAI Qian
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible fsync02() xfs slowness regression on power7
2013-02-28 9:28 ` possible fsync02() xfs slowness regression on power7 CAI Qian
@ 2013-02-28 11:54 ` Dave Chinner
2013-03-01 9:20 ` CAI Qian
0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2013-02-28 11:54 UTC (permalink / raw)
To: CAI Qian; +Cc: Steve Best, xfs
On Thu, Feb 28, 2013 at 04:28:35AM -0500, CAI Qian wrote:
> This LTP test starting to fail using the latest upstream kernel on one of the power7
> systems here, http://tinyurl.com/bngwouj
>
> # ./fsync02
> fsync02 1 TFAIL : fsync took too long: 252.000000 seconds; max_block: 214
>
> When it is working, the test is almost returned immediately. The bisecting so far
> indicated that one or a few of the following could be culprits.
>
> # git log --pretty=oneline 498f7f505dc79934c878c7667840c50c64f232fc..b199c8a4ba11879df87daad496ceee41fdc6aa82
They are all patches committed more than 2 years ago, and none of
them are platform specific. This sounds more like a machine specific
issue than a platform specific problem (i.e. lots of RAM, slow,
slow disk).
In future when reporting a bug, please tell use hardware you are
using as per:
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
Given that this is a sparse file test that repeatedly extends the
file, this is the likely culprit:
> 055388a3188f56676c21e92962fc366ac8b5cb72 xfs: dynamic speculative EOF preallocation
And this commit in 3.9-rc1:
a1e16c2 xfs: limit speculative prealloc size on sparse files
should fix the problem. Please confirm these commits are the cause
and the fix respectively....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible fsync02() xfs slowness regression on power7
2013-02-28 11:54 ` Dave Chinner
@ 2013-03-01 9:20 ` CAI Qian
2013-03-04 4:23 ` CAI Qian
0 siblings, 1 reply; 6+ messages in thread
From: CAI Qian @ 2013-03-01 9:20 UTC (permalink / raw)
To: Dave Chinner; +Cc: Steve Best, xfs
----- 原始邮件 -----
> 发件人: "Dave Chinner" <david@fromorbit.com>
> 收件人: "CAI Qian" <caiqian@redhat.com>
> 抄送: xfs@oss.sgi.com, "Steve Best" <sbest@redhat.com>
> 发送时间: 星期四, 2013年 2 月 28日 下午 7:54:10
> 主题: Re: possible fsync02() xfs slowness regression on power7
>
> On Thu, Feb 28, 2013 at 04:28:35AM -0500, CAI Qian wrote:
> > This LTP test starting to fail using the latest upstream kernel on
> > one of the power7
> > systems here, http://tinyurl.com/bngwouj
> >
> > # ./fsync02
> > fsync02 1 TFAIL : fsync took too long: 252.000000 seconds;
> > max_block: 214
> >
> > When it is working, the test is almost returned immediately. The
> > bisecting so far
> > indicated that one or a few of the following could be culprits.
> >
> > # git log --pretty=oneline
> > 498f7f505dc79934c878c7667840c50c64f232fc..b199c8a4ba11879df87daad496ceee41fdc6aa82
>
> They are all patches committed more than 2 years ago, and none of
> them are platform specific. This sounds more like a machine specific
> issue than a platform specific problem (i.e. lots of RAM, slow,
> slow disk).
>
> In future when reporting a bug, please tell use hardware you are
> using as per:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>
> Given that this is a sparse file test that repeatedly extends the
> file, this is the likely culprit:
>
> > 055388a3188f56676c21e92962fc366ac8b5cb72 xfs: dynamic speculative
> > EOF preallocation
Yes, you are right, just confirmed it. Nice catch!
>
> And this commit in 3.9-rc1:
>
> a1e16c2 xfs: limit speculative prealloc size on sparse files
Just back-ported it and running at the moment and will let you know. Thanks Dave.
>
> should fix the problem. Please confirm these commits are the cause
> and the fix respectively....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible fsync02() xfs slowness regression on power7
2013-03-01 9:20 ` CAI Qian
@ 2013-03-04 4:23 ` CAI Qian
2013-03-04 5:55 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: CAI Qian @ 2013-03-04 4:23 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
> > And this commit in 3.9-rc1:
> >
> > a1e16c2 xfs: limit speculative prealloc size on sparse files
> > should fix the problem. Please confirm these commits are the cause
> > and the fix respectively....
Confirmed this fixed the problem. I'd like to request this to be back-ported
to stable-3.0, stable-3.4 and stable-3.8. What do you think?
Thanks,
CAI Qian
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible fsync02() xfs slowness regression on power7
2013-03-04 4:23 ` CAI Qian
@ 2013-03-04 5:55 ` Dave Chinner
2013-03-04 6:14 ` CAI Qian
0 siblings, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2013-03-04 5:55 UTC (permalink / raw)
To: CAI Qian; +Cc: xfs
On Sun, Mar 03, 2013 at 11:23:26PM -0500, CAI Qian wrote:
>
> > > And this commit in 3.9-rc1:
> > >
> > > a1e16c2 xfs: limit speculative prealloc size on sparse files
> > > should fix the problem. Please confirm these commits are the cause
> > > and the fix respectively....
> Confirmed this fixed the problem. I'd like to request this to be back-ported
> to stable-3.0, stable-3.4 and stable-3.8. What do you think?
IMO, no, it is not a backport candidate. The patch has quite a few
dependencies, and at least for 3.0 xfs_bmapi_read() doesn't exist
and hence is not a trivial backport.
Further, it's take 2 years for this to be noticed, and you haven't
explained why the problem exists on your power machine and not any
others that it has been tested on. And there's been very few
complaints about performance of such workloads over the past 2
years, so either the workload is not important or only your power7
machine is having problems.
Hence I don't see any need to back port it - it's not a critical fix
and very few people see the problem so there's no real need to do
the backport. Maybe someone else has the time and resources to
waste on backporting non-critical fixes to stable kernels, but I
don't....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: possible fsync02() xfs slowness regression on power7
2013-03-04 5:55 ` Dave Chinner
@ 2013-03-04 6:14 ` CAI Qian
0 siblings, 0 replies; 6+ messages in thread
From: CAI Qian @ 2013-03-04 6:14 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
----- 原始邮件 -----
> 发件人: "Dave Chinner" <david@fromorbit.com>
> 收件人: "CAI Qian" <caiqian@redhat.com>
> 抄送: xfs@oss.sgi.com
> 发送时间: 星期一, 2013年 3 月 04日 下午 1:55:02
> 主题: Re: possible fsync02() xfs slowness regression on power7
>
> On Sun, Mar 03, 2013 at 11:23:26PM -0500, CAI Qian wrote:
> >
> > > > And this commit in 3.9-rc1:
> > > >
> > > > a1e16c2 xfs: limit speculative prealloc size on sparse files
> > > > should fix the problem. Please confirm these commits are the
> > > > cause
> > > > and the fix respectively....
> > Confirmed this fixed the problem. I'd like to request this to be
> > back-ported
> > to stable-3.0, stable-3.4 and stable-3.8. What do you think?
>
> IMO, no, it is not a backport candidate. The patch has quite a few
> dependencies, and at least for 3.0 xfs_bmapi_read() doesn't exist
> and hence is not a trivial backport.
It is fine to skip the 3.0 then, but the other stable branches can be applied
as it iis and build fine.
>
> Further, it's take 2 years for this to be noticed, and you haven't
> explained why the problem exists on your power machine and not any
> others that it has been tested on. And there's been very few
> complaints about performance of such workloads over the past 2
> years, so either the workload is not important or only your power7
> machine is having problems.
Hmm, it could also be possible that it was easy to reproduce now with the
new kernel plus new user-spaces as well the new compiler. Also, those systems
started to switch to use XFS as root partitions from ext4 only recently,
so we have never noticed this in the past when running those LTP tests.
XFS could also become more popular than 2-year ago. :)
>
> Hence I don't see any need to back port it - it's not a critical fix
> and very few people see the problem so there's no real need to do
> the backport. Maybe someone else has the time and resources to
> waste on backporting non-critical fixes to stable kernels, but I
> don't....
I have time so I can do the back-port for you guys to review if you
don't mind. Thanks for your time.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-03-04 6:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <130819766.6999797.1362043338279.JavaMail.root@redhat.com>
2013-02-28 9:28 ` possible fsync02() xfs slowness regression on power7 CAI Qian
2013-02-28 11:54 ` Dave Chinner
2013-03-01 9:20 ` CAI Qian
2013-03-04 4:23 ` CAI Qian
2013-03-04 5:55 ` Dave Chinner
2013-03-04 6:14 ` CAI Qian
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox