linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] ext4: Reduce contention on s_orphan_lock
@ 2014-04-29 23:32 Jan Kara
  2014-04-29 23:32 ` [PATCH 1/2] ext4: Use sbi in ext4_orphan_del() Jan Kara
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Jan Kara @ 2014-04-29 23:32 UTC (permalink / raw)
  To: T Makphaibulchoke; +Cc: linux-ext4


  Hello,

  so I finally got to looking into your patches for reducing contention
on s_orphan_lock. AFAICT these two patches (the first one is just a
small cleanup) should have the same speed gain as the patches you wrote
and they are simpler. Can you give them a try please? Thanks!

								Honza

^ permalink raw reply	[flat|nested] 25+ messages in thread
* [PATCH 0/2 v2] Improve orphan list scaling
@ 2014-05-15 20:17 Jan Kara
  2014-05-15 20:17 ` [PATCH 2/2] ext4: Reduce contention on s_orphan_lock Jan Kara
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Kara @ 2014-05-15 20:17 UTC (permalink / raw)
  To: linux-ext4; +Cc: Ted Tso, Thavatchai Makphaibulchoke


  Hello,

  this is my version of patches to improve orphan list scaling by
reducing amount of work done under global s_orphan_mutex. We are
in disagreement with Thavatchai whose patches are better (see thread
http://www.spinics.net/lists/linux-ext4/msg43220.html) so I guess it's
up to Ted or other people on this list to decide.

When running code stressing orphan list operations [1] with these
patches, I see s_orphan_lock to move from number 1 in lock_stat report
to unmeasurable. So with the patches there are other much more
problematic locks (superblock buffer lock and bh_state lock,
j_list_lock, buffer locks for inode buffers when several inodes share a
block...). The average times for 10 runs for the test program to run on my
48-way box with ext4 on ramdisk are:
	Vanilla				Patched
Procs	Avg		Stddev		Avg		Stddev
 1	  2.769200	0.056194	2.890700	0.061727
 2	  5.756500	0.313268	4.383500	0.161629
 4	 11.852500	0.130221	6.542900	0.160039
10	 33.590900	0.394888	27.749500	0.615517
20	 71.035400	0.320914	76.368700	3.734557
40	236.671100	2.856885	228.236800	2.361391

So we can see the biggest speedup was for 2, 4, and 10 threads. For
higher thread counts the contention and cache bouncing prevented any
significant speedup (we can even see a barely-out-of-noise performance
drop for 20 threads). 

Changes since v2:
* Fixed up various bugs in error handling pointed out by Thavatchai and
  some others as well
* Somewhat reduced critical sections under s_orphan_lock

[1] The test program runs given number of processes, each process is
truncating a 4k file by 1 byte until it reaches 1 byte size and then the
file is extended to 4k again.

								Honza

^ permalink raw reply	[flat|nested] 25+ messages in thread
* [PATCH 0/2 v3] Improve orphan list scaling
@ 2014-05-20 12:45 Jan Kara
  2014-05-20 12:45 ` [PATCH 2/2] ext4: Reduce contention on s_orphan_lock Jan Kara
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Kara @ 2014-05-20 12:45 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Thavatchai Makphaibulchoke, Jan Kara

  Hello,

  here is another version of my patches to improve orphan list scaling by
reducing amount of work done under global s_orphan_mutex. Since previous
version I've fixed some bugs (thanks Thavatchai!), retested with updated
xfstests to verify the problem Ted has spotted is fixed, and rerun the
performance tests because the bugs had a non-trivial impact on the
functionality.

To stress orphan list operations I run my artifical test program.  The test
program runs given number of processes, each process is truncating a 4k file
by 1 byte until it reaches 1 byte size and then the file is extended to 4k
again.

The average times for 10 runs for the test program to run on my 48-way box
with ext4 on ramdisk are:
	Vanilla				Patched
Procs	Avg		Stddev		Avg			Stddev
 1	  2.769200	0.056194	2.750000 (-0.6%)	0.054772
 2	  5.756500	0.313268	5.669000 (-1.5%)	0.587528
 4	 11.852500	0.130221	8.311000 (-29.9%)	0.257544
10	 33.590900	0.394888	20.162000 (-40%)	0.189832
20	 71.035400	0.320914	55.854000 (-21.4%)	0.478815
40	236.671100	2.856885	174.543000 (-26.3%)	0.974547

In the lockstat reports, s_orphan_mutex has been #1 in both cases however
the patches significanly reduced the contention. For 10 threads the numbers
look like:

         con-bounces contentions waittime-min waittime-max waittime-total
Orig         7089335     7089335         9.07   3504220.69  1473754546.28
Patched      2547868     2547868         9.18      8218.64   547550185.12

         waittime-avg acq-bounces acquisitions holdtime-min holdtime-max
Orig           207.88    14487647     16381236         0.16       211.62
Patched        214.91     7994533      8191236         0.16       203.81

         holdtime-total holdtime-avg
Orig        79738146.84         4.87
Patched     30660307.81         3.74

We can see the number of acquisitions dropped to a half (we now check
whether inode already is / is not part of the orphan list before acquiring
s_orphan_mutex). The average hold time is somewhat smaller as well and given
that the patched kernel doesn't have those 50% of short lived aquisitions
just for checking whether the inode is part of the orphan list, we can see
that the patched kernel really does significanly less work with s_orphan_lock
held.

Changes since v2:
* Fixed bug in ext4_orphan_del() leading to orphan list corruption - thanks
  to Thavatchai for pointing that out.
* Fixed bug in ext4_orphan_del() that could lead to using freed inodes

Changes since v1:
* Fixed up various bugs in error handling pointed out by Thavatchai and
  some others as well
* Somewhat reduced critical sections under s_orphan_lock

								Honza

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-07-23  8:15 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-29 23:32 [PATCH 0/2] ext4: Reduce contention on s_orphan_lock Jan Kara
2014-04-29 23:32 ` [PATCH 1/2] ext4: Use sbi in ext4_orphan_del() Jan Kara
2014-04-29 23:32 ` [PATCH 2/2] ext4: Reduce contention on s_orphan_lock Jan Kara
2014-05-02 21:56   ` Thavatchai Makphaibulchoke
2014-05-02 21:56 ` [PATCH 0/2] " Thavatchai Makphaibulchoke
2014-05-06 11:49   ` Jan Kara
2014-05-09  6:24     ` Thavatchai Makphaibulchoke
  -- strict thread matches above, loose matches on Subject: below --
2014-05-15 20:17 [PATCH 0/2 v2] Improve orphan list scaling Jan Kara
2014-05-15 20:17 ` [PATCH 2/2] ext4: Reduce contention on s_orphan_lock Jan Kara
2014-05-20  3:23   ` Theodore Ts'o
2014-05-20  8:33   ` Thavatchai Makphaibulchoke
2014-05-20  9:18     ` Jan Kara
2014-05-20 13:57     ` Theodore Ts'o
2014-05-20 17:16       ` Thavatchai Makphaibulchoke
2014-06-02 17:45       ` Thavatchai Makphaibulchoke
2014-06-03  8:52         ` Jan Kara
2014-06-16 19:20           ` Thavatchai Makphaibulchoke
2014-06-17  9:29             ` Jan Kara
2014-06-18  4:38               ` Thavatchai Makphaibulchoke
2014-06-18 10:37                 ` Jan Kara
2014-07-22  4:35                   ` Thavatchai Makphaibulchoke
2014-07-23  8:15                     ` Jan Kara
2014-05-20 12:45 [PATCH 0/2 v3] Improve orphan list scaling Jan Kara
2014-05-20 12:45 ` [PATCH 2/2] ext4: Reduce contention on s_orphan_lock Jan Kara
2014-05-20 16:45   ` Thavatchai Makphaibulchoke
2014-05-20 21:03     ` Jan Kara
2014-05-20 23:27       ` Thavatchai Makphaibulchoke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).