All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv3 0/1] Optimize ext4 file overwrites - perf improvement
@ 2020-09-18  5:06 Ritesh Harjani
  2020-09-18  5:06 ` [PATCHv3 1/1] ext4: Optimize file overwrites Ritesh Harjani
  0 siblings, 1 reply; 7+ messages in thread
From: Ritesh Harjani @ 2020-09-18  5:06 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, jack, dan.j.williams, anju, linux-fsdevel, linux-kernel,
	Ritesh Harjani

Hello,

v2 -> v3
1. Switched to suggested approach from Jan to make the approach general
for all file writes rather than only for DAX.
(So as of now both DAX & DIO should benefit from this as both uses the same
iomap path. Although note that I only tested performance improvement for DAX)

Gave a run on xfstests with -g quick,dax and didn't observe any new
issues with this patch.

In case of file writes, currently we start a journal txn irrespective of whether
it's an overwrite or not. In case of an overwrite we don't need to start a
jbd2 txn since the blocks are already allocated.
So this patch optimizes away the txn start in case of file (DAX/DIO) overwrites.
This could significantly boost performance for multi-threaded writes
specially random writes (overwrite).
Fio script used to collect perf numbers is mentioned below.

Below numbers were calculated on a QEMU setup on ppc64 box with simulated
pmem (fsdax) device. 

Didn't observe any new failures with this patch in xfstests "-g quick,dax"

Performance numbers with different threads - (~10x improvement)
==========================================

vanilla_kernel(kIOPS) (randomwrite)
 60 +-+------+-------+--------+--------+--------+-------+------+-+   
     |        +       +        +        +**      +       +        |   
  55 +-+                                 **                     +-+   
     |                          **       **                       |   
     |                          **       **                       |   
  50 +-+                        **       **                     +-+   
     |                          **       **                       |   
  45 +-+                        **       **                     +-+   
     |                          **       **                       |   
     |                          **       **                       |   
  40 +-+                        **       **                     +-+   
     |                          **       **                       |   
  35 +-+               **       **       **                     +-+   
     |                 **       **       **               **      |   
     |                 **       **       **      **       **      |   
  30 +-+      **       **       **       **      **       **    +-+   
     |        **      +**      +**      +**      **      +**      |   
  25 +-+------**------+**------+**------+**------**------+**----+-+   
              1       2        4        8       12      16            
                                     Threads                                   
patched_kernel(kIOPS) (randomwrite)
  600 +-+-----+--------+--------+-------+--------+-------+------+-+   
      |       +        +        +       +        +       +**      |   
      |                                                   **      |   
  500 +-+                                                 **    +-+   
      |                                                   **      |   
      |                                           **      **      |   
  400 +-+                                         **      **    +-+   
      |                                           **      **      |   
  300 +-+                                **       **      **    +-+   
      |                                  **       **      **      |   
      |                                  **       **      **      |   
  200 +-+                                **       **      **    +-+   
      |                         **       **       **      **      |   
      |                         **       **       **      **      |   
  100 +-+               **      **       **       **      **    +-+   
      |                 **      **       **       **      **      |   
      |       +**      +**      **      +**      +**     +**      |   
    0 +-+-----+**------+**------**------+**------+**-----+**----+-+   
              1        2        4       8       12      16            
                                    Threads                                   
fio script
==========
[global]
rw=randwrite
norandommap=1
invalidate=0
bs=4k
numjobs=16 		--> changed this for different thread options
time_based=1
ramp_time=30
runtime=60
group_reporting=1
ioengine=psync
direct=1
size=16G
filename=file1.0.0:file1.0.1:file1.0.2:file1.0.3:file1.0.4:file1.0.5:file1.0.6:file1.0.7:file1.0.8:file1.0.9:file1.0.10:file1.0.11:file1.0.12:file1.0.13:file1.0.14:file1.0.15:file1.0.16:file1.0.17:file1.0.18:file1.0.19:file1.0.20:file1.0.21:file1.0.22:file1.0.23:file1.0.24:file1.0.25:file1.0.26:file1.0.27:file1.0.28:file1.0.29:file1.0.30:file1.0.31
file_service_type=random
nrfiles=32
directory=/mnt/

[name]
directory=/mnt/
direct=1

NOTE:
======
1. Looking at ~10x perf delta, I probed a bit deeper to understand what's causing
this scalability problem. It seems when we are starting a jbd2 txn then slab
alloc code is observing some serious contention around spinlock.

I think that the spinlock contention problem in slab alloc path could be optimized
on PPC in general, will look into it seperately. But I could still see the
perf improvement of close to ~2x on QEMU setup on x86 with simulated pmem device
with the patched_kernel v/s vanilla_kernel with same fio workload.

perf report from vanilla_kernel (this is not seen with patched kernel) (ppc64)
=======================================================================

  47.86%  fio              [kernel.vmlinux]            [k] do_raw_spin_lock
             |
             ---do_raw_spin_lock
                |
                |--19.43%--_raw_spin_lock
                |          |
                |           --19.31%--0
                |                     |
                |                     |--9.77%--deactivate_slab.isra.61
                |                     |          ___slab_alloc
                |                     |          __slab_alloc
                |                     |          kmem_cache_alloc
                |                     |          jbd2__journal_start
                |                     |          __ext4_journal_start_sb
<...>

2. This problem was reported by Dan Williams at [1]

Links
======
[1]: https://lore.kernel.org/linux-ext4/20190802144304.GP25064@quack2.suse.cz/T/
[v2]: https://lkml.org/lkml/2020/8/22/123

Ritesh Harjani (1):
  ext4: Optimize file overwrites

 fs/ext4/inode.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-03  4:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-18  5:06 [PATCHv3 0/1] Optimize ext4 file overwrites - perf improvement Ritesh Harjani
2020-09-18  5:06 ` [PATCHv3 1/1] ext4: Optimize file overwrites Ritesh Harjani
2020-09-18  7:52   ` Sedat Dilek
2020-09-18  9:52   ` Jan Kara
2020-09-25  7:12   ` [ext4] 4e8fc10115: fio.write_iops 330.6% improvement kernel test robot
2020-09-25  7:12     ` kernel test robot
2020-10-03  4:49   ` [PATCHv3 1/1] ext4: Optimize file overwrites Theodore Y. Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.