linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] zram: queue flag nowait and mior cleanup
@ 2023-05-12  8:29 Chaitanya Kulkarni
  2023-05-12  8:29 ` [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT Chaitanya Kulkarni
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-12  8:29 UTC (permalink / raw)
  To: minchan, senozhatsky; +Cc: axboe, linux-block, Chaitanya Kulkarni

Hi Minchan/Sergey,

Consolidate the zram_bvec_read() and zram_bvec_write() into one
function, add missing flush_dcache_page() before zram_bvec_write(),
and allow user to set the QUEUE_FLAG_NOWAIT to get the better
performance with io_uring:-

* linux-block (for-next) # grep IOPS  zram*fio | column -t

default-nowait-off-1.fio:  read:  IOPS=792k,  BW=3095MiB/s
default-nowait-off-2.fio:  read:  IOPS=812k,  BW=3173MiB/s
default-nowait-off-3.fio:  read:  IOPS=803k,  BW=3137MiB/s

nowait-on-1.fio:           read:  IOPS=864k,  BW=3374MiB/s
nowait-on-2.fio:           read:  IOPS=863k,  BW=3371MiB/s
nowait-on-3.fio:           read:  IOPS=864k,  BW=3374MiB/s

* linux-block (for-next) # grep cpu  zram*fio | column -t
default-nowait-off-1.fio:  cpu  :  usr=5.92%,  sys=13.05%,  ctx=37784503
default-nowait-off-2.fio:  cpu  :  usr=6.07%,  sys=13.66%,  ctx=37821095
default-nowait-off-3.fio:  cpu  :  usr=6.06%,  sys=13.33%,  ctx=38395321

nowait-on-1.fio:           cpu  :  usr=1.78%,  sys=97.58%,  ctx=23848,
nowait-on-2.fio:           cpu  :  usr=1.78%,  sys=97.61%,  ctx=22537,
nowait-on-3.fio:           cpu  :  usr=1.80%,  sys=97.60%,  ctx=21582,

* linux-block (for-next) # grep slat  zram*fio | column -t
default-nowait-off-1.fio: slat (nsec): min=410, max=4827.1k,  avg=1868.3
default-nowait-off-2.fio: slat (nsec): min=411, max=5335.4k,  avg=1953.2
default-nowait-off-3.fio: slat (nsec): min=420, max=5102.4k,  avg=1852.9

nowait-on-1.fio:          slat (nsec): min=1072,max=33725k,  avg=54834.7
nowait-on-2.fio:          slat (nsec): min=972, max=16048k,  avg=54874.8
nowait-on-3.fio:          slat (nsec): min=1042,max=7085.1k, avg=54823.7

Please let me know if further testing is needed I've ran fio
verification job in order to make verify these changes.

-ck

Chaitanya Kulkarni (3):
  zram: allow user to set QUEUE_FLAG_NOWAIT
  zram: consolidate zram_bio_read()_zram_bio_write()
  zram: add flush_dcache_page() call for write

 drivers/block/zram/zram_drv.c | 63 +++++++++++++++--------------------
 1 file changed, 27 insertions(+), 36 deletions(-)

oinux-block (for-next) # sh test-zram-nowait.sh 
+ git log -4
commit f6654c00ddea9b02952ac94551e0d7469fc8e8d8 (HEAD -> for-next)
Author: Chaitanya Kulkarni <kch@nvidia.com>
Date:   Thu May 11 23:50:37 2023 -0700

    zram: add flush_dcache_page() call for write
    
    Just like we have a call flush_dcache_read() after zrma_bvec_read, add
    missing flush_dcache_page() call before zram_bdev_write() in order to
    handle the cache congruency of the kernel and userspace mappings of
    page when zram does write.
    
    Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>

commit bfe3e67f81284663bf84bc5121c27bf90410d433
Author: Chaitanya Kulkarni <kch@nvidia.com>
Date:   Thu May 11 23:46:43 2023 -0700

    zram: consolidate zram_bio_read()_zram_bio_write()
    
    zram_bio_read() and zram_bio_write() are 26 lines rach and share most of
    the code except call to zram_bdev_read() and zram_bvec_write() calls.
    Consolidate code into single zram_bio_rw() to remove the duplicate code
    and an extra function that is only needed for 2 lines of code :-
    
    1c1
    < static void zram_bio_read(struct zram *zram, struct bio *bio)

commit 6f5b4f6dec19464158ead387739ceb72b97e991c
Author: Chaitanya Kulkarni <kch@nvidia.com>
Date:   Thu May 11 23:03:32 2023 -0700

    zram: allow user to set QUEUE_FLAG_NOWAIT
    
    Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
    parameter to retain the default behaviour. Also, update respective
    allocation flags in the write path. Following are the performance
    numbers with io_uring fio engine for random read, note that device has
    been populated fully with randwrite workload before taking these
    numbers :-
    
    * linux-block (for-next) # grep IOPS  zram*fio | column -t
    
    default-nowait-off-1.fio:  read:  IOPS=792k,  BW=3095MiB/s
    default-nowait-off-2.fio:  read:  IOPS=812k,  BW=3173MiB/s
    default-nowait-off-3.fio:  read:  IOPS=803k,  BW=3137MiB/s
    
    nowait-on-1.fio:           read:  IOPS=864k,  BW=3374MiB/s
    nowait-on-2.fio:           read:  IOPS=863k,  BW=3371MiB/s
    nowait-on-3.fio:           read:  IOPS=864k,  BW=3374MiB/s
    
    * linux-block (for-next) # grep cpu  zram*fio | column -t
    default-nowait-off-1.fio:  cpu  :  usr=5.92%,  sys=13.05%,  ctx=37784503
    default-nowait-off-2.fio:  cpu  :  usr=6.07%,  sys=13.66%,  ctx=37821095
    default-nowait-off-3.fio:  cpu  :  usr=6.06%,  sys=13.33%,  ctx=38395321
    
    nowait-on-1.fio:           cpu  :  usr=1.78%,  sys=97.58%,  ctx=23848,
    nowait-on-2.fio:           cpu  :  usr=1.78%,  sys=97.61%,  ctx=22537,
    nowait-on-3.fio:           cpu  :  usr=1.80%,  sys=97.60%,  ctx=21582,
    
    * linux-block (for-next) # grep slat  zram*fio | column -t
    default-nowait-off-1.fio: slat (nsec): min=410, max=4827.1k,  avg=1868.3
    default-nowait-off-2.fio: slat (nsec): min=411, max=5335.4k,  avg=1953.2
    default-nowait-off-3.fio: slat (nsec): min=420, max=5102.4k,  avg=1852.9
    
    nowait-on-1.fio:          slat (nsec): min=1072,max=33725k,  avg=54834.7
    nowait-on-2.fio:          slat (nsec): min=972, max=16048k,  avg=54874.8
    nowait-on-3.fio:          slat (nsec): min=1042,max=7085.1k, avg=54823.7
    
    Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>

commit 6622cc4e0c7139d5173ae5f115a7d87e45a51c1a
Merge: 2742538c4d28 7387653a6022
Author: Chaitanya Kulkarni <kch@nvidia.com>
Date:   Thu May 11 22:57:28 2023 -0700

    Merge branch 'for-next' of git://git.kernel.dk/linux-block into for-next
+ ./compile_zram.sh
+ umount /mnt/zram
umount: /mnt/zram: no mount point specified.
+ dmesg -c
+ modprobe -r zram
+ lsmod
+ grep zram
++ nproc
+ make -j 48 M=drivers/block modules
+ HOST=drivers/block/zram/zram.ko
++ uname -r
+ HOST_DEST=/lib/modules/6.4.0-rc1lblk+/kernel/drivers/block/zram/
+ cp drivers/block/zram/zram.ko /lib/modules/6.4.0-rc1lblk+/kernel/drivers/block/zram//
+ ls -lrth /lib/modules/6.4.0-rc1lblk+/kernel/drivers/block/zram//zram.ko
-rw-r--r--. 1 root root 676K May 12 00:23 /lib/modules/6.4.0-rc1lblk+/kernel/drivers/block/zram//zram.ko
+ dmesg -c
+ lsmod
+ grep zram
+ modprobe zram
+ sleep 1
+ zramctl -s 20GB /dev/zram0
+ sleep 1
+ test_nowait default-nowait-off
+ fio fio/verify.fio --ioengine=io_uring --size=2G --filename=/dev/zram0
write-and-verify: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=16
fio-3.34
Starting 1 process
Jobs: 1 (f=1): [V(1)][100.0%][r=995MiB/s][r=255k IOPS][eta 00m:00s]                  
write-and-verify: (groupid=0, jobs=1): err= 0: pid=176756: Fri May 12 00:23:34 2023
  read: IOPS=255k, BW=995MiB/s (1043MB/s)(1295MiB/1302msec)
    slat (nsec): min=922, max=37201, avg=2879.25, stdev=1196.56
    clat (usec): min=3, max=134, avg=58.98, stdev= 5.43
     lat (usec): min=6, max=137, avg=61.85, stdev= 5.61
    clat percentiles (usec):
     |  1.00th=[   48],  5.00th=[   51], 10.00th=[   53], 20.00th=[   55],
     | 30.00th=[   57], 40.00th=[   58], 50.00th=[   59], 60.00th=[   60],
     | 70.00th=[   62], 80.00th=[   63], 90.00th=[   67], 95.00th=[   69],
     | 99.00th=[   76], 99.50th=[   78], 99.90th=[   86], 99.95th=[   92],
     | 99.99th=[  103]
  write: IOPS=201k, BW=785MiB/s (823MB/s)(2048MiB/2609msec); 0 zone resets
    slat (nsec): min=1132, max=148451, avg=4590.62, stdev=2151.57
    clat (usec): min=19, max=250, avg=74.77, stdev=11.44
     lat (usec): min=23, max=338, avg=79.36, stdev=11.92
    clat percentiles (usec):
     |  1.00th=[   54],  5.00th=[   61], 10.00th=[   65], 20.00th=[   69],
     | 30.00th=[   71], 40.00th=[   73], 50.00th=[   75], 60.00th=[   76],
     | 70.00th=[   78], 80.00th=[   81], 90.00th=[   86], 95.00th=[   91],
     | 99.00th=[  116], 99.50th=[  145], 99.90th=[  165], 99.95th=[  176],
     | 99.99th=[  204]
   bw (  KiB/s): min=161184, max=886504, per=86.97%, avg=699050.67, stdev=266603.81, samples=6
   iops        : min=40296, max=221626, avg=174762.67, stdev=66650.95, samples=6
  lat (usec)   : 4=0.01%, 10=0.01%, 20=0.01%, 50=1.57%, 100=97.06%
  lat (usec)   : 250=1.37%, 500=0.01%
  cpu          : usr=42.84%, sys=56.55%, ctx=4056, majf=0, minf=9079
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=331621,524288,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=995MiB/s (1043MB/s), 995MiB/s-995MiB/s (1043MB/s-1043MB/s), io=1295MiB (1358MB), run=1302-1302msec
  WRITE: bw=785MiB/s (823MB/s), 785MiB/s-785MiB/s (823MB/s-823MB/s), io=2048MiB (2147MB), run=2609-2609msec

Disk stats (read/write):
  zram0: ios=327141/524288, merge=0/0, ticks=484/1980, in_queue=2464, util=97.57%
+ fio fio/randwrite.fio --ioengine=io_uring --size=20G --filename=/dev/zram0
RANDWRITE: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=2
...
fio-3.34
Starting 48 processes
Jobs: 48 (f=48): [w(48)][100.0%][w=714MiB/s][w=183k IOPS][eta 00m:00s]
RANDWRITE: (groupid=0, jobs=48): err= 0: pid=176785: Fri May 12 00:24:34 2023
  write: IOPS=214k, BW=836MiB/s (877MB/s)(49.0GiB/60002msec); 0 zone resets
    slat (nsec): min=381, max=1706.8k, avg=2462.04, stdev=2710.88
    clat (nsec): min=221, max=19317k, avg=445142.67, stdev=614360.19
     lat (usec): min=8, max=19319, avg=447.60, stdev=614.51
    clat percentiles (usec):
     |  1.00th=[   21],  5.00th=[   26], 10.00th=[   31], 20.00th=[   43],
     | 30.00th=[   55], 40.00th=[   74], 50.00th=[  251], 60.00th=[  465],
     | 70.00th=[  515], 80.00th=[  725], 90.00th=[ 1020], 95.00th=[ 1418],
     | 99.00th=[ 3228], 99.50th=[ 4015], 99.90th=[ 5407], 99.95th=[ 6063],
     | 99.99th=[ 7570]
   bw (  KiB/s): min=479679, max=1759836, per=100.00%, avg=857832.81, stdev=5081.52, samples=5712
   iops        : min=119913, max=439954, avg=214451.85, stdev=1270.39, samples=5712
  lat (nsec)   : 250=0.01%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.86%, 50=25.67%
  lat (usec)   : 100=18.97%, 250=4.47%, 500=17.37%, 750=12.90%, 1000=8.96%
  lat (msec)   : 2=8.51%, 4=1.77%, 10=0.50%, 20=0.01%
  cpu          : usr=5.18%, sys=9.25%, ctx=9890182, majf=0, minf=579
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12840341,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
  WRITE: bw=836MiB/s (877MB/s), 836MiB/s-836MiB/s (877MB/s-877MB/s), io=49.0GiB (52.6GB), run=60002-60002msec

Disk stats (read/write):
  zram0: ios=0/12822005, merge=0/0, ticks=0/2558435, in_queue=2558435, util=99.96%
+ for i in 1 2 3
+ fio fio/randread.fio --ioengine=io_uring --size=20G --filename=/dev/zram0 --output=zram-default-nowait-off-1.fio
+ for i in 1 2 3 [r(48)][100.0%][r=3138MiB/s][r=803k IOPS][eta 00m:00s]
+ fio fio/randread.fio --ioengine=io_uring --size=20G --filename=/dev/zram0 --output=zram-default-nowait-off-2.fio
+ for i in 1 2 3 [r(48)][100.0%][r=3027MiB/s][r=775k IOPS][eta 00m:00s]
+ fio fio/randread.fio --ioengine=io_uring --size=20G --filename=/dev/zram0 --output=zram-default-nowait-off-3.fio
+ modprobe -r zramr(48)][100.0%][r=3074MiB/s][r=787k IOPS][eta 00m:00s]
+ modprobe zram nowait=1
+ sleep 1
+ zramctl -s 20GB /dev/zram0
+ sleep 1
+ test_nowait nowait-on
+ fio fio/verify.fio --ioengine=io_uring --size=2G --filename=/dev/zram0
write-and-verify: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=16
fio-3.34
Starting 1 process
Jobs: 1 (f=1): [V(1)][80.0%][r=1192MiB/s,w=124MiB/s][r=305k,w=31.8k IOPS][eta 00m:01s]
write-and-verify: (groupid=0, jobs=1): err= 0: pid=177580: Fri May 12 00:27:46 2023
  read: IOPS=386k, BW=1507MiB/s (1581MB/s)(1293MiB/858msec)
    slat (nsec): min=1242, max=44093, avg=1777.71, stdev=460.23
    clat (nsec): min=721, max=92537, avg=38932.85, stdev=2677.45
     lat (nsec): min=2494, max=95692, avg=40710.56, stdev=2756.19
    clat percentiles (nsec):
     |  1.00th=[36608],  5.00th=[37120], 10.00th=[37632], 20.00th=[37632],
     | 30.00th=[38144], 40.00th=[38144], 50.00th=[38144], 60.00th=[38656],
     | 70.00th=[38656], 80.00th=[39168], 90.00th=[39680], 95.00th=[44800],
     | 99.00th=[49920], 99.50th=[54016], 99.90th=[64256], 99.95th=[65280],
     | 99.99th=[79360]
  write: IOPS=169k, BW=660MiB/s (692MB/s)(2048MiB/3104msec); 0 zone resets
    slat (usec): min=3, max=1097, avg= 5.59, stdev= 2.13
    clat (nsec): min=601, max=1389.5k, avg=88874.27, stdev=12749.86
     lat (usec): min=5, max=1402, avg=94.46, stdev=13.41
    clat percentiles (usec):
     |  1.00th=[   69],  5.00th=[   73], 10.00th=[   76], 20.00th=[   80],
     | 30.00th=[   84], 40.00th=[   87], 50.00th=[   90], 60.00th=[   92],
     | 70.00th=[   95], 80.00th=[   97], 90.00th=[  101], 95.00th=[  104],
     | 99.00th=[  116], 99.50th=[  128], 99.90th=[  167], 99.95th=[  186],
     | 99.99th=[  241]
   bw (  KiB/s): min=122368, max=788104, per=88.69%, avg=599186.29, stdev=219358.99, samples=7
   iops        : min=30592, max=197026, avg=149796.57, stdev=54839.75, samples=7
  lat (nsec)   : 750=0.01%
  lat (usec)   : 4=0.01%, 10=0.01%, 20=0.01%, 50=38.32%, 100=55.08%
  lat (usec)   : 250=6.60%, 500=0.01%
  lat (msec)   : 2=0.01%
  cpu          : usr=35.09%, sys=64.76%, ctx=6, majf=0, minf=9067
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=100.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=331086,524288,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
   READ: bw=1507MiB/s (1581MB/s), 1507MiB/s-1507MiB/s (1581MB/s-1581MB/s), io=1293MiB (1356MB), run=858-858msec
  WRITE: bw=660MiB/s (692MB/s), 660MiB/s-660MiB/s (692MB/s-692MB/s), io=2048MiB (2147MB), run=3104-3104msec

Disk stats (read/write):
  zram0: ios=304865/524288, merge=0/0, ticks=262/1514, in_queue=1776, util=97.57%
+ fio fio/randwrite.fio --ioengine=io_uring --size=20G --filename=/dev/zram0
RANDWRITE: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=2
...
fio-3.34
Starting 48 processes
Jobs: 48 (f=48): [w(48)][100.0%][w=759MiB/s][w=194k IOPS][eta 00m:00s]
RANDWRITE: (groupid=0, jobs=48): err= 0: pid=177583: Fri May 12 00:28:47 2023
  write: IOPS=227k, BW=887MiB/s (931MB/s)(52.0GiB/60001msec); 0 zone resets
    slat (usec): min=6, max=26750, avg=210.18, stdev=316.16
    clat (nsec): min=681, max=26752k, avg=211628.33, stdev=316425.45
     lat (usec): min=18, max=27199, avg=421.81, stdev=457.77
    clat percentiles (usec):
     |  1.00th=[   16],  5.00th=[   18], 10.00th=[   19], 20.00th=[   21],
     | 30.00th=[   23], 40.00th=[   26], 50.00th=[   29], 60.00th=[   36],
     | 70.00th=[  457], 80.00th=[  506], 90.00th=[  545], 95.00th=[  594],
     | 99.00th=[ 1074], 99.50th=[ 1254], 99.90th=[ 3425], 99.95th=[ 3687],
     | 99.99th=[ 4686]
   bw (  KiB/s): min=710189, max=1653080, per=100.00%, avg=910501.29, stdev=3653.62, samples=5712
   iops        : min=177535, max=413270, avg=227617.18, stdev=913.41, samples=5712
  lat (nsec)   : 750=0.01%, 1000=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=16.56%, 50=49.44%
  lat (usec)   : 100=0.93%, 250=0.06%, 500=11.29%, 750=18.02%, 1000=1.51%
  lat (msec)   : 2=1.91%, 4=0.24%, 10=0.03%, 20=0.01%, 50=0.01%
  cpu          : usr=0.69%, sys=98.61%, ctx=26601, majf=0, minf=582
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,13631002,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
  WRITE: bw=887MiB/s (931MB/s), 887MiB/s-887MiB/s (931MB/s-931MB/s), io=52.0GiB (55.8GB), run=60001-60001msec

Disk stats (read/write):
  zram0: ios=0/13604021, merge=0/0, ticks=0/2807107, in_queue=2807107, util=99.95%
+ for i in 1 2 3
+ fio fio/randread.fio --ioengine=io_uring --size=20G --filename=/dev/zram0 --output=zram-nowait-on-1.fio
+ for i in 1 2 3 [r(48)][100.0%][r=3378MiB/s][r=865k IOPS][eta 00m:00s]
+ fio fio/randread.fio --ioengine=io_uring --size=20G --filename=/dev/zram0 --output=zram-nowait-on-2.fio
+ for i in 1 2 3 [r(48)][100.0%][r=3357MiB/s][r=859k IOPS][eta 00m:00s]
+ fio fio/randread.fio --ioengine=io_uring --size=20G --filename=/dev/zram0 --output=zram-nowait-on-3.fio
+ modprobe -r zramr(48)][100.0%][r=3357MiB/s][r=859k IOPS][eta 00m:00s]

linux-block (for-next) # for i in IOPS cpu; do grep $i zram*fio | column -t ; done
zram-default-nowait-off-1.fio:  read:  IOPS=802k,  BW=3133MiB/s  (3285MB/s)(184GiB/60001msec)
zram-default-nowait-off-2.fio:  read:  IOPS=796k,  BW=3111MiB/s  (3262MB/s)(182GiB/60002msec)
zram-default-nowait-off-3.fio:  read:  IOPS=796k,  BW=3108MiB/s  (3259MB/s)(182GiB/60002msec)
zram-nowait-on-1.fio:           read:  IOPS=857k,  BW=3346MiB/s  (3509MB/s)(196GiB/60001msec)
zram-nowait-on-2.fio:           read:  IOPS=857k,  BW=3347MiB/s  (3509MB/s)(196GiB/60001msec)
zram-nowait-on-3.fio:           read:  IOPS=858k,  BW=3353MiB/s  (3516MB/s)(196GiB/60001msec)
zram-default-nowait-off-1.fio:  cpu  :  usr=5.82%,  sys=13.54%,  ctx=36301915,  majf=0,  minf=687
zram-default-nowait-off-2.fio:  cpu  :  usr=5.84%,  sys=13.03%,  ctx=37781937,  majf=0,  minf=640
zram-default-nowait-off-3.fio:  cpu  :  usr=5.84%,  sys=12.90%,  ctx=37492533,  majf=0,  minf=688
zram-nowait-on-1.fio:           cpu  :  usr=1.74%,  sys=97.62%,  ctx=24068,     majf=0,  minf=783
zram-nowait-on-2.fio:           cpu  :  usr=1.74%,  sys=97.57%,  ctx=24674,     majf=0,  minf=763
zram-nowait-on-3.fio:           cpu  :  usr=1.76%,  sys=97.59%,  ctx=24725,     majf=0,  minf=723



-- 
2.40.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-12  8:29 [PATCH 0/3] zram: queue flag nowait and mior cleanup Chaitanya Kulkarni
@ 2023-05-12  8:29 ` Chaitanya Kulkarni
  2023-05-12 14:31   ` Christoph Hellwig
  2023-05-12  8:29 ` [PATCH 2/3] zram: consolidate zram_bio_read()_zram_bio_write() Chaitanya Kulkarni
  2023-05-12  8:29 ` [PATCH 3/3] zram: add flush_dcache_page() call for write Chaitanya Kulkarni
  2 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-12  8:29 UTC (permalink / raw)
  To: minchan, senozhatsky; +Cc: axboe, linux-block, Chaitanya Kulkarni

Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
parameter to retain the default behaviour. Also, update respective
allocation flags in the write path. Following are the performance
numbers with io_uring fio engine for random read, note that device has
been populated fully with randwrite workload before taking these
numbers :-

* linux-block (for-next) # grep IOPS  zram*fio | column -t 

default-nowait-off-1.fio:  read:  IOPS=802k,  BW=3133MiB/s
default-nowait-off-2.fio:  read:  IOPS=796k,  BW=3111MiB/s
default-nowait-off-3.fio:  read:  IOPS=796k,  BW=3108MiB/s

nowait-on-1.fio:           read:  IOPS=857k,  BW=3346MiB/s
nowait-on-2.fio:           read:  IOPS=857k,  BW=3347MiB/s
nowait-on-3.fio:           read:  IOPS=858k,  BW=3353MiB/s

* linux-block (for-next) # grep cpu  zram*fio | column -t 
default-nowait-off-1.fio:  cpu  :  usr=5.82%,  sys=13.54%,  ctx=36301915
default-nowait-off-2.fio:  cpu  :  usr=5.84%,  sys=13.03%,  ctx=37781937
default-nowait-off-3.fio:  cpu  :  usr=5.84%,  sys=12.90%,  ctx=37492533

nowait-on-1.fio:           cpu  :  usr=1.74%,  sys=97.62%,  ctx=24068,  
nowait-on-2.fio:           cpu  :  usr=1.74%,  sys=97.57%,  ctx=24674,  
nowait-on-3.fio:           cpu  :  usr=1.76%,  sys=97.59%,  ctx=24725,  

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/block/zram/zram_drv.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index f6d90f1ba5cf..b2e419f15f71 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -36,6 +36,10 @@
 
 #include "zram_drv.h"
 
+static bool g_nowait;
+module_param_named(nowait, g_nowait, bool, 0444);
+MODULE_PARM_DESC(nowait, "set QUEUE_FLAG_NOWAIT. Default: False");
+
 static DEFINE_IDR(zram_index_idr);
 /* idr index must be protected */
 static DEFINE_MUTEX(zram_index_mutex);
@@ -1540,9 +1544,10 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
 static int zram_bvec_write_partial(struct zram *zram, struct bio_vec *bvec,
 				   u32 index, int offset, struct bio *bio)
 {
-	struct page *page = alloc_page(GFP_NOIO);
+	struct page *page;
 	int ret;
 
+	page = alloc_page(bio->bi_opf & REQ_NOWAIT ? GFP_NOWAIT : GFP_NOIO);
 	if (!page)
 		return -ENOMEM;
 
@@ -2215,6 +2220,8 @@ static int zram_add(void)
 	/* zram devices sort of resembles non-rotational disks */
 	blk_queue_flag_set(QUEUE_FLAG_NONROT, zram->disk->queue);
 	blk_queue_flag_set(QUEUE_FLAG_SYNCHRONOUS, zram->disk->queue);
+	if (g_nowait)
+		blk_queue_flag_set(QUEUE_FLAG_NOWAIT, zram->disk->queue);
 
 	/*
 	 * To ensure that we always get PAGE_SIZE aligned
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] zram: consolidate zram_bio_read()_zram_bio_write()
  2023-05-12  8:29 [PATCH 0/3] zram: queue flag nowait and mior cleanup Chaitanya Kulkarni
  2023-05-12  8:29 ` [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT Chaitanya Kulkarni
@ 2023-05-12  8:29 ` Chaitanya Kulkarni
  2023-05-12 14:32   ` Christoph Hellwig
  2023-05-12  8:29 ` [PATCH 3/3] zram: add flush_dcache_page() call for write Chaitanya Kulkarni
  2 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-12  8:29 UTC (permalink / raw)
  To: minchan, senozhatsky; +Cc: axboe, linux-block, Chaitanya Kulkarni

zram_bio_read() and zram_bio_write() are 26 lines rach and share most of
the code except call to zram_bdev_read() and zram_bvec_write() calls.
Consolidate code into single zram_bio_rw() to remove the duplicate code
and an extra function that is only needed for 2 lines of code :-

1c1
< static void zram_bio_read(struct zram *zram, struct bio *bio)
---
> static void zram_bio_write(struct zram *zram, struct bio *bio)
13,14c13,14
< 		if (zram_bvec_read(zram, &bv, index, offset, bio) < 0) {
< 			atomic64_inc(&zram->stats.failed_reads);
---
> 		if (zram_bvec_write(zram, &bv, index, offset, bio) < 0) {
> 			atomic64_inc(&zram->stats.failed_writes);
18d17
< 		flush_dcache_page(bv.bv_page);

diff stats with this patch :-

 drivers/block/zram/zram_drv.c | 53 ++++++++++++-----------------------
 1 file changed, 18 insertions(+), 35 deletions(-)

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/block/zram/zram_drv.c | 53 ++++++++++++-----------------------
 1 file changed, 18 insertions(+), 35 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index b2e419f15f71..fc37419b3735 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1873,38 +1873,12 @@ static void zram_bio_discard(struct zram *zram, struct bio *bio)
 	bio_endio(bio);
 }
 
-static void zram_bio_read(struct zram *zram, struct bio *bio)
-{
-	struct bvec_iter iter;
-	struct bio_vec bv;
-	unsigned long start_time;
-
-	start_time = bio_start_io_acct(bio);
-	bio_for_each_segment(bv, bio, iter) {
-		u32 index = iter.bi_sector >> SECTORS_PER_PAGE_SHIFT;
-		u32 offset = (iter.bi_sector & (SECTORS_PER_PAGE - 1)) <<
-				SECTOR_SHIFT;
-
-		if (zram_bvec_read(zram, &bv, index, offset, bio) < 0) {
-			atomic64_inc(&zram->stats.failed_reads);
-			bio->bi_status = BLK_STS_IOERR;
-			break;
-		}
-		flush_dcache_page(bv.bv_page);
-
-		zram_slot_lock(zram, index);
-		zram_accessed(zram, index);
-		zram_slot_unlock(zram, index);
-	}
-	bio_end_io_acct(bio, start_time);
-	bio_endio(bio);
-}
-
-static void zram_bio_write(struct zram *zram, struct bio *bio)
+static void zram_bio_rw(struct zram *zram, struct bio *bio)
 {
 	struct bvec_iter iter;
 	struct bio_vec bv;
 	unsigned long start_time;
+	int ret;
 
 	start_time = bio_start_io_acct(bio);
 	bio_for_each_segment(bv, bio, iter) {
@@ -1912,10 +1886,21 @@ static void zram_bio_write(struct zram *zram, struct bio *bio)
 		u32 offset = (iter.bi_sector & (SECTORS_PER_PAGE - 1)) <<
 				SECTOR_SHIFT;
 
-		if (zram_bvec_write(zram, &bv, index, offset, bio) < 0) {
-			atomic64_inc(&zram->stats.failed_writes);
-			bio->bi_status = BLK_STS_IOERR;
-			break;
+		if (op_is_write(bio_op(bio))) {
+			ret = zram_bvec_write(zram, &bv, index, offset, bio);
+			if (ret < 0) {
+				atomic64_inc(&zram->stats.failed_writes);
+				bio->bi_status = BLK_STS_IOERR;
+				break;
+			}
+		} else {
+			ret = zram_bvec_read(zram, &bv, index, offset, bio);
+			if (ret < 0) {
+				atomic64_inc(&zram->stats.failed_reads);
+				bio->bi_status = BLK_STS_IOERR;
+				break;
+			}
+			flush_dcache_page(bv.bv_page);
 		}
 
 		zram_slot_lock(zram, index);
@@ -1935,10 +1920,8 @@ static void zram_submit_bio(struct bio *bio)
 
 	switch (bio_op(bio)) {
 	case REQ_OP_READ:
-		zram_bio_read(zram, bio);
-		break;
 	case REQ_OP_WRITE:
-		zram_bio_write(zram, bio);
+		zram_bio_rw(zram, bio);
 		break;
 	case REQ_OP_DISCARD:
 	case REQ_OP_WRITE_ZEROES:
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] zram: add flush_dcache_page() call for write
  2023-05-12  8:29 [PATCH 0/3] zram: queue flag nowait and mior cleanup Chaitanya Kulkarni
  2023-05-12  8:29 ` [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT Chaitanya Kulkarni
  2023-05-12  8:29 ` [PATCH 2/3] zram: consolidate zram_bio_read()_zram_bio_write() Chaitanya Kulkarni
@ 2023-05-12  8:29 ` Chaitanya Kulkarni
  2023-05-12 14:34   ` Christoph Hellwig
  2 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-12  8:29 UTC (permalink / raw)
  To: minchan, senozhatsky; +Cc: axboe, linux-block, Chaitanya Kulkarni

Just like we have a call flush_dcache_read() after zrma_bvec_read(), add
missing flush_dcache_page() call before zram_bdev_write() in order to
handle the cache congruency of the kernel and userspace mappings of
page for REQ_OP_WRITE handling.

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
 drivers/block/zram/zram_drv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index fc37419b3735..a7954ae80d40 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1887,6 +1887,7 @@ static void zram_bio_rw(struct zram *zram, struct bio *bio)
 				SECTOR_SHIFT;
 
 		if (op_is_write(bio_op(bio))) {
+			flush_dcache_page(bv.bv_page);
 			ret = zram_bvec_write(zram, &bv, index, offset, bio);
 			if (ret < 0) {
 				atomic64_inc(&zram->stats.failed_writes);
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-12  8:29 ` [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT Chaitanya Kulkarni
@ 2023-05-12 14:31   ` Christoph Hellwig
  2023-05-12 14:34     ` Jens Axboe
  2023-05-13  1:05     ` Chaitanya Kulkarni
  0 siblings, 2 replies; 17+ messages in thread
From: Christoph Hellwig @ 2023-05-12 14:31 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: minchan, senozhatsky, axboe, linux-block

On Fri, May 12, 2023 at 01:29:56AM -0700, Chaitanya Kulkarni wrote:
> Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
> parameter to retain the default behaviour. Also, update respective
> allocation flags in the write path. Following are the performance
> numbers with io_uring fio engine for random read, note that device has
> been populated fully with randwrite workload before taking these
> numbers :-

Why would you add a module option, except to make everyones life hell?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] zram: consolidate zram_bio_read()_zram_bio_write()
  2023-05-12  8:29 ` [PATCH 2/3] zram: consolidate zram_bio_read()_zram_bio_write() Chaitanya Kulkarni
@ 2023-05-12 14:32   ` Christoph Hellwig
  2023-05-13  1:06     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2023-05-12 14:32 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: minchan, senozhatsky, axboe, linux-block

On Fri, May 12, 2023 at 01:29:57AM -0700, Chaitanya Kulkarni wrote:
> zram_bio_read() and zram_bio_write() are 26 lines rach and share most of
> the code except call to zram_bdev_read() and zram_bvec_write() calls.
> Consolidate code into single zram_bio_rw() to remove the duplicate code
> and an extra function that is only needed for 2 lines of code :-

And it adds an extra conditional to every page in the fast path..

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] zram: add flush_dcache_page() call for write
  2023-05-12  8:29 ` [PATCH 3/3] zram: add flush_dcache_page() call for write Chaitanya Kulkarni
@ 2023-05-12 14:34   ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2023-05-12 14:34 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: minchan, senozhatsky, axboe, linux-block

On Fri, May 12, 2023 at 01:29:58AM -0700, Chaitanya Kulkarni wrote:
> Just like we have a call flush_dcache_read() after zrma_bvec_read(), add
> missing flush_dcache_page() call before zram_bdev_write() in order to
> handle the cache congruency of the kernel and userspace mappings of
> page for REQ_OP_WRITE handling.

And why do you think this should be required?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-12 14:31   ` Christoph Hellwig
@ 2023-05-12 14:34     ` Jens Axboe
  2023-05-13  1:06       ` Chaitanya Kulkarni
  2023-05-13  1:05     ` Chaitanya Kulkarni
  1 sibling, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2023-05-12 14:34 UTC (permalink / raw)
  To: Christoph Hellwig, Chaitanya Kulkarni; +Cc: minchan, senozhatsky, linux-block

On 5/12/23 8:31 AM, Christoph Hellwig wrote:
> On Fri, May 12, 2023 at 01:29:56AM -0700, Chaitanya Kulkarni wrote:
>> Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
>> parameter to retain the default behaviour. Also, update respective
>> allocation flags in the write path. Following are the performance
>> numbers with io_uring fio engine for random read, note that device has
>> been populated fully with randwrite workload before taking these
>> numbers :-
> 
> Why would you add a module option, except to make everyones life hell?

Yeah that makes no sense. Either the driver is nowait compatible and
it should just set the flag, or it's not.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-12 14:31   ` Christoph Hellwig
  2023-05-12 14:34     ` Jens Axboe
@ 2023-05-13  1:05     ` Chaitanya Kulkarni
  1 sibling, 0 replies; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-13  1:05 UTC (permalink / raw)
  To: Christoph Hellwig, Chaitanya Kulkarni
  Cc: minchan@kernel.org, senozhatsky@chromium.org, axboe@kernel.dk,
	linux-block@vger.kernel.org

On 5/12/23 07:31, Christoph Hellwig wrote:
> On Fri, May 12, 2023 at 01:29:56AM -0700, Chaitanya Kulkarni wrote:
>> Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
>> parameter to retain the default behaviour. Also, update respective
>> allocation flags in the write path. Following are the performance
>> numbers with io_uring fio engine for random read, note that device has
>> been populated fully with randwrite workload before taking these
>> numbers :-
> Why would you add a module option, except to make everyones life hell?

send v2 without modparam.

-ck



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-12 14:34     ` Jens Axboe
@ 2023-05-13  1:06       ` Chaitanya Kulkarni
  2023-05-16  5:51         ` Chaitanya Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-13  1:06 UTC (permalink / raw)
  To: Jens Axboe, Christoph Hellwig, Chaitanya Kulkarni
  Cc: minchan@kernel.org, senozhatsky@chromium.org,
	linux-block@vger.kernel.org

On 5/12/23 07:34, Jens Axboe wrote:
> On 5/12/23 8:31 AM, Christoph Hellwig wrote:
>> On Fri, May 12, 2023 at 01:29:56AM -0700, Chaitanya Kulkarni wrote:
>>> Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
>>> parameter to retain the default behaviour. Also, update respective
>>> allocation flags in the write path. Following are the performance
>>> numbers with io_uring fio engine for random read, note that device has
>>> been populated fully with randwrite workload before taking these
>>> numbers :-
>> Why would you add a module option, except to make everyones life hell?
> Yeah that makes no sense. Either the driver is nowait compatible and
> it should just set the flag, or it's not.
>

send v2 without modparam.

-ck



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/3] zram: consolidate zram_bio_read()_zram_bio_write()
  2023-05-12 14:32   ` Christoph Hellwig
@ 2023-05-13  1:06     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-13  1:06 UTC (permalink / raw)
  To: Christoph Hellwig, Chaitanya Kulkarni
  Cc: minchan@kernel.org, senozhatsky@chromium.org, axboe@kernel.dk,
	linux-block@vger.kernel.org

On 5/12/23 07:32, Christoph Hellwig wrote:
> On Fri, May 12, 2023 at 01:29:57AM -0700, Chaitanya Kulkarni wrote:
>> zram_bio_read() and zram_bio_write() are 26 lines rach and share most of
>> the code except call to zram_bdev_read() and zram_bvec_write() calls.
>> Consolidate code into single zram_bio_rw() to remove the duplicate code
>> and an extra function that is only needed for 2 lines of code :-
> And it adds an extra conditional to every page in the fast path..

will create a version without additional branch in v2 ..

-ck



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-13  1:06       ` Chaitanya Kulkarni
@ 2023-05-16  5:51         ` Chaitanya Kulkarni
  2023-05-16 13:08           ` Sergey Senozhatsky
  0 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-16  5:51 UTC (permalink / raw)
  To: Jens Axboe, Christoph Hellwig
  Cc: minchan@kernel.org, senozhatsky@chromium.org,
	linux-block@vger.kernel.org

Hi Jens/Christoph,

On 5/12/23 18:06, Chaitanya Kulkarni wrote:
> On 5/12/23 07:34, Jens Axboe wrote:
>> On 5/12/23 8:31 AM, Christoph Hellwig wrote:
>>> On Fri, May 12, 2023 at 01:29:56AM -0700, Chaitanya Kulkarni wrote:
>>>> Allow user to set the QUEUE_FLAG_NOWAIT optionally using module
>>>> parameter to retain the default behaviour. Also, update respective
>>>> allocation flags in the write path. Following are the performance
>>>> numbers with io_uring fio engine for random read, note that device has
>>>> been populated fully with randwrite workload before taking these
>>>> numbers :-
>>> Why would you add a module option, except to make everyones life hell?
>> Yeah that makes no sense. Either the driver is nowait compatible and
>> it should just set the flag, or it's not.
>>
> send v2 without modparam.
>
> -ck

Removed modparam v2 is ready to send, but I've few  concerns enabling
nowait unconditionally for zram :-

 From brd data [1] and zram data [2] from my setup :-

         IOPs  (old->new)    | sys cpu% (old->new)
--------------------------------------------------
brd  | 1.5x (3919 -> 5874) | 3x (29 -> 87)
zram | 1.09x ( 29 ->   87) | 9x (11 -> 97)

brd:-
IOPs increased by               ~1.5  times (50% up)
sys CPU percentage increased by ~3.0  times (200% up)

zram:-
IOPs increased by               ~1.09 times (  9% up)
sys CPU percentage increased by ~8.81 times (781% up)

This comparison clearly demonstrates that zram experiences a much more
substantial CPU load relative to the increase in IOPs compared to brd.
Such a significant difference might suggest a potential CPU regression
in zram ?

Especially for zram, if applications are not expecting this high cpu
usage then they we'll get regression reports with default nowait
approach. How about we avoid something like this with one of the
following options ?

1. Provide a fix with module parameter. (Already NACKed).
2. Allow user to configure nowait from command line using zramctl.
Set QUEUE_FLAG_NOWAIT disabled by default.
3. Add a block layer generic sysfs attr nowait like nomerges, since
    similar changes I've posted for pmem [3] and bcache [4] have same
    issue. This generic way we will avoid duplicating code in
    driver and allows user freedom to set or unset based on their
Set QUEUE_FLAG_NOWAIT disabled by default.

or please suggest any other way you guys can think is appropriate ...

-ck

[1] brd nowait off vs nowait on :-

linux-block (zram-nowait) #  grep cpu  brd-*fio | column -t

brd-default-nowait-off-1.fio:  cpu  :  usr=6.34%,   sys=29.84%, 
ctx=216249754,
brd-default-nowait-off-2.fio:  cpu  :  usr=6.41%,   sys=29.83%, 
ctx=217773657,
brd-default-nowait-off-3.fio:  cpu  :  usr=6.37%,   sys=30.05%, 
ctx=222667703,

brd-nowait-on-1.fio:           cpu  :  usr=10.18%,  sys=88.35%, ctx=23221,
brd-nowait-on-2.fio:           cpu  :  usr=10.02%,  sys=86.82%, ctx=22396,
brd-nowait-on-3.fio:           cpu  :  usr=10.17%,  sys=86.29%, ctx=22207,

linux-block (zram-nowait) #  grep IOPS  brd-*fio | column -t

brd-default-nowait-off-1.fio:  read:  IOPS=3872k,  BW=14.8GiB/s
brd-default-nowait-off-2.fio:  read:  IOPS=3933k,  BW=15.0GiB/s
brd-default-nowait-off-3.fio:  read:  IOPS=3953k,  BW=15.1GiB/s

brd-nowait-on-1.fio:           read:  IOPS=5884k,  BW=22.4GiB/s
brd-nowait-on-2.fio:           read:  IOPS=5870k,  BW=22.4GiB/s
brd-nowait-on-3.fio:           read:  IOPS=5870k,  BW=22.4GiB/s

linux-block (zram-nowait) #  grep slat  brd-*fio | column -t

brd-default-nowait-off-1.fio:  slat  (nsec):  min=440, max=56072k,  
avg=9579.17
brd-default-nowait-off-2.fio:  slat  (nsec):  min=440, max=42743k,  
avg=9468.83
brd-default-nowait-off-3.fio:  slat  (nsec):  min=431, max=32493k,  
avg=9532.96

brd-nowait-on-1.fio:           slat  (nsec):  min=1523, max=37786k,  
avg=7596.58
brd-nowait-on-2.fio:           slat  (nsec):  min=1503, max=40101k,  
avg=7612.64
brd-nowait-on-3.fio:           slat  (nsec):  min=1463, max=37298k,  
avg=7610.89

[2] zram nowait off vs nowait on:-

linux-block (zram-nowait) #  grep IOPS  zram-*fio | column -t
zram-default-nowait-off-1.fio:  read:  IOPS=833k,  BW=3254MiB/s
zram-default-nowait-off-2.fio:  read:  IOPS=845k,  BW=3301MiB/s
zram-default-nowait-off-3.fio:  read:  IOPS=845k,  BW=3301MiB/s

zram-nowait-on-1.fio:           read:  IOPS=917k,  BW=3582MiB/s
zram-nowait-on-2.fio:           read:  IOPS=914k,  BW=3569MiB/s
zram-nowait-on-3.fio:           read:  IOPS=917k,  BW=3581MiB/s

linux-block (zram-nowait) #  grep cpu  zram-*fio | column -t
zram-default-nowait-off-1.fio:  cpu  :  usr=5.18%,  sys=11.31%, ctx=39945072
zram-default-nowait-off-2.fio:  cpu  :  usr=5.20%,  sys=11.49%, ctx=40591907
zram-default-nowait-off-3.fio:  cpu  :  usr=5.31%,  sys=11.86%, ctx=40252142

zram-nowait-on-1.fio:           cpu  :  usr=1.87%,  sys=97.05%, ctx=24337
zram-nowait-on-2.fio:           cpu  :  usr=1.83%,  sys=97.20%, ctx=21452
zram-nowait-on-3.fio:           cpu  :  usr=1.84%,  sys=97.20%, ctx=21051

linux-block (zram-nowait) #  grep slat  zram-*fio | column -t
zram-default-nowait-off-1.fio:  slat  (nsec):  min=420, max=6960.6k,  
avg=1859.09
zram-default-nowait-off-2.fio:  slat  (nsec):  min=411, max=5387.7k,  
avg=1848.79
zram-default-nowait-off-3.fio:  slat  (nsec):  min=410, max=6914.2k,  
avg=1902.51

zram-nowait-on-1.fio:           slat  (nsec):  min=1092, max=32268k,   
avg=51605.49
zram-nowait-on-2.fio:           slat  (nsec):  min=1052, max=7676.3k,  
avg=51806.82
zram-nowait-on-3.fio:           slat  (nsec):  min=1062, max=10444k,   
avg=51625.89

[3] 
https://lore.kernel.org/nvdimm/b90ff1ba-b22c-bb37-db0a-4c46bb5f2a06@nvidia.com/T/#t
[4] https://marc.info/?l=linux-bcache&m=168388522305946&w=1
https://marc.info/?l=linux-bcache&m=168401821129652&w=2



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-16  5:51         ` Chaitanya Kulkarni
@ 2023-05-16 13:08           ` Sergey Senozhatsky
  2023-05-16 20:41             ` Chaitanya Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Sergey Senozhatsky @ 2023-05-16 13:08 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Jens Axboe, Christoph Hellwig, minchan@kernel.org,
	senozhatsky@chromium.org, linux-block@vger.kernel.org

On (23/05/16 05:51), Chaitanya Kulkarni wrote:
> Removed modparam v2 is ready to send, but I've few  concerns enabling
> nowait unconditionally for zram :-
> 
>  From brd data [1] and zram data [2] from my setup :-
> 
>          IOPs  (old->new)    | sys cpu% (old->new)
> --------------------------------------------------
> brd  | 1.5x (3919 -> 5874) | 3x (29 -> 87)
> zram | 1.09x ( 29 ->   87) | 9x (11 -> 97)
> 
> brd:-
> IOPs increased by               ~1.5  times (50% up)
> sys CPU percentage increased by ~3.0  times (200% up)
> 
> zram:-
> IOPs increased by               ~1.09 times (  9% up)
> sys CPU percentage increased by ~8.81 times (781% up)
> 
> This comparison clearly demonstrates that zram experiences a much more
> substantial CPU load relative to the increase in IOPs compared to brd.
> Such a significant difference might suggest a potential CPU regression
> in zram ?
> 
> Especially for zram, if applications are not expecting this high cpu
> usage then they we'll get regression reports with default nowait
> approach. How about we avoid something like this with one of the
> following options ?

Well, zram performs decompression/compression on the CPU (per-CPU
crypto streams) for each IO operation, so zram IO is CPU intensive.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-16 13:08           ` Sergey Senozhatsky
@ 2023-05-16 20:41             ` Chaitanya Kulkarni
  2023-05-16 20:43               ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-16 20:41 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Jens Axboe, Christoph Hellwig, minchan@kernel.org,
	linux-block@vger.kernel.org

On 5/16/23 06:08, Sergey Senozhatsky wrote:
> On (23/05/16 05:51), Chaitanya Kulkarni wrote:
>> Removed modparam v2 is ready to send, but I've few  concerns enabling
>> nowait unconditionally for zram :-
>>
>>   From brd data [1] and zram data [2] from my setup :-
>>
>>           IOPs  (old->new)    | sys cpu% (old->new)
>> --------------------------------------------------
>> brd  | 1.5x (3919 -> 5874) | 3x (29 -> 87)
>> zram | 1.09x ( 29 ->   87) | 9x (11 -> 97)
>>
>> brd:-
>> IOPs increased by               ~1.5  times (50% up)
>> sys CPU percentage increased by ~3.0  times (200% up)
>>
>> zram:-
>> IOPs increased by               ~1.09 times (  9% up)
>> sys CPU percentage increased by ~8.81 times (781% up)
>>
>> This comparison clearly demonstrates that zram experiences a much more
>> substantial CPU load relative to the increase in IOPs compared to brd.
>> Such a significant difference might suggest a potential CPU regression
>> in zram ?
>>
>> Especially for zram, if applications are not expecting this high cpu
>> usage then they we'll get regression reports with default nowait
>> approach. How about we avoid something like this with one of the
>> following options ?
> Well, zram performs decompression/compression on the CPU (per-CPU
> crypto streams) for each IO operation, so zram IO is CPU intensive.

and that is exactly I've raised this issue, are you okay with that ?
I'll send V2 with nowait enabled by default ..

-ck



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-16 20:41             ` Chaitanya Kulkarni
@ 2023-05-16 20:43               ` Jens Axboe
  2023-05-16 21:32                 ` Chaitanya Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Jens Axboe @ 2023-05-16 20:43 UTC (permalink / raw)
  To: Chaitanya Kulkarni, Sergey Senozhatsky
  Cc: Christoph Hellwig, minchan@kernel.org,
	linux-block@vger.kernel.org

On 5/16/23 2:41 PM, Chaitanya Kulkarni wrote:
> On 5/16/23 06:08, Sergey Senozhatsky wrote:
>> On (23/05/16 05:51), Chaitanya Kulkarni wrote:
>>> Removed modparam v2 is ready to send, but I've few  concerns enabling
>>> nowait unconditionally for zram :-
>>>
>>>   From brd data [1] and zram data [2] from my setup :-
>>>
>>>           IOPs  (old->new)    | sys cpu% (old->new)
>>> --------------------------------------------------
>>> brd  | 1.5x (3919 -> 5874) | 3x (29 -> 87)
>>> zram | 1.09x ( 29 ->   87) | 9x (11 -> 97)
>>>
>>> brd:-
>>> IOPs increased by               ~1.5  times (50% up)
>>> sys CPU percentage increased by ~3.0  times (200% up)
>>>
>>> zram:-
>>> IOPs increased by               ~1.09 times (  9% up)
>>> sys CPU percentage increased by ~8.81 times (781% up)
>>>
>>> This comparison clearly demonstrates that zram experiences a much more
>>> substantial CPU load relative to the increase in IOPs compared to brd.
>>> Such a significant difference might suggest a potential CPU regression
>>> in zram ?
>>>
>>> Especially for zram, if applications are not expecting this high cpu
>>> usage then they we'll get regression reports with default nowait
>>> approach. How about we avoid something like this with one of the
>>> following options ?
>> Well, zram performs decompression/compression on the CPU (per-CPU
>> crypto streams) for each IO operation, so zram IO is CPU intensive.
> 
> and that is exactly I've raised this issue, are you okay with that ?
> I'll send V2 with nowait enabled by default ..

Did you check that it's actually nowait sane to begin with? I spent
literally 30 seconds on that when you sent the first patch, and the
partial/sync path does not look kosher.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-16 20:43               ` Jens Axboe
@ 2023-05-16 21:32                 ` Chaitanya Kulkarni
  2023-05-16 22:03                   ` Jens Axboe
  0 siblings, 1 reply; 17+ messages in thread
From: Chaitanya Kulkarni @ 2023-05-16 21:32 UTC (permalink / raw)
  To: Jens Axboe, Sergey Senozhatsky
  Cc: Christoph Hellwig, minchan@kernel.org,
	linux-block@vger.kernel.org

On 5/16/23 13:43, Jens Axboe wrote:
> On 5/16/23 2:41 PM, Chaitanya Kulkarni wrote:
>> On 5/16/23 06:08, Sergey Senozhatsky wrote:
>>> On (23/05/16 05:51), Chaitanya Kulkarni wrote:
>>>> Removed modparam v2 is ready to send, but I've few  concerns enabling
>>>> nowait unconditionally for zram :-
>>>>
>>>>    From brd data [1] and zram data [2] from my setup :-
>>>>
>>>>            IOPs  (old->new)    | sys cpu% (old->new)
>>>> --------------------------------------------------
>>>> brd  | 1.5x (3919 -> 5874) | 3x (29 -> 87)
>>>> zram | 1.09x ( 29 ->   87) | 9x (11 -> 97)
>>>>
>>>> brd:-
>>>> IOPs increased by               ~1.5  times (50% up)
>>>> sys CPU percentage increased by ~3.0  times (200% up)
>>>>
>>>> zram:-
>>>> IOPs increased by               ~1.09 times (  9% up)
>>>> sys CPU percentage increased by ~8.81 times (781% up)
>>>>
>>>> This comparison clearly demonstrates that zram experiences a much more
>>>> substantial CPU load relative to the increase in IOPs compared to brd.
>>>> Such a significant difference might suggest a potential CPU regression
>>>> in zram ?
>>>>
>>>> Especially for zram, if applications are not expecting this high cpu
>>>> usage then they we'll get regression reports with default nowait
>>>> approach. How about we avoid something like this with one of the
>>>> following options ?
>>> Well, zram performs decompression/compression on the CPU (per-CPU
>>> crypto streams) for each IO operation, so zram IO is CPU intensive.
>> and that is exactly I've raised this issue, are you okay with that ?
>> I'll send V2 with nowait enabled by default ..
> Did you check that it's actually nowait sane to begin with? I spent
> literally 30 seconds on that when you sent the first patch, and the
> partial/sync path does not look kosher.
>

I did check for it and it needs a nowait sane preparation in V2 with
something like this [1] plus returning error with bio_wouldblock_error()
when we end up in read_from_bdev_sync() when it is not called from
writeback_store(). (rough changes [1])

But after taking performance numbers repeatedly, the main concern I
failed to resolve is above numbers & default enabling nowait for
zram ...

-ck

[1]
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 08d198431a88..c2f154911cc0 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -55,7 +55,7 @@ static const struct block_device_operations zram_devops;

  static void zram_free_page(struct zram *zram, size_t index);
  static int zram_read_page(struct zram *zram, struct page *page, u32 index,
-              struct bio *parent, blk_opf_t opf);
+              struct bio *parent, bool nowait);

  static int zram_slot_trylock(struct zram *zram, u32 index)
  {
@@ -690,7 +690,7 @@ static ssize_t writeback_store(struct device *dev,
          /* Need for hugepage writeback racing */
          zram_set_flag(zram, index, ZRAM_IDLE);
          zram_slot_unlock(zram, index);
-        if (zram_read_page(zram, page, index, NULL, 0)) {
+        if (zram_read_page(zram, page, index, NULL, false)) {
              zram_slot_lock(zram, index);
              zram_clear_flag(zram, index, ZRAM_UNDER_WB);
              zram_clear_flag(zram, index, ZRAM_IDLE);
@@ -772,6 +772,7 @@ struct zram_work {
      unsigned long entry;
      struct page *page;
      int error;
+    bool nowait;
  };

  static void zram_sync_read(struct work_struct *work)
@@ -779,8 +780,9 @@ static void zram_sync_read(struct work_struct *work)
      struct zram_work *zw = container_of(work, struct zram_work, work);
      struct bio_vec bv;
      struct bio bio;
+    blk_opf_t_nowait = zw->nowait ? REQ_NOWAIT : 0;

-    bio_init(&bio, zw->zram->bdev, &bv, 1, REQ_OP_READ);
+    bio_init(&bio, zw->zram->bdev, &bv, 1, REQ_OP_READ | nowait);
      bio.bi_iter.bi_sector = zw->entry * (PAGE_SIZE >> 9);
      __bio_add_page(&bio, zw->page, PAGE_SIZE, 0);
      zw->error = submit_bio_wait(&bio);
@@ -792,18 +794,14 @@ static void zram_sync_read(struct work_struct *work)
   * use a worker thread context.
   */
  static int read_from_bdev_sync(struct zram *zram, struct page *page,
-                unsigned long entry, blk_opf_t opf)
+                unsigned long entry, bool nowait)
  {
      struct zram_work work;

      work.page = page;
      work.zram = zram;
      work.entry = entry;
-
-    if (opf & REQ_NOWAIT) {
-        zram_sync_read(&work);
-        return work.error;
-    }
+    work.nowait = nowait;

      INIT_WORK_ONSTACK(&work.work, zram_sync_read);
      queue_work(system_unbound_wq, &work.work);
@@ -814,21 +812,21 @@ static int read_from_bdev_sync(struct zram *zram, 
struct page *page,
  }

  static int read_from_bdev(struct zram *zram, struct page *page,
-            unsigned long entry, struct bio *parent, blk_opf_t opf)
+            unsigned long entry, struct bio *parent, bool nowait)
  {
      atomic64_inc(&zram->stats.bd_reads);
      if (!parent) {
          if (WARN_ON_ONCE(!IS_ENABLED(ZRAM_PARTIAL_IO)))
              return -EIO;
-        return read_from_bdev_sync(zram, page, entry, opf);
+        return read_from_bdev_sync(zram, page, entry, nowait);
      }
-    read_from_bdev_async(zram, page, entry, parent, opf);
+    read_from_bdev_async(zram, page, entry, parent, nowait);
      return 0;
  }
  #else
  static inline void reset_bdev(struct zram *zram) {};
  static int read_from_bdev(struct zram *zram, struct page *page,
-            unsigned long entry, struct bio *parent, blk_opf_t opf)
+            unsigned long entry, struct bio *parent, bool nowait)
  {
      return -EIO;
  }
@@ -1361,7 +1359,7 @@ static int zram_read_from_zspool(struct zram 
*zram, struct page *page,
  }

  static int zram_read_page(struct zram *zram, struct page *page, u32 index,
-              struct bio *parent, blk_opf_t bi_opf)
+              struct bio *parent, bool nowait)
  {
      int ret;

@@ -1378,7 +1376,7 @@ static int zram_read_page(struct zram *zram, 
struct page *page, u32 index,
          zram_slot_unlock(zram, index);

          ret = read_from_bdev(zram, page, zram_get_element(zram, index),
-                     parent, bi_opf);
+                     parent, nowait);
      }

      /* Should NEVER happen. Return bio error if it does. */
@@ -1395,13 +1393,14 @@ static int zram_read_page(struct zram *zram, 
struct page *page, u32 index,
  static int zram_bvec_read_partial(struct zram *zram, struct bio_vec *bvec,
                    u32 index, int offset, struct bio *bio)
  {
+    bool nowait = bio->bi_opf & REQ_NOWAIT;
      struct page *page;
      int ret;

-    page = alloc_page(bio->bi_opf & REQ_NOWAIT ? GFP_NOWAIT : GFP_NOIO);
+    page = alloc_page(nowait ? GFP_NOWAIT : GFP_NOIO);
      if (!page)
          return -ENOMEM;
-    ret = zram_read_page(zram, page, index, NULL, bio->bi_opf);
+    ret = zram_read_page(zram, page, index, NULL, nowait);
      if (likely(!ret))
          memcpy_to_bvec(bvec, page_address(page) + offset);
      __free_page(page);
@@ -1413,7 +1412,8 @@ static int zram_bvec_read(struct zram *zram, 
struct bio_vec *bvec,
  {
      if (is_partial_io(bvec))
          return zram_bvec_read_partial(zram, bvec, index, offset, bio);
-    return zram_read_page(zram, bvec->bv_page, index, bio, bio->bi_opf);
+    return zram_read_page(zram, bvec->bv_page, index, bio,
+                  bio->bi_opf & REQ_NOWAIT);
  }

  static int zram_write_page(struct zram *zram, struct page *page, u32 
index)
@@ -1547,14 +1547,15 @@ static int zram_write_page(struct zram *zram, 
struct page *page, u32 index)
  static int zram_bvec_write_partial(struct zram *zram, struct bio_vec 
*bvec,
                     u32 index, int offset, struct bio *bio)
  {
+    bool nowait = bio->bi_opf & REQ_NOWAIT;
      struct page *page;
      int ret;

-    page = alloc_page(bio->bi_opf & REQ_NOWAIT ? GFP_NOWAIT : GFP_NOIO);
+    page = alloc_page(nowait ? GFP_NOWAIT : GFP_NOIO);
      if (!page)
          return -ENOMEM;

-    ret = zram_read_page(zram, page, index, bio, bio->bi_opf);
+    ret = zram_read_page(zram, page, index, bio, nowait);
      if (!ret) {
          memcpy_from_bvec(page_address(page) + offset, bvec);
          ret = zram_write_page(zram, page, index);


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT
  2023-05-16 21:32                 ` Chaitanya Kulkarni
@ 2023-05-16 22:03                   ` Jens Axboe
  0 siblings, 0 replies; 17+ messages in thread
From: Jens Axboe @ 2023-05-16 22:03 UTC (permalink / raw)
  To: Chaitanya Kulkarni, Sergey Senozhatsky
  Cc: Christoph Hellwig, minchan@kernel.org,
	linux-block@vger.kernel.org

On 5/16/23 3:32?PM, Chaitanya Kulkarni wrote:
> On 5/16/23 13:43, Jens Axboe wrote:
>> On 5/16/23 2:41?PM, Chaitanya Kulkarni wrote:
>>> On 5/16/23 06:08, Sergey Senozhatsky wrote:
>>>> On (23/05/16 05:51), Chaitanya Kulkarni wrote:
>>>>> Removed modparam v2 is ready to send, but I've few  concerns enabling
>>>>> nowait unconditionally for zram :-
>>>>>
>>>>>    From brd data [1] and zram data [2] from my setup :-
>>>>>
>>>>>            IOPs  (old->new)    | sys cpu% (old->new)
>>>>> --------------------------------------------------
>>>>> brd  | 1.5x (3919 -> 5874) | 3x (29 -> 87)
>>>>> zram | 1.09x ( 29 ->   87) | 9x (11 -> 97)
>>>>>
>>>>> brd:-
>>>>> IOPs increased by               ~1.5  times (50% up)
>>>>> sys CPU percentage increased by ~3.0  times (200% up)
>>>>>
>>>>> zram:-
>>>>> IOPs increased by               ~1.09 times (  9% up)
>>>>> sys CPU percentage increased by ~8.81 times (781% up)
>>>>>
>>>>> This comparison clearly demonstrates that zram experiences a much more
>>>>> substantial CPU load relative to the increase in IOPs compared to brd.
>>>>> Such a significant difference might suggest a potential CPU regression
>>>>> in zram ?
>>>>>
>>>>> Especially for zram, if applications are not expecting this high cpu
>>>>> usage then they we'll get regression reports with default nowait
>>>>> approach. How about we avoid something like this with one of the
>>>>> following options ?
>>>> Well, zram performs decompression/compression on the CPU (per-CPU
>>>> crypto streams) for each IO operation, so zram IO is CPU intensive.
>>> and that is exactly I've raised this issue, are you okay with that ?
>>> I'll send V2 with nowait enabled by default ..
>> Did you check that it's actually nowait sane to begin with? I spent
>> literally 30 seconds on that when you sent the first patch, and the
>> partial/sync path does not look kosher.
>>
> 
> I did check for it and it needs a nowait sane preparation in V2 with
> something like this [1] plus returning error with bio_wouldblock_error()

That looks pretty awful... Things like:

> @@ -779,8 +780,9 @@ static void zram_sync_read(struct work_struct *work)
>       struct zram_work *zw = container_of(work, struct zram_work, work);
>       struct bio_vec bv;
>       struct bio bio;
> +    blk_opf_t_nowait = zw->nowait ? REQ_NOWAIT : 0;
> 
> -    bio_init(&bio, zw->zram->bdev, &bv, 1, REQ_OP_READ);
> +    bio_init(&bio, zw->zram->bdev, &bv, 1, REQ_OP_READ | nowait);
>       bio.bi_iter.bi_sector = zw->entry * (PAGE_SIZE >> 9);
>       __bio_add_page(&bio, zw->page, PAGE_SIZE, 0);
>       zw->error = submit_bio_wait(&bio);
>

make absolutely no sense, setting REQ_NOWAIT and then waiting on IO
completion.

> when we end up in read_from_bdev_sync() when it is not called from
> writeback_store(). (rough changes [1])

writeback_store() will never have nowait set. Outside of that, pushing
the nowait handling to the workqueue makes no sense, that flag only
applies for inline issue.

> But after taking performance numbers repeatedly, the main concern I
> failed to resolve is above numbers & default enabling nowait for
> zram ...

It's a choice you make if you do nowait IO, normal block IO would not
change at all. So I think it's fine, but the driver needs to be in a
sane state to support nowait to begin with.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-05-16 22:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-12  8:29 [PATCH 0/3] zram: queue flag nowait and mior cleanup Chaitanya Kulkarni
2023-05-12  8:29 ` [PATCH 1/3] zram: allow user to set QUEUE_FLAG_NOWAIT Chaitanya Kulkarni
2023-05-12 14:31   ` Christoph Hellwig
2023-05-12 14:34     ` Jens Axboe
2023-05-13  1:06       ` Chaitanya Kulkarni
2023-05-16  5:51         ` Chaitanya Kulkarni
2023-05-16 13:08           ` Sergey Senozhatsky
2023-05-16 20:41             ` Chaitanya Kulkarni
2023-05-16 20:43               ` Jens Axboe
2023-05-16 21:32                 ` Chaitanya Kulkarni
2023-05-16 22:03                   ` Jens Axboe
2023-05-13  1:05     ` Chaitanya Kulkarni
2023-05-12  8:29 ` [PATCH 2/3] zram: consolidate zram_bio_read()_zram_bio_write() Chaitanya Kulkarni
2023-05-12 14:32   ` Christoph Hellwig
2023-05-13  1:06     ` Chaitanya Kulkarni
2023-05-12  8:29 ` [PATCH 3/3] zram: add flush_dcache_page() call for write Chaitanya Kulkarni
2023-05-12 14:34   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).