public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Wang Yugui <wangyugui@e16-tech.com>
To: Chris Mason <clm@meta.com>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-btrfs@vger.kernel.org, Chris Mason <clm@fb.com>
Subject: Re: btrfs write-bandwidth performance regression of 6.5-rc4/rc3
Date: Sun, 13 Aug 2023 17:50:38 +0800	[thread overview]
Message-ID: <20230813175032.AA17.409509F4@e16-tech.com> (raw)
In-Reply-To: <20ab0be0-e7d0-632b-b94c-89d76911f1ed@meta.com>

[-- Attachment #1: Type: text/plain, Size: 2322 bytes --]

Hi,

> On 8/11/23 10:23 AM, Wang Yugui wrote:
> > Hi,
> > 
> > 
> >> On Wed, Aug 02, 2023 at 08:04:57AM +0800, Wang Yugui wrote:
> >>>> And with only a revert of
> >>>>
> >>>> "btrfs: submit IO synchronously for fast checksum implementations"?
> >>>
> >>> GOOD performance when only (Revert "btrfs: submit IO synchronously for fast
> >>> checksum implementations") 
> >>
> >> Ok, so you have a case where the offload for the checksumming generation
> >> actually helps (by a lot).  Adding Chris to the Cc list as he was
> >> involved with this.
> >>
> >>>>> -       if (test_bit(BTRFS_FS_CSUM_IMPL_FAST, &bbio->fs_info->flags))
> >>>>> +       if ((bbio->bio.bi_opf & REQ_META) && test_bit(BTRFS_FS_CSUM_IMPL_FAST, &bbio->fs_info->flags))
> >>>>>                 return false;
> >>>>
> >>>> This disables synchronous checksum calculation entirely for data I/O.
> >>>
> >>> without this fix, data I/O checksum is always synchronous?
> >>> this is a feature change of "btrfs: submit IO synchronously for fast checksum implementations"?
> >>
> >> It is never with the above patch.
> >>
> >>>
> >>>> Also I'm curious if you see any differents for a non-RAID0 (i.e.
> >>>> single profile) workload.
> >>>
> >>> '-m single -d single' is about 10% slow that '-m raid1 -d raid0' in this test
> >>> case.
> >>
> >> How does it compare with and without the revert?  Can you add the numbers?
> > 
> 
> Looking through the thread, you're comparing -m single -d single, but
> btrfs is still doing the raid.
> 
> Sorry to keep asking for more runs, but these numbers are a surprise,
> and I probably won't have time today to reproduce before vacation next
> week (sadly, Christoph and I aren't going together).
> 
> Can you please do a run where lvm or MD raid are providing the raid0?

no LVM/MD used here.

> It doesn't look like you're using compression, but I wanted to double check.

Yes. '-m xx -d yy' with other default mkfs.btrfs option, so no compression.

> How much ram do you have?

192G ECC memory.

two CPU numa nodes, but all PCIe3 NVMe SSD are connected to one NVMe HBA/
one numa node.


> Your fio run has 4 jobs going, can I please see the full fio output for
> a fast run and a slow run?

fio results are saved into attachment files (fast.text & slow.txt)

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2023/08/13

[-- Attachment #2: fast.txt --]
[-- Type: application/octet-stream, Size: 6215 bytes --]

+ fio -name write-bandwidth -rw=write -bs=1024Ki -size=32Gi -runtime=30 -iodepth 1 -ioengine sync -zero_buffers=1 -direct=0 -end_fsync=1 -numjobs=4 -directory=/mnt/test
write-bandwidth: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=sync, iodepth=1
...
fio-3.19
Starting 4 processes
write-bandwidth: Laying out IO file (1 file / 32768MiB)
write-bandwidth: Laying out IO file (1 file / 32768MiB)
write-bandwidth: Laying out IO file (1 file / 32768MiB)
write-bandwidth: Laying out IO file (1 file / 32768MiB)
Jobs: 1 (f=1): [_(3),F(1)][100.0%][eta 00m:00s]
write-bandwidth: (groupid=0, jobs=1): err= 0: pid=2570: Sun Aug 13 17:47:02 2023
  write: IOPS=697, BW=697MiB/s (731MB/s)(21.7GiB/31883msec); 0 zone resets
    clat (usec): min=325, max=23003, avg=1345.13, stdev=3158.16
     lat (usec): min=326, max=23003, avg=1345.57, stdev=3158.16
    clat percentiles (usec):
     |  1.00th=[  355],  5.00th=[  388], 10.00th=[  408], 20.00th=[  553],
     | 30.00th=[  734], 40.00th=[  832], 50.00th=[  865], 60.00th=[  889],
     | 70.00th=[  914], 80.00th=[ 1020], 90.00th=[ 1418], 95.00th=[ 1696],
     | 99.00th=[21627], 99.50th=[21890], 99.90th=[22152], 99.95th=[22152],
     | 99.99th=[22676]
   bw (  KiB/s): min=505856, max=1527017, per=26.13%, avg=755313.46, stdev=181122.74, samples=59
   iops        : min=  494, max= 1491, avg=737.59, stdev=176.88, samples=59
  lat (usec)   : 500=17.84%, 750=12.78%, 1000=48.48%
  lat (msec)   : 2=17.07%, 4=1.44%, 50=2.39%
  cpu          : usr=0.42%, sys=59.17%, ctx=7412, majf=0, minf=282
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,22238,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
write-bandwidth: (groupid=0, jobs=1): err= 0: pid=2571: Sun Aug 13 17:47:02 2023
  write: IOPS=718, BW=719MiB/s (754MB/s)(21.8GiB/31099msec); 0 zone resets
    clat (usec): min=326, max=23115, avg=1338.32, stdev=3160.69
     lat (usec): min=326, max=23115, avg=1338.75, stdev=3160.69
    clat percentiles (usec):
     |  1.00th=[  351],  5.00th=[  383], 10.00th=[  404], 20.00th=[  478],
     | 30.00th=[  627], 40.00th=[  824], 50.00th=[  865], 60.00th=[  889],
     | 70.00th=[  914], 80.00th=[ 1074], 90.00th=[ 1418], 95.00th=[ 1696],
     | 99.00th=[21627], 99.50th=[21890], 99.90th=[22152], 99.95th=[22676],
     | 99.99th=[22938]
   bw (  KiB/s): min=501760, max=1881416, per=26.22%, avg=757748.20, stdev=206407.89, samples=59
   iops        : min=  490, max= 1837, avg=739.98, stdev=201.54, samples=59
  lat (usec)   : 500=21.56%, 750=12.61%, 1000=44.20%
  lat (msec)   : 2=17.72%, 4=1.53%, 20=0.01%, 50=2.38%
  cpu          : usr=0.38%, sys=61.88%, ctx=17364, majf=0, minf=284
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,22353,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
write-bandwidth: (groupid=0, jobs=1): err= 0: pid=2572: Sun Aug 13 17:47:02 2023
  write: IOPS=720, BW=721MiB/s (756MB/s)(22.4GiB/31760msec); 0 zone resets
    clat (usec): min=341, max=22932, avg=1307.11, stdev=3226.40
     lat (usec): min=341, max=22932, avg=1307.52, stdev=3226.41
    clat percentiles (usec):
     |  1.00th=[  367],  5.00th=[  392], 10.00th=[  404], 20.00th=[  437],
     | 30.00th=[  553], 40.00th=[  709], 50.00th=[  832], 60.00th=[  873],
     | 70.00th=[  914], 80.00th=[ 1037], 90.00th=[ 1270], 95.00th=[ 1565],
     | 99.00th=[21365], 99.50th=[21890], 99.90th=[22152], 99.95th=[22152],
     | 99.99th=[22676]
   bw (  KiB/s): min=516047, max=2278277, per=26.81%, avg=774783.25, stdev=257993.39, samples=59
   iops        : min=  503, max= 2224, avg=756.59, stdev=251.87, samples=59
  lat (usec)   : 500=25.91%, 750=16.60%, 1000=33.29%
  lat (msec)   : 2=20.14%, 4=1.55%, 50=2.50%
  cpu          : usr=0.45%, sys=56.77%, ctx=13204, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,22894,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
write-bandwidth: (groupid=0, jobs=1): err= 0: pid=2573: Sun Aug 13 17:47:02 2023
  write: IOPS=713, BW=713MiB/s (748MB/s)(22.3GiB/31998msec); 0 zone resets
    clat (usec): min=342, max=23050, avg=1310.51, stdev=3306.47
     lat (usec): min=343, max=23050, avg=1310.95, stdev=3306.48
    clat percentiles (usec):
     |  1.00th=[  371],  5.00th=[  388], 10.00th=[  404], 20.00th=[  433],
     | 30.00th=[  537], 40.00th=[  685], 50.00th=[  848], 60.00th=[  881],
     | 70.00th=[  914], 80.00th=[ 1029], 90.00th=[ 1188], 95.00th=[ 1401],
     | 99.00th=[21365], 99.50th=[21890], 99.90th=[22152], 99.95th=[22152],
     | 99.99th=[22414]
   bw (  KiB/s): min=509952, max=2291342, per=26.73%, avg=772539.32, stdev=258105.99, samples=59
   iops        : min=  498, max= 2237, avg=754.41, stdev=252.01, samples=59
  lat (usec)   : 500=27.87%, 750=14.51%, 1000=34.68%
  lat (msec)   : 2=20.30%, 50=2.64%
  cpu          : usr=0.40%, sys=54.73%, ctx=16519, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,22826,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=2822MiB/s (2959MB/s), 697MiB/s-721MiB/s (731MB/s-756MB/s), io=88.2GiB (94.7GB), run=31099-31998msec

[-- Attachment #3: slow.txt --]
[-- Type: application/octet-stream, Size: 6259 bytes --]

+ fio -name write-bandwidth -rw=write -bs=1024Ki -size=32Gi -runtime=30 -iodepth 1 -ioengine sync -zero_buffers=1 -direct=0 -end_fsync=1 -numjobs=4 -directory=/mnt/test
write-bandwidth: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=sync, iodepth=1
...
fio-3.19
Starting 4 processes
write-bandwidth: Laying out IO file (1 file / 32768MiB)
write-bandwidth: Laying out IO file (1 file / 32768MiB)
write-bandwidth: Laying out IO file (1 file / 32768MiB)
write-bandwidth: Laying out IO file (1 file / 32768MiB)
Jobs: 4 (f=4): [F(4)][100.0%][w=89.9MiB/s][w=89 IOPS][eta 00m:00s]
write-bandwidth: (groupid=0, jobs=1): err= 0: pid=1795: Sun Aug 13 17:37:22 2023
  write: IOPS=325, BW=326MiB/s (342MB/s)(10.1GiB/31758msec); 0 zone resets
    clat (usec): min=335, max=28799, avg=2894.34, stdev=5373.81
     lat (usec): min=335, max=28799, avg=2894.92, stdev=5373.81
    clat percentiles (usec):
     |  1.00th=[  359],  5.00th=[  379], 10.00th=[  404], 20.00th=[  832],
     | 30.00th=[  857], 40.00th=[  873], 50.00th=[  881], 60.00th=[  898],
     | 70.00th=[  963], 80.00th=[ 1336], 90.00th=[16319], 95.00th=[16909],
     | 99.00th=[17957], 99.50th=[18744], 99.90th=[21627], 99.95th=[21627],
     | 99.99th=[21890]
   bw (  KiB/s): min=256000, max=1909180, per=26.26%, avg=347924.54, stdev=223553.56, samples=59
   iops        : min=  250, max= 1864, avg=339.76, stdev=218.26, samples=59
  lat (usec)   : 500=12.11%, 750=4.66%, 1000=57.22%
  lat (msec)   : 2=13.31%, 4=0.16%, 20=12.19%, 50=0.36%
  cpu          : usr=0.18%, sys=31.60%, ctx=1435, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,10348,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
write-bandwidth: (groupid=0, jobs=1): err= 0: pid=1796: Sun Aug 13 17:37:22 2023
  write: IOPS=327, BW=328MiB/s (344MB/s)(10.1GiB/31576msec); 0 zone resets
    clat (usec): min=337, max=28887, avg=2894.97, stdev=5453.77
     lat (usec): min=337, max=28888, avg=2895.57, stdev=5453.76
    clat percentiles (usec):
     |  1.00th=[  359],  5.00th=[  383], 10.00th=[  404], 20.00th=[  775],
     | 30.00th=[  832], 40.00th=[  840], 50.00th=[  857], 60.00th=[  865],
     | 70.00th=[  938], 80.00th=[ 1020], 90.00th=[16450], 95.00th=[16909],
     | 99.00th=[18220], 99.50th=[19006], 99.90th=[21627], 99.95th=[21890],
     | 99.99th=[22152]
   bw (  KiB/s): min=260096, max=1901014, per=26.25%, avg=347786.14, stdev=222061.26, samples=59
   iops        : min=  254, max= 1856, avg=339.63, stdev=216.80, samples=59
  lat (usec)   : 500=12.40%, 750=6.49%, 1000=58.10%
  lat (msec)   : 2=10.15%, 20=12.57%, 50=0.30%
  cpu          : usr=0.20%, sys=29.40%, ctx=1441, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,10346,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
write-bandwidth: (groupid=0, jobs=1): err= 0: pid=1797: Sun Aug 13 17:37:22 2023
  write: IOPS=317, BW=317MiB/s (332MB/s)(9.82GiB/31715msec); 0 zone resets
    clat (usec): min=327, max=22557, avg=2978.67, stdev=5434.74
     lat (usec): min=327, max=22557, avg=2979.27, stdev=5434.73
    clat percentiles (usec):
     |  1.00th=[  347],  5.00th=[  392], 10.00th=[  848], 20.00th=[  865],
     | 30.00th=[  865], 40.00th=[  873], 50.00th=[  881], 60.00th=[  898],
     | 70.00th=[  914], 80.00th=[ 1401], 90.00th=[16450], 95.00th=[17695],
     | 99.00th=[17957], 99.50th=[18482], 99.90th=[21890], 99.95th=[22152],
     | 99.99th=[22414]
   bw (  KiB/s): min=258048, max=1419228, per=25.64%, avg=339759.12, stdev=166043.31, samples=59
   iops        : min=  252, max= 1385, avg=331.78, stdev=162.04, samples=59
  lat (usec)   : 500=7.23%, 750=0.51%, 1000=67.28%
  lat (msec)   : 2=12.24%, 4=0.03%, 20=12.41%, 50=0.30%
  cpu          : usr=0.10%, sys=30.82%, ctx=1334, majf=0, minf=285
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,10055,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1
write-bandwidth: (groupid=0, jobs=1): err= 0: pid=1798: Sun Aug 13 17:37:22 2023
  write: IOPS=325, BW=326MiB/s (342MB/s)(10.1GiB/31751msec); 0 zone resets
    clat (usec): min=327, max=22259, avg=2895.95, stdev=5395.88
     lat (usec): min=327, max=22259, avg=2896.53, stdev=5395.87
    clat percentiles (usec):
     |  1.00th=[  355],  5.00th=[  388], 10.00th=[  506], 20.00th=[  857],
     | 30.00th=[  865], 40.00th=[  873], 50.00th=[  881], 60.00th=[  889],
     | 70.00th=[  906], 80.00th=[ 1254], 90.00th=[16450], 95.00th=[17695],
     | 99.00th=[17957], 99.50th=[18482], 99.90th=[21627], 99.95th=[21890],
     | 99.99th=[22152]
   bw (  KiB/s): min=260096, max=1825888, per=26.27%, avg=348074.85, stdev=215896.47, samples=59
   iops        : min=  254, max= 1783, avg=339.92, stdev=210.83, samples=59
  lat (usec)   : 500=9.79%, 750=1.76%, 1000=64.38%
  lat (msec)   : 2=11.60%, 4=0.07%, 20=12.09%, 50=0.32%
  cpu          : usr=0.15%, sys=31.31%, ctx=2367, majf=0, minf=284
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,10342,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=1294MiB/s (1357MB/s), 317MiB/s-328MiB/s (332MB/s-344MB/s), io=40.1GiB (43.1GB), run=31576-31758msec

  reply	other threads:[~2023-08-13  9:50 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-31  7:22 btrfs write-bandwidth performance regression of 6.5-rc4/rc3 Wang Yugui
2023-08-01  2:22 ` Wang Yugui
2023-08-01  8:35   ` Christoph Hellwig
2023-08-01  8:56     ` Wang Yugui
2023-08-01  9:03       ` Christoph Hellwig
2023-08-01  9:32         ` Wang Yugui
2023-08-01 10:00           ` Christoph Hellwig
2023-08-01 13:04             ` Wang Yugui
2023-08-01 14:59               ` Christoph Hellwig
2023-08-01 15:51                 ` Wang Yugui
2023-08-01 15:56                   ` Christoph Hellwig
2023-08-01 15:57                     ` Christoph Hellwig
2023-08-02  0:04                     ` Wang Yugui
2023-08-02  9:26                       ` Christoph Hellwig
2023-08-11  8:58                         ` Linux regression tracking (Thorsten Leemhuis)
2023-08-11 10:31                           ` Christoph Hellwig
2023-08-11 14:23                         ` Wang Yugui
2023-08-11 14:52                           ` Chris Mason
2023-08-13  9:50                             ` Wang Yugui [this message]
2023-08-29  9:45                               ` Linux regression tracking (Thorsten Leemhuis)
2023-09-11  7:02                                 ` Thorsten Leemhuis
2023-09-11 23:20                                   ` Wang Yugui
2023-09-12  7:58                                     ` Linux regression tracking (Thorsten Leemhuis)
2023-09-26 10:55                                       ` Thorsten Leemhuis
2023-09-26 17:18                                         ` Chris Mason
2023-09-27 11:30                                           ` Linux regression tracking (Thorsten Leemhuis)
2023-12-06 14:22                                 ` Linux regression tracking (Thorsten Leemhuis)
2023-12-13 15:57                                   ` Naohiro Aota
2023-08-02  8:45 ` Linux regression tracking #adding (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230813175032.AA17.409509F4@e16-tech.com \
    --to=wangyugui@e16-tech.com \
    --cc=clm@fb.com \
    --cc=clm@meta.com \
    --cc=hch@lst.de \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox