linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* a strange bcache0 100% busy with no IO rw and no cpu consumption
@ 2017-05-14  3:20 朱菁
       [not found] ` <20170514084515.GA23435@xoff>
  0 siblings, 1 reply; 2+ messages in thread
From: 朱菁 @ 2017-05-14  3:20 UTC (permalink / raw)
  To: linux-bcache; +Cc: qlg

Hi,all

[root@scst-test bcache]# lsblk
NAME                     MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
dfa                      250:0    0   1.1T  0 disk 
└─dfa1                   250:1    0   1.1T  0 part 
  └─bcache0              249:0    0  21.8T  0 disk 
    ├─vg_bcache0-fc_vol3 253:3    0     4T  0 lvm  
    ├─vg_bcache0-fc_vol4 253:4    0     4T  0 lvm  
    ├─vg_bcache0-fc_vol5 253:5    0     4T  0 lvm  
    └─vg_bcache0-fc_vol6 253:6    0     4T  0 lvm  
sda                        8:0    0 557.8G  0 disk 
├─sda1                     8:1    0     1G  0 part /boot
└─sda2                     8:2    0 556.8G  0 part 
  ├─cl-root              253:0    0    50G  0 lvm  /
  ├─cl-swap              253:1    0     4G  0 lvm  [SWAP]
  └─cl-home              253:2    0 502.8G  0 lvm  /home
sdb                        8:16   0  21.8T  0 disk 
└─sdb1                     8:17   0  21.8T  0 part 
  └─bcache0              249:0    0  21.8T  0 disk 
    ├─vg_bcache0-fc_vol3 253:3    0     4T  0 lvm  
    ├─vg_bcache0-fc_vol4 253:4    0     4T  0 lvm  
    ├─vg_bcache0-fc_vol5 253:5    0     4T  0 lvm  
    └─vg_bcache0-fc_vol6 253:6    0     4T  0 lvm  

[root@scst-test bcache]# modinfo  bcache
filename:       /lib/modules/4.4.65-1.el7.elrepo.x86_64/kernel/drivers/md/bcache/bcache.ko
license:        GPL
author:         Kent Overstreet <koverstreet@google.com>
author:         Kent Overstreet <kent.overstreet@gmail.com>
license:        GPL
srcversion:     391A0B3836FE95B29F75289
depends:        
intree:         Y
vermagic:       4.4.65-1.el7.elrepo.x86_64 SMP mod_unload modversions 

my testing case:
node1: stor-node, running scst target with bcache ( 1.2T pcie-ssd  cache device, 21T lsi raid10 lun backing device)
node2: esxi-node, running FIO testing in a VM(rhel6.5)

at first i have done about 100T 128k randwrite benchmarking on bcache0 device

[root@localhost ~]# cat myfio.sh 
i=0
while [ $i -le 99 ]
do
        fio --filename=/dev/sdb --rw=randwrite --bs=128k  --ioengine=libaio --iodepth=16 --randrepeat=0 --refill_buffers --norandommap --size=1024G  -name=test --numjobs=1
        sleep 5
        let i=i+1
done

then stop all application,clean all dirty_data,set cache_mode to none, set writeback_runing to 0.
and then checking sysfs state. 

[root@scst-test bcache]# service scst stop
[root@scst-test bcache]# cat dirty_data 
0
[root@scst-test bcache]# cat state
clean
[root@scst-test bcache]# cat writeback_percent 
0
[root@scst-test bcache]# cat writeback_running 
1
[root@scst-test bcache]# echo 0 > writeback_running 
[root@scst-test bcache]# cat writeback_running 
0
[root@scst-test bcache]# cat cache_mode 
writethrough writeback writearound [none]

<<<At the same time  

[root@scst-test cache0]# cat priority_stats 
Unused:         66%
Clean:          0%
Dirty:          26% <<<<<< im doubt here , as already shown above ,cachedev was clean and no dirty_data 
Metadata:       0%
Average:        267
Sectors per Q:  24295296
Quantiles:      [3 7 7 13 25 36 48 59 71 83 95 106 118 130 142 154 165 177 190 204 217 230 243 261 280 298 324 347 381 426 531]

at last , the iostat shown bcache0 100% busy and too many avgqu-sz 

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
bcache0           0.00     0.00    0.00    0.00     0.00     0.00     0.00 1227360.00    0.00    0.00    0.00   0.00 100.00

vmstat showing there is no cpu consumption,no io wait

[root@scst-test b0bc3e56-530b-4541-b73d-30007d1e6709]# vmstat 1
--------------------------------------+
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
0  0      0 243501152    976 710832    0    0     2 17246    2    2  0  3 97  0  0
0  0      0 243501152    976 710848    0    0     0     0   47  138  0  0 100  0  0
0  0      0 243501296    976 710848    0    0     0     0  107  110  0  0 100  0  0
0  0      0 243501312    976 710848    0    0     1     0   57  109  0  0 100  0  0
0  0      0 243501312    976 710848    0    0     0     0  104  110  0  0 100  0  0
0  0      0 243501312    976 710848    0    0     0     0   31   82  0  0 100  0  0


it's iostat bug or something else im wrong ?

p.s  it will be fine if i reboot system.

Thx for your reply !

Jing.zhu

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: 答复: a strange bcache0 100% busy with no IO rw and no cpu consumption
       [not found]   ` <000e01d2cc95$1b81cad0$52856070$@ecloudtech.com.cn>
@ 2017-05-14 13:53     ` Matthias Ferdinand
  0 siblings, 0 replies; 2+ messages in thread
From: Matthias Ferdinand @ 2017-05-14 13:53 UTC (permalink / raw)
  To: 朱菁; +Cc: linux-bcache

On Sun, May 14, 2017 at 05:33:12PM +0800, 朱菁 wrote:
> Hi:
> 	Thx for your reply.
> 	/dev/sdb on esxi-node  equal bcache0 on stor-node.

sorry, it just looked too easy :-)

I have also seen suspiciously high iostat values on bcache devices,
where extreme values of "wkB/s" go together with high "avgrq-sz"
(probably a derived value), but never on two consecutive measurements,
always several days in between. But "%util" always looked ok.

Since iostat is just reading /proc/diskstat, I would rather suspect some
(bcache related) bug there.

Sometimes I also see funny numbers in writeback_rate_debug, e.g.
increasing "dirty" values like this: 
   2.8M .. 2.9M .. 2.10M .. 3.0M

Matthias

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-05-14 13:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-14  3:20 a strange bcache0 100% busy with no IO rw and no cpu consumption 朱菁
     [not found] ` <20170514084515.GA23435@xoff>
     [not found]   ` <000e01d2cc95$1b81cad0$52856070$@ecloudtech.com.cn>
2017-05-14 13:53     ` 答复: " Matthias Ferdinand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).