(user) Help needed: mdadm seems to constantly touch my disks

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* (user) Help needed: mdadm seems to constantly touch my disks
@ 2016-12-13  8:13 Jure Erznožnik
  2016-12-14  1:15 ` NeilBrown
  0 siblings, 1 reply; 11+ messages in thread
From: Jure Erznožnik @ 2016-12-13  8:13 UTC (permalink / raw)
  To: linux-raid

First of all, I apologise if this mail list is not intended for layman
help, but this is what I am and I couldn't get an explanation
elsewhere.

My problem is that (as it seems) mdadm is touching HDD superblocks
once per second, once at address 8 (sectors), next at address 16.
Total traffic is kilobytes per second, writes only, no other
detectable traffic.

I have detailed the problem here:
http://unix.stackexchange.com/questions/329477/

Shortened:
kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28
My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a
raid5 array, then bcache set to cache (hopefully) everything
(cache_mode = writeback, sequential_cutoff = 0). On top of bcache
volume I have set up lvm.

* iostat shows traffic on sd[cdef] and md0
* iotop shows no traffic
* iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working
on the disk. Blocks reported are 8, 16 (data size a few KB) and
18446744073709500000 (data size 0). That last one must be some virtual
thingie as the disks are nowhere near that large.
* enabling block_dump shows md0_raid5 process writing to block 8 (1
sectors) and 16 (8 sectors)

This touching is caused by any write into the array and goes on for
quite a while after the write has been done (a couple of hours for
60GB of writes). When services actually work with the array, this
becomes pretty much constant.

What am I observing and is there any way of stopping it?

Thanks,
Jure

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (user) Help needed: mdadm seems to constantly touch my disks
  2016-12-13  8:13 (user) Help needed: mdadm seems to constantly touch my disks Jure Erznožnik
@ 2016-12-14  1:15 ` NeilBrown
       [not found]   ` <CAJ=9zieRuTNiEGuB_RouqbdLGoxNkn09yiogR6rND84LtMdbxA@mail.gmail.com>
  0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2016-12-14  1:15 UTC (permalink / raw)
  To: Jure Erznožnik, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1936 bytes --]

On Tue, Dec 13 2016, Jure Erznožnik wrote:

> First of all, I apologise if this mail list is not intended for layman
> help, but this is what I am and I couldn't get an explanation
> elsewhere.
>
> My problem is that (as it seems) mdadm is touching HDD superblocks
> once per second, once at address 8 (sectors), next at address 16.
> Total traffic is kilobytes per second, writes only, no other
> detectable traffic.
>
> I have detailed the problem here:
> http://unix.stackexchange.com/questions/329477/
>
> Shortened:
> kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28
> My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a
> raid5 array, then bcache set to cache (hopefully) everything
> (cache_mode = writeback, sequential_cutoff = 0). On top of bcache
> volume I have set up lvm.
>
> * iostat shows traffic on sd[cdef] and md0
> * iotop shows no traffic
> * iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working
> on the disk. Blocks reported are 8, 16 (data size a few KB) and
> 18446744073709500000 (data size 0). That last one must be some virtual
> thingie as the disks are nowhere near that large.
> * enabling block_dump shows md0_raid5 process writing to block 8 (1
> sectors) and 16 (8 sectors)
>
> This touching is caused by any write into the array and goes on for
> quite a while after the write has been done (a couple of hours for
> 60GB of writes). When services actually work with the array, this
> becomes pretty much constant.
>
> What am I observing and is there any way of stopping it?

Start with the uppermost layer which has I/O that you cannot explain.
Presumably that is md0.
Run 'blktrace' on that device for a little while, then 'blkparse' to
look at the results.

 blktrace -w 10 md0
 blkparse *blktrace*

It will give the name of the process that initiated the request in [] at
the end of some lines.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <CAJ=9zieRuTNiEGuB_RouqbdLGoxNkn09yiogR6rND84LtMdbxA@mail.gmail.com>]

* Fwd: (user) Help needed: mdadm seems to constantly touch my disks
       [not found]   ` <CAJ=9zieRuTNiEGuB_RouqbdLGoxNkn09yiogR6rND84LtMdbxA@mail.gmail.com>
@ 2016-12-15  7:01     ` Jure Erznožnik
  2016-12-18 19:40       ` Jure Erznožnik
  2016-12-18 22:21       ` Fwd: " NeilBrown
  0 siblings, 2 replies; 11+ messages in thread
From: Jure Erznožnik @ 2016-12-15  7:01 UTC (permalink / raw)
  To: NeilBrown, linux-raid

Thanks for helping Neil. I have run the suggested utilities and here
are my findings:

It is always [kworker/x:yy] (x:yy changes somewhat) or [0].
A few lines from one of the outputs:

  9,0    3        0     0.061577998     0  m   N raid5 rcw 3758609392 2 2 0
  9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
  9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
  9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
  9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
  9,0    0        1     0.065333879   283  C   W 11275825480 [0]
  9,0    0        1     0.065333879   283  C   W 11275825480 [0]
  9,0    0        1     0.065333879   283  C   W 11275825480 [0]
  9,0    0        1     0.065333879   283  C   W 11275825480 [0]
  9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
  9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
  9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
  9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
  9,0    0        2     1.054590402   283  C   W 11275826504 [0]
  9,0    0        2     1.054590402   283  C   W 11275826504 [0]
  9,0    0        2     1.054590402   283  C   W 11275826504 [0]
  9,0    0        2     1.054590402   283  C   W 11275826504 [0]
  9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
  9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
  9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
  9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
  9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
  9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
  9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
  9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
  9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
  9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
  9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
  9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
  9,0    2        1     2.086924691   283  C   W 11275861232 [0]
  9,0    2        1     2.086924691   283  C   W 11275861232 [0]
  9,0    2        1     2.086924691   283  C   W 11275861232 [0]
  9,0    2        1     2.086924691   283  C   W 11275861232 [0]
  9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
  9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
  9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
  9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
  9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
  9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
  9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
  9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
  9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
  9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
  9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
  9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
  9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
  9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
  9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
  9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
  9,0    0        4     3.102340646   283  C   W 11275861272 [0]
  9,0    0        4     3.102340646   283  C   W 11275861272 [0]
  9,0    0        4     3.102340646   283  C   W 11275861272 [0]
  9,0    0        4     3.102340646   283  C   W 11275861272 [0]
  9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
[kworker/3:38]
  9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
[kworker/3:38]
  9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
[kworker/3:38]
  9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
[kworker/3:38]
  9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
  9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
  9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
  9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
  9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
  9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
  9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1

LP,
Jure

On Wed, Dec 14, 2016 at 2:15 AM, NeilBrown <neilb@suse.com> wrote:
> On Tue, Dec 13 2016, Jure Erznožnik wrote:
>
>> First of all, I apologise if this mail list is not intended for layman
>> help, but this is what I am and I couldn't get an explanation
>> elsewhere.
>>
>> My problem is that (as it seems) mdadm is touching HDD superblocks
>> once per second, once at address 8 (sectors), next at address 16.
>> Total traffic is kilobytes per second, writes only, no other
>> detectable traffic.
>>
>> I have detailed the problem here:
>> http://unix.stackexchange.com/questions/329477/
>>
>> Shortened:
>> kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28
>> My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a
>> raid5 array, then bcache set to cache (hopefully) everything
>> (cache_mode = writeback, sequential_cutoff = 0). On top of bcache
>> volume I have set up lvm.
>>
>> * iostat shows traffic on sd[cdef] and md0
>> * iotop shows no traffic
>> * iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working
>> on the disk. Blocks reported are 8, 16 (data size a few KB) and
>> 18446744073709500000 (data size 0). That last one must be some virtual
>> thingie as the disks are nowhere near that large.
>> * enabling block_dump shows md0_raid5 process writing to block 8 (1
>> sectors) and 16 (8 sectors)
>>
>> This touching is caused by any write into the array and goes on for
>> quite a while after the write has been done (a couple of hours for
>> 60GB of writes). When services actually work with the array, this
>> becomes pretty much constant.
>>
>> What am I observing and is there any way of stopping it?
>
> Start with the uppermost layer which has I/O that you cannot explain.
> Presumably that is md0.
> Run 'blktrace' on that device for a little while, then 'blkparse' to
> look at the results.
>
>  blktrace -w 10 md0
>  blkparse *blktrace*
>
> It will give the name of the process that initiated the request in [] at
> the end of some lines.
>
> NeilBrown

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (user) Help needed: mdadm seems to constantly touch my disks
  2016-12-15  7:01     ` Fwd: " Jure Erznožnik
@ 2016-12-18 19:40       ` Jure Erznožnik
  2016-12-18 21:30         ` Theophanis Kontogiannis
  2016-12-18 22:21       ` Fwd: " NeilBrown
  1 sibling, 1 reply; 11+ messages in thread
From: Jure Erznožnik @ 2016-12-18 19:40 UTC (permalink / raw)
  To: NeilBrown, linux-raid

My further attempts to solve this issue include the following (all
unsuccessful):

1. Installing a fresh Ubuntu, assemble the array
2. Install OpenSUSE, assemble the array
3. Tear the array down, create it anew from scratch (it now has a new
UUID, but the data seems to have been preserved, so my bcache / LVM2
configuration remains the same) - interestingly though: during initial
array rebuild which took the better part of today, there was no
clicking even though drives were constantly in action. Either it was
inaudible or the touching didn't take place.

I think I'm barking up the wrong tree with these experiments. Not sure
how to proceed from here.

LP,
Jure

On Thu, Dec 15, 2016 at 8:01 AM, Jure Erznožnik
<jure.erznoznik@gmail.com> wrote:
> Thanks for helping Neil. I have run the suggested utilities and here
> are my findings:
>
> It is always [kworker/x:yy] (x:yy changes somewhat) or [0].
> A few lines from one of the outputs:
>
>   9,0    3        0     0.061577998     0  m   N raid5 rcw 3758609392 2 2 0
>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
> [kworker/3:38]
>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
> [kworker/3:38]
>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
> [kworker/3:38]
>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
> [kworker/3:38]
>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>   9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
>   9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
>   9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
>
> LP,
> Jure
>
> On Wed, Dec 14, 2016 at 2:15 AM, NeilBrown <neilb@suse.com> wrote:
>> On Tue, Dec 13 2016, Jure Erznožnik wrote:
>>
>>> First of all, I apologise if this mail list is not intended for layman
>>> help, but this is what I am and I couldn't get an explanation
>>> elsewhere.
>>>
>>> My problem is that (as it seems) mdadm is touching HDD superblocks
>>> once per second, once at address 8 (sectors), next at address 16.
>>> Total traffic is kilobytes per second, writes only, no other
>>> detectable traffic.
>>>
>>> I have detailed the problem here:
>>> http://unix.stackexchange.com/questions/329477/
>>>
>>> Shortened:
>>> kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28
>>> My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a
>>> raid5 array, then bcache set to cache (hopefully) everything
>>> (cache_mode = writeback, sequential_cutoff = 0). On top of bcache
>>> volume I have set up lvm.
>>>
>>> * iostat shows traffic on sd[cdef] and md0
>>> * iotop shows no traffic
>>> * iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working
>>> on the disk. Blocks reported are 8, 16 (data size a few KB) and
>>> 18446744073709500000 (data size 0). That last one must be some virtual
>>> thingie as the disks are nowhere near that large.
>>> * enabling block_dump shows md0_raid5 process writing to block 8 (1
>>> sectors) and 16 (8 sectors)
>>>
>>> This touching is caused by any write into the array and goes on for
>>> quite a while after the write has been done (a couple of hours for
>>> 60GB of writes). When services actually work with the array, this
>>> becomes pretty much constant.
>>>
>>> What am I observing and is there any way of stopping it?
>>
>> Start with the uppermost layer which has I/O that you cannot explain.
>> Presumably that is md0.
>> Run 'blktrace' on that device for a little while, then 'blkparse' to
>> look at the results.
>>
>>  blktrace -w 10 md0
>>  blkparse *blktrace*
>>
>> It will give the name of the process that initiated the request in [] at
>> the end of some lines.
>>
>> NeilBrown

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (user) Help needed: mdadm seems to constantly touch my disks
  2016-12-18 19:40       ` Jure Erznožnik
@ 2016-12-18 21:30         ` Theophanis Kontogiannis
  0 siblings, 0 replies; 11+ messages in thread
From: Theophanis Kontogiannis @ 2016-12-18 21:30 UTC (permalink / raw)
  To: Jure Erznožnik; +Cc: NeilBrown, Linux RAID

Hello All,

Kind reminder that I had to start a similar thread last month.

https://marc.info/?t=147871214200005&r=1&w=2

Just in case it rings any bells.

BR
Theo



---
Best regards,
ΜΦΧ,

Theophanis Kontogiannis



On Sun, Dec 18, 2016 at 9:40 PM, Jure Erznožnik
<jure.erznoznik@gmail.com> wrote:
> My further attempts to solve this issue include the following (all
> unsuccessful):
>
> 1. Installing a fresh Ubuntu, assemble the array
> 2. Install OpenSUSE, assemble the array
> 3. Tear the array down, create it anew from scratch (it now has a new
> UUID, but the data seems to have been preserved, so my bcache / LVM2
> configuration remains the same) - interestingly though: during initial
> array rebuild which took the better part of today, there was no
> clicking even though drives were constantly in action. Either it was
> inaudible or the touching didn't take place.
>
> I think I'm barking up the wrong tree with these experiments. Not sure
> how to proceed from here.
>
> LP,
> Jure
>
> On Thu, Dec 15, 2016 at 8:01 AM, Jure Erznožnik
> <jure.erznoznik@gmail.com> wrote:
>> Thanks for helping Neil. I have run the suggested utilities and here
>> are my findings:
>>
>> It is always [kworker/x:yy] (x:yy changes somewhat) or [0].
>> A few lines from one of the outputs:
>>
>>   9,0    3        0     0.061577998     0  m   N raid5 rcw 3758609392 2 2 0
>>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
>> [kworker/3:38]
>>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
>> [kworker/3:38]
>>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
>> [kworker/3:38]
>>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
>> [kworker/3:38]
>>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>>   9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
>>   9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
>>   9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
>>
>> LP,
>> Jure
>>
>> On Wed, Dec 14, 2016 at 2:15 AM, NeilBrown <neilb@suse.com> wrote:
>>> On Tue, Dec 13 2016, Jure Erznožnik wrote:
>>>
>>>> First of all, I apologise if this mail list is not intended for layman
>>>> help, but this is what I am and I couldn't get an explanation
>>>> elsewhere.
>>>>
>>>> My problem is that (as it seems) mdadm is touching HDD superblocks
>>>> once per second, once at address 8 (sectors), next at address 16.
>>>> Total traffic is kilobytes per second, writes only, no other
>>>> detectable traffic.
>>>>
>>>> I have detailed the problem here:
>>>> http://unix.stackexchange.com/questions/329477/
>>>>
>>>> Shortened:
>>>> kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28
>>>> My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a
>>>> raid5 array, then bcache set to cache (hopefully) everything
>>>> (cache_mode = writeback, sequential_cutoff = 0). On top of bcache
>>>> volume I have set up lvm.
>>>>
>>>> * iostat shows traffic on sd[cdef] and md0
>>>> * iotop shows no traffic
>>>> * iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working
>>>> on the disk. Blocks reported are 8, 16 (data size a few KB) and
>>>> 18446744073709500000 (data size 0). That last one must be some virtual
>>>> thingie as the disks are nowhere near that large.
>>>> * enabling block_dump shows md0_raid5 process writing to block 8 (1
>>>> sectors) and 16 (8 sectors)
>>>>
>>>> This touching is caused by any write into the array and goes on for
>>>> quite a while after the write has been done (a couple of hours for
>>>> 60GB of writes). When services actually work with the array, this
>>>> becomes pretty much constant.
>>>>
>>>> What am I observing and is there any way of stopping it?
>>>
>>> Start with the uppermost layer which has I/O that you cannot explain.
>>> Presumably that is md0.
>>> Run 'blktrace' on that device for a little while, then 'blkparse' to
>>> look at the results.
>>>
>>>  blktrace -w 10 md0
>>>  blkparse *blktrace*
>>>
>>> It will give the name of the process that initiated the request in [] at
>>> the end of some lines.
>>>
>>> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks
  2016-12-15  7:01     ` Fwd: " Jure Erznožnik
  2016-12-18 19:40       ` Jure Erznožnik
@ 2016-12-18 22:21       ` NeilBrown
       [not found]         ` <CAJ=9zidNV4sPj7KC7_mJEo8+=-YTKyWD5RiLsGG9p33CV12Qdg@mail.gmail.com>
  1 sibling, 1 reply; 11+ messages in thread
From: NeilBrown @ 2016-12-18 22:21 UTC (permalink / raw)
  To: Jure Erznožnik, linux-raid

[-- Attachment #1: Type: text/plain, Size: 5506 bytes --]

On Thu, Dec 15 2016, Jure Erznožnik wrote:

> Thanks for helping Neil. I have run the suggested utilities and here
> are my findings:
>
> It is always [kworker/x:yy] (x:yy changes somewhat) or [0].
> A few lines from one of the outputs:

That's disappointing.  "kworker" could be any work queue.

It might be useful to look for large scale patterns.
What different addresses are written to?  Is there are regular pattern?
What is the period?

md doesn't use work_queues for IO so these much be coming from
elsewhere.  bcache uses a few work-queues...

NeilBrown

>
>   9,0    3        0     0.061577998     0  m   N raid5 rcw 3758609392 2 2 0
>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>   9,0    3        0     0.061580084     0  m   N raid5 rcw 3758609400 2 2 0
>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>   9,0    0        1     0.065333879   283  C   W 11275825480 [0]
>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>   9,0    3        2     1.022155200  2861  Q   W 11275826504 + 32 [kworker/3:38]
>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>   9,0    0        2     1.054590402   283  C   W 11275826504 [0]
>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>   9,0    3        3     2.046065106  2861  Q   W 11275861232 + 8 [kworker/3:38]
>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>   9,0    0        0     2.075247515     0  m   N raid5 rcw 3758619888 2 0 1
>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>   9,0    0        0     2.075250686     0  m   N raid5 rcw 3758619888 2 2 0
>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>   9,0    2        1     2.086924691   283  C   W 11275861232 [0]
>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>   9,0    0        3     2.967340614  1061  Q FWS [kworker/0:18]
>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>   9,0    3        4     3.070092310  2861  Q   W 11275861272 + 8 [kworker/3:38]
>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>   9,0    0        0     3.101966398     0  m   N raid5 rcw 3758619928 2 0 1
>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>   9,0    0        0     3.101969169     0  m   N raid5 rcw 3758619928 2 2 0
>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>   9,0    0        4     3.102340646   283  C   W 11275861272 [0]
>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
> [kworker/3:38]
>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
> [kworker/3:38]
>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
> [kworker/3:38]
>   9,0    3        5     4.094666938  2861  Q   W 11276014160 + 336
> [kworker/3:38]
>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>   9,0    3        0     4.137869804     0  m   N raid5 rcw 3758671440 2 0 1
>   9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
>   9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1
>   9,0    3        0     4.137872647     0  m   N raid5 rcw 3758671448 2 0 1

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

[parent not found: <CAJ=9zidNV4sPj7KC7_mJEo8+=-YTKyWD5RiLsGG9p33CV12Qdg@mail.gmail.com>]

* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks
       [not found]         ` <CAJ=9zidNV4sPj7KC7_mJEo8+=-YTKyWD5RiLsGG9p33CV12Qdg@mail.gmail.com>
@ 2016-12-19  4:01           ` NeilBrown
  2016-12-19  7:12             ` Jure Erznožnik
  0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2016-12-19  4:01 UTC (permalink / raw)
  To: Jure Erznožnik; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 555 bytes --]

[please remember to keep linux-raid cc:ed]

On Mon, Dec 19 2016, Jure Erznožnik wrote:

> I wrote this in OP: iosnoop shows COMM=[idle, md0_raid5, kworker] as
> processes working on the disk. Blocks reported are 8, 16 (data size a
> few KB) and 18446744073709500000 (data size 0). That last one must be
> some virtual thingie as the disks are nowhere near that large.
>
> Does this answer the question or did you mean something else?

Maybe if you just make the blktrace logs available somewhere and I will
look at them myself.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks
  2016-12-19  4:01           ` NeilBrown
@ 2016-12-19  7:12             ` Jure Erznožnik
  2016-12-19 23:39               ` NeilBrown
  0 siblings, 1 reply; 11+ messages in thread
From: Jure Erznožnik @ 2016-12-19  7:12 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

I have made two blktraces at the same time: one for md0 and one for a
member of the array. I hope they will show something sensible.

I dropped them here:
http://expirebox.com/download/ee081fa4f85ffbd0bfad68e4ee257e11.html

The file will be available for 48 hours or so they say.

LP,
Jure

On Mon, Dec 19, 2016 at 5:01 AM, NeilBrown <neilb@suse.com> wrote:
> [please remember to keep linux-raid cc:ed]
>
> On Mon, Dec 19 2016, Jure Erznožnik wrote:
>
>> I wrote this in OP: iosnoop shows COMM=[idle, md0_raid5, kworker] as
>> processes working on the disk. Blocks reported are 8, 16 (data size a
>> few KB) and 18446744073709500000 (data size 0). That last one must be
>> some virtual thingie as the disks are nowhere near that large.
>>
>> Does this answer the question or did you mean something else?
>
> Maybe if you just make the blktrace logs available somewhere and I will
> look at them myself.
>
> NeilBrown

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks
  2016-12-19  7:12             ` Jure Erznožnik
@ 2016-12-19 23:39               ` NeilBrown
  2016-12-21 11:33                 ` Jure Erznožnik
  0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2016-12-19 23:39 UTC (permalink / raw)
  To: Jure Erznožnik; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1115 bytes --]

On Mon, Dec 19 2016, Jure Erznožnik wrote:

> I have made two blktraces at the same time: one for md0 and one for a
> member of the array. I hope they will show something sensible.
>
> I dropped them here:
> http://expirebox.com/download/ee081fa4f85ffbd0bfad68e4ee257e11.html
>
> The file will be available for 48 hours or so they say.

Thanks.
The blktrace_md0.txt file shows several writes that originate from
   iscsi_trx

so it looks like 'md0' (or some other device that is built on it) is
being exported as an iscsi target.  Is that correct?

There are also some large (1024 sector) writes being generated about
once per second.  That is very clearly something being written to md0,
not something that md0 is doing itself.
They are generated by a kworker.  It might be iscsi related, but it
might be something else.

Try disabling the iscsi export and if the strange traffic still appears,
collect another blktrace (maybe for a bit longer - 30 seconds?).
Also please report output of
  ls -l /sys/block/md0/holders/
  fuser /dev/md0

when the problem is occurring.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks
  2016-12-19 23:39               ` NeilBrown
@ 2016-12-21 11:33                 ` Jure Erznožnik
  2016-12-21 22:52                   ` NeilBrown
  0 siblings, 1 reply; 11+ messages in thread
From: Jure Erznožnik @ 2016-12-21 11:33 UTC (permalink / raw)
  To: NeilBrown, linux-raid, tkonto

Mr Brown,

Let me begin with: please give me your paypal address or something so
that I can at least buy you a beer or something

Your analysis and discovery that iSCSI is the origin of writes got me
thinking: how can he see that on md0 device if that device has two
more layers (bcache + LVM) before iSCSI even comes into play. Maybe
the system propagates the origin down the block devices or something,
totally not relevant here. So I embarked on a journey of total data
destruction by disabling one layer at a time. I started by simply
detaching bcache as that was the first thing on the list - and was
non-destructive to boot :)

I have found the culprit:
It is bcache that does the one second writes. I have yet to find the
exact parameters that influence this behaviour, but the output of
writeback_rate_debug is EXTREMELY clear: it's writing a bit of data
each second, reducing the dirty cache by that tiny amount. This is
what causes the write "amplification" resulting in clicks long after a
write has been done - because bcache only writes tiny amounts each
second instead of flushing the entire cache at once when the time
comes.

Thank you for your time and please consider the first sentence of this mail

LP,
Jure

On Tue, Dec 20, 2016 at 12:39 AM, NeilBrown <neilb@suse.com> wrote:
> On Mon, Dec 19 2016, Jure Erznožnik wrote:
>
>> I have made two blktraces at the same time: one for md0 and one for a
>> member of the array. I hope they will show something sensible.
>>
>> I dropped them here:
>> http://expirebox.com/download/ee081fa4f85ffbd0bfad68e4ee257e11.html
>>
>> The file will be available for 48 hours or so they say.
>
> Thanks.
> The blktrace_md0.txt file shows several writes that originate from
>    iscsi_trx
>
> so it looks like 'md0' (or some other device that is built on it) is
> being exported as an iscsi target.  Is that correct?
>
> There are also some large (1024 sector) writes being generated about
> once per second.  That is very clearly something being written to md0,
> not something that md0 is doing itself.
> They are generated by a kworker.  It might be iscsi related, but it
> might be something else.
>
> Try disabling the iscsi export and if the strange traffic still appears,
> collect another blktrace (maybe for a bit longer - 30 seconds?).
> Also please report output of
>   ls -l /sys/block/md0/holders/
>   fuser /dev/md0
>
> when the problem is occurring.
>
> NeilBrown

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks
  2016-12-21 11:33                 ` Jure Erznožnik
@ 2016-12-21 22:52                   ` NeilBrown
  0 siblings, 0 replies; 11+ messages in thread
From: NeilBrown @ 2016-12-21 22:52 UTC (permalink / raw)
  To: Jure Erznožnik, linux-raid, tkonto

[-- Attachment #1: Type: text/plain, Size: 2296 bytes --]

On Wed, Dec 21 2016, Jure Erznožnik wrote:

> Mr Brown,
>
> Let me begin with: please give me your paypal address or something so
> that I can at least buy you a beer or something

Thanks :-)
My first inclination is to say "no thanks" as I am very adequately
compensated by SUSE, and it is part of my role at SUSE to ensure the
upstream kernel remains healthy.  Encouraging a health community is part
of that (and I often learn something while helping people fix things).

But my second inclination is to recognize that gratitude is an important
part of human interactions, and that a community is strong when
gratitude is appropriately given and received.  It is not my place to
direct others how they should show gratitude.
So I'll tell you my paypal address is neil@brown.name and that I'm more
likely to enjoy hot chocolate than beer, but I'll also emphasize that
there is no expectation attached to this information. :-)

>
>
> Your analysis and discovery that iSCSI is the origin of writes got me
> thinking: how can he see that on md0 device if that device has two
> more layers (bcache + LVM) before iSCSI even comes into play. Maybe
> the system propagates the origin down the block devices or something,
> totally not relevant here. So I embarked on a journey of total data
> destruction by disabling one layer at a time. I started by simply
> detaching bcache as that was the first thing on the list - and was
> non-destructive to boot :)
>
> I have found the culprit:
> It is bcache that does the one second writes. I have yet to find the
> exact parameters that influence this behaviour, but the output of
> writeback_rate_debug is EXTREMELY clear: it's writing a bit of data
> each second, reducing the dirty cache by that tiny amount. This is
> what causes the write "amplification" resulting in clicks long after a
> write has been done - because bcache only writes tiny amounts each
> second instead of flushing the entire cache at once when the time
> comes.

Now that we have an understanding of what is happening, I can recommend
that you increase /sys/block/md0/md/safe_mode_delay.  It is measured in
seconds.  If you make it larger than the period of the bcache writes, it
should stop the 'ticking' you mentioned.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-12-21 22:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-13  8:13 (user) Help needed: mdadm seems to constantly touch my disks Jure Erznožnik
2016-12-14  1:15 ` NeilBrown
     [not found]   ` <CAJ=9zieRuTNiEGuB_RouqbdLGoxNkn09yiogR6rND84LtMdbxA@mail.gmail.com>
2016-12-15  7:01     ` Fwd: " Jure Erznožnik
2016-12-18 19:40       ` Jure Erznožnik
2016-12-18 21:30         ` Theophanis Kontogiannis
2016-12-18 22:21       ` Fwd: " NeilBrown
     [not found]         ` <CAJ=9zidNV4sPj7KC7_mJEo8+=-YTKyWD5RiLsGG9p33CV12Qdg@mail.gmail.com>
2016-12-19  4:01           ` NeilBrown
2016-12-19  7:12             ` Jure Erznožnik
2016-12-19 23:39               ` NeilBrown
2016-12-21 11:33                 ` Jure Erznožnik
2016-12-21 22:52                   ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).