* (user) Help needed: mdadm seems to constantly touch my disks @ 2016-12-13 8:13 Jure Erznožnik 2016-12-14 1:15 ` NeilBrown 0 siblings, 1 reply; 11+ messages in thread From: Jure Erznožnik @ 2016-12-13 8:13 UTC (permalink / raw) To: linux-raid First of all, I apologise if this mail list is not intended for layman help, but this is what I am and I couldn't get an explanation elsewhere. My problem is that (as it seems) mdadm is touching HDD superblocks once per second, once at address 8 (sectors), next at address 16. Total traffic is kilobytes per second, writes only, no other detectable traffic. I have detailed the problem here: http://unix.stackexchange.com/questions/329477/ Shortened: kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28 My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a raid5 array, then bcache set to cache (hopefully) everything (cache_mode = writeback, sequential_cutoff = 0). On top of bcache volume I have set up lvm. * iostat shows traffic on sd[cdef] and md0 * iotop shows no traffic * iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working on the disk. Blocks reported are 8, 16 (data size a few KB) and 18446744073709500000 (data size 0). That last one must be some virtual thingie as the disks are nowhere near that large. * enabling block_dump shows md0_raid5 process writing to block 8 (1 sectors) and 16 (8 sectors) This touching is caused by any write into the array and goes on for quite a while after the write has been done (a couple of hours for 60GB of writes). When services actually work with the array, this becomes pretty much constant. What am I observing and is there any way of stopping it? Thanks, Jure ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: (user) Help needed: mdadm seems to constantly touch my disks 2016-12-13 8:13 (user) Help needed: mdadm seems to constantly touch my disks Jure Erznožnik @ 2016-12-14 1:15 ` NeilBrown [not found] ` <CAJ=9zieRuTNiEGuB_RouqbdLGoxNkn09yiogR6rND84LtMdbxA@mail.gmail.com> 0 siblings, 1 reply; 11+ messages in thread From: NeilBrown @ 2016-12-14 1:15 UTC (permalink / raw) To: Jure Erznožnik, linux-raid [-- Attachment #1: Type: text/plain, Size: 1936 bytes --] On Tue, Dec 13 2016, Jure Erznožnik wrote: > First of all, I apologise if this mail list is not intended for layman > help, but this is what I am and I couldn't get an explanation > elsewhere. > > My problem is that (as it seems) mdadm is touching HDD superblocks > once per second, once at address 8 (sectors), next at address 16. > Total traffic is kilobytes per second, writes only, no other > detectable traffic. > > I have detailed the problem here: > http://unix.stackexchange.com/questions/329477/ > > Shortened: > kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28 > My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a > raid5 array, then bcache set to cache (hopefully) everything > (cache_mode = writeback, sequential_cutoff = 0). On top of bcache > volume I have set up lvm. > > * iostat shows traffic on sd[cdef] and md0 > * iotop shows no traffic > * iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working > on the disk. Blocks reported are 8, 16 (data size a few KB) and > 18446744073709500000 (data size 0). That last one must be some virtual > thingie as the disks are nowhere near that large. > * enabling block_dump shows md0_raid5 process writing to block 8 (1 > sectors) and 16 (8 sectors) > > This touching is caused by any write into the array and goes on for > quite a while after the write has been done (a couple of hours for > 60GB of writes). When services actually work with the array, this > becomes pretty much constant. > > What am I observing and is there any way of stopping it? Start with the uppermost layer which has I/O that you cannot explain. Presumably that is md0. Run 'blktrace' on that device for a little while, then 'blkparse' to look at the results. blktrace -w 10 md0 blkparse *blktrace* It will give the name of the process that initiated the request in [] at the end of some lines. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <CAJ=9zieRuTNiEGuB_RouqbdLGoxNkn09yiogR6rND84LtMdbxA@mail.gmail.com>]
* Fwd: (user) Help needed: mdadm seems to constantly touch my disks [not found] ` <CAJ=9zieRuTNiEGuB_RouqbdLGoxNkn09yiogR6rND84LtMdbxA@mail.gmail.com> @ 2016-12-15 7:01 ` Jure Erznožnik 2016-12-18 19:40 ` Jure Erznožnik 2016-12-18 22:21 ` Fwd: " NeilBrown 0 siblings, 2 replies; 11+ messages in thread From: Jure Erznožnik @ 2016-12-15 7:01 UTC (permalink / raw) To: NeilBrown, linux-raid Thanks for helping Neil. I have run the suggested utilities and here are my findings: It is always [kworker/x:yy] (x:yy changes somewhat) or [0]. A few lines from one of the outputs: 9,0 3 0 0.061577998 0 m N raid5 rcw 3758609392 2 2 0 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 9,0 0 1 0.065333879 283 C W 11275825480 [0] 9,0 0 1 0.065333879 283 C W 11275825480 [0] 9,0 0 1 0.065333879 283 C W 11275825480 [0] 9,0 0 1 0.065333879 283 C W 11275825480 [0] 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] 9,0 0 2 1.054590402 283 C W 11275826504 [0] 9,0 0 2 1.054590402 283 C W 11275826504 [0] 9,0 0 2 1.054590402 283 C W 11275826504 [0] 9,0 0 2 1.054590402 283 C W 11275826504 [0] 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 9,0 2 1 2.086924691 283 C W 11275861232 [0] 9,0 2 1 2.086924691 283 C W 11275861232 [0] 9,0 2 1 2.086924691 283 C W 11275861232 [0] 9,0 2 1 2.086924691 283 C W 11275861232 [0] 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 9,0 0 4 3.102340646 283 C W 11275861272 [0] 9,0 0 4 3.102340646 283 C W 11275861272 [0] 9,0 0 4 3.102340646 283 C W 11275861272 [0] 9,0 0 4 3.102340646 283 C W 11275861272 [0] 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 [kworker/3:38] 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 [kworker/3:38] 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 [kworker/3:38] 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 [kworker/3:38] 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 LP, Jure On Wed, Dec 14, 2016 at 2:15 AM, NeilBrown <neilb@suse.com> wrote: > On Tue, Dec 13 2016, Jure Erznožnik wrote: > >> First of all, I apologise if this mail list is not intended for layman >> help, but this is what I am and I couldn't get an explanation >> elsewhere. >> >> My problem is that (as it seems) mdadm is touching HDD superblocks >> once per second, once at address 8 (sectors), next at address 16. >> Total traffic is kilobytes per second, writes only, no other >> detectable traffic. >> >> I have detailed the problem here: >> http://unix.stackexchange.com/questions/329477/ >> >> Shortened: >> kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28 >> My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a >> raid5 array, then bcache set to cache (hopefully) everything >> (cache_mode = writeback, sequential_cutoff = 0). On top of bcache >> volume I have set up lvm. >> >> * iostat shows traffic on sd[cdef] and md0 >> * iotop shows no traffic >> * iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working >> on the disk. Blocks reported are 8, 16 (data size a few KB) and >> 18446744073709500000 (data size 0). That last one must be some virtual >> thingie as the disks are nowhere near that large. >> * enabling block_dump shows md0_raid5 process writing to block 8 (1 >> sectors) and 16 (8 sectors) >> >> This touching is caused by any write into the array and goes on for >> quite a while after the write has been done (a couple of hours for >> 60GB of writes). When services actually work with the array, this >> becomes pretty much constant. >> >> What am I observing and is there any way of stopping it? > > Start with the uppermost layer which has I/O that you cannot explain. > Presumably that is md0. > Run 'blktrace' on that device for a little while, then 'blkparse' to > look at the results. > > blktrace -w 10 md0 > blkparse *blktrace* > > It will give the name of the process that initiated the request in [] at > the end of some lines. > > NeilBrown ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: (user) Help needed: mdadm seems to constantly touch my disks 2016-12-15 7:01 ` Fwd: " Jure Erznožnik @ 2016-12-18 19:40 ` Jure Erznožnik 2016-12-18 21:30 ` Theophanis Kontogiannis 2016-12-18 22:21 ` Fwd: " NeilBrown 1 sibling, 1 reply; 11+ messages in thread From: Jure Erznožnik @ 2016-12-18 19:40 UTC (permalink / raw) To: NeilBrown, linux-raid My further attempts to solve this issue include the following (all unsuccessful): 1. Installing a fresh Ubuntu, assemble the array 2. Install OpenSUSE, assemble the array 3. Tear the array down, create it anew from scratch (it now has a new UUID, but the data seems to have been preserved, so my bcache / LVM2 configuration remains the same) - interestingly though: during initial array rebuild which took the better part of today, there was no clicking even though drives were constantly in action. Either it was inaudible or the touching didn't take place. I think I'm barking up the wrong tree with these experiments. Not sure how to proceed from here. LP, Jure On Thu, Dec 15, 2016 at 8:01 AM, Jure Erznožnik <jure.erznoznik@gmail.com> wrote: > Thanks for helping Neil. I have run the suggested utilities and here > are my findings: > > It is always [kworker/x:yy] (x:yy changes somewhat) or [0]. > A few lines from one of the outputs: > > 9,0 3 0 0.061577998 0 m N raid5 rcw 3758609392 2 2 0 > 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 > 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 > 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 > 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 > 9,0 0 1 0.065333879 283 C W 11275825480 [0] > 9,0 0 1 0.065333879 283 C W 11275825480 [0] > 9,0 0 1 0.065333879 283 C W 11275825480 [0] > 9,0 0 1 0.065333879 283 C W 11275825480 [0] > 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] > 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] > 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] > 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] > 9,0 0 2 1.054590402 283 C W 11275826504 [0] > 9,0 0 2 1.054590402 283 C W 11275826504 [0] > 9,0 0 2 1.054590402 283 C W 11275826504 [0] > 9,0 0 2 1.054590402 283 C W 11275826504 [0] > 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] > 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] > 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] > 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] > 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 > 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 > 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 > 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 > 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 > 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 > 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 > 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 > 9,0 2 1 2.086924691 283 C W 11275861232 [0] > 9,0 2 1 2.086924691 283 C W 11275861232 [0] > 9,0 2 1 2.086924691 283 C W 11275861232 [0] > 9,0 2 1 2.086924691 283 C W 11275861232 [0] > 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] > 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] > 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] > 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] > 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] > 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] > 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] > 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] > 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 > 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 > 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 > 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 > 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 > 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 > 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 > 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 > 9,0 0 4 3.102340646 283 C W 11275861272 [0] > 9,0 0 4 3.102340646 283 C W 11275861272 [0] > 9,0 0 4 3.102340646 283 C W 11275861272 [0] > 9,0 0 4 3.102340646 283 C W 11275861272 [0] > 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 > [kworker/3:38] > 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 > [kworker/3:38] > 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 > [kworker/3:38] > 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 > [kworker/3:38] > 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 > 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 > 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 > 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 > 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 > 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 > 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 > > LP, > Jure > > On Wed, Dec 14, 2016 at 2:15 AM, NeilBrown <neilb@suse.com> wrote: >> On Tue, Dec 13 2016, Jure Erznožnik wrote: >> >>> First of all, I apologise if this mail list is not intended for layman >>> help, but this is what I am and I couldn't get an explanation >>> elsewhere. >>> >>> My problem is that (as it seems) mdadm is touching HDD superblocks >>> once per second, once at address 8 (sectors), next at address 16. >>> Total traffic is kilobytes per second, writes only, no other >>> detectable traffic. >>> >>> I have detailed the problem here: >>> http://unix.stackexchange.com/questions/329477/ >>> >>> Shortened: >>> kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28 >>> My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a >>> raid5 array, then bcache set to cache (hopefully) everything >>> (cache_mode = writeback, sequential_cutoff = 0). On top of bcache >>> volume I have set up lvm. >>> >>> * iostat shows traffic on sd[cdef] and md0 >>> * iotop shows no traffic >>> * iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working >>> on the disk. Blocks reported are 8, 16 (data size a few KB) and >>> 18446744073709500000 (data size 0). That last one must be some virtual >>> thingie as the disks are nowhere near that large. >>> * enabling block_dump shows md0_raid5 process writing to block 8 (1 >>> sectors) and 16 (8 sectors) >>> >>> This touching is caused by any write into the array and goes on for >>> quite a while after the write has been done (a couple of hours for >>> 60GB of writes). When services actually work with the array, this >>> becomes pretty much constant. >>> >>> What am I observing and is there any way of stopping it? >> >> Start with the uppermost layer which has I/O that you cannot explain. >> Presumably that is md0. >> Run 'blktrace' on that device for a little while, then 'blkparse' to >> look at the results. >> >> blktrace -w 10 md0 >> blkparse *blktrace* >> >> It will give the name of the process that initiated the request in [] at >> the end of some lines. >> >> NeilBrown ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: (user) Help needed: mdadm seems to constantly touch my disks 2016-12-18 19:40 ` Jure Erznožnik @ 2016-12-18 21:30 ` Theophanis Kontogiannis 0 siblings, 0 replies; 11+ messages in thread From: Theophanis Kontogiannis @ 2016-12-18 21:30 UTC (permalink / raw) To: Jure Erznožnik; +Cc: NeilBrown, Linux RAID Hello All, Kind reminder that I had to start a similar thread last month. https://marc.info/?t=147871214200005&r=1&w=2 Just in case it rings any bells. BR Theo --- Best regards, ΜΦΧ, Theophanis Kontogiannis On Sun, Dec 18, 2016 at 9:40 PM, Jure Erznožnik <jure.erznoznik@gmail.com> wrote: > My further attempts to solve this issue include the following (all > unsuccessful): > > 1. Installing a fresh Ubuntu, assemble the array > 2. Install OpenSUSE, assemble the array > 3. Tear the array down, create it anew from scratch (it now has a new > UUID, but the data seems to have been preserved, so my bcache / LVM2 > configuration remains the same) - interestingly though: during initial > array rebuild which took the better part of today, there was no > clicking even though drives were constantly in action. Either it was > inaudible or the touching didn't take place. > > I think I'm barking up the wrong tree with these experiments. Not sure > how to proceed from here. > > LP, > Jure > > On Thu, Dec 15, 2016 at 8:01 AM, Jure Erznožnik > <jure.erznoznik@gmail.com> wrote: >> Thanks for helping Neil. I have run the suggested utilities and here >> are my findings: >> >> It is always [kworker/x:yy] (x:yy changes somewhat) or [0]. >> A few lines from one of the outputs: >> >> 9,0 3 0 0.061577998 0 m N raid5 rcw 3758609392 2 2 0 >> 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 >> 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 >> 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 >> 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 >> 9,0 0 1 0.065333879 283 C W 11275825480 [0] >> 9,0 0 1 0.065333879 283 C W 11275825480 [0] >> 9,0 0 1 0.065333879 283 C W 11275825480 [0] >> 9,0 0 1 0.065333879 283 C W 11275825480 [0] >> 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] >> 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] >> 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] >> 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] >> 9,0 0 2 1.054590402 283 C W 11275826504 [0] >> 9,0 0 2 1.054590402 283 C W 11275826504 [0] >> 9,0 0 2 1.054590402 283 C W 11275826504 [0] >> 9,0 0 2 1.054590402 283 C W 11275826504 [0] >> 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] >> 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] >> 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] >> 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] >> 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 >> 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 >> 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 >> 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 >> 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 >> 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 >> 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 >> 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 >> 9,0 2 1 2.086924691 283 C W 11275861232 [0] >> 9,0 2 1 2.086924691 283 C W 11275861232 [0] >> 9,0 2 1 2.086924691 283 C W 11275861232 [0] >> 9,0 2 1 2.086924691 283 C W 11275861232 [0] >> 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] >> 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] >> 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] >> 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] >> 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] >> 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] >> 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] >> 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] >> 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 >> 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 >> 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 >> 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 >> 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 >> 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 >> 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 >> 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 >> 9,0 0 4 3.102340646 283 C W 11275861272 [0] >> 9,0 0 4 3.102340646 283 C W 11275861272 [0] >> 9,0 0 4 3.102340646 283 C W 11275861272 [0] >> 9,0 0 4 3.102340646 283 C W 11275861272 [0] >> 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 >> [kworker/3:38] >> 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 >> [kworker/3:38] >> 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 >> [kworker/3:38] >> 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 >> [kworker/3:38] >> 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 >> 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 >> 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 >> 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 >> 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 >> 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 >> 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 >> >> LP, >> Jure >> >> On Wed, Dec 14, 2016 at 2:15 AM, NeilBrown <neilb@suse.com> wrote: >>> On Tue, Dec 13 2016, Jure Erznožnik wrote: >>> >>>> First of all, I apologise if this mail list is not intended for layman >>>> help, but this is what I am and I couldn't get an explanation >>>> elsewhere. >>>> >>>> My problem is that (as it seems) mdadm is touching HDD superblocks >>>> once per second, once at address 8 (sectors), next at address 16. >>>> Total traffic is kilobytes per second, writes only, no other >>>> detectable traffic. >>>> >>>> I have detailed the problem here: >>>> http://unix.stackexchange.com/questions/329477/ >>>> >>>> Shortened: >>>> kubuntu 16.10 4.8.0-30-generic #32, mdadm v3.4 2016-01-28 >>>> My configuration: 4 spinning platters (/dev/sd[cdef]) assembled into a >>>> raid5 array, then bcache set to cache (hopefully) everything >>>> (cache_mode = writeback, sequential_cutoff = 0). On top of bcache >>>> volume I have set up lvm. >>>> >>>> * iostat shows traffic on sd[cdef] and md0 >>>> * iotop shows no traffic >>>> * iosnoop shows COMM=[idle, md0_raid5, kworker] as processes working >>>> on the disk. Blocks reported are 8, 16 (data size a few KB) and >>>> 18446744073709500000 (data size 0). That last one must be some virtual >>>> thingie as the disks are nowhere near that large. >>>> * enabling block_dump shows md0_raid5 process writing to block 8 (1 >>>> sectors) and 16 (8 sectors) >>>> >>>> This touching is caused by any write into the array and goes on for >>>> quite a while after the write has been done (a couple of hours for >>>> 60GB of writes). When services actually work with the array, this >>>> becomes pretty much constant. >>>> >>>> What am I observing and is there any way of stopping it? >>> >>> Start with the uppermost layer which has I/O that you cannot explain. >>> Presumably that is md0. >>> Run 'blktrace' on that device for a little while, then 'blkparse' to >>> look at the results. >>> >>> blktrace -w 10 md0 >>> blkparse *blktrace* >>> >>> It will give the name of the process that initiated the request in [] at >>> the end of some lines. >>> >>> NeilBrown > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks 2016-12-15 7:01 ` Fwd: " Jure Erznožnik 2016-12-18 19:40 ` Jure Erznožnik @ 2016-12-18 22:21 ` NeilBrown [not found] ` <CAJ=9zidNV4sPj7KC7_mJEo8+=-YTKyWD5RiLsGG9p33CV12Qdg@mail.gmail.com> 1 sibling, 1 reply; 11+ messages in thread From: NeilBrown @ 2016-12-18 22:21 UTC (permalink / raw) To: Jure Erznožnik, linux-raid [-- Attachment #1: Type: text/plain, Size: 5506 bytes --] On Thu, Dec 15 2016, Jure Erznožnik wrote: > Thanks for helping Neil. I have run the suggested utilities and here > are my findings: > > It is always [kworker/x:yy] (x:yy changes somewhat) or [0]. > A few lines from one of the outputs: That's disappointing. "kworker" could be any work queue. It might be useful to look for large scale patterns. What different addresses are written to? Is there are regular pattern? What is the period? md doesn't use work_queues for IO so these much be coming from elsewhere. bcache uses a few work-queues... NeilBrown > > 9,0 3 0 0.061577998 0 m N raid5 rcw 3758609392 2 2 0 > 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 > 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 > 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 > 9,0 3 0 0.061580084 0 m N raid5 rcw 3758609400 2 2 0 > 9,0 0 1 0.065333879 283 C W 11275825480 [0] > 9,0 0 1 0.065333879 283 C W 11275825480 [0] > 9,0 0 1 0.065333879 283 C W 11275825480 [0] > 9,0 0 1 0.065333879 283 C W 11275825480 [0] > 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] > 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] > 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] > 9,0 3 2 1.022155200 2861 Q W 11275826504 + 32 [kworker/3:38] > 9,0 0 2 1.054590402 283 C W 11275826504 [0] > 9,0 0 2 1.054590402 283 C W 11275826504 [0] > 9,0 0 2 1.054590402 283 C W 11275826504 [0] > 9,0 0 2 1.054590402 283 C W 11275826504 [0] > 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] > 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] > 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] > 9,0 3 3 2.046065106 2861 Q W 11275861232 + 8 [kworker/3:38] > 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 > 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 > 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 > 9,0 0 0 2.075247515 0 m N raid5 rcw 3758619888 2 0 1 > 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 > 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 > 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 > 9,0 0 0 2.075250686 0 m N raid5 rcw 3758619888 2 2 0 > 9,0 2 1 2.086924691 283 C W 11275861232 [0] > 9,0 2 1 2.086924691 283 C W 11275861232 [0] > 9,0 2 1 2.086924691 283 C W 11275861232 [0] > 9,0 2 1 2.086924691 283 C W 11275861232 [0] > 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] > 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] > 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] > 9,0 0 3 2.967340614 1061 Q FWS [kworker/0:18] > 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] > 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] > 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] > 9,0 3 4 3.070092310 2861 Q W 11275861272 + 8 [kworker/3:38] > 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 > 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 > 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 > 9,0 0 0 3.101966398 0 m N raid5 rcw 3758619928 2 0 1 > 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 > 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 > 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 > 9,0 0 0 3.101969169 0 m N raid5 rcw 3758619928 2 2 0 > 9,0 0 4 3.102340646 283 C W 11275861272 [0] > 9,0 0 4 3.102340646 283 C W 11275861272 [0] > 9,0 0 4 3.102340646 283 C W 11275861272 [0] > 9,0 0 4 3.102340646 283 C W 11275861272 [0] > 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 > [kworker/3:38] > 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 > [kworker/3:38] > 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 > [kworker/3:38] > 9,0 3 5 4.094666938 2861 Q W 11276014160 + 336 > [kworker/3:38] > 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 > 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 > 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 > 9,0 3 0 4.137869804 0 m N raid5 rcw 3758671440 2 0 1 > 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 > 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 > 9,0 3 0 4.137872647 0 m N raid5 rcw 3758671448 2 0 1 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <CAJ=9zidNV4sPj7KC7_mJEo8+=-YTKyWD5RiLsGG9p33CV12Qdg@mail.gmail.com>]
* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks [not found] ` <CAJ=9zidNV4sPj7KC7_mJEo8+=-YTKyWD5RiLsGG9p33CV12Qdg@mail.gmail.com> @ 2016-12-19 4:01 ` NeilBrown 2016-12-19 7:12 ` Jure Erznožnik 0 siblings, 1 reply; 11+ messages in thread From: NeilBrown @ 2016-12-19 4:01 UTC (permalink / raw) To: Jure Erznožnik; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 555 bytes --] [please remember to keep linux-raid cc:ed] On Mon, Dec 19 2016, Jure Erznožnik wrote: > I wrote this in OP: iosnoop shows COMM=[idle, md0_raid5, kworker] as > processes working on the disk. Blocks reported are 8, 16 (data size a > few KB) and 18446744073709500000 (data size 0). That last one must be > some virtual thingie as the disks are nowhere near that large. > > Does this answer the question or did you mean something else? Maybe if you just make the blktrace logs available somewhere and I will look at them myself. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks 2016-12-19 4:01 ` NeilBrown @ 2016-12-19 7:12 ` Jure Erznožnik 2016-12-19 23:39 ` NeilBrown 0 siblings, 1 reply; 11+ messages in thread From: Jure Erznožnik @ 2016-12-19 7:12 UTC (permalink / raw) To: NeilBrown; +Cc: linux-raid I have made two blktraces at the same time: one for md0 and one for a member of the array. I hope they will show something sensible. I dropped them here: http://expirebox.com/download/ee081fa4f85ffbd0bfad68e4ee257e11.html The file will be available for 48 hours or so they say. LP, Jure On Mon, Dec 19, 2016 at 5:01 AM, NeilBrown <neilb@suse.com> wrote: > [please remember to keep linux-raid cc:ed] > > On Mon, Dec 19 2016, Jure Erznožnik wrote: > >> I wrote this in OP: iosnoop shows COMM=[idle, md0_raid5, kworker] as >> processes working on the disk. Blocks reported are 8, 16 (data size a >> few KB) and 18446744073709500000 (data size 0). That last one must be >> some virtual thingie as the disks are nowhere near that large. >> >> Does this answer the question or did you mean something else? > > Maybe if you just make the blktrace logs available somewhere and I will > look at them myself. > > NeilBrown ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks 2016-12-19 7:12 ` Jure Erznožnik @ 2016-12-19 23:39 ` NeilBrown 2016-12-21 11:33 ` Jure Erznožnik 0 siblings, 1 reply; 11+ messages in thread From: NeilBrown @ 2016-12-19 23:39 UTC (permalink / raw) To: Jure Erznožnik; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1115 bytes --] On Mon, Dec 19 2016, Jure Erznožnik wrote: > I have made two blktraces at the same time: one for md0 and one for a > member of the array. I hope they will show something sensible. > > I dropped them here: > http://expirebox.com/download/ee081fa4f85ffbd0bfad68e4ee257e11.html > > The file will be available for 48 hours or so they say. Thanks. The blktrace_md0.txt file shows several writes that originate from iscsi_trx so it looks like 'md0' (or some other device that is built on it) is being exported as an iscsi target. Is that correct? There are also some large (1024 sector) writes being generated about once per second. That is very clearly something being written to md0, not something that md0 is doing itself. They are generated by a kworker. It might be iscsi related, but it might be something else. Try disabling the iscsi export and if the strange traffic still appears, collect another blktrace (maybe for a bit longer - 30 seconds?). Also please report output of ls -l /sys/block/md0/holders/ fuser /dev/md0 when the problem is occurring. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks 2016-12-19 23:39 ` NeilBrown @ 2016-12-21 11:33 ` Jure Erznožnik 2016-12-21 22:52 ` NeilBrown 0 siblings, 1 reply; 11+ messages in thread From: Jure Erznožnik @ 2016-12-21 11:33 UTC (permalink / raw) To: NeilBrown, linux-raid, tkonto Mr Brown, Let me begin with: please give me your paypal address or something so that I can at least buy you a beer or something Your analysis and discovery that iSCSI is the origin of writes got me thinking: how can he see that on md0 device if that device has two more layers (bcache + LVM) before iSCSI even comes into play. Maybe the system propagates the origin down the block devices or something, totally not relevant here. So I embarked on a journey of total data destruction by disabling one layer at a time. I started by simply detaching bcache as that was the first thing on the list - and was non-destructive to boot :) I have found the culprit: It is bcache that does the one second writes. I have yet to find the exact parameters that influence this behaviour, but the output of writeback_rate_debug is EXTREMELY clear: it's writing a bit of data each second, reducing the dirty cache by that tiny amount. This is what causes the write "amplification" resulting in clicks long after a write has been done - because bcache only writes tiny amounts each second instead of flushing the entire cache at once when the time comes. Thank you for your time and please consider the first sentence of this mail LP, Jure On Tue, Dec 20, 2016 at 12:39 AM, NeilBrown <neilb@suse.com> wrote: > On Mon, Dec 19 2016, Jure Erznožnik wrote: > >> I have made two blktraces at the same time: one for md0 and one for a >> member of the array. I hope they will show something sensible. >> >> I dropped them here: >> http://expirebox.com/download/ee081fa4f85ffbd0bfad68e4ee257e11.html >> >> The file will be available for 48 hours or so they say. > > Thanks. > The blktrace_md0.txt file shows several writes that originate from > iscsi_trx > > so it looks like 'md0' (or some other device that is built on it) is > being exported as an iscsi target. Is that correct? > > There are also some large (1024 sector) writes being generated about > once per second. That is very clearly something being written to md0, > not something that md0 is doing itself. > They are generated by a kworker. It might be iscsi related, but it > might be something else. > > Try disabling the iscsi export and if the strange traffic still appears, > collect another blktrace (maybe for a bit longer - 30 seconds?). > Also please report output of > ls -l /sys/block/md0/holders/ > fuser /dev/md0 > > when the problem is occurring. > > NeilBrown ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fwd: (user) Help needed: mdadm seems to constantly touch my disks 2016-12-21 11:33 ` Jure Erznožnik @ 2016-12-21 22:52 ` NeilBrown 0 siblings, 0 replies; 11+ messages in thread From: NeilBrown @ 2016-12-21 22:52 UTC (permalink / raw) To: Jure Erznožnik, linux-raid, tkonto [-- Attachment #1: Type: text/plain, Size: 2296 bytes --] On Wed, Dec 21 2016, Jure Erznožnik wrote: > Mr Brown, > > Let me begin with: please give me your paypal address or something so > that I can at least buy you a beer or something Thanks :-) My first inclination is to say "no thanks" as I am very adequately compensated by SUSE, and it is part of my role at SUSE to ensure the upstream kernel remains healthy. Encouraging a health community is part of that (and I often learn something while helping people fix things). But my second inclination is to recognize that gratitude is an important part of human interactions, and that a community is strong when gratitude is appropriately given and received. It is not my place to direct others how they should show gratitude. So I'll tell you my paypal address is neil@brown.name and that I'm more likely to enjoy hot chocolate than beer, but I'll also emphasize that there is no expectation attached to this information. :-) > > > Your analysis and discovery that iSCSI is the origin of writes got me > thinking: how can he see that on md0 device if that device has two > more layers (bcache + LVM) before iSCSI even comes into play. Maybe > the system propagates the origin down the block devices or something, > totally not relevant here. So I embarked on a journey of total data > destruction by disabling one layer at a time. I started by simply > detaching bcache as that was the first thing on the list - and was > non-destructive to boot :) > > I have found the culprit: > It is bcache that does the one second writes. I have yet to find the > exact parameters that influence this behaviour, but the output of > writeback_rate_debug is EXTREMELY clear: it's writing a bit of data > each second, reducing the dirty cache by that tiny amount. This is > what causes the write "amplification" resulting in clicks long after a > write has been done - because bcache only writes tiny amounts each > second instead of flushing the entire cache at once when the time > comes. Now that we have an understanding of what is happening, I can recommend that you increase /sys/block/md0/md/safe_mode_delay. It is measured in seconds. If you make it larger than the period of the bcache writes, it should stop the 'ticking' you mentioned. NeilBrown [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-12-21 22:52 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-12-13 8:13 (user) Help needed: mdadm seems to constantly touch my disks Jure Erznožnik 2016-12-14 1:15 ` NeilBrown [not found] ` <CAJ=9zieRuTNiEGuB_RouqbdLGoxNkn09yiogR6rND84LtMdbxA@mail.gmail.com> 2016-12-15 7:01 ` Fwd: " Jure Erznožnik 2016-12-18 19:40 ` Jure Erznožnik 2016-12-18 21:30 ` Theophanis Kontogiannis 2016-12-18 22:21 ` Fwd: " NeilBrown [not found] ` <CAJ=9zidNV4sPj7KC7_mJEo8+=-YTKyWD5RiLsGG9p33CV12Qdg@mail.gmail.com> 2016-12-19 4:01 ` NeilBrown 2016-12-19 7:12 ` Jure Erznožnik 2016-12-19 23:39 ` NeilBrown 2016-12-21 11:33 ` Jure Erznožnik 2016-12-21 22:52 ` NeilBrown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).