All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yi Zhang <yizhan@redhat.com>
To: linux-raid@vger.kernel.org
Cc: shli@fb.com, NeilBrown <neilb@suse.com>
Subject: INFO: task systemd-udevd:794 blocked for more than 600 seconds observed after mdadm -G operation
Date: Thu, 4 Aug 2016 12:01:19 -0400 (EDT)	[thread overview]
Message-ID: <1363340670.11420734.1470326479339.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <2047059297.11279583.1470314510395.JavaMail.zimbra@redhat.com>

Hello folks

Seems I hit one issue that task blocked issue on 4.7.0 during write journal testing, could anyone help check it? 
I have list the log and reproduce steps below, thanks.

Log:
[  106.688135] md: bind<sdb1>
[  106.688301] md: bind<sdc1>
[  106.689831] md: bind<sdd1>
[  106.689897] md: bind<sdf1>
[  106.710079] md: bind<sde1>
[  106.804284] raid6: sse2x1   gen()  6707 MB/s
[  106.821281] raid6: sse2x1   xor()  5667 MB/s
[  106.838284] raid6: sse2x2   gen()  8875 MB/s
[  106.855282] raid6: sse2x2   xor()  6230 MB/s
[  106.872278] raid6: sse2x4   gen() 10359 MB/s
[  106.889278] raid6: sse2x4   xor()  7697 MB/s
[  106.889280] raid6: using algorithm sse2x4 gen() 10359 MB/s
[  106.889281] raid6: .... xor() 7697 MB/s, rmw enabled
[  106.889282] raid6: using ssse3x2 recovery algorithm
[  106.907347] async_tx: api initialized (async)
[  106.924571] xor: automatically using best checksumming function:
[  106.934274]    avx       : 19356.000 MB/sec
[  107.036697] md: raid6 personality registered for level 6
[  107.036699] md: raid5 personality registered for level 5
[  107.036700] md: raid4 personality registered for level 4
[  107.037215] md/raid:md0: device sdd1 operational as raid disk 2
[  107.037217] md/raid:md0: device sdc1 operational as raid disk 1
[  107.037218] md/raid:md0: device sdb1 operational as raid disk 0
[  107.037508] md/raid:md0: allocated 4374kB
[  107.037533] md/raid:md0: raid level 5 active with 3 out of 4 devices, algorithm 2
[  107.037534] RAID conf printout:
[  107.037535]  --- level:5 rd:4 wd:3
[  107.037536]  disk 0, o:1, dev:sdb1
[  107.037537]  disk 1, o:1, dev:sdc1
[  107.037538]  disk 2, o:1, dev:sdd1
[  107.037548] md/raid:md0: using device sdf1 as journal
[  107.037829] md0: detected capacity change from 0 to 32187088896
[  107.037849] RAID conf printout:
[  107.037853]  --- level:5 rd:4 wd:3
[  107.037855]  disk 0, o:1, dev:sdb1
[  107.037857]  disk 1, o:1, dev:sdc1
[  107.037858]  disk 2, o:1, dev:sdd1
[  107.037860]  disk 3, o:1, dev:sde1
[  107.037945] md: recovery of RAID array md0
[  107.037948] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[  107.037949] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
[  107.037954] md: using 128k window, over a total of 10477568k.
[  216.232974] md: md0: recovery done.
[  216.431212] RAID conf printout:
[  216.431216]  --- level:5 rd:4 wd:4
[  216.431218]  disk 0, o:1, dev:sdb1
[  216.431219]  disk 1, o:1, dev:sdc1
[  216.431220]  disk 2, o:1, dev:sdd1
[  216.431221]  disk 3, o:1, dev:sde1
[  222.897854] md/raid:md0: Disk failure on sdf1, disabling device.
md/raid:md0: Operation continuing on 4 devices.
[  223.174949] RAID conf printout:
[  223.174952]  --- level:5 rd:4 wd:4
[  223.174953]  disk 0, o:1, dev:sdb1
[  223.174954]  disk 1, o:1, dev:sdc1
[  223.174955]  disk 2, o:1, dev:sdd1
[  223.174956]  disk 3, o:1, dev:sde1
[  241.640253] md: unbind<sdf1>
[  241.643448] md: export_rdev(sdf1)
[  261.329087] md: bind<sdf1>
[  261.607691] RAID conf printout:
[  261.607694]  --- level:5 rd:4 wd:4
[  261.607695]  disk 0, o:1, dev:sdb1
[  261.607696]  disk 1, o:1, dev:sdc1
[  261.607697]  disk 2, o:1, dev:sdd1
[  261.607698]  disk 3, o:1, dev:sde1
[  292.581113] RAID conf printout:
[  292.581116]  --- level:5 rd:5 wd:5
[  292.581118]  disk 0, o:1, dev:sdb1
[  292.581119]  disk 1, o:1, dev:sdc1
[  292.581120]  disk 2, o:1, dev:sdd1
[  292.581121]  disk 3, o:1, dev:sde1
[  292.581121]  disk 4, o:1, dev:sdf1
[  292.581199] md: reshape of RAID array md0
[  292.581202] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
[  292.581204] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
[  292.581214] md: using 128k window, over a total of 10477568k.
[ 1201.734199] INFO: task systemd-udevd:794 blocked for more than 600 seconds.
[ 1201.734204]       Not tainted 4.7.0 #5
[ 1201.734204] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1201.734206] systemd-udevd   D ffff8800310a77b0     0   794      1 0x00000000
[ 1201.734209]  ffff8800310a77b0 0000000000000001 ffff880031de0000 ffff8807cd3c5688
[ 1201.734211]  ffff8800310a8000 0000000000000000 0000000000000002 ffff8807c3e4c100
[ 1201.734213]  0000000000000000 ffff8800310a77c8 ffffffff816e4935 ffff8807cd3c5400
[ 1201.734214] Call Trace:
[ 1201.734220]  [<ffffffff816e4935>] schedule+0x35/0x80
[ 1201.734226]  [<ffffffffa09a0ba9>] raid5_make_request+0x8b9/0xce0 [raid456]
[ 1201.734229]  [<ffffffff810d0240>] ? prepare_to_wait_event+0xf0/0xf0
[ 1201.734233]  [<ffffffff8156758e>] md_make_request+0xee/0x230
[ 1201.734236]  [<ffffffff81320803>] generic_make_request+0x103/0x1d0
[ 1201.734237]  [<ffffffff81320947>] submit_bio+0x77/0x150
[ 1201.734240]  [<ffffffff813183bb>] ? bio_alloc_bioset+0x1ab/0x2d0
[ 1201.734242]  [<ffffffff81254ecd>] ? alloc_page_buffers+0x5d/0xb0
[ 1201.734244]  [<ffffffff81256a1f>] submit_bh_wbc+0x12f/0x160
[ 1201.734246]  [<ffffffff81256d58>] block_read_full_page+0x208/0x380
[ 1201.734248]  [<ffffffff81259000>] ? I_BDEV+0x20/0x20
[ 1201.734250]  [<ffffffff812597e8>] blkdev_readpage+0x18/0x20
[ 1201.734253]  [<ffffffff81197718>] do_read_cache_page+0x138/0x300
[ 1201.734255]  [<ffffffff812597d0>] ? blkdev_writepages+0x40/0x40
[ 1201.734257]  [<ffffffff811978f9>] read_cache_page+0x19/0x20
[ 1201.734259]  [<ffffffff81333b50>] read_dev_sector+0x80/0xb0
[ 1201.734261]  [<ffffffff81337f14>] read_lba+0x104/0x1c0
[ 1201.734263]  [<ffffffff8133852a>] find_valid_gpt+0xfa/0x710
[ 1201.734266]  [<ffffffff811d7d00>] ? vmap_page_range_noflush+0x190/0x380
[ 1201.734267]  [<ffffffff81338b40>] ? find_valid_gpt+0x710/0x710
[ 1201.734269]  [<ffffffff81338bc9>] efi_partition+0x89/0x440
[ 1201.734271]  [<ffffffff81359209>] ? snprintf+0x49/0x70
[ 1201.734273]  [<ffffffff81338b40>] ? find_valid_gpt+0x710/0x710
[ 1201.734275]  [<ffffffff81335b31>] check_partition+0x101/0x1f0
[ 1201.734277]  [<ffffffff81334462>] rescan_partitions+0xc2/0x2c0
[ 1201.734279]  [<ffffffff8132f294>] __blkdev_reread_part+0x64/0x70
[ 1201.734281]  [<ffffffff8132f2c3>] blkdev_reread_part+0x23/0x40
[ 1201.734282]  [<ffffffff8132fd3a>] blkdev_ioctl+0x46a/0x8f0
[ 1201.734284]  [<ffffffff81259aa1>] block_ioctl+0x41/0x50
[ 1201.734286]  [<ffffffff812323b6>] do_vfs_ioctl+0xa6/0x5c0
[ 1201.734288]  [<ffffffff81232949>] SyS_ioctl+0x79/0x90
[ 1201.734290]  [<ffffffff81274d67>] ? SyS_flock+0x117/0x1a0
[ 1201.734292]  [<ffffffff816e86b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[ 1201.734305] INFO: task md0_reshape:4089 blocked for more than 600 seconds.
[ 1201.734306]       Not tainted 4.7.0 #5
[ 1201.734307] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1201.734308] md0_reshape     D ffff88080db77ac0     0  4089      2 0x00000080
[ 1201.734310]  ffff88080db77ac0 0000000000000000 ffff8807d6724380 ffff8807cd3c5670
[ 1201.734311]  ffff88080db78000 ffff88080db77b20 ffff8807cd3c5670 0000000000000c20
[ 1201.734313]  ffff8807cd3c5418 ffff88080db77ad8 ffffffff816e4935 ffff8807cd3c5400
[ 1201.734314] Call Trace:
[ 1201.734316]  [<ffffffff816e4935>] schedule+0x35/0x80
[ 1201.734319]  [<ffffffffa099fe8b>] raid5_get_active_stripe+0x20b/0x670 [raid456]
[ 1201.734321]  [<ffffffff810d0240>] ? prepare_to_wait_event+0xf0/0xf0
[ 1201.734323]  [<ffffffffa09a4733>] reshape_request+0x4d3/0x8a0 [raid456]
[ 1201.734326]  [<ffffffffa09a4e21>] raid5_sync_request+0x321/0x3a0 [raid456]
[ 1201.734327]  [<ffffffff810cfc54>] ? __wake_up+0x44/0x50
[ 1201.734329]  [<ffffffff8156ce97>] md_do_sync+0x9e7/0xf60
[ 1201.734332]  [<ffffffff810c200d>] ? update_curr+0xed/0x180
[ 1201.734333]  [<ffffffff810c025e>] ? account_entity_dequeue+0xae/0xd0
[ 1201.734336]  [<ffffffff81098d43>] ? kernel_sigaction+0x43/0xe0
[ 1201.734339]  [<ffffffff81566dc6>] md_thread+0x136/0x150
[ 1201.734341]  [<ffffffff81566c90>] ? find_pers+0x70/0x70
[ 1201.734343]  [<ffffffff810ab678>] kthread+0xd8/0xf0
[ 1201.734344]  [<ffffffff816e88bf>] ret_from_fork+0x1f/0x40
[ 1201.734346]  [<ffffffff810ab5a0>] ? kthread_park+0x60/0x60


Reproduce steps:

#mdadm -CR /dev/md0 -l5 -n4 /dev/sd[b-e]1 --write-journal /dev/sdf1
#mdadm --wait /dev/md0 
#mdadm /dev/md0 -f /dev/sdf1
#mdadm /dev/md0 -r /dev/sdf1
#mdadm /dev/md0 -a /dev/sdf1
#mdadm -D /dev/md0 
/dev/md0:
        Version : 1.2
  Creation Time : Thu Aug  4 20:37:41 2016
     Raid Level : raid5
     Array Size : 31432704 (29.98 GiB 32.19 GB)
  Used Dev Size : 10477568 (9.99 GiB 10.73 GB)
   Raid Devices : 4
  Total Devices : 5
    Persistence : Superblock is persistent

    Update Time : Thu Aug  4 20:40:16 2016
          State : clean 
 Active Devices : 4
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 512K

           Name : dhcp-12-125.nay.redhat.com:0  (local to host dhcp-12-125.nay.redhat.com)
           UUID : 09607542:a8dccc81:f5d2f80c:c586923f
         Events : 22

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       2       8       49        2      active sync   /dev/sdd1
       5       8       65        3      active sync   /dev/sde1

       4       8       81        -      spare   /dev/sdf1

#mdadm -G -n5 /dev/md0

Best Regards,
  Yi Zhang



       reply	other threads:[~2016-08-04 16:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2047059297.11279583.1470314510395.JavaMail.zimbra@redhat.com>
2016-08-04 16:01 ` Yi Zhang [this message]
2016-08-05  6:39   ` INFO: task systemd-udevd:794 blocked for more than 600 seconds observed after mdadm -G operation Guoqing Jiang
2016-08-05  7:05     ` yizhan
2016-08-06  4:45       ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1363340670.11420734.1470326479339.JavaMail.zimbra@redhat.com \
    --to=yizhan@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=shli@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.